Computer Engineering & Science

A survey of task management techniques

for big data stream computing

LIANG Yi1，HOU Ying1，CHEN Cheng1，JIN Yi2

2017, 39(02): 215-226. doi:

Abstract ( 264 )

PDF (693KB) ( 892 ) 　　

Review attachment

Stream computing is an important part of big data computing, which has become a hot topic in big data research. Task management is one of the essential features of stream computing, and is responsible for resource scheduling and lifecycle management of stream computing tasks. Current researches focus on application requirements, architecture and overall technology of stream computing, and they are lack of dedicated investigation and analysis of task management techniques. Firstly, we present a general abstract function model of task management for stream computing systems. Secondly, we classify and analyze the key techniques for task management based on this model. Finally, we investigate their applications in current stream processing systems, and the integration and optimization of above techniques.

Parallel multi-label K-nearest

neighbor algorithm based on Spark

WANG Jin,XIA Cuiping,OUYANG Weihua,WANG Hong,DENG Xin,CHEN Qiaosong

2017, 39(02): 227-235. doi:

Abstract ( 197 )

PDF (1090KB) ( 676 ) 　　

Review attachment

With the advent of big data era, applications of largescale multilabel data mining have attracted extensive attention.The MultiLabel KNearest Neighbor (MLKNN) is a simple, efficient and widely used method which outperforms other traditional multilabel learning algorithms in many realworld applications. However, as an increasing number of data need to be dealt with, the MLKNN algorithm is unable to meet the requirements of time and memory space. Combined with the parallel mechanism and iterative computation in the memory of Spark, we propose an algorithm based on Spark distributed inmemory computing platform, named SMLKNN. First, in the stage of map,we try to find the K nearest neighbors for each partition of the samples to be tested. Then in the reduce stage, we determine the final K nearest neighbors according to the K nearest neighbors of each partition.Finally, we cluster the label sets of the K nearest neighbors in parallel, and output the target label sets using the maximum posterior probability (MAP) principle. The experiments in standalone and cluster environments show that in the premise of ensuring the classification accuracy, the performance of the SMLKNN has an approximate linear relationship with computing resources, and the proposed algorithm can enhance the processing ability of the MLKNN when dealing with large scale multilabel data.

Parallel anomaly detection based on Isolation Forest

HOU Yongxu1，DUAN Lei1,2，QIN Jianglong3，QIN Pan1，TANG Changjie1

2017, 39(02): 236-244. doi:

Abstract ( 248 )

PDF (1110KB) ( 685 ) 　　

Review attachment

Anomaly detection, which is used in a variety of applications, attracts attention both in industry and academia. Among numerous methods for anomaly detection, the Isolation Forest algorithm, whose characteristics include high efficiency, sound detection accuracy, has wide realworld applications. However, the conventional Isolation forest algorithm can hardly deal with largescale data sets. To break this limitation, we propose a cloud computing platform based algorithm. Specifically, we design and implement a parallel algorithm for anomaly detection based on Isolation Forest, named PIFH,using the Hadoop distributed storage system and the MapReduce distributed computational framework. By parallelizing the processes of detection model construction and anomaly evaluation, its efficiency is improved, and the application range is also extended. Experiments using realworld data sets demonstrate that the proposed algorithm is efficient and scalable.

Design and implementation of

a multistage bufferless highradix router

YANG Wenxiang,DONG Dezun,LI Cunlu,LEI Fei,SUN Kaixuan,WU Ji

2017, 39(02): 245-251. doi:

Abstract ( 184 )

PDF (874KB) ( 532 ) 　　

As the scale of high performance networks is increasing,the design of highradix router architectures is becoming a hotspot in the field of high performance computing. Using highradix routers, the network can achieve lower transmission latency, lower cost of network construction and lower power consumption, and improve network reliability simultaneously. As the radix of high performance routers increases continuously, simply extending the radix of singlestage crossbar fabric can greatly increase internal connection resources inside the routers, and the cost of switches becomes unbearable.So there is a pressing need to design a new fabric for highradix routers. Over the past decade, the structured designs represented by the YARC and the "network within a network" approach have appeared, and future study focuses on solving all kinds of problems such as buffer and arbitration to design better architectures.We implement a highradix router with a multistage Clos network inside and there are corresponding arbitration modules to schedule requests for each stage. Packet memory buffers are implemented at input and output ports, and the network is bufferless besides these memories. We conduct extensive simulations to evaluate the performance in BookSim simulator, and the results show that the highradix router we design works properly and provides good performance.

A scalable memorybuiltinselfrepair

algorithm based on ECC check code

REN Xiujiang1,XIE Xianghui2,SHI Jingjing1

2017, 39(02): 252-257. doi:

Abstract ( 198 )

PDF (841KB) ( 546 ) 　　

Review attachment

With the continuous progress of microelectronic technology, the static random access memory （SRAM） occupies the majority area of modern systemsonachip (SoC), so the defect rate of the SRAM has become an important factor affecting the yield of chips. We propose a scalable memorybuiltinselfrepair algorithm（SMBISR）based on error checking and correcting (ECC) check code. With the same redundant SRAM structure, the correcting capability of the ECC code can enhance the faulttolerant capability, thus increasing the rate of finished product of chips effectively without increasing test time. We implement the algorithm on the RTL, and the evaluation of the backend design shows that its working frequency can reach 1GHz while the area overhead is only 1.5%.

A highperformance parallel encryption algorithm for streams

FEI Xiongwei1,2,LI Kenli2,YANG Wangdong1,2

2017, 39(02): 258-266. doi:

Abstract ( 201 )

PDF (968KB) ( 842 ) 　　

Review attachment

With constant increase of network users and security requirements, protecting user data streams by the advanced encryption standard (AES) is widely used. As for servers, data streams produced by massive users have the properties of high speed and strong burst, but traditional serial encryption can cause service unavailability or low service quality because of low efficiency. Based on the popular central processing unit ( CPU) + graphics processing unit (GPU) environment, we parallelize AES in the form of pipeline to improve encryption performance, and control burst flows using a slipping window to offer highquality stream encryption service. Experimental results show that our proposed parallel AES in heterogeneous environment can meet the requirements of user data streams with high speed and strong burst, improve the speed of dealing, and control flows effectively.

Image cache replacement algorithms based

on social relationships for mobile terminals

WANG Jing1,2, NIU Lijie1,2

2017, 39(02): 267-274. doi:

Abstract ( 139 )

PDF (1000KB) ( 484 ) 　　

Review attachment

As mobile terminals have great involvement in people’s lives, the mobile social APPs have been widely used. A large number of image resources are used in mobile social APPs, such as WeChat’s circles and Instagram’s picture sharing. While browsing pictures in APPs, we can consume more network traffic and the loading speed is affected. So most APPs display thumbnails first, and then load the original images according to the needs of users. On the server,we usually use caching techniques to speed up the thumbnail generation and reduce the I/O of the disk. However, current cache mechanism concerns more about the cache access frequency, last access time and other factors, ignoring social relationships between users and the different access modes how mobile users access the thumbnails and the original pictures. So we divide the cache buffer into two layers: the original cache buffer and the thumbnail cache buffer, and propose two image cache replacement algorithms based on social relationships, which add social relationships among users and relationships between the thumbnails and the original pictures to the traditional cache replacement algorithm, and replace the cache objects by computing the cache value.Experimental results show that the proposed image cache replacement algorithms based on social relationshipscan improve cache hit ratio of the thumbnails and the original images.

A Torus network routing algorithm guided by straight line

DING Yuliang，ZHANG Jianxian，ZHOU Duan，QIU Xuehong

2017, 39(02): 275-279. doi:

Abstract ( 208 )

PDF (517KB) ( 500 ) 　　

Review attachment

To increase the communication efficiency of the network on chip in TORUS topology, we propose a new routing algorithm based on straight line guidance, namely Tline routing. The new algorithm extends the Torus topology to a coordinate plane that is similar to the mesh structure. The route forwarding direction is guided by a straight line formed by the source and destination of packets. The partially adaptive routing is realized according to the congestion condition around the neighboring nodes. Experimental results show that the Tline routing can achieve better routing performance in comparison with the XY and OE routing algorithms, and the average power consumption decreases by about 8%.

Optimization of FC-AE-ASM protocol

LI Pan1,2,TIAN Ze1,2,CAI Yefang1,2,ZHANG Yishu3,YANG Haibo1,2,HUO Weitao1,2,WANG Yuhuan1,2

2017, 39(02): 280-284. doi:

Abstract ( 331 )

PDF (731KB) ( 534 ) 　　

Review attachment

We achieve fibre channel (FC) protocol implementations according to the documents, study indepth on the fibre channelavionics environmentanonymous subscriber messaging (FCAEASM) protocol, and optimize the process of receiving/sending of ASM messages in the process of FCAEASM protocol implementation based on the FCIP Xilinx core. The principles we follow are: using hardware circuits as much as possible, reducing software intervention, improving protocol processing and execution efficiency, thus meeting the low delay and real time demand of the aviation electronic environment. Finally, we implement the optimized protocol processing using the Verilog language. Modelsim simulation results show that the FC link rate is 2.125/1.062 5 Gbps, and the maximum bandwidth of the ASM message sending and receiving of the maximum payload can reach the bandwidth of the FC link line rate, thus satisfying the realtime requirements of the FCAEASM protocol, and providing reference to the construction of fiber channel embedded networks in the avionics environment.

An improved Bloom Filter algorithm under

the Hadoop for duplicated web page removal

HUANG Wei-jian,YANG Hai-long

2017, 39(02): 285-290. doi:

Abstract ( 192 )

PDF (682KB) ( 432 ) 　　

Review attachment

To solve the space waste problem existing in the server space where a lot of duplicated and similar data are stored, we propose an improved Bloom Filter algorithm, which adds an array of bit and dynamically optimizes the number of copies of duplicated data according to the weight calculated by the repeated hits of the bit array. Then, the improved algorithm is parallelized in the Hadoop distributed cluster to further improve the processing efficiency. Experimental results show that compared with traditional web duplicate removal algorithms, the improved Bloom filter algorithm can not only improve the processing efficiency of jobs, but also save the server storage space to a certain extent by dynamically optimizing the number of copies of duplicated data according to the repeated hits of the bit array.

TDMA-based MAC layer access control

for long distance wireless networks

HUA Chao1,2,HUANG Chuan-he1,2

2017, 39(02): 291-296. doi:

Abstract ( 186 )

PDF (456KB) ( 438 ) 　　

Review attachment

With the popularity of mobile terminals in recent years, people expect more convenient and more efficient wireless networks. Today's wireless networks are committed to providing characteristics such as efficiency, fairness and quality of service. However, in the long distance wireless network environment, these requirements are difficult to meet. We propose an uplink transmission scheme of the control terminal, which can perform long distance transmission and short distance transmission at the same time to improve the bandwidth utilization of the base station. Our proposal mainly solves the following problems: (1) how to get the distance of the terminal; (2) how to schedule the transmission of each terminal to improve the bandwidth utilization rate.

A relay-node selection method of

vehicle-to-vehicle network in mountain regions

CAO Dun1,2，LEI Zheng-bao1

2017, 39(02): 297-302. doi:

Abstract ( 124 )

PDF (702KB) ( 451 ) 　　

In vehicle-to-vehicle networks, the selection of an appropriate relay node plays a key role in achieving efficient and reliable message dissemination. Current works mainly study the relay-node selection on the straight road of highways or at the intersection in urban areas, but few focuses on the curves in mountain regions. We develop an exponent-based partitioning broadcast protocol on curves (EPBPC). Without any prior information of vehicles, the EPBPC maps the vehicles’ location on the curve to some points in a straight line, and selects the node in the farthest and thinnest segment as the relay node by the aid of black burst. Simulations in a real mountain region show that the EPBPC outperforms the state-of-the-art protocols in transmission delay and packet delivery ratio: end-to-end delay can be reduced by24.45% in dense vehicle networks, and packet delivery ratio is higher than 99.8%.

A bias field variational image segmentation

model restraining texture information

LI Hu，WANG Xi-li

2017, 39(02): 303-310. doi:

Abstract ( 134 )

PDF (1135KB) ( 539 ) 　　

The variational level set image segmentation based on the bias field can segment intensity inhomogeneity images using the local image information. However, the model cannot do it well when there are textures in the image. To solve the above problem, we propose a bias field variational level set image segmentation model to suppress texture information. The texture can be restrained by the intrinsic texture descriptor based on the texture geometric structure and it can enhance the contrast among different texture regions and smooth the image in the same texture region. It can reduce the mistakes of texture region segmentation by restraining the texture. Experimental results indicate that the proposed model can better segment natural and synthetic texture images in comparison with the bias field variational model.

Segmentation of tumor nests and stomata of

HE-stained breast cancer histopathological images

KAN Xian-xiang，LIU Juan，QU Ai-ping

2017, 39(02): 311-316. doi:

Abstract ( 391 )

PDF (661KB) ( 635 ) 　　

Review attachment

It is common to analyze and diagnose breast cancer (BC) via HE-stained BC histopathological images. Pathologists generally believe that the pathological and morphological features of tumor nests (TNs) and stroma indicate biological behavior of BC, so it becomes particularly important to accurately segment the TNs and stroma. For HE-stained BC histopathology images, we regard the segmentation as the classification of pixels in the image, extract and analyze features, choose the best combination of features, and then classify it as TNs or stroma. And procedures of sample interval, normalization and thresholding are also taken into account. Experimental results show that the proposed method can segment the TNs and stromata accurately and ensure higher accuracy and precision. What's more, it is satisfactory in terms of time efficiency.

An improved fast SLIC segmentation algorithm

MA Jun-fu，WEI Wei

2017, 39(02): 317-322. doi:

Abstract ( 326 )

PDF (924KB) ( 515 ) 　　

Review attachment

Superpixels algorithms have been applied to various fields of computer vision in recent years. Superpixels can capture the redundant information of the image and reduce the complexity of post-processing. As the preprocessing of images, superpixels segmentation needs to meet the demands of real-time properties and accuracy of image processing. We propose an improved fast SLIC segmentation algorithm in the framework of the SLIC algorithm to improve the efficiency of superpixels segmentation. We extract a small number of pixels of the original image to generate a small scale image by reducing the scale of the original image, and employ the SLIC algorithm to do superpixels segmentation of the downscaled image. After the primary superpixel segmentation, we perform K nearest neighbor classification on the pixels of the original image according to the segmentation results of the downscaled image and realize final superpixels segmentation. Experimental results show that with similar accuracy, the proposed algorithm is much faster than the SLIC algorithm when processing the same object.

A new medical image classification method based on

convolution restricted Boltzmann machine

ZHANG Juan,JIANG Yun,HU Xue-wei,XIAO Ji-ze

2017, 39(02): 323-329. doi:

Abstract ( 166 )

PDF (778KB) ( 638 ) 　　

Review attachment

Data mining methods are widely used to analyze medical images in current research. Commonly used mining methods first need to extract features from medical images and then do classification analysis. At present, the statistical features extracted from images are mostly applied, however, it has a strong dependence on the extracted features. We propose a new classification method based on convolution restricted Boltzmann machine (CRBM), which can train the CRBM model by the fast continuous contrastive divergence algorithm. The method can directly and automatically learn features from the mammography image and use these features to do classificature. Experimental results show that the proposed method can improve the classification accuracy of medical images.

A human fall detection algorithm

based on acceleration sensor

SUN Zi-wen,SUN Xiao-wen

2017, 39(02): 330-335. doi:

Abstract ( 325 )

PDF (547KB) ( 766 ) 　　

Review attachment

Aiming at the accuracy degradation problem caused by improper threshold setting for human fall detection threshold algorithms, we introduce the support vector machine to select thresholds. We obtain human motion information from an acceleration sensor, and extract acceleration and angle as classification features. We propose a fall detection model based on thresholds according to the three stages of human fall: free fall, hitting the ground and stationary. The fall detection model sets thresholds by the support vector machine and manual method respectively, and simulation results show that compared with the manual-threshold-setting method, the accuracy of the proposal is higher, which proves the effectiveness of the proposed method.

An integer FPGA architecture of guided filtering

LIU Zhu-hua,YUAN Wen

2017, 39(02): 336-342. doi:

Abstract ( 245 )

PDF (658KB) ( 533 ) 　　

Review attachment

We analyze a high performance FPGA architecture of guided filtering for single image and find out that the mean filter of guided filtering has design defects. For this reason, we propose an integer FPGA architecture of guided filtering. Changing the sequence of data accumulation of the mean filter can reduce the use of memory on the FPGA. At the same time, we calculate the variance and transform coefficients of guided filtering by integer processing. Moreover, the size of image and filtering window can be changed flexibly via parameter adjustment. We implement the new architecture on the FPGA of Altera’s Cyclone, and experimental results show that the maximum error between the integer FPGA architecture’ processing result and the floating-point calculation result after rounding off is less than one. The new architecture greatly reduces the use of hardware resource and effectively improves the speed of data processing. When be implemented on the EP3C40F484C8, it is capable of processing images with dimension of 1024 by 1024 at the frame rate of 162 FPS, and can well meet the requirements of all kinds of real-time image processing.

regularization and its application in face recognition

FU Jun-peng,CHEN Xiu-hong,GE Xiao-qian

2017, 39(02): 343-350. doi:

Abstract ( 188 )

PDF (770KB) ( 539 ) 　　

Review attachment

For high-dimensional data reduction, selection of effective features is important for classification. In order to solve the high-dimensional and small sample size problem in face recognition, starting with the feature selection and subspace learning, we propose a new method of uncorrelated linear discriminant analysis based on L2,1-norm regularization. To add L2,1-norm penalty term to the objective function, this algorithm firstly decomposes the sample matrix by the SVD. Then it presents a series of transformation, transforming its nonlinear Fisher criterion into linear type. Finally, it adds the L2,1-norm penalty term to the linear model, and solves the regularization problem to get a set of optimal discriminant vectors. We project training samples and testing samples onto low-dimensional subspace respectively, and use the nearest Euclidean distance classifier to classify the testing samples. Due to the characteristic of L2,1-norm, which can perform feature selection and subspace learning simultaneously, the recognition performance is greatly improved. Experiments on three standard face databases (ORL, YaleB and PIE) verify the performance of the algorithm, and show the efficiency of dimensionality reduction and the improvement of discriminant ability.

Image fusion based on area characteristics

in finite discrete shearlet domain

CHEN Qing-jiang,ZHANG Yan-bo,CHAI Yu-zhou,WEI Bing-zhe

2017, 39(02): 351-358. doi:

Abstract ( 140 )

PDF (947KB) ( 426 ) 　　

Review attachment

In order to improve the accuracy of multi-focus image and infrared visible image fusion, based on the advantages of the good localization and shift invariance of the finite discrete shearlet transform, we propose a new image fusion algorithm, called FDST, which combines contrast with image gradient information correlation factor. Firstly, the FDST decomposes the registration images, and obtain the low frequency sub-band coefficients and high frequency sub-band coefficients of different scales and directions. The fusion principle of low frequency sub-band coefficients bases on the method of image gradient information correlation factor for weighting. The high frequency sub-band coefficients combine the high frequency and low frequency sub-band coefficients with contrast, and uses contrast as the criterion for choosing metric coefficients which is taken as the fusion rule. Finally, the low frequency information and high frequency information are reconstructed to obtain a fused image by the finite discrete shearlet inverse transform, and both subjective visual evaluation and objective performance assessments of the fusion results are implemented. The results indicate that the algorithm is superior to other fusion algorithms in subjective visual effects and objective evaluation.

Current Issue

Author center

Review center

Online journal