Computer Engineering & Science

A parallel dynamic bit vector based frequent

closed sequence pattern mining algorithm

CHEN Qian,LIU Yun,GAO Yuying

2018, 40(10): 1717-1725. doi:

Abstract ( 295 )

PDF (1035KB) ( 556 ) 　　

For long sequence databases, which have high computational costs both in time and space, a mining model that is more efficient and compact and can extract information completely is a current research hotspot. We propose a parallel dynamic bit vector based frequent closed sequence pattern mining algorithm (PDBV-FCSP), which combines the multicore processor architecture with the DBV data structure to effectively speed up the processing speed of the sequence database. The search space is divided, and the closed check of the preprocessing sequence is executed as early as possible, which reduces the required storage space and the execution time of mining the frequent closed sequence mode, and overcomes the problems of communication overhead, synchronization and data replication of the existing parallel mining algorithms. The dynamic load balancing mechanism for job redistribution is used to solve the load balancing problem of workloads among processors, thus minimizing the idle CPU time. Simulation results show that, compared with the DBVVDF algorithm, the PDBVFCSP algorithm has better performance in terms of running time, memory usage and scalability. And when the core number increases, the performance is better.

Heat dispersion structure design of thin

client computers based on FT processor

LI Hong,LI Jun,GONG Guohui

2018, 40(10): 1726-1730. doi:

Abstract ( 395 )

PDF (557KB) ( 661 ) 　　

Review attachment

Due to the relatively high power consumption of thin client computers based on current domestic processors, the commonly used passive heat dispersion method with the fanfree thermal conductivity is slightly inadequate in the aspect of heat dispersion performance of the whole machine, having the dead zone of heat dispersion in particular. We design a thin client computer based on quadcore FT processor. The thin client computer has a double thermal module, 4 inlet ducts and 2 exhaust ducts, so that there is no heat dead zone inside of the client computer, and it can achieve a good heat dispersion performance of the whole machine. Temperature test data show that, when the computer runs under full load condition in the environment of 25℃the temperature of each test point in the body is within 52℃ , which meets the working temperature requirements of all components. Besides, the outer surface temperature of the machine is lower than 36℃, which brings favorable user experience.

A matrix factorization recommendation algorithm

based on adaptive tag selection

SONG Wei1,2,LI Xuesong1

2018, 40(10): 1731-1736. doi:

Abstract ( 381 )

PDF (484KB) ( 390 ) 　　

Review attachment

Incorporating tags into matrix factorization is a hot topic in the field of recommender system. Based on adaptive tag selection, we propose a new matrix factorization recommendation algorithm. Firstly, we put forward a tagrating sparsity factor, which balances the usage of latent factors and tags in recommendation. Secondly, tag vectors are computed by the number of tags, which reflects the influence of different frequencies of tags on different items. Finally, the overall description of the proposed algorithm is illustrated. Experimental results show that the proposed algorithm has high recommedation accuracy and high convergence speed.

Web services clustering based on Biterm topic model

CHEN Ting1,2,LIU Jianxun1,2,CAO Buqing1,2,LI Run2

2018, 40(10): 1737-1745. doi:

Abstract ( 454 )

PDF (731KB) ( 487 ) 　　

Review attachment

It is not ideal for the huge number of Web services with data sparseness to use traditional modeling methods to cluster them. To solve this problem, we present a Web service clustering method based on the Biterm topic model (BTM). This method firstly employs the BTM to learn the latent topics of Web service description corpus. Secondly, it derives the topic distribution of each Web service. Finally, it uses the K-Means algorithm to cluster Web services. Compared with the LDA and TF-IDF clustering methods, the proposed approach achieves better performance in purity, entropy and F-measure. Our method can effectively solve the data sparseness problem caused by the short text nature of Web service description, and significantly improve service clustering.

Review:Traffic identification based on machine learning

ZHAO Shuang,CHEN Shuhui

2018, 40(10): 1746-1756. doi:

Abstract ( 683 )

PDF (569KB) ( 1446 ) 　　

Traffic identification is an essential stage for network management and security. As the effectiveness of portnumberbased techniques and deep packet inspection techniques is diminishing, machine learning based traffic identification has become particularly notable in the past decade. Given the importance of traffic identification, we first give a brief overview of traffic identification techniques and the basic concepts concerned, including application scenarios, input objects, identification types and evaluation metrics. Then, in the context of machine learning, we detail the development of key techniques, such as data sets acquisition, features extraction and selection, and identification model design. Additionally, we summarize and compare recent mainstream studies. Finally, we discuss the major challenges and prospects of machine learning based traffic identification.

IPv6 property testing for OpenFlow protocol

LI Yuanping1,3,LI Hua1,2,ZHAO Junlan3,RUAN Hongwei1

2018, 40(10): 1757-1765. doi:

Abstract ( 453 )

PDF (1168KB) ( 347 ) 　　

Since the data forwarding and control are separated in SDN networks, the OpenFlow protocol plays an important role in its southbound interface. With the development of the next generation Internet, available IPv4 address resources are almost used up, and the bottleneck of IPv4 networks is highlighted. How to deploy IPv6 networks quickly, making their contribution for social production and life, and realizing longterm coexistence of current networks and IPv6 networks or a smooth transition to IPv6 networks, becomes an urgent problem for the industry and academia.SDN networks provide such an option, so verifying whether the OpenFlow protocol supports IPv6 protocol has attracted our attention. We construct a formal model for the OpenFlow protocol, and build a nondeterministic finite state machine (NFSM). In order to guide the testing, we also achieve a test generation tree. We focus on identifying whether this protocol supports the IPv6 protocol, and generate 167 test cases by a combination test method. We also develop a test engine, which supports efficient test generation algorithms, test operation and identification. We test our test engine on the test cases, and analyze the test results. The quantitative analysis results meet the expected requirements.

A coverage hole recovery algorithm with minimum

energy consumption based on polar coordinates in WSNs

CUI Lizhen，LI Xiaoyu，HU Haidong，GAO Lili

2018, 40(10): 1766-1771. doi:

Abstract ( 408 )

PDF (735KB) ( 422 ) 　　

Aiming at the coverage hole problem in hybrid wireless sensor networks (WSNs), we propose a coverage hole recovery algorithm based on polar coordinates. Firstly, we determine the boundary points of coverage holes by calculating the intersections of statistic nodes' sensing circle, and connect the boundary points to construct coverage hole polygons. Secondly, the location of virtual recovering nodes in every polygon is calculated according to the polar coordinate method. Finally, we build the distance data table between virtual recovering nodes and moving nodes to complete the hole recovery work, that is moving the moving nodes in the table to the locations of virtual nodes that match them. Simulation results show that compared with other similar algorithms, the proposed algorithm can determine and recover the coverage holes in WSNs effectively, and meanwhile it needs less moving nodes and has a shorter average moving distance. In addition, it prolongs the network's life cycle while improving the quality of coverage.

A robust digital watermarking algorithm

based on Slant transform and SVD

XIAO Zhenjiu，GUO Bingying，LI Nan，TANG Xiaoliang

2018, 40(10): 1772-1779. doi:

Abstract ( 497 )

PDF (997KB) ( 418 ) 　　

Aiming at the false alarm errors of singular value decomposition (SVD) and the contradiction between robustness and invisibility of watermarking, we present a robust digital watermarking based on Slant transform and SVD. Firstly, the watermarking image is pretreated by the twofactor encrypting processing of Arnold scrambling and Logistic mapping. The original host image is divided into 8*8 nonoverlapping subblocks, and then the subblocks are transformed by Slant transform and decomposed by the blocked SVD. Thirdly, left singular matrix and singular value matrix of the watermarking image after SVD processing are multiplied as watermarking principal components to embed into the maximum singular value of each subblock. Simulation results show that the proposed watermarking algorithm not only effectively solves the problem of false alarm problem of the traditional SVD watermarking algorithm, but also improves the running speed and watermarking security. At the same time, it has better visibility and robustness to common attacks such
as JPEG compression, noise, filtering and geometric attacks.

Group key operations based on structured public key

ZHOU Jian1,2,SUN Liyan1,CHEN Honglin1

2018, 40(10): 1780-1786. doi:

Abstract ( 306 )

PDF (472KB) ( 369 ) 　　

The structure of public encryption key cannot meet the performance requisition for group key management based on onedecryptionkey oneencryptionkey algorithm. All group members need to participate in rekeying if the public encryption key is updated. To solve the problem, a structured public encryption key is proposed. Decryption keys are composed into a key set, any decryption key with key independence from the set has limited capability to modify the public encryption key, so the rekeying implements successfully by selfconfiguration mode without support of powerful trusted entity, and the new public encryption key does not break the validity of other decryption key members. In this perspective, the proposed public encryption key structure enriches the relationship between decryption key and encryption key, and extends the key operations for dynamic groups. Besides, it is particularly adaptable to group key management of networks in harsh environment.

Software defect prediction based on

cost-sensitive support vector machine

REN Shengbing，LIAO Xiangdang

2018, 40(10): 1787-1795. doi:

Abstract ( 459 )

PDF (786KB) ( 467 ) 　　

Software defect prediction is a typical unbalanced learning problem. We propose a CCS-SVM software defect prediction model based on cost sensitive SVM algorithm improved by the CSSVM and clustering algorithm. In the CCSSVM prediction model, we combine SVM and the cost of class misclassification, take unbalanced data evaluation index as the objective function, and optimize the misclassification cost factor so as to enhance the recognition rate of the minority class samples. We find the center point of each sample through clustering, define the class confidence for each sample according to the distance of the sample to its center point, assign different misclassification cost factors to different samples, and introduce the class confidence of each sample to the optimization problem of cost sensitive SVM, and improve the robustness of the algorithm and classification performance of SVM. To enhance the generalization ability of the model, we use the genetic algorithm to optimize feature selection and model parameters. Experimental results of the NASA Metric Data Program (MDP) dataset show that our method is significantly improved in the Gmean and Fmeasure value for model evaluation.

Software defect prediction based on

deep autoencoder networks

ZHOU Mo，XU Ling，YANG Mengning，LIAO Shengping，YAN Meng

2018, 40(10): 1796-1804. doi:

Abstract ( 638 )

PDF (941KB) ( 626 ) 　　

Software defect prediction is an effective way for improving the quality of software, and the effect of software defect prediction is closely related to data sets’own characteristics. In regard of feature information redundancy and large dimension of data sets, combining with the powerful learning feature ability of deep learning, we propose a software defect prediction method based on deep autoencoder networks. This method firstly uses an unsupervised learning sampling method to do sampling for 6 open source projects data sets to solve class imbalance problem of datasets. We then build a deep autoencoder network model through training, which can reduce the dimension of data sets. The model uses three classifiers for connection and employs the training sets with reduced dimension to train the classifiers. Finally, we use the test sets to do prediction. Experimental results show that the proposed method outperforms the basic software defect prediction model and the software defect prediction model based on existing feature extraction methods under the circumstance of the data sets with large dimension and redundant feature information. Besides, it is adaptive to different classifiers.

Analysis of software service activity

based on socio-technical attributes

HE Peng1,LIU Yaxin1,LI Bing2,ZENG Cheng1

2018, 40(10): 1805-1814. doi:

Abstract ( 460 )

PDF (1077KB) ( 391 ) 　　

As a coarse-grained service in open-source environment, software supports user experience and secondary development, and its activity reflects the development state of the software service. Understanding the factors of the activity of software service is an essential step to improve the quality of community software service. We take the Sourceforge open-source community as the research object and use volume of download, pages and click-through to measure the activity of software service from the perspective of social-technical attributes. Our results show that: (1) to get better software activity, the number of core developers should be moderate (around 17), and in order to ensure the actual contribution of developers, the number of participants of the software service should be within 7. In addition, the roles of developers should be as more as possible and their distribution should be as uniform as possible. (2) The more mature software service is, the more active it tends to be. The effect of other technical attributes is not obvious and is irrelevant to their popularity. However, the activity of software service has a significant linear correlation with the variety of technical attributes.

White box test case prioritization based on

good point set genetic algorithm

SUN Jiaze1,2,WANG Gang1

2018, 40(10): 1815-1821. doi:

Abstract ( 399 )

PDF (1024KB) ( 438 ) 　　

Test case prioritization is an efficient and practical regression testing technique in the process of software evolution, and it is essential for improving early defect detection rate and reducing test cost. To solve the problem of slow convergence speed and poor stability of the traditional genetic algorithm (GA) in white box test case prioritization, we propose a test case prioritization method based on good point set GA (GGA). The GGA encodes the individuals according to the cover matrixes of program entities and uses average cover percentage of program entities as the fitness function. The GGA generates the new generation of population through random sampling selection operator and good point set crossover operator. We select six typical open source projects as benchmark programs, and take statements, branches and methods as program entities in the experiments. Experimental results show that the GGA has faster convergence speed and better stability than the traditional GA. The GGA provides an effective method for test case prioritization in regression testing, which is helpful for finding software defects as early as possible and reducing test cost.

An image fusion algorithm based on complementary

wavelet transform and PCNN in NSCT domain

WANG Jian1,2,ZHANG Xiufei1,REN Ping1,YUAN Wenle1

2018, 40(10): 1822-1828. doi:

Abstract ( 359 )

PDF (817KB) ( 402 ) 　　

Aiming at the shortcomings of the traditional NSCT image fusion algorithm, we propose an image fusion algorithm based on complementary wavelet transform and PCNN in NSCT domain. Firstly, we use the NSCT to decompose the source image and generate a series of low frequency and high frequency subbands. Low frequency subbands are decomposed by twodimensional wavelet into a low frequency subband and three directional subbands. Then the low frequency subband is fused by the local region energy weighting method. And the three directional subbands are fused by the improved Gaussian weighted SML method. The high frequency subbands decomposed by the NSCT are divided into the highest layers and other layers for fusion. The highest layers are fused by using the improved Gaussian weighted SML method, and the other layers are fused by the PCNN method enhanced by edge gradient information. Finally, the fused image is obtained through NSCT inverse transform. Experimental results demonstrate the effectiveness of the proposed algorithm.

A vehicle face detection algorithm

based on selective search

LI Xiying1,2,3,4,ZHOU Zhihao1,2,3,4,L Shuo1,2,3,4

2018, 40(10): 1829-1836. doi:

Abstract ( 376 )

PDF (802KB) ( 437 ) 　　

Vehicle face detection benefits a wide range of applications such as vehicle identification and semantic segmentation of the vehicle. Although great dedication to this task, most existing researches focus on the detection and positioning of the entire area of the vehicle face. We propose a new vehicle face detection method based on selective search. The approach has two steps. Firstly, the vehicle image is denoised through Gaussian filter as the preprocess. Secondly, we use the image segmentation algorithm based on graph presentation to obtain the initial image segmentation regions from the preprocessed images. We calculate the similarity degree of adjacent regions in color, texture, size and coincidence degree, and then merge the initial split area with the one with the highest similarity. We conduct experiments on 4199 vehicle images of CompCars data set from the Chinese University of Hong Kong for vehicle front face detection test. The results demonstrate that the average coincidence degree of vehicle face is 73.74%, which is better than other object detection algorithms. In addition, the proposed method also has better generalization ability across data set without training.

An image segmentation algorithm

using variable precision rough entropy

WU Shangzhi1,SHE Zhiyong2,ZHANG Xia1,ZHAO Huiqin3

2018, 40(10): 1837-1843. doi:

Abstract ( 592 )

PDF (548KB) ( 402 ) 　　

Image segmentation is a technique and process of dividing an image into a number of specific, unique areas and extracting the target of interest. Segmentation results directly affect target feature extraction and description, further target identification, classification and image understanding. Due to the complexity and relevance of image information, there is uncertainty and fuzziness in image segmentation. We use the variable precision rough set to represent the image, combining rough entropy and particle swarm optimization algorithm, propose an image segmentation algorithm based on variable precision rough entropy. We obtain the optimal segmentation threshold corresponding to the maximum rough entropy and then divide the image by the binary segmentation method. Experimental results show that the proposed algorithm is superior to the traditional single threshold segmentation method, and has certain practicability and flexibility.

Preceding vehicles detection based on integrated

learning and constraint of positions information

GENG Lei1,2，PENG Xiaoshuai1,2，XIAO Zhitao1,2，LI Xiuyan1,2，GAN Peng1,2

2018, 40(10): 1844-1850. doi:

Abstract ( 408 )

PDF (1057KB) ( 359 ) 　　

Since the traditional preceding vehicle detection strategies cannot meet the accuracy and real-time requirements simultaneously, we propose an approach which combines AdaBoost, an integrated learning method, with the constraint of positions information. Firstly, regions proposal (RP) are obtained by edge boxes method according to the sequence information of vehicle edges. Secondly, the position information of vehicles in the frame coordinate system is used to filter out nontarget RPs. Finally, the obtained windows are clustered and fed into the AdaBoost classifiers for vehicles detection, and at the same time borders regression is utilized to improve the accuracy of detection results. Experimental results demonstrate that the proposed method has robustness to different detection scenarios and that it can meet the accuracy and real-time requirements of vehicle detection.

A face recognition algorithm based on semi-supervised

LDA feature subspace optimization

JI Mingjun，LIU Mandan,CAI Leqian

2018, 40(10): 1851-1857. doi:

Abstract ( 535 )

PDF (785KB) ( 379 ) 　　

Facial feature extraction is the most important step in face recognition process, and the quality of the feature directly affects the recognition rate. In order to get better face recognition effect, it is necessary to make full use of sample information. To make full use of the information contained in training samples and test samples, we propose a semisupervised linear discriminant analysis (SLDA) feature extraction method based on weighted combination of principal component analysis (PCA) and linear discriminant analysis (LDA). At the same time, inspired by the combinatorial optimization problem, we use the binary genetic algorithm to optimize the feature space obtained by the semisupervised feature extraction method. Experimental results of ORL face database show that the proposed method achieves higher recognition rate than the classical face recognition algorithm and some improved methods.

Image recognition based on feature clustering

adaptive sparse group autoencoder

XIAO Hanxiong,CHEN Xiuhong,TIAN Jin

2018, 40(10): 1858-1866. doi:

Abstract ( 319 )

PDF (1216KB) ( 399 ) 　　

Due to the lack of prior information, the group Lasso model is trained based on the group number parameter that groups the units uniformly, continuously and fixedly, which easily leads to biased estimates about the group structure of variables. We propose a feature clustering adaptive sparse group autoencoder, which uses the feature clustering method to change the grouping of the hidden layer unit in the process of iteration so that it can adaptively change with the convergence of the features, achieving better estimation of group structure of the variables. Experiments show that the model can better capture the relevant information of the group structure of the variables during the training process and improve the image classification performance to a certain extent.

A moving object detection algorithm based on spectral

residual algorithm and clustering algorithm

MA Qin,ZHANG Xingzhong,LI Haifang,DENG Hongxia

2018, 40(10): 1867-1873. doi:

Abstract ( 300 )

PDF (1583KB) ( 374 ) 　　

The traditional target detection algorithm based on feature point matching has low target recognition rate and high false detection rate because of inaccurate matching of feature points and target contour discontinuity. We introduce the improved spectral residual algorithm and k-means clustering algorithm to solve those problems, and propose a moving target detection algorithm based on spectral residual algorithm and clustering algorithm. The method is divided into two parts. Firstly, we extract the speed up robust features (SURF) from every two frames and complete image registration. Then, we use the spectral subtraction algorithm to extract the saliency feature from frame difference results in order to remove noise and false targets caused by inaccurate matching. Secondly, the improved k-means clustering algorithm is introduced to cluster the discontinuous contour curves to obtain a complete target after morphological processing. Experiments show that the target recognition rate of this new algorithm is 90.61% and the false detection rate is 21.23%, which are better than the traditional moving object detection algorithm based on SURF whose corresponding results are 66.60% and 31.91% respectively and the moving object detection algorithm based on improved oriented FAST and rotated BRIEF (ORB) whose corresponding results are 87.57% and 26.80% respectively. Although the average running time of the algorithm is 18 frames/s, it can meet the requirement of video fluency. This algorithm therefore can be used as an effective moving target detection algorithm in dynamic background.

Current Issue

Author center

Review center

Online journal