Computer Engineering & Science

A hybrid big data platform based on

private cloud VMs and bare metals

WANG Yong-kun1,LUO Xuan1,JIN Yao-hui1,2

2018, 40(2): 191-199. doi:

Abstract ( 427 )

PDF (916KB) ( 766 ) 　　

Review attachment

The wide application of big data analysis technology cannot be separated from the support of big data platforms. Building big data platforms is an important demand of many enterprises and institutions. Building a big data platform requires sophisticated, system-wide technologies, and system performance and scalability should be considered especially. With the increasing volume of data, user needs continue to increase, and hence the scale of the planned data platform may not be able to meet the changing needs. Therefore, we design a hybrid big data platform that uses both bare metals and private cloud Virtual Machines (VM) . This takes into account performance and scalability. Because bare metals generally outperform private cloud VMs, the big data platforms built on bare metals generally perform better than the big data platforms built on private cloud VMs. It is very convenient and quick to start the cloud servers in the private cloud, so the computing and storage nodes of the big data platform can be flexibly expanded to the private cloud so as to ensure that the big data platform can still have sufficient processing capacity during the peak period. We implemented this hybrid design in a production environment. Tests in the production environment also demonstrate the effectiveness of this design.

A hardware dynamic congestion-control

mechanism based on packet delay variation

REN Xiu-jiang1,SI Tian-hao1,ZHOU Jian-yi1,XIE Xiang-hui2

2018, 40(2): 200-209. doi:

Abstract ( 328 )

PDF (1737KB) ( 529 ) 　　

Review attachment

Interconnection networks have become a technical bottleneck to improve the performance of high-performance computing systems. The congestion control in high-performance interconnection networks is studied. Aiming at the formation process of communication hotspots, a hardware dynamic congestion control mechanism (CMDPD) is proposed. The transmission delay variation of packets is used to predict the network congestion state, and the end-to-end network injection is controlled to avoid the network congestion. The simulation environment is built up, and CMDPD is simulated under Fat-tree and Dragonfly networks. The results show that the throughput of CMDPD in Fat-tree network is improved by 5%~12% .

A Spark based parallel genetic algorithm

solving multimodal function extremums

LIU Peng1,2，YE Shuai3，MENG Lei1,2，WANG Can4

2018, 40(2): 210-217. doi:

Abstract ( 509 )

PDF (1301KB) ( 580 ) 　　

Review attachment

The Genetic Algorithm (GA) needs many computation iterations in solving multimodal function extremums, so its running efficiency is too low when dealing with large-scale data, which greatly limits its practical application. The classical parallel platform Hadoop can improve the GA running efficiency to some extent, while the state-of-the-art parallel platform Spark can release much more parallelism of GA by realizing parallel crossover, mutation and other operations on each computing node. For the convenience of comparison, the GA solving multimodal function extremums are designed and implemented on single node, Hadoop and Spark, respectively. Experimental results show that, compared with single node platform and Hadoop platform, the Spark based implementation not only significantly reduces the running time but also effectively avoids the problem of premature convergence because of its powerful randomness, while dealing with large-scale samples.

Design and implementation of a high-resolution

digitally controlled oscillator

ZHAO Xin,PAN Tian-qie,WANG Biao

2018, 40(2): 218-223. doi:

Abstract ( 462 )

PDF (898KB) ( 393 ) 　　

Review attachment

As a key component of All-Digital Phase-Locked Loop (ADPLL), Digitally Controlled Oscillator (DCO) provides a high-frequency output clock. The DCO performance directly affects the frequency range and jitter performance of ADPLL. This paper proposes a DCO based on all-digital standard cell library design. The structure includes the coarse, medium and fine adjustment stages, which achieves a high frequency range of 0.5 GHz~2.6 GHz and a high adjustment resolution of 0.8 ps. The DCO is designed and implemented in an advanced process. Based on this DCO, an ADPLL is designed and implemented. The system jitter is less than 2 ps and the power is 10 mW.

Analysis of the factors limiting Linpack efficiency of

heterogeneous high-performance computing systems

JIA Xun,WU Gui-ming,XIE Xiang-hui

2018, 40(2): 224-230. doi:

Abstract ( 531 )

PDF (487KB) ( 613 ) 　　

Review attachment

Energy consumption is a major challenge for current high-performance computing (HPC) system design. The technique of heterogeneous computing with accelerators connected to host processor can improve the system energy efficiency, and has been widely applied in HPC. However, heterogeneous systems have lower Linpack efficiency in comparison to homogeneous systems with the equivalent scale. Aiming at this problem, from the perspective of structural design, based on the design parameters and performance data of real computing systems, this paper analyzes the main factors limiting the Linpack efficiency of heterogeneous HPC systems and their demands of structural design. Besides, the analysis is verified with a Linpack performance model tailored for heterogeneous systems. This study results are of guiding significance to Linpack performance optimization and efficient architecting on future heterogeneous systems.

E-commerce query suggestion based on log mining

WANG Jing1,2,WANG Ruo-fei1,2

2018, 40(2): 231-237. doi:

Abstract ( 342 )

PDF (657KB) ( 475 ) 　　

Review attachment

Query suggestion can effectively alleviate the input burden for users, eliminate the query ambiguity, and improve theconvenience and accuracy of information retrieval. With the development of e-commerce, query suggestion is also popular in the product search of e-commerce applications.However, traditional query suggestion methods for Web search are not fully applicable in e-commerce applications. Based on the analysis of different query suggestion techniques, an e-commerce query suggestion method based on log mining is presented, which considersboth the search behaviors and shopping behaviors of users.MapReduce is used in log mining to generate the query words in an offline mode, and query suggestions are offeredto users in an online mode. Experimental results show that the presented method can improves the accuracy of querysuggestions and has good performance.

Control system fault diagnosis using improved fuzzy clustering

WANG Yin-song,SHANG Dan-dan,WANG Yan-fei,ZHANG Wan-jun

2018, 40(2): 236-330. doi:

Abstract ( 359 )

PDF (548KB) ( 382 ) 　　

Review attachment

In order to improve the fault diagnosis accuracy of the sensor and actuator of the control system, we propose a new control system fault diagnosis method by combining the advantages of wavelet analysis for extracting features with the good clustering effect of the fuzzy C-means algorithm based on a weighted density function. Firstly, we use wavelet analysis to extract the features of fault signals to reduce the influence of noise. Secondly, we employ the fuzzy C-means clustering algorithm to classify the data whose features have been extracted. Experimental results show that the proposed algorithm can not only identify the fault of different components, but also diagnose different types of faults on the same part.

#br# Prediction of network events’ hotness

based on EKSC algorithm

ZHANG Mao-yuan，SUN Shu-yuan，WANG Yi-bo,MENG Qiong-yao，WANG Qi

2018, 40(2): 238-245. doi:

Abstract ( 452 )

PDF (904KB) ( 751 ) 　　

Review attachment

With the rapid development of the Internet, how to effectively monitor and guide the public opinion on the Internet is of great significance to the social stability. The prediction of network events’ hotness is an important part of public opinion supervision. In view of the fact that the traditional method ignores the temporal information and the relevance contained in the event time series in the process of prediction, a prediction model based on EKSC algorithm is proposed. The model uses the EKSC algorithm to cluster the time series of known network public opinion events of each class and construct a class model library. The time sequence of the know hotness in the predicted event is scaled. The least square method is used to predict the event by selecting the model with the minimum mean square error in the class library. Experimental results show that this method can effectively predict the hotness of network events.

Distance rectification indoor localization

based on cluster analysis optimization

DU Jia-xing,CHEN Ya-wei,ZHANG Jing

2018, 40(2): 246-254. doi:

Abstract ( 325 )

PDF (998KB) ( 444 ) 　　

Review attachment

The localization system based on received signal strength indication (RSSI) is vulnerable to environmental impact, we therefore present a RSSI signal processing optimization strategy based on clustering analysis of the Gaussian mixture model. We achieve accurate indoor localization of unknown nodes through the optimization of the RSSI and the distance rectification of the four sides centroid localization algorithm. Experiments on Bluetooth 4.0 beacon nodes validate that the algorithm can effectively improve range accuracy and location precision. The location precision is increased by 34.6% than the conventional weighted centroid algorithm, and the average error of localization is less than 0.5 m, which meets the location precision requirement.

A multi-user network selection algorithm

based on evolutionary game

TANG Tao1,2,LI Le2,XIAO Jing-wei1,WU Wei-nong1,FENG Wen-jiang2

2018, 40(2): 255-260. doi:

Abstract ( 327 )

PDF (889KB) ( 494 ) 　　

Review attachment

The development and evolution of wireless communication technology results in that a variety of wide area cellular networks have coexisted and overlapped with a large number of Wireless Local Area Networks (WLANs).In the hot region,when a large number of densely distributed users simultaneously initiate the same kind of service requests,a multi-user network selection algorithm based on evolutionary game is proposed.The utility function is designed according to the number of users in the selection network,and the replication dynamic equation of evolutionary game is given.Simulation results show that the proposed algorithm can achieve evolutionary equilibrium quickly and have higher average user benefit than the RSSI algorithm,the user distribution of the access network is more balanced,and the network resources can be reasonably used.

A multi-objective optimization based clonal

selection algorithm in immune invasion detection

ZHANG Feng-bin,FAN Xue-lin,XI Liang

2018, 40(2): 261-267. doi:

Abstract ( 353 )

PDF (678KB) ( 380 ) 　　

Review attachment

In immune invasion detection theory,clonal selection is the key todetectorevolution.The traditional clonal selection algorithm, which compares the cumulative value of affinitybetween samples to select samples, has lower time complexity, but also causes high overlap of detectors and affects the iterative efficiency.This paper transforms the selection and evolution of detector individuals into the solving process of pareto optimal solution, and proposes the detector clone selection algorithm based on multi-objective optimization theory.Experiments show that the algorithm can significantly improve the detection range of each population in the evolutionary process, reduce the number of memory detectors and improve the detection rate of the detection system.

Formal verification of serveral

software components in PAR platform

HU Qi-min1,2,XUE Jin-yun1,2,YOU Zhen1,2,CHENG Zhuo1,2

2018, 40(2): 268-274. doi:

Abstract ( 404 )

PDF (462KB) ( 470 ) 　　

Review attachment

PAR platform is a software platform developed by our research team to support software formality and automated development. The platform fully embodies the advantages of functional abstraction and data abstraction, thus making software development convenient and reliable. The key to achieving this performance is a batch of reusable software components. In order to ensure the correctness and reliability of the whole software platform, it is very important to ensure the correctness and reliability of the software components. In this paper, we select some typical software components in the PAR platform, formalize the semantics of the components in a formal way, and prove the correctness of the components with the help of the Coq theorem prover, hence improving the efficiency of software compoents’ formal verification.

A fault localization method based on

improved program spectrum

YU Xiao-fei,ZHANG Shi,CAI Rui,CHEN Hui-feng,JIANG Jian-min

2018, 40(2): 275-281. doi:

Abstract ( 383 )

PDF (667KB) ( 514 ) 　　

Review attachment

Fault localization is a time-consuming and labor-intensive job in the process of software debugging. Debugging automation has important significance for replacing the manual check and improving the efficiency of software debugging. Specifically, many researchers have focused on fault automation location based on program spectrum in recent years. Aiming at the single error phenomenon, this paper presents a new fault localization method based on improved program spectrum. The method is based on the fact that the test case operation must cover the fault statement if the test case runs erroneously under the single error condition. It checks the coverage statements for all fault test cases, and then obtains the fault base. The fault base is used to improve the fault localization method of software debugging. Finally, the Siemens test suite is used as the test data to compare the effect of different methods on fault location and efficiency. The results show that the proposed method can greatly improve the effectiveness and efficiency of fault localization.

Current researches and future perspectives of crowd

counting and crowd density estimation technology

ZHANG Jun-jun，SHI Zhi-guang，LI Ji-cheng

2018, 40(2): 282-291. doi:

Abstract ( 577 )

PDF (776KB) ( 1703 ) 　　

Review attachment

Crowd counting and crowd density estimation are the important branch in crowd analysis, and are also the important information that surveillance always concerns. Although some important progress has been made in this field in recent decades, there are still some challenging problems. This paper reviews the current researches and development trends of the crowd counting and crowd density estimation methods based on computer vision. Firstly, the development background and application direction of the crowd counting and crowd density estimation technology are introduced. Secondly, the important methods proposed in recent years are summarized, which can be divided into two types from the machine learning point of view: the shallow learning based methods and the deep learning based methods. In the other hand, they can be divided into two types from learning model standpoint: the direct method (i.e., the detection based method) and the indirect method (i.e., the pixel-based, texture-based and corner point based methods). This paper introduces the shallow learning based methods in the last twenty years in detail, and makes a brief summary of the deep learning based methods in recent years. Then, a brief introduction is made on the performance evaluation techniques of crowd counting and crowd density estimation methods, and several data sets are provided for testing and evaluating these methods. Finally, the technical challenges in the field are summarized and the future research directions are prospected.

An improved SURF algorithm for

calligraphy strokes recognition

WANG Min，PANG Shuang-shuang,ZHOU Jun-ni

2018, 40(2): 292-297. doi:

Abstract ( 391 )

PDF (786KB) ( 382 ) 　　

Review attachment

Calligraphy strokes have rich writer charateristics. Whether feature vectors can be correctly extracted and matched directly affect the recognition effect. Aiming at the problem that the traditional SURF（Scale Invariant Feature Transform）algorithm has fewer detected feature points and higher false matching rate, a SURF based on Contourlet transform is proposed. The algorithm uses Contourlet transform to do sub-band decomposition and directional filtering of calligraphic strokes before the feature points are extracted, and then obtains the low frequency and high frequency detail components. The minimum Euclidean distance criterion (LEDC) is adopted to calculate the similarity of the low-frequency detail components. After the high frequency detail components are further decomposed, the appropriate thresholds are selected to extract the high frequency feature points. Then, the SURF feature points are matched. The RANSAC algorithm is used to eliminate the false matching points. Experiments show that the improved SURF algorithm can not only extract the feature points of the strokes better, but also improve the anti-noise performance. The recognition rate is improved by 3%.

An attention analysis method based

on EEG supervising face images

LIU Ji-wei1，SHI Yin-jia1，BAI Yu1，YAN Chao-wen2

2018, 40(2): 298-303. doi:

Abstract ( 491 )

PDF (510KB) ( 486 ) 　　

Review attachment

Using computer vision technology, aiming at intelligently analyzing the attention of people in the mission, through the acquisition and analysis of face image videos and corresponding EEG signals, a continuous facial image information corresponding to the EEG information sample library is established. Under the premise of using the EEG information to supervise the attention degree of facial images, a method that uses facial information to judge the attention degree is applied. According to the recognition results of SVM (Support Vector Machine) classifiers trained under multiple distributed samples, there is a correlation between facial information and its attention degree. Therefore, it is feasible to use facial images to analyze person’s attention in task, and it provides objective data for subsequent evaluation of condition monitoring.

A harmonic analysis oriented

power system digital simulator

WANG Tong-xun1，LI Han2,3，ZHOU Sheng-jun1，LI Ya-qiong1，TAN Meng1

2018, 40(2): 304-312. doi:

Abstract ( 317 )

PDF (1772KB) ( 476 ) 　　

With the continuous expansion of China’s power gird and the deepening harmonic pollution, there is an ever-increasing demand for the simulation analysis of harmonics in power systems. In view of the shortcomings of existing power system digital simulators in terms of the integrity and scalability of harmonic analysis, a harmonic analysis oriented power system digital simulator is investigated and designed by combing computer visualization techniques, Web techniques and power system simulation techniques. By fully considering harmonic analysis and user requirements, both traditional and new harmonic analysis models are supported, SVG and JavaScript are used to support the drawing of one-line diagrams and visualize the results. The feasibility of the harmonic analysis oriented power system digital simulator is verified by an instance. The simulator can achieve multiple types of harmonic analysis, enhance the visualization of harmonic analysis, and has good interactivity and extensibility.

A Twitter hotspot mining method based on

sematic clustering of word vectors

LIU Pei-lei,TANG Jin-tao,WANG Ting,XIE Song-xian,YUE Da-peng,LIU Hai-chi

2018, 40(2): 313-319. doi:

Abstract ( 384 )

PDF (615KB) ( 755 ) 　　

With the rapid development of social media, information overloading becomes a challenge. As a result, how to mining hotspots automatically from so many short and noisy data is an important problem. Social data are real-time and geographic, which usually contain plenty of meta-information. According to these characteristics, this paper proposes a hotspot mining method, which combines user’s behavior patterns and text content analysis. In the process of content analysis, we cluster text on the word scale rather than message scale. Besides, sematic clustering technology of word vectors is used for promoting the performance of keywords extraction. Experimental results on real datasets show that this method is better than traditional methods. Specifically, keywords extracted by this method have strong semantic relevance and good topic segmentation, which are superior to the traditional hot-spot mining methods on the main indexes.

Somatosensory information based misbehavior

detection in online examinations

FAN Zi-jian，XU Jing，LIU Wei

2018, 40(2): 320-325. doi:

Abstract ( 341 )

PDF (928KB) ( 447 ) 　　

Review attachment

Examination surveillance is one of main challenges in online examination. Traditional approaches mainly focus on the identification of examinees and lack of flexible and scalable solutions to detect the misbehavior of the examinees in online examinations. We provide a new solution to monitor the examinees’ behavior based on somatosensory information. Meanwhile, to reduce false alarm rate, a two-dimensional gesture detection scheme is proposed, in which both the duration and frequency of the detected gesture events are adopted to describe the target misbehavior. Examinees’ states are discriminated by analyzing the duration and frequency of the events happened within a time window. Experiments demonstrate that our proposed solution can effectively distinguish the examinees’ misbehavior from their normal actions.

A quasi-human global optimization algorithm for solving

the two dimensional rectangular packing problem

DENG Jian-kai1,2，WANG Lei1,2,YIN Ai-hua3

2018, 40(2): 331-340. doi:

Abstract ( 397 )

PDF (652KB) ( 499 ) 　　

We propose a basic algorithm based on corner-occupied action for solving the 2D rectangular packing problem and a three-phase quasi-human global optimization algorithm. In the first phase, an initial configuration is generated. In the second phase, we optimize the priority levels of all rectangles by iteratively calling the local search sub-procedure and off-trap strategy sub-procedure. We adopt two neighborhood structures—swap and insertion instead of a single neighborhood structure in the local search sub-procedure to avoid some limitations. When the search encounters local optimal solutions, the off-trap strategy sub-procedure is called to jump out of the trap and guide the search into more promising areas. In the third phase, the beauty degree enumeration sub-procedure is called to optimize the selection of corner-occupied actions. We also derive two goodness degree theorems. Experiments on six sets of benchmark instances show that the proposed algorithm outperforms the best algorithms in the literature. For the two benchmark instances named zdf6 and zdf7, while the orientation of the rectangles is fixed the proposed algorithm can find better packing configurations than the best results reported in the literature up to now.

Current Issue

Author center

Review center

Online journal