Computer Engineering & Science

Eliminating control divergence on

GPGPU via partial warp regrouping

SHEN Li,YANG Yao-hua,WANG Zhi-ying

2019, 41(08): 1335-1342. doi:

Abstract ( 171 )

PDF (827KB) ( 364 ) 　　

Review attachment

GPUs have been widely used in current high-performance computing systems. However, their performance is severely constrained by the different directions of control flow during runtime. In response to this problem, warp regrouping methods are generally applied to combine the threads that execute the same branch path within one or more warps, thus obtaining a new warp. However, some unnecessary reorganization existing in these methods introduces additional performance overheads. We analyze the sources of regrouping overhead and propose a partial warp regrouping approach. Under the premise of ensuring certain efficiency, it reduces the reorganization of warps with a large number of active threads so as to avoid performance overhead. Experimental results indicate that the proposed method can significantly reduce unnecessary overheads while ensuring regrouping efficiency.

Optimization of binary translation conditional

transfer instructions based on TCG technology

ZHANG Jia-hao,SHAN Zheng,YUE Feng,FU Li-guo,WANG Jun,LI Ming-liang

2019, 41(08): 1343-1352. doi:

Abstract ( 301 )

PDF (1081KB) ( 335 ) 　　

Review attachment

The application of TCG intermediate representation technology in binary translation can realize the migration of programs between multiple target platforms. In addition, it makes the introduction of new platforms more conveniently and ensure the compatibility between new platforms and mainstream platforms. However, due to a lack of consideration for the association of code in the translation process, the traditional intermediate representation generates back-end codes with many redundant instructions which affect the execution efficiency of the translation program. We firstly analyze the feasibility of instruction optimization and optimize conditional jump instructions. Secondly, we improve intermediate representation via instruction preprocessing, and implement the many-to-many translation model instead of one-to-many translation model to realize the transformation from intermediate representation to back-end code generation. We adopt the instruction reduction technique to design corresponding optimized translation algorithms for the two modes (CMP-JX and TEST-JX) of conditional jump instructions, and realize them on the open source binary platform QEMU. Experiments on the NPB-3.3 and SPEC CPU 2006 test sets show that the code expansion rate is reduced by an average of 14.62% and the running speed of translation programs is improved by 17.23% in comparison with the existing translation modes, which verifies the effectiveness of the proposed method.

A MapReduce workflow heterogeneous scheduling

algorithm based on two-level DAG model

WANG Yu-xin,WANG Fei,WANG Guan,GUO He

2019, 41(08): 1353-1359. doi:

Abstract ( 178 )

PDF (616KB) ( 283 ) 　　

Review attachment

The MapReduce programming model is widely applied in big data processing platforms, and an effective task scheduling algorithm is critical to the efficiency of the model. In our approach, a MapReduce workflow is decomposed as a number of jobs with successive qualifying relationships and each job has a Map phase and a Reduce phase that both contain multiple tasks. Based on the available resources and task heterogeneity of computing cluster, we construct a two-level directed acyclic graph (DAG) model for job and tasks, and propose a MapReduce workflow heterogeneous scheduling algorithm based on two level priority ordering (2-MRHS). In the first stage of the algorithm, the priority ordering is performed: the priority weights of the job level and task level are calculated respectively to form the scheduling queue of tasks. In task assignment stage, the data block subtasks of each task are assigned to the appropriate computing node according to the tasks' earliest finish time (EFT). A large number of randomly generated DAG models are used to conduct experiments and the results show that our algorithm has shorter scheduling length (makespan) and better stability than those of others.

Optimization of LightGBM hyper-parameters

based on message queuing

NAN Dong-liang1,2，WANG Wei-qing1,WANG Hai-yun1

2019, 41(08): 1360-1365. doi:

Abstract ( 184 )

PDF (879KB) ( 323 ) 　　

Review attachment

In order to improve the optimization efficiency of light gradient boosting machine (LightGBM) hyper-parameters, and obtain the global optimal model, we propose a parallel optimization method for LightGBM hyper-parameters based on message queuing mode. According to the pre-selection range of hyper-parameters, each set of hyper-parameters is sent to the queue in message mode. Every node trains the model with the message obtained from the queue in parallel mode and verifies the accuracy rate of the model. Finally, the model with the highest accuracy is selected to calculate the dataset to be predicted. Experimental results show that compared with the traditional grid search, Bayesian optimization, random search method and message queuing serial optimization, the proposed method has the highest speed and largest area under curve (AUC ) value.

Visualization of container orchestration

in microservice environment

ZHANG Li-min，GAO Jing，LI Wu-bin，LIU Chen

2019, 41(08): 1366-1373. doi:

Abstract ( 223 )

PDF (773KB) ( 380 ) 　　

Review attachment

With the rapid development of cloud computing, processing of massive data cannot depend on any single specific application program. The microservice software architecture model brings new design paradigms for the application development due to its modular, scalable, and highly available application advantages. Containers are an emerging lightweight virtualization technology, which are based on the shared Linux kernel and application-oriented. Container technologies (Docker, a typical representative) provide an ideal carrier for microservices. At the same time, container orchestration tools (Kubernetes, a typical representative) greatly simplify the entire process of creating, integrating, deploying, and maintaining containerized microservices. In the transition from “development and maintenance” to “container-oriented”, there will be large and complex combinations of services. Thus the creation and deployment of these microservices become even more important. Considering the usability, we present a visualization method for container orchestration. Experiments demonstrate that the microservice deployment using the proposed method not only provides a friendly service creation interface for developers, but also facilitates the service creation process and improves development efficiency.

A hybrid learnt clause evaluation algorithm

for SAT problem based on frequency

WU Guan-feng1,2,XU Yang2,3,CHEN Qing-shan1,2,HE Xing-xing2,3,CHANG Wen-jing1,2

2019, 41(08): 1374-1380. doi:

Abstract ( 213 )

PDF (689KB) ( 309 ) 　　

Review attachment

In order to effectively manage learnt clauses, avoid a geometrical growth of their scale, reduce the memory cost of redundant learnt clauses and improve the efficiency of the Boolean satisfiability problem (SAT) solver, we need to evaluate learnt clauses and delete some redundant ones. Traditional evaluation methods are based on the length of learnt clauses, and short-length ones are kept. Two current mainstream clause evaluation methods are the variable state independent decaying sum (VSIDS), and a method based on the evaluation of the literals blocks distance (LBD), and the combination of the above two is also used as the basis for clause evaluation. We analyze the relationship between the number of learnt clauses used in conflict analysis and problem solving, and combine the frequency of learnt clauses with the LBD evaluation algorithm, which not only reflects the role of learnt clauses in conflict analysis, but also makes full use of the information between text and decision-making layer. Taking the Syrup solver (GLUCOSE 4.1 parallel version) as baseline, experiments are carried out to evaluate the algorithm and the parallel clause sharing strategy. The experimental comparison shows that the the proposed hybrid evaluation algorithm outperforms the LBD evaluation algorithm, and the number of solving problems is significantly increased.

An adaptive DDoS attack detection

method based on multiple-kernel learning

ZHANG Chen1，TANG Xiang-yan1,2，CHENG Jie-ren1,2,3，DONG Zhe1，LI Jun-qi1

2019, 41(08): 1381-1389. doi:

Abstract ( 151 )

PDF (899KB) ( 313 ) 　　

The distributed denial of service (DDoS) attack is one of the main threats to internet security. Most of the current detection methods based on single feature cannot effectively detect early DDoS attacks in big data environment. We propose a feature adaptive DDoS attack detection method (FADADM) based on multiple kernel learning. We define five features to describe the characteristics of network flow according to the burstiness of DDoS attack flow, the distribution of address and the interactivity of communication. Based on ensemble learning framework, the weight of each dimension is adaptively adjusted by increasing the ratio of variance to mean (IS/M)and reducing the ratio of variance to the mean (RS/M), and by training the simple multiple kernel learning (SimpleMKL) model, two multiple-kernel learning models (IS/M-SimpleMKL and RS/M-SimpleMKL) with different characteristics are establish to identify early DDoS attacks. Experimental results show that the proposed method can detect early DDoS attacks quickly and accurately.

Discussion on cyber-physical system for future networks

YAO Jian-gang1,WEN Wu1,2,KANG Tong1,ZHANG Xiao-feng1,JIN Yong-shun3

2019, 41(08): 1390-1397. doi:

Abstract ( 143 )

PDF (865KB) ( 407 ) 　　

With the development of information and communication technology, the cyber-physical system (CPS) has become an interdependent and deeply integrated large-scale binary composite network. Studying its information network framework, architecture, properties of large data and fragmented knowledge is of great significance to the future smart grid theory research and practical engineering applications. In view of the characteristics of the Internet of Things, big data knowledge and so on, which bring human society into the era of information and knowledge explosion, we mainly discusses the opportunities and challenges which the CPS will face in the future. The development trend of new energy interconnection, the increase of the ratio of electronic information devices, and multi-energy and multi-network convergence are presented. The application prospects of large data and artificial intelligence knowledge are proposed from the aspects of generation, transmission, transformation, deployment and utilization, which are helpful for future information-based scientific decision-making and deployment of smart grid.

A resource allocation and power control strategy

based on user partitioning in heterogeneous networks

LIU Hui1,2,3，SONG Jia-wang1,2，DAI Yun-xia1,2，ZHU Bin-xin1,2

2019, 41(08): 1398-1405. doi:

Abstract ( 126 )

PDF (725KB) ( 307 ) 　　

Deploying heterogeneous networks is a key move to increase the capacity of mobile communication systems. However, the high transmit power of the macrocell base station can cause severe interference to the edge users of the microcell; at the same time, the macrocell users around the microcell base station can also be interfered by the proximity of the microcell base station, so these users should be protected. By studying the resource allocation and power control strategies in the downlink transmission of heterogeneous networks, we propose a resource allocation and power control strategy based on user partitioning. Firstly, the users are classified according to the ratio of signal interference to noise, and a guard band is designed for the edge users with severe interference. In addition, by dynamically adjusting the transmission power of each base station on the guard band, the interference on edge users is mitigated without sacrificing system performance. Simulation results show that compared with the traditional scheme, the proposed strategy can effectively suppress the interference on macrocell and microcell edge users and improve the throughput of the whole system.

Express delivery route optimization and software design

LI Ling-yu,ZHANG Kun

2019, 41(08): 1406-1412. doi:

Abstract ( 271 )

PDF (808KB) ( 533 ) 　　

The express industry in China has experienced explosive growth in recent years. How to improve the efficiency of express delivery and ensure the traffic safety of couriers has become an urgent issue to be solved. Combining with the application of the traveling salesman problem (TSP) in express delivery, we use the C-W saving algorithm to optimize the express delivery route. Monte Carlo simulation shows that the C-W saving algorithm is better than the nearest neighbor algorithm (NNH) currently used by couriers, and the optimal path found by the C-W saving algorithm is shortened by 7.8% on average. We use the C-W saving algorithm and Shiny R technique with the help of Amap API to obtain the route information between distribution points, and develop a Delivery Helper, an Internet-based express delivery route optimization tool. The Shiny technique employs R language to set up dynamic interactive web applications, simplifying the process of web page development. The courier companies or couriers can use the Delivery Helper by simply logging into the software URL. The software is expected to improve the delivery efficiency of couriers and reduce the number of times that couriers look down at mobile phones, thus ensuring traffic safety.

A cross-site script detection method

based on MLP-HMM

ZHOU Kang,WAN Liang,DING Hong-wei

2019, 41(08): 1413-1420. doi:

Abstract ( 129 )

PDF (763KB) ( 312 ) 　　

Given that the estimation of the initial priori hypothesis of the hidden Markov model (HMM) in the cross-station script detection is inaccurate, and the ability of the HMM parameter classification with the maximal likelihood criterion is poor, we propose a cross station script detection model based on MLP-HMM. Firstly, we use the natural language processing (NLP) approach to solve the high-dimensional complexity problem of data. Then, the weights of the whole model are fine-tuned to get the initial observation matrix through the multi-layer perceptron (MLP) neural network learning. Finally, the observation matrix is put into the HMM model to enhance the model's capacity of parameter construction and classification. Experimental results show that the HMM model combined with MLP can significantly improve the detection rate and reduce the detection time in comparison with the original HMM and the traditional algorithm in cross-site script detection.

A cache replicas placement scheme in D2D networks

WEN Kai1,2,3，TAN Xiao1,2

2019, 41(08): 1421-1425. doi:

Abstract ( 115 )

PDF (593KB) ( 286 ) 　　

In the device-to-device (D2D) caching network, the number of cache replicas is an important factor that restricts the caching efficiency of the system. Excessive replicas cannot make full use of the cache resources while insufficient replicas make it difficult for popular files to be obtained effectively. To address the issue of cache replicas placement (CRP) in D2D cache networks, utilizing the convex programming theory, we design a CRP scheme with the goal of maximizing the cache hit ratio of the system. Simulation results show that compared with existing replicas placement schemes the proposed scheme can effectively improve the overall cache hit rate of the D2D cache network.

A requirement oriented formal modeling and

verification method for safety critical systems

HU Jun,ZHANG Wei-jun,LI Wan-qian

2019, 41(08): 1426-1433. doi:

Abstract ( 202 )

PDF (1068KB) ( 345 ) 　　

In the field of safety-critical systems, explicit requirements for the system are essential. We use model-based system engineering idea to model and verify the automatic flight control system (AFCS) based on system requirements. We employ the RSML-e language to model the requirements of the AFCS , propose a method to transform the RSML-e model to NuSMV 2 model, and use the NuSMV 2 to verify the properties of the requirement model. Taking a digital AFCS-GFC700 as an example, we analyze and verify the model. Experimental results show that the method is feasible for the safety analysis of actual systems.

A reflective software architecture supporting software

reusing in design stage and its formalization

LUO Ju-bo1,YING Shi2,LIU Tian-shi1

2019, 41(08): 1434-1443. doi:

Abstract ( 120 )

PDF (1838KB) ( 321 ) 　　

Software reusing in software architecture is especially important. Reusing software architecture is a very difficult problem in the realm of software engineering. There are two main reasons for the problem: the lack of information which can explicitly describe and support the process of reusing software architecture, and the lack of an effective reusing approach. Combining meta information with meta modeling, reflection and software architecture, we construct a reflective software architecture to support software architecture reusing at software design stage. We provide the software architecture reusing process based on the reflective software architecture. We also offer the basic principles of the concrete process of the reflective software architecture, and give the Object-Z formalization description of concrete operations such as the creation of meta-component, meta-connector and meta-composite. The design process of the supporting tool of software architecture reusing based on the reflective software architecture is also illustrated.

A nasopharyngeal carcinoma CT image

segmentation method based on 3D CNNs

XIAO Yin-yan,QUN Hui-min

2019, 41(08): 1444-1452. doi:

Abstract ( 130 )

PDF (1049KB) ( 309 ) 　　

Nasopharyngeal carcinoma computed tomography (CT) image segmentation is an essential task for diagnosis and treatments of nasopharyngeal carcinoma. However, nasopharyngeal carcinoma cells have various shapes, uneven gray scales, fuzzy boundaries, and complicated shapes of lesion cells, so it is difficult to accurately segment the image. In order to solve this problem, we propose a nasopharyngeal carcinoma CT image segmentation method based on three-dimensional convolutional neural networks (3D CNNs). In our three-dimensional deep convolutional neural network framework, ordinary convolutions with 33 convolution kernel are employed in the first 5 layers, the dilated convolutions with a dilation factor of 2 are employed in the middle 6 layers, and the dilated convolutions with adilation factor of 4 are employed in the last 6 layers. The residual connection is used between every two convolutional layers, and the softmax function is used to classify pixels. Dilated convolutions help to obtain accurate density prediction and fine segmentation maps along object boundaries. Residual connections smooth the information propagation in the deep convolutional neural network and improve the training speed. Experimental results show that the proposed method has better performance than other mainstream methods for nasopharyngeal carcinoma CT image segmentation.

Road semantic segmentation

based on hybrid auto-encoder

ZHOU Fei,TANG Jian,YANG Cheng-song,RUI Ting

2019, 41(08): 1453-1458. doi:

Abstract ( 142 )

PDF (744KB) ( 327 ) 　　

Road detection is an important part of the environment perception technology of unmanned vehicles. Using computer vision technology to achieve the semantic segmentation of environmental scenes is one of the key technologies to ensure the safe driving of unmanned vehicles. We propose a hybrid auto-encoder semantic segmentation model combining sparse auto-encoder and denoising auto-encoder. Using the sparse semantic encoding of sparse auto-encoder and the robust semantic encoding of denoising auto-encoder makes the features learned by the model more conducive for semantic segmentation. By establishing a reasonable arrangement order and stacking form of the model, an optimal selection of image semantics can be achieved, thereby creating a semantic segmentation model with deep “rich structure”, which can further improve the semantic segmentation performance. Experiments show that this model is simpler with shorter training time and better comprehensive segmentation performance.

Salient object detection based on

neighborhood optimization mechanism

WEI Wei-yi,WANG Yu,DOU Lei-xiang,WEN Ya-hong

2019, 41(08): 1459-1465. doi:

Abstract ( 122 )

PDF (1074KB) ( 319 ) 　　

In the salient object detection, the detection results are not ideal when the difference between the background region and the foreground region is not obvious. To address this problem, we propose a saliency object detection algorithm based on neighborhood optimization mechanism. Firstly, the image is segmented by super-pixels. Then, the contrast map and distribution map are established in the CIELab color space and they are merged by a new merging method. Finally, under the constraints such as spatial distance, a neighborhood updating mechanism is established to optimize the initial salient maps. Experimental results show that the algorithm is more effective in salient object detection.

A blind image deblurring method based on multiple priors

XU Yu，LIU Hui，SHANG Zhen-hong

2019, 41(08): 1466-1473. doi:

Abstract ( 199 )

PDF (906KB) ( 466 ) 　　

We propose an effective blind image deblurring method based on multiple priors. Our work is motivated by the fact that a good restored image should favor clear images over blurred images. At present, existing deblurring methods are not ideal for image restoration in specific scenes and there are some blurring, including unclear outlines and details. Aiming at these problems, we propose to combine the prior knowledge of multiple priors, including dark channel priors, intensity priors and gradient priors, and to balance them to provide more priori information for outlines and details during the restoration process. This is of great help for blur kernel estimation. We obtain a total prior knowledge by weighing the three priors, and put it into the maximum a posteriori estimation(MAP) framework. The estimated blur kernel is obtained by iterations, and the original image is restored by using the not blind image restoration method. Our results make great progress compared with current advanced methods, especially the outlines and details in various natural scenarios.

An self-calibration optimization algorithm

for 3D reconstruction of X-ray images

ZHANG Bo-lin1,LIU Rong-hai2,ZHENG Xin2,YANG Ying-chun2,CHEN Lei1,WAN Shu-ting1

2019, 41(08): 1474-1481. doi:

Abstract ( 143 )

PDF (762KB) ( 294 ) 　　

According to the imaging characteristics of X-ray images, we propose a self-calibration method for 3D reconstruction of X-ray images. Firstly, the matching relationship of profiles between two adjacent X-ray images is obtained by using the SIFT algorithm. Then, the fundamental matrix is achieved by calculation according to the matching relationship. Thirdly, the initial values of the intrinsic parameters of the X-ray non-destructive testing equipment are estimated by using the fundamental matrix. Finally, the intrinsic parameters are optimized based on the improved Kruppa equation, and the intrinsic self-calibration parameters of the 3D reconstruction of the X-ray image are obtained. According to the intrinsic parameters before and after optimization, the 3D models are established. Comparative experiments are carried out from two aspects: shape and key dimension error, which proves that the optimized intrinsic parameters have higher precision and reliability.

Modeling and simulation of aircraft virtual maintenance

process based on assembly sequence constraints

QIAN Wen-gao1，GENG Hong1，MA Hong-yan2

2019, 41(08): 1482-1489. doi:

Abstract ( 115 )

PDF (632KB) ( 308 ) 　　

Current aircraft virtual maintenance process models suffer solidification, do not fully consider the relationship between complex maintenance behaviors, and cannot meet the trainees' diverse training needs. Given the complexity of aircraft systems,
the not standard and random operation of trainees, we construct a priority constraint matrix according to the priority constraint relationship among maintenance assembly sequences, and then define the priority constraint timed colored Petri net (TCPN) model. On the basis of systematic examination of the maintenance entities and their maintenance operations, we propose a virtual maintenance entity operational meta-model and a process model based on the meta-model fusion, and describe the implementation method of the dynamic virtual maintenance process based on priority constraint TCPN. Finally, an example of a random disassembly process of a refueling valve of an A320 aircraft is modeled and simulated to verify the model.

Current Issue

Author center

Review center

Online journal