Computer Engineering & Science

A communication scheduling method combining

communication rearrangement and message merging

PENG Jin-tao1,YANG Zhang1,2,LIU Qing-kai1,2，ZHANG Qian1

2020, 42(02): 191-196. doi:

Abstract ( 252 )

PDF (745KB) ( 374 ) 　　

Review attachment

Abstract:
Network communication is critical for high-performance computer applications. At present, with the complication of numerical simulation applications and the increasing scale of parallelism, the need for application software to alleviate congestion and reduce communication protocol overhead is becoming more and more urgent. The traditional message merging method only merges small messages with the goal of reducing the communication protocol overhead and latency. In contrast, from the perspective of scheduling algorithms, this paper proposes an algorithm for reducing the network congestion of large messages through message rearrangement and improving the effective utilization of the network by merging messages based on priority. Experiments show that our algorithm can increase the communication performance of real applications by up to 41%, and on average by 10% for each application.

A combined storage structure for

image processing algorithms on GPU

ZUO Xian-yu1,2，ZHANG Zhe1,5，HUANG Xiang-zhi4,5，GE Qiang1,2，ZHANG Li-tao3,ZANG Wen-qian4,5

2020, 42(02): 197-202. doi:

Abstract ( 196 )

PDF (592KB) ( 312 ) 　　

Review attachment

Most image processing algorithms optimized by GPU can achieve better performance, but the scheduling strategy between data transmission and kernel execution is still the main bottleneck for further improvement in efficiency. To solve this problem, streams are usually used to overlap data transmission and kernel execution, in order to hide some of the data transmission and kernel execution time. However, due to the characteristics of the CUDA programming model and the limitations of GPU resources at hardware level, operations are still serialized when there are so many operations to be execute, even if numerous streams are created. In this paper, a new data storage structure, named Combined Storage Structure (CSS), is proposed, which improves the performance by merging small data transmissions on the single stream into a large one to reduce the fixed cost and the call gap of the operations of data transmission and kernel execution. Experimental results show that CSS can not only improve the performance of GPU-based image processing algorithms in the case of single stream, but also improve the acceleration performance in the case of multiple streams. CSS has good practicability and scalability, and it is suitable for the image processing operations that contain more operators or a large number of small-scale images. In addition, the proposed method provides a new research idea for GPU acceleration of image processing algorithms.

Task scheduling optimization in Spark

environment with unbalanced resources

HU Ya-hong1,SHENG Xia2,Mao Jia-fa1

2020, 42(02): 203-209. doi:

Abstract ( 190 )

PDF (804KB) ( 411 ) 　　

Review attachment

Due to the updating of hardware resources, the computing capacity of nodes in a cluster becomes inconsistent. The emergence of cluster heterogeneity leads to the imbalance of cluster computing resources. At present, Spark big data platforms does not consider the cluster heterogeneity and the node resource utilization in task scheduling, which affects the system performance. This paper constructs a performance evaluation index system of cluster nodes, and proposes to use the priority of nodes to express their computing capacity. The proposed node priority adjustment algorithm can dynamically adjust the priority of each node according to the status of the node during task execution. The Spark Dynamic Adaptive Scheduling Algorithm (SDASA) based on node priority completes task assignment according to the real-time node priority. Experiments show that SDASA can shorten the execution time of tasks in the cluster and improve the overall computing performance of the cluster.

Design of general CFD software PHengLEI

ZHAO Zhong,HE Lei,HE Xian-yao

2020, 42(02): 210-219. doi:

Abstract ( 1491 )

PDF (1866KB) ( 1043 ) 　　

Review attachment

PHengLEI is a software platform of Computational Fluid Dynamics (CFD) developed by China Aerodynamics Research and Development Center (CARDC) with completely independent intellectual property rights. It has been widely used in China. As an important part of the national numerical wind tunnel system, PHengLEI are integrating the discipline physical models including helicopter rotor, aircraft icing, wind engineering, chemical reaction and combustion, virtual flight, high order method on both structured and unstructured grids. In order to meet the needs of integrated development of multi-disciplinary and multi-physical models and to adapt to the next generation of high-performance computer hardware, based on the existing work, the architecture and data structure of the next-generation PHengLEI software are further designed.

Reconfigurable unit design in

a reconfigurable Ethernet packet parser

ZHAO Yu1,2,YIN Shu-juan1,LI Xiang-yu2

2020, 42(02): 220-228. doi:

Abstract ( 251 )

PDF (766KB) ( 427 ) 　　

The ever-changing network protocol standards and user-customized network services require greater flexibility for switch hardware. Under this background, this paper proposes a basic processing unit of an Ethernet switch chip packet parser, which can define protocol parsing rules by software programming and has the advantages of high performance and high flexibility. Hardware parsing logic and look-up table can be flexibly configured, and the extraction, search, match, action, and other parsing processes of packet header content are defined, so as to support different types of protocol parsing tasks. It is composed of two types of basic structures in series or parallel so that the hardware resources can be tailored according to the need. Based on the reconfigurable basic processing unit, the reconfigurable message parser can be constructed, which supports the parsing of custom protocol and unknown protocol. This paper mainly introduces the structure of the reconfigurable basic processing unit and introduces the implementation method of the parser architecture based on the basic processing unit. The synthesis results under 40nm process show that the maximum working clock frequency of the basic processing unit can reach 240MHz. Based on the basic processing unit, the parser, which supports the parsing of four-layer common Ethernet protocols, can process 240 million packets per second. The total storage resources used in the reconfigurable basic processing unit are 87.98K bit, and the design size is about 1.47 million gates.

An indoor positioning algorithm based on multimode data

JIA Yu-fu1,LIU Wen-ping1,HU Sheng-hong2

2020, 42(02): 229-235. doi:

Abstract ( 199 )

PDF (604KB) ( 274 ) 　　

To solve the indoor positioning problem, an indoor positioning algorithm based on WiFi and visual information is proposed. Firstly, the mathematical relationship between the image plane coordinates and the coordinates of the world coordinate system is analyzed and established by using the principle of coordinate transformation. Secondly, the mathematical equation is established with the distance between two beacons as the constraint condition, and the elevation angle of the camera is solved. Thirdly, the plane distance between the observation point and the two beacons is calculated according to the elevation angle. Finally, the distance between the observation point and the beacon is calculated. The marker can locate the observation point. According to the above positioning process, the virtual coordinates of the observation points will be obtained. Using SAE and DNN model and taking multiple custom WiFi features as input and 11 distance categories as labels, the valid values of positioning coordinates can be identified by the distance between nodes. The experimental results show that the algorithm can provide high precision real-time positioning in indoor environment, and the reliability of the algorithm is fully verified.

A format preserving encryption scheme

for sensitive information

ZHANG Yu-lei1，LUO Guang-ping1，ZHANG Yong-jie2，ZHANG Xue-wei1,LIU Xiang-zhen1，WANG Cai-fen3

2020, 42(02): 236-240. doi:

Abstract ( 277 )

PDF (391KB) ( 297 ) 　　

Format preserving encryption has the characteristics of unchanged data format and data length after encryption, and does not destroy the data format constraints, thereby reducing the cost of modifying the data format. The existing format preserving encryption schemes for sensitive information are based on the symmetric encryption system, which has problems such as low key transmission security and high key management cost. This paper proposes a format preserving encryption scheme for sensitive information in identity cryptosystems. Compared with the existing format preserving encryption schemes, the two parties do not need to transmit a key, and the key derivation function is used to generate an encryption key and a decryption key. The use of hybrid encryption improves the security of sensitive information transmission. It is proved that the scheme satisfies the security of identity-based pseudo-random permutation. At the same time, the scheme has cipher text indistinguishability under adaptive selective plaintext attack.

Test case prioritization based on ant colony algorithm

ZHANG Wei-xiang,QI Yu-hua,WEI Bo,ZHANG Min,DOU Zhao-hui

2020, 42(02): 241-249. doi:

Abstract ( 259 )

PDF (656KB) ( 406 ) 　　

By optimizing the execution order of test cases, test case prioritization technology can improve the efficiency of software testing. It is an important research topic for enhanced software and regression testing. Aiming at the demand-based test case prioritization problem, a solution method based on ant colony algorithm is proposed. Based on different test case sequence evaluation strategies, two different implementations of the method are given. Firstly, according to the characteristics of black-box software testing, a general demand-based test case sequence evaluation index is designed. Secondly, the concept of test case aspiration is proposed, which was used to define the distance between test cases. Then, the main design strategies such as pheromone updating strategy, optimal solution set updating strategy, and local optimal solution mutation strategy are given. Finally, the two implementations (distance-based approach and index-based approach) are achieved. The experimental results show that this method had a good global search ability, and has better overall effect than other methods such as particle swarm algorithm, genetic algorithm and random testing.

A variance toss algorithm for parameter reduction of soft set

LIN Lian-hai1,TIAN Li-qin1,2,CAI Ming-kai1,LI Sheng-hong1

2020, 42(02): 250-258. doi:

Abstract ( 159 )

PDF (1225KB) ( 256 ) 　　

Soft set is a theory and tool for dealing with uncertain data. It is usually used in decision theory. The parameter reduction of soft set refers to the removal of redundant parameters that have little effect on decision-making. Since the 0-1 linear programming algorithm was proposed, the problem of parameter reduction of soft set has been basically solved, but the implementation of the 0-1 linear programming algorithm is complex and relies on the integer programming algorithm. Here, considering the practical application background of soft set, combining soft set with probability theory, a soft set parameter reduction method in the context of big data, called the variance toss algorithm, is designed. The time complexity of this algorithm is O(m2n), and 0-1 linear programming is usually regarded as a NP-hard problem. The variance toss algorithm is simple to implement. When the object set (or the complete set) is small and less than twice the size of the attribute set, the effect is poor. However, as the size of the object set (or the complete set) increases, the efficiency will gradually increase, and the final computing efficiency will be better than the 0-1 linear programming algorithm. It has higher efficiency for the soft set with high reduction density.

Adversarial domain adaptation with

self-attention in image classification

CHEN Cheng1,GUO Wei-bin1,LI Qing-yu2

2020, 42(02): 259-265. doi:

Abstract ( 234 )

PDF (557KB) ( 286 ) 　　

Domain adaptation, as a systematic framework for addressing data set migration and adaptation, has grown rapidly in recent years. After the emergence of the generative adversarial network (GAN), the introduction of adversarial ideas has brought new ideas to the unsupervised adaptation problem in the domain adaptation. This paper compares and analyzes the intrinsic relationship between GAN and domain adaptation, and then generates an improved GAN method. A domain adaptation method combining self-attention layer is proposed to improve the defect that long-distance dependence cannot be modeled. At the same time, considering the difference in GAN and domain adaptation tasks, this paper improves the self-attention layer by introducing new learning parameters, so that it has higher precision and robustness in classification tasks. Finally, the effectiveness and feasibility of the proposed algorithm are verified by experiments on the open field adaptive dataset.

Abnormal vehicle detection based on improved

mixed Gaussian model and graphic handle

LIU Yan-ping1，CUI Tong1，ZHOU Zhang-bing2，LI Xiao-cui2，LIU Tian1

2020, 42(02): 266-272. doi:

Abstract ( 161 )

PDF (1000KB) ( 307 ) 　　

Because the traffic safety hazards are becoming more serious in the current life, it has certain practical significance to detect abnormal vehicles in scenes such as pedestrian streets and campuses where vehicles are prohibited from traveling. Aiming at the problem of ghost and cavity in the background model built by mixed Gaussian, an abnormal vehicle detection based on mixed Gaussian modeling of SSIM structural similarity is proposed. The similarity between two image pixels is calculated by SSIM. The secondary background modeling is carried out after the Gaussian modeling, and the exponential function is introduced to optimize the weight update process in the Gaussian modeling process, which improves the update speed. The graphical handle function is used to optimize the connected domain method to detect the abnormal vehicle in the foreground area, it is possible to detect the abnormal vehicle and the labeling frame is closer to the shape of the vehicle. Experimental results of 580 images segmented by video show that the detection rate can reach 90.3%.

Person re-identification based

on joint loss and siamese network

FAN Lin1,2,ZHANG Jing-lei1,2

2020, 42(02): 273-280. doi:

Abstract ( 264 )

PDF (904KB) ( 315 ) 　　

In person re-identification applications, pedestrian images are susceptible to illumination changing, similar wearing and different shooting angles, so it is difficult to distinguish sample pairs, which leads to incorrect matching. Aiming at this problem, a person re-identification optimization algorithm based on joint loss and siamese network is proposed. Firstly, the residual convolutional neural network is used to extract image features. The joint loss including focal loss and cross entropy loss is used to carry out supervised traning on the extracted features, in order to increase the models attention to the indistinguishable pairs. Then, the cosine distance is used to calculate the similarity between images to realize the pedestrian recognition. Finally, a re-ranking algorithm is adopted to reduce the mismatch rate. Experimental results on open datasets including Market-1501 and Duke MTMC-reID show that the matching rate (Rank-N) is 91.2% and 84.4%, respectively, and the mean Average Precision（mAP）is 85.8% and 78.6%.

A thresholding image segmentation algorithm

based on multi-objective particle swarm and

artificial bee colony hybrid optimization

ZHAO Feng1,2，KONG Ling-run1,2，MA Gai-ni1,2

2020, 42(02): 281-290. doi:

Abstract ( 215 )

PDF (963KB) ( 299 ) 　　

In order to accurately separate the objects from the background in images, a thresholding image segmentation algorithm based on multi-objective particle swarm and artificial bee colony hybrid optimization is proposed. Under the framework of multi-objective optimization, the improved inter-class variance criterion and maximum entropy criterion are used as the fitness functions, and then the two fitness functions are optimized by particle swarm and bee colony hybrid optimization to obtain a set of non-dominated solutions. At the same time, the global optimal solution of particle swarm is introduced into the employed bee phase to update the honey source and the search equation is modified, so as to improve the global and local search abilities in the evolution of the bee colony. Finally, the weighted ratio of between-cluster variation and modified intra-cluster variation is adopted to select an optimal solution from a set of non-dominated solutions. Experimental results show that this algorithm can obtains ideal thresholding segmentation results.

An image compression technique based on

adaptive block-based octree color quantization

WU Zhen-hua1,SHEN Hu-jun1,2,GONG Zuo-quan1,FENG Ping1,GONG Tong-yan1,3,DENG Ming-sen1

2020, 42(02): 291-298. doi:

Abstract ( 180 )

PDF (1420KB) ( 314 ) 　　

null

Fast object extraction using human-machine

interactive graph cuts

XU Qiu-ping

2020, 42(02): 299-306. doi:

Abstract ( 155 )

PDF (984KB) ( 308 ) 　　

By characterizing pixel distance with RGB pixel value, based on graph cuts theory, an human-machine interactive fast object extraction method is proposed. Closed polygons are artificially drawn around the object as the initial contour line, a single-side variable width cyclic neighborhood is generated to avoid ineffective overlapping and repeated cutting. Meantime, an energy function is constructed and a s-t network is generated to achieve the object extraction through the minimus cost cutting of the s-t network. Repeat the above steps until converge to the best boundary of the object. Later, convenient, fast, safety-oriented, automatic and manual error correction measures are provided for local errors. Experiments show that method has the advantages of convenient human-machine interaction, efficient and complete error correction, and fast and accurate object extraction.

Tourism demand prediction using echo state

network with improved fruit fly optimization algorithm

CHEN Ming-yang，WANG Lin，YU Xiao-xiao

2020, 42(02): 307-316. doi:

Abstract ( 195 )

PDF (808KB) ( 329 ) 　　

Firstly, this paper improves the standard Fruit Fly Optimization Algorithm (FOA) by adaptively adjusting the number of the fruit fly populations and the search step size and optimizing the initial iteration position, so as to improve the local search ability and search efficiency of the algorithm. Through combining the optimized FOA (AFOA) and Echo State Network (ESN), a two-stage combined prediction model (AFOA-ESN) is proposed. The AFOA optimizes the ESN to obtain its key parameters, which are inputted into the ESN to form the final combined prediction model. Finally, this model is used to predict tourism demand. The experimental results show that the AFOA-ESN model has higher prediction accuracy than the ARIMA, SVM, BPNN, standard ESN and other models.

A fuzzy mixed data clustering algorithm

based on density peaks

CHEN Yi-yan1,2,LI Ye3,LI Cun-jin1

2020, 42(02): 317-324. doi:

Abstract ( 166 )

PDF (722KB) ( 293 ) 　　

By extending CFSFDP algorithm to continuous fuzzy sets and discrete fuzzy sets, an extended CFSFDP algorithm for fuzzy mixed data is proposed, which is named FMD-CFSFDP algorithm. The FMD-CFSFDP algorithm extends the classical information in the sample to fuzzy sets, and achieves the clustering of fuzzy sets by seeking the density peaks. The proposed FMD-CFSFDP algorithm is a kind of density-based clustering algorithm established on fuzzy set for fuzzy mixed data. Firstly, the CFSFDP algorithm and some of its improvement algorithms are briefly introduced, and the mathematical definition of fuzzy mixed data is given. Secondly, by combining the concept of traditional fuzzy Euclidean distance, the improved Euclidean distance for both continuous and discrete fuzzy sets with smaller error is proposed. On the basis, the weight is introduced to establish the overall distance for fuzzy mixed data. By referring to the clustering steps of the CFSFDP algorithm, the clustering steps of FMD-CFSFDP algorithm are given. Furthermore, under the conditions of different sample size, different index number, different cluster number and different fetching rule, random simulation experiments are carried out on the algorithm and the clustering results are analyzed. Finally, the advantages and disadvantages of the FMD-CFSFDP algorithm are summarized respectively. On this basis, some improved schemes are proposed, which provides a reference for future in-depth research.

A black hole pattern mining algorithm

in dynamic spatial network

TAN Sheng-xi,JIA Jin-ping,ZHAO Bin,JI Gen-lin

2020, 42(02): 325-333. doi:

Abstract ( 152 )

PDF (779KB) ( 295 ) 　　

The black hole pattern is a landmark achievement in the study of human moving patterns. However, the black hole pattern has limitations in the evolution modeling of human moving patterns. This paper proposes a black hole pattern with time evolution characteristics. The definition of the new pattern needs to meet the three requirements of group scale, spatial locality and time persistence. This paper proposes a dynamic spatial network model with time evolution characteristics. Based on this model, we define a new black hole pattern and propose a corresponding mining algorithm. In order to improve the efficiency of the pattern mining algorithm, we design a candidate pattern pruning algorithm based on spatiotemporal partitioning, which effectively reduces the searching cost of the mining algorithm in spatiotemporal dimension. Finally, experiments based on real data verify the effectiveness and efficiency of the proposed black hole pattern and mining algorithm.

A fuzzy recommendation method based

on heterogeneous information network

LI Xian, ZHAO Xia, ZHANG Ze-hua, ZHANG Chen-wei

2020, 42(02): 334-340. doi:

Abstract ( 172 )

PDF (575KB) ( 307 ) 　　

With the explosive growth of Internet information, the recommendation system plays an increasingly important role. In order to solve the problem of sparse information in the traditional recommendation system and to reasonably express the user’s preference, a fuzzy recommendation algorithm (HFR) based on heterogeneous information network is proposed. The HFR algorithm constructs a triangular fuzzy scoring model to fuzzify the user’s discrete scoring information. In addition, it also adds the attribute information of the project and uses the meta-path representation. Based on this, the multi-source information is fully utilized and a new similarity measure is proposed. The score is predicted to get the final recommendation result. The experimental results show that the HFR algorithm effectively solves the problem of sparse information and improves the recommendation quality.

null

PAN Dong-hang，YUAN Jing-ling，Li Lin，SHENG De-ming

2020, 42(02): 341-350. doi:

Abstract ( 357 )

PDF (668KB) ( 691 ) 　　

null

Current Issue

Author center

Review center

Online journal