Computer Engineering & Science

Scheduling and optimization of multi-job execution in distributed environment

JI Hang-xu, JIANG Su, ZHAO Yu-hai, WU Gang, WANG Guo-ren

2021, 43(06): 951-961. doi:

Abstract ( 395 )

PDF (943KB) ( 443 ) 　　

Distributed big data computing engines are indispensable tools for scientific research institutions, Internet companies, and government departments to process large-scale data. Their use and promotion have promoted the rapid development of various fields and made great contributions to social progress. However, in the case of multi-job processing, the current mainstream big data computing engines still have many shortcomings in resource allocation and job scheduling. They usually divide multi-jobs into memory resources equally and use first-input-first-output (FIFO) method for scheduling jobs, such a simple resource partitioning method and job scheduling mechanism cannot give full play to system performance. In response to this problem, improvements have been made from the job level of the computing engine: (1) in terms of resource division, the task amount of the job is estimated to judge the difference between the task amount and the pre-allocated resources of job, and the jobs with high waste of cluster resources are merged to fully utilize the computing resources by the extraction of job features; (2) in terms of job scheduling, the features of the jobs in the job pool are extracted so that cluster analysis is conducted for the jobs by multipath K-means algorithm, and then self-balancing polling scheduling algorithm is used to schedule the jobs based on the analyzed results to achieve the load balance. In order to verify the effectiveness of the proposed algorithm, comparative experiments were conducted in a distributed cluster environment using large-scale text data sets. The experimental results show that the proposed job merging algorithm and multi-job scheduling algorithm can reduce the job running time by 5% to 23%, improves the system throughput by 7.5%~29%, and reduce the number of threads started by 40% in the best case.

An aggressive butterfly optimization method in compiler

ZHU Guang-lin, L Fang, LAI Qing-kuan, CHEN Hua-ying, HE Xian-bo,

2021, 43(06): 962-968. doi:

Abstract ( 363 )

PDF (660KB) ( 247 ) 　　

The purpose of compilation optimization technology is to explore the optimization space in the program and improve the efficiency of program compilation or operation. Dead code elimination optimization is one of the widely used compilation optimization technologies. It aims to delete the unreachable codes in the program to improve the execution efficiency of the programs. In many applications, the execution path is often related to the input parameter values at runtime, and some branch paths are combined with the runtime parameter values. There may be dead codes. It is difficult to deal with these dead codes through the existing dead code elimination optimization. To this end, this paper proposes an aggressive butterfly optimization method that relies on data flow analysis. According to the possible values of dynamic runtime parameters, the method uses SSA intermediate representation to automatically generate the branch code similar to butterfly shape for the program. At the stage of program compilation in compiler, it provides feasible optimization basis for related optimization. Finally, the effectiveness and feasibility of this method are verified through experiments.

A resistance-driven automatic routing algorithm in rectangular area of FPD parallel ports

HAN Ao, ZHAO Zhen-yu, LIU Guo-qiang, YANG Tian-hao

2021, 43(06): 969-975. doi:

Abstract ( 220 )

PDF (814KB) ( 308 ) 　　

Flat panel display technology has gradually developed into mainstream screen display technology, and automated routing is one of the important research tasks in the field of panel circuit and touch screen circuit design. Different routing solutions are required according to different routing requirements, such as fixed resistance routing or equal resistance routing. This type of routing task usually requires the routing with a designated routing area and maximum resistance between the two sets of ports. The resistance of each route is limited within the specified range, so as to meet the requirements of the IC to drive the load. Parallel port rectangular area routing is a common and important routing target, and it is necessary to find a suitable space allocation plan. The resistance-driven intelligent routing algorithm in the rectangular routing area of parallel ports is to group ports during the rout planning, then perform multi-segment pre-routing for each group of ports, and finally adjust the resistance to a limited area with an adaptive step. Three comparison experiments are successfully completed for the routing of 30 parallel port instances. Compared with the simple three-segment routing method and the fixed-step resistance adjustment method, it can effectively reduce the average routing time by about 40% and the average memory by 31%, and the routing resistance compliance rate is 100%.

A deep neural network edge computing platform based on TPU+FPGA

LUAN Yi, LIU Chang-hua

2021, 43(06): 976-983. doi:

Abstract ( 318 )

PDF (665KB) ( 373 ) 　　

Aiming at the contradiction between the huge consumption of computing resources by deep neural network in pursuit of accuracy and the restricted environment of edge computing platforms, a new edge computing fabric is explored. The new fabric uses FPGA logic resources to construct the Neural Network Tensor Processing Units (TPUs), and collaborates with ARM CPU in the FPGA chip, which not only accelerates the computation of deep neural network model and improves the accuracy, but also optimizes the power consumption of the entire system significantly. Under this architecture, the accuracy of the compressed MobileNet-V1 is 78.1%, while the power consumption is only 3.4 W. Its comparison with other deep learning edge computing platforms with different computing fabrics shows that the system has obvious advantages for accelerating the computation of small-scale deep neural networks, without reducing the accuracy.

High speed channel modeling based on machine learning

HE Jing, LI Jin-wen, YANG An-yi

2021, 43(06): 984-988. doi:

Abstract ( 284 )

PDF (698KB) ( 346 ) 　　

With the increase of transmission rate, transmission length and structure complexity of high-speed channel, channel modeling technology becomes more complex and difficult. This paper proposes a novel method by combining the popular machine learning method with high-speed channel. A large number of analog data are collected, and deep neural network (DNN) and recurrent neural network (RNN) methods are used to model the channel. Once the model is trained successfully, the eye diagram of the output signal can be predicted by the simulation model, and the signal integrity can be evaluated and analyzed quickly and accurately. In addition, in the high-speed channel, the serious interference and attenuation of the signal limits the transmission distance and transmission rate, which brings difficulties to the test and information collection. In order to recover the ideal signal, the high-speed serial link usually contains complex equalization blocks.The least mean square (LMS) algorithm is adopted to effectively eliminate the interference, reduce the bit error rate and improve the transmission rate.

SIMD optimization of smoothed particle hydrodynamics based on ARM SVE

FAN Xiao-kang, XIA Ze-yu, LONG Si-fan, YANG Can-qun

2021, 43(06): 989-996. doi:

Abstract ( 272 )

PDF (915KB) ( 313 ) 　　

Smoothed Particle Hydrodynamics (SPH) is a meshless particle method that has emerged in recent years. SPH has obvious advantages in dealing with large deformations, moving material surfaces, and free surfaces. In recent years, it has been widely used in the field of numerical simulation, and it is a typical application of scientific computing. As an explicit particle method, SPH needs to calculate a large number of particle interactions in each iteration step, and the amount of calculation is very large. How to improve the calculation efficiency of SPH has become a research hotspot. Scalable Vector Extension (SVE) is the next generation SIMD extension proposed by ARM. This paper proposes a SIMD optimization method for SPH based on SVE, and achieves a good speedup.

An intelligent C++ translator architecture for many-core processors

YU Mao-xue, JIA Dong-ning, WEI Zhi-qiang, XU Jia-li, MA Guang-hao

2021, 43(06): 997-1005. doi:

Abstract ( 272 )

PDF (1305KB) ( 388 ) 　　

Domestic heterogeneous many-core processors are a key link for my country to break international technical barriers and make breakthroughs in the field of high-performance computing. Focusing on the construction of domestically produced super-computing software ecological environment, using the intelligent source code conversion method to revitalize the massive multi-core architecture legacy code is an important way to accelerate the efficiency of software development and promote the development of the field. Aiming at the current situation that domestic computing cores do not support C++ compilation, based on the open source ANTLR language translation tool, this paper proposes an auxiliary framework of the intelligent C++ to C conversion for heterogeneous many-core processors.The framework focuses on the key features of object-oriented languages. Based on the abstract syntax tree, it implements base classes and inheritance classes, function definitions, template instantiation based on tagging, and C language conversion of some STL libraries, and establishes automatic annotation of the code to be converted. The system greatly improves the conversion and transplantation efficiency of C++ code. The automatic conversion and transplantation test of the measurable parallel computing benchmark application BableStream confirms the effectiveness of the conversion method proposed in this paper.

A component retrieval method based on learning to rank

CHEN Hua-ye, WANG Hai-tao, JIANG Ying, CHEN Xing

2021, 43(06): 1006-101. doi:

Abstract ( 215 )

PDF (727KB) ( 229 ) 　　

This paper applies the method of learning to rank to the research of component retrieval. Firstly, the facet description method is used to describe the components comprehensively, and the features of the facet described components are extracted through the Word2vec model and the weight setting method. Secondly, the component semantics analysis and cosine similarity calculation are performed on the component feature description information to obtain the component training data set. Finally, the component training data set and the component data set are used to train the model parameters of the improved Plackett-Luce probabilistic ranking model through the maximum likelihood estimation method, so as to obtain a component ranking model. The component ranking model is applied to the component retrieval to realize a component retrieval method. Experiments show that the method has better effectiveness, recall, precision and efficiency are better than the traditional component retrieval methods.

Variant mining of design pattern with clue constraint

XIAO Zhuo-yu, He Pei, XU Yun-biao, CHEN Guo, GUO Jie, HUANG Jun

2021, 43(06): 1014-1023. doi:

Abstract ( 149 )

PDF (869KB) ( 192 ) 　　

Aiming at the low accuracy issue of variant mining of design pattern, a variant mining method of design pattern is proposed by introducing clue constraints. , which aims to describe variant clues of design pattern based on the constraint satisfaction problem (CSP). The DPVMC (Design Pattern Variant Mining with Clue) algorithm is given, which introduces clues in two stages: structural feature constraints and timing feature constraints. Taking Proxy, Bridge, Command, and Factory Method pattern variants as examples, a two-stage single design pattern variant mining experiment and an integrated design pattern variant mining experiment are designed. Design pattern mining experiments are carried out through 4 mainstream design pattern mining tools and 4 benchmark systems. The experimental results show that this research method achieves good results.

UAV-based target tracking based on unsupervised learning

FANG Meng-hua, JIANG Tian,

2021, 43(06): 1024-1031. doi:

Abstract ( 262 )

PDF (859KB) ( 253 ) 　　

With the development of various researches in the field of computer vision, target tracking has become more and more popular and has been widely used in all walks of life. UAV-based arget tracking has also evolved. Compared with ordinary target tracking, UAV-based target tracking has many advantages, but there are also some challenges. In view of the limited data sets, low data quality, and lack of unified data labeling in UAV-based target tracking, this paper designs a new UAV-based target tracking model based on unsupervised learning. This model improves the backbone network and tracking method of UDT model. Combining the SiamFc network structure and the unsupervised target tracking idea of UDT, the backbone network of the model is improved to an AlexNet lightweight neural network, and target tracking is achieved through forward tracking and multi-frame backward verification methods. Comparative experiment results show that the designed model is better than the model before improvement and other classic tracking models.

An improved image matching algorithm based on OSID

CHEN Xue-song, LEI Man, BI Bo, TANG Jin-ping

2021, 43(06): 1032-1040. doi:

Abstract ( 182 )

PDF (1163KB) ( 262 ) 　　

In order to solve the problem that there are other feature points in the image block without considering one feature point in OSID, and the matching speed of histogram descriptor is slow, an improved binary descriptor based on OSID is proposed. In the process of constructing OSID descriptors, the selection of M is fixed. Therefore, when there are multiple feature points in an image block of a feature point, we propose to try to adapt the value of M, enrich the information contained in the descriptors, and improve the correct matching rate of the algorithm. Meanwhile, the histogram descriptors finally generated by OSID are encoded into binary descriptors, and Hamming distance replaces European distance for image matching, so as to improve the matching speed of the algorithm. The test results on the standard dataset show that the improved OSID has better matching accuracy than the similar descriptors and the original algorithm in complex view changes, image blur and JPEG compression scenarios.

An image splicing detection method based on illuminant color and noise

WEI Wei-yi, ZHAO Xiu-feng, ZHAO Yi-fan

2021, 43(06): 1041-1046. doi:

Abstract ( 235 )

PDF (899KB) ( 279 ) 　　

Image splicing is a common image content tampering operation. It is still a difficult task to detect splicing areas in image forensics. Most of the existing splicing detection methods detect tampered areas based on the inconsistency of an image characteristic, while the actual splicing forgery often causes the change of a variety of image characteristics. Aiming at the problem that the single feature extraction can not fully reflect the image characteristics to lead to the low detection accuracy, this paper proposes a new method for detecting the image splicing efficiently by extracting the mixed features of the illuminant color and noise in the local regions. The experimental results show that the mixed feature extraction method can achieve higher accuracy than single feature extraction method.

Enhancing Mongolian furniture patterns based on weighted transform

DONG Ying-da, ZHANG Cheng-tao, DUO Hua-qiong, DU Yu-yi

2021, 43(06): 1047-1051. doi:

Abstract ( 185 )

PDF (592KB) ( 260 ) 　　

In order to solve the problems of fuzzy patterns and edge distortion of Mongolian traditional furniture, this paper proposes an image enhancement method based on weighted transformation. Firstly, Mongolian furniture patterns are decomposed into RGB components, and lifting wavelet transform, stationary wavelet transform, interpolation algorithm and inverse lifting wavelet transform are adopted to obtain the high- resolution patterns. Finally, the weighted transformation function is used to modify the histogram, and the histogram with the smallest contribution is filtered to obtain the high- resolution and contrast-enhanced patterns. This method is compared with the traditional histogram equalization and bicubic interpolation method. The experimental results show that the proposed method has better PSNR (Peak Signal to Noise Ratio) and SSIM (Structure Similarity Index) than the tradi-tional histogram equalization and bicubic interpolation methods, and can effectively enhance Mongolian furniture patterns. The proposed method is of great value to the repair and protection of traditional Mongolian furniture patterns.

A protein complex recognition algorithm based on graph embedding and topological structure information

XU Zhou-bo, LI Ping, LIU Hua-dong, LI Zhen

2021, 43(06): 1052-1059. doi:

Abstract ( 200 )

PDF (616KB) ( 221 ) 　　

Protein complex is the basis of cell structure and biochemical mechanism. How to recognize protein complex accurately has become a popular research direction in recent years. Traditional algorithms has low sensitivity and F-measure in searching protein complexes based on structural information, and the artificial construction features can not reflect the real information of the graph when the existing supervised learning algorithms use machine learning algorithms to identify protein complexes. In order to solve the aforementioned problems, a graph2vec SVM recognition algorithm is proposed. In this algorithm, the protein complex is regarded as a dense subgraph, and the modularity of the subgraph is considered. graph2vec technology is used to transform the graph information into vectors, and SVM classifier is used to recognize the protein complex, which improves the sensitivity of protein complex re- cognition and F-measure. Compared with four popular unsupervised learning algorithms (ClusterONE, CMC,HC-PIN and Coach) and three supervised learning algorithms (SCI-BN, SCI-SVM and RM), the algorithm shows good performance in terms of accuracy, sensitivity and F-measure.

Optimization of anomalous flow detection by k-nearest neighbor flow sequence algorithm

LIU Yun, WANG Zi-yu

2021, 43(06): 1060-1066. doi:

Abstract ( 167 )

PDF (535KB) ( 344 ) 　　

Through the spatio-temporal anomalous flow detection technology, anomalous traffic cha- racteristics can be found in urban traffic data. Different from the single anomalous flow detection method from a time sequence, this paper proposes a k-nearest neighbor flow sequence algorithm (kNNFS) for detecting anomalous flow distributions from flow sequences. Firstly, the individual flow observation is measured in each time interval at each location. Then, a flow distribution probability database is built for each time interval at each location by calculating the frequency of the observation of a single flow. Finally, a threshold is used to determine whether the distance between a new flow distribution probability and its k nearest neighbor calculated by the KL divergence is an abnormal value. If the distance value is less than the threshold, the new flow distribution probability is updated into the historical flow distribution probability database; otherwise it is an anomalous flow distribution. Simulation results show that the kNNFS algorithm outperforms the DPMM algorithm and the SETMADA algorithm in terms of accuracy and running time.

Community search via random walk based on structure-attribute interaction bipartite graph

LI Ju, MA Hui-fang, LI Qing-qing, SU Yun

2021, 43(06): 1067-1075. doi:

Abstract ( 216 )

PDF (701KB) ( 280 ) 　　

Community search in attribute graph is a local community detection method, which is essentially based on the query nodes provided by users to return personalized subgraphs containing query nodes with similar attributes of attributes together with structural cohesion. This task helps users better understand how and why communities are formed. This paper proposes a random walk mechanism based on the structure-attribute interaction bipartite graph, which can effectively support the community search in the attribute graph. Specifically, this method first constructs the structure probability transfer matrix based on the network topology, then explores the bipartite graph formed by the interaction of structure and attribute to define the two-stage node-attribute-node probability transfer matrix, and effectively fuses it with the structure probability transfer matrix to obtain the probability transfer matrix of attribute graph. Finally, a random walk with restart method is designed, and the parallel conductance value based on fusion structure and attribute is used to accurately query the community. Experiments on real datasets and artificial data sets verify the effectiveness and efficiency of the proposed method.

A question similarity calculation method based on RCNN

YANG De-zhi, KE Xian-xin, YU Qi-chao, YANG Bang-hua

2021, 43(06): 1076-1080. doi:

Abstract ( 183 )

PDF (558KB) ( 307 ) 　　

Using deep learning to calculate the question similarity in search engine and question answering system is a hotspot in the NLP field. Combining convolutional neural network (CNN) and long-short memory network (LSTM), a recursive convolutional neural network (RCNN) question similarity calculation method is proposed. Firstly, the bidirectional recursive neural network is utilized to extract context information, and then the 1D Convolutional Neural Network was used to integrate the word embedded information with the context information. Then the global maximum pooling is used to extract the key information to complete the semantic representation of the two questions.Finally, the similarity of the question pair is judged through the matching layer. The experimental results show that, based on the Quora Question Pairs data set, the accuracy of the question similarity calculation method is 83.57%, which is better than other methods.

A short text multi-label classification method combining similarity graph and random walk model

LI Xiao-hong, WANG Shan-shan, MA Yu-yin, MA Hui-fang

2021, 43(06): 1081-1087. doi:

Abstract ( 330 )

PDF (634KB) ( 305 ) 　　

A short text multi-label classification algorithm combining similarity graph and random walk model is proposed. Firstly, the sample data and labels are used as nodes to create a similarity graph, and the weight between the sample and the label is calculated with the help of an external know- ledge base to obtain the matching degree between the predicted sample and the label set. Secondly, the multi-label data are mapped into a multi-label dependency graph. A random walk is performed on the graph, and the previous matching degree is used as the initial prediction value to calculate the probability distribution of each node. When the probability distribution tends to be stable, the probability distribution of the node is the probability distribution of the label, and then the label set of the predicted text is determined. The experimental results show that the proposed method achieves better performance in the classification of multi-label texts. Compared with similar algorithms, the classification performance is significantly improved.

Speech separation of cooperative training generative adversarial networks under low SNR

WANG Tao, QUAN Hai-yan

2021, 43(06): 1088-1094. doi:

Abstract ( 199 )

PDF (738KB) ( 211 ) 　　

Improving the quality of separated speech under low SNR is the focus of speech separation technology research, while most methods still only train the features of the target speaker's speech under low SNR. Aiming at the shortcoming of current methods, a mixed speech separation method based on cooperative training generative adversarial networks(GAN) is proposed. In order to avoid the extraction of the complex acoustic feature, the generative model uses a fully convolutional neural network to directly extract the high-dimensional features of the time-domain waveform, and the discriminative model obtains the features of the interference speaker by constructing a binary classification convolution neural network. Then, the source of the separated information obtained by the system is no longer single. Experiments show that the proposed method can better recover the information of high-frequency components under low SNR, and the separation performance is better than that of the comparative methods on the two-speaker mixed speech data set.

A process variant merging method based on Petri nets

WANG Wu-song, FANG Huan, ZHENG Xue-wen

2021, 43(06): 1095-1103. doi:

Abstract ( 168 )

PDF (598KB) ( 204 ) 　　

In the process of business integration, it is often necessary to merge several existing processes to form new business processes that meet the actual requirements. How to recognize the common characteristics of existing business flows to eliminate process redundancy has great practical value. Therefore, we propose a process variant merging method based on Petri nets. Firstly,
the matching score algorithm is used to calculate matching scores of different combinations of process variants to select a pair of process variants with the highest matching scores. Secondly, the common part of the combination of process variants is extracted according to the proposed merging algorithm to create the corresponding replica, the difference between the process variants are merged using branchs of configurable connector generated. The merged process model can capture all the behaviors of the input model and trace the nodes in the model to determine which process variants the nodes come from.

Current Issue

Author center

Review center

Online journal