Computer Engineering & Science

A systolic array optimization strategy for switching matrix blocks in advance

JU Xin, CAO Ya-song, WEN Mei, WANG Zhi, FENG Jing

2023, 45(01): 1-9. doi:

Abstract ( 292 )

PDF (1616KB) ( 397 ) 　　

The demand for hardware computing power in AI applications increases year by year, driving the evolution of AI accelerators towards higher performance. Research shows that the main computing form of AI applications can be transformed into matrix multiplication, and systolic array has become one of the mainstream matrix multiplication acceleration technologies because of its unique advantages in matrix multiplication. However, there is a certain amount of pipeline filling and emptying overhead when the matrix is flowed into and out of the systolic array, especially for a floating-point systolic array that supports training, whose MAC latency is greater than 1. Untimely switching between matrix blocks will lead to a sharp drop in PE utilization. To solve these problems, theoretical analysis based on typical application scenarios is conducted, and an early switching strategy between matrix blocks is proposed, which can accurately calculate the optimal switching time between matrix blocks in various situations. The RTL design was implemented. The experimental results show that the hardware overhead of the optimized systolic array is slightly increased, but the performance can be improved in all scenarios.

A high precision and high stability oscillator based on RC charging time zero crossing invariance

YUAN Heng-zhou, SANG Hao, YAN Guang-da, FENG Jun, LIANG Bin, GUO Yang

2023, 45(01): 10-16. doi:

Abstract ( 190 )

PDF (1145KB) ( 445 ) 　　

A RC charging time zero crossing invariant oscillator is designed to provide high precision and high stability clock signals insensitive to voltage and temperature. This paper analyzes and deduces the characteristics that the time of zero crossing voltage does not vary with the power supply voltage in the process of RC charging, and adopts the temperature compensation technology to ensure that the time of zero crossing voltage does not vary with the temperature to the maximum extent. The simulation results show that the oscillator can output 2 MHz stably, the frequency fluctuation from 2.5 V to 5.5 V is less than 1%, the temperature fluctuation from -40 ℃ to 125 ℃ is less than 1%, and the maximum current under PVT condition is less than 150 μA.

A k-dominant skyline body parallel solving algorithm based on Flink

SUN Guo-zhang, HUANG Shan, , ALKAM Zabibul, XU Hao-tong, DUAN Xiao-dong,

2023, 45(01): 17-27. doi:

Abstract ( 146 )

PDF (1168KB) ( 242 ) 　　

The k-dominated skyline algorithm weakens the domination relationship between data points and is more suitable for high-dimensional data. k-dominated skyline bodies are suitable for multiple users to query with the k-dominated skyline algorithm, but the existing solution algorithms need to be improved in terms of time efficiency and code scalability. Therefore, this paper proposes an optimization algorithm for solving k-dominated skyline bodies. This algorithm stores the candidate set and the intermediate set for each user separately, and stores the non-k-dominated skyline points in the candidate set to the intermediate set of the corresponding user in the order of appearance of data points in the two sets during the k-domination checking process, so that the next user can filter and use them, which can reduce the number of comparisons between data points, avoids double counting, and improve query efficiency. A multi-user k-dominated skyline body parallel solving algorithm is also proposed, which effectively reduces the comparison time of data points through the Apache Flink parallel processing framework. The theoretical study and experimental data show that the proposed algorithm is highly efficient and can handle the multi-user k-dominated skyline problem well.

Deep hierarchical attention matrix factorization

LI Jian-hong, SU Xiao-qian, WU Cai-hong

2023, 45(01): 28-36. doi:

Abstract ( 160 )

PDF (760KB) ( 333 ) 　　

Matrix factorization is widely used in personalized recommendation because of its better ability of rating prediction, so many models based on matrix factorization is designed to improve the performance of recommendation.However, the limited ability of these models to mine users potential preference information results in unsatisfactory recommendation effect. In order to mine preferences of user and obtain better recommendation effect, a Deep Hierarchical Attention Matrix Factorization method (DeepHAMF) is proposed. Firstly, the original data input into the multi-layer perceptron, and the self-attention mechanism is also used to encode the input into the multi-layer perceptron, which aims to capture the original preference information. This part is called self-attention layer. Secondly, the original matrix factorization results and the matrix factorization results after attention operation are fused with the output results of multi-layer perceptron respectively by attention mechanism, so the users prefe- rence information can be fully mined. This part is called self-attention layer. Last but not the least, results of self-attention and hierarchical attention are fitting by the residual network module. Experimental results on public rating data sets show that DeepHAMF outperforms existing rating prediction algorithms.

Intelligent partitioning of airfoil surface flow field data of aircraft

HU Yue-di, SU Xiang, LI Nan, ZHANG Li-mei

2023, 45(01): 37-45. doi:

Abstract ( 175 )

PDF (1920KB) ( 274 ) 　　

Advances in the automatic post-processing analysis of aircraft airfoil CFD simulation results can effectively improve the efficiency of product design. Therefore, an intelligent partitioning method of airfoil surface flow field data is proposed, which can effectively obtain the airfoil surface flow field partitioning results. Firstly, the airfoil dataset is obtained by modifying the aerodynamic shapes in batch mode with parameterization, and numerical simulation is conducted to generate flow field calculation results. Secondly, the conformal ge-ometry method is adopted to reduce the dimension of the surface flow field data, and perform the resampling and matrixing process, so that the data can be used as the standard input of the prediction model. Thirdly, a convolutional neural network model is built up to train and predict the flow field data. Finally, the parti-tioning results are resampled to the airfoil surface by inverse mapping. Experiments show that the proposed intelligent partitioning method can efficiently partition the flow field data on the airfoil surface for different physical quantities, with an accuracy of more than 92% on the test data set.

A multi-constrained QoS dual path routing optimization algorithm based on software defined network

GOU Ping-zhang, MA Lin, GUO Bao-yong, YUAN Chen

2023, 45(01): 46-56. doi:

Abstract ( 259 )

PDF (1088KB) ( 546 ) 　　

In order to solve the problems of high routing algorithm complexity, low QoS flow satisfaction and single link failure in current software-defined network (SDN) architecture, a multi- constraint QoS dual-path routing optimization algorithm based on software-defined network (SDN_ MCQDP) is proposed. The controller is used to obtain the global network state information, and generate a directed acyclic graph based on the destination node. In the multi-constraint QoS routing stage, the multi-constraint problem is transformed into a linear programming problem by the Lagrangian relaxation dual algorithm. The reverse link is used to delete redundant dual-path links that meet multi-constraint QoS and ensure data transmission after link failure. The algorithm is simulated and analyzed from the aspects of routing calculation time, link utilization, and QoS flow satisfaction. The results show that, compared with MODLARAC, QT, RMCDP_RD, and H_MCOP algorithms, SDN_MCQDP can effectively reduce the transmission delay and route calculation time, improve the link utilization, and still meet the QoS requirements after link failure.

An anti-counterfeiting pattern based on multilevel blocks with encryption and scrambling

ZHOU Cheng-zhuo, ZHENG Hong, WANG Tian-yu, LIU Chang

2023, 45(01): 57-65. doi:

Abstract ( 222 )

PDF (848KB) ( 298 ) 　　

Two-dimensional code authentication based on anti-counterfeiting techniques has received continuously increasing concerns from academia and industry. However, the existing anti-counterfeit scheme based on two-dimensional code has some drawbacks such as poor identification experience, high cost, and inadequate anti-counterfeit ability, and the anti-counterfeiting procedures take place only after sale. Therefore, a novel anti-counterfeiting pattern based on multilevel blocks with encryption and scrambling is proposed. Firstly, the first level expansion is performed to replace every pixel in the original binary message image by a pixel block with a specific size. Secondly, the obtained image is encrypted based on Logistic chaotic sequence. Thirdly, the image is scrambled by Arnold transformation. Finally, the second level expansion is performed to replace every pixel in the generated image by a pixel block with a specific size and gray value, so as to form a complex and chaotic anti-counterfeiting pattern. The printed image is decoded to obtain the readability of the secret message, in order to distinguish whether the pattern is printed at the first time or the second time. Experimental results show that the proposed anti-counterfeiting pattern has good confidentiality, good ability to distinguish the commonly used forgery means, smaller physical size, and relatively convenient identification procedure.

An encrypted knowledge graph storage and retrieval scheme based on searchable encryption

LIN Qing, TENG Fei, TIAN Bo, ZHAO Yue, ZHU Jin-ye, FENG Li

2023, 45(01): 66-76. doi:

Abstract ( 361 )

PDF (1527KB) ( 447 ) 　　

With the rapid development of cloud computing, knowledge graph data outsourcing has become a popular trend. Knowledge graphs in many fields such as medical and finance have privacy- sensitive characteristics. However, cloud servers are not completely credible. In order to protect the confidentiality and integrity of data on cloud servers, encryption and other methods are used to protect the security of knowledge graph data. This paper proposes an encrypted knowledge graph storage and retrieval scheme based on searchable encryption, which can effectively protect the confidentiality and integrity of data and support retrieval on encrypted data. This scheme fully considers the necessity of sequential reading of knowledge graph entities and their relationships, thereby optimizing the encrypted index design and speeding up the retrieval efficiency. The experimental results show that the average query time of the one-hop subgraph of the encrypted knowledge graph is 2.09 times that of the non- encrypted knowledge graph, which verifies that the scheme achieves a good balance between security and query efficiency.

Overview of RFID security authentication protocols

KOU Guang-yue, WEI Guo-heng, PING Yuan, LIU Peng

2023, 45(01): 77-84. doi:

Abstract ( 402 )

PDF (554KB) ( 440 ) 　　

In the development of the Internet of Things, RFID technology with its lightweight advantage plays an important role in the Internet of Things (IoT) system. At the same time, RFID authentication protocols are also subject to security threats due to physical limitations. By sorting out the current mainstream RFID authentication protocols, they are divided into ultra-lightweight, lightweight, middleweight, and heavyweight security authentication protocols according to the magnitude of the encryption algorithm. The security problems of typical security authentication protocols are analyzed, and the security performance and performance indicators of the improved protocols proposed in recent years are discussed and compared according to the magnitude. Finally, the possible development direction of RFID authentication protocols is discussed.

Specific emitter identification of LightGBM based on ant colony parameters optimization

GU Chu-mei, CAO Jian-jun, WANG Bao-wei, XU Yu-xin,

2023, 45(01): 85-94. doi:

Abstract ( 176 )

PDF (879KB) ( 362 ) 　　

In order to improve the accuracy and efficiency of specific emitter identification, a specific emitter identification method of LightGBM based on ant colony parameters optimization is proposed. The lifting wavelet packet transform is used to extract the characteristics of the emitter signal data and construct the characteristic parameter system. The obtained characteristic data set is processed by Z-score standardization. Aiming at the maximum classification accuracy and the minimum feature subset size, a mathematical model of LightGBM parameter optimization and feature selection is established. The ant colony optimization is used to optimize the six parameters of LightGBM (minimum leaf node data volume, number of decision trees, learning rate, L1 regularization item weight, L2 regularization item weight and minimum leaf node sample weight sum). According to the optimized LightGBM, the importance value of each feature is obtained, and the sequential backward search strategy is used for feature selection. The identification of emitter signals is realized through the LightGBM classifier. The experimental results show that the recognition accuracy of the proposed method is better than the comparative feature selection methods (GBDT, XGBoost and LightGBM) on the signal data set with no noise, signal-to-noise ratio of 10dB and signal-to-noise ratio of 5dB. At the same time, the reduction of feature dimension also improves the computational efficiency of specific emitter identification.

A tennis action recognition and evaluation method based on PoseC3D

ZHOU Sheng-ru, CHEN Zhi-gang, DENG Yi-qin

2023, 45(01): 95-103. doi:

Abstract ( 928 )

PDF (1345KB) ( 800 ) 　　

To accurately recognize and evaluate tennis actions, by combining computer vision with tennis related knowledge, this paper proposes a tennis action recognition and evaluation method based on PoseC3D. Firstly, a pose estimation model based on resnet-50 is used to detect human targets in tennis video and extract bone key points. Secondly, the PoseC3D model is trained through the video data set collected in the professional tennis court, so that it can classify the sub actions of tennis. Thirdly, the dynamic time warping algorithm is used to evaluate the classified actions. Finally, based on the collected video data set, a large number of experiments are carried out. The results show that the Top1 accuracy of the proposed tennis action recognition method based on PoseC3D can reach 90.8%. Compared with the methods based on graph convolution network, such as AGCN and ST-GCN, it has stronger generalization ability. Moreover, the proposed scoring algorithm based on dynamic time warping can give real-time and accurate evaluation scores for corresponding actions after action classification, reducing the work intensity of tennis teachers and effectively improving the quality of tennis teaching.

A generation and computation algorithm of Lupaş q-Bézier curve via discrete convolution

GENG Meng-yuan, XIE Bin, HAN Li-wen,

2023, 45(01): 104-112. doi:

Abstract ( 139 )

PDF (813KB) ( 236 ) 　　

Lupa ş q-Bernstein operator is the first proposed qinteger based qanalogue Bernstein operator in rational form. By using the recurrence formulas in reverse as a pyramid algorithm, the nth degree Lupaş q-Bernstein basis function sequence is generated via discrete convolution. Owing to the commutativity of discrete convolution, for each Lupaş q-Bézier curve of degree n, the hodograph and the collection of n! recursive evaluation algorithms are derived. Unlike the tangent point obtained by de Casteljau algorithm of Bézier curve, de Casteljau algorithm of Lupa ş q-Bézier curve obtains a point on the curve being one of the two cut points where the line intersects the curve. For quadratic Lupa ş q-Bézier curve, sufficient and necessary conditions for computing left and right cut points are obtained. In addition, the left and right cut points can be computed simultaneously by proposing a dual cut point algorithm.

On minimum summary representing sets in general graphs

ZHONG Hao, CHEN Wei-dong

2023, 45(01): 113-118. doi:

Abstract ( 162 )

PDF (369KB) ( 203 ) 　　

In general graphs, the similarity between any two nodes is usually characterized based on the topological structure of the graph. Based on node similarity, this paper proposes a concept named summary representing set. Finding a summary representing set with the minimum number of nodes in a graph is called the minimum summary representing set problem. It is proved that the minimum summary representing set problem is NP-hard (non-deterministic polynomials), which means it is unlikely to exist exact algorithms with poly-nomial time. A greedy approximation algorithm with polynomial time is proposed based on submodular function, which is used to solve the minimum summary representing set problem and get the approximate results.

A multi-factor human body surface area calculation model based on deep feedforward neural network

WANG Yu-lu, LI Fei, YANG Zhen, HUANG Shan, ZHANG Gang, ZHAN Shu,

2023, 45(01): 119-126. doi:

Abstract ( 170 )

PDF (860KB) ( 269 ) 　　

Human body surface area (BSA) plays a crucial role in clinical medicine, but most of the existing BSA formulas only use two parameters: height and weight, and adopt the method of matching simple function to estimate the body surface area. Doctors also show that the existing clinical BSA formulas have a large calculation error. To solve these problems, a new BSA regression prediction model is proposed. The regression model consists of two parts: firstly, the factors of body surface area with high correlation are selected by correlation and significance analysis; secondly, a regression model is constructed by training the deep feed-forward neural network with 104 sets of human body data. 5-fold cross validation and independent test set and two verification methods are adopted in the experiments. Firstly, the accuracy of the deep feedforward neural network model and the traditional human surface area calculation formula are evaluated, and the results are compared and analyzed. Secondly, the accuracy of the deep feedforward neural network model and the three algorithm models are evaluated, and the results are compared and analyzed. Compared with the traditional methods, the determination coefficient of the deep feedforward neural network model is higher than that of the two traditional methods, and is six percentage points higher than the traditional method with better results, and the error of deep feedforward neural network model is nearly twice as low as that of the traditional method. Compared with the three algorithm models, the deep feedforward neural network improves the determination coefficient by two percentage points and reduce the error. The experimental results of consistency analysis also show that the 95% consistency limit of the deep feedforward neural network is the smallest and the consistency is the best. Through above experiments, it is proved that the proposed regression framework can better calculate body surface area and obtain more accurate prediction value.
Key words:body surface area;deep feedforward neural network;regression;prediction;cross- validation

An image recognition model for minor and irregular damage on metal surface based on attention mechanism and deformable convolution

DENG Zhong-gang, DAI Gang, WU Xiang-ning, DENG Yu-jiao, WANG Wen, CHEN Miao, TU Yu, ZHANG Feng, FANG Heng

2023, 45(01): 127-135. doi:

Abstract ( 236 )

PDF (1518KB) ( 344 ) 　　

For the detection of minor damages on metal surface, the generalization ability of traditional target recognition algorithms is weak, the general detection algorithms using deep convolution neural network is easy to lose the characteristics of small targets, and the traditional square structure convolution used by these algorithms is not suitable for dealing with irregular damages such as long strips. To solve the above problems, a cascade neural network target detection model based on attention mechanism and deformable convolution, called ADC-Mask R-CNN, is proposed. The model embeds channel domain attention and spatial domain attention in ResNet101 backbone network to enhance the detection effect of minor damage targets, and uses deformable convolution and deformable region of interest pooling technology to improve the detection effect of irregular damages. In addition, the detection results are further optimized by cascaded networks. Comparative experiments on metal surface damage data sets show that the ADC-Mask R-CNN model can improve the detection performance of minor irre- gular damages on metal surface.

Multi-task deep spatial-temporal networkfor couriers pick-up arrival time prediction

WANG Chen-yu, WEN Hao-min, GUO Sheng-nan, LIN You-fang, WAN Huai-yu,

2023, 45(01): 136-144. doi:

Abstract ( 295 )

PDF (950KB) ( 456 ) 　　

Predicting a couriers pick-up arrival time, i.e., estimating the arrival time of the courier after a user places a package pick-up order, is a fundamental task in logistics platforms. Accurate pick-up arrival time prediction can optimize logistics efficiency and improve user experience. The problem faces the following challenges: 1) The couriers pick-up arrival time is affected by a variety of complex spatiotemporal factors, including the spatiotemporal characteristics of the target order, as well as the correlations between the target order and other unpicked-up orders. 2) During the execution of the pick-up task, new orders will be continuously assigned to the courier by the system, resulting in changes in the package pick-up route, which brings great difficulties to the arrival time prediction. In response to the above challenges, a multi-task deep spatial-temporal network (named MSTN4PAT) is proposed to accurately predict package pick-up arrival time, which learns the complex spatiotemporal patterns of couriers pick-up arrival time from massive historical data. MSTN4PAT fully exploits the intrinsic relationship between the origin and destination of the target order, uses multi-task learning to model the interaction between orders, and efficiently integrates various features from the perspectives of feature width and feature depth to achieve accurate arrival time predictions. The experimental results on two real-world datasets show that MSTN4PAT significantly outperforms other comparative medels.

An adaptive fast whale optimization algorithm

YANG Bing-yuan, YUAN Jie, GUO Yuan-yuan

2023, 45(01): 145-153. doi:

Abstract ( 174 )

PDF (1283KB) ( 303 ) 　　

An adaptive fast whale optimization algorithm (AWOA) is proposed to solve the problems of insufficient local search ability and slow convergence rate of the standard whale optimization algorithm. The algorithm adaptively selects global search or local search according to the degree of individual distribution and achieves a dynamic balance between them. Levy Flight is introduced for the secondary optimization of individuals with a high degree of deviation from the average position of the sample to further expand the search area and ensure the global search ability of the algorithm. Standard test functions are used to prove that AOWA has high convergence rate and stability. AWOA is applied to unmanned vehicle's path planning. The simulation results show that it has the stable local exploitation capability and global exploration capability.

Entity disambiguation of Chinese short text using graph model based on topic relations

MA Ying-chao, ZHANG Xiao-bin

2023, 45(01): 154-162. doi:

Abstract ( 187 )

PDF (954KB) ( 332 ) 　　

As an important supporting technology for applications such as knowledge base construction and information retrieval, entity disambiguation plays an important role in the field of Natural Language Processing (NLP). However, in the short text environment, it is difficult for entity disambiguation to extract sufficient context features for disambiguation. Aiming at the characteristics of short texts, this paper proposes a disambiguation method of graph models based on entity topic relations. This method uses TextRank algorithm to infer the topic of corpus constructed by knowledge base information, and uses the result of topic inference as the representation of relationship between entities. By combining the disambiguation score given by the semantic matching model based on BERT, the disambiguation network graph is constructed, and the final disambiguation result is obtained through search and sorting. The data set provided in the short text entity link task of CCKS2020 is used to evaluate the method. The experimental results show that the proposed method is better than other entity linking methods in entity disambiguation of short text, and can effectively solve the entity disambiguation problem of Chinese short text.

Image copy-move forgery detection based on deep feature extraction and DCT transform

WEI Wei-yi, ZHAO Yi-fan, CHEN Guo

2023, 45(01): 163-170. doi:

Abstract ( 284 )

PDF (1419KB) ( 396 ) 　　

Aiming at the problems of high algorithm time complexity and incomplete location region in image copy-move forgery detection, an image copy-move forgery detection algorithm based on deep features extraction and discrete cosine transform is proposed. Firstly, the color and texture information of the image are fused to obtain the four-channel image, the adaptive feature extraction threshold is calculated, and the depth feature is extracted by the feature detector based on the full convolutional neural network. Secondly, the discrete cosine transform is used to extract block features for preliminary matching, and point feature vectors are adopted to eliminate mismatches. Finally, the tampered regions are accurately located by convolution operation. The verification on the public dataset fully demonstrates the advantages of the algorithm in the detection efficiency and the integrity of the positioning area.

A hybrid model for event trigger word extraction

YANG Hao, ZHAO Gang, WANG Xing-fen

2023, 45(01): 171-180. doi:

Abstract ( 195 )

PDF (900KB) ( 393 ) 　　

Although structural grammatical features and semantic features of events have their respective advantages, and the integration of the two features is conducive to accurately represent event trigger words and helpful to complete event trigger word extraction. However, existing feature-based, structure-based and neural network model-based extraction methods can only capture partial features of events, and cannot accurately represent event trigger words. In order to solve the above problems, a hybrid model combining event structural grammatical features with event semantic features is proposed to complete the task of event trigger word extraction. The hybrid model firstly integrates the sentence dependency syntax information into the initial vector model, so that the initial vector integrates the event structural grammatical features. Then, the initial vector is successively introduced into the CNN and BiGRU-E-attention models of the neural network model, and the events semantic features of multi- dimensional are captured. It also completes the feature fusion of event structural grammatical features and event semantic features, and finally completes the extraction of event trigger words. Experimental results on CEC Chinese Emergency Corpus show that the hybrid model improves the F values in the position recognition and classification tasks of event trigger words by 0.86% and 4.07%, respectively, compared with the baseline model. Experimental results on ACE2005 English corpus show that the hybrid model improves the F values in the position recognition and classification tasks of event trigger words by 1.4% and 1.5%, respectively, compared with the baseline model. The experimental results show that the hybrid model achieves excellent results in the task of event trigger word extraction.

Current Issue

Author center

Review center

Online journal