Computer Engineering & Science

A hierarchical hardware barrier synchronization design for many-core processors

ZANG Zhao-hu, LI Chen, WANG Yao-hua, CHEN Xiao-wen, GUO Yang

2022, 44(11): 1901-1908. doi:

Abstract ( 343 )

PDF (1395KB) ( 471 ) 　　

Synchronization plays an important role in ensuring data consistency and correctness of multicore processor threads. As the number of processor cores increases, the cost of synchronization increases. Barrier synchro-nization is one of the effective methods for multi-core synchronization in parallel applications. Software synchronization methods typically require thousands of cycles to complete synchronization among multiple cores. This high latency and serialization synchronization can result in significant performance degradation of multicore programs. Compared with the software barrier synchronization method, the hardware barrier can achieve lower synchronization delay, but the scalability of the centralized hardware barrier is limited and it is difficult to adapt to the multicore processor systems. This paper proposes a hierarchical hardware barrier mechanism called HSync for multicore processors. It consists of local and global barrier units, which work together to achieve fast synchronization with low hardware overhead. The experimental results show that the hierarchical hardware barrier mechanism improves the performance of the multicore proces-sor system by 1.13 times and reduces network traffic by 74% compared with the traditional centralized hardware barrier.

Optimization of dot product algorithms on FT-M7002

GUO Pan-pan, CHEN Meng-xue, LIANG Zu-da, MA Xiao-chang, XU Bang-jian

2022, 44(11): 1909-1917. doi:

Abstract ( 386 )

PDF (645KB) ( 338 ) 　　

On the high-performance DSP of domestic FT-M7002 platform, different types of dot product algorithms are optimized and implemented. The technical chain of the mathematical library of the processor platform is consummated. Taking full advantage of FT-M7002 kernel architecture, SIMD vector parallelization, DMA dual channel transmission, SVR transmission and other optimization methods for dot product algorithm are realized. The research fully excavates the vector parallelism of the program, effectively improving the speed of data transmission and improving the performance of the program. The experimental results show that the average performance ratio of different types of dot product algorithms after and before optimization on FT platform is 12.416 6~45.233 8. Compared with the performance of different types of dot product functions in dsplib library on TI official website on TMS320C6678 processor, the average performance ratio between FT platform and TI platform is 1.371 6 ~ 4.519 6. The research results show that the DSP platform has obvious computational performance advantages over TI mainstream platform.

Stack overflow test for embedded operating systems based on behavior monitoring

YANG Xing-da, CHEN Can, FANG Ling,

2022, 44(11): 1918-1923. doi:

Abstract ( 192 )

PDF (728KB) ( 288 ) 　　

Stack test is an important part of security evaluation of embedded operating systems. Stack overflow will overwrite the data in the adjacent stack, resulting in data confusion and system crash. However, catching and locating stack overflows can be difficult. Firstly, the overflow data may invade the private stack of other tasks in the operating system, and the overflowing task itself has no abnormal behavior, so that it is difficult to determine the root cause of the stack overflow. Secondly, stack overflows may be exposed later than their occurrence due to the priority differences of the operating system tasks. In this research, a dynamic stack test method based on real-time stack allocation and recovery behavior monitoring is proposed. Firstly, instrumentation is inserted at the stack behavior test point to collect the test code of the test stack. Then, the Upper Test (UT) is set to analyze the test code and provide the test result, which can realize the real-time capture and locate the stack overflow. In the actual test based on telematics terminal, this method is used to locate three stack overflow that caused the system to crash or reset, and the safety of the operating system stack is eval-uated. In addition, according to the test results, the static allocation of the stack size is optimized, which saves up to 42% of the stack space in a single task and compresses the entire RAM of tasks to 63% of the original.

Programmable quantum simulation of perfect state transfer in quantum spin chain with a quantum photonic chip

ZHAN Jun-wei, ZENG Ru, WANG Yi-zhi, XUE Shi-chuan, HUANG Guang-yao, WU Jun-jie

2022, 44(11): 1924-1931. doi:

Abstract ( 229 )

PDF (1096KB) ( 465 ) 　　

In recent years, the physical realization technology of quantum computing has made rapid progress, and the construction of quantum computing devices for practical use has become the development focus. Compared with classical simulation, it is more efficient for quantum simulation to study the evolution behavior of quantum spin system. The perfect state transfer in a one-dimensional quantum spin chain has important re-search value in the field of quantum communication and quantum computation. This paper proposes a pro-grammable quantum simulation method of perfect state transfer based on continuous-time quantum walks of two photons, and experimentally demonstrates “periodic and mirror-symmetric” perfect state transfer of two excitations in XY-type quantum spin chain under two special Hamiltonians with a quantum photonic chip. The results provide a practical and scalable scheme for simulating the dynamics of quantum spin systems.

An improved method for solving partial differential equations using deep neural networks

CHEN Xin-hai, LIU Jie, WAN Qian, GONG Chun-ye,

2022, 44(11): 1932-1940. doi:

Abstract ( 568 )

PDF (1169KB) ( 700 ) 　　

Solving partial differential equations plays a vital role of numerical analysis in scientific and engineering fields such as computational fluid dynamics. Due to the multi-scale nature of physics and sensitivity to the quality of the discrete mesh, traditional numerical methods often require complex human-computer interaction and expensive meshing overhead, which limit their application to many real-time simulation and optimal design problems. This paper proposes an improved neural network-based method for solving partial differential equations, named TaylorPINN. It utilizes the universal approximation theorem of neural networks and the function-fitting capability of Taylor formula, and provides a mesh-free numerical solving process. Numerical experimental results on Helmholtz, Klein-Gordon, and Navier-Stokes equations demonstrate that TaylorPINN is able to approximate the underlying mapping relations between the coordinate inputs and quantities of interest, yielding an accurate prediction result. Compared with the widely used physics-informed neural network method, TaylorPINN improves the prediction accuracy by a factor of 3~20x across different numerical problems.

A ciphertext verifiable attribute-based searchable encryption scheme for mobile terminals

NIU Shu-fen, ZHANG Mei-ling, ZHOU Si-wei, YAN Sen

2022, 44(11): 1941-1950. doi:

Abstract ( 184 )

PDF (660KB) ( 341 ) 　　

The data of lightweight devices are mostly stored in the cloud server. Because the cloud service is not completely trusted, and the traditional single keyword searchable encryption will produce a lot of information irrelevant to the retrieved content, this paper proposes a ciphertext verifiable attribute based searchable encryption scheme for mobile terminals. The scheme combines CP-ABE technology to control access granularity, introduces a trusted third party entity to verify data integrity, and help users to partially decrypt data. The scheme is proved to be selectively indistinguishable ciphertext strategy and selective plaintext attack and indistinguishable selective keyword attack under the assumption of difficult problems. Theoretical analysis and numerical simulation show that this scheme has higher efficiency.

A satellite edge computing resource allocation and offloading algorithm with task dependence

FANG Hai, ZHAO Yang, GAO Yuan, YANG Xu

2022, 44(11): 1951-1958. doi:

Abstract ( 345 )

PDF (749KB) ( 633 ) 　　

Aiming at the offloading decision problem of collaborative edge computing in GEO and LEO integrated satellite networks, an offloading decision algorithm for satellite network edge computing is proposed, which considers task-dependent joint computing resources, wireless resource allocation and task scheduling. Firstly, the task offloading problem is modeled as a joint optimization problem of minimizing task delay and energy consumption, then energy consumption and delay are introduced into the subtask priority definition, and a heuristic search offloading strategy is carried out based on dynamic priority. The dependencies between subtasks also take into account the radio resource allocation. The simulation results show that, compared with the existing work, the proposed algorithm can shorten the task execution delay of the co-computing of GEO and LEO satellites and can reduce the power consumption of LEO satellites.

A survey of backdoor implantation and detection techniques on deep neural network model

MA Ming-yuan, LI Hu, WANG Zi-bin, KUANG Xiao-hui

2022, 44(11): 1959-1968. doi:

Abstract ( 269 )

PDF (636KB) ( 642 ) 　　

As one of the representative technologies of the rapid development of artificial intelligence, deep neural network has been applied more and more widely, and the security problems brought by it have gradually attracted attention. Existing studies mainly focus on how to efficiently construct diverse adversarial samples to cheat deep neural network models, and how to detect adversarial samples and reinforce deep neural network models. However, with the development of deep neural network models increasingly relying on open-source data sets, pre-trained models, computing frameworks and other third-party resources, the risk of models being implanted into backdoors is increasing. Starting from each link of the life cycle of deep neural network models, this paper summarizes the related technologies and methods of backdoor implantation and detection of deep neural network models, compares and analyzes the main characteristics and applicable scenarios of different methods, and prospects the future development direction of related technologies.

Indoor positioning of support vector machine optimized by firefly algorithm

ZHONG Chen, YU Xue-xiang, TAI Xiao-man, HAN Yu-chen, XIAO Xing-xing, LIU Qing-hua,

2022, 44(11): 1968-1975. doi:

Abstract ( 170 )

PDF (898KB) ( 292 ) 　　

Aiming at the problems of large positioning fluctuation caused by excessive matching redundant information in indoor positioning fingerprint database and poor positioning timeliness caused by excessive sample number in database, This paper proposes a new indoor positioning method based on support vector machine (SVM) optimized by firefly algorithm (FA). Singular Spectrum Analysis (SSA) is used to remove noise during data preprocessing and the SVM parameters are optimized by FA, so as to establish the indoor positioning regression model. The experimental results show that, compared with the current indoor positioning method, FA-SVM algorithm has fast convergence speed and improves the indoor positioning accuracy and stability.

An analytical improvement of Tsai’s camera plane calibration algorithm

YAO Long-xing, HAN Jiang-tao, ZHANG Zhi-yi

2022, 44(11): 1976-1984. doi:

Abstract ( 194 )

PDF (520KB) ( 241 ) 　　

Aiming at the pinhole perspective projection model of radial distortion, a simple and fast camera calibration algorithm is proposed. This algorithm separates the internal and external parameters of the camera and the distortion parameters of the camera model on the premise that the image center point coincides with the center of the CCD or CMOS sensor, so that the camera calibration can be further linearly performed, which avoids errors caused by nonlinear optimization, reduces the algorithm complexity, appropriately improves the calibration accuracy, and saves the calculation time. Firstly, the lens distortion coefficient is calibrated according to the principle of cross-ratio invariance of perspective projection. Then, according to the rotation transformation relationship and translation transformation relationship, the internal and external parameters of the camera are solved linearly by making full use of the radial distortion constraint, the orthogonality of the rotation transformation and the unique properties of the rotation matrix. Finally, experiments show that, compared with the Tsai’s camera plane calibration algorithm, this algorithm saves about 35% in calibration time and improves accuracy by at least 15%.

An automatic reading algorithm of pointer meter based on text feature and secondary correction

CHEN Kun-jian, LI Zhu, ZHOU Yi-sha, SHENG Qing-hua

2022, 44(11): 1985-1994. doi:

Abstract ( 162 )

PDF (1508KB) ( 355 ) 　　

Existing pointer meter reading algorithms usually detect the scale of the meter to identify the value. However, in the meter image, the scale contains less features, which is prone to misdetection. To solve this problem, a new automatic reading algorithm of pointer meter is proposed. This algorithm greatly improves the robustness of meter reading recognition by selecting image features of a larger area. Because the pointer scale value text is a common part of various meters and has far more image features than scale, the proposed algorithm uses the scale value text as the recognition basis. Firstly, the convolutional neural network detects the scale value text in the meter image, and uses its position coordinates to fit the center of the meter. The secondary image correction converts the arc-shaped scale area into a horizontal straight-line area. At the same time, the recognized text value is also used to improve the distance interpretation method. This method is compared with other reading algorithms. The comparison experiment proves, this algorithm has a high reading accuracy rate, its reference error is less than 0.5%, and it has higher robustness under complex shooting conditions.

Towards Anchor-free object detection with diverse receptive fields attention feature refinement network

ZHANG Hai-yan, FU Ying-na, DING Gui-jiang, MENG Qing-yan

2022, 44(11): 1995-2002. doi:

Abstract ( 137 )

PDF (880KB) ( 250 ) 　　

As one of the research hotspots of object detection, anchor free abandons a large number of predefined box Settings and adopts pixel-by-pixel method for prediction. Even so, it does not deal well with overlapping objects. In addition, the ability of network to obtain global information of images is weak and receptive field mismatch is easy to occur. Therefore, this paper proposes two modules: diverse receptive field attention mechanism (DRAM) and global context-guided feature fusion module (GCF). Extensive experiments on the PASCAL VOC and MS COCO confirm the effectiveness of our method. Compared with the baseline FCOS, the proposed method can improve PASCAL VOC by 1.4 points and obtain a mAP of 42.8 on MS COCO. The detection performance is significantly better than many advanced algorithms.

Garbage detection based on Mask R-CNN

ZHANG Rui-ping, NING Qian, LEI Yin-jie, CHEN Bing-cai

2022, 44(11): 2003-2009. doi:

Abstract ( 210 )

PDF (584KB) ( 350 ) 　　

In recent years, People pay more and more attention to garbage classification and recycling, but garbage classification consumes a lot of manpower and material resources and the sorting efficiency is low. To solve the problem that the garbage detection method based on rectangular bounding box is not effective enough when applied to multi-classification environment, a garbage detection method based on improved Mask R-CNN is proposed. Instead of the traditional ResNet, this method uses the improved ResNeXt101 as the backbone network for feature extraction, which improves the accuracy of object detection and the accuracy of background boundary segmentation. Experimental results show that compared with the traditional Mask R-CNN model, the proposed model’s average classification accuracy is 91.1%, improved by 2.35%. Finally, the experimental comparison with the current popular object detection algorithms shows that the classification accuracy and segmentation accuracy of the proposed algorithm are excellent, which proves the feasibility and effectiveness of the proposed method in the garbage detection task.

An adaptive filtering remote sensing image segmentation network based on attention mechanism

WU Cong-zhong, DONG Hao, FANG Jing

2022, 44(11): 2010-2018. doi:

Abstract ( 217 )

PDF (1177KB) ( 372 ) 　　

Due to the large-scale changes of remote sensing images, large intra-class differences in the background, and the imbalance between the foreground and the background, it is difficult to segment the small objects and object edges of remote sensing images. In convolutional neural networks, the aliasing effect caused by downsampling causes the distortion and loss of object information, which is easily ignored. At the same time, although the expanded convolution has captured rich receptive field information, there is still redundant background information interference. Accordingly, an adaptive filter segmentation network (ARGNet) based on an attention mechanism is proposed. Experiments on the DeepGlobe Road Extraction dataset and the Inria Aerial Image Labeling dataset show that the proposed network can segment more accurate objects.

A low-light image enhancement method based on an end-to-end dual network

CHEN Qing-jiang, LI Jin-yang, QU Mei, HU Qian-nan

2022, 44(11): 2019-2026. doi:

Abstract ( 186 )

PDF (1152KB) ( 287 ) 　　

Objective: Due to the uncertainty of the environment, the captured image has some problems, such as low brightness, low contrast, serious information loss and so on. Moreover, the image enhanced by the existing algorithms has the problem of over exposure, which cannot meet the input requirements of computer vision tasks. Methods: To solve this problem, a low illumination image enhancement method based on end-to-end dual network is proposed, which consists of Inception module and URes-Net module. Firstly, the low illumination image samples are synthesized by Retinex theory, and then the dual network model is used for feature extraction, feature fusion and reconstruction. According to the loss of the test set, the parameters are continuously adjusted to optimize the model. Finally the dual network model has high low illumination image enhancement ability. Results: the experimental results show that the mean values of PSNR and SSIM are 28.659 8 db and 0.896 6 respectively, which are better than other advanced low illumination image methods. Conclusion: compared with other method, the brightness and contrast of this method are significantly improved, and the image obtained is more in line with the visual sense.

Application of convolutional neural network with covariance matrix in human activity recognition

QUAN Wei-ming, LIU Tian-yi, ZHANG Lei

2022, 44(11): 2027-2036. doi:

Abstract ( 184 )

PDF (1354KB) ( 478 ) 　　

At present,deep learning has played an important role in various human activity recognition (HAR) tasks.However,the activity data has the particularity of time series and includes body movements.The existing convolutional neural network (CNN) will cause the data to be highly correlated when performing convolutional operations. As the network affects the next layer, the accuracy of network recognition is limited. In order to solve this phenomenon, this paper proposes an improved convolutional neural network with covariance matrix for HAR scenario. It builds a de-correlated network structure through matrix transformation to eliminate correlation problems. When the network performance is poor, the network can replace the existing BN layer to normalize data. The verification experiments are finished on four HAR public datasets. The proposed neural network is compared with traditional CNN model and BN layer model. The results show that the improved neural network is improved by 1% to 2% compared with the previous deep learning networks, which proves that the improved neural network is effective. Furthermore, the application is transplanted to the mobile terminal for real-time activity recognition.

Retinal vessel segmentation network with joint attention and Transformer

JIANG Yun, LIU Wen-huan, LIANG Jing

2022, 44(11): 2037-2047. doi:

Abstract ( 193 )

PDF (1878KB) ( 412 ) 　　

Retinal vessel segmentation is critical in the diagnosis and treatment planning of many ocular diseases. Accurate segmentation of vascular features from retinal images remains particularly challenging for complex retinal structures as well as low-contrast fundus structures. A Joint Attention and Trans-former Network (JAT-Net) based on Joint Attention and Transformer for retinal vessel segmentation is proposed, which focuses on encoding local detail features with joint attention to channel information and location information of encoding stage features. To achieve more accurate segmentation, the ability to model long-distance contextual information and spatial dependencies is enhanced by Transformer. Retinal vessel segmentation experiments were performed on the DRIVE and CHASE datasets with accuracies of 0.970 6 and 0.977 4, F1 scores of 0.843 3 and 0.815 4.

Multilingual offline handwritten signature recognition based on Gist and IPCA

HAN Hui, Mahpirat, Hornisa Mamat, ZHU Ya-li, Kurban Ubul,

2022, 44(11): 2048-2055. doi:

Abstract ( 167 )

PDF (1135KB) ( 284 ) 　　

Because the effective strokes of offline handwritten signature images are generally sparse, and there are lots of invalid white backgrounds, using the commonly used feature description methods will cause a lot of re-dundancy in the obtained feature data, which will affect the recognition accuracy. In order to improve the recognition accuracy, we either need to rely on a large number of training data or extract multiple features for fusion, which will cause difficulty in the calculation and affect the efficiency of the experiment due to too much feature data and too large dimensions. Therefore, this paper proposes a multilingual off-line hand-written signature recognition method based on the Gist and IPCA algorithms, which uses gist features to focus on the overall layout and strokes of the image, and the batch processing ability of the IPCA algorithm to improve the recognition effect and operation efficiency. Three experimental datasets (Chinese, English, and Uyghur) and the SVM classifier are used in the recognition experiments. The results show that the recognition accuracy of the three data sets is 97.97%, 98.43%, and 97.19% respectively, and the recognition accuracy of the three mixed data sets is 97.7%. Comparative analysis shows that the proposal is obviously better than the previous related research.

Research of single sample generative adversarial networksbased on attention machanism using linear layers

CHEN Xi, ZHAO Hong-dong, YANG Dong-xu, XU Ke-nan, REN Xing-lin, FENG Hui-jie

2022, 44(11): 2056-2063. doi:

Abstract ( 172 )

PDF (1041KB) ( 267 ) 　　

At present, using single-sample training to generate adversarial networks has become the focus of researchers. However, the problems that the model is not easy to converge, the generated image structure collapses, and the training speed is slow still need to be solved urgently. Researchers propose to use a self-attention model in the generative adversarial network to obtain a larger range of samples and improve the quality of the generated images. It is found that using the traditional convolutional self-attention model causes a waste of computing resources due to the redundancy of information in the attention map. A novel linear attention model is proposed, in which a double normalization method is used to alleviate the problem of the attention model being sensitive to input features, and a new single-sample generative adversarial network model is built using this model. In addition, the model uses residual network and spectral normalization methods for stable training, reducing the risk of collapse. A large number of experiments show that, compared with the existing training model, this model has the characteristics of fast training speed, high resolution of generated images, and obvious improvement of evaluation indicators.

A load forecasting method for power grid host based on SARIMA-LSTM model

WANG Kun, ZHENG Chen, ZHANG Li-zhong, CHEN Zhi-gang

2022, 44(11): 2064-2070. doi:

Abstract ( 212 )

PDF (674KB) ( 366 ) 　　

With the continuous development of smart grids, how to improve the prediction effect of the future operation status of information equipment and set the dynamic threshold interval to adapt to data changes are huge challenges for IT operation and maintenance of power grid. In order to solve these problems, this paper proposes a combined time series forecasting model (SARIMA-LSTM). On the basis of the traditional periodic ARIMA (SARIMA) model, the LSTM model in the field of deep learning is introduced, which discards the low accuracy and poor effect of the traditional error fitting method using error autoregressive method to compensate the prediction result. By using this model to do prediction, we can learn the error fluctuation law which cannot be captured by the traditional ARIMA model, and make up for its inability to predict nonlinear data. Finally, the experimental results show that, compared with the ARIMA model and the FAIRIMA model, the SARIMA-LSTM model can achieve higher prediction accuracy, when actually predicting the grid memory load data.

Current Issue

Author center

Review center

Online journal