Computer Engineering & Science

Optimization and parallelization of spectral method for solving underwater acoustic propagation

MA Xian, WANG Yong-xian, ZHU Xiao-qian, TU Hou-wang, LI Peng, YAN Kai-zhuang

2022, 44(3): 381-389. doi:

Abstract ( 489 )

PDF (1271KB) ( 845 ) 　　

The efficiency of numerical calculation of underwater acoustic propagation is one of the key factors in various applications of underwater acoustic science. As a numerical method for solving diffe- rential equations, spectral method has the advantages of high accuracy and fast convergence. Therefore, using normal wave-spectrum method to solve underwater acoustic propagation equations has attracted the attentions of many scholars in recent years. However, the spectral method requires more computation, and the computational efficiency is still difficult to meet the real-time requirements when solving large-scale underwater acoustic propagation problems. Therefore, it is necessary to use a high- performance computing system to carry out performance optimization and program parallelization research on the typical spectral method to calculate the underwater sound propagation program to improve the computing performance. Firstly, the calculation flow and hotspot functions of the program are analyzed. Then, the optimization methods of compiler options, calling high-performance math library MKL, memory access optimization, and reduced computation are studied. Finally, multi-thread parallel acceleration processing is carried out on the many-core high-performance computing platform. Through testing and evaluation on the Tianhe-2 many-core platform, the results show that the running time of the final parallel optimized version of the deep-sea waveguide calculation example is reduced from 584 seconds to 24 seconds compared with the original serial version, which is 23.98 times faster. The calculation time verifies the validity of the used method , and is of great significance to the calculation of the water sound field in the ocean. Further analysis shows that these optimization and parallelization methods also have reference and reference significance for other scientific and engineering numerical calculation problems of the same type on the same platform.

Research on PCIe topology application of AI server

LIN Kai-zhi, ZONG Yan-yan, SUN Long-ling,

2022, 44(3): 390-395. doi:

Abstract ( 1038 )

PDF (1120KB) ( 1373 ) 　　

The CPU+GPU architecture design is widely used in the data collection and processing requirements of AI servers for big data, cloud computing, artificial intelligence and other fields. The commonly used CPU+GPU PCIe topologies include Balance Mode, Common Mode, and Cascade Mode. In combination with practical requirements, the applicability of various topologies should be studied for complex and diverse application scenarios. Firstly, the architecture of the three topologies is briefly introduced, and then the experiment is designed. The applicability of the three topologies is analyzed through the performance of the point-to-point bandwidth and delay, deep learning and double precision floating point operation. At last, the guidance for the PCIe topology selection of AI server is provided in practical application.

A PCB routing resistance calculation method based on machine learning

LIU Guo-qiang, ZHAO Zhen-yu, ZHAO Chen-yu, HAN Ao, YANG Tian-hao

2022, 44(3): 396-402. doi:

Abstract ( 562 )

PDF (821KB) ( 529 ) 　　

In the field of FPD, the routing between FPC ports and IC ports is called PCB routing. Affected by many factors such as the shape of routing area, the width of interconnect, the space between interconnects and so on, PCB routing may be regular or irregular, which makes it very difficult to accurately and quickly calculate the interconnect resistances. The existing resistance calculation method can calculate the resistance of PCB routing with arbitrary shape based on the coordinates of the inflection point of routing, but has very large time and space overhead, which seriously affects the convergence of the design and cannot effectively utilize historical routing data. The calculation method of PCB routing resistance based on machine learning is studied for the first time. Firstly, the PCB routing with arbitrary shape is divided into several continuous quadrilateral. Secondly, the resistance of a single quadrilateral is predicted by using the established quadrilateral resistance calculation method. Finally, the resistance values of all quadrilateral routing are accumulated to obtain the resistance of the PCB routing. Efficient and accurate calculation of the resistance of PCB routing with ar-bitrary shape is carried out through the “division-prediction-calculation” method. Compared with the tradi-tional method, the average absolute error of our method is only about 1 ohm, and the memory cost and time cost are reduced by 60.9% and 97.9%, respectively.

Microprogram control based on OpenVX parallel processor

ZHANG Ke, LI Tao, XING Li-dong

2022, 44(3): 403-410. doi:

Abstract ( 451 )

PDF (1217KB) ( 468 ) 　　

In order to solve the problems of large internal storage and complex transfer steps in the microcontroller of OpenVX parallel processor, the method of using the associated memory to generate the initial address of microprogram is used to improve the utilization of control memory, and groups the execution conditions of each operation according to the idea of grouping parallel judgment, which improves the speed of generating transfer address.This is verified by mapping five types of image processing functions in OpenVX to this processor. The experimental results show that the above structure and method increase the utilization ratio of control memory by 38.7% and reduce the average number of transfer steps by 50%. Finally, the microprogram is optimized to make the whole system more efficient.

Structure and phase noise of feedforward ring oscillator

SANG Hao, YUAN Heng-zhou, LIANG Bin, CHEN Jian-jun, GUO Yang

2022, 44(3): 411-416. doi:

Abstract ( 748 )

PDF (1442KB) ( 829 ) 　　

Based on the requirement of PLL and clock data recovery circuit in SerDes system, the structure and working principle of feedforward ring oscillator are discussed. On the basis of the traditional structure, the feedforward path is coupled to the source of the main path inverter, which can improve the edge rate of the output signal. Finally, based on the impulse sensitivity function of Hajimiri model, the proposed structure effectively reduces the introduction of thermal noise and flicker noise. Single-source feedforward and dual-source feedforward ring oscillators are designed in 28 nm CMOS process. The simulation results show that when the oscillation frequency is 2.5 GHz, the phase noise of two new structures are -99 dBc/Hz@1 MHz and -105 dBc/Hz@1 MHz respectively, and the FoM are 163 dBc/Hz and 164 dBc/Hz respectively.

Comparison among embedded system security protection schemes and their application case analysis

CHEN Xiang-guo, SHANG Fan, SONG Jun-qiang

2022, 44(3): 417-426. doi:

Abstract ( 768 )

PDF (1128KB) ( 1084 ) 　　

Due to the development of Internet of Things technology, the security protection of embedded systems has become a systemic problem that needs to be considered and tried to be solved urgently. A variety of embedded computer system security protection methods are compared, and ARM TrustZone security protection scheme is analyzed. The main differences between the two technologies (TrustZone-A and TrustZone-M) are analyzed, and the applicable scenarios and implementation characteristics of the two solutions are given. The implementation principle of the trusted startup process based on TrustZone technology is described, and the feasibility of the implementation of abnormal program detection based on the TrustZone-A monitoring mode code is discussed. Finally, combined with typical application scenarios, a security protection scheme based on TrustZone-M technology and an implementation example of a security communication protocol are designed and proposed.

A compact attribute-based encryption scheme supporting computing outsourcing in fog computing

WANG Zheng, SUN Xiao

2022, 44(3): 427-435. doi:

Abstract ( 425 )

PDF (1128KB) ( 931 ) 　　

Ciphertext-policy attribute-based encryption provides one-to-many access control for IoT systems based on cloud storage. However, there are some problems such as high cost and coarse granularity in the existing scheme. Therefore, a compact attribute-based encryption scheme supporting computing outsourcing with fog computing technology is proposed. It shortens the length of the keys and ciphertexts to reduce the storage cost of the client, transfers part of the computing to fog nodes to improve encryption and decryption efficiency, has more abundant policy expression ability, and can quickly verify the correctness of outsourcing decryption. The scheme is proved secure against indistinguishably chosen-ciphertext attack. Furthermore, performance analysis shows that our proposal can provide efficient access control for data storage when terminal device is limited in fog computing.

Collaborative filtering recommendation based on local sensitive hash in blockchain environment

WANG Jing, QIAN Xiao-dong

2022, 44(3): 436-446. doi:

Abstract ( 491 )

PDF (822KB) ( 611 ) 　　

To solve the issue of low recommendation performance caused by massive high-dimensional data in the blockchain environment, the local sensitive hash algorithm is optimized to reduce the calculation and storage overhead in the nearest neighbor search process. The principal component of the data distribution is used to reduce the poorly captured projection direction in the traditional LSH. Meanwhile, the projection vector weight is quantified, the interval of the hash bucket is adjusted, and the query result set is further refined according to the number of conflicts. Finally, a weighted average strategy is used to predict the score and generate a recommendation list. Experiments show that, compared with other algorithm indexes, the optimized LSH only needs a small amount of hash tables and hash functions to obtain accurate neighbor search results, and the search efficiency is greatly improved. The optimized LSH can well adapt to the characteristics of blockchain data, alleviate the impact of high-dimensional large-scale data on recommendation performance, and improve the recommendation quality and efficiency.

Optimization of intrusion detection feature extraction by cost constraint algorithm

LIU Yun, ZHENG Wen-feng, ZHANG Yi

2022, 44(3): 447-453. doi:

Abstract ( 332 )

PDF (1028KB) ( 449 ) 　　

The defense performance of intrusion detection system is often affected by class unbalance data. In order to automatically extract data features of scarce categories to improve the accuracy of intrusion detection systems in identifying unknown network attacks, a cost constraint algorithm is proposed. Firstly, a deep neural network based on stacked autoencoder is built up, and sparse constraints on the neurons are added in the hidden layer. Secondly, the cost objective function is optimized by generating a cost matrix, and costs are assigned to imbalanced data features. Finally, the back propagation is used to finely tune the parameters of the neural network model to obtain the optimal feature vector. The simulation results show that, compared with the FAE algorithm and the NDAE algorithm, the cost constraint algorithm improves the intrusion detection accuracy and convergence for multi-dimensional and class imbalanced data.

An attributed network node embedding method combining two-level attention mechanism

YANG Fan-yi, MA Hui-fang, YAN Cai-rui, SU Yun

2022, 44(3): 454-462. doi:

Abstract ( 397 )

PDF (742KB) ( 522 ) 　　

Attributed network embedding aims to learn the low-dimensional representation of nodes for a given attributed network. Nodes with similar topology and attributes are close to each other in the embedding space. The attention mechanism can effectively learn the relative importance of nodes and their neighbors in the network, and aggregate the node representations based on the neighbor importance. According to this, a node embedding method that incorporates a two-layer attention mechanism in attributed network is proposed, which can effectively capture attributed network embedding. This method first captures direct neighbors from the topology and indirect neighbors based on attribute relationship, and effectively considers the relative importance of node neighbors in this process. Specifically, the direct neighbor and indirect neighbor of the node are first captured, and then the node-level attention mechanism is designed to aggregate the direct neighbor representation and the indirect neighbor representation respectively. Finally the semantic-level attention is designed to merge the two embedded representations to obtain the final embedding. Experiments on both real-world datasets and synthetic datasets verify the effectiveness of the proposed method.

A masked face detection algorithm fusing improved channel and layer pruning

LIU Zi-yan, YUAN Lei, ZHU Ming-cheng, MA Shan-shan

2022, 44(3): 463-470. doi:

Abstract ( 490 )

PDF (1112KB) ( 674 ) 　　

Aiming at the problem of shortage of computing power and resources in target detection deployment in real scenarios, this paper proposes a model method based on improved channel and layer pruning. After improving the channel pruning by setting adaptive local security threshold and carrying out layer pruning by comprehensively evaluating the whole residual structure value, the proposed model pruning method is applied to detect masked face. Firstly, for training the YOLOv4 target detection network, the masked face dataset is constructed by using face-based data amplification method. Secondly, after pruning the YOLOv4 model to get different pruning models by the improved channel and layer pruning method, the comparative experiments with YOLOv4 and YOLOv4-tiny was carried out on the masked face dataset. The proposed pruning model (Prune-best) with the highest performance-to-price ratio reduces the number and size of parameters by 75% and 60%, compared with the YOLOv4 model. The reasoning speed of the model decreases by 3.7 ms and its mAP decreases by 2.7%. When the size of the model is 5.56 MB and the number of parameters is 1.428 MB, the mAP of the extreme pruning model (Prune-limit) reaches 0.662 that is 6.3% higher than YOLOv4-tiny, and the number of parameters of the model is only 1/4 of that of YOLOv4-tiny. The experimental results show that the proposed pruning model achieves higher performance-to-price ratio and is more suitable for masked face detection deployment in real scenarios.

An indoor people counting model based on global attention

LI Jing, HE Qiang, ZHANG Chang-lun, WANG Heng-you,

2022, 44(3): 471-478. doi:

Abstract ( 430 )

PDF (1500KB) ( 450 ) 　　

With the explosive development of artificial intelligence technology, machine learning, deep learning and other technologies have been widely used in face recognition, pedestrian detection, video tracking and other fields. Among them, using target detection for indoor crowd statistics has attracted a lot of attentions. Due to the problems such as mutual occlusion of crowds and blurred target features in the indoor monitoring screen, it often leads to low detection accuracy and high false detection rate and missed detection rate. In order to solve this problem, an indoor people counting model based on global attention is proposed. The model introduces the attention mechanism, optimizes the object detection algorithm YOLOv3, and enhances the detection ability by extracting more features of small or unclear heads. The experimental results show that the improved network model has higher recall and average precision.

Recognition of diabetic retinopathy based on attention neural network

ZHANG Tong, MENG Liang

2022, 44(3): 479-485. doi:

Abstract ( 519 )

PDF (955KB) ( 513 ) 　　

To solve the problems that the identification of diabetic retinopathy mainly depends on the clinical experience of doctors and the features of the lesions are difficult to be distinguished by eyes and the recognition rate is low, a diabetic retinopathy classification method based on attention neural network is proposed. Firstly, the retinal images are preprocessed by normalization, histogram equalization and data enhancement. Secondly, 2-DenseNet is proposed by adjusting the classical DenseNet to reduce the number of connections on the premise of avoiding gradient disappearance and ensuring classification accuracy.At the same time, the attention module is embedded into the network to direct it to focus on features such as exudates, thick blood vessels, and microaneurysms in retinal images, which is used to train and test the pre-processed images. Finally, multiple models are compared on the public Kaggle dataset, and the experimental results show that the network has a higher classification accuracy for diabetic retinopathy than other models.

Single hazy image depth estimation fusing the perceptual loss function

ZHANG Lei, WANG Yuan-yu, ZHANG Wen-tao

2022, 44(3): 486-494. doi:

Abstract ( 463 )

PDF (1403KB) ( 502 ) 　　

To address the difficulty of estimating the indoor and outdoor depth under hazy conditions,a single hazy image depth estimation method fusing the perceptual loss function is proposed. Firstly, a two-scale network model is used to coarsely extract the hazy images that are are then locally refined by combining the underlying features. Then, in the upsampling stage, a multi-convolution kernel upsampling method is used to obtain the haze image predition depth map. Finally, the network is trained by combining the pixel-level loss function and the perceptual loss function into a new composite loss function. The experiments are trained, tested, and validated on the indoor NYU Depth v2 dataset and the outdoor Make3D dataset. The result shows that the two-scale network model combining the multi-convolutional kernel up-sampling method and the composite loss function can better estimate the depth information of a single hazy image, improve the accuracy and quality of depth estimation under hazy condition, shorten the training time of the model,and improve the applicability and accuracy of the depth estimation of hazy images.

Network rumor recognition based on naive Bayesian classification

LI Wen-li

2022, 44(3): 495-501. doi:

Abstract ( 574 )

PDF (696KB) ( 660 ) 　　

Rumors spread can destroy social order, endanger national stability and cause public panic. The wide application of social platforms makes information spread faster and more widely, increasing the negative impact caused by rumors. How to quickly and accurately identify online rumors has become a hot issue in the field of information dissemination. Rumor recognition is a binary classification problem. Therefore, based on the idea of Bayesian classification, a Naive Bayesian classification algorithm for network rumor recognition is designed. The naive Bayesian classifier is constructed by Matlab software, and the algorithm is verified by experiments with data collected from microblogs. By controlling the training set, the accuracy, precision, recall rate and F1 value of the identification results are compared, and the identification situation and inherent laws of the naive Bayesian classifier for rumor and non-rumor under different training conditions are explored. The research shows that naive Bayesian classifier is effective for online rumor identification, the selection and control of training sets have great influence on the identification results, and the identification accuracy fluctuates with different training conditions.

A span-based joint entity and relation extraction method

YU Jie, JI Bin, WU Hong-ming, REN Yi, LI Sha-sha, MA Jun, WU Qing-bo

2022, 44(3): 502-508. doi:

Abstract ( 676 )

PDF (550KB) ( 590 ) 　　

Span-based joint extraction models have achieved excellent results in named entity recognition and relation extraction. These models regard text spans as candidate entities and span tuples as candidate relation tuples. span semantic representations are shared in both entity recognition and relation extraction, while existing models cannot well capture semantics of these candidate entities and relations. To address these problems, a span-based joint extraction framework with attention-based semantic re- presentations is proposed. Specially, attentions are utilized to calculate semantic representations, includ- ing span-specific and contextual ones. Experiments show that our model outperforms previous systems and achieves state-of-the-art results on ACE2005, CoNLL2004 and ADE.

A CycleGAN small sample library amplification method for faulty insulator detection

CUI Ke-bin, PAN Feng

2022, 44(3): 509-515. doi:

Abstract ( 501 )

PDF (1925KB) ( 591 ) 　　

In deep learning training, insulator detection requires a large number of faulty insulators. It is actually difficult to obtain a large amount of faulty insulator data. Generative Adversarial Network (GAN) provides a feasible solution for augmenting training samples. This paper supplements the defective insulator samples in the structure of the Cycle-consistent GAN (CycleGAN), optimizes the model by changing the loss function, and inputs the image synthesized by the forward generator to the reverse generator, thus maintaining the overall outline of the sample while adding the difference. In the SSD (Single Shot Detector) target detection experiment, the method of using the improved CycleGAN model to expand the dataset was compared with other GAN models. The results show that the method of using the improved CycleGAN to expand the dataset significantly improves the recognition rate of insulator drop detection compared with other expansion methods.

Design of universal flight parameter testing system for semi-physical simulation flight platform

WANG Jian, ZHANG Hui-xin,

2022, 44(3): 516-520. doi:

Abstract ( 456 )

PDF (1501KB) ( 490 ) 　　

Aiming at the lack of complete raw data support for the semi-physical simulation flight platform in many fields such as flight subject evaluation, platform life research, and the formulation of operating standards, this paper proposes a design of general-purpose flight parameter test system based on LabVIEW. The result is that the simulated cockpit control input information such as flight joystick, throttle stick, switch control panel, flight foot rudder and other simulation commands input, are used to realize the system joint test of the 3D visual simulation software with the project stakeholders. The results of multiple tests show that the system can monitor and record 55 kinds of flight simulation data in real time, including flight attitude parameters, environmental parameters, engine parameters, various warning signals, pilot call information and cockpit voice information. It can be developed and improved for aircraft manufacturing, upgrades, regular inspections and troubleshooting, and pilot operating action coding provide effective and quantifiable data support.

Social media KOL based on barrage text mining

ZHOU Zhong-bao, ZHU Wen-jing, WANG Hao, GUO Xiu-yuan, WANG Li-feng

2022, 44(3): 521-529. doi:

Abstract ( 528 )

PDF (835KB) ( 540 ) 　　

Social media Key Opinion Leader (KOL) is very popular with advertisers because of their excellent business value. However, with the low entry threshold of the KOL industry and data fraud, advertisers are unable to find a KOL that matches their own brand quickly. Based on the above background, this paper studies the video released by KOL on social platforms, analyzes the dynamic theme of the barrage text in the video, and describes the change of the theme of the barrage over time. At the same time, a convolutional neural network model is used to perform sentiment analysis on the barrage text of the video containing the advertisement, and further analyze the audience's emotional polarity to the situation that the video released by KOL containing the advertisement. The experimental results show that the proposed KOL analysis method evaluates the commercial value of KOL more comprehensively and specifically, helping advertisers find a suitable KOL efficiently.

Multi-UAV target search based on improved pigeon swarm algorithm

LING Wen-tong, NI Jian-jun, CHEN Yan, TANG Guang-yi

2022, 44(3): 531-535. doi:

Abstract ( 486 )

PDF (1378KB) ( 508 ) 　　

The problem of UAV target search in a three-dimensional unknown environment is a very challenging and realistic task. Compared with other intelligent algorithms, the pigeon swarm algorithm has faster convergence speed, higher search efficiency, and is suitable for target optimization tasks. Therefore, a multi-UAV target search method based on the pigeon swarm algorithm is proposed. The UAVs search for the information left by the target. Aiming at the problem that the pigeon swarm algorithm is easy to fall into the local optimum, an improved pigeon swarm algorithm based on the differential evolution strategy is proposed. Finally, simulation experiments verify the rationality and effectiveness of the proposed algorithm.

Current Issue

Author center

Review center

Online journal