Computer Engineering & Science

Review of reconfigurable optical interconnection network architecture for HPC and DC

CAO Ji-jun

2022, 44(06): 951-963. doi:

Abstract ( 592 )

PDF (1264KB) ( 748 ) 　　

Interconnection network is one of the core components of high-performance computing systems and data centers, and it is also a global infrastructure that determines the overall performance of the systems. With the rapid development of high-performance computing, cloud computing and big data application technology, in order to meet the large-scale scalable communication requirements of high-performance computing and data center services, traditional electrical interconnection networks face severe challenges in terms of performance, energy consumption and cost. In recent years, researchers have proposed a variety of reconfigurable optical interconnection network architectures for high-performance computing systems and data centers. In this paper, the advantages of optical interconnection network over electrical interconnection network are described. Then, several typical reconfigurable optical interconnect network architectures are introduced, and their characteristics are analyzed and compared. Finally, the development trend of reconfigurable optical interconnect networks is discussed.

Research and application of a SPH parallel algorithm based on particle decomposition

XU Xiao-yang, WANG Si-qi

2022, 44(06): 964-970. doi:

Abstract ( 264 )

PDF (916KB) ( 360 ) 　　

As a typical meshless numerical method, Smoothed Particle Hydrodynamics (SPH) method has natural advantages in modeling free surface flows. However, this method is computationally heavy and time-consuming. Therefore, this paper proposes a SPH parallel algorithm based on particle decomposition. The algorithm distributes all particles to each process evenly for calculation, and the sending, receiving as well as broadcasting functions are called only once at each time step of communication, which is easy to implement and has good scalability. This parallel algorithm is used to carry out the numerical simulations of 2D dam-break flow and 3D droplet impact onto a liquid film. The results show that this SPH parallel algorithm can significantly reduce the simulation time and is conducive to 3D large-scale computation. The maximum speedup can reach more than 30 when the number of particles is greater than one million.

Design and implementation of CPU secure boot based on NAND Flash

GONG Rui, SHI Wei, LIU Wei, ZHANG Jian-feng, WANG Lei

2022, 44(06): 971-978. doi:

Abstract ( 198 )

PDF (684KB) ( 290 ) 　　

NAND Flash memory is widely used as storage device in embedded systems, because of its advantages on capacity, cost and speed. However, due to the inherent device characteristics of NAND Flash, complex driver is required to read and write it, and the code stored on it cannot be ex- ecuted directly. Therefore, NAND Flash is not suitable for boot code storage. Generally, boot code is stored on NOR Flash and executed on it directly. The boot code on NOR Flash will boot the operating system stored in NAND Flash, which increases the system cost and power consumption. This paper designs and implements a CPU secure boot method based on NAND Flash. The structure of block mapping table is added in NAND Flash controller, and the code stored in the first block of NAND Flash is used to search and fill in the block mapping table. Using this method, part of NAND Flash storage blocks can be directly mapped to memory space, so that the boot code stored on NAND Flash can be executed directly without drivers. Besides, an extended BootRom scheme is proposed. Combining with the structure of NAND Flash address mapping, part of on-chip BootRom is extended to the first block of NAND Flash. Hash comparison is used to verify the extended BootROM, which effectively reduces the design complexity and code amount of on-chip BootRom. Our method can effectively implement the secure boot of single NAND Flash system, thus reducing the system cost and improving the system security.

A floating-point exception detection method based on pile insertion at compile time

GUO Si-yu, WANG Lei

2022, 44(06): 979-985. doi:

Abstract ( 172 )

PDF (646KB) ( 300 ) 　　

Floating point number is the finite precision coding of real number, which may lead to inexact or exception results in floating point calculation, so it is very important to realize effective floating point exception detection methods. The existing exception detection methods are not oriented to floating-point mathematical functions. In view of the above phenomena, a floating-point mathematical function-oriented exception detection method is proposed. This method is based on five types of exceptions defined in the IEEE-754 standard: overflow, underflow, division by zero, invalid operation and inexact, and is combined with the relevant theories of floating-point exception generation conditions defined by the floating-point control register FPCR and IEEE-754 standard used in Sunway high- performance mathematical function library. By classifying the exception types and floating-point operation instructions, pile insertion is carried out when the program is compiled, so as to detect the exceptions in the floating-point mathematical functions, and record the code coverage at the same time. In the experiment, this method is applied to a mathematical function library, and more than 100 floating-point mathematical functions in the library are tested. Experimental results show that the proposed floating-point exception detection method can effectively detect various types of exceptions.

A congestion-aware Hamilton shortest path routing algorithm for network on chip

KANG Zi-yang, PENG Ling-hui, ZHOU Gan, LIN Bo, WANG Lei

2022, 44(06): 986-993. doi:

Abstract ( 272 )

PDF (989KB) ( 381 ) 　　

Spiking neural networks (SNN) can be deployed on neuromorphic processors to complete various tasks. Network on Chip (NoC) can solve the complex interconnection and communication problems with less resources and power consumption. NoC is widely adopted in neuromorphic processors to support communication between neurons. The instantaneous burst communication patten of SNN gene- rates a large number of spikes at each time step. At this time, NoC reaches its saturation rapidly, causing network congestion. Meanwhile, non-congestion-aware routing algorithms further aggravates the congestion state of NoC. How to effectively process these spikes at each timestep, reduce the delay of the network, and increase the throughput has become the problem we need to solve at present. The paper first analyzes the instantaneous burst communication characteristics of SNN. Then, a congestion- aware Hamilton path routing algorithm with the shortest path length is proposed to reduce the average latency and increase the throughput of NoC. Finally, the routing algorithm is implemented in Verilog HDL, and performance evaluation is conducted by simulation. The results show that, compared with the non-congestion-aware routing algorithms, the proposal reduces the average delay by 13.9% and 159% respectively, and increases the throughput by 21.6% and 16.8%, respectively under the two experimental scenarios (different packet count, and different packet inject rate) in a 16×16 2D mesh NoC.

Webshell detection based on deep learning

CHE Sheng-bing, ZHANG Guang-lin

2022, 44(06): 994-1002. doi:

Abstract ( 545 )

PDF (1454KB) ( 681 ) 　　

Based on Webshell detection in AWD offensive and defensive competition, fuzzy C-means clustering is used to analyze Webshell in hyperspace, and find that the attack vector is globally sparse and locally closely related. Two deep learning models are proposed for Webshell detection. Since most of the Webshells collected by GitHub are obtained randomly and are not well targeted, the length of the training data is limited and a limited number of relevant samples are retained. Because one attack is closely related to the adjacent 2 to 4 operations, the attack vector has obvious correlation characteristics in the vertical direction, and the horizontal direction is relatively stable, considering that the scale of the feature vector will be reduced during the transfer process, the zero padding of the convolutional layer is increased. Aiming at the sawtooth oscillation phenomenon of the deep learning training curve, the fast calculation formula of the Adam optimization algorithm is proved, and the learning parameters are corrected, which continuously eliminates the sawtooth in the training Loss curve, and maks the training curve drop smoothly according to the exponential law. The training results are obtained soon. Experiments are conducted to compare the two deep learning models with existing similar detection models. The experimental results show that the proposed deep learning models can better detect Webshell attacks in AWD.

Privacy-preserving broadcast encryption in smart city

NIU Shu-fen, FANG Li-zhi, SONG Mi, WANG Cai-fen, DU Xiao-ni

2022, 44(06): 1003-1012. doi:

Abstract ( 194 )

PDF (1063KB) ( 345 ) 　　

A large number of data are generated by the public city departments and citizen in modern city, and modern information and communication technology and network technology are adopted to use and process the massive data. To protect the privacy and data security of users, the encryption algorithm is used to encrypt the data in the process of data transmission. Broadcast encryption is the most effective method in multi-user environment. Traditionally, the ciphertext of identity-based broadcast encryption can be broadcasted to a group of receivers, and the identity of the receiver is contained in the ciphertext. When multiple receivers decrypt the ciphertext, the identity information of other users will be exposed. In order to protect the identity privacy between receivers, an identity-based privacy-preserving broadcast encryption algorithm is proposed, which realizes the anonymity between receivers. In addition, the algorithm focuses how to revoke some re-ceivers of the specified target from the ciphertext of anonymous broadcast and determines the user's data access authority according to the data access control policy, so as to provide users with the revocation of ciphertext. The revocation process does not reveal the plaintext and the identity information of the receivers. In the random oracle model, the security of the algorithm is proved based on the BDH difficulty problem, and the effectiveness and security of the algorithm are verified by the simulation experiment on the actual data set.

A hybrid particle swarm-butterfly algorithm for WSN node deployment

ZHANG Meng-jian, WANG Min, WANG Xiao, QIN Tao, YANG Jing,

2022, 44(06): 1013-1022. doi:

Abstract ( 265 )

PDF (1363KB) ( 416 ) 　　

Aiming at the problems of uneven distribution and low coverage when nodes are randomly deployed in wireless sensor network (WSN), a hybrid particle swarm-butterfly algorithm (HPSBA) is proposed for node deployment optimization. Firstly, logistic mapping and adaptive adjustment strategies are designed to control parameter values, so that the optimization speed, convergence accuracy and glo- bal search capability of HPSBA are improved. Then, four benchmark functions are used to analyze the performance of HPSBA. The simulation results show that HPSBA has higher optimization accuracy, faster optimization speed, and better stability. Finally, HPSBA is used in WSN node deployment optimization and compared with other six typical algorithms such as PSO, BOA, IGWO and so on. The results show that HPSBSA has higher coverage rate, which can effectively reduce the redundancy of nodes and prolong the survival time of WSN.

Capacity analysis of UAV communication system based on FD-NOMA

NIU Chun-yu, JIA Xiang-dong , CAO Sheng-nan, WAN Ni-ni

2022, 44(06): 1023-1029. doi:

Abstract ( 247 )

PDF (676KB) ( 293 ) 　　

In order to improve the communication quality between unmanned aerial vehicle (UAV) and ground users, an UAV communication system model based on full-duplex and non-orthogonal multiple access (FD-NOMA) technology is proposed, and the capacity of the system model in urban and suburban scenarios are analyzed. Firstly, the accurate capacity expression of the system model is given. Secondly, the calculation problem of the exponential integral function in the formula is solved by introducing the Q function and the truncation method, and the approximate closed-form expression of the capacity is obtained. Thirdly, in urban scenarios, the coefficient factor is used to obtain a more accurate approximate closed-form expression. Finally, simulation and numerical results show that rice factor has a certain influence on the system capacity, and increasing the number of UAV or NOMA power vector can achieve better capacity performance.

An image camouflage encryption method based on vector quantization in cloud storage environment

ZHENG Si-fei, FENG Zi-jing, LIU Cheng-yu, CHEN Ri-qing, LIU Xiao-long

2022, 44(06): 1030-1036. doi:

Abstract ( 144 )

PDF (1189KB) ( 257 ) 　　

The traditional image encryption methods usually encrypt an original image into a ciphertext image similar to texture or noise. Such ciphertext image is easy to attract the attention of attackers, resulting in a large number of different types of malicious attacks and analysis. In order to improve the security of image storage in cloud environment, this paper proposes a new image camouflage encryption method. Based on vector quantization and discrete wavelet transform, this method adopts the “plaintext-to-plaintext” camouflage en-cryption method, which can not only protect cloud image as traditional encryption method, but also provide additional visual camouflage function. Experimental results show that this method can not only effectively improve the storage efficiency of cloud image, but also has better visual effect and camouflage characteristics.

Security communication path planning of UAV sensor network under QoS guarantee

GAO Hang, WU Jia-xin, CHEN Long, WU Ji-gang

2022, 44(06): 1037-1045. doi:

Abstract ( 185 )

PDF (786KB) ( 353 ) 　　

In wireless sensor networks, unmanned aerial vehicles (UAVs) regularly cruise around the sensor-covered area to collect the data sensed by the sensors. Due to the broadcast nature of wireless channels, information is more likely to be eavesdropped by illegal nodes on the ground, which makes the security of wireless com-munication challenging. Through UAV trajectory planning and sensor power control, the security of wireless communication can be guaranteed in the physical layer. However, in the existing researches on UAV-assisted wireless communication trajectory planning, the minimum communication time required by sensors is not considered to ensure the quality of service. To solve this problem, the minimum communication time con-straint is added. By jointly optimizing the trajectory of UAV, the transmission power of sensors and the scheduling order of UAV serving sensors, the problem of maximizing the average secrecy rate is proposed. The non-convexity of this problem is also proved. In order to solve the proposed non-convex problem, the original problem is decomposed into three sub-problems, and a fast convergent iterative algorithm TPA is proposed, which uses block coordinate descending, continuous convex optimization and iterative rounding method. The experimental results show that TPA improves the average secrecy rate by 15.7% on average in comparison to the baseline scheme without trajectory optimization, and by 159.8% on average in comparison to the baseline scheme without power control. In comparison to the baseline scheme without the minimum communication time constraint, TPA improves the task completion rate on average by 44.6% and 27.1%, when the UAV flight period is greater than 70s under two different task distribution situations.

Overview of parallel fuzzing

GU Tao-tao, LU Shuai-bing, LI Xiang, KUANG Xiao-hui, ZHAO Gang

2022, 44(06): 1046-1055. doi:

Abstract ( 427 )

PDF (763KB) ( 633 ) 　　

Abstract:Software vulnerability has become the main threat of Internet security, so software vulnerability analysis technology has become increasingly prominent. As one of the hotspot technologies in vulnerability analysis, fuzzing triggers program exceptions by continuously generating test cases, dynamically monitoring the execution of target code, and implementing feedback adjusting variation strategies. Fuzzing has the advantages of convenient deployment, wide applicability and intuitive effect. However, the dynamic execution, variation and feedback mechanism of fuzzing is time-consuming, which affects the efficiency of vulnerability analysis. However, parallel fuzzing improves the efficiency of vulnerability detection with the help of parallel execution, task decomposition and information sharing. Firstly, the main challenges of fuzzing based on coverage feedback are analyzed. Besides, the ideas and solutions of parallel fuzzing are discussed. In addition, the system structure, task division, corpus sharing, crash de-duplication and other aspects of parallel fuzzing are summarized. Finally, the advantages and disadvantages of existing parallel fuzzing are summarized, and the future development direction is prospected.

An ARINC429 decoding library based on tree structure

FAN Zhi-yong, WEI Shi-hao, CUI Hai-qing

2022, 44(06): 1056-1062. doi:

Abstract ( 145 )

PDF (745KB) ( 233 ) 　　

During the bus test, engineers need to look through the interface control file and convert the received binary number to decimal number, which leads to long time of fault location. An ARINC429 bus data description method based on tree structure file format is proposed to solve this problem, and the interface control file decoding library is established based on XML by using this description method. The decoding library contains the transmission speed, parameter name, parameter unit and other information of ARINC429 data words, which can realize the decoding of BNR, BCD and DIS in ARINC429 specification. The decoding library is applied to the communication and navigation hardware in the loop simulation platform of a certain domestic aircraft avionics system. The results show that the simulation platform can identify and decode the data received by the acquisition board. It can separate the simulation test software from the coding library, improve the maintainability, and reduce the total amount of decoding code in the simulation test software, and it is highly mobile vegetative.

Formal derivation of the sequence dimidiate partition problem

ZUO Zheng-kang, LIANG Zan-yang, SU Wei, HUANG Qing, WANG Yuan, WANG Chang-jing

2022, 44(06): 1063-1071. doi:

Abstract ( 139 )

PDF (1128KB) ( 229 ) 　　

Formal derivation is the program development under the theory of program correctness proof, and finally obtains the completely correct algorithm program. Regarding the problem of sequence dimidiate partition, the existing formal derivation method alternates the derivation and proof in the derivation process. The derivation process is cumbersome and most of them cannot directly obtain the executable program. To solve the above problems, this paper proposes a new formal derivation method for the sequence dimidiate partition problem. This method is based on the core idea of partition and recursion, applies the specification transformation technology to transform the problem specification and strictly guarantee its consistency, so that there is no need to proof in the derivation process. Then, the recurrence relations are derived and a highly reliable Apla program is obtained. Finally, the conversion tool is used to automatically generates executable programs. It realizes the complete process of program refinement from the program specification to the specific executable program. By taking two algorithms as examples, the effectiveness and feasibility of the method are verified. It has guided significance for the formal derivation of related problems.

An extended VIFB for infrared and visible image fusion

LI Yi, LI Yang, MIAO Zhuang, WANG Jia-bao, ZHANG Rui

2022, 44(06): 1072-1082. doi:

Abstract ( 219 )

PDF (764KB) ( 288 ) 　　

Infrared and visible image fusion is an important area in the computer vision field due to its numerous applications. In recent years, much progress has been made for developing image fusion algorithms. However, there is a lack of code library and benchmark that can be used to evaluate the state-of-the-art methods. In this paper, after briefly revisiting the recent advances of the infrared and visible image fusion methods, an extended VIFB for infrared and visible image fusion is proposed, which includes 56 image pairs, a code library of 32 fusion algorithms and 16 evaluation metrics. Extensive experiments are conducted on the benchmark to evaluate the performance of these fusion algorithms. Besides, fusion results are analyzed qualitatively and quantitatively to identify the image fusion algorithms with excellent performance. Finally, the future prospects of infrared and visible image fusion field are prospected.

A text-to-image model based on the two-phase stacked generative confrontation network with spectral normalization

WANG Xia, XU Hui-ying, ZHU Xin-zhong

2022, 44(06): 1083-1089. doi:

Abstract ( 192 )

PDF (978KB) ( 311 ) 　　

Generating images from text is a challenge task in machine learning community. Although significant success has been achieved so far, problems such as unstable network training and disappear- ing gradients still exist. In response to the above shortcomings, based on the stacked generative confrontation network model (StackGAN), this paper proposes a text-to-image generation method that combines spectral normalization and perceptual loss function. Firstly, the network model applies spectral normalization to the discriminator, restricts the gradient of each layer of the network to a fixed range, slows down the convergence speed of the discriminator, and hence improves the stability of network training. Secondly, the perceptual loss function is added to the generator network to enhance the consistency between the text content and the generated image. The network model uses Inception scores to evaluate the quality of the generated images. The experimental results show that, compared with the original StackGAN, the network model has better stability and generates clearer images.

An improved M2Det algorithm for ground arrow marking line detection

HUO Ai-qing, LI Yi

2022, 44(06): 1090-1096. doi:

Abstract ( 229 )

PDF (1028KB) ( 250 ) 　　

An improved M2Det detection algorithm is proposed to solve the problems of low accuracy and large amount of parameters in the detection of ground arrow marking lines. The algorithm uses an improved backbone feature extraction network and a multi-level pyramid network in feature extraction, and uses non-maximum suppression to filter the generated dense bounding boxes and class scores to obtain detection results. Lightweight network named MobileNet v1 is adopted to replace the VGG network in order to reduce the number of parameters. Mish activation function is used to substitute the ReLU activation function. Meanwhile, BasicRFB module is added to the MobileNet v1 network to increase the detection accuracy. Mosaic data augmentation is also introduced to enable data augmentation. Self- labeled ground arrow lines are used as the experimental dataset, and the experimental results show that the mAP of the improved M2Det algorithm achieves 88.72%, which is about 3.9% higher than the mAP of the original M2Det algorithm, and significantly higher than the mAP of other comparison algorithms.

Aspect identification of microblog cases based on the interactive attention of contents and comments

DUAN Ling, GUO Jun-jun, YU Zheng-tao, XIANG Yan,

2022, 44(06): 1097-1104. doi:

Abstract ( 197 )

PDF (759KB) ( 326 ) 　　

The automatic identification of aspects involved in microblog cases is an important means to understand the public opinion of the Internet social media news. However, the text format and content of the microblog are flexible and changeable, and the traditional aspect identification methods usually use only a single text or comment, which brings great difficulties to the understanding of microblog semantics. This paper studies the identification of aspects in the microblog text involved in the cases, proposes an aspect identification method of microblog cases based on the interactive attention of contents and comments, and realizes the aspect identification of microblog cases by integrating the contextual information of social media. The paper firstly encodes the contents and comments individually based on the Transformer framework, realizes the fusion of the content information and the comment information based on the interactive attention mechanism, and realizes the aspect identification of the comment text based on the fused features. Finally, experiments were conducted based on the microblog dataset containing 12 cases. The experimental results show that using the interactive attention to fuse microblog content information can significantly improve the accuracy of aspect identification, which proves the effectiveness of the method proposed in the paper.

Robust speech recognition based on adaptive deep neural network in complex environment

ZHANG Kai-sheng, ZHAO Xiao-fen

2022, 44(06): 1105-1113. doi:

Abstract ( 206 )

PDF (688KB) ( 374 ) 　　

In a continuous speech recognition system, aiming at the complex environments (including the variability of speakers and environmental noise), the training data does not match the test data, which results in a low voice recognition rate. A speech recognition method based on adaptive deep neural network is studied. The improved regularized adaptive criterion and the adaptive deep neural network in the feature space are combined to improve data matching. The fusion of speaker identity vector i-vector and noise perception training are used to overcome speaker and environmental noise changes and improve the classification function of the output layer of the traditional deep neural network, which ensures the characteristics of compactness within the class and separation between classes. The test experiment was carried out by superimposing various background noises under the TIMIT English speech data set and the Microsoft Chinese speech data set. The results show that, compared with the current popular GMM-HMM and traditional DNN speech acoustic models, our proposal decreases the recognition word error rate by 5.151% and 3.113% respectively, which improves the generalization performance and robustness of the model to a certain extent.

Application of particle swarm optimization with heuristic information in low-carbon TSP

SHEN Xiao-ning, PAN Hong-li, CHEN Qing-zhou, YOU Xuan, HUANG Yao

2022, 44(06): 1114-1125. doi:

Abstract ( 184 )

PDF (857KB) ( 411 ) 　　

A mathematical model (LCTSP) of the low-carbon traveling salesman problem is established and its validity is verified. A discrete particle swarm optimization algorithm based on heuristic information is proposed. Firstly, according to the distance and load information, a novel discrete individual generation operator is designed, which adopts the multi-mutation strategy for the individual itself to maintain the “inertia” of the individual, and adopts the greedy crossover strategy to realize the information interaction between the personal best and the global best. Secondly, the personal best is searched locally based on the priority unloading information, and the population tracking object is adjusted to jump out of the local optimum quickly. Thirdly, according to the degree of population assimilation, the point insertion method and the 2-Opt operator are used to search the global best in a refined way, in order to enhance the mining ability, improve the search accuracy and reduce the rate of population assimilation. Experimental results in a group of low-carbon traveling salesman problems with different scales show that the proposed algorithm has higher accuracy than six state-of-the-art algorithms.

Current Issue

Author center

Review center

Online journal