High Performance Computing
-
A low BER cooperative-adaptive-equalizer for 112 Gbps PAM4 wireline receivers
- LAI Ming-che, Lv Fang-xu, ZHANG Geng, XU Chao-long
-
2023, 45(06):
951-960.
doi:
-
Abstract
(
182 )
PDF (2512KB)
(
259
)
-
High speed serial interface is the key intellectual property (IP) for the inter-chip interconnection during the high performance computers and data centers. As the single-channel rate of the serial interface evolves from 56 Gbps to 112 Gbps, the high speed serial interface faces a sharp increase in the bit error rate (BER), which seriously affects the interconnection performance and the system stability. In order to solve the problem of the high bit error rate at 112 Gbps PAM4 receiver, a cooperative adaptive equalizer is proposed in this paper. Firstly, an adaptive cooperative equalization algorithm for three kinds of equalizers is proposed to achieve low bit error under the condition of large insertion losses. Then the blind adaptive equalization algorithm based on decision feedback equalizer is proposed to shorten the link training time and reduce the hardware overhead. This paper completes the circuit implementation of the receiver with the cooperative adaptive equalizer under the 12 nm CMOS technology. The simulation results show that the receiver with the proposed cooperative adaptive equalizer can steadily receive the 112 Gbps PAM4 signal under 36.5 dB condition with the BER less than 1e-12. It also can achieve the convergence period of about 400 ns and a power consumption increase of only about 2.3%.
-
A visual analysis method of curved surface streamline based on CAD geometric model
- WU Fu-kun, CAO Yi, DONG Qing-li,
-
2023, 45(06):
961-969.
doi:
-
Abstract
(
106 )
PDF (1571KB)
(
166
)
-
To address the generation problem of surface streamline on complex geometry in large-scale vector field, we design and implement a high-precision visual analysis method of complex geometric surface streamline. This method first performs the intersection calculation based on the complex topological two-dimensional surface structure, and introduces a high-precision interpolation algorithm to achieve effective vector extraction of the geometric surface. Then, an improved Runge-Kutta streamline integration extension technology for vector field is applied for integration calculation, where the adaptive integration step strategy and streamline-surface intersection acceleration structure is introduced, in order to generate continuous and consistent surface streamline. Finally, this method adds the ball feature to the visualization pipeline process, and is integrated into the general visualization analysis platform. Experimental results show that the method can generate continuous and consistent geometric surface streamlines with high precision, and effectively characterize the physical characteristics of the surface flow field of complex devices.
-
Design and implementation of a customized network-on-chip design exploration algorithm
- GE Yi-xuan, LI Chen, CHEN Xiao-wen, LU Jian-zhuang, GUO Yang
-
2023, 45(06):
970-978.
doi:
-
Abstract
(
98 )
PDF (975KB)
(
157
)
-
Designing customized network-on-chip to meet various specific application needs has become the trend of network-on-chip design. Such systems consist of a large number of various types of devices. Mapping these devices into traditional regular network topologies may be able to achieve a lower performance/overhead ratio. Customized on-chip networks become a better choice for domain-specific architecture due to the fine-tuned design feature. However, such fine-tuned design also imposes high burdens on designers which is time-consuming. Therefore, how to explore the optimal custom network topology with agile and fine-tuned design becomes an important challenge for application-specific network-on-chips. In order to explore the optimal topology of customized network-on-chip, an agile and automatic exploration algorithm is designed. In order to reduce the complexity, a heuristic linear programming algorithm is proposed to accelerate the traversing speed between multiple network layers. Compared with the traditional Mesh topology, the generated topology achieves about 20% performance improvement and reduces the average hop count by about 30% within a reasonable time. At the same time, the design exploration algorithm has low time complexity, which can automatically generate customized network on chip architecture under linear time complexity. It has high scalability and can be applied to large-scale system-on-chip.
-
Design and implementation of UPF power supply state analyzer based on WCOJ
- SHI Ming-chuan, ZOU Hong-ji, QIN Zhi-kai, LI Tun
-
2023, 45(06):
979-986.
doi:
-
Abstract
(
80 )
PDF (526KB)
(
145
)
-
With the improvement of IC technology level, the number of functional units that can be integrated on a single chip is increasing, and the total power consumption of the circuit is becoming higher and higher, especially the power consumption problem of VLSI design has become unavoidable. To solve this problem, a low-power design process based on Unified Power Format (UPF) is proposed. In view of the feature that UPF analysis mainly consists of data table operations, an algorithm based on WCOJ (Worst-Case Optimal Join) is proposed to check and merge the design rules of the power supply state table of each hierarchy in the voltage domain, and a power supply state analysis tool in low power design analyzer is designed and implemented. The experimental results show that the proposed algorithm has lower spatial complexity and time complexity than the binary merge algorithm, and has strong portability, which has important theoretical and practical significance.
-
Experimental research on heat dissipation performance of aluminous vapor chamber based on 6U VPX motherboard
- LI Yi
-
2023, 45(06):
987-994.
doi:
-
Abstract
(
90 )
PDF (1673KB)
(
125
)
-
Aiming at the heat dissipation problem of a 6U VPX high-performance motherboard with many heating devices and large total thermal power consumption, An engineering application experimental research on the aluminous vapor chamber is carried out. The test results show that the aluminous vapor chamber cannot meet the heat dissipation requirements of the motherboard under natural heat dissipation conditions at the normal temperature of 28 ℃, the maximum temperature difference on the aluminous vapor chamber is 7.2 ℃ under the air-cooled heat dissipation condition, and the maximum junction temperature of the CPU and DSP chips is 45 ℃ and 50 ℃ respectively. The maximum temperature difference on the aluminous vapor chamber is 6.7 ℃ at a 60 ℃-high temperature environment, and the maximum junction temperature of the CPU and DSP chips is 85 ℃ and 83 ℃ respectively, which is lower than the allowable junction temperature of 105 ℃. It is concluded that the heat dissipation performance of the aluminous vapor chamber is remarkable, and it can meet the heat dissipation requirements of the standard 6U VPX high-performance and high-power consumption motherboard with air-cooling conditions, which is one of the effective methods to solve the heat dissipation difficulty of the motherboard.
Computer Network and Znformation Security
-
Double matching optimization of LoRa parameters based on matching theory
- YANG Mao-heng, ZHANG Hui, ZHOU Chao
-
2023, 45(06):
995-1002.
doi:
-
Abstract
(
83 )
PDF (1252KB)
(
147
)
-
Resource allocation in LoRaWAN is expressed as an optimization problem of spreading factor allocation and channel allocation, especially when there are a large number of connected devices in LoRaWAN, to ensure the fairness of throughput among LoRa users with limited spectrum resources. Firstly, the matching theory is introduced. LoRa users and channels, and LoRa users and spreading factors are used as matching parties to maximize their utility. Therefore, a matching-based channel and spreading factor assignment algorithm is proposed. Based on the matching theory, with the goal of maximizing utility, by optimizing the results of network channel and spreading factor allocation, the minimum channel capacity achieved in LoRaWAN is maximized. A fair airtime initialization algorithm is proposed to ensure the fairness of the throughput of each group of parameters. The simulation results show that the fair airtime initialization algorithm can obtain better initial allocation results than other allocation schemes. The matching-based channels and spreading factors assignment algorithm can significantly increase the LoRa network data extraction rate and greatly reduce network energy consumption.
-
Measurement and analysis of public recursive DNS service systems
- LI Zhong, DING Wei, MO Song-yuan
-
2023, 45(06):
1003-1010.
doi:
-
Abstract
(
77 )
PDF (727KB)
(
136
)
-
Currently, research on the domain name system mainly focuses on domain name classification, domain name security, domain name server performance, and other aspects, while research on the relationships between domain name servers is relatively rare. Through a measurement experiment lasting 31 days on over 8 million DNS servers, basic information on 250 million recursive query events, including the relationship between over 1.3 million recursive servers, was obtained. Statistical analysis of recursive query events was carried out based on capacity, response specification level, and other mea- surements. Statistical analysis was carried out on servers that support recursive queries using two classification methods: function and response status. A server stability measurement is proposed to describe the stability of server responses. The relationships between DNS servers are displayed using an undirected graph. Finally, attempts are made to explain and analyze unreasonable phenomena that appeared in the statistical results of server classification
-
Double-Bagging based feature dimension reduction heterogenous integrated intrusion detection
-
2023, 45(06):
1011-1019.
doi:
-
Abstract
(
92 )
PDF (853KB)
(
148
)
-
Intrusion detection is a challenging and important task in the field of network security. A single classifier may bring classification bias, and using ensemble learning has stronger generalization ability and higher accuracy compared to a single classifier. Although such algorithms have good classification performance, adjusting the weights between the base classifiers requires a lot of time. To address this issue, an feature dimension reduction heterogenous integration intrusion detection model based on Bagging-based feature dimension reduction and Bagging heterogeneous integration-based intrusion detection classification algorithm (Double-Bagging) is proposed. The algorithm integrates five feature selection algorithms and adopts a Bagging voting mechanism to select the optimal feature subset, in order to achieve efficient and accurate feature dimensionality reduction. At the same time, the pairwise diversity measure in ensemble learning is introduced to choose the optimal heterogeneous ensemble set for different base classifier combinations. For the weighting function, accuracy and AOC value are used as weights to integrate classifiers. The experiment shows that the models accuracy is up to 99.94%, and the system error rate and positive judgment rate are up to 0.03% and 99.55%, which is superior to the existing mainstream intrusion detection algorithms.
-
Research on a step-by-step adversarial defense method for image recognition
- XU Ru-zhi, WANG Shuo, LONG Yan, ZONG Qi-zhuo
-
2023, 45(06):
1020-1029.
doi:
-
Abstract
(
94 )
PDF (1177KB)
(
156
)
-
At present, with the continuous development of deep learning technology, its application in the field of image recognition has also made a great breakthrough. However, the existence of adversarial samples seriously threatens the security of the model itself. Therefore, it is of profound practical significance to study effective adversarial defense methods and improve the robustness of the model. Therefore, based on the game between quickly generating adversarial samples and maintaining the similarity of sample prediction results, a step-by-step adversarial defense method is proposed. The method first performs random data enhancement on the common samples to improve the sample diversity. Secondly, it generates the difference adversarial samples and the similarity adversarial samples, so as to improve the variety and quality of the adversarial samples in the adversarial training. Finally, the loss function is redefined for adversarial training. Finally, experimental verification shows that the algorithm has better mobility and robustness in the face of multiple attacks against the sample.
-
An image tampering detection model based on improved Faster R-CNN
- TIAN Xiu-xia, LIU Zheng, LIU Qiu-xu, LI Hao-ran
-
2023, 45(06):
1030-1039.
doi:
-
Abstract
(
195 )
PDF (1094KB)
(
262
)
-
With the development of artificial intelligence, digital images have been widely used in various fields. However, due to the appearance of image editing software, a large number of images have been tampered with maliciously, which seriously affects the authenticity of image content. Different from the general object detection, the study of image tampering detection needs to pay more attention to the tamper information of the image itself, which is often manifested in a weak form. Therefore, the detection model needs to focus on learning more abundant tamper features. This paper proposes a dual-stream Faster R-CNN model that combines gradient edge information and attention mechanism, and the model can realize detection and location of regions with different tampering types. One of the two streams is the color stream, which uses the attention mechanism to extract the surface features of the image, such as brightness contrast, visual difference of tampering with the boundary, etc. The second of the two streams is a gradient stream. A Gradient high-pass filter is used to enhance the anomaly edge features between the real area and the tampered area, making it easier for the model to find faint tampered traces in the tampered image. Finally, the features of color stream and gradient stream are fused by means of compact bilinear pooling. Due to the relatively small size of publicly available image tampering data sets, the Pascal VOC 2012 is used to create an image tampering detection data set which containing 10 010 images for model pre-training. The experimental results on COVER, Columbia, and CASIA data sets show that the model proposed in this paper improves the detection accuracy by 7.1% to 9.6% compared to the latest models, and exhibits higher robustness under JPEG compression and image blur attacks.
-
A survey of pedestrian trajectory prediction based on graph neural network
- CAO Jian, CHEN Yi-mei, LI Hai-sheng, CAI Qiang,
-
2023, 45(06):
1040-1053.
doi:
-
Abstract
(
276 )
PDF (883KB)
(
417
)
-
With the rapid development of the technology of computer vision and autonomous driving, the ability to sense, understand and predict human behavior is becoming more and more important. The popularity of various sensors has generated a large amount of position data of moving objects in society. Predicting the movement trajectory of pedestrians based on these data has great value in social prediction and other fields. To gain insight into the development in this area, a literature review is conducted on graph neural network-based pedestrian trajectory prediction methods. The graph neural network algorithms for pedestrian trajectory prediction are compared, analyzed and summarized from multiple perspectives, and the research and development of different algorithms in this field are discussed. The comparison and analysis are carried out on the current public data sets, an overview of the corresponding performance indicators is provided, and the performance comparison results of different algorithms are given. At the same time, this paper puts forward the research problems that still exist and looks forward to the possible research directions in the future.
-
YOLOv5s algorithm optimization based on multi-scale feature extraction
- LI Xiao-lin, WANG Fu-gang, ZHANG Peng-fei, ZHANG Lin-yu,
-
2023, 45(06):
1054-1062.
doi:
-
Abstract
(
171 )
PDF (834KB)
(
311
)
-
Object detection algorithms are widely used in unmanned driving, robot vision, industrial automation and other fields, and have important research value. Among many target detection algorithm, YOLOv5s has the advantages of fast detection speed and small parameter scale, but also has the problem of low detection accuracy. Aiming at the problem that the YOLOv5s standard convolution module has weak feature extraction capabilities and feature redundancy, two convolution modules based on multi-scale feature extraction are proposed. Firstly, a multi-receptive field convolution module is proposed to improve the feature extraction ability of the model. It obtains semantic information of different granularities through convolution kernels of multiple sizes. Secondly, a feature map convolution module is proposed to improve the diversity of feature maps. It uses a small number of standard convolution kernels and grouped convolutions to reduce the mutual constraints between feature channels. Finally, some standard convolution modules of YOLOv5s are replaced by multi-receptive field convolution module and feature map convolution module, and the improved algorithm in this paper is obtained.The experimental results on Pascal VOC data set show that the improved algorithm not only improves the detection accuracy, but also maintains the real-time detection ability of YOLOv5s. mAP_0.5 and mAP_0.5:0.95 are increased by 2.4% and 4.9% respectively, which proved the effectiveness of the improved algorithm. It is further verified on DOTA data set that the improved algorithm has good generalization ability in different environments.
-
A small target detection algorithm based on improved YOLOv5 in aerial image
- YANG Hui-jian, MENG Liang
-
2023, 45(06):
1063-1070.
doi:
-
Abstract
(
199 )
PDF (1625KB)
(
247
)
-
At present, the target detection technology based on UAV aerial photography is widely used in military and civil fields, but the accuracy of target detection is not high because of the long imag- ing distance, blurred images taken at high altitudes, and small proportion of target information. To solve this problem, an improved algorithm based on YOLOv5 is proposed. Firstly, the original image is fogged to improve its robustness on foggy days. Secondly, the importance of different channels and spaces is enhanced through the integration of CBAM modules. Furthermore, the SPP in the original algorithm is replaced by the ASPP to reduce the influence of pooling operation on feature information. Finally, a detection head is added to the FPN structure to detect targets with finer granularity. Taking YOLOv5s as baseline, the experiment proves that the improved algorithm increases mAP_0.5 by 6.9% in comparison to the original algorithm, and can be effectively applied to the detection of small targets in aerial photography.
-
Facial expression recognition based on improved MobileNetV2
- YAN Chun-man, ZHANG Xiang, WANG Qing-peng
-
2023, 45(06):
1071-1078.
doi:
-
Abstract
(
175 )
PDF (783KB)
(
185
)
-
Aiming at the problem that the existing deep convolutional neural network has a large amount of parameters, which leads to the limitation of facial expression recognition scenes, this paper proposes a facial expression recognition model based on improved lightweight convolutional neural network. The model takes MobileNetV2 lightweight feature extraction network as the main framework, by compressing the network width factor and the global dimension, the number of network parameters and the amount of computation are reduced. SandGlass block is introduced to improve the reverse residual module in this network, and reduce the loss of feature information during network transmission. At the same time, the efficient channel attention mechanism is embedded to improve the network's ability to extract feature information. Experiments were carried out on the facial expression data sets FER2013 and CK+. The facial expression accuracy rate of the proposed network reaches 68.96% and 95.96%, which are 1.06% and 6.14% higher than that of MobileNetV2 respectively, and the number of parameters are decreased by 82.28%. Experimental results verify the effectiveness of the improved network model.
Artificial Intelligence and Data Mining
-
App usage prediction with session-based embedding
- YU Ze-peng, AN Ye-teng, ZHANG Shuo, YANG Zi-xing, LU Ji-xiang, CAO Rong-rong, CHEN Yi-zhou, LI Wen-zhong, LU Sang-lu
-
2023, 45(06):
1079-1086.
doi:
-
Abstract
(
91 )
PDF (738KB)
(
123
)
-
Nowadays, smartphone users install dozens or even hundreds of Apps on their phones. Predicting App usage not only helps the mobile phone system to speed up App launching but also reduce the time for users to search App. This paper focuses on a novel session-based App usage prediction problem that tends to predict a sequence of Apps to be used in a period. A session-based embedding framework called SEM is proposed to solve the problem. Aiming at the side length of application session and the heterogeneity of session semantics, a session embedding method is proposed to form uniform feature representation, which alleviates the problem of user sparsity and obtains the vector representation of sessions. Based on session embedding, a two-layer GRU-based recursive neural network model is trained for App usage session prediction. Extensive experiments based on real datasets show that the proposed framework outperforms conventional App recommendation approaches.
-
Multi-modal false information detection via multi-layer CNN-based feature fusion and multi-classifier hybrid prediction
- LIANG Yi, Turdi Tohti, Askar Hamdulla,
-
2023, 45(06):
1087-1096.
doi:
-
Abstract
(
137 )
PDF (842KB)
(
162
)
-
Aiming at the problem that the existing multi-modal false information detection methods rarely fuse multi-modal features at the feature level and ignore the late fusion effect of multi-modal features, a false information detection method based on CNN multi-modal feature fusion and multi- classifier hybrid prediction is proposed. This method applies multi-layer CNN to multi-modal feature fusion for the first time. The model first uses BERT and Swin-transformer to extract text and image features, and then uses multi-layer CNN to fuse multi-modal features at the feature level. Modal features are fused at the sentence level. Finally, the two fusion features are input into different classifiers to obtain two probability distributions, and the two probability distributions are added proportionally to obtain the final prediction result. Compared with the attention-based multi-modal factorization bilinear model (AMFB), the accuracy of this model is improved by 6.1% and 4.3% on the Weibo dataset and Twitter dataset, respectively. The experimental results show that the proposed model can effectively improve the accuracy of false information detection.
-
A sound event localization and detection algorithm based on feature fusion and Transformer model
- PU Zi-jun, ZHANG Shou-ming
-
2023, 45(06):
1097-1105.
doi:
-
Abstract
(
199 )
PDF (1966KB)
(
203
)
-
Aiming at the problem of multi-channel environmental sound detection, a feature fusion network model TBCF-MTNN is proposed, which introduces the Transformer structure. The network structure takes logarithmic Mel-spectrum and generalized cross-correlation spectrum as input. Firstly, the local features of the spectrum and the temporal context relationship features are obtained through CNN and GRU, and then the two feature maps are merged through the Cross-stitch module, which can effectively solve the traditional problem that multi-feature information cannot be shared in the network. Secondly, the fused feature map is sent to Transformer for re-collection of features. Finally the classification and positioning results are output through the full link layer. The verification on TAU-NIGENS 2020 data set show that, compared with the Baseline model, the TBCF-MTNN network can reduce the classification error rate to 0.26 in the sound detection task, and reduce the localization error to 4.7° in the sound source localization task. Compared with Baseline, FPN, EIN and other models, the proposed model has a better recognition effect.
-
A multi-scale community search method based on spectral wavelet
- YAN Cai-rui, MA Hui-fang, LI Qing-qing
-
2023, 45(06):
1106-1115.
doi:
-
Abstract
(
92 )
PDF (1619KB)
(
121
)
-
As a network analysis task that can capture user’s personalized information, community search aims at mining the community of query nodes that can satisfy the cohesion requirement. Most of the existing community search methods can only locate a single-scale community where query nodes are located. A Multi-Scale Community Search method based on Spectral Wavelet (MSCS_SW) is proposed, which can mine the multi-scale community of query nodes by using spectral wavelet and local modularity. Specifically, firstly, the modularity matrix and the Laplacian are constructed, and decomposed to obtain the relevant eigenvectors. Secondly, based on the spectral theory and the graph wavelet, the scale-dependent local modularity is designed. Thirdly, based on the normalized Laplacian Matrix and the feature space of local modularity, a linear programming problem is designed to solve the sparse indicator vectors related to query at a given scale. Finally, the community boundary truncation strategy is used to add nodes to maximize the local modularity. Experimental results on synthetic network and real-world network datasets demonstrate the efficiency and effectiveness of the proposed method.
-
Medical text classification based on neural network
- XU Lang, LI Dai-wei, ZHANG Hai-qing, TANG Dan, HE Lei, YU Xi
-
2023, 45(06):
1116-1122.
doi:
-
Abstract
(
240 )
PDF (687KB)
(
207
)
-
The traditional medical text data classification methods ignore the context of the text. Each word is independent of each other and cannot represent semantic information. The text description and classification effect are poor, and feature engineering requires manual intervention, so the generalization ability is not strong. Aiming at the problems of low efficiency and low accuracy of medical text data classification, this paper proposes a medical text classification model CMNN based on bidirectional encoder representations from Transformer(BERT), convolutional neural network (CNN) and Bi- directional long and short-term memory (BiLSTM) neural network. The model uses BERT to train word vectors and combines CNN and BiLSTM to capture local latent features and contextual information. Finally, the proposed model is compared with the traditional deep learning models TextCNN and TextRNN in terms of accuracy, precision, recall and F1 score. The experimental results show that the CMNN model outperforms other models on all evaluation metrics, and the accuracy is improved by 1.69%~5.91%.
-
An adaptive mutation butterfly optimization algorithm
- HUANG Xue-yu, LUO Hua
-
2023, 45(06):
1123-1133.
doi:
-
Abstract
(
99 )
PDF (887KB)
(
157
)
-
In view of the problems of the basic butterfly optimization algorithm, such as slow convergence speed, low solution accuracy and being prone to local optimum, an adaptive mutation butterfly optimization algorithm is proposed. Firstly, improved tent map barycenter reverse learning is used to the population to gain a better initial solution. Secondly, the nonlinear inertial weight is introduced in the location update to balance the global search and local search capabilities of the algorithm. Finally, the variance of population fitness and the size of the current optimal solution determine whether to carry out Gaussian mutation quadratic optimization for the current optimal solution and the worst solution, in order to enhance the ability of the algorithm to jump out of the local optimum. The multi-dimensional simulation results of 12 benchmark functions show that the proposed algorithm is be obviously better than other alignment algorithms in convergence speed, solution accuracy and optimization stability.