High Performance Computing
-
A machine learning-based fast calculation method of multi-voltage, multi-temperature and multi-parameter standard cell delay
- ZHAO Zhen-yu, YANG Tian-hao, JIANG Wen-cheng, ZHANG Shu-zheng
-
2023, 45(08):
1331-1338.
doi:
-
Abstract
(
154 )
PDF (913KB)
(
326
)
-
Standard cell library is the foundation of chip design, analysis, and verification, and its generation requires a lot of time and server resources. Therefore, vendors often only provide standard cell libraries under a few corners. However, the design of chip performance, power consumption, and reliability requires delay information of standard cells under multiple voltages, temperatures, and parameters (such as drive strength, channel length, and threshold voltage). To quickly and accurately calculate the delay of standard cells under multiple corners, this paper proposes a machine learning-based method for multi-voltage, multi-temperature, and multi-parameter standard cell delay calculation. By studying the factors that affect the delay of standard cells in depth, data sets are extracted from the 28nm process standard cell library and timing reports. Machine learning algorithms are used to train and calibrate the standard cell delay calculation model. The establishment of the model takes only a few minutes, which is much less than the time consumed by simulation methods (usually hundreds of hours). The average calculation error of the model is 1.542 ps for unknown voltage cell delay, 1.814 ps for unknown temperature cell delay, and 2.202 ps for cell delay under different parameters. The prediction error of cell delay in the static timing analysis process is less than 3%. This method can quickly and accurately calculate the delay of standard cells in real-time and can be applied to fast timing analysis under multiple scenarios before sign-off.
-
Design and implementation of an efficient transmission protocol for Chiplet interconnection
- XIONG Guo-jie, ZHANG Jin-ming, HE Guang-hui
-
2023, 45(08):
1339-1346.
doi:
-
Abstract
(
153 )
PDF (909KB)
(
230
)
-
Efficient, high-bandwidth, and high-reliability transmission protocols are crucial for Chiplet heterogeneous integration technology. Therefore, a parallel transmission interface protocol for Chiplet interconnection is proposed. A new hierarchical architecture is adopted to improve the flexibility and compatibility of the protocol. The fault tolerance to physical link failures is improved by using redundant channels based on the multi-path selection chain, and cyclic redundancy check is implemented in hardware to enhance the transmission reliability of the protocol. To verify the proposed transmission protocol, the protocol transmission path is implemented on two VC709 FPGAs. The experimental results show that compared with PCIe, the protocol has the advantages of high bandwidth, small interface area, and high reliability.
-
A hybrid ISA processor compatible with RISC-V at application level
- SUN Cai-xia, SUI Bing-cai, DENG Quan, ZHENG Zhong, NI Xiao-qiang, WANG Yong-wen
-
2023, 45(08):
1347-1353.
doi:
-
Abstract
(
127 )
PDF (854KB)
(
176
)
-
Changes in the instruction set architecture will result in changes in the processor hardware platform, and binary applications compiled for the old hardware platform will not be able to continue running on the new hardware platform. In this paper, a hybrid instruction set architecture compatible with multiple instruction sets at application level is proposed, and the processor based on the hybrid instruction set architecture can natively run multiple instruction set applications, which can effectively avoid the repetitive work of program developing and porting or the performance loss of the binary translation execution. Based on a self-developed processor, a hybrid instruction set processor compatible with RISC-V at application level is implemented. Compared with a single instruction set, the hardware overhead of supporting for two instruction sets at application level is only increased by 0.45%. The FPGA prototype system successfully boots the operating system ported to the hybrid instruction set architecture, and can correctly run the application based on each instruction set, which verifies the feasibility of the hybrid instruction set architecture idea. Under RISC-V instruction set, the processor achieves 5.58/MHz for Coremark, 8.44/GHz for SPECint2006 and 10.75/GHz for SPECfp2006.
-
Missing value filling for multi-variable urban air quality data based on attention mechanism
- MA Si-yuan, JIAO Jia-hui, REN Sheng-qi, SONG Wei
-
2023, 45(08):
1354-1364.
doi:
-
Abstract
(
155 )
PDF (1333KB)
(
238
)
-
Air pollution seriously affects human health and social sustainable development.However, the multi-variable air quality data obtained by sensors often have missing values, which brings difficulties to data analysis and processing.Currently, many analysis methods for changes in a certain air component only rely on time data and spatial data of this attribute, ignoring the influence of other air components on the trend of this attribute in the same time interval.In addition, it is difficult to achieve ideal results in filling discrete missing data.This paper proposes a Time Attention Model (TAM) based on deep learning, which uses attention mechanism to focus on the correlation between different timestamps and the correlation between different feature time series, and combines short-term historical data to fill missing values in multi-variable air quality data.The proposed model is evaluated using air quality data from Beijing, and the experimental results show that TAM has advantages over ten other baseline models.
-
A Surrogate model-based assertion coverage improvement technology
- SHI Ming-chuan, LONG Qiao-zhou, ZOU Hong-ji, LI Tun
-
2023, 45(08):
1365-1375.
doi:
-
Abstract
(
85 )
PDF (802KB)
(
148
)
-
As the scale of integrated circuit design continues to increase, verification has become one of the bottlenecks in the design process. Currently, simulation is still one of the dominant methods for integrated circuit design verification, and the completeness of simulation is usually measured by various coverage metrics. Functional coverage is a higher level of coverage, and in practical engineering, functions are often presented in the form of SystemVerilog assertions. Currently, it is difficult to generate a large number of test vectors that activate assertions using commonly used random test vector generation methods. When using constraint solving strategies, if the coverage condition involves non-initial input signals (internal signals, output signals), the efficiency of constraint solving will be extremely low, making it still difficult to cover the target assertion. To address the coverage problem of assertions containing non-initial input signals, this paper proposes a Surrogate model-based assertion coverage improvement method, which mainly generates a Surrogate model that reflects the relationship between non-initial input signals and initial input signals and only contains initial input signals, and then uses this Surrogate model as the object of constraint solving, thus reducing the complexity of constraint solving. Experimental results show that this method has a significant improvement in assertion coverage compared to random test vector generation.
-
A hybrid-hardening soft error tolerant NoC router
- GAO Wen-cai, CHEN Xiao-wen
-
2023, 45(08):
1376-1382.
doi:
-
Abstract
(
87 )
PDF (1792KB)
(
147
)
-
Networks-on-Chip (NoC) has become the standard paradigm for interconnect networks in multi-core processors. However, as the power supply voltage gradually decreases and the process size is reduced, the probability of soft errors in NoC increases. Error correction codes are commonly used in NoC router designs to tolerate soft errors. However, traditional router designs often only use Hamming codes for error correction, which has the problem of insufficient error correction capability, despite its simple design structure. This paper proposes a hybrid-hardening NoC router design based on error correction codes. The core idea of this design is to adopt different fault-tolerant code designs based on the importance of information bits, thus achieving a balance between router reliability and fault-tolerant overhead. Experimental results show that our design improves system reliability compared to the baseline design under synthetic traffic and PARSEC benchmark, and the hardware synthesis results also show that this design can shorten the critical path delay by 4%.
-
A massive subway passenger trajectory similarity connection method:A case study of Shenzhen metro
- WANG Xing-su, XIONG Wen, ZHANG Rui
-
2023, 45(08):
1383-1392.
doi:
-
Abstract
(
103 )
PDF (1044KB)
(
170
)
-
The current main trajectory similarity connection methods are based on GPS trajectories. Optimization methods for GPS trajectories cannot be directly applied to the problem of connecting subway passenger trajectories. By fully utilizing the spatiotemporal characteristics of subway passenger tra- jectories and leveraging the trajectorys repetitiveness and symmetry, the trajectory is transformed from a point sequence to an OD sequence to reduce the trajectory length and save storage space. This paper focuses on the design and implementation of the trajectory connection algorithm based on PPJoin+ on the Spark platform. The method is validated on a 13-node Spark cluster and a large-scale dataset contain- ing 5 million passenger trajectories (560 million tap records collected in two consecutive months). The experimental results show that the PPJoin+ algorithm based on OD sequence only takes 14.0 minutes, which saves 62.5% of the execution time compared to the default point sequence trajectory connection algorithm and 78.2% of the execution time compared to the Dima connection algorithm, and exhibits good scalability.
-
Trust problems in edge computing:Challenges and reviews
- XIA Ge-ming, YU Chao-dong, CHEN Jian
-
2023, 45(08):
1393-1404.
doi:
-
Abstract
(
132 )
PDF (647KB)
(
185
)
-
Edge computing is one of the important technologies for the future Internet. Due to the massive dynamic and heterogeneous edge devices in the edge network, some of these edge devices may be malicious. Malicious edge devices will disrupt data processing and transmission within the edge network. With the increasing number of edge devices, edge computing faces a serious trust demand. Therefore, this paper conducts an in-depth investigation of the trust issue in edge computing. Firstly, we review and classify the current research from the perspectives of edge trust management and edge trust evaluation. Then, we discuss and analyze the trust management system based on blockchain and the trust evaluation methods. Furthermore, we also analyze and discus the path planning of trusted data collection, trust incentive mechanism, collaborative trust evaluation, and the application of edge computing trust in the edge trust issue. Finally, according to the completed review and evaluation, we explore some challenges of trust problems in edge computing and suggest the directions that are worth our research effort in the future.
Computer Network and Znformation Security
-
A semi-supervised log anomaly detection method based on attention mechanism
- YIN Chun-yong, FENG Meng-xue
-
2023, 45(08):
1405-1415.
doi:
-
Abstract
(
174 )
PDF (1094KB)
(
242
)
-
Logs record important information about system operation, and log anomaly detection can quickly and accurately identify the cause of system failures. However, log sequences have problems such as data instability and interdependence between data. Therefore, a new semi-supervised log sequence anomaly detection method is proposed. This method uses the Bidirectional Encoder Representations from Transformers (BERT) model and multi-layer convolutional network to extract log information, obtain the contextual relevance between log sequences and the local relevance of log sequences. Finally, the attention-based Bi-GRU network is used for log sequence anomaly detection. The performance of this model was verified on three datasets. Compared with six benchmark models, this model has the best F1 value and the highest AUC value (0.981 3), and the experimental results show that it can effectively handle the problems of data instability and interdependence between data in log sequences.
-
A DGA domain name detection method based on Transformer and multi-feature fusion
- YU Zi-cheng, LING Jie
-
2023, 45(08):
1416-1423.
doi:
-
Abstract
(
152 )
PDF (795KB)
(
219
)
-
To address the problem of high concealment of malicious domain names generated by domain generation algorithms (DGAs) and low accuracy of existing methods in multi-classification of malicious domain names, a DGA domain name detection method based on Transformer and multi-feature fusion is proposed. The method uses the Transformer encoder to capture the global information of domain name characters, and obtains long-distance contextual features at different granularities through a parallel deep convolutional neural network (DCNN). At the same time, BiLSTM and self-attention mechanism are introduced to combine shallow CNN to obtain shallow spatiotemporal features. Finally, the long-distance context features and shallow spatiotemporal features are combined for domain name detection. The experimental results show that the proposed method has better performance in malicious domain name detection. Compared with CNN, LSTM, L-PCAL, and SW-DRN, the proposed method improves the accuracy by 1.72%, 1.10%, 0.75%, and 0.34% in the binary classification experiment and by 1.75%, 1.29%, 0.88%, and 0.83% in the multi-classification experiment.
-
A spatial crowdsourcing privacy preservation task allocation algorithm for road network
- HOU Zhan-wei, LI Xin, WANG Hui, SHEN Zi-hao, LIU Kun, LIU Pei-qian
-
2023, 45(08):
1424-1432.
doi:
-
Abstract
(
84 )
PDF (1009KB)
(
225
)
-
Privacy protection and task allocation are two core issues in spatial crowdsourcing. Based on Euclidean space, most of existing researches use geo-indistinguishability (GeoI) to protect location privacy, while ignoring the underlying road network information, which leads to privacy leakage and utility loss of crowdsourcing workers. In order to protect the location privacy of workers and produce small utility loss, a privacy-preserving batch task allocation algorithm framework for road network is proposed. Firstly, the graph-exponential mechanism optimization problem is proposed, and a greedy algorithm is designed to find the approximate optimal solution. At the same time, the edge server is introduced as the privacy protection agent of workers. Then, the task allocation problem is transformed into a bipartite graph maximum flow problem with the weight of workers travel distance, and the optimal solution is obtained by using Kuhn Munkras (KM) algorithm. Finally, the experiment results show that the proposed algorithms significantly improves both privacy protection and utility.
-
An Android malware detection method based on pre-trained language model
- YIN Jie, HUANG Xiao-yu, LIU Jia-yin, NIU Bo-wei, XIE Wen-wei,
-
2023, 45(08):
1433-1442.
doi:
-
Abstract
(
163 )
PDF (1192KB)
(
227
)
-
In recent years, supervised machine learning-based Android malware detection methods have made some progress. However, due to the difficulty in collecting malware samples, the size of labeled datasets is generally small, which leads to limited generalization ability of the trained supervised models. To address this problem, an unsupervised and supervised combined malware detection method is proposed. Firstly, a language model is pre-trained on a large amount of unlabeled APK samples using unsupervised methods to learn the rich and complex semantic relationships between different operators. Then, the pre-trained language model is fine-tuned by the labeled malware samples to realize the malware detecting ability. Experiments on datasets such as Drebin demonstrate that the proposed method has better generalization ability and detection performance compared with the baseline method, which achieves a maximum accuracy of 98.7%.
-
Research on bimodal SAXS image structure characterization technique
-
2023, 45(08):
1443-1452.
doi:
-
Abstract
(
84 )
PDF (1035KB)
(
139
)
-
The continuous upgrading and development of small-angle X-ray scattering (SAXS) equipment have generated more high-dimensional scattering data, which poses great challenges for researchers to quickly obtain experimental results. Researchers urgently need effective automated classification methods to speed up data representation and obtain higher accuracy. However, many models learn features mainly for illumination images, ignoring the characteristics of scattering images and resulting in lower classification accuracy. Therefore, based on the characteristics of scattering patterns, this paper proposes a bimodal fine-grained feature extraction model called BRTNet. The model adopts a bimodal input mode. The first mode is the feature learning network PRS using a multi-scale convolution architecture, which learns the micro-information of scattering images. The second mode is the multi-head attention mechanism ConvTransformer fusing local information, which learns the correlation information of scattering sequences. Then, the model combines image information and sequence information, fuses the dual-branch features, classifies the scattering data, and obtains the classification results. Experimental results on the biological solution scattering dataset show that the model's classification accuracy exceeds 89%, which has a significant advantage over the baseline model.
-
An image dehazing method based on multi-scale convolution with attention mechanism
- TANG Jian, CHE Wen-gang, GAO Sheng-xiang
-
2023, 45(08):
1453-1462.
doi:
-
Abstract
(
142 )
PDF (2304KB)
(
277
)
-
Image dehazing is a challenging visual task. Previous image dehazing method often depend too much on the physical model of images degraded by fog, and the current image dehazing model using convolution neural network is more complex. Therefore, a lightweight dehazing network MADNet that does not depend on physical model is proposed. The network is mainly composed of a multi-scale convolution module with attention mechanism. By viewing foggy images as composed of clear images and fog residue images, MADNet directly learns the fog residue between the target clear image and the input foggy image, and finally achieve end-to-end image fog removal. The experimental results show that the structure similarity and peak signal-to-noise ratio of the proposed method are better than those of other comparison method on SOTS and NH-HAZE datasets, and it can also achieve better fog removal in real scenes.
-
An electronic component defect detection method based on lightweight YOLOX
- WU Dong-liang, LIU Zhi-gui,
-
2023, 45(08):
1463-1471.
doi:
-
Abstract
(
145 )
PDF (1361KB)
(
218
)
-
Aiming at the problem of large number of parameters and low detection efficiency of the traditional target detection method in defect detection of electronic components, this paper proposes a target detection method based on the lightweight YOLOX detection network. Firstly, the backbone network is lightened using deeply separable convolution to reduce parameters and improve detection speed. Secondly, a spatial Pyramid-based channel attention model is constructed to filter and fuse features of different scales to enhance the feature weights of small size defects. In the feature fusion upsampling process, efficient channel attention is added to improve detection accuracy with slightly increased parameters. Finally, the EIoU loss function is used to optimize the IoU loss function, and the cosine annealing algorithm is used to make the model achieve the best detection effect. The model is tested on a self-made dataset of electronic component appearance defects, and the average detection accuracy reaches 98.96%, with a detection time of approximately 0.09 seconds per image. Compared with the original model, the detection speed is doubled and the model size is reduced by about 60%. The model is also validated on the PCB defect public dataset, achieving fast detection of target defects.
Artificial Intelligence and Data Mining
-
Survey on graph convolutional neural network
- LIU Jun-qi, TU Wen-xuan, ZHU En
-
2023, 45(08):
1472-1481.
doi:
-
Abstract
(
334 )
PDF (787KB)
(
585
)
-
With the widespread existence of graph data, the development of graph convolutional neural networks (GCNNs) is becoming faster and faster. According to the different definitions of the convolution operator, GCNNs can be roughly divided into two categories: one based on spectral methods and the other based on spatial methods. Firstly, representative models of these two categories and their connections are discussed in detail, and then the graph pooling operations are comprehensively summarized. Furthermore, the extensive applications of GCNNs in various fields are introduced, and several possible development directions of GCNNs are proposed. Finally, a conclusion is done.
-
A Spiking Neurons noise-resistant learning algorithm with high and low thresholds
- YANG Jing, XU Yan, JIANG Ying
-
2023, 45(08):
1483-1489.
doi:
-
Abstract
(
68 )
PDF (1485KB)
(
135
)
-
The dynamic threshold learning algorithm of Spiking neurons can change the size of the threshold during the training process, which can effectively improve the noise resistance of neurons. However, the use of dynamic thresholds can reduce the learning accuracy of neurons and easily cause neuron silence when combining with the gradient-based learning algorithm. To address this issue, an improved gradient-based noise-resistant learning algorithm with high and low thresholds is proposed. This algorithm uses high and low thresholds to avoid loss of learning accuracy and uses virtual excitation pulses to continue the learning process when neurons are silent. At the same time, a dynamic learning rate is used to reduce the impact of high and low thresholds on the learning cycle. The experimental results show that this algorithm can significantly improve the noise resistance of neurons while ensuring learning accuracy and convergence speed. It is well suited for the pulse neuron learning algorithm based on gradient descent.
-
A Chinese event detection method based on nugget proposal network with part-of-speech attention mechanism
- HU Qing-meng, , WANG Hong-bin, WANG Jun-zhong
-
2023, 45(08):
1490-1497.
doi:
-
Abstract
(
88 )
PDF (712KB)
(
159
)
-
Event detection mainly studies trigger word detection and event type recognition. At pre- sent, most models based on deep learning focus on semantic role information, syntactic dependency tree information and pre-training models, but ignore the importance of parts of speech. To solve this problem, this paper proposes a Chinese event detection method based on nugget proposal network with part-of-speech attention mechanism. The method firstly obtains part-of-speech sequence based on NLP part-of-speech tagging tool, then uses the CBOW algorithm to obtain part-of-speech embedding, and finally uses part-of-speech embedding in the model to calculate part-of-speech attention for event detection. Experiments on the ACE2005 show that the F1 score of the model with part-of-speech attention is improved by 3.8% and 2.4% respectively on the event detection task, which proves the effectiveness of the method.
-
Entity recognition of support policy text based on RoBERTa-wwm-BiLSTM-CRF
- YU Jin-ping, ZHU Wei-feng, LIAO Lie-fa
-
2023, 45(08):
1498-1507.
doi:
-
Abstract
(
146 )
PDF (682KB)
(
228
)
-
Support policies can help enterprises obtain government support in funding subsidies, tax reductions, and other aspects, and help enterprises develop better. In order to address the problem that the entity boundaries in support policy texts are difficult to define and traditional word vectors cannot solve the problem of polysemy, a support policy texts named entity recognition model based on RoBERTa-wwm-BiLSTM-CRF is proposed. Firstly, the model uses the pre-trained language model RoBERTa-wwm to obtain dynamic word vectors, which can represent the polysemy of words. Secondly, the BiLSTM network is used to further extract the context information and semantic features of support policy texts. Finally, the best prediction sequence is obtained through the conditional random field. The proposed model achieves an F1 value of 91.7% on a self-built support policy dataset composed of 5 512 sentences. The results show that the model can effectively recognize the named entities in support policy texts, thereby improving the efficiency of enterprise policy screening.
-
A multi-strategy fused equilibrium optimization algorithm and its application
- LUO Shi-hang, HE Qing
-
2023, 45(08):
1508-1520.
doi:
-
Abstract
(
146 )
PDF (1057KB)
(
203
)
-
A multiple-strategy fused equilibrium optimization algorithm (MEO) is proposed to address the problems of weak global exploration and local exploitation ability, low optimization accuracy, and easy fall into local optima during the optimization process of the equilibrium optimization (EO) algorithm. Firstly, a high-destructive polynomial mutation strategy is used to initialize the population to improve the quality of the initial solutions and lay a foundation for global optimization. Secondly, a differential mutation-based reconstruction balanced pool strategy is proposed to enrich the diversity of the population during the iterative process and enhance the algorithms ability to avoid local optima. At the same time, the S-shaped transformation factor balancing algorithm is used to balance the global exploration and local exploitation abilities. Finally, a dynamic spiral search strategy is introduced to expand the search range of the algorithm and improve its convergence accuracy and speed. Simulation experiments are conducted to compare the MEO algorithm with the standard EO algorithm and other metaheuristic algorithms on eight benchmark test functions. The experimental results and Wilcoxon rank-sum test results both show that the proposed improvement strategies can improve the optimization accuracy, global exploration and local exploitation abilities, and the ability to escape from local optima of the EO algorithm. In addition, the MEO algorithm is applied to wireless sensor network (WSN) coverage optimization, and the experimental results show that the MEO algorithm can significantly improve the coverage rate of WSN, reduce the redundancy of nodes, and make node distribution more uniform. This demonstrates that the MEO algorithm can be applied to practical problems and has certain practical value.