High Performance Computing
-
MVSim: A fast, scalable and accurate architecture simulator for VLIW multi-core vector processors
- LIU Zhong, LI Cheng, TIAN Xi, LIU Sheng, DENG Rang-yu, QIAN Cheng-dong
-
2024, 46(02):
191-199.
doi:
-
Abstract
(
206 )
PDF (1227KB)
(
264
)
-
This paper designs a fast, scalable and accurate architecture simulator (MVSim) for VLIW multi-core vector processors. The scalable VLIW multi-core vector processor model, multi-level storage architecture model and multi-core performance model are designed in MVSim. It implements cycle-accurate simulation of instruction set architectures, efficient functional simulation of Cache, DMA and multi-core synchronization, and uses multi-threading to achieve efficient and scalable simulation of multi-core processors. The experimental results show that MVSim can accurately simulate the program execution of the target multi-core processor, the simulation results are completely correct, and it has good scalability. The average simulation speed of MVSim is 227 times and 5 times faster than RTL simulator and CCS, respectively, and the average performance error is about 2.9%.
-
Design and implementation of agile switching chip for equipment platform
- LIU Ru-lin, YANG Hui, LI Tao, L Gao-feng, SUN Zhi-gang
-
2024, 46(02):
200-208.
doi:
-
Abstract
(
122 )
PDF (1495KB)
(
225
)
-
When building network systems for equipment platforms such as vehicles, ships, and aerospace, comprehensive consideration needs to be given to constraints such as function, performance, volume, power consumption, and ease of use. Traditional high performance commercial network switching chips have high power consumption and complex configuration and use, making them difficult to meet relevant needs. Therefore, a programmable and low power agile switching chip architecture is proposed for equipment platforms. This architecture introduces protocol-independent ultra-long content matching-action pipelines to provide programmable forwarding configurations, flow rate limiting, packet modification and other functions, and supports various existing network protocols as well as user-defined protocols. The architecture provides flexible management and configuration modes to meet various application requirements such as remote configuration and lightweight configuration without management CPUs. In addition, through the expansion function of the agile switching protocol, the lookup results, action processing records, user configuration information, etc. can be sent out of the chip along with the original packet to achieve fine-grained processing and flexible function expansion at the individual packet level. Based on the agile switching architecture, the YHHX-DS160 agile switching chip has been taped out. This chip can provide full interface line speed switching capability of 160 Gbps, with a maximum power consumption of only 6.6 W, achieving an energy efficiency ratio of 24 Gbps/W.
-
Analysis and evaluation of congestion control in interconnection networks for high performance computing
- SUN Yan, ZHANG Jian-min, LI Yuan, SUN Shun-yu
-
2024, 46(02):
209-216.
doi:
-
Abstract
(
167 )
PDF (979KB)
(
358
)
-
With the development of high performance computing technology, the number of network nodes in high performance computing systems is continuously growing, and the requirements of high performance computing applications for network performance are becoming increasingly stringent. Therefore, congestion control for high performance interconnection networks faces great pressure and challenges. To address the characteristics of high performance computing interconnection networks, researching efficient and low-overhead congestion control methods is crucial to ensuring the performance and stability of high performance interconnection networks. This study focuses on the core issues of interconnection communication in high performance computing systems. It analyzes and compares the mainstream congestion control methods. Based on the structural characteristics and communication properties of high performance computing systems, it designs a data flow model and a flow file generation tool for large-scale simulation, and proposes a comprehensive evaluation index for congestion control. Using the proposed data flow model, different congestion control methods are simulated on a large-scale network, and their performance is analyzed and evaluated based on the proposed evaluation index. The analysis and evaluation techniques proposed in this study can provide more objective and accurate analysis and evaluation of congestion control methods for high performance interconnection networks.
-
Design and implementation of a baseboard management controller on ZYNQ chip
- MA Ke-fan, LI Bao-feng, ZHOU Yue-jin, WU Yuan-yuan, YU Yong-lan, DUO Rui-hua
-
2024, 46(02):
217-223.
doi:
-
Abstract
(
168 )
PDF (1328KB)
(
291
)
-
With the large-scale development of data centers such as supercomputers and cloud computing, the architecture of motherboards has become increasingly complex, and cost control has become more stringent. The commonly used BMC design solutions have limited scalability, and there is an urgent need to design a low-cost and efficient BMC solution. An integrated development platform based on Vivado and Yocto Project is proposed. A XILINX ZYNQ series FPGA chip is selected and its internal hard core ARM is used to run OpenBMC. The design expands peripherals through AXI bus, which has strong extensibility and high flexibility, thus realizing the dual management of BMC and FPGA, saving space and cost of the motherboard.
-
Efficient analysis of coherent hub interface protocol mixturing hardware and software
- ZHAO Zhi-qiao, ZHOU Li, XUN Chang-qing, PAN Guo-teng, TIE Jun-bo, WANG Wei-zheng
-
2024, 46(02):
224-231.
doi:
-
Abstract
(
109 )
PDF (1269KB)
(
252
)
-
In the development process of SoC, how to efficiently and accurately perform functional verification and performance analysis is an urgent problem to be solved. Aiming at the current limited monitoring means of Network-on-Chip protocols on FPGA prototype platforms, this paper proposes an efficient monitoring and analysis method with hardware-software mixture for CHI protocols. By connecting C code through the DPI of SystemVerilog, the synthesizable hardware part provides a shared function body, while the non-synthesizable software part captures CHI messages in the SoC under test from various channels of the Network-on-Chip protocol through the shared function body for offline storage or online inspection. Experimental results show that this method has the advantages of low hardware resource occupation and high reusability. The offline mode has little impact on simulation speed, while the online mode can detect problems while the SoC under test is running, enabling efficient monitoring of CHI protocol mes- sages on the prototype platform and effectively accelerating the localization and performance analysis of SoC problems.
Computer Network and Znformation Security
-
A revocable traceable access control scheme in autonomous vehicle platoon
- LI Yu-xin, WANG Zheng, WANG Hui, SUN Jian-wei
-
2024, 46(02):
232-243.
doi:
-
Abstract
(
102 )
PDF (1335KB)
(
212
)
-
AVP in intelligent transportation systems is an excellent solution for improving fuel efficiency and driving safety. Based on Internet of Things and 5G communication technology, autonomous vehicle platoons are susceptible to various attacks during the process of intelligent teaming and behavior communication, leading to safety accidents. To this end, an attribute-based revocable traceable access control scheme is proposed in combination with edge computing technology. Firstly, fine-grained access control is achieved using attribute-based encryption. Secondly, dynamic updating of the platoon key is implemented using CRT, allowing for the message’s revocable storage and reducing the computational overhead on the vehicle-side through the assistance of roadside infrastructure. Furthermore, an anonymous user and traceability of malicious vehicles are achieved using an elliptic curve encryption mechanism. Finally, through security analysis and simulation experiments, it is shown that the scheme is feasible in terms of security and efficiency.
-
Research on differential privacy protection for Stacking algorithm
- DONG Yan-ling , ZHANG Shu-fen, XU Jing-cheng, WANG Hao-shi,
-
2024, 46(02):
244-252.
doi:
-
Abstract
(
97 )
PDF (1162KB)
(
210
)
-
In order to solve the problem that homogeneous ensemble learning algorithms are more sensitive to noise and difficult to take into account both better predictive performance and effective privacy protection, a DPStacking algorithm based on differential privacy is proposed. This algorithm combines heterogeneous Stacking algorithms with differential privacy technology to optimize the privacy protection and its predictive performance. However, since both the low-level and high-level models of the Stacking algorithm can be composed of different learners, if a privacy budget allocation scheme is designed for a particular learner to provide differential privacy protection, this scheme is often not applicable to Stacking algorithms composed of arbitrary base learners and meta-learners. Based on this, a privacy budget allocation scheme based on meta-learners is designed, which allocates different privacy budgets to different components of meta-learners according to the Pearson correlation coefficient and the characteristics of differential privacy parallel combination. Through theoretical and experimental verification, DPStacking algorithm satisfies ε-differential privacy protection. Compared with differential privacy random forest algorithm (DiffRFs), Adaboost algorithm (DP-AdaBoost), XGBoost algorithm (DPXGB), it can effectively guarantee data privacy while having better predictive performance, and better solve the problem that single homogeneous ensemble learning algorithm is more sensitive to noise.
-
A multi-branch fine-grained recognition method based on dynamic localization and feature fusion
- YANG Xiao-qiang, HUANG Jia-cheng
-
2024, 46(02):
253-263.
doi:
-
Abstract
(
141 )
PDF (1324KB)
(
248
)
-
To solve the classification difficulties of small inter-class differences and large intra-class differences in fine-grained classification, an improved end-to-end fine-grained classification model (TBformer) is proposed based on Swin Transformer. In view of the interference of complex background on network recognition, the dynamic location module (DLModule) combining ECA, Resnet50 and SCDA is used to capture key objects, and a three-branch feature extraction module based on DLModule is designed to improve the ability of target discriminant feature extraction. In order to fully tap the rich fine-grained information contained in the three-branch features, a feature fusion method based on ECA is proposed to enhance the comprehensiveness and accuracy of the features, and improve the robustness of the network for fine-grained classification. The experimental results show that compared with the basic method, the accuracy of TBformer is improved by 3.19% in CUB-200-2011, 3.47% in Stanford Dogs and 1.09% in NABirds.
-
Underwater vehicle target detection and experiment based on improved RetinaNet network
- HUANG Zhen-wei, CHEN Wei, WANG Wen-jie, LU Jin-tong
-
2024, 46(02):
264-271.
doi:
-
Abstract
(
152 )
PDF (1614KB)
(
265
)
-
Aiming at the problems of serious image degradation and low target recognition rate in current underwater vehicle target detection methods, an underwater target detection method combining improved RetinaNet and attention mechanism is proposed. Firstly, RetinaNet backbone network is replaced with DenseNet network, which retains more target features and reduces the number of parameters. Secondly, in order to increase the operation speed of the network model, the initial convolution is replaced by the depth separated deformable convolution, thus greatly reducing the parameters of the model. Finally, CBAM attention module is introduced to enhance features in space and channel dimensions, reducing the interference of underwater complex environment to target detection. The experimental results of underwater robot grasping show that compared with the initial RetinaNet methods, The mAP value of the improved method can reach 81.9%, the parameters are 56.8 MB, and the detection speed is 16.8 frames. The improved method has excellent performance in underwater target detection.
Artificial Intelligence and Data Mining
-
A sliding window voting strategy based on hidden Markov model for morphology detection of QRS complex
- SONG Xin-hai, HAN Jing-yu, LANG Hang, MAO Yi
-
2024, 46(02):
272-281.
doi:
-
Abstract
(
91 )
PDF (987KB)
(
278
)
-
The morphological identification of QRS complex is a key in the detection of abnormal ECG, which acts as the basis for disease diagnosis. The existing QRS morphological recognition methods either identify only a few morphologies, or are sensitive to parameter settings, and the performance is not ideal. Based on this, a sliding window voting strategy based on hidden Markov model (SWVHMM) is proposed to automatically identify QRS morphologies. Firstly, each QRS complex is divided into four phases, and a sliding window is set for each phase to extract samples. Secondly, the waveform of each phase is regarded as a state, and the cluster center of the window samples acts as the observation to construct a state-constrained Hidden Marko model. Finally, we vote on the result of the combination of different phase windows to identify the target morphology pattern with the largest possibility. On the real data set labelled by professional doctors, compared with existing methods, our method improves F1 measure by 5.97% ,5.49% and 2.27%, respectively. The results show that SWVHMM can identify a variety of morphology patterns with improved accuracy.
-
Feature selection algorithm based on feature weights and improved particle swarm optimization
- LIU Zhen-chao, YUAN Ying-chun, WANG Ke-jian, HE Chen
-
2024, 46(02):
282-291.
doi:
-
Abstract
(
134 )
PDF (1169KB)
(
270
)
-
With the development of educational informatization, educational data presents characteristics such as high feature counts and high redundancy, resulting in the classification accuracy of current classification algorithms not being ideal on educational data. Therefore, this paper proposes a hybrid feature selection algorithm (RF-ATPSO) that integrates feature weighting algorithm with improved particle swarm optimization algorithm. The algorithm first uses the RELIEF-F algorithm to calculate the weights of each feature, removes redundant features, and then uses the improved particle swarm optimization algorithm to search for the optimal feature subset in the filtered feature set. Experimental results show that on 6 UCI public datasets, after feature selection using the RF-ATPSO algorithm, the average accuracy is improved by 10.04%, and the average feature subset size is the smallest and the convergence speed is the fastest. In the student academic performance portrait feature dataset, the algorithm achieves high classification accuracy with a smaller feature subset size, with an average accuracy of 94.77%, which is significantly better than other feature selection algorithms. The experiment fully demonstrates the practical application significance of this algorithm.
-
A model-based non-convex clustering algorithm
- ZHONG Zhuo-hui, CHEN Li-fei,
-
2024, 46(02):
292-302.
doi:
-
Abstract
(
89 )
PDF (982KB)
(
224
)
-
Since data may be distributed on an irregular manifold, where the underlying clusters often exhibit non-convex shapes and structures, the clustering problems for such data are collectively referred to as non-convex clustering. However, existing mainstream non-convex clustering methods, including clustering based on original space and clustering based on Space Transformation, ignore the explicit description of non-convex data patterns and fail to understand and describe the underlying mechanisms that produce such structures. Therefore, a descriptive clustering model is proposed to act on non-convex clustering. Firstly, a feature-weighted kernel density model with a hybrid form is defined based on the kernel density method that does not need to assume any probability distribution model in advance and does not restrict the shape of clusters, which cannot be achieved by traditional model-based cluster- ing methods. Secondly, the clustering objective function is derived based on the proposed model, and an optimization algorithm for solving the local density maximum of the density function is proposed based on the expectation maximization algorithm. The clusters are divided into those samples that rise to the same density maximum of the density function. Finally, a model-based non-convex clustering algorithm is defined. The algorithm does not need to manually define the number of clusters, and can assign an explicit probability density function to each cluster, which helps to characterize clusters more robustly and accurately. In addition, the algorithm not only performs adaptive bandwidth selection, but also gives the feature weight of the sample space, enabling the automatic embedded feature selection during the clustering process.
-
A hybrid multi-strategy improved sparrow search algorithm
- LI Jiang-hua, WANG Peng-hui, LI Wei
-
2024, 46(02):
303-315.
doi:
-
Abstract
(
184 )
PDF (1768KB)
(
475
)
-
Aiming at the problems that the Sparrow Search Algorithm (SSA) still has premature convergence when solving the optimal solution of the objective function, it is easy to fall into local optimum under multi-peak conditions, and the solution accuracy is insufficient under high-dimensional conditions, a hybrid multi-strategy improved Sparrow Search Algorithm (MISSA) is proposed. Considering that the quality of the initial solution of the algorithm will greatly affect the convergence speed and accuracy of the entire algorithm, an elite reverse learning strategy is introduced to expand the search area of the algorithm and improve the quality and diversity of the initial population; the step size is controlled in stages, in order to improve the solution accuracy of the algorithm. By adding the Circle mapping parameter and cosine factor to the position of the follower, the ergodicity and search ability of the algorithm are improved. The adaptive selection mechanism is used to update the individual position of the sparrow and add Lévy flight to enhance the algorithm optimization and the ability to jump out of local optima. The improved algorithm is compared with Sparrow Search Algorithm and other algorithms in 13 test functions, and the Friedman test is carried out. The experimental comparison results show that the improved sparrow search algorithm can effectively improve the optimization accuracy and convergence speed, and it can be used in high-dimensional problems. It also has high stability.
-
Distant supervision relation extraction based on type attention and GCN
- ZHANG Huan, LI Wei-jiang,
-
2024, 46(02):
316-324.
doi:
-
Abstract
(
114 )
PDF (822KB)
(
204
)
-
Distant supervision relation extraction uses the automatic alignment of natural language texts and knowledge bases to generate labeled training datasets, solving the problem of manual sample labeling. In the current research, most distant supervision does not pay attention to the long-tail data, so most of the sentence bags obtained by distant supervision contain too few sentences. These sentence bags cannot truly and comprehensively express the data itself. This paper proposes a distant supervised relation extraction model (PG+PTATT) based on position-type attention mechanism and graph convolutional network. According to the similarity between sentence bags, Graph Convolutional Networks (GCN) aggregate the implicit high-level features of similar sentence bags to optimize the sentence bags and obtain more prosperous and more comprehensive feature information of the sentence bags. At the same time, an attention mechanism, Position-Type Attention (PTATT) is constructed, which can solve the problem of wrong labels in distant supervision relation extraction: using the position relationships between entity words and non-entity words and type relationships are modeled to reduce the impact of noisy words. The proposed model is experimentally verified on the dataset New York Times (NYT), and the experimental results show that the proposed model can effectively solve the problems existing in distant supervision relation extraction; and it can effectively improve the accuracy of relation extraction.
-
Research progress on information extraction methods of Chinese electronic medical records
- JI Xu-rui, WEI De-jian, ZHANG Jun-zhong, ZHANG Shuai, CAO Hui
-
2024, 46(02):
325-337.
doi:
-
Abstract
(
262 )
PDF (887KB)
(
436
)
-
The large amount of medical information carried in the electronic medical record can help doctors better understand the situation of patients and assist doctors in clinical diagnosis. As the two core tasks of Chinese electronic medical record (EMR) information extraction, named entity recognition and entity relationship extraction have become the main research directions. Its main goal is to identify the medical entities in the EMR text and extract the medical relationships between the entities. This paper systematically expounds the research status of Chinese electronic medical record, points out the important role of named entity recognition and entity relationship extraction in Chinese electronic medical record information extraction, then introduces the latest research results of named entity recognition and relationship extraction algorithm for Chinese electronic medical record information extraction, and analyzes the advantages and disadvantages of each model in each stage. In addition, the current problems of Chinese EMR are discussed, and the future research trend is prospected.
-
Review of personalized recommendation research based on meta-learning
- WU Guo-dong, LIU Xu-xu, BI Hai-jiao, FAN Wei-cheng, TU Li-jing
-
2024, 46(02):
338-352.
doi:
-
Abstract
(
231 )
PDF (1157KB)
(
529
)
-
As a tool to alleviate “information overload”, recommendation system provides personalized recommendation services for users to filter redundant information, and has been widely used in recent years. However, in actual recommendation scenarios, there are often issues such as cold start and difficulty in adaptively selecting different recommendation algorithms based on the actual environment. Meta-learning, which has the advantage of quickly learning new knowledge and skills from a small number of training samples, is increasingly being applied in research related to recommendation systems. This paper discusses the main research on using meta-learning techniques to alleviate cold start problems and adaptive recommendation issues in recommendation systems. Firstly, it analyzes the relevant research progress made in meta-learning-based recommendations in these two areas. Then, it points out the challenges faced by existing meta-learning recommendation research, such as difficulty in adapting to complex task distributions, high computational costs, and a tendency to fall into local optima. Finally, it provides an outlook on some of the latest research directions in meta-learning for recommendation systems.
-
An improved Harris hawks optimization algorithm based on elite guidance
- LI Yu-heng, GAO Shang, MENG Xiang-yu
-
2024, 46(02):
363-373.
doi:
-
Abstract
(
166 )
PDF (957KB)
(
330
)
-
Aiming at the problems that Harris Hawks Optimization (HHO) is easy to fall into local optimization and has slow convergence speed, an improved Harris Hawks Optimization algorithm based on elite guidance (EHHO) is proposed. Firstly, elite opposite learning is introduced, and the elite center is used as the symmetrical center for opposite learning to optimize the population structure and enhance the ability of the algorithm to jump out of local optimum. Secondly, the elite evolution strategy is introduced, and the evolution based on Gaussian random mutation is carried out with elite individuals as the main body to improve the quality of the population and improve the convergence speed of the algorithm. Finally, an adaptive mechanism is introduced to dynamically adjust the selection probability of the two evolution modes in the elite evolution strategy to improve the stability of the algorithm. To verify the effectiveness of the improved algorithm, 15 benchmark functions are selected for simulation experiments. The experimental results show that the improved algorithm has obvious improvement in optimization performance and robustness, and has certain competitiveness in optimization algorithms.
-
Treelet-based graph neural network for premise selection in first-order logic
- MA Xue, HE Xing-xing, LAN Yong-qi, LI Ying-fang
-
2024, 46(02):
374-380.
doi:
-
Abstract
(
95 )
PDF (671KB)
(
195
)
-
Premise selection is an efficient method to address the performance degradation of automated theorem provers when facing large-scale problems. Currently, the mainstream graph neural networks for premise selection in first-order logic ignore the node order information inside logic formula graphs. To solve the above problem, an order-preserving method for higher-order logical formulas is extended to first-order logic, and a treelet-based graph neural network model is proposed. The model aggregates the information of neighbor nodes in two parts: One part aggregates parent and child node information of the central node, the other part aggregates the order information of the central node. Experimental analysis shows that, compared with the optimal mainstream graph neural network model, the treelet-based graph neural network model improves the classification accuracy by about 2% in the premise selection task.