Loading...
  • 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Current Issue

    • High Performance Computing
      Configurable CPU performance analysis method for specific applications
      DENG Quan, LIN Rongzhen, LUO Li, LU Jianzhuang, WANG Yongwen
      2025, 47(11): 1901-1911. doi:
      Abstract ( 64 )   PDF (1901KB) ( 55 )     
      With the development of integrated circuits and the continuous expansion of chip applications, configurable CPU facilitates the exploration of the chip design space. Configurable CPU can not only meet the demands of agile design but also cater to users’ needs for tuning based on target applications. However, at present, the performance tuning of application-specific configurable CPU still primarily relies on experienced architecture engineers, lacking a set of scientific methodologies for guidance. Therefore, this paper proposes a configurable CPU performance analysis method for specific applications. At the software level, the Perf tool is utilized to quickly identify hot code blocks in applications during hardware execution. At the hardware level, by analyzing two counting modes (counting of cycles and counting of slots) within the analysis framework, the hot execution conditions of each execution unit are pinpointed, enabling designers to swiftly locate hot behaviors in hardware execution. This paper conducts agile design for a configurable DMR (dual-module redundancy) architecture supporting the RISC-V instruction set using the typical fluid dynamics program NPB (NAS parallel benchmark). The experimental results indicate a 13.2% improvement in single-core performance of the configurable CPU, with a 12.2% increase in area overhead.


      A hybrid matrix-vector processor with dynamically reconfigurable dataflow
      AI Chenyang1, ZHAO Lechuan, HUA Tao, WANG Xin’an, WANG Ying
      2025, 47(11): 1912-1921. doi:
      Abstract ( 37 )   PDF (2498KB) ( 27 )     
      Systolic arrays, as energy-efficient accelerators for general matrix multiplication (GEMM) operators, have garnered widespread attention from both academia and industry. However, they often occupy a substantial amount of area and typically require collaboration with VPU (vector processing unit) components, a combination frequently seen in neural network accelerators. Additionally, they suffer from issues such as low temporal and spatial utilization rates and limited performance in end-to-end scenarios. To address these challenges, a hybrid vector systolic array (HVSA) is proposed by integrating systolic arrays with vector processors. By reusing the storage, broadcasting, and inter-channel communication units within the VPU, this architecture enables reconfigurable capabilities in terms of array shape and data flow, allowing for more efficient support of GEMM and vector operations within an acceptable hardware area overhead. Furthermore, an end-to-end compilation framework tailored for HVSA is introduced, encompassing an MLIR-based compilation frontend, data flow scheduling, and a programming model compatible with the RISC-V vector extension. Experimental data demonstrates that HVSA achieves a 30.30-fold increase in computational speed compared to a systolic array of equivalent area. In end-to-end applications, the average operating time of HVSA is reduced to around 4.7% of the original compared to the "VPU+SA" of the same area, and energy consumption is reduced by approximately 58.7%.


      Survey on offline autonomy technologies for cloud-native edge computing systems
      LI Mo, XIA Guoao, REN Yi, LI Bao, WANG Qingkun, ZHANG Jianfeng, TAN Yusong
      2025, 47(11): 1922-1931. doi:
      Abstract ( 44 )   PDF (1204KB) ( 26 )     
      Cloud-native edge computing systems support unified container cluster management across cloud and edge environments. The unstable cloud-edge network makes offline autonomy a critical research focus in this field; however, the academic community currently lacks an analysis of the technical challenges it faces and a systematic review of the current technological landscape. Firstly, this paper elaborates on the typical architecture and components of cloud-native edge computing systems analyzes and proposes three key issues that need to be addressed in offline autonomy scenarios: offline node status determination, recovery after offline node restarts, and cloud-edge synchronization after node reconnection. Subsequently, it analyzes different offline autonomy technologies employed by typical cloud-edge systems to tackle these issues and designs experiments to verify their essential functionalities and performance advantages. Finally, it discusses future research directions for offline autonomy technologies.

      An efficient method for RISC-V memory consistency testing based on loop unrolling
      HU Jintao, XU Xuezheng, YANG Deheng, HUANG Anwen, KOU Guang, LI Qiong
      2025, 47(11): 1932-1944. doi:
      Abstract ( 50 )   PDF (1339KB) ( 20 )     
      The memory consistency model, commonly referred to as the memory model, defines the observation rules for memory access in multi-core systems. As an architectural specification that both hardware and software must adhere to, it is characterized by difficulties of design, description, implementation, and testing, and has long been a research focus in both academic and industrial communities. Due to the uncertainty in the execution order of parallel programs, testing of memory models typically requires repeatedly running specific programs on a large scale. The presence of illegal memory access orders is determined based on the final program states. This process is particularly time-consuming during the pre-silicon simulation phase, posing significant challenges to chip verification. In recent years, RISC-V has gained widespread popularity due to its open-source nature, simplicity, modularity, and high customizability. Leveraging its open-source advantage, RISC-V chips offer an extremely high degree of flexibility in instruction set extension and micro-architecture design. Its memory model also allows customization on the basis of compliance with specifications, and this high customizability introduces additional challenges to chip verification. To address this issue, this paper proposes an efficient memory consistency testing method based on loop unrolling for the RISC-V architecture. By analyzing the performance bottlenecks of existing testing methods and drawing on the loop unrolling technique from traditional compilation, the method merges repeatedly executed test programs. This not only significantly reduces thread synchronization overhead but also increases the probability of inter-leaved memory access execution between threads, thereby improving testing efficiency. Experimental results show that, compared with existing memory consistency testing methods, the proposed method achieves a testing efficiency improvement ranging from 1.5 times to 184 times  across different platforms, including RISC-V boards and simulators.


      Does the ISA really matter?—A survey of simulations based on Gem5
      LI Hua, WANG Yongwen
      2025, 47(11): 1945-1952. doi:
      Abstract ( 54 )   PDF (3358KB) ( 22 )     
      The instruction set architecture (ISA) serves as the foundational framework of a chip, yet existing research on its performance impact often relies on real hardware implementations. However, varying hardware setups pose challenges for direct comparison and analysis of ISAs. To address this issue, simulations of ARM, RISC-V, and x86 ISAs were conducted using the Gem5 simulator, using identical hardware configurations and the same compiler version,  thereby enabling a controlled comparative analysis. CoreMark, Dhrystone, and Whetstone are adopted as benchmark programs, while McPAT assesses power consumption. Results from the simulations reveal that the ARM ISA exhibits superior performance and lower power consumption compared to RISC-V and x86 ISAs. Although differences between ARM and RISC-V are marginal, the performance gap between ARM and x86 may stem from the relatively modest hardware configuration utilized, which could be mitigated or reversed through the adoption of more aggressive hardware techniques. This research underscores that while an ISA plays a pivotal role, solely relying on it cannot fundamentally enhance efficiency.

      Computer Network and Znformation Security
      Network traffic anomaly detection based on gated fusion and multi-scale convolution
      YIN Chunyong, LI Rongbiao
      2025, 47(11): 1953-1963. doi:
      Abstract ( 48 )   PDF (909KB) ( 22 )     
      In the current field of network traffic anomaly detection, problems such as complex model structures and high computational resource requirements are widespread, making it difficult to deploy and perform detection on resource-constrained devices. To address these problems, a network traffic anomaly detection model  based on gated feature fusion and multi-scale convolution, named GFMCAD, is proposed. Firstly, principal component analysis is combined with a clustering method to reduce the complexity of network traffic data. Secondly, parallel multi-scale convolution blocks composed of one-dimensional convolutional neural networks and multi-layer long short-term memory networks are used to extract spatial and temporal features of network traffic at different scales, respectively. Then, the extracted spatial and temporal features are adaptively fused through a gated feature fusion module. Finally, residual fully connected layers and the Softmax function are used to identify abnormal traffic. According to the experimental results on three benchmark datasets, GFMCAD achieves accuracies of 0.971 6, 0.965 8, and 0.987 5, respectively. Experimental results show that GFMCAD reduces the consumption of computational resources while improving the detection capability of the model.


      Multi-stage detection and multimodal localization for audio deletion tampering
      ZHANG Guofu, WANG Ru, SU Zhaopin, YUE Feng, LIAN Chensi, YANG Bo
      2025, 47(11): 1964-1973. doi:
      Abstract ( 52 )   PDF (1301KB) ( 16 )     
      Audio deletion tampering detection faces severe challenges in the field of digital audio authentication, particularly under anti-forensic attacks. To address the difficulties in detecting and locating deletion tampering, a multi-stage detection and multimodal localization method for audio deletion tampering is proposed. Firstly, a header information analysis method is designed to screen out audio files suspected of undergoing header/footer deletion tampering. Subsequently, a column-average-based constant Q spectral sketch feature is introduced, along with a middle deletion tampering classification network that leverages a deep residual shrinkage network and an attention mechanism. Next, by integrating the results from header information analysis and the classification network, a comprehensive judgment is made on whether the audio deletion tampering has occurred. Finally, for detected middle deletion tampering, a localization method combining wavelet packet analysis with multimodal features is proposed. Comparative experimental results demonstrate that the proposed method can effectively detect header/footer deletion tampering and accurately locate middle deletion tampering. Specifically, the accuracy, precision, recall, and F1 score for middle deletion classification all exceed 98%, and the method exhibits enhanced robustness and localization accuracy when faced with conventional signal processing attacks.



      MinRS: A defense method for both model availability and model privacy
      REN Zhiqiang, CHEN Xuebin, ZHANG Hongyang
      2025, 47(11): 1974-1983. doi:
      Abstract ( 34 )   PDF (1273KB) ( 23 )     
      Federated learning is a technology that addresses the challenges  of data sharing and privacy protection in machine learning. However, federated learning systems face security risks in two aspects: those targeting model availability and those targeting model privacy. Moreover, the current defense methods against these two types of security risks are not mutually compatible. To tackle these problems, from the perspective of balancing model availability and model privacy, a defense method named MinRS is proposed. This method consists of a secure access scheme and a selection algorithm, which can defend against malicious model attacks without compromising model privacy, thereby achieving secure model aggregation. Experimental results show that, on the premise of protecting model privacy, MinRS successfully defends against malicious models generated by three different attack strategies, and has almost no negative impact on the performance of the models.


      A federated learning secure aggregation algorithm based on one-class support vector machine
      ZHU Hai, MIAO Xianghua, GUO Shifan, QING Yegui, SHANG You
      2025, 47(11): 1984-1995. doi:
      Abstract ( 30 )   PDF (1734KB) ( 22 )     
      Federated learning has garnered significant attention in academia as it enables users to participate in model training without uploading their data. However, federated learning also faces various security challenges from malicious participants, such as Byzantine attacks and label flipping attacks. Existing defense methods exhibit diminished effectiveness under unevenly distributed data. To address these issues, this paper proposes a secure aggregation algorithm  in federated learning based on the one-class support vector machine (OC-SVM). This algorithm extracts appropriate feature parameters using OC-SVM and determines a threshold to separate normal data from anomalous data. Owing to its ability to construct an optimal hyperplane, the algorithm can effectively distinguish between normal and anomalous data. Moreover, it can select a more suitable threshold under different data conditions, demonstrating strong generalization capability and robustness. Through a series of experiments comparing the proposed algorithm with four different defense algorithms, the results show that, in environments with varying proportions of malicious clients and regardless of whether the data distribution is uniform or not, the proposed algorithm can effectively defend against attacks.


      Artificial Intelligence and Data Mining
      Recent advances in mathematical inference based on artificial intelligence models
      YANG Kaixi, CHEN Xinyi, KAN Zhigang, HAN Xu, ZHAO Baokang, QIAO Linbo
      2025, 47(11): 1996-2007. doi:
      Abstract ( 57 )   PDF (844KB) ( 25 )     
      Artificial intelligence (AI) provides mathematicians with new tools and methods, accelerat- ing the exploration of mathematical problem-solving and proofs. This paper first introduces mainstream mathematical reasoning tools and reasoning datasets. Mathematical reasoning tools not only improve the efficiency of solving complex mathematical problems but also provide structured inputs for AI models to facilitate the integration of machine and mathematical thinking. Reasoning datasets, including mathematical knowledge bases and public data resources, supply AI models with abundant mathematical knowledge, supporting model training and automated reasoning. Secondly, this paper explores various approaches to AI-assisted mathematical reasoning, such as automated reasoning and model-accelerated problem-solving. Thirdly, it discusses the importance of collaborative cooperation between AI and mathematicians-human- machine collaboration enhances the efficiency and accuracy of mathematical research. Finally, this paper concludes with a summary of the multiple approaches and mechanisms through which AI empowers mathematical research, and looks forward to a new era of intelligent mathematical research.

      Research on judicial text summarization based on large language model
      PEI Bingsen, Li Xin, FAN Zhijie, JIANG Zhangtao, SUN Haoyang, LIU Zirui
      2025, 47(11): 2008-2018. doi:
      Abstract ( 38 )   PDF (1975KB) ( 26 )     
      With the continuous development of science and technology, general artificial intelligence (AGI) technology has demonstrated its powerful capabilities in language understanding and generation. In the judicial field, artificial intelligence also plays an increasingly important role, gradually transition- ing from judicial informatization to judicial intellectualization and smart judicial services. In this transition process, the summarization of judicial texts is a key task. Generating summaries based on judicial texts can achieve the goal of “dimensionality reduction”, help quickly grasp case details and obtain case elements, and provide support for practitioners to efficiently acquire information. However, current judicial text summarization technologies still have some problems, such as: the generated summaries lack legal provisions as the basis for judgment, and the summaries have grammatical errors and incoherent sentences, which lead to poor readability, among other issues. To solve the above problems, this paper  leverages the excellent language understanding and generation capabilities of large language models (LLMs), combines different fine-tuning technologies, and designs different prompt templates to construct a domain-specific large model for judicial text summarization. Verification on various datasets proves the feasibility of this model, providing a potential approach for the integration of large language models and the judicial field.

      A multi-view 3D perception network based on spatio-temporal fusion
      LI He, CHEN Pintong, YU Rong, TAN Beihai
      2025, 47(11): 2019-2028. doi:
      Abstract ( 33 )   PDF (1252KB) ( 27 )     
      As a critical component of autonomous driving, the perception system directly influences a vehicle’s comprehension of its surrounding environment and serves as the foundation for achieving safe and reliable autonomous driving. Traditional 2D image detection perception techniques can only provide limited information. While 3D perception offers richer perceptual data,  it faces key challenges, including insufficient fusion of spatial information and inadequate utilization of temporal information. This paper proposes and designs a multi-view 3D perception network that integrates spatio-temporal information. This network comprises a multi-view surround 3D perception network and a spatio-temporal fusion network MVSPNet. The multi-view surround perception network efficiently fuses multi-camera image data through precise spatial perspective transformation, constructing a unified bird’s eye view spatial representation. This achieves spatial alignment and fusion of data from multiple cameras. Compared to the current advanced monocular baseline model FCOS3D, it achieves a mean average precision (mAP) of 0.343, representing a performance improvement of 14.7%. The spatio-temporal fusion network MVSPNet  enables the temporal fusion of multi-view images by integrating multi-frame data. This further significantly enhances the network’s performance, and fusing 2 frames of temporal data results in an additional mAP improvement of 10.2%. The experiment results fully demonstrate the advancement of the designed network in effectively fusing multi-view spatial information and temporal information. This study provides an effective solution for enhancing 3D perception of autonomous driving systems in dynamic and complex scenarios, holding significant implications for advancing the development of safe and reliable autonomous driving technology. 


      Graph prompting for few-shot node classification based on maintaining node cluster distribution
      XIE Qiuyuan, LI Qiuyao, CHAI Bianfang
      2025, 47(11): 2029-2037. doi:
      Abstract ( 36 )   PDF (805KB) ( 16 )     
      In graph mining tasks, prototype-based graph prompt learning has been widely recognized as an effective method to enhance the performance of graph data analysis. However, in few-shot node classification scenarios, existing methods suffer from two key limitations: insufficient utilization of unlabeled data, which leads to inaccurate class prototype construction, and inadequate exploitation of graph topological structure information. These shortcomings restrict the effectiveness of graph prompt learning methods in downstream tasks. To address these issues, this paper proposes a graph prompt learning method that integrates the distribution of all node clusters, named PNCD-GP (prototype with node cluster distribution-graph prompt). This method aims to improve the performance and accuracy of graph data analysis by more effectively leveraging the cluster distribution of unlabeled data and topological structure information. In the pre-training phase, two optimization strategies, predicting masks and preserving graph node clustering, are adopted to learn discriminative node representations and narrow the gap between upstream and downstream tasks. In the graph prompting phase, class prototype virtual nodes are introduced into the original graph as prompts, and high-order information is incorporated to enhance the graph’s topological structure, thereby improving the model’s ability to understand and utilize graph structures. Additionally, prompts are learned by maintaining the cluster distribution between unlabeled samples and labeled nodes. This method enables the construction of more accurate prototype vectors and performs node classification by leveraging the similarity between class prototypes and node representations. Experimental results on multiple public graph datasets demonstrate that the PNCD-GP method exhibits significant advantages in both efficiency and accuracy, verifying its effectiveness and potential in the field of graph prompt learning.

      Environmental sound classification based on spatial attention mechanism and multi-feature data enhancement
      LIU Xiang, LI Chuankun, GUO Jinming, LIU Yu
      2025, 47(11): 2038-2044. doi:
      Abstract ( 42 )   PDF (2200KB) ( 15 )     
      To address the issues of low signal-to-noise ratio (SNR) in dataset samples and insufficient feature representation capability of Log-Mel spectrograms in environmental sound classification (ESC) tasks, this paper proposes an improved model for environmental sound classification based on high- and low-frequency separation. Firstly, phase spectrum is incorporated as a supplement to Log-Mel spectrograms in the input features, constructing a multi-feature parameter input comprising phase, Log-Mel, and  spectrogram spectrum, thereby enhancing the expressive power of the input features. Secondly, an attention mechanism is added to the input section of the neural network to improve its resistance to noise interference, enhancing the network’s robustness and generalization capability. Experiments demonstrate that this proposed model effectively improves the recognition accuracy of environmental sounds, achieving classification accuracies of 97.25%, 89.00%, and 83.45% on ESC10, ESC50, and UrbanSound8K datasets, respectively. Compared to the original model, the accuracy improvements are 2.25%, 10.50%, and 2.22%, respectively.

      An echo state network parameter optimization model based on multi-strategy whale optimization algorithm
      GUO Wei, HAO Siqi, REN Zhizhong, MINAWAER·Mutila
      2025, 47(11): 2045-2055. doi:
      Abstract ( 56 )   PDF (889KB) ( 18 )     
      To address poor network prediction performance caused by the randomness of reservoir parameter selection in traditional echo state network (ESN), this paper proposes an ESN parameter optimization model based on multi-strategy whale optimization algorithm (MWOA-ESN). The key parameters of ESN reservoir are optimized by MWOA algorithm. By introducing the pooling mechanism, migration strategy and priority selection strategy, MWOA effectively solves the defects of low population diversity and easy to fall into local optima of the whale optimization algorithm, and improves the optimization efficiency. Through simulation experiments on multiple time series data sets and short-term power load data sets, the results show that the proposed MWOA-ESN model has universality, and outperforms the existing classical models in terms of prediction accuracy and fitting. Compared with the existing results, the MWOA-ESN parameter optimization model is feasible and effective.

      Imagetext emotion classification based on visual feature enhancement and bidirectional interaction fusion
      WANG Luyao, HU Huijun, LIU Maofu
      2025, 47(11): 2056-2066. doi:
      Abstract ( 46 )   PDF (1485KB) ( 21 )     
      Multimodal sentiment analysis is increasingly receiving widespread attention, with the aim of utilizing multimodal information such as text and images to achieve emotion prediction. Compared to text, the visual modality, as an auxiliary modality, may contain more redundant or confounding information unrelated to emotions, and existing research does not fully consider the interaction and complementarity between multiple perceptual modalities. To address these issues, an imagetext emotion classification model based on visual feature enhancement and bidirectional interactive fusion (VFEBIF) is proposed. In this approach, the fine-grained visual feature enhancement module utilizes structured knowledge from scene graphs and filtering techniques based on CLIP to extract keywords from the text related to visual semantics, thereby enhancing local visual features. Additionally, the bidirectional interactive fusion module implements inter-modal interaction in parallel, and fuses multimodal features to thoroughly explore complementary information between text and image, thus achieving emotion classification. Experiments on two public datasets, TumEmo and MVSA-Single, demonstrate that the VFEBIF method outperforms most existing approaches and can effectively improve the performance of sentiment classification.

      Short-term power load forecasting based on PCA-PSO_KFCM clustering and BiLSTM-Attention
      DENG Mingliang, ZHANG Zhao, ZHOU Hongyan, CHEN Xuebo
      2025, 47(11): 2067-2081. doi:
      Abstract ( 36 )   PDF (1835KB) ( 21 )     
      Accurate and reliable short-term power load forecasting can optimize power dispatching, improve the utilization of power resources, and provide valuable references for the actual production of the power sector. With the diversification of power-using terminals and the influence of short-term factors such as weather and date, the load sequence shows obvious uncertainty and randomness. To address this, a novel two-stage short-term power load forecasting method based on improved kernel fuzzy C-mean clustering and bidirectional long- and short-term memory attention is proposed. In the first stage, the KFCM clustering method based on the joint improvement of principal component analysis and particle swarm optimization is used to group the load data points with similar electricity consumption characteristics into one class, which makes the model training more targeted. In the second stage, mete- orological and temporal features with high correlation are selected as inputs through the Pearson correlation coefficient. Meanwhile, to improve the prediction performance of the model, a temporal attention mechanism and a multi-head self-attention mechanism are introduced into the BiLSTM model. Finally, the proposed method is applied to the real power load dataset provided by Chongqing Electric Power Company in China. The experimental results show that the prediction accuracy is significantly improved compared with many different prediction methods.


      A self-supervised signed social relationship prediction model fusing balance and status theories
      TANG Yuechen, MA Huifang, SHU Ke
      2025, 47(11): 2082-2090. doi:
      Abstract ( 38 )   PDF (1305KB) ( 19 )     
      Signed social relationship prediction aims to predict whether there exists positive directed interactions or negative directed interactions between social entities in a social network. Existing social relation prediction models often overlook the signed and directional characteristics in social networks. Balance theory provides guidance for modeling signed-related social relations, while status theory offers guidance for modeling the joint signed and directional social relations. Moreover, self-supervised techniques can effectively provide mutual assistance in learning node features from these two perspectives. Accordingly, this paper proposes a self-supervised signed social relationship prediction model fusing balance and status theories, which aims to make full use of balance and status theories to model friendships and hierarchical relationships, and to capture the effects of edge signs and directions on different relationships, respectively. To improve prediction performance, a contrastive learning mechanism is used to explore the complementary information of signs and directions in the network during training. Experiments on real datasets validate the effectiveness of the proposed model in this paper.