Computer Engineering & Science

Rubyphi:Automated model checking for Cache coherence protocols in gem5

XU Xuezheng, FANG Jian, LIANG Shaojie, WANG Lu, HUANG Anwen, SUI Jinggao, LI Qiong

2025, 47(7): 1141-1151. doi:

Abstract ( 132 )

PDF (1683KB) ( 71 ) 　　

Cache coherence protocols serve as the cornerstone for ensuring data consistency in multi-core systems and directly impact the performance of the memory subsystem,making it a longstanding focal point in microprocessor design and verification.The design and optimization of coherence protocols typically rely on software simulators like gem5 for rapid implementation.Additionally,errors in protocols are difficult to trigger,locate,and repair during simulation,necessitating the use of model checking tools such as Murphi for formal verification.However,there is a significant difference in programming languages and levels of abstraction between simulator-based protocol design and optimization and model-checking-based protocol verification.Designers are required to separately implement simulator code and construct model-checking frameworks,which not only increases time cost but also introduces potential discrepancies in equivalence between the two approaches.To address this challenge,this paper designs and implements Rubyphi,an automated model checking method for Cache coherence protocols targeting the gem5 simulator.By extracting and translating the protocol descriptions and implementations from gem5,Rubyphi automatically generates Murphi-based model checking framework to conduct formal verification.Experimental results demonstrate that Rubyphi effectively accomplishes the modeling and verification of coherence protocols in gem5,successfully uncovering two existing bugs in gem5 s protocols.The related issues and patches have been confirmed by the community.

Design of CNFET-based ternary cell library for multi-valued-logic computing

WANG Lei1, WANG Hong2, WANG Yao3, ZHU Xiaozhang2, YANG Zhijie1, TANG Yuhua3

2025, 47(7): 1152-1161. doi:

Abstract ( 105 )

PDF (1983KB) ( 151 ) 　　

Compared to binary logic,ternary logic offers more logic states,endowing ternary logic-based circuits with advantages including smaller area,higher utilization,improved transmission efficiency,and enhanced security.This paper implements fundamental ternary logic gates using commonly available carbon nanotube field-effect transistors (CNFETs),establishes a functionally complete ternary logic library,and proposes a method to reduce switching delay by minimizing the CNFETs physical channel lengthand and source/drain length.Based on the developed ternary logic library,a 1-bit multiplier circuit was designed and implemented.HSPICE simulations verified both the circuit performance and the effectiveness of the proposed delay reduction method,demonstrating an average 47 ps reduction in switching delay compared to prior ternary 1-bit multipliers.In the actual circuit application,this paper builds the ternary logic cell library that can be used for higher order circuit synthesis and physical design of the circuit.The method to lower the ternary circuit switching delay time for the future,represented by high-performance chip microprocessor and artificial intelligence to lay a good foundation for very large-scale integration.

A lookahead sliding decision feedback equalizer for 112 Gbit/s SerDes receiver

YANG Zhouhao, Lv Fangxu, XU Weixia, LI Shijie, XU Chaolong, HU Xiaoyue

2025, 47(7): 1162-1169. doi:

Abstract ( 89 )

PDF (2516KB) ( 156 ) 　　

With the continuous advancement of information technology,wireline data rates have expe-rienced a significant leap from 112 Gbit/s to 224 Gbit/s.The increase in data rates has raised the complexity requirements for SerDes receiver equalizers.To address timing constraints and other issues brought about by complex equalizer structures,a sliding block decision feedback equalizer based on a lookahead structure is proposed.The design incorporates a 6-tap feedforward equalizer (FFE) and a 9-tap decision feedback equalizer (DFE) for digital signal processing.Functional validation is conducted through MATLAB simulation modeling.The results show that at a data rate of 112 Gbit/s,under channel attenuation ranging from 8 dB to 35 dB,this digital signal processing design,which utilizes a least mean squares (LMS) adaptive equalization algorithm,can effectively reduce the bit error rate (BER).The BER performance meets the discrimination requirements of KP4 forward error correction (FEC) and demonstrates superior performance compared to traditional equalizer structures.

High performance Cholesky factorization on emerging GPU architectures using Tensor Cores

SHI Lu, ZOU Gaoyuan, WU Siqi, ZHANG Shaoshuai

2025, 47(7): 1170-1180. doi:

Abstract ( 142 )

PDF (1240KB) ( 84 ) 　　

The general matrixmatrix multiplications (GEMMs) can achieve highly optimized performance on Tensor Cores.However,due to its limited parallelism,the existing implementations of Cholesky factorization fail to reach most of the peak performance of Tensor Cores.This paper studies a recursive Cholesky factorization algorithm that recursively subdivides diagonal blocks,generating a large number of GEMMs operations between non-diagonal blocks.This algorithm enables the extraction of a higher proportion of the peak performance of Tensor Cores for internal symmetric Rank-K update (SYRK) and triangular solve matrix (TRSM) operations.Experimental results show that the recursive Cholesky decomposition algorithm proposed in this paper achieves speedups of 1.72× and 1.62× compared to the MAGMA/cuSOLVER algorithms on FP32 and FP16,respectively.

A virtual backbone construction algorithm in wireless sensor networks

HUANG Jinhe1, 2, LIANG Jiarong1, 2, LI Changzhen3

2025, 47(7): 1181-1192. doi:

Abstract ( 72 )

PDF (1493KB) ( 69 ) 　　

The virtual backbone of a wireless sensor network consists of nodes responsible for computation and routing tasks,and its energy efficiency is a critical factor in determining the overall network lifespan.The problem of constructing a long-lived fault-tolerant virtual backbone in wireless sensor networks can be abstracted as the weighted connected domatic partition problem in weighted undirected graphs,which is an NP-Hard problem.This paper proposes a lifespan-aware fault-tolerant virtual backbone construction algorithm,which comprises two sub-algorithms.Sub-algorithm 1 adopts a greedy strategy to select nodes with higher energy levels to construct multiple disjoint connected dominating sets,maximizing the battery utilization efficiency of virtual backbone nodes based on a sleep-wake mechanism.Sub-algorithm 2 employs a pseudo-disjoint connected dominating set technique to select nodes with longer lifespans to update the virtual backbone obtained from sub-algorithm 1,thereby constructing a new long-lived fault-tolerant virtual backbone.Simulation results demonstrate that the proposed algorithm outperforms the compared algorithms in terms of virtual backbone lifespan and the number of connected dominating sets.

A routing optimization algorithm for software-defined optical transport network based on multi-view graph attention mechanism

CHEN Junyan1, LI Xinmei1, ZHU Changhong2, XIAO Wei3

2025, 47(7): 1193-1204. doi:

Abstract ( 151 )

PDF (2519KB) ( 174 ) 　　

To address issues such as poor convergence performance and weak generalization capability in traditional deep reinforcement learning (DRL) applications for routing optimization in software defined optical networks (SDONs),this paper proposes a multi-view graph attention mechanism-based deep Q-Network (MGATDQN) algorithm to optimize routing decisions in SDONs.First,a DRL-based routing decision model is designed to identify the optimal routing strategy for each source-destination traffic demand in the optical network.Second,considering the sparse connectivity characteristics of nodes in optical networks,a multi-view attention network is employed as the network model for the deep Q-Network (DQN).By computing attention weights for neighboring edges,the reinforcement learning agent can consciously aggregate critical network information,thereby enhancing the model’s generalization capability.Additionally,the integration of multi-view learning improves the convergence speed and stability of the graph attention network model.Finally,simulation-based routing experiments are conducted using the Gym framework,and the algorithm’s load-balancing capability and generalization performance are evaluated across different network topologies.Experimental results demonstrate that the MGATDQN algorithm exhibits superior convergence performance and load-balancing ability in SDON routing optimization.Moreover,it generalizes well to unseen network structures and maintains robust decision-making capabilities even when certain network nodes fail.

Research on task offloading scheduling and resource allocation mechanism of vehicleedgecloud collaboration

ZHAO Peng, KUANG Zhufang

2025, 47(7): 1205-1214. doi:

Abstract ( 119 )

PDF (1385KB) ( 69 ) 　　

On the basis of vehicular edge computing,vehicleedgecloud collaboration can further en-able coordination between vehicles and the cloud,providing vehicles with additional computing and storage resources to achieve a smarter,safer,and more reliable driving experience.In traditional research,the computational tasks of vehicle users are assumed to be independent and indivisible,with no dependencies between tasks.However,in real-world applications,with the advancement of artificial intelligence,many applications consist of multiple interdependent components,making the consideration of such dependency-based computational demands essential.Therefore,this paper focuses on a multi-vehicle,multi-task edge computing scenario under vehicleedgecloud collaboration,constructing a model that accounts for vehicleedgecloud coordination,task dependencies,and task priorities to address task offloading decisions,task scheduling decisions,and resource allocation.With the goal of minimizing system energy consumption,a joint optimization algorithm JPDDO based on a priority algorithm and a double deep Q-network (DDQN) is proposed:Firstly,prioritizing multiple sets of dependent tasks;Secondly,solving the offloading decisions,scheduling decisions,computing frequency,and transmission power for the resulting task queue using the DDQN algorithm.Simulation results validate the effectiveness of the proposed method,demonstrating consistently low energy consumption under different network environments and parameter settings.

A malicious code variant families tracing method based on generative adversarial network

LI Li, ZHANG Qing, KONG Youran, SU Renjia, ZHAO Xin

2025, 47(7): 1215-1225. doi:

Abstract ( 146 )

PDF (3333KB) ( 127 ) 　　

Aiming at the issues of rapid mutation and difficult traceability of malicious code, this paper proposes a classification method that enhances familial traceability by creating a dataset of malicious code variants. The method visualizes malicious code, employs an improved generative adversarial network (GAN) for classification, and utilizes Ghost modules and Dropout layers to balance the adversarial capabilities of the generator and discriminator. An efficient channel attention mechanism is introduced to help the model focus on critical features, while a combined structure of convolution and upsampling avoids checkerboard artifacts in generated images. During testing, the models familial traceability for malicious code variants is validated using both a malicious code variant dataset and datasets with distinct categorical features. The proposed method achieves stronger feature extraction, lower resource consumption, and faster inference speed, meeting the demands of modern rapidly evolving malicious code for anti-obfuscation capability and high generalization. Additionally, it is suitable for deployment on mobile and embedded devices, ensuring real-time detection of malicious code.

BotChecker:A Transformer-based GitHub bot detection model#br#

ZHANG Jin1, 3, WU Xingjin1, ZHANG Yang2, XU Shunyu1

2025, 47(7): 1226-1236. doi:

Abstract ( 109 )

PDF (1997KB) ( 77 ) 　　

In open-source software,accurately identifying software development assistant robots(Bots) and human contributors is crucial for understanding and evaluating contribution activities.Given the outstanding performance of deep learning models in NLP and software engineering-related fields,this paper proposes BotChecker,a Transformer-based automated bot detection model.By incorporating enhanced fully connected layers and a dedicated binary classifier structure into the Transformer,the model can effectively learn from comment text data of bot and human accounts to detect bots.Experiments validate the effectiveness of BotChecker in bot detection tasks,achieving Accuracy,Recall,and F1-score of 0.941,0.894,and 0.938,respectively.Furthermore,this paper analyzes the impact of model hyperparameters and zero-shot learning on BotChecker’s performance.The proposed model can provide technical support for bot account identification in open-source communities and serve as a methodological benchmark for future research.

Analysis of influence of surface grid construction method on numerical simulation of high-speed flow aerodynamic/aeroheating

ZHANG Jianwei, WEN Hao, ZHAO Yang

2025, 47(7): 1237-1243. doi:

Abstract ( 64 )

PDF (3037KB) ( 74 ) 　　

High-speed flow fields are highly complex and difficult to characterize in detail.Optimizing grid layout can enhance the depiction of such flows,thereby improving the accuracy of aerodynamic/aeroheating numerical simulations.This paper employs a blunt-body model as the research subject,constructing surface meshes for high-speed flows using different mesh generation methods.By comparing the numerical simulation accuracy of aerodynamic/aeroheating across these grids,the influence of surface grid construction methods on numerical predictions is analyzed.Through multiple case validations,an optimized surface grid generation method is proposed for high-precision computation of high-speed flow fields.The recommended method provides a reliable foundation for high-fidelity numerical simulations of aerodynamic/aeroheating in high-speed flows.

Survey of fuzzing test case generation techniques

LIU Hui1, 2, HOU Tongding1, 2, ZHAO Bo3, 4, GUO Hanbin1, 2

2025, 47(7): 1244-1261. doi:

Abstract ( 187 )

PDF (2836KB) ( 448 ) 　　

Fuzzing test is one of the mainstream software vulnerability detection technologies and has been widely applied across various fields.In recent years,significant progress has been made in the research of fuzzing test case generation techniques.Firstly,this paper reviews the development of fuzzing test case generation technology,classifying and summarizing relevant research while providing a comprehensive comparison.Secondly,based on an in-depth study of fuzzing test case generation techniques,this paper establishs a framework for constructing test cases through both generation-based and mutation-based approaches.Subsequently,this paper categorizes fuzzing test case construction techniques,delving into the process by which fuzzers extract features from program structure and semantics and combine feedback information to generate test cases.Furthermore,this paper classifies and elaborates on the challenges and tasks faced by existing fuzzing test case generation techniques in four key areas:browsers,network protocols,compilers,and operating systems,followed by a systematic summary and comparative analysis.Finally,this paper discusses the limitations and potential solutions of current fuzzing test case generation techniques from multiple perspectives and outlines promising future research directions in this field.

A pyramid feature decoupling extraction fusion network for pansharpening

LIN Yi1, 2, 3, SONG Huihui1, 2, 3

2025, 47(7): 1262-1273. doi:

Abstract ( 96 )

PDF (3536KB) ( 127 ) 　　

The objective of pansharpening is to fuse low-resolution multispectral images (LRMS) and their corresponding high-resolution panchromatic images (PAN) acquired by the same remote sensing satellite to generate high-resolution multispectral images (HRMS).Existing networks overly rely on the feature extraction and fusion capabilities of deep learning,failing to focus on the advantageous features of each modality and neglecting the distinct representations inherent in multimodal data,which leads to excessive redundant features in the final output.To extract features that independently express desired representations,reduce redundant information,and better integrate the complementary information from both modalities,this paper proposes a novel pyramid feature decoupling extraction and fusion network for pansharpening,effectively enhancing the clear representation of spectral and texture details in images.First,inspired by the divide-and-conquer concept,the network decouples and separately extracts spectral and texture information,employing different attention mechanisms to capture the unique details of each modality.Then,a cross-modal feature fusion module strengthens the interaction between features of different modalities,enabling the network to acquire complementary information while eliminating redundancy.Finally,based on a pyramid structure,the network performs feature extraction and fusion operations at multiple spatial scales,achieving outstanding results.Extensive experiments conducted on the GaoFen-2 and WorldView-3 satellite datasets demonstrate that the proposed network significantly outperforms state-of-the-art approaches,providing sub-stantial improvements for the pansharpening task.

A high-precision contact angle measurement method based on improved LESRCNN superresolution model

ZHANG Maodi1, WANG Jun1, 2, SUN Xiaohong1

2025, 47(7): 1274-1284. doi:

Abstract ( 138 )

PDF (2405KB) ( 153 ) 　　

To address the current issues of low accuracy and stability in contact angle measurement,this paper proposes a contact angle measurement method based on an improved lightweight enhanced super-resolution network LESRCNN(lightweight enhanced super-resolution CNN).The improved network replaces the original information extraction module with a feature extractor composed of ConvNeXt Blocks,which are enhanced by borrowing the Swin-T structure from ResNet50,to boost super-resolution performance.Additionally,an enhanced spatial attention (ESA) mechanism is introduced to improve the network's ability to capture fine-grained image details,and the Gaussian error linear unit (GELU) is adopted instead of the rectified linear unit (ReLU) as the activation function to accelerate convergence.Furthermore,to enhance the robustness of the fitting results,this paper employs the Huber function as the weighting function and utilizes an ellipse fitting method based on iteratively reweighted least squares (IRLS) to simultaneously fit the droplet contours from both left and right sides for contact angle calculation.Experimental results demonstrate that,at a scaling factor of 3,the improved network achieves a peak signal-to-noise ratio (PSNR) increase of 0.8 dB and a structural similarity index (SSIM) improvement of 0.0026 compared to the original network when trained on the same droplet dataset.For standard samples with contact angles below 90°,the accuracy and stability of measurements improved by 34.3% and 7.4%,respectively,while for samples with angles of 90° or above,the improvements were 18.2% and 29.4%,respectively.

A category-aware semi-supervised knowledge distillation medel for long-tailed classification

JI Lei, LI Xi, XU Dahong, LIU Hong, GUO Jianping

2025, 47(7): 1285-1294. doi:

Abstract ( 117 )

PDF (1480KB) ( 50 ) 　　

In classification-based pattern recognition tasks,the training process requires handling a large number of category samples.In practice,these samples exhibit a significant long-tailed distribution characteristic,posing substantial challenges for such tasks.The challenges brought by long-tailed distribution mainly manifest in two aspects:imbalanced feature space,and difficulty in focusing on hard samples in the tail regions.To address these issues,a category-aware semi-supervised knowledge distillation model is proposed,which comprises two core components:balanced semi-supervised knowledge distillation and balanced category-aware learning.The former employs semi-supervised knowledge distillation to achieve a more balanced feature space.The latter integrates a category-aware margin loss function with a delayed hard sample learning activation loss function,improving classifier performance and enhancing focus on hard samples.All experiments were conducted on five benchmark datasets,including CIFAR10-LT,CIFAR-100-LT,ImageNet-LT,iNaturalist2018,and Places-LT.Notably,on ImageNet-LT,the proposed model achieved a Top-1 accuracy of 57.5%,outperforming other models.

A digital resource integration method based on agent model

WANG Binfeng, CAI Libing, KONG Longxing, HUANG Shaohua, HAN Wenbin

2025, 47(7): 1294-1302. doi:

Abstract ( 82 )

PDF (1705KB) ( 94 ) 　　

Digital technologies continue to empower various industries and have achieved remarkable results.The development and application of digital resources are key to digital empowerment.However,diverse forms of digital resources required to build simulation systems for specific objectives vary significantly,leading to increased complexity in system development and reduced operational efficiency.This paper begins by analyzing the fundamental concepts of digital resources and summarizes their associated application requirements.Subsequently,it proposes a digital resource integration method based on agent model,which demonstrates significant advantages in mitigating the heterogeneity among different digital models within a system.The method is elaborated from two perspectives:1) An agent model-based integration framework which supports the integration of both agent model-based digital resources and those not employing agent models.2) The design of agent models for digital resources which ensures data transmission requirements between agent models (both externally and internally) and facilitates data conversion across different external interfaces.Finally,the effectiveness and flexibility of agent model-based digital resources integration are validated through a real-time satellite trajectory visualization case study.

A tumor disease prediction model based on PKUSEG-Text-GCN

GAO Zhiling1, ZHAO Xinyu1, 2

2025, 47(7): 1303-1311. doi:

Abstract ( 96 )

PDF (1763KB) ( 127 ) 　　

Current disease prediction models primarily focus on local and contextual information within medical records,lacking the incorporation of global information,which results in suboptimal prediction accuracy.Leveraging the capability of graph neural networks to capture global information,this study proposes the use of graph convolutional networks (GCN) for tumor disease prediction based on Chinese electronic medical records (EMRs).Firstly,the PKUSEG medical domain-specific word segmentation model is employed to tokenize Chinese EMRs.Then,a text graph is constructed by analyzing the co-occurrence relationships between medical records and words,as well as the relationships between words within the medical text.Finally,the graph convolutional network (Text-GCN) is applied to learn the features of this medical text graph,and the trained model is utilized for tumor disease prediction.Experimental results demonstrate that the proposed model achieves a 6% improvement in accuracy compared to the best-performing baseline model.Moreover,the accuracy does not significantly decline when the dataset is small,indicating that the method exhibits strong robustness even with limited electronic medical records.

A butterfly optimization algorithm with multi-strategy improvement

ZHANG Qi1, GU Tengda1, REN Yuchen1, JI Jinqi2, CHEN Haitao1

2025, 47(7): 1312-1320. doi:

Abstract ( 109 )

PDF (1630KB) ( 62 ) 　　

Aiming at the shortcomings of the butterfly optimization algorithm (BOA),such as poor search accuracy,imbalanced global exploration and local exploitation capabilities,and susceptibility to local optima,this paper proposes a multi-strategy improved butterfly optimization algorithm (MSIBOA) to enhance its robustness and optimization performance.The improved algorithm adopts random consistency to initialize the butterfly population,ensuring a more uniform distribution of individuals across all dimensions of the search space and broader coverage of the solution space.Dynamic inertia weight strategy is introduced to balance global and local search,while an elite differential mutation strategy is incorporated to boost the algorithm's global search capability.Experimental comparisons between the improved algorithm and seven other optimization algorithms on 17 benchmark functions demonstrate that MSIBOA outperforms the original BOA in convergence speed,solution accuracy,global optimization capability,and robustness.

Multimodal aspect-based sentiment analysis based on dual channel graph convolutional network

ZHANG Feng1, SHAO Yubin1, DU Qingzhi1, LONG Hua1, MA Dinan2

2025, 47(7): 1321-1330. doi:

Abstract ( 179 )

PDF (1551KB) ( 134 ) 　　

In the task of multimodal aspect-based sentiment analysis,traditional methods primarily focus on deep-level cross-modal interactions between images and texts while paying less attention to the aspect-related shallow information within images and texts.This oversight leads to the introduction of aspect-irrelevant noise,thereby limiting the model’s ability to capture the complex relationships between aspects and sentiments.To address this issue,a dual-channel graph convolutional network (DCGCN) model is proposed.Based on the architecture of the BART model,the proposed approach employs an attention mechanism to enhance aspect semantics,leverages graph convolutional networks (GCN) to extract aspect-enhanced multimodal features,and aggregates syntactic dependencies,aspect-based positional dependencies,and aspect-augmented imagetext correlation information into the GCN adjacency weight matrix to obtain multi-information-aware multimodal features.Experiments demonstrate that the proposed model achieves F1 scores of 67.4% and 67.9% on two Twitter datasets,respectively,and can improve the performance of multimodal aspect-based sentiment analysis.

Current Issue

Author center

Review center

Online journal