Computer Engineering & Science

Evolution and future prospects of fundamental software

LIAO Xiangke, TAN Yusong, JIA Zhouyang, WANG Shangwen, JIAN Songlei, LI Bao

2026, 48(5): 761-769. doi: 10.3969/j.issn.1007-130X.2026.05.001

Abstract ( 144 )

PDF (982KB) ( 77 ) 　　

Fundamental software acts as the vital link between hardware platform and software application. It is essential for building computing architectures and sustaining a strong technology ecosystem. This paper firstly reviews the history of fundamental software, explains how its technical architecture has expanded and evolved into platform-based model, and traces its business evolution from a simple hardware accessory to the central core of the ecosystem. Information technology is now entering a four-element integrated space that combines the physical world, the virtual world, human society, and intelligent machines. In this new trend, the paper delves into the paradigm restructuring that the new generation of fundamental software platforms faces in terms of software form, system architecture, human-computer interaction, and engineering methods. Furthermore, it analyzes the key technical challenges that intelligent operating systems encounter in open environments when addressing real-time tasks, exceptional resilience, continuous evolution, and physical security. Finally, the paper discusses a novel intelligent fundamental software stack jointly constituted by intelligent operating systems and intelligent compilers, the Internet of intelligent agents, compilation optimization, and intelligent development environments, fundamental models and world models, and causal data lake.

A current-mode bandgap voltage reference with low temperature coefficient and high PSRR using subthreshold MOSFET in 28 nm CMOS

ZHAO Chengzhuo, Lv Fangxu, XU Weixia, HUANG Heng, LUO Zhang, XIN Kewei, WANG Wenchen, LI Meng, LAI Mingche, PANG Zhengbin

2026, 48(5): 770-778. doi: 10.3969/j.issn.1007-130X.2026.05.002

Abstract ( 117 )

PDF (2238KB) ( 53 ) 　　

As the size of core devices in integrated circuits shrinks, large-scale process gradually fails to meet the performance requirements of circuits under advanced technologies. As a fundamental unit in analog circuits, the bandgap voltage reference needs to adapt to process variations. This paper, utilizing a 28 nm CMOS process, proposes a current-mode bandgap voltage reference with excellent power supply rejection ratio (PSRR) over a wide bandwidth range and a low temperature coefficient. Unlike conventional circuits that employ bipolar transistors to generate positive and negative temperature coefficient voltages, this bandgap voltage reference generates a reference voltage by leveraging the temperature characteristics of MOSFET voltages in the subthreshold region. Subthreshold devices have a low turn-on voltage, significantly reducing the common-mode voltage of the amplifier, increasing the vertical voltage margin. By incorporating a current mirror structure to generate a secondary power supply and using a common-gate transistor to increase output resistance, the PSRR over a high bandwidth range is enhanced. Based on a current-mode bandgap reference, this voltage reference circuit operates with a 1.8 V power supply. Under the tt process corner, it provides a stable reference voltage of 943 mV within the temperature range of -40~125 ℃, with a temperature coefficient of the reference voltage of 6.1 ppm/℃. Within an input voltage range of 1.5~5 V, the line regulation is 0.33%. The PSRR is -64.3 dB at DC and remains below -64.1 dB within the frequency range of 0~16 kHz, still reaching -58 dB at 100 kHz. The layout area is 0.003 8 mm2, and the quiescent power consumption during operation is 16.56 μW.

A survey on metadata prefetching strategies for distributed file system

WANG Zhenfei, DUN Longxiang, BAO Ziliang, YANG Ruijia, LI Guiqiu

2026, 48(5): 779-792. doi: 10.3969/j.issn.1007-130X.2026.05.003

Abstract ( 99 )

PDF (1499KB) ( 49 ) 　　

When handling large-scale data in distributed file system (DFS), metadata management poses a critical challenge. Metadata operations account for the majority of file system operations, so enhancing the performance of metadata services is of utmost importance. Traditional metadata access methods suffer from issues such as network latency and server load, resulting in inefficiency. To address these problems, research has been conducted on DFS-based metadata prefetching strategies, including prefetching based on access patterns, caching mechanisms, and prediction models. These strategies reduce latency and improve I/O efficiency by proactively caching metadata that is about to be used. However, prefetching strategies face challenges related to prediction accuracy, cache management, data consistency, and security. Future development directions include prefetching strategies based on deep learning and intelligent algorithms, as well as the adaptive and dynamically adjusted prefetching strategies. These strategies will contribute to enhancing the efficiency and accuracy of metadata management, thereby meeting the ever-increasing storage demands in the era of big data, with metadata prefetching strategies playing a crucial role in this process.

Parallel optimization for satisfiability problem solving

LI Ji, ZHOU Lei, GONG Chunye, MA Di, SHEN Yulin, ZHANG Xiang

2026, 48(5): 793-802. doi: 10.3969/j.issn.1007-130X.2026.05.004

Abstract ( 169 )

PDF (1323KB) ( 36 ) 　　

The satisfiability problem (SAT) solver is widely applied in fields such as hardware and software verification, information security, and computational biology. Current optimizations of SAT solvers primarily focus on reducing the solution space of formulas and simplifying the entire solving formula. However, reducing the solution space faces challenges such as slow space reduction and insufficient parallel granularity, while formula simplification exhibits poor performance when combined with existing parallel strategies for solving small-scale problems. This paper introduces kissat++, developed based on kissat, the fastest serial SAT solver to date. Specifically, we propose a fine-grained parallel algorithm for unit propagation using observation list-based dynamic blocking techniques and introduce guided paths to achieve coarse-grained parallel optimization during the search space partitioning process. To further enhance space partitioning efficiency, factors such as decision levels are considered when constructing guided paths to select key variables early, thereby reducing the search space on each process. Experimental results on the Tianhe supercomputer demonstrate that kissat++ achieves more than a 2× speedup compared to the original kissat. Additionally, it solves 49 more instances within the time limit on the SAT benchmark set and ranks ninth among the 16 solvers submitted to the parallel track of the 2023 competition.

A group-remapping encoding method for low-power GPU data transmission

ZHANG Tiefei, XING Jianguo

2026, 48(5): 803-809. doi: 10.3969/j.issn.1007-130X.2026.05.005

Abstract ( 105 )

PDF (619KB) ( 40 ) 　　

The high-performance computing capabilities of modern graphics processing unit (GPU) rely on high-bandwidth graphics double data rate (GDDR) interfaces. The high data transfer rates result in significant energy consumption, particularly due to the asymmetric power consumption associated with transmitting logic 1 values in GDDR’s pseudo open drain (POD) I/O interfaces. By reducing the number of logic 1 values with high energy consumption during data transfer, the issue of high energy consumption during data transfer can be alleviated. This paper proposes a group-remapping encoding method based on the quantity of logic 1 values. Initially, the data to be transmitted is divided into basic units of 4 bits each, which are then grouped according to the number of logic 1 values they contain. Subsequently, groups with a higher quantity of logic 1 values are mapped and encoded into groups with a lower quantity of logic 1 values, aiming to minimize the global count of logic 1 values. When evaluated on modern GPU architectures, the results demonstrate that the group-remapping encoding strategy effectively reduces the number of logic 1 values during data transmission for various applications, achiev- ing an average reduction rate of 26%, thereby proving the effectiveness of the proposed method.

CoTree:Borderless and decentralized server cooperation in edge computing

YUAN Xin, LI Ning, GAO Mingfeng, FANG Shutong, ZHANG Zhaoxin, YU Changli

2026, 48(5): 810-827. doi: 10.3969/j.issn.1007-130X.2026.05.006

Abstract ( 163 )

PDF (1727KB) ( 39 ) 　　

In edge computing (EC), offloading tasks to edge servers or the cloud can significantly enhance system performance. However, due to the heterogeneous and dynamic traffic distribution in edge computing, it is challenging for a single edge server to provide satisfactory computing services anytime and anywhere. This issue has spurred research on collaboration among edge servers. Previous server cooperation algorithms have been limited by a one-hop cooperation area restriction. Even though some studies have extended the cooperation area to multiple hops, they still fail to support the core issue in edge computing, namely task offloading. This paper proposes a novel distributed, borderless server cooperation algorithm model called CoTree, which incorporates task offloading strategies. Its cooperation domain is unrestricted, with each server forming its own basic cooperation unit (BCU) and calculating its declared capacity based on the BCU. Additionally, it considers factors such as server computing capacity, processing delay, and the forwarding delay of tasks and computation results. Simulation results demonstrate that CoTree outperforms previous research efforts in terms of performance.

A blockchain query system based on grouped skip list

LI Xudong, HUANG Yuhao, CHENG Ziguo, LI Zelin

2026, 48(5): 828-843. doi: 10.3969/j.issn.1007-130X.2026.05.007

Abstract ( 105 )

PDF (1311KB) ( 32 ) 　　

As the amount of stored data grows, blockchain faces the challenge of efficient query under large-scale data. Users hope to understand the trend of blockchain changes through complex queries. But the query cost of the existing blockchain system is high and the results cannot be reused. For this reason, a grouped skip list and a blockchain query system based on it are proposed. The system is incrementally built on the existing blockchain system to provide users with equality query, range query and top-k query solutions. Firstly, a grouped skip list (Gskiplist) data structure is proposed, which aggregates traditional skip list nodes to reduce the number of nodes, thereby improving query efficiency. Users can construct grouped skip list indexes for different fields of blockchain data, and dynamically adjust the index node height according to the conditional probability algorithm and heat value. The index node is persisted in the storage engine at the bottom of the blockchain to retain the decentralized and tamper-proof characteristics of blockchain data. Secondly, a query system is introduced, and index and query modules are added to the query system for users to query efficiently. Finally, a comparative experiment is conducted with mainstream solutions. The experiment uses space cost, insertion efficiency and query efficiency as metrics and uses real block data sets. Experimental results show that the system supports complex queries with good performance on the blockchain at a small space cost, demonstrating the feasibility and superior performance of the proposed system.

A fraud detection method based on enhanced graph contrastive learning

GAO Yihui, LI Yuanqing, ZHANG Sanfeng, YANG Wang

2026, 48(5): 844-853. doi: 10.3969/j.issn.1007-130X.2026.05.008

Abstract ( 99 )

PDF (871KB) ( 31 ) 　　

Graph contrastive learning, as an effective pre-training strategy, can address the issue of scarce high-quality labeled data in graph-based fraud detection methods. However, current approaches in this category face challenges where malicious behavioral features are either weakened within the aggregation mechanism of graph neural networks or compromised during the data augmentation process. To tackle this, this paper proposes an optimized graph contrastive learning method that integrates graph reconstruction and dynamic data augmentation techniques, aiming to enhance the effectiveness of fraud detection. This method reduces conflicts arising from neighbor feature aggregation by adjusting edge weights in the graph, thereby improving detection accuracy. Simultaneously, it dynam- ically adjusts the data augmentation process using label invariance and distribution diversity metrics to ensure that the augmented data retains critical fraud features while possessing necessary diversity. Experimental results on multiple graph fraud detection datasets demonstrate the effectiveness of this method, with detection performance improvements ranging from 2% to 5% compared to state-of-the-art methods.

A H.265/HEVC video steganography method based on directional mapping and multi-region embedding

XIE Jiachen, ZHANG Xiang, FU Daoyong, HE Ziwen, LI Ziqiang

2026, 48(5): 854-864. doi: 10.3969/j.issn.1007-130X.2026.05.009

Abstract ( 126 )

PDF (1352KB) ( 25 ) 　　

In recent years, video steganography for H.265/HEVC attracted considerable attention. Among them, video methods that employ prediction unit (PU) as embedding carriers have become a research focus in this field. Such methods achieve information hiding by designing mapping rules between secret messages and PU. However, since the embedding of secret information induces substantial modifications to the prediction units, their resistance to steganalysis is relatively weak. To address this issue, this paper proposes a H.265/HEVC video steganography method based on PU directional mapping and multi-region embedding. The proposed method first extracts the partition mode combinations of two adjacent PU from 8×8 and 16×16 coding unit (CU), and maps them to a two-dimensional coordinate through directional mapping. Then, for different PU partition mode combinations, a multi-region embedding strategy is further designed to achieve high-capacity adaptive steganography with minimal modification distortion, thereby enhancing the method’s resistance to steganalysis. Experimental results demonstrate that the proposed method out-performs existing PU-based video steganography methods in terms of visual quality, bitrate control, and resistance to steganalysis. This method holds significant potential for applications in covert video communication.

An optimized ultrasonic C-scan imaging method for diffusion bonding based on echo denoising

LI Yu, JIANG Yinying, CHANG Qing

2026, 48(5): 865-875. doi: 10.3969/j.issn.1007-130X.2026.05.010

Abstract ( 129 )

PDF (2936KB) ( 26 ) 　　

Aiming at the challenging issue of titanium alloy diffusion bonding interface defect detection in the field of nondestructive testing, an optimized ultrasonic C-scan imaging method for diffusion bonding based on echo denoising is proposed to enhance the detection capability for minute defects. This method achieves noise reduction by utilizing ensemble empirical mode decomposition (EEMD) combined with wavelet soft-threshold denoising to reconstruct ultrasonic echo signals. The imaging method is optimized according to the characteristics of interface waves from minute defects, employing the peak-to-valley difference as a new feature value instead of the gate amplitude for C-scan imaging to highlight minute defects. Additionally, image enhancement and denoising techniques are integrated to further optimize imaging quality and improve defect detection capability. Practical testing on artificial defect samples demonstrates that, compared to existing conventional ultrasonic C-scan imaging and other comparative imaging methods, the proposed method exhibits smaller errors between the detected defect lengths and the actual metallographic dimensions, effectively detecting minute defects within the specimens.

A real-time object detection method for crowded and occluded scenes

SHENG Wei, LIU Mingjian, LIU Dianchen

2026, 48(5): 876-887. doi: 10.3969/j.issn.1007-130X.2026.05.011

Abstract ( 109 )

PDF (1985KB) ( 46 ) 　　

Object detection in crowded scenarios is crucial in real-time systems, but it faces chal- lenges such as limited hardware resources and occlusion issues, leading to detection delays and reduced accuracy. This paper proposes an occlusion-aware lightweight object detection method (OLODN) comprising 3 parts: a backbone, feature fusion, and output prediction. The method employs fast network blocks for feature extraction and utilizes a positional attention mechanism to focus on occlusion boundaries. The spatial pyramid pooling feature concatenation module in the backbone reduces information loss and enhances the ability to recognize individuals of varying scales and occlusions. The feature fusion section adopts grouped shuffle convolution to optimize feature flow without increasing computational overhead. The output prediction section employs a task-aligned single-stage object detection method to improve recognition accuracy under occlusion conditions. Experimental results show that the method achieves 66.8% recall on the WiderPerson dataset, which is 2.0 percentage points higher than that of YOLOv8-n, with only 1.8×106 model parameters and superior operational efficiency compared to other models. On the Up-Down dataset, the classification error rate and undetected object error rate are 2.6% and 1.3%, respectively, which are 0.4 percentage points and 0.7 percentage points lower than YOLOv8-n. The experiments validate the methods efficiency on resource-constrained devices.

A semi-supervised fuzzy clustering algorithm for image segmentation based on multiple clustering and adaptive parameter selection

CHEN Haoran, WANG Xiaopeng, WANG Haizhou

2026, 48(5): 888-897. doi: 10.3969/j.issn.1007-130X.2026.05.012

Abstract ( 118 )

PDF (2607KB) ( 37 ) 　　

To address the issues of existing semi-supervised fuzzy C-means (FCM) clustering algorithms, which fail to fully utilize semi-supervised information and have difficulties in parameter selection, this paper proposes a semi-supervised FCM image segmentation algorithm that leverages supervised information to refine clusters and employs adaptive parameter selection. By pre-clustering the supervised information to refine the clusters, the algorithm determines the optimal number of clusters. Additionally, it utilizes the color differences between image pixels and supervised pixels for label propagation, enabling the supervised information to fully guide the clustering process. Finally, the algorithm achieves adaptive selection of supervision term parameters based on the spatial information of labeled pixels and completes image segmentation using the CIE Lab color system. Experimental results on various datasets demonstrate that this algorithm can effectively segment complex color images, outperform- ing several other FCM algorithms in terms of segmentation accuracy and mean intersection over union (mIoU). On the Berkeley dataset, the average segmentation accuracy and mean IoU reach 96.40% and 89.66%, respectively.

A feature fusion semantic segmentation model based on attention mechanism

MA Dongmei, ZHU Qirong, Lv Xuelong

2026, 48(5): 898-905. doi: 10.3969/j.issn.1007-130X.2026.05.013

Abstract ( 138 )

PDF (844KB) ( 24 ) 　　

To address the issues of mis-segmentation, low segmentation accuracy, and severe loss of detailed information commonly encountered in the existing DeepLabV3+ semantic segmentation model, a feature-fusion semantic segmentation model based on an attention mechanism is proposed. Firstly, a switchable atrous convolution is cascaded within the dilated convolution branch of the model, enabling it to adapt more flexibly to features at different scales and thereby reducing mis-segmentation. Additionally, an RFEM module is introduced to capture multi-scale information from shallow features and depen- dencies across different ranges, enhancing the model’s performance. Furthermore, intermediate-layer features of the model are extracted and fused with its deep features using the ELAFF module, enabling the model to recover detailed information lost during the downsampling process. Finally, an efficient local attention mechanism is added to make the model focus more on image information and reduce background interference. Experimental results on the PASCAL VOC 2012 dataset demonstrate that, compared to the original model, the proposed model achieves a 2.36 percentage points increase in mean intersection-over-union (mIoU) and a 1.60 percentage points improvement in mean pixel accuracy (MPA), effectively enhancing the model’s segmentation performance.

An unbiased offensive text detection method based on BERT and sentiment analysis

YUAN Liang, GUO Weibin

2026, 48(5): 906-913. doi: 10.3969/j.issn.1007-130X.2026.05.014

Abstract ( 141 )

PDF (780KB) ( 36 ) 　　

Offensive information on the internet poses severe harm to individuals and society. In offensive text detection methods, existing methods suffer from misjudging non-offensive texts containing profanity and bias against special groups. To address the former issue, this paper proposes a sentiment analysis-based offensive text detection (SAOD) model, which uses sentiment features to assist in predict- ing whether a text is offensive. To tackle the latter issue, a debiasing data augmentation method called special groups mask (SGM) is proposed. This method masks special groups during training, ensuring that special groups are not directly involved in model training, thereby reducing the model's bias towards these groups. Using BERT+LSTM as the base model, experiments were conducted on publicly avail- able datasets ToxiCN and COLD. The experimental results show that the former method improved the base model’s F1-score from 80.18% to 82.67%. Based on this, the latter method reduces the false positive rate (FPR) from 18.27% to 12.77%.

Time series anomaly detection based on variational Transformer

XUE Anrong, CHEN Jie

2026, 48(5): 914-924. doi: 10.3969/j.issn.1007-130X.2026.05.015

Abstract ( 148 )

PDF (1338KB) ( 45 ) 　　

Time series anomaly detection can identify anomalies in monitoring systems, allowing for timely measures to reduce failures and ensure system security. However, existing time series anomaly detection models struggle to effectively handle the nonlinear associations between time series data. To address this issue, a dual-branch learning model based on variational Transformer and Gaussian kernel is proposed, which constructs sequence associations and local associations separately. The difference metric between reconstruction errors and the two associations is used as the anomaly score, and the k-means algorithm is employed to automatically determine the anomaly threshold. Additionally, the position encoding in the Transformer is calibrated to reduce reconstruction errors. Comparative experimental results against nine baseline models on five public datasets indicate on various datasets, which the proposed model outperforms baseline models in most cases and is the only model that achieves an F1 -score exceeding 90% across all five datasets, with an average F1 -score on various datasets, which is 2.27 percentage points higher than that of the best baseline model. This indicates that the proposed model has significant advantages in correctness and can effectively improve the reliability and precision of time series anomaly detection.

Multi-scale information fusion and layered attention aggregation for subgraph federated learning algorithm

WANG Ruoyu, DING Shifei, GUO Lili

2026, 48(5): 925-935. doi: 10.3969/j.issn.1007-130X.2026.05.016

Abstract ( 120 )

PDF (1298KB) ( 29 ) 　　

Subgraph federated learning trains on subgraphs of a global graph on local clients using graph convolutional networks and updates these parameters on the server, thereby safeguarding user privacy. Existing methods lack attention to important nodes in specific tasks or graph structures, which may reduce the efficiency of node embedding. This paper proposes a novel graph federated learning algorithm, called FedMFG, which introduces multi-scale information fusion convolution to integrate node features with neighbor information, thereby enhancing node feature representation capabilities. The algorithm passes parameters during the pre-training phase to reduce communication costs and applies an attention mechanism at the server to dynamically adjust weights for better aggregation of global parameters. Experimental results on standard benchmark datasets demonstrate that FedMFG achieves higher accuracy, higher stability, and lower communication costs than previous algorithms.

An improved aquila optimization algorithm integrating stochastic opposition-based learning and a mutated teaching-learning-based optimization strategy

SONG Yijia, ZHANG Xiaoqing, SUN Minmin, ZHANG Li, LI Na, ZENG Junzhe

2026, 48(5): 936-950. doi: 10.3969/j.issn.1007-130X.2026.05.017

Abstract ( 100 )

PDF (2424KB) ( 32 ) 　　

To address the shortcomings of slow convergence speed and susceptibility to local optima in the standard aquila optimization (AO) algorithm, an improved aquila optimization algorithm, TAO algorithm, is proposed by integrating stochastic opposition-based learning and a mutated teaching- learning-based optimization (TLBO) strategy. Firstly, an expanded search strategy is introduced in the initial phase to enhance the diversity of initial space exploration, and differential mutation is employed to improve the quality of optimization. Secondly, a stochastic opposition-based learning strategy is adopted to increase the number of elite individuals, thereby enhancing the algorithm’s search quality. Furthermore, individual positions are updated through t-distribution mutation perturbations to boost the diversity of the search space. Meanwhile, the TLBO strategy is integrated, leveraging the teaching-learning synergy strategy to accelerate the algorithm’s convergence speed. Then, simulation experiments were carried out on 23 functions with diverse characteristics (unimodal, multimodal, and fixed-dimension multimodal) selected from the CEC 2005 benchmark test suite. The results demonstrate that, compared to AO algorithm and several other heuristic intelligent optimization algorithms, TAO algorithm exhibits superior performance in terms of optimization accuracy, convergence, and stability. The Wilcoxon rank-sum test results further verify that the search performance of TAO is significantly different from that of the comparative algorithms, and TAO outperforms the comparison algorithms. Finally, three engineering design optimization cases are introduced to further validate the feasibility of TAO algorithm in solv- ing practical problems.

Current Issue

Author center

Review center

Online journal