Loading...
  • 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Current Issue

    • High Performance Computing
      A low-power keyword spotting system with SRAM buffer and computing-in-memory
      HUANG Zhi-rui, JIA Xin-ru, , ZHU Hao-zhe, , CHEN Chi-xiao,
      2024, 46(08): 1331-1339. doi:
      Abstract ( 90 )   PDF (2185KB) ( 135 )     
      This paper proposes a low-power keyword spotting (KWS) system to overcome the problem of high-power consumption caused by deploying KWS algorithms on edge computing hardware, which can significantly impact the endurance of mobile devices. The proposed KWS system is based on computing-in-memory (CIM) technology and software-hardware co-design. In terms of algorithm, a ternary quantized MFCC-CNN joint algorithm based on the standard MFCC algorithm topology is proposed. All the general matrix multiplication (GEMM) in MFCC is mapped to the neural network accelerator. At the circuit level, the proposed system uses a computing-in-memory (CIM) core based on SRAM to overcome the power and memory walls in traditional von Neumann architecture accelerators. Additionally, a SRAM buffer circuit based on a look-up table is proposed to replace the register delay chain, which multiplexes the memory array in the CIM core. Both the SRAM-based CIM core and buffer are implemented using custom circuit units. At the system level, a low-power KWS system is proposed utilizing the two customized circuits discussed above. The system is implemented using ASIC and customized circuit design methods and synthesized using a 28 nm process library. The proposed system achieves a processing delay of 64 ms on 10 classification tasks, with a total power consumption of 645.28 μW. The dynamic power consumption of the MFCC pipeline accounts for 5.9% of the total dynamic power consumption, and the total power consumption accounts for only 1.3% of the system's power consumption.

      Optimization of sparse matrix-vector multiplication based on FPGA and row folding
      ZHOU Zhi, GAO Jian-hua, JI Wei-xing
      2024, 46(08): 1340-1348. doi:
      Abstract ( 57 )   PDF (2277KB) ( 127 )     
      Sparse matrix-vector multiplication (SpMV) is a key kernel in scientific and engineering computing. Due to the irregular data distribution in sparse matrices and the irregular memory access operations in SpMV calculations, the performance of SpMV on multicore CPUs and GPUs still lags significantly behind the theoretical peak performance of these devices. Existing CPUs and GPUs are limited in their architectures, making them unable to effectively utilize the special structure of sparse matrices to accelerate SpMV calculations. However, Field-Programmable gate arrays (FPGA) can achieve efficient parallel computing through customized circuits, which better handle the computation and storage issues of sparse matrices. An SpMV optimization method based on FPGA is proposed, which utilizes a high-level synthesis streaming processing engine and employs an adaptive multi-row folding SpMV optimization strategy. This method reduces the ineffective storage and computation of zero elements in the processing engine through row folding, thereby enhancing the performance of FPGA-based SpMV calculations. Experimental results show that compared to existing FPGA implementations, the proposed row folding-based dataflow engine achieves a maximum speedup of 1.78 times and an average speedup of 1.15 times.


      A parallel fast neighbor searching algorithm for particle-based methods on CPU and GPU architectures in multi-scale simulation
      DAI Chang-wei, KONG Rui-lin, JI Zhe,
      2024, 46(08): 1349-1360. doi:
      Abstract ( 63 )   PDF (2237KB) ( 102 )     
      Particle-based methods are widely applied in the resolving of complex multi-scale physical phenomena in various science and engineering areas. In order to handle the challenge of increasing computational complexity and declining concurrency for the pair-wised particle searching procedure in massive multi-scale particle-based simulations, a new parallel fast neighbor searching algorithm, which features high-concurrency and low memory footprint, is developed and demonstrated on both many-core CPU and GPU architectures. An inter-level interaction strategy based on the concept of hierarchical nested data structure is proposed to resolve the issue of racing condition in cross-level particle search. An asymmetric mapping method is developed to eliminate the full mapping of particles on each level, which reduces the memory consumption. A set of numerical experiments show that, the proposed algorithm can handle multi-scale problems with particle volume ratio up to 108. Compared with traditional algorithm, the proposed algorithm can achieve 2x~8x speedups and lower memory consumption. The GPU-based implementation of the algorithm achieves state-of-the-art computational efficiency.

      Block-grained domain adaptation for neural networks at edge
      XIN Gao-feng, LIU Yu-xiao, ZHANG Qing-long, HAN Rui, LIU Chi
      2024, 46(08): 1361-1371. doi:
      Abstract ( 40 )   PDF (3514KB) ( 85 )     
      Running deep neural networks on edge devices faces two challenges: model scaling and domain adaptation. Existing model scaling techniques and unsupervised online domain adaptation techniques suffer from coarse scaling granularity, limited scaling space, and long online domain adaptation time. To address these two challenges, this paper proposes a block-grained model scaling and domain adaptation training method called EdgeScaler, which consists of offline and online phases. For the model scaling challenge, in the offline phase, blocks are detected and extracted from various DNN and then are converted into multiple derived blocks. In the online phase, based on the combination of blocks and the connections between them, a large-scale scaling space is provided to solve the model scaling problem. For the domain adaptation challenge, a block-specific residual Adapter is designed, which is inserted into the blocks in the offline phase. In the online phase, when a new target domain arrives, all adapters are trained to solve the domain adaptation problem for all options in the block-grained scaling space. Test results on the real edge device, Jetson TX2, show that EdgeScaler can reduce the domain adaptation training time by an average of 85.14%  and reduce the training energy consumption by an average of 84.1%, while providing a large-scale scaling option.

      Sequence alignment software migration and performance evaluation based on DPCT
      LI Pei-zhen, ZHANG Yang, CHEN Wen-bo
      2024, 46(08): 1372-1380. doi:
      Abstract ( 30 )   PDF (1908KB) ( 79 )     
      This paper explores the process of migrating CUDA programs to DPC++ using the GASAL2 sequence alignment software. The DPCT tool is utilized during the migration process to automatically convert CUDA APIs to DPC++ APIs. However, the migrated code still requires adaptation and modification to compile and run correctly. This paper evaluates the effectiveness of the DPCT tool in migrating CUDA programs to DPC++ and demonstrates the high-efficiency performance of DPC++ across different architectures. Experiments show that the migrated program maintains the accuracy of the original program and can run on heterogeneous devices with the Intel GPU architecture without code modification. At the same time, the migrated DPC++-based GASAL2 heterogeneous computing performance can reach approximately 90%~95% of the original CUDA-based GASAL2 computing performance, fully demonstrating the feasibility of DPC++ heterogeneous programming. The results provide a promising solution for cross-platform heterogeneous programming to fully utilize a wider range of hardware support.

      A distributed metadata load balancing algorithm based on dynamic space partitioning and compressed Bloom filter
      XUE Mei-ting, YU Wan-gang, ZHANG Ji-lin, ZENG Yan, YUAN Jun-feng, ZHOU Li
      2024, 46(08): 1381-1389. doi:
      Abstract ( 52 )   PDF (994KB) ( 87 )     
      The distributed metadata management system utilizes multiple metadata servers (MDS) to store and manage a large amount of metadata. The system reduces the data load on each individual MDS by employing different mapping strategies to distribute the massive metadata across multiple MDS, thus minimizing the disk access frequency and improving the overall performance of the metadata management system. Typically, a hash function is used to map metadata keys to different MDS. However, when the feature values of the data are similar, the one-way nature of the hash function can result in data distribution imbalance, leading to performance degradation of the MDS. To address the performance degradation issue caused by uneven data distribution, this paper proposes a dynamic spatial partitioning and compressed Bloom filter-based metadata load balancing algorithm. The algorithm first constructs a hash bucket to organize the metadata keys, mapping the keys to different hash buckets using a hash algorithm. During the mapping process, the target hash bucket is dynamically adjusted based on the load condition of the MDS, and the mapping information of the metadata keys is sequentially stored within the corresponding hash bucket. When accessing metadata, the algorithm preprocesses the metadata keys using a compressed Bloom filter, and then performs a binary search within the specified hash bucket to retrieve the mapping information. Compared to recent metadata management algorithms, the proposed algorithm ensures load balancing of MDS even when key skewness occurs. Experimental results show that the algorithm achieves a 20% improvement in search performance compared to the optimal metadata management algorithm, with only a 2% increase in memory consumption.

      Optimization of memory access logic in BOOM processor
      ZHOU Lin-ning, LIU Jie, LI Hong-kui, FU Hao-dong, LIU Hong-hai, XIAO Hao
      2024, 46(08): 1390-1394. doi:
      Abstract ( 33 )   PDF (669KB) ( 67 )     
      Although the Store instruction backtracking strategy adopted by BOOM processors solves the problem of data conflicts caused by out-of-order execution of memory access instructions, this strategy can lead to a large amount of pipeline flushing and reduce the processor performance. To address this, a correlation prediction method for memory access instructions is proposed. This method cancels the query operation before the Load instruction accesses memory and adds a Load instruction correlation prediction table. Only Load instructions that are predicted to be uncorrelated can be executed in disorder. This method avoids a large amount of pipeline flushing while ensuring the correctness of program logic. The test program uses 7 subroutines under SPEC CPU 2006, and the experimental results show that the improved processor's execution performance is improved by 3.5% on average.

      Computer Network and Znformation Security
      S-JSMA: A fast JSMA adversarial example generation method with low disturbance redundancy
      LIU Qiang, LI Mu-chun, WU Xiao-jie, WANG Yu-heng
      2024, 46(08): 1395-1402. doi:
      Abstract ( 34 )   PDF (658KB) ( 67 )     
      Techniques based on deep learning neural network models are widely used in computer vision, natural language processing, and other fields. However, researchers have found that neural network models have significant security risks, such as vulnerability to adversarial sample attacks. Study- ing the techniques related to adversarial samples for image classification can help people recognize the vulnerability of neural network models, which in turn can promote the research of security hardening mechanisms for related models. To overcome the challenges of high time overhead and perturbation redundancy of the JSMA method, a fast JSMA adversarial example generation method with low distur- bance redundancy called S-JSMA is proposed. The S-JSMA method replaces the iterative operation with a single-step one to simplify the work flow of the JSMA algorithm. Moreover, the proposed method adopts a simple perturbation rule rather than the salient graph based perturbation used in JSMA. Consequently, S-JSMA significantly reduces the time overhead and the disturbance redundancy of generating adversarial examples. The experimental results on the MNIST dataset demonstrate that, compared with the JSMA and the FGSM methods, the proposed S-JSMA achieves considerable attacking effects with a significantly shorter time period.

      A multi-dimensional ciphertexts cross domain aggregation scheme supporting fault tolerance in intelligent medical systems
      ZHANG Xiao-jun, LI Xing-peng, ZHANG Jing-wei, TANG Wei
      2024, 46(08): 1403-1413. doi:
      Abstract ( 36 )   PDF (982KB) ( 76 )     
      To address the problem of data islands in Intelligent medical systems, achieve the goal of medical data security convergence, and simultaneously ensure the confidentiality, integrity and availability of medical data in transmission and storage process, this paper proposes a verifiable multi-dimensional medical ciphertexts cross domain aggregation scheme supporting transmission fault tolerance. The scheme integrates edge computing servers into the traditional cloud computing framework. By designing a homomorphic encryption algorithm and combining the Shamir secret sharing technology, this scheme realizes two-layer aggregation of multi-dimensional encrypted data with transmission fault tolerance. The scheme designs a digital signature algorithm based on elliptic curve, to ensure the integrity of medical encrypted data in the process of transmission and storage. In particular, the medical data analysis center can flexibly select target areas from the cloud server for cross domain aggregation, and exploit cloud storage audit mechanism to fulfil the lightweight integrity verification of the aggregation results. According to Horner's rule and the private key, the medical data analysis center can obtain the aggregation results of various dimensions of medical data from end users in the corresponding region. Security analysis and performance comparison demonstrate that this scheme can be securely and efficiently deployed in intelligent medical systems. 

      A privacy-preserving region-sensitive crowdsensing task allocation mechanism
      WANG Yong-jun, LIU Han-yang, WANG Hui, SHEN Zi-hao, LIU Kun, LIU Pei-qian
      2024, 46(08): 1414-1424. doi:
      Abstract ( 32 )   PDF (1524KB) ( 78 )     
      To address the efficiency and privacy issues caused by the geographical insensitivity of existing mobile crowdsensing task allocation mechanisms, a task allocation mechanism based on regional heat (HTPM) is designed. This mechanism realizes personalized task publishing through the analysis of historical data, improving the success rate of worker applications and reducing the number of location privacy exposures. Firstly, an adaptive grid partitioning algorithm based on the Geohash algorithm (GAGM) is used to divide the task area based on historical data analysis. Then, HTPM assigns task matching prefixes corresponding to the task locations based on the division results, and dynamically updates the task matching prefixes based on the recruitment end time to complete task publishing. Finally, the least probable cost winner selection mechanism (LPC-WSM) is adopted to select winners. Simulation experiments based on the Kaggle taxi route dataset show that the average number of applications per person using the HTPM mechanism is reduced by 30.3%, achieving the goal of ensuring location privacy protection strength and improving task allocation efficiency.

      Graphics and Images
      An abnormal sound detection method based on weighted non-negative matrix decomposition
      PAN Yu-qing, YU Hao, LI Feng
      2024, 46(08): 1425-1432. doi:
      Abstract ( 40 )   PDF (768KB) ( 64 )     
      Existing abnormal sound detection methods often rely on strongly labeled data for training, but high-quality strongly labeled audio data is difficult to annotate and costly to collect. Addressing the issues of poor training results and low accuracy caused by interference from non-stationary and time-varying noise when using weakly labeled data in current abnormal audio detection methods, a weighted non-negative matrix factorization (WNMF) method based on audio spectrum is proposed. This method utilizes WNMF to label weakly labeled and unlabeled data, and separates target sound events from background noise. Under appropriate weight values, WNMF alters the importance of audio information in different frequency bands during labeling to suppress noise and improve separation quality, approaching the effect of fully supervised model training. Then, a convolutional neural network is used to generate frame-level predictions and audio label predictions. Simulation experiments show that this method improves the accuracy by 4.8% compared to traditional NMF methods for processing weakly labeled data.

      A multi-stage feature distillation-weighted lightweight image super-resolution network
      YANG Sheng-rong, CHE Wen-gang, GAO Sheng-xiang, ZHAO Yun-lai
      2024, 46(08): 1433-1443. doi:
      Abstract ( 40 )   PDF (1178KB) ( 99 )     
      To address the issues of insufficient receptive fields for extracting low-level features and the lack of reinforcement for local key features in lightweight networks, this paper proposed a multi-stage feature distillation-weighted lightweight image super-resolution network LMSWN. Firstly, a pyramid-like module is employed to expand the receptive field during shallow feature extraction, integrate feature information of different scales, and enrich the information flow of the network. Secondly, a multi-stage residual distillation-weighted module is designed to enhance the ability of square convolution to extract local key features, recover more detailed information, and improve reconstruction performance. At the same time, the combination of channel separation and 1×1 convolution realizes gradual distillation of features, reducing the number of network parameters. Finally, two adaptive parameters are introduced to jointly learn the features of the two branches of the multi-stage residual distillation-weighted module, enhancing the attention to different levels of feature information and further enhancing the representation ability of the network. Experimental results show that the proposed network is fully validated on five benchmark datasets: Set 5, Set 14, BSDS 100, Urban 100, and Manga 109, and its performance exceeds the current mainstream lightweight network.

      Bi-YOLO:An improved lightweight object detection algorithm based on YOLOv8n
      2024, 46(08): 1444-1454. doi:
      Abstract ( 137 )   PDF (3044KB) ( 136 )     
      The single-stage object detection technology represented by YOLOv8 has significant optimizations in the backbone network, but fails to efficiently integrate contextual information in the neck network, leading to missed and false detections in small object detection. Additionally, the large number of algorithm parameters and high computational complexity make it unsuitable for end-to-end industrial deployment. To address these issues, this paper introduce the BiFormer attention mechanism based on the Transformer structure to enhance the detection performance for small objects and improve the algorithms accuracy. At the same time introduce the GSConv module to reduce the algorithm size while ensuring no adverse impact on its performance, balancing the increase in computational and parametric costs brought by BiFormer. An object detection algorithm named Bi-YOLO is designed to achieve a balance between lightweight and algorithm performance. Experimental results show that compared to YOLOv8n, the Bi-YOLO object detection algorithm improves algorithm accuracy by 4.6%, increases the small object detection accuracy on the DOTA dataset by 2.3%, and reduces the number of parameters by 12.5%. Bi-YOLO effectively achieves a balance between algorithm lightweight and performance, providing a new approach for end-to-end industrial deployment.

      FDW-YOLO:An improved indoor pedestrian fall detection algorithm based on YOLOv8
      CHEN Chen, XU Hui-ying, ZHU Xin-zhong, HUANG Xiao, SONG Jie, CAO Yu-qi, ZHOU Si-yu, SHENG Ke
      2024, 46(08): 1455-1465. doi:
      Abstract ( 91 )   PDF (1677KB) ( 116 )     
      Aiming at the problem of low fall detection accuracy and poor real-time performance in indoor scenes due to the effects of light change, occlusion of the human body form, and changes in the human body posture under special viewpoint, a lightweight improved fall detection algorithm based on YOLOv8, named FDW-YOLO, is proposed. The C2f module in the backbone network is replaced by the FasterNext module, which reduces the computational complexity while retaining the excellent feature extraction capability. According to the characteristics of human falls with large changes in posture, three network structures with dynamically deformable convolutional modules added at different positions in the neck layer are designed, experiments are conducted on a self-made fall dataset for comparison, and ultimately, the YOLOv8-C scheme is selected based on network performance. A bounding box regression loss function WIoU is introduced into the improved network to replace the original CIoU. The experimental results show that compared with YOLOv8n, the FDW-YOLO fall detection algorithm increases mAP@0.5 from 96.5% to 97.9% and mAP@0.5:0.95 from 72.5% to 74.3%, while the number of parameters and computation is only 4.1×106 and 7.3×109, which is in line with the requirements for deployment in low-computing power industrial scenarios.

      Artificial Intelligence and Data Mining
      An epidemic trajectory description model based on health code punch-in data
      WAN Ze-yu, ZHANG Fei-zhou
      2024, 46(08): 1466-1472. doi:
      Abstract ( 34 )   PDF (1470KB) ( 64 )     
      The epidemic has profoundly changed the world’s landscape. In the current modeling and analysis of the epidemic’s spatiotemporal dynamics, there is a lack of accurate descriptions of individual and collective trajectories, making it difficult to meet the demands for precision epidemic prevention. To address this issue, based on the analysis of existing methods for spatiotemporal analysis of the epidemic and trajectory description models, combined with health code punch-in data, a spatiotemporal three- dimensional coordinate system is established with latitude and longitude as the x and y axes and time as the z axis. The health code punch-in data are used as trajectory nodes to present the spatiotemporal tra- jectories of carriers and close contacts. Individual, paired, and group trajectories are accurately described in sequence, thereby constructing a “mountain-shaped” trajectory description model that integrates spatiotemporal topological relationships. This model accurately locates the spatiotemporal range that needs to be controlled within the three-dimensional coordinate system, thereby achieving precise epidemic prevention. Experiments conducted on the Foursquare Dataset simulation dataset demonstrate that the “mountain” model effectively reduces the scope of investigation and the number of personnel, and it has broad application scenarios.

      A Chinese named entity recognition model based on multi-feature fusion embedding#br#
      LIU Xiao-hua, XU Ru-zhi, YANG Cheng-yue
      2024, 46(08): 1473-1481. doi:
      Abstract ( 44 )   PDF (826KB) ( 86 )     
      In order to solve the problems of differences in Chinese glyphs and blurred boundaries of Chinese words, a Chinese named entity recognition model based on multi-feature fusion embedding is proposed. On the basis of extracting semantic features, glyph features are captured based on convolutional neural network and multi-headed self-attention mechanism, word features are obtained with reference to the word vector embedding table, and the bidirectional long short-term memory neural network is used to learn the context representation of long distance. Finally the constraint conditions in sentence sequence labels are learned by combining the conditional random field to realize Chinese named entity recognition. The F1 values on the Resume, Weibo and People Daily datasets reach 96.66%, 70.84% and 96.15%, respectively, which proves that the proposed model effectively improves the performance of Chinese named entity recognition tasks.


      Improved beluga whale optimization algorithms based on Fuch mapping and applications#br#
      CHEN Xin-yi, ZHANG Meng-jian, WANG De-guang
      2024, 46(08): 1482-1492. doi:
      Abstract ( 37 )   PDF (1650KB) ( 69 )     
      Aiming at the drawbacks of beluga whale optimization (BWO), such as low convergence accuracy, limited adaptive ability and weak anti-stagnation ability, two improved BWO algorithms based on Fuch mapping and dynamic opposition-based learning, namely, CIOEBWO and CPOEBWO, are proposed from the perspectives of chaos initialization, chaotic parameter, and nonlinear control strategy. Fuch chaotic initialization is used to increase the traversal of the initial population of BWO, which enhances the optimization accuracy and convergence speed of the algorithm. In the phase of exploitation, Fuch chaotic mapping is introduced to dynamically adjust the parameter C1 to coordinate the capabilities of global search and local search, which improves the adaptive ability of BWO effectively. On the basis of two improvement strategies described above, the dynamic opposition-based learning strategy is introduced to enrich the number of high-quality individuals and enhance the overall anti-stagnation ability of the algorithm. The experimental results of 8 benchmark test functions and Friedman rank test indicate that the convergence accuracy, adaptive ability, and anti-stagnation ability of improved BWO are effectively improved. Compared with BWO and CIOEBWO, CPOEBWO has the better performance. In addition, the optimization results of CPOEBWO and six comparison algorithms show that CPOEBWO has the stronger optimization ability and robustness. Finally, CPOEBWO is applied to solve the engineering optimization problems to demonstrate its applicability and effectiveness.



      A probabilistic forecasting method with fuzzy time series
      DONG Wen-chao, GUO Qiang, ZHANG Cai-ming,
      2024, 46(08): 1493-1502. doi:
      Abstract ( 45 )   PDF (872KB) ( 96 )     
      In time series prediction tasks, the uncertainty of historical observations poses difficulties in forecasting. However, the forecasting methods based on fuzzy time series have unique advantages in dealing with data uncertainty. Probabilistic forecasting, on the other hand, can provide the distribution of the predicted target and quantify the uncertainty of the prediction results. Therefore, a fuzzy time series probabilistic forecasting method based on a probability weighting strategy is proposed to reduce the impact of uncertainty on the forecasting task. The proposed method builds a probability-weighted fuzzy time series prediction model using historical observations of the target variable, and refines the fuzzy rule base of the prediction model by introducing additional observations. Specifically, two operators with low computational cost are used to reconstruct the fuzzy logic relationships. The intersection operator is used to exclude the interfering information, while the union operator merges all information, resulting in two different sets of fuzzy logic relationship groups. The relationship group corresponding to the current observation value in two sets is the prediction for the fuzzy set in the next moment. Finally, the probability distribution of the next moment is output by defuzzification. Experimental results on publicly available time series data sets verify the accuracy and validity of this method, and the prediction accuracy is remarkably improved in comparison to the newly proposed PWFTS prediction method.

      A lightweight semantic segmentation based on attention mechanism
      MA Dong-mei, WANG Peng-yu, GUO Zhi-hao
      2024, 46(08): 1503-1512. doi:
      Abstract ( 69 )   PDF (1024KB) ( 122 )     
      Semantic segmentation is a computer vision technique that requires extracting focused information from a large number of images and then transforming this information into a clearer and easier- to-understand representation by means of a mask. Researchers are trying to find a balance in order to minimize the size of the model while ensuring its accuracy, which is currently a hot topic in designing lightweight network models. Currently, there are many challenges in image semantic segmentation techniques, such as segmentation discontinuity, incorrect segmentation, and high model complexity. To solve these problems, a lightweight semantic segmentation model based on attention mechanism is proposed. It uses freeze-thaw training, and the feature extraction network is MobileNetV2. To recover clearer target boundaries, a lightweight convolutional attention (CBAM) module is introduced in the output part of the atrous spatial pyramid pooling (ASPP) or channel attention (ECA-Net) in the decod- ing part. To solve the sample imbalance problem, the focal_loss loss function is introduced. Mixed accuracy is used, and the standard convolution in the output section is replaced with DO-Conv convolution. Experiments and validations are conducted on the PASCAL VOC2012 and Cityscapes datasets. The model size is 23.6 MB, with mean intersection over union (mIoU) scores of 73.91% and 74.89%, and class-wise pixel accuracy scores of 82.88% and 84.87% respectively. This successfully achieves a balance between accurate segmentation and computational efficiency.


      Long text semantic similarity calculation combining hybrid  feature extraction and deep learning
      XU Jie, SHAO Yu-bin, DU Qing-zhi, LONG Hua, MA Di-nan
      2024, 46(08): 1513-1520. doi:
      Abstract ( 47 )   PDF (683KB) ( 85 )     
      Text semantic similarity calculation is a crucial task in natural language processing, but current research on similarity mostly focuses on short texts rather than long texts. Compared to short texts, long texts are semantically rich but their semantic information tends to be scattered. To address the issue of scattered semantic information in long texts, a feature extraction method is proposed to extract the main semantic information from long texts. The extracted semantic information is then fed into a BERT pre-training model using a sliding window overlap approach to obtain text vector representations. A bidirectional long short-term memory network is then utilized to model the contextual semantic relationships of long texts, mapping them into a semantic space. The models representation ability is further enhanced through the addition of a linear layer. Finally, finetuning is performed by maximizing the inner product of similar semantic vectors and minimizing the cross-entropy loss function. Experiment results  show that this method achieves F1 scores of 0.84 and 0.91 on the CNSE and CNSS datasets, outperforming the baseline models.