• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学

• • 上一篇    下一篇

基于GAIN-ARN生成对抗插补网络模型的缺失数据检测方法

杨钦慧, 童英华   

  1. (1.青海师范大学计算机学院,青海 西宁 810008;
    2.青海师范大学藏语智能全国重点实验室,青海 西宁 810008) 

Missing Data Detection Method Based on the GAIN-ARN Generative Adversarial Imputation Network Model

YANG Qinhui, TONG Yinhua   

  1. (1. School of Computer, Qinghai Normal University, Xining 810008;
    2. The State Key Laboratory of Tibetan Intelligence, Qinghai Normal University, Xining 810008, China) 

摘要: 水质监测、空气质量监测和土壤监测是环境监测的关键组成部分。然而,监测设备在实际运行过程中可能面临设备故障、传感器失效和数据传输问题等各种干扰,导致监测数据缺失。数据缺失会严重影响对环境状况的评估和分析。为了解决这些问题,提出了一种基于GAIN-ARN的生成对抗插补网络模型的缺失数据检测方法。该方法引入了自注意力机制和残差连接,以增强模型对数据内部结构的建模能力,提高插补的稳定性和效果。分别在5个公开的数据集上进行实验,结果表明GAIN-ARN几乎在所有情况下都取得了最低的RMSE值,尤其在Water Quality Testing数据集上的提升最为显著。在20%缺失率下,相对于GAIN提升了约10.61%,相对于WSGAIN-GP提升了约33.22%。实验结果进一步表明,自注意力层的引入显著提升了生成器的插补效果。

关键词: 生成对抗插补网络, 缺失数据, 自注意力机制, 残差连接

Abstract: Water quality monitoring, air quality monitoring, and soil monitoring are crucial components of environmental monitoring. However, during actual operations, monitoring equipment may encounter various interferences such as equipment failures, sensor malfunctions, and data transmission issues, leading to missing monitoring data. Missing data can severely impact the assessment and analysis of environmental conditions. To address these issues, a missing data detection method based on the GAIN-ARN (Generative Adversarial Imputation Network with Attention and Residual Network) model is proposed. This method incorporates self-attention mechanisms and residual connections to enhance the model's ability to capture the internal structure of the data, thereby improving the stability and effectiveness of data imputation. Experiments conducted on five public datasets demonstrate that GAIN-ARN achieves the lowest RMSE values in almost all cases, with the most significant improvement observed in the Water Quality Testing dataset. At a 20% missing rate, GAIN-ARN shows an improvement of approximately 10.61% compared to GAIN and about 33.22% compared to WSGAIN-GP. The experimental results further indicate that the introduction of the self-attention layer significantly enhances the imputation performance of the generator. 

Key words: generative adversarial imputation network, missing data, self-attention mechanism, residual connection