• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science ›› 2025, Vol. 47 ›› Issue (3): 534-547.

• Artificial Intelligence and Data Mining • Previous Articles     Next Articles

Node classification with graph structure prompt in low-resource scenarios

CHEN Yuling,LI Xiang   

  1. (School of Data Science & Engineering,East China Normal University,Shanghai 200062,China)
  • Received:2024-07-19 Revised:2024-09-10 Online:2025-03-25 Published:2025-04-02

Abstract: Text-attribute graph has increasingly become a hotspot in the field of graph research. In traditional graph neural network (GNN) research, the node features used are typically shallow features derived from text information or manually designed features, such as those from the skip-gram and continuous bag of words (CBOW) models. In recent years, with the advent of large language models (LLMs), profound changes have taken place in the direction  of natural language processing (NLP). These changes have not only impacted NLP tasks but have also begun to permeate into GNNs. Consequently, recent graph-related work has started to introduce language representation models and large language models to generate new node characterization, aiming to further mine richer semantic information. In existing work, most models still adopt traditional GNN architectures or contrastive learning approaches. In the category of contrastive learning methods, since traditional node features and node characterization generated by language models are not produced by a unified model, they face the challenge of dealing with two vectors located in different vector spaces. Based on these challenges and considerations, a model named GRASS is proposed. Specifically, in the pre-training task, the model introduces text information expanded by large language models, which is used for contrastive learning with textual information processed by graph convolution. In downstream tasks, to reduce the cost of fine-tuning, GRASS aligns the formats of downstream tasks with those of pre-training tasks. Through this model, GRASS can perform well on node classification tasks without the need for fine-tuning, especially in low-shot scenarios. For example, in the 1-shot scenario, compared with the best baseline, GRASS improves by 6.10%, 6.22%, and 5.21% on the Cora, Pubmed, and ogbn-arxiv datasets, respectively.

Key words: graph neural network, text-attributed graph, large language model, contrastive learning, pre-training, prompt learning