• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science

Previous Articles    

A new proof of the equivalence between random walk
sequences and sentences in representation learning

SUN Yan1,3,4,5,SUN Mao-song1,2,ZHAO Hai-xing1,3,4,YE Zhong-lin1,3,4   

  1. (1.School of Computer,Qinghai Normal University,Xining,Qinghai 810016;
    2.Department of Computer Science and Technology,Tsinghua University,Beijing 100084;
    3.Key Laboratory of Tibetan Information Processing and Machine Translation in QH,Xining 810008;
    4.Key Laboratory of the Education Ministry for Tibetan Information Processing,Xining 810008;
    5.School of Computer,Qinghai Nationalities University,Xining 810007,China)

     
  • Received:2019-07-11 Revised:2019-09-16 Online:2020-02-25 Published:2020-02-25

Abstract:

Representation learning is to map the information with association relationship into a low-dimensional vector space through a shallow neural network in machine learning. The goal of word representation learning is to map the relationship between words and their context words into the low-dimensional vector, while the goal of network representation learning is to map the relationship between network nodes and context nodes into the low-dimensional vector. The word vector is the output of the word representation learning, and the node representation vector is the output of the network representation learning. DeepWalk obtains the walk sequence in the network as a sentence of word2vec model through a random walk strategy, and the node pair is trained in the neural network through the sliding window. word2vec and DeepWalk use the same underlying model and optimization method: Skip-Gram model and negative sampling optimization method. In word2vec and DeepWalk, the Skip-Gram model with negative sampling is called SGNS. The existing research results show that both the word representation learning and the network representation learning algorithm based on the SGNS model both implicitly decompose the target feature matrix. Perozzi et al. proposed that word frequency obeys Zipf's law and node degree in the network obeys the power law distribution, and they considered that the random walk sequence of the network is equivalent to the sentence of the language model. However, the reason for judging whether a sentence is equivalent to a random walk sequence is only based on the power law distribution, which is not sufficient. Therefore, based on the theory and basis of SGNS's implicit decomposition of the target feature matrix, this paper designs two comparative experiments. The experiments use singular value decomposition and matrix completion method to perform node classification tasks on three public data sets, and confirms the equivalence between sentences and random walk sequences.

 

 

Key words: word vector, shifted positive pointwise mutual information, sentence, random walk sequence