• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science ›› 2021, Vol. 43 ›› Issue (03): 407-415.

Previous Articles     Next Articles

Identity calibration of E-commerce big data based on long short-term memory network

LIU Ya-bo,WU Qiu-xuan   

  1. (School of Automation Engineering,Hangzhou Dianzi University,Hangzhou 310018,China)
  • Received:2020-03-26 Revised:2020-05-08 Accepted:2021-03-25 Online:2021-03-25 Published:2021-03-26

Abstract: Due to the variety of products and the lack of uniform writing format, the e-commerce big data under the government procurement platform uses the traditional model to mark the same product with low accuracy, slow speed, low sample utilization rate and insufficient generalization ability. An identity calibration model based on Long Short-Term Memory Network (LSTM) is proposed, which consists of three sub-models in series, such as word segmentation, importance ranking, and similarity calculation. Firstly, the word segmentation sub-model preprocesses the e-commerce big data to obtain a differentiated keyword sequence.  Next, the LSTM importance ranking sub-model screens the most important keyword sequences that characterize the product information. Finally, the LSTM similarity calculation sub-model accurately calibrates the same commodity in the given big data. In addition, binary search, GloVe word vectorization, and word sequence semantic verification technology are introduced to improve the calibration speed, training sample utilization rate, and high calibration generalization ability, respectively. The experimental results show that, when dealing with big data of different types of government procurement e-commerce, the accuracy of calibrating the identity of confusing samples is high.


Key words: E-commerce big data, long short-term memory, importance ranking, similarity calculation