Computer Engineering & Science ›› 2021, Vol. 43 ›› Issue (03): 407-415.
Previous Articles Next Articles
LIU Ya-bo,WU Qiu-xuan
Received:
Revised:
Accepted:
Online:
Published:
Abstract: Due to the variety of products and the lack of uniform writing format, the e-commerce big data under the government procurement platform uses the traditional model to mark the same product with low accuracy, slow speed, low sample utilization rate and insufficient generalization ability. An identity calibration model based on Long Short-Term Memory Network (LSTM) is proposed, which consists of three sub-models in series, such as word segmentation, importance ranking, and similarity calculation. Firstly, the word segmentation sub-model preprocesses the e-commerce big data to obtain a differentiated keyword sequence. Next, the LSTM importance ranking sub-model screens the most important keyword sequences that characterize the product information. Finally, the LSTM similarity calculation sub-model accurately calibrates the same commodity in the given big data. In addition, binary search, GloVe word vectorization, and word sequence semantic verification technology are introduced to improve the calibration speed, training sample utilization rate, and high calibration generalization ability, respectively. The experimental results show that, when dealing with big data of different types of government procurement e-commerce, the accuracy of calibrating the identity of confusing samples is high.
Key words: E-commerce big data, long short-term memory, importance ranking, similarity calculation
0 / / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: http://joces.nudt.edu.cn/EN/
http://joces.nudt.edu.cn/EN/Y2021/V43/I03/407