• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science ›› 2021, Vol. 43 ›› Issue (10): 1873-1879.

Previous Articles     Next Articles

Biomedical named entity recognition based on BERT and BiLSTM-CRF

XU Li,LI Jian-hua   

  1. (School of Information Science and Engineering,East China University of Science and Technology,Shanghai 200237,China)
  • Received:2020-04-12 Revised:2020-09-14 Accepted:2021-10-25 Online:2021-10-25 Published:2021-10-22

Abstract: In biomedical field, the named entity recognition method based on static word vector achieves low precision. To solve this problem, a method of combining pre-training model BERT and BiLSTM-CRF for biomedical named entity recognition is proposed. Firstly, the BERT is used for semantic extraction and the generation of dynamic word vector. Part of speech and chunking features are added to improve the model precision. Secondly, the word vector is sent to the BiLSTM model for further training to obtain the context features. Finally, the CRF is used to decode sequence and output the result with maximum probability. The average F1 score of this model reaches 89.45% on BC4CHEMD, BC5CDR-chem and NCBI-disease datasets. Experimental results show that the proposed model can effectively improve the precision of the model in the biomedical named entity recognition task.


Key words: biomedicine, named entity recognition, pre-training language model, part of speech, chunk- ing