[1] |
Zhang H L,Xiao L Q,Chen W Q,et al.Multi-task label embedding for text classification[J].arXiv:1710.07210,2017.
|
[2] |
Zhang Y,Chen H S,Zhao Y H,et al.Learning tag dependencies for sequence tagging[C]∥Proc of the 27th International Joint Conference on Artificial Intelligence,2018:4581-4587.
|
[3] |
Hochreiter S,Schmidhuber J.Long short-term memory[J].Neural Computation,1997,9(8):1735-1780.
|
[4] |
Lafferty J,McCallum A,Pereira F C N.Conditional random fields:Probabilistic models for segmenting and labeling sequence data[C]∥Proc of the 18th Conference on Machine Learning,2001:282-289.
|
[5] |
Baum L E, Petrie T.Statistical inference for probabilistic functions of finite state Markov chains[J].The Annals of Mathematical Statistics,1966,37(6):1554-1563.
|
[6] |
Kudo T, Matsumoto Y.Use of support vector learning for chunk identification[C]∥Proc of the 4th Conference on Computational Natural Language Learning and the 2nd Learning Language in Logic Workshop,2000:142-144.
|
[7] |
Collobert R,Weston J,Bottou L,et al.Natural language processing (almost) from scratch[J].Journal of Machine Learning Research,2011,12:2493-2537.
|
[8] |
Huang Z H,Xu W,Yu K.Bidirectional LSTM-CRF models for sequence tagging[J].arXiv:1508.01991,2015.
|
[9] |
Lample G,Ballesteros M,Subramanian S,et al.Neural architectures for named entity recognition[J].arXiv:1603.01360,2016.
|
[10] |
Ma X Z, Hovy E. End-to-end sequence labeling via bi- directional LSTM-CNNS-CRF[J].arXiv:1603.01354,2016.
|
[11] |
Liu L Y,Shang J B,Xu F F,et al.Empower sequence label- ing with task-aware neural language model[J].arXiv:1709.04109,2017.
|
[12] |
Yamada I,Asai A,Shindo H,et al.LUKE:Deep contextua- lized entity representations with entity-aware self-attention[J].arXiv:2010.01057,2020.
|
[13] |
Jiang Y F,Hu C,Xiao T,et al.Improved differentiable architecture search for language modeling and named entity recognition[C]∥Proc of 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing,2019:3576-3581.
|
[14] |
Vaswani A,Shazeer N,Parmar N,et al.Attention is all you need[C]∥Proc of the 31st International Conference on Neural Information Processing Systems,2017:5998-6008.
|
[15] |
Yu A W,Dohan D,Luong M T,et al.QANet:Combining local convolution with global self-attention for reading comprehension[J].arXiv:1804.09541,2018.
|
[16] |
Lin Z H, Feng M W,Santos C N,et al.A structured self- attentive sentence embedding[J].arXiv:1703.03130,2017.
|
[17] |
Tang J, Qu M,Mei Q Z.PTE:Predictive text embedding through large-scale heterogeneous text networks[C]∥Proc of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,2015:1165-1174.
|
[18] |
Nam J,Mencía E L,Fürnkranz J.All-in text:Learning document,label,and word representations jointly[C]∥Proc of the 13th AAAI Conference on Artificial Intelligence,2016:1948-1954.
|
[19] |
Wang G Y,Li C Y,Wang W L,et al.Joint embedding of words and labels for text classification[J].arXiv:1805.04174,2018.
|
[20] |
Cui L Y,Zhang Y.Hierarchically-refined label attention network for sequence labeling[J].arXiv:1908.08676,2019.
|
[21] |
Pennington J,Socher R,Manning C D.Glove:Global vectors for word representation[C]∥Proc of 2014 Conference on Empirical Methods in Natural Language Processing,2014:1532-1543.
|