Research Area:  Machine Learning
Named entity recognition (NER) is a fundamental task in natural language processing. The existing Korean NER methods use the Korean morpheme, syllable sequence, and part-of-speech as features, and use a sequence labeling model to tackle this problem. In Korean, on one hand, morpheme itself contains strong indicative information of named entity (especially for time and person). On the other hand, the context of the target morpheme plays an important role in recognizing the named entity(NE) tag of the target morpheme. To make full use of these two features, we propose two auxiliary tasks. One of them is the morpheme-level NE tagging task which will capture the NE feature of syllable sequence composing morpheme. The other one is the context-based NE tagging task which aims to capture the context feature of target morpheme through the masked self-attention network. These two tasks are jointly trained with Bi-LSTM-CRF NER Tagger. The experimental results on Klpexpo 2016 corpus and Naver NLP Challenge 2018 corpus show that our model outperforms the strong baseline systems and achieves the state of the art.
Keywords:  
Korean Named Entity Recognition
Bi-Lstm-Crf
Masked Self-Attention
Deep Learning
Machine Learning
Author(s) Name:  Guozhe Jin,Zhezhou Yu
Journal name:  Computer Speech & Language
Conferrence name:  
Publisher name:  Elsevier
DOI:  10.1016/j.csl.2020.101134
Volume Information:  Volume 65, January 2021, 101134
Paper Link:   https://www.sciencedirect.com/science/article/abs/pii/S088523082030067X