The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
수많은 소셜 플랫폼의 인기가 높아지면서 점점 더 많은 새로운 단어가 등장하고 있습니다. 그러나 이러한 새로운 단어로 인해 단어 분할과 같은 일부 NLP 작업이 더욱 어려워졌습니다. 따라서 새로운 단어 감지는 NLP에서 항상 중요하고 어려운 작업입니다. 본 논문에서는 우리가 선택한 몇 가지 특징을 추가한 BiLSTM+CRF 모델을 사용하여 새로운 단어를 추출하는 것을 목표로 합니다. 이러한 기능에는 단어 길이, 품사(POS), 문맥 엔트로피 및 단어 응고 정도가 포함됩니다. 기존의 새로운 단어 탐지 방법과 비교하여, 우리의 방법은 모델에서 추출한 특징과 새로운 단어를 찾기 위해 선택한 특징을 모두 사용할 수 있습니다. 실험 결과는 우리 모델이 벤치마크 모델에 비해 더 나은 성능을 발휘할 수 있음을 보여줍니다.
Jianyong DUAN
North China University of Technology,CNONIX National Standard Application and Promotion Lab
Zheng TAN
North China University of Technology,CNONIX National Standard Application and Promotion Lab
Mei ZHANG
North China University of Technology
Hao WANG
North China University of Technology,CNONIX National Standard Application and Promotion Lab
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
부
Jianyong DUAN, Zheng TAN, Mei ZHANG, Hao WANG, "New Word Detection Using BiLSTM+CRF Model with Features" in IEICE TRANSACTIONS on Information,
vol. E103-D, no. 10, pp. 2228-2236, October 2020, doi: 10.1587/transinf.2019EDP7330.
Abstract: With the widespread popularity of a large number of social platforms, an increasing number of new words gradually appear. However, such new words have made some NLP tasks like word segmentation more challenging. Therefore, new word detection is always an important and tough task in NLP. This paper aims to extract new words using the BiLSTM+CRF model which added some features selected by us. These features include word length, part of speech (POS), contextual entropy and degree of word coagulation. Comparing to the traditional new word detection methods, our method can use both the features extracted by the model and the features we select to find new words. Experimental results demonstrate that our model can perform better compared to the benchmark models.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2019EDP7330/_p
부
@ARTICLE{e103-d_10_2228,
author={Jianyong DUAN, Zheng TAN, Mei ZHANG, Hao WANG, },
journal={IEICE TRANSACTIONS on Information},
title={New Word Detection Using BiLSTM+CRF Model with Features},
year={2020},
volume={E103-D},
number={10},
pages={2228-2236},
abstract={With the widespread popularity of a large number of social platforms, an increasing number of new words gradually appear. However, such new words have made some NLP tasks like word segmentation more challenging. Therefore, new word detection is always an important and tough task in NLP. This paper aims to extract new words using the BiLSTM+CRF model which added some features selected by us. These features include word length, part of speech (POS), contextual entropy and degree of word coagulation. Comparing to the traditional new word detection methods, our method can use both the features extracted by the model and the features we select to find new words. Experimental results demonstrate that our model can perform better compared to the benchmark models.},
keywords={},
doi={10.1587/transinf.2019EDP7330},
ISSN={1745-1361},
month={October},}
부
TY - JOUR
TI - New Word Detection Using BiLSTM+CRF Model with Features
T2 - IEICE TRANSACTIONS on Information
SP - 2228
EP - 2236
AU - Jianyong DUAN
AU - Zheng TAN
AU - Mei ZHANG
AU - Hao WANG
PY - 2020
DO - 10.1587/transinf.2019EDP7330
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E103-D
IS - 10
JA - IEICE TRANSACTIONS on Information
Y1 - October 2020
AB - With the widespread popularity of a large number of social platforms, an increasing number of new words gradually appear. However, such new words have made some NLP tasks like word segmentation more challenging. Therefore, new word detection is always an important and tough task in NLP. This paper aims to extract new words using the BiLSTM+CRF model which added some features selected by us. These features include word length, part of speech (POS), contextual entropy and degree of word coagulation. Comparing to the traditional new word detection methods, our method can use both the features extracted by the model and the features we select to find new words. Experimental results demonstrate that our model can perform better compared to the benchmark models.
ER -