The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
본 논문은 일본어의 술어-논증 구조 분석을 위한 사례 프레임 획득에 대한 말뭉치 크기의 영향을 보고합니다. 본 연구를 위해 우리는 최대 100억 단어로 구성된 일본어 말뭉치를 수집하고 100가지 크기의 말뭉치에서 사례 프레임을 구성합니다. 그런 다음 사례 프레임 획득을 위한 코퍼스 크기와 술어-인수 구조 분석 성능 간의 관계를 조사하기 위해 이러한 사례 프레임을 구문 및 사례 구조 분석과 제로 아나포라 해상도에 적용합니다. 우리는 더 큰 말뭉치로 구성된 케이스 프레임을 사용하여 더 나은 분석을 얻었습니다. XNUMX억 단어의 말뭉치에도 불구하고 성능이 포화되지 않았습니다.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
부
Ryohei SASANO, Daisuke KAWAHARA, Sadao KUROHASHI, "The Effect of Corpus Size on Case Frame Acquisition for Predicate-Argument Structure Analysis" in IEICE TRANSACTIONS on Information,
vol. E93-D, no. 6, pp. 1361-1368, June 2010, doi: 10.1587/transinf.E93.D.1361.
Abstract: This paper reports the effect of corpus size on case frame acquisition for predicate-argument structure analysis in Japanese. For this study, we collect a Japanese corpus consisting of up to 100 billion words, and construct case frames from corpora of six different sizes. Then, we apply these case frames to syntactic and case structure analysis, and zero anaphora resolution, in order to investigate the relationship between the corpus size for case frame acquisition and the performance of predicate-argument structure analysis. We obtained better analyses by using case frames constructed from larger corpora; the performance was not saturated even with a corpus size of 100 billion words.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.E93.D.1361/_p
부
@ARTICLE{e93-d_6_1361,
author={Ryohei SASANO, Daisuke KAWAHARA, Sadao KUROHASHI, },
journal={IEICE TRANSACTIONS on Information},
title={The Effect of Corpus Size on Case Frame Acquisition for Predicate-Argument Structure Analysis},
year={2010},
volume={E93-D},
number={6},
pages={1361-1368},
abstract={This paper reports the effect of corpus size on case frame acquisition for predicate-argument structure analysis in Japanese. For this study, we collect a Japanese corpus consisting of up to 100 billion words, and construct case frames from corpora of six different sizes. Then, we apply these case frames to syntactic and case structure analysis, and zero anaphora resolution, in order to investigate the relationship between the corpus size for case frame acquisition and the performance of predicate-argument structure analysis. We obtained better analyses by using case frames constructed from larger corpora; the performance was not saturated even with a corpus size of 100 billion words.},
keywords={},
doi={10.1587/transinf.E93.D.1361},
ISSN={1745-1361},
month={June},}
부
TY - JOUR
TI - The Effect of Corpus Size on Case Frame Acquisition for Predicate-Argument Structure Analysis
T2 - IEICE TRANSACTIONS on Information
SP - 1361
EP - 1368
AU - Ryohei SASANO
AU - Daisuke KAWAHARA
AU - Sadao KUROHASHI
PY - 2010
DO - 10.1587/transinf.E93.D.1361
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E93-D
IS - 6
JA - IEICE TRANSACTIONS on Information
Y1 - June 2010
AB - This paper reports the effect of corpus size on case frame acquisition for predicate-argument structure analysis in Japanese. For this study, we collect a Japanese corpus consisting of up to 100 billion words, and construct case frames from corpora of six different sizes. Then, we apply these case frames to syntactic and case structure analysis, and zero anaphora resolution, in order to investigate the relationship between the corpus size for case frame acquisition and the performance of predicate-argument structure analysis. We obtained better analyses by using case frames constructed from larger corpora; the performance was not saturated even with a corpus size of 100 billion words.
ER -