The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
번역 맥락에 대한 의미론적 표현을 학습하는 것은 통계적 기계 번역(SMT)에 도움이 됩니다. 이전의 노력은 명시적인 구조적 구문 정보를 캡처하는 데 약한 신경망을 통해 번역 컨텍스트에서 구문 및 의미론적 지식을 암시적으로 인코딩하는 데 중점을 두었습니다. 본 논문에서는 번역 문맥에서 구조적 구문 정보를 명시적으로 학습하여 번역 예측을 향상시키는 트리 기반 컨볼루셔널 아키텍처를 갖춘 새로운 신경망을 제안합니다. 구체적으로, 먼저 소스 구문 분석 트리가 있는 병렬 문장을 최소 구문 하위 트리 알고리즘을 기반으로 구문 기반 선형 시퀀스로 변환한 다음 선형 시퀀스에 대한 트리 기반 컨볼루션 네트워크를 정의하여 구문 기반 컨텍스트 표현과 번역 예측을 공동으로 학습합니다. 유효성 검증을 위해 제안된 모델을 구문 기반 SMT에 통합하였다. 대규모 중국어-영어 및 독일어-영어 번역 작업에 대한 실험은 제안된 접근 방식이 여러 기본 시스템에 비해 실질적이고 상당한 개선을 달성할 수 있음을 보여줍니다.
Kehai CHEN
Harbin Institute of Technology
Tiejun ZHAO
Harbin Institute of Technology
Muyun YANG
Harbin Institute of Technology
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
부
Kehai CHEN, Tiejun ZHAO, Muyun YANG, "Syntax-Based Context Representation for Statistical Machine Translation" in IEICE TRANSACTIONS on Information,
vol. E101-D, no. 12, pp. 3226-3237, December 2018, doi: 10.1587/transinf.2018EDP7209.
Abstract: Learning semantic representation for translation context is beneficial to statistical machine translation (SMT). Previous efforts have focused on implicitly encoding syntactic and semantic knowledge in translation context by neural networks, which are weak in capturing explicit structural syntax information. In this paper, we propose a new neural network with a tree-based convolutional architecture to explicitly learn structural syntax information in translation context, thus improving translation prediction. Specifically, we first convert parallel sentences with source parse trees into syntax-based linear sequences based on a minimum syntax subtree algorithm, and then define a tree-based convolutional network over the linear sequences to learn syntax-based context representation and translation prediction jointly. To verify the effectiveness, the proposed model is integrated into phrase-based SMT. Experiments on large-scale Chinese-to-English and German-to-English translation tasks show that the proposed approach can achieve a substantial and significant improvement over several baseline systems.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2018EDP7209/_p
부
@ARTICLE{e101-d_12_3226,
author={Kehai CHEN, Tiejun ZHAO, Muyun YANG, },
journal={IEICE TRANSACTIONS on Information},
title={Syntax-Based Context Representation for Statistical Machine Translation},
year={2018},
volume={E101-D},
number={12},
pages={3226-3237},
abstract={Learning semantic representation for translation context is beneficial to statistical machine translation (SMT). Previous efforts have focused on implicitly encoding syntactic and semantic knowledge in translation context by neural networks, which are weak in capturing explicit structural syntax information. In this paper, we propose a new neural network with a tree-based convolutional architecture to explicitly learn structural syntax information in translation context, thus improving translation prediction. Specifically, we first convert parallel sentences with source parse trees into syntax-based linear sequences based on a minimum syntax subtree algorithm, and then define a tree-based convolutional network over the linear sequences to learn syntax-based context representation and translation prediction jointly. To verify the effectiveness, the proposed model is integrated into phrase-based SMT. Experiments on large-scale Chinese-to-English and German-to-English translation tasks show that the proposed approach can achieve a substantial and significant improvement over several baseline systems.},
keywords={},
doi={10.1587/transinf.2018EDP7209},
ISSN={1745-1361},
month={December},}
부
TY - JOUR
TI - Syntax-Based Context Representation for Statistical Machine Translation
T2 - IEICE TRANSACTIONS on Information
SP - 3226
EP - 3237
AU - Kehai CHEN
AU - Tiejun ZHAO
AU - Muyun YANG
PY - 2018
DO - 10.1587/transinf.2018EDP7209
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E101-D
IS - 12
JA - IEICE TRANSACTIONS on Information
Y1 - December 2018
AB - Learning semantic representation for translation context is beneficial to statistical machine translation (SMT). Previous efforts have focused on implicitly encoding syntactic and semantic knowledge in translation context by neural networks, which are weak in capturing explicit structural syntax information. In this paper, we propose a new neural network with a tree-based convolutional architecture to explicitly learn structural syntax information in translation context, thus improving translation prediction. Specifically, we first convert parallel sentences with source parse trees into syntax-based linear sequences based on a minimum syntax subtree algorithm, and then define a tree-based convolutional network over the linear sequences to learn syntax-based context representation and translation prediction jointly. To verify the effectiveness, the proposed model is integrated into phrase-based SMT. Experiments on large-scale Chinese-to-English and German-to-English translation tasks show that the proposed approach can achieve a substantial and significant improvement over several baseline systems.
ER -