The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
본 논문에서는 가전제품용 트리거 없는 음성 인터페이스를 달성하는 데 필수적인 오디오 처리의 새로운 프레임워크를 소개합니다. 음성 인터페이스가 실제 환경에서 지속적으로 작동하는 경우 간헐적인 음성 명령을 추출하고 다른 모든 명령을 거부해야 합니다. 가전제품을 많이 사용하는 경우에도 음성 명령의 수보다 관련 없는 입력의 수가 훨씬 많기 때문에 오경보의 수를 줄이는 것이 매우 중요합니다. 의도적 음성 명령 감지라고 불리는 이 프레임워크는 음성 활동 감지를 기반으로 하지만 감정 인식과 같은 다양한 음성/오디오 처리 기술을 통해 향상됩니다. 제안된 프레임워크의 유효성은 새로 수집된 대규모 코퍼스를 사용하여 평가됩니다. 다양한 기능을 결합함으로써 얻을 수 있는 이점을 테스트하고 확인했으며 간단한 LDA 기반 분류기가 허용 가능한 성능을 보여주었습니다. 다양한 사용자 적응 방법의 효율성도 논의됩니다.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
부
Yasunari OBUCHI, Takashi SUMIYOSHI, "Intentional Voice Command Detection for Trigger-Free Speech Interface" in IEICE TRANSACTIONS on Information,
vol. E93-D, no. 9, pp. 2440-2450, September 2010, doi: 10.1587/transinf.E93.D.2440.
Abstract: In this paper we introduce a new framework of audio processing, which is essential to achieve a trigger-free speech interface for home appliances. If the speech interface works continually in real environments, it must extract occasional voice commands and reject everything else. It is extremely important to reduce the number of false alarms because the number of irrelevant inputs is much larger than the number of voice commands even for heavy users of appliances. The framework, called Intentional Voice Command Detection, is based on voice activity detection, but enhanced by various speech/audio processing techniques such as emotion recognition. The effectiveness of the proposed framework is evaluated using a newly-collected large-scale corpus. The advantages of combining various features were tested and confirmed, and the simple LDA-based classifier demonstrated acceptable performance. The effectiveness of various methods of user adaptation is also discussed.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.E93.D.2440/_p
부
@ARTICLE{e93-d_9_2440,
author={Yasunari OBUCHI, Takashi SUMIYOSHI, },
journal={IEICE TRANSACTIONS on Information},
title={Intentional Voice Command Detection for Trigger-Free Speech Interface},
year={2010},
volume={E93-D},
number={9},
pages={2440-2450},
abstract={In this paper we introduce a new framework of audio processing, which is essential to achieve a trigger-free speech interface for home appliances. If the speech interface works continually in real environments, it must extract occasional voice commands and reject everything else. It is extremely important to reduce the number of false alarms because the number of irrelevant inputs is much larger than the number of voice commands even for heavy users of appliances. The framework, called Intentional Voice Command Detection, is based on voice activity detection, but enhanced by various speech/audio processing techniques such as emotion recognition. The effectiveness of the proposed framework is evaluated using a newly-collected large-scale corpus. The advantages of combining various features were tested and confirmed, and the simple LDA-based classifier demonstrated acceptable performance. The effectiveness of various methods of user adaptation is also discussed.},
keywords={},
doi={10.1587/transinf.E93.D.2440},
ISSN={1745-1361},
month={September},}
부
TY - JOUR
TI - Intentional Voice Command Detection for Trigger-Free Speech Interface
T2 - IEICE TRANSACTIONS on Information
SP - 2440
EP - 2450
AU - Yasunari OBUCHI
AU - Takashi SUMIYOSHI
PY - 2010
DO - 10.1587/transinf.E93.D.2440
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E93-D
IS - 9
JA - IEICE TRANSACTIONS on Information
Y1 - September 2010
AB - In this paper we introduce a new framework of audio processing, which is essential to achieve a trigger-free speech interface for home appliances. If the speech interface works continually in real environments, it must extract occasional voice commands and reject everything else. It is extremely important to reduce the number of false alarms because the number of irrelevant inputs is much larger than the number of voice commands even for heavy users of appliances. The framework, called Intentional Voice Command Detection, is based on voice activity detection, but enhanced by various speech/audio processing techniques such as emotion recognition. The effectiveness of the proposed framework is evaluated using a newly-collected large-scale corpus. The advantages of combining various features were tested and confirmed, and the simple LDA-based classifier demonstrated acceptable performance. The effectiveness of various methods of user adaptation is also discussed.
ER -