The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
본 논문에서는 오토인코더의 병목 현상 특성을 활용한 딥러닝 기반의 비침해적 음성 명료도 추정 방법을 제시합니다. 기존의 표준 비침해적 음성 명료도 추정 방법인 P.563은 다양한 소음 환경에서 명료도 추정 성능이 부족합니다. 우리는 입력과 출력이 각각 오토인코더 병목 현상과 단기 객관적 지능(STOI) 점수인 LSTM(장단기 기억) 신경망을 기반으로 하는 보다 정확한 음성 명료도 추정 방법을 제안합니다. 여기서 STOI는 표준 도구입니다. 기준 음성 신호로 침입 음성 명료도를 측정합니다. 제안한 방법은 다양한 잡음 환경에서 음성 신호에 대한 기존 표준 P.563 및 MFCC(Mel-Frequency Cepstral Coefficient) 특징 기반 명료도 추정 방법과 비교하여 우수한 성능을 가짐을 보였다.
Yoonhee KIM
Seoul National University of Science and Technology
Deokgyu YUN
Seoul National University of Science and Technology
Hannah LEE
Seoul National University of Science and Technology
Seung Ho CHOI
Seoul National University of Science and Technology
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
부
Yoonhee KIM, Deokgyu YUN, Hannah LEE, Seung Ho CHOI, "A Non-Intrusive Speech Intelligibility Estimation Method Based on Deep Learning Using Autoencoder Features" in IEICE TRANSACTIONS on Information,
vol. E103-D, no. 3, pp. 714-715, March 2020, doi: 10.1587/transinf.2019EDL8150.
Abstract: This paper presents a deep learning-based non-intrusive speech intelligibility estimation method using bottleneck features of autoencoder. The conventional standard non-intrusive speech intelligibility estimation method, P.563, lacks intelligibility estimation performance in various noise environments. We propose a more accurate speech intelligibility estimation method based on long-short term memory (LSTM) neural network whose input and output are an autoencoder bottleneck features and a short-time objective intelligence (STOI) score, respectively, where STOI is a standard tool for measuring intrusive speech intelligibility with reference speech signals. We showed that the proposed method has a superior performance by comparing with the conventional standard P.563 and mel-frequency cepstral coefficient (MFCC) feature-based intelligibility estimation methods for speech signals in various noise environments.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2019EDL8150/_p
부
@ARTICLE{e103-d_3_714,
author={Yoonhee KIM, Deokgyu YUN, Hannah LEE, Seung Ho CHOI, },
journal={IEICE TRANSACTIONS on Information},
title={A Non-Intrusive Speech Intelligibility Estimation Method Based on Deep Learning Using Autoencoder Features},
year={2020},
volume={E103-D},
number={3},
pages={714-715},
abstract={This paper presents a deep learning-based non-intrusive speech intelligibility estimation method using bottleneck features of autoencoder. The conventional standard non-intrusive speech intelligibility estimation method, P.563, lacks intelligibility estimation performance in various noise environments. We propose a more accurate speech intelligibility estimation method based on long-short term memory (LSTM) neural network whose input and output are an autoencoder bottleneck features and a short-time objective intelligence (STOI) score, respectively, where STOI is a standard tool for measuring intrusive speech intelligibility with reference speech signals. We showed that the proposed method has a superior performance by comparing with the conventional standard P.563 and mel-frequency cepstral coefficient (MFCC) feature-based intelligibility estimation methods for speech signals in various noise environments.},
keywords={},
doi={10.1587/transinf.2019EDL8150},
ISSN={1745-1361},
month={March},}
부
TY - JOUR
TI - A Non-Intrusive Speech Intelligibility Estimation Method Based on Deep Learning Using Autoencoder Features
T2 - IEICE TRANSACTIONS on Information
SP - 714
EP - 715
AU - Yoonhee KIM
AU - Deokgyu YUN
AU - Hannah LEE
AU - Seung Ho CHOI
PY - 2020
DO - 10.1587/transinf.2019EDL8150
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E103-D
IS - 3
JA - IEICE TRANSACTIONS on Information
Y1 - March 2020
AB - This paper presents a deep learning-based non-intrusive speech intelligibility estimation method using bottleneck features of autoencoder. The conventional standard non-intrusive speech intelligibility estimation method, P.563, lacks intelligibility estimation performance in various noise environments. We propose a more accurate speech intelligibility estimation method based on long-short term memory (LSTM) neural network whose input and output are an autoencoder bottleneck features and a short-time objective intelligence (STOI) score, respectively, where STOI is a standard tool for measuring intrusive speech intelligibility with reference speech signals. We showed that the proposed method has a superior performance by comparing with the conventional standard P.563 and mel-frequency cepstral coefficient (MFCC) feature-based intelligibility estimation methods for speech signals in various noise environments.
ER -