The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
음성 인식 성능을 향상시키기 위해 판별 분석을 기반으로 한 특징 변환을 통해 음향 특징의 중복 차원을 줄이는 방법이 널리 사용되었습니다. 이를 위해 선형 판별 분석(LDA)과 이분산 판별 분석(HDA)이 자주 사용되며, LDA와 HDA에 대한 일반화 방법인 PLDA(power LDA)가 제안되었습니다. 그러나 이러한 방법은 다중 모드 데이터에 대해 예상치 못한 차원 감소를 초래할 수 있습니다. 다중 모드 데이터의 차원을 줄일 때 데이터의 로컬 구조를 보존하는 것이 중요합니다. 본 논문에서는 다중 모드 데이터의 차원성을 적절하게 줄이기 위해 지역성 보존 HDA와 지역성 보존 PLDA의 두 가지 방법을 소개합니다. 또한 최적이 아닌 예측을 신속하게 계산하기 위한 대략적인 계산 방식을 제안합니다. 실험 결과는 지역성 보존 방법이 음성 인식에서 기존 방법보다 더 나은 성능을 제공한다는 것을 보여줍니다.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
부
Makoto SAKAI, Norihide KITAOKA, Kazuya TAKEDA, "Acoustic Feature Transformation Based on Discriminant Analysis Preserving Local Structure for Speech Recognition" in IEICE TRANSACTIONS on Information,
vol. E93-D, no. 5, pp. 1244-1252, May 2010, doi: 10.1587/transinf.E93.D.1244.
Abstract: To improve speech recognition performance, feature transformation based on discriminant analysis has been widely used to reduce the redundant dimensions of acoustic features. Linear discriminant analysis (LDA) and heteroscedastic discriminant analysis (HDA) are often used for this purpose, and a generalization method for LDA and HDA, called power LDA (PLDA), has been proposed. However, these methods may result in an unexpected dimensionality reduction for multimodal data. It is important to preserve the local structure of the data when reducing the dimensionality of multimodal data. In this paper we introduce two methods, locality-preserving HDA and locality-preserving PLDA, to reduce dimensionality of multimodal data appropriately. We also propose an approximate calculation scheme to calculate sub-optimal projections rapidly. Experimental results show that the locality-preserving methods yield better performance than the traditional ones in speech recognition.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.E93.D.1244/_p
부
@ARTICLE{e93-d_5_1244,
author={Makoto SAKAI, Norihide KITAOKA, Kazuya TAKEDA, },
journal={IEICE TRANSACTIONS on Information},
title={Acoustic Feature Transformation Based on Discriminant Analysis Preserving Local Structure for Speech Recognition},
year={2010},
volume={E93-D},
number={5},
pages={1244-1252},
abstract={To improve speech recognition performance, feature transformation based on discriminant analysis has been widely used to reduce the redundant dimensions of acoustic features. Linear discriminant analysis (LDA) and heteroscedastic discriminant analysis (HDA) are often used for this purpose, and a generalization method for LDA and HDA, called power LDA (PLDA), has been proposed. However, these methods may result in an unexpected dimensionality reduction for multimodal data. It is important to preserve the local structure of the data when reducing the dimensionality of multimodal data. In this paper we introduce two methods, locality-preserving HDA and locality-preserving PLDA, to reduce dimensionality of multimodal data appropriately. We also propose an approximate calculation scheme to calculate sub-optimal projections rapidly. Experimental results show that the locality-preserving methods yield better performance than the traditional ones in speech recognition.},
keywords={},
doi={10.1587/transinf.E93.D.1244},
ISSN={1745-1361},
month={May},}
부
TY - JOUR
TI - Acoustic Feature Transformation Based on Discriminant Analysis Preserving Local Structure for Speech Recognition
T2 - IEICE TRANSACTIONS on Information
SP - 1244
EP - 1252
AU - Makoto SAKAI
AU - Norihide KITAOKA
AU - Kazuya TAKEDA
PY - 2010
DO - 10.1587/transinf.E93.D.1244
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E93-D
IS - 5
JA - IEICE TRANSACTIONS on Information
Y1 - May 2010
AB - To improve speech recognition performance, feature transformation based on discriminant analysis has been widely used to reduce the redundant dimensions of acoustic features. Linear discriminant analysis (LDA) and heteroscedastic discriminant analysis (HDA) are often used for this purpose, and a generalization method for LDA and HDA, called power LDA (PLDA), has been proposed. However, these methods may result in an unexpected dimensionality reduction for multimodal data. It is important to preserve the local structure of the data when reducing the dimensionality of multimodal data. In this paper we introduce two methods, locality-preserving HDA and locality-preserving PLDA, to reduce dimensionality of multimodal data appropriately. We also propose an approximate calculation scheme to calculate sub-optimal projections rapidly. Experimental results show that the locality-preserving methods yield better performance than the traditional ones in speech recognition.
ER -