The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
훈련 데이터의 노이즈 레이블은 심층 신경망(DNN)의 성능을 크게 저하시킬 수 있습니다. 시끄러운 레이블을 사용한 학습에 대한 최근 연구에서는 암기 효과라는 DNN의 속성을 사용하여 훈련 데이터를 신뢰할 수 있는 레이블이 있는 데이터 세트와 신뢰할 수 없는 레이블이 있는 데이터 세트로 나눕니다. 준지도 학습 전략을 도입하는 방법은 신뢰할 수 없는 레이블을 버리고 모델의 확실한 예측에서 생성된 의사 레이블을 할당합니다. 지금까지 이 준지도 전략은 이 분야에서 최고의 결과를 가져왔습니다. 그러나 모델이 균형 잡힌 데이터에 대해 학습된 경우에도 의사 레이블의 분포는 여전히 데이터 유사성으로 인해 불균형을 나타낼 수 있음을 관찰했습니다. 또한 준지도 방법을 사용하여 훈련 데이터를 분할함으로써 발생하는 데이터 편향이 나타납니다. 의사 라벨에서 발생하는 두 가지 유형의 편향을 모두 해결하면 편향으로 인한 일반화 성능 저하를 피할 수 있습니다. 시끄러운 의사 라벨. 우리는 인과 추론을 기반으로 편견 없는 의사 라벨링을 도입하는 노이즈 라벨을 사용한 학습 방법을 제안합니다. 제안된 방법은 표준 벤치마크 CIFAR-10 및 CIFAR-100의 높은 잡음 비율 실험에서 상당한 정확도 향상을 달성했습니다.
Ryota HIGASHIMOTO
Kansai University
Soh YOSHIDA
Kansai University
Takashi HORIHATA
Kansai University
Mitsuji MUNEYASU
Kansai University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
부
Ryota HIGASHIMOTO, Soh YOSHIDA, Takashi HORIHATA, Mitsuji MUNEYASU, "Unbiased Pseudo-Labeling for Learning with Noisy Labels" in IEICE TRANSACTIONS on Information,
vol. E107-D, no. 1, pp. 44-48, January 2024, doi: 10.1587/transinf.2023MUL0002.
Abstract: Noisy labels in training data can significantly harm the performance of deep neural networks (DNNs). Recent research on learning with noisy labels uses a property of DNNs called the memorization effect to divide the training data into a set of data with reliable labels and a set of data with unreliable labels. Methods introducing semi-supervised learning strategies discard the unreliable labels and assign pseudo-labels generated from the confident predictions of the model. So far, this semi-supervised strategy has yielded the best results in this field. However, we observe that even when models are trained on balanced data, the distribution of the pseudo-labels can still exhibit an imbalance that is driven by data similarity. Additionally, a data bias is seen that originates from the division of the training data using the semi-supervised method. If we address both types of bias that arise from pseudo-labels, we can avoid the decrease in generalization performance caused by biased noisy pseudo-labels. We propose a learning method with noisy labels that introduces unbiased pseudo-labeling based on causal inference. The proposed method achieves significant accuracy gains in experiments at high noise rates on the standard benchmarks CIFAR-10 and CIFAR-100.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2023MUL0002/_p
부
@ARTICLE{e107-d_1_44,
author={Ryota HIGASHIMOTO, Soh YOSHIDA, Takashi HORIHATA, Mitsuji MUNEYASU, },
journal={IEICE TRANSACTIONS on Information},
title={Unbiased Pseudo-Labeling for Learning with Noisy Labels},
year={2024},
volume={E107-D},
number={1},
pages={44-48},
abstract={Noisy labels in training data can significantly harm the performance of deep neural networks (DNNs). Recent research on learning with noisy labels uses a property of DNNs called the memorization effect to divide the training data into a set of data with reliable labels and a set of data with unreliable labels. Methods introducing semi-supervised learning strategies discard the unreliable labels and assign pseudo-labels generated from the confident predictions of the model. So far, this semi-supervised strategy has yielded the best results in this field. However, we observe that even when models are trained on balanced data, the distribution of the pseudo-labels can still exhibit an imbalance that is driven by data similarity. Additionally, a data bias is seen that originates from the division of the training data using the semi-supervised method. If we address both types of bias that arise from pseudo-labels, we can avoid the decrease in generalization performance caused by biased noisy pseudo-labels. We propose a learning method with noisy labels that introduces unbiased pseudo-labeling based on causal inference. The proposed method achieves significant accuracy gains in experiments at high noise rates on the standard benchmarks CIFAR-10 and CIFAR-100.},
keywords={},
doi={10.1587/transinf.2023MUL0002},
ISSN={1745-1361},
month={January},}
부
TY - JOUR
TI - Unbiased Pseudo-Labeling for Learning with Noisy Labels
T2 - IEICE TRANSACTIONS on Information
SP - 44
EP - 48
AU - Ryota HIGASHIMOTO
AU - Soh YOSHIDA
AU - Takashi HORIHATA
AU - Mitsuji MUNEYASU
PY - 2024
DO - 10.1587/transinf.2023MUL0002
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E107-D
IS - 1
JA - IEICE TRANSACTIONS on Information
Y1 - January 2024
AB - Noisy labels in training data can significantly harm the performance of deep neural networks (DNNs). Recent research on learning with noisy labels uses a property of DNNs called the memorization effect to divide the training data into a set of data with reliable labels and a set of data with unreliable labels. Methods introducing semi-supervised learning strategies discard the unreliable labels and assign pseudo-labels generated from the confident predictions of the model. So far, this semi-supervised strategy has yielded the best results in this field. However, we observe that even when models are trained on balanced data, the distribution of the pseudo-labels can still exhibit an imbalance that is driven by data similarity. Additionally, a data bias is seen that originates from the division of the training data using the semi-supervised method. If we address both types of bias that arise from pseudo-labels, we can avoid the decrease in generalization performance caused by biased noisy pseudo-labels. We propose a learning method with noisy labels that introduces unbiased pseudo-labeling based on causal inference. The proposed method achieves significant accuracy gains in experiments at high noise rates on the standard benchmarks CIFAR-10 and CIFAR-100.
ER -