The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
세분화된 이미지 분류는 컴퓨터 비전의 주요 기본 작업 중 하나입니다. 어텐션 메커니즘과 결합된 전통적인 DCNN(Deep Convolutional Neural Network)의 출현은 세분화된 이미지의 부분 및 로컬 기능에 초점을 맞출 수 있지만 여전히 네트워크의 다양한 어텐션 모듈의 임베딩 모드에 대한 고려가 부족하여 분류 모델의 결과가 만족스럽지 않습니다. 위의 문제를 해결하기 위해 SE, CBAM 및 ECA 모듈을 포함하여 DCNN 네트워크(예: ResNet, VGGNet 등)에 세 가지 주의 메커니즘이 도입되어 DCNN이 이미지에서 돌출 영역의 주요 로컬 기능에 더 잘 집중할 수 있습니다. 동시에 우리는 분류 모델의 성능을 더욱 향상시키기 위해 직렬, 잔차 및 병렬 모드를 포함하여 어텐션 모듈의 세 가지 임베딩 모드를 채택합니다. 실험 결과는 세 가지 임베딩 모드와 결합된 세 가지 어텐션 모듈이 DCNN 네트워크의 성능을 효과적으로 향상시킬 수 있음을 보여줍니다. 또한 SE 및 ECA에 비해 CBAM은 특징 추출 기능이 더 강합니다. 그 중 병렬로 내장된 CBAM은 DCNN이 주목하는 로컬 정보를 더 풍부하고 정확하게 만들 수 있으며, CUB-1.98-1.57 데이터 세트의 원래 VGG16 및 Resnet34보다 각각 200% 및 2011% 더 높은 DCNN에 대한 최적의 효과를 가져올 수 있습니다. 시각화 분석은 또한 어텐션 모듈이 DCNN 네트워크에 쉽게 내장될 수 있음을 나타냅니다. 특히 병렬 모드에서 일반성과 보편성이 더 강합니다.
Wujian YE
Guangdong University of Technology
Run TAN
Guangdong University of Technology
Yijun LIU
Guangdong University of Technology
Chin-Chen CHANG
Feng Chia University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
부
Wujian YE, Run TAN, Yijun LIU, Chin-Chen CHANG, "The Comparison of Attention Mechanisms with Different Embedding Modes for Performance Improvement of Fine-Grained Classification" in IEICE TRANSACTIONS on Information,
vol. E106-D, no. 5, pp. 590-600, May 2023, doi: 10.1587/transinf.2022DLP0006.
Abstract: Fine-grained image classification is one of the key basic tasks of computer vision. The appearance of traditional deep convolutional neural network (DCNN) combined with attention mechanism can focus on partial and local features of fine-grained images, but it still lacks the consideration of the embedding mode of different attention modules in the network, leading to the unsatisfactory result of classification model. To solve the above problems, three different attention mechanisms are introduced into the DCNN network (like ResNet, VGGNet, etc.), including SE, CBAM and ECA modules, so that DCNN could better focus on the key local features of salient regions in the image. At the same time, we adopt three different embedding modes of attention modules, including serial, residual and parallel modes, to further improve the performance of the classification model. The experimental results show that the three attention modules combined with three different embedding modes can improve the performance of DCNN network effectively. Moreover, compared with SE and ECA, CBAM has stronger feature extraction capability. Among them, the parallelly embedded CBAM can make the local information paid attention to by DCNN richer and more accurate, and bring the optimal effect for DCNN, which is 1.98% and 1.57% higher than that of original VGG16 and Resnet34 in CUB-200-2011 dataset, respectively. The visualization analysis also indicates that the attention modules can be easily embedded into DCNN networks, especially in the parallel mode, with stronger generality and universality.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2022DLP0006/_p
부
@ARTICLE{e106-d_5_590,
author={Wujian YE, Run TAN, Yijun LIU, Chin-Chen CHANG, },
journal={IEICE TRANSACTIONS on Information},
title={The Comparison of Attention Mechanisms with Different Embedding Modes for Performance Improvement of Fine-Grained Classification},
year={2023},
volume={E106-D},
number={5},
pages={590-600},
abstract={Fine-grained image classification is one of the key basic tasks of computer vision. The appearance of traditional deep convolutional neural network (DCNN) combined with attention mechanism can focus on partial and local features of fine-grained images, but it still lacks the consideration of the embedding mode of different attention modules in the network, leading to the unsatisfactory result of classification model. To solve the above problems, three different attention mechanisms are introduced into the DCNN network (like ResNet, VGGNet, etc.), including SE, CBAM and ECA modules, so that DCNN could better focus on the key local features of salient regions in the image. At the same time, we adopt three different embedding modes of attention modules, including serial, residual and parallel modes, to further improve the performance of the classification model. The experimental results show that the three attention modules combined with three different embedding modes can improve the performance of DCNN network effectively. Moreover, compared with SE and ECA, CBAM has stronger feature extraction capability. Among them, the parallelly embedded CBAM can make the local information paid attention to by DCNN richer and more accurate, and bring the optimal effect for DCNN, which is 1.98% and 1.57% higher than that of original VGG16 and Resnet34 in CUB-200-2011 dataset, respectively. The visualization analysis also indicates that the attention modules can be easily embedded into DCNN networks, especially in the parallel mode, with stronger generality and universality.},
keywords={},
doi={10.1587/transinf.2022DLP0006},
ISSN={1745-1361},
month={May},}
부
TY - JOUR
TI - The Comparison of Attention Mechanisms with Different Embedding Modes for Performance Improvement of Fine-Grained Classification
T2 - IEICE TRANSACTIONS on Information
SP - 590
EP - 600
AU - Wujian YE
AU - Run TAN
AU - Yijun LIU
AU - Chin-Chen CHANG
PY - 2023
DO - 10.1587/transinf.2022DLP0006
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E106-D
IS - 5
JA - IEICE TRANSACTIONS on Information
Y1 - May 2023
AB - Fine-grained image classification is one of the key basic tasks of computer vision. The appearance of traditional deep convolutional neural network (DCNN) combined with attention mechanism can focus on partial and local features of fine-grained images, but it still lacks the consideration of the embedding mode of different attention modules in the network, leading to the unsatisfactory result of classification model. To solve the above problems, three different attention mechanisms are introduced into the DCNN network (like ResNet, VGGNet, etc.), including SE, CBAM and ECA modules, so that DCNN could better focus on the key local features of salient regions in the image. At the same time, we adopt three different embedding modes of attention modules, including serial, residual and parallel modes, to further improve the performance of the classification model. The experimental results show that the three attention modules combined with three different embedding modes can improve the performance of DCNN network effectively. Moreover, compared with SE and ECA, CBAM has stronger feature extraction capability. Among them, the parallelly embedded CBAM can make the local information paid attention to by DCNN richer and more accurate, and bring the optimal effect for DCNN, which is 1.98% and 1.57% higher than that of original VGG16 and Resnet34 in CUB-200-2011 dataset, respectively. The visualization analysis also indicates that the attention modules can be easily embedded into DCNN networks, especially in the parallel mode, with stronger generality and universality.
ER -