The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
본 논문의 목적은 세밀한 시각적 인식을 위한 효과적인 주의 영역을 제안하는 것입니다. Spatial Transformers의 네트워크 내 공간 조작 능력을 기반으로 우리는 확장 모델을 제안합니다. 주의 유도 공간 변환기 네트워크(AG-STN). 이 모델은 처음에는 하드 코딩된 주의 영역을 사용하여 공간 변환기를 안내할 수 있습니다. 그런 다음 그러한 지침을 끌 수 있으며 네트워크 모델은 위치와 규모 측면에서 지역 학습을 조정합니다. 이러한 조정은 분류 손실에 따라 조정되어 실제로 더 나은 인식 결과를 위해 최적화됩니다. 이 모델을 사용하면 상세한 주의 정보를 성공적으로 포착할 수 있습니다. 또한 AG-STN은 여러 수준에서 주의 정보를 캡처할 수 있으며, 다양한 수준의 주의 정보는 실험에서 서로 보완적입니다. 이들을 융합하면 더 나은 결과를 얻을 수 있습니다.
Dichao LIU
Nagoya University
Yu WANG
Ritsumeikan University
Jien KATO
Ritsumeikan University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
부
Dichao LIU, Yu WANG, Jien KATO, "Attention-Guided Spatial Transformer Networks for Fine-Grained Visual Recognition" in IEICE TRANSACTIONS on Information,
vol. E102-D, no. 12, pp. 2577-2586, December 2019, doi: 10.1587/transinf.2019EDP7045.
Abstract: The aim of this paper is to propose effective attentional regions for fine-grained visual recognition. Based on the Spatial Transformers' capability of spatial manipulation within networks, we propose an extension model, the Attention-Guided Spatial Transformer Networks (AG-STNs). This model can guide the Spatial Transformers with hard-coded attentional regions at first. Then such guidance can be turned off, and the network model will adjust the region learning in terms of the location and scale. Such adjustment is conditioned to the classification loss so that it is actually optimized for better recognition results. With this model, we are able to successfully capture detailed attentional information. Also, the AG-STNs are able to capture attentional information in multiple levels, and different levels of attentional information are complementary to each other in our experiments. A fusion of them brings better results.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2019EDP7045/_p
부
@ARTICLE{e102-d_12_2577,
author={Dichao LIU, Yu WANG, Jien KATO, },
journal={IEICE TRANSACTIONS on Information},
title={Attention-Guided Spatial Transformer Networks for Fine-Grained Visual Recognition},
year={2019},
volume={E102-D},
number={12},
pages={2577-2586},
abstract={The aim of this paper is to propose effective attentional regions for fine-grained visual recognition. Based on the Spatial Transformers' capability of spatial manipulation within networks, we propose an extension model, the Attention-Guided Spatial Transformer Networks (AG-STNs). This model can guide the Spatial Transformers with hard-coded attentional regions at first. Then such guidance can be turned off, and the network model will adjust the region learning in terms of the location and scale. Such adjustment is conditioned to the classification loss so that it is actually optimized for better recognition results. With this model, we are able to successfully capture detailed attentional information. Also, the AG-STNs are able to capture attentional information in multiple levels, and different levels of attentional information are complementary to each other in our experiments. A fusion of them brings better results.},
keywords={},
doi={10.1587/transinf.2019EDP7045},
ISSN={1745-1361},
month={December},}
부
TY - JOUR
TI - Attention-Guided Spatial Transformer Networks for Fine-Grained Visual Recognition
T2 - IEICE TRANSACTIONS on Information
SP - 2577
EP - 2586
AU - Dichao LIU
AU - Yu WANG
AU - Jien KATO
PY - 2019
DO - 10.1587/transinf.2019EDP7045
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E102-D
IS - 12
JA - IEICE TRANSACTIONS on Information
Y1 - December 2019
AB - The aim of this paper is to propose effective attentional regions for fine-grained visual recognition. Based on the Spatial Transformers' capability of spatial manipulation within networks, we propose an extension model, the Attention-Guided Spatial Transformer Networks (AG-STNs). This model can guide the Spatial Transformers with hard-coded attentional regions at first. Then such guidance can be turned off, and the network model will adjust the region learning in terms of the location and scale. Such adjustment is conditioned to the classification loss so that it is actually optimized for better recognition results. With this model, we are able to successfully capture detailed attentional information. Also, the AG-STNs are able to capture attentional information in multiple levels, and different levels of attentional information are complementary to each other in our experiments. A fusion of them brings better results.
ER -