The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
CNN(Convolutional Neural Networks)은 최근 이미지 검색 작업에서 뛰어난 성능을 보여주었습니다. 특히 CNN에서 추출한 Local Convolutional Feature는 뛰어난 판별 능력을 보여줍니다. 이 분야의 최근 연구는 로컬 기능을 전역 기능에 통합하고 두 이미지의 전역 유사성을 평가하는 풀링 방법에 집중되었습니다. 그러나 풀링 방법은 이미지의 로컬 영역 정보와 공간적 관계를 희생하며, 이는 폐색 및 시점 변경에 대한 견고성의 핵심으로 정확하게 알려져 있습니다. 본 논문에서는 pooling 방식 대신 local convolutional feature를 직접 사용하여 결정한 local similarity 기반의 대안적 방식을 제안한다. 구체적으로 먼저 세 가지 형태의 로컬 유사성 텐서(LST)를 정의합니다. 로컬 유사성 텐서(LST)는 로컬 영역에 대한 정보와 이들 간의 공간적 관계를 고려합니다. 그런 다음 LST를 기반으로 유사성 CNN 모델(SCNN)을 구성하여 쿼리와 갤러리 이미지 간의 유사성을 평가합니다. 우리 방법의 이상적인 구성은 지역 영역 크기, 지역 콘텐츠 및 지역 간의 공간적 관계의 세 가지 관점에서 철저한 실험을 통해 모색됩니다. 수정된 오픈 데이터셋(질의 이미지가 차단된 이미지로 제한됨)에 대한 실험 결과는 제안된 방법이 견고성 향상으로 인해 풀링 방법보다 성능이 우수함을 확인합니다. 또한 세 가지 공개 검색 데이터 세트에 대한 테스트는 LST를 기존 풀링 방법과 결합하면 최상의 결과를 얻을 수 있음을 보여줍니다.
Longjiao ZHAO
Nagoya University
Yu WANG
Hitotsubashi University
Jien KATO
Ritsumeikan University
Yoshiharu ISHIKAWA
Nagoya University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
부
Longjiao ZHAO, Yu WANG, Jien KATO, Yoshiharu ISHIKAWA, "Learning Local Similarity with Spatial Interrelations on Content-Based Image Retrieval" in IEICE TRANSACTIONS on Information,
vol. E106-D, no. 5, pp. 1069-1080, May 2023, doi: 10.1587/transinf.2022EDP7163.
Abstract: Convolutional Neural Networks (CNNs) have recently demonstrated outstanding performance in image retrieval tasks. Local convolutional features extracted by CNNs, in particular, show exceptional capability in discrimination. Recent research in this field has concentrated on pooling methods that incorporate local features into global features and assess the global similarity of two images. However, the pooling methods sacrifice the image's local region information and spatial relationships, which are precisely known as the keys to the robustness against occlusion and viewpoint changes. In this paper, instead of pooling methods, we propose an alternative method based on local similarity, determined by directly using local convolutional features. Specifically, we first define three forms of local similarity tensors (LSTs), which take into account information about local regions as well as spatial relationships between them. We then construct a similarity CNN model (SCNN) based on LSTs to assess the similarity between the query and gallery images. The ideal configuration of our method is sought through thorough experiments from three perspectives: local region size, local region content, and spatial relationships between local regions. The experimental results on a modified open dataset (where query images are limited to occluded ones) confirm that the proposed method outperforms the pooling methods because of robustness enhancement. Furthermore, testing on three public retrieval datasets shows that combining LSTs with conventional pooling methods achieves the best results.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2022EDP7163/_p
부
@ARTICLE{e106-d_5_1069,
author={Longjiao ZHAO, Yu WANG, Jien KATO, Yoshiharu ISHIKAWA, },
journal={IEICE TRANSACTIONS on Information},
title={Learning Local Similarity with Spatial Interrelations on Content-Based Image Retrieval},
year={2023},
volume={E106-D},
number={5},
pages={1069-1080},
abstract={Convolutional Neural Networks (CNNs) have recently demonstrated outstanding performance in image retrieval tasks. Local convolutional features extracted by CNNs, in particular, show exceptional capability in discrimination. Recent research in this field has concentrated on pooling methods that incorporate local features into global features and assess the global similarity of two images. However, the pooling methods sacrifice the image's local region information and spatial relationships, which are precisely known as the keys to the robustness against occlusion and viewpoint changes. In this paper, instead of pooling methods, we propose an alternative method based on local similarity, determined by directly using local convolutional features. Specifically, we first define three forms of local similarity tensors (LSTs), which take into account information about local regions as well as spatial relationships between them. We then construct a similarity CNN model (SCNN) based on LSTs to assess the similarity between the query and gallery images. The ideal configuration of our method is sought through thorough experiments from three perspectives: local region size, local region content, and spatial relationships between local regions. The experimental results on a modified open dataset (where query images are limited to occluded ones) confirm that the proposed method outperforms the pooling methods because of robustness enhancement. Furthermore, testing on three public retrieval datasets shows that combining LSTs with conventional pooling methods achieves the best results.},
keywords={},
doi={10.1587/transinf.2022EDP7163},
ISSN={1745-1361},
month={May},}
부
TY - JOUR
TI - Learning Local Similarity with Spatial Interrelations on Content-Based Image Retrieval
T2 - IEICE TRANSACTIONS on Information
SP - 1069
EP - 1080
AU - Longjiao ZHAO
AU - Yu WANG
AU - Jien KATO
AU - Yoshiharu ISHIKAWA
PY - 2023
DO - 10.1587/transinf.2022EDP7163
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E106-D
IS - 5
JA - IEICE TRANSACTIONS on Information
Y1 - May 2023
AB - Convolutional Neural Networks (CNNs) have recently demonstrated outstanding performance in image retrieval tasks. Local convolutional features extracted by CNNs, in particular, show exceptional capability in discrimination. Recent research in this field has concentrated on pooling methods that incorporate local features into global features and assess the global similarity of two images. However, the pooling methods sacrifice the image's local region information and spatial relationships, which are precisely known as the keys to the robustness against occlusion and viewpoint changes. In this paper, instead of pooling methods, we propose an alternative method based on local similarity, determined by directly using local convolutional features. Specifically, we first define three forms of local similarity tensors (LSTs), which take into account information about local regions as well as spatial relationships between them. We then construct a similarity CNN model (SCNN) based on LSTs to assess the similarity between the query and gallery images. The ideal configuration of our method is sought through thorough experiments from three perspectives: local region size, local region content, and spatial relationships between local regions. The experimental results on a modified open dataset (where query images are limited to occluded ones) confirm that the proposed method outperforms the pooling methods because of robustness enhancement. Furthermore, testing on three public retrieval datasets shows that combining LSTs with conventional pooling methods achieves the best results.
ER -