The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
콘텐츠 기반 이미지 검색은 컴퓨터 비전 연구자들 사이에서 오랫동안 뜨거운 주제였습니다. 수년에 걸쳐 많은 발전이 있었으며, 최근의 발전 중 하나는 많은 기계 학습 작업에서 심층 신경망의 성공에서 영감을 받은 심층 메트릭 학습입니다. 메트릭 학습의 목표는 신경망을 사용하여 이미지 픽셀 데이터에서 좋은 상위 수준 특징을 추출하는 것입니다. 이러한 기능은 알고리즘이 인간과 같은 정확도로 이미지 간의 시각적 비교를 수행할 수 있도록 하는 유용한 추상화를 제공합니다. 이러한 특징을 학습하기 위해 이미지 유사성 또는 상대적 유사성에 대한 지도 정보가 자주 사용됩니다. 심층 메트릭 학습에서 중요한 문제 중 하나는 이미지의 다중 레이블 또는 다중 객체 장면에 대한 유사성을 정의하는 방법입니다. 전통적으로 쌍별 유사성은 두 이미지 사이에 단일 공통 레이블이 있는지를 기반으로 정의됩니다. 그러나 이 정의는 매우 대략적이며 다중 레이블 또는 다중 개체 데이터에는 적합하지 않습니다. 또 다른 일반적인 실수는 이미지에 있는 개체의 다양성을 완전히 무시하여 특정 유형의 데이터 세트의 다중 개체 측면을 무시하는 것입니다. 우리 연구에서는 다중 라벨 및 다중 객체 이미지 데이터의 상대적 유사성을 기반으로 심층 이미지 표현을 학습하는 접근 방식을 제안합니다. 두 레이블 세트의 합집합에 대한 교차점에 해당하는 Jaccard 유사성 계수를 기반으로 하는 직관적이고 효과적인 유사성 측정항목을 소개합니다. 그러므로 우리는 유사성을 이산적인 양이 아닌 연속적인 양으로 취급합니다. 우리는 이 유사성 메트릭을 적응형 마진을 사용하여 삼중 손실에 통합하고 이미지 검색 작업에서 좋은 평균 평균 정밀도를 달성합니다. 우리는 최근 제안된 양자화 방법을 사용하여 유사성을 유지하면서 결과적인 심층 특징을 양자화할 수 있음을 보여줍니다. 또한 우리는 제안된 유사성 메트릭이 이전에 제안된 코사인 유사성 기반 메트릭보다 다중 객체 이미지에 대해 더 나은 성능을 발휘한다는 것을 보여줍니다. 우리가 제안한 방법은 두 개의 벤치마크 데이터 세트에서 여러 가지 최첨단 방법보다 성능이 뛰어납니다.
Jonathan MOJOO
Hiroshima University
Takio KURITA
Hiroshima University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
부
Jonathan MOJOO, Takio KURITA, "Deep Metric Learning for Multi-Label and Multi-Object Image Retrieval" in IEICE TRANSACTIONS on Information,
vol. E104-D, no. 6, pp. 873-880, June 2021, doi: 10.1587/transinf.2020EDP7226.
Abstract: Content-based image retrieval has been a hot topic among computer vision researchers for a long time. There have been many advances over the years, one of the recent ones being deep metric learning, inspired by the success of deep neural networks in many machine learning tasks. The goal of metric learning is to extract good high-level features from image pixel data using neural networks. These features provide useful abstractions, which can enable algorithms to perform visual comparison between images with human-like accuracy. To learn these features, supervised information of image similarity or relative similarity is often used. One important issue in deep metric learning is how to define similarity for multi-label or multi-object scenes in images. Traditionally, pairwise similarity is defined based on the presence of a single common label between two images. However, this definition is very coarse and not suitable for multi-label or multi-object data. Another common mistake is to completely ignore the multiplicity of objects in images, hence ignoring the multi-object facet of certain types of datasets. In our work, we propose an approach for learning deep image representations based on the relative similarity of both multi-label and multi-object image data. We introduce an intuitive and effective similarity metric based on the Jaccard similarity coefficient, which is equivalent to the intersection over union of two label sets. Hence we treat similarity as a continuous, as opposed to discrete quantity. We incorporate this similarity metric into a triplet loss with an adaptive margin, and achieve good mean average precision on image retrieval tasks. We further show, using a recently proposed quantization method, that the resulting deep feature can be quantized whilst preserving similarity. We also show that our proposed similarity metric performs better for multi-object images than a previously proposed cosine similarity-based metric. Our proposed method outperforms several state-of-the-art methods on two benchmark datasets.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2020EDP7226/_p
부
@ARTICLE{e104-d_6_873,
author={Jonathan MOJOO, Takio KURITA, },
journal={IEICE TRANSACTIONS on Information},
title={Deep Metric Learning for Multi-Label and Multi-Object Image Retrieval},
year={2021},
volume={E104-D},
number={6},
pages={873-880},
abstract={Content-based image retrieval has been a hot topic among computer vision researchers for a long time. There have been many advances over the years, one of the recent ones being deep metric learning, inspired by the success of deep neural networks in many machine learning tasks. The goal of metric learning is to extract good high-level features from image pixel data using neural networks. These features provide useful abstractions, which can enable algorithms to perform visual comparison between images with human-like accuracy. To learn these features, supervised information of image similarity or relative similarity is often used. One important issue in deep metric learning is how to define similarity for multi-label or multi-object scenes in images. Traditionally, pairwise similarity is defined based on the presence of a single common label between two images. However, this definition is very coarse and not suitable for multi-label or multi-object data. Another common mistake is to completely ignore the multiplicity of objects in images, hence ignoring the multi-object facet of certain types of datasets. In our work, we propose an approach for learning deep image representations based on the relative similarity of both multi-label and multi-object image data. We introduce an intuitive and effective similarity metric based on the Jaccard similarity coefficient, which is equivalent to the intersection over union of two label sets. Hence we treat similarity as a continuous, as opposed to discrete quantity. We incorporate this similarity metric into a triplet loss with an adaptive margin, and achieve good mean average precision on image retrieval tasks. We further show, using a recently proposed quantization method, that the resulting deep feature can be quantized whilst preserving similarity. We also show that our proposed similarity metric performs better for multi-object images than a previously proposed cosine similarity-based metric. Our proposed method outperforms several state-of-the-art methods on two benchmark datasets.},
keywords={},
doi={10.1587/transinf.2020EDP7226},
ISSN={1745-1361},
month={June},}
부
TY - JOUR
TI - Deep Metric Learning for Multi-Label and Multi-Object Image Retrieval
T2 - IEICE TRANSACTIONS on Information
SP - 873
EP - 880
AU - Jonathan MOJOO
AU - Takio KURITA
PY - 2021
DO - 10.1587/transinf.2020EDP7226
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E104-D
IS - 6
JA - IEICE TRANSACTIONS on Information
Y1 - June 2021
AB - Content-based image retrieval has been a hot topic among computer vision researchers for a long time. There have been many advances over the years, one of the recent ones being deep metric learning, inspired by the success of deep neural networks in many machine learning tasks. The goal of metric learning is to extract good high-level features from image pixel data using neural networks. These features provide useful abstractions, which can enable algorithms to perform visual comparison between images with human-like accuracy. To learn these features, supervised information of image similarity or relative similarity is often used. One important issue in deep metric learning is how to define similarity for multi-label or multi-object scenes in images. Traditionally, pairwise similarity is defined based on the presence of a single common label between two images. However, this definition is very coarse and not suitable for multi-label or multi-object data. Another common mistake is to completely ignore the multiplicity of objects in images, hence ignoring the multi-object facet of certain types of datasets. In our work, we propose an approach for learning deep image representations based on the relative similarity of both multi-label and multi-object image data. We introduce an intuitive and effective similarity metric based on the Jaccard similarity coefficient, which is equivalent to the intersection over union of two label sets. Hence we treat similarity as a continuous, as opposed to discrete quantity. We incorporate this similarity metric into a triplet loss with an adaptive margin, and achieve good mean average precision on image retrieval tasks. We further show, using a recently proposed quantization method, that the resulting deep feature can be quantized whilst preserving similarity. We also show that our proposed similarity metric performs better for multi-object images than a previously proposed cosine similarity-based metric. Our proposed method outperforms several state-of-the-art methods on two benchmark datasets.
ER -