The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
2017단계 객체 검출 네트워크의 목 부분에서는 일반적으로 하향식 또는 상향식 방식으로 특징 융합이 수행됩니다. 그러나 두 가지 유형의 불균형이 존재할 수 있습니다. 모델 목의 특징 불균형과 객체의 크기 변화로 인한 관심 영역 추출 레이어의 기울기 불균형입니다. 네트워크가 깊을수록 학습된 특징이 더 추상화됩니다. 즉, 더 많은 의미 정보를 추출할 수 있습니다. 하지만 추출된 영상의 배경, 공간적 위치, 기타 해상도 정보는 적습니다. 이에 비해 얕은 부분은 의미 정보는 거의 학습할 수 없으나 공간적 위치 정보는 많이 학습할 수 있습니다. 목의 특징 불균형 문제를 해결하기 위해 BEtM(Both Ends to Center to Multiple Layers) 특징 융합 방법과 그래디언트 불균형 문제를 해결하기 위해 MRoIE(Multi-level Region of Interest Feature Extraction) 레이어를 제안합니다. R-CNN(Region-based Convolutional Neural Network) 프레임워크와 결합된 BFF(Balanced Feature Fusion) 방법은 Faster R-CNN 아키텍처에 비해 크게 향상된 네트워크 성능을 제공합니다. MS COCO 1.9 데이터 세트에서는 FPN(Feature Pyramid Network) Faster R-CNN 프레임워크 및 GRoIE(Generic Region of Interest Extractor) 프레임워크보다 3.2포인트, XNUMX포인트 높은 평균 정밀도(AP)를 달성합니다. 각기.
Hongzhe LIU
Beijing Union University
Ningwei WANG
Beijing Union University
Xuewei LI
Beijing Union University
Cheng XU
Beijing Union University
Yaze LI
Beijing Union University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
부
Hongzhe LIU, Ningwei WANG, Xuewei LI, Cheng XU, Yaze LI, "BFF R-CNN: Balanced Feature Fusion for Object Detection" in IEICE TRANSACTIONS on Information,
vol. E105-D, no. 8, pp. 1472-1480, August 2022, doi: 10.1587/transinf.2021EDP7261.
Abstract: In the neck part of a two-stage object detection network, feature fusion is generally carried out in either a top-down or bottom-up manner. However, two types of imbalance may exist: feature imbalance in the neck of the model and gradient imbalance in the region of interest extraction layer due to the scale changes of objects. The deeper the network is, the more abstract the learned features are, that is to say, more semantic information can be extracted. However, the extracted image background, spatial location, and other resolution information are less. In contrast, the shallow part can learn little semantic information, but a lot of spatial location information. We propose the Both Ends to Centre to Multiple Layers (BEtM) feature fusion method to solve the feature imbalance problem in the neck and a Multi-level Region of Interest Feature Extraction (MRoIE) layer to solve the gradient imbalance problem. In combination with the Region-based Convolutional Neural Network (R-CNN) framework, our Balanced Feature Fusion (BFF) method offers significantly improved network performance compared with the Faster R-CNN architecture. On the MS COCO 2017 dataset, it achieves an average precision (AP) that is 1.9 points and 3.2 points higher than those of the Feature Pyramid Network (FPN) Faster R-CNN framework and the Generic Region of Interest Extractor (GRoIE) framework, respectively.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2021EDP7261/_p
부
@ARTICLE{e105-d_8_1472,
author={Hongzhe LIU, Ningwei WANG, Xuewei LI, Cheng XU, Yaze LI, },
journal={IEICE TRANSACTIONS on Information},
title={BFF R-CNN: Balanced Feature Fusion for Object Detection},
year={2022},
volume={E105-D},
number={8},
pages={1472-1480},
abstract={In the neck part of a two-stage object detection network, feature fusion is generally carried out in either a top-down or bottom-up manner. However, two types of imbalance may exist: feature imbalance in the neck of the model and gradient imbalance in the region of interest extraction layer due to the scale changes of objects. The deeper the network is, the more abstract the learned features are, that is to say, more semantic information can be extracted. However, the extracted image background, spatial location, and other resolution information are less. In contrast, the shallow part can learn little semantic information, but a lot of spatial location information. We propose the Both Ends to Centre to Multiple Layers (BEtM) feature fusion method to solve the feature imbalance problem in the neck and a Multi-level Region of Interest Feature Extraction (MRoIE) layer to solve the gradient imbalance problem. In combination with the Region-based Convolutional Neural Network (R-CNN) framework, our Balanced Feature Fusion (BFF) method offers significantly improved network performance compared with the Faster R-CNN architecture. On the MS COCO 2017 dataset, it achieves an average precision (AP) that is 1.9 points and 3.2 points higher than those of the Feature Pyramid Network (FPN) Faster R-CNN framework and the Generic Region of Interest Extractor (GRoIE) framework, respectively.},
keywords={},
doi={10.1587/transinf.2021EDP7261},
ISSN={1745-1361},
month={August},}
부
TY - JOUR
TI - BFF R-CNN: Balanced Feature Fusion for Object Detection
T2 - IEICE TRANSACTIONS on Information
SP - 1472
EP - 1480
AU - Hongzhe LIU
AU - Ningwei WANG
AU - Xuewei LI
AU - Cheng XU
AU - Yaze LI
PY - 2022
DO - 10.1587/transinf.2021EDP7261
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E105-D
IS - 8
JA - IEICE TRANSACTIONS on Information
Y1 - August 2022
AB - In the neck part of a two-stage object detection network, feature fusion is generally carried out in either a top-down or bottom-up manner. However, two types of imbalance may exist: feature imbalance in the neck of the model and gradient imbalance in the region of interest extraction layer due to the scale changes of objects. The deeper the network is, the more abstract the learned features are, that is to say, more semantic information can be extracted. However, the extracted image background, spatial location, and other resolution information are less. In contrast, the shallow part can learn little semantic information, but a lot of spatial location information. We propose the Both Ends to Centre to Multiple Layers (BEtM) feature fusion method to solve the feature imbalance problem in the neck and a Multi-level Region of Interest Feature Extraction (MRoIE) layer to solve the gradient imbalance problem. In combination with the Region-based Convolutional Neural Network (R-CNN) framework, our Balanced Feature Fusion (BFF) method offers significantly improved network performance compared with the Faster R-CNN architecture. On the MS COCO 2017 dataset, it achieves an average precision (AP) that is 1.9 points and 3.2 points higher than those of the Feature Pyramid Network (FPN) Faster R-CNN framework and the Generic Region of Interest Extractor (GRoIE) framework, respectively.
ER -