CN113052185A - Small sample target detection method based on fast R-CNN - Google Patents

Small sample target detection method based on fast R-CNN Download PDF

Info

Publication number
CN113052185A
CN113052185A CN202110270154.2A CN202110270154A CN113052185A CN 113052185 A CN113052185 A CN 113052185A CN 202110270154 A CN202110270154 A CN 202110270154A CN 113052185 A CN113052185 A CN 113052185A
Authority
CN
China
Prior art keywords
network
image
attention
features
branch
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110270154.2A
Other languages
Chinese (zh)
Inventor
贾海涛
鲜维富
莫超杰
许文波
任利
周焕来
贾宇明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN202110270154.2A priority Critical patent/CN113052185A/en
Publication of CN113052185A publication Critical patent/CN113052185A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a small sample target detection method based on Faster R-CNN. The fast-RCNN network is deeply improved and optimized by combining a traditional target detection algorithm and a small sample learning algorithm, so that the fast-RCNN network is suitable for small sample target detection. The invention provides an attention-based RPN module, which allocates different weights to different channel characteristics by using a channel attention mechanism, then performs deep cross-correlation on support set characteristics and query set characteristics to generate an attention characteristic graph, and then sends the attention characteristic graph to an RPN network to generate a candidate frame. The method is based on metric learning, an improved weighted prototype network is used for replacing the fast R-CNN classifier head, and the classification accuracy of candidate regions under small samples is improved; the invention introduces a multi-scale FPN module which comprises two branches, wherein one branch is similar to a general detection network and is applied to an RPN layer, and the other branch is applied to a support set image to extract a multi-scale feature map so as to solve the problems of scale sparseness of a small sample data set and scale difference between a query picture and the support set image.

Description

Small sample target detection method based on fast R-CNN
Technical Field
The invention relates to the field of small sample learning and target detection in deep learning, in particular to a target detection technology under a small sample condition.
Background
In recent years, with the development of massively parallel computing devices, deep learning theory has enjoyed great success in the practical application of computer vision. For example, image recognition technology has been widely applied in the fields of face recognition, automatic driving, biomedicine, etc., and the core task in these applications is to detect and recognize targets in a scene through a neural network model. However, the image algorithm based on the deep neural network usually needs a large amount of labeled data, and the model is supervised from end to end, so that a good effect can be achieved after a large number of iterations. However, due to limitations and specificities in some practical applications, it is often difficult to obtain large-scale image data sample sets, such as rare species pictures, rare remote sensing images, precious medical diagnosis pictures, special military target pictures, and the like. On the other hand, even if enough sample pictures are available, marking large-scale sample data requires enormous manpower and material resources. Therefore, in the case of data scarcity, how to learn and popularize the data in a small sample to a new task becomes a hot discussion problem in computer vision and other fields.
With the continuous progress and development of the target detection technology based on deep learning, excellent target detection frameworks such as fast R-CNN, YOLO, SSD and the like appear, but for small sample target detection, the method is a difficult problem in the field of target detection. The invention aims to solve the problem of how to detect the target under only a few samples. The invention combines a small sample learning algorithm and a traditional target detection algorithm, namely fast R-CNN, and designs a method capable of carrying out target detection under the condition of small samples.
Disclosure of Invention
In order to solve the target detection problem under the condition of a small sample, the invention provides a small sample target detection technology based on fast R-CNN. The technology is based on a universal two-stage target detection algorithm fast R-CNN in deep learning, and aiming at the condition of insufficient samples, the fast R-CNN is further improved by combining a small sample learning technology.
The technical scheme adopted by the invention is as follows:
step 1: inputting an image to be detected as a query set image and a small number of images containing targets as support set images;
step 2: extracting query image features through a feature extraction network, and extracting support set image features as support features;
and step 3: simultaneously sending the support image characteristic and the query image characteristic into an FPN network to generate a multi-scale characteristic diagram;
and 4, step 4: generating an attention feature map by the feature map through a channel attention mechanism and spatial attention, sending the attention feature map into an RPN network to generate a candidate frame, and generating a Roi feature map through Roi Pooling;
and 5: and the support features and the Roi features are respectively sent to the measurement branch and the regression branch for classification and positioning, and the object target is detected.
Compared with the prior art, the invention has the beneficial effects that:
(1) compared with the traditional target detection algorithm, the method has better generalization performance;
(2) and for the detection of small sample targets with insufficient samples, the identification and detection can be better carried out.
Drawings
FIG. 1 is a diagram: FPN is shown schematically.
FIG. 2 is a diagram of: channel attention mechanism diagram.
FIG. 3 is a diagram of: and (5) a multi-scale feature map extraction process.
FIG. 4 is a diagram of: partial detection result graph on PASCAL VOC data set
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the following embodiments and accompanying drawings.
In this embodiment, the method for detecting a small sample target includes the following processing steps:
step 1: image input
Different from the traditional target detection algorithm, only a single image to be detected is input, the image to be detected is input as a query image for small sample target detection, and a few images containing targets are used as support set images. Thus, the method includes a query image branch and a support image branch, both branches running concurrently.
Step 2: multi-scale feature map extraction
The invention improves the fast R-CNN detection algorithm, introduces a multi-scale FPN module to simultaneously extract multi-scale characteristic graphs of the query image and the support set image, and solves the problems of target detection with different scales and scale difference of the query image and the support set image, wherein the FPN schematic diagram is shown in figure 1. The branch of the query image is similar to a common detection network, the FPN is applied to the RPN layer, the other branch is applied to the support set image to extract the multi-scale feature map, and each support image feature pyramid is obtained, so that the support set scale space is enriched. Further, after the pyramid supporting the image features is obtained, a multi-scale prototype of each class is generated through a weighted prototype network, and the process is shown in fig. 3.
And step 3: candidate region extraction
A potential candidate Region is generated in a fast-RCNN (Region pro-social Network) by adopting an RPN (Region pro-social Network), then whether the anchor frame belongs to the foreground or the background is judged through softmax, and then the anchor frame is corrected by utilizing regression of a surrounding frame to obtain a more accurate candidate frame. In the small sample target detection, a target object to be detected only contains a small number of training samples, and many RPN networks obtained by training on the base class are likely to generate more candidate frames irrelevant to the object when detecting a new class, so that the candidate classification network is required to have good discrimination capability. On the other hand, the RPN network should filter candidate regions that do not belong to the support set category, thereby reducing the number of candidate frames that need to be distinguished, which is helpful to further improve the network accuracy. Therefore, the present invention proposes an RPN network based on a multiple attention mechanism. The idea of attention mechanism stems from the human visual system selectively focusing on certain areas of emphasis while observing, while ignoring others. The multi-attention mechanism RPN provided by the invention uses the support set and the query set samples as input, so that the RPN can more effectively generate the candidate frame of the small sample target. The query image and the support image are firstly sent to a channel attention module after feature extraction, and the channel attention mechanism selectively strengthens certain channel features and suppresses less useful features by learning global information. As shown in the channel attention mechanism of fig. 2, an input set of feature maps are first subjected to global average pooling to perform global information compression, and all pixels of each feature map are averaged and compressed to a size of 1 × 1. And then, in order to learn the nonlinear correlation relationship between the channels, normalizing the channels through two full-connection layers and Relu activation functions and a sigmoid function, and finally generating the attention weight of each channel. And multiplying the weight by the feature map to obtain the feature map weighted by the attention of the channel. After the channel attention is passed, the support set characteristics and the query set characteristics are subjected to deep cross correlation to generate an attention characteristic graph, and then the attention characteristic graph is sent to an RPN network to generate a candidate box. In the space attention module, the characteristics of the query set pass through a convolution layer, and the method is different from the conventional convolution, and adopts Depth-wise convolution. Meanwhile, the support set features are subjected to pooling and Depth-wise convolution to form a 1 × 1 × C vector, the vector is used as a kernel to perform deep cross-correlation operation with the query set features, and an attention feature graph capable of representing the correlation between the query set and the support set is generated.
And step 3: candidate region classification and regression
After candidate frames are extracted through the RPN, the fast RCNN synthesizes a Roi feature map with the feature map, and the Roi feature map is subjected to final classification and positioning judgment to screen out a final target. The original fast RCNN network directly adopts the traditional softmax function to output the class of the target, but for the case of a small sample, the traditional classification method does not have enough generalization capability to detect the target of a new class. The invention proposes an improved weighted prototype network to replace the fast RCNN classifier head based on a metric learning approach. By adopting a metric learning mode and combining a meta learning training strategy, a model with generalization capability can be trained, and a new target class can be determined according to a small amount of samples, so as to realize target detection under a small sample.
The prototype network extracts the embedded features of the image through the embedded network learning, and uses the mean vector of the features of each category of the images of the support set as the prototype of the class, as shown in formula 1. And then, judging the distance between the query image feature and each prototype through a non-parametric measurement mode such as Euclidean distance and the like to classify.
Figure BDA0002973996010000041
However, this approach has a problem that when the distribution of the support set samples is very different or there are bad samples, the calculated mean vector cannot be used as a representative vector of the class well. The means are calculated in such a way that each sample feature contributes equally to the representative vector, but different sample features should have different degrees of contribution.
The invention calculates the prototype of the class in a weighted manner. Specifically, first, a one-dimensional gaussian kernel function is used to calculate a weighting coefficient for each support set sample feature, as shown in formula 2.
Figure BDA0002973996010000042
Wherein x isijJ support sample, x, representing the ith classqQuery sample, σ, representing class iiIndicating that the width of the gaussian function takes 0.1.
After obtaining the weighting coefficients for each support set feature, the present invention calculates the prototype of the class in a weighting manner, and the specific calculation is shown in formula 3.
Figure BDA0002973996010000043
Wherein
Figure BDA0002973996010000044
Is a prototype of the ith class that is subjected to a weighting calculation.
For a query sample, it is expected to tend toward weighted prototypes of the same class and away from weighted prototypes of different classes, so that a loss function can be obtained, as shown in equation 4.
Figure BDA0002973996010000045
FIG. 4 is a graph showing the results of the detection of the method of the present invention on the PASCAL VOC test set, and it can be seen that the method has a better detection result and can also detect a part of small targets.
While the invention has been described with reference to specific embodiments, any feature disclosed in this specification may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise; all of the disclosed features, or all of the method or process steps, may be combined in any combination, except combinations where mutually exclusive features or/and steps are present.

Claims (4)

1. A small sample target detection method based on fast R-CNN is characterized by comprising the following steps:
step 1: inputting an image to be detected as a query set image and a small number of images containing targets as support set images;
step 2: extracting query image features through a feature extraction network, and extracting support set image features as support features;
and step 3: simultaneously sending the support image characteristic and the query image characteristic into an FPN network to generate a multi-scale characteristic diagram;
and 4, step 4: generating an attention feature map by the feature map through a channel attention mechanism and spatial attention, sending the attention feature map into an RPN network to generate a candidate frame, and generating a Roi feature map through Roi Pooling;
and 5: and the support features and the Roi features are respectively sent to the measurement branch and the regression branch for classification and positioning, and the object target is detected.
2. The method of claim 1, wherein in step 3, the FPN simultaneously introduces a query image branch and a support set image branch, wherein different scales of feature maps output by fusion of the FPN network in the query image branch are input into the RPN network to generate candidate regions; in the support set image branch, the support set image features are input into the FPN network to obtain each support image feature pyramid.
3. The method of claim 1 wherein step 4 is performed in tandem with the channel attention mechanism and the spatial attention, i.e., after the channel attention, the support set features and the query set features are depth cross correlated to generate the attention feature map.
4. The method of claim 1, wherein the step 5 uses a modified weighted prototype network as the metric branch.
CN202110270154.2A 2021-03-12 2021-03-12 Small sample target detection method based on fast R-CNN Pending CN113052185A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110270154.2A CN113052185A (en) 2021-03-12 2021-03-12 Small sample target detection method based on fast R-CNN

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110270154.2A CN113052185A (en) 2021-03-12 2021-03-12 Small sample target detection method based on fast R-CNN

Publications (1)

Publication Number Publication Date
CN113052185A true CN113052185A (en) 2021-06-29

Family

ID=76513149

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110270154.2A Pending CN113052185A (en) 2021-03-12 2021-03-12 Small sample target detection method based on fast R-CNN

Country Status (1)

Country Link
CN (1) CN113052185A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113191359A (en) * 2021-06-30 2021-07-30 之江实验室 Small sample target detection method and system based on support and query samples
CN113705570A (en) * 2021-08-31 2021-11-26 长沙理工大学 Few-sample target detection method based on deep learning
CN113723558A (en) * 2021-09-08 2021-11-30 北京航空航天大学 Remote sensing image small sample ship detection method based on attention mechanism
CN114494728A (en) * 2022-02-10 2022-05-13 北京工业大学 Small target detection method based on deep learning
CN114612702A (en) * 2022-01-24 2022-06-10 珠高智能科技(深圳)有限公司 Image data annotation system and method based on deep learning
CN114663707A (en) * 2022-03-28 2022-06-24 中国科学院光电技术研究所 Improved few-sample target detection method based on fast RCNN
CN114818963A (en) * 2022-05-10 2022-07-29 电子科技大学 Small sample detection algorithm based on cross-image feature fusion
CN115100432A (en) * 2022-08-23 2022-09-23 浙江大华技术股份有限公司 Small sample target detection method and device and computer readable storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110555475A (en) * 2019-08-29 2019-12-10 华南理工大学 few-sample target detection method based on semantic information fusion
US20200334490A1 (en) * 2019-04-16 2020-10-22 Fujitsu Limited Image processing apparatus, training method and training apparatus for the same
CN112434721A (en) * 2020-10-23 2021-03-02 特斯联科技集团有限公司 Image classification method, system, storage medium and terminal based on small sample learning
CN112465752A (en) * 2020-11-16 2021-03-09 电子科技大学 Improved Faster R-CNN-based small target detection method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200334490A1 (en) * 2019-04-16 2020-10-22 Fujitsu Limited Image processing apparatus, training method and training apparatus for the same
CN110555475A (en) * 2019-08-29 2019-12-10 华南理工大学 few-sample target detection method based on semantic information fusion
CN112434721A (en) * 2020-10-23 2021-03-02 特斯联科技集团有限公司 Image classification method, system, storage medium and terminal based on small sample learning
CN112465752A (en) * 2020-11-16 2021-03-09 电子科技大学 Improved Faster R-CNN-based small target detection method

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
QI FAN等: "Few-Shot Object Detection With Attention-RPN and Multi-Relation Detector", 《HTTPS://ARXIV.ORG/ABS/1908.01998》 *
S CHEN等: "R2FA-Det:Delving into high-quality rotatable boxes for ship detection in SAR image", 《REMOTE SENSING》 *
XUAN NIE等: "Attention Mask R-CNN for Ship Detetciton and Segmentation From Remote Sensing Images", 《IEEE ACCESS》 *
张婷婷等: "基于深度学习的图像目标检测算法综述", 《电信科学》 *
李红艳等: "注意力机制改进卷积神经网络的遥感图像目标检测", 《中国图象图形学报》 *

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113191359A (en) * 2021-06-30 2021-07-30 之江实验室 Small sample target detection method and system based on support and query samples
CN113705570A (en) * 2021-08-31 2021-11-26 长沙理工大学 Few-sample target detection method based on deep learning
CN113705570B (en) * 2021-08-31 2023-12-08 长沙理工大学 Deep learning-based few-sample target detection method
CN113723558A (en) * 2021-09-08 2021-11-30 北京航空航天大学 Remote sensing image small sample ship detection method based on attention mechanism
CN114612702A (en) * 2022-01-24 2022-06-10 珠高智能科技(深圳)有限公司 Image data annotation system and method based on deep learning
CN114494728A (en) * 2022-02-10 2022-05-13 北京工业大学 Small target detection method based on deep learning
CN114494728B (en) * 2022-02-10 2024-06-07 北京工业大学 Small target detection method based on deep learning
CN114663707A (en) * 2022-03-28 2022-06-24 中国科学院光电技术研究所 Improved few-sample target detection method based on fast RCNN
CN114818963A (en) * 2022-05-10 2022-07-29 电子科技大学 Small sample detection algorithm based on cross-image feature fusion
CN114818963B (en) * 2022-05-10 2023-05-09 电子科技大学 Small sample detection method based on cross-image feature fusion
CN115100432A (en) * 2022-08-23 2022-09-23 浙江大华技术股份有限公司 Small sample target detection method and device and computer readable storage medium
CN115100432B (en) * 2022-08-23 2022-11-18 浙江大华技术股份有限公司 Small sample target detection method and device and computer readable storage medium

Similar Documents

Publication Publication Date Title
CN113052185A (en) Small sample target detection method based on fast R-CNN
CN111783576B (en) Pedestrian re-identification method based on improved YOLOv3 network and feature fusion
CN113065558A (en) Lightweight small target detection method combined with attention mechanism
CN111767882A (en) Multi-mode pedestrian detection method based on improved YOLO model
CN110782420A (en) Small target feature representation enhancement method based on deep learning
CN111461083A (en) Rapid vehicle detection method based on deep learning
CN105160310A (en) 3D (three-dimensional) convolutional neural network based human body behavior recognition method
CN111709311A (en) Pedestrian re-identification method based on multi-scale convolution feature fusion
CN108520203B (en) Multi-target feature extraction method based on fusion of self-adaptive multi-peripheral frame and cross pooling feature
CN112949572A (en) Slim-YOLOv 3-based mask wearing condition detection method
CN111709313B (en) Pedestrian re-identification method based on local and channel combination characteristics
CN111738344A (en) Rapid target detection method based on multi-scale fusion
CN114463677B (en) Safety helmet wearing detection method based on global attention
CN109635726B (en) Landslide identification method based on combination of symmetric deep network and multi-scale pooling
CN111339975A (en) Target detection, identification and tracking method based on central scale prediction and twin neural network
CN108734200B (en) Human target visual detection method and device based on BING (building information network) features
CN116342894B (en) GIS infrared feature recognition system and method based on improved YOLOv5
CN114663707A (en) Improved few-sample target detection method based on fast RCNN
CN116091946A (en) Yolov 5-based unmanned aerial vehicle aerial image target detection method
CN114429577B (en) Flag detection method, system and equipment based on high confidence labeling strategy
CN111553337A (en) Hyperspectral multi-target detection method based on improved anchor frame
CN116912670A (en) Deep sea fish identification method based on improved YOLO model
CN107679528A (en) A kind of pedestrian detection method based on AdaBoost SVM Ensemble Learning Algorithms
CN114565918A (en) Face silence living body detection method and system based on multi-feature extraction module
CN111046861B (en) Method for identifying infrared image, method for constructing identification model and application

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20210629

WD01 Invention patent application deemed withdrawn after publication