CN113052185A - Small sample target detection method based on fast R-CNN - Google Patents
Small sample target detection method based on fast R-CNN Download PDFInfo
- Publication number
- CN113052185A CN113052185A CN202110270154.2A CN202110270154A CN113052185A CN 113052185 A CN113052185 A CN 113052185A CN 202110270154 A CN202110270154 A CN 202110270154A CN 113052185 A CN113052185 A CN 113052185A
- Authority
- CN
- China
- Prior art keywords
- network
- image
- attention
- features
- branch
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Probability & Statistics with Applications (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a small sample target detection method based on Faster R-CNN. The fast-RCNN network is deeply improved and optimized by combining a traditional target detection algorithm and a small sample learning algorithm, so that the fast-RCNN network is suitable for small sample target detection. The invention provides an attention-based RPN module, which allocates different weights to different channel characteristics by using a channel attention mechanism, then performs deep cross-correlation on support set characteristics and query set characteristics to generate an attention characteristic graph, and then sends the attention characteristic graph to an RPN network to generate a candidate frame. The method is based on metric learning, an improved weighted prototype network is used for replacing the fast R-CNN classifier head, and the classification accuracy of candidate regions under small samples is improved; the invention introduces a multi-scale FPN module which comprises two branches, wherein one branch is similar to a general detection network and is applied to an RPN layer, and the other branch is applied to a support set image to extract a multi-scale feature map so as to solve the problems of scale sparseness of a small sample data set and scale difference between a query picture and the support set image.
Description
Technical Field
The invention relates to the field of small sample learning and target detection in deep learning, in particular to a target detection technology under a small sample condition.
Background
In recent years, with the development of massively parallel computing devices, deep learning theory has enjoyed great success in the practical application of computer vision. For example, image recognition technology has been widely applied in the fields of face recognition, automatic driving, biomedicine, etc., and the core task in these applications is to detect and recognize targets in a scene through a neural network model. However, the image algorithm based on the deep neural network usually needs a large amount of labeled data, and the model is supervised from end to end, so that a good effect can be achieved after a large number of iterations. However, due to limitations and specificities in some practical applications, it is often difficult to obtain large-scale image data sample sets, such as rare species pictures, rare remote sensing images, precious medical diagnosis pictures, special military target pictures, and the like. On the other hand, even if enough sample pictures are available, marking large-scale sample data requires enormous manpower and material resources. Therefore, in the case of data scarcity, how to learn and popularize the data in a small sample to a new task becomes a hot discussion problem in computer vision and other fields.
With the continuous progress and development of the target detection technology based on deep learning, excellent target detection frameworks such as fast R-CNN, YOLO, SSD and the like appear, but for small sample target detection, the method is a difficult problem in the field of target detection. The invention aims to solve the problem of how to detect the target under only a few samples. The invention combines a small sample learning algorithm and a traditional target detection algorithm, namely fast R-CNN, and designs a method capable of carrying out target detection under the condition of small samples.
Disclosure of Invention
In order to solve the target detection problem under the condition of a small sample, the invention provides a small sample target detection technology based on fast R-CNN. The technology is based on a universal two-stage target detection algorithm fast R-CNN in deep learning, and aiming at the condition of insufficient samples, the fast R-CNN is further improved by combining a small sample learning technology.
The technical scheme adopted by the invention is as follows:
step 1: inputting an image to be detected as a query set image and a small number of images containing targets as support set images;
step 2: extracting query image features through a feature extraction network, and extracting support set image features as support features;
and step 3: simultaneously sending the support image characteristic and the query image characteristic into an FPN network to generate a multi-scale characteristic diagram;
and 4, step 4: generating an attention feature map by the feature map through a channel attention mechanism and spatial attention, sending the attention feature map into an RPN network to generate a candidate frame, and generating a Roi feature map through Roi Pooling;
and 5: and the support features and the Roi features are respectively sent to the measurement branch and the regression branch for classification and positioning, and the object target is detected.
Compared with the prior art, the invention has the beneficial effects that:
(1) compared with the traditional target detection algorithm, the method has better generalization performance;
(2) and for the detection of small sample targets with insufficient samples, the identification and detection can be better carried out.
Drawings
FIG. 1 is a diagram: FPN is shown schematically.
FIG. 2 is a diagram of: channel attention mechanism diagram.
FIG. 3 is a diagram of: and (5) a multi-scale feature map extraction process.
FIG. 4 is a diagram of: partial detection result graph on PASCAL VOC data set
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the following embodiments and accompanying drawings.
In this embodiment, the method for detecting a small sample target includes the following processing steps:
step 1: image input
Different from the traditional target detection algorithm, only a single image to be detected is input, the image to be detected is input as a query image for small sample target detection, and a few images containing targets are used as support set images. Thus, the method includes a query image branch and a support image branch, both branches running concurrently.
Step 2: multi-scale feature map extraction
The invention improves the fast R-CNN detection algorithm, introduces a multi-scale FPN module to simultaneously extract multi-scale characteristic graphs of the query image and the support set image, and solves the problems of target detection with different scales and scale difference of the query image and the support set image, wherein the FPN schematic diagram is shown in figure 1. The branch of the query image is similar to a common detection network, the FPN is applied to the RPN layer, the other branch is applied to the support set image to extract the multi-scale feature map, and each support image feature pyramid is obtained, so that the support set scale space is enriched. Further, after the pyramid supporting the image features is obtained, a multi-scale prototype of each class is generated through a weighted prototype network, and the process is shown in fig. 3.
And step 3: candidate region extraction
A potential candidate Region is generated in a fast-RCNN (Region pro-social Network) by adopting an RPN (Region pro-social Network), then whether the anchor frame belongs to the foreground or the background is judged through softmax, and then the anchor frame is corrected by utilizing regression of a surrounding frame to obtain a more accurate candidate frame. In the small sample target detection, a target object to be detected only contains a small number of training samples, and many RPN networks obtained by training on the base class are likely to generate more candidate frames irrelevant to the object when detecting a new class, so that the candidate classification network is required to have good discrimination capability. On the other hand, the RPN network should filter candidate regions that do not belong to the support set category, thereby reducing the number of candidate frames that need to be distinguished, which is helpful to further improve the network accuracy. Therefore, the present invention proposes an RPN network based on a multiple attention mechanism. The idea of attention mechanism stems from the human visual system selectively focusing on certain areas of emphasis while observing, while ignoring others. The multi-attention mechanism RPN provided by the invention uses the support set and the query set samples as input, so that the RPN can more effectively generate the candidate frame of the small sample target. The query image and the support image are firstly sent to a channel attention module after feature extraction, and the channel attention mechanism selectively strengthens certain channel features and suppresses less useful features by learning global information. As shown in the channel attention mechanism of fig. 2, an input set of feature maps are first subjected to global average pooling to perform global information compression, and all pixels of each feature map are averaged and compressed to a size of 1 × 1. And then, in order to learn the nonlinear correlation relationship between the channels, normalizing the channels through two full-connection layers and Relu activation functions and a sigmoid function, and finally generating the attention weight of each channel. And multiplying the weight by the feature map to obtain the feature map weighted by the attention of the channel. After the channel attention is passed, the support set characteristics and the query set characteristics are subjected to deep cross correlation to generate an attention characteristic graph, and then the attention characteristic graph is sent to an RPN network to generate a candidate box. In the space attention module, the characteristics of the query set pass through a convolution layer, and the method is different from the conventional convolution, and adopts Depth-wise convolution. Meanwhile, the support set features are subjected to pooling and Depth-wise convolution to form a 1 × 1 × C vector, the vector is used as a kernel to perform deep cross-correlation operation with the query set features, and an attention feature graph capable of representing the correlation between the query set and the support set is generated.
And step 3: candidate region classification and regression
After candidate frames are extracted through the RPN, the fast RCNN synthesizes a Roi feature map with the feature map, and the Roi feature map is subjected to final classification and positioning judgment to screen out a final target. The original fast RCNN network directly adopts the traditional softmax function to output the class of the target, but for the case of a small sample, the traditional classification method does not have enough generalization capability to detect the target of a new class. The invention proposes an improved weighted prototype network to replace the fast RCNN classifier head based on a metric learning approach. By adopting a metric learning mode and combining a meta learning training strategy, a model with generalization capability can be trained, and a new target class can be determined according to a small amount of samples, so as to realize target detection under a small sample.
The prototype network extracts the embedded features of the image through the embedded network learning, and uses the mean vector of the features of each category of the images of the support set as the prototype of the class, as shown in formula 1. And then, judging the distance between the query image feature and each prototype through a non-parametric measurement mode such as Euclidean distance and the like to classify.
However, this approach has a problem that when the distribution of the support set samples is very different or there are bad samples, the calculated mean vector cannot be used as a representative vector of the class well. The means are calculated in such a way that each sample feature contributes equally to the representative vector, but different sample features should have different degrees of contribution.
The invention calculates the prototype of the class in a weighted manner. Specifically, first, a one-dimensional gaussian kernel function is used to calculate a weighting coefficient for each support set sample feature, as shown in formula 2.
Wherein x isijJ support sample, x, representing the ith classqQuery sample, σ, representing class iiIndicating that the width of the gaussian function takes 0.1.
After obtaining the weighting coefficients for each support set feature, the present invention calculates the prototype of the class in a weighting manner, and the specific calculation is shown in formula 3.
For a query sample, it is expected to tend toward weighted prototypes of the same class and away from weighted prototypes of different classes, so that a loss function can be obtained, as shown in equation 4.
FIG. 4 is a graph showing the results of the detection of the method of the present invention on the PASCAL VOC test set, and it can be seen that the method has a better detection result and can also detect a part of small targets.
While the invention has been described with reference to specific embodiments, any feature disclosed in this specification may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise; all of the disclosed features, or all of the method or process steps, may be combined in any combination, except combinations where mutually exclusive features or/and steps are present.
Claims (4)
1. A small sample target detection method based on fast R-CNN is characterized by comprising the following steps:
step 1: inputting an image to be detected as a query set image and a small number of images containing targets as support set images;
step 2: extracting query image features through a feature extraction network, and extracting support set image features as support features;
and step 3: simultaneously sending the support image characteristic and the query image characteristic into an FPN network to generate a multi-scale characteristic diagram;
and 4, step 4: generating an attention feature map by the feature map through a channel attention mechanism and spatial attention, sending the attention feature map into an RPN network to generate a candidate frame, and generating a Roi feature map through Roi Pooling;
and 5: and the support features and the Roi features are respectively sent to the measurement branch and the regression branch for classification and positioning, and the object target is detected.
2. The method of claim 1, wherein in step 3, the FPN simultaneously introduces a query image branch and a support set image branch, wherein different scales of feature maps output by fusion of the FPN network in the query image branch are input into the RPN network to generate candidate regions; in the support set image branch, the support set image features are input into the FPN network to obtain each support image feature pyramid.
3. The method of claim 1 wherein step 4 is performed in tandem with the channel attention mechanism and the spatial attention, i.e., after the channel attention, the support set features and the query set features are depth cross correlated to generate the attention feature map.
4. The method of claim 1, wherein the step 5 uses a modified weighted prototype network as the metric branch.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110270154.2A CN113052185A (en) | 2021-03-12 | 2021-03-12 | Small sample target detection method based on fast R-CNN |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110270154.2A CN113052185A (en) | 2021-03-12 | 2021-03-12 | Small sample target detection method based on fast R-CNN |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113052185A true CN113052185A (en) | 2021-06-29 |
Family
ID=76513149
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110270154.2A Pending CN113052185A (en) | 2021-03-12 | 2021-03-12 | Small sample target detection method based on fast R-CNN |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113052185A (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113191359A (en) * | 2021-06-30 | 2021-07-30 | 之江实验室 | Small sample target detection method and system based on support and query samples |
CN113705570A (en) * | 2021-08-31 | 2021-11-26 | 长沙理工大学 | Few-sample target detection method based on deep learning |
CN113723558A (en) * | 2021-09-08 | 2021-11-30 | 北京航空航天大学 | Remote sensing image small sample ship detection method based on attention mechanism |
CN114494728A (en) * | 2022-02-10 | 2022-05-13 | 北京工业大学 | Small target detection method based on deep learning |
CN114612702A (en) * | 2022-01-24 | 2022-06-10 | 珠高智能科技(深圳)有限公司 | Image data annotation system and method based on deep learning |
CN114663707A (en) * | 2022-03-28 | 2022-06-24 | 中国科学院光电技术研究所 | Improved few-sample target detection method based on fast RCNN |
CN114818963A (en) * | 2022-05-10 | 2022-07-29 | 电子科技大学 | Small sample detection algorithm based on cross-image feature fusion |
CN115100432A (en) * | 2022-08-23 | 2022-09-23 | 浙江大华技术股份有限公司 | Small sample target detection method and device and computer readable storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110555475A (en) * | 2019-08-29 | 2019-12-10 | 华南理工大学 | few-sample target detection method based on semantic information fusion |
US20200334490A1 (en) * | 2019-04-16 | 2020-10-22 | Fujitsu Limited | Image processing apparatus, training method and training apparatus for the same |
CN112434721A (en) * | 2020-10-23 | 2021-03-02 | 特斯联科技集团有限公司 | Image classification method, system, storage medium and terminal based on small sample learning |
CN112465752A (en) * | 2020-11-16 | 2021-03-09 | 电子科技大学 | Improved Faster R-CNN-based small target detection method |
-
2021
- 2021-03-12 CN CN202110270154.2A patent/CN113052185A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200334490A1 (en) * | 2019-04-16 | 2020-10-22 | Fujitsu Limited | Image processing apparatus, training method and training apparatus for the same |
CN110555475A (en) * | 2019-08-29 | 2019-12-10 | 华南理工大学 | few-sample target detection method based on semantic information fusion |
CN112434721A (en) * | 2020-10-23 | 2021-03-02 | 特斯联科技集团有限公司 | Image classification method, system, storage medium and terminal based on small sample learning |
CN112465752A (en) * | 2020-11-16 | 2021-03-09 | 电子科技大学 | Improved Faster R-CNN-based small target detection method |
Non-Patent Citations (5)
Title |
---|
QI FAN等: "Few-Shot Object Detection With Attention-RPN and Multi-Relation Detector", 《HTTPS://ARXIV.ORG/ABS/1908.01998》 * |
S CHEN等: "R2FA-Det:Delving into high-quality rotatable boxes for ship detection in SAR image", 《REMOTE SENSING》 * |
XUAN NIE等: "Attention Mask R-CNN for Ship Detetciton and Segmentation From Remote Sensing Images", 《IEEE ACCESS》 * |
张婷婷等: "基于深度学习的图像目标检测算法综述", 《电信科学》 * |
李红艳等: "注意力机制改进卷积神经网络的遥感图像目标检测", 《中国图象图形学报》 * |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113191359A (en) * | 2021-06-30 | 2021-07-30 | 之江实验室 | Small sample target detection method and system based on support and query samples |
CN113705570A (en) * | 2021-08-31 | 2021-11-26 | 长沙理工大学 | Few-sample target detection method based on deep learning |
CN113705570B (en) * | 2021-08-31 | 2023-12-08 | 长沙理工大学 | Deep learning-based few-sample target detection method |
CN113723558A (en) * | 2021-09-08 | 2021-11-30 | 北京航空航天大学 | Remote sensing image small sample ship detection method based on attention mechanism |
CN114612702A (en) * | 2022-01-24 | 2022-06-10 | 珠高智能科技(深圳)有限公司 | Image data annotation system and method based on deep learning |
CN114494728A (en) * | 2022-02-10 | 2022-05-13 | 北京工业大学 | Small target detection method based on deep learning |
CN114494728B (en) * | 2022-02-10 | 2024-06-07 | 北京工业大学 | Small target detection method based on deep learning |
CN114663707A (en) * | 2022-03-28 | 2022-06-24 | 中国科学院光电技术研究所 | Improved few-sample target detection method based on fast RCNN |
CN114818963A (en) * | 2022-05-10 | 2022-07-29 | 电子科技大学 | Small sample detection algorithm based on cross-image feature fusion |
CN114818963B (en) * | 2022-05-10 | 2023-05-09 | 电子科技大学 | Small sample detection method based on cross-image feature fusion |
CN115100432A (en) * | 2022-08-23 | 2022-09-23 | 浙江大华技术股份有限公司 | Small sample target detection method and device and computer readable storage medium |
CN115100432B (en) * | 2022-08-23 | 2022-11-18 | 浙江大华技术股份有限公司 | Small sample target detection method and device and computer readable storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113052185A (en) | Small sample target detection method based on fast R-CNN | |
CN111783576B (en) | Pedestrian re-identification method based on improved YOLOv3 network and feature fusion | |
CN113065558A (en) | Lightweight small target detection method combined with attention mechanism | |
CN111767882A (en) | Multi-mode pedestrian detection method based on improved YOLO model | |
CN110782420A (en) | Small target feature representation enhancement method based on deep learning | |
CN111461083A (en) | Rapid vehicle detection method based on deep learning | |
CN105160310A (en) | 3D (three-dimensional) convolutional neural network based human body behavior recognition method | |
CN111709311A (en) | Pedestrian re-identification method based on multi-scale convolution feature fusion | |
CN108520203B (en) | Multi-target feature extraction method based on fusion of self-adaptive multi-peripheral frame and cross pooling feature | |
CN112949572A (en) | Slim-YOLOv 3-based mask wearing condition detection method | |
CN111709313B (en) | Pedestrian re-identification method based on local and channel combination characteristics | |
CN111738344A (en) | Rapid target detection method based on multi-scale fusion | |
CN114463677B (en) | Safety helmet wearing detection method based on global attention | |
CN109635726B (en) | Landslide identification method based on combination of symmetric deep network and multi-scale pooling | |
CN111339975A (en) | Target detection, identification and tracking method based on central scale prediction and twin neural network | |
CN108734200B (en) | Human target visual detection method and device based on BING (building information network) features | |
CN116342894B (en) | GIS infrared feature recognition system and method based on improved YOLOv5 | |
CN114663707A (en) | Improved few-sample target detection method based on fast RCNN | |
CN116091946A (en) | Yolov 5-based unmanned aerial vehicle aerial image target detection method | |
CN114429577B (en) | Flag detection method, system and equipment based on high confidence labeling strategy | |
CN111553337A (en) | Hyperspectral multi-target detection method based on improved anchor frame | |
CN116912670A (en) | Deep sea fish identification method based on improved YOLO model | |
CN107679528A (en) | A kind of pedestrian detection method based on AdaBoost SVM Ensemble Learning Algorithms | |
CN114565918A (en) | Face silence living body detection method and system based on multi-feature extraction module | |
CN111046861B (en) | Method for identifying infrared image, method for constructing identification model and application |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20210629 |
|
WD01 | Invention patent application deemed withdrawn after publication |