CN112183558A - Target detection and feature extraction integrated network based on YOLOv3 - Google Patents
Target detection and feature extraction integrated network based on YOLOv3 Download PDFInfo
- Publication number
- CN112183558A CN112183558A CN202011066312.4A CN202011066312A CN112183558A CN 112183558 A CN112183558 A CN 112183558A CN 202011066312 A CN202011066312 A CN 202011066312A CN 112183558 A CN112183558 A CN 112183558A
- Authority
- CN
- China
- Prior art keywords
- target
- feature
- feature extraction
- information
- target detection
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 82
- 238000000605 extraction Methods 0.000 title claims abstract description 49
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 11
- 239000000284 extract Substances 0.000 claims abstract description 11
- 230000001629 suppression Effects 0.000 claims abstract description 8
- 238000004364 calculation method Methods 0.000 claims description 8
- 239000000126 substance Substances 0.000 claims description 5
- 238000000034 method Methods 0.000 description 6
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 230000009471 action Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000005012 migration Effects 0.000 description 1
- 238000013508 migration Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000013526 transfer learning Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a YOLOv 3-based target detection and feature extraction integrated network, wherein a target detection module runs a YOLOv3 algorithm to output 3 branch detection results, 3 branch data are integrated according to a non-maximum suppression algorithm to obtain an output result of target detection, the target detection result and original image information in the network are input into a decoding module, the decoding module extracts an image of a target area according to input information and sends the image into a feature extraction module, the feature extraction module runs a convolutional neural network to extract feature information of the image and projects the feature information of each target onto a hypersphere, the target information output by the target detection module corresponds to the feature information output by the feature extraction module one by one, and finally the output of the target detection and the output of the feature extraction are combined to obtain the final output of the network. The integrated network provided by the invention can provide target detection information, and simultaneously provide characteristic information of the target for the tracking network, and can effectively improve the performance of the tracking algorithm.
Description
Technical Field
The invention relates to the technical field of computer vision, in particular to a target detection and feature extraction integrated network based on YOLOv 3.
Background
The vision-based target detection and tracking is an important research topic in the field of computer vision, and has important research and practical values in the fields of video monitoring, virtual reality, human-computer interaction, autonomous navigation and the like. The target detection task can provide the information of the category, the position and the size of the target, and the performance of the target detection directly influences the performance of the subsequent target tracking task in the continuous frame sequence target tracking process.
In order to realize efficient target tracking, more accurate information of a target is expected to be obtained, wherein the more accurate information comprises the target category, the target position, the target size, the target color, texture, edge and other representation characteristics, and abundant target characteristics are the key for realizing target robust tracking.
The current target detection algorithm can output the information of the category, the position and the size of a target, but cannot output the representation characteristic information of the target, and the performance of the tracking algorithm is limited due to the limited information output by the detection algorithm.
Disclosure of Invention
In order to solve the limitations and defects of the prior art, the invention provides a target detection and feature extraction integrated network based on YOLOv3, which comprises a target detection module, a decoding module and a feature extraction module, wherein the target detection module runs a YOLOv3 algorithm, outputs 3 branch detection results, and integrates 3 branch data by using a non-maximum suppression algorithm to obtain an output result of target detection;
the decoding module receives the output result of the target detection and the original image information, extracts the image of the target area through decoding, and sends the image of the target area to the feature extraction module;
the feature extraction module operates a convolutional neural network to extract feature information of the target area image and projects image features onto a hypersphere, the coordinates of the hypersphere are the feature information of the target, and the feature information output by the feature detection module corresponds to the target information output by the target detection module one to one;
and combining the feature information output by the feature detection module with the target information output by the target detection module to obtain the final output of the target detection and feature extraction integrated network, wherein the final output comprises the category, the position, the size and the image feature information of each target.
Optionally, the method further includes:
the target detection module runs the YOLOv3 algorithm and outputs 3 branch results which are D1, D2 and D3 respectively, wherein
Integrating the 3 branch data through a non-maximum suppression algorithm to obtain an output result D of target detection, whereinThe maximum value M is the number of targets in the image;
the decoding module extracts an image of a target area and sends the image to the feature extraction module, the feature extraction module operates a convolutional neural network to extract feature information of the image to obtain a feature map, and a calculation formula is as follows:
converting information into a feature vector f with dimension (M multiplied by 10) by a full connection layer for the extracted feature map, wherein the calculation formula is as follows:
and operating the feature vector F, projecting the features of each target onto a hypersphere to obtain the final output F of the feature extraction network, wherein the calculation formula is as follows:
combining the results of the target detection module and the feature extraction module to obtain a final output, wherein the final output contains the category, position, size and image feature information of each target, and the calculation formula is as follows:
optionally, the hypersphere is a 10-dimensional hypersphere.
The invention has the following beneficial effects:
the invention provides a YOLOv 3-based target detection and feature extraction integrated network, wherein a target detection module runs a YOLOv3 algorithm to output 3 branch detection results, 3 branch data are integrated according to a non-maximum suppression algorithm to obtain an output result of target detection, the target detection result and original image information are input into a decoding module, an image of a target area extracted by the decoding module is sent into a feature extraction module, the feature extraction module runs a convolutional neural network to extract feature information of the image, the feature information of each target is projected onto a hypersphere, and the coordinates of the hypersphere are the feature information of the target. And the target information output by the target detection module corresponds to the characteristic information output by the characteristic detection module one by one, and the output of the target detection and the output of the characteristic extraction are combined to obtain the final output of the network.
The integrated network provided by the invention can provide target detection information, and simultaneously provide characteristic information of the target for the tracking network, and can effectively improve the performance of the tracking algorithm. The output of the integrated network provided by the invention not only comprises a target detection result, but also comprises the image characteristic information of the target, and abundant information output lays a foundation for higher-level tasks. The feature extraction of the integrated network provided by the invention depends on the detection result, and only the effective target is subjected to feature extraction, so that the calculated amount is reduced, and the algorithm efficiency is improved. The integrated network provided by the invention extracts the target characteristics by using the convolutional neural network, can extract more image detail information, and has advantages compared with the traditional method. The integrated network provided by the invention maps the characteristic information to a 10-dimensional hypersphere, thereby being convenient for associating targets. The integrated network provided by the invention is subjected to modular design, is beneficial to the deployment and implementation of the network, and can perform transfer learning by utilizing the original network data of YOLOv3, thereby reducing the difficulty and cost of training.
Drawings
Fig. 1 is a schematic structural diagram of an integrated network for target detection and feature extraction based on YOLOv3 according to an embodiment of the present invention.
Detailed Description
In order to make those skilled in the art better understand the technical solution of the present invention, the target detection and feature extraction integrated network based on YOLOv3 provided in the present invention is described in detail below with reference to the accompanying drawings.
Example one
Fig. 1 is a schematic structural diagram of an integrated network for target detection and feature extraction based on YOLOv3 according to an embodiment of the present invention. As shown in fig. 1, the target detection and feature extraction integrated network based on YOLOv3 provided in this embodiment mainly includes three functional modules: the device comprises a target detection module, a decoding module and a feature extraction module.
In the working process, the target detection module operates a YOLOv3 algorithm to output 3 branch detection results, then 3 branch data are integrated through a Non-Maximum Suppression (NMS) algorithm to obtain the output result of the target detection, the target detection result and original image information are input into the decoding module, the decoding module extracts an image of a target area and sends the image into the feature extraction module, the feature extraction module operates a convolutional neural network to extract feature information of the image and projects the feature information of each target onto a 10-dimensional hypersphere, and the coordinates of the hypersphere are the feature information of the target. Due to the action of the decoding module, target information output by the target detection module in the network corresponds to feature information output by the feature detection module one to one, and finally, the target detection output and the feature extraction output are combined to obtain the final output of the network, and the final output contains the category, position, size and image feature information of each target.
The integrated network provided by the embodiment can provide target detection information, and meanwhile, provides characteristic information of the target for the tracking network, and can effectively improve the performance of the tracking algorithm. The output of the integrated network not only comprises a target detection result, but also comprises image characteristic information of the target, and abundant information output lays a foundation for higher-layer tasks. The feature extraction of the integrated network depends on the detection result, only the effective target is subjected to feature extraction, the calculated amount is reduced, and the algorithm efficiency is improved. The integrated network extracts the target features by using the convolutional neural network, can extract more image detail information, and has advantages compared with the traditional method. The integrated network maps the characteristic information to a 10-dimensional hypersphere, so that the target can be associated conveniently. The integrated network is in modular design, so that the deployment and implementation of the network are facilitated, migration learning can be performed by using the original network data of YOLOv3, and the training difficulty and cost are reduced.
The embodiment provides a YOLOv 3-based target detection and feature extraction integrated network, wherein a target detection module runs a YOLOv3 algorithm to output 3 branch detection results, 3 branch data are integrated according to a non-maximum suppression algorithm to obtain an output result of target detection, the target detection result and original image information are input into a decoding module, an image of a target area extracted by the decoding module is sent into a feature extraction module, the feature extraction module runs a convolutional neural network to extract feature information of the image, the feature information of each target is projected onto a hypersphere, and the coordinates of the hypersphere are the feature information of the target. And the target information output by the target detection module corresponds to the characteristic information output by the characteristic detection module one by one, and the output of the target detection and the output of the characteristic extraction are combined to obtain the final output of the network. Compared with the prior art, the integrated network provided by the embodiment integrates target detection and feature extraction, and a novel network structure is constructed. The network function is pioneering, the network output result not only comprises the target detection result but also comprises the image characteristic information of the target, the network output information is enriched, and a foundation is laid for high-level tasks. The integrated network provided by the embodiment adopts the convolutional neural network to extract the characteristic information of the network, can extract more image detail information, and has advantages compared with the traditional method.
It will be understood that the above embodiments are merely exemplary embodiments taken to illustrate the principles of the present invention, which is not limited thereto. It will be apparent to those skilled in the art that various modifications and improvements can be made without departing from the spirit and substance of the invention, and these modifications and improvements are also considered to be within the scope of the invention.
Claims (3)
1. A target detection and feature extraction integrated network based on YOLOv3 is characterized by comprising a target detection module, a decoding module and a feature extraction module, wherein the target detection module runs a YOLOv3 algorithm, outputs 3 branch detection results, and integrates 3 branch data by using a non-maximum suppression algorithm to obtain an output result of target detection;
the decoding module receives the output result of the target detection and the original image information, extracts the image of the target area through decoding, and sends the image of the target area to the feature extraction module;
the feature extraction module operates a convolutional neural network to extract feature information of the target area image and projects image features onto a hypersphere, the coordinates of the hypersphere are the feature information of the target, and the feature information output by the feature detection module corresponds to the target information output by the target detection module one to one;
and combining the feature information output by the feature detection module with the target information output by the target detection module to obtain the final output of the target detection and feature extraction integrated network, wherein the final output comprises the category, the position, the size and the image feature information of each target.
2. The YOLOv 3-based integrated network for object detection and feature extraction as claimed in claim 1, further comprising:
the target detection module runs the YOLOv3 algorithm and outputs 3 branch results which are D1, D2 and D3 respectively, wherein
Integrating the 3 branch data through a non-maximum suppression algorithm to obtain an output result D of target detection, whereinThe maximum value M is the number of targets in the image;
the decoding module extracts an image of a target area and sends the image to the feature extraction module, the feature extraction module operates a convolutional neural network to extract feature information of the image to obtain a feature map, and a calculation formula is as follows:
converting information into a feature vector f with dimension (M multiplied by 10) by a full connection layer for the extracted feature map, wherein the calculation formula is as follows:
and operating the feature vector F, projecting the features of each target onto a hypersphere to obtain the final output F of the feature extraction network, wherein the calculation formula is as follows:
combining the results of the target detection module and the feature extraction module to obtain a final output, wherein the final output contains the category, position, size and image feature information of each target, and the calculation formula is as follows:
3. the integral network of object detection and feature extraction based on YOLOv3 of claim 1, wherein the hypersphere is a 10-dimensional hypersphere.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011066312.4A CN112183558A (en) | 2020-09-30 | 2020-09-30 | Target detection and feature extraction integrated network based on YOLOv3 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011066312.4A CN112183558A (en) | 2020-09-30 | 2020-09-30 | Target detection and feature extraction integrated network based on YOLOv3 |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112183558A true CN112183558A (en) | 2021-01-05 |
Family
ID=73948857
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011066312.4A Pending CN112183558A (en) | 2020-09-30 | 2020-09-30 | Target detection and feature extraction integrated network based on YOLOv3 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112183558A (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109711322A (en) * | 2018-12-24 | 2019-05-03 | 天津天地伟业信息***集成有限公司 | A kind of people's vehicle separation method based on RFCN |
CN109886998A (en) * | 2019-01-23 | 2019-06-14 | 平安科技(深圳)有限公司 | Multi-object tracking method, device, computer installation and computer storage medium |
CN110232350A (en) * | 2019-06-10 | 2019-09-13 | 哈尔滨工程大学 | A kind of real-time water surface multiple mobile object detecting and tracking method based on on-line study |
CN110909794A (en) * | 2019-11-22 | 2020-03-24 | 乐鑫信息科技(上海)股份有限公司 | Target detection system suitable for embedded equipment |
-
2020
- 2020-09-30 CN CN202011066312.4A patent/CN112183558A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109711322A (en) * | 2018-12-24 | 2019-05-03 | 天津天地伟业信息***集成有限公司 | A kind of people's vehicle separation method based on RFCN |
CN109886998A (en) * | 2019-01-23 | 2019-06-14 | 平安科技(深圳)有限公司 | Multi-object tracking method, device, computer installation and computer storage medium |
CN110232350A (en) * | 2019-06-10 | 2019-09-13 | 哈尔滨工程大学 | A kind of real-time water surface multiple mobile object detecting and tracking method based on on-line study |
CN110909794A (en) * | 2019-11-22 | 2020-03-24 | 乐鑫信息科技(上海)股份有限公司 | Target detection system suitable for embedded equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Liu et al. | Insulator detection in aerial images based on faster regions with convolutional neural network | |
CN111242844B (en) | Image processing method, device, server and storage medium | |
CN107808129A (en) | A kind of facial multi-characteristic points localization method based on single convolutional neural networks | |
CN101339661B (en) | Real time human-machine interaction method and system based on moving detection of hand held equipment | |
CN107392131A (en) | A kind of action identification method based on skeleton nodal distance | |
CN112132197A (en) | Model training method, image processing method, device, computer equipment and storage medium | |
CN108229587A (en) | A kind of autonomous scan method of transmission tower based on aircraft floating state | |
CN110796018A (en) | Hand motion recognition method based on depth image and color image | |
CN114565045A (en) | Remote sensing target detection knowledge distillation method based on feature separation attention | |
Xu et al. | GraspCNN: Real-time grasp detection using a new oriented diameter circle representation | |
CN110135277B (en) | Human behavior recognition method based on convolutional neural network | |
CN112487981A (en) | MA-YOLO dynamic gesture rapid recognition method based on two-way segmentation | |
CN105069745A (en) | face-changing system based on common image sensor and enhanced augmented reality technology and method | |
CN112183675A (en) | Twin network-based tracking method for low-resolution target | |
CN116935332A (en) | Fishing boat target detection and tracking method based on dynamic video | |
CN113538474B (en) | 3D point cloud segmentation target detection system based on edge feature fusion | |
CN203397395U (en) | Moving object detection device in platform based on DSP + FPGA | |
CN112164065A (en) | Real-time image semantic segmentation method based on lightweight convolutional neural network | |
CN112183558A (en) | Target detection and feature extraction integrated network based on YOLOv3 | |
CN116485783A (en) | Improved cloth flaw detection method with deep separation layer aggregation and space enhanced attention | |
CN111967287A (en) | Pedestrian detection method based on deep learning | |
CN116109682A (en) | Image registration method based on image diffusion characteristics | |
Gao et al. | Study of improved Yolov5 algorithms for gesture recognition | |
CN116524207A (en) | Weak supervision RGBD image significance detection method based on edge detection assistance | |
CN111881794B (en) | Video behavior recognition method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210105 |
|
RJ01 | Rejection of invention patent application after publication |