CN112183558A - Target detection and feature extraction integrated network based on YOLOv3 - Google Patents

Target detection and feature extraction integrated network based on YOLOv3 Download PDF

Info

Publication number
CN112183558A
CN112183558A CN202011066312.4A CN202011066312A CN112183558A CN 112183558 A CN112183558 A CN 112183558A CN 202011066312 A CN202011066312 A CN 202011066312A CN 112183558 A CN112183558 A CN 112183558A
Authority
CN
China
Prior art keywords
target
feature
feature extraction
information
target detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011066312.4A
Other languages
Chinese (zh)
Inventor
李利华
韩勇强
刘泳庆
张路成
魏晨晨
余清鲜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute of Technology BIT
Original Assignee
Beijing Institute of Technology BIT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute of Technology BIT filed Critical Beijing Institute of Technology BIT
Priority to CN202011066312.4A priority Critical patent/CN112183558A/en
Publication of CN112183558A publication Critical patent/CN112183558A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a YOLOv 3-based target detection and feature extraction integrated network, wherein a target detection module runs a YOLOv3 algorithm to output 3 branch detection results, 3 branch data are integrated according to a non-maximum suppression algorithm to obtain an output result of target detection, the target detection result and original image information in the network are input into a decoding module, the decoding module extracts an image of a target area according to input information and sends the image into a feature extraction module, the feature extraction module runs a convolutional neural network to extract feature information of the image and projects the feature information of each target onto a hypersphere, the target information output by the target detection module corresponds to the feature information output by the feature extraction module one by one, and finally the output of the target detection and the output of the feature extraction are combined to obtain the final output of the network. The integrated network provided by the invention can provide target detection information, and simultaneously provide characteristic information of the target for the tracking network, and can effectively improve the performance of the tracking algorithm.

Description

Target detection and feature extraction integrated network based on YOLOv3
Technical Field
The invention relates to the technical field of computer vision, in particular to a target detection and feature extraction integrated network based on YOLOv 3.
Background
The vision-based target detection and tracking is an important research topic in the field of computer vision, and has important research and practical values in the fields of video monitoring, virtual reality, human-computer interaction, autonomous navigation and the like. The target detection task can provide the information of the category, the position and the size of the target, and the performance of the target detection directly influences the performance of the subsequent target tracking task in the continuous frame sequence target tracking process.
In order to realize efficient target tracking, more accurate information of a target is expected to be obtained, wherein the more accurate information comprises the target category, the target position, the target size, the target color, texture, edge and other representation characteristics, and abundant target characteristics are the key for realizing target robust tracking.
The current target detection algorithm can output the information of the category, the position and the size of a target, but cannot output the representation characteristic information of the target, and the performance of the tracking algorithm is limited due to the limited information output by the detection algorithm.
Disclosure of Invention
In order to solve the limitations and defects of the prior art, the invention provides a target detection and feature extraction integrated network based on YOLOv3, which comprises a target detection module, a decoding module and a feature extraction module, wherein the target detection module runs a YOLOv3 algorithm, outputs 3 branch detection results, and integrates 3 branch data by using a non-maximum suppression algorithm to obtain an output result of target detection;
the decoding module receives the output result of the target detection and the original image information, extracts the image of the target area through decoding, and sends the image of the target area to the feature extraction module;
the feature extraction module operates a convolutional neural network to extract feature information of the target area image and projects image features onto a hypersphere, the coordinates of the hypersphere are the feature information of the target, and the feature information output by the feature detection module corresponds to the target information output by the target detection module one to one;
and combining the feature information output by the feature detection module with the target information output by the target detection module to obtain the final output of the target detection and feature extraction integrated network, wherein the final output comprises the category, the position, the size and the image feature information of each target.
Optionally, the method further includes:
the target detection module runs the YOLOv3 algorithm and outputs 3 branch results which are D1, D2 and D3 respectively, wherein
Figure BDA0002713855870000025
Integrating the 3 branch data through a non-maximum suppression algorithm to obtain an output result D of target detection, wherein
Figure BDA0002713855870000026
The maximum value M is the number of targets in the image;
the decoding module extracts an image of a target area and sends the image to the feature extraction module, the feature extraction module operates a convolutional neural network to extract feature information of the image to obtain a feature map, and a calculation formula is as follows:
Figure BDA0002713855870000021
converting information into a feature vector f with dimension (M multiplied by 10) by a full connection layer for the extracted feature map, wherein the calculation formula is as follows:
Figure BDA0002713855870000022
Figure BDA0002713855870000023
and operating the feature vector F, projecting the features of each target onto a hypersphere to obtain the final output F of the feature extraction network, wherein the calculation formula is as follows:
Figure BDA0002713855870000024
Figure BDA0002713855870000031
wherein the content of the first and second substances,
Figure BDA0002713855870000034
combining the results of the target detection module and the feature extraction module to obtain a final output, wherein the final output contains the category, position, size and image feature information of each target, and the calculation formula is as follows:
Figure BDA0002713855870000032
wherein the content of the first and second substances,
Figure BDA0002713855870000033
optionally, the hypersphere is a 10-dimensional hypersphere.
The invention has the following beneficial effects:
the invention provides a YOLOv 3-based target detection and feature extraction integrated network, wherein a target detection module runs a YOLOv3 algorithm to output 3 branch detection results, 3 branch data are integrated according to a non-maximum suppression algorithm to obtain an output result of target detection, the target detection result and original image information are input into a decoding module, an image of a target area extracted by the decoding module is sent into a feature extraction module, the feature extraction module runs a convolutional neural network to extract feature information of the image, the feature information of each target is projected onto a hypersphere, and the coordinates of the hypersphere are the feature information of the target. And the target information output by the target detection module corresponds to the characteristic information output by the characteristic detection module one by one, and the output of the target detection and the output of the characteristic extraction are combined to obtain the final output of the network.
The integrated network provided by the invention can provide target detection information, and simultaneously provide characteristic information of the target for the tracking network, and can effectively improve the performance of the tracking algorithm. The output of the integrated network provided by the invention not only comprises a target detection result, but also comprises the image characteristic information of the target, and abundant information output lays a foundation for higher-level tasks. The feature extraction of the integrated network provided by the invention depends on the detection result, and only the effective target is subjected to feature extraction, so that the calculated amount is reduced, and the algorithm efficiency is improved. The integrated network provided by the invention extracts the target characteristics by using the convolutional neural network, can extract more image detail information, and has advantages compared with the traditional method. The integrated network provided by the invention maps the characteristic information to a 10-dimensional hypersphere, thereby being convenient for associating targets. The integrated network provided by the invention is subjected to modular design, is beneficial to the deployment and implementation of the network, and can perform transfer learning by utilizing the original network data of YOLOv3, thereby reducing the difficulty and cost of training.
Drawings
Fig. 1 is a schematic structural diagram of an integrated network for target detection and feature extraction based on YOLOv3 according to an embodiment of the present invention.
Detailed Description
In order to make those skilled in the art better understand the technical solution of the present invention, the target detection and feature extraction integrated network based on YOLOv3 provided in the present invention is described in detail below with reference to the accompanying drawings.
Example one
Fig. 1 is a schematic structural diagram of an integrated network for target detection and feature extraction based on YOLOv3 according to an embodiment of the present invention. As shown in fig. 1, the target detection and feature extraction integrated network based on YOLOv3 provided in this embodiment mainly includes three functional modules: the device comprises a target detection module, a decoding module and a feature extraction module.
In the working process, the target detection module operates a YOLOv3 algorithm to output 3 branch detection results, then 3 branch data are integrated through a Non-Maximum Suppression (NMS) algorithm to obtain the output result of the target detection, the target detection result and original image information are input into the decoding module, the decoding module extracts an image of a target area and sends the image into the feature extraction module, the feature extraction module operates a convolutional neural network to extract feature information of the image and projects the feature information of each target onto a 10-dimensional hypersphere, and the coordinates of the hypersphere are the feature information of the target. Due to the action of the decoding module, target information output by the target detection module in the network corresponds to feature information output by the feature detection module one to one, and finally, the target detection output and the feature extraction output are combined to obtain the final output of the network, and the final output contains the category, position, size and image feature information of each target.
The integrated network provided by the embodiment can provide target detection information, and meanwhile, provides characteristic information of the target for the tracking network, and can effectively improve the performance of the tracking algorithm. The output of the integrated network not only comprises a target detection result, but also comprises image characteristic information of the target, and abundant information output lays a foundation for higher-layer tasks. The feature extraction of the integrated network depends on the detection result, only the effective target is subjected to feature extraction, the calculated amount is reduced, and the algorithm efficiency is improved. The integrated network extracts the target features by using the convolutional neural network, can extract more image detail information, and has advantages compared with the traditional method. The integrated network maps the characteristic information to a 10-dimensional hypersphere, so that the target can be associated conveniently. The integrated network is in modular design, so that the deployment and implementation of the network are facilitated, migration learning can be performed by using the original network data of YOLOv3, and the training difficulty and cost are reduced.
The embodiment provides a YOLOv 3-based target detection and feature extraction integrated network, wherein a target detection module runs a YOLOv3 algorithm to output 3 branch detection results, 3 branch data are integrated according to a non-maximum suppression algorithm to obtain an output result of target detection, the target detection result and original image information are input into a decoding module, an image of a target area extracted by the decoding module is sent into a feature extraction module, the feature extraction module runs a convolutional neural network to extract feature information of the image, the feature information of each target is projected onto a hypersphere, and the coordinates of the hypersphere are the feature information of the target. And the target information output by the target detection module corresponds to the characteristic information output by the characteristic detection module one by one, and the output of the target detection and the output of the characteristic extraction are combined to obtain the final output of the network. Compared with the prior art, the integrated network provided by the embodiment integrates target detection and feature extraction, and a novel network structure is constructed. The network function is pioneering, the network output result not only comprises the target detection result but also comprises the image characteristic information of the target, the network output information is enriched, and a foundation is laid for high-level tasks. The integrated network provided by the embodiment adopts the convolutional neural network to extract the characteristic information of the network, can extract more image detail information, and has advantages compared with the traditional method.
It will be understood that the above embodiments are merely exemplary embodiments taken to illustrate the principles of the present invention, which is not limited thereto. It will be apparent to those skilled in the art that various modifications and improvements can be made without departing from the spirit and substance of the invention, and these modifications and improvements are also considered to be within the scope of the invention.

Claims (3)

1. A target detection and feature extraction integrated network based on YOLOv3 is characterized by comprising a target detection module, a decoding module and a feature extraction module, wherein the target detection module runs a YOLOv3 algorithm, outputs 3 branch detection results, and integrates 3 branch data by using a non-maximum suppression algorithm to obtain an output result of target detection;
the decoding module receives the output result of the target detection and the original image information, extracts the image of the target area through decoding, and sends the image of the target area to the feature extraction module;
the feature extraction module operates a convolutional neural network to extract feature information of the target area image and projects image features onto a hypersphere, the coordinates of the hypersphere are the feature information of the target, and the feature information output by the feature detection module corresponds to the target information output by the target detection module one to one;
and combining the feature information output by the feature detection module with the target information output by the target detection module to obtain the final output of the target detection and feature extraction integrated network, wherein the final output comprises the category, the position, the size and the image feature information of each target.
2. The YOLOv 3-based integrated network for object detection and feature extraction as claimed in claim 1, further comprising:
the target detection module runs the YOLOv3 algorithm and outputs 3 branch results which are D1, D2 and D3 respectively, wherein
Figure FDA0002713855860000012
Integrating the 3 branch data through a non-maximum suppression algorithm to obtain an output result D of target detection, wherein
Figure FDA0002713855860000013
The maximum value M is the number of targets in the image;
the decoding module extracts an image of a target area and sends the image to the feature extraction module, the feature extraction module operates a convolutional neural network to extract feature information of the image to obtain a feature map, and a calculation formula is as follows:
Figure FDA0002713855860000011
converting information into a feature vector f with dimension (M multiplied by 10) by a full connection layer for the extracted feature map, wherein the calculation formula is as follows:
Figure FDA0002713855860000021
Figure FDA0002713855860000022
and operating the feature vector F, projecting the features of each target onto a hypersphere to obtain the final output F of the feature extraction network, wherein the calculation formula is as follows:
Figure FDA0002713855860000023
Figure FDA0002713855860000024
wherein the content of the first and second substances,
Figure FDA0002713855860000026
combining the results of the target detection module and the feature extraction module to obtain a final output, wherein the final output contains the category, position, size and image feature information of each target, and the calculation formula is as follows:
Figure FDA0002713855860000025
wherein the content of the first and second substances,
Figure FDA0002713855860000027
3. the integral network of object detection and feature extraction based on YOLOv3 of claim 1, wherein the hypersphere is a 10-dimensional hypersphere.
CN202011066312.4A 2020-09-30 2020-09-30 Target detection and feature extraction integrated network based on YOLOv3 Pending CN112183558A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011066312.4A CN112183558A (en) 2020-09-30 2020-09-30 Target detection and feature extraction integrated network based on YOLOv3

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011066312.4A CN112183558A (en) 2020-09-30 2020-09-30 Target detection and feature extraction integrated network based on YOLOv3

Publications (1)

Publication Number Publication Date
CN112183558A true CN112183558A (en) 2021-01-05

Family

ID=73948857

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011066312.4A Pending CN112183558A (en) 2020-09-30 2020-09-30 Target detection and feature extraction integrated network based on YOLOv3

Country Status (1)

Country Link
CN (1) CN112183558A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109711322A (en) * 2018-12-24 2019-05-03 天津天地伟业信息***集成有限公司 A kind of people's vehicle separation method based on RFCN
CN109886998A (en) * 2019-01-23 2019-06-14 平安科技(深圳)有限公司 Multi-object tracking method, device, computer installation and computer storage medium
CN110232350A (en) * 2019-06-10 2019-09-13 哈尔滨工程大学 A kind of real-time water surface multiple mobile object detecting and tracking method based on on-line study
CN110909794A (en) * 2019-11-22 2020-03-24 乐鑫信息科技(上海)股份有限公司 Target detection system suitable for embedded equipment

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109711322A (en) * 2018-12-24 2019-05-03 天津天地伟业信息***集成有限公司 A kind of people's vehicle separation method based on RFCN
CN109886998A (en) * 2019-01-23 2019-06-14 平安科技(深圳)有限公司 Multi-object tracking method, device, computer installation and computer storage medium
CN110232350A (en) * 2019-06-10 2019-09-13 哈尔滨工程大学 A kind of real-time water surface multiple mobile object detecting and tracking method based on on-line study
CN110909794A (en) * 2019-11-22 2020-03-24 乐鑫信息科技(上海)股份有限公司 Target detection system suitable for embedded equipment

Similar Documents

Publication Publication Date Title
Liu et al. Insulator detection in aerial images based on faster regions with convolutional neural network
CN111242844B (en) Image processing method, device, server and storage medium
CN107808129A (en) A kind of facial multi-characteristic points localization method based on single convolutional neural networks
CN101339661B (en) Real time human-machine interaction method and system based on moving detection of hand held equipment
CN107392131A (en) A kind of action identification method based on skeleton nodal distance
CN112132197A (en) Model training method, image processing method, device, computer equipment and storage medium
CN108229587A (en) A kind of autonomous scan method of transmission tower based on aircraft floating state
CN110796018A (en) Hand motion recognition method based on depth image and color image
CN114565045A (en) Remote sensing target detection knowledge distillation method based on feature separation attention
Xu et al. GraspCNN: Real-time grasp detection using a new oriented diameter circle representation
CN110135277B (en) Human behavior recognition method based on convolutional neural network
CN112487981A (en) MA-YOLO dynamic gesture rapid recognition method based on two-way segmentation
CN105069745A (en) face-changing system based on common image sensor and enhanced augmented reality technology and method
CN112183675A (en) Twin network-based tracking method for low-resolution target
CN116935332A (en) Fishing boat target detection and tracking method based on dynamic video
CN113538474B (en) 3D point cloud segmentation target detection system based on edge feature fusion
CN203397395U (en) Moving object detection device in platform based on DSP + FPGA
CN112164065A (en) Real-time image semantic segmentation method based on lightweight convolutional neural network
CN112183558A (en) Target detection and feature extraction integrated network based on YOLOv3
CN116485783A (en) Improved cloth flaw detection method with deep separation layer aggregation and space enhanced attention
CN111967287A (en) Pedestrian detection method based on deep learning
CN116109682A (en) Image registration method based on image diffusion characteristics
Gao et al. Study of improved Yolov5 algorithms for gesture recognition
CN116524207A (en) Weak supervision RGBD image significance detection method based on edge detection assistance
CN111881794B (en) Video behavior recognition method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210105

RJ01 Rejection of invention patent application after publication