CN114511793B - Unmanned aerial vehicle ground detection method and system based on synchronous detection tracking - Google Patents

Unmanned aerial vehicle ground detection method and system based on synchronous detection tracking Download PDF

Info

Publication number
CN114511793B
CN114511793B CN202011285803.8A CN202011285803A CN114511793B CN 114511793 B CN114511793 B CN 114511793B CN 202011285803 A CN202011285803 A CN 202011285803A CN 114511793 B CN114511793 B CN 114511793B
Authority
CN
China
Prior art keywords
target position
video frame
frame
target
detected
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011285803.8A
Other languages
Chinese (zh)
Other versions
CN114511793A (en
Inventor
苏龙飞
王之元
凡遵林
管乃洋
张天昊
王浩
沈天龙
黄强娟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National Defense Technology Innovation Institute PLA Academy of Military Science
Original Assignee
National Defense Technology Innovation Institute PLA Academy of Military Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National Defense Technology Innovation Institute PLA Academy of Military Science filed Critical National Defense Technology Innovation Institute PLA Academy of Military Science
Priority to CN202011285803.8A priority Critical patent/CN114511793B/en
Publication of CN114511793A publication Critical patent/CN114511793A/en
Application granted granted Critical
Publication of CN114511793B publication Critical patent/CN114511793B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a method and a system for detecting the ground of an unmanned aerial vehicle based on synchronous detection and tracking, wherein the method comprises the following steps: counting video frames acquired by an unmanned aerial vehicle based on a trained model file and a weight file of a target detection depth neural network, performing forward reasoning on the video frames with the count of 1, acquiring a target position area, initializing a target tracker, simultaneously acquiring a detected target position area and a tracked target position area for each video frame acquired subsequently, and if tracking is successful and the image size of the target position area accords with the preset image size, comparing whether the detected target position area and the tracked target position area are overlapped or not, and determining a final output target position area; according to the technical scheme provided by the invention, the interference caused by false detection of target detection is reduced by adopting a synchronous detection tracking method, the calculated amount in the unmanned aerial vehicle detection process is reduced, and the calculation accuracy is improved.

Description

Unmanned aerial vehicle ground detection method and system based on synchronous detection tracking
Technical Field
The invention relates to the technical field of computer vision, in particular to a ground detection method and system of an unmanned aerial vehicle based on synchronous detection and tracking.
Background
The current deep neural network is rapidly developed, the application is more and more extensive, and the method for carrying out target detection or search on a video or an image by utilizing the deep neural network mainly comprises a two-step method represented by FasterR-CNN, R-CNN and the like and a one-step method represented by YOLO, SSD and the like; although FasterR-CNN is an excellent algorithm in a two-step method, the processing speed of 5FPS can be only reached under the support of the powerful computing capacity of K40GPU, and the real-time requirement can not be met; although the speed of YOLO and SSD destination detection in the one-step approach can reach 15FPS or more and can reach real-time requirements, the computational power of the TitanX or M40GPU is necessary to support. The algorithm with better performance and faster speed in the target tracking algorithm is represented by a related filtering algorithm, and the algorithm has stable tracking and faster speed, and can reach 172FPS under the limited computing capacity.
The unmanned aerial vehicle is a reusable aircraft which is controlled by a radio remote control or an autonomous program and unmanned, and has the advantages of simple structure, low cost, strong survivability, good maneuvering performance and capability of completing various tasks; however, the unmanned aerial vehicle has low bearing weight, so that the unmanned aerial vehicle cannot carry a computing device with strong computing performance, so that the difficulty exists in deploying a target detection algorithm based on a deep neural network, and a small unmanned aerial vehicle-mounted computer such as raspberry pie or odroid has light weight and limited computing capacity; even if Tinyyolo or mobilets-SSD in a one-step method with high speed is deployed on an odroid onboard computer, the target detection speed does not exceed 3FPS, and the real-time requirement cannot be met. The retired predator unmanned aerial vehicle mainly acquires data through a sensor of the unmanned aerial vehicle and returns the data to the ground, and the data are interpreted manually on the ground; the improved global eagle-type portable signal sensor and the radar for detecting the ground moving target have preliminary on-board target detection monitoring capability (distinguishing dynamic and static, detecting the moving target), and the detection technology is not mature enough; the rainbow unmanned aerial vehicle acquires data through a sensor of the unmanned aerial vehicle and returns the data to the ground, the data are interpreted manually on the ground, and the rear end is further processed; the artificial intelligence algorithm is tested on a scanning hawk, the test is started for only a few days, the recognition accuracy of a computer to objects such as personnel, vehicles, buildings and the like reaches 60%, and the recognition accuracy is improved to 80% after 1 week, however, the application is still completed on the ground; from this point of view, the current technology still cannot realize the tracking detection of the target in the data acquired by the unmanned aerial vehicle onboard camera in real time and the processing operation of the next instruction.
Disclosure of Invention
Aiming at the defects of the prior art, the invention aims to provide a ground detection method and a ground detection system for an unmanned aerial vehicle based on synchronous detection tracking, which are used for counting video frames in real time from data acquired by an onboard camera in the flight process of the unmanned aerial vehicle by combining a target detection algorithm and a tracking algorithm based on a deep neural network, synchronously detecting and tracking specific targets in the video frames, and realizing the monitoring and searching of the ground targets, the directional tracking of the moving targets and the synchronous detection and tracking of the air targets by a tactical unmanned aerial vehicle.
The invention aims at adopting the following technical scheme:
the invention provides an unmanned aerial vehicle ground detection method based on synchronous detection and tracking, which is characterized by comprising the following steps:
training a target detection depth neural network model to obtain a model file and a weight file;
step (2) collecting real-time video data frame by frame;
step (3) initializing a frame number counter h=0;
step (4) let h=h+1, and simultaneously executing step (5) and step (8);
step (5) based on the trained model file and weight file of the target detection depth neural network, forward reasoning is carried out on real-time video data collected frame by frame, and a target position area detected in the h video frame is obtained;
step (6) judging whether h is 1, if so, executing step (7), otherwise, storing the target position area detected in the h video frame to step (12);
initializing a target tracker according to the target position area detected in the h video frame;
step (8) judging whether h is 1, if so, executing step (4), and if not, executing step (9);
step (9) obtaining a candidate region corresponding to the detected target position region in the h-1 video frame and a region in the h video frame, which is consistent with the candidate region corresponding to the h-1 video frame, and taking the region as the candidate region corresponding to the detected target position region in the h video frame, and obtaining the tracked target position region in the h video frame according to the candidate region corresponding to the detected target position region in the h-1 video frame;
step (10) judging whether the target tracking in the h video frame is successful, if so, executing the step (11), otherwise, executing the step (3);
step (11), judging whether the pixel coordinates of the target position area image tracked in the h video frame exceed the coordinate range of the preset video frame image, if so, executing the step (3), and if not, executing the step (12);
and (12) judging whether the detected target position area and the tracked target position area in the h video frame are overlapped or not, if so, outputting the detected target position area and executing the step (4), and if not, outputting the tracked target position area and executing the step (4).
Preferably, the step (1) includes:
marking each type of target in the historical video data collected frame by frame;
constructing training data by utilizing historical video data marked frame by frame, and training a target detection depth neural network model by utilizing the training data;
and obtaining a model file and a weight file of the trained target detection depth neural network.
Preferably, the step (5) includes:
and sequentially reading the label corresponding to the target, the trained model file, the weight file and the real-time video data acquired frame by using the forward reasoning frame to acquire the position of the target output by the forward reasoning frame.
Preferably, the obtaining the candidate region corresponding to the detected target position region in the h-1 video frame includes:
and expanding the detected target position area in the h-1 video frame by a preset multiple.
Further, the value range of the preset multiple is [1.5,3].
Preferably, the step (10) includes:
analyzing candidate areas corresponding to the detected target position areas in the h video frame by using a classifier of the h-1 video frame, and obtaining scores of the candidate areas corresponding to the detected target position areas in the h video frame;
if the score of the candidate region corresponding to the target position region detected in the h video frame is larger than the preset value of the classifier score, the target tracking is successful, otherwise, the target tracking fails.
Further, the training process of the h-1 video frame classifier comprises the following steps:
taking candidate areas corresponding to the detected target position areas contained in the h-1 video frame as positive sample data for training the second classifier;
taking candidate areas corresponding to the target position areas which do not contain detection in the h-1 video frame as negative sample data for training the second classifier;
constructing sample data for training a two-classifier by utilizing the positive sample data and the negative sample data;
and executing a classifier algorithm on the sample data of the training second classifier to obtain the trained classifier of the h-1 video frame.
The invention provides an unmanned aerial vehicle ground detection system based on synchronous detection and tracking, which is characterized by comprising the following components:
the training module is used for training the target detection deep neural network model and obtaining a model file and a weight file;
the acquisition module is used for acquiring real-time video data frame by frame;
an initialization module i, configured to initialize a frame number counter h=0;
the assignment module is used for enabling h=h+1 and executing the detection module and the judgment module b at the same time;
the detection module is used for carrying out forward reasoning on real-time video data collected frame by frame based on a trained model file and a weight file of the target detection depth neural network, and obtaining a target position area detected in an h video frame;
the judging module a is used for judging whether h is 1, if so, executing an initializing module II, and if not, storing the target position area detected in the h video frame into a judging module e;
the initialization module II is used for initializing a target tracker according to the target position area detected in the h video frame;
the judging module b is used for judging whether h is 1, if yes, executing the assignment module, and if no, executing the tracking module;
the tracking module is used for acquiring a candidate region corresponding to the detected target position region in the h-1 video frame and a region, corresponding to the candidate region corresponding to the h-1 video frame, in the h video frame, and taking the region as the candidate region corresponding to the detected target position region in the h video frame, and acquiring the tracked target position region in the h video frame according to the candidate region corresponding to the detected target position region in the h-1 video frame;
the judging module c is used for judging whether the target tracking in the h video frame is successful, if so, executing the judging module d, and if not, executing the initializing module I;
the judging module d is used for judging whether the pixel coordinates of the target position area image tracked in the h video frame exceed the coordinate range of the preset video frame image, if so, executing the initializing module I, and if not, executing the judging module e;
and the judging module e judges whether the detected target position area and the tracked target position area in the h video frame are overlapped or not, if so, the detected target position area is output, the assignment module is executed, and if not, the tracked target position area is output, and the assignment module is executed.
Preferably, the training module is specifically configured to:
marking each type of target in the historical video data collected frame by frame;
constructing training data by utilizing historical video data marked frame by frame, and training a target detection depth neural network model by utilizing the training data;
and obtaining a model file and a weight file of the trained target detection depth neural network.
Preferably, the detection module is specifically configured to:
and sequentially reading the label corresponding to the target, the trained model file, the weight file and the real-time video data acquired frame by using the forward reasoning frame to acquire the position of the target output by the forward reasoning frame.
Preferably, the obtaining the candidate region corresponding to the detected target position region in the h-1 video frame includes:
and expanding the detected target position area in the h-1 video frame by a preset multiple.
Further, the value range of the preset multiple is [1.5,3].
Preferably, the judging module c is specifically configured to:
analyzing candidate areas corresponding to the detected target position areas in the h video frame by using a classifier of the h-1 video frame, and obtaining scores of the candidate areas corresponding to the detected target position areas in the h video frame;
if the score of the candidate region corresponding to the target position region detected in the h video frame is larger than the preset value of the classifier score, the target tracking is successful, otherwise, the target tracking fails.
Further, the training process of the h-1 video frame classifier comprises the following steps:
taking candidate areas corresponding to the detected target position areas contained in the h-1 video frame as positive sample data for training the second classifier;
taking candidate areas corresponding to the target position areas which do not contain detection in the h-1 video frame as negative sample data for training the second classifier;
constructing sample data for training a two-classifier by utilizing the positive sample data and the negative sample data;
and executing a classifier algorithm on the sample data of the training second classifier to obtain the trained classifier of the h-1 video frame.
Compared with the closest prior art, the invention has the following beneficial effects:
according to the technical scheme, based on a trained model file and a weight file of a target detection depth neural network, counting video frames acquired by an unmanned aerial vehicle, performing forward reasoning on the video frames with the count of 1, acquiring a target position area, initializing a target tracker, simultaneously acquiring a detected target position area and a tracked target position area for each video frame acquired subsequently, and if tracking is successful and the image size of the target position area accords with the preset image size, comparing whether the detected target position area and the tracked target position area are overlapped or not, and determining a final output target position area; the scheme can reduce the interference caused by false detection of target detection; by combining the advantages of the accuracy of the deep learning target detection algorithm and the stability of the target tracking algorithm, the instability caused by jump of the target detection position can be avoided on the premise of keeping the advantage of high accuracy of the target detection algorithm of the deep neural network; meanwhile, the target tracking algorithm with a single scale can adapt to the multi-scale change of the target by means of the target detection algorithm; the technical scheme provided by the invention has small calculated amount, can directly utilize the calculation performance of the onboard GPU of the unmanned aerial vehicle, and has high application value.
Drawings
FIG. 1 is a flow chart of a method of unmanned aerial vehicle ground detection based on synchronous detection tracking;
FIG. 2 is a training flow diagram of a target detection model based on synchronous detection tracking in an embodiment of the invention;
FIG. 3 is a flow chart of real-time detection of targets based on synchronous detection tracking in an embodiment of the invention
Fig. 4 is a block diagram of a ground detection system of an unmanned aerial vehicle based on synchronous detection tracking.
Detailed Description
The following describes the embodiments of the present invention in further detail with reference to the drawings.
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The invention provides an unmanned aerial vehicle ground detection method based on synchronous detection tracking, which is shown in figure 1 and comprises the following steps:
training a target detection depth neural network model to obtain a model file and a weight file;
step (2) collecting real-time video data frame by frame;
step (3) initializing a frame number counter h=0;
step (4) let h=h+1, and simultaneously executing step (5) and step (8);
step (5) based on the trained model file and weight file of the target detection depth neural network, forward reasoning is carried out on real-time video data collected frame by frame, and a target position area detected in the h video frame is obtained;
step (6) judging whether h is 1, if so, executing step (7), otherwise, storing the target position area detected in the h video frame to step (12);
initializing a target tracker according to the target position area detected in the h video frame;
step (8) judging whether h is 1, if so, executing step (4), and if not, executing step (9);
step (9) obtaining a candidate region corresponding to the detected target position region in the h-1 video frame and a region in the h video frame, which is consistent with the candidate region corresponding to the h-1 video frame, and taking the region as the candidate region corresponding to the detected target position region in the h video frame, and obtaining the tracked target position region in the h video frame according to the candidate region corresponding to the detected target position region in the h-1 video frame;
step (10) judging whether the target tracking in the h video frame is successful, if so, executing the step (11), otherwise, executing the step (3);
step (11), judging whether the pixel coordinates of the target position area image tracked in the h video frame exceed the coordinate range of the preset video frame image, if so, executing the step (3), and if not, executing the step (12);
and (12) judging whether the detected target position area and the tracked target position area in the h video frame are overlapped or not, if so, outputting the detected target position area and executing the step (4), and if not, outputting the tracked target position area and executing the step (4).
Preferably, step (1) includes:
marking each type of target in the historical video data collected frame by frame;
constructing training data by utilizing historical video data marked frame by frame, and training a target detection depth neural network model by utilizing the training data;
and obtaining a model file and a weight file of the trained target detection depth neural network.
Preferably, step (5) comprises:
and sequentially reading the label corresponding to the target, the trained model file, the weight file and the real-time video data acquired frame by using the forward reasoning frame to acquire the position of the target output by the forward reasoning frame.
In an embodiment of the present invention, offline training of the target detection depth neural network includes:
a-1, labeling the video data of the same type aiming at a specific target to be detected and tracked by the unmanned aerial vehicle, and performing offline training on a deep neural network by using the labeling data on a GPU server or a computer with stronger performance;
step A-2, decomposing the video data of the same type acquired by the unmanned aerial vehicle into images, wherein the number of the images is as large as possible, and is usually not less than 1 ten thousand, in order to avoid overfitting and improve generalization capability; labeling targets (automobiles, people, tanks, unmanned aerial vehicles and the like) in each image; specifically: the method comprises the steps of framing a target by a rectangular frame, and recording pixel coordinates of vertexes of an upper left corner and a lower right corner of the rectangular frame or vertex coordinates of the upper left corner, length and width of the rectangular frame and corresponding target labels according to a specific format;
step A-3, constructing a deep neural network training platform (TensorFlow, darknet, caffe and the like), setting parameters such as training path size, learning rate and the like, reading a model of the deep neural network such as a mobiles-SSD, and updating the parameters of the model of the deep neural network of the specific target detection algorithm on the marked data;
and A-4, after training for a specific number of times (more than 10000 rounds), storing a training model of the deep neural network, and obtaining a model file and a weight file of the training model of the deep neural network.
Secondly, detecting a target:
step B-1, loading video data and reading video frames;
b-2, initializing a frame number counter to 0;
step B-3, adding 1 to the frame counter, and simultaneously executing step B-4 and step B-7;
b-4, loading a pre-training model based on a deep learning algorithm, and detecting a specific target on the read video frame by using a deep learning forward reasoning mechanism: reading a target category label, a pre-training parameter model file, a weight file and a video frame to be detected, and performing forward reasoning on a new video frame to acquire target position information and confidence;
b-5, judging whether h is 1, if so, executing the step B-6, and if not, storing the detected target position area to the step C-4;
b-6, initializing by the target tracker by taking the target position detected by the target detector as a tracking starting point;
b-7, judging whether h is 1, if so, executing the step B-3, and if not, executing the step C-1;
finally, tracking the target:
step C-1, tracking the target by a tracking algorithm, and updating the target position on a new video frame: determining the position of a candidate region of the previous frame, and extracting the characteristics of the candidate region; searching a region which is most matched with the characteristics of the candidate region in the current and subsequent video frames as a target tracking object, and acquiring a target position region tracked in the video frames;
c-2, judging whether target tracking is successful or not through a preset threshold value, executing the step B-2 when the target tracking is unsuccessful, and executing the next step when the target tracking is successful;
c-3, judging whether the pixel coordinates of the image of the position area of the output target exceed the coordinate range of the preset video frame image, if so, executing the step B-2, and if not, outputting the position of the target and executing the step C-4;
and C-4, judging whether the detected target position area and the tracked target position area in the current video frame are overlapped, if so, outputting the detected target position area, executing the step B-3, and if not, outputting the tracked target position area, and executing the step B-3.
Preferably, the obtaining a candidate region corresponding to the detected target position region in the h-1 video frame includes:
and expanding the detected target position area in the h-1 video frame by a preset multiple.
Further, the value range of the preset multiple is [1.5,3].
Preferably, the training process of the classifier of the h-1 th video frame comprises:
taking candidate areas corresponding to the detected target position areas contained in the h-1 video frame as positive sample data for training the second classifier;
taking candidate areas corresponding to the target position areas which do not contain detection in the h-1 video frame as negative sample data for training the second classifier;
constructing sample data for training a two-classifier by utilizing the positive sample data and the negative sample data;
and executing a classifier algorithm on the sample data of the training second classifier to obtain the trained classifier of the h-1 video frame.
In the embodiment of the invention, in the step C-2, a candidate region of the current frame is used as a template, and whether a real target frame is contained as a positive sample or not is used as a positive sample for training a classification algorithm to obtain a classifier; obtaining a prediction template on the next frame image according to the template of the real target frame of the current frame, and generating a plurality of alternative templates by using a cyclic matrix; operating a classifier generated by the current frame by taking the alternative template as a sample on an image of the next frame to obtain labels of all samples, wherein an alternative frame corresponding to the label containing the real position of the target is taken as a target prediction template in the next frame; comparing the relative positions of the predicted template of the next frame and the amplified template of the real target of the current frame to obtain the position change of the target, thereby obtaining a new target position in the next frame; and comparing the classification value obtained by the classifier with a preset value M, if the classification value is larger than M, tracking successfully, and if the classification value is smaller than M, tracking fails.
Based on the technical scheme provided by the invention, the embodiment of the invention also provides a training flow chart of the target detection model based on the confidence coefficient, as shown in fig. 2:
s1, offline training a target detection model:
s11, aiming at monitoring a specific area, acquiring a video or an image, wherein the acquired image or video scene is required to be similar to the scene of the monitoring area of the actual unmanned aerial vehicle as far as possible;
s12, marking multiple types of targets (vehicles, personnel, trees and the like) in the acquired video or image frame by frame, wherein a marking frame is preferably a rectangular frame, positioning is performed through top left corners and bottom right corners or positioning is performed by adopting top left corners and rectangular long and wide sides, marked coordinates and type labels are stored as xml or txt file types according to a fixed format, and an index file is established to correspond image paths and file names to xml or txt file path file names one by one;
s13, selecting a training platform of a deep neural network, wherein the training platform can be caffe, tensorlow, pyrach and dark net, but is not limited to the platform;
s14, selecting a target detection depth neural network including but not limited to a Mobilens-SSD target detection neural network, setting parameters such as training pathsize, learning rate and the like, reading training images and corresponding xml or txt files according to the index file, and training on the training platform selected in the S13 by using the marked data;
s15, performing N-wheel training on the acquired data in the training process of S14, wherein N is usually not less than 10000, and storing the obtained model file for later use in a real-time target detection process.
Based on the technical scheme provided by the invention, the embodiment of the invention also provides a target real-time detection flow chart based on the confidence coefficient, as shown in fig. 3:
s2, online real-time target detection:
s21, reading camera video or image data frame by frame on the unmanned aerial vehicle in real time;
s22, running a lightweight forward reasoning framework which is convenient to deploy on a mobile platform, wherein the lightweight forward reasoning framework comprises, but is not limited to, an opencvDNN module, a TensorRT forward reasoning module, a Tengine forward reasoning module and a tennine forward reasoning module;
s23, reading the model weight file trained and saved in the S15, detecting a selected target on a video or image read frame by frame, and obtaining and outputting corresponding information such as a target position rectangular frame, confidence coefficient, category label and the like;
the invention provides an unmanned aerial vehicle ground detection system based on synchronous detection tracking, as shown in fig. 4, comprising:
the training module is used for training the target detection deep neural network model and obtaining a model file and a weight file;
the acquisition module is used for acquiring real-time video data frame by frame;
an initialization module i, configured to initialize a frame number counter h=0;
the assignment module is used for enabling h=h+1 and executing the detection module and the judgment module b at the same time;
the detection module is used for carrying out forward reasoning on real-time video data collected frame by frame based on a trained model file and a weight file of the target detection depth neural network, and obtaining a target position area detected in an h video frame;
the judging module a is used for judging whether h is 1, if so, executing an initializing module II, and if not, storing the target position area detected in the h video frame into a judging module e;
the initialization module II is used for initializing a target tracker according to the target position area detected in the h video frame;
the judging module b is used for judging whether h is 1, if yes, executing the assignment module, and if no, executing the tracking module;
the tracking module is used for acquiring a candidate region corresponding to the detected target position region in the h-1 video frame and a region, corresponding to the candidate region corresponding to the h-1 video frame, in the h video frame, and taking the region as the candidate region corresponding to the detected target position region in the h video frame, and acquiring the tracked target position region in the h video frame according to the candidate region corresponding to the detected target position region in the h-1 video frame;
the judging module c is used for judging whether the target tracking in the h video frame is successful, if so, executing the judging module d, and if not, executing the initializing module I;
the judging module d is used for judging whether the pixel coordinates of the target position area image tracked in the h video frame exceed the coordinate range of the preset video frame image, if so, executing the initializing module I, and if not, executing the judging module e;
and the judging module e judges whether the detected target position area and the tracked target position area in the h video frame are overlapped or not, if so, the detected target position area is output, the assignment module is executed, and if not, the tracked target position area is output, and the assignment module is executed.
Preferably, the training module is specifically configured to:
marking each type of target in the historical video data collected frame by frame;
constructing training data by utilizing historical video data marked frame by frame, and training a target detection depth neural network model by utilizing the training data;
and obtaining a model file and a weight file of the trained target detection depth neural network.
Preferably, the detection module is specifically configured to:
and sequentially reading the label corresponding to the target, the trained model file, the weight file and the real-time video data acquired frame by using the forward reasoning frame to acquire the position of the target output by the forward reasoning frame.
Preferably, the obtaining a candidate region corresponding to the detected target position region in the h-1 video frame includes:
and expanding the detected target position area in the h-1 video frame by a preset multiple.
Further, the value range of the preset multiple is [1.5,3].
Preferably, the judging module c is specifically configured to:
analyzing candidate areas corresponding to the detected target position areas in the h video frame by using a classifier of the h-1 video frame, and obtaining scores of the candidate areas corresponding to the detected target position areas in the h video frame;
if the score of the candidate region corresponding to the target position region detected in the h video frame is larger than the preset value of the classifier score, the target tracking is successful, otherwise, the target tracking fails.
Further, the training process of the classifier of the h-1 th video frame comprises the following steps:
taking candidate areas corresponding to the detected target position areas contained in the h-1 video frame as positive sample data for training the second classifier;
taking candidate areas corresponding to the target position areas which do not contain detection in the h-1 video frame as negative sample data for training the second classifier;
constructing sample data for training the two classifiers by using the positive sample data and the negative sample data;
and executing a classifier algorithm on the sample data of the training second classifier to obtain the trained classifier of the h-1 video frame.
The unmanned aerial vehicle ground detection system or the electronic equipment loaded with the unmanned aerial vehicle ground detection method can be deployed on the unmanned aerial vehicle so as to monitor and track the target.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Finally, it should be noted that: the above embodiments are only for illustrating the technical aspects of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the above embodiments, it should be understood by those of ordinary skill in the art that: modifications and equivalents may be made to the specific embodiments of the invention without departing from the spirit and scope of the invention, which is intended to be covered by the claims.

Claims (14)

1. An unmanned aerial vehicle ground detection method based on synchronous detection tracking is characterized by comprising the following steps:
training a target detection depth neural network model to obtain a model file and a weight file;
step (2) collecting real-time video data frame by frame;
step (3) initializing a frame number counter h=0;
step (4) let h=h+1, and simultaneously executing step (5) and step (8);
step (5) based on the trained model file and weight file of the target detection depth neural network, forward reasoning is carried out on real-time video data collected frame by frame, and a target position area detected in the h video frame is obtained;
step (6) judging whether h is 1, if so, executing step (7), otherwise, storing the target position area detected in the h video frame to step (12);
initializing a target tracker according to the target position area detected in the h video frame;
step (8) judging whether h is 1, if so, executing step (4), and if not, executing step (9);
step (9) obtaining a candidate region corresponding to the detected target position region in the h-1 video frame and a region in the h video frame, which is consistent with the candidate region corresponding to the h-1 video frame, and taking the region as the candidate region corresponding to the detected target position region in the h video frame, and obtaining the tracked target position region in the h video frame according to the candidate region corresponding to the detected target position region in the h-1 video frame;
step (10) judging whether the target tracking in the h video frame is successful, if so, executing the step (11), otherwise, executing the step (3);
step (11), judging whether the pixel coordinates of the target position area image tracked in the h video frame exceed the coordinate range of the preset video frame image, if so, executing the step (3), and if not, executing the step (12);
and (12) judging whether the detected target position area and the tracked target position area in the h video frame are overlapped or not, if so, outputting the detected target position area and executing the step (4), and if not, outputting the tracked target position area and executing the step (4).
2. The method of claim 1, wherein the step (1) comprises:
marking each type of target in the historical video data collected frame by frame;
constructing training data by utilizing historical video data marked frame by frame, and training a target detection depth neural network model by utilizing the training data;
and obtaining a model file and a weight file of the trained target detection depth neural network.
3. The method of claim 1, wherein the step (5) comprises:
and sequentially reading the label corresponding to the target, the trained model file, the weight file and the real-time video data acquired frame by using the forward reasoning frame, and acquiring the detected target position area output by the forward reasoning frame.
4. The method of claim 1, wherein the acquiring the candidate region corresponding to the detected target location region in the h-1 th video frame comprises:
and expanding the detected target position area in the h-1 video frame by a preset multiple.
5. The method of claim 4, wherein the predetermined multiple has a value in the range of [1.5,3].
6. The method of claim 1, wherein the step (10) comprises:
analyzing candidate areas corresponding to the detected target position areas in the h video frame by using a classifier of the h-1 video frame, and obtaining scores of the candidate areas corresponding to the detected target position areas in the h video frame;
if the score of the candidate region corresponding to the target position region detected in the h video frame is larger than the preset value of the classifier score, the target tracking is successful, otherwise, the target tracking fails.
7. The method of claim 6, wherein the training process of the classifier of the h-1 th video frame comprises:
taking candidate areas corresponding to the detected target position areas contained in the h-1 video frame as positive sample data for training the second classifier;
taking candidate areas corresponding to the target position areas which do not contain detection in the h-1 video frame as negative sample data for training the second classifier;
constructing sample data for training a two-classifier by utilizing the positive sample data and the negative sample data;
and executing a classifier algorithm on the sample data of the training second classifier to obtain the trained classifier of the h-1 video frame.
8. Unmanned aerial vehicle ground detection system based on synchronous detection tracking, characterized in that the system includes:
the training module is used for training the target detection deep neural network model and obtaining a model file and a weight file;
the acquisition module is used for acquiring real-time video data frame by frame;
an initialization module i, configured to initialize a frame number counter h=0;
the assignment module is used for enabling h=h+1 and executing the detection module and the judgment module b at the same time;
the detection module is used for carrying out forward reasoning on real-time video data collected frame by frame based on a trained model file and a weight file of the target detection depth neural network, and obtaining a target position area detected in an h video frame;
the judging module a is used for judging whether h is 1, if so, executing an initializing module II, and if not, storing the target position area detected in the h video frame into a judging module e;
the initialization module II is used for initializing a target tracker according to the target position area detected in the h video frame;
the judging module b is used for judging whether h is 1, if yes, executing the assignment module, and if no, executing the tracking module;
the tracking module is used for acquiring a candidate region corresponding to the detected target position region in the h-1 video frame and a region, corresponding to the candidate region corresponding to the h-1 video frame, in the h video frame, and taking the region as the candidate region corresponding to the detected target position region in the h video frame, and acquiring the tracked target position region in the h video frame according to the candidate region corresponding to the detected target position region in the h-1 video frame;
the judging module c is used for judging whether the target tracking in the h video frame is successful, if so, executing the judging module d, and if not, executing the initializing module I;
the judging module d is used for judging whether the pixel coordinates of the target position area image tracked in the h video frame exceed the coordinate range of the preset video frame image, if so, executing the initializing module I, and if not, executing the judging module e;
and the judging module e judges whether the detected target position area and the tracked target position area in the h video frame are overlapped or not, if so, the detected target position area is output, the assignment module is executed, and if not, the tracked target position area is output, and the assignment module is executed.
9. The system of claim 8, wherein the training module is specifically configured to:
marking each type of target in the historical video data collected frame by frame;
constructing training data by utilizing historical video data marked frame by frame, and training a target detection depth neural network model by utilizing the training data;
and obtaining a model file and a weight file of the trained target detection depth neural network.
10. The system of claim 8, wherein the detection module is specifically configured to:
and sequentially reading the label corresponding to the target, the trained model file, the weight file and the video data acquired frame by utilizing the forward reasoning frame, and acquiring the detected target position area output by the forward reasoning frame.
11. The system of claim 8, wherein the acquiring the candidate region corresponding to the detected target location region in the h-1 th video frame comprises:
and expanding the detected target position area in the h-1 video frame by a preset multiple.
12. The system of claim 11, wherein the predetermined multiple has a value in the range of [1.5,3].
13. The system of claim 8, wherein the judging module c is specifically configured to:
analyzing candidate areas corresponding to the detected target position areas in the h video frame by using a classifier of the h-1 video frame, and obtaining scores of the candidate areas corresponding to the detected target position areas in the h video frame;
if the score of the candidate region corresponding to the target position region detected in the h video frame is larger than the preset value of the classifier score, the target tracking is successful, otherwise, the target tracking fails.
14. The system of claim 13, wherein the training process of the classifier of the h-1 st video frame comprises:
taking candidate areas corresponding to the detected target position areas contained in the h-1 video frame as positive sample data for training the second classifier;
taking candidate areas corresponding to the target position areas which do not contain detection in the h-1 video frame as negative sample data for training the second classifier;
constructing sample data for training a two-classifier by utilizing the positive sample data and the negative sample data;
and executing a classifier algorithm on the sample data of the training second classifier to obtain the trained classifier of the h-1 video frame.
CN202011285803.8A 2020-11-17 2020-11-17 Unmanned aerial vehicle ground detection method and system based on synchronous detection tracking Active CN114511793B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011285803.8A CN114511793B (en) 2020-11-17 2020-11-17 Unmanned aerial vehicle ground detection method and system based on synchronous detection tracking

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011285803.8A CN114511793B (en) 2020-11-17 2020-11-17 Unmanned aerial vehicle ground detection method and system based on synchronous detection tracking

Publications (2)

Publication Number Publication Date
CN114511793A CN114511793A (en) 2022-05-17
CN114511793B true CN114511793B (en) 2024-04-05

Family

ID=81547239

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011285803.8A Active CN114511793B (en) 2020-11-17 2020-11-17 Unmanned aerial vehicle ground detection method and system based on synchronous detection tracking

Country Status (1)

Country Link
CN (1) CN114511793B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107862705A (en) * 2017-11-21 2018-03-30 重庆邮电大学 A kind of unmanned plane small target detecting method based on motion feature and deep learning feature
WO2019041519A1 (en) * 2017-08-29 2019-03-07 平安科技(深圳)有限公司 Target tracking device and method, and computer-readable storage medium
WO2019101220A1 (en) * 2017-12-11 2019-05-31 珠海大横琴科技发展有限公司 Deep learning network and average drift-based automatic vessel tracking method and system
CN110399808A (en) * 2019-07-05 2019-11-01 桂林安维科技有限公司 A kind of Human bodys' response method and system based on multiple target tracking
CN111784737A (en) * 2020-06-10 2020-10-16 中国人民解放军军事科学院国防科技创新研究院 Automatic target tracking method and system based on unmanned aerial vehicle platform

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11216954B2 (en) * 2018-04-18 2022-01-04 Tg-17, Inc. Systems and methods for real-time adjustment of neural networks for autonomous tracking and localization of moving subject

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019041519A1 (en) * 2017-08-29 2019-03-07 平安科技(深圳)有限公司 Target tracking device and method, and computer-readable storage medium
CN107862705A (en) * 2017-11-21 2018-03-30 重庆邮电大学 A kind of unmanned plane small target detecting method based on motion feature and deep learning feature
WO2019101220A1 (en) * 2017-12-11 2019-05-31 珠海大横琴科技发展有限公司 Deep learning network and average drift-based automatic vessel tracking method and system
CN110399808A (en) * 2019-07-05 2019-11-01 桂林安维科技有限公司 A kind of Human bodys' response method and system based on multiple target tracking
CN111784737A (en) * 2020-06-10 2020-10-16 中国人民解放军军事科学院国防科技创新研究院 Automatic target tracking method and system based on unmanned aerial vehicle platform

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于卷积神经网络检测的单镜头多目标跟踪算法;闵召阳;赵文杰;;舰船电子工程;20171220(12);全文 *
采用核相关滤波的快速TLD视觉目标跟踪;王姣尧;侯志强;余旺盛;廖秀峰;陈传华;;中国图象图形学报;20181116(11);全文 *

Also Published As

Publication number Publication date
CN114511793A (en) 2022-05-17

Similar Documents

Publication Publication Date Title
US11205274B2 (en) High-performance visual object tracking for embedded vision systems
CN109166094B (en) Insulator fault positioning and identifying method based on deep learning
CN106874854B (en) Unmanned aerial vehicle tracking method based on embedded platform
WO2020186678A1 (en) Three-dimensional map constructing method and apparatus for unmanned aerial vehicle, computer device, and storage medium
Alexandrov et al. Analysis of machine learning methods for wildfire security monitoring with an unmanned aerial vehicles
CN103208008B (en) Based on the quick adaptive method of traffic video monitoring target detection of machine vision
US20150138310A1 (en) Automatic scene parsing
CN106709475B (en) Obstacle recognition method and device, computer equipment and readable storage medium
CN110021033A (en) A kind of method for tracking target based on the twin network of pyramid
CN111784737B (en) Automatic target tracking method and system based on unmanned aerial vehicle platform
CN110555420B (en) Fusion model network and method based on pedestrian regional feature extraction and re-identification
CN111461209A (en) Model training device and method
CN114511792B (en) Unmanned aerial vehicle ground detection method and system based on frame counting
CN109919223B (en) Target detection method and device based on deep neural network
Wu et al. Multivehicle object tracking in satellite video enhanced by slow features and motion features
CN113936340A (en) AI model training method and device based on training data acquisition
CN111831010A (en) Unmanned aerial vehicle obstacle avoidance flight method based on digital space slice
CN112487892B (en) Unmanned aerial vehicle ground detection method and system based on confidence
CN114194180A (en) Method, device, equipment and medium for determining auxiliary parking information
CN114511793B (en) Unmanned aerial vehicle ground detection method and system based on synchronous detection tracking
CN112487889A (en) Unmanned aerial vehicle ground detection method and system based on deep neural network
CN116434150A (en) Multi-target detection tracking method, system and storage medium for congestion scene
CN116453109A (en) 3D target detection method, device, equipment and storage medium
CN113139985B (en) Tracking target framing method for eliminating communication delay influence of unmanned aerial vehicle and ground station
CN114494355A (en) Trajectory analysis method and device based on artificial intelligence, terminal equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant