Disclosure of Invention
In view of the above, there is a need to provide a method, an apparatus, a computer device and a storage medium for detecting an unmanned aerial vehicle, which can utilize deployed cameras to realize intelligent security.
A drone detection method, the method comprising:
in the training stage, a training data set comprising an unmanned aerial vehicle target is obtained, and an input image is obtained after labeling and preprocessing;
inputting the input image into a modified YoloV5 network for unmanned detection; the improved yoolov 5 network reduces the number of down-sampling layers of the main network, removes the third detection head branch of the original yoolov 5 network, and further up-samples the output of the second detection head branch of the original yoolov 5 and then splices the output of the fourth layer of the main network to obtain the output of the first detection head branch;
training the improved YoloV5 network through the training data set to obtain a trained unmanned aerial vehicle detection network;
in the application stage, acquiring real-time video data of a city monitoring camera, extracting a frame image containing a suspicious target from the real-time video data through a moving target detection algorithm, and determining position information of the suspicious target;
splitting the frame image containing the suspicious target into a plurality of candidate regions, and determining the candidate regions containing the suspicious target in the plurality of candidate regions according to the position information of the suspicious target;
and inputting the candidate area image containing the suspicious target into the trained unmanned aerial vehicle detection network, detecting the unmanned aerial vehicle, and outputting a detection result marked with a target frame if the suspicious target is judged to be the unmanned aerial vehicle by detection.
In one embodiment, the method further comprises the following steps: acquiring a training data set comprising unmanned aerial vehicle targets;
performing target labeling on the images in the training data set, and performing manual recheck;
and performing at least one of scaling, random cutting and turning operation on the marked image to obtain a preprocessed input image.
In one embodiment, the method further comprises the following steps: and equally dividing the frame image containing the suspicious target into four candidate regions of upper left, upper right, lower left and lower right.
In one embodiment, the method further comprises the following steps: taking the frame image containing the suspicious target as an original image, and equally dividing the original image into four regions of an upper left region, an upper right region, a lower left region and a lower right region to obtain four candidate regions;
selecting an area with the size of one fourth of the original image from the image formed by every two adjacent candidate areas of the four candidate areas, and obtaining the four candidate areas;
selecting an area with the size of one fourth of the original image from the center of the original image to obtain another candidate area;
and obtaining nine candidate areas through picture splitting.
In one embodiment, before inputting the candidate region image containing the suspicious target into the trained unmanned detection network, the method further includes: and performing super-resolution reconstruction on the candidate region image containing the suspicious target.
In one embodiment, the method further comprises the following steps: inputting the candidate area image containing the suspicious target into the trained unmanned detection network for unmanned detection;
if the suspicious target is detected and judged to be the unmanned aerial vehicle, mapping a target frame detected in the candidate area image back to an original image, calculating the position of the target relative to the original image, and marking in the original image;
and outputting the picture or the video marked with the target frame.
In one embodiment, after outputting the picture or video marked with the target frame, the method further includes: and storing the video image frame of the unmanned aerial vehicle target.
An unmanned aerial vehicle detection apparatus, the apparatus comprising:
the training set acquisition module is used for acquiring a training data set comprising an unmanned aerial vehicle target in a training stage, and obtaining an input image after labeling and preprocessing;
the training module is used for inputting the input image into an improved YooloV 5 network for unmanned detection; the improved yoolov 5 network reduces the number of down-sampling layers of the main network, removes the third detection head branch of the original yoolov 5 network, and further up-samples the output of the second detection head branch of the original yoolov 5 and then splices the output of the fourth layer of the main network to obtain the output of the first detection head branch; training the improved YoloV5 network through the training data set to obtain a trained unmanned aerial vehicle detection network;
the suspicious target determining module is used for acquiring real-time video data of the urban monitoring camera in an application stage, extracting a frame image containing a suspicious target from the real-time video data through a moving target detection algorithm, and determining the position information of the suspicious target;
the candidate region determining module is used for splitting the frame image containing the suspicious target into a plurality of candidate regions, and determining the candidate regions containing the suspicious target in the plurality of candidate regions according to the position information of the suspicious target;
and the unmanned aerial vehicle detection module is used for inputting the candidate area image containing the suspicious target into the trained unmanned aerial vehicle detection network to detect the unmanned aerial vehicle, and if the suspicious target is detected and judged to be the unmanned aerial vehicle, outputting a detection result marked with a target frame.
A computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program:
in the training stage, a training data set comprising an unmanned aerial vehicle target is obtained, and an input image is obtained after labeling and preprocessing;
inputting the input image into a modified YoloV5 network for unmanned detection; the improved yoolov 5 network reduces the number of down-sampling layers of the main network, removes the third detection head branch of the original yoolov 5 network, and further up-samples the output of the second detection head branch of the original yoolov 5 and then splices the output of the fourth layer of the main network to obtain the output of the first detection head branch;
training the improved YoloV5 network through the training data set to obtain a trained unmanned aerial vehicle detection network;
in the application stage, acquiring real-time video data of a city monitoring camera, extracting a frame image containing a suspicious target from the real-time video data through a moving target detection algorithm, and determining position information of the suspicious target;
splitting the frame image containing the suspicious target into a plurality of candidate regions, and determining the candidate regions containing the suspicious target in the plurality of candidate regions according to the position information of the suspicious target;
and inputting the candidate area image containing the suspicious target into the trained unmanned aerial vehicle detection network, detecting the unmanned aerial vehicle, and outputting a detection result marked with a target frame if the suspicious target is judged to be the unmanned aerial vehicle by detection.
A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:
in the training stage, a training data set comprising an unmanned aerial vehicle target is obtained, and an input image is obtained after labeling and preprocessing;
inputting the input image into a modified YoloV5 network for unmanned detection; the improved yoolov 5 network reduces the number of down-sampling layers of the main network, removes the third detection head branch of the original yoolov 5 network, and further up-samples the output of the second detection head branch of the original yoolov 5 and then splices the output of the fourth layer of the main network to obtain the output of the first detection head branch;
training the improved YoloV5 network through the training data set to obtain a trained unmanned aerial vehicle detection network;
in the application stage, acquiring real-time video data of a city monitoring camera, extracting a frame image containing a suspicious target from the real-time video data through a moving target detection algorithm, and determining position information of the suspicious target;
splitting the frame image containing the suspicious target into a plurality of candidate regions, and determining the candidate regions containing the suspicious target in the plurality of candidate regions according to the position information of the suspicious target;
and inputting the candidate area image containing the suspicious target into the trained unmanned aerial vehicle detection network, detecting the unmanned aerial vehicle, and outputting a detection result marked with a target frame if the suspicious target is judged to be the unmanned aerial vehicle by detection.
According to the unmanned aerial vehicle detection method, the unmanned aerial vehicle detection device, the computer equipment and the storage medium, the YooloV 5 network is improved, the number of downsampling layers of a main network is reduced, the third detection head branch of the original YooloV 5 network is removed, the output of the second detection head branch of the original YooloV 5 is further upsampled and then spliced with the output of the fourth layer of the main network; training the improved YoloV5 network through a training data set to obtain a trained unmanned aerial vehicle detection network; in the application stage, acquiring real-time video data of a city monitoring camera, extracting a frame image containing a suspicious target from the real-time video data through a moving target detection algorithm, and determining position information of the suspicious target; splitting a frame image containing a suspicious target into a plurality of candidate regions, and determining the candidate regions containing the suspicious target in the plurality of candidate regions according to the position information of the suspicious target; and inputting the candidate area image containing the suspicious target into a trained unmanned aerial vehicle detection network, detecting the unmanned aerial vehicle, and outputting a detection result marked with a target frame if the suspicious target is detected and judged to be the unmanned aerial vehicle. The invention reduces the number of network layers and the down-sampling rate, ensures that the small target still retains effective information at the later network level, improves the detection rate of the detection network to the unmanned aerial vehicle, greatly reduces the calculation cost through the operation of moving target detection and image splitting, and improves the efficiency of the algorithm and the real-time performance of the detection.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
In one embodiment, as shown in fig. 1, there is provided a drone detection method, including the steps of:
102, in a training stage, acquiring a training data set including an unmanned aerial vehicle target, and labeling and preprocessing the training data set to obtain an input image.
Firstly, collecting a training data set; then, manually marking the acquired data; manual rechecking is needed after the labeling is finished, so that interference of a sample with poor labeling on subsequent training is prevented; then, the data needs to be preprocessed, including scaling, random clipping, flipping and other operations, so as to ensure the diversity of the data containing features.
Step 104, inputting the input image into the improved yoolov 5 network for unmanned detection.
The improved yoolov 5 network reduces the number of down-sampling layers of the main network, removes the third detection head branch of the original yoolov 5 network, and further up-samples the output of the second detection head branch of the original yoolov 5 and then splices the output of the fourth layer of the main network to obtain the output of the first detection head branch.
As shown in fig. 2, the modules 0 to 7 form a backbone network, and in the original yoolov 5 network, there is a layer of down-sampling after the module 6, but in the yoolov 5 network improved by the present invention, the number of down-sampling layers of the backbone network is reduced, so that the back-layer network can still learn the features of small targets.
When the network layer number is relatively deep, the contribution of deep-level features to detection is weak, the third detection head branch of the original YoloV5 network is removed, and the operation efficiency of the algorithm is improved.
The output of the second detection head branch of the original YoloV5 is further sampled and then spliced with the output of the fourth layer of the backbone network, namely the module 3, so that the capture rate of the network to the small target of the unmanned aerial vehicle can be increased.
And step 106, training the improved YoloV5 network through a training data set to obtain a trained unmanned aerial vehicle detection network.
Considering the edge calculation, the parameter quantity of the model is reduced to the maximum extent while the accuracy of the original model is approached.
In the training process, whether the expected effect is achieved or not is repeatedly verified, and if the expected effect is not achieved, the data, the iterative network model parameters and the like are reprocessed, so that the effect of the iterative network model parameters is more and more approximate to the expected effect.
And step 108, in the application stage, acquiring real-time video data of the urban monitoring camera, extracting a frame image containing the suspicious target from the real-time video data through a moving target detection algorithm, and determining the position information of the suspicious target.
Firstly, a suspicious target is extracted by a moving target detection method. In practical application scenarios, the background is mostly a static pure background, and if deep learning target detection is performed on each frame, a large calculation cost is consumed. The invention carries out unmanned aerial vehicle detection based on deep learning on the extracted frame image of the suspicious target, thereby saving the calculation cost and improving the algorithm efficiency.
The unmanned aerial vehicle detection method based on the YoloV5 model has the advantages that the unmanned aerial vehicle detection is carried out by utilizing video images acquired by deployed common cameras in the city, the interference detection factors such as small target, fuzzy pixel and the like of the unmanned aerial vehicle are overcome by optimizing the YoloV5 model, and the requirement for detecting the small-target unmanned aerial vehicle can be met on the premise of saving a large amount of manpower and material resources.
Step 110, splitting the frame image containing the suspicious target into a plurality of candidate regions, and determining the candidate regions containing the suspicious target in the plurality of candidate regions according to the position information of the suspicious target.
In consideration of the calculation cost of the deep learning target detection part, the original image is divided into a plurality of candidate areas, the candidate area where the small target unmanned aerial vehicle is located through the motion characteristics of the video target, the invalid detection of the area without the target is avoided, and therefore the purpose of reducing the calculation cost is achieved.
And 112, inputting the candidate area image containing the suspicious target into the trained unmanned aerial vehicle detection network, performing unmanned aerial vehicle detection, and outputting a detection result marked with a target frame if the suspicious target is determined to be an unmanned aerial vehicle by detection.
If a plurality of candidate areas contain suspicious targets and the positions of the targets corresponding to the original image are not overlapped, inputting each candidate area image containing the suspicious targets into the unmanned aerial vehicle detection network; if a plurality of candidate areas contain suspicious targets and the targets correspond to the same positions of the original image, selecting one of the candidate area images for each target and inputting the selected candidate area image into the unmanned aerial vehicle detection network.
In the unmanned aerial vehicle detection method, the YoloV5 network is improved, the number of downsampling layers of a main network is reduced, the third detection head branch of the original YoloV5 network is removed, and the output of the second detection head branch of the original YoloV5 is further upsampled and then spliced with the output of the fourth layer of the main network; training the improved YoloV5 network through a training data set to obtain a trained unmanned aerial vehicle detection network; in the application stage, acquiring real-time video data of a city monitoring camera, extracting a frame image containing a suspicious target from the real-time video data through a moving target detection algorithm, and determining position information of the suspicious target; splitting a frame image containing a suspicious target into a plurality of candidate regions, and determining the candidate regions containing the suspicious target in the plurality of candidate regions according to the position information of the suspicious target; and inputting the candidate area image containing the suspicious target into a trained unmanned aerial vehicle detection network, detecting the unmanned aerial vehicle, and outputting a detection result marked with a target frame if the suspicious target is detected and judged to be the unmanned aerial vehicle. The invention reduces the number of network layers and the down-sampling rate, ensures that the small target still retains effective information at the later network level, improves the detection rate of the detection network to the unmanned aerial vehicle, greatly reduces the calculation cost through the operation of moving target detection and image splitting, and improves the efficiency of the algorithm and the real-time performance of the detection.
In one embodiment, the method further comprises the following steps: acquiring a training data set comprising unmanned aerial vehicle targets; performing target labeling on the images in the training data set, and performing manual recheck; and performing at least one of scaling, random cutting and turning operation on the marked image to obtain a preprocessed input image.
In one embodiment, the method further comprises the following steps: and equally dividing the frame image containing the suspicious object into four candidate regions of upper left, upper right, lower left and lower right.
If the object is missed due to being at the boundary, the frame image is divided according to the dividing method of fig. 3 (a).
In one embodiment, the method further comprises the following steps: taking a frame image containing a suspicious target as an original image, equally dividing the original image into four regions of an upper left region, an upper right region, a lower left region and a lower right region to obtain four candidate regions; selecting an area with the size of one fourth of the original image from the image formed by every two adjacent candidate areas of the four candidate areas, and obtaining the four candidate areas; selecting an area with the size of one fourth of the original image from the center of the original image to obtain another candidate area; and obtaining nine candidate areas through picture splitting.
Fig. 3 shows a method for dividing nine candidate regions, where fig. 3(a) shows an original picture divided equally into 4 parts, fig. 3(b) shows an original image region of a quarter size at the border of the frame based on fig. 3(a), and fig. 3(c) shows an original image region of a quarter size at the border of four frames based on fig. 3 (b). This is mainly done to prevent the object being in the boundary region and being segmented to cause undetected situations.
In one embodiment, before inputting the candidate region image containing the suspicious target into the trained unmanned detection network, the method further comprises: and performing super-resolution reconstruction on the candidate region image containing the suspicious target.
In consideration of the situation that the target of the unmanned aerial vehicle in the air is small, the characteristics of the unmanned aerial vehicle can be enlarged by using the super-resolution reconstruction method.
In one embodiment, the method further comprises the following steps: inputting the candidate area image containing the suspicious target into a trained unmanned detection network for unmanned detection; if the suspicious target is detected and judged to be the unmanned aerial vehicle, a target frame detected in the candidate area image is mapped back to an original image, the position of the target relative to the original image is calculated, and the target is marked in the original image; and outputting the picture or the video marked with the target frame.
In one embodiment, after outputting the picture or video marked with the target frame, the method further includes: and storing the video image frame of the unmanned aerial vehicle target.
And recording the video frame of the unmanned aerial vehicle target, and providing for subsequent reinspection.
It should be understood that, although the steps in the flowchart of fig. 1 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a portion of the steps in fig. 1 may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performance of the sub-steps or stages is not necessarily sequential, but may be performed in turn or alternately with other steps or at least a portion of the sub-steps or stages of other steps.
In another embodiment, as shown in fig. 4, a method for drone detection is provided that includes a training phase and an application phase. And the training stage is used for carrying out network training after marking and preprocessing the acquired data and storing the model after the application requirements are met. In the application stage, preliminary moving target detection is carried out through a background difference algorithm, if targets exist, after picture splitting, candidate region extraction and super-resolution reconstruction operation are carried out, target detection of the unmanned aerial vehicle is carried out through a trained unmanned aerial vehicle detection model, and early warning is carried out after the unmanned aerial vehicle is detected.
In one embodiment, as shown in fig. 5, there is provided a drone detecting device, including: a training set acquisition module 502, a training module 504, a suspicious target determination module 506, a candidate region determination module 508, and an unmanned detection module 510, wherein:
a training set obtaining module 502, configured to obtain a training data set including an unmanned aerial vehicle target in a training phase, and obtain an input image after labeling and preprocessing;
a training module 504 for inputting the input image into a modified YoloV5 network for unmanned detection; the improved yoolov 5 network reduces the number of down-sampling layers of the backbone network, removes the third detection head branch of the original yoolov 5 network, and further up-samples the output of the second detection head branch of the original yoolov 5 and then splices the output of the fourth layer of the backbone network to obtain the output of the first detection head branch; training the improved YoloV5 network through a training data set to obtain a trained unmanned aerial vehicle detection network;
a suspicious target determining module 506, configured to, in an application stage, obtain real-time video data of the city monitoring camera, extract a frame image including a suspicious target from the real-time video data through a moving target detection algorithm, and determine position information of the suspicious target;
a candidate region determining module 508, configured to split the frame image containing the suspicious target into a plurality of candidate regions, and determine, according to the position information of the suspicious target, a candidate region containing the suspicious target in the plurality of candidate regions;
and the unmanned aerial vehicle detection module 510 is configured to input the candidate area image containing the suspicious target into a trained unmanned aerial vehicle detection network, perform unmanned aerial vehicle detection, and output a detection result labeled with a target frame if the suspicious target is determined to be an unmanned aerial vehicle by detection.
The training set acquisition module 502 is further configured to acquire a training data set including the drone target; performing target labeling on the images in the training data set, and performing manual recheck; and performing at least one of scaling, random cutting and turning operation on the marked image to obtain a preprocessed input image.
The candidate region determination module 508 is further configured to divide the frame image containing the suspicious object into four candidate regions, i.e., upper left, upper right, lower left, and lower right.
The candidate region determining module 508 is further configured to divide the original image into four regions, i.e., an upper left region, an upper right region, a lower left region, and a lower right region, by using the frame image containing the suspicious object as the original image, so as to obtain four candidate regions; selecting an area with the size of one fourth of the original image from the image formed by every two adjacent candidate areas of the four candidate areas, and obtaining the four candidate areas; selecting an area of one fourth of the size of the original image from the center of the original image to obtain another candidate area; and obtaining nine candidate areas through picture splitting.
The drone detection module 510 is further configured to perform super-resolution reconstruction on the candidate region image containing the suspicious target before inputting the candidate region image containing the suspicious target into the trained drone detection network.
The unmanned aerial vehicle detection module 510 is further configured to input the candidate area image containing the suspicious target into a trained unmanned aerial vehicle detection network for unmanned aerial vehicle detection; if the suspicious target is detected and judged to be the unmanned aerial vehicle, a target frame detected in the candidate area image is mapped back to an original image, the position of the target relative to the original image is calculated, and the target is marked in the original image; and outputting the picture or the video marked with the target frame.
The drone detecting module 510 is further configured to store video image frames in which the drone target appears after outputting the picture or video marked with the target frame.
For specific limitations of the drone detecting device, reference may be made to the above limitations of the drone detecting method, which are not described herein again. Each module in the unmanned aerial vehicle detection device can be wholly or partially realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as shown in fig. 6. The computer device includes a processor, a memory, a network interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a drone detection method. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.
Those skilled in the art will appreciate that the architecture shown in fig. 6 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In an embodiment, a computer device is provided, comprising a memory storing a computer program and a processor implementing the steps of the above method embodiments when executing the computer program.
In an embodiment, a computer-readable storage medium is provided, on which a computer program is stored, which computer program, when being executed by a processor, carries out the steps of the above-mentioned method embodiments.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent application shall be subject to the appended claims.