CN117576597B - Visual identification method and system based on unmanned aerial vehicle driving - Google Patents
Visual identification method and system based on unmanned aerial vehicle driving Download PDFInfo
- Publication number
- CN117576597B CN117576597B CN202410051695.XA CN202410051695A CN117576597B CN 117576597 B CN117576597 B CN 117576597B CN 202410051695 A CN202410051695 A CN 202410051695A CN 117576597 B CN117576597 B CN 117576597B
- Authority
- CN
- China
- Prior art keywords
- image data
- image
- aerial vehicle
- unmanned aerial
- setting
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000000007 visual effect Effects 0.000 title claims abstract description 39
- 238000000034 method Methods 0.000 title claims abstract description 28
- 238000013528 artificial neural network Methods 0.000 claims abstract description 32
- 238000000605 extraction Methods 0.000 claims abstract description 23
- 238000011156 evaluation Methods 0.000 claims abstract description 13
- 238000007781 pre-processing Methods 0.000 claims abstract description 7
- 238000001514 detection method Methods 0.000 claims description 53
- 238000012545 processing Methods 0.000 claims description 19
- 239000011159 matrix material Substances 0.000 claims description 15
- 238000004364 calculation method Methods 0.000 claims description 9
- 230000008569 process Effects 0.000 claims description 9
- 239000013598 vector Substances 0.000 claims description 9
- 230000008859 change Effects 0.000 claims description 6
- 238000001914 filtration Methods 0.000 claims description 5
- 238000009825 accumulation Methods 0.000 claims description 3
- 230000003213 activating effect Effects 0.000 claims description 3
- 230000004913 activation Effects 0.000 claims description 3
- 238000013527 convolutional neural network Methods 0.000 claims description 3
- 238000013507 mapping Methods 0.000 claims description 3
- 238000011176 pooling Methods 0.000 claims description 3
- 230000000694 effects Effects 0.000 description 5
- 238000012544 monitoring process Methods 0.000 description 5
- 230000002411 adverse Effects 0.000 description 4
- 230000007547 defect Effects 0.000 description 2
- 230000004075 alteration Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
- G06V20/17—Terrestrial scenes taken from planes or by drones
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/30—Noise filtering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/75—Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
- G06V10/757—Matching configurations of points or features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Remote Sensing (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to the technical field of visual identification, and discloses a visual identification method and a visual identification system based on unmanned aerial vehicle driving. The system comprises: collecting sample image data to establish a sample set, performing feature extraction through a neural network, setting an unmanned aerial vehicle identification area, operating the unmanned aerial vehicle to collect image data in the identification area in real time, preprocessing the image data collected by the unmanned aerial vehicle in real time, and performing feature extraction on the image data collected in real time through the neural network; and comparing the characteristic data in the real-time collected image data with the characteristic data in each classified image data in the sample set, completing the identification of the real-time collected image data based on the comparison result, setting an evaluation index, and adjusting the visual identification precision of unmanned aerial vehicle driving based on the evaluation index. The visual recognition system based on unmanned aerial vehicle driving can perform real-time recognition on image data in a real-time acquisition recognition area of the unmanned aerial vehicle.
Description
Technical Field
The invention relates to the technical field of visual identification, in particular to a visual identification method and system based on unmanned aerial vehicle driving.
Background
Along with the gradual maturity of the target detection technology, the method is widely applied to traditional fixed monitoring equipment, but the traditional monitoring equipment has the limitations of poor flexibility and small monitoring range, is difficult to realize the identification of scattered and movable multi-targets, and is generally only used for closed indoor scenes such as supermarkets, shopping malls and the like. The unmanned plane has the advantages of easy operation, good flexibility, strong maneuverability, wide monitoring range and the like, and can make up the defects of the traditional monitoring equipment. However, in a task scene where the unmanned aerial vehicle recognizes a plurality of moving targets, besides the traditional target detection difficulties such as shielding, visual angle change, light condition change and the like, adverse effects on image quality caused by target scale change and unmanned aerial vehicle and target maneuver exist. And because the high flexibility of the unmanned aerial vehicle determines that the carrying capacity of the unmanned aerial vehicle is very limited, the computing resources are relatively deficient, the real-time effect is difficult to achieve when the detection model is deployed on the platform, and the autonomous propulsion on the unmanned aerial vehicle is hindered.
In the prior art CN115880589a, recognition and evaluation are performed on the target acquired by the unmanned aerial vehicle only through a neural network training mode, so that adverse effects of maneuvering of the unmanned aerial vehicle and the target on image quality are ignored, and the method has a great limitation.
Disclosure of Invention
(one) solving the technical problems
Aiming at the defects of the prior art, the invention provides a visual recognition system based on unmanned aerial vehicle driving, which has the advantages of accurate recognition and the like, and solves the problem of adverse effect of unmanned aerial vehicle and target maneuver on image quality.
In order to solve the technical problem of adverse effect of unmanned aerial vehicle and target maneuver on image quality, the invention provides the following technical scheme:
the embodiment discloses a visual identification method based on unmanned aerial vehicle driving, which specifically comprises the following steps:
s1, establishing a sample set based on collected sample image data, and inputting the sample set into a neural network for feature extraction;
s2, setting an unmanned aerial vehicle identification area, and operating the unmanned aerial vehicle to collect image data in the identification area in real time;
s3, preprocessing image data acquired by the unmanned aerial vehicle in real time;
s4, inputting the preprocessed image data into a neural network, and extracting features of the image data acquired in real time based on the neural network;
s5, comparing the characteristic data in the real-time collected image data with the characteristic data in each classified image data in the sample set, and completing the identification of the real-time collected image data based on the comparison result;
s6, setting an evaluation index, and adjusting the visual recognition accuracy of unmanned aerial vehicle driving based on the evaluation index;
preferably, the establishing a sample set based on the collected sample image data, and inputting the sample set into the neural network for feature extraction includes:
s11, classifying the image data in the sample set according to the content;
s12, sequentially inputting the classified image data into a neural network for feature extraction, and extracting feature data in each classified image data;
and S13, storing the feature data of the extracted classified image data.
Preferably, the preprocessing the image data collected by the unmanned aerial vehicle in real time includes:
s31, filtering processing is carried out on image data acquired in real time;
s32, performing contrast enhancement on the filtered image data.
Preferably, the filtering processing of the image data acquired in real time includes:
setting p as a central pixel based on the picture center, and setting a neighborhood set S of the central pixel, wherein q belongs to the field set S of the central pixel;
wherein e is a natural constant;I p represented as p= (x) in image I 1 ,y 1 ) Gray value of dot, (x) 1 ,y 1 ) The coordinates of the pixel point are represented,I q represented as q= (u) in image I 1 ,v 1 ) Gray value of the dot; σs represents the standard deviation of the spatial distance in the gaussian function, σr represents the standard deviation of the pixel value in the gaussian function; gs denotes a spatial distance weight, gr denotes a pixel value weight;BF p representing the gray value, W, of p-point in the filtered image q The sum of weights representing the pixel values of the q points.
Preferably, the contrast enhancement of the filtered image data includes:
s321, establishing a linear mapping relation of the image data based on brightness in the filtered image data;
wherein L is in1 (x, y) represents the pixel value before contrast enhancement of the local pixel block in the image data, L out1 (x, y) represents pixel values of a local pixel block in the contrast-enhanced image data,Qis a contrast gain parameter;
setting a brightness threshold in the image data;
setting up
Wherein L is avg1 Representing a luminance average of a local pixel block in the image data;
when the brightness in the image is lower than the set threshold value, 0 is satisfied<Q<1, when the brightness in the image is higher than or equal to the set threshold value, the following conditions are satisfiedQ>1;
Setting a sliding window and carrying out contrast enhancement on pixels based on the average value and variance of brightness in the window;
setting an n multiplied by n sliding window to slide on the image, and calculating the average value and variance of brightness in the window to carry out contrast enhancement on pixels;
wherein L is avg2 Representing the luminance average value, L, of a local pixel block within a sliding window in2 (x, y) represents the pixel value before contrast enhancement of the local pixel block in the sliding window, L out2 Representing pixel values of local pixel blocks in the sliding window after contrast enhancement;
the mean and variance of the brightness in the sliding window are respectively:
the enhancement coefficient η is:
where k is a set constant and η is an enhancement coefficient.
Preferably, the inputting the preprocessed image data into the neural network, and performing feature extraction on the real-time acquired image data based on the neural network includes:
s41, dividing an input image into L small-area image data blocks;
s42, after receiving an input image data block, the convolution layer moves on the input image data block through a convolution kernel according to a set step length, and performs multiply accumulation on a corresponding area of each step and a characteristic value of the area, so that characteristic extraction of each image data block is realized;
the convolution calculation formula is as follows:
wherein,representing input features->The weights of the corresponding convolution kernels are represented,bthe value of the offset is indicated and,frepresenting an output characteristic;
s43, in the neural network, the output of the upper layer is used as the input of the lower layer, and the convolutional neural network is formed by continuously stacking;
the data must be subjected to the process of activating the function during the process of inputting the data to the lower layer;
setting the upper layer output as the input of the lower layer with the input value of(λ=1, 2, … …, L), each output value +.>The corresponding input weight is +.>B is an offset, and the output result obtained after the input values are input into the neural network is:
wherein,y is an output result for the corresponding activation function;
s44, convolving the characteristics of the image data blocks in the continuously stacked convolution and pooling processes;
s45, transmitting the characteristics of the convolved image data block into a full connection layer;
s46, unfolding and combining the feature data through the full connection layer to obtain a feature array, and storing the feature array.
Preferably, the comparing the feature data in the real-time collected image data with the feature data in each classified image data in the sample set, and the identifying the real-time collected image data based on the comparison result includes:
s51, comparing based on the photographed multi-frame images;
transmitting the shot images to a neural network for feature extraction, identifying objects, storing feature points, and obtaining the coordinate positions of target objects in each frame of images; extracting the characteristics of another frame of image, obtaining a new coordinate position, comparing the new coordinate position with the characteristic point of the previous frame, and judging whether the image is the same object according to the fact that the contact ratio is higher than 80%;
s52, calculating the similarity between the detection target and the target to be detected after the feature extraction;
judging the similarity between the detection target and the target to be detected by calculating the distance between the features;
cosine distance:
wherein,for vector->Sum vector->Cosine similarity between D cos Is the cosine distance, (x) j ,y j ) The coordinates of the J-th point are represented, and J represents the number of coordinate points;
inquiring h targets to be detected in d detection targets, and setting the characteristic form of each target as a ϴ -dimensional vector;
forming a detection feature matrix G by using feature vectors of all detection targets, forming a feature matrix H to be detected by using the feature vectors of all targets to be detected, and multiplying the detection feature matrix G by the feature matrix H to be detected to obtain a cosine similarity matrix C;
for an object to be detectedThe detection targets are arranged in a descending order according to the cosine similarity between the target features to be detected and each detection target feature, and c detection targets with top rank are reserved; setting z α Feature similarity, t, representing rank alpha β Representing the beta detection target;
setting an upper limit of a similarity threshold and a lower limit of the similarity threshold which are successfully matched;
feature similarity z when first ranked 1 Less than the lower limit of the similarity threshold, failing in matching, and setting a target to be detectedIs a new generation target;
feature similarity z when first ranked 1 Is larger than the upper limit of the similarity threshold, and the successful matching indicates the detection target t 1 The similarity with the target to be detected is high enough, and the target to be detected is directly judged as a detection target t 1 ;
When z 1 The method comprises the steps of being smaller than the upper limit of a similarity threshold and larger than the lower limit of the similarity threshold, matching successfully, counting detection targets with similarity above the lower limit of the similarity threshold in ranking results, setting the detection targets above the lower limit of the similarity threshold as effective matching, calculating the proportion of the category of the detection targets in the effective matching, and judging the target to be detected as the detection target with the highest proportion.
Preferably, the setting the evaluation index, and adjusting the visual recognition accuracy of the unmanned aerial vehicle driving based on the evaluation index includes:
calculating the accuracy of visual recognition under the current resolution based on the photographed multi-frame images;
wherein TP represents the number of identified matches with the actual result; FP represents identifying a number inconsistent with the actual result; FN represents the number of missed samples; p is the accuracy rate, R is the recall rate;
further, for a single category, setting a change relation curve of the precision rate and the recall rate as a P-R curve; the area surrounded by the curve and the horizontal axis is the average accuracy AP, and the calculation formula is as follows:
setting an average accuracy average mAP as an average value of accuracy APs of a plurality of categories, wherein a calculation formula is as follows:
wherein m is the number of target types, mAP is the average value of the accuracy AP of a plurality of types;
setting an average accuracy average threshold value of visual identification; and when the calculated average accuracy average value is lower than a set average accuracy average value threshold value of visual identification, improving the resolution of the image shot by the unmanned aerial vehicle.
The embodiment also discloses a visual identification system based on unmanned aerial vehicle driving, specifically includes: the device comprises an image acquisition module, an image processing module, a display module and an image recognition module;
the image acquisition module is used for shooting image data in the current planning area in real time and transmitting the image data to the image processing module in real time;
the image processing module is used for processing the image data transmitted by the image acquisition module and transmitting the processed image to the image recognition module;
the image recognition module is used for recognizing the processed image data;
the display module is used for displaying in the display module according to the identification result of the image processing module.
Compared with the prior art, the invention provides a visual recognition system based on unmanned aerial vehicle driving, which has the following beneficial effects:
1. according to the invention, the sample set is established by collecting sample image data, and the identification of the sample set is completed in a mode of extracting characteristics through a neural network, so that the identification efficiency of the image data is improved.
2. According to the method, the unmanned aerial vehicle identification area is set, the image data acquired in real time in the identification area is preprocessed, the influence of noise points in the image on the image identification efficiency is reduced, the difference between the object and the background is highlighted through the contrast enhancement mode, the image identification efficiency is improved, and meanwhile the accuracy of image identification is guaranteed.
3. According to the invention, whether the image data collected in real time is matched with the image data in the sample set or not is judged by carrying out feature extraction on the image data collected in real time and comparing the image data with the feature data in the sample set extracted by the neural network, so that the accuracy of image identification is ensured.
4. According to the invention, through a multi-frame shooting mode, the feature extraction is carried out on a plurality of pieces of continuously shot image data, and a feature point comparison mode is carried out to judge whether the images are the same identification target, so that the accuracy and the reliability of identification are improved.
5. The invention comprehensively judges whether the detection target is matched with the target to be detected or not by judging the similarity between the detection target and the target to be detected, thereby improving the image recognition efficiency.
Drawings
Fig. 1 is a schematic diagram of a visual identification flow of unmanned aerial vehicle driving.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The embodiment discloses a visual identification method based on unmanned aerial vehicle driving, which specifically comprises the following steps:
s1, establishing a sample set based on collected sample image data, and inputting the sample set into a neural network for feature extraction;
s11, classifying the image data in the sample set according to the content;
s12, sequentially inputting the classified image data into a neural network for feature extraction, and extracting feature data in each classified image data;
s13, storing the feature data of the extracted classified image data;
s2, setting an unmanned aerial vehicle identification area, and operating the unmanned aerial vehicle to collect image data in the identification area in real time;
s3, preprocessing image data acquired by the unmanned aerial vehicle in real time;
preprocessing image data acquired by the unmanned aerial vehicle in real time comprises the following steps:
s31, filtering processing is carried out on image data acquired in real time;
setting p as a central pixel based on the picture center, and setting a neighborhood set S of the central pixel, wherein q belongs to the field set S of the central pixel;
wherein e is naturalA constant;I p represented as p= (x) in image I 1 ,y 1 ) Gray value of dot, (x) 1 ,y 1 ) The coordinates of the pixel point are represented,I q represented as q= (u) in image I 1 ,v 1 ) Gray value of the dot; σs represents the standard deviation of the spatial distance in the gaussian function, σr represents the standard deviation of the pixel value in the gaussian function; g s Representing the spatial distance weight, G r Representing pixel value weights;BF p representing the gray value, W, of p-point in the filtered image q A weight sum representing the pixel value of the q point;
s32, carrying out contrast enhancement on the filtered image data;
s321, establishing a linear mapping relation of the image data based on brightness in the filtered image data;
wherein L is in1 (x, y) represents the pixel value before contrast enhancement of the local pixel block in the image data, L out1 (x, y) represents pixel values of a local pixel block in the contrast-enhanced image data,Qis a contrast gain parameter;
setting a brightness threshold in the image data;
setting up
Wherein L is avg1 Representing a luminance average of a local pixel block in the image data;
when the brightness in the image is lower than the set threshold value, 0 is satisfied<Q<1, when the brightness in the image is higher than or equal to the set threshold value, the following conditions are satisfiedQ>1;
Setting a sliding window and carrying out contrast enhancement on pixels based on the average value and variance of brightness in the window;
setting an n multiplied by n sliding window to slide on the image, and calculating the average value and variance of brightness in the window to carry out contrast enhancement on pixels;
wherein L is avg2 Representing the luminance average value, L, of a local pixel block within a sliding window in2 (x, y) represents the pixel value before contrast enhancement of the local pixel block in the sliding window, L out2 Representing pixel values of local pixel blocks in the sliding window after contrast enhancement;
the mean and variance of the brightness in the sliding window are respectively:
the enhancement coefficient η is:
where k is a set constant and η is an enhancement coefficient.
S4, inputting the preprocessed image data into a neural network, and extracting features of the image data acquired in real time based on the neural network;
s41, dividing an input image into L small-area image data blocks;
s42, after receiving an input image data block, the convolution layer moves on the input image data block through a convolution kernel according to a set step length, and performs multiply accumulation on a corresponding area of each step and a characteristic value of the area, so that characteristic extraction of each image data block is realized;
the convolution calculation formula is as follows:
wherein,representing input features->The weights of the corresponding convolution kernels are represented,bthe value of the offset is indicated and,frepresenting an output characteristic;
s43, in the neural network, the output of the upper layer is used as the input of the lower layer, and the convolutional neural network is formed by continuously stacking;
the data must be subjected to the process of activating the function during the process of inputting the data to the lower layer;
setting the upper layer output as the input of the lower layer with the input value of(λ=1, 2, … …, L), each output value +.>The corresponding input weight is +.>B is an offset, and the output result obtained after the input values are input into the neural network is:
wherein,y is an output result for the corresponding activation function;
s44, convolving the characteristics of the image data blocks in the continuously stacked convolution and pooling processes;
s45, transmitting the characteristics of the convolved image data block into a full connection layer;
s46, unfolding and combining the characteristic data through the full connection layer to obtain a characteristic array, and storing the characteristic array;
s5, comparing the characteristic data in the real-time collected image data with the characteristic data in each classified image data in the sample set, and completing the identification of the real-time collected image data based on the comparison result;
s51, comparing based on the photographed multi-frame images;
transmitting the shot image to a neural network for feature extraction, identifying an object, storing feature points, and obtaining the coordinate position of a target object in a frame of image; extracting the characteristics of another frame of image, obtaining a new coordinate position, comparing the new coordinate position with the characteristic point of the previous frame, and judging whether the image is the same object according to the fact that the contact ratio is higher than 80%;
s52, calculating the similarity between the detection target and the target to be detected after the feature extraction;
further, the similarity between the detection target and the target to be detected is judged by calculating the distance between the features;
cosine distance:
wherein,for vector->Sum vector->Cosine similarity between D cos Is the cosine distance, (x) j ,y j ) The coordinates of the J-th point are represented, and J represents the number of coordinate points;
further, for inquiring h targets to be detected in d targets to be detected, setting the characteristic form of each target as a ϴ -dimensional vector;
forming a detection feature matrix G by using feature vectors of all detection targets, forming a feature matrix H to be detected by using the feature vectors of all targets to be detected, and multiplying the detection feature matrix G by the feature matrix H to be detected to obtain a cosine similarity matrix C;
further, for an object to be detectedThe detection targets are arranged in a descending order according to the cosine similarity between the target features to be detected and each detection target feature, and c detection targets with top rank are reserved; setting z α Feature similarity, t, representing rank alpha β Representing the beta detection target;
further, setting an upper limit of a similarity threshold and a lower limit of the similarity threshold which are successfully matched;
further, when the feature similarity z is ranked first 1 Less than the lower limit of the similarity threshold, failing in matching, and setting a target to be detectedIs a new generation target;
feature similarity z when first ranked 1 Is larger than the upper limit of the similarity threshold, and the successful matching indicates the detection target t 1 The similarity with the target to be detected is high enough, and the target to be detected is directly judged as a detection target t 1 ;
When z 1 The method comprises the steps of being smaller than the upper limit of a similarity threshold and larger than the lower limit of the similarity threshold, matching successfully, counting detection targets with similarity above the lower limit of the similarity threshold in ranking results, setting the detection targets above the lower limit of the similarity threshold as effective matching, calculating the proportion of the category of the detection targets in the effective matching, and judging the target to be detected as the detection target with the highest proportion;
s6, setting an evaluation index, and adjusting the visual recognition accuracy of unmanned aerial vehicle driving based on the evaluation index;
calculating the accuracy of visual recognition under the current resolution based on the photographed multi-frame images;
wherein TP represents the number of identified matches with the actual result; FP represents identifying a number inconsistent with the actual result; FN represents the number of missed samples; p is the precision rate, R is the recall rate;
further, for a single category, setting a change relation curve of the precision rate and the recall rate as a P-R curve; the area surrounded by the curve and the horizontal axis is the average accuracy rate AP, and the calculation formula is as follows:
further, setting an average accuracy rate average mAP as an average value of accuracy rates AP of a plurality of categories, wherein a calculation formula is as follows:
wherein m is the number of target types, mAP is the average value of the accuracy rates AP of a plurality of types;
setting an average accuracy average threshold value of visual identification; when the calculated average accuracy average value is lower than a set average accuracy average value threshold value of visual identification, the resolution of the image shot by the unmanned aerial vehicle is improved;
the embodiment also discloses a visual identification system based on unmanned aerial vehicle driving, specifically includes: the device comprises an image acquisition module, an image processing module, a display module and an image recognition module;
the image acquisition module is used for shooting image data in the current planning area in real time and transmitting the image data to the image processing module in real time;
the image processing module is used for processing the image data transmitted by the image acquisition module and transmitting the processed image to the image recognition module;
the image recognition module is used for recognizing the processed image data;
the display module is used for displaying in the display module according to the identification result of the image processing module.
Although embodiments of the present invention have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made therein without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.
Claims (7)
1. The visual identification method based on unmanned aerial vehicle driving is characterized by comprising the following steps of:
s1, establishing a sample set based on collected sample image data, and inputting the sample set into a neural network for feature extraction;
s2, setting an unmanned aerial vehicle identification area, and operating the unmanned aerial vehicle to collect image data in the identification area in real time;
s3, preprocessing image data acquired by the unmanned aerial vehicle in real time;
s4, inputting the preprocessed image data into a neural network, and extracting features of the image data acquired in real time based on the neural network;
s5, comparing the characteristic data in the real-time collected image data with the characteristic data in each classified image data in the sample set, and completing the identification of the real-time collected image data based on the comparison result;
s6, setting an evaluation index, and adjusting the visual recognition accuracy of unmanned aerial vehicle driving based on the evaluation index;
the preprocessing of the image data acquired by the unmanned aerial vehicle in real time comprises the following steps:
s31, filtering processing is carried out on image data acquired in real time;
s32, carrying out contrast enhancement on the filtered image data;
the contrast enhancement of the filtered image data includes:
s321, establishing a linear mapping relation of the image data based on brightness in the filtered image data;
;
wherein L is in1 (x, y) represents the pixel value before contrast enhancement of the local pixel block in the image data, L out1 (x, y) represents pixel values of a local pixel block in the contrast-enhanced image data,Qis a contrast gain parameter;
setting a brightness threshold in the image data;
setting up;
Wherein L is avg1 Representing a luminance average of a local pixel block in the image data;
when the brightness in the image is lower than the set threshold value, 0 is satisfied<Q<1, when the brightness in the image is higher than or equal to the set threshold value, the following conditions are satisfiedQ>1;
Setting a sliding window and carrying out contrast enhancement on pixels based on the average value and variance of brightness in the window;
setting an n multiplied by n sliding window to slide on the image, and calculating the average value and variance of brightness in the window to carry out contrast enhancement on pixels;
;
wherein L is avg2 Representing the luminance average value, L, of a local pixel block within a sliding window in2 (x, y) represents the pixel value before contrast enhancement of the local pixel block in the sliding window, L out2 Representing pixel values of local pixel blocks in the sliding window after contrast enhancement;
the mean and variance of the brightness in the sliding window are respectively:
;
;
the enhancement coefficient η is:
;
where k is a set constant and η is an enhancement coefficient.
2. The unmanned aerial vehicle driving-based visual recognition method of claim 1, wherein the establishing a sample set based on the collected sample image data and inputting a neural network based on the sample set for feature extraction comprises:
s11, classifying the image data in the sample set according to the content;
s12, sequentially inputting the classified image data into a neural network for feature extraction, and extracting feature data in each classified image data;
and S13, storing the feature data of the extracted classified image data.
3. The unmanned aerial vehicle driving-based visual recognition method of claim 1, wherein the filtering the image data acquired in real time comprises:
setting p as a central pixel based on the picture center, and setting a neighborhood set S of the central pixel, wherein q belongs to the field set S of the central pixel;
;
;
wherein e is a natural constant;I p representation ofIs p= (x) in image I 1 ,y 1 ) Gray value of dot, (x) 1 ,y 1 ) The coordinates of the pixel point are represented,I q represented as q= (u) in image I 1 ,v 1 ) Gray value of the dot; sigma (sigma) s Representing the standard deviation, sigma, of the spatial distance in the gaussian function r Representing the standard deviation of pixel values in the gaussian function; g s Representing the spatial distance weight, G r Representing pixel value weights;BF p representing the gray value, W, of p-point in the filtered image q The sum of weights representing the pixel values of the q points.
4. The unmanned aerial vehicle driving-based visual recognition method of claim 1, wherein inputting the preprocessed image data into the neural network and performing feature extraction on the real-time collected image data based on the neural network comprises:
s41, dividing an input image into L small-area image data blocks;
s42, after receiving an input image data block, the convolution layer moves on the input image data block through a convolution kernel according to a set step length, and performs multiply accumulation on a corresponding area of each step and a characteristic value of the area, so that characteristic extraction of each image data block is realized;
the convolution calculation formula is as follows:
;
wherein,representing input features->The weights of the corresponding convolution kernels are represented,bthe value of the offset is indicated and,frepresenting an output characteristic;
s43, in the neural network, the output of the upper layer is used as the input of the lower layer, and the convolutional neural network is formed by continuously stacking;
the data must be subjected to the process of activating the function during the process of inputting the data to the lower layer;
setting the upper layer output as the input of the lower layer with the input value of(λ=1, 2, … …, L), each output value +.>The corresponding input weight is +.>B is an offset, and the output result obtained after the input values are input into the neural network is:
;
wherein,y is an output result for the corresponding activation function;
s44, convolving the characteristics of the image data blocks in the continuously stacked convolution and pooling processes;
s45, transmitting the characteristics of the convolved image data block into a full connection layer;
s46, unfolding and combining the feature data through the full connection layer to obtain a feature array, and storing the feature array.
5. The visual recognition method based on unmanned aerial vehicle driving according to claim 1, wherein the comparing the feature data in the real-time collected image data with the feature data in each classified image data in the sample set, and the recognizing the real-time collected image data based on the comparison result comprises:
s51, comparing based on the photographed multi-frame images;
transmitting the shot image to a neural network for feature extraction, identifying an object, storing feature points, and obtaining the coordinate position of a target object in a frame of image; extracting the characteristics of another frame of image, obtaining a new coordinate position, comparing the new coordinate position with the characteristic point of the previous frame, and judging whether the image is the same object according to the fact that the contact ratio is higher than 80%;
s52, calculating the similarity between the detection target and the target to be detected after the feature extraction;
the detection target is image data in a sample set;
judging the similarity between the detection target and the target to be detected by calculating the distance between the features;
cosine distance:
;
;
wherein,for vector->Sum vector->Cosine similarity between D cos Is the cosine distance, (x) j ,y j ) The coordinates of the J-th point are represented, and J represents the number of coordinate points;
inquiring h targets to be detected in d detection targets, and setting the characteristic form of each target as a ϴ -dimensional vector;
forming a detection feature matrix G by using feature vectors of all detection targets, forming a feature matrix H to be detected by using the feature vectors of all targets to be detected, and multiplying the detection feature matrix G by the feature matrix H to be detected to obtain a cosine similarity matrix C;
for one ofTarget to be detectedThe detection targets are arranged in a descending order according to the cosine similarity between the target features to be detected and each detection target feature, and c detection targets with top rank are reserved; setting z α Feature similarity, t, representing rank alpha β Representing the beta detection target;
setting an upper limit of a similarity threshold and a lower limit of the similarity threshold which are successfully matched;
feature similarity z when first ranked 1 Less than the lower limit of the similarity threshold, failing in matching, and setting a target to be detectedIs a new generation target;
feature similarity z when first ranked 1 Is larger than the upper limit of the similarity threshold, and the successful matching indicates the detection target t 1 The similarity with the target to be detected is high enough, and the target to be detected is directly judged as a detection target t 1 ;
When z 1 The method comprises the steps of being smaller than the upper limit of a similarity threshold and larger than the lower limit of the similarity threshold, matching successfully, counting detection targets with similarity above the lower limit of the similarity threshold in ranking results, setting the detection targets above the lower limit of the similarity threshold as effective matching, calculating the proportion of the category of the detection targets in the effective matching, and judging the target to be detected as the detection target with the highest proportion.
6. The method for visual recognition based on unmanned aerial vehicle driving according to claim 1, wherein the setting the evaluation index, and adjusting the visual recognition accuracy of unmanned aerial vehicle driving based on the evaluation index comprises:
calculating the accuracy of visual recognition under the current resolution based on the photographed multi-frame images;
;
;
wherein TP represents the number of identified matches with the actual result; FP represents identifying a number inconsistent with the actual result; FN represents the number of missed samples; p is the accuracy rate, R is the recall rate;
for a single category, setting a change relation curve of the accuracy rate and the recall rate as a P-R curve; the area surrounded by the curve and the horizontal axis is the average accuracy AP, and the calculation formula is as follows:
;
setting an average accuracy average mAP as an average value of accuracy APs of a plurality of categories, wherein a calculation formula is as follows:
;
wherein m is the number of target types, mAP is the average value of the accuracy AP of a plurality of types;
setting an average accuracy average threshold value of visual identification; and when the calculated average accuracy average value is lower than a set average accuracy average value threshold value of visual identification, improving the resolution of the image shot by the unmanned aerial vehicle.
7. An unmanned aerial vehicle driving-based visual recognition system for implementing the unmanned aerial vehicle driving-based visual recognition method of any one of claims 1 to 6, comprising an image acquisition module, an image processing module, a display module, and an image recognition module;
the image acquisition module is used for shooting image data in the current planning area in real time and transmitting the image data to the image processing module in real time;
the image processing module is used for processing the image data transmitted by the image acquisition module and transmitting the processed image to the image recognition module;
the image recognition module is used for recognizing the processed image data;
the display module is used for displaying in the display module according to the identification result of the image processing module.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410051695.XA CN117576597B (en) | 2024-01-15 | 2024-01-15 | Visual identification method and system based on unmanned aerial vehicle driving |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410051695.XA CN117576597B (en) | 2024-01-15 | 2024-01-15 | Visual identification method and system based on unmanned aerial vehicle driving |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117576597A CN117576597A (en) | 2024-02-20 |
CN117576597B true CN117576597B (en) | 2024-04-12 |
Family
ID=89864611
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410051695.XA Active CN117576597B (en) | 2024-01-15 | 2024-01-15 | Visual identification method and system based on unmanned aerial vehicle driving |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117576597B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117746272A (en) * | 2024-02-21 | 2024-03-22 | 西安迈远科技有限公司 | Unmanned aerial vehicle-based water resource data acquisition and processing method and system |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10062154B1 (en) * | 2015-02-11 | 2018-08-28 | Synaptics Incorporated | System and method for adaptive contrast enhancement |
CN111310703A (en) * | 2020-02-26 | 2020-06-19 | 深圳市巨星网络技术有限公司 | Identity recognition method, device, equipment and medium based on convolutional neural network |
CN111506759A (en) * | 2020-03-04 | 2020-08-07 | 中国人民解放军战略支援部队信息工程大学 | Image matching method and device based on depth features |
CN111639558A (en) * | 2020-05-15 | 2020-09-08 | 圣点世纪科技股份有限公司 | Finger vein identity verification method based on ArcFace Loss and improved residual error network |
CN112362756A (en) * | 2020-11-24 | 2021-02-12 | 长沙理工大学 | Concrete structure damage monitoring method and system based on deep learning |
CN114298944A (en) * | 2021-12-30 | 2022-04-08 | 上海闻泰信息技术有限公司 | Image enhancement method, device, equipment and storage medium |
CN114818766A (en) * | 2022-04-14 | 2022-07-29 | 重庆亲禾智千科技有限公司 | Self-adaptive bar code contrast enhancement method based on opencv |
CN114862837A (en) * | 2022-06-02 | 2022-08-05 | 西京学院 | Human body security check image detection method and system based on improved YOLOv5s |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
SG139602A1 (en) * | 2006-08-08 | 2008-02-29 | St Microelectronics Asia | Automatic contrast enhancement |
KR101303665B1 (en) * | 2007-06-13 | 2013-09-09 | 삼성전자주식회사 | Method and apparatus for contrast enhancement |
-
2024
- 2024-01-15 CN CN202410051695.XA patent/CN117576597B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10062154B1 (en) * | 2015-02-11 | 2018-08-28 | Synaptics Incorporated | System and method for adaptive contrast enhancement |
CN111310703A (en) * | 2020-02-26 | 2020-06-19 | 深圳市巨星网络技术有限公司 | Identity recognition method, device, equipment and medium based on convolutional neural network |
CN111506759A (en) * | 2020-03-04 | 2020-08-07 | 中国人民解放军战略支援部队信息工程大学 | Image matching method and device based on depth features |
CN111639558A (en) * | 2020-05-15 | 2020-09-08 | 圣点世纪科技股份有限公司 | Finger vein identity verification method based on ArcFace Loss and improved residual error network |
CN112362756A (en) * | 2020-11-24 | 2021-02-12 | 长沙理工大学 | Concrete structure damage monitoring method and system based on deep learning |
CN114298944A (en) * | 2021-12-30 | 2022-04-08 | 上海闻泰信息技术有限公司 | Image enhancement method, device, equipment and storage medium |
CN114818766A (en) * | 2022-04-14 | 2022-07-29 | 重庆亲禾智千科技有限公司 | Self-adaptive bar code contrast enhancement method based on opencv |
CN114862837A (en) * | 2022-06-02 | 2022-08-05 | 西京学院 | Human body security check image detection method and system based on improved YOLOv5s |
Non-Patent Citations (2)
Title |
---|
Sliding window adaptive histogram equalization of intraoral radiographs: effect on image quality;T Sund 等;Dentomaxillofacial Radiology;20060531;第35卷(第3期);第133-138页 * |
无人机对地多移动目标的视觉识别跟踪技术研究;罗小兰;中国优秀硕士学位论文全文数据库 工程科技II辑;20230115(第第1期期);第C031-18页 * |
Also Published As
Publication number | Publication date |
---|---|
CN117576597A (en) | 2024-02-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104282020B (en) | A kind of vehicle speed detection method based on target trajectory | |
CN117576597B (en) | Visual identification method and system based on unmanned aerial vehicle driving | |
Zhou et al. | Robust vehicle detection in aerial images using bag-of-words and orientation aware scanning | |
CN109903331B (en) | Convolutional neural network target detection method based on RGB-D camera | |
CN113065558A (en) | Lightweight small target detection method combined with attention mechanism | |
CN114973002A (en) | Improved YOLOv 5-based ear detection method | |
CN111738114B (en) | Vehicle target detection method based on anchor-free accurate sampling remote sensing image | |
CN112818905B (en) | Finite pixel vehicle target detection method based on attention and spatio-temporal information | |
CN114627447A (en) | Road vehicle tracking method and system based on attention mechanism and multi-target tracking | |
CN113901874A (en) | Tea tender shoot identification and picking point positioning method based on improved R3Det rotating target detection algorithm | |
CN111915558B (en) | Pin state detection method for high-voltage transmission line | |
CN113205026A (en) | Improved vehicle type recognition method based on fast RCNN deep learning network | |
CN112927264A (en) | Unmanned aerial vehicle tracking shooting system and RGBD tracking method thereof | |
CN101320477B (en) | Human body tracing method and equipment thereof | |
CN106529441A (en) | Fuzzy boundary fragmentation-based depth motion map human body action recognition method | |
CN115841633A (en) | Power tower and power line associated correction power tower and power line detection method | |
CN115019201A (en) | Weak and small target detection method based on feature refined depth network | |
CN109215059B (en) | Local data association method for tracking moving vehicle in aerial video | |
CN110647813A (en) | Human face real-time detection and identification method based on unmanned aerial vehicle aerial photography | |
CN113021355B (en) | Agricultural robot operation method for predicting sheltered crop picking point | |
CN112560799B (en) | Unmanned aerial vehicle intelligent vehicle target detection method based on adaptive target area search and game and application | |
CN103927517B (en) | Motion detection method based on human body global feature histogram entropies | |
CN110930436B (en) | Target tracking method and device | |
CN113129336A (en) | End-to-end multi-vehicle tracking method, system and computer readable medium | |
CN115830514B (en) | Whole river reach surface flow velocity calculation method and system suitable for curved river channel |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |