CN117576597B - Visual identification method and system based on unmanned aerial vehicle driving - Google Patents

Visual identification method and system based on unmanned aerial vehicle driving Download PDF

Info

Publication number
CN117576597B
CN117576597B CN202410051695.XA CN202410051695A CN117576597B CN 117576597 B CN117576597 B CN 117576597B CN 202410051695 A CN202410051695 A CN 202410051695A CN 117576597 B CN117576597 B CN 117576597B
Authority
CN
China
Prior art keywords
image data
image
aerial vehicle
unmanned aerial
setting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410051695.XA
Other languages
Chinese (zh)
Other versions
CN117576597A (en
Inventor
崔飞易
刘清秀
李建华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Jinfeijie Information Technology Service Co ltd
Original Assignee
Shenzhen Jinfeijie Information Technology Service Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Jinfeijie Information Technology Service Co ltd filed Critical Shenzhen Jinfeijie Information Technology Service Co ltd
Priority to CN202410051695.XA priority Critical patent/CN117576597B/en
Publication of CN117576597A publication Critical patent/CN117576597A/en
Application granted granted Critical
Publication of CN117576597B publication Critical patent/CN117576597B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/17Terrestrial scenes taken from planes or by drones
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/30Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/757Matching configurations of points or features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Remote Sensing (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to the technical field of visual identification, and discloses a visual identification method and a visual identification system based on unmanned aerial vehicle driving. The system comprises: collecting sample image data to establish a sample set, performing feature extraction through a neural network, setting an unmanned aerial vehicle identification area, operating the unmanned aerial vehicle to collect image data in the identification area in real time, preprocessing the image data collected by the unmanned aerial vehicle in real time, and performing feature extraction on the image data collected in real time through the neural network; and comparing the characteristic data in the real-time collected image data with the characteristic data in each classified image data in the sample set, completing the identification of the real-time collected image data based on the comparison result, setting an evaluation index, and adjusting the visual identification precision of unmanned aerial vehicle driving based on the evaluation index. The visual recognition system based on unmanned aerial vehicle driving can perform real-time recognition on image data in a real-time acquisition recognition area of the unmanned aerial vehicle.

Description

Visual identification method and system based on unmanned aerial vehicle driving
Technical Field
The invention relates to the technical field of visual identification, in particular to a visual identification method and system based on unmanned aerial vehicle driving.
Background
Along with the gradual maturity of the target detection technology, the method is widely applied to traditional fixed monitoring equipment, but the traditional monitoring equipment has the limitations of poor flexibility and small monitoring range, is difficult to realize the identification of scattered and movable multi-targets, and is generally only used for closed indoor scenes such as supermarkets, shopping malls and the like. The unmanned plane has the advantages of easy operation, good flexibility, strong maneuverability, wide monitoring range and the like, and can make up the defects of the traditional monitoring equipment. However, in a task scene where the unmanned aerial vehicle recognizes a plurality of moving targets, besides the traditional target detection difficulties such as shielding, visual angle change, light condition change and the like, adverse effects on image quality caused by target scale change and unmanned aerial vehicle and target maneuver exist. And because the high flexibility of the unmanned aerial vehicle determines that the carrying capacity of the unmanned aerial vehicle is very limited, the computing resources are relatively deficient, the real-time effect is difficult to achieve when the detection model is deployed on the platform, and the autonomous propulsion on the unmanned aerial vehicle is hindered.
In the prior art CN115880589a, recognition and evaluation are performed on the target acquired by the unmanned aerial vehicle only through a neural network training mode, so that adverse effects of maneuvering of the unmanned aerial vehicle and the target on image quality are ignored, and the method has a great limitation.
Disclosure of Invention
(one) solving the technical problems
Aiming at the defects of the prior art, the invention provides a visual recognition system based on unmanned aerial vehicle driving, which has the advantages of accurate recognition and the like, and solves the problem of adverse effect of unmanned aerial vehicle and target maneuver on image quality.
In order to solve the technical problem of adverse effect of unmanned aerial vehicle and target maneuver on image quality, the invention provides the following technical scheme:
the embodiment discloses a visual identification method based on unmanned aerial vehicle driving, which specifically comprises the following steps:
s1, establishing a sample set based on collected sample image data, and inputting the sample set into a neural network for feature extraction;
s2, setting an unmanned aerial vehicle identification area, and operating the unmanned aerial vehicle to collect image data in the identification area in real time;
s3, preprocessing image data acquired by the unmanned aerial vehicle in real time;
s4, inputting the preprocessed image data into a neural network, and extracting features of the image data acquired in real time based on the neural network;
s5, comparing the characteristic data in the real-time collected image data with the characteristic data in each classified image data in the sample set, and completing the identification of the real-time collected image data based on the comparison result;
s6, setting an evaluation index, and adjusting the visual recognition accuracy of unmanned aerial vehicle driving based on the evaluation index;
preferably, the establishing a sample set based on the collected sample image data, and inputting the sample set into the neural network for feature extraction includes:
s11, classifying the image data in the sample set according to the content;
s12, sequentially inputting the classified image data into a neural network for feature extraction, and extracting feature data in each classified image data;
and S13, storing the feature data of the extracted classified image data.
Preferably, the preprocessing the image data collected by the unmanned aerial vehicle in real time includes:
s31, filtering processing is carried out on image data acquired in real time;
s32, performing contrast enhancement on the filtered image data.
Preferably, the filtering processing of the image data acquired in real time includes:
setting p as a central pixel based on the picture center, and setting a neighborhood set S of the central pixel, wherein q belongs to the field set S of the central pixel;
wherein e is a natural constant;I p represented as p= (x) in image I 1 ,y 1 ) Gray value of dot, (x) 1 ,y 1 ) The coordinates of the pixel point are represented,I q represented as q= (u) in image I 1 ,v 1 ) Gray value of the dot; σs represents the standard deviation of the spatial distance in the gaussian function, σr represents the standard deviation of the pixel value in the gaussian function; gs denotes a spatial distance weight, gr denotes a pixel value weight;BF p representing the gray value, W, of p-point in the filtered image q The sum of weights representing the pixel values of the q points.
Preferably, the contrast enhancement of the filtered image data includes:
s321, establishing a linear mapping relation of the image data based on brightness in the filtered image data;
wherein L is in1 (x, y) represents the pixel value before contrast enhancement of the local pixel block in the image data, L out1 (x, y) represents pixel values of a local pixel block in the contrast-enhanced image data,Qis a contrast gain parameter;
setting a brightness threshold in the image data;
setting up
Wherein L is avg1 Representing a luminance average of a local pixel block in the image data;
when the brightness in the image is lower than the set threshold value, 0 is satisfied<Q<1, when the brightness in the image is higher than or equal to the set threshold value, the following conditions are satisfiedQ>1;
Setting a sliding window and carrying out contrast enhancement on pixels based on the average value and variance of brightness in the window;
setting an n multiplied by n sliding window to slide on the image, and calculating the average value and variance of brightness in the window to carry out contrast enhancement on pixels;
wherein L is avg2 Representing the luminance average value, L, of a local pixel block within a sliding window in2 (x, y) represents the pixel value before contrast enhancement of the local pixel block in the sliding window, L out2 Representing pixel values of local pixel blocks in the sliding window after contrast enhancement;
the mean and variance of the brightness in the sliding window are respectively:
the enhancement coefficient η is:
where k is a set constant and η is an enhancement coefficient.
Preferably, the inputting the preprocessed image data into the neural network, and performing feature extraction on the real-time acquired image data based on the neural network includes:
s41, dividing an input image into L small-area image data blocks;
s42, after receiving an input image data block, the convolution layer moves on the input image data block through a convolution kernel according to a set step length, and performs multiply accumulation on a corresponding area of each step and a characteristic value of the area, so that characteristic extraction of each image data block is realized;
the convolution calculation formula is as follows:
wherein,representing input features->The weights of the corresponding convolution kernels are represented,bthe value of the offset is indicated and,frepresenting an output characteristic;
s43, in the neural network, the output of the upper layer is used as the input of the lower layer, and the convolutional neural network is formed by continuously stacking;
the data must be subjected to the process of activating the function during the process of inputting the data to the lower layer;
setting the upper layer output as the input of the lower layer with the input value of(λ=1, 2, … …, L), each output value +.>The corresponding input weight is +.>B is an offset, and the output result obtained after the input values are input into the neural network is:
wherein,y is an output result for the corresponding activation function;
s44, convolving the characteristics of the image data blocks in the continuously stacked convolution and pooling processes;
s45, transmitting the characteristics of the convolved image data block into a full connection layer;
s46, unfolding and combining the feature data through the full connection layer to obtain a feature array, and storing the feature array.
Preferably, the comparing the feature data in the real-time collected image data with the feature data in each classified image data in the sample set, and the identifying the real-time collected image data based on the comparison result includes:
s51, comparing based on the photographed multi-frame images;
transmitting the shot images to a neural network for feature extraction, identifying objects, storing feature points, and obtaining the coordinate positions of target objects in each frame of images; extracting the characteristics of another frame of image, obtaining a new coordinate position, comparing the new coordinate position with the characteristic point of the previous frame, and judging whether the image is the same object according to the fact that the contact ratio is higher than 80%;
s52, calculating the similarity between the detection target and the target to be detected after the feature extraction;
judging the similarity between the detection target and the target to be detected by calculating the distance between the features;
cosine distance:
wherein,for vector->Sum vector->Cosine similarity between D cos Is the cosine distance, (x) j ,y j ) The coordinates of the J-th point are represented, and J represents the number of coordinate points;
inquiring h targets to be detected in d detection targets, and setting the characteristic form of each target as a ϴ -dimensional vector;
forming a detection feature matrix G by using feature vectors of all detection targets, forming a feature matrix H to be detected by using the feature vectors of all targets to be detected, and multiplying the detection feature matrix G by the feature matrix H to be detected to obtain a cosine similarity matrix C;
for an object to be detectedThe detection targets are arranged in a descending order according to the cosine similarity between the target features to be detected and each detection target feature, and c detection targets with top rank are reserved; setting z α Feature similarity, t, representing rank alpha β Representing the beta detection target;
setting an upper limit of a similarity threshold and a lower limit of the similarity threshold which are successfully matched;
feature similarity z when first ranked 1 Less than the lower limit of the similarity threshold, failing in matching, and setting a target to be detectedIs a new generation target;
feature similarity z when first ranked 1 Is larger than the upper limit of the similarity threshold, and the successful matching indicates the detection target t 1 The similarity with the target to be detected is high enough, and the target to be detected is directly judged as a detection target t 1
When z 1 The method comprises the steps of being smaller than the upper limit of a similarity threshold and larger than the lower limit of the similarity threshold, matching successfully, counting detection targets with similarity above the lower limit of the similarity threshold in ranking results, setting the detection targets above the lower limit of the similarity threshold as effective matching, calculating the proportion of the category of the detection targets in the effective matching, and judging the target to be detected as the detection target with the highest proportion.
Preferably, the setting the evaluation index, and adjusting the visual recognition accuracy of the unmanned aerial vehicle driving based on the evaluation index includes:
calculating the accuracy of visual recognition under the current resolution based on the photographed multi-frame images;
wherein TP represents the number of identified matches with the actual result; FP represents identifying a number inconsistent with the actual result; FN represents the number of missed samples; p is the accuracy rate, R is the recall rate;
further, for a single category, setting a change relation curve of the precision rate and the recall rate as a P-R curve; the area surrounded by the curve and the horizontal axis is the average accuracy AP, and the calculation formula is as follows:
setting an average accuracy average mAP as an average value of accuracy APs of a plurality of categories, wherein a calculation formula is as follows:
wherein m is the number of target types, mAP is the average value of the accuracy AP of a plurality of types;
setting an average accuracy average threshold value of visual identification; and when the calculated average accuracy average value is lower than a set average accuracy average value threshold value of visual identification, improving the resolution of the image shot by the unmanned aerial vehicle.
The embodiment also discloses a visual identification system based on unmanned aerial vehicle driving, specifically includes: the device comprises an image acquisition module, an image processing module, a display module and an image recognition module;
the image acquisition module is used for shooting image data in the current planning area in real time and transmitting the image data to the image processing module in real time;
the image processing module is used for processing the image data transmitted by the image acquisition module and transmitting the processed image to the image recognition module;
the image recognition module is used for recognizing the processed image data;
the display module is used for displaying in the display module according to the identification result of the image processing module.
Compared with the prior art, the invention provides a visual recognition system based on unmanned aerial vehicle driving, which has the following beneficial effects:
1. according to the invention, the sample set is established by collecting sample image data, and the identification of the sample set is completed in a mode of extracting characteristics through a neural network, so that the identification efficiency of the image data is improved.
2. According to the method, the unmanned aerial vehicle identification area is set, the image data acquired in real time in the identification area is preprocessed, the influence of noise points in the image on the image identification efficiency is reduced, the difference between the object and the background is highlighted through the contrast enhancement mode, the image identification efficiency is improved, and meanwhile the accuracy of image identification is guaranteed.
3. According to the invention, whether the image data collected in real time is matched with the image data in the sample set or not is judged by carrying out feature extraction on the image data collected in real time and comparing the image data with the feature data in the sample set extracted by the neural network, so that the accuracy of image identification is ensured.
4. According to the invention, through a multi-frame shooting mode, the feature extraction is carried out on a plurality of pieces of continuously shot image data, and a feature point comparison mode is carried out to judge whether the images are the same identification target, so that the accuracy and the reliability of identification are improved.
5. The invention comprehensively judges whether the detection target is matched with the target to be detected or not by judging the similarity between the detection target and the target to be detected, thereby improving the image recognition efficiency.
Drawings
Fig. 1 is a schematic diagram of a visual identification flow of unmanned aerial vehicle driving.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The embodiment discloses a visual identification method based on unmanned aerial vehicle driving, which specifically comprises the following steps:
s1, establishing a sample set based on collected sample image data, and inputting the sample set into a neural network for feature extraction;
s11, classifying the image data in the sample set according to the content;
s12, sequentially inputting the classified image data into a neural network for feature extraction, and extracting feature data in each classified image data;
s13, storing the feature data of the extracted classified image data;
s2, setting an unmanned aerial vehicle identification area, and operating the unmanned aerial vehicle to collect image data in the identification area in real time;
s3, preprocessing image data acquired by the unmanned aerial vehicle in real time;
preprocessing image data acquired by the unmanned aerial vehicle in real time comprises the following steps:
s31, filtering processing is carried out on image data acquired in real time;
setting p as a central pixel based on the picture center, and setting a neighborhood set S of the central pixel, wherein q belongs to the field set S of the central pixel;
wherein e is naturalA constant;I p represented as p= (x) in image I 1 ,y 1 ) Gray value of dot, (x) 1 ,y 1 ) The coordinates of the pixel point are represented,I q represented as q= (u) in image I 1 ,v 1 ) Gray value of the dot; σs represents the standard deviation of the spatial distance in the gaussian function, σr represents the standard deviation of the pixel value in the gaussian function; g s Representing the spatial distance weight, G r Representing pixel value weights;BF p representing the gray value, W, of p-point in the filtered image q A weight sum representing the pixel value of the q point;
s32, carrying out contrast enhancement on the filtered image data;
s321, establishing a linear mapping relation of the image data based on brightness in the filtered image data;
wherein L is in1 (x, y) represents the pixel value before contrast enhancement of the local pixel block in the image data, L out1 (x, y) represents pixel values of a local pixel block in the contrast-enhanced image data,Qis a contrast gain parameter;
setting a brightness threshold in the image data;
setting up
Wherein L is avg1 Representing a luminance average of a local pixel block in the image data;
when the brightness in the image is lower than the set threshold value, 0 is satisfied<Q<1, when the brightness in the image is higher than or equal to the set threshold value, the following conditions are satisfiedQ>1;
Setting a sliding window and carrying out contrast enhancement on pixels based on the average value and variance of brightness in the window;
setting an n multiplied by n sliding window to slide on the image, and calculating the average value and variance of brightness in the window to carry out contrast enhancement on pixels;
wherein L is avg2 Representing the luminance average value, L, of a local pixel block within a sliding window in2 (x, y) represents the pixel value before contrast enhancement of the local pixel block in the sliding window, L out2 Representing pixel values of local pixel blocks in the sliding window after contrast enhancement;
the mean and variance of the brightness in the sliding window are respectively:
the enhancement coefficient η is:
where k is a set constant and η is an enhancement coefficient.
S4, inputting the preprocessed image data into a neural network, and extracting features of the image data acquired in real time based on the neural network;
s41, dividing an input image into L small-area image data blocks;
s42, after receiving an input image data block, the convolution layer moves on the input image data block through a convolution kernel according to a set step length, and performs multiply accumulation on a corresponding area of each step and a characteristic value of the area, so that characteristic extraction of each image data block is realized;
the convolution calculation formula is as follows:
wherein,representing input features->The weights of the corresponding convolution kernels are represented,bthe value of the offset is indicated and,frepresenting an output characteristic;
s43, in the neural network, the output of the upper layer is used as the input of the lower layer, and the convolutional neural network is formed by continuously stacking;
the data must be subjected to the process of activating the function during the process of inputting the data to the lower layer;
setting the upper layer output as the input of the lower layer with the input value of(λ=1, 2, … …, L), each output value +.>The corresponding input weight is +.>B is an offset, and the output result obtained after the input values are input into the neural network is:
wherein,y is an output result for the corresponding activation function;
s44, convolving the characteristics of the image data blocks in the continuously stacked convolution and pooling processes;
s45, transmitting the characteristics of the convolved image data block into a full connection layer;
s46, unfolding and combining the characteristic data through the full connection layer to obtain a characteristic array, and storing the characteristic array;
s5, comparing the characteristic data in the real-time collected image data with the characteristic data in each classified image data in the sample set, and completing the identification of the real-time collected image data based on the comparison result;
s51, comparing based on the photographed multi-frame images;
transmitting the shot image to a neural network for feature extraction, identifying an object, storing feature points, and obtaining the coordinate position of a target object in a frame of image; extracting the characteristics of another frame of image, obtaining a new coordinate position, comparing the new coordinate position with the characteristic point of the previous frame, and judging whether the image is the same object according to the fact that the contact ratio is higher than 80%;
s52, calculating the similarity between the detection target and the target to be detected after the feature extraction;
further, the similarity between the detection target and the target to be detected is judged by calculating the distance between the features;
cosine distance:
wherein,for vector->Sum vector->Cosine similarity between D cos Is the cosine distance, (x) j ,y j ) The coordinates of the J-th point are represented, and J represents the number of coordinate points;
further, for inquiring h targets to be detected in d targets to be detected, setting the characteristic form of each target as a ϴ -dimensional vector;
forming a detection feature matrix G by using feature vectors of all detection targets, forming a feature matrix H to be detected by using the feature vectors of all targets to be detected, and multiplying the detection feature matrix G by the feature matrix H to be detected to obtain a cosine similarity matrix C;
further, for an object to be detectedThe detection targets are arranged in a descending order according to the cosine similarity between the target features to be detected and each detection target feature, and c detection targets with top rank are reserved; setting z α Feature similarity, t, representing rank alpha β Representing the beta detection target;
further, setting an upper limit of a similarity threshold and a lower limit of the similarity threshold which are successfully matched;
further, when the feature similarity z is ranked first 1 Less than the lower limit of the similarity threshold, failing in matching, and setting a target to be detectedIs a new generation target;
feature similarity z when first ranked 1 Is larger than the upper limit of the similarity threshold, and the successful matching indicates the detection target t 1 The similarity with the target to be detected is high enough, and the target to be detected is directly judged as a detection target t 1
When z 1 The method comprises the steps of being smaller than the upper limit of a similarity threshold and larger than the lower limit of the similarity threshold, matching successfully, counting detection targets with similarity above the lower limit of the similarity threshold in ranking results, setting the detection targets above the lower limit of the similarity threshold as effective matching, calculating the proportion of the category of the detection targets in the effective matching, and judging the target to be detected as the detection target with the highest proportion;
s6, setting an evaluation index, and adjusting the visual recognition accuracy of unmanned aerial vehicle driving based on the evaluation index;
calculating the accuracy of visual recognition under the current resolution based on the photographed multi-frame images;
wherein TP represents the number of identified matches with the actual result; FP represents identifying a number inconsistent with the actual result; FN represents the number of missed samples; p is the precision rate, R is the recall rate;
further, for a single category, setting a change relation curve of the precision rate and the recall rate as a P-R curve; the area surrounded by the curve and the horizontal axis is the average accuracy rate AP, and the calculation formula is as follows:
further, setting an average accuracy rate average mAP as an average value of accuracy rates AP of a plurality of categories, wherein a calculation formula is as follows:
wherein m is the number of target types, mAP is the average value of the accuracy rates AP of a plurality of types;
setting an average accuracy average threshold value of visual identification; when the calculated average accuracy average value is lower than a set average accuracy average value threshold value of visual identification, the resolution of the image shot by the unmanned aerial vehicle is improved;
the embodiment also discloses a visual identification system based on unmanned aerial vehicle driving, specifically includes: the device comprises an image acquisition module, an image processing module, a display module and an image recognition module;
the image acquisition module is used for shooting image data in the current planning area in real time and transmitting the image data to the image processing module in real time;
the image processing module is used for processing the image data transmitted by the image acquisition module and transmitting the processed image to the image recognition module;
the image recognition module is used for recognizing the processed image data;
the display module is used for displaying in the display module according to the identification result of the image processing module.
Although embodiments of the present invention have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made therein without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (7)

1. The visual identification method based on unmanned aerial vehicle driving is characterized by comprising the following steps of:
s1, establishing a sample set based on collected sample image data, and inputting the sample set into a neural network for feature extraction;
s2, setting an unmanned aerial vehicle identification area, and operating the unmanned aerial vehicle to collect image data in the identification area in real time;
s3, preprocessing image data acquired by the unmanned aerial vehicle in real time;
s4, inputting the preprocessed image data into a neural network, and extracting features of the image data acquired in real time based on the neural network;
s5, comparing the characteristic data in the real-time collected image data with the characteristic data in each classified image data in the sample set, and completing the identification of the real-time collected image data based on the comparison result;
s6, setting an evaluation index, and adjusting the visual recognition accuracy of unmanned aerial vehicle driving based on the evaluation index;
the preprocessing of the image data acquired by the unmanned aerial vehicle in real time comprises the following steps:
s31, filtering processing is carried out on image data acquired in real time;
s32, carrying out contrast enhancement on the filtered image data;
the contrast enhancement of the filtered image data includes:
s321, establishing a linear mapping relation of the image data based on brightness in the filtered image data;
wherein L is in1 (x, y) represents the pixel value before contrast enhancement of the local pixel block in the image data, L out1 (x, y) represents pixel values of a local pixel block in the contrast-enhanced image data,Qis a contrast gain parameter;
setting a brightness threshold in the image data;
setting up
Wherein L is avg1 Representing a luminance average of a local pixel block in the image data;
when the brightness in the image is lower than the set threshold value, 0 is satisfied<Q<1, when the brightness in the image is higher than or equal to the set threshold value, the following conditions are satisfiedQ>1;
Setting a sliding window and carrying out contrast enhancement on pixels based on the average value and variance of brightness in the window;
setting an n multiplied by n sliding window to slide on the image, and calculating the average value and variance of brightness in the window to carry out contrast enhancement on pixels;
wherein L is avg2 Representing the luminance average value, L, of a local pixel block within a sliding window in2 (x, y) represents the pixel value before contrast enhancement of the local pixel block in the sliding window, L out2 Representing pixel values of local pixel blocks in the sliding window after contrast enhancement;
the mean and variance of the brightness in the sliding window are respectively:
the enhancement coefficient η is:
where k is a set constant and η is an enhancement coefficient.
2. The unmanned aerial vehicle driving-based visual recognition method of claim 1, wherein the establishing a sample set based on the collected sample image data and inputting a neural network based on the sample set for feature extraction comprises:
s11, classifying the image data in the sample set according to the content;
s12, sequentially inputting the classified image data into a neural network for feature extraction, and extracting feature data in each classified image data;
and S13, storing the feature data of the extracted classified image data.
3. The unmanned aerial vehicle driving-based visual recognition method of claim 1, wherein the filtering the image data acquired in real time comprises:
setting p as a central pixel based on the picture center, and setting a neighborhood set S of the central pixel, wherein q belongs to the field set S of the central pixel;
wherein e is a natural constant;I p representation ofIs p= (x) in image I 1 ,y 1 ) Gray value of dot, (x) 1 ,y 1 ) The coordinates of the pixel point are represented,I q represented as q= (u) in image I 1 ,v 1 ) Gray value of the dot; sigma (sigma) s Representing the standard deviation, sigma, of the spatial distance in the gaussian function r Representing the standard deviation of pixel values in the gaussian function; g s Representing the spatial distance weight, G r Representing pixel value weights;BF p representing the gray value, W, of p-point in the filtered image q The sum of weights representing the pixel values of the q points.
4. The unmanned aerial vehicle driving-based visual recognition method of claim 1, wherein inputting the preprocessed image data into the neural network and performing feature extraction on the real-time collected image data based on the neural network comprises:
s41, dividing an input image into L small-area image data blocks;
s42, after receiving an input image data block, the convolution layer moves on the input image data block through a convolution kernel according to a set step length, and performs multiply accumulation on a corresponding area of each step and a characteristic value of the area, so that characteristic extraction of each image data block is realized;
the convolution calculation formula is as follows:
wherein,representing input features->The weights of the corresponding convolution kernels are represented,bthe value of the offset is indicated and,frepresenting an output characteristic;
s43, in the neural network, the output of the upper layer is used as the input of the lower layer, and the convolutional neural network is formed by continuously stacking;
the data must be subjected to the process of activating the function during the process of inputting the data to the lower layer;
setting the upper layer output as the input of the lower layer with the input value of(λ=1, 2, … …, L), each output value +.>The corresponding input weight is +.>B is an offset, and the output result obtained after the input values are input into the neural network is:
wherein,y is an output result for the corresponding activation function;
s44, convolving the characteristics of the image data blocks in the continuously stacked convolution and pooling processes;
s45, transmitting the characteristics of the convolved image data block into a full connection layer;
s46, unfolding and combining the feature data through the full connection layer to obtain a feature array, and storing the feature array.
5. The visual recognition method based on unmanned aerial vehicle driving according to claim 1, wherein the comparing the feature data in the real-time collected image data with the feature data in each classified image data in the sample set, and the recognizing the real-time collected image data based on the comparison result comprises:
s51, comparing based on the photographed multi-frame images;
transmitting the shot image to a neural network for feature extraction, identifying an object, storing feature points, and obtaining the coordinate position of a target object in a frame of image; extracting the characteristics of another frame of image, obtaining a new coordinate position, comparing the new coordinate position with the characteristic point of the previous frame, and judging whether the image is the same object according to the fact that the contact ratio is higher than 80%;
s52, calculating the similarity between the detection target and the target to be detected after the feature extraction;
the detection target is image data in a sample set;
judging the similarity between the detection target and the target to be detected by calculating the distance between the features;
cosine distance:
wherein,for vector->Sum vector->Cosine similarity between D cos Is the cosine distance, (x) j ,y j ) The coordinates of the J-th point are represented, and J represents the number of coordinate points;
inquiring h targets to be detected in d detection targets, and setting the characteristic form of each target as a ϴ -dimensional vector;
forming a detection feature matrix G by using feature vectors of all detection targets, forming a feature matrix H to be detected by using the feature vectors of all targets to be detected, and multiplying the detection feature matrix G by the feature matrix H to be detected to obtain a cosine similarity matrix C;
for one ofTarget to be detectedThe detection targets are arranged in a descending order according to the cosine similarity between the target features to be detected and each detection target feature, and c detection targets with top rank are reserved; setting z α Feature similarity, t, representing rank alpha β Representing the beta detection target;
setting an upper limit of a similarity threshold and a lower limit of the similarity threshold which are successfully matched;
feature similarity z when first ranked 1 Less than the lower limit of the similarity threshold, failing in matching, and setting a target to be detectedIs a new generation target;
feature similarity z when first ranked 1 Is larger than the upper limit of the similarity threshold, and the successful matching indicates the detection target t 1 The similarity with the target to be detected is high enough, and the target to be detected is directly judged as a detection target t 1
When z 1 The method comprises the steps of being smaller than the upper limit of a similarity threshold and larger than the lower limit of the similarity threshold, matching successfully, counting detection targets with similarity above the lower limit of the similarity threshold in ranking results, setting the detection targets above the lower limit of the similarity threshold as effective matching, calculating the proportion of the category of the detection targets in the effective matching, and judging the target to be detected as the detection target with the highest proportion.
6. The method for visual recognition based on unmanned aerial vehicle driving according to claim 1, wherein the setting the evaluation index, and adjusting the visual recognition accuracy of unmanned aerial vehicle driving based on the evaluation index comprises:
calculating the accuracy of visual recognition under the current resolution based on the photographed multi-frame images;
wherein TP represents the number of identified matches with the actual result; FP represents identifying a number inconsistent with the actual result; FN represents the number of missed samples; p is the accuracy rate, R is the recall rate;
for a single category, setting a change relation curve of the accuracy rate and the recall rate as a P-R curve; the area surrounded by the curve and the horizontal axis is the average accuracy AP, and the calculation formula is as follows:
setting an average accuracy average mAP as an average value of accuracy APs of a plurality of categories, wherein a calculation formula is as follows:
wherein m is the number of target types, mAP is the average value of the accuracy AP of a plurality of types;
setting an average accuracy average threshold value of visual identification; and when the calculated average accuracy average value is lower than a set average accuracy average value threshold value of visual identification, improving the resolution of the image shot by the unmanned aerial vehicle.
7. An unmanned aerial vehicle driving-based visual recognition system for implementing the unmanned aerial vehicle driving-based visual recognition method of any one of claims 1 to 6, comprising an image acquisition module, an image processing module, a display module, and an image recognition module;
the image acquisition module is used for shooting image data in the current planning area in real time and transmitting the image data to the image processing module in real time;
the image processing module is used for processing the image data transmitted by the image acquisition module and transmitting the processed image to the image recognition module;
the image recognition module is used for recognizing the processed image data;
the display module is used for displaying in the display module according to the identification result of the image processing module.
CN202410051695.XA 2024-01-15 2024-01-15 Visual identification method and system based on unmanned aerial vehicle driving Active CN117576597B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410051695.XA CN117576597B (en) 2024-01-15 2024-01-15 Visual identification method and system based on unmanned aerial vehicle driving

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410051695.XA CN117576597B (en) 2024-01-15 2024-01-15 Visual identification method and system based on unmanned aerial vehicle driving

Publications (2)

Publication Number Publication Date
CN117576597A CN117576597A (en) 2024-02-20
CN117576597B true CN117576597B (en) 2024-04-12

Family

ID=89864611

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410051695.XA Active CN117576597B (en) 2024-01-15 2024-01-15 Visual identification method and system based on unmanned aerial vehicle driving

Country Status (1)

Country Link
CN (1) CN117576597B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117746272A (en) * 2024-02-21 2024-03-22 西安迈远科技有限公司 Unmanned aerial vehicle-based water resource data acquisition and processing method and system

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10062154B1 (en) * 2015-02-11 2018-08-28 Synaptics Incorporated System and method for adaptive contrast enhancement
CN111310703A (en) * 2020-02-26 2020-06-19 深圳市巨星网络技术有限公司 Identity recognition method, device, equipment and medium based on convolutional neural network
CN111506759A (en) * 2020-03-04 2020-08-07 中国人民解放军战略支援部队信息工程大学 Image matching method and device based on depth features
CN111639558A (en) * 2020-05-15 2020-09-08 圣点世纪科技股份有限公司 Finger vein identity verification method based on ArcFace Loss and improved residual error network
CN112362756A (en) * 2020-11-24 2021-02-12 长沙理工大学 Concrete structure damage monitoring method and system based on deep learning
CN114298944A (en) * 2021-12-30 2022-04-08 上海闻泰信息技术有限公司 Image enhancement method, device, equipment and storage medium
CN114818766A (en) * 2022-04-14 2022-07-29 重庆亲禾智千科技有限公司 Self-adaptive bar code contrast enhancement method based on opencv
CN114862837A (en) * 2022-06-02 2022-08-05 西京学院 Human body security check image detection method and system based on improved YOLOv5s

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
SG139602A1 (en) * 2006-08-08 2008-02-29 St Microelectronics Asia Automatic contrast enhancement
KR101303665B1 (en) * 2007-06-13 2013-09-09 삼성전자주식회사 Method and apparatus for contrast enhancement

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10062154B1 (en) * 2015-02-11 2018-08-28 Synaptics Incorporated System and method for adaptive contrast enhancement
CN111310703A (en) * 2020-02-26 2020-06-19 深圳市巨星网络技术有限公司 Identity recognition method, device, equipment and medium based on convolutional neural network
CN111506759A (en) * 2020-03-04 2020-08-07 中国人民解放军战略支援部队信息工程大学 Image matching method and device based on depth features
CN111639558A (en) * 2020-05-15 2020-09-08 圣点世纪科技股份有限公司 Finger vein identity verification method based on ArcFace Loss and improved residual error network
CN112362756A (en) * 2020-11-24 2021-02-12 长沙理工大学 Concrete structure damage monitoring method and system based on deep learning
CN114298944A (en) * 2021-12-30 2022-04-08 上海闻泰信息技术有限公司 Image enhancement method, device, equipment and storage medium
CN114818766A (en) * 2022-04-14 2022-07-29 重庆亲禾智千科技有限公司 Self-adaptive bar code contrast enhancement method based on opencv
CN114862837A (en) * 2022-06-02 2022-08-05 西京学院 Human body security check image detection method and system based on improved YOLOv5s

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Sliding window adaptive histogram equalization of intraoral radiographs: effect on image quality;T Sund 等;Dentomaxillofacial Radiology;20060531;第35卷(第3期);第133-138页 *
无人机对地多移动目标的视觉识别跟踪技术研究;罗小兰;中国优秀硕士学位论文全文数据库 工程科技II辑;20230115(第第1期期);第C031-18页 *

Also Published As

Publication number Publication date
CN117576597A (en) 2024-02-20

Similar Documents

Publication Publication Date Title
CN104282020B (en) A kind of vehicle speed detection method based on target trajectory
CN117576597B (en) Visual identification method and system based on unmanned aerial vehicle driving
Zhou et al. Robust vehicle detection in aerial images using bag-of-words and orientation aware scanning
CN109903331B (en) Convolutional neural network target detection method based on RGB-D camera
CN113065558A (en) Lightweight small target detection method combined with attention mechanism
CN114973002A (en) Improved YOLOv 5-based ear detection method
CN111738114B (en) Vehicle target detection method based on anchor-free accurate sampling remote sensing image
CN112818905B (en) Finite pixel vehicle target detection method based on attention and spatio-temporal information
CN114627447A (en) Road vehicle tracking method and system based on attention mechanism and multi-target tracking
CN113901874A (en) Tea tender shoot identification and picking point positioning method based on improved R3Det rotating target detection algorithm
CN111915558B (en) Pin state detection method for high-voltage transmission line
CN113205026A (en) Improved vehicle type recognition method based on fast RCNN deep learning network
CN112927264A (en) Unmanned aerial vehicle tracking shooting system and RGBD tracking method thereof
CN101320477B (en) Human body tracing method and equipment thereof
CN106529441A (en) Fuzzy boundary fragmentation-based depth motion map human body action recognition method
CN115841633A (en) Power tower and power line associated correction power tower and power line detection method
CN115019201A (en) Weak and small target detection method based on feature refined depth network
CN109215059B (en) Local data association method for tracking moving vehicle in aerial video
CN110647813A (en) Human face real-time detection and identification method based on unmanned aerial vehicle aerial photography
CN113021355B (en) Agricultural robot operation method for predicting sheltered crop picking point
CN112560799B (en) Unmanned aerial vehicle intelligent vehicle target detection method based on adaptive target area search and game and application
CN103927517B (en) Motion detection method based on human body global feature histogram entropies
CN110930436B (en) Target tracking method and device
CN113129336A (en) End-to-end multi-vehicle tracking method, system and computer readable medium
CN115830514B (en) Whole river reach surface flow velocity calculation method and system suitable for curved river channel

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant