CN116543171A

CN116543171A - Target detection method and device and electronic equipment

Info

Publication number: CN116543171A
Application number: CN202310299475.4A
Authority: CN
Inventors: 郑可尧; 张栋; 孙玉泉; 郑红
Original assignee: Weisen Vision Danyang Co ltd Beijing Branch
Current assignee: Weisen Vision Danyang Co ltd Beijing Branch
Priority date: 2023-03-24
Filing date: 2023-03-24
Publication date: 2023-08-04

Abstract

The invention provides a target detection method, a target detection device and electronic equipment, and relates to the technical field of image processing, wherein the method comprises the following steps: acquiring a target image to be detected; positioning each target to be detected in the target image to be detected based on the corner features corresponding to the target image to be detected to obtain an initial target sub-image set to be detected; determining a target sub-image set to be detected containing at least one complete target to be detected based on contour constraint features corresponding to the target image to be detected and the initial target sub-image set to be detected; and scoring each target sub-image to be detected in the target sub-image set to be detected, and determining a target detection result based on the scoring result. The invention can realize efficient sampling of the target and improve sampling precision, detection efficiency and detection accuracy.

Description

Target detection method and device and electronic equipment

Technical Field

The present invention relates to the field of image processing technologies, and in particular, to a target detection method, a target detection device, and an electronic device.

Background

In image processing, the target detection method generally searches and locates a target through a sliding window mode, namely, firstly, sliding window sampling is performed through window frames with different scales and fixed sizes in a fixed step length, and then whether the target exists in an image block corresponding to the window frame is judged based on a sampling result.

However, for small-scale targets, such as airport background, sky background, and sea background, since the area where the bird is located in the image cannot be determined, and the distribution density in different areas may be different, the redundancy of the image blocks acquired by the sliding window method is high, that is, the targets are not present or some targets are present in most of the image blocks, which results in lower sampling accuracy and further lower detection efficiency of target detection.

Disclosure of Invention

The invention provides a target detection method, a target detection device and electronic equipment, which are used for solving the defects of low sampling precision and detection efficiency in the prior art, realizing efficient sampling of a target and improving the sampling precision, the detection efficiency and the detection accuracy.

The invention provides a target detection method, which comprises the following steps:

acquiring a target image to be detected;

positioning each target to be detected in the target image to be detected based on the corner features corresponding to the target image to be detected to obtain an initial target sub-image set to be detected;

determining a target sub-image set to be detected containing at least one complete target to be detected based on contour constraint features corresponding to the initial target sub-image set to be detected and the initial target sub-image set to be detected;

And scoring each target sub-image to be detected in the target sub-image set to be detected, and determining a target detection result based on the scoring result.

According to the target detection method provided by the invention, each target to be detected in the target image to be detected is positioned based on the corner feature corresponding to the target image to be detected, so as to obtain an initial target sub-image set to be detected, which comprises the following steps:

performing angular point positioning detection on the target image to be detected, and determining angular point characteristics corresponding to the target image to be detected;

and determining an initial target sub-image set to be detected based on the corner features and a predetermined anchor frame, wherein the anchor frame is used for determining the area size of each initial target sub-image in the initial target sub-image set to be detected.

According to the target detection method provided by the invention, the anchor frame comprises the following steps: target scale information and target size information;

the anchor frame is determined based on the following steps:

obtaining a sample target anchor frame training dataset comprising: sample scale information and sample size information corresponding to each sample target anchor frame, wherein the sample scale information and the sample size information are determined based on the number of pixels corresponding to the sample target anchor frame;

Clustering the sample target anchor frame training data set, and determining the association relation between the clustering category corresponding to the sample scale information and the average coincidence degree corresponding to each sample target anchor frame based on a clustering result;

determining a cluster center value corresponding to each cluster category based on the association relation;

and determining each cluster center value as target size information of the anchor frame, and determining sample scale information corresponding to each cluster center value as target scale information of the anchor frame.

According to the target detection method provided by the invention, the determining of the initial target sub-image set to be detected based on the corner features and the predetermined anchor frame comprises the following steps:

determining the position relation between the corner features and the anchor frame, wherein the position relation comprises the following steps: the pixel points corresponding to the corner points are coincident with the vertexes of the anchor frame, the pixel points corresponding to the corner points are positioned on the edge line of the anchor frame, and the pixel points corresponding to the corner points are coincident with the center point of the anchor frame;

traversing each target pixel point corresponding to the corner feature, and determining at least two initial target sub-images to be detected corresponding to each target pixel point based on the position relation and the target images to be detected;

And determining an initial target sub-image set to be detected based on each initial target sub-image to be detected.

According to the target detection method provided by the invention, the contour constraint features corresponding to the initial target sub-image set to be detected are determined based on the following steps:

inputting the target image to be detected into a target contour detection model, and outputting contour features corresponding to the target image to be detected, wherein the target contour detection model is obtained by performing supervision training based on sample target contour training data and labels corresponding to the sample target contour training data;

and determining contour constraint features corresponding to the initial target sub-images to be detected based on the contour features.

According to the target detection method provided by the invention, the determining a target sub-image set to be detected including at least one complete target to be detected based on the contour constraint features corresponding to the initial target sub-image set to be detected and the initial target sub-image set to be detected includes:

traversing the initial target sub-image set to be detected, and comparing the contour constraint features with each initial target sub-image set to be detected;

under the condition that the comparison result shows that the initial target sub-image to be detected contains the contour constraint features, determining the initial target sub-image to be detected as a target sub-image to be detected;

And determining the target sub-image set to be detected based on each target sub-image to be detected.

According to the target detection method provided by the invention, the determination of the scoring result comprises the following steps:

traversing the target sub-image set to be detected, and determining target size information and edge pixel information of each target sub-image to be detected, wherein the target size information comprises: the length value and the width value of the target sub-image to be detected, and the edge pixel information comprises: the weight value of the adjacent pixel difference value and the absolute value of the adjacent pixel difference value in the target sub-image to be detected;

and calling a scoring function, and determining a scoring result corresponding to each target sub-image to be detected based on the target size information and the edge pixel information, wherein the scoring function is used for representing the probability that each target sub-image to be detected in the target sub-image set to be detected belongs to a target class.

According to the target detection method provided by the invention, the scoring function comprises perimeter sensitivity parameters, wherein the perimeter sensitivity parameters are used for representing the influence of the perimeter corresponding to the target sub-image to be detected on the scoring result;

and the perimeter sensitive parameters are determined based on the target scale information corresponding to each target sub-image to be detected.

The present invention also provides an object detection apparatus including:

the acquisition module is used for acquiring an image of a target to be detected;

the first determining module is used for positioning each target to be detected in the target image to be detected based on the corner characteristics corresponding to the target image to be detected, so as to obtain an initial target sub-image set to be detected;

the second determining module is used for determining a target sub-image set to be detected, which contains at least one complete target to be detected, based on the contour constraint features corresponding to the initial target sub-image set to be detected and the initial target sub-image set to be detected;

and the third determining module is used for scoring each target sub-image to be detected in the target sub-image set to be detected and determining a target detection result based on the scoring result.

The invention also provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing any one of the above mentioned object detection methods when executing the program.

According to the target detection method, the target detection device and the electronic equipment, based on the angular sensitivity, the angular point positioning is carried out on the targets in the target images to be detected, the fact that each target sub-image to be detected in the target sub-image set to be detected contains at least one complete target to be detected is ensured by determining the outline constraint feature corresponding to the angular point feature, the target detection result is further determined by scoring each target sub-image to be detected in the target sub-image set to be detected, the sampling accuracy is ensured, and meanwhile, the detection efficiency of each target sub-image to be detected is improved by reducing the sampling redundancy.

Drawings

In order to more clearly illustrate the invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.

Fig. 1 is an exemplary schematic diagram of an application scenario provided by the present invention;

FIG. 2 is a schematic flow chart of the target detection method provided by the invention;

FIG. 3 is one example schematic of a small scale clustering result provided by the present invention;

FIG. 4 is a second exemplary schematic of a small scale clustering result provided by the present invention;

FIG. 5 is one of exemplary schematic diagrams of a mesoscale clustering result provided by the present invention;

FIG. 6 is a second exemplary schematic of a mesoscale clustering result provided by the present invention;

FIG. 7 is one of exemplary schematic diagrams of a large scale clustering result provided by the present invention;

FIG. 8 is a second exemplary schematic of a large scale clustering result provided by the present invention;

FIG. 9 is a schematic diagram of the sampling direction of the anchor frame provided by the present invention;

FIG. 10 is an exemplary schematic of a profile feature provided by the present invention;

FIG. 11 is a plot of PR for small scale bird detection provided by the present invention;

FIG. 12 is an exemplary schematic diagram of PR curves corresponding to different perimeter sensitivity parameters provided by the present invention;

fig. 13 is an exemplary schematic diagram of PR curves corresponding to different second preset thresholds corresponding to corner response values provided by the present invention;

FIG. 14 is a schematic diagram showing a comparison of performance of the target detection method according to the present invention;

FIG. 15 is a second schematic diagram showing a comparison of the performance of the target detection method according to the present invention;

FIG. 16 is a third comparative schematic diagram of the performance of the target detection method according to the present invention;

FIG. 17 is a fourth schematic diagram showing a comparison of performance of the target detection method according to the present invention;

FIG. 18 is a fifth comparative schematic diagram of the performance of the target detection method provided by the present invention;

FIG. 19 is a schematic diagram of the structure of the object detection device according to the present invention;

fig. 20 is a schematic structural diagram of an electronic device provided by the present invention.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

In the prior art, when an image is subjected to target detection, a sliding window mode is generally used for searching and positioning a target. By setting windows with different scales and length-width ratios for sampling similar physical objects, analyzing and judging whether objects exist in image blocks framed by the windows, if the method is still adopted for small-scale flying birds, higher positioning precision can be obtained only by increasing the number of the sampling windows, and when the image blocks are screened in a mode of judging one by one in the later period, detection judging time length is certainly increased, and algorithm instantaneity is reduced. For example, fig. 1 is an exemplary schematic diagram of an application scenario provided by the present invention, as shown in fig. 1, an airport is taken as a background of an image to be detected, a bird is taken as an example of a detection target, 265 small-scale birds are distributed at a wing on one side of the airplane in the image, the area is about 320×240 pixels, if a sliding window mode is adopted for sampling, three scales of 16×16, 32×8 and 8×32 are set in a sampling frame, the sliding window sampling is carried out on the image to be detected with a step length of 2, 332100 image blocks can be generated throughout the whole image, however, only 718 image blocks at least comprise one complete bird to be detected, about 0.22% of the total number of samples is contained, and the rest 99.78% of image blocks only comprise part of birds or do not contain birds, which not only causes waste of computational resources, resulting in lower sampling accuracy, and when the object detection is carried out on the image blocks, the calculation time is greatly prolonged, resulting in lower detection efficiency of the object detection. In view of the above problems, an embodiment of the present invention provides a target detection method with a bird as a target to be detected, and fig. 2 is a schematic flow chart of the target detection method provided by the present invention, as shown in fig. 2, where the method includes:

Step 210, acquiring an image of a target to be detected.

Optionally, the target image to be detected may include at least one target, and in the case that the number of targets is greater than 1, scale information of each target may be different. In addition, it should be noted that in the embodiment of the present invention, the scale information is only for the imaging scale of the target, and the influence of the overall dimension and the imaging distance of the target on the imaging scale is not considered, and taking the detection of the bird as an example, the flying bird scale in the image information is related to the overall dimension of the bird and the distance between the bird and the imaging device, and under the condition that the overall dimension is fixed, the farther the bird is from the imaging device, the smaller the imaging scale of the bird is; under the condition of a certain imaging distance, the smaller the external dimension of the flying bird is, the smaller the imaging dimension is, but in the embodiment of the invention, only the imaging dimension of the flying bird is concerned, and the absolute phase velocity of the area occupied by the imaging dimension is used as a standard for judging the dimension of the flying bird, and the external dimension and the imaging distance of the flying bird are not considered.

And 220, positioning each target to be detected in the target image to be detected based on the corner features corresponding to the target image to be detected, so as to obtain an initial target sub-image set to be detected.

Specifically, most of the image blocks acquired through the sliding window do not contain targets or only contain partial targets, so that the sampling accuracy is low, and the subsequent judging efficiency of the image blocks is affected.

Step 230, determining a target sub-image set to be detected including at least one complete target to be detected based on the contour constraint features corresponding to the initial target sub-image set to be detected and the initial target sub-image set to be detected.

Specifically, after determining the target sub-images to be detected corresponding to the corner features, each target sub-image to be detected is further constrained to contain at least one complete target to be detected through the contour constraint features, so that the detection efficiency and accuracy of the subsequent target sub-images to be detected are improved.

And 240, scoring each target sub-image to be detected in the target sub-image set to be detected, and determining a target detection result based on the scoring result.

Specifically, after the target sub-image set to be detected is determined, each target sub-image to be detected in the target sub-image set to be detected needs to be traversed, the target sub-images to be detected are scored, the probability that each target sub-image to be detected belongs to the target category to be detected is calculated based on the scoring result, so that the final target detection result is determined, missed detection caused by scale reasons is avoided, and further the accuracy of target detection is improved.

Optionally, the positioning each target to be detected in the target image to be detected based on the corner feature corresponding to the target image to be detected to obtain an initial target sub-image set to be detected includes:

Specifically, in order to improve sampling accuracy, in the embodiment of the present invention, through performing angular point positioning detection on an image of a target to be detected, after determining a plurality of angular points corresponding to angular point features, determining an area of an initial sub-image of the target to be detected including the angular points through an anchor frame, so as to ensure that the anchor frame includes the angular points, and an intersection exists between the anchor frame and the target to be detected corresponding to the angular points.

Optionally, taking the object to be detected as a bird as an example, since the beak, the wing tip and the tail of the bird have sharp shapes, the bird can show the visual effect of protruding edge corner points no matter the visual angle changes and the scale sizes in the image. Therefore, in the embodiment of the invention, the Harris angular point detection algorithm is adopted to carry out angular point positioning detection on the birds which are in different scales and different flight attitudes in the target image to be detected, the positions of all the birds in the image can be positioned under the condition that the distribution areas of the birds in the image are unknown, and each sample contains only one bird as far as possible, so that the accurate detection on the birds is realized, and the detection omission rate is reduced. The method for extracting the corner features by using the Harris corner detection algorithm comprises the following steps of:

1) Converting the target image to be detected into a two-dimensional gray level image, arbitrarily taking out an image block W and translating delta x and delta y, and then squaring sum S of gray level value of the image block W and difference between the translated images _w (Deltax, deltay) is as shown in formula (1)The formula (1) is:

wherein f (x) _i ,y _j ) Representing the gray value of an image block, f (x _i -Δx,y _j Δy) represents the gray value of the image after the image block is translated, i represents the row number of the pixels in the image block W, and j represents the column number of the pixels in the image block W.

2) For the translation image f (x _i -Δx,y _j - Δy) calculate a first order taylor expansion approximation such that the sum of squares S _w If there is an analytical solution for the minimum of (Δx, Δy), then the image f (x) _i -Δx,y _j - Δy) is represented by formula (2), formula (2) being:

3) Taking equation (2) into equation (1) to obtain Harris matrix a (x, y) representing one half of the second derivative of the image block W at point (x, y) = (0, 0), the Harris matrix a semi-positive symmetric matrix whose principal variation pattern corresponds to partial differentiation in the orthogonal direction and whose eigenvalue λ ₁ And lambda (lambda) ₂ Is reacted out at the characteristic value lambda ₁ And lambda (lambda) ₂ And under the condition that the pixel points are larger than a second preset threshold value, determining the pixel point corresponding to the center point of the image block W as an angular point, and determining the coordinate information of the angular point in the target image to be detected. The Harris matrix a (x, y) is shown as a formula (3), and the formula (3) is:

4) Based on two eigenvalues lambda in Harris matrix a (x, y) ₁ And lambda (lambda) ₂ Determining a corner response value R' as shown in formula (4), wherein formula (4) is:

R′＝detA-υ(traceA) ²

Wherein deta=λ ₁ *λ ₂ ，traceA＝λ ₁ +λ ₂ The larger the angular point response value R' is, the more remarkable the angular point characteristic is, namely, if the angular point response value corresponding to the pixel point in the target image to be detected is larger than a second preset threshold value of the angular point response value, the pixel point is the angular point.

Optionally, the method for performing angular point positioning detection on the target image to be detected further comprises: the Scale-invariant feature transform algorithm (Scale-invariant feature transform, SIFT), the accelerated robust feature detection algorithm (Speed Up Robust Features, SURF) and the FAST corner detection algorithm (Features from Accelerated Segment Test), however, the Harris corner detection algorithm is optimal for corner detection, recall, precision, stability and robustness compared to the three algorithms described above. In the embodiment of the invention, a Harris corner detection algorithm, a SIFT algorithm, a SURF algorithm and a FAST algorithm are respectively operated in an MS COCO bird data set, the corner detection results under different backgrounds, different scales and different visual angles are subjected to statistical analysis, and the bird positioning performance of the four algorithms is evaluated according to the recall ratio and the precision ratio.

1) The background includes scenes such as sky and sea, and artificial flying objects such as airplanes and kites. Wherein:

(1) When the background is sea and sky, the number of the flying birds contained in the image is 6, and the comparison table of the results of the four algorithms for positioning the flying birds under the background of the sea and sky is shown in table 1, it is known from table 1 that the recall ratio of the Harris corner detection algorithm is 17% higher than that of the SIFT algorithm and the SURF algorithm, and the recall ratio of the Harris corner detection algorithm is 33% higher than that of the FAST algorithm. Meanwhile, the precision of the Harris angular point detection algorithm is respectively 44%, 55% and 79% higher than that of the SIFT algorithm, the SURF algorithm and the FAST algorithm, which shows that when the background is sea and sky, the stability and the robustness of the Harris angular point detection algorithm for detecting the bird are superior to those of the other three algorithms.

Table 1 comparison table of results of four algorithms for locating a bird in the sea and sky background

Feature operator	Total number of flying birds	Total number of bird positioning and detection	Total number of false alarms	Recall ratio	Precision ratio of
						Harris	6	6	0	100％	100％
SIFT	6	5	4	83％	56％
						SURF	6	5	6	83％	45％
FAST	6	4	15	67％	21％

Table 2 first comparison results of four algorithms for locating a bird in aircraft and sky context

Feature operator	Total number of flying birds	Total number of bird positioning and detection	Total number of false alarms	Recall ratio	Precision ratio of
						Harris	1	1	1	100％	50％
SIFT	1	0	1	0	0
						SURF	1	0	2	0	0
FAST	1	0	1	0	0

(2) When the background is an airplane and a sky, the number of the flying birds contained in the image is 1, the flying bird area contains 63 pixels and belongs to a small-scale flying bird, but the background also comprises an oblique gyroplane and a helicopter, the first comparison results of four algorithms for positioning the flying birds in the background of the airplane and the sky are shown in table 2, as can be seen from table 2, only the Harris corner detection algorithm detects the flying birds, and the other three algorithms have missed detection. Meanwhile, the accuracy of the Harris corner detection algorithm is 50% higher than that of the SIFT algorithm, the SURF algorithm and the FAST algorithm, and the fact that the stability and the robustness of the Harris corner detection algorithm for detecting small-scale birds are superior to those of the other three algorithms when the background is an airplane and sky is shown.

(3) When the background is an airplane and sky, the number of the flying birds in the image is 14, the second comparison results of the four algorithms for positioning the flying birds under the background of the airplane and sky are shown in table 3, and as can be seen from table 3, the recall ratio of the Harris corner detection algorithm is 100%, all the flying birds are detected, and the other three algorithms have different degrees of omission, which indicates that the detection performance of the Harris corner detection algorithm is higher than that of the other three algorithms. Meanwhile, the false alarms of the Harris angular point detection algorithm are not generated in the sky except the space, and the other three algorithms are generated in the space and the space, so that the accuracy of the Harris angular point detection algorithm is higher than that of the other three algorithms when the background is the sky, but the four algorithms cannot avoid the false alarms generated by the interference of the plane, so that the plane is unavoidable interference for angular point positioning, and the stability and the robustness of detecting the small-scale bird are superior to those of the other three algorithms.

Table 3 second comparison of four algorithms to locate a bird in aircraft and sky background

Feature operator	Total number of flying birds	Total number of bird positioning and detection	Total number of false alarms	Recall ratio	Precision ratio of
						Harris	14	14	1	100％	93％
SIFT	14	11	4	79％	73％
						SURF	14	9	4	64％	69％
FAST	14	8	5	57％	62％

2) Taking a small-scale bird as an example, detecting the small-scale bird with 4 different image resolutions from side view, overlook and look up respectively, wherein the 4 resolutions are as follows from big to small in sequence: 32×32, 16×16, 8×8, 4×4, in pixels.

(1) For the Harris corner detection algorithm, the Harris corner detection algorithm can stably detect the beak of the bird with reduced image resolution in side view and top view. In the bottom view, the Harris corner detection algorithm can stably detect the tail of the bird, so that corner features of the beak and the tail of the bird are more prominent. Meanwhile, the lower the image resolution is, the fewer the number of Harris corner points appear on the bird, but the miss rate of the Harris corner point detection algorithm is 0, which shows that the Harris corner point detection algorithm has stronger stability.

(2) For SIFT and SURF algorithms, both algorithms may not detect corner features of a bird when the image resolution is less than 8 x 8 in side view and top view, and may fail to detect corner features of a bird when the image resolution is less than 4 x 4 in bottom view. Meanwhile, the omission ratio of the two algorithms is 41%, which indicates that the two algorithms can generate omission when the image resolution is less than 8×8.

(3) For the FAST algorithm, in the detection diagram of three visual angles, when the image resolution is less than 8×8, the FAST algorithm cannot detect the corner features of the bird, and meanwhile, the missing detection rate of the FAST algorithm is 50%, which indicates that when the image resolution is less than 8×8, the missing detection occurs in the FAST algorithm.

Optionally, in the embodiment of the present invention, 10806 birds in the MS COCO bird dataset are detected by the Harris corner detection algorithm, the SIFT algorithm, the SURF algorithm, and the FAST algorithm, and the bird positioning performance of the four algorithms is shown in table 4.

Table 4 comparison table of bird positioning performance for four algorithms

As can be seen from Table 4, the Harris corner detection algorithm has 100% recall ratio and 91% precision ratio for positioning birds of different scales, and is superior to three algorithms of SIFT algorithm, SURF algorithm and FAST algorithm. In the SIFT algorithm, the recall ratio of the large-scale flyer and the medium-scale flyer is 100%, the false alarm number is 2322, the precision ratio is 79%, and the recall ratio of the small-scale flyer is 70%. This is because the scale space is built by convolving the gaussian kernel with the image, and the corner points are extracted by the difference operation of the gaussian space pyramid. However, since the bird scale is small, downsampling operations in the scale space are not favored. In the SURF algorithm, the recall ratio of the small-scale bird is 69%, the total false alarm number is 2372, and the precision ratio is 78%. Although the SURF algorithm improves on the SIFT algorithm to increase the detection speed, the same problem as the SIFT algorithm still exists. Therefore, the SIFT algorithm and the SURF algorithm have weaker positioning effect for detecting small-scale birds than the Harris corner detection algorithm. In the FAST algorithm, the recall ratio for large-scale flying birds and medium-scale flying birds is 100%, the total false alarm number is 2853, the precision ratio is 74%, and the recall ratio for small-scale flying birds is 63%. The FAST algorithm judges whether the pixel point is a corner point or not by calculating the difference degree of the pixel point and the pixel points in the neighborhood around the pixel point, and the FAST algorithm calculates that the neighborhood area of the pixel point is 32 pixels and is larger than a small-scale bird area with lower resolution, so that the small-scale bird background can influence the generation of the corner point, and the FAST algorithm is weaker than the Harris corner point detection algorithm in small-scale bird detection. In the Harris angular point detection algorithm, the recall ratio of the three-scale flyers is 100%, wherein the recall ratio of the small-scale flyers is respectively 30%, 31% and 37% higher than that of the SIFT algorithm, the SURF algorithm and the FAST algorithm, which shows that the Harris angular point detection algorithm has stronger stability, and the precision ratio of the Harris angular point detection algorithm is 91%, which is respectively 12%, 13% and 17% higher than that of the SIFT algorithm, the SURF algorithm and the FAST algorithm, which shows that the Harris angular point detection algorithm is less interfered by the background and has stronger robustness, so that the detection performance of the Harris angular point detection algorithm is better than that of other three algorithms.

In addition, as the flying gesture changes of the flying birds show different imaging effects along with the observation of different visual angles, the actual size and distance also show different dimensional changes, in the embodiment of the invention, the Harris angular point detection algorithm is operated in the MS COCO bird data set, angular point positioning detection is carried out on the flying birds with different dimensions from the three visual angles of side view, bottom view and top view, and the detection result shows that in the detection result of the large-scale flying birds, the Harris angular points are mainly distributed at sharp edges of the beaks, wings, tail feathers and the like of the flying birds and are not influenced by the visual angles; in the detection result of the mesoscale flyer, harris corner points are mainly distributed at sharp edges of the tail of the flyer, but no missed detection exists; in the detection result of the small-scale flyer, at least one corner point is detected for each flyer, and Harris corner points are mainly distributed at the outline tip of the flyer, but no missed detection exists and are not influenced by the visual angle.

Optionally, the anchor frame includes: target scale information and target size information;

the anchor frame is determined based on the following steps:

Specifically, after the corner features are determined, in order to make the image block including the corner contain the bird region as much as possible, the target image to be detected needs to be sampled at each corner position corresponding to the corner features according to the anchor frame of the predefined target scale information and the target size information. Therefore, in the embodiment of the invention, the clustering operation is performed on the sample target anchor frame training data set of the targets with the same attribute in the target image to be detected, so that the anchor frames corresponding to the clustering center obtained by clustering of the sample target anchor frame training data have larger overlap ratio IoU, and the average overlap ratio of the anchor frames is improved, which means that the target detection precision is improved. After clustering, the association relation between the clustering category and the average coincidence degree can be obtained, the clustering category and the clustering center value corresponding to the clustering category can be determined based on the consideration of sampling efficiency, and the target size information of the anchor frame corresponding to the different coincidence degrees under the corresponding scale can be determined based on the clustering center value.

Taking a bird as an example, sampling is performed at each corner point according to a predefined scale and an aspect ratio to obtain a bird sample set, wherein the rectangular frame of each sample is the anchor frame of the bird.

Optionally, a K-means (K-means) clustering method may be used to determine target scale information of anchor frames under different target scales, where the number of class center values obtained by clustering is the same as the number of anchor frames under the target scale. When K-means clustering is carried out, a target area corresponding to an initial clustering center can be determined, the coincidence degree is determined through sample target anchor frame training data and the target area, based on the coincidence degree, whether the sample target anchor frame training data and the target area belong to the same category is judged by taking Euclidean distance between two anchor frames as distance measurement, after all sample target anchor frame training data sets are calculated, a category central value can be updated, iterative calculation is carried out based on the updated category central value, and after the calculation reaches set times, a final clustering result is output. Wherein K represents the number of categories of the cluster, and the value range of K is an integer greater than or equal to 1 and less than the total number of samples in the data set. The above-mentioned overlap ratio IoU (T, C) is represented by the formula (5), and the formula (5) is:

Wherein T represents a truth box in sample target anchor frame training data, and C represents a target area corresponding to a cluster center generated by clustering.

Further, the distance metric D (T, C) may be: d (T, C) =1-IoU (T, C), the sample target anchor frame training dataset may be an MS COCO bird dataset.

Illustratively, the clustering results at different scales are as follows:

1) Fig. 3 is one example schematic diagram of a small-scale clustering result provided by the invention, and as shown in fig. 3, taking the small-scale bird clustering result as an example, the horizontal axis in fig. 3 is the number of clustering categories, the vertical axis is the average overlap ratio of the generated anchor frame, the broken line in fig. 3 represents the association relationship between the clustering categories and the average overlap ratio, and the number of clustering categories performs rounding operation. As can be seen from fig. 3, when the clustering class K is greater than or equal to 3, the average overlap ratio is greater than 70%, and each time the K value is increased by 1, the average overlap ratio increases by less than 0.1, and since the K value is equal to the number of anchor frames set under the small-scale information, that is, the greater the K value, the greater the number of initial target sub-images to be detected is, the longer the calculation sample time is, so, in order to balance the detection accuracy and the detection efficiency, when the target to be detected is small-scale, the clustering class K is set to 3.

When the K value of the small-scale flyer cluster class is 3, fig. 4 is a second schematic diagram of an example of the small-scale clustering result provided by the present invention, as shown in fig. 4, the horizontal axis in fig. 4 represents the length value of the anchor frame, the vertical axis represents the width value of the anchor frame, and as shown in fig. 4, the cluster center values of the three cluster classes are (13, 14), (28, 16), (18, 30), respectively, and when the cluster center values are respectively the length value and the width value of the small-scale anchor frame, the average coincidence degree of the sample target anchor frame training data set with the three anchor frames is about 70.13%, and compared with the 9 anchor frames determined by the fast-rcnn method, the average coincidence degree is improved by 9.23%, and meanwhile, compared with the YOLOv2 method, the average coincidence degree is improved by 3%.

2) Fig. 5 is one example schematic diagram of a mesoscale clustering result provided by the present invention, and as shown in fig. 5, taking a mesoscale bird clustering result as an example, when a clustering class K is greater than or equal to 4, the average overlap ratio is greater than 70%, and each time the K value increases by 1, the average overlap ratio value increases by less than 0.1. In order to balance the balance between the detection accuracy and the detection efficiency, when the target to be detected is a mesoscale, the cluster class K is set to 4. When the K value of the mesoscale flyer cluster class is 4, fig. 6 is a second schematic diagram of examples of the mesoscale cluster result provided by the present invention, as shown in fig. 6, the cluster center values of the four cluster classes are (32, 62), (62, 40), (58, 79), (60, 105), that is, the size information of the mesoscale anchor frame is (32, 62), (62, 40), (58, 79), respectively.

3) Fig. 7 is one example schematic diagram of a large-scale clustering result provided by the present invention, and as shown in fig. 7, taking the large-scale bird clustering result as an example, when the clustering class K is greater than or equal to 5, the average overlap ratio is greater than 70%, and each time the K value increases by 1, the average overlap ratio value increases by less than 0.1. In order to balance the balance between the detection accuracy and the detection efficiency, when the target to be detected is large-scale, the cluster class K is set to 5. When the K value of the large-scale bird cluster category is 5, fig. 8 is a second schematic diagram of an example of the large-scale cluster result provided by the present invention, as shown in fig. 8, the cluster center values of the five cluster categories are (111, 180), (126, 173), (144, 146), (160, 131), (183, 106), that is, the size information of the large-scale anchor frame is (111, 180), (126, 173), (144, 146), (160, 131), (183, 106), respectively.

Optionally, the determining the initial target sub-image set to be detected based on the corner feature and a predetermined anchor frame includes:

Specifically, after the above 12 anchor frames are determined, since the specific positions of the corner points cannot be ensured, in order to enable the acquired sub-image of the target to be detected to include at least one complete target to be detected, the initial sub-image of the target to be detected corresponding to the corner point needs to be further determined based on the positional relationship between the corner point and the anchor frames. The angular point and the anchor frame have three positions: the corner points coincide with the vertexes of the anchor frame, are positioned on the edge lines of the anchor frame, and coincide with the center points of the anchor frame. Fig. 9 is a schematic diagram of the sampling direction of the anchor frame provided in the present invention, as shown in fig. 9, based on the above consideration, it may be determined that the sampling direction is: the angular points are arranged on the center points of the four sides of the anchor frame, the angular points are coincident with the four vertexes of the anchor frame, the angular points are coincident with the center points of the anchor frame, and 9 sampling directions are totally arranged between each angular point and 12 anchor frames, and the number of sampling directions between each anchor frame and the angular point is 9, namely 108 initial target sub-images to be detected corresponding to each angular point.

Optionally, the contour constraint features corresponding to the initial target sub-image set to be detected are determined based on the following steps:

Specifically, after detecting the corner features by the Harris corner detection algorithm and determining the initial target sub-image set to be detected, the number of the initial target sub-images to be detected is large, and each initial target sub-image to be detected cannot be ensured to contain at least one complete target to be detected, meanwhile, in the target image to be detected, the background images except for the target to be detected may have the corner points of objects, that is, when the Harris corner detection algorithm is adopted to extract the corner points, the corner points are subjected to smaller background interference, and the false alarm rate is about 9%. Therefore, in order to further improve the sampling accuracy and the detection accuracy, in the embodiment of the present invention, the target image to be detected is input into the target contour detection model, the contour features of all objects included in the target image to be detected are output, and the contour features are closed edge features, and since not all objects in the target image to be detected are targets to be detected, in order to eliminate corner points irrelevant to the targets to be detected, background interference is reduced, the contour features related to each initial target sub-image to be detected are determined as contour constraint features, for example, the contour features having intersections with the initial target sub-images to be detected are determined as contour constraint features, that is, the contour features having intersections with anchor frames generated by the corner points are determined as contour constraint features.

Optionally, the determining a target sub-image set to be detected including at least one complete target to be detected based on the contour constraint feature corresponding to the initial target sub-image set to be detected and the initial target sub-image set to be detected includes:

Specifically, after determining the initial target sub-image set to be detected, in the embodiment of the invention, in order to further improve the sampling accuracy and the detection accuracy, the initial target sub-image to be detected is compared with the contour constraint features output by the target contour detection model, if the initial target sub-image to be detected contains the complete contour of the target to be detected, the initial target sub-image to be detected is added into the target sub-image set to be detected, that is, for each angular point, in 108 initial target sub-images to be detected corresponding to the angular point, most of initial target sub-images to be detected which only contain partial targets to be detected can be removed through the contour constraint, only the initial target sub-images to be detected which contain the complete target to be detected are reserved, so that the subsequent target detection is facilitated, meanwhile, the angular points which do not belong to the target to be detected are removed based on the contour constraint features, the background interference is overcome, the false alarm rate is reduced, the acquired data volume is greatly reduced, and the precision and the judgment efficiency of the subsequent target detection are improved.

Optionally, before determining the target sub-image set to be detected, in the embodiment of the invention, an initial target contour detection model can be constructed through a structured forest edge detection algorithm, and the initial target contour detection model is supervised and trained through constructing sample target contour training data and labels corresponding to the sample target contour training data, so as to obtain a target contour detection model after training. Taking a target to be detected as a bird as an example, sample target contour training data and corresponding contour labels can be constructed based on an MS COCO bird data set. The structured forest edge detection algorithm can select N decision trees to train the bird outlines, weight the output data of the N decision trees to obtain a final output result of the structured forest, wherein the final output result G of the structured forest is shown in a formula (6), and the formula (6) is as follows:

wherein d _i Output data representing the ith decision tree, w _i The weight of the ith decision tree is represented, D represents the number of decision trees, G represents the final output result of the structured forest and is used for representing the calculation result of the contour constraint characteristic, and the value range is [0, 255 ]]。

Optionally, based on the final output result of the structured forest, the determined contour constraint condition is shown in a formula (7), where the formula (7) is:

Wherein h is _n Represents the set of corner sequences, and { h ] _n 1 is not less than N is not less than N, N represents the total number of corner points,representing an initial target sub-image set to be detected obtained through angular point positioning detection, and +.>Y represents the total number of initial target sub-images to be detected corresponding to single corner point, G _k Represents the final output contours of the structured forest, Φ represents the total number of contours, and { C _k |1≤k≤Φ}，/>Representing a target sub-image set to be detected simultaneously comprising corner points and outlines, and

it should be noted that, the number of contours finally output by the structured forest method is large, and contours with intersections of anchor frames corresponding to each corner point can be determined as contour constraint features.

Fig. 10 is an exemplary schematic diagram of profile features provided in the present invention, where after an object image to be detected with a background of an aircraft and sky is processed by a structured forest method, 367 closed profiles as shown in fig. 10 are generated, where 265 bird profiles are generated, no missed detection exists, and 102 background profiles are generated. Many outline features in fig. 10 are not related to the flyer, are not related to the corner points corresponding to the flyer, and are not intersected with the anchor frame generated by the flyer corner points, so 102 background outlines can be eliminated, and only 265 flyer outlines are reserved as outline constraint features. On the basis, 265 bird outlines correspond to 34128 anchor frames, if the anchor frames contain angular points and complete outline constraint features, the anchor frames are selected as samples, the anchor frames containing incomplete outline constraint features are discarded, after the 34128 anchor frames are subjected to outline constraint, 5460 anchor frames remain, the number of discarded anchor frames accounts for about 16% of the original total number, the number of discarded anchor frames accounts for 84% of the total number of anchor frames, 10% of the anchor frames are generated due to false alarms generated by a Harris angular point detection algorithm, and the rest 74% of anchor frames are anchor frames containing incomplete outline constraint features. Therefore, the false alarm rate generated by the Harris corner detection algorithm can be reduced through the contour constraint, incomplete flying bird anchor frames can be reduced, and the detection efficiency is improved by reducing the number of samples to be distinguished.

Optionally, the initial target contour detection model may be further constructed based on Canny operator, deep net, SCG (scaled conjugate gradient, quantized conjugate gradient method), which is not limited by the embodiment of the present invention.

Alternatively, taking a bird as an object to be detected as an example, the profile-constraining feature may be a bird resting profile and a bird flying profile.

Optionally, determining the scoring result includes:

Specifically, taking the detection of a bird target as an example, the similar physical property sampling method in the prior art calculates the probability that all samples belong to the class of the bird by using a scoring function after collecting the samples, and takes the probability that the class of the bird belongs to the class of the bird as a detection result. The scoring function is an expression determined according to the target characteristics, and the stronger the target characteristics in the similar physical property sample, the higher the probability of belonging to the target, and the lower the probability of belonging to the target. After determining the corner feature corresponding to the target image to be detected, as the similarity between the object and the contour of the bird is possibly larger, such as a bird-shaped kite, the corner is present in the bird-shaped kite, the contour is similar to the bird, when the contour constraint is performed, the initial target sub-image to be detected corresponding to the bird-shaped kite is determined as the target sub-image to be detected, namely, when the contour feature is only used for scoring, if the contour contained in the anchor frame is similar to the contour of the bird, the higher the score is, namely, the higher the probability of the bird contained in the anchor frame is, the lower the probability is, and if the two contain the same contour of the bird, the smaller the area of the anchor frame is, the higher the score is, and otherwise, the lower the score is. Thus, in the set of target sub-images to be detected, there is a target sub-image to be detected that contains a complete contour but does not belong to the target class. Therefore, in the embodiment of the invention, each target sub-image to be detected is traversed, a scoring function is called, a scoring result corresponding to each target sub-image to be detected is determined based on target size information and edge pixel information corresponding to each target sub-image to be detected, the probability that the target sub-image to be detected belongs to a target class is determined through the scoring result, and a final target detection result is determined through comparison of the probability and a third preset threshold value.

Optionally, the profile scoring function in the existing EdgeBoxes methodp _b As shown in formula (8), formula (8) is:

wherein b represents an image block, b _w Representing the width of an image block, b _h Representing the height of the image block,representing the pixel group e of the inner edge of the anchor frame _i Weight of edge pixel group e _i The difference value of two adjacent pixels is represented, the direction vector comprises two parameters of amplitude and angle, the amplitude represents the absolute value of the difference value of the adjacent pixels and is used for representing the light intensity change of the adjacent pixels, and the larger the amplitude is, the larger the light intensity change is, and the more obvious the edge is. m is m _i Representing edge pixel group e _i Delta represents perimeter sensitivity parameters for the sensitivity of the perimeter of the target sub-image to be detected to the scoring result. b ⁱⁿ A rectangular image block representing the inner central region of the image block, the rectangular image block being identical to the central point of the image block b and having a width b _w 2, height b _h 2, p represents a rectangular image block b ⁱⁿ Edge pixel group, m _p Representing edge pixel group e _p Is a sum of edge magnitudes of (c).

In the embodiment of the present invention, the scoring function h _b As shown in formula (9), formula (9) is:

wherein b represents a target sub-image to be detected, b _w Representing the width value, b _h Indicating the height value.

Optionally, FIG. 11 is a graph showing a PR curve for small-scale bird detection according to the present invention, which can be obtained by using the profile scoring function p in the EdgeBox method _b And scoring function h provided by embodiments of the present invention _b Evaluating the outline-constrained small-scale flying bird anchor frameDividing, obtaining a detection result according to a scoring sequence of an anchor frame, using the coincidence degree of the detection result and a true value result as a parameter, defining a judgment parameter of detection or not, determining a relation curve of precision and recall shown in fig. 11, wherein the horizontal axis shows the recall and the vertical axis shows the precision, and the scoring function h provided by the embodiment of the invention is known from fig. 11 _b The corresponding curve is higher than the profile scoring function p in the EdgeBoxes method _b Corresponding curves indicate scoring function h provided by embodiments of the present invention _b The output detection result is integrally superior to the profile scoring function p in the EdgeBox method _b And outputting a detection result. This is due to the profile scoring function p in the EdgeBoxes method _b Is h _b Score minus rectangular image block b ⁱⁿ The inner score, the area of the center area being one-fourth of the overall anchor frame, reduces the score of a small scale bird. In the scoring function in the embodiment of the invention, the scoring result and ranking corresponding to the small-scale targets are improved and the precision of small-scale target detection is increased because the scoring of the central area of the target sub-image to be detected is saved.

Optionally, the scoring function is designed based on target features of the target to be detected, and the stronger the selected target feature is, the higher the probability of belonging to the target to be detected is.

Optionally, the scoring function includes perimeter sensitivity parameters, where the perimeter sensitivity parameters are used to characterize an influence of a perimeter corresponding to a target sub-image to be detected on the scoring result;

Specifically, as the outline perimeter and the intensity of the small-scale target are smaller than those of the large-scale target, zhou Changjiao of the sub-image of the target to be detected corresponding to the small-scale target is smaller, and when the scoring result is calculated through the scoring function, the scoring score corresponding to the large-scale target is higher than that corresponding to the small-scale target, so that the ranking corresponding to the small-scale target and the target detection result of the small-scale target are affected. For example, taking a bird as an example, since the space related information of the small-scale bird is less, the texture, color and edge information of the small-scale bird in the sampling window are blurred, and the information related to the kite similar to the shape of the bird in the window is strong, in this case, the probability of attribution of the class of the small-scale bird may be smaller than that of the kite, the sample ranking may be after the kite sample, and the probability of belonging to the class of the bird may be lower than a given threshold value, so that missed detection may be caused. Therefore, in order to reduce the unbalance of the scoring result caused by different scales, in the embodiment of the invention, the value range of the perimeter sensitive parameters can be set based on experience or heuristics, a plurality of perimeter sensitive parameters are set in the value range, and the optimal perimeter sensitive parameters are selected by determining the precision and recall graphs corresponding to the different perimeter sensitive parameters, so as to improve the detection performance of target detection.

As shown in fig. 12, the perimeter sensitivity parameters may be respectively set to [1.6,1.65,1.7,1.75,1.8], the horizontal axis represents the recall ratio, and the vertical axis represents the precision ratio, and as can be seen from fig. 12, the corresponding PR curves are higher than the curves of other colors as a whole when the perimeter sensitivity parameter δ=1.7, i.e. the target detection result is optimal, so that the perimeter sensitivity parameter δ in the scoring function is set to 1.7.

In addition, the number of angular points directly influences the number of sampled target sub-images to be detected, so that the detection performance of target detection is directly influenced. The setting of the second preset threshold directly affects the number of corner points, that is, the larger the second preset threshold is, the smaller the number of generated corner points is, and the smaller the second preset threshold is, the larger the number of generated corner points is. Therefore, in the embodiment of the invention, the value range of the second preset threshold is set based on experience or heuristics, a plurality of second preset thresholds are set in the value range, and the optimal second preset threshold is selected by determining the precision and recall graphs corresponding to different second preset thresholds so as to ensure the detection performance of target detection. Fig. 13 is an exemplary schematic diagram of PR curves corresponding to different second preset thresholds corresponding to corner response values provided in the present invention, as shown in fig. 13, the second preset thresholds R may be set to [0.01,0.015,0.02,0.025,0.03] respectively, the horizontal axis represents the recall, the vertical axis represents the precision, and as shown in fig. 13, when the second preset threshold r=0.02, the corresponding PR curves are higher than the curves of other colors as a whole, that is, the target detection result is optimal, so that the second preset threshold R is set to 0.02.

Optionally, since the EdgeBoxes method has the optimal performance in the window scoring similar physical property sampling method, and the MCG method has the optimal performance in the grouping similar physical property sampling method, in the embodiment of the invention, the target detection method provided by the embodiment of the invention is compared with the EdgeBoxes and MCG methods in the prior art, wherein:

1) Fig. 14 is a schematic diagram of comparing performance of the target detection method according to the present invention, and when the number of samples is 1000, a performance curve of recall and overlap ratio shown in fig. 14 is obtained, where the horizontal axis is overlap ratio and the vertical axis is recall ratio. As can be seen from fig. 14, the recall ratio of the three curves decreases with increasing overlap ratio. The recall ratio of the target detection method CCOP provided by the embodiment of the invention is higher than two methods of EdgeBoxes and MCG, and when the coincidence ratio is 0.5, the recall ratio of the target detection method CCOP provided by the embodiment of the invention is 0.81, because the target detection method CCOP provided by the embodiment of the invention utilizes the corner characteristics to perform initial screening target and takes positioning sampling at the corner, the sampled anchor frames are obtained by K-means clustering, the distance measurement of the clustering is a function of the coincidence ratio, the more the clustering is, the higher the coincidence ratio of the generated anchor frames and the truth value frame is, and meanwhile, the average coincidence ratio of the anchor frames and the truth value frame set in the target detection method CCOP provided by the embodiment of the invention exceeds 0.7, so that the recall ratio is improved.

2) Fig. 15 is a second schematic diagram showing a comparison of performance of the target detection method according to the present invention, and when the number of samples is 1000, PR curves of 3 similar physical sampling methods shown in fig. 15 can be obtained, wherein the horizontal axis represents the recall ratio and the vertical axis represents the precision ratio. As can be seen from fig. 15, along with the increase of the recall ratio, the precision ratio of the three curves is in a decreasing trend, wherein, taking the flyer detection as an example, the flyer detection performance of the target detection method CCOP provided by the embodiment of the invention is overall better than that of two methods, namely EdgeBoxes and MCG, because the target detection method CCOP provided by the embodiment of the invention uses the improved scoring function on the basis of having a higher overlap ratio sample, and balances the scores of the large-scale flyer, the middle-scale flyer and the small-scale flyer by keeping the score of the center area of the flyer sample and setting the perimeter sensitivity parameter, thereby solving the problem of lower score of the small-scale flyer, and further improving the ranking of the small-scale flyer, thereby improving the detection performance.

3) Taking bird detection as an example, the sampling performance of the three methods is compared as shown in table 5, and as can be seen from table 5, the recall ratio, the precision ratio and the detection speed of the target detection method CCOP provided by the embodiment of the invention under different scales are all higher than those of the EdgeBoxes method and the MCG method, and especially in small-scale bird detection, the recall ratio of the target detection method CCOP provided by the embodiment of the invention is higher than that of the EdgeBoxes method by 13% and higher than that of the MCG method by 11%. The accuracy of the CCOP of the target detection method provided by the embodiment of the invention is 14% higher than that of the EdgeBox method and 10% higher than that of the MCG method. Meanwhile, the speed of detecting a picture by the CCOP of the target detection method provided by the embodiment of the invention is 0.043s, which is 0.2s faster than the EdgeBox method and 11.4s faster than the MCG method, which shows that the CCOP of the target detection method provided by the embodiment of the invention is more suitable for detecting birds, especially small-scale birds compared with the EdgeBox method and the MCG method.

Table 5 three methods sample performance comparison table

Optionally, fig. 16 is a third performance comparing schematic diagram of the target detection method provided by the present invention, fig. 17 is a fourth performance comparing schematic diagram of the target detection method provided by the present invention, fig. 18 is a fifth performance comparing schematic diagram of the target detection method provided by the present invention, and further, as an example, for detecting birds with different scales, comparing the target detection methods CCOP, fast R-CNN and YOLOv2 provided by the embodiments of the present invention to obtain a PR curve of the small-scale bird shown in fig. 16, a PR curve of the medium-scale bird shown in fig. 17, and a PR curve of the large-scale bird shown in fig. 18, where the horizontal axis represents the recall ratio and the vertical axis represents the precision ratio. The fast R-CNN firstly proposes to use the anchor frame for sampling, when the number of the anchor frames is set to be 9, the average coincidence degree of the anchor frames and the truth frame is 60.9%, and the accuracy of target detection is improved by 2.0% compared with the fast R-CNN without the anchor frames. YOLOv2 first proposes that K-means clustering is used for generating an anchor frame, when K is 9, the average coincidence degree of the anchor frame and a truth value frame is 67.2%, and the accuracy of target detection is improved by 4.8% compared with that of the anchor frame generated by not using K-means clustering.

16-18, the detection performance of the target detection method CCOP provided by the embodiment of the invention is better than that of two methods, namely, faster R-CNN and YOLOv2, especially for small-scale birds shown in FIG. 16, on one hand, because the target detection method CCOP provided by the embodiment of the invention is to perform K-means clustering on truth frames of small-scale birds, medium-scale birds and large-scale birds respectively to generate anchor frames, and the distance measurement of the clustering is a function related to the overlap ratio, the average overlap ratio of the anchor frames and the truth frames can be improved, for example, the average overlap ratio of the anchor frames and the truth frames in the CCOP is 70.2% and 9.3% higher than that of Faster R-CNN, and the average overlap ratio of YOLOv2 is 3% higher; another factor is that fast R-CNN and YOLOv2 extract small scale bird features through deep convolutional neural networks, in this process, the target features undergo 4 maximum pooling, each pooling layer downsamples the features once, and the feature map resolution after pooling becomes 1/4 of that before pooling. Because small-scale bird feature information is originally less, the target feature information is further reduced after pooling, and finally the learned target feature is that the network is difficult to distinguish all small-scale birds in the data set, the effect of detecting the small-scale birds by the target detection method CCOP provided by the embodiment of the invention is better than that of other two deep learning methods.

According to the target detection method provided by the invention, the targets in the target image to be detected are positioned based on the angular sensitivity, the inclusion of at least one complete target to be detected in each target sub-image to be detected in the target sub-image set to be detected is ensured by determining the contour constraint characteristics corresponding to the angular characteristics, the target detection result is further determined by grading each target sub-image to be detected in the target sub-image set to be detected, the sampling accuracy is ensured, and meanwhile, the detection efficiency of each target sub-image to be detected is improved by reducing the sampling redundancy, so that the accuracy, the sampling stability and the robustness of target detection are improved.

The object detection device provided by the present invention will be described below, and the object detection device described below and the object detection method described above may be referred to correspondingly to each other.

The present invention also provides an object detection device, fig. 19 is a schematic structural diagram of the object detection device provided by the present invention, as shown in fig. 19, the object detection device 1900 includes: an acquisition module 1901, a first determination module 1902, a second determination module 1903, and a third determination module 1904, wherein:

an acquisition module 1901 for acquiring an image of a target to be detected;

The first determining module 1902 is configured to locate each target to be detected in the target image to be detected based on the corner feature corresponding to the target image to be detected, so as to obtain an initial target sub-image set to be detected;

a second determining module 1903, configured to determine a target sub-image set to be detected including at least one complete target to be detected based on the contour constraint features corresponding to the initial target sub-image set to be detected and the initial target sub-image set to be detected;

and a third determining module 1904, configured to score each target sub-image to be detected in the target sub-image set to be detected, and determine a target detection result based on the scoring result.

According to the target detection device provided by the invention, the targets in the target image to be detected are positioned based on the angular sensitivity, the inclusion of at least one complete target to be detected in each target sub-image to be detected in the target sub-image set to be detected is ensured by determining the contour constraint characteristics corresponding to the angular characteristics, the target detection result is further determined by grading each target sub-image to be detected in the target sub-image set to be detected, the sampling accuracy is ensured, and meanwhile, the detection efficiency of each target sub-image to be detected is improved by reducing the sampling redundancy, so that the accuracy, the sampling stability and the robustness of target detection are improved.

Optionally, the first determining module 1902 is specifically configured to:

Optionally, the anchor frame includes: target scale information and target size information.

Optionally, the first determining module 1902 is specifically configured to:

the anchor frame is determined based on the following steps:

Optionally, the first determining module 1902 is specifically configured to:

Optionally, the second determining module 1903 is specifically configured to:

Optionally, a third determining module 1904 is specifically configured to:

Fig. 20 is a schematic structural diagram of an electronic device according to the present invention, and as shown in fig. 20, the electronic device may include: a processor 2010, a communication interface (Communications Interface) 2020, a memory 2030 and a communication bus 2040, wherein the processor 2010, the communication interface 2020, and the memory 2030 communicate with each other over the communication bus 2040. Processor 2010 may invoke logic instructions in memory 2030 to perform a target detection method comprising:

acquiring a target image to be detected;

Further, the logic instructions in the memory 2030 described above may be implemented in the form of software functional units and may be stored in a computer readable storage medium when sold or used as a stand alone product. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

In another aspect, the present invention also provides a computer program product comprising a computer program, the computer program being storable on a non-transitory computer readable storage medium, the computer program, when executed by a processor, being capable of performing the object detection method provided by the methods described above, the method comprising:

Acquiring a target image to be detected;

In yet another aspect, the present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, is implemented to perform the object detection method provided by the above methods, the method comprising:

acquiring a target image to be detected;

The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.

From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. A method of detecting an object, comprising:

acquiring a target image to be detected;

2. The method for detecting a target according to claim 1, wherein positioning each target to be detected in the target image to be detected based on the corner feature corresponding to the target image to be detected to obtain an initial target sub-image set to be detected includes:

3. The target detection method according to claim 2, wherein the anchor frame includes: target scale information and target size information;

the anchor frame is determined based on the following steps:

4. The object detection method according to claim 2, wherein the determining an initial set of object sub-images to be detected based on the corner features and a predetermined anchor frame comprises:

5. The object detection method according to any one of claims 1 to 4, wherein contour constraint features corresponding to the initial set of object sub-images to be detected are determined based on the steps of:

6. The method according to claim 5, wherein the determining a target sub-image set to be detected including at least one complete target to be detected based on the contour constraint features corresponding to the initial target sub-image set to be detected and the initial target sub-image set to be detected includes:

7. The method according to any one of claims 1 to 4, wherein determining the scoring result includes:

8. The target detection method according to claim 7, wherein the scoring function includes perimeter sensitivity parameters for characterizing the influence of the perimeter corresponding to the target sub-image to be detected on the scoring result;

9. An object detection apparatus, comprising:

10. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the object detection method according to any one of claims 1 to 8 when executing the program.