CN110503095B

CN110503095B - Positioning quality evaluation method, positioning method and device of target detection model

Info

Publication number: CN110503095B
Application number: CN201910794302.3A
Authority: CN
Inventors: 丁建伟; 王蓉; 李锦泽
Original assignee: PEOPLE'S PUBLIC SECURITY UNIVERSITY OF CHINA
Current assignee: PEOPLE'S PUBLIC SECURITY UNIVERSITY OF CHINA
Priority date: 2019-08-27
Filing date: 2019-08-27
Publication date: 2022-06-03
Anticipated expiration: 2039-08-27
Also published as: CN110503095A

Abstract

The invention provides a positioning quality evaluation method, a positioning method and a positioning device of a target detection model, wherein a cross-over ratio, a center distance and a diagonal length of a minimum covering frame of a prediction frame and a corresponding real frame are calculated, a relative position parameter is calculated by adopting the center distance and the diagonal length of the minimum covering frame, and the cross-over ratio is corrected by using the relative position parameter, so that the distance relation between comparison objects can be reflected on the basis of reflecting the cross area of the comparison objects, the cross state can be reflected more accurately, and the positioning precision is improved. Further extend in the target detection location, can effectively promote the sensitivity and the rate of accuracy that detect the location to the scene that many detected targets densely distributed.

Description

Target detection model positioning quality evaluation method, positioning method and device

Technical Field

The invention belongs to the technical field of computer vision, and particularly relates to a positioning quality evaluation method, a positioning method and positioning equipment of a target detection model.

Background

The intersection over union, also known as the Jaccard index, is the most common metric used to compare similarity between two arbitrary shapes. IoU encode the shape attributes of the comparison object (e.g., the width, height, and position of the two bounding boxes) as region attributes and then compute a normalized measure of the area (or volume) of interest. This area property makes the IoU metric independent of the target dimension size. It is due to this property that the computer vision domain is used to evaluate object segmentation, on which all performance metrics of tasks like object tracking and object detection depend.

Object detection, unlike other computer vision tasks, focuses only on two tasks, classification and localization. From the field of target detection, the future will pay more attention to the improvement of target positioning. Moreover, from the subjective visual evaluation of human, the requirement of positioning the object is very strict.

The existing target detection evaluation indexes are basically based on standard intersection ratio, but the intersection ratio index IoU cannot well and actually reflect the measurement of regression accuracy of target positioning by human eyes, the threshold value of 0.5 of the standard IoU is considered as a too loose standard, and the too high threshold value IoU causes the bottleneck of algorithm learning due to ambiguous labeling or wrong labeling of a data set. In addition to the threshold selection problem, the IoU metric has some fatal drawbacks, if two objects do not overlap, the value IoU will be zero, and the positional relationship between the two objects cannot be reversed. At the same time. More precisely, two objects overlap in multiple different directions, and the intersection point is the same horizontally, its IoU will be exactly equal. Thus, the value of the IoU function does not reflect how overlap between two objects occurs. Therefore, it is necessary to provide IoU metrics that are more consistent with subjective perception assessment by the human eye.

Disclosure of Invention

The invention aims to overcome the defect that the intersection ratio IoU cannot reflect the position relation of a compared object by modifying the value of the intersection ratio IoU so as to improve the accuracy of computer vision in target positioning training.

The technical scheme for solving the problems is as follows:

in one aspect, a method for evaluating the positioning quality of a target detection model is provided, which includes:

positioning a target to be detected by using a target detection model to obtain a corresponding prediction frame;

calculating the intersection ratio, the center distance and the diagonal length of the minimum covering frame of the prediction frame and the corresponding real frame;

calculating the relative position parameters of the center distance and the corresponding diagonal length of the minimum covering frame;

correcting the corresponding intersection ratio by using the relative position parameter to obtain an intersection ratio correction value;

and evaluating the positioning quality of the target detection model according to the intersection ratio correction value.

In some embodiments, calculating the relative position parameters of the center distance and the corresponding minimum overlay box diagonal length comprises:

and calculating the quotient of the center distance and the diagonal length of the minimum covering frame to obtain the relative position parameter.

In some embodiments, the correcting the corresponding intersection ratio by using the relative position parameter to obtain a corrected value of the intersection ratio includes:

and calculating the difference between the intersection ratio and the relative position parameter to obtain the intersection ratio correction value.

In some embodiments, evaluating the positioning quality of the target detection model according to the intersection ratio correction value includes:

and judging the prediction frame with the intersection ratio correction value larger than a first set value as the correct positioning result.

In another aspect, the present invention is also a target positioning method, including:

respectively obtaining a plurality of prediction boxes and corresponding classification probability values of a plurality of targets to be detected through target detection positioning operation;

calculating the intersection ratio, the center distance and the diagonal length of the minimum covering frame among the prediction frames;

and screening the optimal prediction frame of each target to be detected by taking the intersection ratio correction value as a standard.

In some embodiments, modifying the corresponding intersection ratio using the relative position parameter to obtain an intersection ratio modification value includes:

In some embodiments, screening the optimal prediction frame of the object to be detected with the cross-over ratio correction value as a standard includes:

and using the intersection ratio correction value as a standard to inhibit iteration by using a non-maximum value to screen out the optimal prediction frame.

In another aspect, the present application further provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the steps of the method are implemented.

In another aspect, the present application also provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the above method.

The invention has the beneficial effects that:

according to the positioning quality evaluation method of the target detection model, the intersection ratio is corrected by adopting the center distance between the candidate frame and the real frame and the diagonal length of the minimum covering frame of the candidate frame and the real frame, so that the distance relation between comparison objects can be further reflected on the basis of reflecting the intersection area of the comparison objects, the intersection state can be more accurately reflected, and the positioning accuracy is improved. According to the target positioning method, the intersection ratio correction value is used as a parameter to replace the intersection ratio in the prior art, and the sensitivity can be improved when the optimal prediction frame is screened through maximum value inhibition.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts. In the drawings:

FIG. 1 is a schematic diagram illustrating the positions of the centers of a candidate frame and a real frame according to an example of the present invention;

FIG. 2 is a schematic view of a position of the candidate frame in FIG. 1 under a state of center shift;

FIG. 3 is a diagram illustrating the alignment positions of the lower right corner of two frames in the inclusion state according to an example of the present invention;

FIG. 4 is a schematic view of the right long sides of two frames connected and aligned with each other according to an embodiment of the present invention;

FIG. 5 is a schematic diagram illustrating the alignment of the centers of two frames in the inclusion state according to an exemplary embodiment of the present invention;

FIG. 6 is a diagram illustrating the alignment positions of the lower right corner of two frames in the inclusion state according to another example of the present invention;

FIG. 7 is a schematic diagram illustrating the overlapping positions of the long sides of two frames in a partially overlapped state according to an exemplary embodiment of the present invention;

FIG. 8 is a schematic diagram illustrating the overlapping positions of two frames along the diagonal in a partially overlapped state according to an exemplary embodiment of the present invention;

FIG. 9 is a schematic diagram illustrating the position of the two frames in a diagonal line alignment when the two frames are not aligned according to an example of the present invention;

FIG. 10 is a schematic view of the positions of the bottom of two frames aligned and the long sides connected in a misaligned state according to an example of the present invention;

FIG. 11 is a schematic view of the right long sides of two frames connected together at an axisymmetric position in a misaligned state according to an example of the present invention;

FIG. 12 is a schematic view of the right broadsides of two frames being connected and in an axisymmetric position in a misaligned state according to an example of the present invention;

fig. 13 is a schematic flow chart illustrating a positioning quality evaluation method of a target detection model according to an embodiment of the present invention;

fig. 14 is a flowchart illustrating a target positioning method according to an embodiment of the invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the embodiments of the present invention are further described in detail below with reference to the accompanying drawings. The exemplary embodiments and descriptions of the present invention are provided to explain the present invention, but not to limit the present invention.

In the prior art, in the evaluation process of the target positioning quality in the target detection field, the intersection ratio of a prediction frame generated by computing network model operation and a real frame corresponding to a real value is mainly calculated, and the positioning quality of the prediction frame is reflected by the size of the intersection ratio, wherein the closer the intersection ratio is to 1, the better the positioning quality of the prediction frame is.

The area intersection ratio IoU is the default evaluation metric that is currently used for object localization in the field of object detection, and is used to identify and discriminate between true and false positives in a set of predictions. When using the intersection ratio IoU as an evaluation index, an accurate metric threshold must be selected. For example, in the PASCAL VOC challenge, a well-known detection accuracy measurement (i.e., mean average accuracy, mAP) is calculated based on a fixed IoU threshold of 0.5. However, arbitrarily selecting IoU the threshold does not completely reflect the positioning performance of the different algorithms, and any positioning accuracy above the threshold is treated equally. Many algorithms that are visually subjectively different for targeting can therefore achieve reasonably high mAP on VOC data sets. Although the VOC data set later raised the IoU threshold to 0.75, the essential problem remains unsolved.

To make the target detection performance measure less sensitive to the selection of IoU thresholds, the MS COCO benchmarking set evaluation calculates an average mAP over multiple IoU thresholds, e.g., one threshold per 0.05 interval between 0.5 and 0.95, each threshold calculating one mAP index, and finally averages 10 values as the final AP. Although the AP value of the existing first gradient algorithm is smaller than 0.5 by this operation, it seems that the algorithm has a high promotion space for accurate target positioning, but the development of the target detection algorithm reaches a bottleneck due to a large number of false labels existing in a large-scale data set and a large number of ambiguous labels which cannot be solved.

In the prior art, the invention also relates to a GIoU (Generalized Intersection over Union generalization Generalized Intersection ratio) which solves the weakness that IoU cannot pay attention to the area of the non-overlapping part by introducing a minimum coverage rectangle to extend and generalize the concept of IoU to the non-overlapping condition, and the calculation formula is that GIoU is IoU- (C-U)/C,where IoU is the intersection ratio, C is the area of the smallest coverage rectangle, and U is the area of the two boxes. The intersection ratio IoU cannot reflect how two objects overlap, and cannot distinguish distances (value range from 0 to 1) even when two object frames do not intersect, the GIoU can focus on the area size of the non-overlapping part which is not focused on IoU, but the evaluation on the good or bad regression of the frames is still not perfect, the evaluation only focuses on the area attribute, and the four dimensions of the regression of the frames are (x, y, w, h) or (x, h)₁,y₁,x₂,y₂) It is clear that the mere representation of the area attribute is a proxy solution, which is lossy for the representation of the positioning information. In large scale data sets, the upper accuracy limit is still reached quickly.

Therefore, on the basis of the prior art, as shown in fig. 13, the present invention provides a method for evaluating the positioning quality of an object detection model, including:

step 101: positioning a target to be detected by using a target detection model to obtain a corresponding prediction frame;

step 102: calculating the intersection ratio of the prediction frame and the corresponding real frame, the center distance and the diagonal length of the minimum covering frame;

step 103: calculating the relative position parameters of the center distance and the corresponding diagonal length of the minimum covering frame;

step 104: correcting the corresponding cross-over ratio by using the relative position parameter to obtain a cross-over ratio correction value;

step 105: and evaluating the positioning quality of the target detection model according to the intersection ratio correction value.

In step 101, the object detection model may be a program module or a functional module obtained according to various existing object detection methods. The target to be detected may be a specific content to be detected (for example, a human face or an animal face) in the image, only one content in one image may be the specific content, or one image may include the specific content and other non-target content. The target detection model can be used for detecting the position of a target to be detected in a graph, the position of the target detected by the model can be represented in a prediction frame mode, and the positioning accuracy rates of different target detection models can be different, so that the positioning performance of the target detection model can be evaluated by using subsequent steps.

In step 102, the intersection ratio can be calculated by the existing calculation method, i.e. IoU. The center distance may refer to a distance between geometric centers of two frames, for example, in the case where both the prediction frame and the real frame are rectangular frames, the center distance between the two frames may be a distance between a diagonal intersection point of the rectangular prediction frame and a diagonal intersection point of the rectangular real frame. The minimum covering box covers the minimum rectangular box of the prediction box and the real box, in which case, the length of the diagonal line can be the length of any diagonal line.

In this embodiment, on the basis of calculating the intersection ratio, the relative relationship between the center positions of the prediction frame and the real frame can be further reflected on the basis of reflecting the overlapping relationship between the areas of the prediction frame and the real frame by further performing the orthogonal combination ratio on the center distance between the prediction frame and the corresponding real frame and the diagonal length of the minimum covering frame, so that the sensitivity of threshold evaluation is improved.

In some embodiments, the step 103 of calculating the relative position parameters of the center distance and the corresponding diagonal length of the minimum covering frame may further include the steps of:

1031, calculating the quotient of the center distance and the diagonal length of the minimum covering frame to obtain a relative position parameter.

In the embodiment, in order to further reflect the relative position relationship between the prediction frame and the real frame in the cross-over ratio, the sensitivity of threshold evaluation is improved; the center distance represents the actual position difference, the minimum cover frame diagonal length represents the farthest distance in the range of the prediction frame and the real frame, and the quotient of the center distance and the minimum cover frame diagonal length is used as a relative position parameter, so that the relative position relation between the prediction frame and the real frame can be more effectively reflected.

Further, the larger the relative position parameter is, the farther the relative position of the prediction frame and the real frame is, the worse the coincidence degree is; the larger the relative position parameter is, the closer the relative position of the prediction frame and the real frame is, and the better the coincidence degree is; thus, the value of the relative position parameter is inversely related to the degree of coincidence.

In other embodiments, the relative position parameter may be calculated in other ways, for example, a factor may be added to the quotient of the center distance and the diagonal length of the minimum coverage box; alternatively, the center distance and the minimum footprint diagonal length may be subtracted and the resulting difference divided by the center distance or divided by the minimum footprint diagonal length.

In some embodiments, the step 104 of correcting the corresponding cross-over ratio by using the relative position parameter to obtain a cross-over ratio correction value may include the steps of:

1041, calculating the difference between the cross-over ratio and the relative position parameter to obtain a cross-over ratio correction value. The correction value may be an absolute value of a difference between the cross-over ratio and the relative position parameter.

In the present embodiment, the intersection ratio can reflect the degree of overlapping of the area of the prediction frame and the real frame, but cannot further reflect the relative position relationship; since the coincidence degree of the prediction frame and the real frame is reflected by the degree of area overlapping of the intersection ratio, when the intersection ratio is larger, the larger the representation overlapping part is, the better the coincidence degree is; when the intersection ratio is smaller, the smaller the overlapped part is, the worse the coincidence degree is; therefore, the value of the cross-over ratio is positively correlated with the degree of coincidence.

On the basis, the difference between the intersection ratio and the relative position parameter is calculated to serve as an intersection ratio correction value, the correlation between the intersection ratio and the coincidence degree can be unified, the coincidence degree of the prediction frame and the real frame can be accurately evaluated, and the positioning quality of the prediction frame can be further evaluated. Specifically, the quality of the positioning quality can be judged according to whether the intersection ratio correction value is within a set threshold range (which can be determined according to experience or the accuracy requirement of positioning); or comparing the intersection ratio correction values corresponding to different target detection models to judge the relative quality of the model.

In this embodiment, the intersection ratio correction value may be directly used to evaluate the positioning quality of the prediction frame, that is, the larger the intersection ratio correction value is, the better the positioning quality of the prediction frame is.

For example, the calculation method of the cross-over ratio correction value may include:

s1, calculating the center distance d between the prediction frame and the real frame;

s2, generating a minimum covering frame of the prediction frame and the real frame, and obtaining the diagonal length c of the minimum covering frame;

and S3, taking the quotient d/c of the center distance d and the diagonal length c as a relative position parameter, and subtracting the relative position parameter from the cross-over ratio to obtain a cross-over ratio correction value CIoU, wherein the specific cross-over ratio correction value CIoU is IoU-d/c.

Illustratively, as shown in FIG. 1, when the predicted frame and the real frame are aligned in the center position, the real frame length is w₁Width of h₁The prediction frame length is w₂Width of h₂. W in the figure₁>w₂，h₁<h₂According to the definition of the intersection ratio IoU, the IoU values of the two boxes are (w)₂×h₁)/(w₁×h₁+w₂×h₂-w₂×h₁). The GIoU values of the two boxes are (w) according to the calculation definition of the GIoU₂×h₁)/(w₁×h₁+w₂×h₂-w₂×h₁)-(w₁×h₂-w₁×h₁-w₂×h₂+w₂×h₁)/w₁×h₂。

Similarly to fig. 2, on the basis of fig. 1, the predicted box is shifted from the center position of the real box, and its IoU value and GIoU value are exactly the same as those of fig. 1. However, the coincidence of the two boxes of fig. 1 and 2 is clearly not the same, and it is clearly deficient to use only IoU or GIoU as a measure of the degree of coincidence of the boxes. The modified value CIoU of inverse cross-over ratio is that the central points of the two frames in FIG. 1 are coincident and the distance is 0, so that the value of d/c of the item behind the CIoU is 0, CIoU is IoU, and the distance between the central points of the two frames in FIG. 2 is w₁-w₂And/2, the length of the minimum covered rectangular diagonal is sqrt (w)₁×w₁+h₂×h₂) Thus, its CIoU is IoU- (w)₁-w₂/2)/sqrt(w₁×w₁+h₂×h₂) It is obvious thatThe value is smaller than that in the case of the center alignment of fig. 1, and therefore it can be recognized that the registration effect is not as good as that of fig. 1, which is consistent with the actual situation.

Further, to illustrate that the cases of fig. 1 and fig. 2 are not special cases, the discussion continues here for cases where the two frames perfectly coincide, contain, intersect, do not intersect, infinity, and so on.

As shown in fig. 3, 4 and 5, when two frames are in an inclusive relationship, it is assumed that one of the large frames is 100 long and 60 wide; the other small box has a length of 50 and a width of 30.

In fig. 3, when the small frame is included in the lower right corner of the large frame, IoU is 0.25, GIoU is 0.25, and CIoU is 0.

In fig. 4, when the small frames are connected at one long side of the large frame and are axisymmetrical, IoU is 0.25, GIoU is 0.25, and CIoU is 0.1214.

In fig. 5, when the small frame is aligned with the large frame, IoU ═ 0.25, GIoU ═ 0.25, and CIoU ═ 0.25.

Comparing the parameters of fig. 3, 4 and 5, it can be seen that CIoU versus IoU and GIoU can further reflect the crossing relationship between boxes, in particular the degree of center alignment between the two baskets. The higher the CIoU value, the higher the degree of center alignment of the two boxes.

As also shown in FIG. 6, assume that one of the large boxes is 100 long and 60 wide; another small box has a length of 25 and a width of 15, and when the small box is included in the lower right corner of the large box, IoU is 0.0625, GIoU is 0.0625, and CIoU is-0.3125.

As shown in fig. 7 and 8, the two frames are 100 long and 60 wide, and partially overlap each other.

In fig. 7, when the width of the overlapped portion of the two frames is 15, IoU is 0.1429, GIoU is 0.1429, and CIoU is-0.1675.

In fig. 8, when the two frames overlap each other in area along the diagonal line 1/4, IoU is 0.1429, GIoU is-0.0794, and CIoU is-0.1905.

As shown in fig. 9, 10, 11 and 12, it is assumed that the large frame has a length of 100 and a width of 60; the small frame is 50 in length and 30 in width; the two frames do not coincide.

In fig. 9, when the large frame and the small frame are connected diagonally, IoU is 0, GIoU is-0.4444, and CIoU is-0.5.

In fig. 10, when the large frame and the small frame are aligned at the bottom and connected at the long side, IoU is 0, GIoU is-0.3333, and CIoU is-0.3826.

In fig. 11, when the long sides of the large frame and the small frame are connected and are axisymmetrical, IoU is 0, GIoU is-0.3333, and CIoU is-0.3345.

In fig. 12, when the large frame and the small frame are connected at one broadside and are axisymmetrical, IoU is 0, GIoU is-0.3333, and CIoU is-0.4642.

When the two frames are completely overlapped, their IoU GIoU CIoU 1; when the two frames are at infinity, IoU is 0 and GIoU is CIoU-1.

In summary, the CIoU inherits the scale invariance of the GIoU; CIoU can be considered as the lower boundary of IoU and is IoU or less like GIoU; when the loss is calculated, the regression loss 1-CIoU can be used, and even if two comparison frames are not intersected, gradient pass-back also exists; CIoU has the same value range as GIoU, but focuses more on the difficult regression box, which just takes the idea of the Loss of Focal Loss by classification into account.

In other embodiments, for the evaluation of the prediction frame after the detection and positioning are completed, the intersection ratio correction value may be calculated according to the real frame, and the prediction frame whose intersection ratio correction value is greater than the first setting value is determined as the positioning result is correct. For example, the first set value is 0.35.

In other embodiments, the intersection-to-parallel ratio and the intersection-to-parallel ratio correction value may be used as the determination criterion at the same time, that is, when the intersection-to-parallel ratio correction value of the prediction frame and the real frame is greater than the first setting value and the intersection-to-parallel ratio is greater than the second setting value, the prediction frame is determined to be correct in positioning result. For example, the first set value is 0.35, and the second set value is 0.5.

In the present embodiment, since the factors of the cross-over ratio correction value include the cross-over ratio and the relative position parameter, different cross-over ratios and relative position parameters may be calculated during the actual application process to obtain the same cross-over ratio correction value, which may result in that effective distinction cannot be made, for example, in one example, CIoU is IoU-d/c is 0.5-0.05-0.45, and in another example, CIoU is IoU-d/c is 0.55-0.1-0.45; therefore, the cross-over ratio correction value is further limited on the basis of limiting the cross-over ratio, and the sensitivity of threshold evaluation can be improved.

Further, the positioning accuracy of the specified class object in the single image, the average accuracy of the specified class object in the multiple images, and the average accuracy mean of the multiple class objects in the multiple images may be further calculated for the target detection positioning result, so as to reflect the positioning quality of the target detection model, and the method includes the steps of: and calculating the number of correct prediction frames aiming at the detection result of the object of the specified category in the single target detection image, and calculating the ratio of the number of correct prediction frames to the total number of the objects of the specified category to obtain the accuracy rate of the positioning result of the object of the specified category.

And respectively calculating the accuracy rate of the object of the specified category in each target detection image and calculating the average value based on the plurality of target detection images to obtain the average accuracy of the positioning of the object of the specified category.

And respectively calculating corresponding average precision and averaging aiming at a plurality of objects of different types on the basis of a plurality of target detection images to obtain an average precision average value.

The higher the values of the accuracy rate, the average accuracy and the average accuracy mean value are, the better the positioning quality of the target detection model is.

On the other hand, based on the same inventive concept as the positioning quality evaluation method of the target detection model shown in fig. 13, the embodiment of the present invention further provides a target positioning method, and repeated details are not repeated. As shown in fig. 14, the target location method of some embodiments may include:

step 201: respectively obtaining a plurality of prediction boxes and corresponding classification probability values of a plurality of targets to be detected through target detection positioning operation;

step 202: calculating the intersection ratio and the center distance between the prediction frames and the diagonal length of the minimum covering frame;

step 203: calculating the relative position parameters of the center distance and the corresponding diagonal length of the minimum covering frame;

step 204: correcting the corresponding intersection ratio by using the relative position parameter to obtain an intersection ratio correction value;

step 205: and screening the optimal prediction frame of each target to be detected by taking the cross-over ratio correction value as a standard.

In this embodiment, the target detection positioning operation may adopt a target detection model formed by training a deep learning neural network, and a plurality of prediction frames with a probability reaching a specified threshold may be output for a plurality of targets to be detected through the operation of the target detection model.

In the practical application process, the positions of a plurality of targets to be detected of the same type are distributed sparsely or densely. Specifically, the target detection model respectively outputs a plurality of prediction frames for a plurality of targets to be detected in a dense scene, the prediction frames of the targets to be detected are overlapped, in this case, only IoU is used as a screening standard to select an optimal frame, the judgment effectiveness is reduced, different detection targets cannot be distinguished efficiently, and the condition of reduced sensitivity or excessive combination of output values is presented. In order to effectively improve the output capability of the optimal prediction frame for each detection target in the dense scene, the embodiment adopts the intersection-to-parallel ratio correction value corrected by the central distance between the two frames and the diagonal length of the minimum covering frame as the comparison standard, so that the sensitivity can be improved.

In some embodiments, the step 203 of calculating the relative position parameters of the center distance and the corresponding diagonal length of the minimum coverage frame may further include the steps of: 2031, calculating the quotient of the center distance and the diagonal length of the minimum covering frame to obtain the relative position parameter.

In some embodiments, the step 204 of correcting the corresponding intersection ratio by using the relative position parameter to obtain an intersection ratio correction value may further include the steps of: 2041, calculating the difference between the cross-over ratio and the relative position parameter to obtain the cross-over ratio correction value.

Specifically, calculating a cross ratio correction value between the prediction frames, when the cross ratio correction value between the two prediction frames is larger than a first set threshold value, reserving the prediction frame with higher classification probability value, and discarding the prediction frame with lower classification probability value; and when the intersection ratio correction value between the two prediction frames is smaller than a first set threshold value, both the two prediction frames are reserved. And repeating the calculation according to the steps, and finally outputting the corresponding optimal prediction frame aiming at each detection target. The first set threshold is determined according to an actual application scenario, and for example, the first set threshold may be 0.5.

In some embodiments, step 205, namely, screening the optimal prediction frame of the object to be detected by using the cross-over ratio correction value as a criterion, may further include the steps of: 2051, iteratively screening the optimal prediction frame by using Non-maximum suppression (Nms) with the cross-over ratio correction value as a standard.

In this embodiment, in order to reduce the calculation amount, the optimal prediction box may be screened out by non-maximum inhibition Nms iteration; compared with the traditional method for eliminating the redundant frames by adopting the cross-over ratio IoU as the standard, the method for eliminating the redundant frames by adopting the cross-over ratio correction value CIoU as the evaluation standard can improve the detection positioning sensitivity and accuracy in the scene of dense distribution of a plurality of detection targets.

Illustratively, a plurality of prediction boxes obtained by performing object detection and positioning operation on an image are arranged according to the corresponding classification probability values from small to large, for example: A. b, C, D, E, F, G, respectively;

and starting from the prediction frame G with the maximum classification probability value, respectively judging whether the intersection ratio correction value CIoU of the prediction frame A, B, C, D, E, F and the prediction frame G is larger than a second set threshold value, discarding the side of the prediction frame larger than the second set threshold value, reserving the prediction frame smaller than the second set threshold value, calculating to obtain the intersection ratio correction value CIoU of the prediction frame B, D and the prediction frame G which is larger than the second set threshold value, discarding the prediction frame B, D, and reserving G.

Further, the prediction box E with the highest classification probability value is selected from the remaining prediction boxes A, C, E, F, whether the intersection ratio correction value CIoU of the prediction box A, C, F and the prediction box E is greater than a second set threshold value or not is respectively judged, the side of the prediction box greater than the second set threshold value is discarded, the prediction box smaller than the second set threshold value is reserved, the intersection ratio correction value CIoU of the prediction box a and the prediction box E is obtained after calculation, the prediction box a is discarded, and the prediction box E is reserved. The iteration is repeated according to the above steps, and finally all the reserved prediction boxes G, E, C are found.

The prediction boxes G, E, C are the optimal prediction boxes for the three detection targets in the image, respectively.

In other embodiments, the deep neural network may be used to train the target detection model to locate the target to be detected. The process of training the target detection model by adopting the deep neural network can comprise the following steps:

s301, generating a plurality of candidate frames by adopting a sliding window method or a selective search method;

s302, calculating the intersection ratio, the center distance and the diagonal length of the minimum covering frame of the candidate frame and the corresponding real frame;

s303, calculating a quotient of the center distance and the corresponding diagonal length of the minimum coverage frame as a relative position parameter, and subtracting the relative position parameter from the intersection ratio to obtain an intersection ratio correction value;

s304, classifying the candidate frame with the intersection ratio correction value larger than the third threshold value into an object frame for inputting a model for training; and classifying the candidate frame with the intersection ratio correction value smaller than the third threshold value as a background frame.

In the description herein, reference to the description of the terms "one embodiment," "a particular embodiment," "some embodiments," "for example," "an example," "a particular example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. The sequence of steps involved in the various embodiments is provided to schematically illustrate the practice of the invention, and the sequence of steps is not limited and can be suitably adjusted as desired.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention has been described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are only exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims

1. A method for evaluating the positioning quality of a target detection model is characterized by comprising the following steps:

calculating relative position parameters of the center distance and the corresponding diagonal length of the minimum coverage frame;

2. The method for evaluating the positioning quality of the object detection model according to claim 1, wherein calculating the relative position parameters of the center distance and the corresponding diagonal length of the minimum coverage frame comprises:

3. The method for evaluating the positioning quality of the target detection model according to claim 2, wherein the step of correcting the corresponding intersection ratio by using the relative position parameter to obtain an intersection ratio correction value comprises:

4. The method according to claim 3, wherein evaluating the positioning quality of the target detection model based on the cross-over ratio correction value includes:

5. A method of locating an object, comprising:

6. The method of claim 5, wherein calculating the relative position parameters of the center distance and the corresponding diagonal length of the minimum coverage frame comprises:

7. The method of claim 6, wherein the correcting the corresponding cross-over ratio by using the relative position parameter to obtain a cross-over ratio correction value comprises:

8. The method for positioning the target according to claim 7, wherein the step of screening the optimal prediction frame of the target to be detected by using the cross-over ratio correction value as a standard comprises the following steps:

9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method according to any of claims 1 to 8 are implemented when the program is executed by the processor.

10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 8.