CN110503095B - Positioning quality evaluation method, positioning method and device of target detection model - Google Patents

Positioning quality evaluation method, positioning method and device of target detection model Download PDF

Info

Publication number
CN110503095B
CN110503095B CN201910794302.3A CN201910794302A CN110503095B CN 110503095 B CN110503095 B CN 110503095B CN 201910794302 A CN201910794302 A CN 201910794302A CN 110503095 B CN110503095 B CN 110503095B
Authority
CN
China
Prior art keywords
frame
relative position
intersection ratio
correction value
positioning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910794302.3A
Other languages
Chinese (zh)
Other versions
CN110503095A (en
Inventor
丁建伟
王蓉
李锦泽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
PEOPLE'S PUBLIC SECURITY UNIVERSITY OF CHINA
Original Assignee
PEOPLE'S PUBLIC SECURITY UNIVERSITY OF CHINA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by PEOPLE'S PUBLIC SECURITY UNIVERSITY OF CHINA filed Critical PEOPLE'S PUBLIC SECURITY UNIVERSITY OF CHINA
Priority to CN201910794302.3A priority Critical patent/CN110503095B/en
Publication of CN110503095A publication Critical patent/CN110503095A/en
Application granted granted Critical
Publication of CN110503095B publication Critical patent/CN110503095B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a positioning quality evaluation method, a positioning method and a positioning device of a target detection model, wherein a cross-over ratio, a center distance and a diagonal length of a minimum covering frame of a prediction frame and a corresponding real frame are calculated, a relative position parameter is calculated by adopting the center distance and the diagonal length of the minimum covering frame, and the cross-over ratio is corrected by using the relative position parameter, so that the distance relation between comparison objects can be reflected on the basis of reflecting the cross area of the comparison objects, the cross state can be reflected more accurately, and the positioning precision is improved. Further extend in the target detection location, can effectively promote the sensitivity and the rate of accuracy that detect the location to the scene that many detected targets densely distributed.

Description

Target detection model positioning quality evaluation method, positioning method and device
Technical Field
The invention belongs to the technical field of computer vision, and particularly relates to a positioning quality evaluation method, a positioning method and positioning equipment of a target detection model.
Background
The intersection over union, also known as the Jaccard index, is the most common metric used to compare similarity between two arbitrary shapes. IoU encode the shape attributes of the comparison object (e.g., the width, height, and position of the two bounding boxes) as region attributes and then compute a normalized measure of the area (or volume) of interest. This area property makes the IoU metric independent of the target dimension size. It is due to this property that the computer vision domain is used to evaluate object segmentation, on which all performance metrics of tasks like object tracking and object detection depend.
Object detection, unlike other computer vision tasks, focuses only on two tasks, classification and localization. From the field of target detection, the future will pay more attention to the improvement of target positioning. Moreover, from the subjective visual evaluation of human, the requirement of positioning the object is very strict.
The existing target detection evaluation indexes are basically based on standard intersection ratio, but the intersection ratio index IoU cannot well and actually reflect the measurement of regression accuracy of target positioning by human eyes, the threshold value of 0.5 of the standard IoU is considered as a too loose standard, and the too high threshold value IoU causes the bottleneck of algorithm learning due to ambiguous labeling or wrong labeling of a data set. In addition to the threshold selection problem, the IoU metric has some fatal drawbacks, if two objects do not overlap, the value IoU will be zero, and the positional relationship between the two objects cannot be reversed. At the same time. More precisely, two objects overlap in multiple different directions, and the intersection point is the same horizontally, its IoU will be exactly equal. Thus, the value of the IoU function does not reflect how overlap between two objects occurs. Therefore, it is necessary to provide IoU metrics that are more consistent with subjective perception assessment by the human eye.
Disclosure of Invention
The invention aims to overcome the defect that the intersection ratio IoU cannot reflect the position relation of a compared object by modifying the value of the intersection ratio IoU so as to improve the accuracy of computer vision in target positioning training.
The technical scheme for solving the problems is as follows:
in one aspect, a method for evaluating the positioning quality of a target detection model is provided, which includes:
positioning a target to be detected by using a target detection model to obtain a corresponding prediction frame;
calculating the intersection ratio, the center distance and the diagonal length of the minimum covering frame of the prediction frame and the corresponding real frame;
calculating the relative position parameters of the center distance and the corresponding diagonal length of the minimum covering frame;
correcting the corresponding intersection ratio by using the relative position parameter to obtain an intersection ratio correction value;
and evaluating the positioning quality of the target detection model according to the intersection ratio correction value.
In some embodiments, calculating the relative position parameters of the center distance and the corresponding minimum overlay box diagonal length comprises:
and calculating the quotient of the center distance and the diagonal length of the minimum covering frame to obtain the relative position parameter.
In some embodiments, the correcting the corresponding intersection ratio by using the relative position parameter to obtain a corrected value of the intersection ratio includes:
and calculating the difference between the intersection ratio and the relative position parameter to obtain the intersection ratio correction value.
In some embodiments, evaluating the positioning quality of the target detection model according to the intersection ratio correction value includes:
and judging the prediction frame with the intersection ratio correction value larger than a first set value as the correct positioning result.
In another aspect, the present invention is also a target positioning method, including:
respectively obtaining a plurality of prediction boxes and corresponding classification probability values of a plurality of targets to be detected through target detection positioning operation;
calculating the intersection ratio, the center distance and the diagonal length of the minimum covering frame among the prediction frames;
calculating the relative position parameters of the center distance and the corresponding diagonal length of the minimum covering frame;
correcting the corresponding intersection ratio by using the relative position parameter to obtain an intersection ratio correction value;
and screening the optimal prediction frame of each target to be detected by taking the intersection ratio correction value as a standard.
In some embodiments, calculating the relative position parameters of the center distance and the corresponding minimum overlay box diagonal length comprises:
and calculating the quotient of the center distance and the diagonal length of the minimum covering frame to obtain the relative position parameter.
In some embodiments, modifying the corresponding intersection ratio using the relative position parameter to obtain an intersection ratio modification value includes:
and calculating the difference between the intersection ratio and the relative position parameter to obtain the intersection ratio correction value.
In some embodiments, screening the optimal prediction frame of the object to be detected with the cross-over ratio correction value as a standard includes:
and using the intersection ratio correction value as a standard to inhibit iteration by using a non-maximum value to screen out the optimal prediction frame.
In another aspect, the present application further provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the steps of the method are implemented.
In another aspect, the present application also provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the above method.
The invention has the beneficial effects that:
according to the positioning quality evaluation method of the target detection model, the intersection ratio is corrected by adopting the center distance between the candidate frame and the real frame and the diagonal length of the minimum covering frame of the candidate frame and the real frame, so that the distance relation between comparison objects can be further reflected on the basis of reflecting the intersection area of the comparison objects, the intersection state can be more accurately reflected, and the positioning accuracy is improved. According to the target positioning method, the intersection ratio correction value is used as a parameter to replace the intersection ratio in the prior art, and the sensitivity can be improved when the optimal prediction frame is screened through maximum value inhibition.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts. In the drawings:
FIG. 1 is a schematic diagram illustrating the positions of the centers of a candidate frame and a real frame according to an example of the present invention;
FIG. 2 is a schematic view of a position of the candidate frame in FIG. 1 under a state of center shift;
FIG. 3 is a diagram illustrating the alignment positions of the lower right corner of two frames in the inclusion state according to an example of the present invention;
FIG. 4 is a schematic view of the right long sides of two frames connected and aligned with each other according to an embodiment of the present invention;
FIG. 5 is a schematic diagram illustrating the alignment of the centers of two frames in the inclusion state according to an exemplary embodiment of the present invention;
FIG. 6 is a diagram illustrating the alignment positions of the lower right corner of two frames in the inclusion state according to another example of the present invention;
FIG. 7 is a schematic diagram illustrating the overlapping positions of the long sides of two frames in a partially overlapped state according to an exemplary embodiment of the present invention;
FIG. 8 is a schematic diagram illustrating the overlapping positions of two frames along the diagonal in a partially overlapped state according to an exemplary embodiment of the present invention;
FIG. 9 is a schematic diagram illustrating the position of the two frames in a diagonal line alignment when the two frames are not aligned according to an example of the present invention;
FIG. 10 is a schematic view of the positions of the bottom of two frames aligned and the long sides connected in a misaligned state according to an example of the present invention;
FIG. 11 is a schematic view of the right long sides of two frames connected together at an axisymmetric position in a misaligned state according to an example of the present invention;
FIG. 12 is a schematic view of the right broadsides of two frames being connected and in an axisymmetric position in a misaligned state according to an example of the present invention;
fig. 13 is a schematic flow chart illustrating a positioning quality evaluation method of a target detection model according to an embodiment of the present invention;
fig. 14 is a flowchart illustrating a target positioning method according to an embodiment of the invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the embodiments of the present invention are further described in detail below with reference to the accompanying drawings. The exemplary embodiments and descriptions of the present invention are provided to explain the present invention, but not to limit the present invention.
In the prior art, in the evaluation process of the target positioning quality in the target detection field, the intersection ratio of a prediction frame generated by computing network model operation and a real frame corresponding to a real value is mainly calculated, and the positioning quality of the prediction frame is reflected by the size of the intersection ratio, wherein the closer the intersection ratio is to 1, the better the positioning quality of the prediction frame is.
The area intersection ratio IoU is the default evaluation metric that is currently used for object localization in the field of object detection, and is used to identify and discriminate between true and false positives in a set of predictions. When using the intersection ratio IoU as an evaluation index, an accurate metric threshold must be selected. For example, in the PASCAL VOC challenge, a well-known detection accuracy measurement (i.e., mean average accuracy, mAP) is calculated based on a fixed IoU threshold of 0.5. However, arbitrarily selecting IoU the threshold does not completely reflect the positioning performance of the different algorithms, and any positioning accuracy above the threshold is treated equally. Many algorithms that are visually subjectively different for targeting can therefore achieve reasonably high mAP on VOC data sets. Although the VOC data set later raised the IoU threshold to 0.75, the essential problem remains unsolved.
To make the target detection performance measure less sensitive to the selection of IoU thresholds, the MS COCO benchmarking set evaluation calculates an average mAP over multiple IoU thresholds, e.g., one threshold per 0.05 interval between 0.5 and 0.95, each threshold calculating one mAP index, and finally averages 10 values as the final AP. Although the AP value of the existing first gradient algorithm is smaller than 0.5 by this operation, it seems that the algorithm has a high promotion space for accurate target positioning, but the development of the target detection algorithm reaches a bottleneck due to a large number of false labels existing in a large-scale data set and a large number of ambiguous labels which cannot be solved.
In the prior art, the invention also relates to a GIoU (Generalized Intersection over Union generalization Generalized Intersection ratio) which solves the weakness that IoU cannot pay attention to the area of the non-overlapping part by introducing a minimum coverage rectangle to extend and generalize the concept of IoU to the non-overlapping condition, and the calculation formula is that GIoU is IoU- (C-U)/C,where IoU is the intersection ratio, C is the area of the smallest coverage rectangle, and U is the area of the two boxes. The intersection ratio IoU cannot reflect how two objects overlap, and cannot distinguish distances (value range from 0 to 1) even when two object frames do not intersect, the GIoU can focus on the area size of the non-overlapping part which is not focused on IoU, but the evaluation on the good or bad regression of the frames is still not perfect, the evaluation only focuses on the area attribute, and the four dimensions of the regression of the frames are (x, y, w, h) or (x, h)1,y1,x2,y2) It is clear that the mere representation of the area attribute is a proxy solution, which is lossy for the representation of the positioning information. In large scale data sets, the upper accuracy limit is still reached quickly.
Therefore, on the basis of the prior art, as shown in fig. 13, the present invention provides a method for evaluating the positioning quality of an object detection model, including:
step 101: positioning a target to be detected by using a target detection model to obtain a corresponding prediction frame;
step 102: calculating the intersection ratio of the prediction frame and the corresponding real frame, the center distance and the diagonal length of the minimum covering frame;
step 103: calculating the relative position parameters of the center distance and the corresponding diagonal length of the minimum covering frame;
step 104: correcting the corresponding cross-over ratio by using the relative position parameter to obtain a cross-over ratio correction value;
step 105: and evaluating the positioning quality of the target detection model according to the intersection ratio correction value.
In step 101, the object detection model may be a program module or a functional module obtained according to various existing object detection methods. The target to be detected may be a specific content to be detected (for example, a human face or an animal face) in the image, only one content in one image may be the specific content, or one image may include the specific content and other non-target content. The target detection model can be used for detecting the position of a target to be detected in a graph, the position of the target detected by the model can be represented in a prediction frame mode, and the positioning accuracy rates of different target detection models can be different, so that the positioning performance of the target detection model can be evaluated by using subsequent steps.
In step 102, the intersection ratio can be calculated by the existing calculation method, i.e. IoU. The center distance may refer to a distance between geometric centers of two frames, for example, in the case where both the prediction frame and the real frame are rectangular frames, the center distance between the two frames may be a distance between a diagonal intersection point of the rectangular prediction frame and a diagonal intersection point of the rectangular real frame. The minimum covering box covers the minimum rectangular box of the prediction box and the real box, in which case, the length of the diagonal line can be the length of any diagonal line.
In this embodiment, on the basis of calculating the intersection ratio, the relative relationship between the center positions of the prediction frame and the real frame can be further reflected on the basis of reflecting the overlapping relationship between the areas of the prediction frame and the real frame by further performing the orthogonal combination ratio on the center distance between the prediction frame and the corresponding real frame and the diagonal length of the minimum covering frame, so that the sensitivity of threshold evaluation is improved.
In some embodiments, the step 103 of calculating the relative position parameters of the center distance and the corresponding diagonal length of the minimum covering frame may further include the steps of:
1031, calculating the quotient of the center distance and the diagonal length of the minimum covering frame to obtain a relative position parameter.
In the embodiment, in order to further reflect the relative position relationship between the prediction frame and the real frame in the cross-over ratio, the sensitivity of threshold evaluation is improved; the center distance represents the actual position difference, the minimum cover frame diagonal length represents the farthest distance in the range of the prediction frame and the real frame, and the quotient of the center distance and the minimum cover frame diagonal length is used as a relative position parameter, so that the relative position relation between the prediction frame and the real frame can be more effectively reflected.
Further, the larger the relative position parameter is, the farther the relative position of the prediction frame and the real frame is, the worse the coincidence degree is; the larger the relative position parameter is, the closer the relative position of the prediction frame and the real frame is, and the better the coincidence degree is; thus, the value of the relative position parameter is inversely related to the degree of coincidence.
In other embodiments, the relative position parameter may be calculated in other ways, for example, a factor may be added to the quotient of the center distance and the diagonal length of the minimum coverage box; alternatively, the center distance and the minimum footprint diagonal length may be subtracted and the resulting difference divided by the center distance or divided by the minimum footprint diagonal length.
In some embodiments, the step 104 of correcting the corresponding cross-over ratio by using the relative position parameter to obtain a cross-over ratio correction value may include the steps of:
1041, calculating the difference between the cross-over ratio and the relative position parameter to obtain a cross-over ratio correction value. The correction value may be an absolute value of a difference between the cross-over ratio and the relative position parameter.
In the present embodiment, the intersection ratio can reflect the degree of overlapping of the area of the prediction frame and the real frame, but cannot further reflect the relative position relationship; since the coincidence degree of the prediction frame and the real frame is reflected by the degree of area overlapping of the intersection ratio, when the intersection ratio is larger, the larger the representation overlapping part is, the better the coincidence degree is; when the intersection ratio is smaller, the smaller the overlapped part is, the worse the coincidence degree is; therefore, the value of the cross-over ratio is positively correlated with the degree of coincidence.
On the basis, the difference between the intersection ratio and the relative position parameter is calculated to serve as an intersection ratio correction value, the correlation between the intersection ratio and the coincidence degree can be unified, the coincidence degree of the prediction frame and the real frame can be accurately evaluated, and the positioning quality of the prediction frame can be further evaluated. Specifically, the quality of the positioning quality can be judged according to whether the intersection ratio correction value is within a set threshold range (which can be determined according to experience or the accuracy requirement of positioning); or comparing the intersection ratio correction values corresponding to different target detection models to judge the relative quality of the model.
In this embodiment, the intersection ratio correction value may be directly used to evaluate the positioning quality of the prediction frame, that is, the larger the intersection ratio correction value is, the better the positioning quality of the prediction frame is.
For example, the calculation method of the cross-over ratio correction value may include:
s1, calculating the center distance d between the prediction frame and the real frame;
s2, generating a minimum covering frame of the prediction frame and the real frame, and obtaining the diagonal length c of the minimum covering frame;
and S3, taking the quotient d/c of the center distance d and the diagonal length c as a relative position parameter, and subtracting the relative position parameter from the cross-over ratio to obtain a cross-over ratio correction value CIoU, wherein the specific cross-over ratio correction value CIoU is IoU-d/c.
Illustratively, as shown in FIG. 1, when the predicted frame and the real frame are aligned in the center position, the real frame length is w1Width of h1The prediction frame length is w2Width of h2. W in the figure1>w2,h1<h2According to the definition of the intersection ratio IoU, the IoU values of the two boxes are (w)2×h1)/(w1×h1+w2×h2-w2×h1). The GIoU values of the two boxes are (w) according to the calculation definition of the GIoU2×h1)/(w1×h1+w2×h2-w2×h1)-(w1×h2-w1×h1-w2×h2+w2×h1)/w1×h2
Similarly to fig. 2, on the basis of fig. 1, the predicted box is shifted from the center position of the real box, and its IoU value and GIoU value are exactly the same as those of fig. 1. However, the coincidence of the two boxes of fig. 1 and 2 is clearly not the same, and it is clearly deficient to use only IoU or GIoU as a measure of the degree of coincidence of the boxes. The modified value CIoU of inverse cross-over ratio is that the central points of the two frames in FIG. 1 are coincident and the distance is 0, so that the value of d/c of the item behind the CIoU is 0, CIoU is IoU, and the distance between the central points of the two frames in FIG. 2 is w1-w2And/2, the length of the minimum covered rectangular diagonal is sqrt (w)1×w1+h2×h2) Thus, its CIoU is IoU- (w)1-w2/2)/sqrt(w1×w1+h2×h2) It is obvious thatThe value is smaller than that in the case of the center alignment of fig. 1, and therefore it can be recognized that the registration effect is not as good as that of fig. 1, which is consistent with the actual situation.
Further, to illustrate that the cases of fig. 1 and fig. 2 are not special cases, the discussion continues here for cases where the two frames perfectly coincide, contain, intersect, do not intersect, infinity, and so on.
As shown in fig. 3, 4 and 5, when two frames are in an inclusive relationship, it is assumed that one of the large frames is 100 long and 60 wide; the other small box has a length of 50 and a width of 30.
In fig. 3, when the small frame is included in the lower right corner of the large frame, IoU is 0.25, GIoU is 0.25, and CIoU is 0.
In fig. 4, when the small frames are connected at one long side of the large frame and are axisymmetrical, IoU is 0.25, GIoU is 0.25, and CIoU is 0.1214.
In fig. 5, when the small frame is aligned with the large frame, IoU ═ 0.25, GIoU ═ 0.25, and CIoU ═ 0.25.
Comparing the parameters of fig. 3, 4 and 5, it can be seen that CIoU versus IoU and GIoU can further reflect the crossing relationship between boxes, in particular the degree of center alignment between the two baskets. The higher the CIoU value, the higher the degree of center alignment of the two boxes.
As also shown in FIG. 6, assume that one of the large boxes is 100 long and 60 wide; another small box has a length of 25 and a width of 15, and when the small box is included in the lower right corner of the large box, IoU is 0.0625, GIoU is 0.0625, and CIoU is-0.3125.
As shown in fig. 7 and 8, the two frames are 100 long and 60 wide, and partially overlap each other.
In fig. 7, when the width of the overlapped portion of the two frames is 15, IoU is 0.1429, GIoU is 0.1429, and CIoU is-0.1675.
In fig. 8, when the two frames overlap each other in area along the diagonal line 1/4, IoU is 0.1429, GIoU is-0.0794, and CIoU is-0.1905.
As shown in fig. 9, 10, 11 and 12, it is assumed that the large frame has a length of 100 and a width of 60; the small frame is 50 in length and 30 in width; the two frames do not coincide.
In fig. 9, when the large frame and the small frame are connected diagonally, IoU is 0, GIoU is-0.4444, and CIoU is-0.5.
In fig. 10, when the large frame and the small frame are aligned at the bottom and connected at the long side, IoU is 0, GIoU is-0.3333, and CIoU is-0.3826.
In fig. 11, when the long sides of the large frame and the small frame are connected and are axisymmetrical, IoU is 0, GIoU is-0.3333, and CIoU is-0.3345.
In fig. 12, when the large frame and the small frame are connected at one broadside and are axisymmetrical, IoU is 0, GIoU is-0.3333, and CIoU is-0.4642.
When the two frames are completely overlapped, their IoU GIoU CIoU 1; when the two frames are at infinity, IoU is 0 and GIoU is CIoU-1.
In summary, the CIoU inherits the scale invariance of the GIoU; CIoU can be considered as the lower boundary of IoU and is IoU or less like GIoU; when the loss is calculated, the regression loss 1-CIoU can be used, and even if two comparison frames are not intersected, gradient pass-back also exists; CIoU has the same value range as GIoU, but focuses more on the difficult regression box, which just takes the idea of the Loss of Focal Loss by classification into account.
In other embodiments, for the evaluation of the prediction frame after the detection and positioning are completed, the intersection ratio correction value may be calculated according to the real frame, and the prediction frame whose intersection ratio correction value is greater than the first setting value is determined as the positioning result is correct. For example, the first set value is 0.35.
In other embodiments, the intersection-to-parallel ratio and the intersection-to-parallel ratio correction value may be used as the determination criterion at the same time, that is, when the intersection-to-parallel ratio correction value of the prediction frame and the real frame is greater than the first setting value and the intersection-to-parallel ratio is greater than the second setting value, the prediction frame is determined to be correct in positioning result. For example, the first set value is 0.35, and the second set value is 0.5.
In the present embodiment, since the factors of the cross-over ratio correction value include the cross-over ratio and the relative position parameter, different cross-over ratios and relative position parameters may be calculated during the actual application process to obtain the same cross-over ratio correction value, which may result in that effective distinction cannot be made, for example, in one example, CIoU is IoU-d/c is 0.5-0.05-0.45, and in another example, CIoU is IoU-d/c is 0.55-0.1-0.45; therefore, the cross-over ratio correction value is further limited on the basis of limiting the cross-over ratio, and the sensitivity of threshold evaluation can be improved.
Further, the positioning accuracy of the specified class object in the single image, the average accuracy of the specified class object in the multiple images, and the average accuracy mean of the multiple class objects in the multiple images may be further calculated for the target detection positioning result, so as to reflect the positioning quality of the target detection model, and the method includes the steps of: and calculating the number of correct prediction frames aiming at the detection result of the object of the specified category in the single target detection image, and calculating the ratio of the number of correct prediction frames to the total number of the objects of the specified category to obtain the accuracy rate of the positioning result of the object of the specified category.
And respectively calculating the accuracy rate of the object of the specified category in each target detection image and calculating the average value based on the plurality of target detection images to obtain the average accuracy of the positioning of the object of the specified category.
And respectively calculating corresponding average precision and averaging aiming at a plurality of objects of different types on the basis of a plurality of target detection images to obtain an average precision average value.
The higher the values of the accuracy rate, the average accuracy and the average accuracy mean value are, the better the positioning quality of the target detection model is.
On the other hand, based on the same inventive concept as the positioning quality evaluation method of the target detection model shown in fig. 13, the embodiment of the present invention further provides a target positioning method, and repeated details are not repeated. As shown in fig. 14, the target location method of some embodiments may include:
step 201: respectively obtaining a plurality of prediction boxes and corresponding classification probability values of a plurality of targets to be detected through target detection positioning operation;
step 202: calculating the intersection ratio and the center distance between the prediction frames and the diagonal length of the minimum covering frame;
step 203: calculating the relative position parameters of the center distance and the corresponding diagonal length of the minimum covering frame;
step 204: correcting the corresponding intersection ratio by using the relative position parameter to obtain an intersection ratio correction value;
step 205: and screening the optimal prediction frame of each target to be detected by taking the cross-over ratio correction value as a standard.
In this embodiment, the target detection positioning operation may adopt a target detection model formed by training a deep learning neural network, and a plurality of prediction frames with a probability reaching a specified threshold may be output for a plurality of targets to be detected through the operation of the target detection model.
In the practical application process, the positions of a plurality of targets to be detected of the same type are distributed sparsely or densely. Specifically, the target detection model respectively outputs a plurality of prediction frames for a plurality of targets to be detected in a dense scene, the prediction frames of the targets to be detected are overlapped, in this case, only IoU is used as a screening standard to select an optimal frame, the judgment effectiveness is reduced, different detection targets cannot be distinguished efficiently, and the condition of reduced sensitivity or excessive combination of output values is presented. In order to effectively improve the output capability of the optimal prediction frame for each detection target in the dense scene, the embodiment adopts the intersection-to-parallel ratio correction value corrected by the central distance between the two frames and the diagonal length of the minimum covering frame as the comparison standard, so that the sensitivity can be improved.
In some embodiments, the step 203 of calculating the relative position parameters of the center distance and the corresponding diagonal length of the minimum coverage frame may further include the steps of: 2031, calculating the quotient of the center distance and the diagonal length of the minimum covering frame to obtain the relative position parameter.
In some embodiments, the step 204 of correcting the corresponding intersection ratio by using the relative position parameter to obtain an intersection ratio correction value may further include the steps of: 2041, calculating the difference between the cross-over ratio and the relative position parameter to obtain the cross-over ratio correction value.
Specifically, calculating a cross ratio correction value between the prediction frames, when the cross ratio correction value between the two prediction frames is larger than a first set threshold value, reserving the prediction frame with higher classification probability value, and discarding the prediction frame with lower classification probability value; and when the intersection ratio correction value between the two prediction frames is smaller than a first set threshold value, both the two prediction frames are reserved. And repeating the calculation according to the steps, and finally outputting the corresponding optimal prediction frame aiming at each detection target. The first set threshold is determined according to an actual application scenario, and for example, the first set threshold may be 0.5.
In some embodiments, step 205, namely, screening the optimal prediction frame of the object to be detected by using the cross-over ratio correction value as a criterion, may further include the steps of: 2051, iteratively screening the optimal prediction frame by using Non-maximum suppression (Nms) with the cross-over ratio correction value as a standard.
In this embodiment, in order to reduce the calculation amount, the optimal prediction box may be screened out by non-maximum inhibition Nms iteration; compared with the traditional method for eliminating the redundant frames by adopting the cross-over ratio IoU as the standard, the method for eliminating the redundant frames by adopting the cross-over ratio correction value CIoU as the evaluation standard can improve the detection positioning sensitivity and accuracy in the scene of dense distribution of a plurality of detection targets.
Illustratively, a plurality of prediction boxes obtained by performing object detection and positioning operation on an image are arranged according to the corresponding classification probability values from small to large, for example: A. b, C, D, E, F, G, respectively;
and starting from the prediction frame G with the maximum classification probability value, respectively judging whether the intersection ratio correction value CIoU of the prediction frame A, B, C, D, E, F and the prediction frame G is larger than a second set threshold value, discarding the side of the prediction frame larger than the second set threshold value, reserving the prediction frame smaller than the second set threshold value, calculating to obtain the intersection ratio correction value CIoU of the prediction frame B, D and the prediction frame G which is larger than the second set threshold value, discarding the prediction frame B, D, and reserving G.
Further, the prediction box E with the highest classification probability value is selected from the remaining prediction boxes A, C, E, F, whether the intersection ratio correction value CIoU of the prediction box A, C, F and the prediction box E is greater than a second set threshold value or not is respectively judged, the side of the prediction box greater than the second set threshold value is discarded, the prediction box smaller than the second set threshold value is reserved, the intersection ratio correction value CIoU of the prediction box a and the prediction box E is obtained after calculation, the prediction box a is discarded, and the prediction box E is reserved. The iteration is repeated according to the above steps, and finally all the reserved prediction boxes G, E, C are found.
The prediction boxes G, E, C are the optimal prediction boxes for the three detection targets in the image, respectively.
In other embodiments, the deep neural network may be used to train the target detection model to locate the target to be detected. The process of training the target detection model by adopting the deep neural network can comprise the following steps:
s301, generating a plurality of candidate frames by adopting a sliding window method or a selective search method;
s302, calculating the intersection ratio, the center distance and the diagonal length of the minimum covering frame of the candidate frame and the corresponding real frame;
s303, calculating a quotient of the center distance and the corresponding diagonal length of the minimum coverage frame as a relative position parameter, and subtracting the relative position parameter from the intersection ratio to obtain an intersection ratio correction value;
s304, classifying the candidate frame with the intersection ratio correction value larger than the third threshold value into an object frame for inputting a model for training; and classifying the candidate frame with the intersection ratio correction value smaller than the third threshold value as a background frame.
In another aspect, the present application further provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the steps of the method are implemented.
In another aspect, the present application also provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the above method.
In the description herein, reference to the description of the terms "one embodiment," "a particular embodiment," "some embodiments," "for example," "an example," "a particular example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. The sequence of steps involved in the various embodiments is provided to schematically illustrate the practice of the invention, and the sequence of steps is not limited and can be suitably adjusted as desired.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention has been described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are only exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (10)

1. A method for evaluating the positioning quality of a target detection model is characterized by comprising the following steps:
positioning a target to be detected by using a target detection model to obtain a corresponding prediction frame;
calculating the intersection ratio, the center distance and the diagonal length of the minimum covering frame of the prediction frame and the corresponding real frame;
calculating relative position parameters of the center distance and the corresponding diagonal length of the minimum coverage frame;
correcting the corresponding intersection ratio by using the relative position parameter to obtain an intersection ratio correction value;
and evaluating the positioning quality of the target detection model according to the intersection ratio correction value.
2. The method for evaluating the positioning quality of the object detection model according to claim 1, wherein calculating the relative position parameters of the center distance and the corresponding diagonal length of the minimum coverage frame comprises:
and calculating the quotient of the center distance and the diagonal length of the minimum covering frame to obtain the relative position parameter.
3. The method for evaluating the positioning quality of the target detection model according to claim 2, wherein the step of correcting the corresponding intersection ratio by using the relative position parameter to obtain an intersection ratio correction value comprises:
and calculating the difference between the intersection ratio and the relative position parameter to obtain the intersection ratio correction value.
4. The method according to claim 3, wherein evaluating the positioning quality of the target detection model based on the cross-over ratio correction value includes:
and judging the prediction frame with the intersection ratio correction value larger than a first set value as the correct positioning result.
5. A method of locating an object, comprising:
respectively obtaining a plurality of prediction boxes and corresponding classification probability values of a plurality of targets to be detected through target detection positioning operation;
calculating the intersection ratio, the center distance and the diagonal length of the minimum covering frame among the prediction frames;
calculating the relative position parameters of the center distance and the corresponding diagonal length of the minimum covering frame;
correcting the corresponding intersection ratio by using the relative position parameter to obtain an intersection ratio correction value;
and screening the optimal prediction frame of each target to be detected by taking the intersection ratio correction value as a standard.
6. The method of claim 5, wherein calculating the relative position parameters of the center distance and the corresponding diagonal length of the minimum coverage frame comprises:
and calculating the quotient of the center distance and the diagonal length of the minimum covering frame to obtain the relative position parameter.
7. The method of claim 6, wherein the correcting the corresponding cross-over ratio by using the relative position parameter to obtain a cross-over ratio correction value comprises:
and calculating the difference between the intersection ratio and the relative position parameter to obtain the intersection ratio correction value.
8. The method for positioning the target according to claim 7, wherein the step of screening the optimal prediction frame of the target to be detected by using the cross-over ratio correction value as a standard comprises the following steps:
and using the intersection ratio correction value as a standard to inhibit iteration by using a non-maximum value to screen out the optimal prediction frame.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method according to any of claims 1 to 8 are implemented when the program is executed by the processor.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 8.
CN201910794302.3A 2019-08-27 2019-08-27 Positioning quality evaluation method, positioning method and device of target detection model Active CN110503095B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910794302.3A CN110503095B (en) 2019-08-27 2019-08-27 Positioning quality evaluation method, positioning method and device of target detection model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910794302.3A CN110503095B (en) 2019-08-27 2019-08-27 Positioning quality evaluation method, positioning method and device of target detection model

Publications (2)

Publication Number Publication Date
CN110503095A CN110503095A (en) 2019-11-26
CN110503095B true CN110503095B (en) 2022-06-03

Family

ID=68589590

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910794302.3A Active CN110503095B (en) 2019-08-27 2019-08-27 Positioning quality evaluation method, positioning method and device of target detection model

Country Status (1)

Country Link
CN (1) CN110503095B (en)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111738072A (en) * 2020-05-15 2020-10-02 北京百度网讯科技有限公司 Training method and device of target detection model and electronic equipment
CN111797993B (en) * 2020-06-16 2024-02-27 东软睿驰汽车技术(沈阳)有限公司 Evaluation method and device of deep learning model, electronic equipment and storage medium
CN111814850A (en) * 2020-06-22 2020-10-23 浙江大华技术股份有限公司 Defect detection model training method, defect detection method and related device
CN112002131A (en) * 2020-07-16 2020-11-27 深圳云游四海信息科技有限公司 In-road parking behavior detection method and device
CN112001247A (en) * 2020-07-17 2020-11-27 浙江大华技术股份有限公司 Multi-target detection method, equipment and storage device
CN112001453B (en) * 2020-08-31 2024-03-08 北京易华录信息技术股份有限公司 Method and device for calculating accuracy of video event detection algorithm
CN112200217B (en) * 2020-09-09 2023-06-09 天津津航技术物理研究所 Identification algorithm evaluation method and system based on infrared image big data
CN112287898B (en) * 2020-11-26 2024-07-05 深源恒际科技有限公司 Method and system for evaluating text detection quality of image
CN113408342B (en) * 2021-05-11 2023-01-03 深圳大学 Target detection method for determining intersection ratio threshold based on features
CN113095301B (en) * 2021-05-21 2021-08-31 南京甄视智能科技有限公司 Road occupation operation monitoring method, system and server
CN115035186B (en) * 2021-12-03 2023-04-11 荣耀终端有限公司 Target object marking method and terminal equipment
CN116309696B (en) * 2022-12-23 2023-12-01 苏州驾驶宝智能科技有限公司 Multi-category multi-target tracking method and device based on improved generalized cross-over ratio

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106875425A (en) * 2017-01-22 2017-06-20 北京飞搜科技有限公司 A kind of multi-target tracking system and implementation method based on deep learning
CN107862437A (en) * 2017-10-16 2018-03-30 中国人民公安大学 The public domain crowd massing method for early warning and system assessed based on risk probability
CN108197628A (en) * 2017-12-07 2018-06-22 维森软件技术(上海)有限公司 The joint judgment method of characteristics of image based on deep neural network
CN109145756A (en) * 2018-07-24 2019-01-04 湖南万为智能机器人技术有限公司 Object detection method based on machine vision and deep learning
CN110070074A (en) * 2019-05-07 2019-07-30 安徽工业大学 A method of building pedestrian detection model
CN110084829A (en) * 2019-03-12 2019-08-02 上海阅面网络科技有限公司 Method for tracking target, device, electronic equipment and computer readable storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106875425A (en) * 2017-01-22 2017-06-20 北京飞搜科技有限公司 A kind of multi-target tracking system and implementation method based on deep learning
CN107862437A (en) * 2017-10-16 2018-03-30 中国人民公安大学 The public domain crowd massing method for early warning and system assessed based on risk probability
CN108197628A (en) * 2017-12-07 2018-06-22 维森软件技术(上海)有限公司 The joint judgment method of characteristics of image based on deep neural network
CN109145756A (en) * 2018-07-24 2019-01-04 湖南万为智能机器人技术有限公司 Object detection method based on machine vision and deep learning
CN110084829A (en) * 2019-03-12 2019-08-02 上海阅面网络科技有限公司 Method for tracking target, device, electronic equipment and computer readable storage medium
CN110070074A (en) * 2019-05-07 2019-07-30 安徽工业大学 A method of building pedestrian detection model

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression;Hamid Rezatofighi等;《arXiv》;20190415;第1-9页 *
The region of interest localization for glaucoma analysis from retinal fundus image using deep learning;AnirbanMitra等;《Computer Methods and Programs in Biomedicine》;20181031;第25-35页 *
UnitBox: An Advanced Object Detection Network;Jiahui Yu等;《arXiv》;20160804;第1-5页 *
基于集成学习与位置信息约束的前方车辆检测;耿磊等;《计算机工程与科学》;20181031;第40卷(第10期);第1844-1850页 *

Also Published As

Publication number Publication date
CN110503095A (en) 2019-11-26

Similar Documents

Publication Publication Date Title
CN110503095B (en) Positioning quality evaluation method, positioning method and device of target detection model
CN106485183B (en) A kind of Quick Response Code localization method and system
US20180061068A1 (en) Depth/Disparity Map Post-processing Method and Device
Palou et al. Monocular depth ordering using T-junctions and convexity occlusion cues
CN111292303A (en) Weld defect type detection method and device, electronic equipment and storage medium
CN112102409B (en) Target detection method, device, equipment and storage medium
CN109074490A (en) Path detection method, related device and computer readable storage medium
JP2009531784A (en) Automatic determination of machine vision tool parameters
Ückermann et al. Realtime 3D segmentation for human-robot interaction
CN109816051B (en) Hazardous chemical cargo feature point matching method and system
CN110516514A (en) A kind of modeling method and device of target detection model
CN110880175A (en) Welding spot defect detection method, system and equipment
Tao et al. Gap detection of switch machines in complex environment based on object detection and image processing
CN109389105A (en) A kind of iris detection and viewpoint classification method based on multitask
CN105447842A (en) Image matching method and device
CN111862133A (en) Method and device for dividing region of closed space and movable equipment
Hermann et al. The gradient-a powerful and robust cost function for stereo matching
Fielding et al. Weighted matchings for dense stereo correspondence
CN108960247A (en) Image significance detection method, device and electronic equipment
CN106157301A (en) A kind of threshold value for Image Edge-Detection is from determining method and device
CN105118072A (en) Method and device for tracking multiple moving targets
Srikakulapu et al. Depth estimation from single image using defocus and texture cues
CN112347818B (en) Method and device for screening difficult sample images of video target detection model
US20210216829A1 (en) Object likelihood estimation device, method, and program
US20150178934A1 (en) Information processing device, information processing method, and program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant