WO2021077868A1 - Target detection method and device - Google Patents

Target detection method and device Download PDF

Info

Publication number
WO2021077868A1
WO2021077868A1 PCT/CN2020/108964 CN2020108964W WO2021077868A1 WO 2021077868 A1 WO2021077868 A1 WO 2021077868A1 CN 2020108964 W CN2020108964 W CN 2020108964W WO 2021077868 A1 WO2021077868 A1 WO 2021077868A1
Authority
WO
WIPO (PCT)
Prior art keywords
bounding box
bounding
target
grid
boxes
Prior art date
Application number
PCT/CN2020/108964
Other languages
French (fr)
Chinese (zh)
Inventor
陈廉政
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2021077868A1 publication Critical patent/WO2021077868A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Definitions

  • This application relates to the field of image processing, and in particular to a target detection method and device.
  • Object Detection is an important part of image processing. Its task is to find all the objects (objects) of interest in the image and determine their positions and sizes. This is one of the core issues in the field of machine vision.
  • a bounding box (bbox) and a bounding box score are generated on the target image.
  • NMS Non-Maximum Suppression
  • each selected high-scoring box in the NMS algorithm needs to do IOU calculation with all low-scoring boxes with a score lower than itself.
  • the algorithm complexity is O(N ⁇ 2), so when the bbox box is When the number increases, the time-consuming will increase by power times, and the increased time-consuming of NMS will cause the detection frame rate to decrease, which seriously affects the detection effect of the algorithm.
  • the embodiments of this application provide a target detection method and device.
  • the target image can be meshed to determine the grid to which the bounding box belongs, and then the boundary can be determined according to the neighboring relationship between the grids.
  • the adjacent relationship of the boxes makes it necessary to calculate the overlap between the reference bounding box and the adjacent bounding boxes when performing the NMS algorithm operation, which effectively reduces the time complexity of the NMS and improves the efficiency of target detection.
  • an embodiment of the present application provides a target detection method, the method includes: acquiring multiple bounding boxes on a target image and scores of the multiple bounding boxes, the scores are used to characterize the confidence that the bounding box contains the target object Degree; divide the target image to obtain multiple grids, and determine the grid to which the multiple bounding boxes belong; traverse the multiple bounding boxes, calculate the overlap degree between the reference bounding box and the adjacent bounding boxes, and calculate the overlap degree according to the overlap
  • the target bounding box is obtained in degrees; the reference bounding box is any one of the multiple bounding boxes, and the adjacent bounding box includes the bounding box belonging to the target grid and the bounding box belonging to the neighboring grid of the target grid, the The target grid is a grid to which the reference bounding box belongs, and the adjacent bounding box does not include the reference bounding box; the target detection result is determined according to the score of the target bounding box.
  • the target image is divided to obtain multiple grids; the adjacent bounding boxes corresponding to the multiple bounding boxes are determined according to the grid division result; then the multiple bounding boxes are traversed, and multiple bounding boxes are obtained by calculation
  • the target bounding box is suppressed by the adjacent bounding box, and finally the target detection result is determined according to the score of the target bounding box.
  • the target detection result is determined according to the score of the target bounding box.
  • the method before traversing multiple bounding boxes, the method further includes: sorting the bounding boxes according to the score size of the bounding boxes, and obtaining the sorting number corresponding to the bounding box.
  • the multiple bounding boxes are sorted according to the score size to obtain the sorting numbers corresponding to the bounding boxes, so that when the bounding boxes are subsequently traversed, the sorting numbers are sequentially performed.
  • sorting the bounding boxes according to the score size of the bounding box specifically includes: sorting the bounding boxes in descending order according to the score size, and the smaller the score is, the larger the sorting number corresponding to the bounding box is.
  • traversing multiple bounding boxes, calculating the overlap between the reference bounding box and the adjacent bounding boxes, to obtain the target bounding box includes: obtaining the bounding box with the order number i in the multiple bounding boxes as Refer to the bounding box and obtain its identification bit at the same time; when the identification bit of the reference bounding box is the first identification value, obtain the adjacent bounding box of the reference bounding box, and determine whether the sequence number of the adjacent bounding box is greater than i; in the case where it is determined that the sequence number of the adjacent bounding box is greater than i, calculate the intersection ratio of the reference bounding box and the adjacent bounding box; when the intersection ratio is greater than the preset threshold, the phase The identification position of the adjacent bounding box is the second identification value; the bounding box whose identification position is the first identification value is acquired as the target bounding box.
  • the sizes of the multiple bounding boxes are different, and dividing the target image to obtain multiple grids includes: dividing the target image according to the largest size of the multiple bounding boxes to obtain multiple grids. grid.
  • the target image is divided into multiple grids according to the largest size of the multiple bounding box sizes, so that when the adjacent bounding boxes of the reference bounding box are obtained according to the grid to which the bounding box belongs, it can be avoided because Too small a grid size results in that the acquired adjacent bounding boxes do not include all bounding boxes whose overlap with the reference bounding box is greater than a preset threshold, which improves the accuracy of acquiring the target bounding box, and further improves the accuracy of the target detection result.
  • dividing the target image to obtain multiple grids includes: dividing the target image to obtain multiple grids according to the sizes of the multiple bounding boxes.
  • the same size of the multiple bounding boxes includes: the multiple bounding boxes have the same width and the multiple bounding boxes have the same height.
  • determining the grid to which multiple bounding boxes belongs includes: determining a target coordinate point, where the target coordinate point is any coordinate point on the bounding box or within the bounding box; and the target coordinate point belongs to The grid of is determined as the grid to which the bounding box belongs.
  • the target coordinate points include: the upper-right coordinate point of the bounding box, the upper-left coordinate point of the bounding box, the lower-right coordinate point of the bounding box, the lower-left coordinate point of the bounding box, or the center coordinate of the bounding box Click any one of them.
  • the method further includes: establishing an index queue corresponding to the grid according to the grid to which the bounding box belongs, and the index queue includes at least one The sort number of the bounding box;
  • Obtaining the neighboring bounding boxes of the reference bounding box includes: acquiring the neighboring bounding boxes from the target grid and the neighboring grids of the target grid according to the index queue.
  • the method before sorting the bounding boxes according to their corresponding score sizes, the method further includes: determining that the multiple bounding boxes are multiple bounding boxes with a score greater than a preset score.
  • the bounding boxes with a score greater than the preset score are screened out, and then multiple bounding boxes are sorted and traversed. Because the bounding box with a low score has little or no impact on the target detection result, the omission has a low score
  • the sorting and traversal of the bounding box can improve the efficiency of target detection while ensuring the reliability of target detection results.
  • determining the target detection result according to the score of the target bounding box includes: determining the target detection result according to the target bounding box with a score greater than a preset score.
  • an embodiment of the present application provides a target detection device.
  • the device includes: an acquiring unit configured to acquire multiple bounding boxes on a target image and scores of the multiple bounding boxes, and the scores are used to characterize the bounding boxes Contains the confidence of the target object; the division unit is used to divide the target image to obtain multiple grids and determine the grid to which the multiple bounding boxes belong; the traversal unit is used to traverse the multiple bounding boxes and calculate The overlap degree of the reference bounding box and the adjacent bounding box, and the target bounding box is obtained according to the overlapping degree; the reference bounding box is any one of the multiple bounding boxes, and the adjacent bounding box includes the bounding box belonging to the target grid And the bounding box of the neighboring grid belonging to the target grid, the target grid is the grid to which the reference bounding box belongs, and the neighboring bounding box does not include the reference bounding box; the determination unit is used to determine the unit according to the target boundary The score of the box determines the target detection result.
  • the device further includes a sorting unit, specifically configured to: sort the bounding boxes according to the score size of the bounding box, and obtain the sorting number corresponding to the bounding box.
  • a sorting unit specifically configured to: sort the bounding boxes according to the score size of the bounding box, and obtain the sorting number corresponding to the bounding box.
  • the sorting unit is specifically used to: sort the bounding boxes in descending order according to the score size, and the smaller the score, the larger the sorting number corresponding to the bounding box.
  • the traversal unit is specifically configured to: obtain the bounding box with the order number i in the multiple bounding boxes as the reference bounding box, and at the same time obtain the identification bit; the identification bit in the reference bounding box is the first In the case of the identification value, obtain the adjacent bounding box of the reference bounding box, and determine whether the ordering number of the adjacent bounding box is greater than i; in the case of determining that the ordering number of the adjacent bounding box is greater than i, calculate the reference The intersection ratio of the bounding box and the adjacent bounding box; in the case that the intersection ratio is greater than the preset threshold, the identification position of the adjacent bounding box is the second identification value; the identification bit is acquired as the first identification The bounding box of the value serves as the target bounding box.
  • the sizes of the multiple bounding boxes are different, and the dividing unit is specifically used to divide the target image to obtain multiple grids according to the largest size among the sizes of the multiple bounding boxes.
  • the sizes of the multiple bounding boxes are the same, and the dividing unit is specifically used to divide the target image to obtain multiple grids according to the sizes of the multiple bounding boxes.
  • the dividing unit in determining the grid to which multiple bounding boxes belong, is specifically used to: determine a target coordinate point, where the target coordinate point is any coordinate point on or within the bounding box ; Determine the grid to which the target coordinate point belongs as the grid to which the bounding box belongs.
  • the target coordinate points include: the upper-right coordinate point of the bounding box, the upper-left coordinate point of the bounding box, the lower-right coordinate point of the bounding box, the lower-left coordinate point of the bounding box, or the center coordinate of the bounding box Click any one of them.
  • the dividing unit is further used to: establish an index queue corresponding to the grid according to the grid to which the bounding box belongs, and the index queue includes at least A sort number of the bounding box;
  • the traversal unit is further used to obtain the adjacent bounding boxes from the target grid and the adjacent grids of the target grid according to the index queue.
  • the sorting unit is further configured to: before sorting the bounding boxes according to their corresponding score sizes, determine that the multiple bounding boxes are multiple bounding boxes with scores greater than a preset score.
  • the determining unit is specifically configured to determine the target detection result according to the target bounding box with a score greater than a preset score.
  • an embodiment of the present application provides a device, which includes:
  • the processor calls the executable program code stored in the memory, so that the device executes any method of the first aspect.
  • the device further includes: the memory, coupled with the processor.
  • the device further includes: an image sensor for acquiring the target image.
  • an embodiment of the present invention provides a computer-readable storage medium.
  • the computer storage medium includes program instructions that, when run on a computer, cause the computer to execute any method described in the first aspect.
  • the embodiments of the present application provide a computer program product containing instructions, which when run on a computer or processor, cause the computer or processor to execute the first aspect or any of its possible implementations.
  • the method in the way.
  • FIG. 1A is a schematic diagram of generating multiple bounding boxes according to an embodiment of this application.
  • FIG. 1B is another schematic diagram of generating multiple bounding boxes provided in an embodiment of this application.
  • FIG. 2 is a schematic flowchart of a target detection method provided by an embodiment of the application
  • FIG. 3 is a schematic diagram of a grid division provided by an embodiment of the application.
  • 4A is a schematic diagram of a grid division process provided by an embodiment of this application.
  • 4B is a schematic diagram of determining a grid to which a bounding box belongs according to an embodiment of this application;
  • FIG. 4C is a schematic diagram of a grid neighbor relationship provided by an embodiment of this application.
  • FIG. 5 is a schematic flowchart of a method for traversing multiple bounding boxes and obtaining a target bounding box according to an embodiment of the application;
  • FIG. 6 is a schematic diagram of an index queue in a grid provided by an embodiment of this application.
  • FIG. 7 is a schematic structural diagram of a target detection device provided by an embodiment of the application.
  • FIG. 8 is a schematic structural diagram of an apparatus provided by an embodiment of the present invention.
  • At least one (item) refers to one or more, and “multiple” refers to two or more.
  • “And/or” is used to describe the association relationship of associated objects, indicating that there can be three types of relationships, for example, “A and/or B” can mean: only A, only B, and both A and B , Where A and B can be singular or plural.
  • the character “/” generally indicates that the associated objects before and after are in an “or” relationship.
  • the following at least one item (a) or similar expressions refers to any combination of these items, including any combination of a single item (a) or a plurality of items (a).
  • At least one of a, b, or c can mean: a, b, c, "a and b", “a and c", “b and c", or "a and b and c" ", where a, b, and c can be single or multiple.
  • FIG. 1A An embodiment of the present application provides a schematic diagram of generating multiple bounding boxes. As shown in (a) of FIG. 1A, three bbox boxes are generated on a face image, and each bbox box corresponds to a different score.
  • the NMS algorithm is introduced to suppress the low-scoring frame and screen out the bbox with the highest score as shown in Figure 1A (b) on the target object as the optimal frame. Finally, the position of the target object is determined according to the optimal frame, and the confidence that the target object is a face image is 0.98.
  • FIG. 1B is another schematic diagram of generating multiple bounding boxes provided in an embodiment of the application. As shown in (c) in FIG. The bbox box, after suppressing the low score box according to the NMS algorithm, selects 2 optimal boxes. Finally, the positions of the two target objects are determined according to the two optimal frames, and the confidence that the two target objects are face images is 0.93 and 0.80, respectively.
  • the IOU calculation formula is as follows:
  • A1 and A2 respectively represent the areas of the two boxes involved in the IOU calculation. If the IOU is greater than the set threshold T, it means that the box is suppressed by the S_bbox, and the box is deleted from the sorting queue.
  • FIG. 2 is a schematic flowchart of a target detection method provided by an embodiment of the application. As shown in FIG. 2, the method includes the following steps:
  • the target image refers to an image that requires target detection. It can be an image directly input by an input device as the target image, or a part of the image input by the input device that is determined according to the size or range set by the user as the target image.
  • the process is the same as the traditional NMS algorithm. First, multiple bounding boxes generated on the target image and their scores are obtained.
  • the bounding box scores indicate the confidence that the bounding box contains the target object, and can be any value between 0 and 1.
  • the target image is divided to obtain multiple grids, and each grid corresponds to a part of the target image.
  • the grid can be divided randomly, according to the shape and size of the target image, or according to the position or size of the bounding box.
  • Fig. 3 is a schematic diagram of a grid division provided by an embodiment of the application. As shown in Fig. 3(a), multiple grids are divided according to the shape and size of the target image, so that the generated The number of grids just completely covers the complete target image; or as shown in Figure 3(b), the grid is divided according to the position of the bounding box in the target image, so that the generated grid completely covers the bounding box without generating the boundary The position of the box is not meshed. After the mesh is divided, the grid to which each bounding box belongs can be determined according to the positions generated by multiple bounding boxes.
  • the grid When the grid is divided, it can be divided according to a non-fixed size, that is, multiple grids divided on the target image have different widths and heights.
  • the grids can also be divided according to a fixed size, that is, multiple grids divided on the target image have the same width and height.
  • the fixed size of the grid can be determined when the target image size and the number of grids are determined; it can also be determined when the target image size and the fixed size of the grid are determined. The number of grids.
  • the known target image size is 3pxcel (pixel)*2pxcel, and the number of grids is 3*2, and the fixed size of the grid is 1pxcel*1pxcel; for the latter case, for example, the known target
  • the image size is 8pxcel*6pxcel
  • the grid size is 2pxcel*2pxcel
  • the ceil function when calculating the number of grids, is used to indicate rounding up, that is, a grid is also generated for the image that cannot meet a grid size at the end of the target image.
  • the floor function can also be used to round down when calculating the number of grids, and the round function can also be used to round down; because the probability of generating a bounding box in an image smaller than a grid size is low, or a bounding box is generated The corresponding score will also be lower, and the calculation of the overlap degree of the bounding box in this part of the target image can be omitted.
  • the sizes of the multiple bounding boxes are different, and dividing the target image to obtain multiple grids includes: dividing the target image according to the largest size of the multiple bounding boxes to obtain multiple grids.
  • the size of multiple bounding boxes is different, which means that multiple bounding boxes have different widths and heights.
  • the grid is divided according to a fixed size, if the sizes of the multiple bounding boxes are different, the largest size among the sizes of the multiple bounding boxes can be used as the fixed size of the grid. In this way, when obtaining adjacent bounding boxes of the reference bounding box, all adjacent bounding boxes can be covered.
  • the size of the multiple bounding boxes is the same, and dividing the target image to obtain multiple grids includes: dividing the target image according to the sizes of the multiple bounding boxes to obtain multiple grids.
  • the size of multiple bounding boxes is the same, which means that multiple bounding boxes have the same width and height.
  • the sizes of multiple bounding boxes can be used as the fixed size of the grid.
  • FIG. 4A is a schematic diagram of a grid division process provided by an embodiment of this application.
  • the height H and width W of the target image are obtained, and then the height Hb and width of the bounding box are Wb is used as the fixed size of the grid.
  • each small grid on the right in Figure 4A is the divided grid.
  • determining the grid to which multiple bounding boxes belongs includes: determining a target coordinate point, which is any coordinate point on or within the bounding box; and determining the grid to which the target coordinate point belongs as the bounding box belongs Grid.
  • the target coordinate points include: the upper-right coordinate point of the bounding box, the upper-left coordinate point of the bounding box, the lower-right coordinate point of the bounding box, the lower-left coordinate point of the bounding box, or the center coordinate point of the bounding box.
  • the target coordinate points include: the upper-right coordinate point of the bounding box, the upper-left coordinate point of the bounding box, the lower-right coordinate point of the bounding box, the lower-left coordinate point of the bounding box, or the center coordinate point of the bounding box.
  • the grid to which the bounding box belongs is determined according to the positions of the multiple bounding boxes.
  • the bounding box has a certain area, and different positions of the bounding box may be located in different grids.
  • the grid to which the bounding box belongs can be determined by the grid to which any coordinate point on the bounding box or inside the bounding box belongs. Or, determine the grid to which the multiple coordinate points on or in the bounding box belong, and then determine the grid to which the bounding box belongs according to the grid to which the coordinate point with the largest number of the multiple coordinate points belongs.
  • FIG. 4B is a schematic diagram of determining the grid to which the bounding box belongs according to an embodiment of the application. , As shown in Figure 4B, firstly number the generated grids, and then determine the grid number to which each bounding box belongs. Taking a bounding box with a score of 0.84 as an example, assuming that the coordinate point of the upper left corner of the bounding box is (Xmin, Ymax), the following formula can be used to determine the grid number to which the bounding box belongs:
  • Iw represents the grid number corresponding to the width of the bounding box
  • Ih represents the grid number corresponding to the height of the bounding box, that is, the grid number to which the bounding box belongs is determined to be R_Ih_Iw according to the grid to which the upper left coordinate point belongs.
  • the adjacent bounding box includes a bounding box belonging to a target grid and a bounding box belonging to an adjacent grid of the target grid, the target grid is a grid to which the reference bounding box belongs, and the adjacent The bounding box does not include the reference bounding box.
  • the relationship between the bounding boxes can be determined according to the grid to which the bounding box belongs and the relationship between the grids.
  • the relationship between the bounding boxes mainly refers to the relationship between adjacent or non-adjacent.
  • the neighboring grid of the target grid can be determined according to the eight-neighbor grid of the target grid.
  • the eight-neighborhood grid of the grid refers to all the grids that have edges or vertices coincident with the grid.
  • FIG. 4C is a schematic diagram of a grid neighbor relationship provided by an embodiment of the application. As shown in (a) in FIG. 4C, the grid R_1_1 is used as the target grid, and the filled grid is the target grid.
  • the eight-neighborhood grid of the grid that is, the adjacent grid of the target grid.
  • the neighboring grids of the target grid may be determined according to the four-neighbor grids of the target grid.
  • the four-neighborhood grid of the grid refers to all the grids that coincide with the edges of the grid.
  • the filled grid is the four-neighbor grid of the target grid R_1_1, that is, the adjacent grid of the target grid.
  • the neighboring grids of the target grid can be determined according to the distance from the center of the target grid.
  • a grid whose center is not more than a preset distance from the center of the target grid can be regarded as the adjacent grid of the target grid.
  • the preset distance can be the side length of the target grid or equal to multiple sides of the target grid. Long, or other value.
  • the preset distance is the side length of the target grid R_1_1
  • the distance between the centers of the grids R_0_1, R_1_0, R_1_2, and R_2_1 and the center of the target grid is equal to the preset distance, which satisfies the center
  • the distance from the center of the grid is not greater than the condition of the preset distance, so these four grids are adjacent grids of the target grid.
  • the neighboring bounding boxes of the reference bounding box can be determined.
  • the bounding box located on the same grid as the reference bounding box and the bounding box located on the adjacent grid of the reference bounding box are both adjacent bounding boxes of the reference bounding box. Among them, the adjacent bounding box of the reference bounding box does not include itself.
  • the method before traversing multiple bounding boxes, the method further includes: sorting the bounding boxes according to their scores to obtain the sorting numbers corresponding to the bounding boxes.
  • each bounding box has its corresponding score.
  • the score of the bounding box may be any value between 0 and 1, which is used to indicate the confidence that the bounding box contains the target object to be detected degree. Because the high-scoring bounding box will suppress the low-scoring bounding box, the bounding boxes are sorted according to the score size, and after obtaining the sorting number corresponding to the bounding box, when the bounding box is traversed according to the sorting number, you can use the reference bounding box
  • the respective order numbers of adjacent bounding boxes determine whether the reference bounding box needs to be overlapped with adjacent bounding boxes, which can effectively improve the traversal efficiency.
  • the method before sorting the bounding boxes according to their corresponding score sizes, the method further includes: determining that the multiple bounding boxes are multiple bounding boxes with a score greater than a preset score.
  • the bounding box and the score corresponding to the bounding box are generated on the target image. These scores are any value between 0 and 1. However, when the score of the bounding box is less than the preset score, the confidence that the bounding box contains the target object to be detected is very low, so even if these bounding boxes are not suppressed, It cannot be used to determine that the bounding box contains the target object. Therefore, the bounding box with a score lower than the preset score can be directly filtered out. For example, the preset score can be 0.8, then the bounding boxes with scores of 0.61, 0.51, and 0.31 are filtered out in Fig. 1B, and only the bounding boxes with scores of 0.93, 0.84, and 0.80 are retained. The corresponding sort numbers are shown in Table 2:
  • FIG. 5 is a method for traversing multiple bounding boxes and obtaining a target bounding box according to an embodiment of this application. Schematic diagram of the process, including the following steps:
  • the multiple bounding boxes are N bounding boxes, and the identification bits of the N bounding boxes are initialized to 0;
  • Increment i by 1, and determine whether i is less than N; when it is determined that i is less than N, perform step 503; when it is determined that i is not less than N, perform step 511;
  • step 506. Determine whether the sequence number of adjacent bounding boxes is greater than i, if yes, go to step 507, if not, go to step 509;
  • N is the total number of multiple bounding boxes obtained from the target image.
  • the identification bit of the bounding box is used to indicate whether the bounding box is suppressed, wherein the first identification value is used to indicate that the bounding box is not suppressed, and the second identification value is used to indicate that the bounding box is suppressed.
  • the identification bits of all bounding boxes are set to the first identification value. In the example shown in FIG. 5, the first identification value is 0, which means that all the bounding boxes are not suppressed. It should be understood that the initial identification bit of the bounding box may also be other values or characters. Further, the reference bounding boxes are obtained in the order of the sorting numbers of the bounding boxes.
  • the bounding boxes are sorted and numbered from 0, then first set i to 0, and then obtain the bounding box with the sorting number 0 as the first Reference bounding boxes. If the bounding box is numbered starting from 1, then first set i to 1, and then obtain the bounding box with the rank number 1 as the first reference bounding box. In other words, the initial value of i is determined by the first sorting number of the bounding box.
  • the adjacent bounding box of the reference bounding box is obtained, and the overlap degree between the reference bounding box and the adjacent bounding box is calculated, where the overlap degree represents the reference
  • the ratio of the overlapping area between the bounding box and the adjacent bounding box can be the ratio of the overlapping area to the area of the reference bounding box, the ratio of the overlapping area to the area of the adjacent bounding box, or the overlapping area to the reference bounding box The ratio of the combined area with the adjacent bounding box.
  • the degree of overlap can be reflected by the intersection ratio, that is, calculated by the IOU calculation formula.
  • the value of the IOU is greater than the preset threshold, it means that the adjacent bounding box is suppressed by the reference bounding box, and the identification position of the adjacent bounding box is the second identification Value, in the example shown in FIG. 5, the second identification value is 1. Then obtain the next adjacent bounding box of the reference bounding box. When the ordering number of the adjacent bounding box is greater than the ordering number of the reference bounding box, and the identification bit of the adjacent bounding box is the first identification value, compare the reference bounding box The IOU calculation is performed the same as the adjacent bounding box, until it is determined that the adjacent bounding boxes of the reference bounding box have all been IOU calculated and the corresponding identification bits are modified.
  • the next bounding box is selected as the reference bounding box to continue the calculation of the overlap between the reference bounding box and the adjacent bounding box, until the traversal of the bounding box of the Nth sorting number is completed.
  • the traversal process of the bounding box of the Nth sorting number can be omitted, so it can also be End the traversal after traversing the N-1th bounding box.
  • the bounding box with the flag of 0 is used as the target bounding box, that is, the target bounding box is bounding boxes that are not suppressed in multiple bounding boxes.
  • the method further includes: establishing an index queue corresponding to the grid according to the grid to which the bounding box belongs, and the index queue includes the sorting number of at least one bounding box; and obtaining a reference
  • the adjacent bounding boxes of the bounding box include: obtaining adjacent bounding boxes from the target grid and the adjacent grids of the target grid according to the index queue.
  • FIG. 6 is a schematic diagram of an index queue in a grid provided by an embodiment of the application. As shown in FIG. 6, the grid to which each bounding box belongs is determined in turn according to the ordering number of the bounding box. After the bounding box belongs to the target grid, the sort number of the bounding box is added to the index queue corresponding to the grid.
  • the index queues of grids R_1_0, R_1_2 and R_2_2 are generated.
  • the index queue includes the sorting numbers of one or more bounding boxes.
  • the sorting numbers are arranged in order.
  • the indexing queue can also include the bounding box corresponding to each sorting number. Score. After generating the index queue of the bounding boxes contained in the grid, when traversing multiple bounding boxes to obtain the neighboring bounding boxes corresponding to the reference bounding box, you can follow the target grid to which the reference bounding box belongs and the neighboring target grid The index queues corresponding to the grids obtain adjacent bounding boxes in order.
  • the target bounding box obtained according to the above steps is not suppressed by other bounding boxes, and can be used to determine the target detection result.
  • the bounding box corresponding to 0.93 and 0.80 is the target bounding box
  • the target detection result is determined according to the score of the target bounding box, that is, the score of the target bounding box indicates the bounding range of the bounding box
  • the confidence level of the target object is included within. The greater the confidence level, the greater the probability of including the target object.
  • the target object corresponding to target detection in the embodiment of the application is a face image
  • the target bounding box 1 score is 0.80, which means that the confidence level of the face image included in the target bounding box 1 is 0.80
  • the target detection result obtained may be: target The probability that the bounding box 1 includes the face image is 80%. If it is preset that when the bounding box score is greater than 0.7, it can be determined that the bounding box includes the target object, then the obtained target detection result may be: the target bounding box 1 includes a face image.
  • determining the target detection result according to the score of the target bounding box includes: determining the target detection result according to the target bounding box with a score greater than a preset score.
  • target bounding boxes are obtained after traversing multiple bounding boxes, and include target bounding boxes whose score is less than the preset score, when these target bounding boxes are used to determine the target detection result, the probability of including the target object in the bounding box is lower than the required Probability, these target bounding boxes can be directly filtered out, and the target detection result can be determined only based on the target bounding box with a score greater than the preset score, so as to improve the efficiency of target detection result generation.
  • the target detection result can also be determined according to the target bounding box whose score is equal to the preset score.
  • the target image is divided to obtain multiple grids; the adjacent bounding boxes corresponding to the multiple bounding boxes are determined according to the grid division result; then the multiple bounding boxes are traversed to obtain multiple bounding boxes by calculation.
  • the target detection result is determined according to the score of the target bounding box.
  • the device 700 includes:
  • the acquiring unit 701 is configured to acquire multiple bounding boxes on a target image and scores of the multiple bounding boxes, where the scores are used to represent the confidence that the bounding box contains the target object;
  • the dividing unit 702 is configured to divide the target image to obtain multiple grids, and determine the grid to which the multiple bounding boxes belong;
  • the traversal unit 703 is configured to traverse the multiple bounding boxes, calculate the degree of overlap between the reference bounding box and the adjacent bounding boxes, and obtain the target bounding box according to the degree of overlap;
  • the reference bounding boxes are the multiple bounding boxes
  • the adjacent bounding box includes a bounding box belonging to a target grid and a bounding box belonging to an adjacent grid of the target grid, and the target grid is a grid to which the reference bounding box belongs, The adjacent bounding box does not include the reference bounding box;
  • the determining unit 704 is configured to determine the target detection result according to the score of the target bounding box.
  • the target image is divided to obtain multiple grids; the adjacent bounding boxes corresponding to the multiple bounding boxes are determined according to the grid division result; then the multiple bounding boxes are traversed to obtain multiple bounding boxes by calculation In the target bounding box that is not suppressed by the adjacent bounding box, finally the target detection result is determined according to the score of the target bounding box.
  • the target detection result is determined according to the score of the target bounding box.
  • the device further includes a sorting unit 705, specifically configured to:
  • the sorting unit 705 is specifically configured to:
  • the bounding boxes are sorted in descending order according to the size of the score, and the smaller the score is, the larger the sorting number corresponding to the bounding box is.
  • the traversal unit 703 is specifically configured to:
  • the identification bit of the reference bounding box is the first identification value, acquiring adjacent bounding boxes of the reference bounding box, and determining whether the sequence number of the adjacent bounding box is greater than i;
  • the sizes of the multiple bounding boxes are different, and the dividing unit 702 is specifically configured to:
  • the target image is divided according to the largest size among the sizes of the multiple bounding boxes to obtain multiple grids.
  • the sizes of the multiple bounding boxes are the same, and the dividing unit 702 is specifically configured to:
  • the target image is divided into multiple grids according to the sizes of the multiple bounding boxes.
  • the dividing unit 702 is specifically configured to:
  • Target coordinate point is any coordinate point on the bounding box or within the bounding box
  • the grid to which the target coordinate point belongs is determined as the grid to which the bounding box belongs.
  • the target coordinate point includes:
  • the dividing unit 702 is further configured to:
  • the index queue includes at least one sorting number of the bounding box
  • the traversal unit 703 is further configured to:
  • the sorting unit 705 is further configured to:
  • the multiple bounding boxes are multiple bounding boxes with scores greater than a preset score.
  • the determining unit 704 is specifically configured to:
  • the device 800 includes at least one processor 801, at least one memory 802 and at least one communication interface 803, and also includes an image sensor 804 and a display 805.
  • the processor 801, the memory 802, the communication interface 803, the image sensor 804, and the display 805 are connected through the communication bus and complete mutual communication.
  • the device 800 can be used in smart devices such as smart access control and smart security.
  • the device 800 can collect images through the image sensor 804, or the communication interface 803 can connect to other communication devices or readable memory to obtain image data, and transmit it to the processor 801 for target detection.
  • the detection results will generally be post-processed (for example, result classification, scoring) , Screening, identification, etc.), and then output the final result to the display 805 for display or storage in the memory 802.
  • the processor 801 may be a general central processing unit (CPU), a graphics processing unit (GPU), a microprocessor, an application-specific integrated circuit (ASIC), or one or more An integrated circuit used to control the execution of the program above.
  • CPU central processing unit
  • GPU graphics processing unit
  • ASIC application-specific integrated circuit
  • the communication interface 803 is used for optical fiber communication with other devices or communication networks.
  • the memory 802 may be a read-only memory (ROM) or other types of static storage devices that can store static information and instructions, random access memory (RAM), or other types that can store information and instructions
  • the dynamic storage device can also be electrically erasable programmable read-only memory (Electrically Erasable Programmable Read-Only Memory, EEPROM), CD-ROM (Compact Disc Read-Only Memory, CD-ROM) or other optical disc storage, optical disc storage (Including compact discs, laser discs, optical discs, digital versatile discs, Blu-ray discs, etc.), magnetic disk storage media or other magnetic storage devices, or can be used to carry or store desired program codes in the form of instructions or data structures and can be used by a computer Any other media accessed, but not limited to this.
  • the memory can exist independently and is connected to the processor through a bus.
  • the memory can also be integrated with the processor.
  • the memory 802 is used to store application program codes and program execution results for executing the above solutions, and the processor 801 controls the execution.
  • the processor 801 is configured to execute application program codes stored in the memory 802.
  • the code stored in the memory 802 can execute the target detection method provided above, for example:
  • the adjacent bounding box includes a bounding box belonging to a target grid and a bounding box belonging to an adjacent grid of the target grid, the target grid is a grid to which the reference bounding box belongs, and the adjacent bounding box Not including the reference bounding box;
  • the target detection result is determined according to the score of the target bounding box.
  • the device 800 in the embodiment of the present application can also be specifically implemented by a complex programmable logic device (CPLD), a field-programmable gate array (Field-Programmable Gate Array, FPGA), etc. This embodiment of the present application Not limited.
  • CPLD complex programmable logic device
  • FPGA Field-Programmable Gate Array
  • the embodiment of the present application also provides a computer-readable storage medium that stores instructions in the computer-readable storage medium, and when it runs on a computer or a processor, the computer or the processor executes any one of the above methods. Or multiple steps. If each component module of the above-mentioned signal processing device is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in the computer readable storage medium.
  • the embodiments of the present application also provide a computer program product containing instructions, which when run on a computer or a processor, cause the computer or the processor to execute any of the methods provided in the embodiments of the present application.
  • the technical solution of this application is essentially or the part that contributes to the existing technology, or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including several instructions. This allows a computer device or a processor therein to execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the disclosed device may be implemented in other ways.
  • the device embodiments described above are merely illustrative, for example, the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined or may be Integrate into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
  • the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable memory.
  • the technical solution of the present application essentially or the part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a memory.
  • a number of instructions are included to enable a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods described in the various embodiments of the present application.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

A target detection method and device. The method comprises: obtaining a plurality of bounding boxes on a target image and scores of the plurality of bounding boxes, the scores being used for representing the confidence that the bounding box comprises a target object (201); dividing the target image to obtain a plurality of grids, and determining the grids to which the plurality of bounding boxes belong (202); traversing the plurality of bounding boxes, calculating the overlapping degree between a reference bounding box and an adjacent bounding box, and obtaining a target bounding box according to the overlapping degree (203); and determining the target detection result according to the score of the target bounding box (204). According to the method, by performing grid division on a target image, determining grids to which bounding boxes belong, and determining the neighboring relation of the bounding boxes according to the neighboring relation between the grids, only the overlapping degree between the reference bounding box and the adjacent bounding box needs to be calculated when an NMS algorithm operation is performed, effectively reducing the time complexity of NMS and improving the target detection efficiency.

Description

一种目标检测方法及装置Target detection method and device
本申请要求于2019年10月26日提交中国知识产权局、申请号为201911026844.2、申请名称为“一种目标检测方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the China Intellectual Property Office with the application number 201911026844.2 and the application name "A method and device for target detection" on October 26, 2019, the entire content of which is incorporated into this application by reference in.
技术领域Technical field
本申请涉及图像处理领域,尤其涉及一种目标检测方法及装置。This application relates to the field of image processing, and in particular to a target detection method and device.
背景技术Background technique
随着深度学习和计算机视觉的快速发展,相关技术已经在诸多领域广泛应用。目标检测(Object Detection)作为图像处理中的重要一环,其任务是找出图像中所有感兴趣的目标(物体),确定它们的位置和大小,是机器视觉领域的核心问题之一。With the rapid development of deep learning and computer vision, related technologies have been widely used in many fields. Object Detection (Object Detection) is an important part of image processing. Its task is to find all the objects (objects) of interest in the image and determine their positions and sizes. This is one of the core issues in the field of machine vision.
在采用深度学习方法进行目标检测时,可以包含两个步骤,首先在目标图像上生成边界框(bounding box,bbox)以及边界框得分,其中边界框得分值越高,bbox框中的物体是目标物体的概率也越大;筛选出边界框得分满足预设条件的高分边界框,确定高分边界框中的物体为需要识别的目标物体。When using a deep learning method for target detection, two steps can be included. First, a bounding box (bbox) and a bounding box score are generated on the target image. The higher the bounding box score is, the object in the bbox box is The probability of the target object is also greater; the high-scoring bounding box whose bounding box score meets the preset conditions is screened out, and the object in the high-scoring bounding box is determined as the target object that needs to be recognized.
在这个过程中,为了避免目标漏检,在生成bbox框时,bbox框的位置排布会比较密,因此会导致在同一个目标物体上出现多个框,为了从众多重叠框中选出最优框,非极大值抑制(Non-Maximum Suppression,NMS)算法被引入,通过交并比(Intersection Over Union,IOU)计算筛选掉与高得分边界框重叠的低得分框,得出目标物体上的最优框。In this process, in order to avoid missed detection of the target, when the bbox box is generated, the position of the bbox box will be densely arranged, which will cause multiple boxes to appear on the same target object. In order to select the most overlapped box Excellent box, Non-Maximum Suppression (NMS) algorithm is introduced, through the Intersection Over Union (IOU) calculation to filter out the low-scoring boxes that overlap with the high-scoring bounding box, and get the target object The best box.
但是,NMS算法中每个被选取的高得分框都要与得分低于自身的所有低得分框做IOU计算,最坏情况下算法复杂度为O(N^2),所以当bbox框的个数增多时,耗时会幂次倍增加,NMS耗时增加会导致检测帧率降低,严重影响算法检测效果。However, each selected high-scoring box in the NMS algorithm needs to do IOU calculation with all low-scoring boxes with a score lower than itself. In the worst case, the algorithm complexity is O(N^2), so when the bbox box is When the number increases, the time-consuming will increase by power times, and the increased time-consuming of NMS will cause the detection frame rate to decrease, which seriously affects the detection effect of the algorithm.
发明内容Summary of the invention
本申请实施例提供一种目标检测方法及装置,采用本申请实施例的方案能够通过对目标图像进行网格划分,确定边界框归属的网格,然后根据网格之间的相邻关系确定边界框的相邻关系,使得在进行NMS算法运算时,只需要计算参考边界框与相邻边界框的重叠度,有效降低NMS的时间复杂度,提升目标检测效率。The embodiments of this application provide a target detection method and device. Using the solution of the embodiments of this application, the target image can be meshed to determine the grid to which the bounding box belongs, and then the boundary can be determined according to the neighboring relationship between the grids. The adjacent relationship of the boxes makes it necessary to calculate the overlap between the reference bounding box and the adjacent bounding boxes when performing the NMS algorithm operation, which effectively reduces the time complexity of the NMS and improves the efficiency of target detection.
第一方面,本申请实施例提供一种目标检测方法,该方法包括:获取目标图像上多个边界框及该多个边界框的得分,该得分用于表征该边界框中包含目标对象的置信度;对该目标图像进行划分得到多个网格,并确定该多个边界框所属的网格;遍历该多个边界框,计算参考边界框与相邻边界框的重叠度,并根据该重叠度得到目标边界框;该参考边界框为该多个边界框中的任一个,该相邻边界框包括属于目标网格的边界框以及属于该目标网格的相邻网格的边界框,该目标网格为述参考边界框所属的网格,该相邻边界框不包括该参考边界框;根据该目标边界框的得分确定目标检测结果。In a first aspect, an embodiment of the present application provides a target detection method, the method includes: acquiring multiple bounding boxes on a target image and scores of the multiple bounding boxes, the scores are used to characterize the confidence that the bounding box contains the target object Degree; divide the target image to obtain multiple grids, and determine the grid to which the multiple bounding boxes belong; traverse the multiple bounding boxes, calculate the overlap degree between the reference bounding box and the adjacent bounding boxes, and calculate the overlap degree according to the overlap The target bounding box is obtained in degrees; the reference bounding box is any one of the multiple bounding boxes, and the adjacent bounding box includes the bounding box belonging to the target grid and the bounding box belonging to the neighboring grid of the target grid, the The target grid is a grid to which the reference bounding box belongs, and the adjacent bounding box does not include the reference bounding box; the target detection result is determined according to the score of the target bounding box.
在本申请实施例中,对目标图像进行划分得到多个网格;根据网格划分结果确定多个边界框对应的相邻边界框;然后遍历多个边界框,计算获得多个边界框中未被相邻边界框 抑制的目标边界框,最后根据目标边界框的得分确定目标检测结果。在这个过程中,通过确定多个边界框所属网格,进而确定边界框的相邻边界框,使得在遍历边界框时,只需要计算边界框与相邻边界框的重叠度,极大减少了数据处理量,提升了数据处理效率,进而提升了目标检测效率。In the embodiment of the present application, the target image is divided to obtain multiple grids; the adjacent bounding boxes corresponding to the multiple bounding boxes are determined according to the grid division result; then the multiple bounding boxes are traversed, and multiple bounding boxes are obtained by calculation The target bounding box is suppressed by the adjacent bounding box, and finally the target detection result is determined according to the score of the target bounding box. In this process, by determining the grid to which multiple bounding boxes belong, and then determining the adjacent bounding boxes of the bounding box, when traversing the bounding box, only the overlap between the bounding box and the adjacent bounding box needs to be calculated, which greatly reduces The amount of data processing improves the efficiency of data processing, which in turn improves the efficiency of target detection.
在一个可选的示例中,在遍历多个边界框之前,该方法还包括:按照该边界框的得分大小对该边界框进行排序,获得该边界框对应的排序编号。In an optional example, before traversing multiple bounding boxes, the method further includes: sorting the bounding boxes according to the score size of the bounding boxes, and obtaining the sorting number corresponding to the bounding box.
在本申请实施例中,对多个边界框按照得分大小进行排序获得边界框对应的排序编号,使得在后续对边界框进行遍历时按照排序编号依次进行,在计算参考边界框与相邻边界框的重叠度时,可以只考虑排序编号大于或小于参考边界框的相邻边界框,减少了数据处理量,进一步提升了目标检测的效率。In the embodiment of the present application, the multiple bounding boxes are sorted according to the score size to obtain the sorting numbers corresponding to the bounding boxes, so that when the bounding boxes are subsequently traversed, the sorting numbers are sequentially performed. When calculating the reference bounding box and the adjacent bounding box When the overlap degree of, you can only consider the adjacent bounding boxes whose sorting numbers are larger or smaller than the reference bounding box, which reduces the amount of data processing and further improves the efficiency of target detection.
在一个可选的示例中,按照该边界框的得分大小对该边界框进行排序,具体包括:按照得分大小对该边界框进行降序排序,得分越小的边界框对应的排序编号越大。In an optional example, sorting the bounding boxes according to the score size of the bounding box specifically includes: sorting the bounding boxes in descending order according to the score size, and the smaller the score is, the larger the sorting number corresponding to the bounding box is.
在一个可选的示例中,遍历多个边界框,计算参考边界框与相邻边界框的重叠度,得到目标边界框,包括:获取该多个边界框中的排序编号为i的边界框作为参考边界框,同时获取其标识位;在该参考边界框的标识位为第一标识值的情况下,获取该参考边界框的相邻边界框,并判定该相邻边界框的排序编号是否大于i;在确定该相邻边界框的排序编号大于i的情况下,计算该参考边界框与该相邻边界框的交并比;在该交并比大于预设阈值的情况下,将该相邻边界框的标识位置为第二标识值;获取该标识位为该第一标识值的边界框作为目标边界框。In an optional example, traversing multiple bounding boxes, calculating the overlap between the reference bounding box and the adjacent bounding boxes, to obtain the target bounding box, includes: obtaining the bounding box with the order number i in the multiple bounding boxes as Refer to the bounding box and obtain its identification bit at the same time; when the identification bit of the reference bounding box is the first identification value, obtain the adjacent bounding box of the reference bounding box, and determine whether the sequence number of the adjacent bounding box is greater than i; in the case where it is determined that the sequence number of the adjacent bounding box is greater than i, calculate the intersection ratio of the reference bounding box and the adjacent bounding box; when the intersection ratio is greater than the preset threshold, the phase The identification position of the adjacent bounding box is the second identification value; the bounding box whose identification position is the first identification value is acquired as the target bounding box.
在一个可选的示例中,多个边界框的尺寸不同,对目标图像进行划分得到多个网格,包括:按照该多个边界框的尺寸中的最大尺寸对该目标图像划分得到多个网格。In an optional example, the sizes of the multiple bounding boxes are different, and dividing the target image to obtain multiple grids includes: dividing the target image according to the largest size of the multiple bounding boxes to obtain multiple grids. grid.
在本申请实施例中,将目标图像按照多个边界框的尺寸中的最大尺寸划分得到多个网格,可以使得根据边界框所属的网格获取参考边界框的相邻边界框时,避免因为网格尺寸过小导致获取的相邻边界框里面未包含全部与参考边界框重叠度大于预设阈值的边界框,提升了获取目标边界框的准确度,进而提升了目标检测结果的准确度。In the embodiment of the present application, the target image is divided into multiple grids according to the largest size of the multiple bounding box sizes, so that when the adjacent bounding boxes of the reference bounding box are obtained according to the grid to which the bounding box belongs, it can be avoided because Too small a grid size results in that the acquired adjacent bounding boxes do not include all bounding boxes whose overlap with the reference bounding box is greater than a preset threshold, which improves the accuracy of acquiring the target bounding box, and further improves the accuracy of the target detection result.
在一个可选的示例中,多个边界框的尺寸相同,对目标图像进行划分得到多个网格,包括:按照该多个边界框的尺寸对该目标图像划分得到多个网格。In an optional example, if the sizes of the multiple bounding boxes are the same, dividing the target image to obtain multiple grids includes: dividing the target image to obtain multiple grids according to the sizes of the multiple bounding boxes.
在一个可选的示例中,该多个边界框的尺寸相同包括:该多个边界框之间具有相同的宽,且该多个边界框之间具有相同的高。In an optional example, the same size of the multiple bounding boxes includes: the multiple bounding boxes have the same width and the multiple bounding boxes have the same height.
在一个可选的示例中,确定多个边界框所属的网格包括:确定目标坐标点,该目标坐标点为该边界框上或该边界框内的任一个坐标点;将该目标坐标点所属的网格确定为该边界框所属的网格。In an optional example, determining the grid to which multiple bounding boxes belongs includes: determining a target coordinate point, where the target coordinate point is any coordinate point on the bounding box or within the bounding box; and the target coordinate point belongs to The grid of is determined as the grid to which the bounding box belongs.
在一个可选的示例中,目标坐标点包括:边界框的右上角坐标点,边界框的左上角坐标点,边界框的右下角坐标点,边界框的左下角坐标点或边界框的中心坐标点中的任一个。In an alternative example, the target coordinate points include: the upper-right coordinate point of the bounding box, the upper-left coordinate point of the bounding box, the lower-right coordinate point of the bounding box, the lower-left coordinate point of the bounding box, or the center coordinate of the bounding box Click any one of them.
在一个可选的示例中,在确定多个边界框所属的网格之后,该方法还包括:根据该边界框所属的网格,建立该网格对应的索引队列,该索引队列中包括至少一个该边界框的排序编号;In an optional example, after determining the grid to which the multiple bounding boxes belong, the method further includes: establishing an index queue corresponding to the grid according to the grid to which the bounding box belongs, and the index queue includes at least one The sort number of the bounding box;
获取参考边界框的相邻边界框包括:按照该索引队列从目标网格和该目标网格的相邻 网格中获取相邻边界框。Obtaining the neighboring bounding boxes of the reference bounding box includes: acquiring the neighboring bounding boxes from the target grid and the neighboring grids of the target grid according to the index queue.
在一个可选的示例中,在对边界框按照其对应得分大小进行排序之前,该方法还包括:确定该多个边界框为得分大于预设得分的多个边界框。In an optional example, before sorting the bounding boxes according to their corresponding score sizes, the method further includes: determining that the multiple bounding boxes are multiple bounding boxes with a score greater than a preset score.
在本申请实施例中,筛选出得分大于预设得分的边界框,再对多个边界框进行排序和遍历,因为得分低的边界框对目标检测结果影响低甚至无影响,因此省略对得分低的边界框的排序和遍历,可以在保证目标检测结果的可靠性的同时,提升目标检测的效率。In the embodiment of the present application, the bounding boxes with a score greater than the preset score are screened out, and then multiple bounding boxes are sorted and traversed. Because the bounding box with a low score has little or no impact on the target detection result, the omission has a low score The sorting and traversal of the bounding box can improve the efficiency of target detection while ensuring the reliability of target detection results.
在一个可选的示例中,根据目标边界框的得分确定目标检测结果,包括:根据得分大于预设得分的目标边界框确定目标检测结果。In an optional example, determining the target detection result according to the score of the target bounding box includes: determining the target detection result according to the target bounding box with a score greater than a preset score.
第二方面,本申请实施例提供了一种目标检测装置,该装置包括:获取单元,用于获取目标图像上多个边界框及该多个边界框的得分,该得分用于表征该边界框中包含目标对象的置信度;划分单元,用于对该目标图像进行划分得到多个网格,并确定该多个边界框所属的网格;遍历单元,用于遍历该多个边界框,计算参考边界框与相邻边界框的重叠度,并根据该重叠度得到目标边界框;该参考边界框为该多个边界框中的任一个,该相邻边界框包括属于目标网格的边界框以及属于该目标网格的相邻网格的边界框,该目标网格为述参考边界框所属的网格,该相邻边界框不包括该参考边界框;确定单元,用于根据该目标边界框的得分确定目标检测结果。In a second aspect, an embodiment of the present application provides a target detection device. The device includes: an acquiring unit configured to acquire multiple bounding boxes on a target image and scores of the multiple bounding boxes, and the scores are used to characterize the bounding boxes Contains the confidence of the target object; the division unit is used to divide the target image to obtain multiple grids and determine the grid to which the multiple bounding boxes belong; the traversal unit is used to traverse the multiple bounding boxes and calculate The overlap degree of the reference bounding box and the adjacent bounding box, and the target bounding box is obtained according to the overlapping degree; the reference bounding box is any one of the multiple bounding boxes, and the adjacent bounding box includes the bounding box belonging to the target grid And the bounding box of the neighboring grid belonging to the target grid, the target grid is the grid to which the reference bounding box belongs, and the neighboring bounding box does not include the reference bounding box; the determination unit is used to determine the unit according to the target boundary The score of the box determines the target detection result.
在一个可选的示例中,该装置还包括排序单元,具体用于:按照该边界框的得分大小对该边界框进行排序,获得该边界框对应的排序编号。In an optional example, the device further includes a sorting unit, specifically configured to: sort the bounding boxes according to the score size of the bounding box, and obtain the sorting number corresponding to the bounding box.
在一个可选的示例中,排序单元,具体用于:按照得分大小对该边界框进行降序排序,得分越小的边界框对应的排序编号越大。In an optional example, the sorting unit is specifically used to: sort the bounding boxes in descending order according to the score size, and the smaller the score, the larger the sorting number corresponding to the bounding box.
在一个可选的示例中,遍历单元具体用于:获取多个边界框中的排序编号为i的边界框作为参考边界框,同时获取其标识位;在该参考边界框的标识位为第一标识值的情况下,获取该参考边界框的相邻边界框,并判定该相邻边界框的排序编号是否大于i;在确定该相邻边界框的排序编号大于i的情况下,计算该参考边界框与该相邻边界框的交并比;在该交并比大于预设阈值的情况下,将该相邻边界框的标识位置为第二标识值;获取该标识位为该第一标识值的边界框作为目标边界框。In an optional example, the traversal unit is specifically configured to: obtain the bounding box with the order number i in the multiple bounding boxes as the reference bounding box, and at the same time obtain the identification bit; the identification bit in the reference bounding box is the first In the case of the identification value, obtain the adjacent bounding box of the reference bounding box, and determine whether the ordering number of the adjacent bounding box is greater than i; in the case of determining that the ordering number of the adjacent bounding box is greater than i, calculate the reference The intersection ratio of the bounding box and the adjacent bounding box; in the case that the intersection ratio is greater than the preset threshold, the identification position of the adjacent bounding box is the second identification value; the identification bit is acquired as the first identification The bounding box of the value serves as the target bounding box.
在一个可选的示例中,多个边界框的尺寸不同,划分单元具体用于:按照该多个边界框的尺寸中的最大尺寸对该目标图像划分得到多个网格。In an optional example, the sizes of the multiple bounding boxes are different, and the dividing unit is specifically used to divide the target image to obtain multiple grids according to the largest size among the sizes of the multiple bounding boxes.
在一个可选的示例中,多个边界框的尺寸相同,划分单元具体用于:按照该多个边界框的尺寸对该目标图像划分得到多个网格。In an optional example, the sizes of the multiple bounding boxes are the same, and the dividing unit is specifically used to divide the target image to obtain multiple grids according to the sizes of the multiple bounding boxes.
在一个可选的示例中,在确定多个边界框所属的网格方面,划分单元具体用于:确定目标坐标点,该目标坐标点为该边界框上或该边界框内的任一个坐标点;将该目标坐标点所属的网格确定为该边界框所属的网格。In an optional example, in determining the grid to which multiple bounding boxes belong, the dividing unit is specifically used to: determine a target coordinate point, where the target coordinate point is any coordinate point on or within the bounding box ; Determine the grid to which the target coordinate point belongs as the grid to which the bounding box belongs.
在一个可选的示例中,目标坐标点包括:边界框的右上角坐标点,边界框的左上角坐标点,边界框的右下角坐标点,边界框的左下角坐标点或边界框的中心坐标点中的任一个。In an alternative example, the target coordinate points include: the upper-right coordinate point of the bounding box, the upper-left coordinate point of the bounding box, the lower-right coordinate point of the bounding box, the lower-left coordinate point of the bounding box, or the center coordinate of the bounding box Click any one of them.
在一个可选的示例中,在确定多个边界框所属的网格之后,划分单元还用于:根据该边界框所属的网格,建立该网格对应的索引队列,该索引队列中包括至少一个该边界框的排序编号;In an optional example, after determining the grid to which the multiple bounding boxes belong, the dividing unit is further used to: establish an index queue corresponding to the grid according to the grid to which the bounding box belongs, and the index queue includes at least A sort number of the bounding box;
在获取参考边界框的相邻边界框方面,遍历单元还用于:按照该索引队列从目标网格和该目标网格的相邻网格中获取相邻边界框。In terms of obtaining the adjacent bounding boxes of the reference bounding box, the traversal unit is further used to obtain the adjacent bounding boxes from the target grid and the adjacent grids of the target grid according to the index queue.
在一个可选的示例中,排序单元还用于:在对该边界框按照其对应得分大小进行排序之前,确定该多个边界框为得分大于预设得分的多个边界框。In an optional example, the sorting unit is further configured to: before sorting the bounding boxes according to their corresponding score sizes, determine that the multiple bounding boxes are multiple bounding boxes with scores greater than a preset score.
在一个可选的示例中,确定单元具体用于:根据得分大于预设得分的目标边界框确定目标检测结果。In an optional example, the determining unit is specifically configured to determine the target detection result according to the target bounding box with a score greater than a preset score.
第三方面,本申请实施例提供了一种装置,该装置包括:In a third aspect, an embodiment of the present application provides a device, which includes:
处理器和传输接口;Processor and transmission interface;
该处理器调用该存储器中存储的该可执行程序代码,使得该装置执行第一方面该的任一方法。The processor calls the executable program code stored in the memory, so that the device executes any method of the first aspect.
在一个可选的示例中,该装置还包括:该存储器,与该处理器耦合。In an optional example, the device further includes: the memory, coupled with the processor.
在一个可选的示例中,该装置还包括:图像传感器,用于获取该目标图像。In an optional example, the device further includes: an image sensor for acquiring the target image.
第四方面,本发明实施例提供了一种计算机可读存储介质,该计算机存储介质包括程序指令,该程序指令在计算机上运行时,使该计算机执行如第一方面所述的任一方法。In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium. The computer storage medium includes program instructions that, when run on a computer, cause the computer to execute any method described in the first aspect.
第五方面,本申请实施例提供了一种包含指令的计算机程序产品,当其在计算机或处理器上运行时,使得该计算机或处理器执行如上述第一方面或者其任一种可能的实施方式中的方法。In the fifth aspect, the embodiments of the present application provide a computer program product containing instructions, which when run on a computer or processor, cause the computer or processor to execute the first aspect or any of its possible implementations. The method in the way.
本申请的这些方面或其他方面在以下实施例的描述中会更加简明易懂。These and other aspects of the present application will be more concise and understandable in the description of the following embodiments.
附图说明Description of the drawings
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例中所需要使用的附图作简单地介绍。In order to more clearly illustrate the technical solutions in the embodiments of the present application or in the prior art, the following will briefly introduce the drawings that need to be used in the embodiments.
图1A为本申请实施例提供的一种生成多个边界框的示意图;FIG. 1A is a schematic diagram of generating multiple bounding boxes according to an embodiment of this application;
图1B为本申请实施例中提供的另一种生成多个边界框的示意图;FIG. 1B is another schematic diagram of generating multiple bounding boxes provided in an embodiment of this application;
图2为本申请实施例提供的一种目标检测方法流程示意图;FIG. 2 is a schematic flowchart of a target detection method provided by an embodiment of the application;
图3为本申请实施例提供的一种划分网格的示意图;FIG. 3 is a schematic diagram of a grid division provided by an embodiment of the application;
图4A为本申请实施例提供的一种网格划分过程示意图;4A is a schematic diagram of a grid division process provided by an embodiment of this application;
图4B为本申请实施例提供的一种确定边界框所属网格的示意图;4B is a schematic diagram of determining a grid to which a bounding box belongs according to an embodiment of this application;
图4C为本申请实施例提供的一种网格相邻关系示意图;FIG. 4C is a schematic diagram of a grid neighbor relationship provided by an embodiment of this application;
图5为本申请实施例提供的一种遍历多个边界框并得到目标边界框的方法流程示意图;FIG. 5 is a schematic flowchart of a method for traversing multiple bounding boxes and obtaining a target bounding box according to an embodiment of the application;
图6为本申请实施例提供的一种网格中的索引队列示意图;FIG. 6 is a schematic diagram of an index queue in a grid provided by an embodiment of this application;
图7为本申请实施例提供的一种目标检测装置结构示意图;FIG. 7 is a schematic structural diagram of a target detection device provided by an embodiment of the application;
图8为本发明实施例提供的一种装置的结构示意图。FIG. 8 is a schematic structural diagram of an apparatus provided by an embodiment of the present invention.
具体实施方式Detailed ways
为了使本技术领域的人员更好地理解本申请方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案说明。In order to enable those skilled in the art to better understand the solutions of the present application, the technical solutions in the embodiments of the present application will be described below in conjunction with the drawings in the embodiments of the present application.
本申请的说明书实施例和权利要求书及上述附图中的术语“第一”、“第二”等是用于区 别类似的对象,而不必用于描述特定的顺序或先后次序。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元。方法、***、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。The terms "first", "second", etc. in the specification embodiments and claims of the present application and the above-mentioned drawings are used to distinguish similar objects, and are not necessarily used to describe a specific sequence or sequence. In addition, the terms "including" and "having" and any variations of them are intended to cover non-exclusive inclusion, for example, including a series of steps or units. The method, system, product, or device need not be limited to those clearly listed steps or units, but may include other steps or units that are not clearly listed or are inherent to these processes, methods, products, or devices.
应当理解,在本申请中,“至少一个(项)”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,用于描述关联对象的关联关系,表示可以存在三种关系,例如,“A和/或B”可以表示:只存在A,只存在B以及同时存在A和B三种情况,其中A,B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。“以下至少一项(个)”或其类似表达,是指这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a,b或c中的至少一项(个),可以表示:a,b,c,“a和b”,“a和c”,“b和c”,或“a和b和c”,其中a,b,c可以是单个,也可以是多个。It should be understood that in this application, "at least one (item)" refers to one or more, and "multiple" refers to two or more. "And/or" is used to describe the association relationship of associated objects, indicating that there can be three types of relationships, for example, "A and/or B" can mean: only A, only B, and both A and B , Where A and B can be singular or plural. The character "/" generally indicates that the associated objects before and after are in an "or" relationship. "The following at least one item (a)" or similar expressions refers to any combination of these items, including any combination of a single item (a) or a plurality of items (a). For example, at least one of a, b, or c can mean: a, b, c, "a and b", "a and c", "b and c", or "a and b and c" ", where a, b, and c can be single or multiple.
在使用深度学习检测网进行目标检测时,根据在目标图像生成的bbox框的得分和位置确定目标物体的置信度和位置。为了避免对目标物体的漏检,在生成bbox框时,bbox框的位置排布会比较密,因此会导致在同一个目标物体上出现多个边界框的情况,请参阅图1A,图1A为本申请实施例提供的一种生成多个边界框的示意图,如图1A中的(a)所示,在人脸图像上生成了3个bbox框,每个bbox框对应的得分不同。为了从众多重叠框中选出最优框,NMS算法被引入,用来抑制低得分框,筛选出目标物体上如图1A中的(b)所示的得分最高的bbox作为最优框。最后根据最优框确定目标物体的位置,并确定目标物体为人脸图像的置信度为0.98。When using the deep learning detection network for target detection, the confidence and position of the target object are determined according to the score and position of the bbox box generated in the target image. In order to avoid the missed detection of the target object, when the bbox box is generated, the position of the bbox box will be relatively dense, so it will cause multiple bounding boxes on the same target object. Please refer to Figure 1A. An embodiment of the present application provides a schematic diagram of generating multiple bounding boxes. As shown in (a) of FIG. 1A, three bbox boxes are generated on a face image, and each bbox box corresponds to a different score. In order to select the optimal frame from many overlapping frames, the NMS algorithm is introduced to suppress the low-scoring frame and screen out the bbox with the highest score as shown in Figure 1A (b) on the target object as the optimal frame. Finally, the position of the target object is determined according to the optimal frame, and the confidence that the target object is a face image is 0.98.
或者,请参阅图1B,图1B为本申请实施例中提供的另一种生成多个边界框的示意图,如图1B中的(c)所示,分别在2个人脸上各自生成了3个bbox框,根据NMS算法,抑制低分框后,筛选出2个最优框。最后根据2个最优框分别确定了两个目标物体的位置,以及两个目标物体为人脸图像的置信度分别为0.93和0.80。Or, please refer to FIG. 1B. FIG. 1B is another schematic diagram of generating multiple bounding boxes provided in an embodiment of the application. As shown in (c) in FIG. The bbox box, after suppressing the low score box according to the NMS algorithm, selects 2 optimal boxes. Finally, the positions of the two target objects are determined according to the two optimal frames, and the confidence that the two target objects are face images is 0.93 and 0.80, respectively.
在传统方法中,在采用NMS算法抑制低得分边界框并获得最优框的过程时,首先将所有的bbox框按照得分高低进行降序排序,然后经过多次IOU计算筛选出最优框。In the traditional method, when the NMS algorithm is used to suppress the low-scoring bounding box and obtain the optimal box, first all bbox boxes are sorted in descending order according to the score, and then the optimal box is selected through multiple IOU calculations.
上述多次IOU计算的具体流程如下:The specific process of the above multiple IOU calculations is as follows:
(1)选取目标图像上得分最高的bbox框记为S_bbox。(1) Select the bbox box with the highest score on the target image and mark it as S_bbox.
(2)用S_bbox与排在其后的所有框依次进行IOU计算,IOU计算公式如下:(2) Use S_bbox and all subsequent boxes to perform IOU calculation in turn. The IOU calculation formula is as follows:
IOU=(A1∩A2)/(A1∪A2)IOU=(A1∩A2)/(A1∪A2)
其中A1和A2分别表示参与IOU计算的两个框的面积,如果IOU大于设定的阈值T,则表示该框被S_bbox抑制,将该框从排序队列中删除。Among them, A1 and A2 respectively represent the areas of the two boxes involved in the IOU calculation. If the IOU is greater than the set threshold T, it means that the box is suppressed by the S_bbox, and the box is deleted from the sorting queue.
(3)如果所选取的S_bbox不是排序队列中最后一个bbox框,则选取紧排其后的一个框为S_bbox,重复步骤(2)。(3) If the selected S_bbox is not the last bbox box in the sorting queue, select the next box as S_bbox, and repeat step (2).
从NMS算法原理中可以看出,每个被选取S_bbox都要与排在其后所有的框做IOU计算,最坏情况下算法复杂度为O(N^2),所以当bbox框的个数增多时,耗时会幂次倍增加,NMS耗时增加会导致检测帧率降低,严重影响算法检测效果。It can be seen from the principle of the NMS algorithm that each selected S_bbox has to do IOU calculations with all the following boxes. In the worst case, the algorithm complexity is O(N^2), so when the number of bbox boxes is When it increases, the time-consuming will increase power times, and the increased time-consuming of NMS will cause the detection frame rate to decrease, which seriously affects the detection effect of the algorithm.
为了解决上述问题,请参阅图2,图2为本申请实施例提供的一种目标检测方法流程示意图,如图2所示,该方法包括如下步骤:In order to solve the above problems, please refer to FIG. 2. FIG. 2 is a schematic flowchart of a target detection method provided by an embodiment of the application. As shown in FIG. 2, the method includes the following steps:
201、获取目标图像上多个边界框及所述多个边界框的得分,所述得分用于表征所述边界框中包含目标对象的置信度。201. Acquire multiple bounding boxes on a target image and scores of the multiple bounding boxes, where the scores are used to characterize the confidence that the bounding box contains the target object.
目标图像是指需要进行目标检测的图像,可以是输入设备直接输入的图像作为目标图像,也可以是输入设备输入的图像中,根据用户设置的尺寸或范围确定的部分图像作为目标图像。与传统的NMS算法过程相同,首先获取在目标图像上生成的多个边界框及其得分,边界框的得分表示边界框中包含目标对象的置信度,可以为0~1之间的任意值。The target image refers to an image that requires target detection. It can be an image directly input by an input device as the target image, or a part of the image input by the input device that is determined according to the size or range set by the user as the target image. The process is the same as the traditional NMS algorithm. First, multiple bounding boxes generated on the target image and their scores are obtained. The bounding box scores indicate the confidence that the bounding box contains the target object, and can be any value between 0 and 1.
202、对所述目标图像进行划分得到多个网格,并确定所述多个边界框所属的网格。202. Divide the target image to obtain multiple grids, and determine the grid to which the multiple bounding boxes belong.
对目标图像进行划分获得多个网格,每个网格对应部分目标图像。网格的划分可以是随机的,也可以根据目标图像的形状和尺寸划分,还可以根据边界框的位置或大小来划分。请参阅图3,图3为本申请实施例提供的一种划分网格的示意图,如图3中的(a)所示,根据目标图像的形状和尺寸划分出多个网格,使得生成的网格数量刚好完全覆盖完整的目标图像;或者如图3中的(b)所示,根据目标图像中的边界框的位置划分网格,使得生成的网格完全覆盖边界框,而没有生成边界框的位置则没有划分网格。完成网格划分后,根据多个边界框生成的位置可以确定每个边界框属于的网格。The target image is divided to obtain multiple grids, and each grid corresponds to a part of the target image. The grid can be divided randomly, according to the shape and size of the target image, or according to the position or size of the bounding box. Please refer to Fig. 3, which is a schematic diagram of a grid division provided by an embodiment of the application. As shown in Fig. 3(a), multiple grids are divided according to the shape and size of the target image, so that the generated The number of grids just completely covers the complete target image; or as shown in Figure 3(b), the grid is divided according to the position of the bounding box in the target image, so that the generated grid completely covers the bounding box without generating the boundary The position of the box is not meshed. After the mesh is divided, the grid to which each bounding box belongs can be determined according to the positions generated by multiple bounding boxes.
在进行网格划分时,可以按照非固定尺寸划分,即目标图像上划分的多个网格具有不同的宽和高。也可以按照固定尺寸划分网格,即目标图像上划分的多个网格具有相同的宽和高。在按照固定尺寸划分网格时,可以是在确定目标图像尺寸和网格个数的情况下确定网格的固定尺寸;也可以是在确定目标图像尺寸和网格固定尺寸的情况下确定划分的网格个数。对于前一种情况,例如已知目标图像尺寸为3pxcel(像素)*2pxcel,网格个数为3*2,可知网格的固定尺寸为1pxcel*1pxcel;对于后一种情况,例如已知目标图像尺寸为8pxcel*6pxcel,网格尺寸为2pxcel*2pxcel,那么横向网格数量Nw=ceil(8pxcel/2pxcel)=4个,纵向网格数量Nh=ceil(6pxcel/2pxcel)=3个,网格总个数为4*3=12个。其中,计算网格数量时采用ceil函数表示向上取整,即对于目标图像中最后不能满足一个网格尺寸大小的图像也生成一个网格。在一些情况下,计算网格数量时也可以采用floor函数向下取整,还可以采用round函数四舍五入取整;因为小于一个网格大小的图像中生成边界框的概率较低,或者生成边界框对应的得分也会较低,可以省略对这部分目标图像中的边界框的重叠度计算。When the grid is divided, it can be divided according to a non-fixed size, that is, multiple grids divided on the target image have different widths and heights. The grids can also be divided according to a fixed size, that is, multiple grids divided on the target image have the same width and height. When dividing a grid according to a fixed size, the fixed size of the grid can be determined when the target image size and the number of grids are determined; it can also be determined when the target image size and the fixed size of the grid are determined. The number of grids. For the former case, for example, the known target image size is 3pxcel (pixel)*2pxcel, and the number of grids is 3*2, and the fixed size of the grid is 1pxcel*1pxcel; for the latter case, for example, the known target The image size is 8pxcel*6pxcel, the grid size is 2pxcel*2pxcel, then the number of horizontal grids Nw = ceil (8pxcel/2pxcel) = 4, the number of vertical grids Nh = ceil (6pxcel/2pxcel) = 3, grid The total number is 4*3=12. Among them, when calculating the number of grids, the ceil function is used to indicate rounding up, that is, a grid is also generated for the image that cannot meet a grid size at the end of the target image. In some cases, the floor function can also be used to round down when calculating the number of grids, and the round function can also be used to round down; because the probability of generating a bounding box in an image smaller than a grid size is low, or a bounding box is generated The corresponding score will also be lower, and the calculation of the overlap degree of the bounding box in this part of the target image can be omitted.
可选的,多个边界框的尺寸不同,对目标图像进行划分得到多个网格,包括:按照多个边界框的尺寸中的最大尺寸对目标图像划分得到多个网格。Optionally, the sizes of the multiple bounding boxes are different, and dividing the target image to obtain multiple grids includes: dividing the target image according to the largest size of the multiple bounding boxes to obtain multiple grids.
多个边界框的尺寸不同,是指多个边界框具有不同的宽和高。在按照固定尺寸划分网格时,如果多个边界框的尺寸不同,可以将多个边界框的尺寸中的最大尺寸作为划分网格的固定尺寸。这样可以是使得获取参考边界框的相邻边界框时,能够涵盖所有相邻边界框。The size of multiple bounding boxes is different, which means that multiple bounding boxes have different widths and heights. When the grid is divided according to a fixed size, if the sizes of the multiple bounding boxes are different, the largest size among the sizes of the multiple bounding boxes can be used as the fixed size of the grid. In this way, when obtaining adjacent bounding boxes of the reference bounding box, all adjacent bounding boxes can be covered.
可选的,多个边界框的尺寸相同,对目标图像进行划分得到多个网格,包括:按照多个边界框的尺寸对目标图像划分得到多个网格。Optionally, the size of the multiple bounding boxes is the same, and dividing the target image to obtain multiple grids includes: dividing the target image according to the sizes of the multiple bounding boxes to obtain multiple grids.
多个边界框的尺寸相同,是指多个边界框具有相同的宽和高。在按照固定尺寸划分网格时,如果多个边界框的尺寸相同,可以将多个边界框的尺寸作为划分网格的固定尺寸。The size of multiple bounding boxes is the same, which means that multiple bounding boxes have the same width and height. When the grid is divided according to a fixed size, if the sizes of multiple bounding boxes are the same, the sizes of multiple bounding boxes can be used as the fixed size of the grid.
具体地,请参阅图4A,图4A为本申请实施例提供的一种网格划分过程示意图,如图4A所示,获取目标图像的高度H和宽度W,然后按照边界框的高度Hb和宽度Wb作为划 分网格的固定尺寸。示例性的,宽度方向上网格个数为Nw=ceil(W/Wb),高度方向上网格个数为Nh=ceil(H/Hb)。ceil表示向上取整,图4A中右边的每个小格即为划分的网格。Specifically, please refer to FIG. 4A, which is a schematic diagram of a grid division process provided by an embodiment of this application. As shown in FIG. 4A, the height H and width W of the target image are obtained, and then the height Hb and width of the bounding box are Wb is used as the fixed size of the grid. Exemplarily, the number of grids in the width direction is Nw=ceil(W/Wb), and the number of grids in the height direction is Nh=ceil(H/Hb). ceil means rounding up, and each small grid on the right in Figure 4A is the divided grid.
可选的,确定多个边界框所属的网格包括:确定目标坐标点,目标坐标点为边界框上或边界框内的任一个坐标点;将目标坐标点所属的网格确定为边界框所属的网格。Optionally, determining the grid to which multiple bounding boxes belongs includes: determining a target coordinate point, which is any coordinate point on or within the bounding box; and determining the grid to which the target coordinate point belongs as the bounding box belongs Grid.
可选的,目标坐标点包括:边界框的右上角坐标点,边界框的左上角坐标点,边界框的右下角坐标点,边界框的左下角坐标点或边界框的中心坐标点中的任一个。Optionally, the target coordinate points include: the upper-right coordinate point of the bounding box, the upper-left coordinate point of the bounding box, the lower-right coordinate point of the bounding box, the lower-left coordinate point of the bounding box, or the center coordinate point of the bounding box. One.
完成对目标图像的网格划分后,根据多个边界框所在的位置确定边界框所属的网格。边界框有一定面积,边界框的不同位置可能位于不同的网格中。可以以边界框上,或者边界框内部任意一个坐标点所属的网格确定边界框所属的网格。或者,确定边界框上或边界框内的多个坐标点所属的网格,然后按照多个坐标点中个数最多的坐标点所属的网格确定边界框所属的网格。After completing the grid division of the target image, the grid to which the bounding box belongs is determined according to the positions of the multiple bounding boxes. The bounding box has a certain area, and different positions of the bounding box may be located in different grids. The grid to which the bounding box belongs can be determined by the grid to which any coordinate point on the bounding box or inside the bounding box belongs. Or, determine the grid to which the multiple coordinate points on or in the bounding box belong, and then determine the grid to which the bounding box belongs according to the grid to which the coordinate point with the largest number of the multiple coordinate points belongs.
以边界框上或边界框内任意一个坐标点所属的网格确定边界框所属的网格为例,请参阅图4B,图4B为本申请实施例提供的一种确定边界框所属网格的示意图,如图4B所示,首先为生成的网格进行编号,然后确定每个边界框所属的网格编号。以得分为0.84的边界框为例,假设该边界框左上角坐标点为(Xmin,Ymax),确定边界框所属网格编号可以采用如下公式:Taking the grid of any coordinate point on the bounding box or in the bounding box to determine the grid to which the bounding box belongs as an example, please refer to FIG. 4B. FIG. 4B is a schematic diagram of determining the grid to which the bounding box belongs according to an embodiment of the application. , As shown in Figure 4B, firstly number the generated grids, and then determine the grid number to which each bounding box belongs. Taking a bounding box with a score of 0.84 as an example, assuming that the coordinate point of the upper left corner of the bounding box is (Xmin, Ymax), the following formula can be used to determine the grid number to which the bounding box belongs:
Iw=floor(Xmin/Wb)Iw=floor(Xmin/Wb)
Ih=floor(Ymax/Hb)Ih=floor(Ymax/Hb)
其中Iw表示边界框在宽度上对应的网格序号,Ih表示边界框在高度上对应的网格序号,即根据左上角坐标点所属的网格确定边界框所属的网格编号为R_Ih_Iw。Where Iw represents the grid number corresponding to the width of the bounding box, and Ih represents the grid number corresponding to the height of the bounding box, that is, the grid number to which the bounding box belongs is determined to be R_Ih_Iw according to the grid to which the upper left coordinate point belongs.
可以看出图4B中的网格从0开始排序,因此上述确定边界框所属网格的公式采用floor函数向下取整。当网格从1开始排序时,确定边界框所属网格的公式可以采用ceil函数向上取整。It can be seen that the grids in Figure 4B are sorted starting from 0, so the above formula for determining the grid to which the bounding box belongs is rounded down using the floor function. When the grids are sorted from 1, the formula to determine the grid to which the bounding box belongs can be rounded up using the ceil function.
同样的,也可以获取边界框右上角坐标点(Xmax,Ymax),或右下角坐标点(Xmax,Ymin),或中心坐标点(Xmed,Ymed),然后根据上述公式计算这些坐标点所属的网格,进而确定边界框所属的网格。Similarly, you can also obtain the coordinates of the upper right corner of the bounding box (Xmax, Ymax), or the coordinates of the lower right corner (Xmax, Ymin), or the center coordinates (Xmed, Ymed), and then calculate the network to which these coordinate points belong according to the above formula. Grid, and then determine the grid to which the bounding box belongs.
203、遍历所述多个边界框,计算参考边界框与相邻边界框的重叠度,并根据所述重叠度得到目标边界框;所述参考边界框为所述多个边界框中的任一个,所述相邻边界框包括属于目标网格的边界框以及属于所述目标网格的相邻网格的边界框,所述目标网格为述参考边界框所属的网格,所述相邻边界框不包括所述参考边界框。203. Traverse the multiple bounding boxes, calculate the degree of overlap between the reference bounding box and the adjacent bounding boxes, and obtain the target bounding box according to the degree of overlap; the reference bounding box is any one of the multiple bounding boxes , The adjacent bounding box includes a bounding box belonging to a target grid and a bounding box belonging to an adjacent grid of the target grid, the target grid is a grid to which the reference bounding box belongs, and the adjacent The bounding box does not include the reference bounding box.
确定每个边界框所属的网格之后,即可根据边界框所属的网格以及网格之间的关系确定边界框之间的关系。在本申请实施例中,边界框之间的关系主要指相邻或不相邻的关系。After determining the grid to which each bounding box belongs, the relationship between the bounding boxes can be determined according to the grid to which the bounding box belongs and the relationship between the grids. In the embodiments of the present application, the relationship between the bounding boxes mainly refers to the relationship between adjacent or non-adjacent.
可选情况下,可以根据目标网格的八邻域网格确定目标网格的相邻网格。网格的八邻域网格是指所有与网格有边或顶点重合的网格。请参阅图4C,图4C为本申请实施例提供的一种网格相邻关系示意图,如图4C中的(a)所示,以网格R_1_1为目标网格,填充网格即为目标网格的八邻域网格,也即目标网格的相邻网格。Optionally, the neighboring grid of the target grid can be determined according to the eight-neighbor grid of the target grid. The eight-neighborhood grid of the grid refers to all the grids that have edges or vertices coincident with the grid. Please refer to FIG. 4C. FIG. 4C is a schematic diagram of a grid neighbor relationship provided by an embodiment of the application. As shown in (a) in FIG. 4C, the grid R_1_1 is used as the target grid, and the filled grid is the target grid. The eight-neighborhood grid of the grid, that is, the adjacent grid of the target grid.
可选情况下,可以根据目标网格的四邻域网格确定目标网格的相邻网格。网格的四邻域网格是指所有与网格有边重合的网格。如图4C中的(b)所示,填充网格即为目标网格 R_1_1的四邻域网格,也即目标网格的相邻网格。Optionally, the neighboring grids of the target grid may be determined according to the four-neighbor grids of the target grid. The four-neighborhood grid of the grid refers to all the grids that coincide with the edges of the grid. As shown in (b) in FIG. 4C, the filled grid is the four-neighbor grid of the target grid R_1_1, that is, the adjacent grid of the target grid.
可选情况下,可以根据与目标网格的中心距离确定目标网格的相邻网格。例如可以将中心与目标网格的中心距离不大于预设距离的网格作为目标网格的相邻网格,预设距离可以是目标网格的边长,或者等于目标网格的多倍边长,或者其他值。例如图4C中的(c),假设预设距离为目标网格R_1_1的边长,网格R_0_1,R_1_0,R_1_2和R_2_1的中心与目标网格的中心之间的距离等于预设距离,满足中心与网格的中心距离不大于预设距离的条件,因此这四个网格为目标网格的相邻网格。Optionally, the neighboring grids of the target grid can be determined according to the distance from the center of the target grid. For example, a grid whose center is not more than a preset distance from the center of the target grid can be regarded as the adjacent grid of the target grid. The preset distance can be the side length of the target grid or equal to multiple sides of the target grid. Long, or other value. For example (c) in Figure 4C, assuming that the preset distance is the side length of the target grid R_1_1, the distance between the centers of the grids R_0_1, R_1_0, R_1_2, and R_2_1 and the center of the target grid is equal to the preset distance, which satisfies the center The distance from the center of the grid is not greater than the condition of the preset distance, so these four grids are adjacent grids of the target grid.
确定目标网格的相邻网格后,即可确定参考边界框的相邻边界框。与参考边界框位于同一网格的边界框,以及位于参考边界框的相邻网格的边界框,都为参考边界框的相邻边界框。其中,参考边界框的相邻边界框不包括其自身。After determining the neighboring grids of the target grid, the neighboring bounding boxes of the reference bounding box can be determined. The bounding box located on the same grid as the reference bounding box and the bounding box located on the adjacent grid of the reference bounding box are both adjacent bounding boxes of the reference bounding box. Among them, the adjacent bounding box of the reference bounding box does not include itself.
可选的,在遍历多个边界框之前,方法还包括:对边界框按照其得分大小进行排序,获得边界框对应的排序编号。Optionally, before traversing multiple bounding boxes, the method further includes: sorting the bounding boxes according to their scores to obtain the sorting numbers corresponding to the bounding boxes.
在目标图像上生成多个边界框后,每个边界框还有其对应得分,边界框的得分可能是0~1之间的任意值,用来表示边界框中包含待检测的目标对象的置信度。因为高得分的边界框会对低得分的边界框进行抑制,那么按照得分大小对边界框进行排序,获得边界框对应的排序编号后,按照排序编号对边界框进行遍历时,可以根据参考边界框和相邻边界框各自的排序编号确定参考边界框是否需要与相邻边界框进行重叠度计算,可以有效提高遍历效率。After multiple bounding boxes are generated on the target image, each bounding box has its corresponding score. The score of the bounding box may be any value between 0 and 1, which is used to indicate the confidence that the bounding box contains the target object to be detected degree. Because the high-scoring bounding box will suppress the low-scoring bounding box, the bounding boxes are sorted according to the score size, and after obtaining the sorting number corresponding to the bounding box, when the bounding box is traversed according to the sorting number, you can use the reference bounding box The respective order numbers of adjacent bounding boxes determine whether the reference bounding box needs to be overlapped with adjacent bounding boxes, which can effectively improve the traversal efficiency.
可以按照分数由大到小进行降序排列,即分数越高,对应边界框的排序编号越小;也可以按照分数由小到大进行升序排列,即分数越高,对应边界框的排序编号越大。以降序排列为例,如图1B中的多个边界框,其对应的排序编号如表1所示:It can be sorted in descending order of scores, that is, the higher the score, the smaller the sorting number of the corresponding bounding box; it can also be sorted in ascending order of the score, that is, the higher the score, the larger the sorting number of the corresponding bounding box . Take the descending order as an example. For multiple bounding boxes in Figure 1B, their corresponding order numbers are shown in Table 1:
表1Table 1
排序编号Sort number 边界框得分 Bounding box score
11 0.930.93
22 0.840.84
33 0.800.80
44 0.620.62
55 0.510.51
66 0.310.31
可选的,在对边界框按照其对应得分大小进行排序之前,方法还包括:确定多个边界框为得分大于预设得分的多个边界框。Optionally, before sorting the bounding boxes according to their corresponding score sizes, the method further includes: determining that the multiple bounding boxes are multiple bounding boxes with a score greater than a preset score.
在上述过程中,在目标图像上生成了边界框以及边界框对应的得分。这些得分为0~1之间的任意值,但是,当边界框的得分小于预设得分时,边界框中包含待检测的目标对象的置信度十分低,那么这些边界框即使没有被抑制,也不能用来确定该边界框中包含目标对象。因此,可以直接筛选掉得分低于预设得分的边界框。例如预设得分可以为0.8,那么图1B中筛选掉得分为0.61,0.51和0.31的边界框,仅保留得分为0.93、0.84和0.80的边界框。其对应的排序编号如表2所示:In the above process, the bounding box and the score corresponding to the bounding box are generated on the target image. These scores are any value between 0 and 1. However, when the score of the bounding box is less than the preset score, the confidence that the bounding box contains the target object to be detected is very low, so even if these bounding boxes are not suppressed, It cannot be used to determine that the bounding box contains the target object. Therefore, the bounding box with a score lower than the preset score can be directly filtered out. For example, the preset score can be 0.8, then the bounding boxes with scores of 0.61, 0.51, and 0.31 are filtered out in Fig. 1B, and only the bounding boxes with scores of 0.93, 0.84, and 0.80 are retained. The corresponding sort numbers are shown in Table 2:
表2Table 2
排序编号Sort number 边界框得分 Bounding box score
11 0.930.93
22 0.840.84
33 0.800.80
这个过程可以有效减少后续进行重叠度计算的数据量,提升了目标检测效率。完成上述全部或部分步骤后,即可遍历多个边界框并得出目标边界框,请参阅图5,图5为本申请实施例提供的一种遍历多个边界框并得到目标边界框的方法流程示意图,具体包括如下步骤:This process can effectively reduce the amount of data for subsequent overlap calculations and improve the efficiency of target detection. After completing all or part of the above steps, multiple bounding boxes can be traversed and a target bounding box can be obtained. Please refer to FIG. 5, which is a method for traversing multiple bounding boxes and obtaining a target bounding box according to an embodiment of this application. Schematic diagram of the process, including the following steps:
501、多个边界框为N个边界框,将N个边界框的标识位初始化为0;501. The multiple bounding boxes are N bounding boxes, and the identification bits of the N bounding boxes are initialized to 0;
502、将i置为0;502. Set i to 0;
503、获取排序编号为i的边界框作为参考边界框,同时获取其标识位;在参考边界框的标识位为1的情况下,执行步骤504;在参考边界框的标识位为0的情况下,执行步505;503. Obtain the bounding box with the sorting number i as the reference bounding box, and at the same time obtain its identification bit; if the identification bit of the reference bounding box is 1, execute step 504; if the identification bit of the reference bounding box is 0 , Go to step 505;
504、将i递增1,并确定i是否小于N;确定i小于N时,执行步骤503;在确定i不小于N时,执行步骤511;504. Increment i by 1, and determine whether i is less than N; when it is determined that i is less than N, perform step 503; when it is determined that i is not less than N, perform step 511;
505、获取参考边界框的相邻边界框;505. Obtain adjacent bounding boxes of the reference bounding box.
506、判定相邻边界框的排序编号是否大于i,若是,执行步骤507,若否,执行步骤509;506. Determine whether the sequence number of adjacent bounding boxes is greater than i, if yes, go to step 507, if not, go to step 509;
507、计算参考边界框与相邻边界框的交并比,确定交并比是否大于预设阈值,若否,执行步骤509;若是,执行步骤508;507. Calculate the intersection ratio between the reference bounding box and the adjacent bounding boxes, and determine whether the intersection ratio is greater than a preset threshold, if not, go to step 509; if so, go to step 508;
508、将相邻边界框的标识位置为1,执行步骤509;508: Set the identification position of the adjacent bounding box to 1, and perform step 509;
509、确定该相邻边界框是否为参考边界框的最后一个相邻边界框,若是执行步骤504,若否,执行步骤510;509. Determine whether the adjacent bounding box is the last adjacent bounding box of the reference bounding box, if it is, execute step 504, if not, execute step 510;
510、选择下一个相邻边界框,执行步骤506;510. Select the next adjacent bounding box, and perform step 506;
511、结束遍历,获取标识位为0的边界框作为目标边界框。511. End the traversal, and obtain the bounding box with the flag of 0 as the target bounding box.
在这个过程中,N为从目标图像上获取的多个边界框总个数。将边界框的标识位用来表征边界框是否被抑制,其中第一标识值用来表示边界框未被抑制,第二标识值用来表示边界框被抑制。初始状态时,所有边界框的标识位都被置为第一标识值,在如图5的示例中,第一标识值为0,表示所有边界框都未被抑制。应当理解,边界框的初始标识位也可以为其他值或者字符。进一步的,按照边界框的排序编号顺序获取参考边界框,如图5的示例中,边界框从0开始排序编号,那么首先将i置为0,然后获取排序编号为0的边界框作为第一个参考边界框。如果边界框从1开始排序编号,那么首先将i置为1,然后获取排序编号为1的边界框作为第一个参考边界框。即是说,i的初始值由边界框第一个排序编号确定。然后获取参考边界框的标识位,第一个参考边界框的标识位为0,获取参考边界框的相邻边界框,并计算参考边界框与相邻边界框的重叠度,其中重叠度表示参考边界框与相邻边界框的重叠面积所占的比率,可以是重叠面积占参考边界框面积的比率,也可以是重叠面积占相邻边界框面积的比率,还可以是重叠面积占参考边界框和相邻边界框合并面积的比率。重叠度可以通过交并比来体现,即通过IOU计算公式来计算,如果IOU的值大 于预设阈值,说明相邻边界框被参考边界框抑制,将相邻边界框的标识位置为第二标识值,在如图5的示例中第二标识值为1。然后获取参考边界框的下一个相邻边界框,在相邻边界框的排序编号大于参考边界框的排序编号,且相邻边界框的标识位为第一标识值的情况下,对参考边界框和相邻边界框同样进行IOU计算,直到确定参考边界框的相邻边界框全部进行了IOU计算并做了相应的标识位修改。根据边界框的排序编号选择下一个边界框作为参考边界框继续进行参考边界框与相邻边界框的重叠度计算,直到完成对第N个排序编号的边界框的遍历。从另一方面来说,因为第N个排序编号的边界框不会有排序编号比它更大的相邻边界框,可以省略对第N个排序编号的边界框的遍历过程,因此也可以在遍历完第N-1个边界框后结束遍历。In this process, N is the total number of multiple bounding boxes obtained from the target image. The identification bit of the bounding box is used to indicate whether the bounding box is suppressed, wherein the first identification value is used to indicate that the bounding box is not suppressed, and the second identification value is used to indicate that the bounding box is suppressed. In the initial state, the identification bits of all bounding boxes are set to the first identification value. In the example shown in FIG. 5, the first identification value is 0, which means that all the bounding boxes are not suppressed. It should be understood that the initial identification bit of the bounding box may also be other values or characters. Further, the reference bounding boxes are obtained in the order of the sorting numbers of the bounding boxes. In the example shown in Figure 5, the bounding boxes are sorted and numbered from 0, then first set i to 0, and then obtain the bounding box with the sorting number 0 as the first Reference bounding boxes. If the bounding box is numbered starting from 1, then first set i to 1, and then obtain the bounding box with the rank number 1 as the first reference bounding box. In other words, the initial value of i is determined by the first sorting number of the bounding box. Then obtain the identification bit of the reference bounding box, the first reference bounding box's identification bit is 0, the adjacent bounding box of the reference bounding box is obtained, and the overlap degree between the reference bounding box and the adjacent bounding box is calculated, where the overlap degree represents the reference The ratio of the overlapping area between the bounding box and the adjacent bounding box can be the ratio of the overlapping area to the area of the reference bounding box, the ratio of the overlapping area to the area of the adjacent bounding box, or the overlapping area to the reference bounding box The ratio of the combined area with the adjacent bounding box. The degree of overlap can be reflected by the intersection ratio, that is, calculated by the IOU calculation formula. If the value of the IOU is greater than the preset threshold, it means that the adjacent bounding box is suppressed by the reference bounding box, and the identification position of the adjacent bounding box is the second identification Value, in the example shown in FIG. 5, the second identification value is 1. Then obtain the next adjacent bounding box of the reference bounding box. When the ordering number of the adjacent bounding box is greater than the ordering number of the reference bounding box, and the identification bit of the adjacent bounding box is the first identification value, compare the reference bounding box The IOU calculation is performed the same as the adjacent bounding box, until it is determined that the adjacent bounding boxes of the reference bounding box have all been IOU calculated and the corresponding identification bits are modified. According to the sorting number of the bounding box, the next bounding box is selected as the reference bounding box to continue the calculation of the overlap between the reference bounding box and the adjacent bounding box, until the traversal of the bounding box of the Nth sorting number is completed. On the other hand, because the bounding box of the Nth sorting number will not have an adjacent bounding box with a higher sorting number, the traversal process of the bounding box of the Nth sorting number can be omitted, so it can also be End the traversal after traversing the N-1th bounding box.
遍历完多个边界框之后,主要是对多个边界框的标识位进行了修改,使得一些被抑制的边界框的标识位被修改为1,而未被抑制的边界框的标识位保持为0。标识位为0的边界框作为目标边界框,即目标边界框为多个边界框中未被抑制的边界框。After traversing multiple bounding boxes, it is mainly to modify the identification bits of multiple bounding boxes, so that the identification bits of some suppressed bounding boxes are modified to 1, while the identification bits of unsuppressed bounding boxes remain at 0 . The bounding box with the flag of 0 is used as the target bounding box, that is, the target bounding box is bounding boxes that are not suppressed in multiple bounding boxes.
可选的,在确定多个边界框所属的网格之后,方法还包括:根据边界框所属的网格,建立网格对应的索引队列,索引队列中包括至少一个边界框的排序编号;获取参考边界框的相邻边界框包括:按照索引队列从目标网格和目标网格的相邻网格中获取相邻边界框。Optionally, after determining the grid to which the multiple bounding boxes belong, the method further includes: establishing an index queue corresponding to the grid according to the grid to which the bounding box belongs, and the index queue includes the sorting number of at least one bounding box; and obtaining a reference The adjacent bounding boxes of the bounding box include: obtaining adjacent bounding boxes from the target grid and the adjacent grids of the target grid according to the index queue.
在上述过程中,在确定了边界框所述的网格之后,在对多个边界框进行遍历之前,还可以对网格中所属的边界框建立索引队列,索引队列的顺序可以根据边界框的排序编号确定。请参阅图6,图6为本申请实施例提供的一种网格中的索引队列示意图,如图6所示,按照边界框的排序编号依次确定每个边界框所属的网格,在确定了边界框属于目标网格后,将边界框的排序编号加入网格对应的索引队列中。图6中生成了网格R_1_0,R_1_2和R_2_2的索引队列,索引队列中包括一个或多个边界框的排序编号,排序编号顺序排列,索引队列中还可以包括每个排序编号对应的边界框的得分。生成网格中包含的边界框的索引队列后,在遍历多个边界框,获取参考边界框对应的相邻边界框时,可以按照参考边界框所属的目标网格,以及目标网格的相邻网格各自对应的索引队列顺序获取相邻边界框。In the above process, after determining the grid described by the bounding box, before traversing multiple bounding boxes, an index queue can be established for the bounding boxes to which the grid belongs, and the order of the index queue can be based on the order of the bounding boxes. The sort number is determined. Please refer to FIG. 6. FIG. 6 is a schematic diagram of an index queue in a grid provided by an embodiment of the application. As shown in FIG. 6, the grid to which each bounding box belongs is determined in turn according to the ordering number of the bounding box. After the bounding box belongs to the target grid, the sort number of the bounding box is added to the index queue corresponding to the grid. In Figure 6, the index queues of grids R_1_0, R_1_2 and R_2_2 are generated. The index queue includes the sorting numbers of one or more bounding boxes. The sorting numbers are arranged in order. The indexing queue can also include the bounding box corresponding to each sorting number. Score. After generating the index queue of the bounding boxes contained in the grid, when traversing multiple bounding boxes to obtain the neighboring bounding boxes corresponding to the reference bounding box, you can follow the target grid to which the reference bounding box belongs and the neighboring target grid The index queues corresponding to the grids obtain adjacent bounding boxes in order.
205、根据所述目标边界框的得分确定目标检测结果。205. Determine a target detection result according to the score of the target bounding box.
根据上述步骤获得的目标边界框没有被其他边界框抑制,可以用来确定目标检测结果。如图1B中的(d)所示,0.93和0.80对应的边界框即为目标边界框,根据目标边界框的得分确定目标检测结果,即是说,目标边界框的得分表明该边界框圈定范围内包括目标对象的置信度,置信度越大,说明包括目标对象的概率越大。例如本申请实施例中目标检测对应的目标对象为人脸图像,目标边界框1得分为0.80,表示目标边界框1范围内包括人脸图像的置信度为0.80,得到的目标检测结果可以是:目标边界框1包括人脸图像的概率为80%。如果预先设定了当边界框得分大于0.7时,可确定边界框中包括目标对象,那么得到的目标检测结果可以是:目标边界框1中包括人脸图像。The target bounding box obtained according to the above steps is not suppressed by other bounding boxes, and can be used to determine the target detection result. As shown in (d) in Figure 1B, the bounding box corresponding to 0.93 and 0.80 is the target bounding box, and the target detection result is determined according to the score of the target bounding box, that is, the score of the target bounding box indicates the bounding range of the bounding box The confidence level of the target object is included within. The greater the confidence level, the greater the probability of including the target object. For example, the target object corresponding to target detection in the embodiment of the application is a face image, and the target bounding box 1 score is 0.80, which means that the confidence level of the face image included in the target bounding box 1 is 0.80, and the target detection result obtained may be: target The probability that the bounding box 1 includes the face image is 80%. If it is preset that when the bounding box score is greater than 0.7, it can be determined that the bounding box includes the target object, then the obtained target detection result may be: the target bounding box 1 includes a face image.
可选的,根据目标边界框的得分确定目标检测结果,包括:根据得分大于预设得分的目标边界框确定目标检测结果。Optionally, determining the target detection result according to the score of the target bounding box includes: determining the target detection result according to the target bounding box with a score greater than a preset score.
如果遍历多个边界框后得到多个目标边界框,且包括得分小于预设得分的目标边界框,这些目标边界框用来确定目标检测结果时,边界框内包括目标对象的概率低于要求的概率,可以直接筛选掉这些目标边界框,而只根据得分大于预设得分的目标边界框来确定目标检 测结果,提升目标检测结果生成的效率。或者,也可以根据得分等于预设得分的目标边界框来确定目标检测结果。If multiple target bounding boxes are obtained after traversing multiple bounding boxes, and include target bounding boxes whose score is less than the preset score, when these target bounding boxes are used to determine the target detection result, the probability of including the target object in the bounding box is lower than the required Probability, these target bounding boxes can be directly filtered out, and the target detection result can be determined only based on the target bounding box with a score greater than the preset score, so as to improve the efficiency of target detection result generation. Alternatively, the target detection result can also be determined according to the target bounding box whose score is equal to the preset score.
可见,在本申请实施例中,对目标图像进行划分得到多个网格;根据网格划分结果确定多个边界框对应的相邻边界框;然后遍历多个边界框,计算获得多个边界框中未被相邻边界框抑制的目标边界框,最后根据目标边界框的得分确定目标检测结果。在这个过程中,通过确定多个边界框所属网格,进而确定边界框的相邻边界框,使得在遍历边界框时,只需要计算边界框与相邻边界框的重叠度,极大减少了数据处理量,提升了数据处理效率,进而提升了目标检测效率。It can be seen that in the embodiment of the present application, the target image is divided to obtain multiple grids; the adjacent bounding boxes corresponding to the multiple bounding boxes are determined according to the grid division result; then the multiple bounding boxes are traversed to obtain multiple bounding boxes by calculation. In the target bounding box that is not suppressed by the adjacent bounding box, finally the target detection result is determined according to the score of the target bounding box. In this process, by determining the grid to which multiple bounding boxes belong, and then determining the adjacent bounding boxes of the bounding box, when traversing the bounding box, only the overlap between the bounding box and the adjacent bounding box needs to be calculated, which greatly reduces The amount of data processing improves the efficiency of data processing, which in turn improves the efficiency of target detection.
参见图7,为本申请实施例提供的一种目标检测装置,如图7所示,所述装置700包括:Referring to FIG. 7, a target detection device provided by an embodiment of this application. As shown in FIG. 7, the device 700 includes:
获取单元701,用于获取目标图像上多个边界框及所述多个边界框的得分,所述得分用于表征所述边界框中包含目标对象的置信度;The acquiring unit 701 is configured to acquire multiple bounding boxes on a target image and scores of the multiple bounding boxes, where the scores are used to represent the confidence that the bounding box contains the target object;
划分单元702,用于对所述目标图像进行划分得到多个网格,并确定所述多个边界框所属的网格;The dividing unit 702 is configured to divide the target image to obtain multiple grids, and determine the grid to which the multiple bounding boxes belong;
遍历单元703,用于遍历所述多个边界框,计算参考边界框与相邻边界框的重叠度,并根据所述重叠度得到目标边界框;所述参考边界框为所述多个边界框中的任一个,所述相邻边界框包括属于目标网格的边界框以及属于所述目标网格的相邻网格的边界框,所述目标网格为述参考边界框所属的网格,所述相邻边界框不包括所述参考边界框;The traversal unit 703 is configured to traverse the multiple bounding boxes, calculate the degree of overlap between the reference bounding box and the adjacent bounding boxes, and obtain the target bounding box according to the degree of overlap; the reference bounding boxes are the multiple bounding boxes In any one of the above, the adjacent bounding box includes a bounding box belonging to a target grid and a bounding box belonging to an adjacent grid of the target grid, and the target grid is a grid to which the reference bounding box belongs, The adjacent bounding box does not include the reference bounding box;
确定单元704,用于根据所述目标边界框的得分确定目标检测结果。The determining unit 704 is configured to determine the target detection result according to the score of the target bounding box.
本申请实施例中所提供的装置,目标图像进行划分得到多个网格;根据网格划分结果确定多个边界框对应的相邻边界框;然后遍历多个边界框,计算获得多个边界框中未被相邻边界框抑制的目标边界框,最后根据目标边界框的得分确定目标检测结果。在这个过程中,通过确定多个边界框所属网格,进而确定边界框的相邻边界框,使得在遍历边界框时,只需要计算边界框与相邻边界框的重叠度,极大减少了数据处理量,提升了数据处理效率,进而提升了目标检测效率。In the device provided in the embodiment of the present application, the target image is divided to obtain multiple grids; the adjacent bounding boxes corresponding to the multiple bounding boxes are determined according to the grid division result; then the multiple bounding boxes are traversed to obtain multiple bounding boxes by calculation In the target bounding box that is not suppressed by the adjacent bounding box, finally the target detection result is determined according to the score of the target bounding box. In this process, by determining the grid to which multiple bounding boxes belong, and then determining the adjacent bounding boxes of the bounding box, when traversing the bounding box, only the overlap between the bounding box and the adjacent bounding box needs to be calculated, which greatly reduces The amount of data processing improves the efficiency of data processing, which in turn improves the efficiency of target detection.
在一个可选的示例中,所述装置还包括排序单元705,具体用于:In an optional example, the device further includes a sorting unit 705, specifically configured to:
按照所述边界框的得分大小对所述边界框进行排序,获得所述边界框对应的排序编号。Sort the bounding boxes according to the score size of the bounding boxes, and obtain the sorting numbers corresponding to the bounding boxes.
在一个可选的示例中,所述排序单元705具体用于:In an optional example, the sorting unit 705 is specifically configured to:
按照得分大小对所述边界框进行降序排序,得分越小的边界框对应的排序编号越大。The bounding boxes are sorted in descending order according to the size of the score, and the smaller the score is, the larger the sorting number corresponding to the bounding box is.
在一个可选的示例中,所述遍历单元703具体用于:In an optional example, the traversal unit 703 is specifically configured to:
获取所述多个边界框中的排序编号为i的边界框作为参考边界框,同时获取其标识位;Acquiring a bounding box with a sequence number of i in the plurality of bounding boxes as a reference bounding box, and at the same time acquiring its identification bit;
在所述参考边界框的标识位为第一标识值的情况下,获取所述参考边界框的相邻边界框,并判定所述相邻边界框的排序编号是否大于i;In the case that the identification bit of the reference bounding box is the first identification value, acquiring adjacent bounding boxes of the reference bounding box, and determining whether the sequence number of the adjacent bounding box is greater than i;
在确定所述相邻边界框的排序编号大于i的情况下,计算所述参考边界框与所述相邻边界框的交并比;In a case where it is determined that the sequence number of the adjacent bounding box is greater than i, calculating the intersection ratio of the reference bounding box and the adjacent bounding box;
在所述交并比大于预设阈值的情况下,将所述相邻边界框的标识位置为第二标识值;In a case where the intersection ratio is greater than a preset threshold, setting the identification position of the adjacent bounding box to the second identification value;
获取所述标识位为所述第一标识值的边界框作为目标边界框。Acquire a bounding box whose identification bit is the first identification value as a target bounding box.
在一个可选的示例中,所述多个边界框的尺寸不同,所述划分单元702具体用于:In an optional example, the sizes of the multiple bounding boxes are different, and the dividing unit 702 is specifically configured to:
按照所述多个边界框的尺寸中的最大尺寸对所述目标图像划分得到多个网格。The target image is divided according to the largest size among the sizes of the multiple bounding boxes to obtain multiple grids.
在一个可选的示例中,所述多个边界框的尺寸相同,所述划分单元702具体用于:In an optional example, the sizes of the multiple bounding boxes are the same, and the dividing unit 702 is specifically configured to:
按照所述多个边界框的尺寸对所述目标图像划分得到多个网格。The target image is divided into multiple grids according to the sizes of the multiple bounding boxes.
在一个可选的示例中,在确定所述多个边界框所属的网格方面,所述划分单元702具体用于:In an optional example, in terms of determining the grid to which the multiple bounding boxes belong, the dividing unit 702 is specifically configured to:
确定目标坐标点,所述目标坐标点为所述边界框上或所述边界框内的任一个坐标点;Determine a target coordinate point, where the target coordinate point is any coordinate point on the bounding box or within the bounding box;
将所述目标坐标点所属的网格确定为所述边界框所属的网格。The grid to which the target coordinate point belongs is determined as the grid to which the bounding box belongs.
在一个可选的示例中,所述目标坐标点包括:In an optional example, the target coordinate point includes:
所述边界框的右上角坐标点,所述边界框的左上角坐标点,边界框的右下角坐标点,边界框的左下角坐标点或边界框的中心坐标点中的任一个。Any one of the upper-right coordinate point of the bounding box, the upper-left coordinate point of the bounding box, the lower-right coordinate point of the bounding box, the lower-left coordinate point of the bounding box, or the center coordinate point of the bounding box.
在一个可选的示例中,在确定所述多个边界框所属的网格之后,所述划分单元702还用于:In an optional example, after determining the grid to which the multiple bounding boxes belong, the dividing unit 702 is further configured to:
根据所述边界框所属的网格,建立所述网格对应的索引队列,所述索引队列中包括至少一个所述边界框的排序编号;Establishing an index queue corresponding to the grid according to the grid to which the bounding box belongs, and the index queue includes at least one sorting number of the bounding box;
在所述获取所述参考边界框的相邻边界框方面,所述遍历单元703还用于:In terms of acquiring the adjacent bounding boxes of the reference bounding box, the traversal unit 703 is further configured to:
按照所述索引队列从目标网格和所述目标网格的相邻网格中获取相邻边界框。Acquire adjacent bounding boxes from the target grid and the adjacent grids of the target grid according to the index queue.
在一个可选的示例中,所述排序单元705还用于:In an optional example, the sorting unit 705 is further configured to:
在对所述边界框按照其对应得分大小进行排序之前,确定所述多个边界框为得分大于预设得分的多个边界框。Before sorting the bounding boxes according to their corresponding score sizes, it is determined that the multiple bounding boxes are multiple bounding boxes with scores greater than a preset score.
在一个可选的示例中,所述确定单元704具体用于:In an optional example, the determining unit 704 is specifically configured to:
根据得分大于预设得分的目标边界框确定目标检测结果。Determine the target detection result according to the target bounding box whose score is greater than the preset score.
本申请另外一种实施例,参见图8,装置800包括至少一个处理器801,至少一个存储器802以及至少一个通信接口803,还包括图像传感器804和显示器805。所述处理器801、所述存储器802、所述通信接口803、所述图像传感器804和显示器805通过所述通信总线连接并完成相互间的通信。In another embodiment of the present application, referring to FIG. 8, the device 800 includes at least one processor 801, at least one memory 802 and at least one communication interface 803, and also includes an image sensor 804 and a display 805. The processor 801, the memory 802, the communication interface 803, the image sensor 804, and the display 805 are connected through the communication bus and complete mutual communication.
装置800可用于智能门禁,智慧安防等智能设备中。装置800可以通过图像传感器804采集图像,或者由通信接口803连接其他通信设备或者可读存储器获取图像数据,传输给处理器801进行目标检测,检测结果一般会做后处理(比如:结果分类,打分,筛选,识别等),然后将最终结果输出给显示器805显示或者存储到存储器802中。The device 800 can be used in smart devices such as smart access control and smart security. The device 800 can collect images through the image sensor 804, or the communication interface 803 can connect to other communication devices or readable memory to obtain image data, and transmit it to the processor 801 for target detection. The detection results will generally be post-processed (for example, result classification, scoring) , Screening, identification, etc.), and then output the final result to the display 805 for display or storage in the memory 802.
处理器801可以是通用中央处理器(central processing unit,CPU),图形处理器(graphics processing unit,GPU),微处理器,特定应用集成电路(application-specific integrated circuit,ASIC),或一个或多个用于控制以上方案程序执行的集成电路。The processor 801 may be a general central processing unit (CPU), a graphics processing unit (GPU), a microprocessor, an application-specific integrated circuit (ASIC), or one or more An integrated circuit used to control the execution of the program above.
通信接口803,用于与其他设备或通信网络进行光纤通信。The communication interface 803 is used for optical fiber communication with other devices or communication networks.
存储器802可以是只读存储器(read-only memory,ROM)或可存储静态信息和指令的其他类型的静态存储设备,随机存取存储器(random access memory,RAM)或者可存储信息和指令的其他类型的动态存储设备,也可以是电可擦可编程只读存储器(Electrically  Erasable Programmable Read-Only Memory,EEPROM)、只读光盘(Compact Disc Read-Only Memory,CD-ROM)或其他光盘存储、光碟存储(包括压缩光碟、激光碟、光碟、数字通用光碟、蓝光光碟等)、磁盘存储介质或者其他磁存储设备、或者能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计算机存取的任何其他介质,但不限于此。存储器可以是独立存在,通过总线与处理器相连接。存储器也可以和处理器集成在一起。The memory 802 may be a read-only memory (ROM) or other types of static storage devices that can store static information and instructions, random access memory (RAM), or other types that can store information and instructions The dynamic storage device can also be electrically erasable programmable read-only memory (Electrically Erasable Programmable Read-Only Memory, EEPROM), CD-ROM (Compact Disc Read-Only Memory, CD-ROM) or other optical disc storage, optical disc storage (Including compact discs, laser discs, optical discs, digital versatile discs, Blu-ray discs, etc.), magnetic disk storage media or other magnetic storage devices, or can be used to carry or store desired program codes in the form of instructions or data structures and can be used by a computer Any other media accessed, but not limited to this. The memory can exist independently and is connected to the processor through a bus. The memory can also be integrated with the processor.
其中,所述存储器802用于存储执行以上方案的应用程序代码以及程序执行结果,并由处理器801来控制执行。所述处理器801用于执行所述存储器802中存储的应用程序代码。The memory 802 is used to store application program codes and program execution results for executing the above solutions, and the processor 801 controls the execution. The processor 801 is configured to execute application program codes stored in the memory 802.
存储器802存储的代码可执行以上提供的目标检测方法,比如:The code stored in the memory 802 can execute the target detection method provided above, for example:
获取目标图像上多个边界框及所述多个边界框的得分,所述得分用于表征所述边界框中包含目标对象的置信度;Acquiring multiple bounding boxes on the target image and scores of the multiple bounding boxes, where the scores are used to characterize the confidence that the bounding box contains the target object;
对所述目标图像进行划分得到多个网格,并确定所述多个边界框所属的网格;Dividing the target image to obtain multiple grids, and determining the grid to which the multiple bounding boxes belong;
遍历所述多个边界框,计算参考边界框与相邻边界框的重叠度,并根据所述重叠度得到目标边界框;所述参考边界框为所述多个边界框中的任一个,所述相邻边界框包括属于目标网格的边界框以及属于所述目标网格的相邻网格的边界框,所述目标网格为述参考边界框所属的网格,所述相邻边界框不包括所述参考边界框;Traverse the multiple bounding boxes, calculate the degree of overlap between the reference bounding box and the adjacent bounding boxes, and obtain the target bounding box according to the degree of overlap; the reference bounding box is any one of the multiple bounding boxes, so The adjacent bounding box includes a bounding box belonging to a target grid and a bounding box belonging to an adjacent grid of the target grid, the target grid is a grid to which the reference bounding box belongs, and the adjacent bounding box Not including the reference bounding box;
根据所述目标边界框的得分确定目标检测结果。The target detection result is determined according to the score of the target bounding box.
本申请实施例中的装置800,具体还可以是复杂可编程逻辑器件(Complex Programmable Logic Device,CPLD)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)等实现,本申请实施例对此不作限定。The device 800 in the embodiment of the present application can also be specifically implemented by a complex programmable logic device (CPLD), a field-programmable gate array (Field-Programmable Gate Array, FPGA), etc. This embodiment of the present application Not limited.
本申请实施例还提供了一种计算机可读存储介质,该计算机可读存储介质中存储有指令,当其在计算机或处理器上运行时,使得计算机或处理器执行上述任一个方法中的一个或多个步骤。上述信号处理装置的各组成模块如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在所述计算机可读取存储介质中。The embodiment of the present application also provides a computer-readable storage medium that stores instructions in the computer-readable storage medium, and when it runs on a computer or a processor, the computer or the processor executes any one of the above methods. Or multiple steps. If each component module of the above-mentioned signal processing device is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in the computer readable storage medium.
基于这样的理解,本申请实施例还提供一种包含指令的计算机程序产品,当其在计算机或处理器上运行时,使得计算机或处理器执行本申请实施例提供的任一个方法。本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备或其中的处理器执行本申请各个实施例所述方法的全部或部分步骤。Based on this understanding, the embodiments of the present application also provide a computer program product containing instructions, which when run on a computer or a processor, cause the computer or the processor to execute any of the methods provided in the embodiments of the present application. The technical solution of this application is essentially or the part that contributes to the existing technology, or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including several instructions. This allows a computer device or a processor therein to execute all or part of the steps of the methods described in the various embodiments of the present application.
需要说明的是,对于前述的各方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本申请并不受所描述的动作顺序的限制,因为依据本申请,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作和模块并不一定是本申请所必须的。It should be noted that for the foregoing method embodiments, for the sake of simple description, they are all expressed as a series of action combinations, but those skilled in the art should know that this application is not limited by the described sequence of actions. Because according to this application, some steps can be performed in other order or at the same time. Secondly, those skilled in the art should also know that the embodiments described in the specification are all preferred embodiments, and the actions and modules involved are not necessarily required by this application.
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。In the above-mentioned embodiments, the description of each embodiment has its own focus. For parts that are not described in detail in an embodiment, reference may be made to related descriptions of other embodiments.
在本申请所提供的几个实施例中,应该理解到,所揭露的装置,可通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如所述单元的划分,仅仅为一种 逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个***,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed device may be implemented in other ways. For example, the device embodiments described above are merely illustrative, for example, the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined or may be Integrate into another system, or some features can be ignored or not implemented. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储器中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储器中,包括若干指令用以使得一台计算机设备(可为个人计算机、服务器或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。本领域普通技术人员可以理解上述实施例的各种方法中的全部或部分步骤是可以通过程序来指令相关的硬件来完成,该程序可以存储于一计算机可读存储器中,存储器如前述存储器802,此处不再赘述。If the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable memory. Based on this understanding, the technical solution of the present application essentially or the part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a memory. A number of instructions are included to enable a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods described in the various embodiments of the present application. Those of ordinary skill in the art can understand that all or part of the steps in the various methods of the above-mentioned embodiments can be completed by a program instructing relevant hardware. The program can be stored in a computer-readable memory, such as the aforementioned memory 802. I won't repeat them here.
以上对本申请实施例进行了详细介绍,本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想;同时,对于本领域的一般技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处,综上上述,本说明书内容不应理解为对本申请的限制。The embodiments of the application are described in detail above, and specific examples are used in this article to illustrate the principles and implementation of the application. The descriptions of the above embodiments are only used to help understand the methods and core ideas of the application; at the same time, for Persons of ordinary skill in the art, based on the ideas of this application, will have changes in the specific implementation and the scope of application. In summary, the content of this specification should not be construed as a limitation to this application.

Claims (23)

  1. 一种目标检测方法,其特征在于,所述方法包括:A target detection method, characterized in that the method includes:
    获取目标图像上多个边界框及所述多个边界框的得分,所述得分用于表征所述边界框中包含目标对象的置信度;Acquiring multiple bounding boxes on the target image and scores of the multiple bounding boxes, where the scores are used to characterize the confidence that the bounding box contains the target object;
    对所述目标图像进行划分得到多个网格,并确定所述多个边界框所属的网格;Dividing the target image to obtain multiple grids, and determining the grid to which the multiple bounding boxes belong;
    遍历所述多个边界框,计算参考边界框与相邻边界框的重叠度,并根据所述重叠度得到目标边界框;所述参考边界框为所述多个边界框中的任一个,所述相邻边界框包括属于目标网格的边界框以及属于所述目标网格的相邻网格的边界框,所述目标网格为述参考边界框所属的网格,所述相邻边界框不包括所述参考边界框;Traverse the multiple bounding boxes, calculate the degree of overlap between the reference bounding box and the adjacent bounding boxes, and obtain the target bounding box according to the degree of overlap; the reference bounding box is any one of the multiple bounding boxes, so The adjacent bounding box includes a bounding box belonging to a target grid and a bounding box belonging to an adjacent grid of the target grid, the target grid is a grid to which the reference bounding box belongs, and the adjacent bounding box Not including the reference bounding box;
    根据所述目标边界框的得分确定目标检测结果。The target detection result is determined according to the score of the target bounding box.
  2. 根据权利要求1所述的方法,其特征在于,在遍历所述多个边界框之前,所述方法还包括:The method according to claim 1, wherein before traversing the multiple bounding boxes, the method further comprises:
    按照所述边界框的得分大小对所述边界框进行排序,获得所述边界框对应的排序编号。Sort the bounding boxes according to the score size of the bounding boxes, and obtain the sorting numbers corresponding to the bounding boxes.
  3. 根据权利要求2所述的方法,其特征在于,所述按照所述边界框的得分大小对所述边界框进行排序,具体包括:The method according to claim 2, wherein the sorting the bounding boxes according to the score size of the bounding boxes specifically comprises:
    按照得分大小对所述边界框进行降序排序,得分越小的边界框对应的排序编号越大。The bounding boxes are sorted in descending order according to the size of the score, and the smaller the score is, the larger the sorting number corresponding to the bounding box is.
  4. 根据权利要求3所述的方法,其特征在于,所述遍历所述多个边界框,计算参考边界框与相邻边界框的重叠度,并根据所述重叠度得到目标边界框,包括:The method according to claim 3, wherein the traversing the multiple bounding boxes, calculating the overlap degree between the reference bounding box and the adjacent bounding boxes, and obtaining the target bounding box according to the overlap degree, comprises:
    获取所述多个边界框中的排序编号为i的边界框作为参考边界框,同时获取其标识位;Acquiring a bounding box with a sequence number of i in the plurality of bounding boxes as a reference bounding box, and at the same time acquiring its identification bit;
    在所述参考边界框的标识位为第一标识值的情况下,获取所述参考边界框的相邻边界框,并判定所述相邻边界框的排序编号是否大于i;In the case that the identification bit of the reference bounding box is the first identification value, acquiring adjacent bounding boxes of the reference bounding box, and determining whether the sequence number of the adjacent bounding box is greater than i;
    在确定所述相邻边界框的排序编号大于i的情况下,计算所述参考边界框与所述相邻边界框的交并比;In a case where it is determined that the sequence number of the adjacent bounding box is greater than i, calculating the intersection ratio of the reference bounding box and the adjacent bounding box;
    在所述交并比大于预设阈值的情况下,将所述相邻边界框的标识位置为第二标识值;In a case where the intersection ratio is greater than a preset threshold, setting the identification position of the adjacent bounding box to the second identification value;
    获取所述标识位为所述第一标识值的边界框作为目标边界框。Acquire a bounding box whose identification bit is the first identification value as a target bounding box.
  5. 根据权利要求1-4任一项所述的方法,其特征在于,所述多个边界框的尺寸不同,所述对所述目标图像进行划分得到多个网格,包括:The method according to any one of claims 1 to 4, wherein the sizes of the multiple bounding boxes are different, and the dividing the target image to obtain multiple grids comprises:
    按照所述多个边界框的尺寸中的最大尺寸对所述目标图像划分得到多个网格。The target image is divided according to the largest size among the sizes of the multiple bounding boxes to obtain multiple grids.
  6. 根据权利要求1-4任一项所述的方法,其特征在于,所述多个边界框的尺寸相同,所述对所述目标图像进行划分得到多个网格,包括:The method according to any one of claims 1 to 4, wherein the sizes of the multiple bounding boxes are the same, and the dividing the target image to obtain multiple grids comprises:
    按照所述多个边界框的尺寸对所述目标图像划分得到多个网格。The target image is divided into multiple grids according to the sizes of the multiple bounding boxes.
  7. 根据权利要求1-6任一项所述的方法,其特征在于,所述确定所述多个边界框所属的网格包括:The method according to any one of claims 1-6, wherein the determining the grid to which the multiple bounding boxes belong comprises:
    确定目标坐标点,所述目标坐标点为所述边界框上或所述边界框内的任一个坐标点;Determining a target coordinate point, where the target coordinate point is any coordinate point on the bounding box or within the bounding box;
    将所述目标坐标点所属的网格确定为所述边界框所属的网格。The grid to which the target coordinate point belongs is determined as the grid to which the bounding box belongs.
  8. 根据权利要求7所述的方法,其特征在于,所述目标坐标点包括:The method according to claim 7, wherein the target coordinate point comprises:
    所述边界框的右上角坐标点,所述边界框的左上角坐标点,边界框的右下角坐标点, 边界框的左下角坐标点或边界框的中心坐标点中的任一个。Any one of the upper right coordinate point of the bounding box, the upper left coordinate point of the bounding box, the lower right coordinate point of the bounding box, the lower left coordinate point of the bounding box, or the center coordinate point of the bounding box.
  9. 根据权利要求2-8任一项所述的方法,其特征在于,在确定所述多个边界框所属的网格之后,所述方法还包括:The method according to any one of claims 2-8, wherein after determining the grid to which the multiple bounding boxes belong, the method further comprises:
    根据所述边界框所属的网格,建立所述网格对应的索引队列,所述索引队列中包括至少一个所述边界框的排序编号;Establishing an index queue corresponding to the grid according to the grid to which the bounding box belongs, and the index queue includes at least one sorting number of the bounding box;
    所述获取所述参考边界框的相邻边界框包括:The acquiring adjacent bounding boxes of the reference bounding box includes:
    按照所述索引队列从目标网格和所述目标网格的相邻网格中获取相邻边界框。Acquire adjacent bounding boxes from the target grid and the adjacent grids of the target grid according to the index queue.
  10. 根据权利要求2-9任一项所述的方法,其特征在于,在按照所述边界框的得分大小对所述边界框进行排序之前,所述方法还包括:The method according to any one of claims 2-9, wherein before sorting the bounding boxes according to the score size of the bounding boxes, the method further comprises:
    确定所述多个边界框为得分大于预设得分的多个边界框。It is determined that the multiple bounding boxes are multiple bounding boxes with scores greater than a preset score.
  11. 根据权利要求1-9任一项所述的方法,其特征在于,根据所述目标边界框的得分确定目标检测结果,包括:The method according to any one of claims 1-9, wherein determining the target detection result according to the score of the target bounding box comprises:
    根据得分大于预设得分的目标边界框确定目标检测结果。Determine the target detection result according to the target bounding box whose score is greater than the preset score.
  12. 一种目标检测装置,其特征在于,所述装置包括:A target detection device, characterized in that the device comprises:
    获取单元,用于获取目标图像上多个边界框及所述多个边界框的得分,所述得分用于表征所述边界框中包含目标对象的置信度;An acquiring unit, configured to acquire multiple bounding boxes on a target image and scores of the multiple bounding boxes, the scores are used to characterize the confidence that the bounding box contains the target object;
    划分单元,用于对所述目标图像进行划分得到多个网格,并确定所述多个边界框所属的网格;A dividing unit for dividing the target image to obtain multiple grids, and determining the grid to which the multiple bounding boxes belong;
    遍历单元,用于遍历所述多个边界框,计算参考边界框与相邻边界框的重叠度,并根据所述重叠度得到目标边界框;所述参考边界框为所述多个边界框中的任一个,所述相邻边界框包括属于目标网格的边界框以及属于所述目标网格的相邻网格的边界框,所述目标网格为述参考边界框所属的网格,所述相邻边界框不包括所述参考边界框;The traversal unit is configured to traverse the multiple bounding boxes, calculate the overlap degree between the reference bounding box and the adjacent bounding boxes, and obtain the target bounding box according to the overlap degree; the reference bounding boxes are the multiple bounding boxes In any one of the above, the adjacent bounding box includes a bounding box belonging to a target grid and a bounding box belonging to an adjacent grid of the target grid, and the target grid is a grid to which the reference bounding box belongs, so The adjacent bounding box does not include the reference bounding box;
    确定单元,用于根据所述目标边界框的得分确定目标检测结果。The determining unit is configured to determine the target detection result according to the score of the target bounding box.
  13. 根据权利要求12所述的装置,其特征在于,所述装置还包括排序单元,具体用于:The device according to claim 12, wherein the device further comprises a sorting unit, specifically configured to:
    按照所述边界框的得分大小对所述边界框进行排序,获得所述边界框对应的排序编号。Sort the bounding boxes according to the score size of the bounding boxes, and obtain the sorting numbers corresponding to the bounding boxes.
  14. 根据权利要求13所述的装置,其特征在于,所述排序单元,具体用于:The device according to claim 13, wherein the sorting unit is specifically configured to:
    按照得分大小对所述边界框进行降序排序,得分越小的边界框对应的排序编号越大。The bounding boxes are sorted in descending order according to the size of the score, and the smaller the score, the larger the sorting number corresponding to the bounding box.
  15. 根据权利要求14所述的装置,其特征在于,所述遍历单元具体用于:The device according to claim 14, wherein the traversal unit is specifically configured to:
    获取所述多个边界框中的排序编号为i的边界框作为参考边界框,同时获取其标识位;Acquiring a bounding box with a sequence number of i in the plurality of bounding boxes as a reference bounding box, and at the same time acquiring its identification bit;
    在所述参考边界框的标识位为第一标识值的情况下,获取所述参考边界框的相邻边界框,并判定所述相邻边界框的排序编号是否大于i;In the case that the identification bit of the reference bounding box is the first identification value, acquiring adjacent bounding boxes of the reference bounding box, and determining whether the sequence number of the adjacent bounding box is greater than i;
    在确定所述相邻边界框的排序编号大于i的情况下,计算所述参考边界框与所述相邻边界框的交并比;In a case where it is determined that the sequence number of the adjacent bounding box is greater than i, calculating the intersection ratio of the reference bounding box and the adjacent bounding box;
    在所述交并比大于预设阈值的情况下,将所述相邻边界框的标识位置为第二标识值;In a case where the intersection ratio is greater than a preset threshold, setting the identification position of the adjacent bounding box to the second identification value;
    获取所述标识位为所述第一标识值的边界框作为目标边界框。Acquire a bounding box whose identification bit is the first identification value as a target bounding box.
  16. 根据权利要求12-15任一项所述的装置,其特征在于,所述多个边界框的尺寸不同,所述划分单元具体用于:The device according to any one of claims 12-15, wherein the sizes of the multiple bounding boxes are different, and the dividing unit is specifically configured to:
    按照所述多个边界框的尺寸中的最大尺寸对所述目标图像划分得到多个网格。The target image is divided according to the largest size among the sizes of the multiple bounding boxes to obtain multiple grids.
  17. 根据权利要求12-15任一项所述的装置,其特征在于,所述多个边界框的尺寸相同,所述划分单元具体用于:The device according to any one of claims 12-15, wherein the multiple bounding boxes have the same size, and the dividing unit is specifically configured to:
    按照所述多个边界框的尺寸对所述目标图像划分得到多个网格。The target image is divided into multiple grids according to the sizes of the multiple bounding boxes.
  18. 根据权利要求12-17任一项所述的装置,其特征在于,所述划分单元具体用于:The device according to any one of claims 12-17, wherein the dividing unit is specifically configured to:
    确定目标坐标点,所述目标坐标点为所述边界框上或所述边界框内的任一个坐标点;Determining a target coordinate point, where the target coordinate point is any coordinate point on the bounding box or within the bounding box;
    将所述目标坐标点所属的网格确定为所述边界框所属的网格。The grid to which the target coordinate point belongs is determined as the grid to which the bounding box belongs.
  19. 根据权利要求18所述的装置,其特征在于,所述目标坐标点包括:The device according to claim 18, wherein the target coordinate point comprises:
    所述边界框的右上角坐标点,所述边界框的左上角坐标点,边界框的右下角坐标点,边界框的左下角坐标点或边界框的中心坐标点中的任一个。Any one of the upper-right coordinate point of the bounding box, the upper-left coordinate point of the bounding box, the lower-right coordinate point of the bounding box, the lower-left coordinate point of the bounding box, or the center coordinate point of the bounding box.
  20. 根据权利要求13-19任一项所述的装置,其特征在于,在确定所述多个边界框所属的网格之后,所述划分单元还用于:The device according to any one of claims 13-19, wherein after determining the grid to which the multiple bounding boxes belong, the dividing unit is further configured to:
    根据所述边界框所属的网格,建立所述网格对应的索引队列,所述索引队列中包括至少一个所述边界框的排序编号;Establishing an index queue corresponding to the grid according to the grid to which the bounding box belongs, and the index queue includes at least one sorting number of the bounding box;
    在所述获取所述参考边界框的相邻边界框方面,所述遍历单元还用于:In the aspect of acquiring adjacent bounding boxes of the reference bounding box, the traversal unit is further configured to:
    按照所述索引队列从目标网格和所述目标网格的相邻网格中获取相邻边界框。Acquire adjacent bounding boxes from the target grid and the adjacent grids of the target grid according to the index queue.
  21. 根据权利要求12-20任一项所述的装置,其特征在于,所述排序单元还用于:The device according to any one of claims 12-20, wherein the sorting unit is further configured to:
    在对所述边界框按照其对应得分大小进行排序之前,确定所述多个边界框为得分大于预设得分的多个边界框。Before sorting the bounding boxes according to their corresponding score sizes, it is determined that the multiple bounding boxes are multiple bounding boxes with scores greater than a preset score.
  22. 根据权利要求11-20所述的装置,其特征在于,所述确定单元具体用于:The device according to claims 11-20, wherein the determining unit is specifically configured to:
    根据得分大于预设得分的目标边界框确定目标检测结果。Determine the target detection result according to the target bounding box whose score is greater than the preset score.
  23. 一种装置,其特征在于,包括:处理器和传输接口;A device, characterized by comprising: a processor and a transmission interface;
    所述处理器调用存储器中存储的可执行程序代码,以执行如权利要求1-11任一项所述的方法。The processor calls the executable program code stored in the memory to execute the method according to any one of claims 1-11.
PCT/CN2020/108964 2019-10-26 2020-08-13 Target detection method and device WO2021077868A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911026844.2 2019-10-26
CN201911026844.2A CN112711972B (en) 2019-10-26 2019-10-26 Target detection method and device

Publications (1)

Publication Number Publication Date
WO2021077868A1 true WO2021077868A1 (en) 2021-04-29

Family

ID=75541559

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/108964 WO2021077868A1 (en) 2019-10-26 2020-08-13 Target detection method and device

Country Status (2)

Country Link
CN (1) CN112711972B (en)
WO (1) WO2021077868A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2593462B (en) * 2020-03-20 2022-03-30 Imagination Tech Ltd Apparatus and method for processing detection boxes

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11238314B2 (en) * 2019-11-15 2022-02-01 Salesforce.Com, Inc. Image augmentation and object detection
CN113705643B (en) * 2021-08-17 2022-10-28 荣耀终端有限公司 Target detection method and device and electronic equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160343146A1 (en) * 2015-05-22 2016-11-24 International Business Machines Corporation Real-time object analysis with occlusion handling
CN109117794A (en) * 2018-08-16 2019-01-01 广东工业大学 A kind of moving target behavior tracking method, apparatus, equipment and readable storage medium storing program for executing
CN109145756A (en) * 2018-07-24 2019-01-04 湖南万为智能机器人技术有限公司 Object detection method based on machine vision and deep learning
CN110084173A (en) * 2019-04-23 2019-08-02 精伦电子股份有限公司 Number of people detection method and device
US20190295262A1 (en) * 2018-03-22 2019-09-26 Texas Instruments Incorporated Video object detection

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108022238B (en) * 2017-08-09 2020-07-03 深圳科亚医疗科技有限公司 Method, computer storage medium, and system for detecting object in 3D image
CN109883400B (en) * 2018-12-27 2021-12-10 南京国图信息产业有限公司 Automatic target detection and space positioning method for fixed station based on YOLO-SITCOL
CN110059548B (en) * 2019-03-08 2022-12-06 北京旷视科技有限公司 Target detection method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160343146A1 (en) * 2015-05-22 2016-11-24 International Business Machines Corporation Real-time object analysis with occlusion handling
US20190295262A1 (en) * 2018-03-22 2019-09-26 Texas Instruments Incorporated Video object detection
CN109145756A (en) * 2018-07-24 2019-01-04 湖南万为智能机器人技术有限公司 Object detection method based on machine vision and deep learning
CN109117794A (en) * 2018-08-16 2019-01-01 广东工业大学 A kind of moving target behavior tracking method, apparatus, equipment and readable storage medium storing program for executing
CN110084173A (en) * 2019-04-23 2019-08-02 精伦电子股份有限公司 Number of people detection method and device

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2593462B (en) * 2020-03-20 2022-03-30 Imagination Tech Ltd Apparatus and method for processing detection boxes

Also Published As

Publication number Publication date
CN112711972A (en) 2021-04-27
CN112711972B (en) 2024-06-14

Similar Documents

Publication Publication Date Title
WO2021077868A1 (en) Target detection method and device
JP6871314B2 (en) Object detection method, device and storage medium
CN112801164B (en) Training method, device, equipment and storage medium of target detection model
US10885660B2 (en) Object detection method, device, system and storage medium
JP6157138B2 (en) Image processing device
JP7376508B2 (en) Image processing method, apparatus, and device for detecting multiple objects
US20120257822A1 (en) Image processing apparatus, image processing method, and computer readable medium
EP2846309A1 (en) Method and apparatus for segmenting object in image
CN107067536A (en) A kind of image boundary determines method, device, equipment and storage medium
CN108615229B (en) Collision detection optimization method based on curvature point clustering and decision tree
CN111144337A (en) Fire detection method and device and terminal equipment
CN108664860A (en) The recognition methods of room floor plan and device
US10198412B2 (en) Simulated annealing to place annotations in a drawing
CN113205090B (en) Picture correction method, device, electronic equipment and computer readable storage medium
JP6596260B2 (en) Teaching support method and image classification method
CN117058338A (en) CAD-based three-dimensional building model construction method, system, equipment and medium
US20160098843A1 (en) Image processing apparatus and method of controlling the same
CN114924822B (en) Screenshot method and device of three-dimensional topological structure, electronic equipment and storage medium
JP2021156879A (en) Fracture surface analysis device, fracture surface analysis method, and machine learning data set generation method
WO2019082283A1 (en) Image interpretation device
CN103247026B (en) Dual-scale image denoising method of golden division ratio-based diamond-shaped template
US11991401B2 (en) Method for searching advertisement insertion position and method for automatically inserting advertisement in video
US20240070804A1 (en) Display device and image display method
CN118071876B (en) Automatic sample-turning method, equipment and medium for special-shaped plate bottom ribs
US11983916B2 (en) Relocation method, mobile machine using the same, and computer readable storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20879469

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20879469

Country of ref document: EP

Kind code of ref document: A1