WO2021143865A1 - 定位方法及装置、电子设备、计算机可读存储介质 - Google Patents
定位方法及装置、电子设备、计算机可读存储介质 Download PDFInfo
- Publication number
- WO2021143865A1 WO2021143865A1 PCT/CN2021/072210 CN2021072210W WO2021143865A1 WO 2021143865 A1 WO2021143865 A1 WO 2021143865A1 CN 2021072210 W CN2021072210 W CN 2021072210W WO 2021143865 A1 WO2021143865 A1 WO 2021143865A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- target
- feature point
- distance
- feature map
- image
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/761—Proximity, similarity or dissimilarity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2210/00—Indexing scheme for image generation or computer graphics
- G06T2210/12—Bounding box
Definitions
- the present disclosure relates to the fields of computer technology and image processing, and in particular to a positioning method and device, electronic equipment, and computer-readable storage media.
- Object detection or object positioning is an important basic technology in computer vision, which can be applied to scenes such as instance segmentation, object tracking, person recognition, and face recognition.
- Object detection or object positioning usually uses anchor frames. However, if the number of anchor frames used is large and the expression ability of anchor frames is weak, it will lead to defects such as a large amount of calculation for object positioning and inaccurate positioning.
- the present disclosure provides at least one positioning method and device.
- the present disclosure provides a positioning method, including:
- the positioning information of the object in the target image is determined.
- the image feature map based on the target image can determine only one anchor frame for each feature point in the image feature map, that is, the object frame corresponding to the object frame information, which reduces the number of anchor frames used in the object positioning process. , which reduces the amount of calculation and improves the efficiency of object positioning.
- the image feature map based on the target image can also determine the object type information of the object to which each feature point in the image feature map belongs, the confidence level of the object frame information, and the confidence level of the object type information, and then determine based on the determined two confidence levels
- the final confidence level of the object frame information effectively enhances the information expression ability of the object frame or object frame information. It can not only express the positioning information and object type information of the object frame corresponding to the object frame information, but also express the confidence level of the object frame information. Information, which helps to improve the accuracy of object positioning based on the object frame.
- the image feature map includes a classification feature map used to classify the object to which the feature points in the image feature map belong and a classification feature map used to classify the feature points in the image feature map.
- the positioning feature map for positioning the object.
- the second confidence level of the object frame information includes:
- the object frame information of the object to which the feature point belongs and the second confidence level of the object frame information are determined.
- the object frame information of the object to which each feature point belongs in the image feature map is determined, but also the object type of the object to which each feature point belongs in the image feature map is determined
- Information as well as the respective confidence levels of the object type information and the object frame information, improve the information expression ability of the object frame, thereby helping to improve the accuracy of object positioning based on the object frame.
- determining the object frame information of the object to which the feature point belongs includes:
- each feature point in the image feature map based on the positioning feature map, respectively determine the target distance range where the distance between the feature point and each boundary in the object frame of the object to which the feature point belongs;
- the object frame information of the object to which the feature point belongs is determined.
- the target distance range in which the distance between the characteristic point and each boundary in the object frame of the object to which the characteristic point belongs is first determined, and then, based on the determined target distance range, the target distance between the characteristic point and each boundary is determined After the two-step processing, the accuracy of the determined target distance can be improved. After that, based on the determined precise target distance, an accurately positioned object frame can be determined for the feature point, which improves the accuracy of the determined object frame.
- determining the target distance range in which the distance between the feature point and each boundary in the object frame of the object to which the feature point belongs includes:
- the distance range corresponding to the maximum probability value can be selected as the target distance range in which the distance between the feature point and a certain frame is located, which improves the accuracy of the determined target distance range, thereby helping to improve the determination based on the target distance range The accuracy of the distance between the characteristic point and a certain boundary.
- selecting the target distance range in which the distance between the characteristic point and the boundary is located from the multiple distance ranges includes:
- the distance range corresponding to the largest first probability value is used as the target distance range.
- selecting the target distance range in which the distance between the characteristic point and the boundary is located from the multiple distance ranges includes:
- the distance range corresponding to the maximum target probability value is taken as the target distance range in which the distance between the characteristic point and the boundary is located.
- an uncertain parameter value is also determined, and the first probability can be corrected based on the uncertain parameter value Or correction to obtain the target probability value where the distance between the feature point and a certain frame is within each distance range, which improves the accuracy of the probability value of the determined feature point and the distance between a certain frame within each distance range, so that there is It is beneficial to improve the accuracy of the target distance range determined based on the probability value.
- determining the second confidence level of the object frame information includes:
- the second confidence level of the object frame information of the object to which the feature point belongs is determined.
- determining the second confidence level of the object frame information of the object to which the characteristic point belongs includes:
- the first probability value corresponding to the distance range in which the distance between the feature point and each boundary is located can be used to determine the confidence level of the object frame information of the object to which the feature point belongs, and enhance the information expressing ability of the object frame.
- determining the object type information of the object to which the feature point belongs based on the classification feature map includes:
- the object type information of the object to which the feature point belongs is determined.
- the preset object type corresponding to the largest second probability value is selected as the object type information of the object to which the feature point belongs, which improves the accuracy of the determined object type information.
- determining the positioning information of the object in the target image based on the object frame information of the object to which each feature point belongs and the target confidence of the object frame information includes:
- Multiple target feature points are filtered from the image feature map, wherein the distance between the multiple target feature points is less than a preset threshold, and the object type information of the object to which each target feature point belongs is the same;
- the positioning information of the object in the target image is determined.
- the object frame information with the highest target confidence is selected from the feature points with the same object type information at a relatively close distance to locate the object, which can effectively reduce the amount of object frame information used for object positioning. It is helpful to improve the timeliness of object positioning.
- the present disclosure provides a positioning device, including:
- An image acquisition module for acquiring a target image, wherein the target image includes at least one object to be located;
- An image processing module for determining the object type information of the object to which each feature point belongs, the object frame information of the object to which each feature point belongs, and the object type information in the image feature map based on the image feature map of the target image The first confidence level of and the second confidence level of the object frame information;
- a confidence processing module configured to determine the target confidence of the object frame information of the object to which each feature point belongs based on the first confidence and the second confidence;
- the positioning module is configured to determine the positioning information of the object in the target image based on the object frame information of the object to which each feature point belongs and the target confidence of the object frame information.
- the image feature map includes a classification feature map used to classify the object to which the feature points in the image feature map belong and a classification feature map used to classify the feature points in the image feature map.
- the positioning feature map for positioning the object.
- the image processing module is used for:
- the object frame information of the object to which the feature point belongs and the second confidence level of the object frame information are determined.
- the image processing module determines the object frame information of the object to which the feature point belongs based on the positioning feature map, it is used to:
- each feature point in the image feature map based on the positioning feature map, respectively determine the target distance range where the distance between the feature point and each boundary in the object frame of the object to which the feature point belongs;
- the object frame information of the object to which the feature point belongs is determined.
- the image processing module is used to determine the target distance range in which the distance between the feature point and each boundary in the object frame of the object to which the feature point belongs:
- the image processing module selects the target distance range in which the distance between the characteristic point and the boundary is located from the plurality of distance ranges based on the determined first probability value When used for:
- the distance range corresponding to the largest first probability value is used as the target distance range.
- the image processing module selects the target distance range in which the distance between the characteristic point and the boundary is located from the plurality of distance ranges based on the determined first probability value When used for:
- the distance range corresponding to the maximum target probability value is taken as the target distance range where the distance between the characteristic point and the boundary is located.
- the image processing module determines the second confidence level of the object frame information, it is configured to:
- the second confidence level of the object frame information of the object to which the feature point belongs is determined.
- the image processing module determines the second confidence level of the object frame information of the object to which the characteristic point belongs, it is used to:
- the image processing module is used to determine the object type information of the object to which the feature point belongs based on the classification feature map for each feature point:
- the object type information of the object to which the feature point belongs is determined.
- the positioning module is used to:
- Multiple target feature points are filtered from the image feature map, wherein the distance between the multiple target feature points is less than a preset threshold, and the object type information of the object to which each target feature point belongs is the same;
- the positioning information of the object in the target image is determined.
- the present disclosure provides an electronic device including a processor, a memory, and a bus.
- the memory stores machine-readable instructions executable by the processor.
- the processor is connected to the The memories communicate through a bus, and when the machine-readable instructions are executed by the processor, the steps of the positioning method described above are executed.
- the present disclosure also provides a computer-readable storage medium with a computer program stored on the computer-readable storage medium, and the computer program executes the steps of the positioning method described above when the computer program is run by a processor.
- the apparatus, electronic equipment, and computer-readable storage medium of the present disclosure contain at least technical features that are substantially the same as or similar to the technical features of any aspect of the method or any implementation of any aspect of the present disclosure.
- the effects of the device, electronic equipment, and computer-readable storage medium please refer to the description of the effects of the content of the method, which is not repeated here.
- FIG. 1 shows a flowchart of a positioning method provided by an embodiment of the present disclosure
- Figure 2 shows a flowchart of a positioning method provided by an embodiment of the present disclosure
- FIG. 3 shows a flowchart of a positioning method provided by an embodiment of the present disclosure
- FIG. 4 shows a flowchart of a positioning method provided by an embodiment of the present disclosure
- FIG. 5 shows a flowchart of a positioning method provided by an embodiment of the present disclosure
- FIG. 6 shows a schematic structural diagram of a positioning device provided by an embodiment of the present disclosure
- Fig. 7 shows a schematic structural diagram of an electronic device provided by an embodiment of the present disclosure.
- the present disclosure provides a positioning method and device, and electronic equipment , Computer-readable storage medium. Among them, based on the image feature map of the target image in the present disclosure, only one anchor frame is determined for each feature point in the image feature map, that is, the object frame corresponding to the object frame information, which reduces the number of anchor frames used in the object positioning process. The amount of calculation.
- the image feature map based on the target image can also determine the object type information of the object to which each feature point in the image feature map belongs, the confidence level of the object frame information, and the confidence level of the object type information, and then determine based on the determined two confidence levels
- the final confidence level of the object frame information effectively enhances the information expression ability of the object frame, and is beneficial to improve the accuracy of object positioning based on the object frame.
- the embodiments of the present disclosure provide a positioning method, which is applied to a terminal device for positioning an object in an image.
- the terminal device may be a camera, a mobile phone, a wearable device, a personal computer, etc., which is not limited in the embodiment of the present disclosure.
- the positioning method provided by an embodiment of the present disclosure includes steps S110 to S140.
- the target image may be an image including a target object captured during the object tracking process, or an image including a human face captured during face detection.
- the purpose of the target image is not limited in the present disclosure.
- the target image includes at least one object to be positioned.
- the objects here can be objects, people, animals, etc.
- the target image may be captured by the terminal device that executes the positioning method of this embodiment, or it may be captured by another device and transmitted to the terminal device that executes the positioning method of this embodiment.
- the method of obtaining the target image is not limited in the present disclosure.
- the target image Before performing this step, the target image needs to be processed first to obtain the image feature map of the target image.
- a convolutional neural network can be used to extract image features of the target image to obtain an image feature map.
- the image feature map of the target image is determined.
- the object type information of the object to which the feature point belongs the object frame information of the object to which the feature point belongs, the first confidence level of the object type information, and the object frame can be determined The second degree of confidence in the information.
- a convolutional neural network may be used to perform further image feature extraction on the image feature map to obtain the object type information, the object frame information, the first confidence level and the second confidence level.
- the object type information includes the object category of the object to which the feature point belongs.
- the object frame information includes the distance between the characteristic point and each boundary in the object frame corresponding to the object frame information.
- the object frame may also be referred to as an anchor frame.
- the first confidence is used to characterize the accuracy or credibility of the object type information determined based on the image feature map.
- the second confidence is used to characterize the accuracy or credibility of the object frame information determined based on the image feature map.
- S130 Based on the first confidence and the second confidence, respectively determine the target confidence of the object frame information of the object to which each feature point belongs.
- the product of the first confidence level and the second confidence level may be used as the target confidence level of the target frame information.
- the target confidence is used to comprehensively characterize the positioning accuracy and classification accuracy of the object frame corresponding to the object frame information.
- the preset weight of the first confidence, the preset weight of the second confidence, the first confidence and the second confidence can be combined to determine the target confidence.
- the disclosure does not limit the specific implementation scheme for determining the target confidence based on the first confidence and the second confidence.
- S140 Determine positioning information of the object in the target image based on the object frame information of the object to which each feature point belongs and the target confidence of the object frame information.
- the object frame information of the object to which the feature point belongs and the target confidence of the object frame information can be used as the location information of the object to which the feature point belongs in the target image. Then, based on the location information of the object to which each feature point belongs in the target image, Determine the location information of each object in the target image.
- the target confidence level of the object frame information is determined, which effectively enhances the information expression ability of the object frame or object frame information, and not only can express the object frame information corresponding to the object frame information.
- the positioning information and object type information can also express the confidence information of the object frame information, thereby helping to improve the accuracy of object positioning based on the object frame.
- the above embodiment can determine an anchor frame for each feature point in the image feature map based on the image feature map of the target image, that is, the object frame corresponding to the object frame information, which reduces the number of anchor frames used in the object positioning process. Reduce the amount of calculation and improve the efficiency of object positioning.
- the image feature map includes a classification feature map used to classify the object to which the feature points in the image feature map belong and a classification feature map used to classify the features in the image feature map.
- the positioning feature map for positioning the object to which the point belongs.
- the convolutional neural network can be used to extract the image features of the target image to obtain the initial feature map, and then use 4 3 ⁇ 3 convolutional layer pairs with input and output 256.
- the initial feature map is processed to obtain the classification feature map and the positioning feature map.
- the first confidence level of the object type information and the second confidence level of the object frame information can be implemented by using the following steps:
- the object type information of the object to which each feature point in the image feature map belongs and the first confidence level of the object type information
- determine the image feature map based on the positioning feature map The object frame information of the object to which each feature point belongs, and the second confidence level of the object frame information.
- a convolutional neural network or a convolutional layer may be used to perform image feature extraction on the classification feature map to obtain the object type information of the object to which each feature point belongs, and the first confidence level of the object type information.
- a convolutional neural network or a convolutional layer to perform image feature extraction on the positioning feature map, the object frame information of the object to which each feature point belongs and the second confidence level of the object frame information are obtained.
- the object frame information of the object to which each feature point belongs in the image feature map is determined, but also the object type of the object to which each feature point belongs in the image feature map is determined Information, and the respective confidence levels corresponding to the object type information and the object frame information, improve the information expression ability of the object frame, thereby helping to improve the accuracy of object positioning based on the object frame.
- determining the object frame information of the object to which each feature point in the image feature map belongs can be implemented by using steps S310 to S330.
- each boundary in the object frame may be a boundary of the object frame in various directions, for example, the upper boundary, the lower boundary, the left boundary, and the right boundary in the object frame.
- a convolutional neural network or a convolutional layer may be used to perform image feature extraction on the positioning feature map to determine the target distance range where the distance between the feature point and each boundary in the object frame of the object to which the feature point belongs.
- the layered layer performs image feature extraction on the positioning feature map to determine the first probability value of the distance between the feature point and the border within each distance range; finally, based on the determined first probability value, from the multiple Among the distance ranges, select the target distance range where the distance between the characteristic point and the boundary is located. Specifically, the distance range corresponding to the largest first probability value may be used as the target distance range.
- the object frame may include, for example, an upper boundary, a lower boundary, a left boundary, and a right boundary.
- five first probability values a, b, c of the five distance ranges corresponding to the left boundary are determined.
- the distance range corresponding to the maximum probability value is selected as the target distance range where the distance between the feature point and the boundary is located, which improves the accuracy of the determined target distance range, thereby helping to improve the feature points determined based on the target distance range The accuracy of the distance to a certain boundary.
- S320 Based on the target distance range and the positioning feature map, respectively determine the target distance between the feature point and each boundary in the object frame of the object to which the feature point belongs.
- a regression network that matches the target distance range, such as a convolutional neural network, and perform image feature extraction on the location feature map to obtain the feature point and each of the object borders of the object to which the feature point belongs The target distance of the boundary.
- a convolutional neural network is further used to determine an accurate distance, which can effectively improve the accuracy of the determined distance.
- a preset or trained parameter or weight N can be used to correct the determined target distance to obtain the final target distance.
- the precise target distance between the feature point and the left boundary is determined using this step.
- the target distance is marked in Figure 2 and denoted by f.
- the determined target distance is within the determined target distance range.
- S330 Determine the object frame information of the object to which the feature point belongs based on the position information of the feature point in the image feature map and the target distance between the feature point and each boundary.
- the location information of the feature point in the image feature map and the target distance between the feature point and each boundary can be used to determine the location information of each boundary in the object frame corresponding to the object frame information in the image feature map.
- the position information of all boundaries in the object frame in the image feature map can be used as the object frame information of the object to which the feature point belongs.
- the accuracy of the determined target distance can be improved. After that, based on the determined precise target distance, an accurately positioned object frame can be determined for the feature point, which improves the accuracy of the determined object frame.
- Steps S410 to S430 are implemented.
- a convolutional neural network can be used to determine the first probability value where the distance between the feature point and a certain boundary is within each distance range, and at the same time determine the distance uncertainty parameter value of the distance between the feature point and the boundary.
- the distance uncertainty parameter value here can be used to characterize the credibility of the determined first probabilities.
- S420 Based on the distance uncertainty parameter value and each first probability value, determine the target probability value where the distance between the characteristic point and the boundary is within each distance range.
- each first probability value is corrected by using the distance uncertainty parameter value to obtain the corresponding target probability value.
- p x, n represents the target probability value of the distance between the feature point and the boundary x within the nth distance range
- N represents the number of the distance range
- ⁇ x represents the distance uncertainty parameter value corresponding to the boundary x
- s x , N represents the first probability value that the distance between the feature point and the boundary x is within the n-th distance range
- s x,m represents the first probability value that the distance between the feature point and the boundary x is within the m-th distance range.
- S430 Based on the determined target probability value, select a target distance range in which the distance between the characteristic point and the boundary is located from the multiple distance ranges.
- the distance range corresponding to the maximum target probability value can be selected as the target distance range.
- a distance uncertainty parameter value is also determined, and the first probability can be corrected or corrected based on the parameter value. Correction to obtain the target probability value where the distance between the feature point and a certain boundary is within each distance range, which improves the accuracy of the probability value of the determined feature point and a certain boundary within each distance range, which is beneficial to Improve the accuracy of the target distance range determined based on the probability value.
- the following steps can be used to determine the confidence of the corresponding object frame information, that is, the second confidence: based on the feature points in the image feature map
- the first probability value corresponding to the target distance range in which the distance of each boundary in the object frame of the object to which the feature point belongs determines the second confidence level of the object frame information of the object to which the feature point belongs.
- the average value of the first probability value corresponding to the target distance range where the distance between the feature point and all boundaries in the object frame of the object to which the feature point belongs may be used as the second confidence.
- the first probability value corresponding to the distance range in which the distance between the feature point and each boundary is located can be used to determine the confidence level of the object frame information of the object to which the feature point belongs, that is, the second confidence level, which enhances The information expression ability of the object frame.
- determining the object type information of the object to which each feature point in the image feature map belongs can be achieved by using the following steps: based on the classification feature map, determining the image feature map The object to which each feature point belongs is the second probability value of each preset object type; based on the preset object type corresponding to the largest second probability value, the object type information of the object to which the feature point belongs is determined.
- a convolutional neural network or a convolutional layer may be used to perform image feature extraction on the classification feature map to obtain the second probability value of the object to which the feature point belongs is each preset object type. Then, the preset object type corresponding to the largest second probability value is selected to determine the object type information of the object to which the feature point belongs. As shown in FIG. 2, the second probability value corresponding to the preset object type "cat" determined by this embodiment is the largest, so it is determined that the object type information corresponds to a cat. It should be noted that in this article, different operations can use different parts of the same convolutional neural network.
- the preset object type corresponding to the largest second probability value is selected as the object type information of the object to which the feature point belongs, which improves the accuracy of the determined object type information.
- steps S510 to S530 implementation based on the object frame information of the object to which each feature point belongs and the target confidence of the object frame information, to determine the location information of the object in the target image.
- the multiple target feature points obtained by screening are feature points belonging to the same object.
- S520 From the object frame information of the object to which each target feature point belongs, select the object frame information with the highest target confidence as the target frame information.
- the object frame information corresponding to the highest target confidence can be selected to locate the object, and other object frame information with lower target confidence can be eliminated to reduce the amount of calculation in the object positioning process.
- S530 Determine positioning information of the object in the target image based on the selected target frame information and the target confidence of the target frame information.
- the object frame information with the highest target confidence is selected from the object frame information corresponding to the feature points with the same object type information at a relatively close distance to locate the object, which can effectively reduce the object frame used for object positioning.
- the amount of information is conducive to improving the timeliness of object positioning.
- the embodiments of the present disclosure also provide a positioning device, which locates an object in an image on a terminal device, and the device and its various modules can perform the same method as the positioning method. Steps, and can achieve the same or similar beneficial effects, so the repeated parts will not be repeated.
- the positioning device provided by the present disclosure includes:
- the image acquisition module 610 is configured to acquire a target image, where the target image includes at least one object to be located.
- the image processing module 620 is configured to determine, based on the image feature map of the target image, the object type information of the object to which each feature point belongs, the object frame information of the object to which each feature point belongs, and the object type in the image feature map The first confidence level of the information and the second confidence level of the object frame information.
- the confidence processing module 630 is configured to determine the target confidence of the object frame information of the object to which each feature point belongs based on the first confidence and the second confidence.
- the positioning module 640 is configured to determine the positioning information of the object in the target image based on the object frame information of the object to which each feature point belongs and the target confidence of the object frame information.
- the image feature map includes a classification feature map for classifying objects to which feature points in the image feature map belong, and a classification feature map for classifying objects to which the feature points in the image feature map belong. Positioning feature map for positioning.
- the image processing module 620 is used to:
- the image processing module 620 determines the object frame information of the object to which each feature point in the image feature map belongs based on the positioning feature map, it is used to:
- each feature point in the image feature map based on the positioning feature map, respectively determine the target distance range where the distance between the feature point and each boundary in the object frame of the object to which the feature point belongs;
- the object frame information of the object to which the feature point belongs is determined.
- the image processing module 620 is used to determine the target distance range in which the distance between a feature point and each boundary in the object frame of the object to which the feature point belongs:
- the image processing module when the image processing module selects the target distance range in which the distance between the characteristic point and the boundary is located from the plurality of distance ranges based on the determined first probability value, use At:
- the distance range corresponding to the largest first probability value is used as the target distance range.
- the image processing module 620 selects the target distance range in which the distance between the characteristic point and the boundary is located from the plurality of distance ranges based on the determined first probability value, Used for:
- the distance range corresponding to the maximum target probability value is taken as the target distance range where the distance between the characteristic point and the boundary is located.
- the image processing module 620 determines the second confidence level of the object frame information, it is configured to:
- the image processing module determines the second confidence level of the object frame information of the object to which the characteristic point belongs, it is used to:
- the image processing module 620 determines the object type information of the object to which each feature point in the image feature map belongs based on the classification feature map, it is used to:
- the object type information of the object to which the feature point belongs is determined.
- the positioning module 640 is used to:
- Multiple target feature points are filtered from the image feature map, wherein the distance between the multiple target feature points is less than a preset threshold, and the object type information of the object to which each target feature point belongs is the same;
- the positioning information of the object in the target image is determined.
- the embodiment of the present disclosure discloses an electronic device, as shown in FIG. 7, comprising: a processor 701, a memory 702, and a bus 703.
- the memory 702 stores machine-readable instructions executable by the processor 701. When the device is running, the processor 701 and the memory 702 communicate through the bus 703.
- the positioning information of the object in the target image is determined.
- the embodiment of the present disclosure also provides a computer program product corresponding to the method and device, which includes a computer-readable storage medium storing program code.
- the instructions included in the program code can be used to execute the method in the previous method embodiment. For implementation, refer to the method embodiment, which will not be repeated here.
- the modules described as separate components may or may not be physically separated, and the components displayed as modules may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
- the functional units in the various embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
- the function is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a non-volatile computer readable storage medium executable by a processor.
- the technical solution of the present disclosure essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present disclosure.
- the aforementioned storage media include: U disk, mobile hard disk, ROM, RAM, magnetic disk or optical disk and other media that can store program codes.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Image Analysis (AREA)
Abstract
Description
Claims (22)
- 一种定位方法,其特征在于,包括:获取目标图像,其中所述目标图像包括至少一个待定位的对象;基于所述目标图像的图像特征图,确定所述图像特征图中每个特征点所属对象的对象类型信息、每个特征点所属对象的对象边框信息、所述对象类型信息的第一置信度和所述对象边框信息的第二置信度;基于所述第一置信度和所述第二置信度,分别确定每个特征点所属对象的对象边框信息的目标置信度;基于每个特征点所属对象的对象边框信息和所述对象边框信息的目标置信度,确定所述目标图像中对象的定位信息。
- 根据权利要求1所述的定位方法,其特征在于,所述图像特征图包括用于对所述图像特征图中的特征点所属的对象进行分类的分类特征图和用于对所述图像特征图中的特征点所属的对象进行定位的定位特征图,基于所述目标图像的图像特征图,确定所述图像特征图中每个特征点所属对象的对象类型信息、每个特征点所属对象的对象边框信息、所述对象类型信息的第一置信度和所述对象边框信息的第二置信度,包括:针对所述图像特征图中每个特征点,基于所述分类特征图,确定该特征点所属对象的对象类型信息,和所述对象类型信息的第一置信度;基于所述定位特征图,确定该特征点所属对象的对象边框信息,和所述对象边框信息的第二置信度。
- 根据权利要求2所述的定位方法,其特征在于,针对所述图像特征图中每个特征点,基于所述定位特征图,确定该特征点所属对象的对象边框信息,包括:针对所述图像特征图中的每个特征点,基于所述定位特征图,分别确定该特征点与该特征点所属对象的对象边框中的每条边界的距离各自所位于的目标距离范围;基于所述目标距离范围和所述定位特征图,分别确定该特征点与该特征点所属对象的对象边框中的每条边界的目标距离;基于该特征点在所述图像特征图中的位置信息,和该特征点与每条边界的目标距离,确定该特征点所属对象的对象边框信息。
- 根据权利要求3所述的定位方法,其特征在于,确定该特征点与该特征点所属对象的对象边框中的每条边界的距离各自所位于的目标距离范围,包括:针对该特征点所属对象的对象边框中的每条边界,基于所述定位特征图,确定该特征点与该条边界的最大距离;将所述最大距离进行分段处理,得到多个距离范围;基于所述定位特征图,确定该特征点与该条边界的距离位于每个距离范围内的第一概率值;基于确定的所述第一概率值,从所述多个距离范围中,选取该特征点与该条边界的距离所位于的目标距离范围。
- 根据权利要求4所述的定位方法,其特征在于,基于确定的所述第一概率值, 从所述多个距离范围中,选取该特征点与该条边界的距离所位于的目标距离范围,包括:将最大的所述第一概率值对应的距离范围作为所述目标距离范围。
- 根据权利要求4所述的定位方法,其特征在于,基于确定的所述第一概率值,从所述多个距离范围中,选取该特征点与该条边界的距离所位于的目标距离范围,包括:基于所述定位特征图,确定该特征点与该条边界的距离的距离不确定参数值;基于所述距离不确定参数值和每个第一概率值,确定该特征点与该条边界的距离位于每个距离范围内的目标概率值;将最大的所述目标概率值对应的距离范围,作为该特征点与该条边界的距离所位于的目标距离范围。
- 根据权利要求4所述的定位方法,其特征在于,确定所述对象边框信息的第二置信度,包括:基于该特征点与该特征点所属对象的对象边框中每条边界的距离各自所位于的目标距离范围对应的第一概率值,确定该特征点所属对象的对象边框信息的第二置信度。
- 根据权利要求7所述的定位方法,其特征在于,确定该特征点所属对象的对象边框信息的第二置信度,包括:获取该特征点与该特征点所属对象的对象边框中每条边界的距离各自所位于的目标距离范围对应的第一概率值的均值;确定该均值作为所述第二置信度。
- 根据权利要求2至8任一项所述的定位方法,其特征在于,针对所述图像特征图中每个特征点,基于所述分类特征图,确定该特征点所属对象的对象类型信息,包括:针对所述图像特征图中每个特征点,基于所述分类特征图,确定该特征点所属的对象为每种预设对象类型的第二概率值;基于最大的所述第二概率值对应的预设对象类型,确定该特征点所属对象的对象类型信息。
- 根据权利要求1至9任一项所述的定位方法,其特征在于,基于每个特征点所属对象的对象边框信息和所述对象边框信息的目标置信度,确定所述目标图像中对象的定位信息,包括:从所述图像特征图中筛选出多个目标特征点,其中,所述多个目标特征点彼此之间的距离小于预设阈值,并且各个目标特征点所属对象的对象类型信息相同;从每个目标特征点所属对象的对象边框信息中,选取具有最高目标置信度的对象边框信息,作为目标边框信息;基于选取的所述目标边框信息,以及所述目标边框信息的目标置信度,确定所述目标图像中对象的定位信息。
- 一种定位装置,其特征在于,包括:图像获取模块,用于获取目标图像,其中所述目标图像包括至少一个待定位的对象;图像处理模块,用于基于所述目标图像的图像特征图,确定所述图像特征图中每个特征点所属对象的对象类型信息、每个特征点所属对象的对象边框信息、所述对象类型信息的第一置信度和所述对象边框信息的第二置信度;置信度处理模块,用于基于所述第一置信度和所述第二置信度,分别确定每个特征 点所属对象的对象边框信息的目标置信度;定位模块,用于基于每个特征点所属对象的对象边框信息和所述对象边框信息的目标置信度,确定所述目标图像中对象的定位信息。
- 根据权利要求11所述的定位装置,其特征在于,所述图像特征图包括用于对所述图像特征图中的特征点所属的对象进行分类的分类特征图和用于对所述图像特征图中的特征点所属的对象进行定位的定位特征图,所述图像处理模块用于:针对所述图像特征图中每个特征点,基于所述分类特征图,确定该特征点所属对象的对象类型信息,和所述对象类型信息的第一置信度;基于所述定位特征图,确定该特征点所属对象的对象边框信息,和所述对象边框信息的第二置信度。
- 根据权利要求12所述的定位装置,其特征在于,所述图像处理模块在针对所述图像特征图中每个特征点,基于所述定位特征图,确定该特征点所属对象的对象边框信息时,用于:针对所述图像特征图中的每个特征点,基于所述定位特征图,分别确定该特征点与该特征点所属对象的对象边框中的每条边界的距离各自所位于的目标距离范围;基于所述目标距离范围和所述定位特征图,分别确定该特征点与该特征点所属对象的对象边框中的每条边界的目标距离;基于该特征点在所述图像特征图中的位置信息,和该特征点与每条边界的目标距离,确定该特征点所属对象的对象边框信息。
- 根据权利要求13所述的定位装置,其特征在于,所述图像处理模块在确定该特征点与该特征点所属对象的对象边框中的每条边界的距离各自所位于的目标距离范围时,用于:针对该特征点所属对象的对象边框中的每条边界,基于所述定位特征图,确定该特征点与该条边界的最大距离;将所述最大距离进行分段处理,得到多个距离范围;基于所述定位特征图,确定该特征点与该条边界的距离位于每个距离范围内的第一概率值;基于确定的所述第一概率值,从所述多个距离范围中,选取该特征点与该条边界的距离所位于的目标距离范围。
- 根据权利要求14所述的定位装置,其特征在于,所述图像处理模块在基于确定的所述第一概率值,从所述多个距离范围中,选取该特征点与该条边界的距离所位于的目标距离范围时,用于:将最大的所述第一概率值对应的距离范围作为所述目标距离范围。
- 根据权利要求14所述的定位装置,其特征在于,所述图像处理模块在基于确定的所述第一概率值,从所述多个距离范围中,选取该特征点与该条边界的距离所位于的目标距离范围时,用于:基于所述定位特征图,确定该特征点与该条边界的距离的距离不确定参数值;基于所述距离不确定参数值和每个第一概率值,确定该特征点与该条边界的距离位于每个距离范围内的目标概率值;将最大的目标概率值对应的距离范围作为该特征点与该条边界的距离所位于的目标距离范围。
- 根据权利要求14所述的定位装置,其特征在于,所述图像处理模块在确定所述对象边框信息的第二置信度时,用于:基于该特征点与该特征点所属对象的对象边框中每条边界的距离各自所位于的目标距离范围对应的第一概率值,确定该特征点所属对象的对象边框信息的第二置信度。
- 根据权利要求17所述的定位装置,其特征在于,所述图像处理模块在确定该特征点所属对象的对象边框信息的第二置信度时,用于:获取该特征点与该特征点所属对象的对象边框中每条边界的距离各自所位于的目标距离范围对应的第一概率值的均值;确定该均值作为所述第二置信度。
- 根据权利要求12至18任一项所述的定位装置,其特征在于,所述图像处理模块在针对所述图像特征图中每个特征点,基于所述分类特征图,确定该特征点所属对象的对象类型信息时,用于:针对所述图像特征图中每个特征点,基于所述分类特征图,确定该特征点所属的对象为每种预设对象类型的第二概率值;基于最大的所述第二概率值对应的预设对象类型,确定该特征点所属对象的对象类型信息。
- 根据权利要求11至19任一项所述的定位装置,其特征在于,所述定位模块用于:从所述图像特征图中筛选出多个目标特征点,其中,所述多个目标特征点彼此之间的距离小于预设阈值,并且各个目标特征点所属对象的对象类型信息相同;从每个目标特征点所属对象的对象边框信息中,选取具有最高目标置信度的对象边框信息,作为目标边框信息;基于选取的所述目标边框信息,以及所述目标边框信息的目标置信度,确定所述目标图像中对象的定位信息。
- 一种电子设备,其特征在于,包括:处理器、存储介质和总线,所述存储介质存储有所述处理器可执行的机器可读指令,当电子设备运行时,所述处理器与所述存储介质之间通过总线通信,所述处理器执行所述机器可读指令,以执行如权利要求1至10任一所述的定位方法。
- 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器运行时执行如权利要求1至10任一所述的定位方法。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2022500616A JP2022540101A (ja) | 2020-01-18 | 2021-01-15 | ポジショニング方法及び装置、電子機器、コンピュータ読み取り可能な記憶媒体 |
KR1020227018711A KR20220093187A (ko) | 2020-01-18 | 2021-01-15 | 포지셔닝 방법 및 장치, 전자 기기, 컴퓨터 판독 가능 저장 매체 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010058788.7A CN111275040B (zh) | 2020-01-18 | 2020-01-18 | 定位方法及装置、电子设备、计算机可读存储介质 |
CN202010058788.7 | 2020-01-18 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021143865A1 true WO2021143865A1 (zh) | 2021-07-22 |
Family
ID=70998770
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/072210 WO2021143865A1 (zh) | 2020-01-18 | 2021-01-15 | 定位方法及装置、电子设备、计算机可读存储介质 |
Country Status (4)
Country | Link |
---|---|
JP (1) | JP2022540101A (zh) |
KR (1) | KR20220093187A (zh) |
CN (1) | CN111275040B (zh) |
WO (1) | WO2021143865A1 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113762109A (zh) * | 2021-08-23 | 2021-12-07 | 北京百度网讯科技有限公司 | 一种文字定位模型的训练方法及文字定位方法 |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111275040B (zh) * | 2020-01-18 | 2023-07-25 | 北京市商汤科技开发有限公司 | 定位方法及装置、电子设备、计算机可读存储介质 |
CN111931723B (zh) * | 2020-09-23 | 2021-01-05 | 北京易真学思教育科技有限公司 | 目标检测与图像识别方法和设备、计算机可读介质 |
CN114613147B (zh) * | 2020-11-25 | 2023-08-04 | 浙江宇视科技有限公司 | 一种车辆违章的识别方法、装置、介质及电子设备 |
CN112819003B (zh) * | 2021-04-19 | 2021-08-27 | 北京妙医佳健康科技集团有限公司 | 一种提升体检报告ocr识别准确率的方法及装置 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108764292A (zh) * | 2018-04-27 | 2018-11-06 | 北京大学 | 基于弱监督信息的深度学习图像目标映射及定位方法 |
US20190035101A1 (en) * | 2017-07-27 | 2019-01-31 | Here Global B.V. | Method, apparatus, and system for real-time object detection using a cursor recurrent neural network |
CN109426803A (zh) * | 2017-09-04 | 2019-03-05 | 三星电子株式会社 | 用于识别对象的方法和设备 |
CN109522938A (zh) * | 2018-10-26 | 2019-03-26 | 华南理工大学 | 一种基于深度学习的图像中目标的识别方法 |
CN111275040A (zh) * | 2020-01-18 | 2020-06-12 | 北京市商汤科技开发有限公司 | 定位方法及装置、电子设备、计算机可读存储介质 |
-
2020
- 2020-01-18 CN CN202010058788.7A patent/CN111275040B/zh active Active
-
2021
- 2021-01-15 WO PCT/CN2021/072210 patent/WO2021143865A1/zh active Application Filing
- 2021-01-15 JP JP2022500616A patent/JP2022540101A/ja not_active Withdrawn
- 2021-01-15 KR KR1020227018711A patent/KR20220093187A/ko not_active Application Discontinuation
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190035101A1 (en) * | 2017-07-27 | 2019-01-31 | Here Global B.V. | Method, apparatus, and system for real-time object detection using a cursor recurrent neural network |
CN109426803A (zh) * | 2017-09-04 | 2019-03-05 | 三星电子株式会社 | 用于识别对象的方法和设备 |
CN108764292A (zh) * | 2018-04-27 | 2018-11-06 | 北京大学 | 基于弱监督信息的深度学习图像目标映射及定位方法 |
CN109522938A (zh) * | 2018-10-26 | 2019-03-26 | 华南理工大学 | 一种基于深度学习的图像中目标的识别方法 |
CN111275040A (zh) * | 2020-01-18 | 2020-06-12 | 北京市商汤科技开发有限公司 | 定位方法及装置、电子设备、计算机可读存储介质 |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113762109A (zh) * | 2021-08-23 | 2021-12-07 | 北京百度网讯科技有限公司 | 一种文字定位模型的训练方法及文字定位方法 |
CN113762109B (zh) * | 2021-08-23 | 2023-11-07 | 北京百度网讯科技有限公司 | 一种文字定位模型的训练方法及文字定位方法 |
Also Published As
Publication number | Publication date |
---|---|
KR20220093187A (ko) | 2022-07-05 |
CN111275040A (zh) | 2020-06-12 |
JP2022540101A (ja) | 2022-09-14 |
CN111275040B (zh) | 2023-07-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021143865A1 (zh) | 定位方法及装置、电子设备、计算机可读存储介质 | |
WO2020252917A1 (zh) | 一种模糊人脸图像识别方法、装置、终端设备及介质 | |
WO2018121018A1 (zh) | 图片识别方法、装置、服务器及存储介质 | |
WO2019033574A1 (zh) | 电子装置、动态视频人脸识别的方法、***及存储介质 | |
WO2019041519A1 (zh) | 目标跟踪装置、方法及计算机可读存储介质 | |
CN110858394A (zh) | 一种图像质量评估方法、装置、电子设备及计算机可读存储介质 | |
CN111339979B (zh) | 基于特征提取的图像识别方法及图像识别装置 | |
JP2011134114A (ja) | パターン認識方法およびパターン認識装置 | |
CN113343826A (zh) | 人脸活体检测模型的训练方法、人脸活体检测方法及装置 | |
CN112200056B (zh) | 人脸活体检测方法、装置、电子设备及存储介质 | |
CN113435408A (zh) | 人脸活体检测方法、装置、电子设备及存储介质 | |
CN112836625A (zh) | 人脸活体检测方法、装置、电子设备 | |
CN111241928A (zh) | 人脸识别底库优化方法、***、设备、可读存储介质 | |
CN114494751A (zh) | 证照信息识别方法、装置、设备及介质 | |
CN116091781B (zh) | 一种用于图像识别的数据处理方法及装置 | |
CN115273184A (zh) | 人脸活体检测模型训练方法及装置 | |
CN106469437B (zh) | 图像处理方法和图像处理装置 | |
CN114220045A (zh) | 对象识别方法、装置及计算机可读存储介质 | |
CN114399432A (zh) | 目标识别方法、装置、设备、介质及产品 | |
CN116433939B (zh) | 样本图像生成方法、训练方法、识别方法以及装置 | |
CN111967579A (zh) | 使用卷积神经网络对图像进行卷积计算的方法和装置 | |
CN114663965B (zh) | 一种基于双阶段交替学习的人证比对方法和装置 | |
CN113221920B (zh) | 图像识别方法、装置、设备、存储介质以及计算机程序产品 | |
CN116416671B (zh) | 一种人脸图像旋正方法、装置、电子设备和存储介质 | |
CN113191195B (zh) | 基于深度学习的人脸检测方法及*** |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21740651 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2022500616 Country of ref document: JP Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 20227018711 Country of ref document: KR Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21740651 Country of ref document: EP Kind code of ref document: A1 |