CN113313112A - Image processing method and device, computer equipment and storage medium - Google Patents

Image processing method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN113313112A
CN113313112A CN202110598079.2A CN202110598079A CN113313112A CN 113313112 A CN113313112 A CN 113313112A CN 202110598079 A CN202110598079 A CN 202110598079A CN 113313112 A CN113313112 A CN 113313112A
Authority
CN
China
Prior art keywords
image
processed
point
projection
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110598079.2A
Other languages
Chinese (zh)
Other versions
CN113313112B (en
Inventor
章国锋
鲍虎军
袁谨
刘浩敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Shangtang Technology Development Co Ltd
Original Assignee
Zhejiang Shangtang Technology Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Shangtang Technology Development Co Ltd filed Critical Zhejiang Shangtang Technology Development Co Ltd
Priority to CN202110598079.2A priority Critical patent/CN113313112B/en
Publication of CN113313112A publication Critical patent/CN113313112A/en
Application granted granted Critical
Publication of CN113313112B publication Critical patent/CN113313112B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/211Selection of the most significant subset of features
    • G06F18/2113Selection of the most significant subset of features by ranking or filtering the set of features, e.g. using a measure of variance or of feature cross-correlation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/751Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The present disclosure provides an image processing method, apparatus, computer device, and storage medium, wherein the method comprises: acquiring a first characteristic point in an image to be processed and a second characteristic point matched with the first characteristic point in a target reference image corresponding to the image to be processed; projecting the first characteristic point to a preset plane to obtain a first projection point of the first characteristic point in the preset plane, and projecting the second characteristic point to the preset plane to obtain a second projection point of the second characteristic point in the preset plane; determining target characteristic points belonging to the dynamic object from the first characteristic points based on the relative position relation between the first projection points and the second projection points; and removing the target characteristic points from the image to be processed to obtain an image processing result. According to the embodiment of the disclosure, the target feature points belonging to the dynamic object are removed from the first feature points, and the positioning accuracy and the positioning precision are improved when positioning is performed based on the first feature points from which the target feature points are removed.

Description

Image processing method and device, computer equipment and storage medium
Technical Field
The present disclosure relates to the field of image processing technologies, and in particular, to an image processing method and apparatus, a computer device, and a storage medium.
Background
At present, when positioning is performed by using Simultaneous Localization and Mapping (SLAM), positioning is mainly achieved based on feature points included in images acquired by a camera acquiring a target scene in different processing cycles, but when dynamic objects exist in the target scene, the feature points included in the images acquired by the camera acquiring the target scene include feature points corresponding to the dynamic objects, and the feature points corresponding to the dynamic objects affect the precision and accuracy of positioning.
Disclosure of Invention
The embodiment of the disclosure at least provides an image processing method, an image processing device, computer equipment and a storage medium.
In a first aspect, an embodiment of the present disclosure provides an image processing method, including:
acquiring a first feature point in an image to be processed and a second feature point matched with the first feature point in a target reference image corresponding to the image to be processed;
projecting the first characteristic point to a preset plane to obtain a first projection point of the first characteristic point in the preset plane, and projecting the second characteristic point to the preset plane to obtain a second projection point of the second characteristic point in the preset plane;
determining a target characteristic point belonging to a dynamic object from the first characteristic points based on the relative position relationship between the first projection point and the second projection point;
and removing the target characteristic points from the image to be processed to obtain an image processing result.
Therefore, the target characteristic points belonging to the dynamic object in the first characteristic points can be determined by utilizing the projection vectors of the second projection points pointing to the first projection points, and the target characteristic points in the image to be processed are eliminated, so that when the first characteristic points with the target characteristic points eliminated are positioned, the interference of the characteristic points of the dynamic object on the positioning accuracy and precision is reduced, and the positioning accuracy and the positioning precision are improved.
In a possible implementation manner, before the obtaining a first feature point in an image to be processed and a second feature point matching the first feature point in a target reference image corresponding to the image to be processed, the method further includes:
and determining the target reference image for the image to be processed based on a preset screening condition.
In a possible implementation manner, the determining the target reference image for the image to be processed based on a preset filtering condition includes:
detecting whether the current key frame image meets the preset screening condition or not;
taking the key frame image as the target reference image under the condition that the key frame image meets the preset screening condition;
determining a first video frame image as the target reference image under the condition that the key frame image does not meet the preset screening condition;
wherein the first video frame image comprises: and the image with the time stamp earlier than the image to be processed and the shortest time interval with the time stamp of the image to be processed.
In one possible embodiment, the method further comprises: determining the image to be processed as a new key frame image under the condition that the key frame image does not meet the preset screening condition;
and the new key frame image is used for carrying out image processing on the next frame of image to be processed.
Therefore, information between the target reference image and the image to be processed is guaranteed to be dispersed as much as possible under the condition of meeting the optimization, and the elimination precision of the dynamic object is improved.
In one possible embodiment, the preset screening conditions include at least one of:
the difference value between the frame numbers of the image to be processed and the key frame image is smaller than a preset frame number difference value threshold value;
the number of second feature points matched with the first feature points in the key frame image reaches a preset number threshold;
and the visual angle difference value between the first shooting visual angle corresponding to the image to be processed and the second shooting visual angle corresponding to the key frame image is smaller than a preset visual angle difference value threshold value.
The difference value between the frame numbers of the image to be processed and the target reference image is smaller than a preset frame number difference value threshold value, so that the first characteristic points and the second characteristic points which are enough in number and can be matched are ensured to be arranged in the image to be processed and the target reference image, and the target characteristic points belonging to the dynamic object can be better screened from the first characteristic points; the image with the number of the second characteristic points of the matched first characteristic points reaching the preset number threshold is used as a target reference image of the image to be processed, so that the target characteristic points belonging to the dynamic object can be better screened from the first characteristic points; under the condition that the visual angle difference between the first shooting visual angle corresponding to the image to be processed and the second shooting visual angle corresponding to the target reference image is smaller than the preset visual angle difference threshold value, most of the same target objects can be ensured to be contained in the image to be processed and the target reference image, and the first feature points with enough quantity can be determined from the image to be processed.
In a possible implementation manner, the projecting the first feature point to a preset plane to obtain a first projection point of the first feature point in the preset plane includes:
determining rough pose information of a camera when the to-be-processed image is acquired based on three-dimensional position information of the camera in a target scene when the target reference image is acquired and first orientation information of the camera in the target scene when the to-be-processed image is acquired;
and projecting the first characteristic point to the preset plane based on the rough pose information to obtain a first projection point of the first characteristic point in the preset plane.
In this way, it is assumed that, when the camera captures the image to be processed, only the direction changes and the position in the target scene does not change relative to when the camera captures the target reference image, and further, the rough pose information of the camera when the image to be processed is acquired can be determined based on the three-dimensional position information of the camera in the target scene when the target reference image is acquired and the first orientation information of the camera in the target scene when the image to be processed is acquired, so that the first feature point in the image to be processed is projected to the preset plane by using the assumed condition.
In a possible implementation manner, the projecting the second feature point to the preset plane to obtain a second projection point of the second feature point in the preset plane includes:
and projecting the second feature point to the preset plane based on second position and posture information of the camera when the target reference image is acquired, so as to obtain a second projection point of the second feature point in the preset plane.
In a possible implementation manner, the determining, from the first feature points, target feature points belonging to a dynamic object based on a relative positional relationship between the first projection points and the second projection points includes:
determining a projection vector pointing from the second projection point to the first projection point based on a relative positional relationship between the first projection point and the second projection point;
and determining target characteristic points belonging to the dynamic object from the first characteristic points based on the projection vector.
Therefore, whether the motion of the projection vectors corresponding to different feature points of different target objects tends to be consistent or not can be represented by using the projection vectors, and then the target feature points are screened from the first feature points.
In a possible implementation, the determining, from the first feature points, target feature points belonging to the dynamic object based on the projection vector includes:
in the 1 st iteration cycle, according to the modular length of the projection vectors, sorting the projection vectors corresponding to the first characteristic points respectively;
determining a target projection vector of the 1 st iteration period from the projection vectors based on the sequencing result and a preset rejection ratio;
judging whether a preset iteration stop condition is met or not in the nth iteration period;
if yes, determining target characteristic points belonging to the dynamic object from the first characteristic points based on the target projection vectors determined in the (n-1) th iteration cycle; n is an integer greater than 1;
the iteration stop condition includes at least one of:
the number of iteration cycles is greater than or equal to a preset number;
and the difference value between the module length mean value determined in the current iteration period and the module length mean value determined in the previous iteration period is smaller than a preset difference value threshold value.
In one possible embodiment, the method further comprises:
in the nth iteration period, if the iteration stop condition is judged not to be met, determining a mode length mean value based on a target projection vector determined in the (n-1) th iteration period; determining a target projection vector of the nth iteration period based on the error between the modular length of each projection vector and the modular length mean value and the preset rejection ratio;
judging whether the iteration stop condition is met or not in the (n +1) th iteration cycle;
and if so, determining target characteristic points belonging to the dynamic object from the first characteristic points based on the target projection vector determined in the nth iteration cycle.
Therefore, through the iteration of the plurality of iteration cycles, the mean value of the modular length is continuously converged to a value close to most of the winning vectors, and the target characteristic points belonging to the dynamic object can be more accurately screened from the first characteristic points.
In one possible embodiment, the method further comprises:
determining accurate pose information of the camera when the camera acquires the image to be processed based on non-target feature points except the target feature points in the first feature points, third feature points matched with the non-target feature points in the target reference image and second pose information of the camera when the camera acquires the target reference image;
wherein the second feature point includes the third feature point.
Therefore, the pose of the camera when the image to be processed is acquired can be optimized, the accuracy of the obtained accurate pose information is higher, and the accuracy of camera positioning is improved.
In one possible embodiment, the method further comprises:
according to the accurate pose information, the third feature point is re-projected into the image to be processed, and a third projection point of the third feature point in the image to be processed is obtained;
determining a reprojection error based on the position information of the third projection point in the image to be processed and the position information of the non-target feature point in the image to be processed;
determining a new preset rejection ratio based on the reprojection error; and the new preset rejection ratio is used for carrying out image processing on the next frame of image to be processed.
In a possible implementation manner, the re-projecting the third feature point to the image to be processed according to the accurate pose information to obtain a third projection point of the third feature point in the image to be processed includes:
determining a conversion relation between a first image coordinate system corresponding to the image to be processed and a second image coordinate system corresponding to the preset plane according to the accurate pose information;
and projecting a second projection point of the third feature point in the preset plane to the image to be processed based on the conversion relation to obtain a third projection point of the third feature point in the image to be processed.
Therefore, the re-projection error calculation is carried out on the point rejecting result of the image to be processed in the period, and the rejecting proportion of the next frame of image to be processed is determined, so that the target characteristic points corresponding to the dynamic objects can be further screened from the subsequent image processing images, and the accumulation of the positioning errors caused by the existence of the dynamic objects in the multi-frame images to be processed is reduced.
Therefore, the target characteristic points corresponding to the dynamic objects can be further screened from the subsequent image processing images by adjusting the preset rejection proportion of the next frame of image to be processed, and the precision is improved.
In a second aspect, an embodiment of the present disclosure further provides an image processing apparatus, including:
the acquisition module is used for acquiring a first feature point in an image to be processed and a second feature point matched with the first feature point in a target reference image corresponding to the image to be processed;
the first processing module is used for projecting the first characteristic point to a preset plane to obtain a first projection point of the first characteristic point in the preset plane, and projecting the second characteristic point to the preset plane to obtain a second projection point of the second characteristic point in the preset plane;
the second processing module is used for determining target characteristic points belonging to the dynamic object from the first characteristic points based on the relative position relation between the first projection points and the second projection points;
and the third processing module is used for removing the target characteristic points from the image to be processed to obtain an image processing result.
In one possible implementation, the image processing apparatus further includes: a fourth processing module; and the fourth processing module is used for determining the target reference image for the image to be processed based on a preset screening condition.
In a possible implementation manner, the fourth processing module, when determining the target reference image for the image to be processed based on a preset filtering condition, is configured to: detecting whether the current key frame image meets the preset screening condition or not;
taking the key frame image as the target reference image under the condition that the key frame image meets the preset screening condition;
determining a first video frame image as the target reference image under the condition that the key frame image does not meet the preset screening condition;
wherein the first video frame image comprises: and the image with the time stamp earlier than the image to be processed and the shortest time interval with the time stamp of the image to be processed.
The fourth processing module is further configured to: determining the image to be processed as a new key frame image under the condition that the key frame image does not meet the preset screening condition;
and the new key frame image is used for carrying out image processing on the next frame of image to be processed.
In one possible embodiment, the preset screening conditions include at least one of: the difference value between the frame numbers of the image to be processed and the key frame image is smaller than a preset frame number difference value threshold value;
the number of second feature points matched with the first feature points in the key frame image reaches a preset number threshold;
and the visual angle difference value between the first shooting visual angle corresponding to the image to be processed and the second shooting visual angle corresponding to the key frame image is smaller than a preset visual angle difference value threshold value.
In a possible implementation manner, when the first feature point is projected onto a preset plane to obtain a first projection point of the first feature point in the preset plane, the first processing module is specifically configured to determine the rough pose information of the camera when the image to be processed is acquired based on three-dimensional position information of the camera in a target scene when the target reference image is acquired and first orientation information of the camera in the target scene when the image to be processed is acquired;
and projecting the first characteristic point to the preset plane based on the rough pose information to obtain a first projection point of the first characteristic point in the preset plane. .
In a possible implementation manner, when the second feature point is projected onto the preset plane to obtain a second projection point of the second feature point in the preset plane, the first processing module is specifically configured to project the second feature point onto the preset plane based on second pose information of the camera when the target reference image is acquired, to obtain a second projection point of the second feature point in the preset plane.
In a possible implementation manner, when determining a target feature point belonging to a dynamic object from the first feature points based on a relative positional relationship between the first projection point and the second projection point, the second processing module is specifically configured to determine a projection vector pointing from the second projection point to the first projection point based on a relative positional relationship between the first projection point and the second projection point;
and determining target characteristic points belonging to the dynamic object from the first characteristic points based on the projection vector.
In a possible implementation manner, when determining, based on the projection vectors, target feature points belonging to the dynamic object from the first feature points, the second processing module is specifically configured to, in a1 st iteration cycle, order projection vectors corresponding to a plurality of the first feature points, respectively, according to a modular length of the projection vectors;
determining a target projection vector of the 1 st iteration period from the projection vectors based on the sequencing result and a preset rejection ratio;
judging whether a preset iteration stop condition is met or not in the nth iteration period;
if yes, determining target characteristic points belonging to the dynamic object from the first characteristic points based on the target projection vectors determined in the (n-1) th iteration cycle; n is an integer greater than 1;
the iteration stop condition includes at least one of:
the number of iteration cycles is greater than or equal to a preset number;
and the difference value between the module length mean value determined in the current iteration period and the module length mean value determined in the previous iteration period is smaller than a preset difference value threshold value.
In a possible implementation manner, the second processing module is further configured to determine, in the nth iteration cycle, a mode length mean value based on a target projection vector determined in the (n-1) th iteration cycle if it is determined that the iteration stop condition is not satisfied; determining a target projection vector of the nth iteration period based on the error between the modular length of each projection vector and the modular length mean value and the preset rejection ratio;
judging whether the iteration stop condition is met or not in the (n +1) th iteration cycle;
and if so, determining target characteristic points belonging to the dynamic object from the first characteristic points based on the target projection vector determined in the nth iteration cycle.
In one possible implementation, the image processing apparatus further includes: a determination module; the determining module is configured to determine, based on a non-target feature point of the first feature points except the target feature point, a third feature point of the target reference image, which is matched with the non-target feature point, and second pose information of the camera when the camera acquires the target reference image, accurate pose information of the camera when the camera acquires the image to be processed;
wherein the second feature point includes the third feature point.
In one possible implementation, the image processing apparatus further includes: a fifth processing module 407; the fifth processing module is configured to re-project the third feature point to the image to be processed according to the accurate pose information to obtain a third projection point of the third feature point in the image to be processed;
determining a reprojection error based on the position information of the third projection point in the image to be processed and the position information of the non-target feature point in the image to be processed;
determining a new preset rejection ratio based on the reprojection error; and the new preset rejection ratio is used for carrying out image processing on the next frame of image to be processed.
In a possible implementation manner, when the third feature point is re-projected into the image to be processed according to the accurate pose information to obtain a third projection point of the third feature point in the image to be processed, the fifth processing module is specifically configured to determine, according to the accurate pose information, a conversion relationship between a first image coordinate system corresponding to the image to be processed and a second image coordinate system corresponding to the preset plane;
and projecting a second projection point of the third feature point in the preset plane to the image to be processed based on the conversion relation to obtain a third projection point of the third feature point in the image to be processed.
In a third aspect, this disclosure also provides a computer device, a processor, and a memory, where the memory stores machine-readable instructions executable by the processor, and the processor is configured to execute the machine-readable instructions stored in the memory, and when the machine-readable instructions are executed by the processor, the machine-readable instructions are executed by the processor to perform the steps in the first aspect or any one of the possible implementations of the first aspect.
In a fourth aspect, this disclosure also provides a computer-readable storage medium having a computer program stored thereon, where the computer program is executed to perform the steps in the first aspect or any one of the possible implementation manners of the first aspect.
For the description of the effects of the image processing apparatus, the computer device, and the computer-readable storage medium, reference is made to the description of the image processing method, which is not repeated here.
In order to make the aforementioned objects, features and advantages of the present disclosure more comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings required for use in the embodiments will be briefly described below, and the drawings herein incorporated in and forming a part of the specification illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the technical solutions of the present disclosure. It is appreciated that the following drawings depict only certain embodiments of the disclosure and are therefore not to be considered limiting of its scope, for those skilled in the art will be able to derive additional related drawings therefrom without the benefit of the inventive faculty.
Fig. 1 shows a flowchart of an image processing method provided by an embodiment of the present disclosure;
FIG. 2 is a flow chart illustrating a specific method for determining a target reference image according to an embodiment of the present disclosure;
fig. 3 illustrates an exemplary diagram of a projection vector of a first projection point and a second projection point of a preset plane provided by an embodiment of the present disclosure;
fig. 4 shows a schematic diagram of an image processing apparatus provided by an embodiment of the present disclosure;
fig. 5 shows a schematic diagram of a computer device provided by an embodiment of the present disclosure.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present disclosure more clear, the technical solutions of the embodiments of the present disclosure will be described clearly and completely with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are only a part of the embodiments of the present disclosure, not all of the embodiments. The components of embodiments of the present disclosure, as generally described and illustrated herein, may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present disclosure is not intended to limit the scope of the disclosure, as claimed, but is merely representative of selected embodiments of the disclosure. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the disclosure without making creative efforts, shall fall within the protection scope of the disclosure.
According to research, when synchronous positioning and Mapping (SLAM) is adopted for positioning, an image acquisition device periodically acquires a target scene to obtain a sampling image corresponding to each acquisition period, characteristic points in the sampling images corresponding to adjacent processing periods are extracted, positions of the characteristic points which can be matched between different sampling images in different sampling images and camera positions corresponding to sampling images acquired in a previous processing period in the adjacent processing periods are utilized to determine camera positions corresponding to sampling images acquired in a next processing period in the adjacent processing periods, and Mapping is carried out based on the camera positions. In the process, since a dynamic object may exist in the target scene, when the SLAM performs positioning based on the feature points of the object, the feature points of the dynamic object may interfere with the positioning result, which adversely affects the accuracy and precision of the positioning.
Based on the above research, the present disclosure provides an image processing method, an image processing apparatus, a computer device, and a storage medium, which are capable of removing a target feature point belonging to a dynamic object from a first feature point based on an inconsistency of a motion of the feature point corresponding to the dynamic object and a static object on a preset plane, reducing interference of the feature point of the dynamic object on positioning accuracy and precision, and improving the positioning accuracy and the positioning precision.
The above-mentioned drawbacks are the results of the inventor after practical and careful study, and therefore, the discovery process of the above-mentioned problems and the solutions proposed by the present disclosure to the above-mentioned problems should be the contribution of the inventor in the process of the present disclosure.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
To facilitate understanding of the present embodiment, first, an image processing method disclosed in the embodiments of the present disclosure is described in detail, where an execution subject of the image processing method provided in the embodiments of the present disclosure is generally a computer device with certain computing capability, and the computer device includes, for example: a terminal device, which may be a User Equipment (UE), a mobile device, a User terminal, a cellular phone, a cordless phone, a Personal Digital Assistant (PDA), a handheld device, a computing device, a vehicle mounted device, a wearable device, or a server or other processing device. In some possible implementations, the image processing method may be implemented by a processor calling computer readable instructions stored in a memory.
The following describes an image processing method provided by the embodiment of the present disclosure.
Referring to fig. 1, a flowchart of an image processing method provided by the embodiment of the present disclosure is shown, where the method includes steps S101 to S104, where:
s101: acquiring a first feature point in an image to be processed and a second feature point matched with the first feature point in a target reference image corresponding to the image to be processed;
s102: projecting the first characteristic point to a preset plane to obtain a first projection point of the first characteristic point in the preset plane, and projecting the second characteristic point to the preset plane to obtain a second projection point of the second characteristic point in the preset plane;
s103: determining a target characteristic point belonging to a dynamic object from the first characteristic points based on the relative position relationship between the first projection point and the second projection point;
s104: and removing the target characteristic points from the image to be processed to obtain an image processing result.
The method comprises the steps of respectively projecting a first characteristic point in an image to be processed and a second characteristic point corresponding to the first characteristic point in a target reference image to a preset plane to obtain a first projection point of the first characteristic point in the preset plane and a second projection point of the second characteristic point in the preset plane; and then determining target characteristic points belonging to the dynamic object from the first characteristic points based on the relative position relationship between the second projection point and the first projection point, and removing the target characteristic points of the dynamic object from the first characteristic points so as to remove the target characteristic points belonging to the dynamic object from the first characteristic points based on the movement inconsistency of the characteristic points corresponding to the dynamic object and the static object on a preset plane, thereby reducing the interference of the characteristic points of the dynamic object on the positioning accuracy and precision and improving the positioning accuracy and the positioning precision.
The following describes the details of S101 to S104.
For the above S101, in a specific implementation, the image to be processed is, for example, any frame of video frame image acquired when performing positioning based on SLAM, or any frame of image determined in a video acquired by scanning a target scene.
For example, when the image to be processed is acquired, a camera may be used to capture a video of the target scene, and the image to be processed may be determined from video frame images included in the video.
In particular implementations, the target scene may be determined according to a particular scene of the image processing application; illustratively, the target scenario may include: any one of a stadium, a block, and an office.
In the case of video capture of a target scene, for example, a camera installed in the target scene may be used, or a camera installed in a terminal device that can move in the target scene may be used to acquire a video of the target scene.
After the video of the target scene is obtained, each frame of video frame image can be sequentially used as an image to be processed, and a target reference image of the image to be processed is determined in the video; wherein, the target reference image determined for the image to be processed is usually earlier in time stamp than the video frame image of the image to be processed.
In addition, the process of acquiring the video of the target scene and the process of determining the image to be processed and processing the image to be processed can be synchronous or asynchronous; if the two processes are performed synchronously, the image to be processed can be determined from the acquired video frame image while the video of the target scene is acquired. If the two processes are carried out asynchronously, the video can be acquired first, and then the image to be processed is determined from the acquired video after the video is acquired.
The target reference image corresponding to the image to be processed may be obtained, for example, by the following method:
and determining the target reference image for the image to be processed based on a preset screening condition.
Referring to fig. 2, an embodiment of the present disclosure provides a specific method for determining the target reference image for the image to be processed based on a preset screening condition, including:
s201: detecting whether the current key frame image meets the preset screening condition or not;
s202: taking the key frame image as the target reference image under the condition that the key frame image meets the preset screening condition;
s203: determining a first video frame image as the target reference image under the condition that the key frame image does not meet the preset screening condition;
wherein the first video frame image comprises: and the image with the time stamp earlier than the image to be processed and the shortest time interval with the time stamp of the image to be processed.
In addition, when the key frame image does not satisfy the preset screening condition, the method further includes:
s204: determining the image to be processed as a new key frame image;
and the new key frame image is used for carrying out image processing on the next frame of image to be processed.
In a specific implementation, when each frame image in a segment of video is processed, for example, a1 st frame video frame image in the video may be determined as a current key frame image, and then, for a2 nd frame video frame image in the video, it is detected whether the current key frame image (the 1 st frame video frame image) satisfies a preset screening condition; and determining that the preset screening condition is met, taking the current key frame image as a target reference image of the 2 nd frame video frame image, and processing the 2 nd frame video frame image by using the target reference image.
Detecting whether the current key frame image (the 1 st frame video frame image) meets a preset screening condition or not aiming at the 3 rd frame video frame image in the video; and determining that the preset screening condition is met, taking the current key frame image as a target reference image of the 3 rd frame video frame image, and processing the 3 rd frame video frame image by using the target reference image.
……
The 4 th to 8 th frame video frame images are processed in sequence, and the current key frame image (the 1 st frame video frame image) can meet the preset screening condition, so that the 4 th to 8 th frame video frame images are processed by using the current key frame image (the 1 st frame video frame image).
Detecting whether the current key frame image (the 1 st frame video frame image) meets a preset screening condition or not aiming at the 9 th frame video frame image in the video; if the current key frame image (the 1 st frame video frame image) does not meet the preset screening condition, the 8 th frame video frame image is taken as a target reference image of the 9 th frame video frame image, and the 9 th frame video frame image is processed by using the target reference image (the 8 th frame video frame image).
And taking the 9 th frame video frame image as a new key frame image.
Detecting whether the current key frame image (the 9 th frame video frame image) meets a preset screening condition or not aiming at the 10 th frame video frame image in the video; and determining that the preset screening condition is met, taking the current key frame image as a target reference image of the 10 th frame of video frame image, and processing the 10 th frame of video frame image by using the target reference image.
… … the above process is carried out until all the video frame images in the video to be processed are processed.
In the above process, when detecting whether the current key frame image meets the preset screening condition, it is determined whether the preset screening condition is met in the time sequence-based image sequence from three dimensions of time, space and tracking quality according to the result of the feature point optical flow tracking.
The preset screening conditions include, but are not limited to, at least one of the following (1), (2) and (3):
(1): and the difference value between the frame numbers of the image to be processed and the key frame image is smaller than a preset frame number difference value threshold value.
The video frame images of the target scene collected in each period are numbered according to the time sequence of the video frame images of each frame in the video acquired by the camera collecting the target scene, and the frame numbers of the video frame images of each frame are obtained. Illustratively, the camera captures the target scene once every 0.03 second, and sets a frame number for the video frame image of each period from the first processing period, for example, the frame number of the video frame image of the first processing period is "1", the frame number of the video frame image of the second processing period is "2", the frame number of the video frame image of the third processing period is "3" … …, and so on, the frame number of the video frame image of the twentieth processing period is "20", and the difference between the frame number of the video frame image of the twentieth processing period and the frame number of the video frame image of the first processing period is 19(20-1 is 19).
If the difference value between the frame numbers of the image to be processed and the current key frame image is smaller than the preset frame number difference value threshold value, the current key frame image is used as a target reference image, so that the first feature points and the second feature points which have enough number and can be matched are ensured in the image to be processed and the target reference image, the target feature points belonging to the dynamic object can be better screened out from the first feature points, and after the target feature points are screened out from the first feature points, the subsequent processing such as depth estimation processing, positioning processing and the like can be better carried out on the image to be processed by using the other first feature points left in the first feature points.
(2): and the number of second feature points matched with the first feature points in the key frame image reaches a preset number threshold.
For example, after feature points in the image to be processed are extracted to obtain feature points in the image to be processed, and feature points in the target reference image are extracted to obtain feature points in the target reference image, the feature points in the image to be processed and the feature points in the target reference image are also matched. And determining a first feature point in the image to be processed and a second feature point which can be successfully matched with the first feature point in the target reference image. And successfully matching the first characteristic point with the second characteristic point, namely representing the same characteristic point on the same object by the first characteristic point and the second characteristic point. And if the number of the second characteristic points matched with the first characteristic points in the current key frame image reaches a preset number threshold, taking the current key frame image as a target reference image, so that the target characteristic points belonging to the dynamic object can be better screened from the first characteristic points.
For example, feature points obtained by extracting features from an image to be processed include: A1-A100; feature points obtained by feature extraction of the current key frame image comprise B1-B200, and a preset number threshold value is assumed to be 80; if matched feature points can be determined for at least 80 feature points in B1-B200 from A1-A100, taking the current key frame image as a target reference image of the image to be processed; if matching feature points cannot be determined from A1-a 100 for at least 80 feature points in B1-B200, the current key frame image cannot be used as the target reference image for the image to be processed.
(3): and the visual angle difference value between the first shooting visual angle corresponding to the image to be processed and the second shooting visual angle corresponding to the key frame image is smaller than a preset visual angle difference value threshold value.
Under the condition that the visual angle difference between the first shooting visual angle corresponding to the image to be processed and the second shooting visual angle corresponding to the target reference image is smaller than the preset visual angle difference threshold value, most of the same target objects can be ensured to be contained in the image to be processed and the target reference image, and the first feature points with enough quantity can be determined from the image to be processed.
For example, when determining the target reference image, the first frame image may be directly set as the first frame reference frame image. If the time interval between the image to be processed and the first frame reference frame image is greater than a preset threshold value, namely the number of image intervals between the image to be processed and the first frame reference frame image is greater than the threshold value, setting the image to be processed as a new reference frame image; setting the image to be processed as a new reference frame image when the number of successfully matched second feature points between the first feature points of the image to be processed and the first frame reference frame image is less than a number threshold; and setting the image to be processed as a new reference frame image, wherein the visual angle difference value between a first shooting visual angle corresponding to the image to be processed and a second shooting visual angle corresponding to the first frame key frame image is greater than or equal to a preset visual angle difference value threshold value.
In the above example, the first feature point is a feature point in the image to be processed, which is obtained by performing feature point extraction processing on the image to be processed; wherein the first feature point includes: the first Feature point in the first image may be extracted from the key points of the static object and/or the Feature points of the dynamic object in the target scene, for example, by using a Feature point extraction algorithm such as harris corner detection, Scale Invariant Feature Transform (SIFT), and the like.
The target reference image is an image obtained by acquiring a target scene in a historical processing period by a camera, and the second feature point is a feature point which is obtained by extracting the feature point of the target reference image and is matched with the first feature point in the target reference image. The extraction method of the second feature point is similar to that of the first feature point, and is not described herein again. The second feature points of the target reference image are obtained by corner point detection, and the first feature points may be obtained by tracking the current key frame image of the image to be processed by an optical flow tracking algorithm (Lucas-Kanade, LK), for example.
Illustratively, there is one static object "apple" and one dynamic object "pedestrian" in the target scene, three keypoints A, B, C exist on the "apple", two keypoints D, E exist on the "pedestrian"; a first feature point A ' corresponding to the key point A of the apple, a first feature point B ' corresponding to the key point B, a first feature point C ' corresponding to the key point C, a first feature point D ' corresponding to the key point D of the pedestrian and a first feature point E ' corresponding to the key point E exist in the image to be processed; a second feature point A corresponding to the key point A of the apple, a second feature point B corresponding to the key point B, a second feature point C corresponding to the key point C, a second feature point D corresponding to the key point D of the pedestrian and a second feature point E corresponding to the key point E exist in the target reference image; a ' and A ' are used for representing key points A of the apple in the target scene, B ' and B ' are used for representing key points B of the apple in the target scene, C ' and C ' are used for representing key points C of the apple in the target scene, D ' and D ' are used for representing key points D of the pedestrian in the target scene, and E ' are used for representing key points E of the pedestrian in the target scene.
In addition, if the image processing process is already performed on the target reference image before the image processing is performed on the current image to be processed, that is, the image processing is already performed on the target reference image as the image to be processed in the historical processing period, since the feature extraction processing is already performed on the target reference image, the feature points obtained by performing the feature extraction processing on the target reference image in the historical period can be stored; in the current processing period, feature extraction processing can be performed only on the current image to be processed; and reading the characteristic points of the target reference image from the pre-stored data, matching the characteristic points of the image to be processed with the characteristic points of the target reference image, and determining a first characteristic point and a second characteristic point.
For the above S102, the preset plane may be a plane predetermined in a scene coordinate system corresponding to the target scene. For example, the target scene includes at least one object with a large area plane, such as a "ground", "a wall", "a ceiling", and the like, and a plane where the surface of at least one object is located may be used as a preset plane.
When image processing is performed on the same frame of image to be processed, only one preset plane is usually determined; different images to be processed can correspond to different preset planes. For example, in a target scene, a ceiling corresponding to a certain area is relatively smooth, but the ceiling corresponds to a large number of objects placed on the ground, and in this case, the plane where the surface of the ceiling is located may be used as a preset plane; in another area in the target scene, there are more decorations on the ceiling, but the ground is spacious, in which case the plane of the surface of the ground can be taken as the preset plane. The method can be determined according to the actual situation of the target scene.
After the preset plane is determined in the target scene, the specific pose of the preset plane in the scene coordinate system corresponding to the target scene is determined, that is, the transformation relationship between the scene coordinate system and the preset plane can be determined. Under the condition that the rough pose information in the target scene when the camera acquires the image to be processed and the second pose information in the target scene when the target reference image is acquired are determined, the first feature point can be projected to the preset plane to obtain a first projection point of the first feature point in the preset plane, and the second feature point is projected to the preset plane to obtain a second projection point of the second feature point in the preset plane.
Specifically, the rough pose information determined for the image to be processed when the first feature point is projected onto the preset plane is an estimated pose. In determining the coarse pose information, it is assumed that the camera has only a change in direction when capturing the image to be processed, and no change occurs in position in the target scene relative to the camera when capturing the target reference image, and further, the coarse pose information of the camera when acquiring the image to be processed may be determined based on three-dimensional position information of the camera in the target scene when acquiring the target reference image and first orientation information of the camera in the target scene when acquiring the image to be processed. And then projecting the first characteristic point to a preset plane based on the rough pose information to obtain a first projection point of the first characteristic point in the preset plane.
Here, the first orientation information of the camera in the target scene when acquiring the image to be processed may be obtained by an orientation meter connected to the camera, for example. For example, when an image to be processed is acquired by a camera in a mobile phone, an orientation meter is deployed in the mobile phone; the orientation meter determination orientation information at the time of acquiring the image to be processed may be read as the first orientation information by calling an interface provided externally from the orientation meter.
In addition, since the pose information of the other image having the time stamp earlier than the image to be processed has been determined in the processing cycle of the history, when the other image having the time stamp earlier than the image to be processed is taken as the target reference image of the image to be processed, the second pose information at the time when the camera acquires the target reference image is known. At this time, the second feature point may be projected to the preset plane based on second pose information of the camera when the target reference image is acquired, so as to obtain a second projection point of the second feature point in the preset plane. Here, the second pose information is pose information corresponding to the target reference image obtained by performing accurate pose estimation on the target reference image; for example, if the current image to be processed is taken as the target reference image in the subsequent processing process, the second pose information of the target reference image is, for example, the accurate pose information of the current image to be processed. The determination method of the accurate pose information can be described in the following embodiments, and is not described herein again.
After the second position and posture information when the camera acquires the target reference image is determined, the second feature point can be projected to the preset plane based on the second position and posture information, and a second projection point of the second feature point in the preset plane is obtained.
For example, the embodiment of the present disclosure takes projecting the first feature point to a preset plane based on the rough pose information as an example: determining relative pose information between the camera and the preset plane based on the pose information of the preset plane in the scene coordinate system and the rough pose information; then, based on the relative pose information and the imaging principle of the camera, a conversion relation between the preset plane and an image coordinate system of the camera when the image to be processed is obtained is determined, and then the first feature point is projected into the preset plane according to the conversion relation.
The specific process of projecting the second feature point to the preset plane is similar to the specific process of projecting the first feature point to the preset plane, and is not described herein again.
For the above S103, when the target feature point belonging to the dynamic object is determined from the first feature point, for example, the following manner may be adopted:
determining a projection vector pointing from the second projection point to the first projection point based on a relative positional relationship between the first projection point and the second projection point; and determining target characteristic points belonging to the dynamic object from the first characteristic points based on the projection vectors.
The relative position relation is generated based on the angle change of the camera when the camera acquires the target reference image and the image to be processed is acquired by the camera, and for the static feature points, when the angle change occurs to the camera, the displacement of the projection points of different static feature points on the preset plane tends to be consistent; for the dynamic feature points, when the camera angle changes, under a non-special condition, the displacement of the projection point of the dynamic feature point on the preset plane is inconsistent with the displacement of the projection point of the static feature point on the preset plane, that is, the target feature point can be determined from the first feature point by utilizing the motion inconsistency of different feature points in the preset plane.
Illustratively, the displacement of the projected point corresponding to the static object is the reverse displacement of the camera; the displacement of the projection point corresponding to the dynamic object is the reverse displacement of the camera and the displacement after the displacement of the dynamic object is superposed in the process of the angle change of the camera.
Fig. 3 is a diagram illustrating an exemplary projection vector of a first projection point and a second projection point on a predetermined plane according to an embodiment of the disclosure, where the first projection point a corresponds to a matched second projection point a ', the first projection point B has a matched second projection point B ', the first projection point C has a matched second projection point C ', the first projection point D has a matched second projection point D ', and the first projection point E has a matched second projection point E '; the projection vector is a vector pointing from the second proxel to the first proxel, both in magnitude and direction.
The first projection point A, the first projection point B, the first projection point C and the first projection point D are projection points corresponding to the static object; the first projection point E is a projection point of the dynamic object; as can be seen from fig. 2, the magnitude and direction of the projection vector from the second projection point a 'to the first projection point a, the projection vector from the second projection point B' to the first projection point B, the projection vector from the second projection point C 'to the first projection point C, and the projection vector D from the second projection point D' to the first projection point D all tend to be consistent; the projection vector from the second projection point E' to the first projection point E is different from the projection vector in magnitude and direction.
When determining the target feature point belonging to the dynamic object from the first feature points based on the projection vector, for example, the following manner may be adopted:
in the 1 st iteration cycle, according to the modular length of the projection vectors, sorting the projection vectors corresponding to the first characteristic points respectively;
determining a target projection vector of the 1 st iteration period from the projection vectors based on the sequencing result and a preset rejection ratio;
judging whether a preset iteration stop condition is met or not in the nth iteration period;
if yes, determining target characteristic points belonging to the dynamic object from the first characteristic points based on the target projection vectors determined in the (n-1) th iteration cycle; n is an integer greater than 1;
wherein the iteration stop condition comprises at least one of:
the number of iteration cycles is greater than or equal to a preset number;
and the difference value between the module length mean value determined in the current iteration period and the module length mean value determined in the previous iteration period is smaller than a preset difference value threshold value.
In addition, in the nth iteration period, if the iteration stop condition is judged not to be met, determining a mode length mean value based on a target projection vector determined in the (n-1) th iteration period; determining a target projection vector of the nth iteration period based on the error between the modular length of each projection vector and the modular length mean value and the preset rejection ratio;
judging whether the iteration stop condition is met or not in the (n +1) th iteration cycle;
and if so, determining target characteristic points belonging to the dynamic object from the first characteristic points based on the target projection vector determined in the nth iteration cycle.
In a specific implementation, the projection vector belonging to the static object is formed only by the motion of the camera, the direction and the modular length on the preset plane are the same, while the projection vector belonging to the dynamic object combines the motion of the camera and the motion of the dynamic object, and the direction and the modular length on the preset plane are obviously different from the projection vector of the static object.
In the embodiment of the present disclosure, in the 1 st iteration cycle, when determining the target projection vector of the 1 st iteration cycle from the projection vectors based on the sorting result and the preset culling proportion, for example, the projection vectors may be sorted in the order of modulo length from large to small. Then, according to a preset elimination proportion, according to the sequence of the modular length from large to small, a plurality of projection vectors with longer modular length are eliminated from the plurality of projection vectors, and the rest other projection vectors are the target projection vectors determined in the 1 st iteration period.
In the nth iteration cycle, if it is determined that the iteration stop condition is satisfied, when the target feature point belonging to the dynamic object is determined from the first feature points based on the target projection vector determined in the (n-1) th iteration cycle, for example, a first feature point corresponding to the target projection vector may be determined from a plurality of first feature points, the first feature point corresponding to the target projection vector may be deleted from the plurality of first feature points in the image to be processed, and the remaining other first feature points may be the target feature points.
In the nth iteration cycle, if it is determined that the iteration stop condition is not satisfied, when determining the target projection vector of the nth iteration cycle based on the error between the modular length of each projection vector and the modular length mean value and the preset rejection ratio, for example, the following manner may be adopted:
and sequencing the projection vectors based on the sequence of the error between the modular length of each projection vector and the mean value of the modular length from large to small.
Then, according to a preset elimination proportion, according to the sequence of the errors from large to small, a plurality of projection vectors with large errors are eliminated from the plurality of projection vectors, and the rest other projection vectors are the target projection vectors determined in the nth iteration cycle.
In addition, the preset rejection ratio may be predetermined, or may be determined in the process of performing image processing on other images to be processed in the previous processing cycle. For the case that the preset rejection ratio is determined in the process of performing image processing on other to-be-processed images based on the previous period, reference may be specifically made to the following embodiments, which are not described herein again.
For example, there are 100 first feature points, and the total number of formed projection vectors is 100; the preset rejection ratio is 20%. The following iteration cycle is performed:
a 1: in the 1 st iteration cycle: sequencing the 100 projection vectors according to the modular length of the 100 projection vectors;
according to a preset rejection proportion, 20 projection vectors with longer modular lengths are rejected from 100 projection vectors according to the sequence of modular lengths from large to small, and the remaining 80 projection vectors are target projection vectors determined in the 1 st iteration period.
a 2: in the 2 nd iteration cycle: and judging that the preset iteration stop condition is not met.
Based on the 80 target projection vectors determined for the 1 st iteration cycle, a mean of the mode lengths is determined, and the error between the mode lengths of the 100 projection vectors and the mean of the mode lengths is determined.
And sorting the 100 projection vectors based on the sequence of the errors between the modular lengths of the 100 projection vectors and the mean of the modular lengths from large to small.
Then, according to the sequence of the errors from large to small, 20 projection vectors with large errors are removed from 100 projection vectors, and the remaining 80 projection vectors are the target projection vectors determined in the 2 nd iteration cycle.
a 3: in the 3 rd iteration cycle: and judging that the preset iteration stop condition is not met.
Based on the 80 target projection vectors determined for the 2 nd iteration cycle, a mean of the mode lengths is determined, and the error between the mode lengths of the 100 projection vectors and the mean of the mode lengths is determined.
And sorting the 100 projection vectors based on the sequence of the errors between the modular lengths of the 100 projection vectors and the mean of the modular lengths from large to small.
Then, according to the sequence of the errors from large to small, 20 projection vectors with large errors are removed from 100 projection vectors, and the remaining 80 projection vectors are the target projection vectors determined in the 3 rd iteration cycle.
……
ai: in the ith iteration cycle: and judging that a preset iteration stop condition is met.
According to the 80 target projection vectors determined in the (i-1) th iteration cycle, 80 first feature points corresponding to the target projection vectors are determined from the multiple first feature points, the 80 first feature points corresponding to the target projection vectors are deleted from 100 first feature points in the image to be processed, and the remaining 20 first feature points are the target feature points.
Through the iteration of the multiple iteration cycles, the mean value of the modular length is continuously converged to a value close to most of the winning vectors, and the target characteristic points belonging to the dynamic object can be more accurately screened from the first characteristic points.
In an image processing method provided in another embodiment of the present disclosure, the method further includes: determining accurate pose information of the camera when the camera acquires the image to be processed based on non-target feature points except the target feature points in the first feature points, third feature points matched with the non-target feature points in the target reference image and second pose information of the camera when the camera acquires the target reference image;
wherein the second feature point includes the third feature point.
Therefore, the pose of the camera when the to-be-processed image is acquired can be optimized, the accuracy of the obtained accurate pose information is higher, and the accuracy of camera positioning is improved.
In another embodiment of the present disclosure, since there is a special case of motion of a dynamic object, for example, a dynamic object moves away from or close to a camera, in the special case, the projection vector determined for the dynamic object and the projection vector determined for a static object both tend to coincide in size and direction; in order to reduce the influence of the positioning accuracy of the image to be processed in the subsequent processing period of the dynamic object and reduce the accumulation of errors in multiple processing periods, the embodiment of the disclosure can also perform reprojection error calculation on the point rejecting result of the image to be processed in the period, and determine the preset rejecting proportion of the next frame of image to be processed, so that the target feature point corresponding to the dynamic object can be further screened from the subsequent image processing image.
Specifically, after obtaining the accurate pose information of the image to be processed, the method further includes: in another embodiment of the present disclosure, the third feature point may be re-projected into the image to be processed according to the accurate pose information, so as to obtain a third projection point of the third feature point in the image to be processed;
determining a reprojection error based on the position information of the third projection point in the image to be processed and the position information of the non-target feature point in the image to be processed;
determining a new preset rejection ratio based on the reprojection error; and the new preset rejection ratio is used for carrying out image processing on the next frame of image to be processed.
For example, if the reprojection error is smaller than the preset error threshold, which indicates that there is a low possibility that a dynamic object exists in the current image to be processed, the preset rejection ratio of the next image to be processed may be correspondingly reduced, or the preset rejection ratio of the next image to be processed may be kept unchanged. If the reprojection error is greater than or equal to the preset error threshold, the possibility that a dynamic object exists in the current image to be processed is high, and the preset rejection proportion of the next image to be processed can be correspondingly increased, so that the feature points belonging to the dynamic object can be more sufficiently rejected when the next image to be processed is processed.
In a specific embodiment, when a third feature point is re-projected into an image to be processed according to accurate pose information to obtain a third projection point of the third feature point in the image to be processed, for example, a conversion relationship between a first image coordinate system corresponding to the image to be processed and a second image coordinate system corresponding to the preset plane may be determined according to the accurate pose information;
and projecting a second projection point of the third feature point in the preset plane to the image to be processed based on the conversion relation to obtain a third projection point of the third feature point in the image to be processed.
It will be understood by those skilled in the art that in the method of the present invention, the order of writing the steps does not imply a strict order of execution and any limitations on the implementation, and the specific order of execution of the steps should be determined by their function and possible inherent logic.
Based on the same inventive concept, an image processing apparatus corresponding to the image processing method is also provided in the embodiments of the present disclosure, and since the principle of the apparatus in the embodiments of the present disclosure for solving the problem is similar to the image processing method described above in the embodiments of the present disclosure, the implementation of the apparatus may refer to the implementation of the method, and repeated details are not described again.
Referring to fig. 4, a schematic diagram of an image processing apparatus provided in an embodiment of the present disclosure is shown, where the apparatus includes: an acquisition module 401, a first processing module 402, a second processing module 403, and a third processing module 404; wherein the content of the first and second substances,
an obtaining module 401, configured to obtain a first feature point in an image to be processed and a second feature point, which is matched with the first feature point, in a target reference image corresponding to the image to be processed;
a first processing module 402, configured to project the first feature point to a preset plane to obtain a first projection point of the first feature point in the preset plane, and project the second feature point to the preset plane to obtain a second projection point of the second feature point in the preset plane;
a second processing module 403, configured to determine, based on a relative position relationship between the first projection point and the second projection point, a target feature point belonging to a dynamic object from the first feature point;
and a third processing module 404, configured to remove the target feature point from the image to be processed, so as to obtain an image processing result.
In one possible implementation, the image processing apparatus further includes: a fourth processing module 405; the fourth processing module 405 is configured to determine the target reference image for the image to be processed based on a preset screening condition.
In a possible implementation manner, the fourth processing module 405, when determining the target reference image for the image to be processed based on a preset filtering condition, is configured to: detecting whether the current key frame image meets the preset screening condition or not;
taking the key frame image as the target reference image under the condition that the key frame image meets the preset screening condition;
determining a first video frame image as the target reference image under the condition that the key frame image does not meet the preset screening condition;
wherein the first video frame image comprises: and the image with the time stamp earlier than the image to be processed and the shortest time interval with the time stamp of the image to be processed.
The fourth processing module 405 is further configured to: determining the image to be processed as a new key frame image under the condition that the key frame image does not meet the preset screening condition;
and the new key frame image is used for carrying out image processing on the next frame of image to be processed.
In one possible embodiment, the preset screening conditions include at least one of: the difference value between the frame numbers of the image to be processed and the key frame image is smaller than a preset frame number difference value threshold value;
the number of second feature points matched with the first feature points in the key frame image reaches a preset number threshold;
and the visual angle difference value between the first shooting visual angle corresponding to the image to be processed and the second shooting visual angle corresponding to the key frame image is smaller than a preset visual angle difference value threshold value.
In a possible implementation manner, when the first feature point is projected onto a preset plane to obtain a first projection point of the first feature point in the preset plane, the first processing module 402 is specifically configured to determine the rough pose information of the camera when acquiring the image to be processed based on three-dimensional position information of the camera in a target scene when acquiring the target reference image and first orientation information of the camera in the target scene when acquiring the image to be processed;
and projecting the first characteristic point to the preset plane based on the rough pose information to obtain a first projection point of the first characteristic point in the preset plane. .
In a possible implementation manner, when the second feature point is projected onto the preset plane to obtain a second projection point of the second feature point in the preset plane, the first processing module 402 is specifically configured to project the second feature point onto the preset plane based on second pose information of the camera when the target reference image is acquired to obtain a second projection point of the second feature point in the preset plane.
In a possible implementation manner, when determining a target feature point belonging to a dynamic object from the first feature points based on a relative positional relationship between the first projection point and the second projection point, the second processing module 403 is specifically configured to determine a projection vector pointing from the second projection point to the first projection point based on a relative positional relationship between the first projection point and the second projection point;
and determining target characteristic points belonging to the dynamic object from the first characteristic points based on the projection vector.
In a possible implementation manner, when determining a target feature point belonging to the dynamic object from the first feature points based on the projection vectors, the second processing module 403 is specifically configured to, in a1 st iteration cycle, order projection vectors corresponding to a plurality of first feature points respectively according to a modular length of the projection vectors;
determining a target projection vector of the 1 st iteration period from the projection vectors based on the sequencing result and a preset rejection ratio;
judging whether a preset iteration stop condition is met or not in the nth iteration period;
if yes, determining target characteristic points belonging to the dynamic object from the first characteristic points based on the target projection vectors determined in the (n-1) th iteration cycle; n is an integer greater than 1;
the iteration stop condition includes at least one of:
the number of iteration cycles is greater than or equal to a preset number;
and the difference value between the module length mean value determined in the current iteration period and the module length mean value determined in the previous iteration period is smaller than a preset difference value threshold value.
In a possible implementation manner, the second processing module 403 is further configured to, in the nth iteration cycle, determine a mean modular length based on a target projection vector determined in the (n-1) th iteration cycle if it is determined that the iteration stop condition is not satisfied; determining a target projection vector of the nth iteration period based on the error between the modular length of each projection vector and the modular length mean value and the preset rejection ratio;
judging whether the iteration stop condition is met or not in the (n +1) th iteration cycle;
and if so, determining target characteristic points belonging to the dynamic object from the first characteristic points based on the target projection vector determined in the nth iteration cycle.
In one possible implementation, the image processing apparatus further includes: a determination module 406; the determining module 406 is configured to determine, based on a non-target feature point of the first feature points except the target feature point, a third feature point of the target reference image, which is matched with the non-target feature point, and second pose information of the camera when the target reference image is acquired, accurate pose information of the camera when the image to be processed is acquired;
wherein the second feature point includes the third feature point.
In one possible implementation, the image processing apparatus further includes: a fifth processing module 407; the fifth processing module 407 is configured to re-project the third feature point to the image to be processed according to the accurate pose information, so as to obtain a third projection point of the third feature point in the image to be processed;
determining a reprojection error based on the position information of the third projection point in the image to be processed and the position information of the non-target feature point in the image to be processed;
determining a new preset rejection ratio based on the reprojection error; and the new preset rejection ratio is used for carrying out image processing on the next frame of image to be processed.
In a possible implementation manner, when the third feature point is re-projected into the image to be processed according to the accurate pose information to obtain a third projection point of the third feature point in the image to be processed, the fifth processing module 407 is specifically configured to determine, according to the accurate pose information, a conversion relationship between a first image coordinate system corresponding to the image to be processed and a second image coordinate system corresponding to the preset plane;
and projecting a second projection point of the third feature point in the preset plane to the image to be processed based on the conversion relation to obtain a third projection point of the third feature point in the image to be processed.
The description of the processing flow of each module in the device and the interaction flow between the modules may refer to the related description in the above method embodiments, and will not be described in detail here.
An embodiment of the present disclosure further provides a computer device, as shown in fig. 5, which is a schematic structural diagram of the computer device provided in the embodiment of the present disclosure, and includes:
a processor 51 and a memory 52; the memory 52 stores machine-readable instructions executable by the processor 51, the processor 51 being configured to execute the machine-readable instructions stored in the memory 52, the processor 51 performing the following steps when the machine-readable instructions are executed by the processor 51:
acquiring a first feature point in an image to be processed and a second feature point matched with the first feature point in a target reference image corresponding to the image to be processed;
projecting the first characteristic point to a preset plane to obtain a first projection point of the first characteristic point in the preset plane, and projecting the second characteristic point to the preset plane to obtain a second projection point of the second characteristic point in the preset plane;
determining a target characteristic point belonging to a dynamic object from the first characteristic points based on the relative position relationship between the first projection point and the second projection point;
and removing the target characteristic points from the image to be processed to obtain an image processing result.
The storage 52 includes a memory 521 and an external storage 522; the memory 521 is also referred to as an internal memory, and temporarily stores operation data in the processor 51 and data exchanged with an external memory 522 such as a hard disk, and the processor 51 exchanges data with the external memory 522 through the memory 521.
For the specific execution process of the instruction, reference may be made to the steps of the image processing method described in the embodiments of the present disclosure, and details are not described here.
The embodiments of the present disclosure also provide a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to perform the steps of the image processing method described in the above method embodiments. The storage medium may be a volatile or non-volatile computer-readable storage medium.
The embodiments of the present disclosure also provide a computer program product, where the computer program product carries a program code, and instructions included in the program code may be used to execute the steps of the image processing method in the foregoing method embodiments, which may be referred to specifically in the foregoing method embodiments, and are not described herein again.
The computer program product may be implemented by hardware, software or a combination thereof. In an alternative embodiment, the computer program product is embodied in a computer storage medium, and in another alternative embodiment, the computer program product is embodied in a Software product, such as a Software Development Kit (SDK), or the like.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the system and the apparatus described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. In the several embodiments provided in the present disclosure, it should be understood that the disclosed system, apparatus, and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present disclosure may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present disclosure. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
Finally, it should be noted that: the above-mentioned embodiments are merely specific embodiments of the present disclosure, which are used for illustrating the technical solutions of the present disclosure and not for limiting the same, and the scope of the present disclosure is not limited thereto, and although the present disclosure is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive of the technical solutions described in the foregoing embodiments or equivalent technical features thereof within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present disclosure, and should be construed as being included therein. Therefore, the protection scope of the present disclosure shall be subject to the protection scope of the claims.

Claims (16)

1. An image processing method, comprising:
acquiring a first feature point in an image to be processed and a second feature point matched with the first feature point in a target reference image corresponding to the image to be processed;
projecting the first characteristic point to a preset plane to obtain a first projection point of the first characteristic point in the preset plane, and projecting the second characteristic point to the preset plane to obtain a second projection point of the second characteristic point in the preset plane;
determining a target characteristic point belonging to a dynamic object from the first characteristic points based on the relative position relationship between the first projection point and the second projection point;
and removing the target characteristic points from the image to be processed to obtain an image processing result.
2. The image processing method according to claim 1, wherein before the obtaining of the first feature point in the image to be processed and the second feature point matching with the first feature point in the target reference image corresponding to the image to be processed, the method further comprises:
and determining the target reference image for the image to be processed based on a preset screening condition.
3. The image processing method according to claim 2, wherein the determining the target reference image for the image to be processed based on a preset filtering condition comprises:
detecting whether the current key frame image meets the preset screening condition or not;
taking the key frame image as the target reference image under the condition that the key frame image meets the preset screening condition;
determining a first video frame image as the target reference image under the condition that the key frame image does not meet the preset screening condition;
wherein the first video frame image comprises: and the image with the time stamp earlier than the image to be processed and the shortest time interval with the time stamp of the image to be processed.
4. The image processing method according to claim 3, further comprising: determining the image to be processed as a new key frame image under the condition that the key frame image does not meet the preset screening condition;
and the new key frame image is used for carrying out image processing on the next frame of image to be processed.
5. The image processing method according to any one of claims 2 to 4, wherein the preset screening condition includes at least one of:
the difference value between the frame numbers of the image to be processed and the key frame image is smaller than a preset frame number difference value threshold value;
the number of second feature points matched with the first feature points in the key frame image reaches a preset number threshold;
and the visual angle difference value between the first shooting visual angle corresponding to the image to be processed and the second shooting visual angle corresponding to the key frame image is smaller than a preset visual angle difference value threshold value.
6. The image processing method according to any one of claims 1 to 5, wherein the projecting the first feature point to a preset plane to obtain a first projection point of the first feature point in the preset plane comprises:
determining rough pose information of a camera when the to-be-processed image is acquired based on three-dimensional position information of the camera in a target scene when the target reference image is acquired and first orientation information of the camera in the target scene when the to-be-processed image is acquired;
and projecting the first characteristic point to the preset plane based on the rough pose information to obtain a first projection point of the first characteristic point in the preset plane.
7. The image processing method according to any one of claims 1 to 6, wherein the projecting the second feature point to the preset plane to obtain a second projection point of the second feature point in the preset plane includes:
and projecting the second feature point to the preset plane based on second position and posture information of the camera when the target reference image is acquired, so as to obtain a second projection point of the second feature point in the preset plane.
8. The image processing method according to any one of claims 1 to 6, wherein the determining a target feature point belonging to a dynamic object from the first feature points based on a relative positional relationship between the first projection point and the second projection point includes:
determining a projection vector pointing from the second projection point to the first projection point based on a relative positional relationship between the first projection point and the second projection point;
and determining target characteristic points belonging to the dynamic object from the first characteristic points based on the projection vector.
9. The image processing method according to claim 8, wherein the determining, from the first feature points, target feature points belonging to the dynamic object based on the projection vector comprises:
in the 1 st iteration cycle, according to the modular length of the projection vectors, sorting the projection vectors corresponding to the first characteristic points respectively;
determining a target projection vector of the 1 st iteration period from the projection vectors based on the sequencing result and a preset rejection ratio;
judging whether a preset iteration stop condition is met or not in the nth iteration period;
if yes, determining target characteristic points belonging to the dynamic object from the first characteristic points based on the target projection vectors determined in the (n-1) th iteration cycle; n is an integer greater than 1;
the iteration stop condition includes at least one of:
the number of iteration cycles is greater than or equal to a preset number;
and the difference value between the module length mean value determined in the current iteration period and the module length mean value determined in the previous iteration period is smaller than a preset difference value threshold value.
10. The image processing method according to claim 9, further comprising:
in the nth iteration period, if the iteration stop condition is judged not to be met, determining a mode length mean value based on a target projection vector determined in the (n-1) th iteration period; determining a target projection vector of the nth iteration period based on the error between the modular length of each projection vector and the modular length mean value and the preset rejection ratio;
judging whether the iteration stop condition is met or not in the (n +1) th iteration cycle;
and if so, determining target characteristic points belonging to the dynamic object from the first characteristic points based on the target projection vector determined in the nth iteration cycle.
11. The image processing method according to any one of claims 1 to 10, further comprising:
determining accurate pose information of the camera when the camera acquires the image to be processed based on non-target feature points except the target feature points in the first feature points, third feature points matched with the non-target feature points in the target reference image and second pose information of the camera when the camera acquires the target reference image;
wherein the second feature point includes the third feature point.
12. The image processing method according to claim 11, further comprising:
according to the accurate pose information, the third feature point is re-projected into the image to be processed, and a third projection point of the third feature point in the image to be processed is obtained;
determining a reprojection error based on the position information of the third projection point in the image to be processed and the position information of the non-target feature point in the image to be processed;
determining a new preset rejection ratio based on the reprojection error; and the new preset rejection ratio is used for carrying out image processing on the next frame of image to be processed.
13. The image processing method according to claim 12, wherein the re-projecting the third feature point into the image to be processed according to the accurate pose information to obtain a third projection point of the third feature point in the image to be processed, includes:
determining a conversion relation between a first image coordinate system corresponding to the image to be processed and a second image coordinate system corresponding to the preset plane according to the accurate pose information;
and projecting a second projection point of the third feature point in the preset plane to the image to be processed based on the conversion relation to obtain a third projection point of the third feature point in the image to be processed.
14. An image processing apparatus characterized by comprising:
the acquisition module is used for acquiring a first feature point in an image to be processed and a second feature point matched with the first feature point in a target reference image corresponding to the image to be processed;
the first processing module is used for projecting the first characteristic point to a preset plane to obtain a first projection point of the first characteristic point in the preset plane, and projecting the second characteristic point to the preset plane to obtain a second projection point of the second characteristic point in the preset plane;
the second processing module is used for determining target characteristic points belonging to the dynamic object from the first characteristic points based on the relative position relation between the first projection points and the second projection points;
and the third processing module is used for removing the target characteristic points from the image to be processed to obtain an image processing result.
15. A computer device, comprising: a processor, a memory storing machine-readable instructions executable by the processor, the processor to execute the machine-readable instructions stored in the memory, the machine-readable instructions, when executed by the processor, the processor to perform the image processing method of any of claims 1 to 13.
16. A computer-readable storage medium, characterized in that a computer program is stored thereon, which computer program, when executed by a computer device, executes the image processing method according to any one of claims 1 to 13.
CN202110598079.2A 2021-05-31 2021-05-31 Image processing method and device, computer equipment and storage medium Active CN113313112B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110598079.2A CN113313112B (en) 2021-05-31 2021-05-31 Image processing method and device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110598079.2A CN113313112B (en) 2021-05-31 2021-05-31 Image processing method and device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN113313112A true CN113313112A (en) 2021-08-27
CN113313112B CN113313112B (en) 2023-02-07

Family

ID=77376229

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110598079.2A Active CN113313112B (en) 2021-05-31 2021-05-31 Image processing method and device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113313112B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115115822A (en) * 2022-06-30 2022-09-27 小米汽车科技有限公司 Vehicle-end image processing method and device, vehicle, storage medium and chip

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110119189A (en) * 2018-02-05 2019-08-13 浙江商汤科技开发有限公司 The initialization of SLAM system, AR control method, device and system
CN110673607A (en) * 2019-09-25 2020-01-10 优地网络有限公司 Feature point extraction method and device in dynamic scene and terminal equipment
WO2020119140A1 (en) * 2018-12-13 2020-06-18 歌尔股份有限公司 Method, apparatus and smart device for extracting keyframe in simultaneous localization and mapping
CN111950370A (en) * 2020-07-10 2020-11-17 重庆邮电大学 Dynamic environment offline visual milemeter expansion method
CN112734839A (en) * 2020-12-31 2021-04-30 浙江大学 Monocular vision SLAM initialization method for improving robustness

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110119189A (en) * 2018-02-05 2019-08-13 浙江商汤科技开发有限公司 The initialization of SLAM system, AR control method, device and system
WO2020119140A1 (en) * 2018-12-13 2020-06-18 歌尔股份有限公司 Method, apparatus and smart device for extracting keyframe in simultaneous localization and mapping
CN110673607A (en) * 2019-09-25 2020-01-10 优地网络有限公司 Feature point extraction method and device in dynamic scene and terminal equipment
CN111950370A (en) * 2020-07-10 2020-11-17 重庆邮电大学 Dynamic environment offline visual milemeter expansion method
CN112734839A (en) * 2020-12-31 2021-04-30 浙江大学 Monocular vision SLAM initialization method for improving robustness

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
YU H,ET AL.: "Learning bipatite graph matching for robust visual localization", 《2020 IEEE INTERNATIONAL SYMPOSIUM ON MIXED AND AUGMENTED REALITY》 *
张峻宁 等: "一种自适应特征地图匹配的改进VSLAM算法", 《自动化学报》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115115822A (en) * 2022-06-30 2022-09-27 小米汽车科技有限公司 Vehicle-end image processing method and device, vehicle, storage medium and chip
CN115115822B (en) * 2022-06-30 2023-10-31 小米汽车科技有限公司 Vehicle-end image processing method and device, vehicle, storage medium and chip

Also Published As

Publication number Publication date
CN113313112B (en) 2023-02-07

Similar Documents

Publication Publication Date Title
KR101722803B1 (en) Method, computer program, and device for hybrid tracking of real-time representations of objects in image sequence
Stiller et al. Estimating motion in image sequences
KR101921672B1 (en) Image processing method and device
US10311595B2 (en) Image processing device and its control method, imaging apparatus, and storage medium
CN106650965B (en) Remote video processing method and device
CN111179358A (en) Calibration method, device, equipment and storage medium
CN102782708A (en) Fast subspace projection of descriptor patches for image recognition
CN111179433A (en) Three-dimensional modeling method and device for target object, electronic device and storage medium
CN109302545B (en) Video image stabilization method and device and computer readable storage medium
CN108604374B (en) Image detection method and terminal
CN105049706A (en) Image processing method and terminal
CN113313112B (en) Image processing method and device, computer equipment and storage medium
JP2016212784A (en) Image processing apparatus and image processing method
CN110827336A (en) Image alignment method, device, equipment and storage medium
CN110309721B (en) Video processing method, terminal and storage medium
CN113240806B (en) Information processing method, information processing device, electronic equipment and storage medium
CN114981845A (en) Image scanning method and device, equipment and storage medium
CN112907671B (en) Point cloud data generation method and device, electronic equipment and storage medium
CN113610967A (en) Three-dimensional point detection method and device, electronic equipment and storage medium
CN112862866A (en) Image registration method and system based on sparrow search algorithm and computing equipment
CN112233139A (en) System and method for detecting motion during 3D data reconstruction
CN107168514B (en) Image processing method and electronic equipment
CN111028296A (en) Method, device, equipment and storage device for estimating focal length value of dome camera
CN116091998A (en) Image processing method, device, computer equipment and storage medium
Groeneweg et al. A fast offline building recognition application on a mobile telephone

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant