WO2019007258A1 - 相机姿态信息的确定方法、装置、设备及存储介质 - Google Patents
相机姿态信息的确定方法、装置、设备及存储介质 Download PDFInfo
- Publication number
- WO2019007258A1 WO2019007258A1 PCT/CN2018/093418 CN2018093418W WO2019007258A1 WO 2019007258 A1 WO2019007258 A1 WO 2019007258A1 CN 2018093418 W CN2018093418 W CN 2018093418W WO 2019007258 A1 WO2019007258 A1 WO 2019007258A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- feature point
- matrix
- homography matrix
- optical flow
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 96
- 239000011159 matrix material Substances 0.000 claims abstract description 362
- 230000003287 optical effect Effects 0.000 claims abstract description 130
- 238000001914 filtration Methods 0.000 claims abstract description 71
- 230000000295 complement effect Effects 0.000 claims abstract description 51
- 238000012545 processing Methods 0.000 claims abstract description 40
- 238000001514 detection method Methods 0.000 claims abstract description 26
- 238000013519 translation Methods 0.000 claims description 106
- 239000000284 extract Substances 0.000 claims description 11
- 238000000605 extraction Methods 0.000 claims description 7
- 230000008569 process Effects 0.000 abstract description 23
- 239000010410 layer Substances 0.000 description 106
- 239000003550 marker Substances 0.000 description 32
- 238000010586 diagram Methods 0.000 description 25
- 230000006870 function Effects 0.000 description 11
- 230000008859 change Effects 0.000 description 9
- 230000033001 locomotion Effects 0.000 description 9
- 238000004364 calculation method Methods 0.000 description 8
- 230000004927 fusion Effects 0.000 description 7
- 238000004891 communication Methods 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 5
- 238000006243 chemical reaction Methods 0.000 description 3
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000006073 displacement reaction Methods 0.000 description 3
- 238000012216 screening Methods 0.000 description 3
- 230000001133 acceleration Effects 0.000 description 2
- 238000004091 panning Methods 0.000 description 2
- 238000003672 processing method Methods 0.000 description 2
- 230000005236 sound signal Effects 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 description 1
- 241000699670 Mus sp. Species 0.000 description 1
- 230000003190 augmentative effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 230000001186 cumulative effect Effects 0.000 description 1
- 238000007599 discharging Methods 0.000 description 1
- 230000005484 gravity Effects 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- 238000012905 input function Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000010079 rubber tapping Methods 0.000 description 1
- 239000002356 single layer Substances 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000010897 surface acoustic wave method Methods 0.000 description 1
- 230000035899 viability Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/02—Affine transformations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/04—Context-preserving transformations, e.g. by using an importance map
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/20—Image enhancement or restoration using local operators
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
- G06T7/248—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments involving reference images or patches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/269—Analysis of motion using gradient-based methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
- G06T7/74—Determining position or orientation of objects or cameras using feature-based methods involving reference images or patches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/10—Image acquisition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/10—Image acquisition
- G06V10/17—Image acquisition using hand-held instruments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/22—Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
- G06V10/225—Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition based on a marking or identifier characterising the area
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
- G06V10/443—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/46—Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
- G06V10/462—Salient features, e.g. scale invariant feature transforms [SIFT]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/75—Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/75—Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
- G06V10/757—Matching configurations of points or features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20016—Hierarchical, coarse-to-fine, multiscale or multiresolution image processing; Pyramid transform
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20076—Probabilistic image processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30204—Marker
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30204—Marker
- G06T2207/30208—Marker matrix
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30244—Camera pose
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/20—Scenes; Scene-specific elements in augmented reality scenes
Definitions
- the present application relates to the field of computer technologies, and in particular, to a method, an apparatus, a device, and a storage medium for determining camera attitude information.
- Augmented Reality (AR) technology is a technology that calculates the position and angle of camera images in real time and adds corresponding images, videos or 3D models.
- the goal of this technology is to put the virtual world on the screen. The world interacts.
- a natural image can be used as a template image for matching (Marker) and corresponding camera pose information is obtained.
- the natural image is a normal captured image
- the Marker image can be a natural image or a regular image.
- the Marker image needs to be detected first, and after the Marker image is detected, the camera pose is obtained by relying on the tracking of the feature points thereof, that is, the camera pose information is obtained.
- the change of the feature points is not considered.
- the affine transformation is obvious, if the feature points of one image layer in the Marker image are matched with the feature points in the current image, the obtained camera pose information has low precision; if multiple image layers in the Marker image are acquired, Matching the feature points of each image layer with the feature points in the current image requires excessive matching overhead, which is not conducive to operational efficiency.
- the embodiment of the present application provides a method, an apparatus, a device, and a storage medium for determining camera posture information.
- One aspect of the present application provides a method for determining camera pose information, the method comprising:
- the first image is a previous frame image of the second image
- the first image and the second image are images acquired by the camera
- the template image is a reference image for matching
- the second target homography matrix is a homography matrix of the template image to the first image
- a device for determining camera pose information comprising:
- a first acquiring module configured to acquire a first image, a second image, and a template image, where the first image is a previous frame image of the second image, and the template image is an image captured by the camera,
- the template image is a reference image for matching
- a detecting module configured to perform feature point detection on the first feature point in the module image and the second feature point in the second image to obtain a first homography matrix
- a tracking module configured to: according to the first optical flow feature point in the first image and the first target homography matrix of the second image, and according to the first target homography matrix and the second target homography matrix Determining a second homography matrix, the second target homography matrix being a homography matrix of the template image to the first image;
- a complementary filtering module configured to perform complementary filtering processing on the first homography matrix and the second homography matrix to obtain camera pose information of the camera.
- a camera attitude information determining apparatus comprising: a memory, a transceiver, a processor, and a bus system;
- the memory is used to store a program
- the processor is configured to execute a program in the memory to implement the following steps:
- the first image is a previous frame image of the second image
- the first image and the second image are images acquired by the camera
- the template image is a reference image for matching
- a computer readable storage medium having stored therein instructions that, when executed on a computer, cause the computer to perform camera gesture information as described in the above aspects Determine the method.
- the embodiments of the present application have at least the following advantages:
- the second homography matrix is the optical flow according to the first image and the second image.
- the tracking result and the optical flow tracking result of the template image and the first image are estimated, so that the optical flow speed is fast, the precision is higher, the output result is more stable and smooth, but the error is accumulated over time, and the method provided in this embodiment
- FIG. 1 is a structural block diagram of a terminal in an embodiment of the present application.
- FIG. 2 is a schematic diagram of a scenario of an AR application scenario according to an exemplary embodiment of the present disclosure
- FIG. 3 is a schematic diagram of a Marker image in an embodiment of the present application.
- FIG. 4 is a schematic diagram of detecting a Marker image in a current image in an embodiment of the present application
- FIG. 5 is a schematic diagram of an embodiment of a method for determining camera pose information according to an embodiment of the present application.
- FIG. 6 is a schematic flowchart of a method for determining camera pose information in an embodiment of the present application
- FIG. 7 is a schematic diagram of an embodiment of a method for determining camera pose information according to an embodiment of the present application.
- FIG. 8 is a schematic diagram of a template image in an embodiment of the present application.
- FIG. 9 is a schematic diagram of an embodiment of determining a target feature point on an original image layer in an embodiment of the present application.
- FIG. 10 is a schematic diagram of an embodiment of filtering processing a first rotation translation matrix and a second rotation translation matrix in the embodiment of the present application;
- FIG. 11 is a schematic diagram of another embodiment of a device for determining camera posture information according to an embodiment of the present application.
- FIG. 12 is a schematic diagram of another embodiment of a device for determining camera posture information according to an embodiment of the present application.
- FIG. 13 is a schematic diagram of another embodiment of a device for determining camera posture information according to an embodiment of the present application.
- FIG. 14 is a schematic diagram of another embodiment of a device for determining camera pose information according to an embodiment of the present application.
- FIG. 15 is a schematic diagram of another embodiment of a device for determining camera posture information according to an embodiment of the present application.
- 16 is a schematic diagram of another embodiment of a device for determining camera pose information in an embodiment of the present application.
- FIG. 17 is a schematic structural diagram of an apparatus for determining camera posture information in an embodiment of the present application.
- the embodiment of the present application provides a method for determining camera pose information and a related device, which divides a template image into a plurality of equal grids, and extracts at most one target feature point in one grid, so that the target feature points are uniformly distributed and It has a high degree of matching and fusion, so that the target feature points can be used to obtain higher-precision camera attitude information while ensuring operational efficiency.
- FIG. 1 is a structural block diagram of a terminal provided by an exemplary embodiment of the present application.
- the terminal includes a processor 120, a memory 140, and a camera 160.
- Processor 120 includes one or more processing cores, such as a 1 core processor, an 8 core processor, and the like.
- the processor 120 is configured to execute at least one of instructions, code, code segments, and programs stored in the memory 140.
- the processor 120 is electrically connected to the memory 140.
- processor 120 is coupled to memory 140 via a bus.
- Memory 140 stores one or more instructions, codes, code segments and/or programs. The instructions, code, code segments and/or programs, when executed by the processor 120, are used to implement the method of determining camera pose information as provided in the various embodiments below.
- the processor 120 is also electrically coupled to the camera 160.
- processor 120 is coupled to camera 160 via a bus.
- the camera 160 is a sensor device having image capturing capabilities. Camera 160 may also be referred to as a camera, a light sensitive device, and the like. Camera 160 has the ability to continuously acquire images or acquire images multiple times.
- camera 160 is disposed inside or outside the device. In the embodiment of the present application, the camera 160 may continuously acquire a multi-frame image, where the i-th frame image in the multi-frame image is the first image, and the i+1-th frame image in the multi-frame image is the second image.
- FIG. 2 is a schematic diagram of a scenario of an AR application scenario provided by an exemplary embodiment of the present application.
- a desktop 220 having a picture 222 on which the picture content of the picture 222 can be considered as a Marker picture, which is a reference picture for matching.
- the mobile terminal 240 having the camera continuously photographs the captured image with the desktop 220 to obtain a frame image such as the images 1 to 6 shown in the drawing.
- the one-frame image of the continuous shooting is sequentially input to the processor for processing.
- the first image is used to refer to the ith frame image captured by the camera
- the second image is used to refer to the i+1th frame image captured by the camera.
- the mobile terminal measures the homography matrix between the Marker image and the second image through a detector (Detector), and measures the homography matrix between the first image and the second image through a tracker (Tracker); After the two homography matrices are subjected to complementary filtering processing, the camera pose information of the mobile terminal itself is calculated, and the camera pose information is used to represent the spatial position of the mobile terminal when the second image is captured in the real world.
- Detector detector
- Tracker Tracker
- a homography matrix also known as a homography matrix, generally describes the transformation of some points on a common plane between two images.
- the homography matrix describes the mapping relationship between two planes. If the feature points in the real environment fall on the same physical plane, the motion estimation can be performed between the two frames by the homography matrix.
- the mobile terminal decomposes the homography matrix by ransac (Random Sample Consensus) to obtain a rotation. Translation matrix R
- R is the rotation matrix corresponding to the second posture when the camera changes from the first posture when the image A is captured to the second posture when the image B is captured
- T is the first time when the camera changes from the first posture when the image A is captured to when the image B is captured.
- the present application is directed to the low computing power of the mobile device.
- the solution uses the complementary filtering algorithm to accurately combine the detection result of the natural image stored by the user and the result of the inter-frame tracking, thereby realizing a stable and fast set.
- robust method for determining camera pose information can be applied to an AR scenario, such as an AR-type game scenario, an AR-type educational scenario, and an AR-type conference scenario.
- the method can be applied to an application based on Marker image for camera positioning and posture correction.
- the template image in this application is the Marker image.
- the Marker image may also be referred to as an Anchor image.
- the Marker image includes some regular image or a natural image that is normally captured.
- a natural image refers to a normal captured image
- a regular image is an image with very distinct geometric features, such as a black rectangular frame, a checkerboard, and the like.
- the Marker image will also appear in the real world. For example, the Marker image will also appear on the desktop or on the book, that is, the Marker image will appear in the scene that the mobile terminal needs to capture, so as to establish the real world three-dimensional coordinates based on the Marker image. system.
- FIG. 3 is a schematic diagram of a Marker image according to an embodiment of the present application.
- a user can take an image by using a given natural image or by using a mobile phone.
- the smartphone will then detect the Marker part in the current image and draw a virtual object on the Marker coordinate system, as shown in FIG. 4, which is a Marker image detected in the current image in the embodiment of the present application.
- FIG. 4 is a Marker image detected in the current image in the embodiment of the present application.
- the Marker part refers to the image area where the Marker image is located in the current image
- the Marker coordinate system refers to the coordinate system established in the current image based on the Marker part for the real world.
- the cover image on the book in FIG. 4 is the same as the Marker image shown in FIG. 3.
- the three-dimensional moving character is added to the book of FIG. 4 to interact with the user.
- the method of determining the camera pose information in the present application will be described below from the perspective of a mobile terminal having a camera. Referring to FIG. 5, a flowchart of a method of determining camera pose information provided in an exemplary embodiment of the present application is shown. The method includes:
- Step 501 Acquire a first image, a second image, and a template image; the first image is a previous frame image of the second image, the first image and the second image are images acquired by the camera, and the template image is a reference image used for matching ;
- the terminal acquires a template image.
- the terminal acquires a template image selected or uploaded by the user, or the terminal acquires a certain frame image collected by the user control camera as a template image.
- the template image is a reference image for matching a plurality of frames of images acquired by the camera during the movement.
- the template image is a reference image for matching the second image
- the second image is a certain frame image of the multi-frame image acquired by the camera during the movement.
- the terminal further acquires a multi-frame image acquired by the camera during the movement, wherein the terminal uses the ith frame image as the first image, and the first image is also referred to as the previous frame image;
- the second image also referred to as the current image.
- the acquiring process of the template image is independent of the acquiring process of the first image/second image, and the embodiment does not limit the timing relationship of the two acquiring processes.
- Step 502 Perform feature point detection on the first feature point in the module image and the second feature point in the second image to obtain a first homography matrix
- the terminal performs feature point detection on the first feature point in the template image and the second feature point in the second image, and obtains at least four pairs of feature points of the template image and the Marker part in the second image, according to the at least four pairs of feature points.
- the first homography matrix is calculated.
- the first homography matrix is used to characterize camera pose changes from the template image to the second image.
- Step 503 Perform feature point tracking on the first optical flow feature point in the first image and the second optical flow feature point of the second image to obtain a second homography matrix
- the terminal also performs optical flow tracking on the second image with respect to the first image to obtain an optical flow matching result of the second optical flow feature point of the second image with respect to the first optical flow feature point in the first image.
- the optical flow matching result includes at least four pairs of feature points, and the first target homography matrix is calculated according to the at least four pairs of feature points, and then the cached template image is obtained to the second target homography matrix of the first image, according to the first target
- the homography matrix and the second target homography matrix obtain a second homography matrix.
- Optical flow is a method of describing the movement of pixels between images over time. As time goes by, the same pixel will move in the image, and we want to track its motion. Among them, the calculation of the partial pixel motion is called the sparse optical flow, and the calculation of all the pixels is called the dense optical flow.
- the present application is described by taking the Lucas-Kanade optical flow algorithm for calculating the sparse optical flow as an example, which is referred to as LK optical flow.
- the second homography matrix is also used to characterize camera pose changes from the template image to the second image. Although both the first homography matrix and the second homography matrix are used to characterize the camera pose change of the template image to the second image, the first homography matrix and the second homography matrix are calculated by different calculation methods.
- Step 504 Perform complementary filtering processing on the first homography matrix and the second homography matrix to obtain camera attitude information of the camera.
- the complementary filtering process refers to a processing method of filtering and combining the first homography matrix and the second homography matrix.
- the complementary filtering process is implemented using a Kalman filter or a complementary filter.
- the output result is slow and the precision is low
- the second homography matrix is based on the first image and the first
- the optical flow tracking result of the two images and the optical flow tracking result of the template image and the first image are estimated, so the optical flow speed is fast, the precision is higher, the output result is more stable and smooth, but the error is accumulated over time.
- the method provided by the embodiment can complement the filtering characteristics of the two homography matrices by performing complementary filtering processing on the first homography matrix and the second homography matrix, thereby obtaining more accurate camera attitude information.
- the performing complementary filtering processing on the first homography matrix and the second homography matrix to obtain camera attitude information of the camera includes:
- the performing complementary filtering processing on the first rotation translation matrix and the second rotation translation matrix to obtain the camera pose information includes:
- the camera pose information is determined according to the first filtering result and the second filtering result.
- the determining the first rotation translation matrix according to the first homography matrix, and determining the second rotation translation matrix according to the second homography matrix includes:
- the template image corresponds to a plurality of grids arranged in an array
- Performing feature point detection on the first feature point in the module image and the second feature point in the second image to obtain a first homography matrix including:
- the feature point pair includes: a first feature point located in the target grid, and a feature point of the second feature point that has the greatest matching degree with the first feature point;
- the method further includes:
- the original image layer is an image layer in the template image, and the plurality of grids are included in the original image layer.
- the extracting the first feature point from each image layer of the template image and determining the first feature point in the original image layer includes:
- the method before the matching the first feature point in the template image with the second feature point in the second image, the method further includes:
- Matching a first feature point in the template image with a second feature point in the second image, and determining a set of feature point pairs in each of the plurality of grids include:
- the first optical flow feature point in the first image and the second optical flow feature point in the second image are feature point tracked to obtain a second single Should matrix, including:
- a first target homography matrix including :
- the determining the second homography matrix according to the first target homography matrix and the second target homography matrix comprises:
- the method further includes:
- the q is a positive integer.
- FIG. 6 is a schematic flowchart of a method for determining camera pose information according to an embodiment of the present application.
- the camera continuously collects a frame image in the real world, in the 101 module.
- a new image is loaded as the current image, and then the detector of the 103 module detects the first homography matrix of the template image to the current image (ie, the second image), and then determines whether the first single is obtained in the 105 module.
- the matrix if it is, reaches the 107 module.
- the tracker is used to track the second image relative to the first image; if not, jump to 101 module.
- the template image will be tracked in the tracker of the 104 module and the second homography matrix will be updated.
- the first homography matrix and the second homography matrix may then be merged in the module 107 by complementary filtering processing, and the camera attitude information obtained after the fusion is output in the module 108. If the 105 module determines that the detection has a result and the 109 module determines that the tracker is not initialized, the tracker is initialized and the tracker starts working from the next frame.
- the detector and the tracker belong to the camera attitude information determining device.
- the method for determining the camera posture information in the present application is introduced from the perspective of the camera attitude information determining device.
- FIG. 7 a flowchart of a method for determining camera posture information provided in an exemplary embodiment of the present application is shown. The method includes:
- the camera attitude information determining apparatus acquires the first image, the second image, and the template image, wherein the first image is the previous frame image of the second image, and the second image may be understood as the currently captured image or The image currently being processed.
- the template image is an image to be matched, and may also be referred to as a Marker image or an Anchor image.
- the template image contains multiple image layers.
- the camera attitude information determining device may be a terminal device such as a mobile phone, a tablet computer, a personal digital assistant (PDA), a point of sales (POS), or an in-vehicle computer.
- a terminal device such as a mobile phone, a tablet computer, a personal digital assistant (PDA), a point of sales (POS), or an in-vehicle computer.
- PDA personal digital assistant
- POS point of sales
- the original image layer is an image layer in the template image, and the original image layer includes multiple gates. grid;
- the template image contains a plurality of image layers. Usually, the size of these image layers is inconsistent, and the template image with the original size is called the original image layer.
- the terminal downsamples the original size template image to generate a pyramid image, and the pyramid image includes an image obtained by scaling the original size template image according to a preset ratio. Taking the pyramid image including the four-layer image as an example, the template image is scaled according to the scaling ratios of 1.0, 0.8, 0.6, and 0.4, and four image layers of different sizes of the template image are obtained.
- the first feature point is extracted for each image layer, so that the multi-layer feature descriptor (ie, the first feature point) of the template image at multiple scales is obtained, and the first feature points are scaled, and all the first features are
- the position of the point is reduced to the corresponding position of the original image layer, and a mask of the original image layer size is created (ie, the size of the mask is the same as the size of the original graphic layer), and the mask is evenly divided into a plurality of small grids. spare.
- the terminal extracts feature points for each layer of pyramid image and calculates an ORB feature descriptor.
- each feature point is recorded on the original scale pyramid image (ie, the original image layer).
- Dimensional coordinates The feature points on these pyramid images and the two-dimensional coordinates may be referred to as first feature points. In one example, there are up to 500 first feature points on each pyramid image.
- the first feature point may be a scale-invariant feature transform (SIFT), or a speeded-up robust feature (SURF), or a fast fast and rotated Brief, ORB) feature points, or histogram of oriented gradient (HOG) features, or local binary patterns (LBP), in order to ensure real-time, we use ORB feature points in this scheme
- SIFT scale-invariant feature transform
- SURF speeded-up robust feature
- ORB fast fast and rotated Brief
- ORB histogram of oriented gradient
- LBP local binary patterns
- the FAST corner point refers to the location of the ORB feature point in the image.
- the FAST corner point mainly detects the obvious change of the local pixel gray scale, and is known for its fast speed.
- the idea of the FAST corner is that if a pixel differs greatly from the neighborhood's pixels (too bright or too dark), the pixel may be a corner.
- the BRIEF descriptor is a binary representation of a vector that describes the information about the pixels around the key in an artificially designed way.
- the description vector of the BRIEF descriptor consists of a number of 0's and 1's, where 0's and 1's encode the size relationship of two pixels near the FAST corner.
- the ORB feature point is a descriptor of the acceleration algorithm, which adds rotation invariance and is fast. It is therefore suitable for implementation on mobile devices.
- the target feature point is a match between the first feature point and the second feature point.
- the largest feature point, the target feature point is used to determine the first homography matrix, and the second feature point is the feature point extracted from the second image;
- the user since the first feature point has no scale invariance, and the template image size change in our application is obvious, the user may photograph the template image at different scales, so the scale problem must be solved. Therefore, it is necessary to generate a pyramid image for the template image, extract a first feature point for each layer of the template image, and then match the second feature point in the second image.
- the camera attitude information determining means detects whether a template image exists in the currently captured second image by matching the second feature point extracted on the second image with the first feature point on the original image layer. For the first feature point in each target grid in the original image layer, if there are multiple second feature points on the second image that belong to the target grid and match the first feature point, then A maximum of one feature point to be selected is selected as the target feature point in the target grid, and then the first homography matrix is calculated by using the target feature points.
- steps 702 to 703 complete the detection of the template image
- the camera attitude information determining apparatus needs to use the optical flow tracking method to track the image.
- the optical flow Lucas-Kanade algorithm is mainly used, and the first optical flow feature point extracted by the previous frame image (ie, the first image) is used to perform optical flow from the new image, thereby finding between two frames.
- Matching points thereby calculating a first target homography matrix of the first image to the second image, and acquiring a template image buffered in the historical optical flow process to a second target homography matrix of the first image, thereby obtaining a template image to the first The second homography matrix of the two images.
- the camera attitude information determining apparatus may calculate camera attitude information according to the first homography matrix and the second homography matrix.
- the camera attitude information determining apparatus performs complementary filtering processing on the first homography matrix and the second homography matrix to obtain camera attitude information of the camera.
- the complementary filtering process refers to a processing method of filtering and combining the first homography matrix and the second homography matrix.
- the complementary filtering process is implemented using a Kalman filter or a complementary filter.
- a method for determining camera attitude information is provided.
- the camera attitude information determining apparatus first acquires a first image, a second image, and a template image, and then extracts a first feature point from each image layer of the template image.
- the original image layer is an image layer in the template image
- the original image layer includes a plurality of grids
- the camera pose information determining device further uses the first feature point and the second feature Point matching to determine a target feature point in each grid of the original image layer, wherein the target feature point is a feature point having the largest matching degree between the first feature point and the second feature point
- the target feature point is used to determine the first homography matrix
- the second feature point is the feature point extracted from the second image.
- the camera pose information determining device may be based on the first optical flow feature point in the first image.
- the camera pose information is determined according to the first homography matrix and the second homography matrix.
- the template image is divided into multiple equal grids, and only one target feature point exists in one grid, so the target feature points are more evenly distributed and have high matching degree and degree of fusion, thereby ensuring operation.
- the camera feature information with higher precision can be obtained by using the target feature points.
- the first feature point in the first image layer and the first feature point in the second image layer are scaled and projected onto the original image layer.
- the first feature point in the original image layer a manner of determining the first feature point in the original image layer will be described.
- multi-layer images are extracted in the template image (or reduced in different scales to obtain multi-layer images), and the sizes of these images are pyramid-shaped, that is, the size of the images is sorted from small to large. Assuming that the first image layer is above the second image layer, then the first image layer can be considered to be smaller than the second image layer.
- the camera attitude information determining means extracts the first feature points from the first image layer and the second image layer, respectively, and then scales all the first feature points to be projected onto the original image layer.
- the size of the original image layer can be designed according to actual conditions, which is not limited herein.
- the original image layer of the template image is scaled by a scaling factor of 1.0, 0.8, 0.6, 0.4 to obtain a first image layer, a second image layer, a third image layer, and a fourth image of the template image. Image layer. then
- the camera attitude information determining apparatus first extracts a first feature point from a first image layer of the template image, and extracts a first feature point from a second image layer of the template image, where the first image layer The second image layer has a different size, and then the first feature point in the first image layer and the first feature point in the second image layer are scaled and projected onto the original image layer.
- all the first feature points extracted by each graphics layer may be merged on the original graphics layer corresponding to the template image, thereby obtaining as many first feature points as possible to facilitate further screening of the first feature points. , thereby improving the accuracy of the screening.
- the matching performance is enhanced to ensure that objects can be detected even under large-scale changes.
- the template image corresponds to an array arrangement. Multiple grids;
- Performing feature point detection on the first feature point in the module image and the second feature point in the second image to obtain a first homography matrix including:
- the feature point pair includes: a first feature point located in the target grid, and a feature point of the second feature point that has the greatest matching degree with the first feature point;
- the target grid is a part of the grid of the plurality of grids of the template image. That is, the first feature point in the target grid has matching target feature points in the second image, and each target raster corresponds to only one set of matched feature point pairs. Because the two images are calculated by the homography matrix, only a minimum of four pairs of feature points are needed to calculate the homography matrix, and the number of feature point pairs is less, but the quality requirement is higher. The similarity of the feature point pairs in the same grid is higher, and the terminal selects the feature point pairs belonging to different target grids as much as possible for subsequent calculation.
- the cover image of the book in FIG. 8 is a schematic diagram of the template image in the embodiment of the present application.
- the scale of the left half is small, and the scale of the right half is large, and the pyramid space of a single layer cannot accurately describe the situation. . Therefore, such features can be described in the following manner.
- FIG. 9 is a schematic diagram of an embodiment of determining a target feature point on an original image layer according to an embodiment of the present application.
- the original template image is downsampled to generate a pyramid, and the first feature point is extracted for each layer. Therefore, a plurality of first feature points of the template image at a plurality of scales are obtained.
- the feature points are scaled, the position of all the feature points is scaled down to the original image layer size, and a mask of the original image layer size is created, and evenly divided into multiple small grids for backup.
- the second feature point is extracted once, and then matched with the first feature point on the original image layer, respectively, to obtain feature matching at multiple scales.
- points on the template image since each point has multiple scales and is scaled down to the original image layer size, there are multiple matching feature points gathered in the same grid area.
- For each grid select only one point with the highest matching score as the representative.
- the matching of the template images after the plurality of grid filters to the second image can be obtained.
- the template image to the first homography matrix of the second image is calculated according to the at least four sets of feature point pairs.
- the grid filter In the grid of such template images, more than one feature point may be merged.
- Our grid filtering method is equivalent to smoothing two adjacent feature points, and the matching information of the two layers is utilized proportionally, so that the number of pyramid layers required can be greatly reduced.
- the grid filter automatically selects the corresponding scale, selects the low scale in the left half, and selects the high scale in the right half, so that the matching can be better.
- the camera attitude information determining device first extracts the second feature point from the second image, and then matches the first feature point with the second feature point in each of the original image layers. And obtaining at least one feature point to be selected, wherein each feature point to be selected corresponds to a matching score, and finally, in each grid of the original image layer, selecting a feature with the highest matching score from at least one feature point to be selected The point is the target feature point.
- the grid is used to limit the maximum number of matches, which ensures the stability of the first homography matrix is calculated, and only the second feature point is extracted once for the second image during operation, and the added feature matching takes less time. , does not affect the speed of operation, thus speeding up the matching efficiency.
- the camera attitude information determining apparatus needs to acquire the first optical flow feature point in the preset area of the first image in the process of determining the first target homography matrix, and the preset area may include four vertices.
- the four vertices initialize an image area, which is the area where the template image is located in the first image, on which some Shi-Tomasi corner points are extracted as the first optical flow feature points.
- the previous optical flow feature points may not be as good, especially as the optical flow points before the rotation and perspective projection may not be observed on the current image, so the optical flow tracking needs to be updated every few frames.
- the optical flow feature points Using the four vertices of the preset area calculated in the previous frame (refer to the four vertices of the book in the figure at the bottom right of Figure 9), zoom out to find a mask, and use the optical flow algorithm to determine the number in the mask. Two optical flow feature points. Understandably, this entire process is run in a background thread, does not affect the speed of the main thread.
- the tracked optical flow feature points are automatically updated to ensure the stability of the optical flow algorithm. Calculating, according to the first optical flow feature point and the second optical flow feature point, a first target homography matrix from the first image to the second image, the first target homography matrix as a recursive basis of the first image to the template image .
- the number of pixels in the mask is smaller than the number of pixels in the preset area. This is because we need to obtain the optical flow feature points on the template image.
- the feature points of the edges are easy to detect errors, so the number of mask pixels obtained by reducing one circle is less. .
- the camera attitude information determining apparatus acquires the first optical flow feature point in the preset area of the first image, and masks the second image according to the optical flow algorithm and the first optical flow feature point. Obtaining a second optical flow feature point. Finally, the first target homography matrix from the first image to the second image is calculated according to the first optical flow feature point and the second optical flow feature point. In the above manner, the vertices of the preset area calculated by the previous frame image are reduced by one circle to obtain a mask, so that the detection of the edge feature points can be reduced, thereby reducing the error rate of detection.
- the first target single matrix and the first target are provided on the basis of the third embodiment corresponding to FIG.
- the second target homography matrix determines the second homography matrix, which may include:
- the camera attitude information determining apparatus needs to acquire the third optical flow feature point in the template image in the process of determining the second homography matrix or before determining the second homography matrix, and then according to the third optical flow characteristic.
- Point and the first optical flow feature point find a matching point of the template image and the first image, thereby calculating a second target homography matrix, multiplying the second target homography matrix by the first target homography matrix to obtain a template image to the second The second homography matrix of the image.
- the current image is the second image
- the previous frame image is the first image
- the manner in which the camera pose information device determines the second homography matrix is introduced, that is, the template image is first acquired to the second target homography matrix of the first image, and then according to the first target homography matrix and the first The two target homography matrix calculates a second homography matrix from the second image to the template image.
- the second homography matrix can be obtained by using the optical flow feature points, thereby improving the feasibility and practicability of the scheme.
- the method may further include:
- the q optical flow feature points are acquired as the second optical flow feature points, so that the number of the second optical flow feature points reaches a preset threshold, and q is a positive integer.
- the remaining second optical flow feature points are too small to represent the characteristics of the mask. Therefore, it is necessary to extract q optical flow feature points as the second optical flow feature points in the mask again, so that the number of second optical flow feature points reaches a preset threshold.
- the preset threshold may be 50, or 100, and may be other values, which are not limited herein.
- the camera attitude information determining device acquires q optical flow feature points from the mask as second optical flow feature points, so that The number of second optical flow feature points reaches a preset threshold.
- the new optical flow feature points can be re-extracted to compensate, which is equivalent to automatically updating the tracked feature points, thereby improving the stability of the optical flow algorithm.
- the first homography matrix and the second homography are provided on the basis of the foregoing embodiment corresponding to FIG.
- the matrix determines camera pose information, which may include:
- Complementary filtering processing is performed on the first rotation translation matrix and the second rotation translation matrix to acquire camera posture information.
- the camera attitude information determining apparatus needs to be divided into two steps when determining the camera attitude information, and the first step is mainly to determine two rotation translation matrices.
- the second step is mainly to perform complementary filtering on the two rotation translation matrices, and finally obtain camera pose information.
- the process of converting the homography matrix into the rotation translation matrix can be understood as a process of converting the two-dimensional coordinates into three-dimensional coordinates.
- the three-dimensional coordinates are obtained to determine the position in the real world when the camera captures the template image.
- the time-consuming part mainly lies in the detection.
- the tracker tracks one frame of image for no more than 10 milliseconds, and the detector detects one frame of image for approximately 30 milliseconds. Therefore, another alternative is that the fusion of the first rotation translation matrix and the second rotation translation matrix is not performed in every frame, but the detection and fusion of each frame image are placed on the back-end thread, The modified increments obtained by the fusion are used to correct the subsequent camera poses, so that the main thread only needs to track the time, and the detection and fusion will not block the main thread, and the calculation speed will be improved.
- the process of determining the camera pose information is divided into two parts, one part is determining the first rotation translation matrix according to the first homography matrix, and determining the second rotation translation matrix according to the second homography matrix, Another part is to perform complementary filtering processing on the first rotation translation matrix and the second rotation translation matrix to acquire camera posture information of the camera.
- the two-dimensional homography rectangle can be decomposed into a three-dimensional rotation translation matrix. Since all the first feature points on the template image are fused on the original graphics layer, only one rotation translation matrix can be obtained. The solution to improve the operability of the solution.
- the use of complementary filtering results in smoother camera pose information.
- the first homography matrix is determined according to the sixth embodiment corresponding to FIG.
- a rotation translation matrix, and determining a second rotation translation matrix according to the second homography matrix may include:
- a first rotational translation matrix is calculated from the first homography matrix, the perspective projection matrix of the second image, and the perspective projection matrix of the template image.
- the rotating portion in the first rotational translation matrix is used to represent a spatial rotation change when the camera changes from the first posture when the template image is acquired to the second posture when the second image is acquired, in the first rotation translation matrix.
- the panning portion is used to indicate a spatial displacement change when the camera changes from the first posture when the template image is acquired to the second posture when the second image is acquired.
- a second rotation translation matrix is calculated according to the second homography matrix, the perspective projection matrix of the second image, and the perspective projection matrix of the template image.
- the rotating portion in the second rotational translation matrix is also used to represent a spatial rotation change when the camera changes from the first posture when acquiring the template image to the second posture when the second image is acquired, in the second rotation translation matrix.
- the panning portion is also used to indicate the spatial displacement change of the camera from the first posture when the template image is acquired to the second posture when the second image is acquired.
- the first rotation translation matrix when we have the first homography matrix, can be decomposed by combining the camera parameters.
- the second rotation translation matrix can be decomposed.
- the following is an example of decomposing the first rotation translation matrix.
- the manner of decomposing the second rotation translation matrix is similar to the method of decomposing the first rotation translation matrix, and details are not described herein.
- xc represents a homogeneous representation of the two-dimensional coordinates on the second image
- xm is a homogeneous representation of the two-dimensional coordinates on the template image
- H represents the first homography matrix
- T) represents the first rotation translation matrix
- P represents the perspective projection matrix
- the homogeneous coordinate of the 2D point [x, y] T is [x, y, 1] T
- the homogeneous coordinate of the 3D point [x, y, z] T is [x, y, z, 1] T . Therefore, it can be launched:
- P c represents a perspective projection matrix of the second image
- P m represents a perspective projection matrix of the template image.
- xm is back-projected by the camera parameters of the template image.
- sR 00 , sR 10 , sR 20 ..., sT 0 , sT 1 , sT 2 can be obtained correspondingly.
- R is a rotation matrix
- the column vector is satisfied as a unit matrix, so that the scale factor s can be obtained.
- the third column R2 can be calculated by R0 and R1, and T can be calculated by S and the third column of the left.
- the scale factor s has a positive or negative sign.
- the position of the template image in the second image can be calculated by calculating the RT, and the sign of s can be inferred from the position in front of the camera (Tz ⁇ 0), thereby obtaining a determined set.
- the translation matrix is rotated to obtain a determined camera pose information.
- the first rotation translation matrix and the first The two rotation translation matrix performs complementary filtering processing to obtain camera attitude information, which may include:
- the camera pose information is determined based on the first filtered result and the second filtered result.
- the speed of acquiring the first rotation translation matrix is slow, the precision is low, and the output result fluctuates around the correct value, and has high frequency error but the average value is relatively stable.
- the speed of obtaining the second rotational translation matrix is faster, the precision is higher, the output result is more stable and smooth, and there is no high-frequency jitter, but the error is accumulated over time to cause drift.
- the combination of the first rotational translation matrix and the second rotational translation matrix forms a complementary complement, so that complementary filtering is performed, and a smooth output result can be obtained.
- the low pass filter and the high pass filter can form a complementary filter.
- the Kalman filter can also implement the functions of the low pass filter and the high pass filter.
- the Kalman filter and the complementary filter are almost the same in performance, but the complementary filter is more compact, and the characteristics of our application scene are closer to the complementary filter, so a similar complementary idea is used to implement a visual complementary filter.
- FIG. 10 is a schematic diagram of an embodiment of filtering processing a first rotation translation matrix and a second rotation translation matrix in the embodiment of the present application.
- the first homography matrix represents slave detection.
- the homography matrix detected in the device represents the conversion from the template image to the current camera image.
- the first homography matrix can directly decompose the first rotation translation matrix (R1
- the second homography matrix represents the homography matrix traced from the tracker, and also represents the conversion from the template image to the current image (ie, the second image), and the second homography matrix can be decomposed into the second rotation translation matrix (R2)
- T1 can filter out high frequency noise through a low pass filter to obtain a first filtering result (Rf1
- the camera coordinates can be used to calculate the true coordinates of the four vertices in the template image coordinate system.
- T to calculate the three-dimensional coordinates of the four vertices of the template image on the current camera, and calculate the corresponding two-dimensional coordinates through perspective projection.
- finding the matching of the four sets of template images to the two-dimensional coordinate points on the current camera image thereby calculating the updated homography matrix, and using it to update the integrator, thereby eliminating the cumulative error of the tracker.
- the complementary filtering provided by this embodiment is a framework, and the filtering can be performed not only by the tracking result obtained by the detector but by the optical flow tracking result obtained by the tracker. It can be the template image tracking result of any two or more different sources, and even includes the data transmitted by the external sensor (such as the data measured by the inertial measurement unit), and the Kalman filter can also be used for the corresponding processing.
- the complementary filtering process is performed on the first rotation translation matrix and the second rotation translation matrix, that is, the first rotation translation matrix is input to the low-pass filter, and the first filtering result is obtained, and the second filtering result is obtained.
- the rotation translation matrix input is a high-pass filter, and a second filtering result is obtained, and finally the camera attitude information is determined according to the first filtering result and the second filtering result.
- the first rotation translation matrix can be compensated for the low precision and the high frequency error
- the second rotation translation matrix can be compensated for the drift of the second rotation translation matrix due to the accumulated error over time
- the complementary filtering method can be used to obtain smoothness. Output results to improve the viability of the solution.
- FIG. 11 is a schematic diagram of an apparatus for determining camera attitude information in the embodiment of the present application.
- the camera posture information determining apparatus 30 includes:
- a first acquiring module 301 configured to acquire a first image, a second image, and a template image, where the first image is a previous frame image of the second image, the first image and the second image Is an image acquired by the camera, the template image being a reference image for matching;
- the detecting module 302 is configured to perform feature point detection on the first feature point in the template image and the second feature point in the second image to obtain a first homography matrix
- the tracking module 303 is configured to: according to the first optical flow feature point in the first image and the first target homography matrix of the second image, and according to the first target homography matrix and the second target The matrix determines a second homography matrix, the second target homography matrix is a homography matrix of the template image to the first image;
- the complementary filtering module 304 is configured to determine camera pose information according to the first homography matrix and the second homography matrix determined by the first determining module 304.
- the complementary filtering module 304 includes:
- a determining unit 3041 configured to determine a first rotation translation matrix according to the first homography matrix, and determine a second rotation translation matrix according to the second homography matrix, wherein the first homography matrix and the first The two-single matrix is two-dimensional information, and the first rotational translation matrix and the second rotational translation matrix are three-dimensional information;
- the processing unit 3042 is configured to perform complementary filtering processing on the first rotation translation matrix and the second rotation translation matrix to acquire the camera attitude information.
- the processing unit 3042 includes:
- a first input subunit 30421 configured to input the first rotation translation matrix to a low pass filter to obtain a first filtering result
- a second input subunit 30422 configured to input the second rotation translation matrix to the high pass filter to obtain a second filtering result
- the determining subunit 30423 is configured to determine the camera pose information according to the first filtering result and the second filtering result.
- the determining unit 3041 includes:
- a first calculation subunit 30411 configured to calculate the first rotation translation matrix according to the first homography matrix, the perspective projection matrix of the second image, and the perspective projection matrix of the template image;
- a second calculation subunit 30412 configured to calculate the second rotation translation matrix according to the second homography matrix, the perspective projection matrix of the second image, and the perspective projection matrix of the template image.
- the template image corresponds to a plurality of grids arranged in an array
- the detecting module 302 includes:
- a matching module 3021 configured to match a first feature point in the template image with a second feature point in the second image, and determine one in each of the plurality of grids a set of feature points, the first feature point located in the target grid, and a feature point of the second feature point that has the largest matching degree with the first feature point;
- the first determining module 3022 is configured to calculate a first homography matrix between the module image and the second image according to the feature point pairs in the target grid.
- the apparatus further comprises:
- a first extraction module configured to separately extract the first feature point from each image layer of the template image, and determine the first feature point in an original image layer
- the original image layer is an image layer in the template image, and the plurality of grids are included in the original image layer.
- the first extraction module includes:
- a first extracting unit configured to extract the first feature point from a first image layer of the template image
- a second extracting unit configured to extract the first feature point from a second image layer of the template image, wherein the first image layer and the second image layer have different sizes
- a projection unit configured to perform scaling processing on the first feature point in the first image layer and the first feature point in the second image layer, and project to the original image layer.
- the apparatus further comprises:
- a second extraction module configured to extract a second feature point from the second image
- the matching module includes:
- a matching unit configured to match the first feature point with the second feature point for each of the target points in the target image layer, and obtain at least one pair of matching Selecting feature point pairs, each set of the selected feature point pairs corresponding to one matching score;
- a selecting unit configured to select a feature point pair with the highest matching score from the at least one pair of feature point pairs to be selected as the feature point pair determined in the target grid.
- the tracking module 303 includes:
- the first acquiring unit 3031 is configured to acquire a first optical flow feature point in a preset area of the first image, where the preset area is an area corresponding to the template image;
- a second acquiring unit 3033 configured to acquire a second optical flow feature point according to the first optical flow feature point
- the first calculating unit 3032 is configured to calculate the first target homography matrix from the first image to the second image according to the first optical flow feature point and the second optical flow feature point.
- the apparatus further comprises:
- a second acquiring module configured to acquire q optical flow feature points as the second optical flow feature points, if the number of the second optical flow feature points is less than a preset threshold, so that the second optical flow feature The number of points reaches the preset threshold, and q is a positive integer.
- the embodiment of the present application further provides another camera attitude information determining apparatus.
- FIG. 17 for the convenience of description, only parts related to the embodiment of the present application are shown. For details that are not disclosed, please refer to the present application.
- the terminal can be any terminal device including a mobile phone, a tablet computer, a PDA, a POS, a car computer, and the terminal is a mobile phone as an example:
- FIG. 17 is a block diagram showing a partial structure of a mobile phone related to a terminal provided by an embodiment of the present application.
- the mobile phone includes: a radio frequency (RF) circuit 410, a memory 420, an input unit 430, a display unit 440, a sensor 450, an audio circuit 460, a wireless fidelity (WiFi) module 470, and a processor 480. And power supply 490 and other components.
- RF radio frequency
- the RF circuit 410 can be used for transmitting and receiving information or during a call, and receiving and transmitting the signal. Specifically, after receiving the downlink information of the base station, the processor 480 processes the data. In addition, the uplink data is designed to be sent to the base station.
- RF circuit 410 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a Low Noise Amplifier (LNA), a duplexer, and the like.
- LNA Low Noise Amplifier
- RF circuitry 410 can also communicate with the network and other devices via wireless communication. The above wireless communication may use any communication standard or protocol, including but not limited to Global System of Mobile communication (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access (Code Division). Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Long Term Evolution (LTE), E-mail, Short Messaging Service (SMS), and the like.
- GSM Global System of Mobile communication
- the memory 420 can be used to store software programs and modules, and the processor 480 executes various functional applications and data processing of the mobile phone by running software programs and modules stored in the memory 420.
- the memory 420 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application required for at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may be stored according to Data created by the use of the mobile phone (such as audio data, phone book, etc.).
- memory 420 can include high speed random access memory, and can also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.
- the input unit 430 can be configured to receive input numeric or character information and to generate key signal inputs related to user settings and function controls of the handset.
- the input unit 430 may include a touch panel 431 and other input devices 432.
- the touch panel 431 also referred to as a touch screen, can collect touch operations on or near the user (such as a user using a finger, a stylus, or the like on the touch panel 431 or near the touch panel 431. Operation), and drive the corresponding connecting device according to a preset program.
- the touch panel 431 may include two parts: a touch detection device and a touch controller.
- the touch detection device detects the touch orientation of the user, and detects a signal brought by the touch operation, and transmits the signal to the touch controller; the touch controller receives the touch information from the touch detection device, converts the touch information into contact coordinates, and sends the touch information.
- the processor 480 is provided and can receive commands from the processor 480 and execute them.
- the touch panel 431 can be implemented in various types such as resistive, capacitive, infrared, and surface acoustic waves.
- the input unit 430 may also include other input devices 432.
- other input devices 432 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control buttons, switch buttons, etc.), trackballs, mice, joysticks, and the like.
- the display unit 440 can be used to display information input by the user or information provided to the user as well as various menus of the mobile phone.
- the display unit 440 can include a display panel 441.
- the display panel 441 can be configured in the form of a liquid crystal display (LCD), an organic light-emitting diode (OLED), or the like.
- the touch panel 431 can cover the display panel 441. When the touch panel 431 detects a touch operation on or near the touch panel 431, it transmits to the processor 480 to determine the type of the touch event, and then the processor 480 according to the touch event. The type provides a corresponding visual output on display panel 441.
- touch panel 431 and the display panel 441 are used as two independent components to implement the input and input functions of the mobile phone in FIG. 17, in some embodiments, the touch panel 431 and the display panel 441 may be integrated. Realize the input and output functions of the phone.
- the handset may also include at least one type of sensor 450, such as a light sensor, motion sensor, and other sensors.
- the light sensor may include an ambient light sensor and a proximity sensor, wherein the ambient light sensor may adjust the brightness of the display panel 441 according to the brightness of the ambient light, and the proximity sensor may close the display panel 441 and/or when the mobile phone moves to the ear. Or backlight.
- the accelerometer sensor can detect the magnitude of acceleration in all directions (usually three axes). When it is stationary, it can detect the magnitude and direction of gravity.
- the mobile phone can be used to identify the gesture of the mobile phone (such as horizontal and vertical screen switching, related Game, magnetometer attitude calibration), vibration recognition related functions (such as pedometer, tapping), etc.; as for the mobile phone can also be configured with gyroscopes, barometers, hygrometers, thermometers, infrared sensors and other sensors, no longer Narration.
- the gesture of the mobile phone such as horizontal and vertical screen switching, related Game, magnetometer attitude calibration
- vibration recognition related functions such as pedometer, tapping
- the mobile phone can also be configured with gyroscopes, barometers, hygrometers, thermometers, infrared sensors and other sensors, no longer Narration.
- Audio circuit 460, speaker 461, and microphone 462 provide an audio interface between the user and the handset.
- the audio circuit 460 can transmit the converted electrical data of the received audio data to the speaker 461 for conversion to the sound signal output by the speaker 461; on the other hand, the microphone 462 converts the collected sound signal into an electrical signal by the audio circuit 460. After receiving, it is converted into audio data, and then processed by the audio data output processor 480, sent to the other mobile phone via the RF circuit 410, or outputted to the memory 420 for further processing.
- WiFi is a short-range wireless transmission technology
- the mobile phone can help users to send and receive emails, browse web pages, and access streaming media through the WiFi module 470, which provides users with wireless broadband Internet access.
- FIG. 17 shows the WiFi module 470, it can be understood that it does not belong to the essential configuration of the mobile phone, and can be omitted as needed within the scope of not changing the essence of the invention.
- the processor 480 is the control center of the handset, and connects various portions of the entire handset using various interfaces and lines, by executing or executing software programs and/or modules stored in the memory 420, and invoking data stored in the memory 420, executing The phone's various functions and processing data, so that the overall monitoring of the phone.
- the processor 480 may include one or more processing units; optionally, the processor 480 may integrate an application processor and a modem processor, where the application processor mainly processes an operating system, a user interface, and an application. Etc.
- the modem processor primarily handles wireless communications. It can be understood that the above modem processor may not be integrated into the processor 480.
- the handset also includes a power source 490 (such as a battery) that supplies power to the various components.
- a power source 490 such as a battery
- the power source can be logically coupled to the processor 480 through a power management system to manage charging, discharging, and power management functions through the power management system.
- the mobile phone may further include a camera, a Bluetooth module, and the like, and details are not described herein again.
- the memory 420 included in the terminal is used to store a program
- the processor 480 is configured to execute a program in the memory 420 to implement the method for determining camera pose information as described in the respective embodiments.
- a computer readable storage medium comprising instructions that, when executed on a computer, cause the computer to perform camera pose information as described in various embodiments above Determine the method.
- the disclosed system, apparatus, and method may be implemented in other manners.
- the device embodiments described above are merely illustrative.
- the division of the unit is only a logical function division.
- there may be another division manner for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not executed.
- the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
- the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
- each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
- the above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
- the integrated unit if implemented in the form of a software functional unit and sold or used as a standalone product, may be stored in a computer readable storage medium.
- the technical solution of the present application may be embodied in the form of a software product in the form of a software product, or a part of the technical solution, which is stored in a storage medium.
- a number of instructions are included to cause a computer device (which may be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present application.
- the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like, which can store program codes. .
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- Databases & Information Systems (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Computing Systems (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- General Engineering & Computer Science (AREA)
- Image Analysis (AREA)
- Studio Devices (AREA)
Abstract
Description
Claims (22)
- 一种相机姿态信息的确定方法,其特征在于,应用于具有相机的移动终端中,所述方法包括:获取第一图像、第二图像以及模板图像;所述第一图像是所述第二图像的上一帧图像,所述第一图像和所述第二图像是所述相机采集的图像,所述模板图像是用于匹配的基准图像;将所述模块图像中的第一特征点和所述第二图像中的第二特征点进行特征点检测,得到第一单应矩阵;根据所述第一图像中的第一光流特征点和所述第二图像的第一目标单应矩阵,并根据所述第一目标单应矩阵和第二目标单应矩阵确定第二单应矩阵,所述第二目标单应矩阵是所述模板图像到所述第一图像的单应矩阵;对所述第一单应矩阵和所述第二单应矩阵进行互补滤波处理,得到所述相机的相机姿态信息。
- 根据权利要求1所述的方法,其特征在于,所述对所述第一单应矩阵和所述第二单应矩阵进行互补滤波处理,得到所述相机的相机姿态信息,包括:根据所述第一单应矩阵确定第一旋转平移矩阵,并根据所述第二单应矩阵确定第二旋转平移矩阵,其中,所述第一单应矩阵和所述第二单应矩阵为二维信息,所述第一旋转平移矩阵与所述第二旋转平移矩阵为三维信息;对所述第一旋转平移矩阵和所述第二旋转平移矩阵进行互补滤波处理,以获取所述相机姿态信息。
- 根据权利要求2所述的方法,其特征在于,所述对所述第一旋转平移矩阵和所述第二旋转平移矩阵进行互补滤波处理,以获取所述相机姿态信息,包括:将所述第一旋转平移矩阵输入至低通滤波器,得到第一滤波结果;将所述第二旋转平移矩阵输入至高通滤波器,得到第二滤波结果;根据所述第一滤波结果和所述第二滤波结果确定所述相机姿态信息。
- 根据权利要求2所述的方法,其特征在于,所述根据所述第一单应矩阵确定第一旋转平移矩阵,并根据所述第二单应矩阵确定第二旋转平移矩阵,包括:根据所述第一单应矩阵、所述第二图像的透视投影矩阵以及所述模板图像的透视投影矩阵计算所述第一旋转平移矩阵;根据所述第二单应矩阵、所述第二图像的透视投影矩阵以及所述模板图像的透视投影矩阵计算所述第二旋转平移矩阵。
- 根据权利要求1至4任一所述的方法,其特征在于,所述模板图像对应有阵列排布的多个栅格;所述将所述模块图像中的第一特征点和所述第二图像中的第二特征点进行特征点检测,得到第一单应矩阵,包括:将所述模板图像中的第一特征点与所述第二图像中的第二特征点进行匹配,在所述多个栅格中的每个目标栅格中确定出一组特征点对,所述特征点对包括:位于所述目标栅格中的第一特征点,以及所述第二特征点中与所述第一特征点匹配度最大的特征点;根据所述目标栅格中的所述特征点对,计算出所述模块图像和所述第二图像之间的第一 单应矩阵。
- 根据权利要求5所述的方法,其特征在于,所述方法还包括:从所述模板图像的每个图像层中分别提取所述第一特征点,并在原始图像层中确定所述第一特征点;其中,所述原始图像层为所述模板图像中的一个图像层,所述原始图像层中包含所述多个栅格。
- 根据权利要求6所述的方法,其特征在于,所述从所述模板图像的每个图像层中分别提取第一特征点,并在原始图像层中确定所述第一特征点,包括:从所述模板图像的第一图像层中提取所述第一特征点;从所述模板图像的第二图像层中提取所述第一特征点,其中,所述第一图像层与所述第二图像层具有不同的尺寸大小;对所述第一图像层中的所述第一特征点以及所述第二图像层中的所述第一特征点进行缩放处理,并投影至所述原始图像层。
- 根据权利要求5所述的方法,其特征在于,所述将所述模板图像中的第一特征点与所述第二图像中的第二特征点进行匹配之前,所述方法还包括:从所述第二图像中提取第二特征点;所述将所述模板图像中的第一特征点与所述第二图像中的第二特征点进行匹配,在所述多个栅格中的每个目标栅格中确定出一组特征点对,包括:对于所述原始图像层的每个所述目标栅格中的第一特征点,将所述第一特征点与第二特征点进行匹配,并得到至少一对相互匹配的待选择特征点对,每组所述待选择特征点对对应有一个匹配分值;从所述至少一对待选择特征点对中选择匹配分值最高的特征点对,作为所述目标栅格中确定出的特征点对。
- 根据权利要求1至4任一所述的方法,其特征在于,所述根据所述第一图像中的第一光流特征点和所述第二图像的第二光流特征点确定第一目标单应矩阵,包括:在所述第一图像的预设区域中获取第一光流特征点,所述预设区域是与所述模板图像对应的区域;根据所述第一光流特征点获取第二光流特征点;根据所述第一光流特征点与所述第二光流特征点,计算从所述第一图像到所述第二图像的所述第一目标单应矩阵。
- 根据权利要求9所述的方法,其特征在于,所述方法还包括:若所述第二光流特征点的数量小于预设门限,则获取q个光流特征点作为所述第二光流特征点,以使所述第二光流特征点的数量达到所述预设门限,所述q为正整数。
- 一种相机姿态信息的确定装置,其特征在于,所述装置具有相机,所述装置包括:第一获取模块,用于获取第一图像、第二图像以及模板图像,其中,所述第一图像是所述第二图像的上一帧图像,所述第一图像和所述第二图像是所述相机采集的图像,所述模板图像是用于匹配的基准图像;检测模块,用于将所述模块图像中的第一特征点与所述第二图像中的第二特征点进行特征点检测,得到第一单应矩阵;追踪模块,用于根据所述第一图像中的第一光流特征点和所述第二图像的第二光流特征点确定第一目标单应矩阵,并根据所述第一目标单应矩阵和第二目标单应矩阵确定第二单应矩阵,所述第二目标单应矩阵是所述模板图像到所述第一图像的单应矩阵;互补滤波模块,用于对所述第一单应矩阵和所述第二单应矩阵进行互补滤波处理,得到所述相机的相机姿态信息。
- 根据权利要求11所述的装置,其特征在于,所述互补滤波模块,包括:确定单元,用于根据所述第一单应矩阵确定第一旋转平移矩阵,并根据所述第二单应矩阵确定第二旋转平移矩阵,其中,所述第一单应矩阵和所述第二单应矩阵为二维信息,所述第一旋转平移矩阵与所述第二旋转平移矩阵为三维信息;处理单元,用于对所述第一旋转平移矩阵和所述第二旋转平移矩阵进行互补滤波处理,以获取所述相机姿态信息。
- 根据权利要求12所述的装置,其特征在于,所述处理单元,包括:第一输入子单元,用于将所述第一旋转平移矩阵输入至低通滤波器,得到第一滤波结果;第二输入子单元,用于将所述第二旋转平移矩阵输入至高通滤波器,得到第二滤波结果;确定子单元,用于根据所述第一滤波结果和所述第二滤波结果确定所述相机姿态信息。
- 根据权利要求12所述的装置,其特征在于,所述确定单元,包括:第一计算子单元,用于根据所述第一单应矩阵、所述第二图像的透视投影矩阵以及所述模板图像的透视投影矩阵计算所述第一旋转平移矩阵;第二计算子单元,用于根据所述第二单应矩阵、所述第二图像的透视投影矩阵以及所述模板图像的透视投影矩阵计算所述第二旋转平移矩阵。
- 根据权利要求11至14任一所述的装置,其特征在于,所述模板图像对应有阵列排布的多个栅格;所述检测模块,包括:匹配模块,用于将所述模板图像中的第一特征点与所述第二图像中的第二特征点进行匹配,在所述多个栅格中的每个目标栅格中确定出一组特征点对,所述特征点对包括:位于所述目标栅格中的第一特征点,以及所述第二特征点中与所述第一特征点匹配度最大的特征点;第一确定模块,用于根据所述目标栅格中的所述特征点对,计算出所述模块图像和所述第二图像之间的第一单应矩阵。
- 根据权利要求15所述的装置,其特征在于,所述装置还包括:第一提取模块,用于从所述模板图像的每个图像层中分别提取所述第一特征点,并在原始图像层中确定所述第一特征点;其中,所述原始图像层为所述模板图像中的一个图像层,所述原始图像层中包含所述多个栅格。
- 根据权利要求16所述的装置,其特征在于,所述第一提取模块,包括:第一提取单元,用于从所述模板图像的第一图像层中提取所述第一特征点;第二提取单元,用于从所述模板图像的第二图像层中提取所述第一特征点,其中,所述第一图像层与所述第二图像层具有不同的尺寸大小;投影单元,用于对所述第一图像层中的所述第一特征点以及所述第二图像层中的所述第一特征点进行缩放处理,并投影至所述原始图像层。
- 根据权利要求15所述的装置,其特征在于,所述装置还包括:第二提取模块,用于从所述第二图像中提取第二特征点;所述匹配模块,包括:匹配单元,用于对于所述原始图像层的每个所述目标栅格中的第一特征点,将所述第一特征点与第二特征点进行匹配,并得到至少一对相互匹配的待选择特征点对,每组所述待选择特征点对对应有一个匹配分值;选择单元,用于从所述至少一对待选择特征点对中选择匹配分值最高的特征点对,作为所述目标栅格中确定出的特征点对。
- 根据权利要求11至14任一所述的装置,其特征在于,所述追踪模块,包括:第一获取单元,用于在所述第一图像的预设区域中获取第一光流特征点,所述预设区域是与所述模板图像对应的区域;第二获取单元,用于根据所述第一光流特征点获取第二光流特征点;第一计算单元,用于根据所述第一光流特征点与所述第二光流特征点,计算从所述第一图像到所述第二图像的所述第一目标单应矩阵。
- 根据权利要求19所述的装置,其特征在于,所述装置还包括:第二获取模块,用于若所述第二光流特征点的数量小于预设门限,则获取q个光流特征点作为所述第二光流特征点,以使所述第二光流特征点的数量达到所述预设门限,所述q为正整数。
- 一种移动终端,其特征在于,所述移动终端包括:处理器和存储器;所述存储器用于存储程序,所述处理器用于执行所述存储器中的程序以实现如权利要求1至12任一所述的相机姿态信息的确定方法。
- 一种计算机可读存储介质,所述计算机可读存储介质包括指令,当其在计算机上运行时,使得计算机执行如权利要求1至12任一所述的相机姿态信息的确定方法。
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP18828662.9A EP3579192B1 (en) | 2017-07-07 | 2018-06-28 | Method, apparatus and device for determining camera posture information, and storage medium |
JP2019548070A JP6824433B2 (ja) | 2017-07-07 | 2018-06-28 | カメラ姿勢情報の決定方法、決定装置、モバイル端末及びコンピュータプログラム |
KR1020197030212A KR102319207B1 (ko) | 2017-07-07 | 2018-06-28 | 카메라 자세 정보를 결정하기 위한 방법, 장치 및 디바이스, 그리고 저장 매체 |
US16/386,143 US10963727B2 (en) | 2017-07-07 | 2019-04-16 | Method, device and storage medium for determining camera posture information |
US17/183,200 US11605214B2 (en) | 2017-07-07 | 2021-02-23 | Method, device and storage medium for determining camera posture information |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710552105.1A CN109215077B (zh) | 2017-07-07 | 2017-07-07 | 一种相机姿态信息确定的方法及相关装置 |
CN201710552105.1 | 2017-07-07 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/386,143 Continuation US10963727B2 (en) | 2017-07-07 | 2019-04-16 | Method, device and storage medium for determining camera posture information |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2019007258A1 true WO2019007258A1 (zh) | 2019-01-10 |
Family
ID=64797133
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2018/093418 WO2019007258A1 (zh) | 2017-07-07 | 2018-06-28 | 相机姿态信息的确定方法、装置、设备及存储介质 |
Country Status (7)
Country | Link |
---|---|
US (2) | US10963727B2 (zh) |
EP (1) | EP3579192B1 (zh) |
JP (1) | JP6824433B2 (zh) |
KR (1) | KR102319207B1 (zh) |
CN (2) | CN110517319B (zh) |
TW (1) | TWI683259B (zh) |
WO (1) | WO2019007258A1 (zh) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110503604A (zh) * | 2019-07-31 | 2019-11-26 | 武汉大学 | 一种基于高精度pos的航空面阵影像实时正射拼接方法 |
CN111563840A (zh) * | 2019-01-28 | 2020-08-21 | 北京初速度科技有限公司 | 分割模型的训练方法、装置、位姿检测方法及车载终端 |
CN113628275A (zh) * | 2021-08-18 | 2021-11-09 | 北京理工大学深圳汽车研究院(电动车辆国家工程实验室深圳研究院) | 一种充电口位姿估计方法、***、充电机器人及存储介质 |
CN114419437A (zh) * | 2022-01-12 | 2022-04-29 | 湖南视比特机器人有限公司 | 基于2d视觉的工件分拣***及其控制方法和控制装置 |
CN114782447A (zh) * | 2022-06-22 | 2022-07-22 | 小米汽车科技有限公司 | 路面检测方法、装置、车辆、存储介质及芯片 |
CN115937305A (zh) * | 2022-06-28 | 2023-04-07 | 北京字跳网络技术有限公司 | 图像处理方法、装置及电子设备 |
Families Citing this family (24)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110517319B (zh) * | 2017-07-07 | 2022-03-15 | 腾讯科技(深圳)有限公司 | 一种相机姿态信息确定的方法及相关装置 |
CN107590453B (zh) * | 2017-09-04 | 2019-01-11 | 腾讯科技(深圳)有限公司 | 增强现实场景的处理方法、装置及设备、计算机存储介质 |
US11113837B2 (en) * | 2019-10-25 | 2021-09-07 | 7-Eleven, Inc. | Sensor mapping to a global coordinate system |
US11292129B2 (en) * | 2018-11-21 | 2022-04-05 | Aivot, Llc | Performance recreation system |
WO2020170462A1 (ja) * | 2019-02-22 | 2020-08-27 | 公立大学法人会津大学 | 動画像距離算出装置および動画像距離算出用プログラムを記録したコンピュータ読み取り可能な記録媒体 |
CN110880160B (zh) * | 2019-11-14 | 2023-04-18 | Oppo广东移动通信有限公司 | 图片帧超分方法、装置、终端设备及计算机可读存储介质 |
CN111080683B (zh) * | 2019-12-09 | 2023-06-02 | Oppo广东移动通信有限公司 | 图像处理方法、装置、存储介质及电子设备 |
CN111179309B (zh) * | 2019-12-19 | 2024-05-24 | 联想(北京)有限公司 | 一种跟踪方法及设备 |
CN116797971A (zh) * | 2019-12-31 | 2023-09-22 | 支付宝实验室(新加坡)有限公司 | 一种视频流识别方法及装置 |
CN111598927B (zh) * | 2020-05-18 | 2023-08-01 | 京东方科技集团股份有限公司 | 一种定位重建方法和装置 |
CN111578839B (zh) * | 2020-05-25 | 2022-09-20 | 阿波罗智联(北京)科技有限公司 | 障碍物坐标处理方法、装置、电子设备及可读存储介质 |
CN111681282A (zh) * | 2020-06-18 | 2020-09-18 | 浙江大华技术股份有限公司 | 一种栈板识别处理方法及装置 |
CN111768454B (zh) * | 2020-08-05 | 2023-12-22 | 腾讯科技(深圳)有限公司 | 位姿确定方法、装置、设备及存储介质 |
CN112085789A (zh) * | 2020-08-11 | 2020-12-15 | 深圳先进技术研究院 | 位姿估计方法、装置、设备及介质 |
CN112066988B (zh) * | 2020-08-17 | 2022-07-26 | 联想(北京)有限公司 | 定位方法及定位设备 |
EP3965071A3 (en) * | 2020-09-08 | 2022-06-01 | Samsung Electronics Co., Ltd. | Method and apparatus for pose identification |
CN112150558B (zh) | 2020-09-15 | 2024-04-12 | 阿波罗智联(北京)科技有限公司 | 用于路侧计算设备的障碍物三维位置获取方法及装置 |
CN113141518B (zh) * | 2021-04-20 | 2022-09-06 | 北京安博盛赢教育科技有限责任公司 | 直播课堂中视频帧图像的控制方法、控制装置 |
US20230031480A1 (en) * | 2021-07-28 | 2023-02-02 | Htc Corporation | System for tracking camera and control method thereof |
CN113837949B (zh) * | 2021-08-19 | 2024-01-19 | 广州医软智能科技有限公司 | 一种图像处理方法和装置 |
EP4390846A1 (en) * | 2021-10-18 | 2024-06-26 | Samsung Electronics Co., Ltd. | Electronic apparatus and method for identifying content |
CN115861428B (zh) * | 2023-02-27 | 2023-07-14 | 广东粤港澳大湾区硬科技创新研究院 | 一种位姿测量方法、装置、终端设备及存储介质 |
CN116580083B (zh) * | 2023-07-13 | 2023-09-22 | 深圳创维智慧科技有限公司 | 摄像设备的位姿估计方法、装置、电子设备及存储介质 |
CN116645400B (zh) * | 2023-07-21 | 2023-12-08 | 江西红声技术有限公司 | 视觉及惯性混合位姿跟踪方法、***、头盔及存储介质 |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102088569A (zh) * | 2010-10-13 | 2011-06-08 | 首都师范大学 | 低空无人飞行器序列图像拼接方法和*** |
CN105043392A (zh) * | 2015-08-17 | 2015-11-11 | 中国人民解放军63920部队 | 一种飞行器位姿确定方法及装置 |
US20170018086A1 (en) * | 2015-07-16 | 2017-01-19 | Google Inc. | Camera pose estimation for mobile devices |
Family Cites Families (28)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7403658B2 (en) * | 2005-04-15 | 2008-07-22 | Microsoft Corporation | Direct homography computation by local linearization |
JP5434608B2 (ja) * | 2010-01-08 | 2014-03-05 | トヨタ自動車株式会社 | 測位装置及び測位方法 |
EP2375376B1 (en) * | 2010-03-26 | 2013-09-11 | Alcatel Lucent | Method and arrangement for multi-camera calibration |
CN102375984B (zh) * | 2010-08-06 | 2014-02-26 | 夏普株式会社 | 特征量计算装置、图像连接装置、图像检索装置及特征量计算方法 |
US8855406B2 (en) * | 2010-09-10 | 2014-10-07 | Honda Motor Co., Ltd. | Egomotion using assorted features |
CN103649998B (zh) * | 2010-12-21 | 2016-08-31 | Metaio有限公司 | 确定为确定照相机的姿态和/或为确定至少一个真实对象的三维结构而设计的参数集的方法 |
KR101207535B1 (ko) * | 2010-12-31 | 2012-12-03 | 한양대학교 산학협력단 | 이동 로봇의 이미지 기반 동시적 위치 인식 및 지도 작성 방법 |
EP2668617A1 (en) * | 2011-01-27 | 2013-12-04 | Metaio GmbH | Method for determining correspondences between a first and a second image, and method for determining the pose of a camera |
JP2012164188A (ja) * | 2011-02-08 | 2012-08-30 | Sony Corp | 画像処理装置、画像処理方法およびプログラム |
KR101755687B1 (ko) * | 2011-04-01 | 2017-07-07 | 에스케이 주식회사 | 촬영화상의 카메라 자세 추정 시스템 및 방법 |
US9020187B2 (en) * | 2011-05-27 | 2015-04-28 | Qualcomm Incorporated | Planar mapping and tracking for mobile devices |
WO2013079098A1 (en) * | 2011-11-29 | 2013-06-06 | Layar B.V. | Dynamically configuring an image processing function |
US8965057B2 (en) * | 2012-03-02 | 2015-02-24 | Qualcomm Incorporated | Scene structure-based self-pose estimation |
JP5465299B2 (ja) * | 2012-09-21 | 2014-04-09 | キヤノン株式会社 | 情報処理装置、情報処理方法 |
US9418480B2 (en) * | 2012-10-02 | 2016-08-16 | Augmented Reailty Lab LLC | Systems and methods for 3D pose estimation |
KR20140108828A (ko) * | 2013-02-28 | 2014-09-15 | 한국전자통신연구원 | 카메라 트래킹 장치 및 방법 |
US20140369557A1 (en) * | 2013-06-14 | 2014-12-18 | Qualcomm Incorporated | Systems and Methods for Feature-Based Tracking |
US9336440B2 (en) * | 2013-11-25 | 2016-05-10 | Qualcomm Incorporated | Power efficient use of a depth sensor on a mobile device |
CN104050475A (zh) * | 2014-06-19 | 2014-09-17 | 樊晓东 | 基于图像特征匹配的增强现实的***和方法 |
KR101666959B1 (ko) * | 2015-03-25 | 2016-10-18 | ㈜베이다스 | 카메라로부터 획득한 영상에 대한 자동보정기능을 구비한 영상처리장치 및 그 방법 |
JP6464934B2 (ja) * | 2015-06-11 | 2019-02-06 | 富士通株式会社 | カメラ姿勢推定装置、カメラ姿勢推定方法およびカメラ姿勢推定プログラム |
CN105069809B (zh) * | 2015-08-31 | 2017-10-03 | 中国科学院自动化研究所 | 一种基于平面混合标识物的相机定位方法及*** |
JP6602141B2 (ja) * | 2015-10-05 | 2019-11-06 | キヤノン株式会社 | 画像処理装置および方法 |
CN105261042A (zh) * | 2015-10-19 | 2016-01-20 | 华为技术有限公司 | 光流估计的方法及装置 |
CN105512606B (zh) * | 2015-11-24 | 2018-12-21 | 北京航空航天大学 | 基于ar模型功率谱的动态场景分类方法及装置 |
CN105869166B (zh) | 2016-03-29 | 2018-07-10 | 北方工业大学 | 一种基于双目视觉的人体动作识别方法及*** |
CN105953796A (zh) * | 2016-05-23 | 2016-09-21 | 北京暴风魔镜科技有限公司 | 智能手机单目和imu融合的稳定运动跟踪方法和装置 |
CN110517319B (zh) * | 2017-07-07 | 2022-03-15 | 腾讯科技(深圳)有限公司 | 一种相机姿态信息确定的方法及相关装置 |
-
2017
- 2017-07-07 CN CN201910834144.XA patent/CN110517319B/zh active Active
- 2017-07-07 CN CN201710552105.1A patent/CN109215077B/zh active Active
-
2018
- 2018-06-26 TW TW107121953A patent/TWI683259B/zh active
- 2018-06-28 WO PCT/CN2018/093418 patent/WO2019007258A1/zh unknown
- 2018-06-28 JP JP2019548070A patent/JP6824433B2/ja active Active
- 2018-06-28 KR KR1020197030212A patent/KR102319207B1/ko active IP Right Grant
- 2018-06-28 EP EP18828662.9A patent/EP3579192B1/en active Active
-
2019
- 2019-04-16 US US16/386,143 patent/US10963727B2/en active Active
-
2021
- 2021-02-23 US US17/183,200 patent/US11605214B2/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102088569A (zh) * | 2010-10-13 | 2011-06-08 | 首都师范大学 | 低空无人飞行器序列图像拼接方法和*** |
US20170018086A1 (en) * | 2015-07-16 | 2017-01-19 | Google Inc. | Camera pose estimation for mobile devices |
CN105043392A (zh) * | 2015-08-17 | 2015-11-11 | 中国人民解放军63920部队 | 一种飞行器位姿确定方法及装置 |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111563840A (zh) * | 2019-01-28 | 2020-08-21 | 北京初速度科技有限公司 | 分割模型的训练方法、装置、位姿检测方法及车载终端 |
CN111563840B (zh) * | 2019-01-28 | 2023-09-05 | 北京魔门塔科技有限公司 | 分割模型的训练方法、装置、位姿检测方法及车载终端 |
CN110503604A (zh) * | 2019-07-31 | 2019-11-26 | 武汉大学 | 一种基于高精度pos的航空面阵影像实时正射拼接方法 |
CN110503604B (zh) * | 2019-07-31 | 2022-04-29 | 武汉大学 | 一种基于高精度pos的航空面阵影像实时正射拼接方法 |
CN113628275A (zh) * | 2021-08-18 | 2021-11-09 | 北京理工大学深圳汽车研究院(电动车辆国家工程实验室深圳研究院) | 一种充电口位姿估计方法、***、充电机器人及存储介质 |
CN114419437A (zh) * | 2022-01-12 | 2022-04-29 | 湖南视比特机器人有限公司 | 基于2d视觉的工件分拣***及其控制方法和控制装置 |
CN114782447A (zh) * | 2022-06-22 | 2022-07-22 | 小米汽车科技有限公司 | 路面检测方法、装置、车辆、存储介质及芯片 |
CN114782447B (zh) * | 2022-06-22 | 2022-09-09 | 小米汽车科技有限公司 | 路面检测方法、装置、车辆、存储介质及芯片 |
CN115937305A (zh) * | 2022-06-28 | 2023-04-07 | 北京字跳网络技术有限公司 | 图像处理方法、装置及电子设备 |
Also Published As
Publication number | Publication date |
---|---|
US20210174124A1 (en) | 2021-06-10 |
JP6824433B2 (ja) | 2021-02-03 |
US10963727B2 (en) | 2021-03-30 |
EP3579192A4 (en) | 2020-11-25 |
US20190244050A1 (en) | 2019-08-08 |
CN110517319B (zh) | 2022-03-15 |
TWI683259B (zh) | 2020-01-21 |
KR102319207B1 (ko) | 2021-10-28 |
CN110517319A (zh) | 2019-11-29 |
JP2020509506A (ja) | 2020-03-26 |
EP3579192A1 (en) | 2019-12-11 |
CN109215077A (zh) | 2019-01-15 |
KR20190127838A (ko) | 2019-11-13 |
US11605214B2 (en) | 2023-03-14 |
EP3579192B1 (en) | 2023-10-25 |
TW201837783A (zh) | 2018-10-16 |
CN109215077B (zh) | 2022-12-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11605214B2 (en) | Method, device and storage medium for determining camera posture information | |
CN108648235B (zh) | 相机姿态追踪过程的重定位方法、装置及存储介质 | |
CN108615248B (zh) | 相机姿态追踪过程的重定位方法、装置、设备及存储介质 | |
CN108734736B (zh) | 相机姿态追踪方法、装置、设备及存储介质 | |
WO2019184889A1 (zh) | 增强现实模型的调整方法、装置、存储介质和电子设备 | |
US11481982B2 (en) | In situ creation of planar natural feature targets | |
US11398044B2 (en) | Method for face modeling and related products | |
TWI543610B (zh) | 電子裝置及其影像選擇方法 | |
CN109151442B (zh) | 一种图像拍摄方法及终端 | |
US20190392648A1 (en) | Image display method and device | |
CN110059652B (zh) | 人脸图像处理方法、装置及存储介质 | |
KR20210111833A (ko) | 타겟의 위치들을 취득하기 위한 방법 및 장치와, 컴퓨터 디바이스 및 저장 매체 | |
JP2022511427A (ja) | 画像特徴点の動き情報の決定方法、タスク実行方法およびデバイス | |
CN108776822B (zh) | 目标区域检测方法、装置、终端及存储介质 | |
CN114170349A (zh) | 图像生成方法、装置、电子设备及存储介质 | |
CN111556337B (zh) | 一种媒体内容植入方法、模型训练方法以及相关装置 | |
CN110135329B (zh) | 从视频中提取姿势的方法、装置、设备及存储介质 | |
CN111258413A (zh) | 虚拟对象的控制方法和装置 | |
CN110941327A (zh) | 虚拟对象的显示方法和装置 | |
CN113705309A (zh) | 一种景别类型判断方法、装置、电子设备和存储介质 | |
CN114093020A (zh) | 动作捕捉方法、装置、电子设备及存储介质 | |
CN113676718A (zh) | 图像处理方法、移动终端及可读存储介质 | |
CN115393427A (zh) | 一种相机位姿的确定方法、装置、计算机设备和存储介质 | |
CN109035334A (zh) | 位姿的确定方法和装置、存储介质及电子装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 18828662 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2019548070 Country of ref document: JP Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 2018828662 Country of ref document: EP Effective date: 20190903 |
|
ENP | Entry into the national phase |
Ref document number: 20197030212 Country of ref document: KR Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |