CN114782547A - Three-dimensional coordinate determination method and device - Google Patents

Three-dimensional coordinate determination method and device Download PDF

Info

Publication number
CN114782547A
CN114782547A CN202210384618.7A CN202210384618A CN114782547A CN 114782547 A CN114782547 A CN 114782547A CN 202210384618 A CN202210384618 A CN 202210384618A CN 114782547 A CN114782547 A CN 114782547A
Authority
CN
China
Prior art keywords
coordinate system
joint point
standard coordinate
standard
camera
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210384618.7A
Other languages
Chinese (zh)
Inventor
詹渝
李锋海
翁仁亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Aibee Technology Co Ltd
Original Assignee
Beijing Aibee Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Aibee Technology Co Ltd filed Critical Beijing Aibee Technology Co Ltd
Priority to CN202210384618.7A priority Critical patent/CN114782547A/en
Publication of CN114782547A publication Critical patent/CN114782547A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • G06T7/85Stereo camera calibration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30244Camera pose

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Image Processing (AREA)

Abstract

The application discloses a three-dimensional coordinate determination method: acquiring an image to be processed including a target object, wherein the image to be processed is a two-dimensional image, and then obtaining a ray vector of each joint point in a camera coordinate system based on the coordinates of each joint point of the target object in the image coordinate system in the image to be processed. Furthermore, the ray vector of each joint point in the camera coordinate system is converted into the ray vector in the standard coordinate system, the three-dimensional coordinates of each joint point in the standard coordinate system are obtained based on the ray vector in the standard coordinate system, and the three-dimensional coordinates of each joint point in the standard coordinate system can be used for determining the three-dimensional coordinates of the target object. Because the ray vector and the three-dimensional coordinates of each joint point are coordinates under the standard coordinate system, the three-dimensional coordinates of each joint point under the standard coordinate system can be obtained more accurately and conveniently based on the ray vector under the standard coordinate system, and the 3D coordinates of the target object in the 2D image can be accurately determined.

Description

Three-dimensional coordinate determination method and device
Technical Field
The present application relates to the field of image processing, and in particular, to a method and an apparatus for determining three-dimensional coordinates.
Background
Monocular cameras may capture two-dimensional (2D) images. In some scenarios, it is desirable to determine a three-dimensional (3D) pose of a target object in a 2D image. The premise for determining the three-dimensional pose is to accurately determine the 3D coordinates of the target object in the 2D image.
How to accurately determine the 3D coordinates of the target object in the 2D image is a problem that needs to be solved at present.
Disclosure of Invention
The technical problem to be solved by the application is how to accurately determine the 3D coordinates of the target object in the 2D image, and provides a method and a device for determining the 3D coordinates.
In a first aspect, an embodiment of the present application provides a method for determining three-dimensional coordinates, where the method includes:
acquiring an image to be processed including a target object, wherein the image to be processed is a two-dimensional image;
obtaining a ray vector of each joint point of the target object in the image coordinate system based on the coordinate of each joint point in the image coordinate system to be processed;
converting the ray vector of each joint point under a camera coordinate system into a ray vector under a standard coordinate system;
and obtaining the three-dimensional coordinates of each joint point under the standard coordinate system based on the ray vector under the standard coordinate system, wherein the three-dimensional coordinates of each joint point under the standard coordinate system.
Optionally, three-dimensional coordinates of each joint point in the standard coordinate system may be used to determine a three-dimensional posture of the target object.
Optionally, the obtaining three-dimensional coordinates of each joint point in the standard coordinate system based on the ray vector in the standard coordinate system includes:
determining coordinates of a reference joint point in the standard coordinate system and offsets of the joint points relative to the reference joint point in the standard coordinate system based on ray vectors in the standard coordinate system;
and obtaining the three-dimensional coordinates of each joint point in the standard coordinate system based on the coordinates of the reference joint point in the standard coordinate system and the offset.
Optionally, the determining, based on the ray vector in the standard coordinate system, the coordinates of the reference joint point in the standard coordinate system includes:
and determining the coordinates of the reference joint point in the standard coordinate system based on the ray vector and the camera external reference feature vector in the standard coordinate system.
Optionally, the determining, based on the ray vector in the standard coordinate system, the coordinates of the reference joint point in the standard coordinate system includes:
and inputting the ray vector under the standard coordinate system into a first machine learning model to obtain the coordinate of the reference joint point under the standard coordinate system. The first machine learning model is used for obtaining coordinates of a reference joint point of a first object in a standard coordinate system based on ray vectors of a plurality of joint points of the first object in the standard coordinate system.
Optionally, the determining, based on the ray vector in the standard coordinate system, an offset of each joint point relative to the reference joint point in the standard coordinate system includes:
and inputting the ray vector under the standard coordinate system into a second machine learning model to obtain the offset of each joint point relative to the reference joint point under the standard coordinate system. Wherein:
the second machine learning model is configured to obtain offsets of the plurality of joint points of the second object relative to the reference joint point of the second object in a standard coordinate system based on the ray vectors of the plurality of joint points of the second object in the standard coordinate system.
Optionally, each joint point includes a first joint point, and a ray vector of the first joint point in the camera coordinate system is determined by:
obtaining a three-dimensional coordinate corresponding to the first joint point according to the coordinate of the first joint point in an image coordinate system and a preset depth value;
subtracting the coordinate of the central point of a camera for shooting the image to be processed from the three-dimensional coordinate corresponding to the first joint point to obtain an intermediate coordinate;
and dividing the intermediate coordinate by the focal length of the camera to obtain a ray vector of the first joint point in a camera coordinate system, wherein the origin of a ray corresponding to the ray vector is the central point of the camera.
Optionally, the standard coordinate system is obtained by:
rotating the camera coordinate system along an x-axis such that an x-z plane of the camera coordinate system is parallel to a ground plane; and translating the camera coordinate system along the y-axis direction such that the x-z plane of the camera coordinate system is coplanar with the ground plane.
In a second aspect, an embodiment of the present application provides a three-dimensional coordinate determination apparatus, including:
the device comprises a first acquisition unit, a second acquisition unit and a processing unit, wherein the first acquisition unit is used for acquiring an image to be processed comprising a target object, and the image to be processed is a two-dimensional image;
the first determining unit is used for obtaining ray vectors of all joint points of the target object in a camera coordinate system based on the coordinates of all joint points in an image coordinate system in the image to be processed;
the conversion unit is used for converting the ray vector of each joint point under a camera coordinate system into a ray vector under a standard coordinate system;
and the second determining unit is used for obtaining the three-dimensional coordinates of each joint point in the standard coordinate system based on the ray vector in the standard coordinate system.
Optionally, the second determining unit is configured to:
determining coordinates of reference joint points in the standard coordinate system and offset of each joint point relative to the reference joint points in the standard coordinate system based on ray vectors in the standard coordinate system;
and obtaining the three-dimensional coordinates of each joint point in the standard coordinate system based on the coordinates of the reference joint points in the standard coordinate system and the offset.
Optionally, the determining, based on the ray vector in the standard coordinate system, the coordinates of the reference joint point in the standard coordinate system includes:
and determining the coordinates of the reference joint point in the standard coordinate system based on the ray vector and the camera external reference feature vector in the standard coordinate system.
Optionally, the determining, based on the ray vector in the standard coordinate system, the coordinates of the reference joint point in the standard coordinate system includes:
and inputting the ray vector under the standard coordinate system into a first machine learning model to obtain the coordinate of the reference joint point under the standard coordinate system. The first machine learning model is used for obtaining coordinates of a reference joint point of a first object in a standard coordinate system based on ray vectors of a plurality of joint points of the first object in the standard coordinate system.
Optionally, the determining, based on the ray vector in the standard coordinate system, an offset of each joint point relative to the reference joint point in the standard coordinate system includes:
and inputting the ray vector under the standard coordinate system into a second machine learning model to obtain the offset of each joint point relative to the reference joint point under the standard coordinate system. Wherein:
the second machine learning model is configured to obtain offsets of the plurality of joint points of the second object relative to the reference joint point of the second object in a standard coordinate system based on the ray vectors of the plurality of joint points of the second object in the standard coordinate system.
Optionally, each joint point includes a first joint point, and a ray vector of the first joint point in the camera coordinate system is determined by:
obtaining a three-dimensional coordinate corresponding to the first joint point according to the coordinate of the first joint point in an image coordinate system and a preset depth value;
subtracting the coordinate of the central point of a camera for shooting the image to be processed from the three-dimensional coordinate corresponding to the first joint point to obtain an intermediate coordinate;
and dividing the intermediate coordinate by the focal length of the camera to obtain a ray vector of the first joint point in a camera coordinate system, wherein the origin of a ray corresponding to the ray vector is the central point of the camera.
Optionally, the standard coordinate system is obtained by:
rotating the camera coordinate system along an x-axis such that an x-z plane of the camera coordinate system is parallel to a ground plane; and translating the camera coordinate system along the y-axis direction such that the x-z plane of the camera coordinate system is coplanar with the ground plane.
In a third aspect, an embodiment of the present application provides an apparatus, where the apparatus includes: a processor, a memory, a system bus; the equipment and the memory are connected through the system bus; the memory is for storing one or more programs, the one or more programs comprising instructions, which when executed by the processor, cause the processor to perform the method of any of the above first aspects.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium having stored therein instructions that, when executed on a terminal device, cause the terminal device to perform the method of any one of the above first aspects.
Compared with the prior art, the embodiment of the application has the following advantages:
an embodiment of the present application provides a three-dimensional coordinate determination method, and in one example, an image to be processed including a target object may be obtained, where the image to be processed is a two-dimensional image, and then, based on coordinates of joint points of the target object in an image coordinate system in the image to be processed, a ray vector of each joint point in a camera coordinate system is obtained. The ray vector can eliminate the influence of camera internal parameters on the image to be processed. Further, the ray vector of each joint point in the camera coordinate system is converted into the ray vector in the standard coordinate system, so as to eliminate the influence of the camera external parameters on the ray vector. After the ray vector in the standard coordinate system is obtained, the three-dimensional coordinates of each joint point in the standard coordinate system may be obtained based on the ray vector in the standard coordinate system, and the three-dimensional coordinates of each joint point in the standard coordinate system may be used to determine the three-dimensional pose of the target object. Because the ray vector and the three-dimensional coordinates of each joint point are coordinates under a standard coordinate system, the three-dimensional coordinates of each joint point under the standard coordinate system can be obtained more accurately and conveniently based on the ray vector under the standard coordinate system. Therefore, the method and the device are beneficial to accurately determining the 3D posture of the target object in the 2D image in an application scene of determining the 3D posture of the target object.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a schematic flowchart of a three-dimensional coordinate determination method according to an embodiment of the present disclosure;
FIG. 2 is a schematic flowchart illustrating a method for determining three-dimensional coordinates of a joint point in a standard coordinate system according to an embodiment of the present disclosure;
fig. 3 is a schematic structural diagram of a three-dimensional coordinate determination apparatus according to an embodiment of the present application.
Detailed Description
In order to make those skilled in the art better understand the technical solutions of the present application, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application without making any creative effort belong to the protection scope of the present application.
Various non-limiting embodiments of the present application are described in detail below with reference to the accompanying drawings.
Exemplary method
Referring to fig. 1, the figure is a schematic flowchart of a three-dimensional coordinate determination method provided in an embodiment of the present application. In this embodiment, the method may be executed by a server or a terminal device, and the embodiment of the present application is not particularly limited.
In one example, the method shown in fig. 1 may include, for example, the steps of: S101-S104.
S101, acquiring an image to be processed comprising a target object, wherein the image to be processed is a two-dimensional image.
In one example, the image to be processed may be an image taken by a monocular camera.
The embodiment of the present application does not specifically limit a specific implementation manner for acquiring the to-be-processed image. In one example, the image to be processed may be read from a memory of a monocular camera.
The target object in the embodiments of the present application may be a human or an animal, and is not limited herein.
S102, obtaining ray vectors of all joint points of the target object in a camera coordinate system based on the coordinates of all joint points in an image coordinate system in the image to be processed.
When the target object is a human, the joint point of the target object is a human body joint point. When the target object is an animal, the joint point of the target object is an animal joint point.
In the embodiment of the present application, considering the influence of camera parameters, 2-dimensional images captured by a monocular camera may be the same for target objects with different 3-dimensional coordinates. In other words, one 2-dimensional image may correspond to a plurality of 3-dimensional coordinates. In view of this, in the embodiment of the present application, a ray vector of each joint point of the target object in the image coordinate system may be obtained based on coordinates of the joint point in the image coordinate system in the image to be processed. The ray vector removes the effect of camera parameters.
In one example, the coordinates of the respective joint points of the target object in the image coordinate system may be determined first. Wherein the coordinates of each joint point in the image coordinate system are two-dimensional coordinates. And then, obtaining the ray vector of each joint point of the target object in the camera coordinate system based on the coordinate of each joint point in the image coordinate system in the image to be processed. Wherein the camera coordinate system is a three-dimensional coordinate system.
For convenience of description, any one of the respective joint points will be referred to as a "first joint point". Next, taking the first joint as an example, a specific implementation of S102 will be described.
As described above, the image coordinate system is a two-dimensional coordinate system, and the camera coordinate system is a three-dimensional coordinate system, so that the ray vector corresponding to the first joint point is a three-dimensional vector. And the coordinate of the first joint point in the image coordinate system is a two-dimensional coordinate without corresponding depth information. In view of this, in the embodiment of the present application, the depth coordinate of the first joint point may be determined in advance, and then, based on the depth coordinate of the first joint point and the two-dimensional coordinate of the first joint point in the image coordinate system, the ray vector corresponding to the first joint point may be determined.
As an example, the depth coordinate of the first joint point may be set to a preset depth value, e.g., to 1. And then, determining a ray vector corresponding to the first joint point based on the coordinate of the first joint point in the image coordinate system and a preset depth value.
In a specific example, the three-dimensional coordinates corresponding to the first joint point may be obtained according to the coordinates of the first joint point in the image coordinate system and a preset depth value. And then, subtracting the coordinate of the central point of the camera for shooting the image to be processed from the three-dimensional coordinate corresponding to the first joint point to obtain an intermediate coordinate. The coordinate of the central point of the camera is a three-dimensional coordinate in a camera coordinate system, and the middle coordinate is also a three-dimensional coordinate. Further, dividing the intermediate coordinate by the focal length of the camera to obtain a ray vector of the first joint point in a camera coordinate system, wherein an origin of a ray corresponding to the ray vector is a central point of the camera.
And S103, converting the ray vector of each joint point in the camera coordinate system into the ray vector in the standard coordinate system.
For a ray vector of the first joint point in the camera coordinate system, the ray vector has a certain correlation with the external parameters of the camera. The external parameters of the camera may include, for example, the height of the camera, the pitch angle of the camera, and the like.
In order to remove the influence of the camera external parameters on the ray vector of the first joint point in the camera coordinate system, and thus determine the 3D coordinates of the target object based on the camera-independent features, in the embodiment of the present application, the ray vectors of the joint points in the camera coordinate system may be converted into the ray vectors in the standard coordinate system.
With regard to the standard coordinate system, it should be noted that the standard coordinate system is another three-dimensional coordinate system different from the camera coordinate system. The standard coordinate system is not particularly limited in the embodiments of the present application. In one example, the standard coordinate system may be derived by transforming the camera coordinate system, e.g., the camera coordinate system may be rotated along an x-axis such that an x-z plane of the camera coordinate system is parallel to a ground plane; and translating the camera coordinate system along the y-axis direction to make the x-z plane of the camera coordinate system coplanar with the ground plane, thereby obtaining the standard coordinate system.
In yet another example, the standard coordinate system may be a world coordinate system.
After the standard coordinate system is determined, the ray vectors of the joint points in the camera coordinate system may be converted into the ray vectors in the standard coordinate system based on the relationship between the camera coordinate system and the standard coordinate system.
And S104, obtaining the three-dimensional coordinates of each joint point in the standard coordinate system based on the ray vector in the standard coordinate system.
After the ray vector of each joint point in the standard coordinate system is determined, the three-dimensional coordinates of each joint point in the standard coordinate system may be obtained based on the ray vector in the standard coordinate system, and in an example, the three-dimensional posture of the target object may also be determined based on the three-dimensional coordinates of each joint point in the standard coordinate system.
In the embodiment of the application, because the ray vector and the three-dimensional coordinates of each joint point are coordinates in a standard coordinate system, the three-dimensional coordinates of each joint point in the standard coordinate system can be more accurately and conveniently obtained based on the ray vector in the standard coordinate system.
After determining the three-dimensional coordinates of each joint point in the standard coordinate system, in one example, if the standard coordinate system is different from the world coordinate system, the three-dimensional coordinates of each joint point in the standard coordinate system may be converted into the three-dimensional coordinates of each joint point in the world coordinate system. And after the three-dimensional coordinates of each joint point in the world coordinate system are obtained, connecting the adjacent coordinate points to obtain the three-dimensional posture of the target object. In another example, if the standard coordinate system is a world coordinate system, after obtaining three-dimensional coordinates of each joint point in the standard coordinate system, the three-dimensional poses of the target object may be obtained by connecting similar coordinate points in the standard coordinate system.
In S104, "obtaining the three-dimensional coordinates of each joint point in the standard coordinate system based on the ray vector in the standard coordinate system" may be implemented in various ways.
In one example, a model for determining three-dimensional coordinates of each joint point in a standard coordinate system based on a ray vector of the joint point in the standard coordinate system may be trained, and the ray vector of each joint point in the standard coordinate system may be input into the model, so as to obtain the three-dimensional coordinates of each joint point in the standard coordinate system. However, as can be seen from the process of training the actual model, the model is difficult to converge, and therefore, it is difficult to train and obtain a model with a good effect in this way, and therefore, it is difficult to accurately and conveniently obtain the three-dimensional coordinates of each joint point in the standard coordinate system in this way.
To avoid this problem, in another example, S104, when implemented in detail, may include S201-S202 shown in fig. 2. Fig. 2 is a flowchart illustrating a method for determining three-dimensional coordinates of a joint point in a standard coordinate system according to an embodiment of the present disclosure.
S201: and determining the coordinates of the reference joint point in the standard coordinate system and the offset of each joint point relative to the reference joint point in the standard coordinate system based on the ray vector in the standard coordinate system.
In the embodiment of the present application, the three-dimensional coordinates of the joint points under the standard coordinate system can be divided into two parts, one part is the coordinates of the reference joint points under the standard coordinate system, and the other part is the offset of each joint point relative to the reference joint points under the standard coordinate system. Compared with the method for directly determining the three-dimensional coordinates of the joint point in the standard coordinate system, the determination process of the two parts is simpler, and the determination result is relatively more accurate.
Regarding the reference joint point, it should be noted that it may be any one of the joint points, and the embodiments of the present application are not particularly limited. The reference joint point may be predetermined, for example, the joint point located at the pelvic position.
Next, a specific implementation of "determining coordinates of a reference joint point in the standard coordinate system based on a ray vector in the standard coordinate system" is described.
In one example, coordinates of a reference joint point in the standard coordinate system may be determined based directly on a ray vector in the standard coordinate system. For example, a first machine learning model is trained in advance, and the first machine learning model is used for obtaining coordinates of a reference joint point of a first object in a standard coordinate system based on ray vectors of a plurality of joint points of the first object in the standard coordinate system. The first object mentioned herein may include any object belonging to the same category as the target object. For example, if the target object is a person, the first object may include any person. In other words, the first object may comprise the target object. Therefore, the trained first machine learning model may be used to determine coordinates of the reference joint point of the target object in a standard coordinate system based on the ray vectors of the joint points of the target object in the standard coordinate system. For this case, ray vectors of the plurality of joint points of the target object in a standard coordinate system may be directly input to the first machine learning model to obtain coordinates of the reference joint point in the standard coordinate system.
With respect to the first machine learning model, it should be noted that, in one example, the first machine learning model may be trained based on a training ray vector and a training reference joint coordinate. The training ray vector may include a ray vector of a plurality of joint points of a training object in a standard coordinate system, and the training reference joint point coordinates may include coordinates of a reference joint point of the training object in the standard coordinate system. The training object referred to herein may be an object included in a training image. In addition, the ray vectors of the multiple joint points of the training object in the standard coordinate system may be determined based on the coordinates of the respective joint points of the training object in the image coordinate system, and the specific determination process is similar to an implementation manner of determining the ray vectors of the respective joint points of the target object in the standard coordinate system based on the coordinates of the respective joint points of the target object in the image coordinate system in the image to be processed, so that the relevant contents may refer to S101-S102, and the description is not repeated here. It is easy to understand that, since the training ray vector is not influenced by the external parameters of the camera, nor the internal parameters of the camera, the generalization effect of the trained first machine learning model is better. Moreover, the training data of the model are all in a standard coordinate system, and the first machine learning model is easier to converge.
In yet another example, the coordinates of the reference joint point in the standard coordinate system may be determined in combination with the ray vector in the standard coordinate system and the camera external reference, taking into account that the camera external reference is of some help in determining the coordinates of the reference joint point in the standard coordinate system. In one specific example, the camera external reference feature vector may be determined based on the camera external reference, for example, the camera external reference may be processed by using a plurality of full connection layers to obtain the camera external reference feature vector, and then, based on the ray vector in the standard coordinate system and the camera external reference feature vector, the coordinates of the reference joint point in the standard coordinate system may be determined. As an example, a ray vector and a camera-external reference feature vector in the standard coordinate system may be input into a first machine learning model, so as to obtain coordinates of the reference joint point in the standard coordinate system. For this case, the first machine learning model may be trained based on training ray vectors, training camera external reference feature vectors, and training reference joint coordinates. As for the training camera extrinsic reference feature vector, it should be noted that it can be obtained from the camera extrinsic reference corresponding to the training image. For example, the camera external parameters corresponding to the training images are processed by a plurality of full connection layers.
The structure of the first machine learning model is not specifically limited in the embodiment of the present application, and the first machine learning model may be any Deep Neural Network (DNN) model that takes two-dimensional key points as inputs, which is not necessarily illustrated herein.
Next, a specific implementation of "determining the offset of each joint point relative to the reference joint point in the standard coordinate system based on the ray vector in the standard coordinate system" will be described.
In one example, the offset of each joint point relative to the reference joint point in the standard coordinate system may be determined based directly on a ray vector in the standard coordinate system. For example, a second machine learning model is trained in advance, and the second machine learning model is used for obtaining offsets of a plurality of joint points of a second object relative to a reference joint point of the second object in a standard coordinate system based on ray vectors of the plurality of joint points of the second object in the standard coordinate system. The second object mentioned here may include any object belonging to the same category as the target object. For example, if the target object is a person, the second object may include any person. In other words, the second object may comprise the target object. Therefore, the second machine learning model may be configured to determine, based on the ray vectors of the plurality of joint points of the target object in a standard coordinate system, the offset of each joint point of the target object relative to the reference joint point of the target object in the standard coordinate system. For this case, the ray vectors of the plurality of joint points of the target object in the standard coordinate system may be directly input into the second machine learning model to obtain the offset of each joint point of the target object in the standard coordinate system with respect to the reference joint point of the target object.
Regarding the second machine learning model, it should be noted that, in an example, the second machine learning model may be trained based on a training ray vector and a training offset. Wherein the training ray vector may comprise a ray vector of a plurality of joint points of a training object in a standard coordinate system, and the training offset may comprise an offset of each joint point of the training object relative to the reference joint point in the standard coordinate system. It is understood that, since the training ray vector is not influenced by the camera external parameters and the camera internal parameters, the generalization effect of the trained second machine learning model is better. Moreover, the training data of the model are all in a standard coordinate system, and the second machine learning model is easier to converge.
In yet another example, the offset of each joint point relative to the reference joint point in the standard coordinate system may be determined in conjunction with a ray vector in the standard coordinate system and a camera external reference, taking into account that the camera external reference assists in determining the offset of each joint point relative to the reference joint point in the standard coordinate system. In a specific example, the offset of each joint point relative to the reference joint point in the standard coordinate system may be determined based on a ray vector in the standard coordinate system and a camera-external reference feature vector. As an example, the ray vector and the camera-external reference feature vector in the standard coordinate system may be input into a second machine learning model, so as to obtain the offset of each joint point relative to the reference joint point in the standard coordinate system. For this case, the second machine learning model may be trained based on a training ray vector, a training camera extrinsic feature vector, and a training offset.
Regarding the external reference feature vector of the camera, reference may be made to the above related description section, and details are not described here.
The structure of the second machine learning model is not specifically limited in the embodiments of the present application, and the second machine learning model may be any DNN model that takes two-dimensional key points as inputs, which is not necessarily illustrated here.
S202: and obtaining the three-dimensional coordinates of each joint point in the standard coordinate system based on the coordinates of the reference joint points in the standard coordinate system and the offset.
After the coordinates and the offset of the reference joint point in the standard coordinate system are obtained, the coordinates and the offset of the reference joint point in the standard coordinate system may be added to obtain the three-dimensional coordinates of each joint point in the standard coordinate system. Taking the first joint point as an example, the offset of the first joint point relative to the reference joint point in the standard coordinate system may be added to the coordinate of the reference joint point in the standard coordinate system to obtain the three-dimensional coordinate of the first joint point in the standard coordinate system.
Exemplary device
Based on the method provided by the above embodiment, the embodiment of the present application further provides an apparatus, which is described below with reference to the accompanying drawings.
Referring to fig. 3, fig. 3 is a schematic structural diagram of a three-dimensional coordinate determination apparatus according to an embodiment of the present application. The apparatus 300 may specifically include, for example: a first acquisition unit 301, a first determination unit 302, a conversion unit 303, and a second determination unit 304.
A first acquiring unit 301, configured to acquire an image to be processed including a target object, where the image to be processed is a two-dimensional image;
a first determining unit 302, configured to obtain, based on coordinates of each joint point of the target object in the image coordinate system in the image to be processed, a ray vector of each joint point in a camera coordinate system;
a converting unit 303, configured to convert the ray vector of each joint point in the camera coordinate system into a ray vector in a standard coordinate system;
a second determining unit 304, configured to obtain three-dimensional coordinates of each joint point in the standard coordinate system based on the ray vector in the standard coordinate system.
Optionally, the second determining unit 304 is configured to:
determining coordinates of reference joint points in the standard coordinate system and offset of each joint point relative to the reference joint points in the standard coordinate system based on ray vectors in the standard coordinate system;
and obtaining the three-dimensional coordinates of each joint point in the standard coordinate system based on the coordinates of the reference joint points in the standard coordinate system and the offset.
Optionally, the determining, based on the ray vector in the standard coordinate system, the coordinates of the reference joint point in the standard coordinate system includes:
and determining the coordinates of the reference joint point in the standard coordinate system based on the ray vector and the camera external reference feature vector in the standard coordinate system.
Optionally, the determining, based on the ray vector in the standard coordinate system, the coordinates of the reference joint point in the standard coordinate system includes:
and inputting the ray vector under the standard coordinate system into a first machine learning model to obtain the coordinate of the reference joint point under the standard coordinate system. Wherein:
the first machine learning model is used for obtaining coordinates of a reference joint point of a first object in a standard coordinate system based on ray vectors of a plurality of joint points of the first object in the standard coordinate system
Optionally, the determining, based on the ray vector in the standard coordinate system, the offset of each joint point relative to the reference joint point in the standard coordinate system includes:
and inputting the ray vector under the standard coordinate system into a second machine learning model to obtain the offset of each joint point relative to the reference joint point under the standard coordinate system. Wherein:
the second machine learning model is configured to obtain offsets of the plurality of joint points of the second object relative to the reference joint point of the second object in a standard coordinate system based on ray vectors of the plurality of joint points of the second object in the standard coordinate system.
Optionally, each joint point includes a first joint point, and a ray vector of the first joint point in the camera coordinate system is determined by:
obtaining a three-dimensional coordinate corresponding to the first joint point according to the coordinate of the first joint point in an image coordinate system and a preset depth value;
subtracting the coordinate of the central point of a camera for shooting the image to be processed from the three-dimensional coordinate corresponding to the first joint point to obtain an intermediate coordinate;
and dividing the intermediate coordinate by the focal length of the camera to obtain a ray vector of the first joint point in a camera coordinate system, wherein the origin of a ray corresponding to the ray vector is the central point of the camera.
Optionally, the standard coordinate system is obtained by:
rotating the camera coordinate system along an x-axis such that an x-z plane of the camera coordinate system is parallel to a ground plane; and translating the camera coordinate system along the y-axis direction such that the x-z plane of the camera coordinate system is coplanar with the ground plane.
Since the apparatus 300 is an apparatus corresponding to the method provided in the above method embodiment, and the specific implementation of each unit of the apparatus 300 is the same as that of the above method embodiment, for the specific implementation of each unit of the apparatus 300, reference may be made to the description part of the above method embodiment, and details are not repeated here.
An embodiment of the present application further provides an apparatus, including: a processor, a memory, a system bus; the equipment and the memory are connected through the system bus; the memory is for storing one or more programs, the one or more programs comprising instructions, which when executed by the processor, cause the processor to perform the method of any of the above method embodiments.
The embodiment of the present application provides a computer-readable storage medium, in which instructions are stored, and when the instructions are executed on a terminal device, the terminal device is caused to execute the method described in any one of the above method embodiments.
Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice in the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.
It will be understood that the present application is not limited to the precise arrangements that have been described above and shown in the drawings, and that various modifications and changes may be made without departing from the scope thereof. The scope of the application is limited only by the appended claims.
The above description is only exemplary of the present application and should not be taken as limiting the present application, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims (10)

1. A method of three-dimensional coordinate determination, the method comprising:
acquiring an image to be processed comprising a target object, wherein the image to be processed is a two-dimensional image;
obtaining ray vectors of all joint points of the target object in a camera coordinate system based on the coordinates of all joint points in an image coordinate system in the image to be processed;
converting the ray vector of each joint point under a camera coordinate system into a ray vector under a standard coordinate system;
and obtaining the three-dimensional coordinates of each joint point under the standard coordinate system based on the ray vector under the standard coordinate system, wherein the three-dimensional coordinates of each joint point under the standard coordinate system.
2. The method of claim 1, wherein obtaining three-dimensional coordinates of the respective joint point in the standard coordinate system based on the ray vector in the standard coordinate system comprises:
determining coordinates of a reference joint point in the standard coordinate system and offsets of the joint points relative to the reference joint point in the standard coordinate system based on ray vectors in the standard coordinate system;
and obtaining the three-dimensional coordinates of each joint point in the standard coordinate system based on the coordinates of the reference joint points in the standard coordinate system and the offset.
3. The method of claim 2, wherein determining coordinates of a reference joint point in the standard coordinate system based on the ray vectors in the standard coordinate system comprises:
and determining the coordinates of the reference joint point in the standard coordinate system based on the ray vector and the camera external reference feature vector in the standard coordinate system.
4. The method of claim 2, wherein determining coordinates of a reference joint point in the standard coordinate system based on the ray vectors in the standard coordinate system comprises:
inputting the ray vector under the standard coordinate system into a first machine learning model to obtain the coordinate of the reference joint point under the standard coordinate system; wherein:
the first machine learning model is used for obtaining coordinates of a reference joint point of a first object in a standard coordinate system based on ray vectors of a plurality of joint points of the first object in the standard coordinate system.
5. The method of claim 2, wherein said determining an offset of said respective joint point relative to said reference joint point in said standard coordinate system based on a ray vector in said standard coordinate system comprises:
inputting the ray vector under the standard coordinate system into a second machine learning model to obtain the offset of each joint point relative to the reference joint point under the standard coordinate system; wherein:
the second machine learning model is configured to obtain offsets of the plurality of joint points of the second object relative to the reference joint point of the second object in a standard coordinate system based on the ray vectors of the plurality of joint points of the second object in the standard coordinate system.
6. The method of claim 1, wherein each joint point comprises a first joint point, and wherein a ray vector of the first joint point in a camera coordinate system is determined by:
obtaining a three-dimensional coordinate corresponding to the first joint point according to the coordinate of the first joint point in an image coordinate system and a preset depth value;
subtracting the coordinate of the central point of a camera for shooting the image to be processed from the three-dimensional coordinate corresponding to the first joint point to obtain an intermediate coordinate;
and dividing the intermediate coordinate by the focal length of the camera to obtain a ray vector of the first joint point in a camera coordinate system, wherein the origin of a ray corresponding to the ray vector is the central point of the camera.
7. The method of claim 1, wherein the standard coordinate system is obtained by:
rotating the camera coordinate system along an x-axis such that an x-z plane of the camera coordinate system is parallel to a ground plane; and translating the camera coordinate system along the y-axis direction such that the x-z plane of the camera coordinate system is coplanar with the ground plane.
8. A three-dimensional coordinate determination apparatus, characterized in that the apparatus comprises:
the device comprises a first acquisition unit, a second acquisition unit and a processing unit, wherein the first acquisition unit is used for acquiring an image to be processed comprising a target object, and the image to be processed is a two-dimensional image;
the first determining unit is used for obtaining a ray vector of each joint point of the target object in the image coordinate system based on the coordinate of each joint point in the image coordinate system in the image to be processed;
the conversion unit is used for converting the ray vector of each joint point under a camera coordinate system into a ray vector under a standard coordinate system;
and the second determining unit is used for obtaining the three-dimensional coordinates of each joint point under the standard coordinate system based on the ray vector under the standard coordinate system, and the three-dimensional coordinates of each joint point under the standard coordinate system.
9. An apparatus, characterized in that the apparatus comprises: a processor, a memory, a system bus; the equipment and the memory are connected through the system bus; the memory is to store one or more programs, the one or more programs comprising instructions, which when executed by the processor, cause the processor to perform the method of any of claims 1 to 7.
10. A computer-readable storage medium having stored therein instructions which, when run on a terminal device, cause the terminal device to perform the method of any one of claims 1 to 7.
CN202210384618.7A 2022-04-13 2022-04-13 Three-dimensional coordinate determination method and device Pending CN114782547A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210384618.7A CN114782547A (en) 2022-04-13 2022-04-13 Three-dimensional coordinate determination method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210384618.7A CN114782547A (en) 2022-04-13 2022-04-13 Three-dimensional coordinate determination method and device

Publications (1)

Publication Number Publication Date
CN114782547A true CN114782547A (en) 2022-07-22

Family

ID=82429685

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210384618.7A Pending CN114782547A (en) 2022-04-13 2022-04-13 Three-dimensional coordinate determination method and device

Country Status (1)

Country Link
CN (1) CN114782547A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8723789B1 (en) * 2011-02-11 2014-05-13 Imimtek, Inc. Two-dimensional method and system enabling three-dimensional user interaction with a device
CN112258567A (en) * 2020-10-10 2021-01-22 达闼机器人有限公司 Visual positioning method and device for object grabbing point, storage medium and electronic equipment
WO2021103648A1 (en) * 2019-11-29 2021-06-03 百果园技术(新加坡)有限公司 Hand key point detection method, gesture recognition method, and related devices
WO2021227694A1 (en) * 2020-05-13 2021-11-18 北京市商汤科技开发有限公司 Image processing method and apparatus, electronic device, and storage medium
US20220020174A1 (en) * 2019-03-12 2022-01-20 Amdt Holdings, Inc. Monoscopic radiographic image and three-dimensional model registration methods and systems

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8723789B1 (en) * 2011-02-11 2014-05-13 Imimtek, Inc. Two-dimensional method and system enabling three-dimensional user interaction with a device
US20220020174A1 (en) * 2019-03-12 2022-01-20 Amdt Holdings, Inc. Monoscopic radiographic image and three-dimensional model registration methods and systems
WO2021103648A1 (en) * 2019-11-29 2021-06-03 百果园技术(新加坡)有限公司 Hand key point detection method, gesture recognition method, and related devices
WO2021227694A1 (en) * 2020-05-13 2021-11-18 北京市商汤科技开发有限公司 Image processing method and apparatus, electronic device, and storage medium
CN112258567A (en) * 2020-10-10 2021-01-22 达闼机器人有限公司 Visual positioning method and device for object grabbing point, storage medium and electronic equipment

Similar Documents

Publication Publication Date Title
US10469828B2 (en) Three-dimensional dense structure from motion with stereo vision
CN111598993B (en) Three-dimensional data reconstruction method and device based on multi-view imaging technology
CN108198141B (en) Image processing method and device for realizing face thinning special effect and computing equipment
WO2018068678A1 (en) Method and device for determining external parameter of stereoscopic camera
CN110568447A (en) Visual positioning method, device and computer readable medium
EP3308323B1 (en) Method for reconstructing 3d scene as 3d model
US20230252664A1 (en) Image Registration Method and Apparatus, Electronic Apparatus, and Storage Medium
CN103345736A (en) Virtual viewpoint rendering method
JP7123736B2 (en) Image processing device, image processing method, and program
WO2016030305A1 (en) Method and device for registering an image to a model
KR102152436B1 (en) A skeleton processing system for dynamic 3D model based on 3D point cloud and the method thereof
CN110487274B (en) SLAM method and system for weak texture scene, navigation vehicle and storage medium
EP3905195A1 (en) Image depth determining method and living body identification method, circuit, device, and medium
CN114782636A (en) Three-dimensional reconstruction method, device and system
Xu et al. Three dimentional reconstruction of large cultural heritage objects based on uav video and tls data
WO2020133080A1 (en) Object positioning method and apparatus, computer device, and storage medium
CN115053260A (en) Data set generation method, neural network generation method and scene model construction method
CN114782547A (en) Three-dimensional coordinate determination method and device
CN111724432A (en) Object three-dimensional detection method and device
KR101673144B1 (en) Stereoscopic image registration method based on a partial linear method
CN113610969B (en) Three-dimensional human body model generation method and device, electronic equipment and storage medium
Clipp et al. A new minimal solution to the relative pose of a calibrated stereo camera with small field of view overlap
CN111489439B (en) Three-dimensional line graph reconstruction method and device and electronic equipment
CN109328459B (en) Intelligent terminal, 3D imaging method thereof and 3D imaging system
JP2004362128A (en) Method for correcting three-dimensional attitude in model image collation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination