CN113470112A - Image processing method, image processing device, storage medium and terminal - Google Patents

Image processing method, image processing device, storage medium and terminal Download PDF

Info

Publication number
CN113470112A
CN113470112A CN202110748267.9A CN202110748267A CN113470112A CN 113470112 A CN113470112 A CN 113470112A CN 202110748267 A CN202110748267 A CN 202110748267A CN 113470112 A CN113470112 A CN 113470112A
Authority
CN
China
Prior art keywords
position information
key point
target object
dimensional position
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110748267.9A
Other languages
Chinese (zh)
Inventor
潘睿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Oppo Mobile Telecommunications Corp Ltd
Original Assignee
Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Oppo Mobile Telecommunications Corp Ltd filed Critical Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority to CN202110748267.9A priority Critical patent/CN113470112A/en
Publication of CN113470112A publication Critical patent/CN113470112A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses an image processing method, an image processing device, a storage medium and a terminal, and relates to the technical field of image processing. Determining two-dimensional position information of each object key point in the target object in an image two-dimensional coordinate system and relative depth of each object key point in the target object relative to a reference point in the target object; determining first three-dimensional position information of each object key point in the target object in a world three-dimensional coordinate system according to the two-dimensional position information and the relative depth; and further determining second three-dimensional position information of each model key point in the target model corresponding to the target object. The first three-dimensional position information of each object key point in the target object is determined, and then the second three-dimensional position information of each model key point in the target model corresponding to the target object is determined according to the two-dimensional position information and the first three-dimensional position information of each object key point in the target object, so that the three-dimensional position information of the target model corresponding to the target object can be determined through one image, and the complexity of determining the three-dimensional position is reduced.

Description

Image processing method, image processing device, storage medium and terminal
Technical Field
The present application relates to the field of image processing technologies, and in particular, to an image processing method, an image processing apparatus, a storage medium, and a terminal.
Background
With the development of scientific technology, the use of images is more and more extensive, for example, three-dimensional human body gestures can be recognized in images, and the images are widely applied to the fields of security, games, entertainment and the like.
In the related art, a three-dimensional human body posture detection method generally obtains a plurality of two-dimensional position information from a plurality of images at different angles, and then converts the two-dimensional position information into three-dimensional position information according to a predetermined relationship between the two-dimensional position information and the three-dimensional position information.
In the related art, the current three-dimensional human body posture detection method needs a plurality of images, and the detection method is complex.
Disclosure of Invention
The application provides an image processing method, an image processing device, a storage medium and a terminal, which can solve the technical problem that three-dimensional human body posture detection in the related technology is complex.
In a first aspect, an embodiment of the present application provides an image processing method, including:
identifying a target object in a target image, and determining two-dimensional position information of each object key point in the target object in an image two-dimensional coordinate system and relative depth of each object key point in the target object relative to a reference point in the target object;
determining first three-dimensional position information of each object key point in the target object in a world three-dimensional coordinate system according to the two-dimensional position information and the relative depth;
and determining second three-dimensional position information of each model key point in the target model corresponding to the target object in the world three-dimensional coordinate system according to the two-dimensional position information and the first three-dimensional position information.
In a second aspect, an embodiment of the present application provides an image processing apparatus, including:
the object two-dimensional position determining module is used for identifying a target object in a target image, and determining two-dimensional position information of each object key point in the target object in an image two-dimensional coordinate system and relative depth of each object key point relative to a reference point in the target object;
the object three-dimensional position determining module is used for determining first three-dimensional position information of each object key point in the target object in a world three-dimensional coordinate system according to the two-dimensional position information and the relative depth;
and the model three-dimensional position determining module is used for determining second three-dimensional position information of each model key point in the target model corresponding to the target object in the world three-dimensional coordinate system according to the two-dimensional position information and the first three-dimensional position information.
In a third aspect, embodiments of the present application provide a computer storage medium storing a plurality of operations, the operations being adapted to be loaded by a processor and to perform the steps of the above-mentioned method.
In a fourth aspect, embodiments of the present application provide a terminal including a memory, a processor, and a computer program stored on the memory and executable on the processor.
The beneficial effects brought by the technical scheme provided by some embodiments of the application at least comprise:
the application provides an image processing method, which comprises the steps of identifying a target object in a target image, and determining two-dimensional position information of each object key point in the target object in an image two-dimensional coordinate system and relative depth of each object key point relative to a reference point in the target object; determining first three-dimensional position information of each object key point in the target object in a world three-dimensional coordinate system according to the two-dimensional position information and the relative depth; and determining second three-dimensional position information of each model key point in the target model corresponding to the target object in a world three-dimensional coordinate system according to the two-dimensional position information and the first three-dimensional position information. The two-dimensional position information and the relative depth of each object key point in the target object can be determined according to one target image, so that the first three-dimensional position information of each object key point in the target object can be determined firstly, and then the second three-dimensional position information of each model key point in the target model corresponding to the target object can be determined according to the two-dimensional position information and the first three-dimensional position information of each object key point in the target object, so that the three-dimensional position information of the target model corresponding to the target object can be determined through one image, the complexity of determining the three-dimensional position is reduced, and the accuracy of determining the three-dimensional position is increased.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is an exemplary system architecture diagram of an image processing method provided in an embodiment of the present application;
fig. 2 is a system interaction diagram of an image processing method according to an embodiment of the present application;
fig. 3 is a schematic flowchart of an image processing method according to another embodiment of the present application;
fig. 4 is a schematic flowchart of an image processing method according to another embodiment of the present application;
FIG. 5 is a schematic diagram of a coordinate system reference provided in another embodiment of the present application;
FIG. 6 is a schematic diagram of probability distribution of obtaining coordinates according to another embodiment of the present application;
FIG. 7 is a schematic diagram of a target object in a standard posture according to another embodiment of the present application;
FIG. 8 is a schematic diagram of a target model driver according to another embodiment of the present application;
FIG. 9 is a schematic diagram of a target model driver according to another embodiment of the present application;
FIG. 10 is a schematic diagram of a target model driver provided in accordance with another embodiment of the present application;
FIG. 11 is a schematic diagram of target model driving according to another embodiment of the present application
Fig. 12 is a schematic structural diagram of an image processing apparatus according to another embodiment of the present application;
fig. 13 is a schematic structural diagram of an image processing apparatus according to another embodiment of the present application;
fig. 14 is a schematic data flow diagram of an image processing apparatus according to another embodiment of the present application;
fig. 15 is a schematic structural diagram of a terminal according to an embodiment of the present application.
Detailed Description
In order to make the features and advantages of the present application more obvious and understandable, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are only a part of the embodiments of the present application, and not all the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the application, as detailed in the appended claims.
Fig. 1 is an exemplary system architecture diagram of an image processing method according to an embodiment of the present application.
As shown in fig. 1, the system architecture may include a terminal 101, a network 102, and a server 103. Network 102 is the medium used to provide communication links between terminals 101 and servers 103. Network 102 may include various types of wired or wireless communication links, such as: the wired communication link includes an optical fiber, a twisted pair wire or a coaxial cable, and the Wireless communication link includes a bluetooth communication link, a Wireless-Fidelity (Wi-Fi) communication link, a microwave communication link, or the like.
The terminal 101 may interact with the server 103 through the network 102 to receive messages from the server 103 or to send messages to the server 103. The terminal 101 may be hardware or software. When the terminal 101 is hardware, it can be a variety of electronic devices including, but not limited to, smart watches, smart phones, tablet computers, laptop portable computers, desktop computers, and the like. When the terminal 101 is software, it may be installed in the electronic devices listed above, and it may be implemented as multiple software or software modules (for example, for providing distributed services), or as a single software or software module, and is not limited in this respect.
The server 103 may be a business server providing various services. The server 103 may be hardware or software. When the server 103 is hardware, it may be implemented as a distributed server cluster composed of a plurality of servers, or may be implemented as a single server. When the server 103 is software, it may be implemented as a plurality of software or software modules (for example, to provide distributed services), or may be implemented as a single software or software module, and is not limited in particular herein.
It should be understood that the number of terminals, networks, and servers in fig. 1 is merely illustrative, and that any number of terminals, networks, and servers may be used, as desired for an implementation.
Referring to fig. 2, fig. 2 is a system interaction diagram of an image processing method according to an embodiment of the present application, it can be understood that, in the embodiment of the present application, an execution subject may be a terminal or a processor in the terminal, or may also be a service related to the execution of the image processing method in the terminal, and for convenience of description, a system interaction process in the image processing method is described below with reference to fig. 1 and fig. 2 by taking the execution subject as the processor in the terminal as an example.
S201, identifying a target object in the target image by the processor, and determining two-dimensional position information of each object key point in the target object in an image two-dimensional coordinate system and relative depth of each object key point in the target object relative to a reference point in the target object.
Optionally, determining two-dimensional position information of each object key point in the target object in the image two-dimensional coordinate system and a relative depth to the reference point in the target object, includes: acquiring a two-dimensional coordinate probability distribution map of each object key point in a target object in an image two-dimensional coordinate system, and acquiring a one-dimensional coordinate probability distribution map of each object key point in the target object in a one-dimensional coordinate system vertical to the image two-dimensional coordinate system; and determining two-dimensional position information of each object key point in the target object in an image two-dimensional coordinate system according to the two-dimensional coordinate probability distribution map, and determining the relative depth of each object key point in the target object relative to a reference point in the target object according to the one-dimensional coordinate probability distribution map.
S202, the processor determines first three-dimensional position information of each object key point in the target object in a world three-dimensional coordinate system according to the two-dimensional position information and the relative depth.
Optionally, determining first three-dimensional position information of each object key point in the target object in a world three-dimensional coordinate system according to the two-dimensional position information and the relative depth, including: determining three-dimensional position information of each object key point in the target object in a camera three-dimensional coordinate system according to the two-dimensional position information, the relative depth and preset camera internal parameters; and determining first three-dimensional position information of each object key point in the target object in a world three-dimensional coordinate system according to the three-dimensional position information of each object key point in the target object in the camera three-dimensional coordinate system and preset camera external parameters.
S203, the processor determines second three-dimensional position information of each model key point in the target model corresponding to the target object in a world three-dimensional coordinate system according to the two-dimensional position information and the first three-dimensional position information.
Optionally, determining, according to the two-dimensional position information and the first three-dimensional position information, second three-dimensional position information of each model key point in the target model corresponding to the target object in the world three-dimensional coordinate system, includes: determining a driving rotation angle of each object key point in the target object relative to the initial rotation angle according to the first three-dimensional position information; driving each model key point in a target model corresponding to the target object to move according to the driving rotation angle so as to enable the posture of the target model to be the same as the posture of the target object in the target image; obtaining model three-dimensional position information of each model key point in the target model after movement in a world three-dimensional coordinate system, and determining second three-dimensional position information of each model key point in the target model in the world three-dimensional coordinate system according to the two-dimensional position information and the model three-dimensional position information.
Optionally, determining second three-dimensional position information of each model key point in the target model in the world three-dimensional coordinate system according to the two-dimensional position information and the model three-dimensional position information, including: determining real camera external parameters corresponding to the target model according to the two-dimensional position information and the model three-dimensional position information; and determining second three-dimensional position information of each model key point in the target model in a world three-dimensional coordinate system according to the three-dimensional position information of each object key point in the target object in the camera three-dimensional coordinate system and the real camera external parameters.
Optionally, determining a driving rotation angle of each object key point in the target object relative to the initial rotation angle includes: determining a normal vector corresponding to each object key point in a target object; determining an initial relationship between a father node and a child node in each object key point in the target object according to the position information of the father node in each object key point in the target object, the position information of the child node and a normal vector; and determining the driving rotation angle of each object key point in the target object relative to the initial rotation angle according to the position information of the father node in each object key point in the target object, the position information of the child node and the initial relationship between the father node and the child node.
Optionally, driving, according to the driving rotation angle, each model key point in the target model corresponding to the target object to move, includes: and driving each model key point in the target model corresponding to the target object to move by a forward driving method based on the driving rotation angle.
In the embodiment of the application, firstly, a target object in a target image is identified, and two-dimensional position information of each object key point in the target object in an image two-dimensional coordinate system and relative depth of each object key point relative to a reference point in the target object are determined; determining first three-dimensional position information of each object key point in the target object in a world three-dimensional coordinate system according to the two-dimensional position information and the relative depth; and determining second three-dimensional position information of each model key point in the target model corresponding to the target object in a world three-dimensional coordinate system according to the two-dimensional position information and the first three-dimensional position information. The two-dimensional position information and the relative depth of each object key point in the target object can be determined according to one target image, so that the first three-dimensional position information of each object key point in the target object can be determined firstly, and then the second three-dimensional position information of each model key point in the target model corresponding to the target object can be determined according to the two-dimensional position information and the first three-dimensional position information of each object key point in the target object, so that the three-dimensional position information of the target model corresponding to the target object can be determined through one image, the complexity of determining the three-dimensional position is reduced, and the accuracy of determining the three-dimensional position is increased.
Referring to fig. 3, fig. 3 is a schematic flowchart illustrating an image processing method according to another embodiment of the present application.
As shown in fig. 3, the method includes:
s301, identifying a target object in the target image, and determining two-dimensional position information of each object key point in the target object in an image two-dimensional coordinate system and relative depth of each object key point in the target object relative to a reference point in the target object.
It can be understood that an image processing method provided in the embodiment of the present application is mainly used for identifying or detecting an object in an image and determining three-dimensional position information of a model corresponding to the object, where the image may be an image of any type or source. It is therefore possible to first acquire an image to be recognized or detected and determine the image as a target image.
After the target image is acquired, the object in the target image may be identified to determine the target object in the target image, for example, an object frame detection method may be used to determine the target object in the target image, where when there is more than one target object included in the target image, the categories of different target objects may be the same or different; for example, the plurality of target objects are all humans; or the plurality of target objects are all vehicles. As another example, the target objects in the target image include: humans and animals; or the target objects in the target image comprise people and vehicles, and the target object category is determined according to the actual application scene requirements. For the purpose of aspect description, the following description will be given taking a target object in a target image as an example.
After the target object in the target image is determined, object keypoints in the target object may be determined, where the object keypoints refer to core points used for referring to or constituting the target object, for example, when the target object in the target image is a person, object keypoints of the target object may be determined as joint points of a person, and object keypoints in the target object may be determined as a keypoint detection method, for example, when the target object in the target image is a person, the keypoint detection method may be a method implemented based on a Cascaded Pyramid Network (CPN) Network, or may be a method implemented based on a Simple baseline Network (Simple Baselines) Network, or the like.
After determining the object key points of the target object, a two-dimensional coordinate system may be established based on the target image, and the two-dimensional coordinate system may be determined as an image two-dimensional coordinate system, where the specific process of establishing the image two-dimensional coordinate system may be to first determine a region corresponding to the target image, where the region includes the target object, then select a point in the region as an origin, and finally determine the two-dimensional coordinate system based on the origin and a plane where the region is located.
After the object key points and the image two-dimensional coordinate system of the target object are determined, because the object key points and the image two-dimensional coordinate system of the target object are determined based on the target image, that is, the object key points and the image two-dimensional coordinate system of the target object can be in one dimension, the two-dimensional position information of each object key point in the target object in the image two-dimensional coordinate system can be determined.
Further, a reference point may be determined based on each object key point in the target object, and the reference point coincides with the origin of the two-dimensional coordinate system of the image. The purpose of determining the reference point in the target object is to determine the relative distance of each object keypoint in the target object with respect to the reference point in a direction perpendicular to the two-dimensional coordinate system of the image, which may also be considered as the relative depth of each object keypoint in the target image with respect to the reference point in the target object.
S302, determining first three-dimensional position information of each object key point in the target object in a world three-dimensional coordinate system according to the two-dimensional position information and the relative depth.
Alternatively, on the basis of the above-mentioned establishment of the two-dimensional coordinate system of the image, a world three-dimensional coordinate system of the target object in the real world may also be established, and then the position of the target object in the world three-dimensional coordinate system is also the position of the target object in the real world.
After the two-dimensional position information of each object key point in the target object in the image two-dimensional coordinate system and the relative depth of each object key point in the target object relative to the reference point in the target object are determined, the two-dimensional position information of each object key point in the target object represents the position information of the target object on a certain two-dimensional plane, and the relative depth of each object key point in the target object represents the position information of the target object on a certain one-dimensional plane, so that the three-dimensional position information of each object key point in the target object in the world three-dimensional coordinate system can be converted based on the two-dimensional position information and the relative depth of each object key point in the target object, and the three-dimensional position information is determined as the first three-dimensional position information.
And S303, determining second three-dimensional position information of each model key point in the target model corresponding to the target object in a world three-dimensional coordinate system according to the two-dimensional position information and the first three-dimensional position information.
After first three-dimensional position information of each object key point in the target object in a world three-dimensional coordinate system is determined, in order to obtain three-dimensional position information of a target model corresponding to the target object in the real world, the position of the target model corresponding to the target object can be described from the angle of a camera by combining two-dimensional position information and the first three-dimensional position information of each object key point in the target object, the distance from each model key point in the target model to the camera is determined, the distance can be regarded as the absolute depth from each model key point in the target model to the camera, the absolute coordinate of each model key point in the target model in the world three-dimensional coordinate system, namely second three-dimensional position information, is determined according to the absolute depth, and the second three-dimensional position information is used as final three-dimensional position information of the target model in the real world.
In the embodiment of the application, firstly, a target object in a target image is identified, and two-dimensional position information of each object key point in the target object in an image two-dimensional coordinate system and relative depth of each object key point relative to a reference point in the target object are determined; determining first three-dimensional position information of each object key point in the target object in a world three-dimensional coordinate system according to the two-dimensional position information and the relative depth; and determining second three-dimensional position information of each model key point in the target model corresponding to the target object in a world three-dimensional coordinate system according to the two-dimensional position information and the first three-dimensional position information. The two-dimensional position information and the relative depth of each object key point in the target object can be determined according to one target image, so that the first three-dimensional position information of each object key point in the target object can be determined firstly, and then the second three-dimensional position information of each model key point in the target model corresponding to the target object can be determined according to the two-dimensional position information and the first three-dimensional position information of each object key point in the target object, so that the three-dimensional position information of the target model corresponding to the target object can be determined through one image, the complexity of determining the three-dimensional position is reduced, and the accuracy of determining the three-dimensional position is increased.
Referring to fig. 4, fig. 4 is a schematic flowchart of an image processing method according to another embodiment of the present application.
As shown in fig. 4, the method includes:
s401, identifying a target object in the target image, acquiring a two-dimensional coordinate probability distribution map of each object key point in the target object in an image two-dimensional coordinate system, and acquiring a one-dimensional coordinate probability distribution map of each object key point in the target object in a one-dimensional coordinate system perpendicular to the image two-dimensional coordinate system.
In the embodiment of the application, a pixel coordinate system, an image two-dimensional coordinate system, a one-dimensional coordinate system, a camera three-dimensional coordinate system and a world three-dimensional coordinate system can be respectively established based on the target image. Referring to fig. 5, fig. 5 is a schematic diagram of a coordinate system reference according to another embodiment of the present application. As shown in fig. 5, a coordinate system uv is a pixel coordinate system, the pixel coordinate system is a two-dimensional coordinate system obtained by using a vertex of a rectangular region corresponding to the target image as an origin, and using two frames near the origin in the rectangular frames as a u coordinate axis and a v coordinate axis, respectively; a coordinate system xy image two-dimensional coordinate system, wherein the image two-dimensional coordinate system is a coordinate system formed on the basis of a plane where the target image is located, and an origin O of the image two-dimensional coordinate system is the position of a reference point in the target object; the coordinate system Zc is a one-dimensional coordinate system, the one-dimensional coordinate system is a coordinate system vertical to the image two-dimensional coordinate system, namely the coordinate axis of the one-dimensional coordinate system is vertical to the two-dimensional coordinate system xy, and the coordinate axis of the one-dimensional coordinate system passes through the origin O of the image two-dimensional coordinate system xy; the coordinate system XcYcZc is a camera three-dimensional coordinate system, the camera three-dimensional coordinate system is established from a camera angle corresponding to a target image, a plane formed by an Xc coordinate extraction of the camera three-dimensional coordinate system and a Yc coordinate axis is parallel to a plane where an image two-dimensional coordinate system xy is located, a ZC coordinate axis of the camera three-dimensional coordinate system is parallel to a coordinate extraction of a one-dimensional coordinate system, an origin of the camera three-dimensional coordinate system and an origin of the one-dimensional coordinate system are both Oc in coincidence, and a ZC coordinate axis in the camera three-dimensional coordinate system is parallel to a coordinate axis of the one-dimensional coordinate system; the coordinate system ZwXwYw is a world three-dimensional coordinate system, the world three-dimensional coordinate system can be regarded as a coordinate system corresponding to the target object in the real world, the coordinate axis positions of the world three-dimensional coordinate system may not be limited, and in binocular vision, the world three-dimensional coordinate system origin is generally positioned at the midpoint of the left camera, the right camera or the X-axis directions of the left camera, the right camera or both the cameras.
Further, in FIG. 5 the pixel coordinate system may be used to represent location information for pixels in the target image; the image two-dimensional coordinate system can be used for representing the position information of image points in the target image; a one-dimensional coordinate system may be used to represent the relative depth of an image point in the target image with respect to an image point; the camera three-dimensional coordinate system can be used for representing an object in the target image or a model corresponding to the object in the target image, and three-dimensional position information under the camera view angle; the world three-dimensional coordinate system is used for representing three-dimensional position information of an object in a target image or a model corresponding to the object in the target image in the real world. Wherein, P (Xw, Yw, Zw) is a coordinate point in a world three-dimensional coordinate system, P (x, y) is a coordinate point in an image two-dimensional coordinate system, the pixel coordinate of P (x, y) in the pixel coordinate system is (u, v), f is a camera focal length, and is equal to the distance between Oc and O.
After the coordinate system is determined, a two-dimensional coordinate probability distribution map of each object keypoint in the target object in the image two-dimensional coordinate system can be obtained, and a one-dimensional coordinate probability distribution map of each object keypoint in the target object in a one-dimensional coordinate system perpendicular to the image two-dimensional coordinate system can be obtained.
Specifically, please refer to fig. 6, in which fig. 6 is a schematic diagram of probability distribution of obtaining coordinates according to another embodiment of the present disclosure. As shown in fig. 6, fig. 6 may be considered a network structure for gesture recognition, after determining a target object in a target image 610, features in the target image 610 may be extracted based on an efficiency network (EfficientNet)620, and then sampled based on at least one upsampling (Upsample)630, wherein the up-sampling 630 can be implemented based on a convolution function (Conv2D)631 and a pixel shuffle (pixelsuffle) 632, and the sampled data can be divided into two processing parts based on the two-dimensional coordinate system and the one-dimensional coordinate system of the image, and one part of the sampled data obtains a two-dimensional coordinate probability distribution map 660 of each object key point in the target object in the two-dimensional coordinate system of the image based on the convolution function (Conv2D)631, the two-dimensional probability distribution graph 660 represents the distribution of two-dimensional coordinate points (coordinate points formed by x coordinates and y coordinates) of each object key point in the target object in the image two-dimensional coordinate system; another part of the sampled data is based on the average pooling (Avgpool)640 and the Full Connected (FC) 650 to obtain a one-dimensional coordinate probability distribution graph 670 of each object key point in the target object in the one-dimensional coordinate system, wherein the one-dimensional coordinate probability distribution graph 670 indicates that each object key point in the target object is distributed by coordinate points (coordinate points formed by Zc coordinates) in the one-dimensional coordinate system.
Alternatively, the network structure of the gesture recognition includes the structure shown in fig. 6 but is not limited to the above manner, and any network in which two-dimensional gesture points and relative depths can be learned through a network may be included.
S402, determining two-dimensional position information of each object key point in the target object in the image two-dimensional coordinate system according to the two-dimensional coordinate probability distribution map, and determining the relative depth of each object key point in the target object relative to the reference point in the target object according to the one-dimensional coordinate probability distribution map.
After determining a two-dimensional coordinate probability distribution map of each object key point in the target object in the image two-dimensional coordinate system and a one-dimensional coordinate probability distribution map in the one-dimensional coordinate system perpendicular to the image two-dimensional coordinate system, the two-dimensional coordinate probability distribution map may be converted into two-dimensional position information of each object key point in the target object in the image two-dimensional coordinate system, where a specific conversion method may not be limited, for example, averaging data in the two-dimensional coordinate probability distribution map to obtain an x coordinate and a y coordinate of each object key point in the target object in the image two-dimensional coordinate system, respectively; and acquiring coordinates in the data set in the two-dimensional coordinate probability distribution map as the x coordinate and the y coordinate of each object key point in the target object in the image two-dimensional coordinate system.
Similarly, the one-dimensional coordinate probability distribution map of each object key point in the target object in the one-dimensional coordinate system may be converted into the one-dimensional position information of each object key point in the target object in the one-dimensional coordinate system, because only one coordinate system exists in the one-dimensional coordinate system, and the coordinate axis of the one-dimensional coordinate system passes through the position of the reference point in the target object, the coordinate value of each object key point in the target object in the one-dimensional coordinate system is the relative distance from each object key point to the reference point, and the distance may also be regarded as the relative depth of each object key point in the target object with respect to the reference point in the target object.
And S403, determining three-dimensional position information of each object key point in the target object in a camera three-dimensional coordinate system according to the two-dimensional position information, the relative depth and preset camera internal parameters.
The method comprises the following steps of determining first three-dimensional position information of each object key point in a target object in a world three-dimensional coordinate system according to two-dimensional position information and relative depth of each object key point in the target object, wherein the first step is to convert the two-dimensional position information and the relative depth of each object key point in the target object into three-dimensional position information of a camera. Specifically, it can be calculated by the following formula:
Figure BDA0003142534170000081
the camera comprises a camera body, a camera key point, a camera body and a camera body, wherein Zc is a coordinate value corresponding to the relative depth of the object key point, x and y are coordinate values corresponding to two-dimensional position information of the object key point, Xc, Yc and Zc are coordinate values corresponding to three-dimensional position information of the object key point in a camera body, fx, fy, u and v are preset camera body parameters, wherein the preset camera body parameters are standard data, and can be obtained through Zhang-Zhengyou calibration.
S404, determining first three-dimensional position information of each object key point in the target object in a world three-dimensional coordinate system according to the three-dimensional position information of each object key point in the target object in the camera three-dimensional coordinate system and preset camera external parameters.
In the process of determining first three-dimensional position information of each object key point in the target object in a world three-dimensional coordinate system according to the two-dimensional position information and the relative depth of each object key point in the target object, the second step is to convert the three-dimensional position information of each object key point in the target object in the camera three-dimensional coordinate system into the first three-dimensional position information of each object key point in the target object in the world three-dimensional coordinate system. Specifically, it can be calculated by the following formula:
Figure BDA0003142534170000091
wherein Xc, Yc, and Zc are coordinate values corresponding to three-dimensional position information of the object key point in the camera three-dimensional coordinate system, Xw, Yw, and Zw are coordinate values corresponding to first three-dimensional position information of the object key point in the world three-dimensional coordinate system, R and T are preset camera external parameters, which are rotation matrices and translation matrices, respectively, the R matrix is obtained by multiplying the three rotation matrices, if the three rotation matrices are R1, R2, and R3, then R1, R2, and R3 are rotation matrices around the Xc coordinate axis, the Yc coordinate axis, and the Zc coordinate axis, respectively, and the T matrix is obtained according to a set default depth, which is a distance from the reference point in the depth target object to an origin in the camera three-dimensional coordinate system, for example, when the set default depth is 2, the default depth is setThe unit corresponding to the depth can be set as required, and can be units of meters, centimeters or millimeters, and when the default unit corresponding to the depth is meters, the T matrix can be obtained as
Figure BDA0003142534170000092
S405, determining the driving rotation angle of each object key point in the target object relative to the initial rotation angle according to the first three-dimensional position information.
After the first three-dimensional position information of each object key point in the target object in the world three-dimensional coordinate system is determined, the three-dimensional position information of a target model corresponding to the target object in the world three-dimensional coordinate system can also be determined. Specifically, the driving rotation angle of each object key point in the target object relative to the initial rotation angle may be determined according to the first three-dimensional position information of each object key point in the target object in the world three-dimensional coordinate system obtained in the above step.
Referring to fig. 7, fig. 7 is a schematic diagram illustrating a target object in a standard posture according to another embodiment of the present application. The initial rotation angle may be considered as the target object being in a standard posture, and in fig. 7, when the target object 710 is a human being, the standard posture is a T-shaped posture in which both legs are put together and both arms are spread, and the angle of each object key point in the standard posture of the target object 710 is an initial angle, so the driving rotation angle is related to the current posture of the target object in the target image.
The method for determining the driving rotation angle of each object key point in the target object relative to the initial rotation angle may include, first, determining a normal vector corresponding to each object key point in the target object, and specifically, determining at least three object key points, for example, when the target object is a person, using a pelvic bone point position point, a left crotch position point, and a right crotch position point as object key points for determining the normal vector, and then calculating the corresponding normal vector through a normal vector calculation formula, where the normal vector calculation formula is:
the normal vector is equal to Trianglenmal (pelvic point position point, left crotch position point, right crotch position point), wherein Trianglenmal standard triangular function, based on the pelvic point position point, left crotch position point, right crotch position point standard triangular function, can determine standard triangular function, based on the pelvic point position point, left crotch position point corresponding normal vector.
Further, according to the position information of the father node in each object key point in the target object, the position information of the child node and the normal vector, the initial relationship between the father node and the child node in each object key point in the target object is determined, and the initial relationship is also the initial inverse matrix of the node.
Inverse function, etc. the node initial inverse matrix is quaternion.
And finally, determining the driving rotation angle corresponding to each object key point in the target object according to the position information of the father node in each object key point in the target object, the position information of the child node and the initial relationship between the father node and the child node.
A parent node skeleton driving rotation angle is equal to quaternion.
And S406, driving each model key point in the target model corresponding to the target object to move according to the driving rotation angle so that the posture of the target model is the same as that of the target object in the target image.
Alternatively, a target model corresponding to the target object may be preset in advance, wherein the shape of the target model is the same as or similar to the target object, but the target model is in a pyramidal posture (not in any motion state), so after determining the driving rotation angle corresponding to each object key point in the target object, each model key point in the target model corresponding to the target object may be driven to move, wherein the model key point is similar to the object key point, and the model key point is a core point referring to or constituting the target model until the posture of the target model is the same as the posture of the target object in the target image.
The method for driving the movement of each object key point in the target model corresponding to the target object may not be limited, and for example, each model key point in the target model corresponding to the target object may be driven to move by using a Forward driving (FK) method based on a driving rotation angle. Specifically, when each model key point in a target model corresponding to a target object is driven to move by using a forward driving method based on a driving rotation angle, the orientation of the target object is firstly calculated according to the positions of a spinal column point and left and right hip joint points, then the driving rotation angle of each child node point based on a parent node is calculated through a skeleton relation, and finally the target model is driven to move by setting an angle rotation model component.
Furthermore, on the basis of driving each model key point in the target model corresponding to the target object by using a Forward driving (FK) method based on the driving rotation angle, continuously driving each model key point in the target model corresponding to the target object by using a backward driving method, so that each model key point in the target model corresponding to the target object is driven more accurately when the model key points in the target model corresponding to the target object are driven to move integrally, and when each model key point in the target model corresponding to the target object is driven by using the backward driving method based on the driving rotation angle, firstly based on the driving rotation angle of each child node, then based on the target object orientation and the bone relationship calculated according to the positions of the spinal column point, the left hip joint point and the right hip joint point, the driving rotation angles of the child nodes are transformed, and the driving rotation angle of the parent node based on the child node is reversely calculated at one level, and finally, driving the target model to move by setting an angle rotation model component.
Referring to fig. 8 and 9, fig. 8 is a schematic diagram of a target model driver according to another embodiment of the present application, and fig. 9 is a schematic diagram of a target model driver according to another embodiment of the present application.
As shown in fig. 8, a target object 810 exists in the target image in fig. 8, a target model 820 corresponding to the target object can be determined in fig. 9 based on the target object 810, and the target model 820 is driven to operate based on the position information of each object key point in the target object, so that the posture 820 of the target model is the same as the posture of the target object 810 in the target image.
S407, obtaining model three-dimensional position information of each model key point in the target model after movement in a world three-dimensional coordinate system, and determining second three-dimensional position information of each model key point in the target model in the world three-dimensional coordinate system according to the two-dimensional position information and the model three-dimensional position information.
Because the posture of the target model is the same as the posture of the target object in the target image, the position information of each model key point in the target model can be regarded as the position information of each object key point in the target object in the real world, so that the model three-dimensional position information of each model key point in the moving target model in the world three-dimensional coordinate system can be obtained, and the second three-dimensional position information of each model key point in the target model in the world three-dimensional coordinate system can be determined according to the two-dimensional position information of the target object in the two-dimensional coordinate system and the model three-dimensional position information.
The method for determining the second three-dimensional position information of each model key point in the target model in the world three-dimensional coordinate system may be that, firstly, the real camera external parameters corresponding to the target model are determined according to the two-dimensional position information of the target object in the two-dimensional coordinate system and the model three-dimensional position information. In the embodiment, in the process of calculating the first three-dimensional position information of each object key point in the target object in the world three-dimensional coordinate system, the preset camera external parameter used is the camera external parameter set by the user or the default camera external parameter, so that the real camera external parameter of the target model can be calculated in order to obtain the more real three-dimensional position information of the target model in the world three-dimensional coordinate system. Optionally, the real camera external reference of the target object may be determined according to the two-dimensional position information of each object key Point in the target object and the model three-dimensional position information, specifically, an n-Point Perspective (PnP) algorithm may be solved based on a minimum reprojection error method, and the two-dimensional position information of each object key Point in the target object is matched with the model three-dimensional position information of the target model to obtain the real camera external reference of the target model; the real Camera external parameters of the target model can be solved through an EPnP (effective Perspectral-n-Point Camera Point Estimation) algorithm. Optionally, the solved real camera external parameter may be further modified according to the model three-dimensional position information of the target model, so that the real camera external parameter is more accurate.
Further, the real camera external reference of the target model mainly comprises a rotation matrix and a translation matrix, in the embodiment of the application, the translation matrix in the real camera external reference is mainly obtained, the translation matrix represents the distance from the reference point in the object to the original point in the three-dimensional coordinate system of the camera, and then the translation matrix in the real camera external reference represents the real distance from the reference point in the target model to the original point in the three-dimensional coordinate system of the camera, namely the absolute depth from the reference point in the target model to the original point in the three-dimensional coordinate system of the camera.
And further, determining second three-dimensional position information of each model key point in the target model in a world three-dimensional coordinate system according to the three-dimensional position information of each object key point in the target object in the camera three-dimensional coordinate system and the real camera external parameters. After the real camera external reference of the target model is calculated, similar to the step of determining the first three-dimensional position information of each object key point in the target object in the world three-dimensional coordinate system according to the three-dimensional position information of each object key point in the target object in the camera three-dimensional coordinate system and the preset camera external reference, the second three-dimensional position information of each model key point in the target model in the world three-dimensional coordinate system can be directly determined according to the three-dimensional position information of each object key point in the target object in the camera three-dimensional coordinate system and the real camera external reference, which is not described herein.
The embodiment of the application provides an image processing method, which solves the problems that the current three-dimensional human body posture detection method needs a plurality of complicated image detection methods and the depths of key points of each model in a target model are not accurately predicted directly, solves the problem of graphics by using the knowledge of graphics, and can obtain the real position of the target model corresponding to a target object in a space.
Referring to fig. 10 and 11, fig. 10 is a schematic diagram of a target model driver according to another embodiment of the present application, and fig. 11 is a schematic diagram of a target model driver according to another embodiment of the present application.
As shown in fig. 10, when the image processing method in the embodiment of the present application is applied to virtual fitting, after determining real three-dimensional position information of each model key point in the target model 1010 corresponding to the target object in the world three-dimensional coordinate system, the target model 1010 may be driven to move forward and backward accurately in space, and the target model 1010 may be attached to the icon image. As shown in fig. 11, the virtual clothes 1020 can also be controlled to fit into the target image of the target object based on the target model 1010, which realizes virtual fitting from the user's perspective.
In the embodiment of the application, because the two-dimensional position information and the relative depth of each object key point in the target object can be determined according to one target image, the first three-dimensional position information of each object key point in the target object can be determined firstly, and then the second three-dimensional position information of each model key point in the corresponding target model in the target object can be determined according to the two-dimensional position information and the first three-dimensional position information of each object key point in the target object, so that the three-dimensional position information of the target model can be determined through one image, the complexity of determining the three-dimensional position is reduced, and the accuracy of determining the three-dimensional position is increased.
Referring to fig. 12, fig. 12 is a schematic structural diagram of an image processing apparatus according to another embodiment of the present application.
As shown in fig. 12, the image processing apparatus 1200 includes:
the object two-dimensional position determining module 1210 is configured to identify a target object in a target image, and determine two-dimensional position information of each object key point in the target object in an image two-dimensional coordinate system and a relative depth of each object key point relative to a reference point in the target object.
And the object three-dimensional position determining module 1220 is configured to determine, according to the two-dimensional position information and the relative depth, first three-dimensional position information of each object key point in the target object in a world three-dimensional coordinate system.
And the model three-dimensional position determining module 1230 is configured to determine, according to the two-dimensional position information and the first three-dimensional position information, second three-dimensional position information of each model key point in the target model corresponding to the target object in the world three-dimensional coordinate system.
Referring to fig. 13, fig. 13 is a schematic structural diagram of an image processing apparatus according to another embodiment of the present application.
As shown in fig. 13, the image processing apparatus 1300 includes:
the probability distribution map obtaining module 1310 is configured to obtain a two-dimensional coordinate probability distribution map of each object key point in the target object in the two-dimensional image coordinate system, and obtain a one-dimensional coordinate probability distribution map of each object key point in the target object in a one-dimensional coordinate system perpendicular to the two-dimensional image coordinate system.
A two-dimensional and depth calculating module 1320, configured to determine, according to the two-dimensional coordinate probability distribution map, two-dimensional position information of each object key point in the target object in the image two-dimensional coordinate system, and determine, according to the one-dimensional coordinate probability distribution map, a relative depth of each object key point in the target object with respect to a reference point in the target object.
The camera position calculating module 1330 is configured to determine three-dimensional position information of each object key point in the target object in the camera three-dimensional coordinate system according to the two-dimensional position information, the relative depth, and the preset camera internal parameter.
The first three-dimensional position determining module 1340 is configured to determine, according to the three-dimensional position information of each object key point in the target object in the camera three-dimensional coordinate system and the preset camera external parameters, first three-dimensional position information of each object key point in the target object in the world three-dimensional coordinate system.
A driving rotation angle determining module 1350, configured to determine, according to the first three-dimensional position information, a driving rotation angle of each object key point in the target object with respect to the initial rotation angle.
The method for determining the driving rotation angle of each object key point in the target object relative to the initial rotation angle comprises the following steps: determining a normal vector corresponding to each object key point in a target object; determining an initial relationship between a father node and a child node in each object key point in the target object according to the position information of the father node in each object key point in the target object, the position information of the child node and a normal vector; and determining the driving rotation angle of each object key point in the target object relative to the initial rotation angle according to the position information of the father node in each object key point in the target object, the position information of the child node and the initial relationship between the father node and the child node.
And a driving module 1360, configured to drive each model key point in the target model corresponding to the target object to move according to the driving rotation angle, so that the pose of the target model is the same as the pose of the target object in the target image.
Driving each model key point in the target model corresponding to the target object to move according to the driving rotation angle, wherein the driving method comprises the following steps: and driving each model key point in the target model corresponding to the target object to move by a forward driving method based on the driving rotation angle.
And the second three-dimensional position determining module 1370 is used for obtaining model three-dimensional position information of each model key point in the moving target model in a world three-dimensional coordinate system, and determining second three-dimensional position information of each model key point in the target model in the world three-dimensional coordinate system according to the two-dimensional position information and the model three-dimensional position information.
The method for determining the second three-dimensional position information of each model key point in the target model in the world three-dimensional coordinate system according to the two-dimensional position information and the model three-dimensional position information comprises the following steps: determining real camera external parameters corresponding to the target model according to the two-dimensional position information and the model three-dimensional position information; and determining second three-dimensional position information of each model key point in the target model in a world three-dimensional coordinate system according to the three-dimensional position information of each object key point in the target object in the camera three-dimensional coordinate system and the real camera external parameters.
Referring to fig. 14, fig. 14 is a schematic data flow diagram of an image processing apparatus according to another embodiment of the present application.
As shown in fig. 14, taking the image processing apparatus 1200 as an example, in the object two-dimensional position determining module, the target images are subjected to object recognition to obtain target objects respectively, and then the target objects are subjected to pose estimation to obtain two-dimensional position information of each object key point in the target object in the image two-dimensional coordinate system and a relative depth to a reference point in the target object, and the two-dimensional position information of each object key point in the target object and the relative depth to the reference point in the target object may constitute a 2.5D pose of the target object.
In the object three-dimensional position determining module, by setting camera parameters, first three-dimensional position information of each object key point of the target object in a world three-dimensional coordinate system, namely the 3D posture of the target object, can be determined based on a back projection method.
In the model three-dimensional position determining module, firstly, the model three-dimensional position information of a target model corresponding to a target object is determined based on forward driving, namely the 3D posture of the target model, then, the real camera parameters corresponding to the target model are determined based on the two-dimensional position information of the target object in a two-dimensional coordinate system, the model three-dimensional position information and a solving algorithm, and finally, the second three-dimensional position information of each model key point in the target model in a world three-dimensional coordinate system is determined based on the camera parameters and a back projection method, namely the final 3D posture of the target model.
Embodiments of the present application also provide a computer storage medium storing a plurality of instructions adapted to be loaded by a processor and to perform the steps of the method according to any of the above embodiments.
Further, please refer to fig. 15, where fig. 15 is a schematic structural diagram of a terminal according to an embodiment of the present application. As shown in fig. 15, the terminal 1500 may include: at least one central processor 1501, at least one network interface 1504, a user interface 1503, memory 1505, at least one communication bus 1502.
The communication bus 1502 is used to realize connection communication among these components.
The user interface 1503 may include a Display (Display) and a Camera (Camera), and the optional user interface 1503 may also include a standard wired interface and a standard wireless interface.
The network interface 1504 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface), among others.
The central processor 1501 may include one or more processing cores, among others. The central processor 1501 connects various parts within the entire terminal 1500 using various interfaces and lines, and performs various functions of the terminal 1500 and processes data by operating or executing instructions, programs, code sets, or instruction sets stored in the memory 1505 and calling data stored in the memory 1505. Optionally, the central Processing unit 1501 may be implemented in at least one hardware form of a Digital Signal Processing (DSP), a Field-Programmable Gate Array (FPGA), and a Programmable Logic Array (PLA). The CPU 1501 may integrate one or a combination of a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a modem, and the like. Wherein, the CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for rendering and drawing the content required to be displayed by the display screen; the modem is used to handle wireless communications. It is to be understood that the modem may be implemented by a single chip without being integrated into the central processor 1501.
The Memory 1505 may include a Random Access Memory (RAM) or a Read-Only Memory (Read-Only Memory). Optionally, the memory 1505 includes a non-transitory computer-readable medium (non-transitory computer-readable storage medium). The memory 1505 may be used to store instructions, programs, code sets, or instruction sets. The memory 1505 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing the various method embodiments described above, and the like; the storage data area may store data and the like referred to in the above respective method embodiments. The memory 1505 may optionally be at least one memory device located remotely from the central processor 1501. As shown in fig. 15, the memory 1505, which is a kind of computer storage medium, may include therein an operating system, a network communication module, a user interface module, and an image processing program.
In the terminal 1500 shown in fig. 15, the user interface 1503 is mainly used as an interface for providing input for a user, and acquiring data input by the user; the central processing unit 1501 may be configured to call the image processing program stored in the memory 1505, and specifically perform the following operations:
identifying a target object in a target image, and determining two-dimensional position information of each object key point in the target object in an image two-dimensional coordinate system and relative depth of each object key point in the target object relative to a reference point in the target object; determining first three-dimensional position information of each object key point in the target object in a world three-dimensional coordinate system according to the two-dimensional position information and the relative depth; and determining second three-dimensional position information of each model key point in the target model corresponding to the target object in a world three-dimensional coordinate system according to the two-dimensional position information and the first three-dimensional position information.
Optionally, determining two-dimensional position information of each object key point in the target object in the image two-dimensional coordinate system and a relative depth to the reference point in the target object, includes: acquiring a two-dimensional coordinate probability distribution map of each object key point in a target object in an image two-dimensional coordinate system, and acquiring a one-dimensional coordinate probability distribution map of each object key point in the target object in a one-dimensional coordinate system vertical to the image two-dimensional coordinate system; and determining two-dimensional position information of each object key point in the target object in an image two-dimensional coordinate system according to the two-dimensional coordinate probability distribution map, and determining the relative depth of each object key point in the target object relative to a reference point in the target object according to the one-dimensional coordinate probability distribution map.
Optionally, determining first three-dimensional position information of each object key point in the target object in a world three-dimensional coordinate system according to the two-dimensional position information and the relative depth, including: determining three-dimensional position information of each object key point in the target object in a camera three-dimensional coordinate system according to the two-dimensional position information, the relative depth and preset camera internal parameters; and determining first three-dimensional position information of each object key point in the target object in a world three-dimensional coordinate system according to the three-dimensional position information of each object key point in the target object in the camera three-dimensional coordinate system and preset camera external parameters.
Optionally, determining, according to the two-dimensional position information and the first three-dimensional position information, second three-dimensional position information of each model key point in the target model corresponding to the target object in the world three-dimensional coordinate system, includes: determining a driving rotation angle of each object key point in the target object relative to the initial rotation angle according to the first three-dimensional position information; driving each model key point in a target model corresponding to the target object to move according to the driving rotation angle so as to enable the posture of the target model to be the same as the posture of the target object in the target image; obtaining model three-dimensional position information of each model key point in the target model after movement in a world three-dimensional coordinate system, and determining second three-dimensional position information of each model key point in the target model in the world three-dimensional coordinate system according to the two-dimensional position information and the model three-dimensional position information.
Optionally, determining second three-dimensional position information of each model key point in the target model in the world three-dimensional coordinate system according to the two-dimensional position information and the model three-dimensional position information, including: determining real camera external parameters corresponding to the target model according to the two-dimensional position information and the model three-dimensional position information; and determining second three-dimensional position information of each model key point in the target model in a world three-dimensional coordinate system according to the three-dimensional position information of each object key point in the target object in the camera three-dimensional coordinate system and the real camera external parameters.
Optionally, determining a driving rotation angle of each object key point in the target object relative to the initial rotation angle includes: determining a normal vector corresponding to each object key point in a target object; determining an initial relationship between a father node and a child node in each object key point in the target object according to the position information of the father node in each object key point in the target object, the position information of the child node and a normal vector; and determining the driving rotation angle of each object key point in the target object relative to the initial rotation angle according to the position information of the father node in each object key point in the target object, the position information of the child node and the initial relationship between the father node and the child node.
Optionally, driving, according to the driving rotation angle, each model key point in the target model corresponding to the target object to move, includes: and driving each model key point in the target model corresponding to the target object to move by a forward driving method based on the driving rotation angle.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of modules is merely a division of logical functions, and an actual implementation may have another division, for example, a plurality of modules or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or modules, and may be in an electrical, mechanical or other form.
Modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical modules, may be located in one place, or may be distributed on a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
In addition, functional modules in the embodiments of the present application may be integrated into one processing module, or each of the modules may exist alone physically, or two or more modules are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode.
The integrated module, if implemented in the form of a software functional module and sold or used as a separate product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method of the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
It should be noted that, for the sake of simplicity, the above-mentioned method embodiments are described as a series of acts or combinations, but those skilled in the art should understand that the present application is not limited by the described order of acts, as some steps may be performed in other orders or simultaneously according to the present application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required in this application.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In view of the above description of the image processing method, apparatus, storage medium and terminal provided by the present application, those skilled in the art will recognize that the scope of the present application can be modified according to the following claims.

Claims (10)

1. An image processing method, characterized in that the method comprises:
identifying a target object in a target image, and determining two-dimensional position information of each object key point in the target object in an image two-dimensional coordinate system and relative depth of each object key point in the target object relative to a reference point in the target object;
determining first three-dimensional position information of each object key point in the target object in a world three-dimensional coordinate system according to the two-dimensional position information and the relative depth;
and determining second three-dimensional position information of each model key point in the target model corresponding to the target object in the world three-dimensional coordinate system according to the two-dimensional position information and the first three-dimensional position information.
2. The method of claim 1, wherein the determining two-dimensional position information of each object key point in the target object in an image two-dimensional coordinate system and a relative depth to a reference point in the target object comprises:
acquiring a two-dimensional coordinate probability distribution map of each object key point in the target object in an image two-dimensional coordinate system, and acquiring a one-dimensional coordinate probability distribution map of each object key point in the target object in a one-dimensional coordinate system vertical to the image two-dimensional coordinate system;
and determining two-dimensional position information of each object key point in the target object in an image two-dimensional coordinate system according to the two-dimensional coordinate probability distribution map, and determining the relative depth of each object key point in the target object relative to a reference point in the target object according to the one-dimensional coordinate probability distribution map.
3. The method according to claim 1 or 2, wherein the determining first three-dimensional position information of each object key point in the target object in a world three-dimensional coordinate system according to the two-dimensional position information and the relative depth comprises:
determining three-dimensional position information of each object key point in the target object in a camera three-dimensional coordinate system according to the two-dimensional position information, the relative depth and preset camera internal parameters;
and determining first three-dimensional position information of each object key point in the target object in a world three-dimensional coordinate system according to the three-dimensional position information of each object key point in the target object in the camera three-dimensional coordinate system and preset camera external parameters.
4. The method according to claim 3, wherein the determining, according to the two-dimensional position information and the first three-dimensional position information, second three-dimensional position information of each model key point in the target model corresponding to the target object in the world three-dimensional coordinate system includes:
determining a driving rotation angle of each object key point in the target object relative to an initial rotation angle according to the first three-dimensional position information;
driving each model key point in a target model corresponding to the target object to move according to the driving rotation angle, so that the posture of the target model is the same as the posture of the target object in the target image;
obtaining model three-dimensional position information of each model key point in the target model in the world three-dimensional coordinate system after movement, and determining second three-dimensional position information of each model key point in the target model in the world three-dimensional coordinate system according to the two-dimensional position information and the model three-dimensional position information.
5. The method of claim 4, wherein determining second three-dimensional position information of each model key point in the target model in the world three-dimensional coordinate system according to the two-dimensional position information and the model three-dimensional position information comprises:
determining real camera external parameters corresponding to the target model according to the two-dimensional position information and the model three-dimensional position information;
and determining second three-dimensional position information of each model key point in the target model in the world three-dimensional coordinate system according to the three-dimensional position information of each object key point in the target object in the camera three-dimensional coordinate system and the real camera external parameters.
6. The method of claim 4, wherein determining the driving rotation angle of each object keypoint in the target object relative to an initial rotation angle comprises:
determining a normal vector corresponding to each object key point in the target object;
determining an initial relationship between a father node and a child node in each object key point in the target object according to the position information of the father node in each object key point in the target object, the position information of the child node and the normal vector;
and determining the driving rotation angle of each object key point in the target object relative to the initial rotation angle according to the position information of the father node in each object key point in the target object, the position information of the child node and the initial relationship between the father node and the child node.
7. The method according to claim 4, wherein the driving each model key point in the target model corresponding to the target object to move according to the driving rotation angle comprises:
and driving each model key point in the target model corresponding to the target object to move by a forward driving method based on the driving rotation angle.
8. An image processing apparatus, characterized in that the apparatus comprises:
the object two-dimensional position determining module is used for identifying a target object in a target image, and determining two-dimensional position information of each object key point in the target object in an image two-dimensional coordinate system and relative depth of each object key point relative to a reference point in the target object;
the object three-dimensional position determining module is used for determining first three-dimensional position information of each object key point in the target object in a world three-dimensional coordinate system according to the two-dimensional position information and the relative depth;
and the model three-dimensional position determining module is used for determining second three-dimensional position information of each model key point in the target model corresponding to the target object in the world three-dimensional coordinate system according to the two-dimensional position information and the first three-dimensional position information.
9. A computer storage medium storing a plurality of operations adapted to be loaded by a processor and to perform the steps of the method according to any of claims 1 to 7.
10. A terminal comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor when executing the program implementing the steps of the method according to any of claims 1 to 7.
CN202110748267.9A 2021-06-30 2021-06-30 Image processing method, image processing device, storage medium and terminal Pending CN113470112A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110748267.9A CN113470112A (en) 2021-06-30 2021-06-30 Image processing method, image processing device, storage medium and terminal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110748267.9A CN113470112A (en) 2021-06-30 2021-06-30 Image processing method, image processing device, storage medium and terminal

Publications (1)

Publication Number Publication Date
CN113470112A true CN113470112A (en) 2021-10-01

Family

ID=77877326

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110748267.9A Pending CN113470112A (en) 2021-06-30 2021-06-30 Image processing method, image processing device, storage medium and terminal

Country Status (1)

Country Link
CN (1) CN113470112A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114495421A (en) * 2021-12-30 2022-05-13 山东奥邦交通设施工程有限公司 Intelligent open type road construction operation monitoring and early warning method and system
CN115018918A (en) * 2022-08-04 2022-09-06 南昌虚拟现实研究院股份有限公司 Three-dimensional coordinate determination method and device, electronic equipment and storage medium
CN116386016A (en) * 2023-05-22 2023-07-04 杭州睿影科技有限公司 Foreign matter treatment method and device, electronic equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109448090A (en) * 2018-11-01 2019-03-08 北京旷视科技有限公司 Image processing method, device, electronic equipment and storage medium
CN110826357A (en) * 2018-08-07 2020-02-21 北京市商汤科技开发有限公司 Method, device, medium and equipment for three-dimensional detection and intelligent driving control of object
CN111582207A (en) * 2020-05-13 2020-08-25 北京市商汤科技开发有限公司 Image processing method, image processing device, electronic equipment and storage medium
CN111862299A (en) * 2020-06-15 2020-10-30 上海非夕机器人科技有限公司 Human body three-dimensional model construction method and device, robot and storage medium
CN112184914A (en) * 2020-10-27 2021-01-05 北京百度网讯科技有限公司 Method and device for determining three-dimensional position of target object and road side equipment
CN112509123A (en) * 2020-12-09 2021-03-16 北京达佳互联信息技术有限公司 Three-dimensional reconstruction method and device, electronic equipment and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110826357A (en) * 2018-08-07 2020-02-21 北京市商汤科技开发有限公司 Method, device, medium and equipment for three-dimensional detection and intelligent driving control of object
CN109448090A (en) * 2018-11-01 2019-03-08 北京旷视科技有限公司 Image processing method, device, electronic equipment and storage medium
CN111582207A (en) * 2020-05-13 2020-08-25 北京市商汤科技开发有限公司 Image processing method, image processing device, electronic equipment and storage medium
CN111862299A (en) * 2020-06-15 2020-10-30 上海非夕机器人科技有限公司 Human body three-dimensional model construction method and device, robot and storage medium
CN112184914A (en) * 2020-10-27 2021-01-05 北京百度网讯科技有限公司 Method and device for determining three-dimensional position of target object and road side equipment
CN112509123A (en) * 2020-12-09 2021-03-16 北京达佳互联信息技术有限公司 Three-dimensional reconstruction method and device, electronic equipment and storage medium

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114495421A (en) * 2021-12-30 2022-05-13 山东奥邦交通设施工程有限公司 Intelligent open type road construction operation monitoring and early warning method and system
CN114495421B (en) * 2021-12-30 2022-09-06 山东奥邦交通设施工程有限公司 Intelligent open type road construction operation monitoring and early warning method and system
CN115018918A (en) * 2022-08-04 2022-09-06 南昌虚拟现实研究院股份有限公司 Three-dimensional coordinate determination method and device, electronic equipment and storage medium
CN115018918B (en) * 2022-08-04 2022-11-04 南昌虚拟现实研究院股份有限公司 Three-dimensional coordinate determination method and device, electronic equipment and storage medium
CN116386016A (en) * 2023-05-22 2023-07-04 杭州睿影科技有限公司 Foreign matter treatment method and device, electronic equipment and storage medium
CN116386016B (en) * 2023-05-22 2023-10-10 杭州睿影科技有限公司 Foreign matter treatment method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
US11270460B2 (en) Method and apparatus for determining pose of image capturing device, and storage medium
CN107223269B (en) Three-dimensional scene positioning method and device
CN109829981B (en) Three-dimensional scene presentation method, device, equipment and storage medium
CN113470112A (en) Image processing method, image processing device, storage medium and terminal
US20170186219A1 (en) Method for 360-degree panoramic display, display module and mobile terminal
CN106846497B (en) Method and device for presenting three-dimensional map applied to terminal
US9135678B2 (en) Methods and apparatus for interfacing panoramic image stitching with post-processors
CN110866977B (en) Augmented reality processing method, device, system, storage medium and electronic equipment
CN109754464B (en) Method and apparatus for generating information
CN108430032B (en) Method and equipment for realizing position sharing of VR/AR equipment
CN111402404B (en) Panorama complementing method and device, computer readable storage medium and electronic equipment
EP3855386A2 (en) Method, apparatus, device and storage medium for transforming hairstyle and computer program product
CN110262763B (en) Augmented reality-based display method and apparatus, storage medium, and electronic device
CN111862349A (en) Virtual brush implementation method and device and computer readable storage medium
CN111161398A (en) Image generation method, device, equipment and storage medium
CN113870439A (en) Method, apparatus, device and storage medium for processing image
CN114998433A (en) Pose calculation method and device, storage medium and electronic equipment
CN113240789B (en) Virtual object construction method and device
CN112037305B (en) Method, device and storage medium for reconstructing tree-like organization in image
CN113902520A (en) Augmented reality image display method, device, equipment and storage medium
CN115775310A (en) Data processing method and device, electronic equipment and storage medium
CN109785444A (en) Recognition methods, device and the mobile terminal of real plane in image
WO2024060708A1 (en) Target detection method and apparatus
CN112991551A (en) Image processing method, image processing device, electronic equipment and storage medium
CN116563740A (en) Control method and device based on augmented reality, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination