CN116029948A - Image processing method, apparatus, electronic device, and computer-readable storage medium - Google Patents

Image processing method, apparatus, electronic device, and computer-readable storage medium Download PDF

Info

Publication number
CN116029948A
CN116029948A CN202111240970.5A CN202111240970A CN116029948A CN 116029948 A CN116029948 A CN 116029948A CN 202111240970 A CN202111240970 A CN 202111240970A CN 116029948 A CN116029948 A CN 116029948A
Authority
CN
China
Prior art keywords
image
virtual
sub
virtual sub
feature points
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111240970.5A
Other languages
Chinese (zh)
Inventor
曾伟宏
王旭
刘晶
桑燊
刘海珊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lemon Inc Cayman Island
Original Assignee
Lemon Inc Cayman Island
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lemon Inc Cayman Island filed Critical Lemon Inc Cayman Island
Priority to CN202111240970.5A priority Critical patent/CN116029948A/en
Priority to PCT/SG2022/050749 priority patent/WO2023075681A2/en
Publication of CN116029948A publication Critical patent/CN116029948A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/40Filling a planar surface by adding surface attributes, e.g. colour or texture
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/20Image enhancement or restoration using local operators
    • G06T5/30Erosion or dilatation, e.g. thinning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Processing (AREA)
  • User Interface Of Digital Computer (AREA)
  • Image Analysis (AREA)

Abstract

An image processing method, an image processing apparatus, an electronic device, and a computer-readable storage medium. The image processing method comprises the following steps: acquiring at least one first virtual sub-image; processing a selected virtual sub-image of the at least one first virtual sub-image to obtain at least one second virtual sub-image associated with the selected virtual sub-image; in response to detecting the detection object, current feature information of the detection object is acquired, wherein the current feature information is used for indicating the current state of the detection object, and the detection object comprises a plurality of target features; determining movement information of a plurality of feature points in an initial virtual image based on the current feature information, wherein the initial virtual image is obtained by superposing a first virtual sub-image and a second virtual sub-image; and driving the plurality of feature points to move in the initial virtual image according to the movement information so as to generate a current virtual image corresponding to the current state. The method can reduce the design difficulty and the driving difficulty of the virtual image and improve the design efficiency.

Description

Image processing method, apparatus, electronic device, and computer-readable storage medium
Technical Field
Embodiments of the present disclosure relate to an image processing method, apparatus, electronic device, and computer-readable storage medium.
Background
With the rapid development of the internet, the avatar is widely used in emerging fields of live broadcasting, short video, games, etc. The application of the avatar not only improves the interest of human-computer interaction but also brings convenience to the user. For example, on a live platform, a host may live using an avatar without having to show.
Disclosure of Invention
At least one embodiment of the present disclosure provides an image processing method including: acquiring at least one first virtual sub-image, each of the at least one first virtual sub-image corresponding to one of a plurality of target features; processing a selected virtual sub-image of the at least one first virtual sub-image to obtain at least one second virtual sub-image associated with the selected virtual sub-image; in response to detecting the detection object, current feature information of the detection object is acquired, wherein the current feature information is used for indicating the current state of the detection object, and the detection object comprises a plurality of target features; determining movement information of a plurality of feature points in an initial virtual image based on the current feature information, wherein the initial virtual image is obtained by superposing at least one first virtual sub-image and at least one second virtual sub-image; and driving the plurality of feature points to move in the initial virtual image according to the movement information so as to generate a current virtual image corresponding to the current state.
For example, in the image processing method provided in an embodiment of the present disclosure, further includes: acquiring a filling image; at least one first virtual sub-image, at least one second virtual sub-image, and a filler image are superimposed to generate an initial virtual image.
For example, in the image processing method provided in an embodiment of the present disclosure, further includes: acquiring depth information of each first virtual sub-image and each second virtual sub-image in an initial virtual image; and superimposing at least one first virtual sub-image and at least one second virtual sub-image according to the depth information of each first virtual sub-image and each second virtual sub-image to generate an initial virtual image.
For example, in an image processing method provided in an embodiment of the present disclosure, processing a selected virtual sub-image of at least one first virtual sub-image to obtain at least one second virtual sub-image associated with the selected virtual sub-image includes: the selected virtual sub-image is morphologically processed to obtain at least one second virtual sub-image associated with the selected virtual sub-image.
For example, in an image processing method provided in an embodiment of the present disclosure, selecting the virtual sub-image includes detecting an eye-white image of the object, the method further including:
Splitting the eye white image into a first eye white sub-image and a second eye white sub-image along the central axis of the eye white image, wherein the direction of the central axis is parallel to the length direction of the eyes of the detection object, the first eye white sub-image is positioned on one side of the central axis far away from the mouth of the detection object, the second eye white sub-image is positioned on one side of the central axis close to the mouth, and morphological processing is performed on the selected virtual sub-image to obtain at least one second virtual sub-image related to the selected virtual sub-image, and the method comprises the following steps: respectively carrying out morphological processing on the first eye white sub-image and the second eye white sub-image to obtain a first sub-image and a second sub-image; and filling the first sub-image according to the color and texture of the face of the detection object to obtain an upper eyelid image of the detection object, and filling the second sub-image to obtain a lower eyelid image of the detection object.
For example, in an image processing method provided in an embodiment of the present disclosure, processing a selected virtual sub-image of at least one first virtual sub-image to obtain at least one second virtual sub-image associated with the selected virtual sub-image includes: splitting the selected virtual sub-image to obtain a plurality of second virtual sub-images.
For example, in an image processing method provided in an embodiment of the present disclosure, selecting a virtual sub-image includes detecting a mouth image of an object, splitting the selected virtual sub-image to obtain a plurality of second virtual sub-images, including: the mouth image is split into two second virtual sub-images from the position where the mouth of the detection object is widest, one of the two second virtual sub-images is an upper lip image, and the other second virtual sub-image is a lower lip image.
For example, in the image processing method provided in an embodiment of the present disclosure, further includes: calculating the maximum deformation value of a plurality of characteristic points; determining movement information of a plurality of feature points in the initial virtual image based on the current feature information, including: and determining movement information of a plurality of feature points in the initial virtual image based on the current feature information and the maximum deformation value.
For example, in an image processing method provided in an embodiment of the present disclosure, calculating a maximum deformation value of a plurality of feature points includes: determining a maximum deformation curve which is met by a first part of characteristic points in the plurality of characteristic points; determining the position coordinates of each feature point in the first part of feature points; substituting the position coordinates of each feature point in the first part of feature points into the maximum deformation curve to obtain the maximum deformation value of the first part of feature points.
For example, in an image processing method provided in an embodiment of the present disclosure, calculating a maximum deformation value of a plurality of feature points includes: calculating a difference value between the extreme position coordinates and the reference position coordinates of the second part of feature points in the plurality of feature points, wherein the extreme position coordinates are the coordinates of the second part of feature points in the initial virtual image; and taking the difference value as the maximum deformation value of each of the second partial characteristic points.
For example, in an image processing method provided in an embodiment of the present disclosure, determining movement information of a plurality of feature points in an initial virtual image based on current feature information and a maximum deformation value includes: determining a current state value of the current state relative to a reference state according to the current characteristic information; and calculating the current state value and the maximum deformation value, and determining movement information of a plurality of feature points in the initial virtual image.
For example, in an image processing method provided in an embodiment of the present disclosure, calculating a current state value and a maximum deformation value, determining movement information of a plurality of feature points in an initial virtual image includes: and multiplying the current state value and the maximum deformation value to obtain movement information of a plurality of feature points in the initial virtual image.
For example, in an image processing method provided in an embodiment of the present disclosure, determining a current state value of a current state relative to a reference state according to current feature information includes: obtaining a mapping relation between the characteristic information and the state value; and determining a current state value of the current state relative to the reference state according to the mapping relation and the current characteristic information.
For example, in an image processing method provided in an embodiment of the present disclosure, obtaining a mapping relationship between feature information and a state value includes: acquiring a plurality of samples, wherein each sample comprises a corresponding relation between sample characteristic information and a sample state value; and constructing a mapping function based on the corresponding relation included in each sample in the plurality of samples, wherein the mapping function represents the mapping relation between the characteristic information and the state value.
For example, in an image processing method provided in an embodiment of the present disclosure, sample feature information includes target feature information and second feature information, a sample state value includes a first value corresponding to the target feature information and a second value corresponding to the second feature information, and constructing a mapping function based on a correspondence relationship includes: constructing a linear equation set; and solving the linear equation set according to the target characteristic information, the first value, the second characteristic information and the second value to obtain a mapping function.
For example, in an image processing method provided in an embodiment of the present disclosure, the second virtual sub-image includes an upper lip image and a lower lip image, the first partial feature points include feature points in the upper lip image and feature points in the lower lip image, and determining a maximum deformation curve that the first partial feature points conform to from the plurality of feature points includes: and respectively determining a first maximum deformation curve which is met by the characteristic points in the upper lip image and a second maximum deformation curve which is met by the characteristic points in the lower lip image.
For example, in the image processing method provided in an embodiment of the present disclosure, feature points in an upper lip image and feature points in a lower lip image are in one-to-one correspondenceComprising n columns, the first maximum deformation curve is y 1 =(x′-n) 2 The second maximum deformation curve is y 2 =c-(x′-n) 2 X ' is the x ' axis coordinate of the feature point in the upper lip image and the feature point in the lower lip image, and the widest position of the mouth is the x ' axis, y 1 Y is the distance the feature point in the upper lip image moves away from the lower lip 2 The distance that the feature point in the lower lip image moves away from the upper lip.
For example, in an image processing method provided in an embodiment of the present disclosure, selecting a sub-image includes detecting an eye white image of an object, a second virtual sub-image includes an upper eyelid image and a lower eyelid image, a reference position coordinate is a coordinate of a reference point corresponding to a second partial feature point on a central axis of the eye white image, a direction of the central axis is parallel to a length direction of an eye of the object, the second partial feature point includes a feature point of the upper eyelid image and a feature point of the lower eyelid image, and calculating a difference between a limit position coordinate and a reference position coordinate of the second partial feature point among the plurality of feature points includes: calculating the difference between the limit position coordinates of the characteristic points in the upper eyelid image and the reference points on the central axis, which correspond to the characteristic points in the upper eyelid image; and calculating the difference between the limit position coordinates of the characteristic points in the lower eyelid image and the reference points on the central axis corresponding to the characteristic points in the lower eyelid image.
For example, in an image processing method provided in an embodiment of the present disclosure, movement information includes a movement distance, and driving a plurality of feature points to move in an initial virtual image according to the movement information includes: the plurality of feature points are driven to move towards a preset reference position by a moving distance.
For example, in an image processing method provided in an embodiment of the present disclosure, a filling image is used to fill a gap in a superimposed sub-image, where the superimposed sub-image is obtained by superimposing two second virtual sub-images, and the gap refers to a position between the two second virtual sub-images in the superimposed sub-image, and driving a plurality of feature points to move in an initial virtual image according to movement information, including: and driving the plurality of feature points to move in the initial virtual image according to the movement information, and changing the shape of the filling image according to the size of the gap so that the filling image is matched with the gap to generate the current virtual image.
For example, in an image processing method provided in an embodiment of the present disclosure, the filling image includes an oral image, and the two second virtual sub-images include an upper lip image and a lower lip image.
At least one embodiment of the present disclosure provides an image processing apparatus including: an acquisition unit configured to acquire at least one first virtual sub-image, each of the at least one first virtual sub-image corresponding to one of a plurality of target features; a processing unit configured to process a selected virtual sub-image of the at least one first virtual sub-image to obtain at least one second virtual sub-image associated with the selected virtual sub-image; a detection unit configured to acquire current feature information of a detection object in response to detection of the detection object, the current feature information being used to indicate a current state of the detection object, the detection object including a plurality of target features; a determining unit configured to determine movement information of a plurality of feature points in an initial virtual image based on current feature information, the initial virtual image being superimposed by at least one first virtual sub-image and at least one second virtual sub-image; and a driving unit configured to drive the plurality of feature points to move in the initial virtual image according to the movement information to generate a current virtual image corresponding to the current state.
At least one embodiment of the present disclosure provides an electronic device comprising a processor; a memory including one or more computer program modules; one or more computer program modules are stored in the memory and configured to be executed by the processor, the one or more computer program modules comprising instructions for implementing the image processing methods provided by any of the embodiments of the present disclosure.
At least one embodiment of the present disclosure provides a computer-readable storage medium storing non-transitory computer-readable instructions that, when executed by a computer, may implement the image processing method provided by any of the embodiments of the present disclosure.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings of the embodiments will be briefly described below, and it is apparent that the drawings in the following description relate only to some embodiments of the present disclosure, not to limit the present disclosure.
FIG. 1A illustrates a flow chart of an image processing method provided by at least one embodiment of the present disclosure;
FIG. 1B schematically illustrates a schematic view of a plurality of first virtual sub-images provided by some embodiments of the present disclosure;
FIG. 1C schematically illustrates a schematic view of processing the image 110 of the mouth of FIG. 1B to obtain a second virtual sub-image provided by some embodiments of the present disclosure;
FIG. 1D illustrates a schematic diagram of processing a selected virtual sub-image to obtain a second virtual sub-image provided by some embodiments of the present disclosure;
FIG. 1E illustrates a schematic diagram of a full mouth image using a fill image provided in accordance with at least one embodiment of the present disclosure;
FIG. 2A is a flow chart illustrating a method of calculating maximum deformation values for a plurality of feature points according to at least one embodiment of the present disclosure;
FIG. 2B is a schematic diagram illustrating a maximum deformation curve provided by at least one embodiment of the present disclosure;
FIG. 2C is a flow chart illustrating another method of calculating maximum deformation values for a plurality of feature points provided in accordance with at least one embodiment of the present disclosure;
FIG. 2D illustrates a schematic diagram of calculating a difference between extreme position coordinates and reference position coordinates of a feature point in an upper eyelid, provided by at least one embodiment of the present disclosure;
FIG. 3 is a flow chart illustrating a method for determining movement information for a plurality of feature points in an initial virtual image based on current feature information and a maximum deformation value according to at least one embodiment of the present disclosure;
FIG. 4 illustrates an effect schematic of an image processing method provided by at least one embodiment of the present disclosure;
FIG. 5 illustrates a schematic block diagram of an image processing apparatus provided by at least one embodiment of the present disclosure;
FIG. 6 illustrates a schematic block diagram of an electronic device provided by at least one embodiment of the present disclosure;
FIG. 7 illustrates a schematic block diagram of another electronic device provided by at least one embodiment of the present disclosure; and
fig. 8 illustrates a schematic diagram of a computer-readable storage medium provided by at least one embodiment of the present disclosure.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present disclosure more apparent, the technical solutions of the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings of the embodiments of the present disclosure. It will be apparent that the described embodiments are some, but not all, of the embodiments of the present disclosure. All other embodiments, which can be made by one of ordinary skill in the art without the need for inventive faculty, are within the scope of the present disclosure, based on the described embodiments of the present disclosure.
Unless defined otherwise, technical or scientific terms used in this disclosure should be given the ordinary meaning as understood by one of ordinary skill in the art to which this disclosure belongs. The terms "first," "second," and the like, as used in this disclosure, do not denote any order, quantity, or importance, but rather are used to distinguish one element from another. Likewise, the terms "a," "an," or "the" and similar terms do not denote a limitation of quantity, but rather denote the presence of at least one. The word "comprising" or "comprises", and the like, means that elements or items preceding the word are included in the element or item listed after the word and equivalents thereof, but does not exclude other elements or items. The terms "connected" or "connected," and the like, are not limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. "upper", "lower", "left", "right", etc. are used merely to indicate relative positional relationships, which may also be changed when the absolute position of the object to be described is changed.
Both the motion and expression of the avatar can be driven in real time according to the motion and expression of an object (e.g., a user) detected by the electronic device. The design difficulty and the driving difficulty of the avatar are high. The three-dimensional Virtual image is manufactured by performing image art design, character model manufacturing, and animation skeleton binding, and further relates to motion capture technology, holographic hardware technology, augmented Reality (Augmented Reality, AR) technology, virtual Reality (VR) technology and development of a driving program in driving and presenting of the Virtual image, so that the three-dimensional Virtual image is longer in manufacturing period, higher in implementation difficulty and driving difficulty, and higher in cost. The two-dimensional virtual image is manufactured by carrying out professional original picture design according to the requirements of different original picture design platforms, drawing each frame of animation frame by frame to form action expression during driving, and carrying out material transformation on specific drawing software such as live2d, so that the realization difficulty and the driving difficulty are high.
At least one embodiment of the present disclosure provides an image processing method, an image processing apparatus, an electronic device, and a computer-readable storage medium. The image processing method comprises the following steps: acquiring at least one first virtual sub-image, each of the at least one first virtual sub-image corresponding to one of a plurality of target features; processing a selected virtual sub-image of the at least one first virtual sub-image to obtain at least one second virtual sub-image associated with the selected virtual sub-image; in response to detecting the detection object, current feature information of the detection object is acquired, wherein the current feature information is used for indicating the current state of the detection object, and the detection object comprises a plurality of target features; determining movement information of a plurality of feature points in an initial virtual image based on the current feature information; and driving the plurality of feature points to move in an initial virtual image according to the movement information so as to generate a current virtual image corresponding to the current state, wherein the initial virtual image is obtained by superposing the at least one first virtual sub-image and the at least one second virtual sub-image. The image processing method can reduce the design difficulty of the virtual image, improve the design efficiency, and reduce the driving difficulty of the virtual image, so that the virtual image is easier to realize and drive.
Fig. 1A shows a flowchart of an image processing method according to at least one embodiment of the present disclosure.
As shown in fig. 1A, the method may include steps S10 to S50.
Step S10: at least one first virtual sub-image is acquired, each of the at least one first virtual sub-image corresponding to one of a plurality of target features.
Step S20: a selected virtual sub-image of the at least one first virtual sub-image is processed to obtain at least one second virtual sub-image associated with the selected virtual sub-image.
Step S30: in response to detecting the detection object, current feature information of the detection object is acquired, the current feature information being used to indicate a current state of the detection object, the detection object including a plurality of target features.
Step S40: and determining movement information of a plurality of feature points in an initial virtual image based on the current feature information, wherein the initial virtual image is obtained by superposing at least one first virtual sub-image and at least one second virtual sub-image.
Step S50: and driving the plurality of feature points to move in the initial virtual image according to the movement information so as to generate a current virtual image corresponding to the current state.
According to the embodiment, the first virtual sub-image can be processed to obtain the second virtual sub-image, so that only the virtual sub-image required by part of the initial virtual image is required to be drawn, and the virtual sub-image required by all the initial virtual images is not required to be drawn, thereby improving the efficiency of designing the virtual image and reducing the design difficulty.
In the embodiment of the disclosure, the generation and driving of the avatar can be realized by acquiring the current state of the detection object in real time and driving the feature points in the initial avatar to move according to the current feature information of the current state, so that the generation and driving of the avatar can be realized without using an animated skeleton binding technology, a holographic hardware technology and the like or specific drawing software (such as Live 2D), and the realization difficulty and driving difficulty of the avatar are reduced.
For step S10, for example, the first virtual sub-image is drawn in advance by a designer using a drawing tool such as Photoshop, and stored in the storage unit. For example, according to the layer name preset by the designer, the corresponding image is drawn in the layer, so that the layer and the driving layer can be called according to the layer name. For example, reading the psd. File of the Photoshop mapping tool from the storage unit results in at least one first virtual sub-image.
For example, the first virtual sub-image is an image of the target feature. For example, the target features are facial features of upper eyelashes, lower eyelashes, eye whites, pupils, and the like, and accordingly, the first virtual sub-image may be an upper eyelash image, a lower eyelash image, an eye white image, a mouth image, a pupil image, and the like.
In some embodiments of the present disclosure, the first virtual sub-image is, for example, an image of a designer drawing a target feature in one of the extreme states. For example, the upper eyelash image is an image of upper eyelashes when the eyes are open to the maximum extent. For example, the mouth image is an image of the mouth when the mouth is opened to the maximum extent.
Fig. 1B schematically illustrates a schematic view of a plurality of first virtual sub-images provided by some embodiments of the present disclosure.
As shown in fig. 1B, the plurality of first virtual sub-images includes a mouth image 110, an upper eyelash image 120, a pupil image 130, and an eye white image 140.
It should be understood that fig. 1B only shows a portion of the first virtual sub-image, and not all of the first virtual sub-image. For example, the plurality of first virtual sub-images also includes an image of hair, an image of nose, and so forth. The designer can draw the required first virtual sub-image according to the actual need.
For step S20, in some embodiments of the present disclosure, the second virtual sub-image associated with the selected virtual sub-image may be a partial image of the target feature to which the selected virtual sub-image corresponds. In this embodiment, step S20 may include splitting the selected virtual sub-image to obtain a plurality of second virtual sub-images.
For example, the virtual sub-image is selected to be the image of the mouth 110, and the second virtual sub-image associated with the image of the mouth 110 may be the image of the upper lip and the image of the lower lip. In this embodiment, step S20 may include splitting the mouth image into two second virtual sub-images from a position where the mouth of the detection object is widest, one of the two second virtual sub-images being an upper lip image and the other being a lower lip image.
Fig. 1C schematically illustrates a schematic view of processing the image 110 of the mouth in fig. 1B to obtain a second virtual sub-image provided by some embodiments of the present disclosure.
As shown in fig. 1C, the image 110 of the mouth is split into a second virtual sub-image 111 and a second virtual sub-image 112 along the widest position of the mouth, i.e., along the position where the dashed line AA' is located.
The second virtual sub-image 111 is an upper lip image and the second virtual sub-image 112 is a lower lip image.
In some embodiments of the present disclosure, the second virtual sub-image associated with the selected virtual sub-image may be an image of other target features located around the target feature to which the selected virtual sub-image corresponds.
In some embodiments of the present disclosure, step S20 may include morphologically processing the selected virtual sub-image to obtain at least one second virtual sub-image associated with the selected virtual sub-image.
For example, the virtual sub-image is selected as the white-of-eye image 140, and the upper eyelid and the lower eyelid are target features located around the white-of-eye, and the white-of-eye image 140 is processed to obtain the upper eyelid image and the lower eyelid image.
Fig. 1D illustrates a schematic diagram of processing a selected virtual sub-image to obtain a second virtual sub-image provided by some embodiments of the present disclosure.
In the embodiment shown in fig. 1D, selecting the virtual sub-image includes detecting an image of the eye white of the subject. In this embodiment, the image processing method further includes: and splitting the eye white image into a first eye white sub-image and a second eye white sub-image along the central axis of the eye white image, wherein the direction of the central axis is parallel to the length direction of the eyes of the detection object, the first eye white sub-image is positioned at one side of the central axis far away from the mouth of the detection object, and the second eye white sub-image is positioned at one side of the central axis close to the mouth.
As shown in fig. 1D, the white image is split into a first white sub-image 141 and a second white sub-image 142 along a central axis BB' of an alpha (alpha) channel of the white image.
In step S20, morphological processing is performed on the first eye white sub-image 141 and the second eye white sub-image 142, respectively, to obtain a first sub-image and a second sub-image, and the first sub-image is filled according to the color and texture of the face of the detection object to obtain an upper eyelid image 143 of the detection object, and the second sub-image is filled to obtain a lower eyelid image 144 of the detection object.
For example, the first and second eye white sub-images 141 and 142 are respectively subjected to inverse processing, and small voids (for example, open operation, closed operation, or hole filling in morphology, for example) are filtered out to obtain the first and second sub-images. The first sub-image and the second sub-image are a layer of an upper eyelid and a layer of a lower eyelid, respectively. Then, the corresponding positions of the facial layers are extracted to obtain textures and colors of the upper eyelid and the lower eyelid, and the upper eyelid image 143 and the lower eyelid image 144 are obtained by filling the upper eyelid layer and the lower eyelid layer, respectively.
For step S30, for example, the detection object may be detected by a detection device (camera, infrared device, or the like). For example, the detection object may be an object to be virtualized, and the object to be virtualized may be a living body such as a person, a pet, or the like, for example. For example, the detection object may be a user who needs to perform avatar driving, for example, a host who selects a virtualization function provided by the live platform at the live platform.
In some embodiments of the present disclosure, the detection object includes a feature to be virtualized, the feature to be virtualized including a target feature. For example, the feature to be virtualized is a feature to be virtualized in the detection object, the target feature is a key part of the detection object, and the key part is a part to be driven in a virtual image obtained after the feature to be virtualized. For example, the features to be virtualized of the test object may include, but are not limited to, cheeks, shoulders, hair, five officials, etc. For example, in the features to be virtual, cheeks and shoulders are locations that do not need to be driven in the virtual image, while eyebrows, upper lashes, lower lashes, mouth, upper eyelid, lower eyelid, etc. need to be driven in the virtual image, and thus, target features may include, but are not limited to, eyebrows, upper lashes, lower lashes, mouth, upper eyelid, lower eyelid, etc.
For example, in response to the camera detecting the detection object, current feature information of the detection object is acquired. For example, the detection object is subjected to positioning detection by using a plurality of face key points, so that the current characteristic information of the detection object is obtained. For example, a 106 face key point detection algorithm, or a 280 face key point detection algorithm may be employed, as may other applicable algorithms, embodiments of the present disclosure not being limited in this respect.
In some embodiments of the present disclosure, for example, the current feature information includes facial motion information, body posture information, and the like of the detection object, which are used to indicate the current state of the detection object. For example, the current feature information can indicate the degree of opening of the eyes of the subject, the degree of opening of the mouth, and the like.
In some embodiments of the present disclosure, the current feature information includes contrast information between a target feature of the current state and a reference feature of the detection object, the contrast information being unchanged when a distance of the detection object with respect to an image capturing device for detecting the detection object changes.
For example, when the detection object is far from or near to the camera, as long as the motion or expression of the detection object is unchanged, the current feature information is not changed, so that the shake of the avatar caused by the change of the distance between the detection object and the camera can be relieved.
For example, the reference feature may be a face, eyes, etc. For example, the current feature information may include a ratio h1/h0 of an eyebrow-to-eye height h1 to a face height h0, a ratio h2/h0 of an eye upper and lower eyelid key point height h2 to a face height h0, a ratio h3/k0 of a pupil-to-eye outer corner distance h3 to an eye width k0, a ratio s1/s0 of a mouth area s1 to a face area s0, that is, an eyebrow height is represented by the ratio h1/h0, an eye opening degree is represented by the ratio h2/h0, a pupil position is represented by the ratio h3/k0, and a mouth opening degree is represented by the ratio s1/s 0.
For step S40, the initial virtual image may be generated in response to detecting that the detection object superimposes the at least one first virtual sub-image and the at least one second virtual sub-image. For example, each first virtual sub-image and each second virtual sub-image are used as a layer of the initial virtual image, and at least one first virtual sub-image and at least one second virtual sub-image are overlapped to obtain the initial virtual image. The plurality of feature points in the initial virtual image are feature points in each virtual sub-image (including the first virtual sub-image and the second virtual sub-image).
As shown in fig. 1A, the image processing method may include step S60 and step S70 in addition to step S10 to step S50. For example, step S60 and step S70 may be performed between step S20 and step S30.
Step S60: a fill image is acquired.
Step S70: at least one first virtual sub-image, at least one second virtual sub-image, and a filler image are superimposed to generate an initial virtual image.
The embodiment fills the first virtual sub-image and the second virtual sub-image by using the filling image, so that the initial virtual image generated by the first virtual sub-image and the second virtual sub-image is more vivid, and the current virtual image is more vivid.
For step S60, for example, the filling image is an oral image, and the oral image and the image 111 of the upper lip and the image 112 of the lower lip are superimposed to obtain a complete mouth image of the initial virtual image.
For step S70, a gap between the first virtual sub-image and the second virtual sub-image is filled, for example with a filling image, to generate an initial virtual image.
In other embodiments of the present disclosure, step S70 may be to superimpose at least one first virtual sub-image, at least one second virtual sub-image, and a fill image to generate an initial virtual image in response to detecting the detection object.
Fig. 1E illustrates a schematic diagram of a full mouth image using a fill image provided in at least one embodiment of the present disclosure.
As shown in fig. 1E, the fill image is an oral image 113, and the oral image 113 is superimposed with the image 111 of the upper lip and the image 112 of the lower lip to obtain a complete mouth image 150 of the initial virtual image.
In some embodiments of the present disclosure, the image processing method may further include acquiring depth information of each first virtual sub-image and each second virtual sub-image in the initial virtual image on the basis of the foregoing steps, and superimposing at least one first virtual sub-image and at least one second virtual sub-image according to the depth information of each first virtual sub-image and each second virtual sub-image to generate the initial virtual image.
The embodiment sets depth information for each virtual sub-image (including the first virtual sub-image and the second virtual sub-image) so that an initial virtual image obtained by superimposing a plurality of virtual sub-images has a three-dimensional effect, and an avatar obtained by driving a feature point in the initial virtual image to move also has a three-dimensional effect.
In some embodiments of the present disclosure, depth information of each first virtual sub-image and each second virtual sub-image may be preset by a designer and stored in a storage unit. For example, depth information is read from the storage unit, and in response to detection of the detection object, at least one first virtual sub-image and at least one second virtual sub-image are superimposed in accordance with the depth information of each first virtual sub-image and each second virtual sub-image to generate an initial virtual image.
For example, the depth information of each virtual sub-image includes a depth value of each virtual sub-image and a depth value of a feature point in each virtual sub-image.
In some embodiments of the present disclosure, each virtual sub-image corresponds to one of a plurality of features to be virtualized of the detection object, the plurality of features to be virtualized including a target feature, a depth value of each virtual sub-image is proportional to a first distance in a direction perpendicular to a face of the detection object, a depth value of a feature point in each virtual sub-image is proportional to the first distance, and the first distance is a distance between the feature to be virtualized corresponding to the virtual sub-image and an eye of the detection object. That is, in the direction perpendicular to the face of the detection object, the depth value of each virtual sub-image and the depth value of the feature point in each virtual sub-image are proportional to the distance between the virtual sub-image and the eyes of the detection object.
It is to be understood that the distance in this disclosure may be a vector, either positive or negative.
For example, the plurality of virtual sub-images includes a nose virtual sub-image corresponding to a nose of the detection object and a shoulder virtual sub-image corresponding to a shoulder of the detection object. In the direction perpendicular to the face of the detection object, the distance between the nose of the to-be-virtual feature and the eyes of the detection object is-f 1, the distance between the shoulders of the to-be-virtual feature and the eyes of the detection object is f2, and both f1 and f2 are greater than 0. Thus, the depth value of the nose virtual sub-image is smaller than the depth value of the virtual sub-image of the eye, which is smaller than the depth value of the shoulder virtual sub-image. This embodiment allows the nose of the avatar to be visually positioned in front of the eyes and the shoulders to be positioned behind the eyes.
For example, for each virtual sub-image, the virtual sub-image is divided into a plurality of rectangular frames, the frame coordinates (bounding box) of each rectangular frame are extracted, and depth values are set for the upper left corner vertex and the lower right corner vertex in each rectangular frame according to the first distance, and each vertex of the rectangular frame may represent one feature point. The depth value of the vertex of the rectangular frame is proportional to the distance between the vertex and the eyes of the detection object.
For example, for the target feature eyebrow, the virtual sub-image of the eyebrow is equally divided into 3 rectangles, the frame coordinates (X, Y) of the upper left corner vertex and the lower right corner vertex of each of the 3 rectangles in the image coordinate system are extracted, X is the abscissa, Y is the ordinate, and a depth value Z is set for each vertex to obtain the coordinates (X, Y, Z) of each vertex, while the W coordinates and the four texture coordinates (s, t, r, q) are added. For example, the dimension of each vertex is 8 dimensions. The number of the vertex groups is 8, namely two rows and four columns (2, 4), the first behavior (X, Y, Z, W) and the second behavior (s, t, r, q). The image coordinate system uses the width direction of the virtual sub-image as the X axis, the height direction of the virtual sub-image as the Y axis, and the lower left corner of the virtual sub-image as the origin of coordinates, for example.
The virtual sub-image of the posterior (the portion of the hair away from the eyes of the subject) is divided into more rectangles than the virtual sub-image of the eyebrow. In the subsequent virtual sub-image, the depth value (i.e., z-coordinate) of the middle region is smaller than the depth value (i.e., z-coordinate) of the two sides to form a subsequent curved shape.
In some embodiments of the present disclosure, the image processing method may further include calculating a maximum deformation value of the plurality of feature points in addition to the steps S10 to S50, so that the step S40 includes determining movement information of the plurality of feature points in the initial virtual image based on the current feature information and the maximum deformation value. For example, calculation of the maximum deformation values of the plurality of feature points is performed before step S40.
At least one embodiment of step S40 is described below in conjunction with fig. 2A, 2B, 2C, 2D and fig. 3.
Fig. 2A is a flowchart illustrating a method for calculating maximum deformation values of a plurality of feature points according to at least one embodiment of the present disclosure.
As shown in fig. 2A, the method may include steps S210 to S230.
Step S210: and determining the maximum deformation curve which is met by the first part of characteristic points in the plurality of characteristic points.
Step S220: position coordinates of each of the first partial feature points are determined.
Step S230: substituting the position coordinates of each feature point in the first part of feature points into the maximum deformation curve to obtain the maximum deformation value of the first part of feature points.
According to the embodiment, the maximum deformation value is obtained through the maximum deformation curve, and some characteristic points at the edge part of the virtual sub-image can be prevented from being missed, so that the current virtual image is more complete and accords with the actual situation. For example, if feature points of edge portions (for example, mouth angle positions) of the upper lip image and the lower lip image are missed, resulting in that the feature points of the edge portions (for example, mouth angle positions) cannot be driven, then the mouth angle of the current virtual image is not closed when the mouth of the detection subject is closed, resulting in incomplete and unrealistic current virtual image.
For step S210, in some embodiments of the present disclosure, the first partial feature point is a partial feature point of the plurality of feature points, and the first partial feature point is a plurality of feature points that can be fitted to a curve when the target feature is at the maximum deformation. And when the target feature is at the maximum deformation, the fitted curve of the first part of feature points is the maximum deformation curve.
For example, the second virtual sub-image includes an upper lip image in which feature points can be fitted to a curve when the mouth is opened to the maximum, and a lower lip image in which feature points can be fitted to a curve when the mouth is opened to the maximum. Thus, the first partial feature points may include feature points in the upper lip image and feature points in the lower lip image. In this embodiment, step S210 includes: and respectively determining a first maximum deformation curve which is met by the characteristic points in the upper lip image and a second maximum deformation curve which is met by the characteristic points in the lower lip image.
Fig. 2B illustrates a schematic diagram of a maximum deformation curve provided by at least one embodiment of the present disclosure.
As shown in fig. 2B, an upper lip image 111 and a lower lip image 112 are included in the scene. The upper lip image 111 and the lower lip image 112 are images corresponding when the mouth is opened to the maximum extent. The upper lip image 111 includes a plurality of feature points, and the lower lip image 112 includes a plurality of feature points.
The feature points in the upper lip image and the feature points in the lower lip image are in one-to-one correspondence, and the feature points in the upper lip image and the feature points in the lower lip image comprise n columns. The first maximum deformation curve is:
y 1 =(x′-n) 2
the second maximum deformation curve is:
y 2 =c-(x′-n) 2
and x 'is the x' axis coordinate of the feature point in the upper lip image and the feature point in the lower lip image. In the mouth image shown in FIG. 2B, the widest position of the mouth is taken as the x' axis, y in the above formula 1 In the above formula, y is the distance that the characteristic point in the upper lip image moves away from the lower lip 2 The distance that the feature point in the lower lip image moves away from the upper lip.
In some embodiments of the present disclosure, n is an odd number.
In some embodiments of the present disclosure, the maximum deformation curve that the first partial feature points conform to may be fitted in advance by a designer from a plurality of samples.
For step S220, position coordinates of a plurality of feature points in the upper lip image and position coordinates of a plurality of feature points in the lower lip image are determined. For example, the X coordinates of the plurality of feature points in the upper lip image are extracted from the coordinates (X, Y, Z, W) described above, and the X coordinates are converted into the X coordinates in the mouth image shown in fig. 2B.
For step S230, for example, the x 'axis coordinates of the plurality of feature points on the upper lip include x' Upper 1 ~x’ Upper n Will x' Upper 1 ~x’ Upper n Substituted into y 1 =(x′-n) 2 Obtaining the maximum deformation value y of each characteristic point on the upper lip 1
For example, the x-axis coordinates of the feature point on the lower lip include x' Lower 1 ~x’ Lower n Will x' Lower 1 ~x’ Lower n Substituted into y 2 =c-(x′-n) 2 Obtaining the maximum deformation value y of each characteristic point on the lower lip 2
Fig. 2C is a flowchart illustrating another method for calculating maximum deformation values of a plurality of feature points according to at least one embodiment of the present disclosure.
As shown in fig. 2C, the method may include steps S240 to S250.
Step S240: and calculating the difference value between the extreme position coordinates and the reference position coordinates of the second part of feature points in the plurality of feature points, wherein the extreme position coordinates are the coordinates of the second part of feature points in the initial virtual image.
Step S250: and taking the difference value as the maximum deformation value of each of the second partial characteristic points.
The embodiment directly determines the maximum deformation value according to the limit position coordinate and the reference position coordinate, and has simple calculation and easy realization.
For step S240, the second partial feature point is, for example, other feature points than the first partial feature point among the plurality of feature points. For example, feature points in the upper eyelash image, the upper eyelid image, and the lower eyelid image are second partial feature points. The maximum deformation value may be determined according to steps S240 to S250 for the feature points in the upper eyelash image, the upper eyelid image, and the lower eyelid image.
For example, the limit position coordinates of the second partial feature point refer to the coordinates of the second partial feature point in the moving direction of the feature point, and the reference position coordinates refer to the coordinates of the reference point in the moving direction. For example, the movement direction of the upper eyelashes is perpendicular to the width direction of the eyes, and the limit position coordinates of the feature points on the upper eyelashes may be coordinates of the feature points of the upper eyelashes in the width direction perpendicular to the eyes. For example, if the width direction of the eyes is a first direction, the moving direction of the upper eyelashes is a second direction. The first direction coincides, for example, with the X' axis direction in fig. 2B above, i.e. with the X axis direction in the image coordinate system.
For example, the reference point of the feature point in the second partial feature point is a point on the reference line having the same X-axis coordinate as the feature point.
For example, for the feature points on the upper eyelash image, the upper eyelid image, and the lower eyelid image, the reference line is the central axis of the eye white layer (for example, the dashed line BB' in fig. 1D), and the reference position coordinates of the feature points are the coordinates in the second direction of the reference points corresponding to the feature points on the central axis of the eye white layer.
For example, the selection sub-image includes an eye white image of the detection object, the second virtual sub-image includes an upper eyelid image and a lower eyelid image, the reference position coordinates are coordinates of a reference point on a central axis of the eye white image corresponding to a second partial feature point, the second partial feature point including a feature point of the upper eyelid image and a feature point of the lower eyelid image. Step S240 in this embodiment includes: calculating the difference between the limit position coordinates of the characteristic points in the upper eyelid image and the reference points on the central axis, which correspond to the characteristic points in the upper eyelid image; and calculating the difference between the limit position coordinates of the characteristic points in the lower eyelid image and the reference points on the central axis corresponding to the characteristic points in the lower eyelid image.
Fig. 2D illustrates a schematic diagram of calculating a difference between extreme position coordinates and reference position coordinates of a feature point in an eyelid in accordance with at least one embodiment of the present disclosure.
As shown in fig. 2D, a feature point Q in the upper eyelid image 0 The coordinates are (x) 0 ,y Q0 ),x 0 Is the characteristic point Q 0 Coordinate values in the X-axis (eye width direction), y shown in FIG. 2D Q0 Is the characteristic point Q 0 Coordinate values in the Y-axis (moving direction) shown in fig. 2D. Characteristic point Q 0 Is the reference point n1 on the central axis of the eye white layer, and the coordinates of n1 are (x) 0 ,y n0 ) Feature point Q 0 The difference between the limit position coordinates and the reference position coordinates is y Q0 -y n0
Similarly, for example, a feature point Q 'in the lower eyelid image' 0 The coordinates are (x) 0 ,y Q’0 ) Feature point Q' 0 Is the reference point n1 on the central axis of the eye white layer, and the coordinates of n1 are (x) 0 ,y n0 ) Feature point Q' 0 The difference between the limit position coordinates and the reference position coordinates is y n0 -y Q’0
For example, feature point P in upper eyelash image 1 The coordinates are (x) 1 ,y p1 ) Feature point P 1 Is the reference point m on the central axis of the eye white layer 1 ,m 1 Is (x) 1 ,y m1 ) Feature point P 1 The difference between the limit position coordinates and the reference position coordinates isy p1 -y m1
Above y Q0 、y Q’0 、y n0 、y p1 And y m1 Are all greater than or equal to 0.
For step S250, y is, for example p1 -y m1 As the characteristic point P 1 Is the maximum deformation value of (a).
Fig. 3 is a flowchart illustrating a method for determining movement information of a plurality of feature points in an initial virtual image based on current feature information and a maximum deformation value according to at least one embodiment of the present disclosure.
As shown in fig. 3, the method may include steps S310 to S320.
Step S310: and determining the current state value of the current state relative to the reference state according to the current characteristic information.
Step S320: and calculating the current state value and the maximum deformation value, and determining the movement information of a plurality of feature points in the initial virtual image.
The current state value may be a parameter for embodying a relationship between the current feature information and the reference feature information when the detection object is in the reference state.
For example, the reference state is where the eye is open to the maximum extent, the target feature is eyelashes, and the current state value may be a parameter representing a relationship between a position of the current eyelash and a position of the eyelash when the eye is open to the maximum extent.
In some embodiments of the present disclosure, step S310 may include obtaining a mapping relationship between feature information and state values, and determining a current state value of the target feature with respect to the reference state according to the mapping relationship and the current feature information.
For example, obtaining the mapping between the feature information and the state value may include: acquiring a plurality of samples, wherein each sample comprises a corresponding relation between sample characteristic information of a target characteristic and a sample state value; based on the corresponding relation, a mapping function is constructed, and the mapping function represents the mapping relation between the characteristic information and the state value.
For example, the sample feature information includes target feature information and second feature information, and the sample state value includes a first value corresponding to the target feature information and a second value corresponding to the second feature information. The construction of the mapping function comprises the steps of constructing a linear equation set, substituting the target characteristic information and the first value, the second characteristic information and the second value into the linear equation set respectively, and solving the linear equation set to obtain the mapping function.
For example, the linear equation set is a binary once equation set, the target feature information is a position coordinate Y0 of a feature point on the eyelash image in the moving direction when the eye in the sample is opened to the maximum, the first value is 0, the second feature information is a position coordinate Y1 of a feature point on the eyelash image in the moving direction when the eye in the sample is closed, and the second value is 1. The target feature information and the second feature information may be results of statistical calculation of a plurality of samples. And constructing a binary primary equation set according to (Y0, 0) and (Y1, 1), solving the binary primary equation set to obtain a mapping function u=av+b, wherein a and b are values obtained by solving the binary primary equation set, v is current characteristic information, and u is a current state value. For example, the current position coordinate v (i.e., the coordinate on the Y axis) of each feature point on the eyelashes is substituted into the mapping function u=av+b, and the current state value u corresponding to the current feature information v is obtained by solving.
In other embodiments of the present disclosure, the mapping relationship between the feature information and the state value may also be a mapping relationship table, and the present disclosure is not limited to the representation form of the mapping relationship.
In some embodiments of the present disclosure, step S320 includes multiplying the current state value and the maximum deformation value to obtain movement information of a plurality of feature points in the initial virtual image.
For example, for the feature point P in the upper eyelash 1 If the current state value is 0.5, the feature point P 1 Is 0.5 and y p1 -y m1 Is a product of (a) and (b).
Referring back to fig. 1A, in some embodiments of the present disclosure, the movement information includes a movement distance. Step S50 in this embodiment includes: the plurality of feature points are driven to move towards a preset reference position by a moving distance.
For example, for a mouth image, the reference position is the position where the mouth is widest, and the plurality of feature points of the upper lip are driven to move a moving distance from an initial position, which is a position in the upper lip image where the feature points are at the maximum when the mouth is opened to the maximum, toward the position where the mouth is widest, in an initial virtual image where the mouth is opened to the maximum.
For another example, the reference position is a central axis of the eye white, and the plurality of feature points of the upper eyelid are driven to move a moving distance from an initial position toward the central axis of the eye white, the initial position being a position of the feature points of the upper eyelid in the upper eyelid image when the eyes are opened to the maximum extent, and the eyes are opened to the maximum extent in the initial virtual image.
In some embodiments of the present disclosure, the filling image is used to fill a gap in a superimposed sub-image, where the superimposed sub-image is superimposed by two second virtual sub-images, and the gap refers to a position between the two second virtual sub-images in the superimposed sub-image, in which embodiment step S50 includes: and driving the plurality of feature points to move in the initial virtual image according to the movement information, and changing the shape of the filling image according to the size of the gap so that the filling image is matched with the gap to generate the current virtual image.
According to the movement information, driving the plurality of feature points to move in the initial virtual image may be performed according to the method described above, which is not described herein.
As shown in fig. 1C and 1E, the filling image is, for example, an oral image 113, and the two second virtual sub-images are an upper lip image 111 and a lower lip image 112, respectively. As the feature points in the upper lip image 111 and the lower lip image 112 move, the shape of the gap between the upper lip image 111 and the lower lip image 112 changes. The mouth image 113 is stretched or compressed to change the shape of the mouth image 113 such that the shape of the mouth image 113 fills the gap between the upper lip image 111 and the lower lip image 112 to present the complete mouth image to the detection subject.
In the above-described embodiments of the present disclosure, the movement information of the target feature is movement information with respect to the reference position, and thus, the feature point of the target feature is driven to move toward the reference position, and it is not necessary to additionally draw the final limit position to which the feature point of the target feature moves, thereby improving the efficiency of designing the avatar.
In other embodiments of the present disclosure, the method shown above with reference to fig. 1A may further include, in addition to steps S10 to S70: according to the current characteristic information, a pitch angle, a yaw angle and a roll angle are calculated, a rotation matrix is calculated according to the pitch angle, the yaw angle and the roll angle, characteristic points in the initial virtual image are driven to move according to the movement information, and the rotation of the initial virtual image is controlled according to the rotation matrix, so that a current virtual image corresponding to the current state is generated.
For example, the detection object is a user, and a pitch angle (pitch), a yaw angle (yaw) and a roll angle (roll) are calculated according to the 280 key points of the face of the user. The pitch angle is the angle by which the face rotates about a first axis, the yaw angle is the angle by which the face rotates about a second axis, and the roll angle is the angle by which the face rotates about a third axis. The first axis is perpendicular to the height direction of the face, the second axis is parallel to the height direction of the face, and the third axis is perpendicular to the first axis and the second axis. According to the current characteristic information, a pitch angle, a yaw angle and a roll angle are calculated, and according to the pitch angle, the yaw angle and the roll angle, a rotation matrix is calculated by referring to a correlation algorithm in a correlation technology, and the disclosure is not repeated.
In this embodiment, the initial virtual image is controlled to rotate according to the rotation matrix while driving the movement of the feature points in the initial virtual image according to the movement information to generate the current virtual image corresponding to the current state.
For example, in response to the head rotation of the detection object, the current virtual image presented by the electronic device is changed from a virtual image of the front face of the detection object to a virtual image of the side face of the detection object after the head rotation.
In this embodiment, the rotation of the initial virtual image may be controlled according to the rotation matrix, making the current virtual image more realistic and vivid.
Fig. 4 illustrates an effect schematic diagram of an image processing method provided in at least one embodiment of the present disclosure.
As shown in fig. 4, the effect diagram includes a state 401 of the detection object at the first time and a state 402 at the second time, respectively.
Also included in the effect diagram is a current virtual image 403 of the detection object presented in the electronic device at a first moment in time and a current virtual image 404 of the detection object presented in the electronic device at a second moment in time.
As shown in fig. 4, the eyes are open and the mouth is closed at the first moment, and accordingly, the eyes in the current virtual image 403 showing the detection object in the electronic device are also open and the mouth is also closed.
As shown in fig. 4, at the second moment, the eyes are closed and the mouth is open, and accordingly, the eyes in the current virtual image 404 of the detection object shown in the electronic device are also closed and the mouth is also open.
Fig. 5 shows a schematic block diagram of an image processing apparatus 500 provided by at least one embodiment of the present disclosure.
For example, as shown in fig. 5, the image processing apparatus 500 includes an acquisition unit 510, a processing unit 520, a detection unit 530, a determination unit 540, and a driving unit 550.
The acquisition unit 510 is configured to acquire at least one first virtual sub-image, wherein the at least one first virtual sub-image each corresponds to one of a plurality of target features. The acquisition unit 510 may perform step S10 described in fig. 1A, for example.
The processing unit 520 is configured to process a selected virtual sub-image of the at least one first virtual sub-image to obtain at least one second virtual sub-image associated with the selected virtual sub-image. The processing unit 520 may perform step S20 described in fig. 1A, for example.
The detection unit 530 is configured to acquire current feature information of a detection object including a plurality of target features in response to detecting the detection object, the current feature information indicating a current state of the detection object. The detection unit 530 may perform step S30 described in fig. 1A, for example.
The determining unit 540 is configured to determine movement information of a plurality of feature points in an initial virtual image based on the current feature information, the initial virtual image being superimposed by at least one first virtual sub-image and at least one second virtual sub-image. The determination unit 540 may perform, for example, step S40 described in fig. 1A.
The driving unit 550 is configured to drive the plurality of feature points to move in the initial virtual image according to the movement information to generate a current virtual image corresponding to the current state. The driving unit 550 may perform step S50 described in fig. 1A, for example.
For example, the acquisition unit 510, the processing unit 520, the detection unit 530, the determination unit 540, and the driving unit 550 may be hardware, software, firmware, and any feasible combination thereof. For example, the acquisition unit 510, the processing unit 520, the detection unit 530, the determination unit 540, and the driving unit 550 may be dedicated or general-purpose circuits, chips, devices, or the like, or may be a combination of a processor and a memory. With respect to the specific implementation forms of the respective units described above, the embodiments of the present disclosure are not limited thereto.
It should be noted that, in the embodiment of the present disclosure, each unit of the image processing apparatus 500 corresponds to each step of the foregoing image processing method, and the specific function of the image processing apparatus 500 may refer to the related description of the image processing method, which is not repeated herein. The components and structures of the image processing apparatus 500 shown in fig. 5 are merely exemplary and not limiting, and the image processing apparatus 500 may further include other components and structures as desired.
At least one embodiment of the present disclosure also provides an electronic device comprising a processor and a memory including one or more computer program modules. One or more computer program modules are stored in the memory and configured to be executed by the processor, the one or more computer program modules comprising instructions for implementing the image processing methods described above. The electronic equipment can reduce the design difficulty of the virtual image, improve the design efficiency and reduce the driving difficulty.
Fig. 6 is a schematic block diagram of an electronic device provided by some embodiments of the present disclosure. As shown in fig. 6, the electronic device 800 includes a processor 810 and a memory 820. Memory 820 is used to store non-transitory computer-readable instructions (e.g., one or more computer program modules). The processor 810 is configured to execute non-transitory computer readable instructions that, when executed by the processor 810, may perform one or more of the steps of the image processing method described above. The memory 820 and the processor 810 may be interconnected by a bus system and/or other forms of connection mechanisms (not shown).
For example, processor 810 may be a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), or other form of processing unit having data processing capabilities and/or program execution capabilities. For example, the Central Processing Unit (CPU) may be an X86 or ARM architecture, or the like. The processor 810 may be a general purpose processor or a special purpose processor that may control other components in the electronic device 800 to perform the desired functions.
For example, memory 820 may comprise any combination of one or more computer program products, which may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. Volatile memory can include, for example, random Access Memory (RAM) and/or cache memory (cache) and the like. The non-volatile memory may include, for example, read-only memory (ROM), hard disk, erasable programmable read-only memory (EPROM), portable compact disc read-only memory (CD-ROM), USB memory, flash memory, and the like. One or more computer program modules may be stored on the computer readable storage medium and executed by the processor 810 to implement the various functions of the electronic device 800. Various applications and various data, as well as various data used and/or generated by the applications, etc., may also be stored in the computer readable storage medium.
It should be noted that, in the embodiments of the present disclosure, specific functions and technical effects of the electronic device 800 may refer to the above description about the image processing method, which is not repeated herein.
Fig. 7 is a schematic block diagram of another electronic device provided by some embodiments of the present disclosure. The electronic device 900 is suitable for use, for example, to implement the image processing methods provided by the embodiments of the present disclosure. The electronic device 900 may be a terminal device or the like. It should be noted that the electronic device 900 illustrated in fig. 7 is merely an example, and does not impose any limitation on the functionality and scope of use of the embodiments of the present disclosure.
As shown in fig. 7, the electronic device 900 may include a processing means (e.g., a central processing unit, a graphics processor, etc.) 910 that may perform various suitable actions and processes in accordance with programs stored in a Read Only Memory (ROM) 920 or loaded from a storage 980 into a Random Access Memory (RAM) 930. In the RAM 930, various programs and data required for the operation of the electronic device 900 are also stored. The processing device 910, the ROM 920, and the RAM 930 are connected to each other by a bus 940. An input/output (I/O) interface 950 is also connected to bus 940.
In general, the following devices may be connected to the I/O interface 950: input devices 960 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, and the like; an output device 970 including, for example, a Liquid Crystal Display (LCD), speaker, vibrator, etc.; a storage device 980 including, for example, magnetic tape, hard disk, etc.; communication device 990. Communication device 990 may allow electronic device 900 to communicate wirelessly or by wire with other electronic devices to exchange data. While fig. 7 shows an electronic device 900 having various means, it is to be understood that not all of the illustrated means are required to be implemented or provided, and that electronic device 900 may alternatively be implemented or provided with more or fewer means.
For example, according to an embodiment of the present disclosure, the above-described image processing method may be implemented as a computer software program. For example, embodiments of the present disclosure include a computer program product comprising a computer program, carried on a non-transitory computer readable medium, the computer program comprising program code for performing the above-described image processing method. In such embodiments, the computer program may be downloaded and installed from a network via communication device 990, or from storage device 980, or from ROM 920. The functions defined in the image processing method provided in the embodiment of the present disclosure may be implemented when the computer program is executed by the processing apparatus 910.
At least one embodiment of the present disclosure also provides a computer-readable storage medium for storing non-transitory computer-readable instructions that, when executed by a computer, can implement the above-described image processing method. By using the computer readable storage medium, the design difficulty of the virtual image can be reduced, the design efficiency can be improved, and the driving difficulty can be reduced.
Fig. 8 is a schematic diagram of a storage medium according to some embodiments of the present disclosure. As shown in fig. 8, storage medium 1000 is used to store non-transitory computer readable instructions 1010. For example, non-transitory computer readable instructions 1010, when executed by a computer, may perform one or more steps in accordance with the image processing methods described above.
For example, the storage medium 1000 may be applied to the electronic device 800 described above. For example, the storage medium 1000 may be the memory 820 in the electronic device 800 shown in fig. 6. For example, the relevant description of the storage medium 1000 may refer to a corresponding description of the memory 820 in the electronic device 800 shown in fig. 6, which is not repeated here.
The following points need to be described:
(1) The drawings of the embodiments of the present disclosure relate only to the structures to which the embodiments of the present disclosure relate, and reference may be made to the general design for other structures.
(2) The embodiments of the present disclosure and features in the embodiments may be combined with each other to arrive at a new embodiment without conflict.
The foregoing is merely specific embodiments of the disclosure, but the scope of the disclosure is not limited thereto, and the scope of the disclosure should be determined by the claims.

Claims (24)

1. An image processing method, comprising:
obtaining at least one first virtual sub-image, wherein the at least one first virtual sub-image each corresponds to one of a plurality of target features;
processing a selected virtual sub-image of the at least one first virtual sub-image to obtain at least one second virtual sub-image associated with the selected virtual sub-image;
In response to detecting a detection object, acquiring current feature information of the detection object, wherein the current feature information is used for indicating the current state of the detection object, and the detection object comprises the plurality of target features;
determining movement information of a plurality of feature points in an initial virtual image based on the current feature information, wherein the initial virtual image is obtained by superposing at least one first virtual sub-image and at least one second virtual sub-image; and
and driving the plurality of feature points to move in the initial virtual image according to the movement information so as to generate a current virtual image corresponding to the current state.
2. The method of claim 1, further comprising:
acquiring a filling image; and
the at least one first virtual sub-image, the at least one second virtual sub-image, and the fill image are superimposed to generate the initial virtual image.
3. The method of claim 1, further comprising:
depth information of each first virtual sub-image and each second virtual sub-image in the initial virtual image is acquired, and the at least one first virtual sub-image and the at least one second virtual sub-image are overlapped according to the depth information of each first virtual sub-image and each second virtual sub-image so as to generate the initial virtual image.
4. A method according to any one of claims 1 to 3, wherein processing a selected virtual sub-image of the at least one first virtual sub-image to obtain the at least one second virtual sub-image associated with the selected virtual sub-image comprises:
morphological processing is performed on the selected virtual sub-image to obtain the at least one second virtual sub-image associated with the selected virtual sub-image.
5. The method of claim 4, wherein the selecting a virtual sub-image comprises an eye-white image of the test object, the method further comprising:
splitting the eye white image into a first eye white sub-image and a second eye white sub-image along the central axis of the eye white image, wherein the direction of the central axis is parallel to the length direction of the eyes of the detection object, the first eye white sub-image is positioned at one side of the central axis far away from the mouth of the detection object, the second eye white sub-image is positioned at one side of the central axis close to the mouth,
morphological processing of the selected virtual sub-image to obtain the at least one second virtual sub-image associated with the selected virtual sub-image, comprising:
Respectively carrying out morphological processing on the first eye white sub-image and the second eye white sub-image to obtain a first sub-image and a second sub-image; and
and filling the first sub-image according to the color and texture of the face of the detection object to obtain an upper eyelid image of the detection object, and filling the second sub-image to obtain a lower eyelid image of the detection object.
6. A method according to any one of claims 1 to 3, wherein processing a selected virtual sub-image of the at least one first virtual sub-image to obtain the at least one second virtual sub-image associated with the selected virtual sub-image comprises:
splitting the selected virtual sub-image to obtain a plurality of second virtual sub-images.
7. The method of claim 6, wherein the selecting a virtual sub-image comprises detecting a mouth image of the subject,
splitting the selected virtual sub-image to obtain a plurality of second virtual sub-images, including:
splitting the mouth image into two second virtual sub-images from the position of the widest mouth of the detection object, wherein one of the two second virtual sub-images is an upper lip image, and the other second virtual sub-image is a lower lip image.
8. A method according to any one of claims 1 to 3, further comprising:
calculating the maximum deformation value of the plurality of characteristic points;
wherein determining movement information of a plurality of feature points in the initial virtual image based on the current feature information includes:
and determining movement information of a plurality of feature points in the initial virtual image based on the current feature information and the maximum deformation value.
9. The method of claim 8, wherein calculating a maximum deformation value for the plurality of feature points comprises:
determining a maximum deformation curve which is met by a first part of characteristic points in the plurality of characteristic points;
determining the position coordinates of each feature point in the first part of feature points; and
substituting the position coordinates of each feature point in the first part of feature points into the maximum deformation curve to obtain the maximum deformation value of the first part of feature points.
10. The method of claim 8, wherein calculating a maximum deformation value for the plurality of feature points comprises:
calculating a difference value between a limit position coordinate and a reference position coordinate of a second part of feature points in the plurality of feature points, wherein the limit position coordinate is a coordinate of the second part of feature points in the initial virtual image; and
And taking the difference value as the maximum deformation value of each of the second partial characteristic points.
11. The method of claim 8, wherein determining the movement information of the plurality of feature points in the initial virtual image based on the current feature information and the maximum deformation value comprises:
determining a current state value of the current state relative to a reference state according to the current characteristic information; and
and calculating the current state value and the maximum deformation value, and determining the movement information of the feature points in the initial virtual image.
12. The method of claim 11, wherein determining the movement information for the plurality of feature points in the initial virtual image calculated for the current state value and the maximum deformation value comprises:
and multiplying the current state value and the maximum deformation value to obtain the movement information of the feature points in the initial virtual image.
13. The method of claim 11, wherein determining a current state value of the current state relative to a reference state based on the current characteristic information comprises:
obtaining a mapping relation between the characteristic information and the state value; and
And determining the current state value of the current state relative to the reference state according to the mapping relation and the current characteristic information.
14. The method of claim 13, wherein obtaining the mapping between the characteristic information and the state value comprises:
obtaining a plurality of samples, wherein each sample comprises a corresponding relation between sample characteristic information and a sample state value; and
and constructing a mapping function based on the corresponding relation included in each sample in the plurality of samples, wherein the mapping function represents the mapping relation between the characteristic information and the state value.
15. The method of claim 14, wherein the sample characteristic information comprises target characteristic information and second characteristic information, the sample state value comprises a first value corresponding to the target characteristic information and a second value corresponding to the second characteristic information,
based on the correspondence, constructing the mapping function, including:
constructing a linear equation set; and
and solving the linear equation system according to the target characteristic information, the first value, the second characteristic information and the second value to obtain the mapping function.
16. The method of claim 9, wherein the second virtual sub-image comprises an upper lip image and a lower lip image, the first portion of feature points comprise feature points in the upper lip image and feature points in the lower lip image,
determining a maximum deformation curve which is met by a first part of characteristic points in the plurality of characteristic points, wherein the maximum deformation curve comprises the following steps:
and respectively determining a first maximum deformation curve which accords with the characteristic points in the upper lip image and a second maximum deformation curve which accords with the characteristic points in the lower lip image.
17. The method of claim 16, wherein the feature points in the upper lip image and the feature points in the lower lip image are in one-to-one correspondence, the feature points in the upper lip image and the feature points in the lower lip image comprising n columns,
the first maximum deformation curve is y 1 =(x′-n) 2 The second maximum deformation curve is y 2 =c-(x′-n) 2
Wherein said x'The widest position of the mouth is the x' axis, y, of the coordinates of the characteristic points in the upper lip image and the characteristic points in the lower lip image 1 Y is the distance that the characteristic point in the upper lip image moves away from the lower lip 2 And the characteristic point in the lower lip image is moved away from the upper lip.
18. The method according to claim 10, wherein the selected sub-image includes an eye white image of the detection object, the second virtual sub-image includes an upper eyelid image and a lower eyelid image, the reference position coordinates are coordinates of a reference point corresponding to the second partial feature point on a central axis of the eye white image, the direction of the central axis is parallel to a length direction of an eye of the detection object,
the second partial feature points include feature points of the upper eyelid image and feature points of the lower eyelid image,
calculating a difference between the extreme position coordinates and the reference position coordinates of a second partial feature point of the plurality of feature points, comprising:
calculating the difference value between the limit position coordinates of the characteristic points in the upper eyelid image and the reference points corresponding to the characteristic points in the upper eyelid image on the central axis; and
and calculating the difference value between the limit position coordinates of the characteristic points in the lower eyelid image and the reference points corresponding to the characteristic points in the lower eyelid image on the central axis.
19. The method of claim 1, wherein the movement information comprises a movement distance,
Driving the plurality of feature points to move in the initial virtual image according to the movement information, including:
and driving the plurality of characteristic points to move towards a preset reference position by the moving distance.
20. The method of claim 3, wherein the filler image is used to fill a void in a superimposed sub-image, the superimposed sub-image being superimposed by the two second virtual sub-images, the void being a location in the superimposed sub-image between the two second virtual sub-images,
driving the plurality of feature points to move in the initial virtual image according to the movement information, including:
and driving the plurality of feature points to move in the initial virtual image according to the movement information, and changing the shape of the filling image according to the gap size so that the filling image is matched with the gap to generate the current virtual image.
21. The method of claim 3 or 20, wherein the filling image comprises an oral image and the two second virtual sub-images comprise an upper lip image and a lower lip image.
22. An image processing apparatus comprising:
an acquisition unit configured to acquire at least one first virtual sub-image, wherein the at least one first virtual sub-image each corresponds to one of a plurality of target features;
A processing unit configured to process a selected virtual sub-image of the at least one first virtual sub-image to obtain at least one second virtual sub-image associated with the selected virtual sub-image;
a detection unit configured to acquire current feature information of a detection object in response to detection of the detection object, wherein the current feature information is used for indicating a current state of the detection object, and the detection object comprises the plurality of target features;
a determining unit configured to determine movement information of a plurality of feature points in the initial virtual image based on the current feature information, wherein the initial virtual image is obtained by superimposing the at least one first virtual sub-image and the at least one second virtual sub-image; and
and a driving unit configured to drive the plurality of feature points to move in the initial virtual image according to the movement information so as to generate a current virtual image corresponding to the current state.
23. An electronic device, comprising:
a processor;
a memory comprising one or more computer program instructions;
wherein the one or more computer program instructions are stored in the memory and when executed by the processor implement the image processing method of any of claims 1-21.
24. A computer readable storage medium having computer readable instructions stored non-transitory, wherein the computer readable instructions, when executed by a processor, implement the image processing method of any one of claims 1-21.
CN202111240970.5A 2021-10-25 2021-10-25 Image processing method, apparatus, electronic device, and computer-readable storage medium Pending CN116029948A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202111240970.5A CN116029948A (en) 2021-10-25 2021-10-25 Image processing method, apparatus, electronic device, and computer-readable storage medium
PCT/SG2022/050749 WO2023075681A2 (en) 2021-10-25 2022-10-21 Image processing method and apparatus, and electronic device, and computer-readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111240970.5A CN116029948A (en) 2021-10-25 2021-10-25 Image processing method, apparatus, electronic device, and computer-readable storage medium

Publications (1)

Publication Number Publication Date
CN116029948A true CN116029948A (en) 2023-04-28

Family

ID=86072799

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111240970.5A Pending CN116029948A (en) 2021-10-25 2021-10-25 Image processing method, apparatus, electronic device, and computer-readable storage medium

Country Status (2)

Country Link
CN (1) CN116029948A (en)
WO (1) WO2023075681A2 (en)

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101847268B (en) * 2010-04-29 2015-03-04 北京中星微电子有限公司 Cartoon human face image generation method and device based on human face images
CN105139438B (en) * 2014-09-19 2018-01-12 电子科技大学 video human face cartoon generation method
CN107705341B (en) * 2016-08-08 2023-05-12 创奇思科研有限公司 Method and device for generating user expression head portrait
CN108717719A (en) * 2018-05-23 2018-10-30 腾讯科技(深圳)有限公司 Generation method, device and the computer storage media of cartoon human face image
CN110363135B (en) * 2019-07-10 2022-09-27 广州市百果园信息技术有限公司 Method for determining degree of closing of human eyes, and method and device for controlling eyes
CN110557625A (en) * 2019-09-17 2019-12-10 北京达佳互联信息技术有限公司 live virtual image broadcasting method, terminal, computer equipment and storage medium
CN112135160A (en) * 2020-09-24 2020-12-25 广州博冠信息科技有限公司 Virtual object control method and device in live broadcast, storage medium and electronic equipment

Also Published As

Publication number Publication date
WO2023075681A2 (en) 2023-05-04
WO2023075681A3 (en) 2023-08-24

Similar Documents

Publication Publication Date Title
WO2021093453A1 (en) Method for generating 3d expression base, voice interactive method, apparatus and medium
CN106575445B (en) Fur avatar animation
EP3992919B1 (en) Three-dimensional facial model generation method and apparatus, device, and medium
WO2019154013A1 (en) Expression animation data processing method, computer device and storage medium
KR102103939B1 (en) Avatar facial expression animations with head rotation
CN110490896B (en) Video frame image processing method and device
Sharma et al. 3d face reconstruction in deep learning era: A survey
KR101145260B1 (en) Apparatus and method for mapping textures to object model
CN113272870A (en) System and method for realistic real-time portrait animation
JP2024522287A (en) 3D human body reconstruction method, apparatus, device and storage medium
WO2011075082A1 (en) Method and system for single view image 3 d face synthesis
JP2011159329A (en) Automatic 3d modeling system and method
CN116977522A (en) Rendering method and device of three-dimensional model, computer equipment and storage medium
JP4842242B2 (en) Method and apparatus for real-time expression of skin wrinkles during character animation
CN115330980A (en) Expression migration method and device, electronic equipment and storage medium
CN113313631B (en) Image rendering method and device
JP2010211732A (en) Object recognition device and method
CN113144613B (en) Model-based method for generating volume cloud
JP2002304638A (en) Device and method for generating expression animation
US8009171B2 (en) Image processing apparatus and method, and program
CN116030509A (en) Image processing method, apparatus, electronic device, and computer-readable storage medium
CN116029948A (en) Image processing method, apparatus, electronic device, and computer-readable storage medium
Yang et al. Realistic Real-time Facial Expressions Animation via 3D Morphing Target.
US20230079478A1 (en) Face mesh deformation with detailed wrinkles
CN114677476A (en) Face processing method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination