CN116258799A

CN116258799A - Digital person animation generation method and device, electronic equipment and storage medium

Info

Publication number: CN116258799A
Application number: CN202211446978.1A
Authority: CN
Inventors: 沈晓彬; 吴松城
Original assignee: Xiamen Black Mirror Technology Co ltd
Current assignee: Xiamen Black Mirror Technology Co ltd
Priority date: 2022-11-18
Filing date: 2022-11-18
Publication date: 2023-06-13

Abstract

The invention discloses a method, a device, electronic equipment and a storage medium for generating digital human animation, wherein the method comprises the following steps: acquiring a digital person to be edited, receiving an editing request of a user for the digital person to be edited, and establishing a WebRTC link with a WebRTC server, wherein the editing request comprises editing parameters; transmitting the editing parameters and the digital person to be edited to a WebRTC server based on the WebRTC link so that the WebRTC server forwards the editing parameters and the digital person to be edited to a real-time rendering service node; the method comprises the steps that a target digital person animation is obtained according to a video stream returned by a WebRTC server, the target digital person animation is displayed at the front end, the video stream is generated after a real-time rendering service node renders a digital person to be edited according to editing parameters, the digital person to be edited is processed at the real-time rendering service node, and the digital person animation is generated, so that the processing pressure of the server is reduced, and the generation efficiency and the real-time performance of the digital person animation are improved.

Description

Digital person animation generation method and device, electronic equipment and storage medium

Technical Field

The present invention relates to the field of computer technologies, and in particular, to a method and apparatus for generating a digital human animation, an electronic device, and a storage medium.

Background

Digital people are products of information science and life science fusion, and the method of information science is utilized to carry out virtual simulation on the forms and functions of human bodies at different levels.

In the prior art, when animation is generated based on digital people, the animation is generally edited and generated directly in a special server, but the special server has high processing pressure, low efficiency and poor instantaneity.

Therefore, how to generate the animation based on the digital person more efficiently is a technical problem to be solved at present.

It should be noted that the information disclosed in the above background section is only for enhancing understanding of the background of the present disclosure and thus may include information that does not constitute prior art known to those of ordinary skill in the art.

Disclosure of Invention

The embodiment of the application provides a method, a device, electronic equipment and a storage medium for generating digital person animation, which are used for generating the animation based on the digital person more efficiently.

In a first aspect, a method for generating a digital person animation is provided, including:

acquiring a digital person to be edited, receiving an editing request of a user for the digital person to be edited, and establishing a WebRTC link with a WebRTC server, wherein the editing request comprises editing parameters;

Transmitting the editing parameters and the digital person to be edited to the WebRTC server based on the WebRTC link so that the WebRTC server forwards the editing parameters and the digital person to be edited to a real-time rendering service node;

acquiring a target digital person animation according to a video stream returned by the WebRTC server, and displaying the target digital person animation at the front end, wherein the video stream is generated after the real-time rendering service node renders the digital person to be edited according to the editing parameters;

the WebRTC server and the real-time rendering service node are connected in a socket mode.

In some embodiments, after obtaining the target digital person animation according to the video stream returned by the WebRTC server and presenting the target digital person animation at the front end, the method further includes:

receiving an export request of the user for the target digital person animation, and sending the export request and the video stream to a specified theme in a Kafka cluster;

calling an export service interface for monitoring the specified subject to forward the export request and the video stream to an offline rendering service node;

And storing a target video file in a designated storage position corresponding to the export request based on the export service interface, wherein the target video file is generated after the offline rendering service node performs offline rendering on the video stream according to the export request.

In some embodiments, the voice and mouth shape control data in the video stream is generated by the real-time rendering service node by invoking a text-to-speech service interface, and the voice and mouth shape control data in the target video file is generated by the offline rendering service node by invoking the text-to-speech service interface.

In some embodiments, the real-time rendering service node is determined by the WebRTC server from a pre-registered plurality of preset real-time rendering service nodes based on load balancing.

In some embodiments, before sending the editing parameters and the digital person to be edited to the WebRTC server based on the WebRTC link, the method further includes:

receiving setting parameters of a virtual camera input by a user, and adding the setting parameters to the editing parameters, wherein the virtual camera is used for carrying out virtual shooting on the digital person to be edited.

In some embodiments, prior to obtaining the digital person to be edited, the method further comprises:

receiving the decoration data input by a user, and judging whether mutually exclusive decoration data exist in the decoration data;

if the mutual exclusion data exist, eliminating the mutual exclusion data, and carrying out decoration on a preset initial digital person based on the decoration data after eliminating the mutual exclusion data;

and generating the digital person to be edited based on the result of the decoration.

In some embodiments, prior to receiving the user-entered grooming data, the method further comprises:

receiving photos, gender data and wind pattern data input by a user;

and if the photo meets the preset condition, generating the preset initial digital person according to the photo, the gender data and the wind pattern data by calling a face reconstruction interface.

In a second aspect, there is provided an editing apparatus for a digital person, the apparatus comprising:

the system comprises a building module, a editing module and a display module, wherein the building module is used for obtaining a digital person to be edited, receiving an editing request of a user for the digital person to be edited, and building a WebRTC link with a WebRTC server, wherein the editing request comprises editing parameters;

the sending module is used for sending the editing parameters and the digital person to be edited to the WebRTC server based on the WebRTC link so that the WebRTC server forwards the editing parameters and the digital person to be edited to a real-time rendering service node;

The acquisition module is used for acquiring a target digital person animation according to a video stream returned by the WebRTC server and displaying the target digital person animation at the front end, wherein the video stream is generated after the real-time rendering service node renders the digital person to be edited according to the editing parameters;

In a third aspect, there is provided an electronic device comprising:

a processor; and

a memory for storing executable instructions of the processor;

wherein the processor is configured to perform the method of generating a digital person animation of the first aspect via execution of the executable instructions.

In a fourth aspect, a computer readable storage medium is provided, on which a computer program is stored, which computer program, when being executed by a processor, implements the method for generating a digital person animation according to the first aspect.

By applying the technical scheme, the digital person to be edited is obtained, an editing request of a user for the digital person to be edited is received, and a WebRTC link with a WebRTC server is established, wherein the editing request comprises editing parameters; transmitting the editing parameters and the digital person to be edited to the WebRTC server based on the WebRTC link so that the WebRTC server forwards the editing parameters and the digital person to be edited to a real-time rendering service node; acquiring a target digital person animation according to a video stream returned by the WebRTC server, and displaying the target digital person animation at the front end, wherein the video stream is generated after the real-time rendering service node renders the digital person to be edited according to the editing parameters; the WebRTC server and the real-time rendering service node are connected in a socket mode, and a digital person to be edited is processed at the real-time rendering service node and generates digital person animation, so that the processing pressure of the server is reduced, and the generation efficiency and the real-time performance of the digital person animation are improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly introduced below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic flow chart of a method for generating digital human animation according to an embodiment of the present invention;

FIG. 2 is a flow chart of a method for generating digital human animation according to another embodiment of the present invention;

FIG. 3 is a schematic flow chart of generating a digital person to be edited in an embodiment of the invention;

FIG. 4 is a schematic diagram showing the effects of different wind types in an embodiment of the present invention;

FIG. 5 is a schematic diagram showing the structure of a digital human animation generating device according to an embodiment of the present invention;

fig. 6 shows a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.

It is noted that other embodiments of the present application will be readily apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.

It is to be understood that the present application is not limited to the precise construction set forth herein below and shown in the drawings, and that various modifications and changes may be made without departing from the scope thereof. The scope of the application is limited only by the appended claims.

A method of generating a digital human animation according to an exemplary embodiment of the present application is described below with reference to fig. 1 to 3. It should be noted that the following application scenario is only shown for the convenience of understanding the spirit and principles of the present application, and embodiments of the present application are not limited in any way in this respect. Rather, embodiments of the present application may be applied to any scenario where applicable.

The embodiment of the application provides a method for generating digital human animation, as shown in fig. 1, the method can be applied to a server, and the server is connected with a client of a front end through a service interface, and comprises the following steps:

step S101, a digital person to be edited is obtained, an editing request of a user for the digital person to be edited is received, and a WebRTC link with a WebRTC server is established, wherein the editing request comprises editing parameters.

In this embodiment, the digital person to be edited may be uploaded by a user, may be generated in real time, or may be selected by the user from a plurality of pre-created digital persons, or may be received from another server. The user can input an editing request of the digital person to be edited through the client of the front end, wherein the editing request comprises editing parameters. Editing parameters may include, but are not limited to, parameters that edit scenes, actions, items, subtitles, dubbing, text, material, filters, music, transitions, etc.

The scene is a scene where a digital person is located, and editing parameters of the scene can be used for setting different scenes, adjusting the position of the scene, the angle between the scene and the person, the size of the scene and the like; editing parameters of the actions can enable the digital person to execute corresponding actions, such as lecture broadcasting, daily interaction, POSE of standing or sitting posture, social expression, emotion expression, impulse performance, sports and the like; the item may be a digital person carrying or an item in the surrounding environment; the subtitles may be text corresponding to dubbing; dubbing can be automatic synthesis from TTS (Text To Speech) or sound and music recorded by a user; the text may be text other than subtitles; the materials can be preset pictures or video information, and also can be content uploaded by a user, and editing parameters of the materials can adjust the positions of the materials, the arrangement modes of the materials, the proportion sizes, the transparency and the like; the filter may be a presented shooting effect; the music may be presented background music; the transition may be to have the digital person enter a different scene.

After receiving the editing request, establishing a WebRTC link with a WebRTC (Web Real-Time Communications, web Real-time communication) server, wherein the WebRTC server is connected with a Real-time rendering service node in a socket mode. The real-time rendering service node realizes the rendering function based on the 3D engine, and the 3D engine can be UE4, unity, etc.

WebRTC is a real-time communication technology that allows web applications or sites to establish point-to-point connections between browsers without the aid of intermediaries, enabling the transmission of video and/or audio streams or any other data. By establishing the WebRTC link, interaction with the real-time rendering service node can be achieved through forwarding by the WebRTC server.

Socket means that data transmission can be directly carried out after connection is established between two parties, and active pushing of information can be achieved after connection. The socket mode is used for connecting the WebRTC server and the real-time rendering service node, so that real-time interaction between the WebRTC server and the real-time rendering service node is realized, and the data transmission efficiency is improved.

Step S102, sending the editing parameters and the digital person to be edited to the WebRTC server based on the WebRTC link, so that the WebRTC server forwards the editing parameters and the digital person to be edited to a real-time rendering service node.

In this embodiment, after the WebRTC link is established, the editing parameters and the digital person to be edited are sent to the WebRTC server through the WebRTC link, and the WebRTC server may send the editing parameters and the digital person to be edited to the real-time rendering service node.

In some embodiments of the present application, the real-time rendering service node is determined by the WebRTC server from a plurality of preset real-time rendering service nodes registered in advance based on load balancing.

In order to meet the requirement of processing a plurality of editing requests simultaneously, a plurality of preset real-time rendering service nodes are preset, the preset real-time rendering service nodes are registered in a WebRTC server in advance, and as the load capacity of different preset real-time rendering service nodes possibly differs, the WebRTC server determines the real-time rendering service nodes from the preset real-time rendering service nodes according to load balancing after receiving editing parameters and digital persons to be edited, so that data processing can be performed more efficiently.

Optionally, the server may implement load balancing by polling each preset real-time rendering service node, or may implement load balancing by randomly selecting one preset real-time rendering service node, or may implement load balancing by selecting a preset real-time rendering service node with the smallest load.

In some embodiments of the present application, before sending the editing parameters and the digital person to be edited to the WebRTC server based on the WebRTC link, the method further includes:

In this embodiment, the virtual camera is a "camera" set up by the animation software, its role in representing the viewpoint is equivalent to a traditional camera in animation, the virtual camera is completely different from the shooting object of the physical camera that shoots, but the function is very similar, the physical camera shoots a real scene figure or a scene that is actually built, the virtual camera shoots a model that is built in the three-dimensional software, the virtual camera has parameters such as lens, focal length, focus, aperture, depth of field, etc., and can realize movements such as pushing, pulling, shaking, moving, following, throwing, lifting, etc.

The virtual camera may be one or more, and the setting parameters of the user on the virtual camera may include a camera type, a position of the virtual camera, an orientation of the virtual camera, a field of view of the virtual camera, a focal length of the virtual camera, a correspondence between different camera types and each animation frame, and the like, where the camera types may include a core virtual camera and a general virtual camera, and the core virtual camera may include a dedicated virtual camera and a close-up virtual camera.

The dedicated virtual camera may be a virtual camera configured to follow the photographed digital person, i.e. a lens animation, where the lens of the virtual camera intelligently follows the animation of the digital person to change and move, for example, a section of animation is a digital person dancing, and if the dedicated virtual camera is configured, the lens is adjusted in real time according to the dance of the digital person and always follows the digital person.

The general virtual camera may include at least one of: a close-range virtual camera, a far-range virtual camera, and a panoramic virtual camera. For example, panoramic virtual cameras may be used when panoramic animation presentations are desired; when it is desired to show a certain item in the animation, or a digital person's local action, or some specified text, then a close-up virtual camera may be used.

The corresponding relation between different camera types is used for switching the camera type of the currently started virtual camera to other camera types, for example, the corresponding relation between the close-range virtual camera and the far-range virtual camera is preset, and when the camera type of the currently started virtual camera is the close-range virtual camera, the camera type can be switched from the close-range virtual camera to the far-range virtual camera in response to acquiring instruction information sent by a user. The time interval for switching the camera type can also be preset, for example, the camera type of the virtual camera can be started according to time change in the process of displaying the animation when the camera is switched every 5 seconds.

By setting the corresponding relation between different camera types and each animation frame, the corresponding type of virtual camera can be started according to the change of each animation frame number. Animation is composed of multiple frames, and the animated content is represented by animation frames. The scenario of the content expressed in the animation changes, so that the shots representing the frames of the animation also need to change correspondingly according to the playing progress of the animation. For example, the animation is that a digital person picks up a cup of water from a table, drinks the water in the cup and puts the cup back to the table, at this time, in order to better reflect the animation content, the camera type corresponding to the animation frame related to picking up a cup of water from the table and putting the cup back to the table by the digital person can be set as a long-range virtual camera, and the camera type corresponding to the animation frame related to drinking water by the digital person is set as a close-range virtual camera, so as to highlight the action of drinking water by the digital person.

By adding the setting parameters of the virtual camera to the editing parameters, the rendering effect of the digital person animation can be further optimized.

Step S103, obtaining a target digital person animation according to a video stream returned by the WebRTC server, and displaying the target digital person animation at the front end, wherein the video stream is generated after the real-time rendering service node renders the digital person to be edited according to the editing parameters.

In this embodiment, after receiving the editing parameters and the digital person to be edited from the WebRTC server, the real-time rendering service node renders the digital person to be edited according to the editing parameters and generates a video stream, and then sends the video stream to the WebRTC server, and then the WebRTC server returns to the video stream server through WebRTC link. The video stream is a target digital person animation corresponding to the digital person to be edited, and finally the server sends the target digital person animation to the front end through the service interface for display.

The embodiment of the application also provides a method for generating the digital human animation, which is shown in fig. 2 and comprises the following steps:

step S201, a digital person to be edited is obtained, an editing request of a user for the digital person to be edited is received, and a WebRTC link with a WebRTC server is established.

Step S202, transmitting the editing parameters and the digital person to be edited to the WebRTC server based on the WebRTC link, so that the WebRTC server forwards the editing parameters and the digital person to be edited to a real-time rendering service node.

Step S203, obtaining a target digital person animation according to a video stream returned by the WebRTC server, and displaying the target digital person animation at the front end, where the video stream is generated after the real-time rendering service node renders the digital person to be edited according to the editing parameter.

Step S204, receiving an export request of the user for the target digital person animation, and sending the export request and the video stream to a specified theme in the Kafka cluster.

In this embodiment, when the user needs to export the target digital person animation displayed at the front end, a corresponding export request may be input, and after receiving the export request, the server sends the export request and the video stream to the specified theme in the Kafka cluster.

Optionally, the export request may include a target format, an export manner, and the target format may be any format including wmv, asf, asx, rm, rmvb, mpg, mpeg, mpe, 3gp, mov, mp4, and m4v, avi, dat, mkv, flv, vob, and the export manner may include exporting to the local or to the cloud.

Step S205, call the export service interface that listens to the specified theme to forward the export request and the video stream to an offline rendering service node.

And presetting an export service interface, wherein the export service interface monitors the specified theme in real time, and forwards the export request and the video stream to an offline rendering service node for processing after monitoring the export request and the video stream.

Step S206, based on the export service interface, storing a target video file in a designated storage location corresponding to the export request, where the target video file is generated after the offline rendering service node performs offline rendering on the video stream according to the export request.

In this embodiment, the offline rendering service node performs offline rendering on the received video stream according to the export request, and generates a corresponding target video file. The target video file can be stored to a designated storage position corresponding to the export request through the export service interface, and the designated storage position can be a designated local storage path when the export mode is export to the local; when the export mode of the specified storage position is export to the cloud, the specified storage position can be url of the cloud or a specified network mailbox address. The video stream is converted into the target video file by adopting the offline rendering service node, so that the processing pressure of the server when the digital human animation is exported is reduced, and the data processing efficiency is improved.

In some embodiments of the present application, the voice and mouth shape control data in the video stream is generated by the real-time rendering service node by calling a text-to-speech service interface, and the voice and mouth shape control data in the target video file is generated by the offline rendering service node by calling the text-to-speech service interface.

In this embodiment, by presetting a text-to-speech service interface, the real-time rendering service node and the offline rendering service node can generate speech and mouth shape control data by calling the text-to-speech service interface when performing rendering operation.

In addition, the specific embodiments of step S201 to step S203 refer to step S101 to step S103, which are not described herein.

In some embodiments of the present application, before acquiring the digital person to be edited, as shown in fig. 3, the method further includes the steps of:

step S301, receiving a photo, gender data and wind pattern data input by a user

In this embodiment, a digital person needs to be built based on a face photo provided, a user can set the sex of the digital person by inputting sex data, and the user can set the painting type of the digital person by inputting painting type data, and various painting types, such as reality, beauty, delicacy, lovely and the like, are preset, each painting type can be previewed through a sample photo, and the user can conveniently select according to own needs, as shown in an effect schematic diagram of different painting types in fig. 4.

Step S302, if the photo satisfies the preset condition, step S303 is executed, otherwise step S304 is executed.

In this embodiment, in order to match a digital person with a person in a photo, so as to ensure a better visual effect, the photo needs to meet certain preset conditions, where the preset conditions may include a front-face photo, uniform and sufficient illumination, natural and relaxed expression, belonging to a preset format (such as JPG or PNG format), and the size of the photo does not exceed a preset size (such as 10 MB); no one of the following can occur: deflection tilt, laughing, mouth opening, tooth leakage, facial shadows, and facial shadows, etc.

Step S303, a face reconstruction interface is called according to the photo, the gender data and the wind pattern data to generate a preset initial digital person.

In this embodiment, a face reconstruction interface is provided, which may be based on a 3DMM (3D Morphable Face Model, face 3D deformation statistical model) or a DECA (Detailed Expression Capture and Animation ) model. And if the photo meets the preset condition, calling a face reconstruction interface according to the photo, the gender data and the wind pattern data to generate a preset initial digital person. The preset initial digital person can be generated only by inputting the photo, the gender data and the wind-drawing type data by the user, and the photo needs to meet the preset conditions, so that the digital person can be efficiently and accurately generated.

Optionally, the hairstyle of the initial digital person is preset as a default hairstyle, and the default hairstyle is used as the hairstyle of the initial digital person after the face is reconstructed, so that the user can modify the hairstyle according to the requirement.

Step S304, the photo is prompted to be out of the requirements.

In step S305, the grooming data input by the user is received.

In this embodiment, the dress data is data for rendering visual elements from skeletal data. The grooming data may comprise a grooming picture, a vertex grid and mapping data, wherein the mapping data is for mapping the grooming picture to the vertex grid and mapping vertices in the vertex grid to respective bones in the bone data. The decorative picture can be embedded in a slot (slot) included in the decorative data, is a texture picture, and can be stored in a PNG (Portable Network Graphic Format, portable network graphics format) format. A vertex grid is an area of a series of vertices. Mapping data maps the dress up picture onto the vertex grid, binds bones for the vertices in the vertex grid and gives weight, and the bone motion drives the corresponding vertices to move, so that the vertex motion changes the dress up picture. The digital person's hairstyle, color, complexion, clothing, pants, etc. information may be generated by the grooming data.

Optionally, the user may be recommended appropriate grooming data based on user data entered by the user, which may include, but is not limited to, information about the user's face, stature, occupation, age, etc.

Step S306, if there is mutually exclusive decoration data, step S307 is executed, otherwise step S308 is executed.

In this embodiment, it is determined whether mutually exclusive decoration data exists, where the mutually exclusive decoration data is not compatible in terms of both attribute and category, and the mutually exclusive decoration data cannot exist simultaneously, for example, one wearing sports shoes and the other wearing slippers are mutually exclusive decoration data.

Step S307, eliminating the mutually exclusive decoration data, and decorating a preset initial digital person based on the decoration data with the mutually exclusive decoration data eliminated.

Because mutually exclusive decoration data are removed, the preset initial digital person can be more accurately decorated.

Step S308, performing decoration on the preset initial digital person based on the decoration data.

In this embodiment, unity may be invoked to render the grooming data onto a preset initial digital person for grooming.

Step S309, generating the digital person to be edited based on the result of the decoration.

By applying the technical scheme, the digital person to be edited is obtained, an editing request of the user for the digital person to be edited is received, and a WebRTC link with a WebRTC server is established, wherein the editing request comprises editing parameters; transmitting the editing parameters and the digital person to be edited to a WebRTC server based on the WebRTC link so that the WebRTC server forwards the editing parameters and the digital person to be edited to a real-time rendering service node; acquiring a target digital person animation according to a video stream returned by the WebRTC server, and displaying the target digital person animation at the front end; receiving a export request of a user for the target digital person animation, and sending the export request and the video stream to a specified theme in the Kafka cluster; calling an export service interface for monitoring the specified subject to forward an export request and a video stream to an offline rendering service node; and storing the target video file in a designated storage position corresponding to the export request based on the export service interface, processing the digital person to be edited in the real-time rendering service node and generating the digital person animation, and generating the corresponding target video file based on the offline rendering service node, thereby reducing the processing pressure of the server and improving the generation efficiency and the real-time performance of the digital person animation.

The embodiment of the application also provides an editing device for a digital person, as shown in fig. 5, the device comprises:

the establishing module 501 acquires a digital person to be edited, receives an editing request of a user for the digital person to be edited, and establishes a WebRTC link with a WebRTC server, wherein the editing request comprises editing parameters;

the sending module 502 is configured to send the editing parameters and the digital person to be edited to the WebRTC server based on the WebRTC link, so that the WebRTC server forwards the editing parameters and the digital person to be edited to a real-time rendering service node;

an obtaining module 503, configured to obtain a target digital person animation according to a video stream returned by the WebRTC server, and display the target digital person animation at a front end, where the video stream is generated after the real-time rendering service node renders the digital person to be edited according to the editing parameter;

In a specific application scenario, the apparatus further includes a deriving module configured to:

In a specific application scenario, the voice and mouth shape control data in the video stream are generated by the real-time rendering service node by calling a text-to-voice service interface, and the voice and mouth shape control data in the target video file are generated by the offline rendering service node by calling the text-to-voice service interface.

In a specific application scenario, the real-time rendering service node is determined by the WebRTC server from a plurality of preset real-time rendering service nodes registered in advance based on load balancing.

In a specific application scenario, the apparatus further includes an adding module, configured to:

In a specific application scenario, the apparatus further includes a first generating module configured to:

In a specific application scenario, the apparatus further includes a second generating module configured to:

receiving photos, gender data and wind pattern data input by a user;

By applying the above technical scheme, the editing device of the digital person includes: the system comprises a building module, a editing module and a display module, wherein the building module is used for obtaining a digital person to be edited, receiving an editing request of a user for the digital person to be edited, and building a WebRTC link with a WebRTC server, wherein the editing request comprises editing parameters; the sending module is used for sending the editing parameters and the digital person to be edited to the WebRTC server based on the WebRTC link so that the WebRTC server forwards the editing parameters and the digital person to be edited to a real-time rendering service node; the acquisition module is used for acquiring a target digital person animation according to a video stream returned by the WebRTC server and displaying the target digital person animation at the front end, wherein the video stream is generated after the real-time rendering service node renders the digital person to be edited according to the editing parameters; the WebRTC server and the real-time rendering service node are connected in a socket mode, and a digital person to be edited is processed at the real-time rendering service node and generates digital person animation, so that the processing pressure of the server is reduced, and the generation efficiency and the real-time performance of the digital person animation are improved.

The embodiment of the invention also provides an electronic device, as shown in fig. 6, which comprises a processor 601, a communication interface 602, a memory 603 and a communication bus 604, wherein the processor 601, the communication interface 602 and the memory 603 complete communication with each other through the communication bus 604,

a memory 603 for storing executable instructions of the processor;

a processor 601 configured to execute via execution of the executable instructions:

The communication bus may be a PCI (Peripheral Component Interconnect, peripheral component interconnect standard) bus, or an EISA (Extended Industry Standard Architecture ) bus, or the like. The communication bus may be classified as an address bus, a data bus, a control bus, or the like. For ease of illustration, the figures are shown with only one bold line, but not with only one bus or one type of bus.

The communication interface is used for communication between the terminal and other devices.

The memory may include RAM (Random Access Memory ) or may include non-volatile memory, such as at least one disk memory. Optionally, the memory may also be at least one memory device located remotely from the aforementioned processor.

The processor may be a general-purpose processor, including a CPU (Central Processing Unit ), NP (Network Processor, network processor), etc.; but also DSP (Digital Signal Processing, digital signal processor), ASIC (Application Specific Integrated Circuit ), FPGA (Field-Programmable Gate Array, field programmable gate array) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components.

In yet another embodiment of the present invention, there is also provided a computer-readable storage medium having stored therein a computer program which, when executed by a processor, implements the method of generating digital human animation as described above.

In yet another embodiment of the present invention, there is also provided a computer program product containing instructions that, when run on a computer, cause the computer to perform the method of generating a digital human animation as described above.

In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the present invention, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by a wired (e.g., coaxial cable, fiber optic, digital subscriber line), or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid state disk), etc.

It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

In this specification, each embodiment is described in a related manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments.

The foregoing description is only of the preferred embodiments of the present invention and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention are included in the protection scope of the present invention.

Claims

1. A method for generating a digital human animation, comprising:

2. The method of claim 1, wherein after obtaining a target digital person animation from a video stream returned by the WebRTC server and presenting the target digital person animation at a front end, the method further comprises:

3. The method of claim 2, wherein the voice and mouth shape control data in the video stream is generated by the real-time rendering service node by invoking a text-to-speech service interface, and wherein the voice and mouth shape control data in the target video file is generated by the offline rendering service node by invoking the text-to-speech service interface.

4. The method of claim 1, wherein the real-time rendering service node is determined by the WebRTC server from a pre-registered plurality of preset real-time rendering service nodes based on load balancing.

5. The method of claim 1, wherein prior to sending the editing parameters and the digital person to be edited to the WebRTC server based on the WebRTC link, the method further comprises:

6. The method of claim 1, wherein prior to obtaining the digital person to be edited, the method further comprises:

7. The method of claim 6, wherein prior to receiving the user-entered grooming data, the method further comprises:

receiving photos, gender data and wind pattern data input by a user;

8. An editing apparatus for a digital person, the apparatus comprising:

9. An electronic device, comprising:

a processor; and

a memory for storing executable instructions of the processor;

wherein the processor is configured to perform the method of generating a digital human animation of any of claims 1-7 via execution of the executable instructions.

10. A computer-readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements a method for generating a digital human animation according to any of claims 1-7.