CN110035271B

CN110035271B - Fidelity image generation method and device and electronic equipment

Info

Publication number: CN110035271B
Application number: CN201910216551.4A
Authority: CN
Inventors: 郭冠军
Original assignee: Beijing ByteDance Network Technology Co Ltd
Current assignee: Douyin Vision Co Ltd; Douyin Vision Beijing Co Ltd
Priority date: 2019-03-21
Filing date: 2019-03-21
Publication date: 2020-06-02
Anticipated expiration: 2039-03-21
Also published as: CN110035271A

Abstract

The embodiment of the disclosure provides a fidelity image generation method, a fidelity image generation device and electronic equipment, belonging to the technical field of data processing, wherein the method comprises the following steps: acquiring a plurality of images containing a target object, one or more continuous actions of the target object being determinable based on the plurality of images; acquiring a texture map of a specific area on the target object and a shape constraint map of a specific element in the plurality of images; constructing a reconstructed model of the target object based on the texture map, the shape constraint map, and two-dimensional image information of the plurality of images; generating, using the reconstruction model, a fidelity image that matches input information of the target object, the fidelity image including one or more predicted actions that match the input information. Through the processing scheme of the application, the reality of the generated image is improved.

Description

Fidelity image generation method and device and electronic equipment

Technical Field

The present disclosure relates to the field of data processing technologies, and in particular, to a method and an apparatus for generating a fidelity image, and an electronic device.

Background

With the development of network technology, the application of artificial intelligence technology in network scenes is greatly improved. As a specific application requirement, more and more network environments use virtual characters for interaction, for example, a virtual anchor is provided in live webcasting to perform anthropomorphic broadcast on live webcasting content, and necessary guidance is provided for live webcasting, so that the live webcasting presence and interactivity are enhanced, and the live webcasting effect is improved.

Expression simulation (e.g., mouth-type motion simulation) technology is one of artificial intelligence technologies, and currently, expression simulation is implemented to drive facial expressions of characters mainly based on text-driven, natural voice-driven, and audio-video hybrid modeling methods. For example, a Text-to-Speech (TTS) engine typically converts input Text information into a corresponding phoneme sequence, phoneme duration and a corresponding Speech waveform, then selects a corresponding model unit from a model library, and finally presents Speech and facial expression actions corresponding to the input Text content through smoothing and a corresponding synchronization algorithm.

The expression simulation in the prior art has the condition of single expression simulation and even distortion, more robots perform, and the fidelity of expression actions is far away from the expression of real characters.

Disclosure of Invention

In view of the above, embodiments of the present disclosure provide a method and an apparatus for generating a fidelity image, and an electronic device, which at least partially solve the problems in the prior art.

In a first aspect, an embodiment of the present disclosure provides a fidelity image generation method, including:

acquiring a plurality of images containing a target object, one or more continuous actions of the target object being determinable based on the plurality of images;

acquiring a texture map of a specific area on the target object and a shape constraint map of a specific element in the plurality of images;

constructing a reconstructed model of the target object based on the texture map, the shape constraint map, and two-dimensional image information of the plurality of images;

generating, using the reconstruction model, a fidelity image that matches input information of the target object, the fidelity image including one or more predicted actions that match the input information.

According to a specific implementation manner of the embodiment of the present disclosure, the acquiring a plurality of images including a target object includes:

adopting camera equipment to carry out video acquisition on the target object to obtain a video file containing a plurality of video frames;

and selecting part or all of the video frames from the video file to form a plurality of images containing the target object.

setting broadcast samples of different styles aiming at a target object;

acquiring a sample video of the target object aiming at the broadcast samples of different styles;

a plurality of images including a target object are acquired from the sample video.

According to a specific implementation manner of the embodiment of the present disclosure, the obtaining a texture map of a specific region on the target object and a shape constraint map of a specific element in the plurality of images includes:

3D reconstruction is carried out on a specific area of the target object to obtain a 3D area object;

acquiring a three-dimensional grid of the 3D area object, wherein the three-dimensional grid comprises a preset coordinate value;

determining a texture map for the particular region based on pixel values at different three-dimensional grid coordinates.

According to a specific implementation manner of the embodiment of the present disclosure, the obtaining a texture map of a specific region on the target object and a shape constraint map of a specific element in the plurality of images further includes:

performing keypoint detection for a specific element in the plurality of images, resulting in a plurality of keypoints related to the specific element;

forming a shape constraint graph describing the particular element based on the plurality of keypoints.

According to a specific implementation manner of the embodiment of the present disclosure, the constructing a reconstruction model of the target object based on the texture map, the shape constraint map, and the two-dimensional image information of the plurality of images includes:

and setting a convolutional neural network for training the reconstruction model, and training an image containing the target object by using the convolutional neural network, wherein the input of the last layer of the convolutional neural network is consistent with the node input of the input layer.

According to a specific implementation manner of the embodiment of the present disclosure, the training, by using the convolutional neural network, an image including the target object includes:

measuring a prediction error by using a mean square error function, wherein the prediction error is used for describing the difference between an output pictographic frame and an artificial collection frame;

and reducing the prediction error by adopting a back propagation function.

According to a specific implementation manner of the embodiment of the present disclosure, the generating a fidelity image matched with the input information of the target object by using the reconstruction model includes:

acquiring input information aiming at the target object, and analyzing the input information to obtain a first analysis result;

performing model quantization on the first analysis result to obtain a target object motion quantization vector;

generating a plurality of fidelity images matched to the motion quantization vector.

According to a specific implementation manner of the embodiment of the present disclosure, the generating a plurality of fidelity images matched with the motion quantization vectors includes:

taking the texture map as a fixed input of the fidelity image;

determining a motion constraint value for the particular element based on element values in the motion quantization vector;

and predicting a plurality of fidelity images matched with the input information through continuous motion constraint values and the fixed texture maps.

In a second aspect, an embodiment of the present disclosure provides a fidelity image generating apparatus, including:

an acquisition module for acquiring a plurality of images containing a target object, one or more continuous actions of the target object being determinable based on the plurality of images;

an obtaining module, configured to determine, in the plurality of images, a texture map of a specific region on the target object and a shape constraint map of a specific element;

a construction module for constructing a reconstructed model of the target object based on the texture map, the shape constraint map, and two-dimensional image information of the plurality of images;

a generating module for generating a fidelity image matched with the input information of the target object by using the reconstruction model, wherein the fidelity image comprises one or more predicted actions matched with the input information.

In a third aspect, an embodiment of the present disclosure further provides an electronic device, where the electronic device includes:

at least one processor; and the number of the first and second groups,

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of generating a fidelity image according to any of the first aspect or any implementation manner of the first aspect.

In a fourth aspect, the disclosed embodiments also provide a non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform the fidelity image generation method in the first aspect or any implementation manner of the first aspect.

In a fifth aspect, the disclosed embodiments also provide a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions that, when executed by a computer, cause the computer to perform the fidelity image generation method in the first aspect or any of the implementations of the first aspect.

The fidelity image generation scheme in the disclosed embodiment comprises acquiring a plurality of images containing a target object, and determining one or more continuous actions of the target object based on the plurality of images; acquiring a texture map of a specific area on the target object and a shape constraint map of a specific element in the plurality of images; constructing a reconstructed model of the target object based on the texture map, the shape constraint map, and two-dimensional image information of the plurality of images; generating, using the reconstruction model, a fidelity image that matches input information of the target object, the fidelity image including one or more predicted actions that match the input information. By the aid of the processing scheme, the animation image matched with the input information can be truly simulated, and user experience is improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings needed to be used in the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present disclosure, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a schematic diagram illustrating a process of generating a fidelity image according to an embodiment of the disclosure;

fig. 2 is a schematic diagram illustrating another process for generating a fidelity image according to an embodiment of the disclosure;

fig. 3 is a schematic diagram illustrating another process for generating a fidelity image according to an embodiment of the disclosure;

fig. 4 is a schematic view of another fidelity image generation process provided in the embodiments of the present disclosure;

fig. 5 is a schematic view of another fidelity image generation process provided in the embodiments of the present disclosure;

fig. 6 is a schematic structural diagram of a fidelity image generating apparatus provided in the embodiment of the disclosure;

fig. 7 is a schematic view of an electronic device provided in an embodiment of the disclosure.

Detailed Description

The embodiments of the present disclosure are described in detail below with reference to the accompanying drawings.

The embodiments of the present disclosure are described below with specific examples, and other advantages and effects of the present disclosure will be readily apparent to those skilled in the art from the disclosure in the specification. It is to be understood that the described embodiments are merely illustrative of some, and not restrictive, of the embodiments of the disclosure. The disclosure may be embodied or carried out in various other specific embodiments, and various modifications and changes may be made in the details within the description without departing from the spirit of the disclosure. It is to be noted that the features in the following embodiments and examples may be combined with each other without conflict. All other embodiments, which can be derived by a person skilled in the art from the embodiments disclosed herein without making any creative effort, shall fall within the protection scope of the present disclosure.

It is noted that various aspects of the embodiments are described below within the scope of the appended claims. It should be apparent that the aspects described herein may be embodied in a wide variety of forms and that any specific structure and/or function described herein is merely illustrative. Based on the disclosure, one skilled in the art should appreciate that one aspect described herein may be implemented independently of any other aspects and that two or more of these aspects may be combined in various ways. For example, an apparatus may be implemented and/or a method practiced using any number of the aspects set forth herein. Additionally, such an apparatus may be implemented and/or such a method may be practiced using other structure and/or functionality in addition to one or more of the aspects set forth herein.

It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present disclosure, and the drawings only show the components related to the present disclosure rather than the number, shape and size of the components in actual implementation, and the type, amount and ratio of the components in actual implementation may be changed arbitrarily, and the layout of the components may be more complicated.

In addition, in the following description, specific details are provided to facilitate a thorough understanding of the examples. However, it will be understood by those skilled in the art that the aspects may be practiced without these specific details.

The embodiment of the disclosure provides a fidelity image generation method. The fidelity image generation method provided by the embodiment can be executed by a computing device, the computing device can be implemented as software, or implemented as a combination of software and hardware, and the computing device can be integrated in a server, a terminal device and the like.

Referring to fig. 1, a fidelity image generation method provided by the embodiment of the present disclosure includes the following steps:

s101, a plurality of images containing a target object are acquired, and one or more continuous motions of the target object can be determined based on the plurality of images.

The action and expression of the target object are contents to be simulated and predicted by the scheme of the disclosure, and as an example, the target object may be a real person capable of performing network broadcasting, or may be another object having an information dissemination function, such as a television program host, a news broadcaster, a teacher giving lessons, and the like.

The target object is usually a person with a broadcasting function, and since the person of the type usually has a certain degree of awareness, when there is a huge amount of content that requires the target object to perform broadcasting including voice and/or video actions, it usually requires a large cost. Meanwhile, for a live-type program, a target object generally cannot appear in multiple live rooms (or multiple live channels) at the same time. If an effect such as "anchor separation" is desired, it is often difficult to achieve this effect by live broadcast.

For this reason, it is necessary to capture a video of a target object (e.g., a main broadcast) by a video recording device such as a video camera in advance, and capture a broadcast record of the target object for different contents by the video. For example, a live room host of the target object may be recorded, and a broadcast record of the target object for a news segment may also be recorded.

The video collected for the target object comprises a plurality of frame images, and a plurality of images comprising one or more continuous motions of the target object can be selected from the frame images of the video to form an image set. By training the image set, the action and expression of the target object aiming at different input contents can be predicted and simulated.

S102, acquiring a texture map of a specific area on the target object and a shape constraint map of a specific element from the plurality of images.

Having acquired a plurality of images (e.g., video frames) associated with the target object, constituent objects on the target object may be selected to model the target object. To improve the efficiency of the modeling, certain regions that are not too highly recognizable to the user (e.g., facial regions) and certain elements that are highly recognizable to the user (e.g., mouth, eyes, etc.) may be selected for modeling.

Specifically, a texture map (e.g., face texture) of a specific region of the target object and key points (e.g., eye, mouth, etc. key points of five sense organs) of specific elements in the plurality of images are acquired, so that a shape constraint map of the target object texture map and the specific elements is formed.

The texture of the specific region may be obtained by a 3D reconstruction method, for example, a human face three-dimensional mesh is obtained by a 3D human face reconstruction method, and the human face pixel values corresponding to all three-dimensional mesh points constitute the facial texture of the target object (e.g., anchor). Wherein, the 3D face reconstruction can be realized by adopting the existing technology.

The shape constraint graph of the specific element can be realized by adopting a key point detection mode, taking eyes and mouth as an example, the eye and mouth key points are obtained by the existing face key point detection algorithm. The eye/mouth closure area is formed by connecting key points around the eye/mouth, respectively. The pupil area of the eye is filled in blue, the rest of the eye is filled in white, and the mouth-closing area is filled in red. The image of the closed region formed by the key points of the specific element after being filled with color forms the shape constraint graph of the specific element.

S103, constructing a reconstruction model of the target object based on the texture map, the shape constraint map and the two-dimensional image information of the plurality of images.

After the texture map and the shape constraint map are obtained, a plurality of images for generating the texture map and the shape constraint map can be combined, and a reconstruction model for a target object can be trained and constructed through the set convolution neural network.

The convolutional neural network structure may contain several convolutional layers, pooling layers, fully-connected layers, and classifiers. The number of nodes of the output layer and the input layer of the last layer of the convolutional neural network structure is the same, so that the video frame generating the target object image can be directly output.

In the process of training the convolutional neural network, a mean square error function is used for measuring a prediction error, namely the difference between a prediction output target object pictograph frame and an artificial acquisition target object pictograph frame, and for the difference, a back propagation function is used for reducing the difference.

And S104, generating a fidelity image matched with the input information of the target object by using the reconstruction model, wherein the fidelity image comprises one or more predicted actions matched with the input information.

After the reconstruction model is set, various actions and expressions of the target object in the video can be predicted by utilizing the reconstruction model in a video animation mode. Specifically, a video file containing the target object motion and expression may be generated by generating a fidelity image, which may be a full frame or a key frame of the video file, containing a collection of multiple images of one or more predicted motions that match the input information.

The input information may be in a variety of ways, for example, the input information may be in the form of text or audio. And converting input information into parameters matched with the texture map and the shape constraint map after data analysis, and finally completing generation of a guaranteed image by calling the texture map and the shape constraint map by using the reconstructed model obtained after training.

In the prediction stage, a texture map of a specific area of a target object and shape constraints of specific elements can be given, image information of a two-dimensional anchor broadcast image is predicted by using a trained reconstruction model, and a continuous anchor broadcast image is predicted by taking the shape constraints of continuous specific elements and the textures of fixed specific areas as input.

In the process of implementing step S101, referring to fig. 2, according to a specific implementation manner of the embodiment of the present disclosure, acquiring a plurality of images including a target object may include the following steps:

s201, video acquisition is carried out on the target object by adopting the camera equipment, and a video file containing a plurality of video frames is obtained.

The target object is usually a person with a broadcasting function, and since the person of the type usually has a certain degree of awareness, when there is a huge amount of content that requires the target object to perform broadcasting including voice and/or video actions, it usually requires a large cost. Meanwhile, for live programs, a target object cannot appear in a plurality of live broadcasting rooms (or a plurality of live broadcasting channels) at the same time, and if an effect such as 'anchor' is displayed, the effect is usually difficult to achieve by live broadcasting through a real person.

S202, selecting partial or all video frames from the video file to form a plurality of images containing the target object.

The video collected for the target object comprises a plurality of frame images, and a plurality of images comprising one or more continuous motions of the target object can be selected from the video to form an image set. By training the image set, the action and expression of the target object aiming at different input contents can be predicted and simulated.

As another implementation manner of step S101, referring to fig. 3, according to a specific implementation manner of the embodiment of the present disclosure, the acquiring a plurality of images including a target object may further include steps S301 to S303:

s301, setting broadcast samples of different styles aiming at the target object.

In order to more comprehensively acquire various actions and expressions of the target object, different types of broadcast samples can be preset. For example, the broadcast sample may contain different emotions such as happy, sad, angry, etc., thereby obtaining a more comprehensive training sample.

And S302, acquiring a sample video of the target object aiming at the broadcast samples of different styles.

By carrying out video sampling on the target object, sample videos of the target object aiming at the different styles of broadcast samples can be obtained.

S303, acquiring a plurality of images including the target object from the sample video.

According to actual needs, a plurality of images containing the target object can be selected from a plurality of video frames in the sample video, the plurality of images can be part or all of the video frames in the sample video, and key frames can be selected from all of the sample video as the plurality of images.

In the process of implementing step S102, according to a specific implementation manner of the embodiment of the present disclosure, referring to fig. 4, acquiring a texture map of a specific region on the target object and a shape constraint map of a specific element in the plurality of images may include:

s401, performing 3D reconstruction on the specific area of the target object to obtain a 3D area object.

S402, obtaining a three-dimensional grid of the 3D area object, wherein the three-dimensional grid comprises a preset coordinate value.

The 3D region object is described by a three-dimensional grid in terms of its specific position, for which specific coordinate values are set for the three-dimensional grid, which can be described, for example, by setting plane two-dimensional coordinates and spatial height coordinates.

And S403, determining a texture map of the specific area based on pixel values on different three-dimensional grid coordinates.

The pixel values at different three-dimensional grid coordinates may be connected together to form a grid plane that forms a texture map of the particular area.

Through the implementation of steps S401 to S403, the texture map of the specific area can be formed faster, and the efficiency of forming the texture map is improved.

In the process executed in step S104, as a specific implementation manner, referring to fig. 5, generating a fidelity image matched with the input information of the target object by using a reconstruction model may include the following steps:

s501, acquiring input information aiming at the target object, and analyzing the input information to obtain a first analysis result.

The input information may be in a variety of ways, for example, the input information may be in the form of text or audio. And converting the input information into a first analysis result after data analysis, wherein the first analysis result comprises parameters matched with the texture map and the shape constraint map, and finally completing generation of a guarantee image by calling the texture map and the shape constraint map by using the reconstruction model obtained after training.

And S502, performing model quantization on the first analysis result to obtain a target object motion quantization vector.

The first analysis result includes a motion amplitude parameter for a specific element on the target object, and taking the mouth as an example, the motion amplitude can be quantized to 1 when the mouth is fully opened, the motion amplitude can be quantized to 0 when the mouth is fully closed, and by quantizing a value between 0 and 1, an intermediate state of the mouth between full opening and full closing can be described.

S503, a plurality of fidelity images matched with the motion quantization vectors are generated.

Motion quantization vectors can describe the motion amplitude of a specific element on a target object in a mode of a sequence fidelity image, and fidelity patterns of the specific motion element with different motion amplitudes are continuously spliced together to form a prediction result containing different motions of the target object.

Specifically, generating a plurality of fidelity images matched with the motion quantization vectors may include steps S5031 to S5033:

s5031, using the texture map as a fixed input of the fidelity image.

Since the sensitivity of the texture map to the user is low, the texture map can be used as a fixed input of the predicted target object in the process of forming the fidelity image, that is, the texture map remains unchanged in the fidelity image.

S5032, determining a motion constraint value of the specific element based on the element value in the motion quantization vector.

Motion amplitude of a specific element on a target object on a fidelity image can be described through element values in motion quantization vectors, fidelity patterns of the specific motion element with different motion amplitudes are continuously spliced together, and prediction results comprising different motions of the target object are formed

S5033, predicting a plurality of fidelity images matching the input information by using the continuous motion constraint value and the fixed texture map.

In contrast to the above method embodiment, referring to fig. 6, the present disclosure embodiment further discloses a fidelity image generating device 60, including:

an acquisition module 601 for acquiring a plurality of images containing a target object, based on which one or more consecutive movements of the target object can be determined.

An obtaining module 602, configured to determine, in the plurality of images, a texture map of a specific region on the target object and a shape constraint map of a specific element.

A building module 603 configured to build a reconstructed model of the target object based on the texture map, the shape constraint map, and the two-dimensional image information of the plurality of images.

Specifically, the convolutional neural network structure may include several convolutional layers, pooling layers, fully-connected layers, and classifiers. The number of nodes of the output layer and the input layer of the last layer of the convolutional neural network structure is the same, so that the video frame generating the target object image can be directly output.

A generating module 604 for generating a fidelity image matched to the input information of the target object using the reconstruction model, the fidelity image comprising one or more predicted actions matched to the input information.

The apparatus shown in fig. 6 may correspondingly execute the content in the above method embodiment, and details of the part not described in detail in this embodiment refer to the content described in the above method embodiment, which is not described again here.

Referring to fig. 7, an embodiment of the present disclosure also provides an electronic device 70, including:

at least one processor; and the number of the first and second groups,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of generating a fidelity image of the method embodiments described above.

The disclosed embodiments also provide a non-transitory computer-readable storage medium storing computer instructions for causing the computer to perform the foregoing method embodiments.

The disclosed embodiments also provide a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, cause the computer to perform the fidelity image generation method of the aforementioned method embodiments.

Referring now to FIG. 7, a schematic diagram of an electronic device 70 suitable for use in implementing embodiments of the present disclosure is shown. The electronic devices in the embodiments of the present disclosure may include, but are not limited to, mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), in-vehicle terminals (e.g., car navigation terminals), and the like, and fixed terminals such as digital TVs, desktop computers, and the like. The electronic device shown in fig. 7 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.

As shown in fig. 7, the electronic device 70 may include a processing means (e.g., a central processing unit, a graphics processor, etc.) 701 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)702 or a program loaded from a storage means 708 into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data necessary for the operation of the electronic apparatus 70 are also stored. The processing device 701, the ROM 702, and the RAM 703 are connected to each other by a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.

Generally, the following devices may be connected to the I/O interface 705: input devices 706 including, for example, a touch screen, touch pad, keyboard, mouse, image sensor, microphone, accelerometer, gyroscope, or the like; an output device 707 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 708 including, for example, magnetic tape, hard disk, etc.; and a communication device 709. The communication means 709 may allow the electronic device 70 to communicate wirelessly or by wire with other devices to exchange data. While the figures illustrate an electronic device 70 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.

In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such embodiments, the computer program may be downloaded and installed from a network via the communication means 709, or may be installed from the storage means 708, or may be installed from the ROM 702. The computer program, when executed by the processing device 701, performs the above-described functions defined in the methods of the embodiments of the present disclosure.

It should be noted that the computer readable medium in the present disclosure can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.

The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device.

The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: acquiring at least two internet protocol addresses; sending a node evaluation request comprising the at least two internet protocol addresses to node evaluation equipment, wherein the node evaluation equipment selects the internet protocol addresses from the at least two internet protocol addresses and returns the internet protocol addresses; receiving an internet protocol address returned by the node evaluation equipment; wherein the obtained internet protocol address indicates an edge node in the content distribution network.

Alternatively, the computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: receiving a node evaluation request comprising at least two internet protocol addresses; selecting an internet protocol address from the at least two internet protocol addresses; returning the selected internet protocol address; wherein the received internet protocol address indicates an edge node in the content distribution network.

Computer program code for carrying out operations for aspects of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units described in the embodiments of the present disclosure may be implemented by software or hardware. Where the name of a unit does not in some cases constitute a limitation of the unit itself, for example, the first retrieving unit may also be described as a "unit for retrieving at least two internet protocol addresses".

It should be understood that portions of the present disclosure may be implemented in hardware, software, firmware, or a combination thereof.

The above description is only for the specific embodiments of the present disclosure, but the scope of the present disclosure is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present disclosure should be covered within the scope of the present disclosure. Therefore, the protection scope of the present disclosure shall be subject to the protection scope of the claims.

Claims

1. A fidelity image generation method, comprising:

generating a fidelity image matched with input information of the target object by using the reconstruction model, wherein the fidelity image comprises one or more predicted actions matched with the input information;

wherein the generating a fidelity image matched with the input information of the target object by using the reconstruction model comprises:

2. The method of claim 1, wherein said acquiring a plurality of images containing a target object comprises:

3. The method of claim 1, wherein said acquiring a plurality of images containing a target object comprises:

setting broadcast samples of different styles aiming at a target object;

4. The method according to claim 1, wherein the obtaining a texture map of a specific region and a shape constraint map of a specific element on the target object in the plurality of images comprises:

3D reconstruction is carried out on the specific area on the target object to obtain a 3D area object;

5. The method according to claim 4, wherein said obtaining a texture map of a specific region and a shape constraint map of a specific element on the target object in the plurality of images further comprises:

6. The method of claim 1, wherein constructing the reconstructed model of the target object based on the texture map, the shape constraint map, and two-dimensional image information of the plurality of images comprises:

7. The method of claim 6, wherein said training an image containing said target object with said convolutional neural network comprises:

and reducing the prediction error by adopting a back propagation function.

8. The method of claim 1, wherein the generating a plurality of fidelity images that match the motion quantization vector comprises:

taking the texture map as a fixed input of the fidelity image;

9. A fidelity image generation apparatus, comprising:

a generation module for generating a fidelity image matched with the input information of the target object by using the reconstruction model, wherein the fidelity image comprises one or more predicted actions matched with the input information;

10. An electronic device, characterized in that the electronic device comprises:

at least one processor; and the number of the first and second groups,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the fidelity image generation method of any of the preceding claims 1-8.

11. A non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the fidelity image generation method of any of the preceding claims 1-8.