CN109740476A

CN109740476A - Instant communication method, device and server

Info

Publication number: CN109740476A
Application number: CN201811595491.3A
Authority: CN
Inventors: 单鑫
Original assignee: Beijing Linyun Information Technology Co Ltd
Current assignee: Beijing Xinyu Technology Co.,Ltd.
Priority date: 2018-12-25
Filing date: 2018-12-25
Publication date: 2019-05-10
Anticipated expiration: 2038-12-25
Also published as: CN109740476B

Abstract

This application discloses a kind of instant communication methods.This method includes the video image that the first user terminal received and transmitted the first user in instant messaging；The personal image of first user is extracted from the video image of first user；Receive simultaneously transfer of virtual demand information, wherein the virtual demand information is used to describe the modification demand for the personal image；Virtual dummy is generated according to the personal image of first user and the virtual demand information；Enhancing video image is generated, wherein the enhancing video image is obtained by replacing the personal image in the video image using the virtual dummy；The enhancing video image is sent, second user end receives the enhancing video image.Present invention also provides a kind of for implementing the instant communication device and server of the above method.Present application addresses can only see dullness caused by authentic image, dull or be difficult to the technical issues of protecting portrait privacy as video calling.

Description

Instant communication method, device and server

Technical field

This application involves telecommunications fields, in particular to a kind of instant communication method, device and server.

Background technique

People carry out instant messaging at present mainly mobile phone speech call, Video chat, mail contact etc. by the way of Deng.The both sides of video calling or the multi-party authentic image that can only see other side in Video chat in the related technology, if video calling Time is longer, then can make one to feel dull, dull；The both sides to attend a meeting or in some network teleconferences or multi-party hope are not Show the true colors of oneself, it is desirable to protect personal portrait privacy.

It can only see dullness caused by the authentic image of other side, dull for video calling in the related technology；Or network video It is difficult to the problem of protecting personal portrait privacy by hiding true colors in frequency meeting, not yet proposes effective solution side at present Case.

Summary of the invention

The main purpose of the application is to provide a kind of instant communication method, device and server, at least the above to solve One of problem.

To achieve the goals above, according to the one aspect of the application, a kind of Instant Messenger applied to user terminal is provided Communication method, this method comprises:

First user terminal receives and transmits the video image of the first user in instant messaging；From the institute of first user State the personal image that first user is extracted in video image；Receive simultaneously transfer of virtual demand information, wherein the virtual need Ask information for describing the modification demand for the personal image；According to the personal image of first user and described Virtual demand information generates virtual dummy；Enhancing video image is generated, wherein the enhancing video image is as described in utilizing Virtual dummy is replaced the personal image in the video image and is obtained；The enhancing video image is sent, second uses Family end receives the enhancing video image.

Further, method as the aforementioned, it is described to extract described first from the video image of first user The personal image of user, the action message including extracting first user from the video image of first user, Wherein, the action message includes the facial expression information of first user；The people according to first user Body image and the virtual demand information generate virtual dummy, comprising: generate visual human according to the virtual demand information Object bone；The facial expression information is associated with the virtual portrait bone, is generated same with the first user's face expression The virtual dummy of step.

Further, method as the aforementioned, the action message further include the limb action information of first user；Institute It states and virtual dummy is generated according to the personal image of first user and the virtual demand information, further includes: will The limb action information is associated with the virtual portrait bone, generates the visual human synchronous with the first user limb action As model.

To achieve the goals above, it according to the another aspect of the application, provides a kind of applied to the instant of server end The means of communication, this method comprises:

Receive the video image of the first user in the instant messaging of the first user terminal transmission；From the institute of first user State the personal image that first user is extracted in video image；Receive virtual demand information, wherein the virtual demand information For describing the modification demand for the personal image；According to the personal image of first user and the virtual need Information is asked to generate virtual dummy；Enhancing video image is generated, wherein the enhancing video image is by utilizing the visual human The personal image in the video image is replaced as model to obtain；The enhancing video image is sent to second user End.

Further, method as the aforementioned, it is described to extract described first from the video image of first user The personal image of user, the action message including extracting first user from the video image of first user, Wherein, the action message includes the facial expression information and limb action information of first user；It is described according to described The personal image of one user and the virtual demand information generate virtual dummy, comprising: according to the virtual demand Information generates virtual portrait bone；The facial expression information and the limb action information and the virtual portrait bone are closed Connection generates the virtual dummy synchronous with the first user's face expression and limb action.

To achieve the goals above, according to the another aspect of the application, a kind of instant communication device is provided, the device packet It includes:

Information transmission unit, image extraction unit, model generation unit and video processing unit, in which: the information passes Defeated unit, for receiving and transmitting the video image of the first user in instant messaging；Described image extraction unit is used for from institute State the personal image that first user is extracted in the video image of the first user；The information transmission unit, is also used to Receive simultaneously transfer of virtual demand information, wherein the virtual demand information is used to describe to need the modification of the personal image It asks；The model generation unit, for being generated according to the personal image of first user and the virtual demand information Virtual dummy；The video processing unit, for generating enhancing video image, wherein the enhancing video image is by utilizing The virtual dummy is replaced the personal image in the video image and is obtained；The video processing unit, is also used to Send the enhancing video image；The information transmission unit is also used to receive the enhancing video image.

Further, device as the aforementioned, described image extraction unit, including movement extraction unit, the movement are extracted Unit from the video image of first user for extracting the first user's face expression information；The model is raw At unit, including bone generation unit and movement associative cell, in which: the bone generation unit, for according to described virtual Demand information generates virtual portrait bone；The movement associative cell is used for the facial expression information and the visual human The association of object bone；The model generation unit, for generating and the virtual dummy of the first user's face expression synchronization.

To achieve the goals above, according to the another aspect of the application, a kind of instant communication server is provided, the service Device includes information transmission modular, image zooming-out module, model generation module and video processing module, in which:

The information transmission modular, the video figure of the first user in instant messaging for receiving the transmission of the first user terminal Picture；Described image extraction module, for extracting the person of first user from the video image of first user Image；The information transmission modular is also used to receive virtual demand information, wherein the virtual demand information is for description pair In the modification demand of the personal image；The model generation module, for the personal image according to first user Virtual dummy is generated with the virtual demand information；The video processing module, for generating enhancing video image, wherein The enhancing video image is obtained by replacing the personal image in the video image using the virtual dummy；Institute Information transmission modular is stated, is also used to the enhancing video image being sent to second user end.

Further, server as the aforementioned, described image extraction module, including movement extraction module, the movement mention Modulus block, for extracting the action message of first user from the video image of first user, wherein described Action message includes the facial expression information and limb action information of first user；The model generation module, including bone Bone generation module and movement relating module, in which: the bone generation module, it is empty for being generated according to the virtual demand information Anthropomorphic object bone；The movement relating module is used for the facial expression information and the limb action information and the void Anthropomorphic object bone association；The video processing module, it is synchronous with the first user's face expression and limb action for generating Virtual dummy.

In the embodiment of the present application, by the way of being extracted in video image, scheme further according to the person using by personal image Picture and virtual demand generate virtual dummy, by the way that virtual dummy is replaced video image person image, have reached benefit The purpose of Video chat is carried out with virtual portrait, to realize the technical effect for hiding true portrait in video calling, in turn Solving video calling in the related technology can only see dullness caused by the authentic image of other side, in the dull or network teleconference It is difficult to by hiding the technical issues of true colors protect personal portrait privacy.

Detailed description of the invention

The attached drawing constituted part of this application is used to provide further understanding of the present application, so that the application's is other Feature, objects and advantages become more apparent upon.The illustrative examples attached drawing and its explanation of the application is for explaining the application, not Constitute the improper restriction to the application.In the accompanying drawings:

Fig. 1 is a kind of flow diagram for instant communication method for user terminal that the application one embodiment provides；

Fig. 2 is a kind of flow diagram for instant communication method for user terminal that the application one embodiment provides；

Fig. 3 is a kind of flow diagram for instant communication method for server that the application one embodiment provides；

Fig. 4 is a kind of flow diagram for instant communication method for server that the application one embodiment provides；

Fig. 5 is a kind of structural schematic diagram for instant communication device that the application one embodiment provides；And

Fig. 6 is a kind of structural schematic diagram for instant communication server that the application one embodiment provides.

Specific embodiment

In order to make those skilled in the art more fully understand application scheme, below in conjunction in the embodiment of the present application Attached drawing, the technical scheme in the embodiment of the application is clearly and completely described, it is clear that described embodiment is only The embodiment of the application a part, instead of all the embodiments.Based on the embodiment in the application, ordinary skill people Member's every other embodiment obtained without making creative work, all should belong to the model of the application protection It encloses.

It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase Mutually combination.The application is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.

According to embodiments of the present invention 1, a kind of instant communication method applied to user terminal is provided, as shown in Figure 1, the party Method includes the following steps, namely A1 to step A6:

The first user terminal of step A1. receives and transmits the video image of the first user in instant messaging；Specifically, first After user connect video calling with second user by the user terminal including modes such as application program of mobile phone, computer software, mobile phone, The filming apparatus such as the built-in camera of computer or external camera shoot the video calling image of the first user.

Step A2. extracts the personal image of first user from the video image of first user；It can be with It is extracted by the processing module of user terminal, can also be by transmission of video images to server, it will by the processor of server Personal image is extracted from video image.

Step A3. is received and transfer of virtual demand information, wherein the virtual demand information is for describing for the people The modification demand of body image；Specifically, the first user can be received by the first user terminal in application program of mobile phone or computer software Figure image of first user in video calling is substituted for virtual image in function menu, can be whole body images replacement, It is also possible to only head portrait replacement, virtual image includes cartoon character, cartoon figure, star character, video display performer etc.；It can be by Two user terminals reception second user is in the function menu of application program of mobile phone or computer software by the first user in video calling In figure image be substituted for virtual image.

Step A4. generates virtual portrait mould according to the personal image of first user and the virtual demand information Type；Specifically, for example, virtual demand information is " cartoon character ", then personal image is changed into personal image by filtering techniques Parameter, personal image is become to the virtual dummy of cartoon character style.

Step A5. generates enhancing video image, wherein the enhancing video image using the virtual dummy by being replaced The personal image changed in the video image obtains；Specifically, using augmented reality, by the people in original video image Body image replaces with virtual dummy；Augmented reality, it is a kind of by real world information and virtual world information " nothing The integrated new technology of seam ", be script is difficult to experience in the certain time spatial dimension of real world entity information (depending on Feel information, sound, taste, tactile etc.), it by science and technology such as computers, is superimposed again after analog simulation, by virtual Information application To real world, perceived by human sensory, thus reach the sensory experience of exceeding reality, true environment and virtual object It has been added to the same picture in real time or space exists simultaneously.

Step A6. sends the enhancing video image, and second user end receives the enhancing video image.Specifically, Two users second user end can with augmented reality video is seen in the video calling of the first user, that is, include the first user Real background in virtual figure image augmented reality video.

It should be noted that step shown in the flowchart of the accompanying drawings can be in such as a group of computer-executable instructions It is executed in computer system, although also, logical order is shown in flow charts, and it in some cases, can be with not The sequence being same as herein executes shown or described step.

According to embodiments of the present invention 1, a kind of instant communication method applied to server is provided, as shown in figure 3, the party Method includes the following steps, namely B1 to step B6:

Step B1. receives the video image of the first user in the instant messaging of the first user terminal transmission；Specifically, first User is by including service after the user terminal of the modes such as application program of mobile phone, computer software connect video calling with second user Device receives the video calling image that the filming apparatus such as mobile phone, the built-in camera of computer or external camera shoot the first user.

Step B2. extracts the personal image of first user from the video image of first user；

Step B3. receives virtual demand information, wherein the virtual demand information is for describing for the personal image Modification demand；Specifically, server can receive the first user in application program of mobile phone or computer software by the first user terminal Function menu in figure image of first user in video calling be substituted for virtual image, can be whole body images and replace It changes, is also possible to only head portrait replacement, virtual image includes cartoon character, cartoon figure, star character, video display performer etc.；Clothes Business device can also receive second user in the function menu of application program of mobile phone or computer software for first by second user end Figure image of the user in video calling is substituted for virtual image.

Step B4. generates virtual portrait mould according to the personal image of first user and the virtual demand information Type；Specifically, for example, virtual demand information is " cartoon character ", then personal image is changed into personal image by filtering techniques Parameter, personal image is become to the virtual dummy of cartoon character style.

Step B5. generates enhancing video image, wherein the enhancing video image using the virtual dummy by being replaced The personal image changed in the video image obtains；Specifically, using augmented reality, by the people in original video image Body image replaces with virtual dummy；Augmented reality, it is a kind of by real world information and virtual world information " nothing The integrated new technology of seam ", be script is difficult to experience in the certain time spatial dimension of real world entity information (depending on Feel information, sound, taste, tactile etc.), it by science and technology such as computers, is superimposed again after analog simulation, by virtual Information application To real world, perceived by human sensory, thus reach the sensory experience of exceeding reality, true environment and virtual object It has been added to the same picture in real time or space exists simultaneously.

The enhancing video image is sent to second user end by step B6..Specifically, server is generating enhancing video Be sent to second user end after image, second user second user end can with see enhancing in the video calling of the first user Real video includes the augmented reality video of the virtual figure image in the real background of the first user.

According to embodiments of the present invention 1, it additionally provides a kind of for implementing the above-mentioned instant communication method applied to user terminal Instant communication device, as shown in figure 5, the device includes information transmission unit, image extraction unit, model generation unit and view Frequency processing unit, in which:

The information transmission unit, for receiving and transmitting the video image of the first user in instant messaging；Specifically, First user is connected by the wireless communication unit and second user of the user terminal including modes such as application program of mobile phone, computer software After connecing video calling, the video that the filming apparatus such as mobile phone, the built-in camera of computer or external camera shoot the first user is logical Talk about image.

Described image extraction unit, for extracting first user's from the video image of first user Personal image；Specifically, can be extracted by the processing module of user terminal, can also by transmission of video images to server, Personal image is extracted from video image by the processor of server

The information transmission unit is also used to receive simultaneously transfer of virtual demand information, wherein the virtual demand information is used In description for the modification demand of the personal image；Specifically, the first user can be received by the first user terminal in mobile phone application Figure image of first user in video calling is substituted for virtual image in the function menu of program or computer software, it can be with It is whole body images replacement, is also possible to only head portrait replacement, virtual image includes cartoon character, cartoon figure, star character, video display Performer etc.；Second user can be received by second user end in the function menu of application program of mobile phone or computer software by first Figure image of the user in video calling is substituted for virtual image.

The model generation unit, for according to the personal image of first user and the virtual demand information Generate virtual dummy；Specifically, for example, virtual demand information is " cartoon character ", then personal image is passed through into filter skill Art changes the parameter of personal image, and personal image is become to the virtual dummy of cartoon character style.

The video processing unit, for generating enhancing video image, wherein the enhancing video image is as described in utilizing Virtual dummy is replaced the personal image in the video image and is obtained；Specifically, the first user terminal or server Video processing unit utilizes augmented reality, and the personal image in original video image is replaced with virtual dummy；Enhancing Reality technology, it is a kind of by " seamless " the integrated new technology of real world information and virtual world information, is originally existing It is difficult the entity information (visual information, sound, taste, tactile etc.) experienced in the certain time spatial dimension in the real world, leads to The science and technology such as computer are crossed, are superimposed again after analog simulation, virtual Information application to real world is felt by human sensory Know, to reach the sensory experience of exceeding reality, true environment and virtual object have been added to the same picture in real time Or space exists simultaneously.

The video processing unit is also used to send the enhancing video image；Specifically, the first user terminal or server The second user end with the call of the first user terminal video is sent by augmented reality video.

The information transmission unit is also used to receive the enhancing video image.Specifically, second user end has received increasing Strong video image, second user second user end can with augmented reality video is seen in the video calling of the first user, i.e., It include the augmented reality video of the virtual figure image in the real background of the first user.

According to embodiments of the present invention 1, it additionally provides a kind of for implementing the above-mentioned instant communication method applied to server Instant communication server, as shown in fig. 6, include information transmission modular, image zooming-out module, model generation module and video at Manage module, in which:

The information transmission modular, the video figure of the first user in instant messaging for receiving the transmission of the first user terminal Picture；Specifically, the first user is by including that the user terminal of the modes such as application program of mobile phone, computer software connect view with second user After frequency is conversed, the information transmission modular of server receives the filming apparatus such as mobile phone, the built-in camera of computer or external camera Shoot the video calling image of the first user.

Described image extraction module, for extracting first user's from the video image of first user Personal image；

The information transmission modular is also used to receive virtual demand information, wherein the virtual demand information is for describing For the modification demand of the personal image；Specifically, server and the first user terminal are wirelessly connected, and server can pass through first User terminal the first user of reception is in the function menu of application program of mobile phone or computer software by the first user in video calling Figure image be substituted for virtual image, can be whole body images replacement, be also possible to only head portrait replacement, virtual image includes dynamic Unrestrained personage, cartoon figure, star character, video display performer etc.；Server can also receive second user by second user end and exist Figure image of first user in video calling is substituted for virtually in the function menu of application program of mobile phone or computer software Image.

The model generation module, for according to the personal image of first user and the virtual demand information Generate virtual dummy；Specifically, for example, virtual demand information is " cartoon character ", then personal image is passed through into filter skill Art changes the parameter of personal image, and personal image is become to the virtual dummy of cartoon character style.

The video processing module, for generating enhancing video image, wherein the enhancing video image is as described in utilizing Virtual dummy is replaced the personal image in the video image and is obtained；It specifically, will be former using augmented reality Personal image in video image replaces with virtual dummy；Augmented reality, it be it is a kind of by real world information and The new technology that virtual world information is " seamless " to be integrated, is that script is difficult to experience in the certain time spatial dimension of real world The entity information (visual information, sound, taste, tactile etc.) arrived is superimposed after analog simulation again by science and technology such as computers, It by virtual Information application to real world, is perceived by human sensory, to reach the sensory experience of exceeding reality, really Environment and virtual object have been added to the same picture in real time or space exists simultaneously.

The information transmission modular is also used to the enhancing video image being sent to second user end.Specifically, it services Device is sent to second user end after generating enhancing video image, and second user can be in the view with the first user at second user end Augmented reality video is seen in frequency call, that is, includes the augmented reality of the virtual figure image in the real background of the first user Video.

According to embodiments of the present invention 2, a kind of instant communication method applied to user terminal is provided, as shown in Fig. 2, the party Method comprises the following steps that

Step A21. extracts the personal image of first user from the video image of first user, including The action message of first user is extracted from the video image of first user, wherein the action message packet Include the facial expression information and limb action information of first user；It can be extracted by the processing module of user terminal, Transmission of video images to server can be believed personal image and corresponding facial expression by the processor of server Breath and limb action information are extracted from video image.

Step A4. generates virtual portrait mould according to the personal image of first user and the virtual demand information Type；

Further, step A41 generates virtual portrait bone according to the virtual demand information；

The facial expression information and limb action information are associated with by step A42 with the virtual portrait bone, generate with The first user's face expression virtual dummy synchronous with limb action；Specifically, using application program or software system The bone of preset virtual portrait (including cartoon character, cartoon figure, star character, video display performer etc.) in system, and will lead to Facial expression, the limb action of the first user etc. of capture, tracking just in video calling are crossed, is generated and the first user's face table The feelings virtual dummy synchronous with limb action.

According to embodiments of the present invention 2, a kind of instant communication method applied to server is provided, as shown in figure 4, the party Method comprises the following steps that

Step B21. extracts the personal image of first user from the video image of first user, including The action message of first user is extracted from the video image of first user, wherein the action message packet Include the facial expression information and limb action information of first user；Specifically, it can be carried out by the processing module of user terminal It extracts, it can also be by transmission of video images to server, by the processor of server by personal image and corresponding face Portion's expression information and limb action information are extracted from video image.

Step B4. generates virtual portrait mould according to the personal image of first user and the virtual demand information Type；

Further, step B41. generates virtual portrait bone according to the virtual demand information；

The facial expression information and the limb action information are associated with by step B42. with the virtual portrait bone, raw At the virtual dummy synchronous with the first user's face expression and limb action.Specifically, application program or soft is used The bone of preset virtual portrait (including cartoon character, cartoon figure, star character, video display performer etc.) in part system, and By by capturing, tracking facial expression, the limb action of the first user etc. just in video calling, generate and the first user face Portion's expression virtual dummy synchronous with limb action.

According to embodiments of the present invention 2, it additionally provides a kind of for implementing the above-mentioned instant communication method applied to user terminal Instant communication device, as shown in figure 5, the device includes information transmission unit, image extraction unit, model generation unit and view Frequency processing unit, in which:

Described image extraction unit, for extracting first user's from the video image of first user Personal image；Described image extraction unit includes movement extraction unit, and movement extraction unit is used for the institute from first user State the action message that first user is extracted in video image, wherein the action message includes the face of first user Portion's expression information and limb action information；It can be extracted by the processing module of user terminal, it can also be by transmission of video images To server, by the processor of server by personal image and corresponding facial expression information and limb action information from It is extracted in video image.

The model generation unit, including bone generation unit and movement associative cell；The bone generation unit, is used for Virtual portrait bone is generated according to the virtual demand information；

The movement associative cell, for the facial expression information to be associated with the virtual portrait bone；

The model generation unit, for generating and the virtual dummy of the first user's face expression synchronization.Tool Body uses virtual portrait preset in application program or software systems (including cartoon character, cartoon figure, star character, shadow Depending on performer etc.) bone, and will be moved by the facial expression that captures, track the first user just in video calling, limbs Make etc., generate the virtual dummy synchronous with the first user's face expression and limb action.

According to embodiments of the present invention 2, it additionally provides a kind of for implementing the above-mentioned instant communication method applied to server Instant communication server, as shown in fig. 6, include information transmission modular, image zooming-out module, model generation module and video at Manage module, in which:

Described image extraction module, for extracting first user's from the video image of first user Personal image；Described image extraction module includes movement extraction module, and the movement extraction module is used for from first user The video image in extract the action message of first user, wherein the action message includes first user Facial expression information and limb action information；

The model generation module, including bone generation module and movement relating module；The bone generation module, is used for Virtual portrait bone is generated according to the virtual demand information；

The movement relating module is used for the facial expression information and the limb action information and the visual human The association of object bone；

The video processing module, for generating the visual human synchronous with the first user's face expression and limb action As model.Specifically, (including cartoon character, cartoon figure, bright using virtual portrait preset in application program or software systems Star personage, video display performer etc.) bone, and will be by capturing, tracking the facial table of the first user just in video calling Feelings, limb action etc. generate the virtual dummy synchronous with the first user's face expression and limb action.

It should be noted that the description and claims of this application and term " first " in above-mentioned attached drawing, " Two " etc. be to be used to distinguish similar objects, without being used to describe a particular order or precedence order.For example, it will be appreciated that the One user terminal and second user end are interchangeable under appropriate circumstances, i.e., the both sides of video calling can convert true portrait It is incorporated in true background of video call for virtual portrait, to apply embodiments herein described here.In addition, art Language " comprising " and " having " and their any deformation, it is intended that cover it is non-exclusive include, for example, containing a series of The instant communication device of unit is not necessarily limited to the voice instant messaging unit being clearly listed, but may include not arranging clearly The voice instant messaging unit intrinsic for instant communication device out.

In addition, term " installation ", " setting ", " being equipped with ", " connection ", " connected ", " socket " shall be understood in a broad sense.For example, It may be a fixed connection, be detachably connected or monolithic construction；It can be mechanical connection, or electrical connection；It can be direct phase It even, or indirectly connected through an intermediary, or is two connections internal between device, element or component. For those of ordinary skills, the concrete meaning of above-mentioned term in this application can be understood as the case may be.

Obviously, those skilled in the art should be understood that each module of the above invention or each step can be with general Computing device realize that they can be concentrated on a single computing device, or be distributed in multiple computing devices and formed Network on, optionally, they can be realized with the program code that computing device can perform, it is thus possible to which they are stored Be performed by computing device in the storage device, perhaps they are fabricated to each integrated circuit modules or by they In multiple modules or step be fabricated to single integrated circuit module to realize.In this way, the present invention is not limited to any specific Hardware and software combines.

The foregoing is merely preferred embodiment of the present application, are not intended to limit this application, for the skill of this field For art personnel, various changes and changes are possible in this application.Within the spirit and principles of this application, made any to repair Change, equivalent replacement, improvement etc., should be included within the scope of protection of this application.

Claims

1. a kind of instant communication method characterized by comprising

First user terminal receives and transmits the video image of the first user in instant messaging；

The personal image of first user is extracted from the video image of first user；

Receive simultaneously transfer of virtual demand information, wherein the virtual demand information is used to describe repairing for the personal image Decorations demand；

Virtual dummy is generated according to the personal image of first user and the virtual demand information；

Enhancing video image is generated, wherein the enhancing video image is by replacing the video figure using the virtual dummy The personal image as in obtains；

The enhancing video image is sent, second user end receives the enhancing video image.

2. the method according to claim 1, wherein described mention from the video image of first user The personal image for taking first user, including extracting first user's from the video image of first user Action message, wherein the action message includes the facial expression information of first user；

It is described that virtual dummy, packet are generated according to the personal image of first user and the virtual demand information It includes:

Virtual portrait bone is generated according to the virtual demand information；

The facial expression information is associated with the virtual portrait bone, is generated and the first user's face expression synchronization Virtual dummy.

3. according to the method described in claim 2, it is characterized in that, the action message further includes the limbs of first user Action message；

It is described that virtual dummy is generated according to the personal image of first user and the virtual demand information, also wrap It includes:

The limb action information is associated with the virtual portrait bone, is generated synchronous with the first user limb action Virtual dummy.

4. a kind of instant communication method characterized by comprising

Receive the video image of the first user in the instant messaging of the first user terminal transmission；

Receive virtual demand information, wherein the virtual demand information is used to describe the modification demand for the personal image；

The enhancing video image is sent to second user end.

5. according to the method described in claim 4, it is characterized in that, described mention from the video image of first user The personal image for taking first user, including extracting first user's from the video image of first user Action message, wherein the action message includes the facial expression information and limb action information of first user；

The facial expression information and the limb action information are associated with the virtual portrait bone, generated and described first The user's face expression virtual dummy synchronous with limb action.

6. a kind of instant communication device, which is characterized in that including information transmission unit, image extraction unit, model generation unit And video processing unit, in which:

The information transmission unit, for receiving and transmitting the video image of the first user in instant messaging；

Described image extraction unit, for extracting the person of first user from the video image of first user Image；

The information transmission unit is also used to receive simultaneously transfer of virtual demand information, wherein the virtual demand information is for retouching State the modification demand for the personal image；

The model generation unit, for being generated according to the personal image of first user and the virtual demand information Virtual dummy；

The video processing unit, for generating enhancing video image, wherein the enhancing video image is described virtual by utilizing Dummy is replaced the personal image in the video image and is obtained；

The video processing unit is also used to send the enhancing video image；

The information transmission unit is also used to receive the enhancing video image.

7. device according to claim 6, which is characterized in that described image extraction unit, including movement extraction unit, institute Movement extraction unit is stated for extracting the first user's face expression information from the video image of first user；

The model generation unit, including bone generation unit and movement associative cell, in which:

The bone generation unit, for generating virtual portrait bone according to the virtual demand information；

The model generation unit, for generating and the virtual dummy of the first user's face expression synchronization.

8. device according to claim 7, which is characterized in that the movement extraction unit is also used to use from described first The first user limb action information is extracted in the video image at family；

The movement associative cell is also used to for the limb action information being associated with the virtual portrait bone；

The model generation unit is also used to generate the virtual dummy synchronous with the first user limb action.

9. a kind of instant communication server, which is characterized in that generate mould including information transmission modular, image zooming-out module, model Block and video processing module, in which:

The information transmission modular, the video image of the first user in instant messaging for receiving the transmission of the first user terminal；

Described image extraction module, for extracting the person of first user from the video image of first user Image；

The information transmission modular is also used to receive virtual demand information, wherein the virtual demand information for describe for The modification demand of the person image；

The model generation module, for being generated according to the personal image of first user and the virtual demand information Virtual dummy；

The video processing module, for generating enhancing video image, wherein the enhancing video image is described virtual by utilizing Dummy is replaced the personal image in the video image and is obtained；

The information transmission modular is also used to the enhancing video image being sent to second user end.

10. server according to claim 9, which is characterized in that mould is extracted in described image extraction module, including movement Block, the movement extraction module, for extracting the movement of first user from the video image of first user Information, wherein the action message includes the facial expression information and limb action information of first user；

The model generation module, including bone generation module and movement relating module, in which:

The bone generation module, for generating virtual portrait bone according to the virtual demand information；

The movement relating module is used for the facial expression information and the limb action information and the virtual portrait bone Bone association；

The video processing module, for generating the virtual portrait mould synchronous with the first user's face expression and limb action Type.