CN107679519A - A kind of multi-modal interaction processing method and system based on visual human - Google Patents
A kind of multi-modal interaction processing method and system based on visual human Download PDFInfo
- Publication number
- CN107679519A CN107679519A CN201711026544.5A CN201711026544A CN107679519A CN 107679519 A CN107679519 A CN 107679519A CN 201711026544 A CN201711026544 A CN 201711026544A CN 107679519 A CN107679519 A CN 107679519A
- Authority
- CN
- China
- Prior art keywords
- imitated
- person
- dimensional face
- visual human
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/64—Three-dimensional objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T17/00—Three dimensional [3D] modelling, e.g. data description of 3D objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T19/00—Manipulating 3D models or images for computer graphics
- G06T19/20—Editing of 3D images, e.g. changing shapes or colours, aligning objects or positioning parts
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/174—Facial expression recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2219/00—Indexing scheme for manipulating 3D models or images for computer graphics
- G06T2219/20—Indexing scheme for editing of 3D models
- G06T2219/2016—Rotation, translation, scaling
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- General Health & Medical Sciences (AREA)
- Computer Graphics (AREA)
- Software Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Architecture (AREA)
- Computer Hardware Design (AREA)
- General Engineering & Computer Science (AREA)
- Geometry (AREA)
- Processing Or Creating Images (AREA)
Abstract
A kind of multi-modal interaction processing method and system based on visual human that the application provides, by the multi-modal data for obtaining the person of being imitated, then the three-dimensional face images for the person of being imitated described in being extracted from the multi-modal data, the three-dimensional face images are parsed again, determine the key point of the three-dimensional face images, the corresponding node of the key point and the three-dimensional face model of visual human is bound, the three-dimensional face images for the person of being imitated and parsing described in obtaining in real time, the key point of the three-dimensional face images for the person of being imitated according to obtaining parsing, mapped on the node of the three-dimensional face model of the visual human, generate and export imitation data, allow the visual human three-dimensional face model present it is true to nature, smooth man-machine interaction effect, lift Consumer's Experience.
Description
Technical field
The application is related to field of artificial intelligence, more particularly to a kind of multi-modal interaction processing method based on visual human
And system, a kind of visual human and a kind of storage medium.
Background technology
With the continuous development of scientific technology, the introducing of information technology, computer technology and artificial intelligence technology, machine
Industrial circle is progressively walked out in the research of people, gradually extend to the neck such as medical treatment, health care, family, amusement and service industry
Domain.And requirement of the people for robot is also conformed to the principle of simplicity the multiple mechanical action of substance be promoted to anthropomorphic question and answer, independence and with
The intelligent robot that other robot interacts, man-machine interaction also just turn into an important factor for determining intelligent robot development.
Robot includes the tangible machine people for possessing entity and the virtual robot being mounted on hardware device at present.It is existing
Virtual robot in technology can not carry out multi-modal interaction, and show changeless state always, can not realize it is true to nature,
Smooth, anthropomorphic interaction effect.
Therefore, the interaction capabilities and presentation ability of virtual robot are lifted, are the major issues of present urgent need to resolve.
The content of the invention
In view of this, the application provides a kind of multi-modal interaction processing method and system based on visual human, one kind virtually
People and a kind of storage medium, to solve technological deficiency present in prior art.
On the one hand, the application provides a kind of multi-modal interaction processing method based on visual human, and the visual human is in intelligence
Equipment is run, including:
Obtain the multi-modal data for the person of being imitated;
The three-dimensional face images of the person of being imitated described in extraction from the multi-modal data;
The three-dimensional face images are parsed, determine the key point of the three-dimensional face images;
The corresponding node of the key point and the three-dimensional face model of visual human is bound;
The three-dimensional face images for the person of being imitated and parsing described in obtaining in real time;
The key point of the three-dimensional face images for the person of being imitated according to obtaining parsing, in the three-dimensional people of the visual human
Mapped on the node of face model, data are imitated in generation;
Data are imitated in output.
Alternatively, before the multi-modal data for obtaining the person of being imitated, including:
Visual human is waken up, the visual human is shown in default viewing area.
Alternatively, the visual human is by the high mould structure generations of 3D, and possesses default image and technical ability;
The visual human includes application program, the executable file or by the intelligence operated on the smart machine
The hologram that equipment projects.
Alternatively, the system that the smart machine uses is included built in WINDOWS systems, MAC OS systems or hologram device
System.
Alternatively, the default viewing area includes the throwing of the display interface or the smart machine of the smart machine
Penetrate region.
Alternatively, the output is imitated data and included:Make the three-dimensional face model of the visual human according to the mould received
Imitative data are imitated the person of being imitated, while export multi-modal interaction data.
Alternatively, the three-dimensional face images that the person of being imitated is extracted from the multi-modal data include:
The multi-modal data of the person of being imitated is parsed, multi-modal interaction data, the parsing are exported with decision-making
Including:Semantic understanding, visual identity, affection computation, cognition calculate;
When including the intention for imitating technical ability in the result of the parsing, open imitation technical ability and start acquisition device acquisition
The three-dimensional face images of the person of being imitated.
Alternatively, it is described according to parsing when the end rotation for the person of being imitated or face rotate described in being determined according to parsing
The key point of the three-dimensional face images of the obtained person of being imitated is carried out on the three-dimensional face model node of the visual human
Mapping, data are imitated in generation to be included:
The difference of key point is true corresponding to three-dimensional face images before and after the end rotation of the person of being imitated obtained according to parsing
Determine spin matrix;
According to the spin matrix, the anglec of rotation of the end rotation of the person of being imitated is determined;
The three-dimensional face model for controlling the visual human according to the anglec of rotation enters to the person's of being imitated end rotation
Row imitates.
Alternatively, the key point includes:
Distributed point of bone, muscle and/or the face of face in face.
Alternatively, the node of the three-dimensional face model of the visual human includes:
Eyebrow, eyes, eyelid, mouth and/or the corners of the mouth.
On the other hand, present invention also provides a kind of multi-modal interaction process system based on visual human, including intelligence to set
Standby and server, the smart machine include acquisition module, generation module and output module, and the server includes extraction mould
Block, determining module, binding module and parsing module, wherein:
The acquisition module, for obtaining the multi-modal data for the person of being imitated;
The extraction module, the three-dimensional face images for the person of being imitated described in the extraction from the multi-modal data;
The determining module, for being parsed to the three-dimensional face images, determine the pass of the three-dimensional face images
Key point;
The binding module, for the corresponding node of the key point and the three-dimensional face model of visual human to be tied up
It is fixed;
The parsing module, for the three-dimensional face images for the person of being imitated and parsing described in acquisition in real time;
The generation module, for according to parsing obtain described in the person of being imitated three-dimensional face images key point,
Mapped on the node of the three-dimensional face model of the visual human, data are imitated in generation;
The output module, data are imitated for exporting.
Alternatively, the three-dimensional face model that the output module is used to make the visual human is according to the imitation data received
The person of being imitated is imitated, while exports multi-modal interaction data.
Alternatively, the server includes:
Data resolution module, for being parsed to the multi-modal data of the person of being imitated, exported with decision-making multi-modal
Interaction data, the parsing include:Semantic understanding, visual identity, affection computation, cognition calculate;
Image module is obtained, for when including the intention for imitating technical ability in the result of the parsing, opening and imitating technical ability
And start the three-dimensional face images that acquisition device obtains the person of being imitated.
Alternatively, the generation module includes:
Spin matrix determination sub-module, for being rotated when the end rotation for the person of being imitated or face according to parsing determination
When, rotation is determined according to the difference of key point corresponding to the three-dimensional face images before and after the end rotation for parsing the obtained person of being imitated
Matrix;
Anglec of rotation determination sub-module, for according to the spin matrix, determining the end rotation of the person of being imitated
The anglec of rotation;
End rotation imitates submodule, for controlling the three-dimensional face model pair of the visual human according to the anglec of rotation
The person's of being imitated end rotation is imitated.
On the other hand, present invention also provides a kind of visual human, the visual human to perform the above-mentioned multimode based on visual human
State interaction processing method.
On the other hand, present invention also provides a kind of storage medium, computer instruction is stored with, the computer instruction is held
The above-mentioned multi-modal interaction processing method based on visual human of row.
A kind of multi-modal interaction processing method and system based on visual human, a kind of visual human and one kind that the application provides
Storage medium, by obtain the person of being imitated multi-modal data, then from the multi-modal data extraction described in the person of being imitated
Three-dimensional face images, then the three-dimensional face images are parsed, the key point of the three-dimensional face images are determined, by institute
The corresponding node for stating key point and the three-dimensional face model of visual human is bound, in real time the three-dimensional people of the person of being imitated described in acquisition
Face image simultaneously parses, the key point of the three-dimensional face images for the person of being imitated according to obtaining parsing, the visual human's
Mapped on the node of three-dimensional face model, generate and export imitation data so that the three-dimensional face model of the visual human
True to nature, smooth man-machine interaction effect can be presented, lift Consumer's Experience.
Brief description of the drawings
Fig. 1 is a kind of structural representation for multi-modal interaction process system based on visual human that the embodiment of the application one provides
Figure;
Fig. 2 is a kind of multi-modal interaction processing method flow chart based on visual human that the embodiment of the application one provides;
Fig. 3 is a kind of multi-modal interaction processing method flow chart based on visual human that the embodiment of the application one provides;
Fig. 4 is a kind of multi-modal interaction processing method flow chart based on visual human that the embodiment of the application one provides;
Fig. 5 is a kind of multi-modal interaction processing method flow chart based on visual human that the embodiment of the application one provides;
Fig. 6 is a kind of structural representation for multi-modal interaction process system based on visual human that the embodiment of the application one provides
Figure;
Fig. 7 is a kind of structural representation for multi-modal interaction process system based on visual human that the embodiment of the application one provides
Figure;
Fig. 8 is a kind of structural representation for multi-modal interaction process system based on visual human that the embodiment of the application one provides
Figure.
Embodiment
Many details are elaborated in the following description in order to fully understand the application.But the application can be with
Much it is different from other manner described here to implement, those skilled in the art can be in the situation without prejudice to the application intension
Under do similar popularization, therefore the application is not limited by following public specific implementation.
In this application, there is provided a kind of multi-modal interaction processing method and system based on visual human, a kind of visual human
And a kind of storage medium, it is described in detail one by one in the following embodiments.
In the application, the visual human operates in smart machine, and the smart machine can be desktop PC, notes
Originally, line holographic projections equipment of the intellectual computing device such as palm PC and intelligent movable equipment or intelligence etc., the movement
Smart machine can include smart mobile phone, intelligent robot etc..
The attribute that the visual human possesses, it can include:Visual human's mark, social property, personality attribute, personage's technical ability etc. belong to
Property.Specifically, social property can include:Appearance, name, sex, native place, age, family relationship, occupation, position, religion
The attribute field such as faith, emotion state, educational background;Personality attribute can include:The attribute fields such as personality, makings;Personage's technical ability can
With including:Sing and dance, the professional skill such as tell a story, train.
In this application, the attribute of visual human may be such that the parsing of multi-modal interaction and the result of decision can be more prone to or more
To be adapted to the visual human, system can be by calling the attribute information to realize the wake-up of visual human, activity, going to wake up and nullify
The control of state, belong to the adeditive attribute information that visual human distinguishes true people.
In the application, the intelligent holographic projector equipment can use hologram device built-in system, other intelligence
Equipment can use WINDOWS systems or MAC OS systems.
Therefore, the visual human can be the hologram or fortune come out by intelligent holographic projection
Application program or executable file of the row on the smart machine.
Referring to Fig. 1, for the structural representation of the multi-modal interactive system based on visual human of the embodiment of the present application.
The multi-modal interactive system based on visual human includes smart machine 120 and server, and the server can be
High in the clouds brain 110.
The smart machine 120 can include:User interface 121, communication module 122, CPU 123 and man-machine
Interactively enter output module 124.Wherein, the user interface 121, its shown in default viewing area be waken up it is virtual
People.The man-machine interaction input/output module 124, it obtains multi-modal data and output visual human performs parameter, multi-modal
Data include the data from surrounding environment and the multi-modal input data interacted with user (comprises at least facial image to believe
Breath).The communication module 122, it calls visual human's ability interface and received multi-modal defeated by the parsing of visual human's ability interface
Enter the multi-modal output data that decision data goes out.The CPU 123, it utilizes target person in multi-modal output data
Face and visual human's relative position information calculate the execution parameter that virtual head part rotates to target face direction.
The high in the clouds brain 110 possesses multi-modal data parsing module (also referred to as " visual human's ability interface "), and it is to described
The multi-modal data that smart machine 120 is sent is parsed, and the multi-modal output data of decision-making, the multi-modal output data packet
Include target face and visual human's relative position information.
As shown in figure 1, corresponding logical process is called respectively in each ability interface of multi-modal data resolving.Below
For the explanation of each interface:
Semantic understanding interface 111, it receives the voice messaging from the communication module 122 forwarding, voice knowledge is carried out to it
The other and natural language processing based on a large amount of language materials.
Visual identity interface 112, human body, face, scene can be directed to according to computer vision algorithms make, deep learning algorithm
Deng progress video content detection, identification, tracking etc..Image is identified according to predetermined algorithm, the detection of quantitative
As a result.Possess image preprocessing function, feature extraction functions, decision making function and concrete application function.Image preprocessing can be
Basic handling, including the conversion of color space conversion, edge extracting, image and image threshold are carried out to the vision collecting data of acquisition
Change;Feature extraction can extract the characteristic information such as the colour of skin of target, color, texture, motion and coordinate in image;Decision-making can be with
It is to characteristic information, the concrete application for needing this feature information is distributed to according to certain decision strategy;Concrete application function is real
The functions such as existing Face datection, human limbs identification, motion detection.
Affection computation interface 114, it receives the multi-modal data from the communication module 122 forwarding, utilizes affection computation
Logic (can be Emotion identification technology) calculates the current emotional state of user.Emotion identification technology is one of affection computation
Important component, the content of Emotion identification research include the sides such as facial expression, voice, behavior, text and physiological signal identification
Face, the emotional state of user is may determine that by above content.Emotion identification technology can only pass through vision Emotion identification skill
Art monitors the emotional state of user, can also be by the way of vision Emotion identification technology and sound Emotion identification technology combine
To monitor the emotional state of user, and it is not limited thereto.In the present embodiment, it is preferred to use the two mode combined monitors
Mood.
Affection computation interface 114 collects human face when carrying out vision Emotion identification, by using image capture device
Facial expression image, being then converted into can the technology progress expression mood analysis such as analyze data, recycling image procossing.Understand facial table
Feelings, it usually needs the delicate change to expression detects, such as cheek muscle, mouth change and choose eyebrow etc..
Cognition calculates interface 113, and it receives the multi-modal data from the communication module 122 forwarding, and the cognition calculates
Interface 113 carries out data acquisition, identification and study to handle multi-modal data, to obtain user's portrait, knowledge mapping etc., with
Rational Decision is carried out to multi-modal output data.
A kind of schematical technical scheme of the above-mentioned multi-modal interactive system based on visual human for the embodiment of the present application.
For the ease of skilled artisan understands that the technical scheme of the application, base of the description below by multiple embodiments to the application
It is further detailed in the multi-modal interaction processing method and system of visual human, a kind of visual human and a kind of storage medium.
Referring to Fig. 2, the embodiment of the application one provides a kind of multi-modal interaction processing method based on visual human, described virtual
People runs in smart machine, including step 201 is to step 207.
Step 201:Obtain the multi-modal data for the person of being imitated.
In the embodiment of the present application, the person of being imitated is the user exchanged with the visual human;As the visual human
When carrying the profile of star personage, the bean vermicelli that person can be the star personage is not then imitated.
The multi-modal data can gather natural language, visually-perceptible, touch perception, the language language of the person of being imitated
The data such as sound, emotional facial expressions, action.
Alternatively, before the multi-modal data for obtaining the person of being imitated, including:
Visual human is waken up, the visual human is shown in default viewing area.
In the embodiment of the present application, the visual human possesses default image and technical ability by the high mould structure generations of 3D,
Such as visual human can be the image appearance of Chinese Human-Female people, the function of possessing the imitation of face facial expression.
The default viewing area can include the projection of the display interface or intelligent holographic projector equipment of smart machine
Region.
In the embodiment of the present application, the visual human can in standby, dormancy isotype, when needing to carry out face imitation from
Move or wake up visual human manually, such as visual human is the application program operated on smart mobile phone, the application program is opened
The face image of a Chinese Famous movie star is shown as afterwards, can be imitated by obtaining the facial expression for the person of being imitated,
When the application program without using when will on backstage into temporary transient resting state, it is necessary to using when manually from cutting from the background
Change, you can wake up the visual human run in the application program.
In addition, the visual human can also be the hologram of intelligent holographic projector equipment projection, the hologram
Projected area is the viewing area of the visual human.
Step 202:The three-dimensional face images of the person of being imitated described in extraction from the multi-modal data.
Referring to Fig. 3, in the embodiment of the present application, the three-dimensional face images bag for the person of being imitated is extracted from the multi-modal data
Step 301 is included to step 302.
Step 301:The multi-modal data of the person of being imitated is parsed, multi-modal interaction data is exported with decision-making.
The parsing includes:Semantic understanding, visual identity, affection computation, cognition calculate, i.e.,:The multi-modal data bag
Include the data from surrounding environment and from the multi-modal data interacted with the person of being imitated;Call visual human's ability interface solution
Analysis exports multi-modal interaction data from the multi-modal data interacted with the person of being imitated, decision-making.
In the embodiment of the present application, the person of being imitated is parsed by server, multi-modal data is generated, then by institute
State multi-modal data and be transferred to the smart machine for running and having visual human.
Step 302:When including the intention for imitating technical ability in the result of the parsing, open imitation technical ability and start acquisition
Device obtains the three-dimensional face images for the person of being imitated.
In the embodiment of the present application, when smart machine receives the result of the parsing, knowing in the result of the parsing has
When imitating the intention of technical ability, then the three-dimensional face images that acquisition device obtains the person of being imitated are opened;The acquisition device can be
Smart machine is built-in or external video camera, shooting are first-class.
Step 203:The three-dimensional face images are parsed, determine the key point of the three-dimensional face images.
In the embodiment of the present application, the key point can include the distribution of bone, muscle and/or face in face of face
Point, the distributed point is bound with corresponding coordinate points.
Step 204:The corresponding node of the key point and the three-dimensional face model of visual human is bound.
In the embodiment of the present application, the node of the three-dimensional face model of the visual human can include eyebrow, eyes, eyelid,
Mouth and/or the corners of the mouth, the eyebrow, eyes, eyelid, mouth and/or the corners of the mouth are individually controlled by node, then be will identify that and
Node data carry out corresponding binding with the key point of the person's of being imitated facial image.
Step 205:The three-dimensional face images for the person of being imitated and parsing described in obtaining in real time.
Step 206:The key point of the three-dimensional face images for the person of being imitated according to obtaining parsing, in the visual human
Three-dimensional face model node on mapped, generation imitate data.
In the embodiment of the present application, when the three-dimensional face images of the person of being imitated change, the person's of being imitated
Bone, muscle and/or the face of the face of three-dimensional face images also can be according to the shiftings of corresponding coordinate points in the distributed point of face
Move and change, the node of the three-dimensional face model of the visual human bound with the distributed point also can synchronously occur
Change, generates a series of imitation data.
Step 207:Data are imitated in output.
In the embodiment of the present application, the three-dimensional face model of the visual human is according to a series of imitation data pair received
The person of being imitated is imitated, such as the visual human can realize that blink, nozzle type or end rotation etc. imitate function.
A kind of multi-modal interaction processing method based on visual human that the application provides, by obtaining described be imitated in real time
The three-dimensional face images of person and parsing, the key point of the three-dimensional face images for the person of being imitated according to obtaining parsing, in institute
State and mapped on the node of the three-dimensional face model of visual human, data are imitated in generation, and data are imitated in finally output so that described
True to nature, smooth man-machine interaction effect can be presented in the three-dimensional face model of visual human, lift Consumer's Experience.
Referring to Fig. 4, the embodiment of the application one provides a kind of multi-modal interaction processing method based on visual human, including step
401 to step 409.
Step 401:Visual human is waken up, the visual human is shown in default viewing area.
In the embodiment of the present application, the visual human possesses aobvious using height emulation 3D virtual figure images as Main User Interface
Write the outward appearance of character features;And multi-modal man-machine interaction is supported, possesses natural language understanding, visually-perceptible, touch perception, language
Say the AI abilities such as voice output, emotional facial expressions action output.
Step 402:Obtain the multi-modal data for the person of being imitated.
Step 403:The multi-modal data of the person of being imitated is parsed, multi-modal interaction data is exported with decision-making.
Step 404:When including the intention for imitating technical ability in the result of the parsing, open imitation technical ability and start acquisition
Device obtains the three-dimensional face images for the person of being imitated.
Step 405:The three-dimensional face images are parsed, determine the key point of the three-dimensional face images.
Step 406:The corresponding node of the key point and the three-dimensional face model of visual human is bound.
Step 407:The three-dimensional face images for the person of being imitated and parsing described in obtaining in real time.
Step 408:The key point of the three-dimensional face images for the person of being imitated according to obtaining parsing, in the visual human
Three-dimensional face model node on mapped, generation imitate data.
Step 409:The three-dimensional face model of the visual human enters according to the imitation data received to the person of being imitated
Row imitates, while exports multi-modal interaction data.
A kind of multi-modal interaction processing method based on visual human that the embodiment of the present application provides, by by the visual human
It is equipped on the smart machines of input/output module such as support perception, control, and society is configured as needed for the visual human
Meeting attribute, personality attribute and personage's technical ability etc., can make user experience intelligent and personalized experience.
Referring to Fig. 5, exemplified by operating in the visual human on smart mobile phone and realize that end rotation imitates, the embodiment of the application one
There is provided a kind of multi-modal interaction processing method based on visual human, including step 501 is to step 510.
Step 501:Visual human is waken up, the visual human is shown on smart mobile phone in default viewing area.
In the embodiment of the present application, the visual human can be to be generated by the high mould structures of 3D, and possess default image
And technical ability, such as the image for Chinese Famous movie actress's model that face face imitates can be carried out, open what is installed in smart mobile phone
APP, the visual human are operated in the APP, wake up the visual human, and the visual human is shown in the default of smart mobile phone APP
Viewing area, such as the middle position of smart mobile phone display screen.
Step 502:Obtain the multi-modal data for the person of being imitated.
In the embodiment of the present application, the person of being imitated can be the small A of kinsfolk, empty below using the person of being imitated as small A
Anthropomorphic three-dimensional face images are to illustrate exemplified by model.
The multi-modal data can gather natural language, visually-perceptible, touch perception, the language language of the person of being imitated
The data of the generations such as sound, emotional facial expressions, action, such as things, the sensation of touching object, the sound for gathering small A language, seeing
The data of the generations such as sound, mood, the action made.
Step 503:The multi-modal data of the person of being imitated is parsed, multi-modal interaction data is exported with decision-making.
In the embodiment of the present application, the parsing includes:Semantic understanding, visual identity, affection computation, cognition calculate.Such as
To the generation such as the above-mentioned small A language collected, the things seen, the sensation of touching object, sound, mood, action for making
Data calculated.
Step 504:When including the intention for imitating technical ability in the result of the parsing, open imitation technical ability and start acquisition
Device obtains the three-dimensional face images for the person of being imitated.
In the embodiment of the present application, when there is the intention for imitating technical ability in the medium and small A of the result of parsing multi-modal data,
Then open the imitation technical ability of visual human and start the three-dimensional face images that the acquisition device on smart mobile phone obtains small A.
Acquisition device in the embodiment of the present application can be the camera of smart mobile phone.
Step 505:The three-dimensional face images are parsed, determine the key point of the three-dimensional face images.
In the embodiment of the present application, small A facial image is parsed, determines the bone, muscle and face of small A face
Key point is used as in distributed point of face etc., and records the initial coordinate position of the key point.
Step 506:The corresponding node of the key point and the three-dimensional face model of visual human is bound.
In the embodiment of the present application, by the key point of small A facial image and pair of the three-dimensional face model of visual human's model
Node is answered to be bound.
The node includes eyebrow, eyes, eyelid, mouth and/or corners of the mouth etc., and each node individually controls,
It can realize and individually imitate function, such as blink, squeeze eyebrow and/or open one's mouth.
Step 507:The three-dimensional face images for the person of being imitated and parsing described in obtaining in real time.
In the embodiment of the present application, small A three-dimensional face images and parsing are obtained in real time, i.e. the moment obtains small A facial table
Feelings so that the three-dimensional face images of visual human's model can synchronize imitation, avoid producing imitating to lag and brought not to user
The problem of experiencing well.
Step 508:Described in being determined according to parsing during the end rotation for the person of being imitated, according to the obtained person of being imitated of parsing
End rotation before and after three-dimensional face images corresponding to the difference of key point determine spin matrix.
In the embodiment of the present application, when the coordinate of the key point of small A three-dimensional face images changes, global coordinate system to
One side can determine that small A head is rotated when skew, before and after the small A then obtained according to parsing end rotation
Three-dimensional face images corresponding to the difference of key point determine spin matrix.
Whether the face of the three-dimensional face images rotates or whether head is shaken and entered using the above method
Row determines.
Step 509:According to the spin matrix, the anglec of rotation of the end rotation of the person of being imitated is determined.
In the embodiment of the present application, then the spin matrix calculates multidimensional to calculate the picture of face according to each picture
The anglec of rotation, the angle is the accurate anglec of rotation of the end rotation of the small A of the person of being imitated.
Step 510:The three-dimensional face model of the visual human is controlled to the person's of being imitated head according to the anglec of rotation
Portion's rotation is imitated.
In the embodiment of the present application, after drawing the anglec of rotation, the three-dimensional face images can of visual human's model
The rotation according to corresponding to being carried out the anglec of rotation, it is possible to achieve more accurately imitate function.
A kind of multi-modal interaction processing method based on visual human that the embodiment of the present application provides, can cause visual human to do
The multi-modal interactive technical ability of user's face expression is imitated to real-time, and true to nature, smooth, anthropomorphic interaction effect can be realized.
Fig. 6 to Fig. 8 is a kind of structure for multi-modal interaction process system based on visual human that the embodiment of the present application provides
Schematic diagram.Because system embodiment is substantially similar to embodiment of the method, related part is referring to the part explanation of embodiment of the method
Can.System embodiment described below is only schematical.
Referring to Fig. 6, the application provides a kind of multi-modal interaction process system based on visual human, including smart machine kimonos
Business device, the smart machine include acquisition module 601, generation module 606 and output module 607, and the server includes extraction
Module 602, determining module 603, binding module 604 and parsing module 605, wherein:
The acquisition module 601, for obtaining the multi-modal data for the person of being imitated;
The extraction module 602, the three-dimensional face images for the person of being imitated described in the extraction from the multi-modal data;
The determining module 603, for being parsed to the three-dimensional face images, determine the three-dimensional face images
Key point;
The binding module 604, for the key point and the corresponding node of the three-dimensional face model of visual human to be carried out
Binding;
The parsing module 605, for the three-dimensional face images for the person of being imitated and parsing described in acquisition in real time;
The generation module 606, for according to parsing obtain described in the person of being imitated three-dimensional face images key point,
Mapped on the node of the three-dimensional face model of the visual human, data are imitated in generation;
The output module 607, data are imitated for exporting.
Alternatively, the smart machine includes:
Wake module, for waking up visual human, the visual human is set to be shown in default viewing area;The visual human
Run in smart machine.
Alternatively, the visual human is by the high mould structure generations of 3D, and possesses default image and technical ability;
The visual human includes application program, the executable file or by the intelligence operated on the smart machine
The hologram that equipment projects.
Alternatively, the system that the smart machine uses is included built in WINDOWS systems, MAC OS systems or hologram device
System.
Alternatively, the default viewing area includes the throwing of the display interface or the smart machine of the smart machine
Penetrate region.
Alternatively, the three-dimensional face model that the output module is used to make the visual human is according to the imitation data received
The person of being imitated is imitated, while exports multi-modal interaction data.
Alternatively, include referring to Fig. 7, the server:
Data resolution module 701, for being parsed to the multi-modal data of the person of being imitated, multimode is exported with decision-making
State interaction data, the parsing include:Semantic understanding, visual identity, affection computation, cognition calculate;
Image module 702 is obtained, for when including the intention for imitating technical ability in the result of the parsing, opening and imitating skill
The three-dimensional face images that acquisition device obtains the person of being imitated and can be started.
Alternatively, include referring to Fig. 8, the generation module 606:
Spin matrix determination sub-module 801, for when the end rotation or face of the person of being imitated according to parsing determination
During rotation, the difference of key point determines corresponding to the three-dimensional face images before and after the end rotation of the person of being imitated obtained according to parsing
Spin matrix;
Anglec of rotation determination sub-module 802, for according to the spin matrix, determining the head rotation of the person of being imitated
The anglec of rotation turned;
End rotation imitates submodule 803, for controlling the three-dimensional face mould of the visual human according to the anglec of rotation
Type is imitated the person's of being imitated end rotation.
Alternatively, the key point includes:
Distributed point of bone, muscle and/or the face of face in face.
Alternatively, the node of the three-dimensional face model of the visual human includes:
Eyebrow, eyes, eyelid, mouth and/or the corners of the mouth.
A kind of multi-modal interaction process system based on visual human that the application provides, by obtaining described be imitated in real time
The three-dimensional face images of person and parsing, the key point of the three-dimensional face images for the person of being imitated according to obtaining parsing, in institute
State and mapped on the node of the three-dimensional face model of visual human, data are imitated in generation, and data are imitated in finally output so that described
True to nature, smooth man-machine interaction effect can be presented in the three-dimensional face model of visual human, lift Consumer's Experience.
The smart machine of the application can include processor and memory, and the memory storage has computer instruction, institute
Processor is stated to call the computer instruction and perform the foregoing multi-modal interaction processing method based on visual human.
The processor can be CPU (Central Processing Unit, CPU), can also be it
His general processor, digital signal processor (Digital Signal Processor, DSP), application specific integrated circuit
(Application Specific Integrated Circuit, ASIC), ready-made programmable gate array (Field-
Programmable Gate Array, FPGA) either other PLDs, discrete gate or transistor logic,
Discrete hardware components etc..General processor can be microprocessor or the processor can also be any conventional processor
Deng the processor is the control centre of the smart machine, utilizes each of various interfaces and the whole smart machine of connection
Individual part.
The memory mainly includes storing program area and storage data field, wherein, storing program area can store operation system
Application program (such as sound-playing function, image player function etc.) needed for system, at least one function etc.;Storage data field can
Storage uses created data (such as voice data, phone directory etc.) etc. according to mobile phone.In addition, memory can include height
Fast random access memory, nonvolatile memory, such as hard disk, internal memory, plug-in type hard disk, intelligent memory card can also be included
(Smart Media Card, SMC), secure digital (Secure Digital, SD) card, flash card (Flash Card), at least
One disk memory, flush memory device or other volatile solid-state parts.
The embodiment of the application one also provides a kind of visual human, and the visual human performs the above-mentioned multi-modal friendship based on visual human
Mutual processing method.
The exemplary scheme of above-mentioned visual human for the present embodiment a kind of.It should be noted that the technical side of the visual human
Case and the technical scheme of the above-mentioned multi-modal interaction processing method based on visual human belong to same design, the technical side of visual human
The detail content that case is not described in detail, it may refer to the technical scheme of the above-mentioned multi-modal interaction processing method based on visual human
Description.
The embodiment of the application one also provides a kind of storage medium, is stored with computer instruction, and the computer instruction performs
The above-mentioned multi-modal interaction processing method based on visual human.
A kind of exemplary scheme of above-mentioned storage medium for the present embodiment.It should be noted that the skill of the storage medium
Art scheme and the technical scheme of the above-mentioned multi-modal interaction processing method based on visual human belong to same design, storage medium
The detail content that technical scheme is not described in detail, it may refer to the skill of the above-mentioned multi-modal interaction processing method based on visual human
The description of art scheme.
The computer instruction includes computer program code, the computer program code can be source code form,
Object identification code form, executable file or some intermediate forms etc..The computer-readable medium can include:Institute can be carried
Any entity or device, recording medium, USB flash disk, mobile hard disk, magnetic disc, CD, the computer for stating computer program code store
Device, read-only storage (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory),
Electric carrier signal, telecommunication signal and software distribution medium etc..It should be noted that the computer-readable medium include it is interior
Appropriate increase and decrease can be carried out according to legislation in jurisdiction and the requirement of patent practice by holding, such as in some jurisdictions of courts
Area, electric carrier signal and telecommunication signal are not included according to legislation and patent practice, computer-readable medium.
It should be noted that for foregoing each method embodiment, in order to which simplicity describes, therefore it is all expressed as a series of
Combination of actions, but those skilled in the art should know, the application is not limited by described sequence of movement because
According to the application, some steps can use other orders or carry out simultaneously.Secondly, those skilled in the art should also know
Know, embodiment described in this description belongs to preferred embodiment, and involved action and module might not all be this Shens
Please be necessary.
In the above-described embodiments, the description to each embodiment all emphasizes particularly on different fields, and does not have the portion being described in detail in some embodiment
Point, it may refer to the associated description of other embodiments.
The application preferred embodiment disclosed above is only intended to help and illustrates the application.Alternative embodiment is not detailed
All details are described, it is only described embodiment also not limit the invention.Obviously, according to the content of this specification,
It can make many modifications and variations.This specification is chosen and specifically describes these embodiments, is to preferably explain the application
Principle and practical application so that skilled artisan can be best understood by and utilize the application.The application is only
Limited by claims and its four corner and equivalent.
Claims (16)
- A kind of 1. multi-modal interaction processing method based on visual human, it is characterised in that the visual human runs in smart machine, Including:Obtain the multi-modal data for the person of being imitated;The three-dimensional face images of the person of being imitated described in extraction from the multi-modal data;The three-dimensional face images are parsed, determine the key point of the three-dimensional face images;The corresponding node of the key point and the three-dimensional face model of visual human is bound;The three-dimensional face images for the person of being imitated and parsing described in obtaining in real time;The key point of the three-dimensional face images for the person of being imitated according to obtaining parsing, in the three-dimensional face mould of the visual human Mapped on the node of type, data are imitated in generation;Data are imitated in output.
- 2. according to the method for claim 1, it is characterised in that before the multi-modal data for obtaining the person of being imitated, bag Include:Visual human is waken up, the visual human is shown in default viewing area.
- 3. according to the method for claim 1, it is characterised in that the visual human has by the high mould structure generations of 3D Standby default image and technical ability;The visual human includes application program, the executable file or by the smart machine operated on the smart machine The hologram projected.
- 4. according to the method for claim 1, it is characterised in that the system that the smart machine uses includes WINDOWS systems System, MAC OS systems or hologram device built-in system.
- 5. according to the method for claim 2, it is characterised in that the default viewing area includes the smart machine The projected area of display interface or the smart machine.
- 6. according to the method for claim 1, it is characterised in that data are imitated in the output to be included:Make the visual human's Three-dimensional face model is imitated the person of being imitated according to the imitation data received, while exports multi-modal interactive number According to.
- 7. according to the method for claim 1, it is characterised in that described to extract the person's of being imitated from the multi-modal data Three-dimensional face images include:The multi-modal data of the person of being imitated is parsed, multi-modal interaction data is exported with decision-making, the parsing includes: Semantic understanding, visual identity, affection computation, cognition calculate;When including the intention for imitating technical ability in the result of the parsing, open and imitate technical ability and start described in acquisition device acquisition The three-dimensional face images for the person of being imitated.
- 8. according to the method for claim 1, it is characterised in that when the end rotation of the person of being imitated according to parsing determination Or during face rotation, the key point of the three-dimensional face images of the person of being imitated according to obtaining parsing is in the visual human Three-dimensional face model node on mapped, generation imitate data include:The difference of key point determines rotation corresponding to three-dimensional face images before and after the end rotation of the person of being imitated obtained according to parsing Torque battle array;According to the spin matrix, the anglec of rotation of the end rotation of the person of being imitated is determined;The three-dimensional face model for controlling the visual human according to the anglec of rotation carries out mould to the person's of being imitated end rotation It is imitative.
- 9. according to the method for claim 1, it is characterised in that the key point includes:Distributed point of bone, muscle and/or the face of face in face.
- 10. according to the method for claim 1, it is characterised in that the node of the three-dimensional face model of the visual human includes:Eyebrow, eyes, eyelid, mouth and/or the corners of the mouth.
- 11. a kind of multi-modal interaction process system based on visual human, it is characterised in that including smart machine and server, institute Stating smart machine includes acquisition module, generation module and output module, and the server includes extraction module, determining module, tied up Cover half block and parsing module, wherein:The acquisition module, for obtaining the multi-modal data for the person of being imitated;The extraction module, the three-dimensional face images for the person of being imitated described in the extraction from the multi-modal data;The determining module, for being parsed to the three-dimensional face images, determine the key point of the three-dimensional face images;The binding module, for the corresponding node of the key point and the three-dimensional face model of visual human to be bound;The parsing module, for the three-dimensional face images for the person of being imitated and parsing described in acquisition in real time;The generation module, for the key point of the three-dimensional face images for the person of being imitated according to parsing and obtain, described Mapped on the node of the three-dimensional face model of visual human, data are imitated in generation;The output module, data are imitated for exporting.
- 12. system according to claim 11, it is characterised in that the output module is used for the three-dimensional for making the visual human Faceform is imitated the person of being imitated according to the imitation data received, while exports multi-modal interaction data.
- 13. system according to claim 12, it is characterised in that the server includes:Data resolution module, for being parsed to the multi-modal data of the person of being imitated, multi-modal interaction is exported with decision-making Data, the parsing include:Semantic understanding, visual identity, affection computation, cognition calculate;Image module is obtained, for when including the intention for imitating technical ability in the result of the parsing, opening and imitating technical ability and open Dynamic acquisition device obtains the three-dimensional face images for the person of being imitated.
- 14. system according to claim 11, it is characterised in that the generation module includes:Spin matrix determination sub-module, for when according to parsing determine described in the person of being imitated end rotation or face rotate when, Spin moment is determined according to the difference of key point corresponding to the three-dimensional face images before and after the end rotation for the person of being imitated that parsing obtains Battle array;Anglec of rotation determination sub-module, for according to the spin matrix, determining the rotation of the end rotation of the person of being imitated Gyration;End rotation imitates submodule, for controlling the three-dimensional face model of the visual human according to the anglec of rotation to described The person's of being imitated end rotation is imitated.
- 15. a kind of visual human, it is characterised in that visual human's perform claim requires the method described in 1-10 any one.
- 16. a kind of storage medium, it is characterised in that be stored with computer instruction, the computer instruction perform claim requires 1- Method described in 10 any one.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711026544.5A CN107679519A (en) | 2017-10-27 | 2017-10-27 | A kind of multi-modal interaction processing method and system based on visual human |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711026544.5A CN107679519A (en) | 2017-10-27 | 2017-10-27 | A kind of multi-modal interaction processing method and system based on visual human |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107679519A true CN107679519A (en) | 2018-02-09 |
Family
ID=61143468
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711026544.5A Pending CN107679519A (en) | 2017-10-27 | 2017-10-27 | A kind of multi-modal interaction processing method and system based on visual human |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107679519A (en) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108335345A (en) * | 2018-02-12 | 2018-07-27 | 北京奇虎科技有限公司 | The control method and device of FA Facial Animation model, computing device |
CN109117770A (en) * | 2018-08-01 | 2019-01-01 | 吉林盘古网络科技股份有限公司 | FA Facial Animation acquisition method, device and terminal device |
CN109278051A (en) * | 2018-08-09 | 2019-01-29 | 北京光年无限科技有限公司 | Exchange method and system based on intelligent robot |
CN110751717A (en) * | 2019-09-10 | 2020-02-04 | 平安科技(深圳)有限公司 | Virtual head model construction method and device, computer equipment and storage medium |
CN111360819A (en) * | 2020-02-13 | 2020-07-03 | 平安科技(深圳)有限公司 | Robot control method and device, computer device and storage medium |
CN112528978A (en) * | 2021-02-10 | 2021-03-19 | 腾讯科技(深圳)有限公司 | Face key point detection method and device, electronic equipment and storage medium |
CN112667068A (en) * | 2019-09-30 | 2021-04-16 | 北京百度网讯科技有限公司 | Virtual character driving method, device, equipment and storage medium |
CN113379880A (en) * | 2021-07-02 | 2021-09-10 | 福建天晴在线互动科技有限公司 | Automatic expression production method and device |
CN114998816A (en) * | 2022-08-08 | 2022-09-02 | 深圳市指南针医疗科技有限公司 | Skeleton AI video-based case improvement method, device and storage medium |
CN115052030A (en) * | 2022-06-27 | 2022-09-13 | 北京蔚领时代科技有限公司 | Virtual digital person control method and system |
CN115458128A (en) * | 2022-11-10 | 2022-12-09 | 北方健康医疗大数据科技有限公司 | Method, device and equipment for generating digital human body image based on key points |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102376100A (en) * | 2010-08-20 | 2012-03-14 | 北京盛开互动科技有限公司 | Single-photo-based human face animating method |
CN104346824A (en) * | 2013-08-09 | 2015-02-11 | 汉王科技股份有限公司 | Method and device for automatically synthesizing three-dimensional expression based on single facial image |
CN204945944U (en) * | 2015-07-08 | 2016-01-06 | 赵刚 | A kind of holographic interaction image system |
CN106447785A (en) * | 2016-09-30 | 2017-02-22 | 北京奇虎科技有限公司 | Method for driving virtual character and device thereof |
CN106919899A (en) * | 2017-01-18 | 2017-07-04 | 北京光年无限科技有限公司 | The method and system for imitating human face expression output based on intelligent robot |
CN106959839A (en) * | 2017-03-22 | 2017-07-18 | 北京光年无限科技有限公司 | A kind of human-computer interaction device and method |
CN206411651U (en) * | 2016-11-23 | 2017-08-15 | 朴明义 | A kind of virtual imaging system |
-
2017
- 2017-10-27 CN CN201711026544.5A patent/CN107679519A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102376100A (en) * | 2010-08-20 | 2012-03-14 | 北京盛开互动科技有限公司 | Single-photo-based human face animating method |
CN104346824A (en) * | 2013-08-09 | 2015-02-11 | 汉王科技股份有限公司 | Method and device for automatically synthesizing three-dimensional expression based on single facial image |
CN204945944U (en) * | 2015-07-08 | 2016-01-06 | 赵刚 | A kind of holographic interaction image system |
CN106447785A (en) * | 2016-09-30 | 2017-02-22 | 北京奇虎科技有限公司 | Method for driving virtual character and device thereof |
CN206411651U (en) * | 2016-11-23 | 2017-08-15 | 朴明义 | A kind of virtual imaging system |
CN106919899A (en) * | 2017-01-18 | 2017-07-04 | 北京光年无限科技有限公司 | The method and system for imitating human face expression output based on intelligent robot |
CN106959839A (en) * | 2017-03-22 | 2017-07-18 | 北京光年无限科技有限公司 | A kind of human-computer interaction device and method |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108335345A (en) * | 2018-02-12 | 2018-07-27 | 北京奇虎科技有限公司 | The control method and device of FA Facial Animation model, computing device |
CN108335345B (en) * | 2018-02-12 | 2021-08-24 | 北京奇虎科技有限公司 | Control method and device of facial animation model and computing equipment |
CN109117770A (en) * | 2018-08-01 | 2019-01-01 | 吉林盘古网络科技股份有限公司 | FA Facial Animation acquisition method, device and terminal device |
CN109278051A (en) * | 2018-08-09 | 2019-01-29 | 北京光年无限科技有限公司 | Exchange method and system based on intelligent robot |
CN110751717A (en) * | 2019-09-10 | 2020-02-04 | 平安科技(深圳)有限公司 | Virtual head model construction method and device, computer equipment and storage medium |
CN112667068A (en) * | 2019-09-30 | 2021-04-16 | 北京百度网讯科技有限公司 | Virtual character driving method, device, equipment and storage medium |
CN111360819A (en) * | 2020-02-13 | 2020-07-03 | 平安科技(深圳)有限公司 | Robot control method and device, computer device and storage medium |
CN111360819B (en) * | 2020-02-13 | 2022-09-27 | 平安科技(深圳)有限公司 | Robot control method and device, computer device and storage medium |
CN112528978B (en) * | 2021-02-10 | 2021-05-14 | 腾讯科技(深圳)有限公司 | Face key point detection method and device, electronic equipment and storage medium |
CN112528978A (en) * | 2021-02-10 | 2021-03-19 | 腾讯科技(深圳)有限公司 | Face key point detection method and device, electronic equipment and storage medium |
CN113379880A (en) * | 2021-07-02 | 2021-09-10 | 福建天晴在线互动科技有限公司 | Automatic expression production method and device |
CN113379880B (en) * | 2021-07-02 | 2023-08-11 | 福建天晴在线互动科技有限公司 | Expression automatic production method and device |
CN115052030A (en) * | 2022-06-27 | 2022-09-13 | 北京蔚领时代科技有限公司 | Virtual digital person control method and system |
CN114998816A (en) * | 2022-08-08 | 2022-09-02 | 深圳市指南针医疗科技有限公司 | Skeleton AI video-based case improvement method, device and storage medium |
CN115458128A (en) * | 2022-11-10 | 2022-12-09 | 北方健康医疗大数据科技有限公司 | Method, device and equipment for generating digital human body image based on key points |
CN115458128B (en) * | 2022-11-10 | 2023-03-24 | 北方健康医疗大数据科技有限公司 | Method, device and equipment for generating digital human body image based on key points |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107679519A (en) | A kind of multi-modal interaction processing method and system based on visual human | |
CN107944542A (en) | A kind of multi-modal interactive output method and system based on visual human | |
CN107894833A (en) | Multi-modal interaction processing method and system based on visual human | |
CN110163054B (en) | Method and device for generating human face three-dimensional image | |
CN105378742B (en) | The biometric identity managed | |
CN107797663A (en) | Multi-modal interaction processing method and system based on visual human | |
CN107765852A (en) | Multi-modal interaction processing method and system based on visual human | |
CN107765856A (en) | Visual human's visual processing method and system based on multi-modal interaction | |
CN109271018A (en) | Exchange method and system based on visual human's behavioral standard | |
CN105931506B (en) | A kind of children based on augmented reality tint system and its display methods | |
CN108665492A (en) | A kind of Dancing Teaching data processing method and system based on visual human | |
CN107831905A (en) | A kind of virtual image exchange method and system based on line holographic projections equipment | |
CN108942919A (en) | A kind of exchange method and system based on visual human | |
JP2019537758A (en) | Control method, controller, smart mirror, and computer-readable storage medium | |
CN108052250A (en) | Virtual idol deductive data processing method and system based on multi-modal interaction | |
CN109324688A (en) | Exchange method and system based on visual human's behavioral standard | |
CN109035373A (en) | The generation of three-dimensional special efficacy program file packet and three-dimensional special efficacy generation method and device | |
CN109343695A (en) | Exchange method and system based on visual human's behavioral standard | |
CN109032328A (en) | A kind of exchange method and system based on visual human | |
CN108595012A (en) | Visual interactive method and system based on visual human | |
CN109086860A (en) | A kind of exchange method and system based on visual human | |
CN109035415B (en) | Virtual model processing method, device, equipment and computer readable storage medium | |
CN108416420A (en) | Limbs exchange method based on visual human and system | |
CN110837294A (en) | Facial expression control method and system based on eyeball tracking | |
CN109542389A (en) | Sound effect control method and system for the output of multi-modal story content |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180209 |