CN109271018A - Exchange method and system based on visual human's behavioral standard - Google Patents
Exchange method and system based on visual human's behavioral standard Download PDFInfo
- Publication number
- CN109271018A CN109271018A CN201810955494.7A CN201810955494A CN109271018A CN 109271018 A CN109271018 A CN 109271018A CN 201810955494 A CN201810955494 A CN 201810955494A CN 109271018 A CN109271018 A CN 109271018A
- Authority
- CN
- China
- Prior art keywords
- data
- human
- modal
- interaction
- visual human
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/011—Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/048—Interaction techniques based on graphical user interfaces [GUI]
- G06F3/0484—Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
Landscapes
- Engineering & Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
The present invention provides a kind of exchange method based on visual human's behavioral standard, visual human is shown by smart machine, starting voice, emotion, vision and sensing capability when being in interaction mode, it includes: obtain multi-modal interaction data, multi-modal interaction data is parsed, the interaction for obtaining user is intended to;It is intended to generate multi-modal response data according to interaction and expresses data with the matched Emotional Behavior on Virtual Human of multi-modal response data, wherein Emotional Behavior on Virtual Human expresses data and shows the current mood of visual human by the facial expression and limb action of visual human;Emotional Behavior on Virtual Human expression data are cooperated to export multi-modal response data.The present invention provides a kind of visual humans, can carry out multi-modal interaction with user.Also, the present invention can also cooperate output Emotional Behavior on Virtual Human expression data when exporting multi-modal response data, the mood of data representation current virtual people be expressed by Emotional Behavior on Virtual Human, so that user enjoys anthropomorphic interactive experience.
Description
Technical field
The present invention relates to artificial intelligence fields, specifically, being related to a kind of exchange method based on visual human's behavioral standard
And system.
Background technique
The exploitation of robot multi-modal interactive system is dedicated to imitating human conversation, to attempt to imitate people between context
Interaction between class.But at present for, the exploitation of robot multi-modal interactive system relevant for visual human is also less complete
It is kind, not yet occur carrying out the visual human of multi-modal interaction, it is even more important that there is no and carry out based on visual human itself behavioral standard
Interactive interactive product.
Therefore, the present invention provides a kind of exchange method and system based on visual human's behavioral standard.
Summary of the invention
To solve the above problems, the present invention provides a kind of exchange method based on visual human's behavioral standard, it is described virtual
People is shown by smart machine, starts voice, emotion, vision and sensing capability, the method packet when being in interaction mode
Containing following steps:
Multi-modal interaction data is obtained, the multi-modal interaction data is parsed, the interaction for obtaining user is intended to;
According to the interaction be intended to generate multi-modal response data and with the multi-modal response data it is matched virtual
People's emotion expression service data, wherein the Emotional Behavior on Virtual Human expression data pass through the facial expression and limb action table of visual human
The current mood of existing visual human;
The Emotional Behavior on Virtual Human expression data are cooperated to export the multi-modal response data.
According to one embodiment of present invention, according to the interaction be intended to generate multi-modal response data and with it is described more
In the step of matched Emotional Behavior on Virtual Human of mode response data expresses data, also comprise the steps of:
The emotional parameters of current virtual people are determined according to the context environmental of the interaction intention and interaction;
It is generated according to the emotional parameters and expresses data with the multi-modal matched Emotional Behavior on Virtual Human of response data.
According to one embodiment of present invention, the emotional parameters include active mood parameter, angry emoticon parameter and
Frightened emotional parameters.
According to one embodiment of present invention, it is generated according to the emotional parameters matched with the multi-modal response data
In the step of Emotional Behavior on Virtual Human expression data comprising the steps of:
Generation and the matched visual human's facial expression data of the emotional parameters and visual human's limb action data,
In, visual human's facial expression data and visual human's limb action data belong to the Emotional Behavior on Virtual Human expression number
According to.
According to one embodiment of present invention, the limb action data include headwork data, hand motion data,
Any one of major beat data and body work data appoint several combinations.
According to one embodiment of present invention,
When exporting the multi-modal response data, when user is directed to the current view one for interacting topic with the visual human
It causes, then the visual human's limb action exported is nodding action and agreement gesture motion;
When exporting the multi-modal response data, when user is directed to the current view phase for interacting topic with the visual human
Instead, then the visual human's limb action exported is head shaking movement and disagrees gesture motion.
According to one embodiment of present invention, in interactive process, visual human exports the deliberate action for having specific intended
It is interacted with user's expansion.
According to another aspect of the present invention, a kind of interactive device based on visual human's behavioral standard is additionally provided, it is described
Device includes:
Interaction is intended to obtain module, is used to obtain multi-modal interaction data, solves to the multi-modal interaction data
Analysis, the interaction for obtaining user are intended to;
Generation module, be used to be intended to generate according to the interaction multi-modal response data and with the multi-modal response
The Emotional Behavior on Virtual Human of Data Matching expresses data, wherein the Emotional Behavior on Virtual Human expression data by the expression of visual human and
Limb action shows the current mood of visual human;
Output module is used to that the Emotional Behavior on Virtual Human expression data to be cooperated to export the multi-modal response data.
According to another aspect of the present invention, a kind of program product is additionally provided, it includes any one of as above for executing
The series of instructions of the method and step.
According to another aspect of the present invention, a kind of interactive system based on visual human's behavioral standard is additionally provided, it is described
System includes:
Smart machine is mounted with visual human thereon, for obtaining multi-modal interaction data, and has voice, emotion, expression
With the ability of movement output, the smart machine includes hologram device;
Cloud brain, be used to carry out the multi-modal interaction data semantic understanding, visual identity, cognition calculate and
Affection computation exports multi-modal response data with visual human described in decision.
Exchange method and system provided by the invention based on visual human's behavioral standard provides a kind of visual human, visual human
Have default image and preset attribute, multi-modal interaction can be carried out with user.Also, it is provided by the invention to be based on visual human
The exchange method and system of behavioral standard can also cooperate output Emotional Behavior on Virtual Human expression number when exporting multi-modal response data
According to by the mood of Emotional Behavior on Virtual Human expression data representation current virtual people, so that being able to carry out stream between user and visual human
Smooth exchange, and user is made to enjoy anthropomorphic interactive experience.
Other features and advantages of the present invention will be illustrated in the following description, also, partly becomes from specification
It obtains it is clear that understand through the implementation of the invention.The objectives and other advantages of the invention can be by specification, right
Specifically noted structure is achieved and obtained in claim and attached drawing.
Detailed description of the invention
Attached drawing is used to provide further understanding of the present invention, and constitutes part of specification, with reality of the invention
It applies example and is used together to explain the present invention, be not construed as limiting the invention.In the accompanying drawings:
Fig. 1 shows that the interaction of the interactive system according to an embodiment of the invention based on visual human's behavioral standard is shown
It is intended to;
Fig. 2 shows the structural frames of the interactive system according to an embodiment of the invention based on visual human's behavioral standard
Figure;
Fig. 3 shows the module frame of the interactive system according to an embodiment of the invention based on visual human's behavioral standard
Figure;
Fig. 4 shows the structure of the interactive system based on visual human's behavioral standard according to another embodiment of the invention
Block diagram;
Fig. 5 shows the exchange method flow chart according to an embodiment of the invention based on visual human's behavioral standard;
Fig. 6, which is shown, generates void in the exchange method according to an embodiment of the invention based on visual human's behavioral standard
The flow chart of anthropomorphic emotion expression service data;
Fig. 7 shows that output is more in the exchange method according to an embodiment of the invention based on visual human's behavioral standard
The flow chart of mode reply data;
Fig. 8 shows the schematic diagram of emotional parameters according to an embodiment of the invention Yu emotion expression service Data Matching;
Fig. 9 shows another stream of the exchange method according to an embodiment of the invention based on visual human's behavioral standard
Cheng Tu;And
Figure 10 show it is according to an embodiment of the invention user, smart machine and cloud brain between the parties
The flow chart communicated.
Specific embodiment
To make the object, technical solutions and advantages of the present invention clearer, the embodiment of the present invention is made below in conjunction with attached drawing
Further it is described in detail.
It is clear to state, it needs to carry out before embodiment as described below:
The visual human that the present invention mentions is equipped on the smart machine for supporting the input/output modules such as perception, control;With Gao Fang
True 3d virtual figure image is Main User Interface, has the appearance of significant character features;It supports multi-modal human-computer interaction, has
Natural language understanding, visual perception touch the AI abilities such as perception, language voice output, emotional facial expressions movement output;Configurable society
Meeting attribute, personality attribute, personage's technical ability etc. make user enjoy the virtual portrait of intelligent and personalized Flow Experience.
Visual human's smart machine mounted are as follows: have the input of non-tactile, non-mouse-keyboard screen (holography, TV screen,
Multimedia display screen, LED screen etc.), and the smart machine of camera is carried, meanwhile, it can be hologram device, VR equipment, PC
Machine.But other smart machines are not precluded, such as: hand-held plate, naked eye 3D equipment, even smart phone.
Visual human interacts in system level and user, and operating system, such as hologram device are run in the system hardware
Built-in system is windows or MAC OS if PC.
Virtual artificial system application or executable file.
Virtual robot obtains the multi-modal interaction data of user based on the hardware of the smart machine, beyond the clouds the energy of brain
Under power is supported, semantic understanding, visual identity, cognition calculating, affection computation are carried out to multi-modal interaction data, it is defeated to complete decision
Process out.
The cloud brain being previously mentioned is to provide the visual human to carry out semantic understanding (language semantic to the interaction demand of user
Understanding, Action Semantic understanding, visual identity, affection computation, cognition calculate) processing capacity terminal, realize and the friendship of user
Mutually, with the multi-modal response data of the output of visual human described in decision.
Each embodiment of the invention is described in detail with reference to the accompanying drawing.
Fig. 1 shows that the interaction of the interactive system according to an embodiment of the invention based on visual human's behavioral standard is shown
It is intended to.As shown in Figure 1, carrying out multi-modal interaction needs user 101, smart machine 102, visual human 103 and cloud brain
104.Wherein, the user 101 interacted with visual human can be the visual human of true people, another visual human and entity, another
Visual human and entity visual human are similar with the interactive process of visual human with the interactive process of visual human with single people.Therefore,
Only show the multi-modal interactive process of user (people) Yu visual human in Fig. 1.
In addition, smart machine 102 includes 1022 (substantially core processing of display area 1021 and hardware supported equipment
Device).Display area 1021 is used to show that the image of visual human 103, hardware supported equipment 1022 to make with the cooperation of cloud brain 104
With for the data processing in interactive process.Visual human 103 needs screen display carrier to present.Therefore, display area 1021 is wrapped
It includes: holographic screen, TV screen, multimedia display screen and LED screen etc..
The process interacted between visual human and user 101 in Fig. 1 are as follows:
Interaction required early-stage preparations or condition have, and visual human is carried and operated on smart machine 102, and virtual
People has specific image characteristics.Visual human has natural language understanding, visual perception, touches perception, language output, emotion table
The AI abilities such as feelings movement output.In order to cooperate the touch perceptional function of visual human, it is also required to be equipped on smart machine and has touching
Touch the component of perceptional function.According to one embodiment of present invention, in order to promote interactive experience, visual human after being activated just
It is shown in predeterminable area.
It should be noted that the image of visual human 103 and dressing up and being not limited to one mode.Visual human 103 can be with
Have different images and dresss up.The image of visual human 103 is generally 3D high mould animating image.Visual human 103 can have
Different appearance and decoration.The image of every kind of visual human 103 can also correspond to it is a variety of different dress up, the classification dressed up can be according to
Classify according to season, can also classify according to occasion.These images and dresss up and can reside in cloud brain 104, it can also be with
It is present in smart machine 102, can be called at any time when needing to call these images and dress up.
Social property, personality attribute and the personage's technical ability of visual human 103 is also not necessarily limited to a kind of or a kind of.Visual human
103 can have a variety of social properties, multiple personality attribute and a variety of personage's technical ability.These social properties, personality attribute with
And personage's technical ability can arrange in pairs or groups respectively, and be not secured to a kind of collocation mode, user, which can according to need, to be selected and arranges in pairs or groups.
Specifically, social property may include: appearance, name, dress ornament, decoration, gender, native place, age, family pass
The attributes such as system, occupation, position, religious belief, emotion state, educational background;Personality attribute may include: the attributes such as personality, makings;People
The professional skills such as object technical ability may include: sing and dance, tells a story, trains, and the displaying of personage's technical ability is not limited to limbs, table
The technical ability of feelings, head and/or mouth is shown.
In this application, the social property of visual human, personality attribute and personage's technical ability etc. can make multi-modal interaction
Parsing and the result of decision are more prone to or are more suitable for the visual human.
The following are multi-modal interactive processes, firstly, obtaining multi-modal interaction data, solve to multi-modal interaction data
Analysis, the interaction for obtaining user are intended to.The reception device for obtaining multi-modal interaction data is respectively mounted or is configured at smart machine 102
On, these reception devices include the received text device for receiving text, receive the pronunciation receiver of voice, receive taking the photograph for vision
As head and the infrored equipment etc. of reception perception information.
Then, according to interaction be intended to generate multi-modal response data and with the matched virtual human feelings of multi-modal response data
Thread expresses data, wherein Emotional Behavior on Virtual Human expresses data and shows visual human by the facial expression and limb action of visual human
Current mood.
Finally, cooperation Emotional Behavior on Virtual Human expression data export multi-modal response data.
According to one embodiment of present invention, in interactive process, visual human exports the deliberate action for having specific intended
It is interacted with user's expansion.
Fig. 2 shows the structural frames of the interactive system according to an embodiment of the invention based on visual human's behavioral standard
Figure.As shown in Fig. 2, completing multi-modal interactive needs: user 101, smart machine 102 and cloud brain 104 by system.Its
In, smart machine 102 includes reception device 102A, processing unit 102B, output device 102C and attachment device 102D.Cloud
Brain 104 includes communication device 104A.
Interactive system provided by the invention based on visual human's behavioral standard need user 101, smart machine 102 with
And unobstructed communication channel is established between cloud brain 104, so as to complete the interaction of user 101 Yu visual human.In order to complete
At interactive task, smart machine 102 and cloud brain 104 can be provided with the device and component for supporting to complete interaction.With
The object of visual human's interaction can be a side, or multi-party.
Smart machine 102 includes reception device 102A, processing unit 102B, output device 102C and attachment device
102D.Wherein, reception device 102A is for receiving multi-modal interaction data.The example of reception device 102A includes grasping for voice
The microphone of work, scanner, camera (movement touched is not related to using the detection of visible or nonvisible wavelength) etc..Intelligence is set
Standby 102 can obtain multi-modal interaction data by above-mentioned input equipment.Output device 102C is virtual for exporting
The multi-modal reply data that people interacts with user 101, substantially suitable with the configuration of reception device 102A, details are not described herein.
Processing unit 102B is for handling the interaction data transmitted in interactive process by cloud brain 104.Attachment device
102D is used for contacting between cloud brain 104, and processing unit 102B handles the pretreated multi-modal friendship of reception device 102A
Mutual data or the data transmitted by cloud brain 104.Attachment device 102D sends call instruction to call on cloud brain 104
Robot capability.
The communication device 104A that cloud brain 104 includes is for completing writing to each other between smart machine 102.Communication
It keeps in communication and contacts between attachment device 102D on device 104A and smart machine 102, what reception smart machine 102 was sent asks
It asks, and sends the processing result of the sending of cloud brain 104, be Jie linked up between smart machine 102 and cloud brain 104
Matter.
Fig. 3 shows the module of the interactive system based on visual human's behavioral standard according to another embodiment of the invention
Block diagram.As shown in figure 3, system includes that interaction is intended to obtain module 301, generation module 302 and output module 303.Wherein, it hands over
Mutually it is intended to obtain module 301 to include text collection unit 3011, audio collection unit 3012, vision collecting unit 3013, perception
Acquisition unit 3014 and resolution unit 3015.Generation module 302 includes emotional parameters determination unit 3021 and emotion expression service
Data generating unit 3022.Output module 303 includes judging unit 3031.
Interaction is intended to obtain module 301 for obtaining multi-modal interaction data, parses, obtains to multi-modal interaction data
Interaction to user is intended to.Visual human 103 shown by smart machine 102, when being in interaction mode starting voice, emotion,
Vision and sensing capability.Text collection unit 3011 is used to acquire text information.Audio collection unit 3012 is used to acquire sound
Frequency information.Vision collecting unit 3013 is used to acquire visual information.Perception acquisition unit 3014 is used to acquire perception information.More than
The example of acquisition unit includes the microphone for voice operating, scanner, camera, sensing control equipment, such as using visible or not
Visible wavelength ray, signal, environmental data etc..Multi-modal interactive number can be obtained by above-mentioned input equipment
According to.Multi-modal interaction may include one of text, audio, vision and perception data, also may include a variety of, the present invention
It is restricted not to this.
Generation module 302 is used to be intended to generate multi-modal response data according to interaction and match with multi-modal response data
Emotional Behavior on Virtual Human express data, wherein Emotional Behavior on Virtual Human expresses data and passes through the expression of visual human and limb action performance
The current mood of visual human.
Emotional parameters determination unit 3021 is used to be intended to according to interaction and the context environmental of interaction determines current virtual
The emotional parameters of people.Wherein, the emotional parameters of visual human include active mood parameter, angry emoticon parameter and frightened mood ginseng
Number.
Emotion expression service data generating unit 3022 is used to generate and the matched void of multi-modal response data according to emotional parameters
Anthropomorphic emotion expression service data.According to one embodiment of present invention, it generates and the matched visual human's facial expression number of emotional parameters
Accordingly and visual human's limb action data, wherein visual human's facial expression data and visual human's limb action data belong to void
Anthropomorphic emotion expression service data.
In addition, limb action data include that headwork data, hand motion data, major beat data and trunk are dynamic
Make any one of data or appoints several combinations.
Output module 303 is for cooperating Emotional Behavior on Virtual Human expression data to export multi-modal response data.Judging unit 3031
For judging whether user is consistent for the view for currently interacting topic with visual human.It is currently interacted when user is directed to visual human
Topic it is of the same mind, then the visual human's limb action exported be nodding action and agree to gesture motion.As user and virtually
People for current interaction topic view on the contrary, the visual human's limb action then exported is head shaking movement and to disagree gesture dynamic
Make.
Fig. 4 shows the structure of the interactive system based on visual human's behavioral standard according to another embodiment of the invention
Block diagram.As shown in figure 4, completing interaction needs user 101, smart machine 102 and cloud brain 104.Wherein, smart machine
102 include man-machine interface 401, data processing unit 402, input/output unit 403 and interface unit 404.Cloud brain 104
Interface 1043 and affection computation interface 1044 are calculated comprising semantic understanding interface 1041, visual identity interface 1042, cognition.
Interactive system provided by the invention based on visual human's behavioral standard includes smart machine 102 and cloud brain
104.Visual human 103 runs in smart machine 102, and visual human 103 has default image and preset attribute, in interaction
It can star voice, emotion, vision and sensing capability when state.
In one embodiment, smart machine 102 may include: man-machine interface 401, data processing unit 402, input it is defeated
Device 403 and interface unit 404 out.Wherein, man-machine interface 401 is shown in the predeterminable area of smart machine 102 in fortune
The visual human 103 of row state.
Data processing unit 402 carries out the number generated in multi-modal interactive process for handling user 101 and visual human 103
According to.Processor used can be data processing unit (Central Processing Unit, CPU), can also be that other are logical
With processor, digital signal processor (Digital Signal Processor, DSP), specific integrated circuit
(Application Specific Integrated Circuit, ASIC), ready-made programmable gate array (Field-
Programmable Gate Array, FPGA) either other programmable logic device, discrete gate or transistor logic,
Discrete hardware components etc..General processor can be microprocessor or the processor is also possible to any conventional processor
Deng processor is the control centre of terminal, utilizes the various pieces of various interfaces and the entire terminal of connection.
It include memory in smart machine 102, memory mainly includes storing program area and storage data area, wherein is deposited
Store up program area can application program needed for storage program area, at least one function (for example sound-playing function, image play function
Energy is equal) etc.;Storage data area can store according to smart machine 102 use created data (such as audio data, browsing note
Record etc.) etc..In addition, memory may include high-speed random access memory, it can also include nonvolatile memory, such as firmly
Disk, memory, plug-in type hard disk, intelligent memory card (Smart Media Card, SMC), secure digital (Secure Digital,
SD) block, flash card (Flash Card), at least one disk memory, flush memory device or other volatile solid-states
Part.
Input/output unit 403 is used to obtain multi-modal interaction data and exports the output data in interactive process.It connects
Mouthful unit 404 is used to communicate with the expansion of cloud brain 104, and by with the interface in cloud brain 104, to fetching, to transfer cloud big
Visual human's ability in brain 104.
Cloud brain 104 include semantic understanding interface 1041, visual identity interface 1042, cognition calculate interface 1043 and
Affection computation interface 1044.The above interface is communicated with the expansion of interface unit 404 in smart machine 102.Also, cloud is big
Brain 104 also includes and the corresponding semantic understanding logic of semantic understanding interface 1041, vision corresponding with visual identity interface 1042
Recognition logic and cognition calculate the corresponding cognition calculating logic of interface 1043 and emotion corresponding with affection computation interface 1044
Calculating logic.
As shown in figure 4, each ability interface calls corresponding logical process respectively in multi-modal data resolving.Below
For the explanation of each interface:
Semantic understanding interface 1041 receives the special sound instruction forwarded from interface unit 404, carries out voice knowledge to it
The other and natural language processing based on a large amount of corpus.
Visual identity interface 1042 can be calculated for human body, face, scene according to computer vision algorithms make, deep learning
Method etc. carries out video content detection, identification, tracking etc..Image is identified according to scheduled algorithm, the inspection of quantitative
Survey result.Have image preprocessing function, feature extraction functions, decision making function and concrete application function;
Wherein, image preprocessing function, which can be, carries out basic handling, including color sky to the vision collecting data of acquisition
Between conversion, edge extracting, image transformation and image threshold;
Feature extraction functions can extract the features such as the colour of skin of target, color, texture, movement and coordinate in image and believe
Breath;
Decision making function can be to characteristic information, is distributed to according to certain decision strategy and needs the specific of this feature information
Multi-modal output equipment or multi-modal output application, such as realize Face datection, human limbs identification, motion detection function.
Cognition calculates interface 1043, receives the multi-modal data forwarded from interface unit 404, and cognition calculates interface 1043
Data acquisition, identification and study are carried out to handle multi-modal data, to obtain user's portrait, knowledge mapping etc., to multimode
State output data carries out Rational Decision.
Affection computation interface 1044 receives the multi-modal data forwarded from interface unit 404, utilizes affection computation logic
(can be Emotion identification technology) calculates the current emotional state of user.Emotion identification technology is that one of affection computation is important
Component part, the content of Emotion identification research include facial expression, voice, behavior, text and physiological signal identification etc., are led to
Crossing the above content may determine that the emotional state of user.Emotion identification technology can be monitored only by vision Emotion identification technology
The emotional state of user can also monitor use in conjunction with by the way of using vision Emotion identification technology and sound Emotion identification technology
The emotional state at family, and be not limited thereto.In the present embodiment, it is preferred to use the two in conjunction with mode monitor mood.
Affection computation interface 1044 is to collect mankind face by using image capture device when carrying out vision Emotion identification
Portion's facial expression image is then converted into that data can be analyzed, the technologies such as image procossing is recycled to carry out the analysis of expression mood.Understand face
Expression, it usually needs the delicate variation of expression is detected, such as cheek muscle, mouth variation and choose eyebrow etc..
Fig. 5 shows the exchange method flow chart according to an embodiment of the invention based on visual human's behavioral standard.
As shown in figure 5, parsing, obtaining to multi-modal interaction data firstly, obtain multi-modal interaction data in step S501
The interaction of user is intended to.In multi-modal interactive process, virtual robot obtains multimode by the reception device on smart machine
State interaction data.It may include text data, voice data, perception data and action data etc. in multi-modal interaction data.
Then, in step S502, according to interaction be intended to generate multi-modal response data and with multi-modal response data
Matched Emotional Behavior on Virtual Human expresses data, wherein Emotional Behavior on Virtual Human expresses the facial expression and limbs that data pass through visual human
The current mood of movement displaying visual human.In one embodiment, facial expression may include expression in the eyes data, mouth data and
Eyebrow data etc..Limb action data may include header data, four limbs data and trunk data etc..
Finally, cooperation Emotional Behavior on Virtual Human expression data export multi-modal response data in step S503.In order to enable empty
Personification reaches more anthropomorphic effect, needs to export the emotion expression service data for representing oneself mood when interacting with user.Mood table
Multi-modal response data can be cooperated to export up to data, for representing the mental state of visual human at this time, bring user and more really intend
The interactive experience of people.
In addition, the visual interactive system provided by the invention based on visual human can also cooperate a kind of program product, packet
Containing for executing the series of instructions for completing the exchange method step based on visual human's behavioral standard.Program product can run meter
The instruction of calculation machine, computer instruction includes computer program code, and computer program code can be source code form, object identification code
Form, executable file or certain intermediate forms etc..
Program product may include: can carry computer program code any entity or device, recording medium, USB flash disk,
Mobile hard disk, magnetic disk, CD, computer storage, read-only memory (ROM, Read-Only Memory), random access memory
Device (RAM, Random Access Memory), electric carrier signal, telecommunication signal and software distribution medium etc..
It should be noted that the content that program product includes can be according to making laws in jurisdiction and patent practice is wanted
It asks and carries out increase and decrease appropriate, such as do not include electric carrier wave according to legislation and patent practice, program product in certain jurisdictions
Signal and telecommunication signal.
Fig. 6, which is shown, generates void in the exchange method according to an embodiment of the invention based on visual human's behavioral standard
The flow chart of anthropomorphic emotion expression service data.
As shown in fig. 6, in step s 601, being intended to according to interaction and the context environmental of interaction determining current virtual people
Emotional parameters.Wherein, emotional parameters include active mood parameter, angry emoticon parameter and frightened emotional parameters.
In step S602, is generated according to emotional parameters and express number with the matched Emotional Behavior on Virtual Human of multi-modal response data
According to.In one embodiment, it generates and the matched visual human's facial expression data of emotional parameters and visual human's limb action number
According to, wherein visual human's facial expression data and visual human's limb action data belong to Emotional Behavior on Virtual Human expression data.Limbs are dynamic
Make data include any one of headwork data, hand motion data, major beat data and body work data or
Appoint several combinations.
Fig. 7 shows that output is more in the exchange method according to an embodiment of the invention based on visual human's behavioral standard
The flow chart of mode reply data.
Firstly, in step s 701, judging whether user is consistent for the view for currently interacting topic with visual human.According to
The interaction of user is intended to and the emotional parameters of visual human are to judge user is for the view for currently interacting topic with visual human
It is no consistent.
In step S702, the of the same mind of topic currently is interacted when user is directed to visual human, then the visual human exported
Limb action is nodding action and agreement gesture motion.In step S703, when user interacts words for current with visual human
The view of topic is on the contrary, the visual human's limb action then exported is head shaking movement and disagrees gesture motion.
Fig. 8 shows the schematic diagram of emotional parameters according to an embodiment of the invention Yu emotion expression service Data Matching.
As shown in figure 8, emotional parameters include active mood parameter, angry emoticon parameter and frightened emotional parameters etc..Each mood ginseng
Number has and has matching facial expression data and limb action data.When the cooperation of these data is exported, energy
It is more anthropomorphic enough so that visual human has the Emotion expression such as the mankind.
As shown in figure 8, the facial expression of visual human can be mesh when the emotional parameters of visual human are active mood parameter
Light faces user's.The headwork of visual human can be user oriented.The hand motion of visual human can be nature stretching, extension
's.The major beat of visual human, which can be, to be naturally drooped.The body work of visual human can be loosen be also possible to tend to
User's.
As shown in figure 8, the facial expression of visual human can be pressure when the emotional parameters of visual human are angry emotional parameters
Press down, is angry and micro- red.It is user oriented that the headwork of visual human can be side.The hand motion of visual human can be
It clenches fist.The major beat of visual human, which can be, stamps one's foot.It is tight that the body work of visual human can be trunk.
As shown in figure 8, the facial expression of visual human can be mesh when the emotional parameters of visual human are frightened emotional parameters
What color break-up was hided.The headwork of visual human can be micro- low.The hand motion of visual human, which can be to roll up, to tremble.Visual human
Major beat can be arm pectoral crosses.The body work of visual human can be loosen be also possible to retrogressing trend
's.
In fact, the emotional parameters of visual human are not limited to listed above three kinds, can also include mood more abundant
Parameter, such as exciting emotional parameters, sad emotional parameters and intense strain parameter etc..Facial expression corresponding with emotional parameters
Data and limb action data are also not unique as shown in Figure 8.Under a certain emotional parameters, visual human can be there are many more refinement
The facial expression data and limb action data divided.All forms of expression that can show visual human's current emotional can transport
It uses in the embodiment of the present invention, the present invention makes limitation not to this.
Fig. 9 shows another stream of the exchange method according to an embodiment of the invention based on visual human's behavioral standard
Cheng Tu.
As shown in figure 9, smart machine 102 is issued to cloud brain 104 and is requested in step S901.Later, in step
In S902, smart machine 102 is constantly in the state for waiting cloud brain 104 to reply.During waiting, smart machine
102 can carry out Clocked operation to returned data the time it takes.
In step S903, if the reply data not returned for a long time, for example, being more than scheduled time span
5S, then smart machine 102 can select to carry out local reply, generate local common reply data.Then, defeated in step S904
The animation cooperated out with local common response, and voice playing equipment is called to carry out voice broadcasting.
Figure 10 show it is according to an embodiment of the invention user, smart machine and cloud brain between the parties
The flow chart communicated.
In order to realize the multi-modal interaction between smart machine 102 and user 101, user 101, smart machine 102 are needed
And communication connection is set up between cloud brain 104.This communication connection should be it is real-time, unobstructed, can guarantee to hand over
It is mutually impregnable.
In order to complete to interact, some conditions or premise are needed to have.These conditions or premise include smart machine
Visual human is loaded and run in 102, and smart machine 102 has the hardware facility of perception and control function.Visual human exists
Start voice, emotion, vision and sensing capability when in interaction mode.
After completing early-stage preparations, smart machine 102 starts to interact with the expansion of user 101, firstly, smart machine 102 obtains
Multi-modal interaction data.It may include the data of diversified forms in multi-modal interaction data, for example, can in multi-modal interaction data
To include text data, voice data, perception data and action data etc..It is multi-modal configured with receiving in smart machine 102
The relevant device of interaction data, for receiving the multi-modal interaction data of the transmission of user 101.At this point, the two of expanding data transmitting
Side is user 101 and smart machine 102, and the direction of data transmitting is to be transmitted to smart machine 102 from user 101.
Then, smart machine 102 sends to cloud brain 104 and requests.Request cloud brain 104 to multi-modal interaction data
Semantic understanding, visual identity, cognition calculating and affection computation are carried out, to help user to carry out decision.At this point, to multi-modal friendship
Mutual data are parsed, and the interaction for obtaining user is intended to.And it is intended to generate multi-modal response data and and multimode according to interaction
The matched Emotional Behavior on Virtual Human of state response data expresses data, wherein Emotional Behavior on Virtual Human expresses the facial table that data pass through visual human
Feelings and limb action show the current mood of visual human.Then, cloud brain 104 will reply data transmission to smart machine
102.At this point, two sides of expansion communication are smart machine 102 and cloud brain 104.
Finally, smart machine 102 can pass through cooperation after smart machine 102 receives the data of the transmission of cloud brain 104
Emotional Behavior on Virtual Human expresses data and exports multi-modal response data.At this point, two sides of expansion communication are smart machine 102 and user
101。
Exchange method and system provided by the invention based on visual human's behavioral standard provides a kind of visual human, visual human
Have default image and preset attribute, multi-modal interaction can be carried out with user.Also, it is provided by the invention to be based on visual human
The exchange method and system of behavioral standard can also cooperate output Emotional Behavior on Virtual Human expression number when exporting multi-modal response data
According to by the mood of Emotional Behavior on Virtual Human expression data representation current virtual people, so that being able to carry out stream between user and visual human
Smooth exchange, and user is made to enjoy anthropomorphic interactive experience.
It should be understood that disclosed embodiment of this invention is not limited to specific structure disclosed herein, processing step
Or material, and the equivalent substitute for these features that those of ordinary skill in the related art are understood should be extended to.It should also manage
Solution, term as used herein is used only for the purpose of describing specific embodiments, and is not intended to limit.
" one embodiment " or " embodiment " mentioned in specification means the special characteristic described in conjunction with the embodiments, structure
Or characteristic is included at least one embodiment of the present invention.Therefore, the phrase " reality that specification various places throughout occurs
Apply example " or " embodiment " the same embodiment might not be referred both to.
While it is disclosed that embodiment content as above but described only to facilitate understanding the present invention and adopting
Embodiment is not intended to limit the invention.Any those skilled in the art to which this invention pertains are not departing from this
Under the premise of the disclosed spirit and scope of invention, any modification and change can be made in the implementing form and in details,
But scope of patent protection of the invention, still should be subject to the scope of the claims as defined in the appended claims.
Claims (10)
1. a kind of exchange method based on visual human's behavioral standard, which is characterized in that the visual human by smart machine show,
When being in interaction mode, starting voice, emotion, vision and sensing capability, the method are comprised the steps of:
Multi-modal interaction data is obtained, the multi-modal interaction data is parsed, the interaction for obtaining user is intended to;
According to the interaction be intended to generate multi-modal response data and with the multi-modal matched virtual human feelings of response data
Thread expresses data, wherein the Emotional Behavior on Virtual Human expression data are empty by facial expression and the limb action performance of visual human
Anthropomorphic current mood;
The Emotional Behavior on Virtual Human expression data are cooperated to export the multi-modal response data.
2. the method as described in claim 1, which is characterized in that according to the interaction be intended to generate multi-modal response data and
In the step of expressing data with the multi-modal matched Emotional Behavior on Virtual Human of response data, also comprise the steps of:
The emotional parameters of current virtual people are determined according to the context environmental of the interaction intention and interaction;
It is generated according to the emotional parameters and expresses data with the multi-modal matched Emotional Behavior on Virtual Human of response data.
3. method according to claim 2, which is characterized in that the emotional parameters include active mood parameter, angry emoticon
Parameter and frightened emotional parameters.
4. method according to claim 2, which is characterized in that generated and the multi-modal response number according to the emotional parameters
In the step of the matched Emotional Behavior on Virtual Human expression data comprising the steps of:
It generates and the matched visual human's facial expression data of the emotional parameters and visual human's limb action data, wherein institute
It states visual human's facial expression data and visual human's limb action data belongs to the Emotional Behavior on Virtual Human expression data.
5. method as claimed in claim 4, which is characterized in that the limb action data include headwork data, hand
Any one of action data, major beat data and body work data appoint several combinations.
6. method according to any one of claims 1 to 5, which is characterized in that
When exporting the multi-modal response data, the of the same mind of topic currently is interacted when user is directed to the visual human,
The visual human's limb action then exported is nodding action and agreement gesture motion;
When exporting the multi-modal response data, when user and the visual human for the view for currently interacting topic on the contrary,
The visual human's limb action then exported is head shaking movement and disagrees gesture motion.
7. such as method of any of claims 1-6, which is characterized in that in interactive process, visual human's output has
The deliberate action of specific intended is interacted with user's expansion.
8. a kind of interactive device based on visual human's behavioral standard, which is characterized in that described device includes:
Interaction is intended to obtain module, is used to obtain multi-modal interaction data, parses, obtain to the multi-modal interaction data
Interaction to user is intended to;
Generation module, be used to be intended to generate according to the interaction multi-modal response data and with the multi-modal response data
Matched Emotional Behavior on Virtual Human expresses data, wherein the Emotional Behavior on Virtual Human expression data pass through the expression and limbs of visual human
The current mood of movement displaying visual human;
Output module is used to that the Emotional Behavior on Virtual Human expression data to be cooperated to export the multi-modal response data.
9. a kind of program product, it includes for executing a series of of such as method and step of any of claims 1-7
Instruction.
10. a kind of interactive system based on visual human's behavioral standard, which is characterized in that the system includes:
Smart machine is mounted with visual human thereon, for obtaining multi-modal interaction data, and has voice, emotion, expression and moves
Make the ability exported, the smart machine includes hologram device;
Cloud brain is used to carry out the multi-modal interaction data semantic understanding, visual identity, cognition calculating and emotion
It calculates, multi-modal response data is exported with visual human described in decision.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810955494.7A CN109271018A (en) | 2018-08-21 | 2018-08-21 | Exchange method and system based on visual human's behavioral standard |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810955494.7A CN109271018A (en) | 2018-08-21 | 2018-08-21 | Exchange method and system based on visual human's behavioral standard |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109271018A true CN109271018A (en) | 2019-01-25 |
Family
ID=65154099
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810955494.7A Pending CN109271018A (en) | 2018-08-21 | 2018-08-21 | Exchange method and system based on visual human's behavioral standard |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109271018A (en) |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110647636A (en) * | 2019-09-05 | 2020-01-03 | 深圳追一科技有限公司 | Interaction method, interaction device, terminal equipment and storage medium |
CN110688911A (en) * | 2019-09-05 | 2020-01-14 | 深圳追一科技有限公司 | Video processing method, device, system, terminal equipment and storage medium |
CN110765294A (en) * | 2019-10-25 | 2020-02-07 | 深圳追一科技有限公司 | Image searching method and device, terminal equipment and storage medium |
CN110807388A (en) * | 2019-10-25 | 2020-02-18 | 深圳追一科技有限公司 | Interaction method, interaction device, terminal equipment and storage medium |
CN111897434A (en) * | 2020-08-05 | 2020-11-06 | 上海永骁智能技术有限公司 | System, method, and medium for signal control of virtual portrait |
CN112133406A (en) * | 2020-08-25 | 2020-12-25 | 合肥工业大学 | Multi-mode emotion guidance method and system based on emotion maps and storage medium |
CN112182173A (en) * | 2020-09-23 | 2021-01-05 | 支付宝(杭州)信息技术有限公司 | Human-computer interaction method and device based on virtual life and electronic equipment |
CN112379780A (en) * | 2020-12-01 | 2021-02-19 | 宁波大学 | Multi-mode emotion interaction method, intelligent device, system, electronic device and medium |
CN113633870A (en) * | 2021-08-31 | 2021-11-12 | 武汉轻工大学 | Emotional state adjusting system and method |
CN114070811A (en) * | 2020-07-30 | 2022-02-18 | 庄连豪 | Intelligent video and audio fusion system and implementation method thereof |
WO2023216765A1 (en) * | 2022-05-09 | 2023-11-16 | 阿里巴巴(中国)有限公司 | Multi-modal interaction method and apparatus |
WO2023246163A1 (en) * | 2022-06-22 | 2023-12-28 | 海信视像科技股份有限公司 | Virtual digital human driving method, apparatus, device, and medium |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105427856A (en) * | 2016-01-12 | 2016-03-23 | 北京光年无限科技有限公司 | Invitation data processing method and system for intelligent robot |
CN105843381A (en) * | 2016-03-18 | 2016-08-10 | 北京光年无限科技有限公司 | Data processing method for realizing multi-modal interaction and multi-modal interaction system |
CN107009362A (en) * | 2017-05-26 | 2017-08-04 | 深圳市阿西莫夫科技有限公司 | Robot control method and device |
CN107301168A (en) * | 2017-06-01 | 2017-10-27 | 深圳市朗空亿科科技有限公司 | Intelligent robot and its mood exchange method, system |
CN107765852A (en) * | 2017-10-11 | 2018-03-06 | 北京光年无限科技有限公司 | Multi-modal interaction processing method and system based on visual human |
CN107894831A (en) * | 2017-10-17 | 2018-04-10 | 北京光年无限科技有限公司 | A kind of interaction output intent and system for intelligent robot |
CN108416420A (en) * | 2018-02-11 | 2018-08-17 | 北京光年无限科技有限公司 | Limbs exchange method based on visual human and system |
-
2018
- 2018-08-21 CN CN201810955494.7A patent/CN109271018A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105427856A (en) * | 2016-01-12 | 2016-03-23 | 北京光年无限科技有限公司 | Invitation data processing method and system for intelligent robot |
CN105843381A (en) * | 2016-03-18 | 2016-08-10 | 北京光年无限科技有限公司 | Data processing method for realizing multi-modal interaction and multi-modal interaction system |
CN107009362A (en) * | 2017-05-26 | 2017-08-04 | 深圳市阿西莫夫科技有限公司 | Robot control method and device |
CN107301168A (en) * | 2017-06-01 | 2017-10-27 | 深圳市朗空亿科科技有限公司 | Intelligent robot and its mood exchange method, system |
CN107765852A (en) * | 2017-10-11 | 2018-03-06 | 北京光年无限科技有限公司 | Multi-modal interaction processing method and system based on visual human |
CN107894831A (en) * | 2017-10-17 | 2018-04-10 | 北京光年无限科技有限公司 | A kind of interaction output intent and system for intelligent robot |
CN108416420A (en) * | 2018-02-11 | 2018-08-17 | 北京光年无限科技有限公司 | Limbs exchange method based on visual human and system |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110688911B (en) * | 2019-09-05 | 2021-04-02 | 深圳追一科技有限公司 | Video processing method, device, system, terminal equipment and storage medium |
CN110688911A (en) * | 2019-09-05 | 2020-01-14 | 深圳追一科技有限公司 | Video processing method, device, system, terminal equipment and storage medium |
CN110647636A (en) * | 2019-09-05 | 2020-01-03 | 深圳追一科技有限公司 | Interaction method, interaction device, terminal equipment and storage medium |
CN110765294A (en) * | 2019-10-25 | 2020-02-07 | 深圳追一科技有限公司 | Image searching method and device, terminal equipment and storage medium |
CN110807388A (en) * | 2019-10-25 | 2020-02-18 | 深圳追一科技有限公司 | Interaction method, interaction device, terminal equipment and storage medium |
CN114070811A (en) * | 2020-07-30 | 2022-02-18 | 庄连豪 | Intelligent video and audio fusion system and implementation method thereof |
CN111897434A (en) * | 2020-08-05 | 2020-11-06 | 上海永骁智能技术有限公司 | System, method, and medium for signal control of virtual portrait |
CN112133406A (en) * | 2020-08-25 | 2020-12-25 | 合肥工业大学 | Multi-mode emotion guidance method and system based on emotion maps and storage medium |
CN112133406B (en) * | 2020-08-25 | 2022-11-04 | 合肥工业大学 | Multi-mode emotion guidance method and system based on emotion maps and storage medium |
CN112182173A (en) * | 2020-09-23 | 2021-01-05 | 支付宝(杭州)信息技术有限公司 | Human-computer interaction method and device based on virtual life and electronic equipment |
CN112379780A (en) * | 2020-12-01 | 2021-02-19 | 宁波大学 | Multi-mode emotion interaction method, intelligent device, system, electronic device and medium |
CN112379780B (en) * | 2020-12-01 | 2021-10-26 | 宁波大学 | Multi-mode emotion interaction method, intelligent device, system, electronic device and medium |
CN113633870A (en) * | 2021-08-31 | 2021-11-12 | 武汉轻工大学 | Emotional state adjusting system and method |
CN113633870B (en) * | 2021-08-31 | 2024-01-23 | 武汉轻工大学 | Emotion state adjustment system and method |
WO2023216765A1 (en) * | 2022-05-09 | 2023-11-16 | 阿里巴巴(中国)有限公司 | Multi-modal interaction method and apparatus |
WO2023246163A1 (en) * | 2022-06-22 | 2023-12-28 | 海信视像科技股份有限公司 | Virtual digital human driving method, apparatus, device, and medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109271018A (en) | Exchange method and system based on visual human's behavioral standard | |
CN110531860B (en) | Animation image driving method and device based on artificial intelligence | |
CN109343695A (en) | Exchange method and system based on visual human's behavioral standard | |
CN109324688A (en) | Exchange method and system based on visual human's behavioral standard | |
CN110286756A (en) | Method for processing video frequency, device, system, terminal device and storage medium | |
US12002160B2 (en) | Avatar generation method, apparatus and device, and medium | |
CN109522835A (en) | Children's book based on intelligent robot is read and exchange method and system | |
CN110400251A (en) | Method for processing video frequency, device, terminal device and storage medium | |
CN107294837A (en) | Engaged in the dialogue interactive method and system using virtual robot | |
CN107340865A (en) | Multi-modal virtual robot exchange method and system | |
CN107797663A (en) | Multi-modal interaction processing method and system based on visual human | |
CN107784355A (en) | The multi-modal interaction data processing method of visual human and system | |
CN109871450A (en) | Based on the multi-modal exchange method and system for drawing this reading | |
CN108595012A (en) | Visual interactive method and system based on visual human | |
CN108942919A (en) | A kind of exchange method and system based on visual human | |
CN109176535A (en) | Exchange method and system based on intelligent robot | |
CN107632706A (en) | The application data processing method and system of multi-modal visual human | |
CN108416420A (en) | Limbs exchange method based on visual human and system | |
CN107679519A (en) | A kind of multi-modal interaction processing method and system based on visual human | |
CN107808191A (en) | The output intent and system of the multi-modal interaction of visual human | |
CN108052250A (en) | Virtual idol deductive data processing method and system based on multi-modal interaction | |
CN109032328A (en) | A kind of exchange method and system based on visual human | |
CN108681398A (en) | Visual interactive method and system based on visual human | |
CN109278051A (en) | Exchange method and system based on intelligent robot | |
CN109086860A (en) | A kind of exchange method and system based on visual human |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190125 |