CN109271018A

CN109271018A - Exchange method and system based on visual human's behavioral standard

Info

Publication number: CN109271018A
Application number: CN201810955494.7A
Authority: CN
Inventors: 尚小维; 李晓丹; 俞志晨
Original assignee: Beijing Guangnian Wuxian Technology Co Ltd
Current assignee: Beijing Guangnian Wuxian Technology Co Ltd
Priority date: 2018-08-21
Filing date: 2018-08-21
Publication date: 2019-01-25

Abstract

The present invention provides a kind of exchange method based on visual human's behavioral standard, visual human is shown by smart machine, starting voice, emotion, vision and sensing capability when being in interaction mode, it includes: obtain multi-modal interaction data, multi-modal interaction data is parsed, the interaction for obtaining user is intended to；It is intended to generate multi-modal response data according to interaction and expresses data with the matched Emotional Behavior on Virtual Human of multi-modal response data, wherein Emotional Behavior on Virtual Human expresses data and shows the current mood of visual human by the facial expression and limb action of visual human；Emotional Behavior on Virtual Human expression data are cooperated to export multi-modal response data.The present invention provides a kind of visual humans, can carry out multi-modal interaction with user.Also, the present invention can also cooperate output Emotional Behavior on Virtual Human expression data when exporting multi-modal response data, the mood of data representation current virtual people be expressed by Emotional Behavior on Virtual Human, so that user enjoys anthropomorphic interactive experience.

Description

Exchange method and system based on visual human's behavioral standard

Technical field

The present invention relates to artificial intelligence fields, specifically, being related to a kind of exchange method based on visual human's behavioral standard And system.

Background technique

The exploitation of robot multi-modal interactive system is dedicated to imitating human conversation, to attempt to imitate people between context Interaction between class.But at present for, the exploitation of robot multi-modal interactive system relevant for visual human is also less complete It is kind, not yet occur carrying out the visual human of multi-modal interaction, it is even more important that there is no and carry out based on visual human itself behavioral standard Interactive interactive product.

Therefore, the present invention provides a kind of exchange method and system based on visual human's behavioral standard.

Summary of the invention

To solve the above problems, the present invention provides a kind of exchange method based on visual human's behavioral standard, it is described virtual People is shown by smart machine, starts voice, emotion, vision and sensing capability, the method packet when being in interaction mode Containing following steps:

Multi-modal interaction data is obtained, the multi-modal interaction data is parsed, the interaction for obtaining user is intended to；

According to the interaction be intended to generate multi-modal response data and with the multi-modal response data it is matched virtual People's emotion expression service data, wherein the Emotional Behavior on Virtual Human expression data pass through the facial expression and limb action table of visual human The current mood of existing visual human；

The Emotional Behavior on Virtual Human expression data are cooperated to export the multi-modal response data.

According to one embodiment of present invention, according to the interaction be intended to generate multi-modal response data and with it is described more In the step of matched Emotional Behavior on Virtual Human of mode response data expresses data, also comprise the steps of:

The emotional parameters of current virtual people are determined according to the context environmental of the interaction intention and interaction；

It is generated according to the emotional parameters and expresses data with the multi-modal matched Emotional Behavior on Virtual Human of response data.

According to one embodiment of present invention, the emotional parameters include active mood parameter, angry emoticon parameter and Frightened emotional parameters.

According to one embodiment of present invention, it is generated according to the emotional parameters matched with the multi-modal response data In the step of Emotional Behavior on Virtual Human expression data comprising the steps of:

Generation and the matched visual human's facial expression data of the emotional parameters and visual human's limb action data, In, visual human's facial expression data and visual human's limb action data belong to the Emotional Behavior on Virtual Human expression number According to.

According to one embodiment of present invention, the limb action data include headwork data, hand motion data, Any one of major beat data and body work data appoint several combinations.

According to one embodiment of present invention,

When exporting the multi-modal response data, when user is directed to the current view one for interacting topic with the visual human It causes, then the visual human's limb action exported is nodding action and agreement gesture motion；

When exporting the multi-modal response data, when user is directed to the current view phase for interacting topic with the visual human Instead, then the visual human's limb action exported is head shaking movement and disagrees gesture motion.

According to one embodiment of present invention, in interactive process, visual human exports the deliberate action for having specific intended It is interacted with user's expansion.

According to another aspect of the present invention, a kind of interactive device based on visual human's behavioral standard is additionally provided, it is described Device includes:

Interaction is intended to obtain module, is used to obtain multi-modal interaction data, solves to the multi-modal interaction data Analysis, the interaction for obtaining user are intended to；

Generation module, be used to be intended to generate according to the interaction multi-modal response data and with the multi-modal response The Emotional Behavior on Virtual Human of Data Matching expresses data, wherein the Emotional Behavior on Virtual Human expression data by the expression of visual human and Limb action shows the current mood of visual human；

Output module is used to that the Emotional Behavior on Virtual Human expression data to be cooperated to export the multi-modal response data.

According to another aspect of the present invention, a kind of program product is additionally provided, it includes any one of as above for executing The series of instructions of the method and step.

According to another aspect of the present invention, a kind of interactive system based on visual human's behavioral standard is additionally provided, it is described System includes:

Smart machine is mounted with visual human thereon, for obtaining multi-modal interaction data, and has voice, emotion, expression With the ability of movement output, the smart machine includes hologram device；

Cloud brain, be used to carry out the multi-modal interaction data semantic understanding, visual identity, cognition calculate and Affection computation exports multi-modal response data with visual human described in decision.

Exchange method and system provided by the invention based on visual human's behavioral standard provides a kind of visual human, visual human Have default image and preset attribute, multi-modal interaction can be carried out with user.Also, it is provided by the invention to be based on visual human The exchange method and system of behavioral standard can also cooperate output Emotional Behavior on Virtual Human expression number when exporting multi-modal response data According to by the mood of Emotional Behavior on Virtual Human expression data representation current virtual people, so that being able to carry out stream between user and visual human Smooth exchange, and user is made to enjoy anthropomorphic interactive experience.

Other features and advantages of the present invention will be illustrated in the following description, also, partly becomes from specification It obtains it is clear that understand through the implementation of the invention.The objectives and other advantages of the invention can be by specification, right Specifically noted structure is achieved and obtained in claim and attached drawing.

Detailed description of the invention

Attached drawing is used to provide further understanding of the present invention, and constitutes part of specification, with reality of the invention It applies example and is used together to explain the present invention, be not construed as limiting the invention.In the accompanying drawings:

Fig. 1 shows that the interaction of the interactive system according to an embodiment of the invention based on visual human's behavioral standard is shown It is intended to；

Fig. 2 shows the structural frames of the interactive system according to an embodiment of the invention based on visual human's behavioral standard Figure；

Fig. 3 shows the module frame of the interactive system according to an embodiment of the invention based on visual human's behavioral standard Figure；

Fig. 4 shows the structure of the interactive system based on visual human's behavioral standard according to another embodiment of the invention Block diagram；

Fig. 5 shows the exchange method flow chart according to an embodiment of the invention based on visual human's behavioral standard；

Fig. 6, which is shown, generates void in the exchange method according to an embodiment of the invention based on visual human's behavioral standard The flow chart of anthropomorphic emotion expression service data；

Fig. 7 shows that output is more in the exchange method according to an embodiment of the invention based on visual human's behavioral standard The flow chart of mode reply data；

Fig. 8 shows the schematic diagram of emotional parameters according to an embodiment of the invention Yu emotion expression service Data Matching；

Fig. 9 shows another stream of the exchange method according to an embodiment of the invention based on visual human's behavioral standard Cheng Tu；And

Figure 10 show it is according to an embodiment of the invention user, smart machine and cloud brain between the parties The flow chart communicated.

Specific embodiment

To make the object, technical solutions and advantages of the present invention clearer, the embodiment of the present invention is made below in conjunction with attached drawing Further it is described in detail.

It is clear to state, it needs to carry out before embodiment as described below:

The visual human that the present invention mentions is equipped on the smart machine for supporting the input/output modules such as perception, control；With Gao Fang True 3d virtual figure image is Main User Interface, has the appearance of significant character features；It supports multi-modal human-computer interaction, has Natural language understanding, visual perception touch the AI abilities such as perception, language voice output, emotional facial expressions movement output；Configurable society Meeting attribute, personality attribute, personage's technical ability etc. make user enjoy the virtual portrait of intelligent and personalized Flow Experience.

Visual human's smart machine mounted are as follows: have the input of non-tactile, non-mouse-keyboard screen (holography, TV screen, Multimedia display screen, LED screen etc.), and the smart machine of camera is carried, meanwhile, it can be hologram device, VR equipment, PC Machine.But other smart machines are not precluded, such as: hand-held plate, naked eye 3D equipment, even smart phone.

Visual human interacts in system level and user, and operating system, such as hologram device are run in the system hardware Built-in system is windows or MAC OS if PC.

Virtual artificial system application or executable file.

Virtual robot obtains the multi-modal interaction data of user based on the hardware of the smart machine, beyond the clouds the energy of brain Under power is supported, semantic understanding, visual identity, cognition calculating, affection computation are carried out to multi-modal interaction data, it is defeated to complete decision Process out.

The cloud brain being previously mentioned is to provide the visual human to carry out semantic understanding (language semantic to the interaction demand of user Understanding, Action Semantic understanding, visual identity, affection computation, cognition calculate) processing capacity terminal, realize and the friendship of user Mutually, with the multi-modal response data of the output of visual human described in decision.

Each embodiment of the invention is described in detail with reference to the accompanying drawing.

Fig. 1 shows that the interaction of the interactive system according to an embodiment of the invention based on visual human's behavioral standard is shown It is intended to.As shown in Figure 1, carrying out multi-modal interaction needs user 101, smart machine 102, visual human 103 and cloud brain 104.Wherein, the user 101 interacted with visual human can be the visual human of true people, another visual human and entity, another Visual human and entity visual human are similar with the interactive process of visual human with the interactive process of visual human with single people.Therefore, Only show the multi-modal interactive process of user (people) Yu visual human in Fig. 1.

In addition, smart machine 102 includes 1022 (substantially core processing of display area 1021 and hardware supported equipment Device).Display area 1021 is used to show that the image of visual human 103, hardware supported equipment 1022 to make with the cooperation of cloud brain 104 With for the data processing in interactive process.Visual human 103 needs screen display carrier to present.Therefore, display area 1021 is wrapped It includes: holographic screen, TV screen, multimedia display screen and LED screen etc..

The process interacted between visual human and user 101 in Fig. 1 are as follows:

Interaction required early-stage preparations or condition have, and visual human is carried and operated on smart machine 102, and virtual People has specific image characteristics.Visual human has natural language understanding, visual perception, touches perception, language output, emotion table The AI abilities such as feelings movement output.In order to cooperate the touch perceptional function of visual human, it is also required to be equipped on smart machine and has touching Touch the component of perceptional function.According to one embodiment of present invention, in order to promote interactive experience, visual human after being activated just It is shown in predeterminable area.

It should be noted that the image of visual human 103 and dressing up and being not limited to one mode.Visual human 103 can be with Have different images and dresss up.The image of visual human 103 is generally 3D high mould animating image.Visual human 103 can have Different appearance and decoration.The image of every kind of visual human 103 can also correspond to it is a variety of different dress up, the classification dressed up can be according to Classify according to season, can also classify according to occasion.These images and dresss up and can reside in cloud brain 104, it can also be with It is present in smart machine 102, can be called at any time when needing to call these images and dress up.

Social property, personality attribute and the personage's technical ability of visual human 103 is also not necessarily limited to a kind of or a kind of.Visual human 103 can have a variety of social properties, multiple personality attribute and a variety of personage's technical ability.These social properties, personality attribute with And personage's technical ability can arrange in pairs or groups respectively, and be not secured to a kind of collocation mode, user, which can according to need, to be selected and arranges in pairs or groups.

Specifically, social property may include: appearance, name, dress ornament, decoration, gender, native place, age, family pass The attributes such as system, occupation, position, religious belief, emotion state, educational background；Personality attribute may include: the attributes such as personality, makings；People The professional skills such as object technical ability may include: sing and dance, tells a story, trains, and the displaying of personage's technical ability is not limited to limbs, table The technical ability of feelings, head and/or mouth is shown.

In this application, the social property of visual human, personality attribute and personage's technical ability etc. can make multi-modal interaction Parsing and the result of decision are more prone to or are more suitable for the visual human.

The following are multi-modal interactive processes, firstly, obtaining multi-modal interaction data, solve to multi-modal interaction data Analysis, the interaction for obtaining user are intended to.The reception device for obtaining multi-modal interaction data is respectively mounted or is configured at smart machine 102 On, these reception devices include the received text device for receiving text, receive the pronunciation receiver of voice, receive taking the photograph for vision As head and the infrored equipment etc. of reception perception information.

Then, according to interaction be intended to generate multi-modal response data and with the matched virtual human feelings of multi-modal response data Thread expresses data, wherein Emotional Behavior on Virtual Human expresses data and shows visual human by the facial expression and limb action of visual human Current mood.

Finally, cooperation Emotional Behavior on Virtual Human expression data export multi-modal response data.

Fig. 2 shows the structural frames of the interactive system according to an embodiment of the invention based on visual human's behavioral standard Figure.As shown in Fig. 2, completing multi-modal interactive needs: user 101, smart machine 102 and cloud brain 104 by system.Its In, smart machine 102 includes reception device 102A, processing unit 102B, output device 102C and attachment device 102D.Cloud Brain 104 includes communication device 104A.

Interactive system provided by the invention based on visual human's behavioral standard need user 101, smart machine 102 with And unobstructed communication channel is established between cloud brain 104, so as to complete the interaction of user 101 Yu visual human.In order to complete At interactive task, smart machine 102 and cloud brain 104 can be provided with the device and component for supporting to complete interaction.With The object of visual human's interaction can be a side, or multi-party.

Smart machine 102 includes reception device 102A, processing unit 102B, output device 102C and attachment device 102D.Wherein, reception device 102A is for receiving multi-modal interaction data.The example of reception device 102A includes grasping for voice The microphone of work, scanner, camera (movement touched is not related to using the detection of visible or nonvisible wavelength) etc..Intelligence is set Standby 102 can obtain multi-modal interaction data by above-mentioned input equipment.Output device 102C is virtual for exporting The multi-modal reply data that people interacts with user 101, substantially suitable with the configuration of reception device 102A, details are not described herein.

Processing unit 102B is for handling the interaction data transmitted in interactive process by cloud brain 104.Attachment device 102D is used for contacting between cloud brain 104, and processing unit 102B handles the pretreated multi-modal friendship of reception device 102A Mutual data or the data transmitted by cloud brain 104.Attachment device 102D sends call instruction to call on cloud brain 104 Robot capability.

The communication device 104A that cloud brain 104 includes is for completing writing to each other between smart machine 102.Communication It keeps in communication and contacts between attachment device 102D on device 104A and smart machine 102, what reception smart machine 102 was sent asks It asks, and sends the processing result of the sending of cloud brain 104, be Jie linked up between smart machine 102 and cloud brain 104 Matter.

Fig. 3 shows the module of the interactive system based on visual human's behavioral standard according to another embodiment of the invention Block diagram.As shown in figure 3, system includes that interaction is intended to obtain module 301, generation module 302 and output module 303.Wherein, it hands over Mutually it is intended to obtain module 301 to include text collection unit 3011, audio collection unit 3012, vision collecting unit 3013, perception Acquisition unit 3014 and resolution unit 3015.Generation module 302 includes emotional parameters determination unit 3021 and emotion expression service Data generating unit 3022.Output module 303 includes judging unit 3031.

Interaction is intended to obtain module 301 for obtaining multi-modal interaction data, parses, obtains to multi-modal interaction data Interaction to user is intended to.Visual human 103 shown by smart machine 102, when being in interaction mode starting voice, emotion, Vision and sensing capability.Text collection unit 3011 is used to acquire text information.Audio collection unit 3012 is used to acquire sound Frequency information.Vision collecting unit 3013 is used to acquire visual information.Perception acquisition unit 3014 is used to acquire perception information.More than The example of acquisition unit includes the microphone for voice operating, scanner, camera, sensing control equipment, such as using visible or not Visible wavelength ray, signal, environmental data etc..Multi-modal interactive number can be obtained by above-mentioned input equipment According to.Multi-modal interaction may include one of text, audio, vision and perception data, also may include a variety of, the present invention It is restricted not to this.

Generation module 302 is used to be intended to generate multi-modal response data according to interaction and match with multi-modal response data Emotional Behavior on Virtual Human express data, wherein Emotional Behavior on Virtual Human expresses data and passes through the expression of visual human and limb action performance The current mood of visual human.

Emotional parameters determination unit 3021 is used to be intended to according to interaction and the context environmental of interaction determines current virtual The emotional parameters of people.Wherein, the emotional parameters of visual human include active mood parameter, angry emoticon parameter and frightened mood ginseng Number.

Emotion expression service data generating unit 3022 is used to generate and the matched void of multi-modal response data according to emotional parameters Anthropomorphic emotion expression service data.According to one embodiment of present invention, it generates and the matched visual human's facial expression number of emotional parameters Accordingly and visual human's limb action data, wherein visual human's facial expression data and visual human's limb action data belong to void Anthropomorphic emotion expression service data.

In addition, limb action data include that headwork data, hand motion data, major beat data and trunk are dynamic Make any one of data or appoints several combinations.

Output module 303 is for cooperating Emotional Behavior on Virtual Human expression data to export multi-modal response data.Judging unit 3031 For judging whether user is consistent for the view for currently interacting topic with visual human.It is currently interacted when user is directed to visual human Topic it is of the same mind, then the visual human's limb action exported be nodding action and agree to gesture motion.As user and virtually People for current interaction topic view on the contrary, the visual human's limb action then exported is head shaking movement and to disagree gesture dynamic Make.

Fig. 4 shows the structure of the interactive system based on visual human's behavioral standard according to another embodiment of the invention Block diagram.As shown in figure 4, completing interaction needs user 101, smart machine 102 and cloud brain 104.Wherein, smart machine 102 include man-machine interface 401, data processing unit 402, input/output unit 403 and interface unit 404.Cloud brain 104 Interface 1043 and affection computation interface 1044 are calculated comprising semantic understanding interface 1041, visual identity interface 1042, cognition.

Interactive system provided by the invention based on visual human's behavioral standard includes smart machine 102 and cloud brain 104.Visual human 103 runs in smart machine 102, and visual human 103 has default image and preset attribute, in interaction It can star voice, emotion, vision and sensing capability when state.

In one embodiment, smart machine 102 may include: man-machine interface 401, data processing unit 402, input it is defeated Device 403 and interface unit 404 out.Wherein, man-machine interface 401 is shown in the predeterminable area of smart machine 102 in fortune The visual human 103 of row state.

Data processing unit 402 carries out the number generated in multi-modal interactive process for handling user 101 and visual human 103 According to.Processor used can be data processing unit (Central Processing Unit, CPU), can also be that other are logical With processor, digital signal processor (Digital Signal Processor, DSP), specific integrated circuit (Application Specific Integrated Circuit, ASIC), ready-made programmable gate array (Field- Programmable Gate Array, FPGA) either other programmable logic device, discrete gate or transistor logic, Discrete hardware components etc..General processor can be microprocessor or the processor is also possible to any conventional processor Deng processor is the control centre of terminal, utilizes the various pieces of various interfaces and the entire terminal of connection.

It include memory in smart machine 102, memory mainly includes storing program area and storage data area, wherein is deposited Store up program area can application program needed for storage program area, at least one function (for example sound-playing function, image play function Energy is equal) etc.；Storage data area can store according to smart machine 102 use created data (such as audio data, browsing note Record etc.) etc..In addition, memory may include high-speed random access memory, it can also include nonvolatile memory, such as firmly Disk, memory, plug-in type hard disk, intelligent memory card (Smart Media Card, SMC), secure digital (Secure Digital, SD) block, flash card (Flash Card), at least one disk memory, flush memory device or other volatile solid-states Part.

Input/output unit 403 is used to obtain multi-modal interaction data and exports the output data in interactive process.It connects Mouthful unit 404 is used to communicate with the expansion of cloud brain 104, and by with the interface in cloud brain 104, to fetching, to transfer cloud big Visual human's ability in brain 104.

Cloud brain 104 include semantic understanding interface 1041, visual identity interface 1042, cognition calculate interface 1043 and Affection computation interface 1044.The above interface is communicated with the expansion of interface unit 404 in smart machine 102.Also, cloud is big Brain 104 also includes and the corresponding semantic understanding logic of semantic understanding interface 1041, vision corresponding with visual identity interface 1042 Recognition logic and cognition calculate the corresponding cognition calculating logic of interface 1043 and emotion corresponding with affection computation interface 1044 Calculating logic.

As shown in figure 4, each ability interface calls corresponding logical process respectively in multi-modal data resolving.Below For the explanation of each interface:

Semantic understanding interface 1041 receives the special sound instruction forwarded from interface unit 404, carries out voice knowledge to it The other and natural language processing based on a large amount of corpus.

Visual identity interface 1042 can be calculated for human body, face, scene according to computer vision algorithms make, deep learning Method etc. carries out video content detection, identification, tracking etc..Image is identified according to scheduled algorithm, the inspection of quantitative Survey result.Have image preprocessing function, feature extraction functions, decision making function and concrete application function；

Wherein, image preprocessing function, which can be, carries out basic handling, including color sky to the vision collecting data of acquisition Between conversion, edge extracting, image transformation and image threshold；

Feature extraction functions can extract the features such as the colour of skin of target, color, texture, movement and coordinate in image and believe Breath；

Decision making function can be to characteristic information, is distributed to according to certain decision strategy and needs the specific of this feature information Multi-modal output equipment or multi-modal output application, such as realize Face datection, human limbs identification, motion detection function.

Cognition calculates interface 1043, receives the multi-modal data forwarded from interface unit 404, and cognition calculates interface 1043 Data acquisition, identification and study are carried out to handle multi-modal data, to obtain user's portrait, knowledge mapping etc., to multimode State output data carries out Rational Decision.

Affection computation interface 1044 receives the multi-modal data forwarded from interface unit 404, utilizes affection computation logic (can be Emotion identification technology) calculates the current emotional state of user.Emotion identification technology is that one of affection computation is important Component part, the content of Emotion identification research include facial expression, voice, behavior, text and physiological signal identification etc., are led to Crossing the above content may determine that the emotional state of user.Emotion identification technology can be monitored only by vision Emotion identification technology The emotional state of user can also monitor use in conjunction with by the way of using vision Emotion identification technology and sound Emotion identification technology The emotional state at family, and be not limited thereto.In the present embodiment, it is preferred to use the two in conjunction with mode monitor mood.

Affection computation interface 1044 is to collect mankind face by using image capture device when carrying out vision Emotion identification Portion's facial expression image is then converted into that data can be analyzed, the technologies such as image procossing is recycled to carry out the analysis of expression mood.Understand face Expression, it usually needs the delicate variation of expression is detected, such as cheek muscle, mouth variation and choose eyebrow etc..

Fig. 5 shows the exchange method flow chart according to an embodiment of the invention based on visual human's behavioral standard. As shown in figure 5, parsing, obtaining to multi-modal interaction data firstly, obtain multi-modal interaction data in step S501 The interaction of user is intended to.In multi-modal interactive process, virtual robot obtains multimode by the reception device on smart machine State interaction data.It may include text data, voice data, perception data and action data etc. in multi-modal interaction data.

Then, in step S502, according to interaction be intended to generate multi-modal response data and with multi-modal response data Matched Emotional Behavior on Virtual Human expresses data, wherein Emotional Behavior on Virtual Human expresses the facial expression and limbs that data pass through visual human The current mood of movement displaying visual human.In one embodiment, facial expression may include expression in the eyes data, mouth data and Eyebrow data etc..Limb action data may include header data, four limbs data and trunk data etc..

Finally, cooperation Emotional Behavior on Virtual Human expression data export multi-modal response data in step S503.In order to enable empty Personification reaches more anthropomorphic effect, needs to export the emotion expression service data for representing oneself mood when interacting with user.Mood table Multi-modal response data can be cooperated to export up to data, for representing the mental state of visual human at this time, bring user and more really intend The interactive experience of people.

In addition, the visual interactive system provided by the invention based on visual human can also cooperate a kind of program product, packet Containing for executing the series of instructions for completing the exchange method step based on visual human's behavioral standard.Program product can run meter The instruction of calculation machine, computer instruction includes computer program code, and computer program code can be source code form, object identification code Form, executable file or certain intermediate forms etc..

Program product may include: can carry computer program code any entity or device, recording medium, USB flash disk, Mobile hard disk, magnetic disk, CD, computer storage, read-only memory (ROM, Read-Only Memory), random access memory Device (RAM, Random Access Memory), electric carrier signal, telecommunication signal and software distribution medium etc..

It should be noted that the content that program product includes can be according to making laws in jurisdiction and patent practice is wanted It asks and carries out increase and decrease appropriate, such as do not include electric carrier wave according to legislation and patent practice, program product in certain jurisdictions Signal and telecommunication signal.

Fig. 6, which is shown, generates void in the exchange method according to an embodiment of the invention based on visual human's behavioral standard The flow chart of anthropomorphic emotion expression service data.

As shown in fig. 6, in step s 601, being intended to according to interaction and the context environmental of interaction determining current virtual people Emotional parameters.Wherein, emotional parameters include active mood parameter, angry emoticon parameter and frightened emotional parameters.

In step S602, is generated according to emotional parameters and express number with the matched Emotional Behavior on Virtual Human of multi-modal response data According to.In one embodiment, it generates and the matched visual human's facial expression data of emotional parameters and visual human's limb action number According to, wherein visual human's facial expression data and visual human's limb action data belong to Emotional Behavior on Virtual Human expression data.Limbs are dynamic Make data include any one of headwork data, hand motion data, major beat data and body work data or Appoint several combinations.

Fig. 7 shows that output is more in the exchange method according to an embodiment of the invention based on visual human's behavioral standard The flow chart of mode reply data.

Firstly, in step s 701, judging whether user is consistent for the view for currently interacting topic with visual human.According to The interaction of user is intended to and the emotional parameters of visual human are to judge user is for the view for currently interacting topic with visual human It is no consistent.

In step S702, the of the same mind of topic currently is interacted when user is directed to visual human, then the visual human exported Limb action is nodding action and agreement gesture motion.In step S703, when user interacts words for current with visual human The view of topic is on the contrary, the visual human's limb action then exported is head shaking movement and disagrees gesture motion.

Fig. 8 shows the schematic diagram of emotional parameters according to an embodiment of the invention Yu emotion expression service Data Matching. As shown in figure 8, emotional parameters include active mood parameter, angry emoticon parameter and frightened emotional parameters etc..Each mood ginseng Number has and has matching facial expression data and limb action data.When the cooperation of these data is exported, energy It is more anthropomorphic enough so that visual human has the Emotion expression such as the mankind.

As shown in figure 8, the facial expression of visual human can be mesh when the emotional parameters of visual human are active mood parameter Light faces user's.The headwork of visual human can be user oriented.The hand motion of visual human can be nature stretching, extension 's.The major beat of visual human, which can be, to be naturally drooped.The body work of visual human can be loosen be also possible to tend to User's.

As shown in figure 8, the facial expression of visual human can be pressure when the emotional parameters of visual human are angry emotional parameters Press down, is angry and micro- red.It is user oriented that the headwork of visual human can be side.The hand motion of visual human can be It clenches fist.The major beat of visual human, which can be, stamps one's foot.It is tight that the body work of visual human can be trunk.

As shown in figure 8, the facial expression of visual human can be mesh when the emotional parameters of visual human are frightened emotional parameters What color break-up was hided.The headwork of visual human can be micro- low.The hand motion of visual human, which can be to roll up, to tremble.Visual human Major beat can be arm pectoral crosses.The body work of visual human can be loosen be also possible to retrogressing trend 's.

In fact, the emotional parameters of visual human are not limited to listed above three kinds, can also include mood more abundant Parameter, such as exciting emotional parameters, sad emotional parameters and intense strain parameter etc..Facial expression corresponding with emotional parameters Data and limb action data are also not unique as shown in Figure 8.Under a certain emotional parameters, visual human can be there are many more refinement The facial expression data and limb action data divided.All forms of expression that can show visual human's current emotional can transport It uses in the embodiment of the present invention, the present invention makes limitation not to this.

Fig. 9 shows another stream of the exchange method according to an embodiment of the invention based on visual human's behavioral standard Cheng Tu.

As shown in figure 9, smart machine 102 is issued to cloud brain 104 and is requested in step S901.Later, in step In S902, smart machine 102 is constantly in the state for waiting cloud brain 104 to reply.During waiting, smart machine 102 can carry out Clocked operation to returned data the time it takes.

In step S903, if the reply data not returned for a long time, for example, being more than scheduled time span 5S, then smart machine 102 can select to carry out local reply, generate local common reply data.Then, defeated in step S904 The animation cooperated out with local common response, and voice playing equipment is called to carry out voice broadcasting.

In order to realize the multi-modal interaction between smart machine 102 and user 101, user 101, smart machine 102 are needed And communication connection is set up between cloud brain 104.This communication connection should be it is real-time, unobstructed, can guarantee to hand over It is mutually impregnable.

In order to complete to interact, some conditions or premise are needed to have.These conditions or premise include smart machine Visual human is loaded and run in 102, and smart machine 102 has the hardware facility of perception and control function.Visual human exists Start voice, emotion, vision and sensing capability when in interaction mode.

After completing early-stage preparations, smart machine 102 starts to interact with the expansion of user 101, firstly, smart machine 102 obtains Multi-modal interaction data.It may include the data of diversified forms in multi-modal interaction data, for example, can in multi-modal interaction data To include text data, voice data, perception data and action data etc..It is multi-modal configured with receiving in smart machine 102 The relevant device of interaction data, for receiving the multi-modal interaction data of the transmission of user 101.At this point, the two of expanding data transmitting Side is user 101 and smart machine 102, and the direction of data transmitting is to be transmitted to smart machine 102 from user 101.

Then, smart machine 102 sends to cloud brain 104 and requests.Request cloud brain 104 to multi-modal interaction data Semantic understanding, visual identity, cognition calculating and affection computation are carried out, to help user to carry out decision.At this point, to multi-modal friendship Mutual data are parsed, and the interaction for obtaining user is intended to.And it is intended to generate multi-modal response data and and multimode according to interaction The matched Emotional Behavior on Virtual Human of state response data expresses data, wherein Emotional Behavior on Virtual Human expresses the facial table that data pass through visual human Feelings and limb action show the current mood of visual human.Then, cloud brain 104 will reply data transmission to smart machine 102.At this point, two sides of expansion communication are smart machine 102 and cloud brain 104.

Finally, smart machine 102 can pass through cooperation after smart machine 102 receives the data of the transmission of cloud brain 104 Emotional Behavior on Virtual Human expresses data and exports multi-modal response data.At this point, two sides of expansion communication are smart machine 102 and user 101。

It should be understood that disclosed embodiment of this invention is not limited to specific structure disclosed herein, processing step Or material, and the equivalent substitute for these features that those of ordinary skill in the related art are understood should be extended to.It should also manage Solution, term as used herein is used only for the purpose of describing specific embodiments, and is not intended to limit.

" one embodiment " or " embodiment " mentioned in specification means the special characteristic described in conjunction with the embodiments, structure Or characteristic is included at least one embodiment of the present invention.Therefore, the phrase " reality that specification various places throughout occurs Apply example " or " embodiment " the same embodiment might not be referred both to.

While it is disclosed that embodiment content as above but described only to facilitate understanding the present invention and adopting Embodiment is not intended to limit the invention.Any those skilled in the art to which this invention pertains are not departing from this Under the premise of the disclosed spirit and scope of invention, any modification and change can be made in the implementing form and in details, But scope of patent protection of the invention, still should be subject to the scope of the claims as defined in the appended claims.

Claims

1. a kind of exchange method based on visual human's behavioral standard, which is characterized in that the visual human by smart machine show, When being in interaction mode, starting voice, emotion, vision and sensing capability, the method are comprised the steps of:

According to the interaction be intended to generate multi-modal response data and with the multi-modal matched virtual human feelings of response data Thread expresses data, wherein the Emotional Behavior on Virtual Human expression data are empty by facial expression and the limb action performance of visual human Anthropomorphic current mood；

2. the method as described in claim 1, which is characterized in that according to the interaction be intended to generate multi-modal response data and In the step of expressing data with the multi-modal matched Emotional Behavior on Virtual Human of response data, also comprise the steps of:

3. method according to claim 2, which is characterized in that the emotional parameters include active mood parameter, angry emoticon Parameter and frightened emotional parameters.

4. method according to claim 2, which is characterized in that generated and the multi-modal response number according to the emotional parameters In the step of the matched Emotional Behavior on Virtual Human expression data comprising the steps of:

It generates and the matched visual human's facial expression data of the emotional parameters and visual human's limb action data, wherein institute It states visual human's facial expression data and visual human's limb action data belongs to the Emotional Behavior on Virtual Human expression data.

5. method as claimed in claim 4, which is characterized in that the limb action data include headwork data, hand Any one of action data, major beat data and body work data appoint several combinations.

6. method according to any one of claims 1 to 5, which is characterized in that

When exporting the multi-modal response data, the of the same mind of topic currently is interacted when user is directed to the visual human, The visual human's limb action then exported is nodding action and agreement gesture motion；

When exporting the multi-modal response data, when user and the visual human for the view for currently interacting topic on the contrary, The visual human's limb action then exported is head shaking movement and disagrees gesture motion.

7. such as method of any of claims 1-6, which is characterized in that in interactive process, visual human's output has The deliberate action of specific intended is interacted with user's expansion.

8. a kind of interactive device based on visual human's behavioral standard, which is characterized in that described device includes:

Interaction is intended to obtain module, is used to obtain multi-modal interaction data, parses, obtain to the multi-modal interaction data Interaction to user is intended to；

Generation module, be used to be intended to generate according to the interaction multi-modal response data and with the multi-modal response data Matched Emotional Behavior on Virtual Human expresses data, wherein the Emotional Behavior on Virtual Human expression data pass through the expression and limbs of visual human The current mood of movement displaying visual human；

9. a kind of program product, it includes for executing a series of of such as method and step of any of claims 1-7 Instruction.

10. a kind of interactive system based on visual human's behavioral standard, which is characterized in that the system includes:

Smart machine is mounted with visual human thereon, for obtaining multi-modal interaction data, and has voice, emotion, expression and moves Make the ability exported, the smart machine includes hologram device；

Cloud brain is used to carry out the multi-modal interaction data semantic understanding, visual identity, cognition calculating and emotion It calculates, multi-modal response data is exported with visual human described in decision.