CN110196927B - Multi-round man-machine conversation method, device and equipment - Google Patents

Multi-round man-machine conversation method, device and equipment Download PDF

Info

Publication number
CN110196927B
CN110196927B CN201910383367.9A CN201910383367A CN110196927B CN 110196927 B CN110196927 B CN 110196927B CN 201910383367 A CN201910383367 A CN 201910383367A CN 110196927 B CN110196927 B CN 110196927B
Authority
CN
China
Prior art keywords
response data
machine response
server
client
machine
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910383367.9A
Other languages
Chinese (zh)
Other versions
CN110196927A (en
Inventor
吕飞飞
张子隆
刘炎
吴浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Volkswagen Mobvoi Beijing Information Technology Co Ltd
Original Assignee
Volkswagen Mobvoi Beijing Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Volkswagen Mobvoi Beijing Information Technology Co Ltd filed Critical Volkswagen Mobvoi Beijing Information Technology Co Ltd
Priority to CN201910383367.9A priority Critical patent/CN110196927B/en
Publication of CN110196927A publication Critical patent/CN110196927A/en
Application granted granted Critical
Publication of CN110196927B publication Critical patent/CN110196927B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/9032Query formulation
    • G06F16/90332Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Human Computer Interaction (AREA)
  • Telephonic Communication Services (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The embodiment of the invention discloses a method, a device and equipment for multi-round man-machine conversation, wherein the method comprises the following steps: the method comprises the steps that a client side obtains user interaction voice input by a user under a current conversation turn and analyzes the user interaction voice to obtain an analysis instruction; if the client determines that the current conversation turn is a return instruction, acquiring an information identifier of the previous-stage machine response data matched with the machine response data of the current conversation turn, and sending the information identifier to a server; and the client receives the confirmation return response of the server and presents the stored upper-level machine response data to the user. According to the technical scheme of the embodiment of the invention, when the client analyzes the user interaction voice as the return instruction, the corresponding machine response data is called and presented to the user by sending the information identifier and acquiring the confirmation return response of the server, so that multi-turn conversation between human and machines is realized, the user experience is improved, the data bandwidth occupied by the client is reduced, and the server resource is saved.

Description

Multi-round man-machine conversation method, device and equipment
Technical Field
The embodiment of the invention relates to the technical field of human-computer interaction, in particular to a method, a device and equipment for multi-round human-computer conversation.
Background
With the continuous progress of software technology, various Applications (APPs) appear in the field of view, and the voice interaction function as an intangible link between the user and the Application has become an extremely important component in the development of the Application.
Currently developed applications use a single round of conversation during the voice interaction session, for example, the user says "what is good at the vicinity", the voice interaction function returns a list of gourmets, the user can say the name of the restaurant or an index number of the list, such as "first", into the details interface for the restaurant, and when the user dislikes the restaurant or wants to view other restaurants, the user needs to re-enter "what is good at the vicinity".
The voice interaction mode has a great defect in logic, particularly lacks the relevance between contexts, the server is required to provide the same conversation content for multiple times, particularly when the conversation level of the user is more, the user often needs to input the same problem frequently, the required conversation level can be reached after multiple screening, the interaction times are greatly increased, and the conversation time is prolonged.
Disclosure of Invention
The embodiment of the invention provides a method, a device and equipment for multi-round man-machine conversation, which realize multi-round conversation between man machines, ensure the accuracy of presented data, avoid a client from repeatedly acquiring the same data content and save server resources.
In a first aspect, an embodiment of the present invention provides a method for multi-turn human-machine conversation, including:
the method comprises the steps that a client side obtains user interaction voice input by a user under a current conversation turn, and carries out instruction analysis on the user interaction voice to obtain an analysis instruction;
if the client determines that the analysis instruction is a return instruction, acquiring an information identifier of the upper-level machine response data matched with the machine response data of the current conversation turn, and sending the information identifier to a server;
and the client side returns a response according to the confirmation matched with the information identifier and fed back by the server, determines that the user interaction voice meets the request condition of historical machine response data, and presents the stored superior machine response data to the user.
In a second aspect, an embodiment of the present invention provides a method for multiple rounds of human-machine conversation, including:
the server receives user interaction voice which is sent by the client and input by a user in the current conversation turn;
if the server determines that the user interaction voice is a return instruction, the server acquires an information identifier of the previous-stage machine response data matched with the machine response data of the current conversation turn, feeds back a confirmation return response matched with the information identifier, and determines that the user interaction voice meets the request condition of historical machine response data;
and the server takes the upper-level machine response data as current machine response data so as to keep data synchronization between the server and the client.
In a third aspect, an embodiment of the present invention provides a multi-turn human-machine conversation device, applied to a client, including:
the instruction analysis module is used for acquiring user interaction voice input by a user under the current conversation turn, and performing instruction analysis on the user interaction voice to obtain an analysis instruction;
the information identifier acquisition module is used for acquiring the information identifier of the upper-level machine response data matched with the machine response data of the current conversation turn and sending the information identifier to a server if the analysis instruction is determined to be a return instruction;
and the machine response data presentation module is used for returning a response according to the confirmation matched with the information identifier fed back by the server, determining that the user interaction voice meets the request condition of historical machine response data, and presenting the stored superior machine response data to the user.
In a fourth aspect, an embodiment of the present invention provides a multi-turn human-machine conversation apparatus, applied in a server, including:
the user interaction voice acquisition module is used for receiving user interaction voice which is sent by the client and input by a user in the current conversation turn;
the instruction response module is used for acquiring an information identifier of the previous-stage machine response data matched with the machine response data of the current conversation turn if the user interaction voice is determined to be a return instruction, feeding back a confirmation return response matched with the information identifier, and determining that the user interaction voice meets the request condition of historical machine response data;
and the first data synchronization module is used for taking the upper-level machine response data as current machine response data so as to keep data synchronization between the server and the client.
In a fifth aspect, an embodiment of the present invention provides an apparatus, where the apparatus includes:
one or more processors;
storage means for storing one or more programs;
when executed by the one or more processors, cause the one or more processors to implement the multi-turn human-machine dialog method of any embodiment of the invention.
In a sixth aspect, the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the multi-round human-machine conversation method according to any embodiment of the present invention.
According to the technical scheme of the embodiment of the invention, the user interaction voice is analyzed through the client, the stored information identification is sent to the server when the user interaction voice is a return instruction, and the corresponding machine response data is locally called and presented to the user after the confirmation return response of the server is received, so that multi-round conversation between a human machine and a machine is realized, the user experience is improved, the accuracy of the presented data is ensured by verifying the validity of the information identification, meanwhile, the data bandwidth occupied by the client is reduced, the repeated acquisition of the same data content from the server is avoided, and the server resource is saved.
Drawings
FIG. 1A is a flowchart of a method for multiple rounds of human-machine interaction according to an embodiment of the present invention;
FIG. 1B is a data flow diagram of a multi-round human-machine interaction method according to an embodiment of the present invention;
FIG. 2A is a flowchart of a multi-turn human-machine interaction method according to a second embodiment of the present invention;
FIG. 2B is a data flow diagram of a multi-turn human-machine interaction method according to a second embodiment of the present invention;
FIG. 3 is a block diagram of a multi-turn human-machine interaction device according to a third embodiment of the present invention;
FIG. 4 is a block diagram of a multi-turn human-machine interaction device according to a fourth embodiment of the present invention;
fig. 5 is a block diagram of a device according to a fifth embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.
Example one
Fig. 1A is a flowchart of a multi-turn human-machine conversation method according to an embodiment of the present invention, where this embodiment is applicable to a situation where a user performs multiple turns of human-machine conversation with a client, and the method may be implemented by a multi-turn human-machine conversation device according to an embodiment of the present invention, where the device may be implemented by software and/or hardware, and may be generally integrated in a client providing a human-machine interaction function, and used in cooperation with a server providing machine response data, and may typically be integrated in a vehicle navigation client, where the method specifically includes the following steps:
s110, the client side obtains user interaction voice input by the user in the current conversation turn, and carries out instruction analysis on the user interaction voice to obtain an analysis instruction.
A Client (Client) is an application program for providing local services for a user, and is installed in a Client of the user, for example, in electronic devices such as a mobile phone and a computer; the device is installed in communication devices of vehicles such as automobiles, trains and airplanes. The client includes various forms, such as a browser used for browsing a web page and various types of Application programs (APPs). Optionally, in the embodiment of the present invention, the type of the client and the type of the client on which the client is installed are not specifically limited.
The client in the embodiment of the invention is a client with a man-machine interaction function, and can acquire the interaction voice of a user. In the current conversation, when the client side obtains the user interaction voice, the client side carries out instruction analysis on the user interaction voice to obtain an analysis instruction. Optionally, in the embodiment of the present invention, an instruction analysis is performed on the user interaction Speech by using an Automatic Speech Recognition (ASR) technology and/or a Natural Language Understanding (NLU) technology, so as to obtain an analysis instruction. ASR is the conversion of lexical content in human speech into computer readable input such as keystrokes, binary codes or character sequences. The NLU is the semantic meaning of the text, that is, the text content is converted into the text semantic meaning, the exact meaning of the word in the text is not important, and what is important is the semantic information conveyed by the text.
And S120, if the client determines that the analysis instruction is a return instruction, acquiring an information identifier of the upper-level machine response data matched with the machine response data of the current conversation turn, and sending the information identifier to a server.
The client side determines to be a return instruction according to the analyzed instruction; the return instruction is an instruction sent by a user for checking the response data of the upper-level machine; if the machine response data of the current conversation turn is acquired based on the related information in the other machine response data, the machine response data of the current conversation turn is used as the next-level machine response data of the other machine response data, and the other machine response data is used as the previous-level machine response data of the current conversation turn; for example, if the recognized instruction includes keywords such as "return", "previous level", or "reselection", the recognized instruction is regarded as a return instruction, and at this time, the client acquires the information identifier of the previous level machine response data matched with the machine response data of the current conversation turn and sends the information identifier to the server. For example, by the user uttering the voice message "nearby food", the client provides a corresponding "food list" as first-level machine response data; under the dialogue, the user selects the food name or the index label in the food list again by sending out voice information to check the detailed information of one food, so that the detailed information of the food provided by the client serves as second-level machine response data, and the first-level machine response data is the previous-level machine response data of the second-level machine response data. When each machine response data is generated, a matched and unique information identifier is generated to represent the machine response data, so that when the client acquires a return instruction under the machine response data of the current conversation turn, namely under the conversation of detailed food information, the information identifier corresponding to the machine response data of the previous stage, namely the information identifier corresponding to the food list, is searched and sent to the server.
S130, the client returns a response according to the confirmation matched with the information identifier fed back by the server, determines that the user interaction voice meets the request condition of historical machine response data, and presents the stored superior machine response data to the user.
If the client acquires the confirmation return response fed back by the server, the information identifier is valid, for example, the information identifier corresponding to the food list determines that the user interaction voice meets the request condition of the historical machine response data, and the client presents the upper-level machine response data of the current machine response data stored locally, namely the upper-level machine response data of the detailed information of the food, namely the food list, to the user. In particular, the machine response data may be presented to the user in a form of a voice and/or text list, or may be presented to the user in other forms.
If the client acquires the information identifier invalid instruction fed back by the server or does not receive the confirmation return response fed back by the server within the set time, the client informs the user that the return instruction is invalid in a voice and/or text mode so as to enable the user to carry out voice interaction again.
Optionally, in the embodiment of the present invention, after the client acquires the user interaction voice input by the user in the current conversation turn, and performs instruction parsing on the user interaction voice to obtain a parsing instruction, if the client determines that the parsing instruction is a non-return instruction, the client sends the parsing instruction to the server, so that the server searches for the machine response data matched with the parsing instruction and generates the information identifier matched with the machine response data; the non-return instruction is other user instructions except for the return instruction; particularly, if it is recognized that the user instruction does not include keywords such as "return", "previous stage", and "reselection", that is, the user instruction is regarded as a non-return instruction, for example, the current round of conversation is a "food list" provided by the client according to the voice information "food attached to the user" sent by the user, the voice information sent again by the user is a "nearby supermarket", the client confirms that the analysis instruction is a non-return instruction according to the analyzed instruction, and then the analysis instruction is sent to the server, so that the server searches for a "supermarket list" matched with a "nearby supermarket" and generates an information identifier matched with the "supermarket list"; and if the client acquires the supermarket list sent by the server and the information identifier matched with the supermarket list, the client performs local storage and presents the supermarket list to the user.
According to the technical scheme of the embodiment of the invention, the user interaction voice is analyzed through the client, the stored information identification is sent to the server when the user interaction voice is a return instruction, and the corresponding machine response data is locally called and presented to the user after the confirmation return response of the server is received, so that multi-round conversation between a human machine and a machine is realized, the user experience is improved, the accuracy of the presented data is ensured by verifying the validity of the information identification, meanwhile, the data bandwidth occupied by the client is reduced, the repeated acquisition of the same data content from the server is avoided, and the server resource is saved.
Specific application scenario one
Fig. 1B is a data flow chart of a multi-round man-machine conversation method provided on the basis of the above embodiment, according to a specific application scenario of the present invention, the data flow is as follows:
the client acquires and analyzes user interaction voice input by a user under the current conversation turn; the client determines the analyzed instruction as a return instruction; the client acquires the information identifier of the upper-level machine response data matched with the machine response data of the current conversation turn; the client sends the information identifier to a server; the server receives an information identifier sent by the client; the server verifies that the information identifier is valid; the server generates a confirmation return response matched with the information identifier, and updates the current machine response data to ensure the data synchronization with the client; the server sends the confirmation return response to the client; the client receives a response returned by the confirmation sent by the server; the client determines that the user interaction voice meets the request condition of the historical machine response data; and the client presents the stored upper-level machine response data to the user.
According to the technical scheme of the embodiment of the invention, the user interaction voice is analyzed through the client, the stored information identification is sent to the server when the user interaction voice is a return instruction, and the corresponding machine response data is locally called and presented to the user after the confirmation return response of the server is received, so that multi-round conversation between a human machine and a machine is realized, the user experience is improved, the accuracy of the presented data is ensured by verifying the validity of the information identification, meanwhile, the data bandwidth occupied by the client is reduced, the repeated acquisition of the same data content from the server is avoided, and the server resource is saved.
Example two
Fig. 2A is a flowchart of a multi-round human-machine conversation method according to a second embodiment of the present invention, where this embodiment is applicable to a situation where a user performs multi-round human-machine conversation with a client, and the method may be executed by a multi-round human-machine conversation device according to the second embodiment of the present invention, and the device may be implemented by software and/or hardware, and may be generally integrated in a server having a human-machine conversation processing function, and used in cooperation with a client that obtains user interaction voice, and may typically be integrated in a vehicle-mounted navigation server, where the method specifically includes the following steps:
s210, the server receives user interaction voice which is sent by the client and input by the user in the current conversation turn.
S220, if the server determines that the user interaction voice is a return instruction, the server acquires an information identifier of the previous-stage machine response data matched with the machine response data of the current conversation turn, feeds back a confirmation return response matched with the information identifier, and determines that the user interaction voice meets the request condition of historical machine response data.
Optionally, in the embodiment of the present invention, the server determines whether the user interaction speech is a return instruction according to an ASR technique and/or an NLU technique; if the server determines that the user interaction voice is a return instruction, acquiring an information identifier of the previous-stage machine response data matched with the machine response data of the current conversation turn; if the information identifier can be acquired, the return instruction is proved to be effective, and a confirmation return response is sent to the client so that the client responds to the return instruction of the user; if the information identifier cannot be acquired, the return instruction is proved to be invalid, and an invalid return response is sent to the client so that the client informs the user that the return instruction is invalid.
Optionally, in the embodiment of the present invention, the valid time of the information identifier may be set, the server stores the information identifier within the valid time, and deletes the information identifier after the valid time is exceeded, that is, the information identifier cannot be queried any more, and the information identifier is invalid; for example, in the above embodiment, for example, if the machine response data of the current conversation turn is detailed data of a food and the valid time of the information identifier is ten minutes, if the server determines that the user interaction voice is a return instruction at a certain time, if the difference time between the time and the generation time of the information identifier of the previous-stage machine response data, that is, the "food list", is less than or equal to ten minutes, the information identifier is still stored in the server, and the server may obtain the information identifier, that is, the information identifier is valid. Different information identifier storage time can be set according to the user hierarchy, for example, a VIP user sets a longer information identifier storage time, and a general user sets a shorter information identifier storage time.
Optionally, the information identifier includes a hash value; the Hash value is a process of mapping a piece of longer data into shorter data through a certain Hash Algorithm, such as an MD5Message Digest Algorithm (MD5Message-Digest Algorithm) and a Secure Hash Algorithm 1(Secure Hash Algorithm 1, SHA-1 for short), and the mapped shorter data is the Hash value of the longer data. In the embodiment of the present invention, an algorithm used for obtaining the hash value is not particularly limited. Particularly, for user requests with the same content at the same time, for example, different users send the same interactive voice "food nearby" at the same time, because the positions of the users and the user levels are different, the machine response data obtained by querying are also different, that is, the data sources are different, and therefore, the hash values generated according to the machine response data are also different, and therefore, by using the characteristic that the hash values have uniqueness, the hash values are used as information identifiers, and different machine response data can be accurately distinguished.
The server feeds back the confirmation return response corresponding to the information identifier, so that the uniqueness of the information identifier is utilized, the response accuracy is ensured, the same machine response data is prevented from being sent to the client again, the data bandwidth occupied by the client is reduced, and the communication resource of the server is saved.
And S230, the server takes the upper-level machine response data as current machine response data so as to enable the server and the client to keep data synchronization.
And the server takes the upper-level machine response data as the current machine response data while sending a confirmation return response to the client so as to keep the data synchronization between the server and the client.
Optionally, in the embodiment of the present invention, after the server receives the user interaction voice, which is sent by the client and input by the user in the current conversation turn, if it is determined that the user interaction voice is a non-return instruction, the server obtains machine response data matched with the user interaction voice, generates an information identifier matched with the machine response data, and feeds back the machine response data and the information identifier to the client, so that the client presents the machine response data to the user; for example, the current round of conversation is that the server analyzes the voice information of the user as 'food attached to the user', a 'food list' is further provided as current machine response data, the server obtains the interactive voice of the user again and determines that the analysis instruction is a non-return instruction when the interactive voice is analyzed as 'nearby supermarket', a 'supermarket list' matched with the 'nearby supermarket' is obtained, an information identifier matched with the 'supermarket list' is generated, the server sends the 'supermarket list' and the information identifier matched with the 'supermarket list' to the client, so that the client presents the 'supermarket list' to the user, and meanwhile, the server takes the 'supermarket list' as current machine response data to ensure data synchronization with the client.
According to the technical scheme of the embodiment of the invention, the server is used for analyzing the user interaction voice, when the user interaction voice is a return instruction, the confirmation return response is sent to the client according to the information identifier, and the current machine response data is updated to keep the data synchronization with the client, so that the multi-round conversation between the human machine and the computer is realized, the user experience is improved, the accuracy of the presented data and the data synchronization between the client and the server are ensured by verifying the validity of the information identifier, meanwhile, the server is prevented from repeatedly sending the same data content to the same client, and the communication resource is saved.
Specific application scenario two
Fig. 2B is a data flow chart of a multi-round human-computer conversation method provided on the basis of the above embodiment according to a second specific application scenario of the present invention, where the data flow is as follows:
the method comprises the steps that a client side obtains user interaction voice input by a user under a current conversation turn; the client sends the user interaction voice to a server; the server receives and analyzes user interaction voice sent by the client; the server determines the user interaction voice as a return instruction; the server acquires the information identifier of the upper-level machine response data matched with the machine response data of the current conversation turn; the server generates a confirmation return response matched with the information identifier and updates the current machine response data; the server sends a confirmation return response matched with the information identifier to the client; the client receives a response returned by the confirmation sent by the server; and the client presents the stored upper-level machine response data to the user.
According to the technical scheme of the embodiment of the invention, the server is used for analyzing the user interaction voice, when the user interaction voice is a return instruction, the confirmation return response is sent to the client according to the information identifier, and the current machine response data is updated to keep the data synchronization with the client, so that the multi-round conversation between the human machine and the computer is realized, the user experience is improved, the accuracy of the presented data and the data synchronization between the client and the server are ensured by verifying the validity of the information identifier, meanwhile, the server is prevented from repeatedly sending the same data content to the same client, and the communication resource is saved.
EXAMPLE III
Fig. 3 is a block diagram of a multi-turn human-machine conversation apparatus provided in a third embodiment of the present invention, where the apparatus is applied to a client, and specifically includes: an instruction parsing module 310, an information identifier obtaining module 320, and a machine response data presenting module 330.
The instruction analysis module 310 is configured to obtain user interaction voices input by a user in a current conversation turn, and perform instruction analysis on the user interaction voices to obtain analysis instructions;
an information identifier obtaining module 320, configured to, if it is determined that the analysis instruction is a return instruction, obtain an information identifier of the previous-stage machine response data that matches the machine response data of the current conversation turn, and send the information identifier to a server;
and the machine response data presentation module 330 is configured to determine that the user interaction voice meets a historical machine response data request condition according to a response returned by the server and the confirmation matching the information identifier, and present the stored superior machine response data to the user.
According to the technical scheme of the embodiment of the invention, the user interaction voice is analyzed through the client, the stored information identification is sent to the server when the user interaction voice is a return instruction, and the corresponding machine response data is locally called and presented to the user after the confirmation return response of the server is received, so that multi-round conversation between a human machine and a machine is realized, the user experience is improved, the accuracy of the presented data is ensured by verifying the validity of the information identification, meanwhile, the data bandwidth occupied by the client is reduced, the repeated acquisition of the same data content from the server is avoided, and the server resource is saved.
Optionally, on the basis of the foregoing embodiments, the multi-round human-machine interaction device further includes:
the non-return instruction determining module is used for sending the analysis instruction to a server if the analysis instruction is determined to be a non-return instruction, so that the server searches machine response data matched with the analysis instruction and generates an information identifier matched with the machine response data;
and the local storage module is used for locally storing the machine response data and presenting the machine response data to a user if the machine response data sent by the server and the information identifier matched with the machine response data are obtained.
Optionally, on the basis of the foregoing embodiments, the instruction parsing module 310 is specifically configured to:
and analyzing the user interaction voice by using an automatic voice recognition technology and/or a natural language understanding technology to obtain an analysis instruction.
The device can execute the multi-round man-machine conversation method provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method. For technical details not described in detail in this embodiment, reference may be made to the method provided in any embodiment of the present invention.
Example four
Fig. 4 is a block diagram of a multi-turn human-machine conversation apparatus according to a fourth embodiment of the present invention, where the apparatus is applied to a server, and specifically includes: a user interaction voice acquisition module 410, an instruction response module 420 and a first data synchronization module 430.
A user interaction voice obtaining module 410, configured to receive a user interaction voice, which is sent by a client and is input by a user in a current conversation turn;
an instruction response module 420, configured to, if it is determined that the user interaction voice is a return instruction, obtain an information identifier of the previous-stage machine response data that matches the machine response data of the current conversation turn, feed back a confirmation return response matching the information identifier, and determine that the user interaction voice satisfies a historical machine response data request condition;
a first data synchronization module 430, configured to use the previous-level machine response data as current machine response data, so that the server and the client maintain data synchronization.
According to the technical scheme of the embodiment of the invention, the server is used for analyzing the user interaction voice, when the user interaction voice is a return instruction, the confirmation return response is sent to the client according to the information identifier, and the current machine response data is updated to keep the data synchronization with the client, so that the multi-round conversation between the human machine and the computer is realized, the user experience is improved, the accuracy of the presented data and the data synchronization between the client and the server are ensured by verifying the validity of the information identifier, meanwhile, the server is prevented from repeatedly sending the same data content to the same client, and the communication resource is saved.
Optionally, on the basis of the foregoing embodiments, the multi-round human-machine interaction device further includes:
the machine response data sending module is used for acquiring machine response data matched with the user interaction voice if the user interaction voice is determined to be a non-return instruction, generating an information identifier matched with the machine response data, and feeding back the machine response data and the information identifier to the client so that the client presents the machine response data to a user;
and the second data synchronization module is used for taking the machine response data as current machine response data so as to keep data synchronization between the server and the client.
Optionally, on the basis of the foregoing embodiments, the information identifier includes a hash value.
The device can execute the multi-round man-machine conversation method provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method. For technical details not described in detail in this embodiment, reference may be made to the method provided in any embodiment of the present invention.
EXAMPLE five
Fig. 5 is a schematic structural diagram of a multi-turn human-machine interaction device according to a fifth embodiment of the present invention, as shown in fig. 5, the device includes a processor 50, a memory 51, an input device 52, and an output device 53; the number of processors 50 in the device may be one or more, and one processor 50 is taken as an example in fig. 5; the device processor 50, the memory 51, the input device 52 and the output device 53 may be connected by a bus or other means, as exemplified by the bus connection in fig. 5.
The memory 51 is a computer-readable storage medium, and can be used for storing software programs, computer-executable programs, and modules, such as the modules (the instruction parsing module 310, the information identifier obtaining module 320, and the machine response data presenting module 330) corresponding to the multiple rounds of human-machine interaction devices executed by the client in the embodiment of the present invention. Alternatively, the processor 50 executes various functional applications and data processing of the device by running software programs, instructions and modules stored in the memory 51 according to modules (the user interaction voice acquiring module 410, the instruction responding module 420 and the first data synchronizing module 430) corresponding to the multiple rounds of human-computer interaction devices executed by the server in the embodiment of the present invention, that is, the above-mentioned multiple rounds of human-computer interaction methods are implemented.
The memory 51 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal, and the like. Further, the memory 51 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, the memory 51 may further include memory located remotely from the processor 50, which may be connected to the device over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The input device 52 is operable to receive input numeric or character information and to generate key signal inputs relating to user settings and function controls of the apparatus. The output device 53 may include a display device such as a display screen.
EXAMPLE six
An embodiment of the present invention further provides a computer-readable storage medium, which when executed by a computer processor is configured to perform a method for multiple rounds of human-machine conversation, the method including:
the method comprises the steps that a client side obtains user interaction voice input by a user under a current conversation turn, and carries out instruction analysis on the user interaction voice to obtain an analysis instruction;
if the client determines that the analysis instruction is a return instruction, acquiring an information identifier of the upper-level machine response data matched with the machine response data of the current conversation turn, and sending the information identifier to a server;
and the client side returns a response according to the confirmation matched with the information identifier and fed back by the server, determines that the user interaction voice meets the request condition of historical machine response data, and presents the stored superior machine response data to the user.
Alternatively, the computer readable storage medium, when executed by a computer processor, is for performing a method for multiple rounds of human-machine conversation, the method comprising:
the server receives user interaction voice which is sent by the client and input by a user in the current conversation turn;
if the server determines that the user interaction voice is a return instruction, the server acquires an information identifier of the previous-stage machine response data matched with the machine response data of the current conversation turn, feeds back a confirmation return response matched with the information identifier, and determines that the user interaction voice meets the request condition of historical machine response data;
and the server takes the upper-level machine response data as current machine response data so as to keep data synchronization between the server and the client.
Of course, the storage medium provided by the embodiment of the present invention contains computer-executable instructions, and the computer-executable instructions are not limited to the method operations described above, and may also perform related operations in a multi-round man-machine conversation method provided by any embodiment of the present invention.
From the above description of the embodiments, it is obvious for those skilled in the art that the present invention can be implemented by software and necessary general hardware, and certainly, can also be implemented by hardware, but the former is a better embodiment in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which can be stored in a computer-readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a FLASH Memory (FLASH), a hard disk or an optical disk of a computer, and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute the methods according to the embodiments of the present invention.
It should be noted that, in the embodiment of the multi-round human-computer interaction device, the included modules are only divided according to functional logic, but are not limited to the above division, as long as the corresponding functions can be realized; in addition, the specific names of the functional modules are only for convenience of distinguishing from each other and are not used for limiting the protection scope of the present invention.
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (8)

1. A method for multiple rounds of human-computer conversation, comprising:
the method comprises the steps that a client side obtains user interaction voice input by a user under a current conversation turn, and carries out instruction analysis on the user interaction voice to obtain an analysis instruction;
if the client determines that the analysis instruction is a return instruction, acquiring an information identifier of the upper-level machine response data matched with the machine response data of the current conversation turn, and sending the information identifier to a server;
the client side returns a response according to the confirmation matched with the information identifier fed back by the server, determines that the user interaction voice meets the request condition of historical machine response data, and presents the stored superior machine response data to the user;
if the client determines that the analysis instruction is a non-return instruction, the client sends the analysis instruction to a server so that the server searches machine response data matched with the analysis instruction and generates an information identifier matched with the machine response data;
and if the client acquires the machine response data sent by the server and the information identifier matched with the machine response data, the client performs local storage and presents the machine response data to a user.
2. The method of claim 1, wherein the performing instruction parsing on the user interaction voice to obtain a parsing instruction comprises: and analyzing the user interaction voice by using an automatic voice recognition technology and/or a natural language understanding technology to obtain an analysis instruction.
3. A method for multiple rounds of human-computer conversation, comprising:
the server receives user interaction voice which is sent by the client and input by a user in the current conversation turn;
if the server determines that the user interaction voice is a return instruction, the server acquires an information identifier of the previous-stage machine response data matched with the machine response data of the current conversation turn, feeds back a confirmation return response matched with the information identifier, and determines that the user interaction voice meets the request condition of historical machine response data;
the server takes the upper-level machine response data as current machine response data so as to keep data synchronization between the server and the client;
if the server determines that the user interaction voice is a non-return instruction, acquiring machine response data matched with the user interaction voice, generating an information identifier matched with the machine response data, and feeding back the machine response data and the information identifier to the client so that the client presents the machine response data to a user;
and the server takes the machine response data as current machine response data so as to keep data synchronization between the server and the client.
4. The method of claim 3, wherein the information identifier comprises a hash value.
5. A multi-turn man-machine conversation device is applied to a client side and is characterized by comprising:
the instruction analysis module is used for acquiring user interaction voice input by a user under the current conversation turn, and performing instruction analysis on the user interaction voice to obtain an analysis instruction;
the information identifier acquisition module is used for acquiring the information identifier of the upper-level machine response data matched with the machine response data of the current conversation turn and sending the information identifier to a server if the analysis instruction is determined to be a return instruction;
the machine response data presentation module is used for returning a response according to the confirmation matched with the information identifier fed back by the server, determining that the user interaction voice meets the request condition of historical machine response data, and presenting the stored upper-level machine response data to the user;
the non-return instruction determining module is used for sending the analysis instruction to a server if the analysis instruction is determined to be a non-return instruction, so that the server searches machine response data matched with the analysis instruction and generates an information identifier matched with the machine response data;
and the local storage module is used for locally storing the machine response data and presenting the machine response data to a user if the machine response data sent by the server and the information identifier matched with the machine response data are obtained.
6. The apparatus of claim 5, wherein the instruction parsing module is specifically configured to:
and analyzing the user interaction voice by using an automatic voice recognition technology and/or a natural language understanding technology to obtain an analysis instruction.
7. A multi-round man-machine conversation device applied to a server is characterized by comprising:
the user interaction voice acquisition module is used for receiving user interaction voice which is sent by the client and input by a user in the current conversation turn;
the instruction response module is used for acquiring an information identifier of the previous-stage machine response data matched with the machine response data of the current conversation turn if the user interaction voice is determined to be a return instruction, feeding back a confirmation return response matched with the information identifier, and determining that the user interaction voice meets the request condition of historical machine response data;
the first data synchronization module is used for taking the upper-level machine response data as current machine response data so as to keep data synchronization between the server and the client;
the machine response data sending module is used for acquiring machine response data matched with the user interaction voice if the user interaction voice is determined to be a non-return instruction, generating an information identifier matched with the machine response data, and feeding back the machine response data and the information identifier to the client so that the client presents the machine response data to a user;
and the second data synchronization module is used for taking the machine response data as current machine response data so as to keep data synchronization between the server and the client.
8. An apparatus, characterized in that the apparatus comprises:
one or more processors;
storage means for storing one or more programs;
when executed by the one or more processors, cause the one or more processors to implement the multi-turn human machine dialog method of claim 1 or 2, or the multi-turn human machine dialog method of claim 3 or 4.
CN201910383367.9A 2019-05-09 2019-05-09 Multi-round man-machine conversation method, device and equipment Active CN110196927B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910383367.9A CN110196927B (en) 2019-05-09 2019-05-09 Multi-round man-machine conversation method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910383367.9A CN110196927B (en) 2019-05-09 2019-05-09 Multi-round man-machine conversation method, device and equipment

Publications (2)

Publication Number Publication Date
CN110196927A CN110196927A (en) 2019-09-03
CN110196927B true CN110196927B (en) 2021-09-10

Family

ID=67752607

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910383367.9A Active CN110196927B (en) 2019-05-09 2019-05-09 Multi-round man-machine conversation method, device and equipment

Country Status (1)

Country Link
CN (1) CN110196927B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110941693A (en) * 2019-10-09 2020-03-31 深圳软通动力信息技术有限公司 Task-based man-machine conversation method, system, electronic equipment and storage medium
CN110737765A (en) * 2019-10-25 2020-01-31 上海喜马拉雅科技有限公司 Dialogue data processing method for multi-turn dialogue and related device
CN112417109B (en) * 2020-10-26 2023-08-01 问问智能信息科技有限公司 Method and device for testing man-machine dialogue system
CN113656562B (en) * 2020-11-27 2024-07-02 话媒(广州)科技有限公司 Multi-round man-machine psychological interaction method and device
CN113079400A (en) * 2021-03-25 2021-07-06 海信视像科技股份有限公司 Display device, server and voice interaction method
CN116521841B (en) * 2023-04-18 2024-05-14 百度在线网络技术(北京)有限公司 Method, device, equipment and medium for generating reply information

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103927006A (en) * 2014-04-08 2014-07-16 弗徕威智能机器人科技(上海)有限公司 Robot based information interaction system and method
CN108366281A (en) * 2018-02-05 2018-08-03 山东浪潮商用***有限公司 A kind of full voice exchange method applied to set-top box
CN109151063A (en) * 2018-10-10 2019-01-04 小雅智能平台(深圳)有限公司 A kind of method and system controlling intelligent robot

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140278403A1 (en) * 2013-03-14 2014-09-18 Toytalk, Inc. Systems and methods for interactive synthetic character dialogue
CN106095568B (en) * 2016-06-01 2019-10-29 努比亚技术有限公司 Memory management device, mobile terminal and method
WO2018000278A1 (en) * 2016-06-29 2018-01-04 深圳狗尾草智能科技有限公司 Context sensitive multi-round dialogue management system and method based on state machines
CN107053208B (en) * 2017-05-24 2018-06-01 北京无忧创新科技有限公司 A kind of method of active dialog interaction robot system and the system active interlocution

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103927006A (en) * 2014-04-08 2014-07-16 弗徕威智能机器人科技(上海)有限公司 Robot based information interaction system and method
CN108366281A (en) * 2018-02-05 2018-08-03 山东浪潮商用***有限公司 A kind of full voice exchange method applied to set-top box
CN109151063A (en) * 2018-10-10 2019-01-04 小雅智能平台(深圳)有限公司 A kind of method and system controlling intelligent robot

Also Published As

Publication number Publication date
CN110196927A (en) 2019-09-03

Similar Documents

Publication Publication Date Title
CN110196927B (en) Multi-round man-machine conversation method, device and equipment
CN110489417B (en) Data processing method and related equipment
KR102038074B1 (en) Developer Voice Activity System
CN109308357B (en) Method, device and equipment for obtaining answer information
CN107945796B (en) Speech recognition method, device, equipment and computer readable medium
WO2016004763A1 (en) Service recommendation method and device having intelligent assistant
US11934394B2 (en) Data query method supporting natural language, open platform, and user terminal
CN111949240A (en) Interaction method, storage medium, service program, and device
CN108021369B (en) Data integration processing method and related device
CN112740323B (en) Voice understanding method and device
US11567980B2 (en) Determining responsive content for a compound query based on a set of generated sub-queries
CN111079428B (en) Word segmentation and industry dictionary construction method and device and readable storage medium
CN111008254B (en) Object creation method, device, computer equipment and storage medium
CN111225115B (en) Information providing method and device
CN110442696B (en) Query processing method and device
CN106371905B (en) Application program operation method and device and server
WO2018176705A1 (en) Method and apparatus for voice service response
CN107315739A (en) A kind of semantic analysis
CN111261149B (en) Voice information recognition method and device
CN116644159A (en) Keyword extraction method, keyword extraction device, keyword extraction equipment and computer readable storage medium
CN111756825A (en) Real-time cloud voice translation processing method and system
KR102023999B1 (en) Method and apparatus for generating web pages
CN111221841A (en) Real-time processing method and device based on big data
CN111970406B (en) Short message display method and device
CN103257718A (en) Chinese character input method, device and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant