CN111159371A - Dialogue strategy method for task-oriented dialogue system - Google Patents

Dialogue strategy method for task-oriented dialogue system Download PDF

Info

Publication number
CN111159371A
CN111159371A CN201911331882.9A CN201911331882A CN111159371A CN 111159371 A CN111159371 A CN 111159371A CN 201911331882 A CN201911331882 A CN 201911331882A CN 111159371 A CN111159371 A CN 111159371A
Authority
CN
China
Prior art keywords
state
dialog
conversation
task
information entropy
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911331882.9A
Other languages
Chinese (zh)
Other versions
CN111159371B (en
Inventor
赵阳洋
王振宇
王佩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN201911331882.9A priority Critical patent/CN111159371B/en
Publication of CN111159371A publication Critical patent/CN111159371A/en
Priority to PCT/CN2020/142579 priority patent/WO2021121436A1/en
Application granted granted Critical
Publication of CN111159371B publication Critical patent/CN111159371B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Animal Behavior & Ethology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a conversation strategy method facing a task type conversation system, which is applied to a music intelligent search scene based on a knowledge graph and comprises the following steps: s1, constructing a Markov decision model aiming at a specific field; s2, calculating a state value function matrix by using a Bellman equation; s3, matching the knowledge graph and searching the knowledge base by combining the conversation state at the current moment to obtain a music result meeting the user target; s4, carrying out attribute information entropy calculation on the search result; s5, analyzing the calculated attribute information entropy; and S6, calculating the next round of action through the state transition matrix. The invention overcomes the difficulty of complete cold start in a task-based dialogue system, calculates the state value function matrix by constructing the reinforcement learning model, obtains the next round of action by combining the result of the state value function matrix and the attribute information entropy of the state, completes the knowledge search task with fewer dialogue rounds, and has good usability.

Description

Dialogue strategy method for task-oriented dialogue system
Technical Field
The invention relates to the field of intellectual search based on a knowledge graph of a task-based dialog system, in particular to a dialog strategy method facing the task-based dialog system.
Background
With the rapid development of the related artificial intelligence technology, the interaction mode between people and intelligent equipment tends to be intelligent, and gradually changes from traditional Graphical User Interface (GUI) to human-computer interactive User Interface (Conversational User Interface), that is, an artificial intelligence assistant is used to help users to complete multiple tasks or multiple services. The man-machine dialog system can be divided into two main categories of non-task-oriented dialog systems and task-oriented dialog systems in terms of functions. Task-based dialog systems, also known as target-driven (goaldriven) dialog systems, such as customer service robots, airline ticket booking systems, etc., provide users with domain-specific services intended to assist users in completing tasks such as shopping and booking airline tickets. The man-machine conversation system can greatly reduce the labor cost, simplify the man-machine interaction process and improve the intelligent degree of application, thereby having wide research and application values.
In a task-based dialog system, a user makes multiple rounds of dialog with the system to complete a particular task. In the field of intellectual search based on knowledge maps of multi-turn conversations, the system needs to help a user to quickly search knowledge items meeting constraint conditions through the turns as few as possible. In this process, the guidance of the system plays a decisive role in the path followed by the dialog. Good dialogue strategies can directly and simply guide users to express target attributes, thereby determining constraints of knowledge-graph matching and knowledge-base searching. Therefore, the intelligence of the strategy of the dialogue system directly relates to the searching efficiency of the system. The industrial application of task-based dialog systems often faces the problem of lacking domain-specific training data sets, and therefore supervised training cannot be performed on the training data sets. Currently, most dialog systems solve the problem of a completely cold start of the system by manually formulating dialog rules. Mainstream manual dialogue strategy establishment can quickly establish a dialogue strategy mechanism, but the establishment process consumes a large amount of manpower and lacks the capacity of generalization and domain migration. Therefore, how to construct a dialogue robot suitable for a complete cold start scene in such a scene, and having intelligence degree and domain migration capability is the background of the present invention.
The currently mainstream models for implementing the dialog strategy can be mainly classified into the following models: a dialog strategy based on the finite state automata is strong (Zhu Xiao Yan. method research [ J ] based on the slot characteristic finite state automata in the dialog management, 2004,27(8): 1092-; slot or fill-in methods (free-switches, Tianhuafeng, Dubo, et al. study and implementation of framework-based dialog management models [ J ] computer engineering, 2005(13): 221-; and probabilistic model-based dialog strategies (POMDP model and solution [ J ] for zhangbo, zai celebration, guobaining. spoken dialog systems. computer research and development, 2002(02): 90-97). The interaction process between the user and the system is defined as a process of alternating states of 'initial state- > action- > update state- > … - > termination state' and trigger actions based on the dialog strategy of the finite state automata, and the method is a typical system-dominant method, the rhythm of the dialog is completely determined by the system, the user needs to supplement information according to the process specified by the system, and the flexibility and the expandability are lacked. The slot-filling based dialog strategy improves to some extent the finite state automata based approach, which models the dialog as a slot-filling process. The method provides a relatively flexible input mode for the user, supports a system with mixed dominance of the user and the system, and is suitable for relatively complex information acquisition scenes. However, due to the limitation of slot positions, when the number of slots is too large, the complexity of the algorithm also increases sharply, and thus the method is not suitable for more complex scenes. For complex scenes with a large number of grooves, the method based on the probability model has a good expansion mode. In the face of excessive states or action spaces, when the traditional reinforcement learning is difficult to efficiently explore, the convergence rate of the model can be greatly improved by deep reinforcement learning.
On the basis of the three conversation strategy methods, the invention provides a multi-round conversation strategy method integrating reinforcement learning and information entropy aiming at two problems in a knowledge graph-based search type conversation system, and the two problems are solved as follows:
(1) in a task-based multi-turn dialogue system, due to the specificity of a domain, large-scale dialogue data for a specific domain is usually lacked, and therefore training of a supervision model cannot be performed. Before the system collects dialogue data in a real application environment online, the system has an important problem of how to construct a dialogue strategy model for cold start.
(2) For knowledge search type dialogue system based on knowledge graph, the system needs to generate knowledge base query sentence through user target, and combine external knowledge base and knowledge graph to help user query required information, and give response reply of the system. The conversation strategy task needs to consider the current conversation state and also needs to make a conversation strategy by combining the inquiry result of the knowledge base and the matching result of the knowledge graph. How to construct a conversation strategy model considering knowledge base search results based on a knowledge graph is a big problem faced by a conversation strategy task.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide a conversation strategy method facing a task type conversation system.
The invention is realized by at least one of the following technical schemes.
A dialogue strategy method for a task-oriented dialogue system comprises the following steps:
s1, constructing a Markov decision model aiming at a knowledge graph-based search type dialogue system in the vertical field;
s2, acquiring a state value function matrix by using a Bellman equation according to the step S1;
s3, matching the knowledge graph and searching the knowledge base by combining the conversation state at the current moment to obtain a result meeting the user target;
s4, carrying out attribute information entropy calculation on the search result;
s5, analyzing the calculated attribute information entropy;
and S6, obtaining the action of the next round of system through the state transition matrix.
The multi-round conversation strategy optimization method is applied to a conversation strategy module in a multi-round conversation music search system, a complete multi-round conversation music search system is realized, and the system is packaged with a WeChat public number for system demonstration.
Further, the step S1 specifically includes the following steps:
s11, defining quintuple (S, a, p, r, gamma) in the field according to the number of slots in the conversation, wherein S is all states containing termination states, a is all actions, p is state transition probability, r is a reward function, and gamma is a discount factor in an interval of 0-1;
and S12, self-defining the example, searching in the database, and defining the termination state of the conversation according to the search result of the database.
Further, the step S2 is specifically the expectation that the state value function of the state S is returned in the markov decision process, i.e. v (S) ═ E [ G [ ]t|St=s]Wherein G istFor a return at time t in state S, StBellman equation based on state-value functions for actions taken at time t
Figure BDA0002329838700000031
Iteratively calculating a state value function matrix v(s), wherein pi (a | s) represents the probability distribution of the behavior of the strategy in a given state,
Figure BDA0002329838700000032
Representing the immediate reward from performing action a when the status is s, gamma being a discount factor,
Figure BDA0002329838700000033
Represents the probability that the state at the next moment is changed to s' when the state at the current moment is s, vπ(s ') is a state value function of the next state s', a representing the set of all actions a.
Further, the step S3 specifically includes the following steps:
s31, receiving triples output by a natural language understanding module in the dialog system, namely field identification, intention identification and slot-value pair, and obtaining a single sentence understanding analysis result;
s32, carrying out conversation state tracking by combining the historical slot value state, updating the current conversation state, and converting the current conversation state into a state St;
s33, taking out the constraint of the current user target from the dialog state tracker, namely, the slot-value pair list, converting the constraint into a knowledge base query statement, and carrying out knowledge map matching and knowledge search.
Further, the step S4 specifically includes the following steps:
s41, judging the number of the search results, if the number is larger than N, calculating the attribute information entropy of the results, and if the number is not larger than N, directly informing the system to give a search result list;
s42, according to formula h (attr) ═ Σx∈χAnd p (x) logp (x), calculating the information entropy of the attribute attr, wherein χ represents the attribute attr, attr represents a possible value set, and p (x) represents the probability that the attribute attr takes the value x.
Further, the step S5 specifically includes the following steps:
s51, judging the number of attributes with the information entropy larger than 0, if the number of attributes is not larger than 1, indicating that the distinguishable attributes are 1, so that the next conversation should inquire the target constraint of the slot from the user;
s52, if the attribute number of the information entropy larger than 0 is larger than 1, searching the column vector P corresponding to the current state S in the state transition matrix PsThe transition probability vector P of the state ssConversion to 01 vector TsTransition probability>The value of node 0 takes 1 and T is usedsFiltering the state value function matrix v to obtain a next vector s' which is possible to be transferred and a corresponding state value;
s53, the next state S' maximizes the prize value for the next state, i.e., v*V (s') max, v*Representing the maximum state value function, comparing s with s ', and finding out the slot position with s being 0 and s' being 1; if the values on a plurality of slot positions are different, the slot positions with the information entropy of 0 are filtered out by full permutation and combination to obtain a new s', and then the comparison of the sizes of the state values is carried out, and the slot positions with the information entropy of 0 are used as the slot positions queried by the system action.
Further, the step S6 is specifically to splice the slots into the action required to be queried by the next round system.
Compared with the prior art, the invention has at least the following beneficial effects:
1. the method defines a Markov decision model, and constructs the Markov decision model of a dialogue strategy by defining a dialogue state set S, a system action set A, a state transition probability P, a return function R and a discount factor gamma;
2. the method combines the music search result attribute information entropy and the state value function to search the slot attribute with the highest music search value, thereby determining the inquiry action of the system;
3. the invention overcomes the difficulty of cold start of a multi-turn dialogue system, constructs a knowledge base search statement based on the dialogue state of each turn of dialogue under the condition of no training of a dialogue data set in a specific field, calculates a dynamic dialogue strategy of attribute information entropy in the knowledge base search result and the result matched with a knowledge map, combines reinforcement learning and attribute information entropy, constructs a dialogue strategy submodule in a dialogue management module in the dialogue system, and improves the intelligence degree of the system.
Drawings
FIG. 1 is a flowchart illustrating a dialog strategy method for a task-oriented dialog system according to an embodiment of the present invention;
fig. 2 is a diagram illustrating the process of calculating and selecting the music search result information entropy according to this embodiment.
Detailed Description
It should be noted that, in the present application, the embodiments and features of the embodiments may be combined with each other without conflict, and the present invention is further described in detail with reference to the drawings and the embodiments.
Fig. 1 shows a conversation strategy method for a task-oriented conversation system, which takes a music search task as an example and includes the following steps:
s1, constructing a Markov decision model aiming at a knowledge graph-based search type dialogue system (music search, book search and the like) in the vertical field, and defining five tuples (S, a, p, r and gamma) in the field, wherein S is all states (including termination state), a is all actions, p is state transition probability, r is a reward function, and gamma is a discount function (generally defaults to 0.9) in an interval of 0-1;
the step S1 specifically includes the following steps:
s11, defining quintuple (S, a, p, r, gamma) in the field according to the number of slots in the conversation, a conversation state set S, a system action set a, a state transition probability p, a return function r and a discount factor gamma;
(1) set of states s
In the music search task, the dialog state is represented by the value taking condition of 6 slots, the state of each slot is divided into filled state and unfilled state, the dialog state output by the dialog state tracking module (table 2 is enumeration of state representation of the dialog state tracking module) is converted into number representation, and the total number of the dialog state is 26The 64 states are sequentially encoded according to the subscript 01, the six-bit 01 encoding sequentially indicates whether the song, singer, album, lyricwrite, compoer, and label slots are filled, and the number of states and corresponding state numbers are shown in table 2. For example, the current dialog state is<singer (Zhoujilun, song (rice incense)>Then the corresponding state code should be S110000. Then, the state set S ═ S000000,S100000,…,S111111}
TABLE 2 numbered representation of dialog states
Figure BDA0002329838700000051
(2) Set of actions a
The system action is divided into a query action request () and a provide song list action offer (), and the query action can be divided into six actions of a query song name request (song), a query singer request (singer), a query album request (album), a query speaker request (lyricwriter), a query composer request (composer) and a query song type request (label) according to different slots of the query. Thus, action set a ═ { offer (songs), request (attrs) }, where attrs ═ song, singer, album, lyric, composer, label.
(3) Transition probability p between states
Defining the transition probability P (s, s ') between the states (s, s ') as 1/N, wherein N is the possible value number of the next state s ', and the current state s is a non-termination state. A user may give information on more than one slot in a single round of dialog, so transition probabilities between dialog states are defined according to tables 3 and 4:
TABLE 3 dialog state transition probability example table
Figure BDA0002329838700000061
Table 4 dialog state transition probability example table
Figure BDA0002329838700000062
(4) Real-time report r
Defining when the dialog state reaches 49 set termination states, which means that the user completes the current task, the reward value after the transition is set as 100, and the reward value for the transition of each other pair of dialog states is-1, as shown in table 5 and table 6, the termination state is shown in bold:
TABLE 5 example State transition reward matrix
Figure BDA0002329838700000063
TABLE 6 example State transition reward matrix
Figure BDA0002329838700000071
(5) Discount factor gamma
The discount factor represents the importance of the future profit to the current state, γ ∈ [0,1], and the present embodiment sets the discount factor γ to 0.8.
S12, defining an example (as shown in table 1), searching the database, and defining the termination state of the dialog according to the database search result.
Table 1 example of finding termination state
Figure BDA0002329838700000072
The termination state represents the end of the session, and if the termination state is reached, it represents that the system should give the song list offer () to end the session. From the experience and common knowledge, the following rules are established to define the termination state of the session:
1. when the user gives song name song information and any other attribute information of the song, the state is a termination state, and the number of the states is 5;
2. when the user gives album name album of the song and lyricist lyricwriter or composer, the state is termination state, 2 kinds in total;
3. if any three or more attributes are known in the six attributes, the state is a termination state, and 20+15+6+1 is 42 types. Therefore, 42 termination states are defined, as shown in table 7 and table 8:
TABLE 7 description of the termination status of a conversation
Figure BDA0002329838700000081
TABLE 8 description of the termination status of a conversation
Figure BDA0002329838700000082
S2, according to the step S1, the state value function matrix is obtained by the aid of the Bellman equation, and the method specifically comprises the following steps:
s21, in the markov decision process, the state value function of the state S is the expectation of its return, i.e., v (S) ═ E [ G [ ]t|St=s]Wherein G istFor a return at time t in state S, StBellman equation based on state-value functions for actions taken at time t
Figure BDA0002329838700000083
Iteratively calculating a state value function matrix v(s), wherein pi (a | s) tableThe probability distribution showing the behavior of the strategy in a given state,
Figure BDA0002329838700000084
Representing the immediate reward from performing action a when the status is s, gamma being a discount factor,
Figure BDA0002329838700000085
Represents the probability that the state at the next moment is changed to s' when the state at the current moment is s, vπ(s ') is a state value function of the next state s', a representing the set of all actions a.
S3, matching the knowledge graph and the search knowledge base according to the conversation state at the current moment to obtain a result meeting the user target, which comprises the following steps:
s31, receiving triples output by a natural language understanding module of the dialog system, namely field identification, intention identification and slot-value pair, and obtaining a single sentence understanding analysis result;
s32, carrying out conversation state tracking by combining the historical slot value state, updating the current conversation state, and converting the current conversation state into a state St;
s33, taking out the constraint of the current user target from the dialog state tracker, namely, the slot-value pair list, converting the constraint into a knowledge base query statement, and carrying out knowledge map matching and knowledge search. The conversion process is to generate corresponding constraint conditions for query according to the value of each slot.
S4, as shown in fig. 2, performing attribute information entropy calculation on the search result, specifically including the following steps:
s41, judging the number of the search results, if the number is larger than 10, calculating the attribute information entropy of the results, and if the number is not larger than 10, directly informing the system to provide a search result list;
s42, according to formula h (attr) ═ Σx∈χAnd p (x) logp (x), calculating the information entropy of the attribute attr, wherein χ represents the attribute attr (wherein attr is a name), attr represents a possible value set, and p (x) represents the probability that the attribute attr takes the value of x.
S5, analyzing the calculated attribute information entropy, specifically comprising the following steps:
s51, judging the number of attributes with the information entropy larger than 0, if the number of attributes is not larger than 1, indicating that the distinguishable attributes are 1, so that the next dialog of the system should inquire the target constraint of the slot to the user;
s52, if the attribute number of the information entropy larger than 0 is larger than 1, searching the column vector P corresponding to the current state S in the state transition matrix PsThe transition probability vector P of the state ssConversion to 01 vector TsTransition probability>The value of node 0 takes 1 and T is useds(01 vector, i.e. transition probability of state s if inside the state transition matrix>The value of the node 0 is 1, a 01 matrix which is consistent with the dimension of the state transition matrix is constructed and defined as a 01 vector Ts) Filtering the state value function matrix v to obtain a next vector s' which is possible to be transferred and a corresponding state value; the filtering mode is that all nodes with the state transition probability of 0 are set to be 0 by constructing 01 vector filtering;
s53, the next state S' maximizes the prize value for the next state, i.e., v*V (s') max (v)*Representing the maximum state value function), comparing s with s ', and finding out the slot position with s being 0 and s' being 1; if the values on a plurality of slot positions are different, the slot positions with the information entropy of 0 are filtered out by full permutation and combination to obtain a new s', and then the comparison of the sizes of the state values is carried out, and the slot positions with the large information entropy are used as the slot positions queried by the system action.
And S6, obtaining the action of the next round of system through the state transition matrix, specifically splicing the slots into the action required to be inquired by the next round of system.
The method constructs an effective multi-turn dialogue management model and has good usability.
The multi-round conversation strategy optimization method is applied to a conversation strategy module in a multi-round conversation music search system, a complete multi-round conversation music search system is realized, and the system is packaged with a WeChat public number for system demonstration.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that various equivalent changes, modifications, substitutions and alterations can be made herein without departing from the principles and spirit of the invention, the scope of which is defined by the appended claims and their equivalents.

Claims (7)

1. A conversation strategy method for a task-oriented conversation system is characterized by comprising the following steps:
s1, constructing a Markov decision model aiming at a knowledge graph-based search type dialogue system in the vertical field;
s2, acquiring a state value function matrix by using a Bellman equation according to the step S1;
s3, matching the knowledge graph and searching the knowledge base by combining the conversation state at the current moment to obtain a result meeting the user target;
s4, carrying out attribute information entropy calculation on the search result;
s5, analyzing the calculated attribute information entropy;
and S6, obtaining the action of the next round of system through the state transition matrix.
2. The dialog strategy method for a task-oriented dialog system according to claim 1, wherein the step S1 specifically comprises the steps of:
s11, defining quintuple (S, a, p, r, gamma) in the field according to the number of slots in the conversation, wherein S is all states containing termination states, a is all actions, p is state transition probability, r is a reward function, and gamma is a discount factor in an interval of 0-1;
and S12, self-defining the example, searching in the database, and defining the termination state of the conversation according to the search result of the database.
3. A dialog strategy method for a task-oriented dialog system according to claim 1, characterized in that said step S2 is specifically a method for rewarding a state value function of a state S in a markov decision process, i.e. v (S) -E [ G ],t|St=s]wherein G istFor a return at time t in state S, StBellman equation based on state-value functions for actions taken at time t
Figure FDA0002329838690000011
Iteratively calculating a state value function matrix v(s), wherein pi (a | s) represents the probability distribution of the behavior of the strategy in a given state,
Figure FDA0002329838690000012
Representing the immediate reward from performing action a when the status is s, gamma being a discount factor,
Figure FDA0002329838690000013
Represents the probability that the state at the next moment is changed to s' when the state at the current moment is s, vπ(s ') is a state value function of the next state s', a representing the set of all actions a.
4. The dialog strategy method for a task-oriented dialog system according to claim 1, wherein the step S3 specifically comprises the steps of:
s31, receiving triples output by a natural language understanding module in the dialog system, namely field identification, intention identification and slot-value pair, and obtaining a single sentence understanding analysis result;
s32, combining the historical slot value state, tracking the conversation state, updating the current conversation state, and converting into the state St
S33, taking out the constraint of the current user target from the dialog state tracker, namely, the slot-value pair list, converting the constraint into a knowledge base query statement, and carrying out knowledge map matching and knowledge search.
5. The dialog strategy method for a task-oriented dialog system according to claim 1, wherein the step S4 specifically comprises the steps of:
s41, judging the number of the search results, if the number is larger than N, calculating the attribute information entropy of the results, and if the number is not larger than N, directly informing the system to give a search result list;
s42, according to formula h (attr) ═ Σx∈χAnd p (x) logp (x), calculating the information entropy of the attribute attr, wherein χ represents the attribute attr, attr represents a possible value set, and p (x) represents the probability that the attribute attr takes the value x.
6. The dialog strategy method for a task-oriented dialog system according to claim 1, wherein the step S5 specifically comprises the steps of:
s51, judging the number of attributes with the information entropy larger than 0, if the number of attributes is not larger than 1, indicating that the distinguishable attributes are 1, so that the next conversation should inquire the target constraint of the slot from the user;
s52, if the attribute number of the information entropy larger than 0 is larger than 1, searching the column vector P corresponding to the current state S in the state transition matrix PsThe transition probability vector P of the state ssConversion to 01 vector TsTransition probability>The value of node 0 takes 1 and T is usedsFiltering the state value function matrix v to obtain a next vector s' which is possible to be transferred and a corresponding state value;
s53, the next state S' maximizes the prize value for the next state, i.e., v*V (s') max, v*Representing the maximum state value function, comparing s with s ', and finding out the slot position with s being 0 and s' being 1; if the values on a plurality of slot positions are different, the slot positions with the information entropy of 0 are filtered out by full permutation and combination to obtain a new s', and then the comparison of the sizes of the state values is carried out, and the slot positions with the information entropy of 0 are used as the slot positions queried by the system action.
7. The dialog strategy method for task-oriented dialog systems of claim 1, wherein the step S6 is specifically to splice the slots into the actions required to be queried by the next round of system.
CN201911331882.9A 2019-12-21 2019-12-21 Dialogue strategy method for task-oriented dialogue system Active CN111159371B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201911331882.9A CN111159371B (en) 2019-12-21 2019-12-21 Dialogue strategy method for task-oriented dialogue system
PCT/CN2020/142579 WO2021121436A1 (en) 2019-12-21 2020-12-31 Dialogue strategy method for task-oriented dialogue system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911331882.9A CN111159371B (en) 2019-12-21 2019-12-21 Dialogue strategy method for task-oriented dialogue system

Publications (2)

Publication Number Publication Date
CN111159371A true CN111159371A (en) 2020-05-15
CN111159371B CN111159371B (en) 2023-04-21

Family

ID=70557681

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911331882.9A Active CN111159371B (en) 2019-12-21 2019-12-21 Dialogue strategy method for task-oriented dialogue system

Country Status (2)

Country Link
CN (1) CN111159371B (en)
WO (1) WO2021121436A1 (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111816173A (en) * 2020-06-01 2020-10-23 珠海格力电器股份有限公司 Dialogue data processing method, device, storage medium and computer equipment
CN112052322A (en) * 2020-09-03 2020-12-08 哈尔滨工业大学 Intelligent robot conversation strategy generation method based on particle calculation
CN112364147A (en) * 2020-12-01 2021-02-12 四川长虹电器股份有限公司 Cross-domain multi-turn dialogue method based on knowledge graph and implementation system
WO2021121436A1 (en) * 2019-12-21 2021-06-24 华南理工大学 Dialogue strategy method for task-oriented dialogue system
CN113239171A (en) * 2021-06-07 2021-08-10 平安科技(深圳)有限公司 Method and device for updating conversation management system, computer equipment and storage medium
CN114201286A (en) * 2022-02-16 2022-03-18 成都明途科技有限公司 Task processing method and device, electronic equipment and storage medium
CN114862527A (en) * 2022-06-17 2022-08-05 阿里巴巴(中国)有限公司 Object recommendation method and device
CN115577089A (en) * 2022-11-24 2023-01-06 零犀(北京)科技有限公司 Method, device, equipment and storage medium for optimizing nodes in conversation process

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113688223A (en) * 2021-09-10 2021-11-23 上海汽车集团股份有限公司 Task type conversation management method and device
CN115809669B (en) * 2022-12-30 2024-03-29 联通智网科技股份有限公司 Dialogue management method and electronic equipment
CN117407514B (en) * 2023-11-28 2024-07-09 星环信息科技(上海)股份有限公司 Solution plan generation method, device, equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105788593A (en) * 2016-02-29 2016-07-20 中国科学院声学研究所 Method and system for generating dialogue strategy
CN108282587A (en) * 2018-01-19 2018-07-13 重庆邮电大学 Mobile customer service dialogue management method under being oriented to strategy based on status tracking

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105068661B (en) * 2015-09-07 2018-09-07 百度在线网络技术(北京)有限公司 Man-machine interaction method based on artificial intelligence and system
CN105183850A (en) * 2015-09-07 2015-12-23 百度在线网络技术(北京)有限公司 Information querying method and device based on artificial intelligence
CN109543010A (en) * 2018-10-22 2019-03-29 拓科(武汉)智能技术股份有限公司 The interactive method and system of fused data library inquiry
CN111159371B (en) * 2019-12-21 2023-04-21 华南理工大学 Dialogue strategy method for task-oriented dialogue system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105788593A (en) * 2016-02-29 2016-07-20 中国科学院声学研究所 Method and system for generating dialogue strategy
CN108282587A (en) * 2018-01-19 2018-07-13 重庆邮电大学 Mobile customer service dialogue management method under being oriented to strategy based on status tracking

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
钟可立;王小捷;: "基于信息熵的POMDP模型观测函数估计" *

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021121436A1 (en) * 2019-12-21 2021-06-24 华南理工大学 Dialogue strategy method for task-oriented dialogue system
CN111816173A (en) * 2020-06-01 2020-10-23 珠海格力电器股份有限公司 Dialogue data processing method, device, storage medium and computer equipment
CN111816173B (en) * 2020-06-01 2024-06-07 珠海格力电器股份有限公司 Dialogue data processing method and device, storage medium and computer equipment
CN112052322A (en) * 2020-09-03 2020-12-08 哈尔滨工业大学 Intelligent robot conversation strategy generation method based on particle calculation
CN112364147A (en) * 2020-12-01 2021-02-12 四川长虹电器股份有限公司 Cross-domain multi-turn dialogue method based on knowledge graph and implementation system
CN113239171A (en) * 2021-06-07 2021-08-10 平安科技(深圳)有限公司 Method and device for updating conversation management system, computer equipment and storage medium
WO2022257468A1 (en) * 2021-06-07 2022-12-15 平安科技(深圳)有限公司 Method and apparatus for updating dialogue management system, and computer device and storage medium
CN113239171B (en) * 2021-06-07 2023-08-01 平安科技(深圳)有限公司 Dialogue management system updating method, device, computer equipment and storage medium
CN114201286A (en) * 2022-02-16 2022-03-18 成都明途科技有限公司 Task processing method and device, electronic equipment and storage medium
CN114201286B (en) * 2022-02-16 2022-04-26 成都明途科技有限公司 Task processing method and device, electronic equipment and storage medium
CN114862527A (en) * 2022-06-17 2022-08-05 阿里巴巴(中国)有限公司 Object recommendation method and device
CN115577089A (en) * 2022-11-24 2023-01-06 零犀(北京)科技有限公司 Method, device, equipment and storage medium for optimizing nodes in conversation process

Also Published As

Publication number Publication date
CN111159371B (en) 2023-04-21
WO2021121436A1 (en) 2021-06-24

Similar Documents

Publication Publication Date Title
CN111159371A (en) Dialogue strategy method for task-oriented dialogue system
Lemon et al. Machine learning for spoken dialogue systems
Lehmann et al. Autosparql: Let users query your knowledge base
CN103049792B (en) Deep-neural-network distinguish pre-training
CN111046187B (en) Sample knowledge graph relation learning method and system based on confrontation type attention mechanism
CN110245238B (en) Graph embedding method and system based on rule reasoning and syntax mode
CN109960722B (en) Information processing method and device
CN113158691B (en) Dialogue method and device based on mixed knowledge management and electronic equipment
CN110807566A (en) Artificial intelligence model evaluation method, device, equipment and storage medium
CN110059170B (en) Multi-turn dialogue online training method and system based on user interaction
CN111400461A (en) Intelligent customer service problem matching method and device
CN112328808A (en) Knowledge graph-based question and answer method and device, electronic equipment and storage medium
CN111143539A (en) Knowledge graph-based question-answering method in teaching field
CN111125316A (en) Knowledge base question-answering method integrating multiple loss functions and attention mechanism
CN117542509A (en) Multi-round consultation method based on diagnosis and treatment guidance tree and diagnosis and treatment reasoning engine
Lee et al. A situation-based dialogue management using dialogue examples
CN110909124A (en) Hybrid enhanced intelligent demand accurate sensing method and system based on human-in-loop
Smith How can research on past urban adaptations be made useful for sustainability science?
CN116028610B (en) N-element complex query embedding method on super-relation knowledge graph
CN116737911A (en) Deep learning-based hypertension question-answering method and system
Fan et al. Integrating multi-granularity model and similarity measurement for transforming process data into different granularity knowledge
CN116403608A (en) Speech emotion recognition method based on multi-label correction and space-time collaborative fusion
CN110990426B (en) RDF query method based on tree search
CN110442690B (en) Query optimization method, system and medium based on probabilistic reasoning
Zhou et al. DB-GPT-Hub: Towards Open Benchmarking Text-to-SQL Empowered by Large Language Models

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant