CN111104502A

CN111104502A - Dialogue management method, system, electronic device and storage medium for outbound system

Info

Publication number: CN111104502A
Application number: CN201911346577.7A
Authority: CN
Inventors: 江小林; 罗超; 胡泓
Original assignee: Ctrip Computer Technology Shanghai Co Ltd
Current assignee: Ctrip Computer Technology Shanghai Co Ltd
Priority date: 2019-12-24
Filing date: 2019-12-24
Publication date: 2020-05-05

Abstract

The invention discloses a dialogue management method, a dialogue management system, electronic equipment and a storage medium of an outbound system, wherein the dialogue management method comprises the following steps: obtaining semantic information; updating the current conversation state according to the semantic information; inputting the dialogue state into a prestored state machine for state jumping; and the outbound system outputs questions and answers according to the state after the state machine skips, and simultaneously enhances the generalization performance of the dialogue system by using reinforcement learning. The dialogue management method of the outbound system can realize spoken language communication between the user and the outbound system, the outbound system can answer the problem of the user, humanized communication is realized, the acceptance of the user is improved, a large amount of labor cost is saved for enterprises, and the outbound system can realize switching under the condition that the user is an extension set, and is flexible in design.

Description

Dialogue management method, system, electronic device and storage medium for outbound system

Technical Field

The present invention relates to the field of natural language processing technologies, and in particular, to a dialog management method and system for an outbound call system, an electronic device, and a storage medium.

Background

For some service industries, such as OTA (on-line travel) industry, the demand for outbound is very large, and many OTA enterprises have their own outbound systems. The existing outbound system uses an Interactive Voice Response (IVR) system, a user can not communicate with the outbound system in spoken language by feeding back information through keys, the existing outbound system can not reply to the problem of the user, the existing outbound system can not realize switching under the condition that the user belongs to an extension, the existing outbound system is rigid in design, humanized communication is lacked, and the acceptance of the user is low.

Disclosure of Invention

The invention aims to overcome the defect that the outbound system cannot communicate with a user in spoken language in the prior art, and provides a dialogue management method, a dialogue management system, electronic equipment and a storage medium of the outbound system.

The invention solves the technical problems through the following technical scheme:

a dialog management method of an outbound system, the dialog management method comprising:

obtaining semantic information;

updating the current conversation state according to the semantic information;

inputting the dialogue state into a prestored state machine for state jumping;

and the outbound system outputs a question and a answer according to the state after the state machine skips.

Preferably, the dialog management method further comprises the steps of:

after the state machine jumps to a preset state, the semantic information is input into a pre-trained reinforcement learning model;

and the outbound system outputs questions and answers according to the output result of the reinforcement learning model.

Preferably, before the step of obtaining semantic information, the dialog management method further includes the steps of:

and acquiring slot information in the domain information, wherein the slot information is used for forming the conversation state.

Preferably, the state machine performs state jump according to the current dialog state and the trigger event;

the dialog state comprises a plurality of sub-dialog states;

the sub-dialog state is structured data and comprises attributes and member variables;

the attributes include state attributes, and the member variables include state transfer functions;

the sub-dialog state updates the dialog state according to the trigger event; and/or the presence of a gas in the gas,

the triggering event is a user intention in the semantic information.

Preferably, the dialog management method further includes a step of training a reinforcement learning model, the step of training the reinforcement learning model including:

presetting a reward mechanism to carry out automatic data annotation;

collecting first dialogue data by adopting a hot start mode;

collecting second dialogue data by adopting an epsilon-greedy (greedy algorithm) exploration mode;

and vectorizing and representing the dialogue states and actions in the first dialogue data and the second dialogue data, inputting the dialogue states and actions into a reinforcement learning model, and acquiring a model output result.

Preferably, the semantic information includes the dialog state and the action;

after the state machine jumps to a preset state, vectorizing and representing the dialogue state and the action, and inputting the dialogue state and the action into a pre-trained reinforcement learning model;

the outbound system outputs the output question and answer of the pre-trained reinforcement learning model; and/or the presence of a gas in the gas,

the reinforcement learning model is a DQN (deep Q network) algorithm model.

A dialog management system for an outbound system, the dialog management system comprising:

the semantic information acquisition module is used for acquiring semantic information;

the dialogue state updating module is used for updating the current dialogue state according to the semantic information;

the state skipping module is used for inputting the dialogue state into a prestored state machine for state skipping;

and the first question-answer output module is used for outputting question answers by the outbound system according to the state after the state machine skips.

Preferably, the dialog management system further comprises the following modules:

the first model input module is used for inputting the semantic information into a pre-trained reinforcement learning model after the state machine jumps to a preset state;

and the second question-answer output module is used for outputting a question answer by the outbound system according to the output result of the reinforcement learning model.

and the slot information acquisition module is used for acquiring slot information in the field information, and the slot information is used for forming the conversation state.

the dialog state comprises a plurality of sub-dialog states;

the triggering event is a user intention in the semantic information.

Preferably, the dialog management system further comprises a training model module, the training model module comprising:

the data marking unit is used for presetting a reward mechanism to carry out automatic data marking;

the first data collection unit is used for collecting first dialogue data in a hot start mode;

the second data collection unit is used for collecting second dialogue data in an epiglon-greedy exploration mode;

and the model result acquisition unit is used for vectorizing and representing the conversation state and the action in the first conversation data and the second conversation data, inputting the vectorized and represented conversation state and action into the reinforcement learning model and acquiring a model output result.

Preferably, the semantic information includes the dialog state and the action;

the first model input module is used for vectorizing and representing the dialogue state and the action and inputting the dialogue state and the action into a pre-trained reinforcement learning model after the state machine jumps to a preset state;

the second question-answer output module is used for outputting the question answer output by the pre-trained reinforcement learning model through the outbound system; and/or the presence of a gas in the gas,

the reinforced learning model is a DQN algorithm model.

An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing a method of session management for a call-out system as claimed in any one of the preceding claims when executing the program.

A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the dialog management method of the outbound system of any of the above mentioned.

The positive progress effects of the invention are as follows: the dialogue management method of the outbound system can realize spoken language communication between the user and the outbound system, the outbound system can answer the problem of the user, humanized communication is realized, the acceptance of the user is improved, a large amount of labor cost is saved for enterprises, and the outbound system can realize switching under the condition that the user is an extension set, and is flexible in design.

Drawings

Fig. 1 is a flowchart illustrating a session management method of an outbound system according to a preferred embodiment 1 of the present invention.

Fig. 2 is a flowchart illustrating a session management method of an outbound call system according to a preferred embodiment 2 of the present invention.

Fig. 3 is a flowchart illustrating a session management method of an outbound call system according to a preferred embodiment 3 of the present invention.

Fig. 4 is a diagram illustrating state transitions of a state machine according to a preferred embodiment 3 of the present invention.

Fig. 5 is a block diagram of a session management system of the outbound system according to the preferred embodiment 4 of the present invention.

Fig. 6 is a block diagram of a session management system of the outbound system according to the preferred embodiment 5 of the present invention.

Fig. 7 is a block diagram of a session management system of the outbound system according to the preferred embodiment 6 of the present invention.

Fig. 8 is a schematic structural diagram of an electronic device according to a preferred embodiment 7 of the invention.

Detailed Description

The invention is further illustrated by the following examples, which are not intended to limit the scope of the invention.

Example 1

The embodiment provides a dialogue management method of an outbound system. As shown in fig. 1, the dialog management method of the outbound system of this embodiment includes the following steps:

s101, obtaining semantic information;

step S102, updating the current conversation state according to the semantic information;

step S103, inputting the dialogue state into a prestored state machine for state jump;

and step S104, the outbound system outputs a question and a answer according to the state after the state machine skips. The dialogue management method of the outbound system of the embodiment can realize spoken language communication between the user and the outbound system, the outbound system can answer the problem of the user, humanized communication is realized, the acceptance of the user is improved, a large amount of labor cost is saved for enterprises, and the outbound system can realize switching and is flexible in design when the user is an extension.

Example 2

On the basis of embodiment 1, the present embodiment provides a dialog management method for an outbound system. As shown in fig. 2, after step S104, the dialog management method of the outbound system of this embodiment further includes the following steps:

step S201, after the state machine jumps to a preset state, semantic information is input into a pre-trained reinforcement learning model;

and S202, outputting the question and answer by the outbound system according to the output result of the reinforcement learning model.

The dialogue management method of the outbound system in the embodiment uses the trained reinforcement learning model to perform dialogue management, uses the state machine to quickly collect man-machine dialogue data when the dialogue starts to be started in an initial cold stage for training the reinforcement learning model, and switches to the reinforcement learning model to perform dialogue exploration in a hot start stage, so that dialogue learning can be automatically performed, paths which are not considered are covered, the accuracy and the intelligent degree of the dialogue system are further improved, and the humanization of the dialogue system is improved.

Example 3

On the basis of embodiment 2, the present embodiment provides a dialog management method for an outbound call system. As shown in fig. 3, before step S101, the dialog management method of the outbound system of this embodiment further includes the following steps:

s100, acquiring slot information in the field information, wherein the slot information is used for forming a conversation state;

dividing according to different fields, determining information to be acquired according to field information, for example, aiming at the common order confirmation field of the OTA industry, in the field, an OTA manufacturer confirms whether a certain order can be received or not to a hotel, and extracting slot information required to be acquired, such as: it can be determined that if not, the cause thereof is extracted. The action of the robot is made according to the slot information in a targeted manner, for example, if the acquired slot information has the conditions of 'being capable of determining', 'being full of rooms', 'being uncooperative', and the like, the robot can be made to inquire in a targeted manner: "do you ask for a full room" and "do you ask for no cooperation".

S101, obtaining semantic information; semantic information includes dialog state, actions, and user intent;

each session performed by the external system and the user in this embodiment will be saved, the whole session is split into a plurality of session states, each session state is encapsulated and saved by one object, all information of the session state needs to be stored in a database such as a Remote Dictionary Server (Remote Dictionary Server) in order to ensure that the session state exists in the whole session process, and the session state is updated every time the session is interacted. The embodiment adopts an object-oriented design method, and the dialog state of the embodiment is structured data comprising attributes and member variables; the attributes comprise state attributes, and the member variables comprise state transfer functions; each state in the dialog state has its own attribute and can perform certain operation (state transition function) when receiving certain trigger event, so each state can be a class, the state attribute can be represented by member variable of the class, and the state transition function can be realized by member function of the class. The triggering event is primarily a user intent. The specific dialog states include: user intent field, recent robot actions, some number of times limit: the hotel request repeat times, double check inconsistency times, call calling times, reason times of unable order receiving, unable identification times, name inquiry times and the like. Coded identification of dialog states, e.g.

Whether the ORDER RECEIVEs the confirmation state or not, whether the clarification ORDER RECEIVEs the confirmation state or not, and whether the clarification ORDER RECEIVEs the confirmation state or not, wherein the confirmation state is 2;

whether the order can be orally subscribed, whether the order can be accepted/public static int class _ ORAL _ RESERVATION 3;

for a specific field, in addition to the slot information required by the specific scenario in the previous step, the dialog management method of the outbound system of this embodiment also needs to design a series of general components, specifically: the latest user message: the result of the semantic analysis module comprises an intention entity, confidence coefficient and the like; responding to the system in a round; a dialog state; the latest action name is used for informing the robot of what to do next; event queue: and recording each time of conversation between the user and the robot in time sequence, recording complete context information so as to be convenient for other modules to utilize or analyze the conversation later, and counting a series of times so as to control the whole conversation.

For the dialog state obtained in the previous step, a series of actions needs to be designed as a reply of the corresponding user of the robot in the next step according to the scene, but the actions are not limited to this, and the actions may also be executable operations such as dialing an extension, calling an API (application programming Interface), hanging a phone call, and the like. For OTA order confirmation scenarios, common actions are such as: inquiring whether the order can be determined; asking whether an order was received, etc.

updating the current conversation state according to the user intention in the semantic information; the user's intent represents what the user says, such as "full room".

step S104, the outbound system outputs question answers according to the state after the state machine jumps

In the initial stage of session management, a session flow is artificially defined, the session flow is divided into jumps among different states, and for the whole session, the condition of session end is to acquire specified information. In the cold start stage of the project, the rule implementation method based on the state machine can quickly and effectively collect data and can meet the basic circulation of conversation.

Taking the field of order confirmation as an example, the design method of the state machine is as follows:

1. firstly, according to the determined conversation state, the whole conversation is divided into a plurality of conversation states, each conversation state is packaged and stored through an object, all information of the conversation states needs to be stored in a database such as Redis in order to ensure that the conversation states exist in the whole conversation process, and each conversation interaction can update the conversation states.

2. Secondly, designing a corresponding event-driven mechanism, wherein the event-driven mechanism is generally a semantic analysis result of the transcribed text, and the dialog management system finds out a state transfer function and a next function from the system according to the current state and a trigger event, so as to execute the state transfer function to perform state transfer.

As shown in fig. 4, the state jump diagram of the state machine of this embodiment is shown. The state machine of this embodiment is based on the order confirmation field, and the working principle of the dialog system using the state machine is as follows:

the dialogue system firstly inquires whether the reservation part is available, simultaneously receives and identifies the user answer, if the reservation part is identified according to the user answer, the dialogue system continuously inquires whether the order is available, if the reservation part is not identified according to the user answer, the dialogue system requests transfer, after the transfer is successful, the dialogue system continuously inquires whether the reservation part is available, and if the transfer is failed, the dialogue system outputs a Rakaye; if the user says other words, entering a general purpose/output intention recognition branch, inducing the opposite party to explain and recognize again, comparing the words spoken by the user twice, if the words spoken by the user twice are consistent, outputting the intention, and confirming the user; if the intentions of the two descriptions are inconsistent, the user is induced to say again for recognition, whether the output results are consistent or not is judged again, and if the output results are not consistent, the Bayer process is output. If the identified user intention is not in the preset intention type, directly outputting the Rakah rakah; if the user is identified to be willing to accept the order, the system further inquires whether the order can be determined or not, checks whether the order can be determined or not, inquires the name of the user if the order can be determined twice, and outputs the name of the user after the name of the user is obtained; if the user suddenly changes the idea and does not want to confirm the order receiving, entering a general/output intention identification branch; inquiring whether to orally reserve if the user is identified not to receive the order, and if the user is identified to be willing to orally reserve, carrying forward IVR voice broadcast and inquiring whether the branch can be confirmed; outputting a rakah if it is recognized that the user does not accept the verbal subscription. If the reservation has not been on duty or has been off duty, then the general purpose intention identification branch is entered; if not clearly audible or unrecognized, it is again clarified.

The diamonds and squares in fig. 4 are dialog states, and if a dialog enters this state, it is only switched to another state if a condition is met, and interruptions and answers may occur in all states

Type 18: 10 types (types) without recognition, 18 (others)

Type <18 recognition of 10 types defined

Personalized Double check: setting different check words according to different types

If the personalized Double check is still returned after 2 times, then "query last _ name" is entered and Return type is 18.

For the state machine based on the rules realized in the previous step, the man-machine conversation data can be rapidly collected for training a reinforcement learning algorithm, for conversation management, the transfer relationship among states can be obtained through reinforcement learning according to the states and actions in the conversation decision process defined in the previous step, good behaviors are rewarded according to a rewarded function, bad behaviors are punished, and a behavior sequence is optimized. Firstly, vectorizing and representing conversation states and actions, and performing iteration by using a DQN algorithm, wherein a neural network part only needs to use common DNN (convolutional neural network), a DQN part adopts a target network to perform algorithm iteration, and an experience playback part adopts simulation data based on rules. For rewards, a measure is given as to whether dialog information is obtained and whether user questions are both answered.

The dialog management method of the outbound system in this embodiment further includes, on the basis of embodiment 2, a step of training a reinforcement learning model:

step S301, presetting a reward mechanism to perform automatic data annotation;

step S302, collecting first dialogue data by adopting a hot start mode;

step S303, collecting second dialogue data by adopting an epiglon-greedy exploration mode;

and step S304, vectorizing and representing the conversation state and action in the first conversation data and the second conversation data, and inputting the vectorized and represented conversation state and action into the reinforcement learning model to obtain a model output result.

The specific training process is as follows: 1. in the training process, an epsilon-greedy exploration mode is adopted to indicate the probability of carrying out the selection action, for example, if the value is 0.9, the action is randomly selected with the probability of 0.1, so that the exploratory performance of the system is stronger, and some new conversation paths are found. 2. And in each training round, the data of the buffer area is used for system learning, the network in each round is updated for many times, and after the network in each round is updated, the temporary storage network parameters of the target network are updated. And selecting the next robot action by using the trained DQN network.

In the present embodiment, the action is taken as the system output, e.g.

check order confirm good, can the order confirm, is?

The reason why the check order request cannot be answered is that

check full room request is full room

check full room reply good, order room type has set full room, thanks, see again

The intention in this embodiment is a user intention, for example:

order confirmation 1

public static final String ORDER_CONFIRM＝"order_confirm"；

Order unconfirmation 2

public static final String ORDER_NOT_CONFIRM＝"order_not_confirm"；

Step S201 specifically includes:

step S2010, after the state machine jumps to a preset state, vectorizing and representing the conversation state and the action, and inputting the vectorized and represented conversation state and action into a pre-trained reinforcement learning model;

and S202, outputting the question and answer of the pre-trained reinforcement learning model by the outbound system.

The dialogue management method of the outbound call system of the embodiment is used for OTA industry. The dialogue management method of the outbound system of the embodiment further uses a reinforcement learning model to manage the dialogue system on the basis of the state machine, so that the dialogue accuracy of the system can be further improved, and the application scenes of the system can be increased.

Example 4

As shown in fig. 5, the dialog management system of the outbound system of this embodiment includes the following modules:

the semantic information acquisition module 1 is used for acquiring semantic information;

the dialogue state updating module 2 is used for updating the current dialogue state according to the semantic information;

the state skipping module 3 is used for inputting the dialogue state into a prestored state machine for state skipping;

and the first question-answer output module 4 is used for outputting the question and answer by the outbound system according to the state after the state machine skips.

The dialogue management system of the outbound system of the embodiment can realize spoken language communication between the user and the outbound system, the outbound system can answer the problem of the user, humanized communication is realized, the acceptance of the user is improved, a large amount of labor cost is saved for enterprises, and the outbound system can realize switching and is flexible in design when the user is an extension.

Example 5

On the basis of embodiment 4, this embodiment provides a dialog management system of an outbound system, and as shown in fig. 6, the dialog management system of the outbound system of this embodiment further includes the following modules:

the first model input module 5 is used for inputting semantic information into a pre-trained reinforcement learning model after the state machine jumps to a preset state;

and the second question-answer output module 6 is used for outputting the question answer by the outbound system according to the output result of the reinforcement learning model.

The dialogue management system of the outbound system of the embodiment uses the trained reinforcement learning model to perform dialogue management, when the dialogue starts to be started in an initial cold stage, the state machine is used for rapidly collecting man-machine dialogue data for training the reinforcement learning model, and in a hot start stage, the reinforcement learning model is used for dialogue exploration, so that dialogue learning can be automatically performed, paths which are not considered are covered, the accuracy and the intelligent degree of the dialogue system are further improved, and the humanization of the dialogue system is improved.

Example 6

On the basis of embodiment 5, this embodiment provides a dialog management system of an outbound system, and as shown in fig. 7, the dialog management system of the outbound system of this embodiment further includes the following modules:

the slot information acquisition module 7 is used for acquiring slot information in the domain information, and the slot information is used for forming a conversation state;

Each session performed by the external system and the user in this embodiment will be saved, the whole session is split into a plurality of session states, each session state is packaged and saved by one object, all information of the session states needs to be stored in a database such as Redis in order to ensure that the session states exist in the whole session process, and the session states are updated every time the session is interacted. The embodiment adopts an object-oriented design method, and the dialog state of the embodiment is structured data comprising attributes and member variables; the attributes comprise state attributes, and the member variables comprise state transfer functions; each state in the dialog state has its own attribute and can perform certain operation (state transition function) when receiving certain trigger event, so each state can be a class, the state attribute can be represented by member variable of the class, and the state transition function can be realized by member function of the class. The triggering event is primarily a user intent. The specific dialog states include: intent, recent robot actions, some number of limits: the hotel request repeat times, the double check inconsistency times, the call calling times, the reason times of the order receiving incapability, the unrecognizable times, the name inquiry times and the like. Coded identification of dialog states, e.g.

Whether the order receives the confirmation status, whether the clarification order receives the confirmation status

public static final int CLARIFY_RECEIVE_ORDER＝2；

Whether a verbal order can be placed, whether a clear order can accept a verbal order-

public static final int CLARIFY_ORAL_RESERVATION＝3；

For the dialog state obtained in the previous step, a series of actions need to be designed as the reply of the corresponding user of the robot in the next step according to the scene, but the actions are not limited to this, and the actions may also be executable operations such as dialing an extension, calling api, hanging up a phone call, and the like. For OTA order confirmation scenarios, common actions are such as: inquiring whether the order can be determined; asking whether an order was received, etc.

The state machine used in the dialog management system of the outbound call system in this embodiment is as described in embodiment 3 above, and is not described here again.

A training model module 8, wherein the training model module 8 specifically comprises:

the data labeling unit 81 is used for presetting a reward mechanism to perform automatic data labeling;

a first data collection unit 82 for collecting first dialogue data by a hot start method;

a second data collecting unit 83, configured to collect second session data in an epiglon-greedy exploration manner;

a model result obtaining unit 84, which performs vectorization representation on the dialogue states and actions in the first dialogue data and the second dialogue data and inputs the dialogue states and actions into the reinforcement learning model to obtain a model output result;

the specific training process is as follows: 1. in the training process, an epsilon-greedy exploration mode is adopted to indicate the probability of carrying out the selection action, for example, if the value is 0.9, the action is randomly selected with the probability of 0.1, so that the exploratory performance of the system is stronger, and some new conversation paths are found. 2. And in each training round, the data of the buffer area is used for system learning, the network in each round is updated for many times, and after the network in each round is updated, the temporary storage network parameters of the target network are updated. The trained dqn network is used to select the next robot action.

In the present embodiment, the action is taken as the system output, e.g.

check order confirm good, can the order confirm, is?

The reason why the check order request cannot be answered is that

check full room request is full room

The intention in this embodiment is a user intention, for example:

order confirmation 1

public static final String ORDER_CONFIRM＝"order_confirm"；

Order unconfirmation 2

public static final String ORDER_NOT_CONFIRM＝"order_not_confirm"；

The first model input module 5 is used for vectorizing and representing the conversation state and the action and inputting the vectorized and represented conversation state and action into a pre-trained reinforcement learning model;

the specific implementation principle of the dialog management system of the outbound call system in this embodiment is adopted in embodiment 3, and details are not described here.

The dialogue management system of the outbound system of the embodiment is used in the OTA industry.

The dialogue management method of the outbound system of the embodiment further uses a reinforcement learning model to manage the dialogue system on the basis of the state machine, so that the dialogue accuracy of the system can be further improved, and the application scenes of the system can be increased.

Example 7

Fig. 8 is a module schematic diagram of an electronic device according to embodiment 7 of the present invention. The electronic device comprises a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the dialog management method of the outbound system of

embodiment

1 or 2 or 3 when executing the program. The electronic device 30 shown in fig. 8 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiment of the present invention.

As shown in fig. 8, the electronic device 30 may be embodied in the form of a general purpose computing device, which may be, for example, a server device. The components of the electronic device 30 may include, but are not limited to: the at least one processor 31, the at least one memory 32, and a bus 33 connecting the various system components (including the memory 32 and the processor 31).

The bus 33 includes a data bus, an address bus, and a control bus.

The memory 32 may include volatile memory, such as Random Access Memory (RAM)321 and/or cache memory 322, and may further include Read Only Memory (ROM) 323.

Memory 32 may also include a program/utility 325 having a set (at least one) of program modules 324, such program modules 324 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.

The processor 31 executes various functional applications and data processing, such as a dialog management method of the outbound system provided in

embodiment

1 or 2 or 3 of the present invention, by running a computer program stored in the memory 32.

The electronic device 30 may also communicate with one or more external devices 34 (e.g., keyboard, pointing device, etc.). Such communication may be through input/output (I/O) interfaces 35. Also, model-generating device 30 may also communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet) via network adapter 36. As shown, network adapter 36 communicates with the other modules of model-generating device 30 via bus 33. It should be understood that although not shown in the figures, other hardware and/or software modules may be used in conjunction with the model-generating device 30, including but not limited to: microcode, device drivers, redundant processors, external disk drive arrays, RAID (disk array) systems, tape drives, and data backup storage systems, etc.

It should be noted that although in the above detailed description several units/modules or sub-units/modules of the electronic device are mentioned, such a division is merely exemplary and not mandatory. Indeed, the features and functionality of two or more of the units/modules described above may be embodied in one unit/module according to embodiments of the invention. Conversely, the features and functions of one unit/module described above may be further divided into embodiments by a plurality of units/modules.

Example 8

The present embodiment provides a computer-readable storage medium on which a computer program is stored, which when executed by a processor implements the steps of the dialog management method of the outbound system provided in

embodiments

1 or 2 or 3.

More specific examples, among others, that the readable storage medium may employ may include, but are not limited to: a portable disk, a hard disk, random access memory, read only memory, erasable programmable read only memory, optical storage device, magnetic storage device, or any suitable combination of the foregoing.

In a possible embodiment, the invention can also be implemented in the form of a program product comprising program code for causing a terminal device to carry out the steps of the dialog management method of the outbound

system implementing embodiment

1 or 2 or 3 when the program product is run on the terminal device.

Where program code for carrying out the invention is written in any combination of one or more programming languages, the program code may execute entirely on the user device, partly on the user device, as a stand-alone software package, partly on the user device and partly on a remote device or entirely on the remote device.

While specific embodiments of the invention have been described above, it will be appreciated by those skilled in the art that this is by way of example only, and that the scope of the invention is defined by the appended claims. Various changes and modifications to these embodiments may be made by those skilled in the art without departing from the spirit and scope of the invention, and these changes and modifications are within the scope of the invention.

Claims

1. A dialog management method for an outbound system, the dialog management method comprising:

obtaining semantic information;

updating the current conversation state according to the semantic information;

inputting the dialogue state into a prestored state machine for state jumping;

2. A dialog management method for an outbound system as claimed in claim 1, wherein the dialog management method further comprises the steps of:

3. A dialog management method for an outbound system as claimed in claim 1 or 2, characterised in that the dialog management method further comprises, before the step of obtaining semantic information, the steps of:

4. The dialog management method of an outbound system as claimed in claim 1, wherein said state machine performs a state jump based on a current state of said dialog and a triggering event;

the dialog state comprises a plurality of sub-dialog states;

the triggering event is a user intention in the semantic information.

5. The dialog management method of an outbound system as claimed in claim 2, wherein the dialog management method further comprises the step of training a reinforcement learning model, the step of training a reinforcement learning model comprising:

presetting a reward mechanism to carry out automatic data annotation;

collecting first dialogue data by adopting a hot start mode;

collecting second dialogue data by adopting an epiglon-greedy exploration mode;

6. The dialog management method of an outbound system as claimed in claim 5,

the semantic information includes the dialog state and the action;

the reinforced learning model is a DQN algorithm model.

7. A dialog management system for an outbound system, the dialog management system comprising:

8. The dialog management system of an outbound system of claim 7 wherein the dialog management system further comprises the modules of:

9. A dialog management system for an outbound system as claimed in claim 7 or 8, characterised in that the dialog management system further comprises the following modules:

10. The dialog management system of an outbound system of claim 7 wherein the state machine performs a state jump based on the current state of the dialog and a triggering event;

the dialog state comprises a plurality of sub-dialog states;

the triggering event is a user intention in the semantic information.

11. The dialog management system of the outbound system of claim 8 wherein the dialog management system further comprises a training model module, the training model module comprising:

12. The dialog management system of an outbound system of claim 11,

the semantic information includes the dialog state and the action;

the reinforced learning model is a DQN algorithm model.

13. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the dialog management method of the outbound system of any of claims 1 to 6 when executing the program.

14. A computer-readable storage medium, on which a computer program is stored, which program, when being executed by a processor, carries out the steps of a dialog management method for an outbound system according to any one of claims 1 to 6.