CN110866103B

CN110866103B - Sentence diversity generation method and system in dialogue system

Info

Publication number: CN110866103B
Application number: CN201911087246.6A
Authority: CN
Inventors: 梁小丹; 陈炳成; 林倞
Original assignee: Sun Yat Sen University
Current assignee: Sun Yat Sen University
Priority date: 2019-11-08
Filing date: 2019-11-08
Publication date: 2023-07-07
Anticipated expiration: 2039-11-08
Also published as: CN110866103A

Abstract

The invention discloses a sentence diversity generation method and a sentence diversity generation system in a dialogue system, wherein the method comprises the following steps: s1, extracting a dependency tree of an answer sentence, and converting the dependency tree into an undirected graph; step S2, inputting the answer sentence and the undirected graph obtained in the step S1 into a graph structure converter to obtain a feature vector of the answer sentence; step S3, extracting feature vectors of dialogue histories of the answer sentences by using the sequence structure converter; and S4, inputting the feature vector of the answer sentence obtained in the step S2 and the feature vector of the dialogue history obtained in the step S3 into a conditional variation automatic encoder to obtain a new answer sentence of the dialogue history.

Description

Sentence diversity generation method and system in dialogue system

Technical Field

The invention relates to the technical field of dialogue systems, in particular to a sentence diversity generation method and system for fusing sentence grammar structures in a dialogue system.

Background

The dialogue system is a research direction of natural language processing, and the research purpose of the dialogue system is to generate the next sentence of the dialogue history according to the dialogue history of a user and a dialogue robot. In the field of dialog systems, a number of related technologies have been developed, including mainly retrievable dialog systems, generative dialog systems, and dialog systems in which retrievable and generative are mixed.

In reality, for the same dialogue history, there are a plurality of different answers, which is a sentence diversity generation problem in the dialogue system. However, in the dialogue system of the prior art, the sentence generation does not use the grammar structure information of the answer sentence, so that the correlation of the generated sentences is not strong in some cases, and a good dialogue effect cannot be realized.

Disclosure of Invention

In order to overcome the defects of the prior art, the invention aims to provide a sentence diversity generation method and system in a dialogue system so as to improve the sentence diversity in the dialogue system.

In order to achieve the above and other objects, the present invention provides a sentence diversity generation method in a dialogue system, comprising the steps of:

s1, extracting a dependency tree of an answer sentence, and converting the dependency tree into an undirected graph;

step S2, inputting the answer sentence and the undirected graph obtained in the step S1 into a graph structure converter to obtain a feature vector of the answer sentence;

step S3, extracting feature vectors of dialogue histories of the answer sentences by using the sequence structure converter;

and S4, inputting the feature vector of the answer sentence obtained in the step S2 and the feature vector of the dialogue history obtained in the step S3 into a conditional variation automatic encoder to obtain a new answer sentence of the dialogue history.

Preferably, step S1 further comprises:

step S100, extracting a dependency tree of the answer sentence by using an open source natural language processing tool;

step S101, representing the dependency tree by a directed graph, wherein nodes in the dependency tree are words of sentences, and directed edges in the dependency tree represent syntactic relations among the words;

and step S102, changing the directed edges in the directed graph into undirected edges to obtain an undirected graph of the answer sentence.

Preferably, in step S1, the undirected graph is represented by an adjacency matrix.

Preferably, if the answer sentence has n words, the adjacency matrix of the answer sentence is a matrix M with dimension n×n, and the value M of the ith row and jth column in the adjacency matrix M _ij Is determined by the following conditions:

preferably, step S2 further comprises

Step S200, graph Attention operation is carried out on the feature V of the answer sentence and the adjacency matrix M of the undirected Graph;

step S201, adding the result of Graph attribute operation and the feature V, and performing layer normalization operation;

step S202, junction of step S201

Inputting a layer of feedforward neural network, and performing layer normalization operation to obtain the feature vector of the answer sentence.

Preferably, in step S3, m sentences of the dialogue history are acquired, the m sentences are arranged in sequence, the m sentences are spliced into a sentence C in sequence end to end, and the sentence C is input to the sequence structure converter to obtain the feature vector of the dialogue history.

Preferably, the automatic conditional variation encoder is composed of an encoder and a decoder, the feature vector E ' of the conversation history obtained in step S3 is input to the encoder of the automatic conditional variation encoder to obtain a normal distribution z ', and a plurality of samples are sampled from the normal distribution z ', and then are input to the decoder respectively to obtain a plurality of different answer sentences.

In order to achieve the above object, the present invention further provides a sentence diversity generation system in a dialogue system, including:

an answer sentence processing unit for extracting a dependency tree of an answer sentence and converting the dependency tree into an undirected graph;

an answer sentence feature vector extraction unit configured to input the answer sentence and the undirected graph of the answer sentence obtained by the answer sentence processing unit into a graph structure converter to obtain a feature vector of the answer sentence;

a dialogue history feature extraction unit for extracting feature vectors of dialogue histories of the answer sentences using a sequence structure converter;

and the diversity sentence generating unit is used for inputting the feature vectors of the answer sentences of the answer sentence feature vector extracting unit and the feature vectors of the dialogue history feature extracting unit into the conditional variation automatic encoder to obtain new answer sentences of the dialogue history.

Preferably, in the answer sentence processing unit, the conversion method of the dependency tree into an undirected graph is to change a directed edge into an undirected edge, and the undirected graph is represented by an adjacency matrix.

Preferably, in the dialogue history feature extraction unit, m sentences of the dialogue history are acquired, the m sentences are arranged in sequence, the m sentences are spliced into a sentence C in sequence end to end, and the sentence C is input to the sequence structure converter to obtain feature vectors of the dialogue history.

Compared with the prior art, the sentence diversity generation method and system in the dialogue system convert the dependency tree into the undirected graph by extracting the dependency tree of the answer sentence, then input the answer sentence and the undirected graph into the graph structure converter to obtain the feature vector of the answer sentence, extract the feature vector of the dialogue history of the answer sentence by using the sequence structure converter, and finally input the obtained feature vector of the answer sentence and the obtained feature vector of the dialogue history into the conditional variation automatic encoder to obtain the new answer sentence of the dialogue history, thereby realizing the purpose of improving the sentence generation diversity in the dialogue system.

Drawings

FIG. 1 is a flow chart showing the steps of a sentence diversity generation method in a dialogue system according to the present invention;

FIG. 2 is a schematic diagram of a dependency tree in an embodiment of the present invention;

FIG. 3 is a diagram of a directed graph representation of a dependency tree in an embodiment of the present invention;

FIG. 4 is a diagram illustrating the transformation of a dependency tree into undirected graph according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of a adjacency matrix of an answer sentence in an embodiment of the invention;

FIG. 6 is a block diagram of the architecture converter (Graph Transformer) of an embodiment of the present invention;

FIG. 7 is a block diagram of a conditional variance automatic encoder in an embodiment of the invention;

fig. 8 is a system architecture diagram of a sentence diversity generation system in a dialog system according to the present invention.

Detailed Description

Other advantages and effects of the present invention will become readily apparent to those skilled in the art from the following disclosure, when considered in light of the accompanying drawings, by describing embodiments of the present invention with specific embodiments thereof. The invention may be practiced or carried out in other embodiments and details within the scope and range of equivalents of the various features and advantages of the invention.

Fig. 1 is a flowchart illustrating steps of a sentence diversity generation method in a dialogue system according to the present invention. As shown in fig. 1, the sentence diversity generating method in the dialogue system of the present invention includes the following steps:

step S1, extracting a dependency tree of the answer sentence, and converting the dependency tree into an undirected graph, wherein the undirected graph is represented by an adjacent matrix M.

Specifically, the answer sentence is an answer sentence to a question in the dialogue system, and the dependency tree of the answer sentence can be extracted by using an open-source natural language processing tool, such as Stanford CoreNLP, allenlp, and the like. The dependency tree is a directed graph, the nodes in the dependency tree are words of a sentence, and the directed edges in the dependency tree represent syntactic relationships between the words. If there is a certain syntactic relationship between words, there will be a directed edge between the nodes represented by the two words in the directed graph.

In the invention, the conversion method for converting the dependency tree into the undirected graph is to change the directed edge of the dependency tree into the undirected edge.

Specifically, assuming that the answer sentence has n words, the adjacency matrix of the answer sentence is a matrix M having a dimension n×n, and the value M of the ith row and jth column in the adjacency matrix M _ij Is determined by the following conditions:

examples of extracting sentence dependency trees and computing adjacency matrices are as follows: for example, a sentence "a syntactic structure is fused in sentence feature extraction". "

First, extracting a dependency tree of the sentence by using an open-source natural language processing tool, as shown in fig. 2;

then, the above dependency tree is represented by a directed graph, the nodes in the dependency tree are words of the sentence, and the directed edges in the dependency tree represent the syntactic relationship between the words, as shown in fig. 3.

And changing the directed edge in the directed graph into an undirected edge to obtain an undirected graph of the sentence, as shown in fig. 4.

Finally, the example sentence is fused with a syntactic structure in sentence feature extraction. The undirected graph of "is converted to an adjacency matrix M as shown in fig. 5.

Step S2, inputting the answer sentence and the adjacency matrix M of the undirected graph of step S1 into a graph structure converter (Graph Transformer) to obtain the feature vector of the answer sentence;

fig. 6 is a block diagram of the graph-structure converter (Graph Transformer) according to an embodiment of the present invention, and the following description is made with reference to fig. 6 for describing a feature extraction process of the graph-structure converter (Graph Transformer) according to the present invention:

specifically, assuming that the answer sentence is composed of n words, the i-th word is composed of a k-dimensional feature vector V _i Expressed, the feature of the answer sentence is expressed as v= (V ₁ ，...，V _n ). The feature V of the reply sentence and the adjacency matrix M of the undirected graph are input to the graph structure converter (Graph Transformer).

The characteristic extraction process of the graph structure converter is as follows:

1. and carrying out Graph Attention operation on the feature V of the answer sentence and the adjacency matrix M of the undirected Graph. Specifically, feature vector V for the ith word _i The Graph Attention is calculated as follows:

wherein M is _ij Is step S1The value of the ith row and jth column of the adjacency matrix M in (b).

2. Will be

And V is equal to _i Adding and performing layer normalization operation, wherein the specific operation is as follows:

where LayerNorm is a layer normalization operation, the layer normalization operation is not described in detail herein since it is prior art.

3. Will be

Inputting a layer of feedforward neural network, and then performing layer normalization operation, wherein the specific operation is as follows:

wherein FFN is a layer of feedforward neural network.

So that for the i-th word the feature vector V _i The transformed feature vector is obtained after Graph Transformer treatment

Thereby obtaining the characteristics of the answer sentence after Graph Transformer transformation

Finally, for the characteristics of the answer sentences

The following operations are performed to obtain the characteristics V' of the final answer sentence:

step S3, extracting the characteristics of the dialogue history of the answer sentence by using a sequence structure converter (transducer);

specifically, in a dialogue system, a dialogue sample is typically composed of dialogue history and answer sentences, an example of which is as follows:

conversation history (m sentences):

1. how does today weather?

2. Today, the weather is good and the sunshine is bright.

……

M, do you feel that the next week will not go down to storms?

The reply sentence is the next sentence of the dialogue history, for example:

i feel that the next week will experience heavy rain.

It is assumed that in the dialogue system, the dialogue history of the answer sentence is composed of m sentences, and the m sentences are arranged in order, the m sentences are sequentially spliced into one sentence C, and the sentence C is input to a sequence structure converter (converter) to obtain a feature vector of the dialogue history.

Specifically, suppose that sentence C is composed of r words, and that the i-th word in sentence C is composed of a k-dimensional feature vector E _i Expressed, the feature of sentence C is expressed as e= (E ₁ ，...，E _r ) After E is input into the transducer, the characteristics of the converted sentence C can be obtained

Feature of sentence C->

The following operations are performed to obtain the final dialogue history feature E':

step S4, inputting the feature vector V 'of the answer sentence of step S2 and the feature vector E' of the dialogue history of step S3 into a conditional variation automatic encoder to generate the answer sentence.

The structure diagram of the condition-variable automatic encoder is shown in fig. 7, the condition-variable automatic encoder is composed of an encoder and a decoder, the characteristic E ' of the conversation history is input into the encoder of the condition-variable automatic encoder to obtain a normal distribution z ', and a plurality of different answer sentences can be obtained only by sampling a plurality of samples from the normal distribution z ' and then respectively inputting the samples into the decoder. Specifically, after the feature vector E ' of the conversation history is input to the encoder in the conditional variation automatic encoder, a normal distribution Z ' can be obtained, and then the normal distribution Z ' is sampled multiple times and input to the decoder respectively, and then the decoder generates different answer sentences respectively, thereby realizing diversity generation of the answer sentences.

Fig. 8 is a system architecture diagram of a sentence diversity generation system in a dialog system according to the present invention. As shown in fig. 8, the sentence diversity generating system in the dialogue system of the present invention includes:

an answer sentence processing unit 201 for extracting a dependency tree of an answer sentence and converting the dependency tree into an undirected graph represented by an adjacency matrix;

In the present invention, the conversion method of the reply sentence extraction unit 201 to convert the dependency tree into the undirected graph is to change the directed edge of the dependency tree into the undirected edge.

Specifically, assuming that the answer sentence has n words, the adjacency matrix of the answer sentence is a matrix M having a dimension n×n, and the value M of the ith row and jth column in the adjacency matrix M _ij Is determined by the following conditionsAnd (3) determining:

an answer sentence feature vector extraction unit 202 for inputting the answer sentence and the undirected graph of the answer sentence obtained by the answer sentence processing unit 201 into a graph structure converter (Graph Transformer) to obtain a feature vector of the answer sentence.

Assuming that the answer sentence is composed of n words, the ith word is composed of a k-dimensional feature vector V _i Expressed, the feature of the answer sentence is expressed as v= (V ₁ ，...，V _n ). The feature V of the reply sentence and the adjacency matrix M of the undirected graph are input to the graph structure converter (Graph Transformer).

wherein M is _ij Is the value of the ith row and jth column of the adjacency matrix M in step S1.

2. Will be

3. Will be

wherein FFN is a layer of feedforward neural network.

So that for the i-th word the feature vector V _i The transformed feature vector is obtained after the processing of the graph structure converter (Graph Transformer)

Thereby obtaining the characteristic +.A reply sentence transformed by the diagram structure transformer (Graph Transformer)>

Finally, for the characteristics of the answer sentences

a dialogue history feature extraction unit 203 for acquiring an answer history of the answer sentence, extracting feature vectors of the dialogue history using a sequence structure converter (transducer);

specifically, it is assumed that in the dialogue system, the dialogue history of the answer sentence is composed of m sentences, and the m sentences are arranged in order, the m sentences are sequentially spliced into one sentence C, and the sentence C is input to a sequence structure converter (converter) to obtain a feature vector of the dialogue history.

Specifically, suppose that sentence C is composed of r words, and that the i-th word in sentence C is composed of a k-dimensional feature vector E _i Expressed, the feature of sentence C is expressed as e= (E ₁ ，...，E _r ) After inputting E into a sequence structure converter (transducer), the characteristics of the converted sentence C can be obtained

Feature of sentence C->

the diversity sentence generating unit 204 inputs the feature vector of the answer sentence feature vector extracting unit 202 and the feature vector of the dialogue history feature extracting unit 203 to the conditional variance automatic encoder, and obtains the answer sentence of the dialogue history.

The condition variation automatic encoder consists of an encoder and a decoder, the characteristic E ' of the dialogue history is input into the encoder of the condition variation automatic encoder to obtain a normal distribution z ', and a plurality of different answer sentences can be obtained only by sampling a plurality of samples from the normal distribution z ' and then respectively inputting the samples into the decoder

In summary, the sentence diversity generating method and system in the dialogue system of the present invention convert the dependency tree into the undirected graph by extracting the dependency tree of the answer sentence, then input the answer sentence and undirected graph into the graph-hook converter to obtain the feature vector of the answer sentence, extract the feature vector of the dialogue history of the answer sentence by using the sequence structure converter, and finally input the obtained feature vector of the answer sentence and the obtained feature vector of the dialogue history into the condition-variable automatic encoder to obtain the new answer sentence of the dialogue history, thereby achieving the purpose of improving the sentence diversity in the dialogue system.

The above embodiments are merely illustrative of the principles of the present invention and its effectiveness, and are not intended to limit the invention. Modifications and variations may be made to the above-described embodiments by those skilled in the art without departing from the spirit and scope of the invention. Accordingly, the scope of the invention is to be indicated by the appended claims.

Claims

1. A sentence diversity generation method in a dialogue system includes the steps of:

step S3, extracting the feature vector of the dialogue history of the answer sentence by using a sequence structure converter;

step S4, inputting the feature vector of the answer sentence obtained in the step S2 and the feature vector of the dialogue history obtained in the step S3 into a condition variation automatic encoder to obtain a new answer sentence of the dialogue history;

if the answer sentence has n words, the adjacency matrix of the answer sentence is a matrix M with dimension n x n, and the value M of the ith row and the jth column in the adjacency matrix M _ij Is determined by the following conditions:

step S2 further comprises:

feature vector V for the ith word _i The Graph Attention is calculated as follows:

wherein M is _ij Is the value of the ith row and jth column of the adjacency matrix M in step S1;

step S202, junction of step S201

2. The sentence diversity generation method in a dialog system of claim 1, wherein step S1 further comprises:

3. The sentence diversity generation method in a dialog system of claim 2, wherein: in step S1, the undirected graph is represented by an adjacency matrix.

4. The sentence diversity generation method in a dialog system of claim 1, wherein: in step S3, m sentences of the conversation history are obtained, the m sentences are arranged in sequence, the m sentences are spliced into a sentence C in sequence end to end, and the sentence C is input to the sequence structure converter to obtain the feature vector of the conversation history.

5. The sentence diversity generation method in a dialog system of claim 4, wherein: the condition variation automatic encoder is composed of an encoder and a decoder, the feature vector E ' of the conversation history obtained in the step S3 is input into the encoder of the condition variation automatic encoder to obtain a normal distribution z ', a plurality of samples are sampled from the normal distribution z ', and then the samples are respectively input into the decoder to obtain a plurality of different answer sentences.

6. A sentence diversity generation system in a dialog system for implementing the sentence diversity generation method of any of claims 1 to 5, comprising:

7. The sentence diversity generation system in a dialog system of claim 6, wherein: in the answer sentence processing unit, the conversion method of the dependency tree into the undirected graph is to change the directed edge into the undirected edge, and the undirected graph is represented by an adjacency matrix.

8. The sentence diversity generation method in a dialog system of claim 6, wherein: in the dialogue history feature extraction unit, m sentences of dialogue history are obtained, the m sentences are arranged in sequence, the m sentences are spliced into a sentence C in sequence in an end-to-end mode, and the sentence C is input into the sequence structure converter to obtain feature vectors of the dialogue history.