CN111507070A

CN111507070A - Natural language generation method and device

Info

Publication number: CN111507070A
Application number: CN202010297512.4A
Authority: CN
Inventors: 俞凯; 赵晏彬
Original assignee: AI Speech Ltd
Current assignee: AI Speech Ltd
Priority date: 2020-04-15
Filing date: 2020-04-15
Publication date: 2020-08-07
Anticipated expiration: 2040-04-15
Also published as: CN111507070B

Abstract

The invention discloses a natural language generation method and a natural language generation device, wherein the method comprises the following steps: receiving an AMR diagram and a line diagram corresponding to the AMR diagram, and taking the AMR diagram and the line diagram as the input of an encoder; in the encoder, the AMR diagram and the line diagram are respectively encoded by using a diagram neural network, wherein high-order adjacent information of the AMR diagram is blended in the encoding process by using the diagram neural network, and the order of the high-order adjacent information is more than 1; and after the encoding is finished, each word in the natural language corresponding to the AMR graph is analyzed by using a decoder. In the scheme of the embodiment of the application, because the relationship between indirect adjacent nodes is considered during encoding, the model can better explore the information in the AMR graph.

Description

Natural language generation method and device

Technical Field

The invention belongs to the technical field of natural language generation, and particularly relates to a natural language generation method and device.

Background

In the prior art, an Abstract semantic Representation (AMR) is a semantic Representation at sentence level for structurally describing the semantics contained in a sentence. AMR is the storage in a computer of a graph structure in which each node of the graph represents a semantic concept and edges in the graph represent relationships between semantic concepts. FIG. 1 illustrates an AMR diagram (abstract semantic representation diagram) that reflects the semantics of the statement "He runs as fast as the wind".

The abstract semantic text generation task is to restore this highly abstract and structured semantic graph representation to the corresponding natural language. This is a typical "graph-to-sequence" natural language generation task that can be widely applied in intelligent dialog systems. Three approaches are currently popular for this task:

1) generating a model based on the sequence of rules;

2) a "sequence-to-sequence (Seq2 Seq)" model based on a traditional recurrent neural network;

3) graph-to-sequence (Graph2Seq) model based on Graph neural networks.

The system fully considers various relationships among all nodes in the graph and maps the relationships into corresponding natural languages by constructing a large number of rules. A Seq2Seq model based on a recurrent neural network uses the neural machine translation thought for reference and adopts an Encoder-Decoder (Encoder-Decoder) structure. AMR is coded through a coder, and then a corresponding natural language is analyzed through a corresponding decoder. And training the neural network through a large amount of parallel corpus data to fit a proper mapping function. However, since the input to the neural network is AMR of a graph structure, not a sequence, in this task, the graph needs to be serialized by some means and then further trained. Graph2Seq model based on Graph neural networks is a new model that has emerged in recent years. It also employs an encoder-decoder architecture, but differs from the use of a graph neural network to directly encode the AMR graph structure at the encoder stage, thereby omitting the serialization process

The inventor finds out in the process of implementing the present application that the prior art scheme mainly has the following defects:

for rule-based models, rules often cannot cover all patterns due to the complexity and diversity of natural language. The restored natural language is often relatively hard and loses the fluency of the language in many times. Currently, rule-based systems have been phased out.

The Seq2Seq model based on the neural network needs to serialize the graph and then encode the graph, however, serializing the graph means that the structural information in the graph is lost, and the information in the graph cannot be effectively encoded. Thereby compromising the effectiveness.

While Graph2Seq model based on Graph neural network can well retain the structural information of the Graph. However, the existing graph neural network model still has two problems: a. current graph neural network structures tend to only consider relationships between neighboring nodes, ignoring higher-order graph adjacency relationships; b. current graph encoders only consider relationships between nodes in the graph, and relationships between ignored edges. These two disadvantages make it impossible for the model to explore more graph information, and as graphs become larger and more complex, the performance of the model is greatly reduced.

Disclosure of Invention

An embodiment of the present invention provides a natural language generating method and apparatus, which are used to solve at least one of the above technical problems.

In a first aspect, an embodiment of the present invention provides a natural language generation method, including: receiving an AMR diagram and a line diagram corresponding to the AMR diagram, and taking the AMR diagram and the line diagram as the input of an encoder; in the encoder, the AMR diagram and the line diagram are respectively encoded by using a diagram neural network, wherein high-order adjacent information of the AMR diagram is blended in the encoding process by using the diagram neural network, and the order of the high-order adjacent information is more than 1; and after the encoding is finished, analyzing each word in the natural language corresponding to the AMR diagram by using a decoder.

In a second aspect, an embodiment of the present invention provides a method for natural language generation, including: the device comprises a receiving module, a processing module and a processing module, wherein the receiving module is configured to receive an AMR diagram and a line graph corresponding to the AMR diagram, and the AMR diagram and the line graph are used as the input of an encoder; the encoding module is configured to encode the AMR diagram and the line diagram in the encoder by using a diagram neural network, wherein the diagram neural network encodes the original AMR diagram and the line diagram by using a diagram attention network, and high-order adjacent information of the AMR diagram is blended in the encoding process, and the order of the high-order adjacent information is more than 1; and the decoding module is configured to analyze each word in the natural language corresponding to the AMR diagram by using a decoder after the encoding is finished.

In a third aspect, an electronic device is provided, comprising: at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the steps of the natural language generation method of any of the embodiments of the present invention.

In a fourth aspect, the present invention also provides a computer program product, which includes a computer program stored on a non-volatile computer-readable storage medium, where the computer program includes program instructions, and when the program instructions are executed by a computer, the computer executes the steps of the natural language generation method of any one of the embodiments of the present invention.

According to the scheme provided by the method and the device, the AMR diagram and the line graph corresponding to the AMR diagram are received firstly, then the AMR diagram and the line graph are respectively input to the graph neural network for coding, high-order adjacent information of the AMR diagram is merged in the coding process, and finally, the natural language corresponding to the AMR diagram is output through the decoder. Since the relationship between non-directly adjacent nodes is also considered during encoding, the model can better explore the information in the AMR graph.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts.

FIG. 1 is a diagram illustrating an abstract semantic representation according to an embodiment of the present invention;

FIG. 2 is a flowchart of a natural language generation method according to an embodiment of the present invention;

FIG. 3 shows an original AMR map and its corresponding line graph;

FIG. 4 shows neighbor information arranged in a different order;

FIG. 5 shows an overview of a model provided by an embodiment of the present application;

FIG. 6 shows an example of finding a line graph;

FIG. 7 shows the B L EU variation between models with different orders K with respect to AMR diagram size;

FIG. 8 shows the B L EU variation between models with different Ke with respect to the AMR map size and number of folds (left) and folds (right);

FIG. 9 (a) shows an example comparison between different methods, FIG. 9 (b) shows our method and several benchmarks;

fig. 10 is a block diagram of a natural language generating apparatus for a cloud server according to an embodiment of the present invention;

fig. 11 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Referring to fig. 2, which is a flowchart illustrating an embodiment of a natural language generation method according to the present application, the natural language generation method according to the present embodiment may be applied to convert a graph of an abstract semantic representation into a natural language, and the present application is not limited herein.

As shown in fig. 2, in step 201, an AMR map and a line graph corresponding to the AMR map are received, and the AMR map and the line graph are used as input of an encoder;

in step 202, in the encoder, the AMR map and the line map are encoded separately using a map neural network;

in step 203, after the encoding is completed, each word in the natural language corresponding to the AMR map is parsed using a decoder.

In this embodiment, for step 201, the natural language generating device adopts an encoder-decoder structure, and first the natural language generating device receives an AMR map and a line graph corresponding to the AMR map, and takes the AMR map and the line graph as input of an encoder.

Then, for step 202, in the encoder, the natural language generating device encodes the AMR graph and the line graph respectively using a graph neural network, wherein high-order adjacent information of the AMR graph is merged in the process of encoding using the graph neural network, and the order of the high-order adjacent information is greater than 1. Wherein for node x in the graph_iOf which first order adjacency information R¹(x_i) Denotes x_iA set of nodes that can be reached within 1 step (neighbor nodes); and second order adjacency information R²(x_i) Is represented by x_iA set of nodes that can be reached within 2 steps; by analogy, K-th order adjacency information P^K(x_i) Is represented by x_iA collection of nodes that can be reached within two.

Finally, for step 203, after the encoding is completed, each word in the natural language corresponding to the AMR map is parsed using a decoder.

According to the scheme of the embodiment of the application, the AMR diagram and the line graph corresponding to the AMR diagram are received in advance, then the AMR diagram and the line graph are respectively input to the graph neural network for coding, high-order adjacent information of the AMR diagram is merged into the coding process, and finally, the natural language corresponding to the AMR diagram is output through the decoder. Since the relationship between non-directly adjacent nodes is also considered during encoding, the model can better explore the information in the AMR graph.

In some optional embodiments, before the receiving an AMR map and a line map corresponding to the AMR map, the method further comprises: receiving an ARM graph, and converting attributes of edges in the ARM graph into corresponding nodes to generate a line graph corresponding to the ARM graph, wherein the line graph reflects the relationship among the edges of the ARM graph. Therefore, when only the AMR diagram exists, the line graph corresponding to the AMR diagram is obtained firstly, and then the subsequent processing is carried out, so that the user can obtain the corresponding line graph only by providing the AMR diagram.

In some optional embodiments, the encoding the AMR map and the line map using a graph neural network, respectively, comprises: in the graph neural network, a graph attention network is adopted to respectively code the original AMR graph and the line graph.

In some optional embodiments, after the encoding the AMR map and the line map separately using a map neural network, the method further comprises: after encoding is completed, information transfer between the AMR graph and the line graph is performed between the AMR graph and the line graph by using an attention mechanism so as to model the relationship between nodes and edges in the AMR graph and the line graph. Therefore, the relationship between the edges is fused, so that the encoding capacity of the model for the graph can be further enhanced.

In some optional embodiments, the parsing each word in the natural language corresponding to the AMR map using a decoder comprises: and iteratively analyzing each word in the natural language corresponding to the AMR diagram by using a decoder in an autoregressive mode.

In some optional embodiments, the order of the high-order adjacency information may be greater than or equal to 1, and further, the order of the high-order adjacency information may be greater than or equal to 4. The inventor finds that when the order number in the high-order adjacency information is greater than or equal to 4, the performance of the whole model can be optimal.

The following description is provided to enable those skilled in the art to better understand the present disclosure by describing some of the problems encountered by the inventors in implementing the present disclosure and by describing one particular embodiment of the finally identified solution.

The inventor finds in the process of implementing the present application that the underlying principle of the above drawback can be generalized to the coding problem of the graph. The existing model cannot effectively code the information of the graph structure and cannot comprehensively discover various information in the graph, so when the graph is converted into a natural language, the coding error further influences the text generation performance.

The inventor also finds that in order to solve the problem of graph coding, many researchers tend to better code the graph by optimizing the structure of the model from the structure of the model. But ignores extracting richer information from the graph itself to encode it.

The scheme of the embodiment of the application provides two improvement strategies. First we incorporate higher order adjacency information in the graph coding process. As shown in fig. 2. For node x in the graph_iOf which first order adjacency information R¹(x_i) Denotes x_iA set of nodes that can be reached within 1 step (neighbor nodes); and second order adjacency information R²(x_i) Is represented by x_iA set of nodes that can be reached within 2 steps; by analogy, K-order adjacency information R^K(x_i) Is represented by x_iA collection of nodes that can be reached within two.

The traditional graph neural network model only considers first-order adjacent information R¹(x_i) When the graph neural network operation is carried out, each node only interacts with the adjacent nodes. In our method, each node would be associated with R¹(x_i)，R²(x_i)...R^K(x_i) The nodes in (1) interact respectively. In this way, the relationships between non-directly adjacent nodes are also taken into account when encoding, so that the model can better exploit the information in the AMR map.

Second, we further consider the relationships between edges in the graph, in addition to the relationships between nodes in the AMR graph. To achieve this, we introduce a line graph in the input to the model to reflect the relationship between edges in the AMR graph.

The line graph is a concept in graph theory, which is defined as given a graph G whose line graph L (G) satisfies two conditions:

1. l (G) each node represents an edge in G;

2. l (G), if and only if the edges they represent have a common point in G.

Fig. 3 shows the original AMR map and its corresponding line graph. It can be seen that the attributes on the edges in the original AMR graph are translated into corresponding nodes and are individually grouped into a new graph (line graph). This graph reflects the relationship between edges. In graph coding, we also code the graph corresponding to the original AMR graph. The relationship between the edges is fused, so that the encoding capacity of the model for the graph can be further enhanced.

Fig. 5 shows the structure of our model. Our model also follows the encoder-decoder structure. At the encoder side, the encoder accepts as input the original AMR map and its line map. The two graphs as input are encoded using graph neural networks, respectively. In the system, a graph attention network (GAT) is adopted to encode the graph, and meanwhile, high-order adjacent information of the graph is merged in the encoding process. After the original AMR map and its line map are coded, a message is also passed between the two maps. The purpose of this step is to model the relationships between nodes and edges in the graph. In this model, we use Attention mechanism (Attention) to accomplish the information transfer between the figures.

After the AMR picture is encoded, we use the decoder to further decode the encoded picture to generate the corresponding natural language. The decoder adopts a traditional Transformer structure and iteratively analyzes each word in the natural language in an autoregressive mode. This completes the process from the abstract semantic representation AMR to natural language.

The embodiment of the application can realize the following technical effects: the scheme is mainly applied to the task of generating the graph to the sequence, and by introducing the high-order adjacent information of the graph and the relation between the edges in the graph, the coding capacity of the model on the graph can be greatly improved, and more smooth and correct natural texts can be generated. Deeper, the traditional model has a great performance reduction when the graph becomes larger and more complex. The model has more advantages in the face of complex conditions due to the fact that more information is fused.

The technical solutions of the present application are analyzed and tested in the following procedures and results, so that those skilled in the art can better understand the solutions of the present application.

Abstract semantic text generation based on line graph enhancement and high-order graph neural network

Abstract

In many graph-based models, efficient structural coding of graphs containing attribute edges is an important but challenging aspect. This work has mainly studied text generation for abstract semantic representations-a graph-to-sequence task aimed at recovering the corresponding natural language from abstract semantic representations (AMR). Existing graph-to-sequence methods typically use graph neural networks as their encoders, but have two major drawbacks: 1) the message propagation process in the AMR graph only considers the relationship between adjacent nodes and ignores the graph adjacency relationship of higher order; 2) only the relationships between nodes in the graph are considered, while the interrelationships between edges are ignored. In the embodiment of the application, a novel graph coding framework is provided, and the relationship between edges in a graph can be effectively explored. We also incorporate higher-order adjacency information into the graph attention network to facilitate the model's encoding of rich structures in AMR graphs. Experimental results show that the method provided by the embodiment of the application obtains the best performance on an English AMR reference data set. Experimental analysis also shows that the relationship of edges and high-order information are very helpful for modeling from graph to sequence.

Introduction to 1

Abstract semantic Representation (AMR) is a sentence-level semantic Representation formalized by a directed graph Representation, where nodes are concepts and edges are semantic relationships. Because AMR is a highly structured meaning representation, it can facilitate many semantically related tasks such as machine translation and summarization. However, the use of AMR graphs can be challenging because it is not easy to fully capture rich structural information in graph-based data, especially when edges of a graph have attribute labels. The purpose of generating from AMR is to convert AMR semantics into surface form (natural language).

With continued reference to fig. 3, there is shown an original AMR diagram with its corresponding line graph and its conceptual and relational (line) graphs, wherein: the natural language meaning of the AMR expression on the left side of FIG. 3 is "He run as fast as the wind". Are aligned to each other based on node edge relationships in the original graph.

This is a basic graph to sequence task that takes AMR as input directly. Fig. 3 (left) shows a standard AMR diagram and its corresponding natural language form. Early work utilized sequence-to-sequence frameworks by linearizing the entire graph. Such representations may lose useful structural information. In recent studies, Graphical Neural Networks (GNNs) have dominated this task and achieve the most advanced performance. However, in these GNN-based models, the representation of each concept node is updated only by the aggregated information from its neighbors, which results in two limitations: 1) the interaction between indirectly connected nodes depends to a large extent on the number of stacked layers. As the size of the graph becomes larger, the dependency between distant AMR concepts cannot be sufficiently explored. 2) They only focus on modeling relationships between concepts and ignore edge relationships and their structures. Some researchers use Transformer to model arbitrary pairs of concepts, whether directly connected or not, but they still ignore the topology of edges throughout the AMR map.

To address the above limitations, we propose a novel graph sequence model based on graph attention network. We convert the edge labels into relationship nodes and construct a new graph that directly reflects the edge relationships. In graph theory, such a graph is referred to as a line graph. As shown in fig. 3, we therefore split the original AMR graph into two subgraphs-a conceptual graph and a relational graph without labeled edges. These two graphs describe the AMR concept and the dependency of the edge, respectively, which helps to model these relationships (especially for edges). Our model takes these subgraphs as input and the communication between the two graphs is based on an attention mechanism. Furthermore, for both graphs, we mix higher order adjacency information into the respective graph encoders in order to model the relationship between indirectly connected nodes.

Empirical studies on two english benchmark datasets show that our model achieves the latest performance of 30.58 and 32.46B L EU on L DC2015E86 and L DC2017T10, respectively.

We propose a novel graph-to-sequence model that first models the relationship between AMR edges using line graphs.

Integrating higher order neighbor information into the graph encoder to model the relationships between indirectly connected nodes.

We prove that both high-order adjacency information and edge relationships are important for graph-to-sequence modeling.

2 hybrid sequence diagram attention network

Here we first introduce the graph attention network (GAT) and its hybrid sequential extensions, which are the basis of our proposed model.

2.1 Graph Attention Networks (GAT, Graph Attention Networks)

GAT is a special type of network that processes graph structure data through an attention mechanism. Given a graph G ═ V, E, where V and E are the set of nodes xi and edges (E)_ij，l_e) Wherein l is_eIs an edge tag that is not considered in the GAT layer.

Fig. 4 shows that the neighbor information is arranged in a different order.

N (xi) denotes nodes directly connected by xi, N + (xi) is a set comprising xi and all its direct neighbors we have N + (xi) ═ N (xi) ∪ { xi }, respectively.

Each node x in the graph_iHaving an initial characteristic

Where d is the feature dimension. The representation of each node is iteratively updated by a graph attention operation. In step 1, each node xi aggregates context information by participating in its neighbors and itself. The updated representation hli is calculated from the weighted average of the connected nodes:

the formula for calculating the attention coefficient α ij is:

where σ is a non-linear activation function, e.g. Re L U.W^lAnd are and

and

is a projected learnable parameter, through L steps, each node will eventually have a context-aware representation

To achieve a stable training process, we also used residual concatenation, followed by layer normalization between the two graphical attention layers.

2.2 Mixed Higher-Order Information (Mixing highher Order Information)

In a traditional graph focus layer, the relationships between indirectly connected nodes are ignored. However, mixed-Order GAT (Mix-Order GAT) can explore these relationships in a single-step operation by mixing higher-Order adjacency information. Before describing the details of mixed sequential GAT, we first give some comments. We use R^K＝R¹，...R^KRepresenting neighborhood information from 1 st order to K th order. R^k(x_i) Represents a k-th order neighborhood, which means R^k(x_i) For x within k hops for all nodes in_iAre all reachable.

R¹(xi)＝N+(x_i) As shown in fig. 4, we can derive:

the K-Mix GAT integrates neighborhood information RK. In the 1 st update step, each x_iWill interact with its reachable neighbors in a different order and compute attention features independently. To represent

Updated by connection features from different orders, i.e.

Wherein, | | represents concatenation,

is the attention weight of the k-th order,

are the learnable weights of the projection. In the next section, we will represent the Mix-OrderGAT layer using MixGAT (-).

3 method

Fig. 5 shows an overview of our proposed model. The architecture of our method is shown in figure 5. As described above, we split the AMR graph into two sub-graphs without labeled edges. Our model follows the Encoder-Decoder architecture, where the Encoder takes two subgraphs as inputs and the Decoder generates the corresponding text from the encoded information. We first give some details about the line graph and the input representation.

3.1 line drawing & input representation

The graph of graph G is another graph L (G) that represents the adjacent relationship between G edges L (G) is defined as each node of · L (G) represents the edge of G-two nodes of L (G) are adjacent if and only if their respective edges share a common node in G

Is the set of edges without label information. The edges in both the G _ c and G _ e graphs are without attribute tags and can be efficiently encoded by Mix-Order GAT. We use

And

representing Gc and Ge

Order neighborhood information. By initial embedding

Represents each concept node x_i∈ Vc by embedding

The set representing each relational node yi ∈ Ve. node embedding is represented as

And

where m is | Vc |, n is | Ve |.

FIG. 6 shows an example of finding a line graph. On the left side, e1 and e2 have opposite directions, so each direction remains the same in the line drawing. In the right part, e1 and e2 follow the same direction, so there is only one direction in the corresponding line graph.

Representing the number of concept nodes and relationship nodes, respectively. Thus, the input to our system can be derived from

And (4) showing.

3.2 self-update

The encoder of our system consists of N stacked graphics coding layers. As shown in fig. 5, each coding layer has two parts: self-updating and masked cross-attention per graph. For Gc and Ge, we use

To know

To indicate the input node embedding of the 1 st coding layer. The hybrid software is intended to update the representation separately for the network (MixGAT).

In step 1 (layer), we have:

wherein,

clself and

elself is based on mixed order neighborhood information RK c and RK e updated representation. One thing to note is that Gc and Ge are both directed graphs. This means that the information propagation in the graph follows a top-down approach, following a pre-specified direction. However, one-way propagation may lose structural information in the reverse direction. To establish bi-directional communication, we use Dual Graph. The dual graph has the same node representation as the original graph, but the edges are in the opposite direction. For example, if edge A → B is in the original graph, it becomes B → A in the corresponding dual graph. Since the dual graph has the same node representation, we only need to change the neighborhood information. Gc and Ge are shown as a dual plot of Gc and Ge. e RK c and e RK e are the corresponding neighborhood information. We have:

since we have updated node embedding in both directions, the final representation of the independent graph update process is a combination of bi-directional embedding, i.e.

Wherein,

and

is a trainable matrix of projections.

And

is the result of the self-renewal process.

3.3 Cross attention mechanism with mask (MaskedCrossAttention)

From the structure of the original AMR graph, we can easily establish an alignment between Gc and Ge.if xi is the start/end of an edge corresponding to yi, the relationship node yi will align directly with the concept node xi. As shown in FIG. 3, ARG0 is the edge between run-02 and he.As a result, node ARG0 in Ge is connected directly to run-02 while he is in Gc^n×mFor mij in M, let mij be 0 if yi ∈ Ve is aligned with xj ∈ Vc, otherwise let mij be ═ infinity

And

with masked cross attention, attention weight matrix A_lCan be calculated as:

wherein

And

is a learnable projection matrix. The weight fraction of misaligned pairs is set to- ∞accordingto M. For the

Node of (1), using A_lIdentification is from

The correlation representation of (a) is:

wherein

Is that

Is calculated as a mask weighted sum. To pair

The nodes in (1) perform the same calculations as follows:

the final output of the picture coding layer is a combination of the original embedding and a context representation of another picture. We also take the output of the previous layer as the residual input, i.e.

FFN is a feed forward network that includes two linear transformations. After N stacked pattern-encoding layers, the two patterns Gc and Ge are finally encoded as CN and EN.

3.4 decoder

The decoder of our system is similar to the transform decoder. At each generation step, the representation of the output word is updated through multiple rounds of attention, including the previously generated token and the encoder output. Note that the output of our graphics encoder is divided into two parts: concept representation CN and relationship representation EN. The concept information is more important to generate because the concept graph directly contains natural words. With the multi-step cross-concern, the CN also brings rich relationship information. For simplicity, we use CN only as the encoder output on the decoder side. To solve the data sparsity problem in sequence generation, we use Byte Pair Encoding (BPE) according to Zhu et al. We divide the word nodes in the AMR graph and the reference sentence into sub-words and the decoder vocabulary is shared with the encoder of the concept graph.

4 experiment

4.1 setting

Data and preprocessing we performed experiments using two benchmark datasets L DC2015E85 and L DC2017T10, these two datasets contain 16833 and 36521 training samples that use a common development set containing 1368 samples and a common test set containing 1371 samples.

Details of training

For the model parameters, the number of graphics coding layers is fixed to 6, the representation dimension d is set to 512. we set the graphics neighborhood order K to 1, 2 and 4 for Gc and Ge a transform decoder with 6 layers, 512 size and 8 heads based on Open-NMT. we use Adam (Kingma and Ba, 2015) as optimizer, β to (0.9, 0.98.) similar to Vaswani et al, the learning rate also differs during training:

lr＝γd^-0.5·min(t^-0.5，t*w^-1.5)，(13)

where T denotes the cumulative training step and w denotes the preheat step we use w 16000 with coefficient γ set to 0.75 for batch size we use 80 for L DC2015E86 and 120 for L DC2017T 10.

4.2 results

We compare our system to several baselines, including a traditional sequence-to-sequence model, several pattern-to-sequence models with multiple pattern encoders, and a transformer-based model.

The results of experiments on L DC2015E86 and L DC2017T10 test sets are reported in Table 1 We can see that sequence-based models perform the worst because they lose useful structural information in the graph.

Table 1 primary results of our method and several benchmarks in the test set of L DC2015E86 and L DC2017T 10.

The column of the Model shows names of various conventional models, Sequence-Based Model represents a Sequence-Based Model, Graph-Based Model represents a Graph-Based Model, Transformer-Based Model represents a converter-Based Model, and ourAproach represents Our method, L DC2015E86 and L DC2017T10 represent name numbers of test data sets, B L EU: Bilinal Evaluation Understatus: Translation auxiliary Evaluation, and Meteor: metal for Evaluation of transformation with application execution order: Translation Evaluation index with display ordering.

Compared with the previous research, the K-4 order neighborhood information method achieves the best B L EU score, the latest model on the two data sets is improved by 0.92, and similar phenomena can be found on other indexes of Meteor.

5 analysis of

As mentioned above, our system has two key points: the relationship between the high-order graph neighborhood information and the AMR edges. To verify the effectiveness of these two arrangements, we performed a series of experimental tests based on different characteristics of the graph.

5.1 neighborhood information ablation study

As shown in table 1, if a graph node interacts only with its direct neighbors (K1), its performance is worse than previous transform-based models, but when we integrate high-order adjacency information, a significant improvement can be observed B L EU score increases 1.94 and 2.50 on L DC2015E86 and L DC2017T10, respectively, as K grows from 1 to 4 as described above, if only first-order neighborhoods are considered, the dependency between distant AMR concepts cannot be fully explored as the size of the graph grows larger.

FIG. 7 shows the B L EU variation between models with different orders K with respect to AMR diagram size.

Fig. 8 shows the B L EU variation between models with different Ke with respect to the size and (left) of the AMR map and the number of re-entrant nodes (right).

5.2 ablation study of labeled edge relationships

We think about the relationship between labeled edges in AMR maps by integrating a line graph (relationship graph) Ge in the system. This section will go into analyzing the effectiveness of this contribution. In the previous setup, the pattern neighborhood order K for Gc and Ge was the same. For ablation testing, we fixed the adjacent order Kc to Gc and the order Ke to the relationship graph Ge. We set Ke-0, 1 and 4, where Ke-0 indicates that the relationship node in Ge can only interact with itself. This means that dependencies between AMR edges will be completely ignored and the edge information is simply combined with the corresponding concept. We report the results of these two test sets in table 2.

Table 2 model results for graphs Ge with different neighborhood order B L EU scores are significantly different from the best model, labeled with x (p < 0.01), and tested by self-lifting resampling, if we neglect the dependence between AMR edges (Ke 0), performance will drop significantly, B L EU scores for L DC2015E86 and L DC2017T10 drop by 1.69 and 1.38, respectively, when Ke > 0, performance will increase, which means that edge relationships do provide benefits for graph coding and sequence generation, when Ke 4, edge relationships will be fully explored in different neighborhood order and best performance is achieved on both data sets.

We also investigated the effectiveness of edge relationships when dealing with reentrant nodes. A re-entry node is a node having multiple parents. Such a structure can be easily identified in AMR maps we believe that the relationship map Ge helps to explore different dependencies with the same concept, which can bring benefits to maps containing more reentrancy. To test this hypothesis, we also split the test set into different sections according to their number of reentries and evaluated our model using Ke 4 and Ke 0 on different partitions. As shown in fig. 8 (right), when the number of reentrants increases to 5, the difference between the two becomes large. Moreover, the edge relationships are more important in processing graphs with reentrant graphs than the size of the graph.

5.3 case study

The model performance is deeply understood. Table 3 provides some examples. The AMR map in all examples contains re-entry nodes (marked in bold). In example (a), the concept of their two parents is the same-want. While our Ke-0 model successfully finds that they are the subject of both wanted, it fails to recognize the parallel relationship between money and face objects and treats faces as verbs. In contrast, our model with Ke-4 finds perfectly the parallel structure in the AMR map and reconstructs the correct sentence.

In example (b), we compare the best model to two baselines: ghuSEQ and structurestrnformer (denoted ST-Transformer). The AMR map in example (b) has two re-entry points, which makes it more difficult for the model to recover the corresponding sentence. As we see, the conventional graph-based model GCNSEQ cannot predict the correct subject of the predicate can. The structured-Transformer uses the correct subject, but the resulting sentence is not very fluid due to the presence of the redundant word "people". This problem of over-generations is mainly due to re-emission. However, our model can efficiently deal with this problem and generate correct sentences with correct semantics.

An example comparison between different methods is shown in fig. 9 (a) (upper part). In fig. 9 (b) (lower part) our method and several benchmarks are shown.

6 related work

AMR to text generation is a typical graphics to sequence task. Early studies have employed rule-based approaches to address this problem. Using a two-stage approach, the graph is first partitioned into spanning trees and the natural language is generated using a plurality of tree converters. Graph-to-string rules are learned using heuristic extraction algorithms. More works have been directed to sequences as translation tasks and use phrase-based models or neural-based models. These methods typically require linearization of the input map by depth traversal. A better sequence-based model is obtained by using other syntax information.

Turning to the graph to sequence approach, researchers have for the first time demonstrated that graph neural networks can significantly improve the performance of generation by explicitly encoding the structure of the graph, since then, new approaches have been proposed in recent years with variable graph encoders, such as graph L STM, Gated Graph Neural Networks (GGNN), and graph convolution neural networks.

Our model follows the same spirit of exploring relationships between indirectly connected nodes, but our approach is very different: (1) we use graph-based methods to integrate with higher-order adjacency information while preserving the explicit structure of the graph. (2) We first consider the relationship between the marker edges by introducing a line graph.

Conclusion and future work

In this work, we propose a novel graph-to-sequence generation method that uses a graph to model the relationships between labeled edges in the original AMR graph.

Referring to fig. 10, a block diagram of a natural language generating apparatus for a speech module according to an embodiment of the present invention is shown.

As shown in fig. 10, a natural language generating apparatus 1000 includes a receiving module 1010, an encoding module 1020, and a decoding module 1030.

The receiving module 1010 is configured to receive an AMR diagram and a line diagram corresponding to the AMR diagram, and use the AMR diagram and the line diagram as input of an encoder; an encoding module 1020 configured to encode, in the encoder, the AMR map and the line map respectively using a map neural network, wherein, in the map neural network, the original AMR map and the line map are encoded respectively using a map attention network, and high-order adjacency information of the AMR map is merged in the encoding process, and the order of the high-order adjacency information is greater than 1; and a decoding module 1030 configured to parse each word in the natural language corresponding to the AMR map using a decoder after the encoding is completed.

In some optional embodiments, the natural language generating apparatus further includes: and a node and edge relation modeling module (not shown in the figure) configured to perform information transfer between the AMR graph and the line graph by using an attention mechanism after the encoding is completed so as to model the relation between the nodes and the edges in the AMR graph and the line graph.

It should be understood that the modules recited in fig. 10 correspond to various steps in the method described with reference to fig. 2. Thus, the operations and features described above for the method and the corresponding technical effects are also applicable to the modules in fig. 10, and are not described again here.

It should be noted that the modules in the embodiments of the present application are not limited to the scheme of the present application, for example, the decoding module may be described as a module that, after the encoding is completed, uses a decoder to parse out each word in the natural language corresponding to the AMR map. In addition, the related function module may also be implemented by a hardware processor, for example, the result returning module may also be implemented by a processor, which is not described herein again.

In other embodiments, the present invention further provides a non-volatile computer storage medium, where the computer storage medium stores computer-executable instructions, and the computer-executable instructions can execute the natural language generation method in any of the above method embodiments;

as one embodiment, a non-volatile computer storage medium of the present invention stores computer-executable instructions configured to:

receiving an AMR diagram and a line diagram corresponding to the AMR diagram, and taking the AMR diagram and the line diagram as the input of an encoder;

in the encoder, the AMR diagram and the line diagram are respectively encoded by using a diagram neural network, wherein high-order adjacent information of the AMR diagram is blended in the encoding process by using the diagram neural network, and the order of the high-order adjacent information is more than 1;

and after the encoding is finished, each word in the natural language corresponding to the AMR graph is analyzed by using a decoder.

The non-volatile computer-readable storage medium may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to use of the natural language generating apparatus, and the like. Further, the non-volatile computer-readable storage medium may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments, the non-transitory computer-readable storage medium optionally includes memory located remotely from the processor, which may be connected to the natural language generating device over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

Embodiments of the present invention also provide a computer program product, which includes a computer program stored on a non-volatile computer-readable storage medium, where the computer program includes program instructions, which, when executed by a computer, cause the computer to execute any one of the above-mentioned natural language generation methods.

Fig. 11 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, and as shown in fig. 11, the electronic device includes: one or more processors 1110 and a memory 1120, with one processor 1110 being an example in fig. 11. The apparatus of the natural language generating method may further include: an input device 1130 and an output device 1140. The processor 1110, the memory 1120, the input device 1130, and the output device 1140 may be connected by a bus or other means, and the bus connection is exemplified in fig. 11. The memory 1120 is a non-volatile computer-readable storage medium as described above. The processor 1110 executes various functional applications of the server and data processing by executing nonvolatile software programs, instructions, and modules stored in the memory 1120, that is, implements the natural language generation method of the above-described method embodiment. The input device 1130 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the natural language generating device. The output device 1140 may include a display device such as a display screen.

The product can execute the method provided by the embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method. For technical details that are not described in detail in this embodiment, reference may be made to the method provided by the embodiment of the present invention.

As an embodiment, the electronic device is applied to a natural language generating apparatus, and is used for a speech module, and includes:

at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to cause the at least one processor to:

The electronic device of the embodiments of the present application exists in various forms, including but not limited to:

(1) a mobile communication device: such devices are characterized by mobile communications capabilities and are primarily targeted at providing voice, data communications. Such terminals include: smart phones (e.g., iphones), multimedia phones, functional phones, and low-end phones, among others.

(2) Ultra mobile personal computer device: the equipment belongs to the category of personal computers, has calculation and processing functions and generally has the characteristic of mobile internet access. Such terminals include: PDA, MID, and UMPC devices, etc., such as ipads.

(3) A portable entertainment device: such devices can display and play multimedia content. This type of device comprises: audio, video players (e.g., ipods), handheld game consoles, electronic books, and smart toys and portable car navigation devices.

(4) A server: the device for providing the computing service comprises a processor, a hard disk, a memory, a system bus and the like, and the server is similar to a general computer architecture, but has higher requirements on processing capacity, stability, reliability, safety, expandability, manageability and the like because of the need of providing high-reliability service.

(5) And other electronic devices with data interaction functions.

The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A natural language generation method, comprising:

2. The method of claim 1, wherein prior to the receiving an AMR map and a line map corresponding to the AMR map, the method further comprises:

receiving an ARM graph, and converting attributes of edges in the ARM graph into corresponding nodes to generate a line graph corresponding to the ARM graph, wherein the line graph reflects the relationship among the edges of the ARM graph.

3. The method of claim 1, wherein said encoding the AMR map and the line map separately using a graph neural network comprises:

in the graph neural network, a graph attention network is adopted to respectively code the original AMR graph and the line graph.

4. The method of claim 1, wherein after the encoding the AMR map and the line map separately using a graph neural network, the method further comprises:

after encoding is completed, information transfer between the AMR graph and the line graph is performed between the AMR graph and the line graph by using an attention mechanism so as to model the relationship between nodes and edges in the AMR graph and the line graph.

5. The method of any of claims 1-4, wherein the parsing out each word in the natural language corresponding to the AMR graph using a decoder comprises:

and iteratively analyzing each word in the natural language corresponding to the AMR diagram by using a decoder in an autoregressive mode.

6. The method of claim 5, wherein the order of the high-order adjacency information is 4 or more.

7. A natural language generation apparatus comprising:

the device comprises a receiving module, a processing module and a processing module, wherein the receiving module is configured to receive an AMR diagram and a line graph corresponding to the AMR diagram, and the AMR diagram and the line graph are used as the input of an encoder;

the encoding module is configured to encode the AMR diagram and the line diagram in the encoder by using a diagram neural network, wherein the diagram neural network encodes the original AMR diagram and the line diagram by using a diagram attention network, and high-order adjacent information of the AMR diagram is blended in the encoding process, and the order of the high-order adjacent information is more than 1;

and the decoding module is configured to analyze each word in the natural language corresponding to the AMR diagram by using a decoder after the encoding is finished.

8. The apparatus of claim 7, further comprising:

and the node and edge relation modeling module is configured to perform information transfer between the AMR graph and the line graph by using an attention mechanism after the encoding is completed so as to model the relation between the nodes and the edges in the AMR graph and the line graph.

9. A computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, cause the computer to perform the steps of the method of any of claims 1 to 6.

10. An electronic device, comprising: at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the steps of the method of any one of claims 1 to 6.