CN107657313B

CN107657313B - System and method for transfer learning of natural language processing task based on field adaptation

Info

Publication number: CN107657313B
Application number: CN201710877941.7A
Authority: CN
Inventors: 肖仰华; 谢晨昊; 梁家卿; 崔万云
Original assignee: Shanghai Shuyan Technology Development Co ltd
Current assignee: Shanghai Shuyan Technology Development Co ltd
Priority date: 2017-09-26
Filing date: 2017-09-26
Publication date: 2021-05-18
Anticipated expiration: 2037-09-26
Also published as: CN107657313A

Abstract

The invention discloses a transfer learning system and a method of natural language processing tasks based on field adaptation, comprising the following steps: an open-realm part module and a domain-specific part module, wherein the open-realm part module is configured to: training by using the existing open field data, and modifying an output layer according to different natural language processing tasks, specifically comprising: the LSTM network trains the open field, and a weight matrix of the open field neural network can be obtained; a domain-specific part module to: constructing a neural network in a specific field, and modifying an output layer according to different natural language processing tasks, wherein the output layer is consistent with part of output layers in the open field; and training the neural network in the specific field, and obtaining results of different natural language processing tasks according to the results of the output layer. It solves the problem of field adaptation in natural language processing tasks.

Description

System and method for transfer learning of natural language processing task based on field adaptation

Technical Field

The invention relates to a migration learning system and a migration learning method of a natural language processing task based on field adaptation, which solve the problem of field adaptation in the natural language processing task.

Background

Natural Language Processing (Natural Language Processing) is a branching subject of the fields of artificial intelligence and linguistics. In this field, how to process and use natural language, including natural language recognition, natural language generation, and natural language understanding, is discussed. Natural language cognition is the understanding of human language by computers. The natural language generation system converts computer data into natural language. Natural language understanding systems translate natural language into a form that is more easily handled by computer programs. The main tasks of natural language processing include part-of-speech tagging, emotion analysis, syntactic analysis, and the like.

The current natural language processing technology still has some problems in application. Most of the current natural language processing technologies are trained by using standard data sets, but in a large number of practical applications, natural language processing needs to be performed in specific fields. The ambiguity of natural language and the overfitting of models at the embedding layer can cause the models to have poor effect in specific fields of application. Meanwhile, due to the lack of training data in the specific field, the training can not be carried out by utilizing the data in the specific field. However, the natural language processing task has a large number of similar features (such as the same vocabulary and the same syntax) in the open domain and the specific domain, and the training data of the open domain is often sufficient, so that the problems can be solved by migrating the knowledge in the open domain to the specific domain by using the migration learning.

Deep learning has achieved good results in natural language processing tasks, including part-of-speech tagging tasks, emotion analysis tasks, and syntactic analysis tasks. In natural language processing tasks, a deep neural network can often represent different levels of knowledge of a natural language, each layer in the neural network can represent different levels of knowledge, for example, an embedding layer can represent word level knowledge, and a convolutional neural network and a cyclic neural network can represent phrase/sentence level knowledge. These neural network layers play an important role in natural language understanding.

Disclosure of Invention

The invention aims to solve the technical problem of providing a system and a method for transfer learning of natural language processing tasks based on field adaptation, which are used for solving the problems in the prior art.

The technical scheme adopted by the invention for solving the technical problems is as follows:

a domain-adaptive natural language processing task based transfer learning system includes:

an open-realm part module and a domain-specific part module, wherein the open-realm part module is configured to:

training by using the existing open field data, and modifying an output layer according to different natural language processing tasks, specifically comprising: the LSTM network trains the open field, and a weight matrix of the open field neural network can be obtained;

a domain-specific part module to: constructing a neural network in a specific field, and modifying an output layer according to different natural language processing tasks, wherein the output layer is consistent with part of output layers in the open field;

and training the neural network in the specific field, and obtaining results of different natural language processing tasks according to the results of the output layer.

Preferably, the open-domain neural network comprises an Embedding layer, an LSTM layer and an output layer, wherein the Embedding layer uses words in sentences as d-dimensional vectors, and the LSTM layer receives the words;

c_t＝i_t⊙u_t+f_t⊙c_t-1

furthermore, the output layer is variable for different natural language processing tasks.

Preferably, the domain-specific neural network specifically includes:

the embedded layer, the LSTM layer and the independent embedded layer, the LSTM layer and the output layer of the open field part module;

the Embedding layer is similar to the Embedding layer in the open field part;

the LSTM layer receives input from the open domain part embedding layer and the specific domain part embedding layer simultaneously;

c_t＝i_t⊙u_t+f_t⊙c_t-1

the output layer simultaneously receives the output of the LSTM layer of the open field part and the output of the LSTM layer of the specific field part for outputting;

the output layer is consistent with the output layer of the open field part and is variable for different natural language processing tasks.

Preferably, the output layer outputs an emotional score when it is an emotion analysis task;

where S is^O、S^SThe outputs of the output layers are the open-field part and the domain-specific part, respectively. σ is a sigmoid function.

The method for transfer learning of the natural language processing task based on the field adaptation comprises the following steps:

an open-field partial learning step and a specific-field partial learning step, wherein the open-field partial learning step includes:

a domain-specific partial learning step comprising: constructing a neural network in a specific field, and modifying an output layer according to different natural language processing tasks, wherein the output layer is consistent with part of output layers in the open field;

c_t＝i_t⊙u_t+f_t⊙c_t-1

Preferably, the domain-specific neural network specifically includes:

the Embedding layer is similar to the Embedding layer in the open field part;

c_t＝i_t⊙u_t+f_t⊙c_t-1

The invention utilizes the neural network to establish the transfer learning model and solves the problem of field adaptation in the natural language processing task.

Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.

Drawings

The present invention will be described in detail below with reference to the accompanying drawings so that the above advantages of the present invention will be more apparent. Wherein,

FIG. 1 is a schematic diagram of the architecture of the domain adaptive natural language processing task based transfer learning system of the present invention.

Detailed Description

The following detailed description of the embodiments of the present invention will be provided with reference to the drawings and examples, so that how to apply the technical means to solve the technical problems and achieve the technical effects can be fully understood and implemented. It should be noted that, as long as there is no conflict, the embodiments and the features of the embodiments of the present invention may be combined with each other, and the technical solutions formed are within the scope of the present invention.

Additionally, the steps illustrated in the flow charts of the figures may be performed in a computer system such as a set of computer-executable instructions and, although a logical order is illustrated in the flow charts, in some cases, the steps illustrated or described may be performed in an order different than here.

The invention relates to a migration learning method of a natural language processing task based on field adaptation, which utilizes a neural network to establish a migration learning model and solves the problem of field adaptation in the natural language processing task.

First, for convenience of explanation, we list some of the labels used in this patent in Table 1.

TABLE 1

Wherein, the natural language processing task's based on field adaptation migratory learning system includes:

c_t＝i_t⊙u_t+f_t⊙c_t-1

Preferably, the domain-specific neural network specifically includes:

the Embedding layer is similar to the Embedding layer in the open field part;

c_t＝i_t⊙u_t+f_t⊙c_t-1

c_t＝i_t⊙u_t+f_t⊙c_t-1

Preferably, the domain-specific neural network specifically includes:

the Embedding layer is similar to the Embedding layer in the open field part;

c_t＝i_t⊙u_t+f_t⊙c_t-1

As in fig. 1, we divide the system as a whole into two parts: open field section and domain specific section.

Open field part: firstly, training is carried out by utilizing the existing open field data, and an output layer is modified according to different natural language processing tasks. The open domain is trained by using the LSTM network in the yellow area of the graph, and a weight matrix of the open domain neural network can be obtained.

Specific field section: and constructing a special domain neural network of a green part in the graph, and modifying the output layer according to different natural language processing tasks, wherein the special domain neural network is consistent with the output layer of the open domain part. And keeping the weight matrix of the repeated partial neural network the same as the trained weight matrix of the open-domain neural network, and training the neural network in the specific domain. And obtaining results of different natural language processing tasks according to the output layer results.

We next explain the specific implementation of each part of the system.

LSTM and open field partial neural network construction steps:

the neural network of the open field part comprises an embedding layer, an LSTM layer and an output layer.

The Embedding layer embeds words in sentences into d-dimensional vectors, and the LSTM layer receives the words.

c_t＝i_t⊙u_t+f_t⊙c_t-1

The output layer is variable for different natural language processing tasks.

Constructing a neural network in a specific field:

a neural network of a domain-specific part comprises an embedding layer, an LSTM layer and independent embedding, LSTM and output layers of an open domain part.

The Embedding layer is similar to the Embedding layer of the open field section. The LSTM layer receives input from both the open realm part embedding layer and the domain specific part embedding layer.

c_t＝i_t⊙u_t+f_t⊙c_t-1

The output layer receives and outputs the output of the open domain partial LSTM layer and the output of the domain-specific partial LSTM layer at the same time.

The output layer is consistent with the output layer of the open field part and is variable for different natural language processing tasks. For example, the emotion analysis task, the output layer outputs an emotion score.

Table 2 shows STS data set as open domain, SSTB data set as specific domain result and comparison with other systems under emotion analysis task. Wherein baseline1 and baseline2 are models for direct training of the LSTM neural network without the use of migratory learning;

TABLE 2

Table 3 shows the results of the system in the specific fields of "economy", "education", "science" and "military" on the daily literature under the part of speech tagging task and comparison with other systems.

TABLE 3

Table 4 shows the results of our system on Brown corpuses under the task of entity boundary identification and comparison with other systems. TABLE 4

It should be noted that for simplicity of description, the above method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present application is not limited by the order of acts described, as some steps may occur in other orders or concurrently depending on the application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required in this application.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects.

Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

Finally, it should be noted that: although the present invention has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that changes may be made in the embodiments and/or equivalents thereof without departing from the spirit and scope of the invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A domain-adaptive natural language processing task based transfer learning system includes:

keeping the weight matrix of the repeated partial neural network to be the same as the trained weight matrix of the open-domain neural network, training the neural network in the specific domain, and obtaining results of different natural language processing tasks according to the results of the output layer;

the domain-specific neural network specifically comprises:

the LSTM layer of the domain-specific part receives input from both the open domain part embedding layer and the domain-specific part embedding layer;

c_t＝i_t⊙u_t+f_t⊙c_t-1

wherein x is_tIs a combination of two fields input by a transverse connection, i_tIs the input gate of the t-th cell; f. of_tA forgetting gate being the t-th cell; o_tIs the output gate of the t-th cell, W⁽⁾、b⁽⁾、U⁽⁾Is a trainable parameter, c_tIs the memory cell state of the t-th cell;

wherein f is an activation function; the output layer is consistent with the output layer of the open field part and is variable for different natural language processing tasks.

2. The system of claim 1, wherein the open-domain neural network comprises an Embedding layer, an LSTM layer and an output layer, wherein the Embedding layer embeds words in sentences into d-dimensional vectors, and the LSTM layer receives the words;

c_t＝i_t⊙u_t+f_t⊙c_t-1

wherein i_tIs the input gate of the t-th cell; f. of_tA forgetting gate being the t-th cell; o_tIs the output gate of the t-th cell, W⁽⁾、b⁽⁾、U⁽⁾For trainable parameters, c_tIs the memory cell state of the t-th cell;

3. The domain-adaptive natural language processing task based migration learning system of claim 1, wherein the output layer outputs an emotional score when it is an emotion analysis task;

where S is^O、S^SThe outputs of the output layers are respectively an open-domain part and a domain-specific part, and sigma is a sigmoid function.

4. The method for transfer learning of the natural language processing task based on the field adaptation comprises the following steps:

the domain-specific neural network specifically comprises:

c_t＝i_t⊙u_t+f_t⊙c_t-1

5. The domain-adaptive natural language processing task based migration learning method of claim 4, wherein the open-domain neural network comprises an Embedding layer, an LSTM layer and an output layer, wherein the Embedding layer embeds words in sentences into d-dimensional vectors, and the LSTM layer receives the words;

c_t＝i_t⊙u_t+f_t⊙c_t-1

6. The method of claim 4, wherein the output layer outputs an emotional score when it is an emotion analysis task;