CN107657313B - System and method for transfer learning of natural language processing task based on field adaptation - Google Patents

System and method for transfer learning of natural language processing task based on field adaptation Download PDF

Info

Publication number
CN107657313B
CN107657313B CN201710877941.7A CN201710877941A CN107657313B CN 107657313 B CN107657313 B CN 107657313B CN 201710877941 A CN201710877941 A CN 201710877941A CN 107657313 B CN107657313 B CN 107657313B
Authority
CN
China
Prior art keywords
layer
domain
open
output
natural language
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710877941.7A
Other languages
Chinese (zh)
Other versions
CN107657313A (en
Inventor
肖仰华
谢晨昊
梁家卿
崔万云
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Shuyan Technology Development Co ltd
Original Assignee
Shanghai Shuyan Technology Development Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Shuyan Technology Development Co ltd filed Critical Shanghai Shuyan Technology Development Co ltd
Priority to CN201710877941.7A priority Critical patent/CN107657313B/en
Publication of CN107657313A publication Critical patent/CN107657313A/en
Application granted granted Critical
Publication of CN107657313B publication Critical patent/CN107657313B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Other Investigation Or Analysis Of Materials By Electrical Means (AREA)
  • Machine Translation (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The invention discloses a transfer learning system and a method of natural language processing tasks based on field adaptation, comprising the following steps: an open-realm part module and a domain-specific part module, wherein the open-realm part module is configured to: training by using the existing open field data, and modifying an output layer according to different natural language processing tasks, specifically comprising: the LSTM network trains the open field, and a weight matrix of the open field neural network can be obtained; a domain-specific part module to: constructing a neural network in a specific field, and modifying an output layer according to different natural language processing tasks, wherein the output layer is consistent with part of output layers in the open field; and training the neural network in the specific field, and obtaining results of different natural language processing tasks according to the results of the output layer. It solves the problem of field adaptation in natural language processing tasks.

Description

System and method for transfer learning of natural language processing task based on field adaptation
Technical Field
The invention relates to a migration learning system and a migration learning method of a natural language processing task based on field adaptation, which solve the problem of field adaptation in the natural language processing task.
Background
Natural Language Processing (Natural Language Processing) is a branching subject of the fields of artificial intelligence and linguistics. In this field, how to process and use natural language, including natural language recognition, natural language generation, and natural language understanding, is discussed. Natural language cognition is the understanding of human language by computers. The natural language generation system converts computer data into natural language. Natural language understanding systems translate natural language into a form that is more easily handled by computer programs. The main tasks of natural language processing include part-of-speech tagging, emotion analysis, syntactic analysis, and the like.
The current natural language processing technology still has some problems in application. Most of the current natural language processing technologies are trained by using standard data sets, but in a large number of practical applications, natural language processing needs to be performed in specific fields. The ambiguity of natural language and the overfitting of models at the embedding layer can cause the models to have poor effect in specific fields of application. Meanwhile, due to the lack of training data in the specific field, the training can not be carried out by utilizing the data in the specific field. However, the natural language processing task has a large number of similar features (such as the same vocabulary and the same syntax) in the open domain and the specific domain, and the training data of the open domain is often sufficient, so that the problems can be solved by migrating the knowledge in the open domain to the specific domain by using the migration learning.
Deep learning has achieved good results in natural language processing tasks, including part-of-speech tagging tasks, emotion analysis tasks, and syntactic analysis tasks. In natural language processing tasks, a deep neural network can often represent different levels of knowledge of a natural language, each layer in the neural network can represent different levels of knowledge, for example, an embedding layer can represent word level knowledge, and a convolutional neural network and a cyclic neural network can represent phrase/sentence level knowledge. These neural network layers play an important role in natural language understanding.
Disclosure of Invention
The invention aims to solve the technical problem of providing a system and a method for transfer learning of natural language processing tasks based on field adaptation, which are used for solving the problems in the prior art.
The technical scheme adopted by the invention for solving the technical problems is as follows:
a domain-adaptive natural language processing task based transfer learning system includes:
an open-realm part module and a domain-specific part module, wherein the open-realm part module is configured to:
training by using the existing open field data, and modifying an output layer according to different natural language processing tasks, specifically comprising: the LSTM network trains the open field, and a weight matrix of the open field neural network can be obtained;
a domain-specific part module to: constructing a neural network in a specific field, and modifying an output layer according to different natural language processing tasks, wherein the output layer is consistent with part of output layers in the open field;
and training the neural network in the specific field, and obtaining results of different natural language processing tasks according to the results of the output layer.
Preferably, the open-domain neural network comprises an Embedding layer, an LSTM layer and an output layer, wherein the Embedding layer uses words in sentences as d-dimensional vectors, and the LSTM layer receives the words;
Figure GDA0002938277800000021
Figure GDA0002938277800000022
Figure GDA0002938277800000023
Figure GDA0002938277800000024
ct=it⊙ut+ft⊙ct-1
Figure GDA0002938277800000025
furthermore, the output layer is variable for different natural language processing tasks.
Preferably, the domain-specific neural network specifically includes:
the embedded layer, the LSTM layer and the independent embedded layer, the LSTM layer and the output layer of the open field part module;
the Embedding layer is similar to the Embedding layer in the open field part;
the LSTM layer receives input from the open domain part embedding layer and the specific domain part embedding layer simultaneously;
Figure GDA0002938277800000026
Figure GDA0002938277800000027
Figure GDA0002938277800000028
Figure GDA0002938277800000029
Figure GDA00029382778000000210
ct=it⊙ut+ft⊙ct-1
Figure GDA00029382778000000211
the output layer simultaneously receives the output of the LSTM layer of the open field part and the output of the LSTM layer of the specific field part for outputting;
Figure GDA0002938277800000031
the output layer is consistent with the output layer of the open field part and is variable for different natural language processing tasks.
Preferably, the output layer outputs an emotional score when it is an emotion analysis task;
Figure GDA0002938277800000032
Figure GDA0002938277800000033
where S isO、SSThe outputs of the output layers are the open-field part and the domain-specific part, respectively. σ is a sigmoid function.
The method for transfer learning of the natural language processing task based on the field adaptation comprises the following steps:
an open-field partial learning step and a specific-field partial learning step, wherein the open-field partial learning step includes:
training by using the existing open field data, and modifying an output layer according to different natural language processing tasks, specifically comprising: the LSTM network trains the open field, and a weight matrix of the open field neural network can be obtained;
a domain-specific partial learning step comprising: constructing a neural network in a specific field, and modifying an output layer according to different natural language processing tasks, wherein the output layer is consistent with part of output layers in the open field;
and training the neural network in the specific field, and obtaining results of different natural language processing tasks according to the results of the output layer.
Preferably, the open-domain neural network comprises an Embedding layer, an LSTM layer and an output layer, wherein the Embedding layer uses words in sentences as d-dimensional vectors, and the LSTM layer receives the words;
Figure GDA0002938277800000034
Figure GDA0002938277800000035
Figure GDA0002938277800000036
Figure GDA0002938277800000037
ct=it⊙ut+ft⊙ct-1
Figure GDA0002938277800000038
furthermore, the output layer is variable for different natural language processing tasks.
Preferably, the domain-specific neural network specifically includes:
the embedded layer, the LSTM layer and the independent embedded layer, the LSTM layer and the output layer of the open field part module;
the Embedding layer is similar to the Embedding layer in the open field part;
the LSTM layer receives input from the open domain part embedding layer and the specific domain part embedding layer simultaneously;
Figure GDA0002938277800000041
Figure GDA0002938277800000042
Figure GDA0002938277800000043
Figure GDA0002938277800000044
Figure GDA0002938277800000045
ct=it⊙ut+ft⊙ct-1
Figure GDA0002938277800000046
the output layer simultaneously receives the output of the LSTM layer of the open field part and the output of the LSTM layer of the specific field part for outputting;
Figure GDA0002938277800000047
the output layer is consistent with the output layer of the open field part and is variable for different natural language processing tasks.
Preferably, the output layer outputs an emotional score when it is an emotion analysis task;
Figure GDA0002938277800000048
Figure GDA0002938277800000049
where S isO、SSThe outputs of the output layers are the open-field part and the domain-specific part, respectively. σ is a sigmoid function.
The invention utilizes the neural network to establish the transfer learning model and solves the problem of field adaptation in the natural language processing task.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
The present invention will be described in detail below with reference to the accompanying drawings so that the above advantages of the present invention will be more apparent. Wherein,
FIG. 1 is a schematic diagram of the architecture of the domain adaptive natural language processing task based transfer learning system of the present invention.
Detailed Description
The following detailed description of the embodiments of the present invention will be provided with reference to the drawings and examples, so that how to apply the technical means to solve the technical problems and achieve the technical effects can be fully understood and implemented. It should be noted that, as long as there is no conflict, the embodiments and the features of the embodiments of the present invention may be combined with each other, and the technical solutions formed are within the scope of the present invention.
Additionally, the steps illustrated in the flow charts of the figures may be performed in a computer system such as a set of computer-executable instructions and, although a logical order is illustrated in the flow charts, in some cases, the steps illustrated or described may be performed in an order different than here.
The invention relates to a migration learning method of a natural language processing task based on field adaptation, which utilizes a neural network to establish a migration learning model and solves the problem of field adaptation in the natural language processing task.
First, for convenience of explanation, we list some of the labels used in this patent in Table 1.
TABLE 1
Figure GDA0002938277800000051
Figure GDA0002938277800000061
Wherein, the natural language processing task's based on field adaptation migratory learning system includes:
an open-realm part module and a domain-specific part module, wherein the open-realm part module is configured to:
training by using the existing open field data, and modifying an output layer according to different natural language processing tasks, specifically comprising: the LSTM network trains the open field, and a weight matrix of the open field neural network can be obtained;
a domain-specific part module to: constructing a neural network in a specific field, and modifying an output layer according to different natural language processing tasks, wherein the output layer is consistent with part of output layers in the open field;
and training the neural network in the specific field, and obtaining results of different natural language processing tasks according to the results of the output layer.
Preferably, the open-domain neural network comprises an Embedding layer, an LSTM layer and an output layer, wherein the Embedding layer uses words in sentences as d-dimensional vectors, and the LSTM layer receives the words;
Figure GDA0002938277800000062
Figure GDA0002938277800000063
Figure GDA0002938277800000064
Figure GDA0002938277800000065
ct=it⊙ut+ft⊙ct-1
Figure GDA0002938277800000066
furthermore, the output layer is variable for different natural language processing tasks.
Preferably, the domain-specific neural network specifically includes:
the embedded layer, the LSTM layer and the independent embedded layer, the LSTM layer and the output layer of the open field part module;
the Embedding layer is similar to the Embedding layer in the open field part;
the LSTM layer receives input from the open domain part embedding layer and the specific domain part embedding layer simultaneously;
Figure GDA0002938277800000067
Figure GDA0002938277800000068
Figure GDA0002938277800000069
Figure GDA0002938277800000071
Figure GDA0002938277800000072
ct=it⊙ut+ft⊙ct-1
Figure GDA0002938277800000073
the output layer simultaneously receives the output of the LSTM layer of the open field part and the output of the LSTM layer of the specific field part for outputting;
Figure GDA0002938277800000074
the output layer is consistent with the output layer of the open field part and is variable for different natural language processing tasks.
Preferably, the output layer outputs an emotional score when it is an emotion analysis task;
Figure GDA0002938277800000075
Figure GDA0002938277800000076
where S isO、SSThe outputs of the output layers are the open-field part and the domain-specific part, respectively. σ is a sigmoid function.
The method for transfer learning of the natural language processing task based on the field adaptation comprises the following steps:
an open-field partial learning step and a specific-field partial learning step, wherein the open-field partial learning step includes:
training by using the existing open field data, and modifying an output layer according to different natural language processing tasks, specifically comprising: the LSTM network trains the open field, and a weight matrix of the open field neural network can be obtained;
a domain-specific partial learning step comprising: constructing a neural network in a specific field, and modifying an output layer according to different natural language processing tasks, wherein the output layer is consistent with part of output layers in the open field;
and training the neural network in the specific field, and obtaining results of different natural language processing tasks according to the results of the output layer.
Preferably, the open-domain neural network comprises an Embedding layer, an LSTM layer and an output layer, wherein the Embedding layer uses words in sentences as d-dimensional vectors, and the LSTM layer receives the words;
Figure GDA0002938277800000077
Figure GDA0002938277800000078
Figure GDA0002938277800000079
Figure GDA00029382778000000710
ct=it⊙ut+ft⊙ct-1
Figure GDA0002938277800000081
furthermore, the output layer is variable for different natural language processing tasks.
Preferably, the domain-specific neural network specifically includes:
the embedded layer, the LSTM layer and the independent embedded layer, the LSTM layer and the output layer of the open field part module;
the Embedding layer is similar to the Embedding layer in the open field part;
the LSTM layer receives input from the open domain part embedding layer and the specific domain part embedding layer simultaneously;
Figure GDA0002938277800000082
Figure GDA0002938277800000083
Figure GDA0002938277800000084
Figure GDA0002938277800000085
Figure GDA0002938277800000086
ct=it⊙ut+ft⊙ct-1
Figure GDA0002938277800000087
the output layer simultaneously receives the output of the LSTM layer of the open field part and the output of the LSTM layer of the specific field part for outputting;
Figure GDA0002938277800000088
the output layer is consistent with the output layer of the open field part and is variable for different natural language processing tasks.
Preferably, the output layer outputs an emotional score when it is an emotion analysis task;
Figure GDA0002938277800000089
Figure GDA00029382778000000810
where S isO、SSThe outputs of the output layers are the open-field part and the domain-specific part, respectively. σ is a sigmoid function.
The invention utilizes the neural network to establish the transfer learning model and solves the problem of field adaptation in the natural language processing task.
As in fig. 1, we divide the system as a whole into two parts: open field section and domain specific section.
Open field part: firstly, training is carried out by utilizing the existing open field data, and an output layer is modified according to different natural language processing tasks. The open domain is trained by using the LSTM network in the yellow area of the graph, and a weight matrix of the open domain neural network can be obtained.
Specific field section: and constructing a special domain neural network of a green part in the graph, and modifying the output layer according to different natural language processing tasks, wherein the special domain neural network is consistent with the output layer of the open domain part. And keeping the weight matrix of the repeated partial neural network the same as the trained weight matrix of the open-domain neural network, and training the neural network in the specific domain. And obtaining results of different natural language processing tasks according to the output layer results.
We next explain the specific implementation of each part of the system.
LSTM and open field partial neural network construction steps:
the neural network of the open field part comprises an embedding layer, an LSTM layer and an output layer.
The Embedding layer embeds words in sentences into d-dimensional vectors, and the LSTM layer receives the words.
Figure GDA0002938277800000091
Figure GDA0002938277800000092
Figure GDA0002938277800000093
Figure GDA0002938277800000094
ct=it⊙ut+ft⊙ct-1
Figure GDA0002938277800000095
The output layer is variable for different natural language processing tasks.
Constructing a neural network in a specific field:
a neural network of a domain-specific part comprises an embedding layer, an LSTM layer and independent embedding, LSTM and output layers of an open domain part.
The Embedding layer is similar to the Embedding layer of the open field section. The LSTM layer receives input from both the open realm part embedding layer and the domain specific part embedding layer.
Figure GDA0002938277800000096
Figure GDA0002938277800000097
Figure GDA0002938277800000098
Figure GDA0002938277800000099
Figure GDA00029382778000000910
ct=it⊙ut+ft⊙ct-1
Figure GDA00029382778000000911
The output layer receives and outputs the output of the open domain partial LSTM layer and the output of the domain-specific partial LSTM layer at the same time.
Figure GDA0002938277800000101
The output layer is consistent with the output layer of the open field part and is variable for different natural language processing tasks. For example, the emotion analysis task, the output layer outputs an emotion score.
Figure GDA0002938277800000102
Figure GDA0002938277800000103
Where S isO、SSThe outputs of the output layers are the open-field part and the domain-specific part, respectively. σ is a sigmoid function.
Table 2 shows STS data set as open domain, SSTB data set as specific domain result and comparison with other systems under emotion analysis task. Wherein baseline1 and baseline2 are models for direct training of the LSTM neural network without the use of migratory learning;
TABLE 2
Figure GDA0002938277800000104
Table 3 shows the results of the system in the specific fields of "economy", "education", "science" and "military" on the daily literature under the part of speech tagging task and comparison with other systems.
TABLE 3
Figure GDA0002938277800000111
Table 4 shows the results of our system on Brown corpuses under the task of entity boundary identification and comparison with other systems. TABLE 4
Figure GDA0002938277800000112
It should be noted that for simplicity of description, the above method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present application is not limited by the order of acts described, as some steps may occur in other orders or concurrently depending on the application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required in this application.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects.
Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
Finally, it should be noted that: although the present invention has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that changes may be made in the embodiments and/or equivalents thereof without departing from the spirit and scope of the invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (6)

1. A domain-adaptive natural language processing task based transfer learning system includes:
an open-realm part module and a domain-specific part module, wherein the open-realm part module is configured to:
training by using the existing open field data, and modifying an output layer according to different natural language processing tasks, specifically comprising: the LSTM network trains the open field, and a weight matrix of the open field neural network can be obtained;
a domain-specific part module to: constructing a neural network in a specific field, and modifying an output layer according to different natural language processing tasks, wherein the output layer is consistent with part of output layers in the open field;
keeping the weight matrix of the repeated partial neural network to be the same as the trained weight matrix of the open-domain neural network, training the neural network in the specific domain, and obtaining results of different natural language processing tasks according to the results of the output layer;
the domain-specific neural network specifically comprises:
the embedded layer, the LSTM layer and the independent embedded layer, the LSTM layer and the output layer of the open field part module;
the LSTM layer of the domain-specific part receives input from both the open domain part embedding layer and the domain-specific part embedding layer;
Figure FDA0002982776170000011
Figure FDA0002982776170000012
Figure FDA0002982776170000013
Figure FDA0002982776170000014
Figure FDA0002982776170000015
ct=it⊙ut+ft⊙ct-1
Figure FDA0002982776170000016
wherein x istIs a combination of two fields input by a transverse connection, itIs the input gate of the t-th cell; f. oftA forgetting gate being the t-th cell; otIs the output gate of the t-th cell, W()、b()、U()Is a trainable parameter, ctIs the memory cell state of the t-th cell;
the output layer simultaneously receives the output of the LSTM layer of the open field part and the output of the LSTM layer of the specific field part for outputting;
Figure FDA0002982776170000017
wherein f is an activation function; the output layer is consistent with the output layer of the open field part and is variable for different natural language processing tasks.
2. The system of claim 1, wherein the open-domain neural network comprises an Embedding layer, an LSTM layer and an output layer, wherein the Embedding layer embeds words in sentences into d-dimensional vectors, and the LSTM layer receives the words;
Figure FDA0002982776170000021
Figure FDA0002982776170000022
Figure FDA0002982776170000023
Figure FDA0002982776170000024
ct=it⊙ut+ft⊙ct-1
Figure FDA0002982776170000025
wherein itIs the input gate of the t-th cell; f. oftA forgetting gate being the t-th cell; otIs the output gate of the t-th cell, W()、b()、U()For trainable parameters, ctIs the memory cell state of the t-th cell;
furthermore, the output layer is variable for different natural language processing tasks.
3. The domain-adaptive natural language processing task based migration learning system of claim 1, wherein the output layer outputs an emotional score when it is an emotion analysis task;
Figure FDA0002982776170000026
Figure FDA0002982776170000027
where S isO、SSThe outputs of the output layers are respectively an open-domain part and a domain-specific part, and sigma is a sigmoid function.
4. The method for transfer learning of the natural language processing task based on the field adaptation comprises the following steps:
an open-field partial learning step and a specific-field partial learning step, wherein the open-field partial learning step includes:
training by using the existing open field data, and modifying an output layer according to different natural language processing tasks, specifically comprising: the LSTM network trains the open field, and a weight matrix of the open field neural network can be obtained;
a domain-specific partial learning step comprising: constructing a neural network in a specific field, and modifying an output layer according to different natural language processing tasks, wherein the output layer is consistent with part of output layers in the open field;
keeping the weight matrix of the repeated partial neural network to be the same as the trained weight matrix of the open-domain neural network, training the neural network in the specific domain, and obtaining results of different natural language processing tasks according to the results of the output layer;
the domain-specific neural network specifically comprises:
the embedded layer, the LSTM layer and the independent embedded layer, the LSTM layer and the output layer of the open field part module;
the LSTM layer of the domain-specific part receives input from both the open domain part embedding layer and the domain-specific part embedding layer;
Figure FDA0002982776170000031
Figure FDA0002982776170000032
Figure FDA0002982776170000033
Figure FDA0002982776170000034
Figure FDA0002982776170000035
ct=it⊙ut+ft⊙ct-1
Figure FDA0002982776170000036
wherein x istIs a combination of two fields input by a transverse connection, itIs the input gate of the t-th cell; f. oftA forgetting gate being the t-th cell; otIs the output gate of the t-th cell, W()、b()、U()Is a trainable parameter, ctIs the memory cell state of the t-th cell;
the output layer simultaneously receives the output of the LSTM layer of the open field part and the output of the LSTM layer of the specific field part for outputting;
Figure FDA0002982776170000037
wherein f is an activation function; the output layer is consistent with the output layer of the open field part and is variable for different natural language processing tasks.
5. The domain-adaptive natural language processing task based migration learning method of claim 4, wherein the open-domain neural network comprises an Embedding layer, an LSTM layer and an output layer, wherein the Embedding layer embeds words in sentences into d-dimensional vectors, and the LSTM layer receives the words;
Figure FDA0002982776170000038
Figure FDA0002982776170000039
Figure FDA00029827761700000310
Figure FDA0002982776170000041
ct=it⊙ut+ft⊙ct-1
Figure FDA0002982776170000042
wherein itIs the input gate of the t-th cell; f. oftA forgetting gate being the t-th cell; otIs the output gate of the t-th cell, W()、b()、U()For trainable parameters, ctIs the memory cell state of the t-th cell;
furthermore, the output layer is variable for different natural language processing tasks.
6. The method of claim 4, wherein the output layer outputs an emotional score when it is an emotion analysis task;
Figure FDA0002982776170000043
Figure FDA0002982776170000044
where S isO、SSThe outputs of the output layers are respectively an open-domain part and a domain-specific part, and sigma is a sigmoid function.
CN201710877941.7A 2017-09-26 2017-09-26 System and method for transfer learning of natural language processing task based on field adaptation Active CN107657313B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710877941.7A CN107657313B (en) 2017-09-26 2017-09-26 System and method for transfer learning of natural language processing task based on field adaptation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710877941.7A CN107657313B (en) 2017-09-26 2017-09-26 System and method for transfer learning of natural language processing task based on field adaptation

Publications (2)

Publication Number Publication Date
CN107657313A CN107657313A (en) 2018-02-02
CN107657313B true CN107657313B (en) 2021-05-18

Family

ID=61131087

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710877941.7A Active CN107657313B (en) 2017-09-26 2017-09-26 System and method for transfer learning of natural language processing task based on field adaptation

Country Status (1)

Country Link
CN (1) CN107657313B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3782080A1 (en) 2018-04-18 2021-02-24 DeepMind Technologies Limited Neural networks for scalable continual learning in domains with sequentially learned tasks
CN109190120B (en) * 2018-08-31 2020-01-21 第四范式(北京)技术有限公司 Neural network training method and device and named entity identification method and device
CN109782600A (en) * 2019-01-25 2019-05-21 东华大学 A method of autonomous mobile robot navigation system is established by virtual environment
CN111079447B (en) * 2020-03-23 2020-07-14 深圳智能思创科技有限公司 Chinese-oriented pre-training method and system
CN111539474B (en) * 2020-04-23 2022-05-10 大连理工大学 Classifier model transfer learning method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104538028A (en) * 2014-12-25 2015-04-22 清华大学 Continuous voice recognition method based on deep long and short term memory recurrent neural network
CN106528776A (en) * 2016-11-07 2017-03-22 上海智臻智能网络科技股份有限公司 Text classification method and device
CN106952181A (en) * 2017-03-08 2017-07-14 深圳市景程信息科技有限公司 Electric Load Prediction System based on long Memory Neural Networks in short-term
CN106980683A (en) * 2017-03-30 2017-07-25 中国科学技术大学苏州研究院 Blog text snippet generation method based on deep learning
CN107168955A (en) * 2017-05-23 2017-09-15 南京大学 Word insertion and the Chinese word cutting method of neutral net using word-based context

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104538028A (en) * 2014-12-25 2015-04-22 清华大学 Continuous voice recognition method based on deep long and short term memory recurrent neural network
CN106528776A (en) * 2016-11-07 2017-03-22 上海智臻智能网络科技股份有限公司 Text classification method and device
CN106952181A (en) * 2017-03-08 2017-07-14 深圳市景程信息科技有限公司 Electric Load Prediction System based on long Memory Neural Networks in short-term
CN106980683A (en) * 2017-03-30 2017-07-25 中国科学技术大学苏州研究院 Blog text snippet generation method based on deep learning
CN107168955A (en) * 2017-05-23 2017-09-15 南京大学 Word insertion and the Chinese word cutting method of neutral net using word-based context

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
A Convolution-LSTM-Based Deep Neural Network for Cross-Domain MOOC Forum Post Classification;Xiaocong Wei et al;《information》;20170730;第8卷(第3期);第1-16页 *
Recurrent Neural Collective Classification;Derek D. Monner,James A. Reggia;《 IEEE Transactions on Neural Networks and Learning Systems 》;20131231;第24卷(第12期);第1932-1943页 *
基于迁移学习与深度卷积特征的图像标注方法研究;宋光慧;《中国博士学位论文全文数据库 信息科技辑》;20170815;论文第33-47页 *

Also Published As

Publication number Publication date
CN107657313A (en) 2018-02-02

Similar Documents

Publication Publication Date Title
CN107657313B (en) System and method for transfer learning of natural language processing task based on field adaptation
Jupalle et al. Automation of human behaviors and its prediction using machine learning
CN111368996B (en) Retraining projection network capable of transmitting natural language representation
CN113987209B (en) Natural language processing method, device, computing equipment and storage medium based on knowledge-guided prefix fine adjustment
CN112257858B (en) Model compression method and device
CN109992779B (en) Emotion analysis method, device, equipment and storage medium based on CNN
CN112131366A (en) Method, device and storage medium for training text classification model and text classification
Gallant et al. Representing objects, relations, and sequences
Dinsmore The symbolic and connectionist paradigms: closing the gap
CN108962224B (en) Joint modeling method, dialogue method and system for spoken language understanding and language model
CN107408384A (en) The end-to-end speech recognition of deployment
CN114565104A (en) Language model pre-training method, result recommendation method and related device
CN113987179A (en) Knowledge enhancement and backtracking loss-based conversational emotion recognition network model, construction method, electronic device and storage medium
CN113901191A (en) Question-answer model training method and device
CN110647919A (en) Text clustering method and system based on K-means clustering and capsule network
Giles Adaptive Processing of Sequences and Data Structures: International Summer School on Neural Networks," ER Caianiello", Vietri Sul Mare, Salerno, Italy, September 6-13, 1997, Tutorial Lectures
Cree et al. Computational models of semantic memory
CN110442693B (en) Reply message generation method, device, server and medium based on artificial intelligence
CN112131879A (en) Relationship extraction system, method and device
Zaman et al. Convolutional recurrent neural network for question answering
Jati et al. Multilingual named entity recognition model for Indonesian health insurance question answering system
Luo Application analysis of multimedia technology in college English curriculum
CN111625623B (en) Text theme extraction method, text theme extraction device, computer equipment, medium and program product
Wang et al. How internal neurons represent the short context: an emergent perspective
CN109033088B (en) Neural network-based second language learning model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant