CN113987209A - Natural language processing method and device based on knowledge-guided prefix fine tuning, computing equipment and storage medium - Google Patents
Natural language processing method and device based on knowledge-guided prefix fine tuning, computing equipment and storage medium Download PDFInfo
- Publication number
- CN113987209A CN113987209A CN202111300021.1A CN202111300021A CN113987209A CN 113987209 A CN113987209 A CN 113987209A CN 202111300021 A CN202111300021 A CN 202111300021A CN 113987209 A CN113987209 A CN 113987209A
- Authority
- CN
- China
- Prior art keywords
- prefix
- training
- language model
- words
- task
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 49
- 238000003058 natural language processing Methods 0.000 title claims abstract description 34
- 238000012549 training Methods 0.000 claims abstract description 112
- 239000013598 vector Substances 0.000 claims abstract description 76
- 230000008451 emotion Effects 0.000 claims abstract description 32
- 238000004458 analytical method Methods 0.000 claims abstract description 25
- 238000000605 extraction Methods 0.000 claims abstract description 24
- 238000004364 calculation method Methods 0.000 claims abstract description 23
- 238000013507 mapping Methods 0.000 claims description 20
- 230000006870 function Effects 0.000 claims description 13
- 238000004590 computer program Methods 0.000 claims description 12
- 238000012545 processing Methods 0.000 claims description 10
- 230000000694 effects Effects 0.000 abstract description 5
- 208000000668 Chronic Pancreatitis Diseases 0.000 description 4
- 208000002193 Pain Diseases 0.000 description 4
- 206010033649 Pancreatitis chronic Diseases 0.000 description 4
- 230000000873 masking effect Effects 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 208000024891 symptom Diseases 0.000 description 4
- 238000001959 radiotherapy Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 2
- 238000005286 illumination Methods 0.000 description 2
- 208000000094 Chronic Pain Diseases 0.000 description 1
- 238000007792 addition Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000000877 morphologic effect Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 230000005855 radiation Effects 0.000 description 1
- 238000009966 trimming Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/237—Lexical tools
- G06F40/242—Dictionaries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/088—Non-supervised learning, e.g. competitive learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Evolutionary Biology (AREA)
- Molecular Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Animal Behavior & Ethology (AREA)
- Databases & Information Systems (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a natural language processing method, a device, computing equipment and a storage medium based on knowledge-guided prefix fine tuning, which are characterized by firstly constructing prefix cue words related to a downstream task and label words related to task categories obtained from a knowledge map, then utilizing embedded vectors of the prefix cue words to be spliced with key values and value values of input texts and then carrying out self-attention calculation so as to enable the prefix cue words and the input texts to be closely combined for learning, and simultaneously integrating all the label words to determine learning labels, namely utilizing ontology knowledge related to the task categories to guide fine tuning of a pre-training language model, so that the prediction effect of the fine-tuned pre-training language model on the downstream task is better, and the prediction accuracy of the pre-training language model is improved. The downstream tasks are emotion analysis tasks and relation extraction tasks, and the emotion analysis accuracy and the relation extraction accuracy improved by the pre-training language model obtained by the corresponding method are adopted.
Description
Technical Field
The invention belongs to a natural language processing technology, and particularly relates to a natural language processing method and device based on knowledge-guided prefix fine tuning, computing equipment and a storage medium.
Background
The pre-training model is a model obtained by training on a large reference data set, such as a large pre-training language model like BERT, GPT, XLNet, etc., and is obtained by pre-training on a large amount of corpora. Because the pre-trained model has been unsupervised learning with a large corpus, knowledge in the corpus has been migrated into Eembedding of the pre-trained model.
The fine tuning/fine-tune is a main method for transferring the PTM knowledge to the downstream task, and the currently common fine tuning methods all need to add a network structure for fine tuning aiming at a specific task so as to adapt to a specific task. However, such trimming methods have the following drawbacks: (1) the parameter efficiency is low: each downstream task has its own fine tuning parameters; (2) the training target and the fine tuning target of the pre-training are different, so that the generalization capability of the pre-training model is poor; (3) compared with the network parameters added in the pre-training stage, a large amount of data is needed to learn the newly added parameters. The shortcomings of these fine tuning methods lead to poor task performance in emotion analysis tasks, relationship extraction tasks, and various classification tasks.
The prior patent document CN112100383A discloses a meta-knowledge fine tuning method and a platform facing a multi-task language model, wherein the method obtains highly transferable common knowledge, namely meta-knowledge, on different data sets of similar tasks based on cross-domain typicality fraction learning, mutually associates and mutually strengthens the learning processes of the similar tasks on different domains corresponding to different data sets, improves the fine tuning effect of similar downstream tasks on the data sets of different domains in the application of the language model, and improves the parameter initialization capability and generalization capability of a general language model of the similar tasks. The method does not consider ontology knowledge, and has poor fine tuning effect on downstream tasks.
As disclosed in CN113032559A, a language model fine-tuning method for low-resource-adhesion language text classification constructs a low-noise fine-tuning dataset through morphological analysis and stem extraction, fine-tunes a cross-language pre-training model on the dataset, provides a meaningful and easy-to-use feature extractor for downstream text classification tasks, better selects relevant semantic and syntactic information from the pre-trained language model, and uses these features for the downstream text classification tasks. The method does not consider ontology knowledge, and has poor fine tuning effect on downstream tasks.
Disclosure of Invention
In view of the foregoing, an object of the present invention is to provide a natural language processing method, apparatus, computing device and storage medium based on prefix fine tuning guided by knowledge, wherein a pre-trained language model is trained by considering prefix hints and ontology knowledge related to a downstream task, so as to improve accuracy of prediction of the pre-trained language model on the downstream task.
In a first aspect, an embodiment provides a natural language processing method based on knowledge-guided prefix fine tuning, including the following steps:
constructing an initial prefix cue word according to a downstream task, and mapping the initial prefix cue word into embedded vectors with the same number as the number of layers of a pre-training language model through a function, wherein the dimension of each embedded vector is 2 times that of the corresponding model layer;
linking each task category of the downstream tasks to a knowledge graph, and taking words related to each task category in the knowledge graph as tag words;
converting the pre-training language model into a downstream task of shielding the token according to the prefix prompt words and the label words, and performing fine tuning training on the pre-training language model, wherein the fine tuning training comprises the following steps: inputting a training text into a pre-training language model, splitting an embedded vector of a prefix cue word into 2 parts with the same dimensionality as that of a corresponding model layer on each layer, splicing a key value and a value corresponding to the training text respectively, participating in self-attribute calculation, and simultaneously optimizing the embedded vector of the prefix cue word, parameters of the pre-training language model and the weight of a label word by taking the weighted result of all label words corresponding to each task category as a label;
when the method is applied, the embedded vectors of the predicted text and the prefix cue words are input into the fine-tuned pre-training language model, and the predicted values of all the label words and the weighting results of the corresponding weights are used as prediction results after calculation.
Preferably, the mapping the initial prefix cue words into the embedded vectors with the same number as the number of layers of the pre-training language model through the function includes:
and initially encoding the initial prefix cue words into initial embedded vectors, and then mapping the initial embedded vectors once by adopting function mapping to obtain the embedded vectors with the same number as the number of layers of the pre-training language model.
Preferably, the mapping the initial prefix cue words into the embedded vectors with the same number as the number of layers of the pre-training language model through the function includes:
and initially encoding the initial prefix cue words into initial embedded vectors, and mapping the initial embedded vectors to each layer of the pre-training language model by adopting multiple layers of MLPs (Multi-level MLPs) to obtain the embedded vectors corresponding to each layer.
Preferably, when the pre-training language model is subjected to fine-tuning training, the calculation mode of participating in self-entry is as follows:
wherein l represents the number of layers, QlDenotes the query value, KlDenotes the key value, VlA value is represented by a value,the embedded vector representing the prefix hint is split into the portion that corresponds to the key value,representing prefix hintsThe embedded vector of words is split into the part corresponding to the value, soft (-) meaning, sign; indicating a splicing operation.
Preferably, the pre-trained language model comprises: BERT, RoBerta, GPT series model.
In one embodiment, the downstream task is an emotion analysis task, the corresponding initial prefix words are emotion analysis, and each task category of the emotion analysis task is connected to the financial field knowledge map so as to search words related to each task category as tag words; then converting the pre-training language model into an emotion analysis task of the shielding token according to emotion analysis and the label words, and carrying out fine tuning training on the pre-training language model; and finally, inputting the embedded vectors of the predicted text and the emotion analysis into the finely-tuned pre-training language model during application, and taking the predicted values of all the label words and the weighting results of the corresponding weights as emotion analysis prediction results after calculation.
In another embodiment, the downstream task is a relationship extraction task, the corresponding initial prefix words are extracted as relationships, and each task category of the relationship extraction task is connected to the medical field knowledge map so as to search words related to each task category as tag words; then converting the pre-training language model into a relation extraction task of a shielding token according to the relation extraction and the label words, and carrying out fine tuning training on the pre-training language model; and finally, inputting the prediction text and the embedded vector of the relation extraction into the fine-tuned pre-training language model during application, and taking the prediction values of all the label words and the weighting results of the corresponding weights as the relation extraction results after calculation.
In a second aspect, an embodiment provides a natural language processing apparatus based on knowledge-guided prefix fine tuning, including:
the prefix cue word processing module is used for constructing an initial prefix cue word according to a downstream task, and mapping the initial prefix cue word into embedded vectors with the same number as the number of layers of the pre-training language model through a function, wherein the dimension of each embedded vector is 2 times that of the corresponding model layer;
the system comprises a tag word processing module, a knowledge graph and a task processing module, wherein the tag word processing module is used for linking each task category of a downstream task to the knowledge graph and taking words related to each task category in the knowledge graph as tag words;
the fine tuning module is used for converting the pre-training language model into a downstream task of shielding the token according to the prefix cue words and the label words, and performing fine tuning training on the pre-training language model, and comprises the following steps: inputting a training text into a pre-training language model, splitting an embedded vector of a prefix cue word into 2 parts with the same dimensionality as that of a corresponding model layer on each layer, splicing a key value and a value corresponding to the training text respectively, participating in self-attribute calculation, and simultaneously optimizing the embedded vector of the prefix cue word, parameters of the pre-training language model and the weight of a label word by taking the weighted result of all label words corresponding to each task category as a label;
and the application module is used for inputting the embedded vectors of the predicted text and the prefix prompt words into the finely-tuned pre-training language model, and taking the predicted values of all the label words and the weighting results of the corresponding weights as prediction results after calculation.
In a third aspect, an embodiment provides a computing device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the natural language processing method based on knowledge-guided prefix fine tuning described in the first aspect when executing the computer program.
In a fourth aspect, an embodiment provides a computer storage medium, on which a computer program is stored, and when the computer program is processed and executed, the method for natural language processing based on knowledge-guided prefix fine tuning in the first aspect is implemented.
Compared with the prior art, the invention has the beneficial effects that at least:
according to the technical scheme provided by the embodiment, the prefix prompt words related to the downstream task and the label words related to the task categories obtained from the knowledge map are firstly constructed, then the embedded vectors of the prefix prompt words are spliced with the key values and the value values of the input text, and then self-entry calculation is carried out, so that the prefix prompt words and the input text are closely combined for learning, and meanwhile, all the label words are integrated to determine learning labels, namely, body knowledge related to the task categories is used for guiding fine adjustment of the pre-training language model, so that the prediction effect of the fine-adjusted pre-training language model on the downstream task is better, and the prediction accuracy of the pre-training language model is improved. The downstream tasks are emotion analysis tasks and relation extraction tasks, and the emotion analysis accuracy and the relation extraction accuracy improved by the pre-training language model obtained by the corresponding method are adopted.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a flowchart of a natural language processing method based on knowledge-guided prefix hinting provided by an embodiment;
fig. 2 is a schematic structural diagram of a natural language processing apparatus for guiding prefix fine-tuning based on knowledge according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the detailed description and specific examples, while indicating the scope of the invention, are intended for purposes of illustration only and are not intended to limit the scope of the invention.
Aiming at the problem that the emotion analysis task and the relation extraction task are inaccurate by using the fine-tuned pre-training language model, the embodiment provides a fine-tuning mode of the pre-training language model by taking a knowledge body and a prefix prompt word related to the emotion analysis task and the relation extraction task as guidance, and the pre-training language model obtained through the fine-tuning mode can improve the task prediction result.
Fig. 1 is a flowchart of a natural language processing method based on knowledge-guided prefix hinting provided by an embodiment. As shown in fig. 1, the natural language processing method based on knowledge-guided prefix fine tuning provided by the embodiment includes the following steps:
step 1, constructing an initial prefix cue word related to a downstream task, and mapping to obtain an embedded vector.
In an embodiment, the prefix hint words are phrases that are closely related to downstream tasks, and the phrases are text that is composed of at least one word. When the downstream task is an emotion analysis task for each certain text statement (e.g., today stock is all green, and is bad), the prefix hints are emotion analysis. When the downstream task is a relationship extraction task for each text sentence (e.g., external illumination may be effective to improve pain symptoms in patients with chronic pancreatitis), the prefix cue is relationship extraction. After the prefix cue words are initialized, mapping is carried out on the prefix cue words to obtain embedded vectors with the number being the same as the number of layers of the pre-training language model, and in order to realize that the embedded vectors are respectively combined with the key value and the value of each layer of the pre-training language model, the dimension of each embedded vector is required to be 2 times of the dimension of the corresponding model layer.
In an embodiment, the initial prefix hint words may be initially encoded into initial embedded vectors, and then the initial embedded vectors are mapped once by using function mapping to obtain embedded vectors with the same number of layers as the number of layers of the pre-training language model. For example, the pre-trained language model has 10 layers, each layer has a size of 5 × 768, and can be mapped once through the mapping function, and the mapping is directly to an embedded vector of 5 × 768 × 10 × 2, 10 indicates that 10 embedded vectors of 5 × 768 × 2 are obtained, and 2 indicates that the dimension is 2 times the size of 5 × 768 of each layer.
In an embodiment, the initial prefix hint words may also be initially encoded into initial embedded vectors, and the initial embedded vectors are mapped to each layer of the pre-training language model by using multiple layers of MLPs, so as to obtain embedded vectors corresponding to each layer. That is, after multiple times of mapping, the embedded vector corresponding to each layer is obtained, but the dimension of the embedded vector corresponding to each layer is also ensured to be 2 times of the size of each layer.
And 2, linking each task category of the downstream task to the knowledge graph, and taking words related to each task category in the knowledge graph as tag words.
In an embodiment, the task categories are related to downstream tasks, and for the emotion analysis task, the task categories comprise positive emotions, negative emotions and the like. The positive emotion and the negative emotion can be linked to financial field knowledge maps such as a HowNet emotion dictionary, words related to the positive emotion are obtained and used as label words, and for example, good evaluation, excellence, goodness and the like related to the positive emotion can be obtained to form a label word set for constructing task supervision learning labels. For the relationship extraction task, the task category includes radiation therapy and the like. The radiotherapy can be connected to a knowledge map in the medical field such as DiseasKG, Yidu-N7K and the like to obtain a text related to the radiotherapy, wherein the text "external irradiation can effectively improve the pain symptoms of patients with chronic pancreatitis", and tag words of the external irradiation, the chronic pancreatitis and the pain symptoms are extracted from the text.
And 3, converting the pre-training language model into a downstream task of the shielding token according to the prefix prompt words and the label words, and performing fine tuning training on the pre-training language model.
In an embodiment, the pre-training language model may employ a BERT, RoBerta, GPT series model. The models can map input texts to obtain query, key and value vectors, and all contain a self-attribute mechanism to carry out self-attribute calculation.
In an embodiment, the fine tuning training process comprises: inputting a training text into a pre-training language model, splitting an embedded vector of a prefix cue word into 2 parts with the same dimensionality as that of a corresponding model layer on each layer, splicing the 2 parts with a key value and a value corresponding to the training text respectively, participating in self-attribute calculation, and simultaneously optimizing the embedded vector of the prefix cue word, parameters of the pre-training language model and the weight of a label word by taking the weighted result of all label words corresponding to each task category as a label.
In the l-th layer of a pre-trained language model, a representation of an input text sequenceFirst mapped to the query/key/value vector:
Ql=XlWQ,Kl=XlWK,Vl=XlWV
wherein, W is a model parameter, and then the calculation mode participating in self-attention is as follows:
wherein Q islDenotes the query value, KlDenotes the key value, VlA value is represented by a value,the embedded vector representing the prefix hint is split into the portion that corresponds to the key value,the embedded vector representing the prefix hint word is split into a part corresponding to a value, soft (-) represents and a symbol; indicating a splicing operation.
In an embodiment, a weight is initialized for each tag word, and then each tag word is summed according to the weight to obtain a training tag, for example, when the weights of epi-illumination and chronic pancreatitis and pain symptoms are initialized to 0.2, 0.5 and 0.3, respectively, and the task of masking token prediction is performed on the pre-trained language model, that is, when the vocabulary at the masking token position [ MASK ] in the input text sequence is predicted, the loci of this category are treated with radiation of 0.2 +0.5 + 0.2. Then, based on the embedded vector and the weight vector of the learnable prefix cue word, the parameters of the pre-training language model are finely adjusted on the sample data, and better performance of the pre-training language model can be obtained.
And 4, inputting the embedded vectors of the predicted text and the prefix prompt words into the finely-tuned pre-training language model during application, and taking the predicted values of all the label words and the weighting results of the corresponding weights as prediction results through calculation.
In the embodiment, for the emotion prediction task, the embedded vectors of the predicted text and the prefix prompt words are input into the fine-tuned pre-training language model, and the prediction values of all the label words and the weighting results of the corresponding weights are used as the prediction results of the predicted text after calculation.
The natural language processing method based on knowledge-guided prefix fine tuning provided by the embodiment generates an embedded vector and a tagged word set of a multilayer knowledge prefix cue word based on the downstream task description and the external knowledge base design, and converts the downstream task into a task of masking token prediction.
In the natural language processing method based on knowledge-guided prefix fine tuning provided by the above embodiment, the pre-training language model is a neural network model that is specially used for learning semantic information in a corpus from a large-scale unmarked corpus in an unsupervised manner, and is a complex learning model composed of multiple layers of neural networks, and the pre-training language model can more accurately capture semantic information in a text, thereby improving the accuracy of the model in performing downstream tasks.
In the natural language processing method based on knowledge-guided prefix fine tuning provided by the embodiment, the fine tuning technology based on knowledge-guided prefix is adopted, so that the accuracy and efficiency of downstream tasks can be remarkably improved, the requirements of different applications can be met, the method is not limited to a classification task in natural language processing, and the method is also suitable for a text generation task.
As shown in fig. 2, the embodiment further provides a fine tuning apparatus 200 for a language model, including:
the prefix cue word processing module 201 is configured to construct an initial prefix cue word according to a downstream task, and map the initial prefix cue word into embedded vectors with the same number as the number of layers of the pre-training language model through a function, wherein the dimension of each embedded vector is 2 times that of the corresponding model layer;
the tag word processing module 202 is configured to link each task category of the downstream task to the knowledge graph, and use a word related to each task category in the knowledge graph as a tag word;
the fine tuning module 203 is configured to convert the pre-training language model into a downstream task of masking the token according to the prefix cue word and the tag word, and perform fine tuning training on the pre-training language model, including: inputting a training text into a pre-training language model, splitting an embedded vector of a prefix cue word into 2 parts with the same dimensionality as that of a corresponding model layer on each layer, splicing a key value and a value corresponding to the training text respectively, participating in self-attribute calculation, and simultaneously optimizing the embedded vector of the prefix cue word, parameters of the pre-training language model and the weight of a label word by taking the weighted result of all label words corresponding to each task category as a label;
and the application module 204 is configured to input the embedded vectors of the predicted text and the prefix prompt word into the fine-tuned pre-training language model, and take the prediction values of all the label words and the weighting results of the corresponding weights as prediction results through calculation.
It should be noted that, when the natural language processing apparatus provided in the embodiment performs automatic generation, the division of each functional module is taken as an example, and the above-mentioned function distribution may be performed by different functional modules as needed, that is, the internal structure of the terminal or the server is divided into different functional modules to perform all or part of the above-described functions. In addition, the natural language processing apparatus provided in the embodiment and the natural language processing method embodiment belong to the same concept, and specific implementation procedures thereof are detailed in the natural language processing method embodiment and are not described herein again.
Embodiments also provide a computing device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, the processor implementing a knowledge-guided prefix-fine-tuning-based natural language processing method when executing the computer program.
Embodiments provide a computer storage medium having stored thereon a computer program that, when executed by a processor, implements a natural language processing method based on knowledge-guided prefix hinting.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others.
The above-mentioned embodiments are intended to illustrate the technical solutions and advantages of the present invention, and it should be understood that the above-mentioned embodiments are only the most preferred embodiments of the present invention, and are not intended to limit the present invention, and any modifications, additions, equivalents, etc. made within the scope of the principles of the present invention should be included in the scope of the present invention.
Claims (10)
1. A natural language processing method based on knowledge-guided prefix fine tuning is characterized by comprising the following steps:
constructing an initial prefix cue word according to a downstream task, and mapping the initial prefix cue word into embedded vectors with the same number as the number of layers of a pre-training language model through a function, wherein the dimension of each embedded vector is 2 times that of the corresponding model layer;
linking each task category of the downstream tasks to a knowledge graph, and taking words related to each task category in the knowledge graph as tag words;
converting the pre-training language model into a downstream task of shielding the token according to the prefix prompt words and the label words, and performing fine tuning training on the pre-training language model, wherein the fine tuning training comprises the following steps: inputting a training text into a pre-training language model, splitting an embedded vector of a prefix cue word into 2 parts with the same dimensionality as that of a corresponding model layer on each layer, splicing a key value and a value corresponding to the training text respectively, participating in self-attribute calculation, and simultaneously optimizing the embedded vector of the prefix cue word, parameters of the pre-training language model and the weight of a label word by taking the weighted result of all label words corresponding to each task category as a label;
when the method is applied, the embedded vectors of the predicted text and the prefix cue words are input into the fine-tuned pre-training language model, and the predicted values of all the label words and the weighting results of the corresponding weights are used as prediction results after calculation.
2. The knowledge-guided prefix-fine-tuning-based natural language processing method of claim 1, wherein the functionally mapping the initial prefix hints to the same number of embedded vectors as the number of layers of the pre-trained language model comprises:
and initially encoding the initial prefix cue words into initial embedded vectors, and then mapping the initial embedded vectors once by adopting function mapping to obtain the embedded vectors with the same number as the number of layers of the pre-training language model.
3. The knowledge-guided prefix-fine-tuning-based natural language processing method of claim 1, wherein the functionally mapping the initial prefix hints to the same number of embedded vectors as the number of layers of the pre-trained language model comprises:
and initially encoding the initial prefix cue words into initial embedded vectors, and mapping the initial embedded vectors to each layer of the pre-training language model by adopting multiple layers of MLPs (Multi-level MLPs) to obtain the embedded vectors corresponding to each layer.
4. The natural language processing method based on knowledge-guided prefix fine tuning of claim 1, wherein when the pre-trained language model is subjected to fine tuning training, the calculation mode of participating in self-attention is as follows:
wherein l represents the number of layers, QlDenotes the query value, KlDenotes the key value, VlA value is represented by a value,embedded vector splitting to represent prefix hintsThe portion corresponding to the key value is come out,the embedded vector representing the prefix hint word is split into a part corresponding to a value, soft (-) represents and a symbol; indicating a splicing operation.
5. The knowledge-guided prefix-fine-tuning-based natural language processing method of claim 1, wherein the pre-trained language model comprises: BERT, RoBerta, GPT series model.
6. The natural language processing method based on knowledge-guided prefix refinement of claim 1, wherein the downstream task is an emotion analysis task, the corresponding initial prefix word is emotion analysis, and each task category of the emotion analysis task is connected to a financial domain knowledge graph to search for words related to each task category as tag words; then converting the pre-training language model into an emotion analysis task of the shielding token according to emotion analysis and the label words, and carrying out fine tuning training on the pre-training language model; and finally, inputting the embedded vectors of the predicted text and the emotion analysis into the finely-tuned pre-training language model during application, and taking the predicted values of all the label words and the weighting results of the corresponding weights as emotion analysis prediction results after calculation.
7. The natural language processing method based on knowledge-guided prefix refinement of claim 1, wherein the downstream task is a relationship extraction task, the corresponding initial prefix words are relationship extractions, each task category of the relationship extraction task is connected to a medical field knowledge graph to search for words related to each task category as tag words; then converting the pre-training language model into a relation extraction task of a shielding token according to the relation extraction and the label words, and carrying out fine tuning training on the pre-training language model; and finally, inputting the prediction text and the embedded vector of the relation extraction into the fine-tuned pre-training language model during application, and taking the prediction values of all the label words and the weighting results of the corresponding weights as the relation extraction results after calculation.
8. A natural language processing apparatus that directs prefix hinting based on knowledge, comprising:
the prefix cue word processing module is used for constructing an initial prefix cue word according to a downstream task, and mapping the initial prefix cue word into embedded vectors with the same number as the number of layers of the pre-training language model through a function, wherein the dimension of each embedded vector is 2 times that of the corresponding model layer;
the system comprises a tag word processing module, a knowledge graph and a task processing module, wherein the tag word processing module is used for linking each task category of a downstream task to the knowledge graph and taking words related to each task category in the knowledge graph as tag words;
the fine tuning module is used for converting the pre-training language model into a downstream task of shielding the token according to the prefix cue words and the label words, and performing fine tuning training on the pre-training language model, and comprises the following steps: inputting a training text into a pre-training language model, splitting an embedded vector of a prefix cue word into 2 parts with the same dimensionality as that of a corresponding model layer on each layer, splicing a key value and a value corresponding to the training text respectively, participating in self-attribute calculation, and simultaneously optimizing the embedded vector of the prefix cue word, parameters of the pre-training language model and the weight of a label word by taking the weighted result of all label words corresponding to each task category as a label;
and the application module is used for inputting the embedded vectors of the predicted text and the prefix prompt words into the finely-tuned pre-training language model, and taking the predicted values of all the label words and the weighting results of the corresponding weights as prediction results after calculation.
9. A computing device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the natural language processing method based on knowledge-guided prefix refinement of any one of claims 1-7 when executing the computer program.
10. A computer storage medium having a computer program stored thereon, wherein the computer program when executed is configured to implement the natural language processing method for knowledge-based guided prefix refinement of any of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111300021.1A CN113987209B (en) | 2021-11-04 | 2021-11-04 | Natural language processing method, device, computing equipment and storage medium based on knowledge-guided prefix fine adjustment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111300021.1A CN113987209B (en) | 2021-11-04 | 2021-11-04 | Natural language processing method, device, computing equipment and storage medium based on knowledge-guided prefix fine adjustment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113987209A true CN113987209A (en) | 2022-01-28 |
CN113987209B CN113987209B (en) | 2024-05-24 |
Family
ID=79746414
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111300021.1A Active CN113987209B (en) | 2021-11-04 | 2021-11-04 | Natural language processing method, device, computing equipment and storage medium based on knowledge-guided prefix fine adjustment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113987209B (en) |
Cited By (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114612290A (en) * | 2022-03-11 | 2022-06-10 | 北京百度网讯科技有限公司 | Training method of image editing model and image editing method |
CN114792097A (en) * | 2022-05-14 | 2022-07-26 | 北京百度网讯科技有限公司 | Method and device for determining prompt vector of pre-training model and electronic equipment |
CN114862493A (en) * | 2022-04-07 | 2022-08-05 | 北京中科深智科技有限公司 | Generation model for generating personalized commodity description based on light-weight fine adjustment |
CN114943211A (en) * | 2022-07-25 | 2022-08-26 | 北京澜舟科技有限公司 | Text generation method and system based on prefix and computer readable storage medium |
CN115563283A (en) * | 2022-10-20 | 2023-01-03 | 北京大学 | Text classification method based on prompt learning |
CN115640520A (en) * | 2022-11-07 | 2023-01-24 | 北京百度网讯科技有限公司 | Method, device and storage medium for pre-training cross-language cross-modal model |
CN115906815A (en) * | 2023-03-08 | 2023-04-04 | 北京语言大学 | Error correction method and device for modifying one or more types of wrong sentences |
CN116186200A (en) * | 2023-01-19 | 2023-05-30 | 北京百度网讯科技有限公司 | Model training method, device, electronic equipment and storage medium |
CN116306917A (en) * | 2023-05-17 | 2023-06-23 | 卡奥斯工业智能研究院(青岛)有限公司 | Task processing method, device, equipment and computer storage medium |
CN116737938A (en) * | 2023-07-19 | 2023-09-12 | 人民网股份有限公司 | Fine granularity emotion detection method and device based on fine tuning large model online data network |
CN116861928A (en) * | 2023-07-07 | 2023-10-10 | 北京中关村科金技术有限公司 | Method, device, equipment and medium for generating instruction fine tuning data |
CN116956835A (en) * | 2023-09-15 | 2023-10-27 | 京华信息科技股份有限公司 | Document generation method based on pre-training language model |
CN117194637A (en) * | 2023-09-18 | 2023-12-08 | 深圳市大数据研究院 | Multi-level visual evaluation report generation method and device based on large language model |
CN117216227A (en) * | 2023-10-30 | 2023-12-12 | 广东烟草潮州市有限责任公司 | Tobacco enterprise intelligent information question-answering method based on knowledge graph and large language model |
CN117332419A (en) * | 2023-11-29 | 2024-01-02 | 武汉大学 | Malicious code classification method and device based on pre-training |
CN117474084A (en) * | 2023-12-25 | 2024-01-30 | 淘宝(中国)软件有限公司 | Bidirectional iteration method, equipment and medium for pre-training model and downstream sequence task |
WO2024031891A1 (en) * | 2022-08-10 | 2024-02-15 | 浙江大学 | Fine tuning method and apparatus for knowledge representation-disentangled classification model, and application |
CN117875273A (en) * | 2024-03-13 | 2024-04-12 | 中南大学 | News abstract automatic generation method, device and medium based on large language model |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111444721A (en) * | 2020-05-27 | 2020-07-24 | 南京大学 | Chinese text key information extraction method based on pre-training language model |
CN112100383A (en) * | 2020-11-02 | 2020-12-18 | 之江实验室 | Meta-knowledge fine tuning method and platform for multitask language model |
US20210035556A1 (en) * | 2019-08-02 | 2021-02-04 | Babylon Partners Limited | Fine-tuning language models for supervised learning tasks via dataset preprocessing |
CN112699218A (en) * | 2020-12-30 | 2021-04-23 | 成都数之联科技有限公司 | Model establishing method and system, paragraph label obtaining method and medium |
CN113033182A (en) * | 2021-03-25 | 2021-06-25 | 网易(杭州)网络有限公司 | Text creation auxiliary method and device and server |
CN113468877A (en) * | 2021-07-09 | 2021-10-01 | 浙江大学 | Language model fine-tuning method and device, computing equipment and storage medium |
-
2021
- 2021-11-04 CN CN202111300021.1A patent/CN113987209B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210035556A1 (en) * | 2019-08-02 | 2021-02-04 | Babylon Partners Limited | Fine-tuning language models for supervised learning tasks via dataset preprocessing |
CN111444721A (en) * | 2020-05-27 | 2020-07-24 | 南京大学 | Chinese text key information extraction method based on pre-training language model |
CN112100383A (en) * | 2020-11-02 | 2020-12-18 | 之江实验室 | Meta-knowledge fine tuning method and platform for multitask language model |
CN112699218A (en) * | 2020-12-30 | 2021-04-23 | 成都数之联科技有限公司 | Model establishing method and system, paragraph label obtaining method and medium |
CN113033182A (en) * | 2021-03-25 | 2021-06-25 | 网易(杭州)网络有限公司 | Text creation auxiliary method and device and server |
CN113468877A (en) * | 2021-07-09 | 2021-10-01 | 浙江大学 | Language model fine-tuning method and device, computing equipment and storage medium |
Non-Patent Citations (1)
Title |
---|
韩程程;李磊;刘婷婷;高明;: "语义文本相似度计算方法", 华东师范大学学报(自然科学版), no. 05, 25 September 2020 (2020-09-25) * |
Cited By (29)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114612290B (en) * | 2022-03-11 | 2023-07-21 | 北京百度网讯科技有限公司 | Training method of image editing model and image editing method |
CN114612290A (en) * | 2022-03-11 | 2022-06-10 | 北京百度网讯科技有限公司 | Training method of image editing model and image editing method |
CN114862493A (en) * | 2022-04-07 | 2022-08-05 | 北京中科深智科技有限公司 | Generation model for generating personalized commodity description based on light-weight fine adjustment |
CN114792097A (en) * | 2022-05-14 | 2022-07-26 | 北京百度网讯科技有限公司 | Method and device for determining prompt vector of pre-training model and electronic equipment |
CN114943211A (en) * | 2022-07-25 | 2022-08-26 | 北京澜舟科技有限公司 | Text generation method and system based on prefix and computer readable storage medium |
WO2024031891A1 (en) * | 2022-08-10 | 2024-02-15 | 浙江大学 | Fine tuning method and apparatus for knowledge representation-disentangled classification model, and application |
CN115563283A (en) * | 2022-10-20 | 2023-01-03 | 北京大学 | Text classification method based on prompt learning |
CN115563283B (en) * | 2022-10-20 | 2023-04-25 | 北京大学 | Text classification method based on prompt learning |
CN115640520A (en) * | 2022-11-07 | 2023-01-24 | 北京百度网讯科技有限公司 | Method, device and storage medium for pre-training cross-language cross-modal model |
CN116186200B (en) * | 2023-01-19 | 2024-02-09 | 北京百度网讯科技有限公司 | Model training method, device, electronic equipment and storage medium |
CN116186200A (en) * | 2023-01-19 | 2023-05-30 | 北京百度网讯科技有限公司 | Model training method, device, electronic equipment and storage medium |
CN115906815A (en) * | 2023-03-08 | 2023-04-04 | 北京语言大学 | Error correction method and device for modifying one or more types of wrong sentences |
CN116306917B (en) * | 2023-05-17 | 2023-09-08 | 卡奥斯工业智能研究院(青岛)有限公司 | Task processing method, device, equipment and computer storage medium |
CN116306917A (en) * | 2023-05-17 | 2023-06-23 | 卡奥斯工业智能研究院(青岛)有限公司 | Task processing method, device, equipment and computer storage medium |
CN116861928A (en) * | 2023-07-07 | 2023-10-10 | 北京中关村科金技术有限公司 | Method, device, equipment and medium for generating instruction fine tuning data |
CN116861928B (en) * | 2023-07-07 | 2023-11-17 | 北京中关村科金技术有限公司 | Method, device, equipment and medium for generating instruction fine tuning data |
CN116737938A (en) * | 2023-07-19 | 2023-09-12 | 人民网股份有限公司 | Fine granularity emotion detection method and device based on fine tuning large model online data network |
CN116956835A (en) * | 2023-09-15 | 2023-10-27 | 京华信息科技股份有限公司 | Document generation method based on pre-training language model |
CN116956835B (en) * | 2023-09-15 | 2024-01-02 | 京华信息科技股份有限公司 | Document generation method based on pre-training language model |
CN117194637A (en) * | 2023-09-18 | 2023-12-08 | 深圳市大数据研究院 | Multi-level visual evaluation report generation method and device based on large language model |
CN117194637B (en) * | 2023-09-18 | 2024-04-30 | 深圳市大数据研究院 | Multi-level visual evaluation report generation method and device based on large language model |
CN117216227A (en) * | 2023-10-30 | 2023-12-12 | 广东烟草潮州市有限责任公司 | Tobacco enterprise intelligent information question-answering method based on knowledge graph and large language model |
CN117216227B (en) * | 2023-10-30 | 2024-04-16 | 广东烟草潮州市有限责任公司 | Tobacco enterprise intelligent information question-answering method based on knowledge graph and large language model |
CN117332419A (en) * | 2023-11-29 | 2024-01-02 | 武汉大学 | Malicious code classification method and device based on pre-training |
CN117332419B (en) * | 2023-11-29 | 2024-02-20 | 武汉大学 | Malicious code classification method and device based on pre-training |
CN117474084A (en) * | 2023-12-25 | 2024-01-30 | 淘宝(中国)软件有限公司 | Bidirectional iteration method, equipment and medium for pre-training model and downstream sequence task |
CN117474084B (en) * | 2023-12-25 | 2024-05-03 | 淘宝(中国)软件有限公司 | Bidirectional iteration method, equipment and medium for pre-training model and downstream sequence task |
CN117875273A (en) * | 2024-03-13 | 2024-04-12 | 中南大学 | News abstract automatic generation method, device and medium based on large language model |
CN117875273B (en) * | 2024-03-13 | 2024-05-28 | 中南大学 | News abstract automatic generation method, device and medium based on large language model |
Also Published As
Publication number | Publication date |
---|---|
CN113987209B (en) | 2024-05-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113987209B (en) | Natural language processing method, device, computing equipment and storage medium based on knowledge-guided prefix fine adjustment | |
CN111738003B (en) | Named entity recognition model training method, named entity recognition method and medium | |
CN109325229B (en) | Method for calculating text similarity by utilizing semantic information | |
CN112905795A (en) | Text intention classification method, device and readable medium | |
CN113468877A (en) | Language model fine-tuning method and device, computing equipment and storage medium | |
CN111062217A (en) | Language information processing method and device, storage medium and electronic equipment | |
CN114676255A (en) | Text processing method, device, equipment, storage medium and computer program product | |
CN112052318A (en) | Semantic recognition method and device, computer equipment and storage medium | |
CN117149984B (en) | Customization training method and device based on large model thinking chain | |
CN116992007B (en) | Limiting question-answering system based on question intention understanding | |
CN110717021A (en) | Input text and related device for obtaining artificial intelligence interview | |
CN115858750A (en) | Power grid technical standard intelligent question-answering method and system based on natural language processing | |
CN112488111B (en) | Indication expression understanding method based on multi-level expression guide attention network | |
CN114239599A (en) | Method, system, equipment and medium for realizing machine reading understanding | |
Seo et al. | Plain template insertion: korean-prompt-based engineering for few-shot learners | |
CN112905750A (en) | Generation method and device of optimization model | |
Yang et al. | Task independent fine tuning for word embeddings | |
CN111813907A (en) | Question and sentence intention identification method in natural language question-answering technology | |
CN116757195A (en) | Implicit emotion recognition method based on prompt learning | |
CN113408267B (en) | Word alignment performance improving method based on pre-training model | |
CN111401069A (en) | Intention recognition method and intention recognition device for conversation text and terminal | |
Alwaneen et al. | Stacked dynamic memory-coattention network for answering why-questions in Arabic | |
Chakkarwar et al. | A Review on BERT and Its Implementation in Various NLP Tasks | |
CN114239555A (en) | Training method of keyword extraction model and related device | |
Khandait et al. | Automatic question generation through word vector synchronization using lamma |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |