CN114091432A - Method and device for extracting traffic quality inspection violation reasons based on multi-task learning - Google Patents

Method and device for extracting traffic quality inspection violation reasons based on multi-task learning Download PDF

Info

Publication number
CN114091432A
CN114091432A CN202111461750.5A CN202111461750A CN114091432A CN 114091432 A CN114091432 A CN 114091432A CN 202111461750 A CN202111461750 A CN 202111461750A CN 114091432 A CN114091432 A CN 114091432A
Authority
CN
China
Prior art keywords
neural network
model
violation
convolution neural
graph convolution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111461750.5A
Other languages
Chinese (zh)
Inventor
华旭明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Chuang Frame Software Co ltd
Original Assignee
Shanghai Chuang Frame Software Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Chuang Frame Software Co ltd filed Critical Shanghai Chuang Frame Software Co ltd
Priority to CN202111461750.5A priority Critical patent/CN114091432A/en
Publication of CN114091432A publication Critical patent/CN114091432A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/353Clustering; Classification into predefined classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/253Grammatical analysis; Style critique
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • G06F40/35Discourse or dialogue representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)

Abstract

The invention provides a method and a device for extracting traffic quality inspection violation reasons based on multitask learning, wherein the method comprises the following steps: carrying out data preprocessing on the original sentence recording, and processing the original sentence recording into a form which can be processed by a pre-training model; reconstructing the original sentence into a form of a dependency syntax tree, processing the tree structure into an adjacent matrix which can be processed by a graph convolution neural network, and processing syntax structure characteristics through the graph convolution neural network; and performing joint training on the syntactic structure processed by the graph convolution neural network and the machine reading understanding model, and taking the obtained result as the positions of the initial and end point indexes of the violation reasons predicted by the model. The invention greatly improves the information extraction capability of the model, and the extracted violation reasons are more in line with Chinese grammar rules.

Description

Method and device for extracting traffic quality inspection violation reasons based on multi-task learning
Technical Field
The invention relates to the technical field of natural language processing, in particular to a method and a device for extracting traffic quality inspection violation reasons based on multi-task learning.
Background
The telephone traffic quality inspection system is a system for detecting whether manual calls such as customer service, urging, etc. violate the content specified by the state and the enterprise. In the mode, the customer service and the collector can easily bypass the keywords and achieve the purpose in a new violation form, so that irreparable loss can be caused to customers and companies. With the continuous development of natural language processing technology in artificial intelligence, the problem is gradually solved, and through the natural language processing technology, a machine can understand the meaning of a Chinese sentence to a certain degree, so that whether the sentence violates rules or not is checked, and a corresponding violation part is extracted.
The technology used for extracting the illegal part of the dialogue is the information extraction technology in natural language processing, and the information extraction technology mainly utilizes a relevant machine learning method to automatically extract some factual information in the free text. Currently, information extraction mainly includes research on named entity identification, relationship extraction, event extraction and the like. The event extraction mainly researches how to extract events which are interesting to a user from unstructured texts, and describes the events in a structured text form so that the user can further inquire, trace and analyze the events, and the event extraction is a very important research direction in the field of natural language processing.
The event extraction aims at the modified document, and predicts an event description, an event trigger word, an element corresponding to the event and a role corresponding to the element. However, most of the existing Chinese event extraction models adopt a pipeline mode, namely, an event trigger word is recognized firstly, and then an event element is recognized. In addition, the extraction task for the violation reason does not have the event trigger words, event elements and the like in the traditional event extraction paradigm.
Disclosure of Invention
In order to solve the problem that the existing model can not process the violation reason extraction task well, the invention aims to provide a method and a device for extracting the violation reason of the telephone traffic quality inspection based on multi-task learning, which have more grammatical and logical results.
In order to solve the problems, the technical scheme of the invention is as follows:
a traffic quality inspection violation reason extraction method based on multitask learning comprises the following steps:
carrying out data preprocessing on the original sentence recording, and processing the original sentence recording into a form which can be processed by a pre-training model;
reconstructing the original sentence into a form of a dependency syntax tree, processing the tree structure into an adjacent matrix which can be processed by a graph convolution neural network, and processing syntax structure characteristics through the graph convolution neural network; and
and performing joint training on the syntactic structure processed by the graph convolution neural network and the machine reading understanding model, and taking the obtained result as the positions of the initial and end point indexes of the violation reasons predicted by the model.
Optionally, the data preprocessing of the original sentence recording specifically includes: and (3) respectively processing the Chinese characters and the pinyin by adopting two BERT models, splicing, then connecting with a softmax function, and finally performing cross entropy loss calculation.
Optionally, the step of performing joint training on the syntactic structure processed by the graph convolution neural network and the machine reading understanding model, and taking an obtained result as the positions of the initial and end point indexes of the violation cause predicted by the model specifically includes: the syntax structure information is processed through a graph convolution neural network model, violation type information is processed through a machine reading understanding model to obtain related violation reasons, and a pre-training model is used for sharing a pre-training network structure part and corresponding parameters and output implicit representation.
Optionally, the hidden representation is divided into two parts, one part is a hidden representation dimension transformation calculation loss function part in the machine reading understanding model, and the other part is a new hidden representation obtained by combining the adjacency matrix input graph convolution neural network part, and then the dimension transformation calculation loss function is performed.
Optionally, the weight parameters are dynamically updated by using a gradient update formula using a gradnorm method, and the Loss function Grad Loss defining the gradient is the sum of absolute values of differences between actual gradient norms and ideal gradient norms of each task:
Figure BDA0003387731030000021
wherein:
Figure BDA0003387731030000022
the above-mentioned
Figure BDA0003387731030000023
Is the actual gradient norm, is the weighted loss w of task ii(t)Li(t), is the L2 norm of the gradient of the neural network parameter W that needs to be updated;
Figure BDA0003387731030000024
is the norm of the ideal gradient and is,
Figure BDA0003387731030000025
is obtained for all tasks
Figure BDA0003387731030000026
Average value of (d);
Figure BDA0003387731030000027
is the reverse training speed for task i,
Figure BDA0003387731030000028
the larger LiThe larger (t) the slower the training; r isi(t) is the relative reverse training speed for task i.
Optionally, the
Figure BDA0003387731030000031
And
Figure BDA0003387731030000032
the magnitude of the penalty for balancing, when a task penalty is too great,
Figure BDA0003387731030000033
will be greater than
Figure BDA0003387731030000034
A large gradient is generated, and the weight parameter is reduced; [ r ] ofi(t)]αIs used to balance the training speed, i.e. when a task is trained too fast [ ri(t)]αIt becomes smaller, resulting in a large gradient penalty and hence a smaller weight parameter.
Furthermore, the invention also provides a device for extracting the violation reason of the telephone traffic quality control based on the multitask learning, which comprises the following steps:
the data preprocessing module is used for preprocessing the original sentence recording and processing the original sentence recording into a form which can be processed by a pre-training model;
the graph convolution neural network processing module is used for reconstructing the original sentence into a dependency syntax tree form, processing the tree structure into an adjacent matrix which can be processed by a graph convolution neural network, and processing the syntax structure characteristics through the graph convolution neural network;
and the joint training module is used for carrying out joint training on the syntactic structure processed by the graph convolution neural network and the machine reading understanding model, and taking the obtained result as the positions of the initial point and the end point index of the violation cause predicted by the model.
Compared with the prior art, the method and the device fully utilize the grammatical structure information of the Chinese text sentence, utilize the graph convolution neural network and the machine reading understanding task to carry out multi-task learning, fully utilize the data of different tasks, and improve the evaluation index ROUGE from 0.43 to 0.54 compared with the method of singly using the machine reading understanding task to carry out violation reason extraction, thereby greatly improving the capability of extracting information by a model, and the extracted violation reasons are more in line with the Chinese grammatical rules.
Drawings
Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments with reference to the following drawings:
fig. 1 is a flow chart of a method for extracting a traffic quality inspection violation cause based on multitask learning according to an embodiment of the present invention;
FIG. 2 is a diagram illustrating an overall structure of a multi-task learning model provided by an embodiment of the invention;
FIG. 3 is a schematic diagram of a machine reading understanding model according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a graph convolution neural network model structure provided by an embodiment of the present invention;
FIG. 5 is a diagram illustrating a result fusion of a machine reading understanding model and a graph convolution neural network model according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of a device for extracting traffic quality inspection violation causes based on multi-task learning according to an embodiment of the present invention.
Detailed Description
The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the invention, but are not intended to limit the invention in any way. It should be noted that it would be obvious to those skilled in the art that various changes and modifications can be made without departing from the spirit of the invention. All falling within the scope of the present invention.
Specifically, as shown in fig. 1, the present invention provides a method for extracting traffic quality inspection violation causes based on multitask learning, where the method includes the following steps:
s1: carrying out data preprocessing on the original sentence recording, and processing the original sentence recording into a form which can be processed by a pre-training model;
specifically, for example:
original sentence: i speak that you owe money with your relatives and friends to let your relatives and friends know that you owe money
Type of violation: threatening threat of scare
The violation causes: telling your money not to be paid with your relatives and friends
Because the original sentence is converted into a text form from the recording through the intelligent voice algorithm, there are inaccurate places, and a certain correction needs to be carried out by using a text error correction algorithm. The text error correction model not only extracts Chinese character features, but also uses standard pinyin as features, the model structure respectively adopts two BERT models to process Chinese characters and pinyin, and after splicing, the model structure is connected with a softmax function, and finally cross entropy loss calculation is carried out. For a machine reading understanding task, an original sentence needs to be constructed into a problem form, violation types are spliced to the original sentence and then serve as samples for machine reading understanding, new samples need to be processed into a form which can be processed by a model after being obtained, words are uniquely corresponding to serial numbers of word banks in a pre-training model by using tokenizer of the pre-training model, and meanwhile, label information is mapped into a digital form, so that the pre-training model can process input texts.
S2: reconstructing the original sentence into a form of a dependency syntax tree, processing the tree structure into an adjacent matrix which can be processed by a graph convolution neural network, and processing syntax structure characteristics through the graph convolution neural network;
for the neural network of the graph, it is first required to transform the original sentence into the form of a dependency syntax tree, the dependency syntax analysis is to determine the syntax structure of the sentence by analyzing the dependency relationship between words in the sentence, and the result of the DDParser processing by the hundred degree open source syntax analysis tool DDParser, as in the above example, is "[ { ' word ' [ ', ' follow ', ' of ', ' friends ', ' say ', ' you ', ' owen ', ' not ', ' also ', ' let ', ' you ', ' family ', ' friend ', ' know ' ]head ': 7,7,6,3,6,2,13,9,7,11,12,13,0,16,16,13,13], ' deprel ': SBV ', ' ATT ', ' POB ',
'SBV', 'VOB', 'ADV', 'ADV', 'HED', 'ATT', 'ATT', 'DBL', 'DBL' ] }, wherein word is followed by a word that reacts to the tree syntax structure of the whole sentence, and the word is interpreted from the tree structure, and the number n inside represents that the current word is a subtree node of the nth word, thus forming a complete tree structure.
The method comprises the steps of for an independent reading understanding task model, the structure is relatively simple, only the preprocessed result needs to be input into a pre-training model Roberta to obtain the hidden representation of each word, dimension transformation is conducted on the hidden representation of the multi-dimensional vector by respectively taking the starting position and the ending position of a reason as classification targets, in the model training stage, a loss function is calculated by using a transformed two-dimensional matrix and a two-dimensional matrix of labeled data, then back propagation is conducted according to loss to update model parameters, and in the prediction stage, the result obtained after the hidden representation of each word is processed by an argmax function is used as the position of the initial and end point indexes of the violation reason predicted by the model.
For an individual graph neural network model, the hidden representation of a word obtained by a pre-training model Roberta and an adjacent matrix obtained in feature engineering are used as the input of a graph convolution neural network, the graph convolution neural network outputs a group of hidden representations corresponding to each word combined with a syntactic structure, the hidden representations are obtained by updating the graph convolution neural network, and the graph convolution neural network formula is used for reference:
Figure BDA0003387731030000051
wherein
Figure BDA0003387731030000052
Is the self-circulation of the adjacency matrix plus the graph vertices, D is the degree matrix of the graph
Figure BDA0003387731030000053
Since the neighboring edges (degrees) of the graph vertices are not the same, regularization, i.e., division by d, is required to reduce the variancei. The hidden representation exists in a multi-dimensional matrix form, so that dimension transformation and pooling operation needs to be carried out on the hidden representation, the hidden representation is converted into vector representation of a word corresponding to a number, similarly, in a model training stage, a loss function is calculated by using a processed matrix vector and a two-dimensional matrix of labeled data, then back propagation of a neural network is carried out to update model parameters, in a prediction stage, argmax function processing is carried out, and obtained results are used as positions of indexes of a starting point and an end point of violation reasons of model prediction.
S3: and performing joint training on the syntactic structure processed by the graph convolution neural network and the machine reading understanding model, and taking the obtained result as the positions of the initial and end point indexes of the violation reasons predicted by the model.
The syntax structure information can be processed through a graph convolution neural network model, violation type information can be processed through a machine reading understanding model to obtain related violation causes, a pre-training model is used on a network structure, a pre-training network structure part, corresponding parameters and output hidden representations are shared, the hidden representations are divided into two parts, one part is a hidden representation dimension transformation calculation loss function part in the machine reading understanding model, the other part is a new hidden representation obtained by combining an adjacent matrix input graph convolution neural network part, and then a dimension transformation calculation loss function is carried out.
It is particularly noted that the two calculated losses cannot be directly added, because the two losses have very different orders of magnitude, and if the two losses are directly added, the gradient is unbalanced and the convergence speeds of different tasks are inconsistent, thereby affecting the result of the whole model training. In this embodiment, the present invention introduces a weight balance gradient, uses a gradnorm method, which is an optimization method for dynamically adjusting weight parameters of a Loss function by using a gradient, and can dynamically update the weight parameters by using a gradient update formula, where a Loss function Grad Loss defining a gradient is the sum of absolute values of differences between actual gradient norms and ideal gradient norms of each task:
Figure BDA0003387731030000061
wherein:
Figure BDA0003387731030000062
the above-mentioned
Figure BDA0003387731030000063
Is the actual gradient norm, is the weighted loss w of task ii(t)Li(t), is the L2 norm of the gradient of the neural network parameter W that needs to be updated;
Figure BDA0003387731030000064
is the norm of the ideal gradient and is,
Figure BDA0003387731030000065
is obtained for all tasks
Figure BDA0003387731030000066
Average value of (d);
Figure BDA0003387731030000067
is the reverse training speed for task i,
Figure BDA0003387731030000068
the larger LiThe larger (t) the slower the training; r isi(t) is the relative reverse training speed for task i.
Figure BDA0003387731030000069
And
Figure BDA00033877310300000610
is of the order of the balance loss, when a task loss is of the order of magnitude too great,
Figure BDA00033877310300000611
will be greater than
Figure BDA00033877310300000612
A large gradient is generated, and the weight parameter is reduced; [ r ] ofi(t)]αIs used to balance the training speed, i.e. when a task is trained too fast [ ri(t)]αIt becomes smaller, resulting in a large gradient penalty and hence a smaller weight parameter. Therefore, by using the gradnorm method, the problem of large loss magnitude gap between multiple tasks can be solved.
Specifically, as shown in FIG. 2, Roberta is a pre-training model module; MRC is a machine reading understanding module; the GCN is a graph convolution neural network module; the CONCAT is a module for fusing the results obtained by the MRC and the GCN. In the figure, the sentence text is mapped into word directionInputting a Roberta pre-training model after quantity splicing to obtain a new vector H fused with context informationi(implicit representation), i.e. in the figures
Figure BDA00033877310300000613
Vectors corresponding to word level, paragraph level, and position level, respectively.
As shown in FIG. 3, a new vector H is output by the pre-training model moduleiObtaining a new hidden representation H 'after changing dimensionality through a multilayer neural network (MLP)'i. As shown in FIG. 4, Aij、DijRespectively, an adjacency matrix and a degree matrix constructed from the dependency syntax tree of the sentence and a new vector H output by the pre-training model moduleiThe input of the convolutional neural network which jointly forms a graph is subjected to multilayer convolution and ReLU function operation to obtain a new implicit expression H ″i. As shown in FIG. 5, the hidden symbol obtained from the MRC module is H'iAnd the hidden representation obtained by the GCN model is HiAfter the dimension is changed through a multilayer neural network (MLP), the splicing is processed by an argmax function to obtain the start and end of output results, namely the start position and the end position of the extracted reason.
As shown in fig. 6, an embodiment of the present invention discloses a device for extracting traffic quality inspection violation causes based on multitask learning, where the device includes:
the data preprocessing module 61 is used for preprocessing the original sentence recording and processing the original sentence recording into a form which can be processed by a pre-training model;
the graph convolution neural network processing module 62 is configured to reform the original sentence into a dependency syntax tree form, process the tree structure into an adjacent matrix that can be processed by the graph convolution neural network, and process the syntax structure characteristics by the graph convolution neural network;
a joint training module 63, configured to perform joint training on the syntactic structure and the machine reading understanding model processed by the graph convolution neural network, and use an obtained result as positions of start and end point indexes of violation causes predicted by the model;
the device for extracting the violation cause of the traffic quality control based on the multitask learning is used for executing the method for extracting the violation cause of the traffic quality control based on the multitask learning.
Compared with the prior art, the method and the device fully utilize the grammatical structure information of the Chinese text sentence, utilize the graph convolution neural network and the machine reading understanding task to carry out multi-task learning, fully utilize the data of different tasks, and improve the evaluation index ROUGE from 0.43 to 0.54 compared with the method of singly using the machine reading understanding task to carry out violation reason extraction, thereby greatly improving the capability of extracting information by a model, and the extracted violation reasons are more in line with the Chinese grammatical rules.
The foregoing description of specific embodiments of the present invention has been presented. It is to be understood that the present invention is not limited to the specific embodiments described above, and that various changes or modifications may be made by one skilled in the art within the scope of the appended claims without departing from the spirit of the invention. The embodiments and features of the embodiments of the present application may be combined with each other arbitrarily without conflict.

Claims (7)

1. A traffic quality inspection violation reason extraction method based on multitask learning is characterized by comprising the following steps:
carrying out data preprocessing on the original sentence recording, and processing the original sentence recording into a form which can be processed by a pre-training model;
reconstructing the original sentence into a form of a dependency syntax tree, processing the tree structure into an adjacent matrix which can be processed by a graph convolution neural network, and processing syntax structure characteristics through the graph convolution neural network; and
and performing joint training on the syntactic structure processed by the graph convolution neural network and the machine reading understanding model, and taking the obtained result as the positions of the initial and end point indexes of the violation reasons predicted by the model.
2. The method for extracting the traffic quality inspection violation cause based on multitask learning according to claim 1, wherein the step of performing data preprocessing on the original sentence recording specifically comprises the steps of: and (3) respectively processing the Chinese characters and the pinyin by adopting two BERT models, splicing, then connecting with a softmax function, and finally performing cross entropy loss calculation.
3. The method for extracting the traffic quality inspection violation cause based on the multitask learning according to claim 1, wherein the step of performing joint training on the syntactic structure processed by the graph convolution neural network and the machine reading understanding model to obtain a result as the position of the initial and end point indexes of the violation cause predicted by the model specifically comprises: the syntax structure information is processed through a graph convolution neural network model, violation type information is processed through a machine reading understanding model to obtain related violation reasons, and a pre-training model is used for sharing a pre-training network structure part and corresponding parameters and output implicit representation.
4. The method for extracting the traffic quality inspection violation causes based on the multitask learning according to claim 3, wherein the hidden representation is divided into two parts, one part is a hidden representation dimension transformation calculation loss function part in a machine reading understanding model, the other part is a new hidden representation obtained by combining an adjacency matrix input graph convolution neural network part, and then the dimension transformation calculation loss function is performed.
5. The method for extracting the traffic quality inspection violation cause based on multitask learning according to claim 4, wherein the weight parameter is dynamically updated by using a gradient update formula by using a gradnorm method, and a gradient-defining Loss function Grad Loss is the sum of absolute values of differences between actual gradient norms and ideal gradient norms of each task:
Figure FDA0003387731020000011
wherein:
Figure FDA0003387731020000012
the above-mentioned
Figure FDA0003387731020000021
Is the actual gradient norm, is the weighted loss w of task ii(t)Li(t), is the L2 norm of the gradient of the neural network parameter W that needs to be updated;
Figure FDA0003387731020000022
is the norm of the ideal gradient and is,
Figure FDA0003387731020000023
is obtained for all tasks
Figure FDA0003387731020000029
Average value of (d);
Figure FDA0003387731020000024
is the reverse training speed for task i,
Figure FDA0003387731020000025
the larger LiThe larger (t) the slower the training; r isi(t) is the relative reverse training speed for task i.
6. The method of claim 5, wherein the method for extracting the cause of the violation of the traffic quality inspection based on the multi-task learning is characterized in that
Figure FDA0003387731020000026
And
Figure FDA0003387731020000027
the magnitude of the penalty for balancing, when a task penalty is too great,
Figure FDA0003387731020000028
will be greater than
Figure FDA00033877310200000210
A large gradient and thus a weight is generatedReducing the parameters; [ r ] ofi(t)]αIs used to balance the training speed, i.e. when a task is trained too fast [ ri(t)]αIt becomes smaller, resulting in a large gradient penalty and hence a smaller weight parameter.
7. A device for extracting traffic quality inspection violation causes based on multitask learning, the device comprising:
the data preprocessing module is used for preprocessing the original sentence recording and processing the original sentence recording into a form which can be processed by a pre-training model;
the graph convolution neural network processing module is used for reconstructing the original sentence into a dependency syntax tree form, processing the tree structure into an adjacent matrix which can be processed by a graph convolution neural network, and processing the syntax structure characteristics through the graph convolution neural network;
and the joint training module is used for carrying out joint training on the syntactic structure processed by the graph convolution neural network and the machine reading understanding model, and taking the obtained result as the positions of the initial point and the end point index of the violation cause predicted by the model.
CN202111461750.5A 2021-12-02 2021-12-02 Method and device for extracting traffic quality inspection violation reasons based on multi-task learning Pending CN114091432A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111461750.5A CN114091432A (en) 2021-12-02 2021-12-02 Method and device for extracting traffic quality inspection violation reasons based on multi-task learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111461750.5A CN114091432A (en) 2021-12-02 2021-12-02 Method and device for extracting traffic quality inspection violation reasons based on multi-task learning

Publications (1)

Publication Number Publication Date
CN114091432A true CN114091432A (en) 2022-02-25

Family

ID=80306552

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111461750.5A Pending CN114091432A (en) 2021-12-02 2021-12-02 Method and device for extracting traffic quality inspection violation reasons based on multi-task learning

Country Status (1)

Country Link
CN (1) CN114091432A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114925813A (en) * 2022-05-25 2022-08-19 支付宝(杭州)信息技术有限公司 Training method and device of target detection system

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114925813A (en) * 2022-05-25 2022-08-19 支付宝(杭州)信息技术有限公司 Training method and device of target detection system

Similar Documents

Publication Publication Date Title
CN110633409B (en) Automobile news event extraction method integrating rules and deep learning
CN112528672A (en) Aspect-level emotion analysis method and device based on graph convolution neural network
WO2022141878A1 (en) End-to-end language model pretraining method and system, and device and storage medium
CN114547329A (en) Method for establishing pre-training language model, semantic analysis method and device
CN112446215B (en) Entity relation joint extraction method
CN115392259B (en) Microblog text sentiment analysis method and system based on confrontation training fusion BERT
Wu et al. Aspect-level sentiment classification based on location and hybrid multi attention mechanism
Zhang et al. Chatbot design method using hybrid word vector expression model based on real telemarketing data
CN113204624B (en) Multi-feature fusion text emotion analysis model and device
US11966700B2 (en) Neural tagger with deep multi-level model
CN114091432A (en) Method and device for extracting traffic quality inspection violation reasons based on multi-task learning
CN117473054A (en) Knowledge graph-based general intelligent question-answering method and device
Lamons et al. Python Deep Learning Projects: 9 projects demystifying neural network and deep learning models for building intelligent systems
CN113051886B (en) Test question duplicate checking method, device, storage medium and equipment
CN113051607B (en) Privacy policy information extraction method
Rojan et al. Natural Language Processing based Text Imputation for Malayalam Corpora
Sun et al. Chinese named entity recognition using the improved transformer encoder and the lexicon adapter
Cairang et al. Research on error correction method of Tibetan text based on deep learning
Nie et al. Graph neural net-based user simulator
CN109299442A (en) Chinese chapter primary-slave relation recognition methods and system
Zhang et al. A Chinese Document-level Event Extraction Method based on ERNIE
CN113869054B (en) Deep learning-based power field project feature recognition method
Pingan et al. Image caption description generation method based on reflective attention mechanism
Xu et al. LayoutLM-Critic: Multimodal Language Model for Text Error Correction of Optical Character Recognition
Li et al. STCP: An Efficient Model Combining Subject Triples and Constituency Parsing for Recognizing Textual Entailment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination