CN115964497A - Event extraction method integrating attention mechanism and convolutional neural network - Google Patents

Event extraction method integrating attention mechanism and convolutional neural network Download PDF

Info

Publication number
CN115964497A
CN115964497A CN202310154608.9A CN202310154608A CN115964497A CN 115964497 A CN115964497 A CN 115964497A CN 202310154608 A CN202310154608 A CN 202310154608A CN 115964497 A CN115964497 A CN 115964497A
Authority
CN
China
Prior art keywords
text
extracted
event
vector
word
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310154608.9A
Other languages
Chinese (zh)
Inventor
周永彬
周沁仪
林海伦
张倩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Science and Technology
Institute of Information Engineering of CAS
Original Assignee
Nanjing University of Science and Technology
Institute of Information Engineering of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Science and Technology, Institute of Information Engineering of CAS filed Critical Nanjing University of Science and Technology
Priority to CN202310154608.9A priority Critical patent/CN115964497A/en
Publication of CN115964497A publication Critical patent/CN115964497A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Machine Translation (AREA)

Abstract

The invention discloses an event extraction method integrating an attention mechanism and a convolutional neural network, which comprises the following steps of: 1) Performing feature representation on text content to be extracted by using a text encoder to obtain distributed features of the text to be extracted; 2) Extracting context characteristics of the text to be extracted and associated information between vocabularies from the distributed characteristics by using a characteristic extractor; 3) Inputting the context characteristics of the text to be extracted and the associated information between the vocabularies into an event trigger word classifier, outputting the event trigger word of the text to be extracted, and then determining the event type of the text to be extracted based on the event trigger word of the text to be extracted; 4) The event element classifier judges whether each word segmentation in the text to be extracted is an event element or not in sequence according to the event type and the context characteristics of the text to be extracted; 5) A role category is identified for each of the event elements using an element role classifier. The invention greatly improves the accuracy rate of event extraction and has high efficiency.

Description

Event extraction method integrating attention mechanism and convolutional neural network
Technical Field
The invention belongs to an information extraction technology in the field of natural language processing, and particularly relates to an event extraction method integrating an attention mechanism and a convolutional neural network.
Background
The information extraction is to extract specific information from the natural language text, so as to automatically classify, extract and reconstruct the mass heterogeneous text content. The information extraction mainly comprises entity extraction, relation extraction and event extraction, wherein the event extraction is an extraction task for highly structuring various entities and relations. The event extraction is to identify important arguments of events related to a target from a semi-structured or unstructured text, namely, event trigger words and event related argument information are obtained and organized into event information, and the event information is widely applied to the fields of semantic search, information analysis, event reasoning, risk early warning, intelligent question answering and the like. The event extraction task is divided into two parts of event identification and event argument identification, namely finding an event trigger word and determining the event type, identifying the event key argument and determining the argument role. Since the understanding of the world and the problem solving are more in line with the cognitive habits of human beings by taking events as units, event extraction becomes a focus of attention in the industrial and academic fields at home and abroad. In recent years, most of these research works rely on deep learning to obtain event information, and currently, the existing event extraction methods are mainly classified into the following categories according to the neural network architectures used by the methods:
(1) Event extraction based on the convolutional neural network, inducing k-grams information by using the convolutional neural network, capturing local semantic features, learning the composition semantic features of sentences, and completing the identification and extraction of event information. This approach can handle multiple event scenarios. However, the vocabulary semantics cannot be encoded in different contexts, and if more global information needs to be acquired, the receptive field needs to be increased in a cascading manner.
(2) Event extraction based on the recurrent neural network, modeling sequence information of the recurrent neural network, capturing the dependency relationship between the parameter role and the trigger word type, and mining the relationship between time sequence information and long distance in the text. The method can effectively analyze the long dependency text, but the method is essentially a Markov decision process, cannot well learn global structure information, neglects the position information of the entity pair, cannot be executed in parallel and has low speed.
(3) Based on the event extraction of the attention mechanism, the dependency relationship among different distance words is learned, the weight among the words is analyzed, the semantic relationship is judged, the event mode information is reused, and the data annotation is simplified. The method can capture important semantic information and effectively utilize global information, but the sequences are directly compared pairwise, so that the position cannot be well modeled.
(4) Event extraction based on the combination of the neural networks improves the extraction capability of semantic features by stacking two or more neural networks, and can verify that fusing different neural networks to improve the accuracy and efficiency of event extraction is effective.
However, the existing current event extraction method still lacks of fully utilizing semantic features in the text, and ignores the improvement of semantic correlation between event types and event arguments in the event extraction process on the event extraction accuracy. Fusing multiple neural networks is an effective feature extraction mode, but how to design a proper neural network to realize the full utilization of semantic relevance still needs to be researched.
Disclosure of Invention
The invention aims to design an event extraction method fusing a plurality of neural networks by fully utilizing semantic correlation in texts, and the method can improve the accuracy of event extraction under the condition of ensuring the efficiency of event extraction.
In order to achieve the purpose, the invention adopts the technical scheme that: an event extraction method integrating an attention mechanism and a convolutional neural network is characterized in that: (1) Effectively acquiring a combined semantic feature vector in a literature sentence through a convolutional neural network, and performing maximum pooling operation on the combined semantic feature vector to obtain a sentence-level local feature and a semantic structure; (2) The attention fusion mechanism solves the problem of long-distance dependence by calculating the mutual influence among words in the same sentence, the attention fuses different knowledge generated by the same attention convergence, the difference of the knowledge comes from different subspace representations of the same query, key and value, and the influence of text noise on the extraction effect can be effectively reduced; (3) The word vector generated by loading a pre-training model (such as BERT) is used for mapping corresponding word vector representation for each input word, namely the feature vector fused with context information is dynamically generated by fine tuning, and a plain text is represented as distributed feature information, so that the hidden state of each word contains the influence of words at different positions in a sentence, and meanwhile, the sparsity of a parameter space is avoided.
An event extraction method for fusing an attention mechanism and a convolutional neural network comprises the following steps:
1) Performing feature representation on the text content to be extracted by using a text encoder to obtain distributed features of the text to be extracted;
2) Extracting context characteristics of the text to be extracted and associated information among vocabularies from the distributed characteristics by using a characteristic extractor;
3) Inputting the contextual characteristics of the text to be extracted and the associated information between the vocabularies into an event trigger word classifier, outputting the event trigger word of the text to be extracted, and then determining the event type of the text to be extracted based on the event trigger word of the text to be extracted;
4) The event element classifier judges whether each word segmentation in the text to be extracted is an event element or not in sequence according to the event type and the context characteristics of the text to be extracted;
5) A role category is identified for each of the event elements using an element role classifier.
Further, the method for obtaining the distributed features of the text to be extracted comprises the following steps: the text encoder generates a text sequence S = [ x ] according to the text to be extracted 1 ,x 2 ,x 3 ,..,x n ]Wherein n is the number of word segmentation in the text to be extracted, x i The ith word segmentation in the text to be extracted is carried out; then, a word embedding vector, a segmentation vector and a position vector are respectively generated for each participle in the text sequence S, and the text sequence S is converted into an input sequence T = (T) in a mode of summing the word embedding vector, the segmentation vector and the position vector 1 ,t 2 ,t 3 ,..,t n ),t n A word embedding vector, a segmentation vector and a position vector which represent the nth participle are merged into a fused vector; the sequence T = (T) 1 ,t 2 ,t 3 ,..,t n ) Inputting a Transformer layer to obtain interword association and distribute weight through a self-attention function to obtain a feature vector fusing context information; inputting the feature vector of the fusion context information into a pre-training model to obtain a sequence E m ={e 1 ,e 2 ,e 3 ,...,e n The text to be extracted is obtained; e.g. of a cylinder n And representing the word vector corresponding to the nth participle.
Further, the feature extractor comprises a bidirectional long-time and short-time memory network, a convolutional layer and an attention unit; the method for extracting the context characteristics of the text to be extracted and the associated information among the vocabularies comprises the following steps: firstly, inputting the distributed characteristics into the bidirectional long-time and short-time memory network to obtain sequence characteristic vectors and inputting the sequence characteristic vectors into the convolutional layer; the convolution layer carries out convolution calculation on the sequence feature vector to obtain the local feature of the text to be extracted and the semantic structure high-dimensional feature vector; inputting the semantic structure high-dimensional feature vector into the attention unit to obtain each participle x in the text to be extracted i Relevance feature vector r with target entity i Where i ∈ [1, n ]]。
Further, the event trigger word classifier will
Figure BDA0004091838090000031
And r i After splicing, inputting the words into a conditional random field to obtain each word segmentation x in the text to be extracted i A corresponding event type; />
Figure BDA0004091838090000032
For word segmentation x i And (5) corresponding semantic structure high-dimensional feature vectors.
Further, the pre-training model is a BERT model.
A server, comprising a memory and a processor, the memory storing a computer program configured to be executed by the processor, the computer program comprising instructions for carrying out the steps of the above method.
A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the above-mentioned method.
The invention completes the event extraction through a text encoder, a feature extractor, an event trigger word classifier, an event element classifier and an element role classifier, and comprises the following steps:
firstly, performing feature representation on text content to be extracted by using a text encoder, and representing a plain text as distributed feature information;
secondly, acquiring features with different abstraction degrees successively by using a feature extractor, and automatically capturing text sentence features from distributed features obtained by processing of a text encoder through a bidirectional long-short memory network; then, extracting local features around each vocabulary through a convolutional neural network, further extracting contextual features and key information features among the words from the text sentence features captured by the bidirectional long and short memory network by combining local feature vectors extracted by the convolutional layer, distributing different weights to different vector representations by using an attention mechanism, learning the relevant information among the vocabularies from the text sentence features captured by the bidirectional long and short memory network, and reducing the influence of text noise on the extraction effect;
then, classifying the new sample by utilizing deep features extracted by a convolutional neural network and an attention mechanism and an event trigger word classifier, judging whether the vocabulary is an event trigger word, classifying the event category based on trigger word information, and completing event type identification;
and then, according to the characteristics of the event type and the acquired characteristics, sequentially judging each entity in the sentence by using an event element classifier, and judging whether the phrase is an element of the event.
And finally, judging the role type of the acquired event element by using an element role classifier, and finally obtaining a structured event which comprises a trigger word, the event element and the type thereof.
The text encoder performs characteristic representation on the text of the information to be extracted, and the characteristic representation is divided into two processes: and (4) constructing a feature representation model and learning the model. The specific execution steps are as follows:
1) Constructing a feature representation model: preparing a text sequence S = [ x ] containing n words 1 ,x 2 ,x 3 ,..,x n ]Wherein x is i An ith character representing the input text sequence; converting the text sequence S into an input sequence T = (T) by summing a word embedding vector, a segmentation vector and a position vector 1 ,t 2 ,t 3 ,..,t n ) Wherein the word embedding vector is obtained by querying a word vector table, the segmentation vector represents a sentence to which the word belongs, the position vector represents word position information, t n A word embedding vector + a segmentation vector + a position vector representing the nth participle in the S; the sequence T = (T) 1 ,t 2 ,t 3 ,..,t n ) Inputting a Transformer layer to obtain the association among words and distribute weight through a self-attention function to obtain a feature vector fusing context information:
Figure BDA0004091838090000041
q represents a hidden vector of a decoded character, K is a hidden vector corresponding to different words in the encoder, Q and each word in the coding region respectively calculate a value through a vector dot product, and a weight is calculated through a softmax function. Intuitively speaking, Q, K, V are same sentence, obtain the relation matrix between the lemma and normalize the back through the vector product, act on input sentence self, through a plurality of different linear variation to Q, K, V projections promptly, obtain single-head attention:
Figure BDA0004091838090000045
the multi-head attention mechanism is to splice different self-attention mechanism results, and calculate to obtain different spatial latitude position information:
MultiHead(Q,K,V)=Concat(Head 1 ,Head 2 ,Head 3 ,..,Head n )W 0
2) Model learning: training and learning on the feature representation model by using a training set of literature articles, namely 'the Guzaer King Chuan', dynamically generating a feature vector fused with context information to obtain a sequence E m = BERT (T), mapping as E m ={e 1 ,e 2 ,e 3 ,...,e n },e i =[w 1 ,w 2 ,...,w m ],E m I.e. the resulting distributed signature, e i Embedding a vector for each word corresponding to the entered character, w ∈ [ -1,1]And m is the designated word vector dimension.
The feature extractor is divided into two processes for distributed feature extraction: sentence-level feature capture and high-level semantic feature capture. The specific execution steps are as follows:
1) Sentence level feature capture: using a bidirectional long-time memory network to capture sentence vector context information and enabling distributed features E m Inputting bidirectional long-time and short-time memory network, and connecting the forward direction of the ith character in series
Figure BDA0004091838090000042
And backward output->
Figure BDA0004091838090000043
Obtaining the sequence characteristic vector of the ith character and marking the sequence characteristic vector as->
Figure BDA0004091838090000044
Obtaining sentence-level semantic representation L of ith character i
2) High-level semantic feature capture: calculating out
Figure BDA0004091838090000051
L to be acquired i Performing local feature extraction by convolution layer using multiple convolution kernels in combination, conv k Representing the convolution operation, the convolution kernel is k; counting/or>
Figure BDA0004091838090000052
The maximum pooling operation after different convolution kernel processing is completed to obtain the local feature and semantic structure high-dimensional feature vector->
Figure BDA0004091838090000053
Calculating alpha = softmax (Q × K) by a fusion attention mechanism, wherein Q represents a hidden vector of a decoded word, K is a hidden vector corresponding to different words in an encoder, Q and each word in a coding region respectively calculate a value through a vector dot product, and a weight is calculated through a softmax function; calculating an output vector +>
Figure BDA0004091838090000054
Obtaining the correlation characteristic vector r of the words in the sentence and the target entity i For the word segmentation x in the text to be extracted i Relevance feature vector to target entity, where i ∈ [1, n ]]。
Event trigger word classifier using feature extractor
Figure BDA0004091838090000055
And r i The vector splicing is input into a conditional random field, so that the acquisition of the event type is completed, and the method is divided into two parts of constructing an event type dependency relationship and deducing the event type. The specific execution steps are as follows:
1) Constructing event type dependency relations: modeling dependencies between tags using conditional random fieldsFor a given statement S = { x = 1 ,x 2 ,..,x n And its corresponding sequence label y = { y = } 1 ,y 2 ,..,y n Calculating the conditional probability p (y \ m) on all possible label sequences given by y; computing
Figure BDA0004091838090000056
Wherein β (S) represents a likely event type tag sequence in S; calculate->
Figure BDA0004091838090000057
f is a mapping function that maps the feature vector to an event type label; counting/or>
Figure BDA0004091838090000058
W y Is a prediction weight matrix, based on the weight value of the prediction>
Figure BDA0004091838090000059
Is the transition weight; loss function calculation mode L = -sigma x logp(y|S);
2) The type of the presumed event is: decoding an input model τ = (A, B, π) and an observed O = { O } using the Viterbi algorithm 1 ,o 2 ,..,o T }, initialize delta 1 (i)=π i b i (o 1 ),i∈[1,N](ii) a Recursion delta t (i)=max 1≤j≤ N[δ t-1 (j)a ji ]b i (o t ) And
Figure BDA00040918380900000510
Figure BDA00040918380900000511
up to P * =max 1≤i≤N δ r (i) And i is * r =argmax 1≤i≤Nr (i)](ii) a Backtracking optimal path->
Figure BDA00040918380900000512
Determining an optimal path->
Figure BDA00040918380900000513
And obtaining an analysis result, and outputting and obtaining the event type corresponding to the trigger word.
The event element classifier is divided into three parts, namely feature splicing, construction of event argument dependency relationship and inference of event argument types. The specific execution steps are as follows:
1) Characteristic splicing: will extract each word segmentation x in the text to be extracted i Mapping the corresponding event type to an event vector V i Is spliced to r i Then, obtaining a feature vector R fusing event type information i ={V i ,r i };
2) Constructing an event argument dependency relationship: modeling dependencies between tags using conditional random fields, S = { x ] for a given statement 1 ,x 2 ,..,x n And its corresponding sequence label g = { g = } 1 ,g 2 ,..,g n Calculating the conditional probability p (g \ S) on all possible label sequences given by y; computing
Figure BDA00040918380900000514
Wherein β (S) represents a possible event argument tag sequence in S; calculate->
Figure BDA00040918380900000515
f is a mapping function that maps the feature vectors to event argument tags; counting/or>
Figure BDA00040918380900000516
W g Is a prediction weight matrix, based on the weight value of the prediction>
Figure BDA00040918380900000517
Is the transition weight; loss function calculation mode L = -sigma s log(g|S);
3) Speculative event argument type: decoding an input model τ = (C, D, π) and an observed O = { O } using the Viterbi algorithm 1 ,o 2 ,..,o T }, initialize delta 1 (i)=π i b i (o 1 ),i∈[1,N](ii) a Recursion delta t (i)=max 1≤j≤Nt-1 (j)a ji ]b i (o t ) And with
Figure BDA0004091838090000061
Up to, P * =max 1≤i≤N δ r (i) And i is * r =argmax 1≤i≤Nr (i)](ii) a Backtracking optimal path +>
Figure BDA0004091838090000062
Determining an optimal path->
Figure BDA0004091838090000063
And obtaining an analysis result and outputting an argument and an argument role.
Compared with the prior art, the invention has the following positive effects:
according to the event extraction method fusing a plurality of neural networks, the convolutional neural network, the attention mechanism and the pre-training model are combined, so that the training effect is guaranteed, and the model training speed is effectively increased; on the other hand, the method pays attention to the importance of the semantic relation between the event type and the event argument to the event extraction and focuses on utilizing the relation, so that the accuracy and the efficiency of the model are remarkably improved compared with other models of the same type.
Drawings
FIG. 1 is an overall flow diagram of the method of the present invention.
FIG. 2 is a flow chart of an algorithm of an event extraction method for merging an attention mechanism with a convolutional neural network according to the present invention.
Detailed Description
The invention is further described with reference to the following figures and examples.
Fig. 1 is a system model diagram of an event extraction method combining an attention mechanism and a convolutional neural network according to the present invention. The method mainly comprises the following five steps to complete event extraction:
step 101, preparing an event text data set for training, namely 'Kazakh of Gezakh';
step 102, constructing text feature representation by using a text encoder, and acquiring distributed feature representation of a text S to be extracted;
and 103, capturing sentence-level semantic representation, high-dimensional feature vectors of local features and semantic structures, and correlation feature vectors of words and target entities in the distributed feature representation by using a feature extractor.
104, detecting a trigger word representing an event for the feature representation acquired in the step 103 by using an event trigger word classifier, and acquiring an event type corresponding to the text through the event trigger word;
step 105, merging the event type obtained in step 104 and the feature representation obtained in step 103 by using an event element classifier and an element role classifier, sequentially judging each entity in the sentence through the merged feature vector, and judging whether the phrase is an element of the event or not;
further, in step 101, an event text data set is collected in the literature, "the university of gesarang", and the data set is divided into a training set and a testing set according to a ratio of 9. The training set is input into a text coder and training events are extracted. The event extraction model adopts supervised learning, and the labeling of event corpora adopts a double-pointer labeling method. The double-pointer marking method can effectively solve the entity nesting problem by endowing each character with the corresponding starting position and the corresponding ending position of each label, and the labels cover entities, event trigger words, event elements and corresponding roles. Each corpus comprises two forms of a single sentence and a plurality of sentences, has event theme consistency and only considers main events. There are 22 experimental corpus event types.
Further, in step 102, the pre-training model used in the text encoder adopts a bert-base-hierarchy version in hugging face. In step 102, parameters of the model are learned using Adam's method. The method comprises the following steps:
step 201, selecting a training data set M for model learning, initializing algorithm input: setting model learning parameters according to the event type set C and the relationship set L: maximum number of iterations epochs, learning rate λ, maximum input sequence length b, batch size k (in this example epochs is 100, λ is 5e-5, b is 128, k is 32);
step 202, training the feature representation model constructed by the text encoder in step 201, learning model parameters, and learning the parameters of the model by adopting an Adam method in the model learning process;
further, in step 103, the feature extractor includes a stack of a bidirectional LSTM neural network, a convolutional neural network, and an attention mechanism, including:
step 301, performing feature extraction on the learned feature expression by using a bidirectional LSTM neural network, and preventing an overfitting phenomenon from being generated in the training process by using a dropout function;
step 302, performing secondary feature extraction on the shallow feature representation obtained in step 301 by using a convolutional neural network, setting m convolutional kernels with different lengths, wherein each convolutional kernel is provided with n numbers (m is 3 in the example, the lengths are 2, 3 and 4, and n is 20 respectively)
Step 303, expressing the shallow feature obtained in step 301 to perform attention calculation by using an attention mechanism, obtaining key information between words and sentences, and performing similarity calculation by using a cosine distance formula;
step 304, splicing the feature representations obtained in the steps 302 and 303 to obtain a highly abstracted feature representation;
further, in step 104, the event type identification module takes the feature representation in step 304 as input, finds out the relation between the entity labeled by the text in the training set and the label by using the transition matrix of the conditional random field layer, and extracts an event trigger word to obtain an event type;
further, in step 105, the event element classifier maps the event types obtained in step 104 into a 1-dimensional array, after splicing to step 304, introduces event type label information to enhance description capability, finds out the relation between labels by using the transition matrix of the conditional random field layer, predicts the constraint relation between labels, and extracts event arguments and the corresponding roles.
The invention provides an event extraction method integrating an attention mechanism and a convolutional neural network. The method combines the convolutional neural network, the attention mechanism and the pre-training model, effectively obtains sentence-level event characteristics in literature description, learns the relationship among words, dynamically generates the characteristic vector fusing context information, ensures extraction efficiency, finally realizes full utilization of semantic correlation between event types and event arguments in an event extraction flow, and greatly improves the accuracy of event extraction.
Of course, the present invention may have other embodiments, which are not limited to the embodiments described in the detailed description, and other embodiments provided by the technical solutions of the present invention by those skilled in the art are also within the scope of the claims attached to the present invention.

Claims (7)

1. An event extraction method for fusing an attention mechanism and a convolutional neural network comprises the following steps:
1) Performing feature representation on the text content to be extracted by using a text encoder to obtain distributed features of the text to be extracted;
2) Extracting context characteristics of the text to be extracted and associated information among vocabularies from the distributed characteristics by using a characteristic extractor;
3) Inputting the context characteristics of the text to be extracted and the associated information between the vocabularies into an event trigger word classifier, outputting the event trigger word of the text to be extracted, and then determining the event type of the text to be extracted based on the event trigger word of the text to be extracted;
4) The event element classifier judges whether each word segmentation in the text to be extracted is an event element or not in sequence according to the event type and the context characteristics of the text to be extracted;
5) A role category is identified for each of the event elements using an element role classifier.
2. The method of claim 1, wherein the method is obtainedThe method for extracting the distributed features of the text comprises the following steps: the text encoder generates a text sequence S = [ x ] according to the text to be extracted 1 ,x 2 ,x 3 ,..,x n ]Wherein n is the number of word segments in the text to be extracted, and x i The ith word segmentation in the text to be extracted is carried out; then, a word embedding vector, a segmentation vector and a position vector are respectively generated for each participle in the text sequence S, and the text sequence S is converted into an input sequence T = (T) in a mode of summing the word embedding vector, the segmentation vector and the position vector 1 ,t 2 ,t 3 ,..,t n ),t n A word embedding vector, a segmentation vector and a position vector which represent the nth participle are merged into a fused vector; the sequence T = (T) 1 ,t 2 ,t 3 ,..,t n ) Inputting a Transformer layer to obtain interword association and distribute weight through a self-attention function to obtain a feature vector fusing context information; inputting the feature vector of the fusion context information into a pre-training model to obtain a sequence E m ={e 1 ,e 2 ,e 3 ,...,e n The text to be extracted is obtained; e.g. of the type n And representing the word vector corresponding to the nth participle.
3. The method of claim 2, wherein the feature extractor comprises a two-way long-short-time memory network, a convolutional layer, and an attention unit; the method for extracting the context characteristics of the text to be extracted and the associated information among the vocabularies comprises the following steps: firstly, inputting the distributed characteristics into the bidirectional long-time and short-time memory network to obtain sequence characteristic vectors and inputting the sequence characteristic vectors into the convolutional layer; the convolution layer carries out convolution calculation on the sequence feature vector to obtain the local feature of the text to be extracted and the semantic structure high-dimensional feature vector; inputting the semantic structure high-dimensional feature vector into the attention unit to obtain each participle x in the text to be extracted i Relevance feature vector r with target entity i Wherein i ∈ [1, n ]]。
4. The method of claim 3, wherein the event triggersThe word classifier will
Figure FDA0004091838080000011
And r i After splicing, inputting the words into a conditional random field to obtain each word segmentation x in the text to be extracted i A corresponding event type; />
Figure FDA0004091838080000012
For word segmentation x i And (5) corresponding semantic structure high-dimensional feature vectors.
5. The method of claim 2, wherein the pre-training model is a BERT model.
6. A server, comprising a memory and a processor, the memory storing a computer program configured to be executed by the processor, the computer program comprising instructions for carrying out the steps of the method according to any one of claims 1 to 5.
7. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of one of claims 1 to 5.
CN202310154608.9A 2023-02-23 2023-02-23 Event extraction method integrating attention mechanism and convolutional neural network Pending CN115964497A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310154608.9A CN115964497A (en) 2023-02-23 2023-02-23 Event extraction method integrating attention mechanism and convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310154608.9A CN115964497A (en) 2023-02-23 2023-02-23 Event extraction method integrating attention mechanism and convolutional neural network

Publications (1)

Publication Number Publication Date
CN115964497A true CN115964497A (en) 2023-04-14

Family

ID=87358657

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310154608.9A Pending CN115964497A (en) 2023-02-23 2023-02-23 Event extraction method integrating attention mechanism and convolutional neural network

Country Status (1)

Country Link
CN (1) CN115964497A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116701576A (en) * 2023-08-04 2023-09-05 华东交通大学 Event detection method and system without trigger words

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116701576A (en) * 2023-08-04 2023-09-05 华东交通大学 Event detection method and system without trigger words
CN116701576B (en) * 2023-08-04 2023-10-10 华东交通大学 Event detection method and system without trigger words

Similar Documents

Publication Publication Date Title
CN108932342A (en) A kind of method of semantic matches, the learning method of model and server
JP7370033B2 (en) Semantic recognition method
CN111666758B (en) Chinese word segmentation method, training device and computer readable storage medium
CN116662582B (en) Specific domain business knowledge retrieval method and retrieval device based on natural language
Zhang et al. Aspect-based sentiment analysis for user reviews
CN113705238B (en) Method and system for analyzing aspect level emotion based on BERT and aspect feature positioning model
CN116304748B (en) Text similarity calculation method, system, equipment and medium
Chen et al. A few-shot transfer learning approach using text-label embedding with legal attributes for law article prediction
CN113128237B (en) Semantic representation model construction method for service resources
CN113987187A (en) Multi-label embedding-based public opinion text classification method, system, terminal and medium
CN115796182A (en) Multi-modal named entity recognition method based on entity-level cross-modal interaction
CN116661805A (en) Code representation generation method and device, storage medium and electronic equipment
Wang et al. Attention-based CNN-BLSTM networks for joint intent detection and slot filling
CN115964497A (en) Event extraction method integrating attention mechanism and convolutional neural network
CN117574898A (en) Domain knowledge graph updating method and system based on power grid equipment
CN112036189A (en) Method and system for recognizing gold semantic
CN115238696A (en) Chinese named entity recognition method, electronic equipment and storage medium
Zhang et al. Weakly supervised setting for learning concept prerequisite relations using multi-head attention variational graph auto-encoders
CN115169429A (en) Lightweight aspect-level text emotion analysis method
CN114417891A (en) Reply sentence determination method and device based on rough semantics and electronic equipment
CN114398076A (en) Object-oriented program method named odor detection method based on deep learning
Devkota et al. Knowledge of the ancestors: Intelligent ontology-aware annotation of biological literature using semantic similarity
Li et al. Senti-EGCN: An Aspect-Based Sentiment Analysis System Using Edge-Enhanced Graph Convolutional Networks
Vadavalli et al. Deep Learning based truth discovery algorithm for research the genuineness of given text corpus
CN113239703B (en) Deep logic reasoning financial text analysis method and system based on multi-element factor fusion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination