CN111125367B - Multi-character relation extraction method based on multi-level attention mechanism - Google Patents
Multi-character relation extraction method based on multi-level attention mechanism Download PDFInfo
- Publication number
- CN111125367B CN111125367B CN201911362557.9A CN201911362557A CN111125367B CN 111125367 B CN111125367 B CN 111125367B CN 201911362557 A CN201911362557 A CN 201911362557A CN 111125367 B CN111125367 B CN 111125367B
- Authority
- CN
- China
- Prior art keywords
- text
- layer
- vector
- word
- entity
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
- G06F16/355—Class or cluster creation or modification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
- G06F16/353—Clustering; Classification into predefined classes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Databases & Information Systems (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a method for extracting various character relations based on a multi-level attention mechanism, which comprises the following steps: preprocessing the collected text; the method comprises the steps of adopting a remote supervision technology to carry out alignment labeling on an original personage named entity to obtain a text containing the entity and entity description information; training the Chinese word vector of the obtained text containing the entity; constructing a bidirectional long-short-time memory network containing two levels of attention mechanisms, and training the constructed model to obtain a multi-classification model for extracting various character relations; and inputting the preprocessed text to obtain a text relation extraction result. The invention solves the defects of the relation extraction of the prior various character relation texts, and improves the relation extraction experimental result of the various character relation texts.
Description
Technical Field
The invention relates to the field of natural language processing, in particular to a method for extracting various character relations based on a multi-level attention mechanism.
Background
With the rapid development of internet technology, text information data in a network grows exponentially, but text information data is often unstructured information. Information extraction is a task of natural language processing, and aims to extract structured information from unstructured text. Information extraction includes two aspects: named entity recognition task for discovering entities present in text and relationship extraction task for judging relationships between discovered entities, i.e., obtaining entity pair e for specified text 1 And e 2 And a relation r between the two (e 1 ,r,e 2 ). The task of relation extraction has been widely used in the fields of knowledge graph, information retrieval, etc.
Conventional non-deep learning methods for relational extraction are typically supervised learning and can be classified into feature-based methods and kernel-based methods, both of which use existing NLP tools, resulting in downstream error accumulation. Entering the deep learning era, the manual feature acquisition mode is avoided, but the supervised deep learning requires a large amount of training data to learn features. The labeling of training data can take a lot of time and effort and is biased to a fixed domain. Mintz et al in 2009 proposed a remote supervision method, which strongly hypothesizes that entity relationships in a knowledge base are entity relationships in text, and generates a large amount of data by aligning the knowledge base with the text.
However, the strong assumption of remote supervision does not necessarily hold, and the entity relationships present in the text do not necessarily have to be the same as the entity relationships in the knowledge base. To alleviate this disadvantage, riedel uses multiple instance learning. Lin uses a segmented convolutional neural network and a sentence-level attention mechanism for the first time in 2016, and the deep learning and the use of the attention mechanism are introduced to achieve better effect of relation extraction.
Most of relation extraction tasks are related to English texts, and related to Chinese texts, especially Chinese texts containing various character relations, how to use a deep learning fusion attention mechanism to realize a better method for extracting various character relations of Chinese texts is needed to be researched.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a multi-character relation extraction method based on a multi-level attention mechanism. The invention obtains the global characteristic representation of the text by adopting a bidirectional long-short-time memory network and a word-level attention mechanism, wherein the word-level attention mechanism is used for strengthening the weight of words which are more important for relation extraction, then a multi-instance learning mode is adopted, the sentence-level attention mechanism obtains the package representation consisting of a plurality of sentence representations, and the description information of a named entity is added to strengthen the result of the package representation. The invention obtains better experimental results on the remote supervision relation extraction data set.
The invention can be realized by the following technical scheme:
a multi-character relation extraction method based on a multi-level attention mechanism comprises the following steps:
preprocessing the collected text;
the method comprises the steps of adopting a remote supervision technology to carry out alignment labeling on an original personage named entity to obtain a text containing the entity and entity description information;
training the Chinese word vector of the obtained text containing the entity;
constructing a bidirectional long-short-time memory network containing two levels of attention mechanisms, and training the constructed model to obtain a multi-classification model for extracting various character relations;
and inputting the preprocessed text to obtain a text relation extraction result.
Specifically, the pretreatment includes:
removing English data in the text;
removing emoticons and hyperlinks in the text;
removing stop words in the text according to the Chinese stop word list;
and performing Chinese word segmentation on the text subjected to the processing.
Specifically, in the step of aligning and labeling the original character named entities by adopting the remote supervision technology, the character name entry is acquired by using the Chinese online hundred degrees encyclopedia, the two characters with the relation and the relation of the characters form a triplet, and finally, a character relation knowledge base is constructed. Pairs of entities, i.e., relationships of two entities, that appear in the text with the knowledge base are labeled as relationships in triples. The final marked data set in the invention has 35 relation types.
Specifically, in the step of training the Chinese Word vector of the text, a distributed Word vector representation method Word2Vec is adopted, and the dimension of the output Word vector is set to 300.
Specifically, in the step of constructing the bidirectional long and short time memory network containing the two levels of attention mechanisms, a pytorch is used to build a BiLSTM (bidirectional long and short time memory network) and two levels of attention mechanism network structures, wherein the first layer of the network is an embedded layer, the second layer is a bidirectional LSTM layer, the third layer is a word level attention layer, the fourth layer is a sentence level attention layer, and the fifth layer is a classifier soffmax layer.
Further, the embedded layerThe input of the Word is a trained Word vector sequence, the length of a text sequence (the number of Word vector sequences) is set to be m, the Word vector sequences are filled with 0 and are truncated beyond m, the relative position length of words in each text relative to two entities is also m, an embedding layer adopts hundred degrees encyclopedia as a corpus, and Word2Vec is obtained by using a Gensim tool. The word vector dimension is dw, and the randomly initialized position vector dimension is dp, so that a vector sequence w= { w is obtained 1 ,w 2 ,…,w m },w i ∈R d Where d=dw+dp×2.
Still further, in the bi-directional LSTM layer, the unidirectional LSTM is expressed as follows:
i t is an input door, f t C is a forgetful door t Is in the state of metacells, o t For outputting the door h t To conceal the vector, W x ,W h ,W c As the weight of the material to be weighed,
i t =σ(W xi x t +W hi h t-1 +W ci c t-1 +b i )
f t =σ(W xf x t +W hf h t-1 +W cf c t-1 +b f )
c t =f t c t-1 +i t tanh(W xc x t +W hc h t-1 +b c )
o t =σ(W xo x t +W ho h t-1 +W co c t +b o )
h t =o t tanh(c t )
vector h of bidirectional LSTM t Output by a forward networkAnd reverse output->And (5) obtaining the product together.
Still further, the word-level attention layer is used to strengthen the weight of words that are more important for relation extraction.
In the word-level attention layer, u i Score representing the relevance of each word in a sentence, r represents a random query vector, h i Is h t I.e. each word hidden vector, the specific relationship is:
u i =h i ·r
α i the weight obtained by the word level attention mechanism is calculated as follows:
s is a vector representation of a sentence, and the calculation formula is as follows:
still further, the sentence-level attention layer is configured to add entity description information.
In the sentence-level attention layer, e i Representing an input sentence s i Relation r relative to predictions k The matching degree of (2) is calculated by the following formula:
α i the weight obtained by the sentence level attention mechanism is calculated as follows:
b is a vector representation of a packet, equal to the weighted sum of all sentences, calculated as follows:
the obtained package represents the description information of the entity on the splice, namely the category information vector of the entity, and is expressed as follows:
the fifth layer is the classifier softmax layer for generating results of the relational extraction by the softmax multi-classifier.
The method obtains global feature representation of text by using BiLSTM (bidirectional long short time memory network) and Word level attention (word level attention mechanism), the word level attention mechanism is used for strengthening the weight of words which are more important for relation extraction, then a multi-instance learning mode is adopted, sentence level attention (sentence level attention mechanism) obtains a package representation composed of a plurality of sentence representations, and descriptive information of named entities is added to strengthen the result of the package representation.
Compared with the prior art, the invention has the following beneficial effects:
for Chinese texts with various task relations, the invention better avoids noise caused by remote supervision by adopting two levels of attention mechanisms, and adds entity description information into the Chinese texts, so that the semantic characteristics of the texts are enhanced, and better relation extraction results are obtained.
Drawings
FIG. 1 is a flow chart of a method for extracting relationships between multiple people based on a multi-level attention mechanism according to the present invention.
FIG. 2 is a diagram of a multiple persona relationship extraction network model based on a multi-level attention mechanism in accordance with the present invention.
Detailed Description
The present invention will be described in further detail with reference to examples and drawings, but embodiments of the present invention are not limited thereto.
Examples
Fig. 1 is a flowchart of a method for extracting multiple character relationships based on a multi-level attention mechanism, the method comprising the steps of:
(1) Preprocessing the collected text;
the disclosed remote supervision multiple persona relationship extraction data sets (e.g., CCKS 2019 IPER data sets) are used in this embodiment. The following operations are performed: firstly, removing English data in a text;
removing special symbols in the text, such as: emoji and hyperlinks, representing emoji as "expression", removing hyperlinks, etc.; and removing stop words in the text according to the Chinese stop word list.
(2) The method comprises the steps of adopting a remote supervision technology to carry out alignment labeling on an original personage named entity to obtain a text containing the entity and entity description information;
in the entity labeling stage, the character relationship knowledge base is finally constructed by utilizing the acquired name entry of the Chinese online hundred degrees encyclopedia to form a triplet of two characters with relationship and the relationship thereof. Pairs of entities, i.e., relationships of two entities, that appear in the text with the knowledge base are labeled as relationships in triples. The final annotated dataset had 35 relationship types.
(3) Training the Chinese word vector of the obtained text containing the entity;
in the text vectorization step of this embodiment, the word2vec method is used, chinese word segmentation is performed on the text processed by the above method using a barker word segmentation tool, word2vec training is performed using a genesim package, and the vector dimension of each word is 300.
(4) Constructing a bidirectional long-short-time memory network containing two levels of attention mechanisms, and training the constructed model to obtain a multi-classification model for extracting various character relations;
as shown in fig. 2, the network model constructed in this embodiment includes: an embedded layer, a bi-directional LSTM layer, a word level attention layer, a sentence level attention layer, and a softmax classification layer.
The neural network model constructed by the embodiment is trained through the downloaded Chinese hundred-degree encyclopedic data set, the loss function is Crossentiopy, the optimization method is Adam, through adjustment of other parameters of the model, training is completed when 15 epochs or loss are unchanged in 1000 batches, the test set is tested, a result is obtained, a result of relation extraction is measured through a P-R curve, the P-R curve is characterized by a precision and a recovery of the result, and the upper curve in a two-dimensional coordinate system represents the relation extraction effect is good.
(5) And preprocessing the text required to be extracted, and inputting the preprocessed text into a trained model to obtain a text relation extraction result.
The method for extracting the relationships of various characters based on the multi-level attention mechanism is established in the method, the input text is converted into a vector form through an embedding layer, hidden vectors with more characteristics are obtained through a BiLSTM layer, the attention mechanism of word level is used for extracting more important words from the text, higher weight is obtained, the attention mechanism of sentence level is better package representation, noise brought by remote supervision can be eliminated, and therefore better experimental results are obtained.
The above examples are preferred embodiments of the present invention, but the embodiments of the present invention are not limited to the above examples, and any other changes, modifications, substitutions, combinations, and simplifications that do not depart from the spirit and principle of the present invention should be made in the equivalent manner, and the embodiments are included in the protection scope of the present invention.
Claims (3)
1. A multi-character relation extraction method based on a multi-level attention mechanism is characterized by comprising the following steps:
preprocessing the collected text;
the method comprises the steps of adopting a remote supervision technology to carry out alignment labeling on an original personage named entity to obtain a text containing the entity and entity description information;
training the Chinese word vector of the obtained text containing the entity;
constructing a bidirectional long-short-time memory network containing two levels of attention mechanisms, and training the constructed model to obtain a multi-classification model for extracting various character relations;
inputting the preprocessed text to obtain a text relation extraction result;
in the step of aligning and labeling the original character named entities by adopting a remote supervision technology, acquiring name entries by utilizing Chinese online hundred degrees encyclopedia, forming a triplet by two characters with relation and the relation of the characters, and finally constructing a character relation knowledge base; entity pairs which exist with the knowledge base appear in the text, namely the relation between the two entities is marked as the relation in the triplet;
in the step of constructing a bidirectional long-short-time memory network containing two levels of attention mechanisms, a pyrach is used for constructing a BiLSTM and two levels of attention mechanism network structures, wherein the first layer of the network is an embedded layer, the second layer is a bidirectional LSTM layer, the third layer is a word level attention layer, the fourth layer is a sentence level attention layer, and the fifth layer is a classifier softmax layer;
the input of the embedding layer is a trained word vector sequence, the length of a text sequence, namely the number of word vector sequences is set to be m, the number of word vector sequences is less than m and filled with 0, the relative position length of words in each text relative to two entities is also m, the embedding layer adopts a pre-trained word vector dimension dw and a randomly initialized position vector dimension dp, and therefore a vector sequence w= { w is obtained 1 ,w 2 ,...,w m },w i ∈R d Wherein d=dw+dp×2;
in the bi-directional LSTM layer, its unidirectional LSTM is expressed as follows:
i t is an input door, f t C is a forgetful door t Is in the state of metacells, o t For outputting the door h t To conceal the vector, W x ,W h ,W c As the weight of the material to be weighed,
i t =σ(W xi x t +W hi h t-1 +W ci c t-1 +b i )
f t =σ(W xf x t +W hf h t-1 +W cf c t-1 +b f )
c t =f t c t-1 +i t tanh(W xc x t +W hc h t-1 +b c )
o t =σ(W xo x t +W ho h t-1 +W co c t +b o )
h t =o t tanh(c t )
vector h of bidirectional LSTM t Output by a forward networkAnd reverse output->Obtaining the materials together;
in the attention layer of word level, u i Score representing the relevance of each word in a sentence, r represents a random query vector, h i Is h t I.e. each word hidden vector, the specific relationship is:
u i =h i ·r
α i the weight obtained by the word level attention mechanism is calculated as follows:
s is a vector representation of a sentence, and the calculation formula is as follows:
the sentence-level attention layer is used for adding entity description information;
in the sentence-level attention layer, e i Representing an input sentence s i Relation r relative to predictions k The matching degree of (2) is calculated by the following formula:
β i the weight obtained by the sentence level attention mechanism is calculated as follows:
b is a vector representation of a packet, equal to the weighted sum of all sentences, calculated as follows:
the obtained package represents the description information of the entity on the splice, namely the category information vector of the entity, and is expressed as follows:
2. the method of claim 1, wherein the preprocessing comprises:
removing English data in the text;
removing emoticons and hyperlinks in the text;
removing stop words in the text according to the Chinese stop word list;
and performing Chinese word segmentation on the text subjected to the processing.
3. The method of claim 1, wherein the step of training the text for the chinese Word vector uses a distributed Word vector representation method Word2Vec, and the dimension of the output Word vector is set to 300.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911362557.9A CN111125367B (en) | 2019-12-26 | 2019-12-26 | Multi-character relation extraction method based on multi-level attention mechanism |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911362557.9A CN111125367B (en) | 2019-12-26 | 2019-12-26 | Multi-character relation extraction method based on multi-level attention mechanism |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111125367A CN111125367A (en) | 2020-05-08 |
CN111125367B true CN111125367B (en) | 2023-05-23 |
Family
ID=70502727
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911362557.9A Active CN111125367B (en) | 2019-12-26 | 2019-12-26 | Multi-character relation extraction method based on multi-level attention mechanism |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111125367B (en) |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111651606B (en) * | 2020-06-05 | 2024-03-01 | 深圳市慧择时代科技有限公司 | Text processing method and device and electronic equipment |
CN112101009B (en) * | 2020-09-23 | 2024-03-26 | 中国农业大学 | Method for judging similarity of red-building dream character relationship frames based on knowledge graph |
CN112560490A (en) * | 2020-12-08 | 2021-03-26 | 吉林大学 | Knowledge graph relation extraction method and device, electronic equipment and storage medium |
CN112818683A (en) * | 2021-01-26 | 2021-05-18 | 山西三友和智慧信息技术股份有限公司 | Chinese character relationship extraction method based on trigger word rule and Attention-BilSTM |
CN112926325A (en) * | 2021-02-14 | 2021-06-08 | 北京工业大学 | Chinese character relation extraction construction method based on BERT neural network |
CN113128229B (en) * | 2021-04-14 | 2023-07-18 | 河海大学 | Chinese entity relation joint extraction method |
CN113919350A (en) * | 2021-09-22 | 2022-01-11 | 上海明略人工智能(集团)有限公司 | Entity identification method, system, electronic equipment and storage medium |
CN117057345B (en) * | 2023-10-11 | 2024-01-30 | 腾讯科技(深圳)有限公司 | Role relation acquisition method and related products |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109408812A (en) * | 2018-09-30 | 2019-03-01 | 北京工业大学 | A method of the sequence labelling joint based on attention mechanism extracts entity relationship |
CN110502749A (en) * | 2019-08-02 | 2019-11-26 | 中国电子科技集团公司第二十八研究所 | A kind of text Relation extraction method based on the double-deck attention mechanism Yu two-way GRU |
-
2019
- 2019-12-26 CN CN201911362557.9A patent/CN111125367B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109408812A (en) * | 2018-09-30 | 2019-03-01 | 北京工业大学 | A method of the sequence labelling joint based on attention mechanism extracts entity relationship |
CN110502749A (en) * | 2019-08-02 | 2019-11-26 | 中国电子科技集团公司第二十八研究所 | A kind of text Relation extraction method based on the double-deck attention mechanism Yu two-way GRU |
Also Published As
Publication number | Publication date |
---|---|
CN111125367A (en) | 2020-05-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111125367B (en) | Multi-character relation extraction method based on multi-level attention mechanism | |
CN110502749B (en) | Text relation extraction method based on double-layer attention mechanism and bidirectional GRU | |
CN110866117B (en) | Short text classification method based on semantic enhancement and multi-level label embedding | |
CN108984526B (en) | Document theme vector extraction method based on deep learning | |
CN108363743B (en) | Intelligent problem generation method and device and computer readable storage medium | |
CN111858944B (en) | Entity aspect level emotion analysis method based on attention mechanism | |
CN110765775B (en) | Self-adaptive method for named entity recognition field fusing semantics and label differences | |
Sun et al. | Sentiment analysis for Chinese microblog based on deep neural networks with convolutional extension features | |
CN111401061A (en) | Method for identifying news opinion involved in case based on BERT and Bi L STM-Attention | |
CN107943784B (en) | Relationship extraction method based on generation of countermeasure network | |
CN110969020A (en) | CNN and attention mechanism-based Chinese named entity identification method, system and medium | |
CN108388554B (en) | Text emotion recognition system based on collaborative filtering attention mechanism | |
CN111241816A (en) | Automatic news headline generation method | |
CN113392209B (en) | Text clustering method based on artificial intelligence, related equipment and storage medium | |
CN112069831A (en) | Unreal information detection method based on BERT model and enhanced hybrid neural network | |
CN113704416B (en) | Word sense disambiguation method and device, electronic equipment and computer-readable storage medium | |
Imaduddin et al. | Word embedding comparison for indonesian language sentiment analysis | |
Suyanto | Synonyms-based augmentation to improve fake news detection using bidirectional LSTM | |
CN116932661A (en) | Event knowledge graph construction method oriented to network security | |
CN114491062B (en) | Short text classification method integrating knowledge graph and topic model | |
CN111145914A (en) | Method and device for determining lung cancer clinical disease library text entity | |
CN111353032B (en) | Community question and answer oriented question classification method and system | |
CN113204975A (en) | Sensitive character wind identification method based on remote supervision | |
CN117216265A (en) | Improved graph annotation meaning network news topic classification method | |
Rafi et al. | A linear sub-structure with co-variance shift for image captioning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |