CN113191118B - Text relation extraction method based on sequence annotation - Google Patents

Text relation extraction method based on sequence annotation Download PDF

Info

Publication number
CN113191118B
CN113191118B CN202110501103.6A CN202110501103A CN113191118B CN 113191118 B CN113191118 B CN 113191118B CN 202110501103 A CN202110501103 A CN 202110501103A CN 113191118 B CN113191118 B CN 113191118B
Authority
CN
China
Prior art keywords
word
entity
vector
sequence
relationship
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110501103.6A
Other languages
Chinese (zh)
Other versions
CN113191118A (en
Inventor
展一鸣
李钊
吴士伟
李慧娟
辛国茂
陈通
胡传会
张超
赵秀浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Computer Science Center National Super Computing Center in Jinan
Original Assignee
Shandong Computer Science Center National Super Computing Center in Jinan
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Computer Science Center National Super Computing Center in Jinan filed Critical Shandong Computer Science Center National Super Computing Center in Jinan
Priority to CN202110501103.6A priority Critical patent/CN113191118B/en
Publication of CN113191118A publication Critical patent/CN113191118A/en
Application granted granted Critical
Publication of CN113191118B publication Critical patent/CN113191118B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/103Formatting, i.e. changing of presentation of documents
    • G06F40/117Tagging; Marking up; Designating a block; Setting of attributes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Machine Translation (AREA)

Abstract

The invention relates to the technical field of data processing, in particular to a text relation extraction method based on sequence annotation, which comprises the steps of constructing a training data set similar to predicted data, and presetting all possible bidirectional entity relations and three fixed dependency relations; dividing an input sentence into word sequences, and inputting the word sequences into a pre-training model to obtain a representation vector of a word in each sentence; forming a unique word pair sequence by using a similar handshake mode on the word vector sequence; inputting the obtained vector pair sequence into a neural network classification layer; calculating loss and back-propagating; judging the category of each word pair, and judging whether the word pair has a corresponding relation with the position; and decoding the final result by using the pseudo codes shown in the drawings according to the corresponding relation, and finally obtaining all the extracted triples. The invention can simultaneously complete two tasks: entity identification and relationship classification. The extraction accuracy and recall rate are obviously improved, and the extraction accuracy and recall rate are greatly improved.

Description

Text relation extraction method based on sequence annotation
Technical Field
The invention relates to the technical field of data processing, in particular to a text relation extraction method based on sequence labeling.
Background
The triplet relationships have been widely used in the field of natural language understanding as a knowledge representation that can be stored in a structured manner and play a great role. In the field of natural language text, a piece of text knowledge can always be represented by one or more triples. The structured triad relationship is extracted from one or more pieces of text, i.e., the knowledge represented by the discrete text is converted into a form that can be understood or stored by the machine, a task referred to as relationship extraction.
Relationship extraction methods, from early statistical methods to recent neural network methods, have evolved over many years, with relationship extraction tasks ranging from simple single relationship extraction to overlapping relationship extraction. A single relationship extraction refers to the existence of a unique one of the triples in a piece of text, while an overlapping relationship extraction refers to the existence of multiple triples and overlapping in a piece of text. The overlapping relationships are divided into physical overlaps and single physical overlaps, and for better explanation of the overlapping relationships, three relationship types are shown in FIG. 1.
In the conventional method, there are three disadvantages: 1) Some relation extraction methods consider the relation extraction task as a relation classification task, namely, the relation between entities is classified by the entities already marked in a given text, and the method is not suitable for practical application. 2) Still other methods simplify the relationship extraction problem, only solve the single relationship extraction problem, and do not consider the actual situation; in fact, multiple relationship extraction is the most common problem in our practical application. 3) The trace flow method divides a relation extraction task into two independent tasks, namely an entity identification task and a relation classification task, and the method does not consider the correlation between the two tasks, so that the problem of exposure deviation is caused, namely when a final result is generated in a certain sequence, the result of the latter step is influenced by the result of the former step.
Disclosure of Invention
The invention aims to overcome the defects in the prior art, and provides a text relation extraction method based on sequence labeling, which can simultaneously complete two tasks: entity identification and relationship classification.
The technical scheme adopted for solving the technical problems is as follows:
a text relation extraction method based on sequence labeling comprises the following steps:
step 1, presetting all possible entity relation categories and establishing a relation set R;
step 2, constructing a training data set suitable for the service field, wherein entity relations in the training data must comprise all preset relation types;
step 3, expanding a relation set R, and presetting all possible bidirectional entity relations to obtain a data set R';
step 4, constructing a dependency relationship set R for the triplet s
Step 5, dividing the input sentence into word sequences, and inputting the word sequences into a pre-training model to obtain a representation vector set of the words in each sentenceWherein d is the dimension of the hidden layer vector represented by the preset super parameter, h i Is a word vector;
step 6, forming a unique word pair sequence for the word vector sequence H obtained in the step 5, and double traversing the word sequence to form a word vector pair;
step 7, constructing training targets for the word pairs generated in step 6, for each word pair (W i ,W j ) J.gtoreq.i constructs a 0-value vector of length 2r+3(target vector), where R is the number of relationships in the relationship set R;
step 8, inputting the word vector pair sequences obtained in the step 6 into a neural network classification layer, and finally classifying and outputting 2r+3 categories; the classification layer function is as follows:
wherein h is i ∈H,h j ∈H,[h i ;h j ]For the vector concatenation operation,intermediate vectors obtained by Post-Norm function for spliced vectors, +.>Is->Vector after linear transformation, < >>And->For trainable parameters, ++>And->Is a deviation parameter;
step 9, training stage r i,j Inputting a Circle Loss function, calculating Loss and carrying out back propagation;
step 10, prediction stage vs. r in step 8 i,j Judging, namely judging the category of each word pair; r is (r) i,j Each position in the list represents different labels, if the numerical value in a certain position is larger than 0, the corresponding relation of the word pair in the position is judged, and if the numerical value is smaller than 0, the corresponding relation in the position is not judged;
and 11, decoding a final result by using a pseudo code according to the corresponding relation of each word pair obtained in the step 10, and finally obtaining all the extracted triples.
Further, in step 3, the bidirectional entity relationship is a relationship of two different orientations, namely a forward direction, a pointing direction and a reverse direction.
Further, in step 4, the dependency relationship set R s There are three dependencies: head entity head to tail entity head, head entity tail to tail entity tail, entity head to entity tail.
Further, in step 5, the pre-training model is one of BERT, ALBERT, roBERTa, ERNIE, XLNet and the like.
Further, in step 6, the traversal process is: each word vector and the word vector (including itself) following it form a word vector pair, and the word sequence is traversed to obtainFor a word vector pair, where n is the number of word vectors.
Further, in step 9, the loss function is as follows:
wherein,,and->The predictive scores of the positive examples and the negative examples are respectively, L and K are respectively a positive example set and a negative example set, L is a Circle Loss value, and e is the base of a natural logarithmic function.
Further, in step 11, the pseudo code is as shown in fig. 3.
The invention has the technical effects that:
compared with the prior art, the text relation extraction method based on the sequence labeling can simultaneously complete two tasks: entity identification and relationship classification. Aiming at the entity overlapping problem in the text, the invention designs a novel labeling method which can solve the problem of overlapping of a plurality of entities. In order to solve the exposure deviation, the labeling method of the invention uses a joint extraction mode, which can label the relationship between the entities at the same time of labeling the entities. The method has the advantages of remarkably improving the extraction accuracy and recall rate and greatly improving the extraction accuracy and recall rate.
Drawings
FIG. 1 is an exemplary diagram of an overlapping relationship in accordance with the present invention;
FIG. 2 is a diagram of a model of the present invention;
FIG. 3 is a pseudo code diagram of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention more clear, the technical solutions in the embodiments of the present invention are clearly and completely described below with reference to the accompanying drawings of the specification.
Example 1:
the text relation extraction method based on sequence labeling, which is related to the embodiment, comprises the following steps:
step 1, presetting all possible entity relation categories and establishing a relation set R: for the business scenario, key knowledge information in the field is determined by expert demonstration, or important attributes of the business object in the business scenario are determined. For example, in the medical field, attribute analysis for drugs, important attributes include: the applicability, administration method, adverse reflection, notice, etc., whereby four or more preset relationships of the medicine can be obtained and added to the relationship set R.
Step 2, constructing a training data set suitable for the service field: first, we define a complete piece of training data, which must contain all the ternary relationships contained in the original sentences S and S that exist in the relationship set R. Secondly, a section or a sentence of text is given, and the preset relation R is combined through manual analysis, so that the required triples are extracted to form training data. For example, the applicable symptom for the drug of statement S "a" is B, C, D. By the way, three ternary relations of S can be determined, including (A medicine, application, B), (A medicine, application, C), (A medicine, application, D). Accordingly, an effective training sentence can be constructed.
And 3, expanding a relation set R (containing R relations) to expand each relation into two relations (forward and reverse) with different directions, and obtaining a data set R' containing 2R relations. For example, the relationship "applicability" is to be extended to both "applicability-forward" and "applicability-reverse". For both entities (A, B), the relationship pointed to by A to B is determined to be a forward relationship, and the relationship pointed to by B to A is a reverse relationship.
Step 4, constructing a dependency relationship set R for the triplet s There are three dependencies: head-to-head (start entity head points to target entity head), tail-to-tail (start entity tail points to target entity tail), head-to-tail (entity head points to entity tail). Here, the initial entity refers to the former entity in the triplet, i.e. the initiator of the relationship, and the target entity refers to the latter entity in the triplet, i.e. the recipient of the relationship.
Step 5, dividing the input sentence into word sequences and inputting the word sequences into BEIn the RT pre-training model, a set of expression vectors of words in each sentence is obtainedWherein d is the dimension of the hidden layer vector represented by the preset super parameter, h i Is a word vector.
Step 6, forming a unique word pair sequence by using a similar handshake mode on the word vector sequence H obtained in the step 5I.e. each word vector "handshakes" with its following word vector (including itself) to form a word vector pair, thus double traversing the word sequence ultimately resultsFor word vector pairs, where n is the number of word vectors;
step 7, constructing training targets for the word pairs generated in step 6, for each word pair (W i ,W j ) J.gtoreq.i constructs a 0-value vector of length 2r+3(target vector), where R is the number of relationships in the relationship set R. For the relation triplet (E 1 ,R i' ,E 2 ),R i' E R', we define E j' [1]J' E (1, 2) is entity E j' First character of (E) j' [-1]J' E (1, 2) is entity E j' V of the last character of (v) i,j [idx]A value representing the position of the target vector idx of the word pair. If W is i ∈E 1 ,W j ∈E 2 Then put v i,j [i']=1; if W is i =E 1 [1],W j =E 1 [-1]Or W i =E 2 [1],W j =E 2 [-1]Then put v i,j [2r+3]=1; if W is i =E 1 [1],W j =E 2 [1]Then put v i,j [2r+2]=1; if W is i =E 1 [-1],W j =E 2 [-1]Then put v i,j [2r+1]=1。
And 8, inputting the word vector pair sequence obtained in the step 6 into a neural network classification layer, and finally classifying and outputting 2r+3 classification quantity. The classification layer function is as follows:
wherein h is i ∈H,h j ∈H,[h i ;h j ]For the vector concatenation operation,intermediate vectors obtained by Post-Norm function for spliced vectors, +.>Is->Vector after linear transformation, < >>Andfor trainable parameters, ++>And->Is a deviation parameter. In particular, post-Norm functions are:
h′=Post-Norm(h)
=LayerNorm(DropOut(GELU(Wh+b))+h)
step 9, training stage r i,j And a target matrix v i,j Comparing according to v i,j Will r i,j Dividing into positive examples or negative examples, and dividing r into i,j Inputting a Circle Loss function, calculating Loss and carrying out back propagation; the loss function is as follows:
wherein,,and->The predictive scores of the positive examples and the negative examples are respectively, L and K are respectively a positive example set and a negative example set, L is a Circle Loss value, and e is the base of a natural logarithmic function.
Step 10, prediction stage vs. r in step 8 i,j Judging, namely judging the category of each word pair; r is (r) i,j Each position in the list represents different labels, if the numerical value in a certain position is larger than 0, the corresponding relation of the word pair in the position is judged, and if the numerical value is smaller than 0, the corresponding relation in the position is not judged;
and step 11, decoding a final result by using a pseudo code as shown in fig. 3 according to the corresponding relation of each word pair obtained in the step 10, and finally obtaining all the extracted triples.
The pseudocode described in fig. 3: receiving a marked sequence S, mapping the index of the S to a dictionary M of original sentence index pairs, and outputting a triplet set finally by an original input sentence. Line 1 of code initializes a set E of word index pairs containing "head-to-tail" dependencies, line 2 of code initializes a dictionary H mapping bi-directional entity relationships to "head-to-head" word index pairs, line 3 of code initializes a dictionary T mapping bi-directional entity relationships to "tail-to-tail" word index pairs, line 4 of codeInitializing a result set R, establishing a dictionary mapping indexes to relation types, initializing the quantity R of bidirectional entity relations by a code line 5, initializing indexes of three dependency relations in a target vector by a code line 7, traversing the whole labeling sequence S by a code line 8-20, converting the indexes of the labeling sequence belonging to a head-to-tail relation into head-to-tail word index pairs of original sentences through the dictionary M by a code line 9-11, adding the head-to-tail word index pairs into a set E, traversing all the bidirectional relations by a code line 12-19, mapping the word index pairs belonging to the head-to-head relation onto the bidirectional relation set contained in the word index pairs, mapping the word index pairs belonging to the tail-to-tail relation onto the bidirectional relation set contained in the bidirectional relation set, performing double traversing by a code line 21-38, and defining E [ i ] by a code line 23-24],E[j]For an entity's head-to-tail word index pair, code 25 lines define P h For entity E [ i ]]And E [ j ]]The head word index pair, representing the "head-to-head" relationship, defines P on line 26 of the code t For entity E [ i ]]And E [ j ]]The tail word index pair, representing a "tail-to-tail" relationship, defines H [ P ] at line 27 of code h ]To query P in dictionary H through head word index pairs h Corresponding bi-directional relationship, code 28 rows define T [ P ] t ]To query P in dictionary T through tail word index pairs t Corresponding bi-directional relation, code 29 lines fetch set H [ P ] h ]And set T [ P ] t ]Is Set of intersection-derived relationships r Code 30-36 lines judge Set r Non-null and adding triples to Set R, code 31 line extracts entity fragments from original sentence through head and tail indexes of initial entity, code 32 line extracts entity fragments from original sentence through head and tail indexes of target entity, code 33-35 line traverses relation Set r And adding each predicted relationship to the final output result combination R, and returning the final result to the code 39 line and ending the flow.
The entity is a unified name of objectively existing objects or concepts; the triplet is a knowledge representation in the form of (initial entity, relationship, target entity); the entity head refers to the first character of the entity character segment, refers to the first Chinese character of the entity word for Chinese and refers to the first word of the entity word group for English; the entity tail refers to the last character of the entity character segment, refers to the last Chinese character of the entity word for Chinese and refers to the last word of the entity word group for English;
test example:
the model of the invention reaches the leading level in the public data set NYT and WEBNLG, which are both English data sets, wherein the NYT data set comes from paper Modeling Relations and TheirMentions without LabeledText, and the paper uses a remote supervision learning method to report the triples extracted from the corpus in New York; the WEBNLG dataset is from paper Creating training corpora for nlg micro-planners, which uses a text generation approach to generate a piece of text and matches a pre-set ternary relationship. The experimental results are shown in table 1.
Table 1 comparison between the method of this patent and other methods
The NoveTaggering method is from paper Joint extraction of entities and relations based on a novel tagging scheme, the GraphRel method is from paper Extracting relational facts by an end-to-end neural model with copy mechanism, the OrderCopyrE method is from paper Learning the extraction order of multiple relational facts in a sentence withreinforcement learning, and the CasRel method is from paper ANovel Cascade Binary Tagging Framework for Relational Triple Extraction.
The calculation method of the F1 evaluation index is as follows:
the F1 index represents the overall level of the predicted result and is a harmonic mean of the model accuracy and recall. As can be seen from Table 1, the method of the present invention achieves the best effect on both data sets, and can improve the key index F1 by 1-2% compared with the latest method, and the extraction accuracy and recall rate are significantly improved compared with other methods. The following three problems are better solved: 1) Meanwhile, the problems of entity extraction and relationship classification among entities are solved. 2) Overlapping triples in the text are extracted correctly. 3) The problem of exposure deviation is solved, and the extraction accuracy is improved.
The foregoing embodiments are merely examples of the present invention, and the scope of the present invention includes, but is not limited to, the forms and styles of the foregoing embodiments, and any suitable changes or modifications made by those skilled in the art, which are consistent with the claims of the present invention, shall fall within the scope of the present invention.

Claims (4)

1. A text relation extraction method based on sequence labeling is characterized by comprising the following steps: the method comprises the following steps:
step 1, presetting all possible entity relation categories and establishing a relation set R;
step 2, constructing a training data set suitable for the service field, wherein entity relations in the training data must comprise all preset relation types;
step 3, expanding a relation set R, and presetting all possible bidirectional entity relations to obtain a data set R';
step 4, constructing a dependency relationship set R for the triplet s
Step 5, dividing the input sentence into word sequences, and inputting the word sequences into a pre-training model to obtain a representation vector set of the words in each sentenceWherein d is the dimension of the hidden layer vector represented by the preset super parameter, h i Is a word vector;
step 6, forming a unique word pair sequence for the word vector sequence H obtained in the step 5, and double traversing the word sequence to form a word vector pair; by a means ofThe traversal process is as follows: each word vector forms a word vector pair with itself and the word vector following it, and the word sequence is traversed to obtainFor word vector pairs, where n is the number of word vectors;
step 7, constructing training targets for the word pairs generated in step 6, for each word pair (W i ,W j ) J.gtoreq.i constructs a 0-value vector of length 2r+3Wherein R is the number of relationships in the relationship set R;
step 8, inputting the word vector pair sequences obtained in the step 6 into a neural network classification layer, and finally classifying and outputting 2r+3 categories; the classification layer function is as follows:
wherein h is i ∈H,h j ∈H,[h i ;h j ]For the vector concatenation operation,intermediate vectors obtained by Post-Norm function for spliced vectors, +.>Is->Vector after linear transformation, < >>Andfor trainable parameters, ++>And->Is a deviation parameter;
step 9, training stage r i,j Inputting a Circle Loss function, calculating Loss and carrying out back propagation; the loss function is as follows:
wherein,,and->Predictive scores of positive examples and negative examples are respectively provided, L and K are positive example sets and negative example sets respectively, L is a Circle Loss value, and e is a base of a natural logarithmic function;
step 10, prediction stage vs. r in step 8 i,j Judging, namely judging the category of each word pair; r is (r) i,j Each position in the list represents different labels, if the numerical value in a certain position is larger than 0, the corresponding relation of the word pair in the position is judged, and if the numerical value is smaller than 0, the corresponding relation in the position is not judged;
and 11, decoding a final result by using a pseudo code according to the corresponding relation of each word pair obtained in the step 10, and finally obtaining all the extracted triples.
2. The text relationship extraction method based on sequence labeling as claimed in claim 1, wherein: in step 3, the bidirectional entity relationship is a relationship of two different directives, namely a forward direction, a direct direction and a reverse direction relationship.
3. The text relationship extraction method based on sequence labeling as claimed in claim 1, wherein: in step 4, the dependency relationship set R s There are three dependencies: head entity head to tail entity head, head entity tail to tail entity tail, entity head to entity tail.
4. The text relationship extraction method based on sequence labeling as claimed in claim 1, wherein: in step 5, the pre-training model is one of BERT, ALBERT, roBERTa, ERNIE, XLNet.
CN202110501103.6A 2021-05-08 2021-05-08 Text relation extraction method based on sequence annotation Active CN113191118B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110501103.6A CN113191118B (en) 2021-05-08 2021-05-08 Text relation extraction method based on sequence annotation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110501103.6A CN113191118B (en) 2021-05-08 2021-05-08 Text relation extraction method based on sequence annotation

Publications (2)

Publication Number Publication Date
CN113191118A CN113191118A (en) 2021-07-30
CN113191118B true CN113191118B (en) 2023-07-18

Family

ID=76984478

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110501103.6A Active CN113191118B (en) 2021-05-08 2021-05-08 Text relation extraction method based on sequence annotation

Country Status (1)

Country Link
CN (1) CN113191118B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113792539B (en) * 2021-09-15 2024-02-20 平安科技(深圳)有限公司 Entity relationship classification method and device based on artificial intelligence, electronic equipment and medium
CN115358341B (en) * 2022-08-30 2023-04-28 北京睿企信息科技有限公司 Training method and system for instruction disambiguation based on relational model

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101425065A (en) * 2007-10-31 2009-05-06 日电(中国)有限公司 Entity relation excavating method and device
CN103678316A (en) * 2012-08-31 2014-03-26 富士通株式会社 Entity relationship classifying device and entity relationship classifying method
CN106484675A (en) * 2016-09-29 2017-03-08 北京理工大学 Fusion distributed semantic and the character relation abstracting method of sentence justice feature
CN106649275A (en) * 2016-12-28 2017-05-10 成都数联铭品科技有限公司 Relation extraction method based on part-of-speech information and convolutional neural network
CN107133220A (en) * 2017-06-07 2017-09-05 东南大学 Name entity recognition method in a kind of Geography field
CN108280062A (en) * 2018-01-19 2018-07-13 北京邮电大学 Entity based on deep learning and entity-relationship recognition method and device
CN109408812A (en) * 2018-09-30 2019-03-01 北京工业大学 A method of the sequence labelling joint based on attention mechanism extracts entity relationship
CN111931506A (en) * 2020-05-22 2020-11-13 北京理工大学 Entity relationship extraction method based on graph information enhancement

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101425065A (en) * 2007-10-31 2009-05-06 日电(中国)有限公司 Entity relation excavating method and device
CN103678316A (en) * 2012-08-31 2014-03-26 富士通株式会社 Entity relationship classifying device and entity relationship classifying method
CN106484675A (en) * 2016-09-29 2017-03-08 北京理工大学 Fusion distributed semantic and the character relation abstracting method of sentence justice feature
CN106649275A (en) * 2016-12-28 2017-05-10 成都数联铭品科技有限公司 Relation extraction method based on part-of-speech information and convolutional neural network
CN107133220A (en) * 2017-06-07 2017-09-05 东南大学 Name entity recognition method in a kind of Geography field
CN108280062A (en) * 2018-01-19 2018-07-13 北京邮电大学 Entity based on deep learning and entity-relationship recognition method and device
CN109408812A (en) * 2018-09-30 2019-03-01 北京工业大学 A method of the sequence labelling joint based on attention mechanism extracts entity relationship
CN111931506A (en) * 2020-05-22 2020-11-13 北京理工大学 Entity relationship extraction method based on graph information enhancement

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding;Jacob Devlin 等;《arxiv》;1-16 *
Embeddings of Label Components for Sequence Labeling: A Case Study of Fine-grained Named Entity Recognition;Takuma Kato 等;《arxiv》;1-8 *
基于CNN和双向LSTM的中文分词于词性标注一体化模型;张建虎;《中国优秀硕士学位论文全文数据库》;I138-1265 *
基于深度学习的在线医疗咨询文本命名实体识别;陈河宏;《中国优秀硕士学位论文全文数据库》;E054-66 *

Also Published As

Publication number Publication date
CN113191118A (en) 2021-07-30

Similar Documents

Publication Publication Date Title
CN110032648B (en) Medical record structured analysis method based on medical field entity
CN107748757B (en) Question-answering method based on knowledge graph
CN110532554B (en) Chinese abstract generation method, system and storage medium
WO2020063092A1 (en) Knowledge graph processing method and apparatus
US7689527B2 (en) Attribute extraction using limited training data
WO2018153215A1 (en) Method for automatically generating sentence sample with similar semantics
CN106599032A (en) Text event extraction method in combination of sparse coding and structural perceptron
CN110196906A (en) Towards financial industry based on deep learning text similarity detection method
CN112084381A (en) Event extraction method, system, storage medium and equipment
CN113191118B (en) Text relation extraction method based on sequence annotation
CN111222318B (en) Trigger word recognition method based on double-channel bidirectional LSTM-CRF network
CN112989208B (en) Information recommendation method and device, electronic equipment and storage medium
CN111581923A (en) Method, device and equipment for generating file and computer readable storage medium
CN110188359B (en) Text entity extraction method
CN114547298A (en) Biomedical relation extraction method, device and medium based on combination of multi-head attention and graph convolution network and R-Drop mechanism
CN112101014B (en) Chinese chemical industry document word segmentation method based on mixed feature fusion
CN111125295A (en) Method and system for obtaining food safety question answers based on LSTM
CN114564563A (en) End-to-end entity relationship joint extraction method and system based on relationship decomposition
CN109815478A (en) Medicine entity recognition method and system based on convolutional neural networks
Logacheva et al. Word sense disambiguation for 158 languages using word embeddings only
Wang et al. Aspect-based sentiment analysis with graph convolutional networks over dependency awareness
CN112667819A (en) Entity description reasoning knowledge base construction and reasoning evidence quantitative information acquisition method and device
CN111492364A (en) Data labeling method and device and storage medium
CN113468311B (en) Knowledge graph-based complex question and answer method, device and storage medium
CN113886521A (en) Text relation automatic labeling method based on similar vocabulary

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant