CN110968665A - Method for recognizing upper and lower level word relation based on gradient enhanced decision tree - Google Patents

Method for recognizing upper and lower level word relation based on gradient enhanced decision tree Download PDF

Info

Publication number
CN110968665A
CN110968665A CN201911086620.0A CN201911086620A CN110968665A CN 110968665 A CN110968665 A CN 110968665A CN 201911086620 A CN201911086620 A CN 201911086620A CN 110968665 A CN110968665 A CN 110968665A
Authority
CN
China
Prior art keywords
sample
word
path
samples
decision tree
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911086620.0A
Other languages
Chinese (zh)
Other versions
CN110968665B (en
Inventor
潘翔
阮义彰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University of Technology ZJUT
Original Assignee
Zhejiang University of Technology ZJUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University of Technology ZJUT filed Critical Zhejiang University of Technology ZJUT
Priority to CN201911086620.0A priority Critical patent/CN110968665B/en
Publication of CN110968665A publication Critical patent/CN110968665A/en
Application granted granted Critical
Publication of CN110968665B publication Critical patent/CN110968665B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/355Class or cluster creation or modification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06F18/2148Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the process organisation or structure, e.g. boosting cascade

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to a method for identifying the relation between upper and lower terms based on a gradient enhanced decision tree. To train the classification model, the inputs are the entity pairs and their path information, and the outputs are either 1 (for context) or 0 (for no context). And a high-confidence recommendation set based on a positive classification result is obtained by jointly training the two classifiers. The model adapts quickly to regular patterns of unlabeled corpus text by continuously iterating high confidence sets. The method can better mine the upper and lower word relation of the E-business domain.

Description

Method for recognizing upper and lower level word relation based on gradient enhanced decision tree
Technical Field
The invention relates to the technical field of natural language processing, in particular to a method for recognizing the relation between upper and lower level words based on a gradient enhanced decision tree.
Background
Automatic mining and verification of context relationships between entities is an important task in e-commerce. The context relationship represents a relationship between a generic entity (hyperym) and a specific instance thereof (hypoym). Such as appliances and refrigerators. In electronic commerce, mining such lower relationships helps to better understand user queries and commodity recommendations.
However, in electronic commerce, this task faces many challenges. First, the corpus of text on the web often contains a lot of noise, and the text is updated frequently. Noise makes it difficult for the general method to obtain valid information from e-commerce text. The high frequency of updates wastes significant labor costs in labeling new and proprietary words. Second, there are currently known about 10 billion commercial entities (including a large number of isomorphs). Assume that each node in the category tree has at least one root node (y) and its associated leaf node (x) is greater than or equal to 0. Then the category tree for the commodity entity would be huge. This requires a good recall rate while ensuring accuracy. Aiming at the particularity of a corpus text in the field of electronic commerce, a gradient enhancement decision tree method based on joint training is provided by taking the semi-supervised thought as reference. The method can automatically dig the up-down hyperword entity relationship in the text of the specific domain and the noise text. From the various entity relationship mining methods of the prior invention, all entity relationship mining methods can be classified as supervised, semi-supervised and unsupervised. Wherein in two-class learning, multiple classifier trains are combined together with higher accuracy than learning alone. This approach requires that related tasks share a similar representation. The bootstrapping method is to train a classifier based on a small number of labeled samples and then iteratively augment the training set with highly-trusted samples in the current model. bootstrapping is good at introducing new or large e-commerce unlabeled text corpora through small sample guided seeds. However, this method has a "semantic drift" problem after many iterations. In order to reduce errors continuously introduced in semi-supervised learning iteration, one method is to perform cross training on different types of samples to prevent the precision from being reduced; or the bias of the marking error is reduced by conditional independent segmentation of the feature space. Other non-bootstrapping techniques use the same extraction method to generate independent errors, thereby triggering predictions from multiple extractors. These predictions combine to improve the accuracy of the extraction. In addition to these, there are methods that use two complementary methods to handle both top and bottom entity relationship mining, a distributed-based method and a path-based method. Distributed methods are excellent for finding entity relationships. Some path-based methods, however, encode using a recurrent neural network, with results comparable to distributed methods.
The mining method for mining the relation of upper and lower position words in the complex text mainly comprises the following steps of:
firstly, when a user searches commodities, the search content is expanded through the upper and lower terms, secondary search is reduced, and user experience is improved.
And secondly, adding commodity information recall. Under the condition of not changing the dimensionality, the recall information precision is improved, and the recall information amount is enriched.
And thirdly, the situation that the scene card can be used for multiple times in the application scene is improved.
And fourthly, layering related words of the commodity domain into categories, attributes, attribute values and auxiliary classification tree system.
Fifthly, positioning the hot new words.
Disclosure of Invention
The invention aims to overcome the defects and provides a method for identifying the superior-inferior word relationship based on a gradient enhanced decision tree. To train the classification model, the inputs are the entity pairs and their path information, and the outputs are either 1 (for context) or 0 (for no context). And a high-confidence recommendation set based on a positive classification result is obtained by jointly training the two classifiers. The model adapts quickly to regular patterns of unlabeled corpus text by continuously iterating high confidence sets. The method can better mine the upper and lower word relation of the E-business domain.
The invention achieves the aim through the following technical scheme: a method for recognizing the relation between upper and lower terms based on a gradient enhanced decision tree comprises the following steps:
(1) constructing a random dislocation sample training set;
(2) constructing a sample training set based on the path;
(3) and training a semi-supervised combined gradient enhancement decision tree model according to the constructed random dislocation sample training set and the path-based sample training set, and recognizing the upper and lower level word relation by using the trained model.
Preferably, the construction method of the random dislocation sample training set comprises the following steps:
(1.1) segmenting the corpus text based on an Alibaba Word Segmenter lexical analysis system; extracting upper and lower word pairs from the existing word stock for matching, and constructing a positive sample by combining texts between the word pairs;
(1.2) misplacing the upper and lower words of the successfully matched word pair to serve as negative sample word pairs; matching the texts by using the staggered words to construct random staggered negative samples;
and (1.3) combining the positive and negative samples obtained in the step, and constructing to obtain a random dislocation sample training set.
Preferably, the method for constructing the path-based sample training set includes:
(2.1) fragmenting the corpus text and recording as Ssplit=Split({S1,S2,S3,…,Sn}); (2.2) taking the malposition word pairs in the random malposition sample, matching with the corpus text to obtain a sentence set S containing malposition upper and lower word pairs<x,y>={S<x1,y1>,S<x2,y2>,S<x3,y3>,…,S<xn,yn>}; (2.3) taking out the path between the misaligned word pair, and recording the path as P ═ P1,P2,P3…,Pn};
(2.4) extracting the paths and the corpus fragment { S1,S2,S3,…,SnMatching, inquiring a fragment prototype sentence after matching is successful, and taking a first word before and after a path P' but not an original staggered word pair as a negative sample word pair based on the path; a path-based training set of samples is obtained in combination with the positive samples.
Preferably, the corpus fragmentation adopts an Ngarm algorithm to enumerate sentence fragments formed by all continuous participles, each participle is marked as length 1, and a segment with a path length not greater than 5 is taken.
Preferably, the semi-supervised joint gradient enhancement decision tree model is an addition model, the learning algorithm is a forward stepping algorithm, and the basis function is a CART tree; the loss function is the mean square error function loss, i.e.:
Figure BDA0002265607880000041
the negative gradient is then:
Figure BDA0002265607880000042
wherein y-f (x) is the residual error; the output is: classification tree f (x).
Preferably, the semi-supervised joint gradient enhancement decision tree training method comprises the following steps:
inputting a text corpus T, pre-trained word embedding and maximum iteration I;
(i) preprocessing T data, and extracting two types of training samples XpAnd XdWherein X ispFor the path-based sample training set, XdTraining set for random dislocation sample;
(ii) converting each training sample into a vector representation using word embedding W;
(iii) is provided with
Figure BDA0002265607880000051
And
Figure BDA0002265607880000052
X′prepresents a route sample, X'dRepresenting randomly misplaced samples;
(iv) using X separatelyp∪X′pAnd Xd∪X′dBy training two classifiers f1And f2
(v) Predicting the unlabeled samples, and selecting positive samples with high confidence coefficient to be used as new training samples X'pAnd X'dCarrying out expansion;
(vi) recycling step (iv) and step (v) up to X'pAnd X'dNew annotated samples are not appearing;
and (3) outputting: two classifiers and a prediction label for the test sample.
The invention has the beneficial effects that: the method can complete the sample construction of the complex text, and mark the prediction of the unmarked entity; the method analyzes the characteristics of the e-commerce domain text, summarizes the upper and lower word pairs of some e-commerce domains by means of substring, pattern, rule learning and the like, and can better mine the upper and lower word relation of the e-commerce domains.
Drawings
FIG. 1 is a schematic flow diagram of the process of the present invention;
FIG. 2 is a schematic diagram of a training set of random misalignment samples according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a path sample training set-based construction according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a training process of a gradient enhanced decision tree model according to an embodiment of the present invention.
Detailed Description
The invention will be further described with reference to specific examples, but the scope of the invention is not limited thereto:
example (b): as shown in fig. 1, a method for identifying a context relationship based on a gradient-enhanced decision tree includes the following steps:
(1) the construction of the random misalignment sample training set is specifically as follows, as shown in fig. 2:
the linguistic data text is firstly participled through a lexical analysis system based on AliWS (Aliba Word Segmenter for short). And extracting upper and lower word pairs from the existing word stock for matching, and constructing a positive sample by combining texts between the word pairs. And (5) staggering the upper and lower words of the successfully matched word pair to serve as a negative sample word pair. And then, matching the text by using the misplaced words to construct a random misplaced negative sample, such as:
(1) < apple is a fruit >
(2) < fruits such as apple >
(3) < dog is an animal >
After dislocation, the < apple, fruit >, < dog, animal > becomes < apple, animal > < dog, fruit >. And then finding a sentence path matched with the corpus in the corpus. After screening, the following results were obtained:
(1) < apple for tropical animals >
(2) < dogs do not eat fruit >
The misaligned word pairs and their path information as a whole are constructed as negative examples. And combining the positive and negative samples to form a random dislocation sample training set.
The path between two word pairs is taken to satisfy:
1. the number of words does not exceed 5 words in length, e.g. "is a" length of 3.
2. Except the word pairs with the length of single word byte less than 2, wherein the word pairs comprise one, no and no in the non-upper and lower word pairs.
3. When the corpus cannot be matched to contain two word pairs at the same time, the training corpus cannot be constructed according to the word pair.
And based on the word mode characteristic vector representation obtained above, word embedding and word mode characteristic vector splicing of the word pairs, and using the finally spliced characteristic vector as the representation of the word pairs. The calculation process is as follows:
Figure BDA0002265607880000071
Figure BDA0002265607880000072
Figure BDA0002265607880000073
Figure BDA0002265607880000074
Figure BDA0002265607880000075
i.e. the vector represented by the path of a given word pair < x, y >.
(2) The construction of the path-based sample training set is specifically as follows, as shown in fig. 3: first, the corpus text is fragmented and recorded as Ssplit=Split({S1,S2,S3,…,Sn}). The corpus fragmentation uses the Ngarm algorithm to enumerate sentence fragments formed by all continuous participles, each participle is marked as length 1, and a fragment with the path length not more than 5 is taken. For example, if "dragon fruit is a rose fruit" the sentence with length of 7 is fragmented:
(1) dragon fruit
(2) The dragon fruit is
(3) The dragon fruit is
(4) The dragon fruit is
(5) The dragon fruit is a rose
And 28 kinds of fragments.
Taking the malposition word pair in the random malposition sample, matching with the corpus text to obtain a sentence set S containing the malposition upper and lower word pairs<x,y>={S<x1,y1>,S<x2,y2>,S<x3,y3>,…,S<xn,yn>For example:
(1) s < x1, y1> < apple for tropical animals >
(2) (iii) S < x2, y2> < dogs do not eat fruit >
Taking out the path between the dislocation word pair, and recording as P ═ P1,P2,P3…,Pn}. Extracting the paths and the corpus fragment { S1,S2,S3,…,SnMatch is made. And after matching is successful, inquiring the fragment prototype sentence, and taking the first word before and after the path P', which is not the original staggered word pair, as a path negative sample word pair. Such as:
P1=<for tropical zones>
P2=<Can not eat>
Matching with the text fragment to obtain a sentence:
s' < such a temperature is very suitable for tropical animals >
(cold weather people do not eat cold food >
The negative sample word pair < temperature, animal >, < people, cold food > based on the path can be obtained. And finally, combining the positive sample with the positive sample to obtain a path-based sample training set.
(3) Training a semi-supervised joint gradient enhancement decision tree model according to the constructed random dislocation sample training set and the path-based sample training set, wherein the construction training process is shown in FIG. 4; and performing upper and lower level word relation recognition by using the trained model.
After two training samples, namely random dislocation samples and path-based samples, are constructed, semi-supervised joint gradient enhancement decision tree model training is started. The misplaced-based samples change the 100-dimensional vector of the path 300 and the hyponyms when constructed, and the path-based samples change the 200-dimensional vector of the hypernyms and hyponyms when constructed.
The semi-supervised combined gradient enhancement decision tree training method comprises the following steps:
inputting a text corpus T, pre-trained word embedding and maximum iteration I;
(i) preprocessing T data, and extracting two types of training samples XpAnd XdWherein X ispFor the path-based sample training set, XdTraining set for random dislocation sample;
(ii) converting each training sample into a vector representation using word embedding W;
(iii) is provided with
Figure BDA0002265607880000092
And
Figure BDA0002265607880000093
X′prepresents a route sample, X'dRepresenting randomly misplaced samples;
(iv) using X separatelyp∪X′pAnd Xd∪X′dBy training two classifiers f1And f2
(v) Predicting the unlabeled samples, and selecting positive samples with high confidence coefficient to be used as new training samples X'pAnd X'dCarrying out expansion;
(vi) recycling step (iv) and step (v) up to X'pAnd X'dNew annotated samples are not appearing;
and (3) outputting: two classifiers and a prediction label for the test sample.
The relationship mining of the superior and inferior terms is essentially a binary task. Gradient-enhanced decision trees are among the best algorithms to fit to the true distribution in conventional machine learning algorithms, and are algorithms that classify or regress data by using additive models (i.e., linear combinations of basis functions) and by continuously reducing the residual errors generated by the training process. The gradient enhancement decision tree model is an addition model, the learning algorithm is a forward stepping algorithm, and the basis function is a CART tree. The loss function is the mean square error function loss, i.e.:
Figure BDA0002265607880000091
the negative gradient is then:
Figure BDA0002265607880000101
and y-f (x) is residual error, and the model learns a weak classifier by fitting the residual error in each iteration. The requirements for weak classifiers are generally simple enough and are low variance and high variance. Because the training process is to continuously improve the accuracy of the final classifier by reducing the bias. The core is that each weak classifier is the residual of the conclusion sum of all previous classifiers, and the residual is an accumulated amount of the true value obtained after adding the predicted value. The input of the model is a marked sample, and the marked sample is divided into a path sample and a dislocation sample. Since label is a label and is a binary task, label indicates a high-low word pair or a non-high word pair by [1, 0 ]. The format is as follows:
Figure BDA0002265607880000102
the output is: classification tree f (x).
The method mainly comprises the following steps:
(1) initialization: c is a constant value estimated to minimize the loss function, which is a tree with only one root node, and the general squared loss function is the mean of the nodes
Figure BDA0002265607880000103
(2) For M ═ 1,2,3, …, M:
(a) calculating a residual error for the sample i-1, 2,3 …, N;
Figure BDA0002265607880000111
(b) to { (x)1,rm1),…,(xN,rmN) Fitting a classification tree to obtain leaf node regions R of the mth treemj,j=1,2,…,J
(c) For J equal to 1,2, …, J, the values of leaf node regions are estimated by linear search to minimize the loss function, and the calculation is performed
Figure BDA0002265607880000112
K represents the number of samples in the jth node of the mth tree. The above formula represents cmjIs used to determine the average of the residuals in the jth node of the mth tree.
(d) The update minimizes the loss function, I being the parameter controlling the negative gradient.
Figure BDA0002265607880000113
(3) Obtaining a final classification tree:
Figure BDA0002265607880000114
obtaining a gradient enhanced decision tree classification function, and then carrying out marking-free data T'1And performing path sample construction and then putting the path sample into a classification tree for prediction. The method comprises the following steps:
the input is as follows:
Figure BDA0002265607880000115
Figure BDA0002265607880000121
(4) during training, a classification regression tree is trained for each possible class of the sample X. The training set has two types, namely an upper-lower relation or a non-upper-lower word relation, for the sample<x,y>The prediction result 0 indicates a non-hypernym relationship, and 1 indicates a hypernym relationship. After multi-round iterative training, two trees are generated, and a new sample is obtained<x’,y’>Is respectively F1(x),F2(x) Then the probability that the sample belongs to a certain class c is:
Figure BDA0002265607880000122
the method comprises the steps of training two classifiers by constructing different samples, taking a sample of which the prediction result of the same sample on the two classifiers is greater than 0.8 as a high-confidence sample.
When text { T1,T2,T3,…,TnWhen there is no intersection, the new text T2The generated high confidence set is directly added into the training set after being audited. At this time, the growth rate:
Figure BDA0002265607880000123
when the semi-supervised model learns the high confidence set of the nth mutually disjoint text, the growth rate tends to be 0:
Figure BDA0002265607880000124
when text { T'1,T′2,T′3,...,T′NT 'for any two texts'n,T′mWhen there is intersection
T′n∪T′m=T′n\T′m+T′n∩T′m+T′m\Tn
I.e. for T'1Newly-added T'm,T′nThe effect of the text is equivalent to newly added T'n\T′m+T′n∩T′m+T′m\T′n. And taking the intersection of the upper and lower word pairs in the meaning of the intersection.
Then when n documents are newly added,
Figure BDA0002265607880000131
that is, when there is intersection between the texts, any n texts can be split into at most
Figure BDA0002265607880000136
Mutually disjoint text. When learning the nth text, assuming that i ≠ j, the text growth rate is:
Figure BDA0002265607880000132
is T'i\T′j=T′ij,T′i\T′j=T′ji,T′i∩T′j=T′(j,i)And T'ij,T′ji,T′(j,i)And if the two are not mutually intersected, then:
Figure BDA0002265607880000133
when N → N
Figure BDA0002265607880000134
Due to the fact that
Figure BDA0002265607880000137
The same number of nouns involved in complete disjointness
Figure BDA0002265607880000135
So the amount of newly added information tends to be 0 when N → N, i.e., text, tends to add the full amount of text. If for any T'iTo learn T'iWhen the growth rate of the model is greater than or equal to 0, when i tends to infinity, the growth rate of the model tends to 0; the model converges.
While the invention has been described in connection with specific embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (6)

1. A method for recognizing the relation between upper and lower level words based on a gradient enhanced decision tree is characterized by comprising the following steps:
(1) constructing a random dislocation sample training set;
(2) constructing a sample training set based on the path;
(3) and training a semi-supervised combined gradient enhancement decision tree model according to the constructed random dislocation sample training set and the path-based sample training set, and recognizing the upper and lower level word relation by using the trained model.
2. The method for recognizing the relation between upper and lower level words based on the gradient reinforced decision tree as claimed in claim 1, wherein: the construction method of the random dislocation sample training set comprises the following steps:
(1.1) segmenting the corpus text based on an Alibaba Word Segmenter lexical analysis system; extracting upper and lower word pairs from the existing word stock for matching, and constructing a positive sample by combining texts between the word pairs;
(1.2) misplacing the upper and lower words of the successfully matched word pair to serve as negative sample word pairs; matching the texts by using the staggered words to construct random staggered negative samples;
and (1.3) combining the positive and negative samples obtained in the step, and constructing to obtain a random dislocation sample training set.
3. The method for recognizing the relation between upper and lower level words based on the gradient reinforced decision tree as claimed in claim 1, wherein: the method for constructing the path-based sample training set comprises the following steps:
(2.1) fragmenting the corpus text and recording as Ssplit=Split({S1,S2,S3,…,Sn});
(2.2) taking the malposition word pairs in the random malposition sample, matching with the corpus text to obtain a sentence set S containing malposition upper and lower word pairs<x,y>={S<x1,y1>,S<x2,y2>,S<x3,y3>,…,S<xn,yn>};
(2.3) taking out the path between the misaligned word pair, and recording the path as P ═ P1,P2,P3…,Pn};
(2.4) extracting the paths and the corpus fragment { S1,S2,S3,…,SnMatching, inquiring a fragment prototype sentence after matching is successful, and taking a first word before and after a path P' but not an original staggered word pair as a negative sample word pair based on the path; a path-based training set of samples is obtained in combination with the positive samples.
4. The method for recognizing the relation between upper and lower level words based on the gradient reinforced decision tree as claimed in claim 3, wherein: the corpus fragmentation adopts an Ngarm algorithm to enumerate sentence fragments formed by all continuous participles, each participle is marked as length 1, and a fragment with the path length not more than 5 is taken.
5. The method for recognizing the relation between upper and lower level words based on the gradient reinforced decision tree as claimed in claim 1, wherein: the semi-supervised joint gradient enhancement decision tree model is an addition model, the learning algorithm is a forward stepping algorithm, and the basis function is a CART tree; the loss function is the mean square error function loss, i.e.:
Figure FDA0002265607870000021
the negative gradient is then:
Figure FDA0002265607870000022
wherein y-f (x) is the residual error; the output is: classification tree f (x).
6. The method for recognizing the relation between upper and lower level words based on the gradient reinforced decision tree as claimed in claim 1, wherein: the semi-supervised combined gradient enhancement decision tree training method comprises the following steps:
inputting a text corpus T, pre-trained word embedding and maximum iteration I;
(i) preprocessing T data, and extracting two types of training samples XpAnd XdWherein X ispFor the path-based sample training set, XdTraining set for random dislocation sample;
(ii) converting each training sample into a vector representation using word embedding W;
(iii) is provided with
Figure FDA0002265607870000031
And
Figure FDA0002265607870000032
X′prepresents a route sample, X'dRepresenting randomly misplaced samples;
(iv) using X separatelyp∪X′pAnd Xd∪X′dBy training two classifiers f1And f2
(v) Predicting the unlabeled samples, and selecting positive samples with high confidence coefficient to be used as new training samples X'pAnd X'dCarrying out expansion;
(vi) recycling step (iv) and step (v) up to X'pAnd X'dNew annotated samples are not appearing;
and (3) outputting: two classifiers and a prediction label for the test sample.
CN201911086620.0A 2019-11-08 2019-11-08 Method for recognizing upper and lower level word relation based on gradient enhanced decision tree Active CN110968665B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911086620.0A CN110968665B (en) 2019-11-08 2019-11-08 Method for recognizing upper and lower level word relation based on gradient enhanced decision tree

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911086620.0A CN110968665B (en) 2019-11-08 2019-11-08 Method for recognizing upper and lower level word relation based on gradient enhanced decision tree

Publications (2)

Publication Number Publication Date
CN110968665A true CN110968665A (en) 2020-04-07
CN110968665B CN110968665B (en) 2022-09-23

Family

ID=70030486

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911086620.0A Active CN110968665B (en) 2019-11-08 2019-11-08 Method for recognizing upper and lower level word relation based on gradient enhanced decision tree

Country Status (1)

Country Link
CN (1) CN110968665B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2008202384A1 (en) * 2008-05-23 2009-12-10 O'Collins, Frank Anthony Mr Ucadia Semantic Classification System
CN107506486A (en) * 2017-09-21 2017-12-22 北京航空航天大学 A kind of relation extending method based on entity link
CN108733702A (en) * 2017-04-20 2018-11-02 北京京东尚科信息技术有限公司 User inquires method, apparatus, electronic equipment and the medium of hyponymy extraction
CN109408642A (en) * 2018-08-30 2019-03-01 昆明理工大学 A kind of domain entities relation on attributes abstracting method based on distance supervision
CN110196982A (en) * 2019-06-12 2019-09-03 腾讯科技(深圳)有限公司 Hyponymy abstracting method, device and computer equipment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2008202384A1 (en) * 2008-05-23 2009-12-10 O'Collins, Frank Anthony Mr Ucadia Semantic Classification System
CN108733702A (en) * 2017-04-20 2018-11-02 北京京东尚科信息技术有限公司 User inquires method, apparatus, electronic equipment and the medium of hyponymy extraction
CN107506486A (en) * 2017-09-21 2017-12-22 北京航空航天大学 A kind of relation extending method based on entity link
CN109408642A (en) * 2018-08-30 2019-03-01 昆明理工大学 A kind of domain entities relation on attributes abstracting method based on distance supervision
CN110196982A (en) * 2019-06-12 2019-09-03 腾讯科技(深圳)有限公司 Hyponymy abstracting method, device and computer equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
郭茂盛等: "文本蕴含关系识别与知识获取研究进展及展望", 《计算机学报》 *

Also Published As

Publication number Publication date
CN110968665B (en) 2022-09-23

Similar Documents

Publication Publication Date Title
CN107330032B (en) Implicit discourse relation analysis method based on recurrent neural network
CN110427623B (en) Semi-structured document knowledge extraction method and device, electronic equipment and storage medium
CN108984724B (en) Method for improving emotion classification accuracy of specific attributes by using high-dimensional representation
CN108897857B (en) Chinese text subject sentence generating method facing field
CN108829722B (en) Remote supervision Dual-Attention relation classification method and system
CN108182295B (en) Enterprise knowledge graph attribute extraction method and system
CN111444343B (en) Cross-border national culture text classification method based on knowledge representation
CN111753024B (en) Multi-source heterogeneous data entity alignment method oriented to public safety field
CN111694924A (en) Event extraction method and system
CN113268995B (en) Chinese academy keyword extraction method, device and storage medium
CN106649561A (en) Intelligent question-answering system for tax consultation service
CN108304373B (en) Semantic dictionary construction method and device, storage medium and electronic device
CN101561805A (en) Document classifier generation method and system
CN112307351A (en) Model training and recommending method, device and equipment for user behavior
CN112069312B (en) Text classification method based on entity recognition and electronic device
CN112966525B (en) Law field event extraction method based on pre-training model and convolutional neural network algorithm
CN110210036A (en) A kind of intension recognizing method and device
CN112131876A (en) Method and system for determining standard problem based on similarity
CN113157859A (en) Event detection method based on upper concept information
CN111159405B (en) Irony detection method based on background knowledge
CN114417851A (en) Emotion analysis method based on keyword weighted information
CN114564563A (en) End-to-end entity relationship joint extraction method and system based on relationship decomposition
CN112115259A (en) Feature word driven text multi-label hierarchical classification method and system
Katumullage et al. Using neural network models for wine review classification
CN113722439B (en) Cross-domain emotion classification method and system based on antagonism class alignment network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant