CN108563637A - A kind of sentence entity complementing method of fusion triple knowledge base - Google Patents

A kind of sentence entity complementing method of fusion triple knowledge base Download PDF

Info

Publication number
CN108563637A
CN108563637A CN201810328826.9A CN201810328826A CN108563637A CN 108563637 A CN108563637 A CN 108563637A CN 201810328826 A CN201810328826 A CN 201810328826A CN 108563637 A CN108563637 A CN 108563637A
Authority
CN
China
Prior art keywords
entity
sentence
entities
triple
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810328826.9A
Other languages
Chinese (zh)
Inventor
黄河燕
魏骁驰
史学文
刘茜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute of Technology BIT
Original Assignee
Beijing Institute of Technology BIT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute of Technology BIT filed Critical Beijing Institute of Technology BIT
Priority to CN201810328826.9A priority Critical patent/CN108563637A/en
Publication of CN108563637A publication Critical patent/CN108563637A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

A kind of sentence entity complementing method of fusion triple knowledge base, belongs to Computer Natural Language Processing field.Concrete operation step includes:1. building the data set used for model training;2. entity, relationship, sentence template are indicated with vector;3. the entity word in completion sentence.The sentence entity complementing method compare with the existing technology of fusion triple knowledge base proposed by the present invention, when entity word in for the problem that sentence carries out completion it can be considered that relationship in entity word supplement and sentence between other entity words, efficiently solves and be difficult in conventional sentence complementing method to entity word progress completion.Experiment shows that method proposed by the present invention is obviously improved on the evaluation metrics using average sequence (MR) and preceding 10 hit rate (H@10).

Description

A kind of sentence entity complementing method of fusion triple knowledge base
Technical field
The present invention relates to a kind of sentence entity complementing methods of fusion triple knowledge base, belong at computer natural language Manage technical field.
Background technology
Sentence completion algorithm is a kind of technology by incomplete sentence supplement for complete words.It can be widely applied for inputting Method intelligent prompt, search engine inquiry prompt etc..A wide range of with electronic equipments such as computers is popularized, and text input becomes Basic operation to these equipment can be greatly decreased user by sentence completion technology and input word when inputting word Quantity increases user's input speed, improves user experience, and input business uses on a large scale at present.
Traditional sentence complementing method is mostly based on the co-occurrence statistics information of word, by being carried out to extensive text sentence Statistics, obtains the co-occurrence probabilities between each word and all words, is calculated later in sentence to be supplemented using the co-occurrence probabilities The probability of middle filling different terms, is ranked up candidate word according to the probability size being calculated, if probability is higher A dry word feeds back to user as candidate, is selected for user.
Above-mentioned traditional sentence complementing method has the drawback that the co-occurrence information that only considered in sentence between word, and Existing semantic relation between entity word is not accounted in sentence.When needing to carry out completion to the entity word in sentence, wait mending The semantic relation between other entity words in the entity word and sentence filled is often more even more important than co-occurrence information, particularly with retouching The Subject, Predicate and Object type sentence of relationship between entity is stated, phenomenon becomes apparent.And the co-occurrence information used in conventional method can not The relationship between entity is effectively embodied, thus the completion of lacked entity word in sentence is also difficult to obtain satisfactory effect Fruit.
Invention content
The problem of being difficult to carry out completion to entity word the purpose of the present invention is to solve sentence complementing method, it is proposed that one The sentence entity complementing method of kind fusion triple knowledge base.
The present invention is achieved through the following technical solutions:
Related definition is carried out first:It is specific as follows:
Define 1:Entity can represent the mark of specific things;
Define 2:Relationship can represent the mark of the contact between the things representated by two entities;
Define 3:Triple, i.e., by two entities and its between a kind of structure for forming of relationship;Wherein, the two entities Form entity pair;
Define 4:Triple knowledge base, i.e., the set being made of a large amount of triples;
Wherein, the collection of all entities is collectively referred to as entity set in all triples, and the related collection of institute is collectively referred to as set of relations;
Define 5:Sentence entity can carry out matched noun that is, in sentence with entity;
Define 6:Sentence template, i.e., remaining content after deleting any two sentence entity in sentence;
Define 7:Entity recognition method is named, can be according to the sentence of input, the method for obtaining sentence entity;
Define 8:Entity link method can find entity that can be matched according to the sentence entity of input Method;
Define 9:Triple concatenate rule, even in two triples, second entity of first triple and First entity of two triples is identical, then the two triples can be attached using identical entity;
Define 10:Multiple triples are attached by relation path according to triple concatenate rule, constitute chain structure, The collection that relationship is formed in these triples is collectively referred to as relation path;
Wherein, triple concatenate rule is as described in defining 9;
A kind of sentence entity complementing method of fusion triple knowledge base, concrete operation step are:
Step 1: the data set that structure is used for model training;
Given sentence set carries out step 1.1-step 1.3 for each sentence in sentence set and operates:
Step 1.1:All sentence entities in sentence are extracted using name entity recognition method, obtained all sentences are real The set N of body composition;
Wherein, sentence entity is as described in defining 5;Name entity recognition method as described in defining 7;
Step 1.2:Sentence entity in the N obtained by step 1.1 is matched two-by-two according to the rule of traversal combination, The entity that each two is mutually matched all constitutes sentence entity pair;
Step 1.3:The each sentence entity centering obtained from step 1.2 carries out step 1.3.1-1.3.2 operations:
Step 1.3.1:Two sentence entities of sentence entity centering are deleted from sentence, obtain sentence template;
Wherein, sentence template is as described in defining 6;
Step 1.3.2:Being found using entity link method can be with the progress of two sentence entities of sentence entity centering Two entities in the entity set matched, and constitute entity pair;All triples are traversed in triple knowledge base again, searching can Connect the relation path of this two entity of entity centering;
Wherein, entity link method is as described in defining 8, and triple knowledge base is as described in defining 4, and relation path is as defined 11 It is described;
Step 2: entity, relationship, sentence template are indicated with vector;
Wherein, entity is as described in defining 1;Relationship is as described in defining 2;Sentence template is as described in defining 3;
Step 2.1:To one vector of each entity random initializtion in entity set, all obtained institute's directed quantities Set be denoted as E, to one vector of each relationship random initializtion in set of relations, the set of obtained institute's directed quantity It is denoted as R;
Step 2.2:Triple all in triple knowledge base is substituted into following formula successively: It calculatesAll triples are calculated laterIt sums up to obtain Lk
Wherein,Indicate the 1st entity in i-th of triple,Indicate the 2nd entity in i-th of triple, ri Indicate the relationship in i-th of triple,With E (ri) the two entities in i-th of triple are indicated respectively With relationship by the vector of random initializtion in step 2.1.
Step 2.3:Step 2.3.1-2.3.2 is carried out for each sentence template obtained by step 1.3.1:
Step 2.3.1:By i-th of sentence template siThe neural network model based on sequence is inputted, output obtains sentence mould The vector of plate indicates, is denoted as f (si);
Step 2.3.2:According to defining 6, sentence template has a corresponding sentence entity pair, corresponding to the sentence template Entity is to two entities and the obtained f (s of step 2.3.1 in the entity pair that is obtained in step 1.3.2i) substitute into formula (1):
Wherein,WithRespectively two entities of presentation-entity centering initialized in step 2.1 obtain to Amount, ‖ ‖ operations represent two norms;
Step 2.4:Step 2.3 is obtained allIt sums up to obtain Ls
Step 2.5:For all sentence templates obtained in step 1.3.1, and by this sentence template siIn step 2.3.1 vector f (the s obtainedi) and this sentence template corresponding to sentence entity to the relation path that obtains by step 1.3.2 Substitute into formula (2):
Wherein,Indicate the vector that each relationship initializes in step 2.1 in this relation path, ∑ table Show summation operation;All sentence templates are calculated laterIt sums up to obtain Lp
Step 2.6,:The L that step 2.2 is obtainedk, the obtained L of step 2.4s, the obtained L of step 2.5pIt sums up to obtain Optimization object function L;
Step 2.7:Using the vector of all entities in gradient descent algorithm optimization object function L, relationship vector, be based on Parameter in the neural network model of sequence, makes L minimize, and vector, the institute that all entities are obtained after optimization are related Optimized parameter in vector, the neural network model based on sequence;
Step 3: the entity word in completion sentence, specifically includes following sub-step:
Step 3.1:User provides a sentence s for requiring supplementation with entity word, it is extracted using name entity recognition method In all sentence entities, obtained all sentence entities constitute set E1
Wherein, name entity recognition method is as described in defining 7;
Step 3.2:For E1In each sentence entity, carry out step 3.2.1-3.2.3, wherein i-th of sentence is real Body is denoted as wi
Step 3.2.1:By sentence entity wiIt is combined to obtain sentence entity pair with the sentence entity w required supplementation with;
Step 3.2.2:By sentence entity wiIt is deleted from s, obtains sentence template
Sentence template is as described in defining 6;
Step 3.2.3:The optimized parameter generation for the neural network model based on sequence that sentence template and step 2.7 are obtained Enter the neural network model based on sequence, obtains indicating the vector of sentence template
Step 3.2.4:The reality that can be carried out with the sentence entity in matched entity set is found using entity link method Body ei
Step 3.2.5:The entity e that step 3.2.4 is obtainediIt is obtained with step 3.2.3Substitute into formula (3):
Wherein, E (ei) it is sentence entity eiIn the vector that step 2.7 obtains;
Step 3.2.6:Vector E is calculated using similarity formulai(e) all entities obtained with step 2.7 are corresponding vectorial Similarity, and all entities are sorted from big to small according to similarity, obtain the sequence serial number of each entity;
Step 3.3:For each entity in entity set, sequence serial number that each entity is obtained in step 3.2.6 into Row adduction, obtain sequence serial number and;
Step 3.4:The sequence serial number that entity in entity set is obtained according to step 3.3 and ascending sequence, will arrange Sentence entity corresponding to the entity being ranked first in sequence returns, and in the sentence s that is given to step 3.1 user of completion;
Wherein, sentence entity is as described in defining 5.
Advantageous effect
A kind of sentence entity complementing method of fusion triple knowledge base of the present invention, with existing sentence complementing method phase Than having the advantages that:
1. when entity word in for sentence carries out completion it can be considered that entity word to be supplemented and other entity words in sentence Between relationship, solve the problems, such as to be difficult in conventional sentence complementing method to carry out completion to entity word.
During 2 carry out sentence entity completion task in by the data set of wikipedia and Freebase constructions, by Wiki Sentence in encyclopaedia is split as training set and test set at random, the experimental results showed that, under identical data set, the present invention is made Method is compared with language model method of the tradition based on word co-occurrence and without using the limit system of triplet information, this hair The complementing method of bright proposition is obviously improved on the evaluation metrics using average sequence (MR) and preceding 10 hit rate (H@10).
Description of the drawings
Fig. 1 is a kind of general frame design cycle of the sentence entity complementing method of fusion triple knowledge base of the present invention Figure.
Specific implementation mode
The method of the invention is described in detail with reference to the accompanying drawings and embodiments.
Embodiment 1
A kind of detailed process of the sentence entity complementing method of fusion triple knowledge base is as shown in Figure 1.The present embodiment is chatted The flow and its specific embodiment of the method for the invention are stated.
The data used in the present embodiment are by 50,000 sentences in wikipedia and 200,000,000 from Freebase Triple is constituted.
The sentence entity complementing method of the fusion triple knowledge base used in the present embodiment, flow chart as shown in Figure 1, The specific steps are:
Step A, the data set used for model training is built;
Given sentence set, for each sentence in the set carry out step A.1-A.3 step operate:
Step is A.1:All sentence entities in sentence are extracted using name entity recognition method, obtained all sentences are real Body constitutes set E, and the specific implementation of the name entity recognition method uses Open-Source Tools OpenNLP, by all sentence inputtings OpenNLP obtains entity word.
Step is A.2:Sentence entity in the E A.1 obtained by step is matched two-by-two according to the rule of traversal combination, The entity that each two is mutually matched all constitutes sentence entity pair;
Step is A.3:A.2 each sentence entity obtained for step to carry out step A.3.1-A.3.3 operate:
Step is A.3.1:Two sentence entities of sentence entity centering are deleted from sentence, obtain sentence template;
Step is A.3.2:Being found using entity link method can be with the progress of two sentence entities of sentence entity centering Two entities in the entity set matched, and entity pair is constituted, which is matched by the Anchor Text in wikipedia Method is realized;All triples in Freebase are traversed again, find the relationship that can connect this two entity of entity centering Path;
Wherein A.3.2 A.3.1 step can be performed simultaneously with step, can also first carry out step and A.3.2 execute step again A.3.1;
Step B, entity, relationship, sentence template are indicated with vector;
Step is B.1:It is right to one vector of each entity random initializtion in the entity set of Freebase One vector of each relationship random initializtion in the set of relations of Freebase;
Step is B.2:All triples in all Freebase are substituted into following formula: It calculatesAll triples are calculated laterIt sums up to obtain Lk;Wherein,Table Show the 1st entity in i-th of triple in Freebase,Indicate the 2nd reality in i-th of triple in Freebase Body, riIndicate the relationship in i-th of triple in Freebase, With E (ri) indicate in Freebase respectively The two entities and relationship in i-th of triple step B.1 in by the vector of random initializtion.
Step is B.3:For each sentence template for B.3.1 being obtained by step carry out step B.3.1-B.3.2:
Step is B.3.1:By i-th of sentence template siThe neural network model based on sequence is inputted, output obtains sentence mould The vector of plate indicates, is denoted as f (si);
Step is B.3.2:By the entity corresponding to the sentence template to step A.3.2 in two in obtained entity pair B.3.1, the f (s that entity and step obtaini) substitute into following equation:It calculates
Wherein,WithRespectively two entities of presentation-entity centering step B.1 in initialization obtain to Amount;
Step is B.4:B.3, step is obtained allIt sums up to obtain Ls
Step is B.5:For it is all step A.3.1 in obtained sentence template, the vector that it is B.3.1 obtained in step And its using step A.3.1, step A.3.2, the relation path that A.3.3 obtains of step substitute into following formula:
Wherein,Indicate in relation path of the two entities in Freebase each relationship step B.1 in just All sentence templates are calculated later for the vector that beginningization obtainsIt sums up to obtain Lp
Step B.6,:B.4, the L that step is obtainedk, the L that B.4 obtains of steps, the L that B.5 obtains of steppIt sums up to obtain Optimization object function L;
Step is B.7:Use the vector of all entities, the vector of relationship, recurrence in gradient descent algorithm optimization object function L Parameter in neural network model, makes L minimize, obtained after optimization the vector of all entities in Freebase, institute it is related Optimized parameter in the vector of system, recurrent neural networks model;
Step C, the entity word in completion sentence;
Step is C.1:Given sentence sentence to be supplemented " Obama first-generation _ _ _ " utilizes name Entity recognition Method extraction all sentence entities " Obama " therein, " U.S. ", obtained all sentence entities constitute set E1
Step is C.2:For E1In each sentence entity " Obama " and " U.S. ", respectively carry out step C.2.1- C.2.3:
Step is C.2.1:It is combined the sentence entity and the sentence entity w required supplementation with to obtain sentence entity pair;
Step is C.2.2:The sentence entity is deleted from former sentence, obtains sentence template, is obtained for example, deleting " Obama " Sentence template " _ _ _ it is first-generation _ _ _ ", delete " U.S. " obtain sentence template " Obama birth _ _ _ _ _ _ ";
Step is C.2.3:B.7, the optimized parameter for the recurrent neural network that sentence template and step are obtained substitutes into recurrent neural Network model obtains indicating the vector of sentence template
Step is C.2.4:Matched Freebase entities can be carried out with the sentence entity by being found using entity link method The entity e of concentrationi
Step is C.2.5:C.2.4, the sentence entity that step obtains and the f that C.2.3 step obtains are substituted into formula:
Wherein, E (ei) it is sentence entity eiIn the vector that B.7 step obtains;
Step is C.2.6:Vector E is calculated using cosine similarityi(e) all realities in the Freebase B.7 obtained with step Body corresponds to the similarity of vector, and all entities are sorted from big to small according to similarity, obtains the sequence serial number of each entity;
Step is C.3:For each entity in Freebase entity sets, all sequences that it is C.2.6 obtained in step Serial number sums up, obtain sequence serial number and;
Step is C.4:C.3, the sequence serial number that entity in Freebase entity sets is obtained according to step and ascending row Sentence entity " Hawaii " corresponding to the entity being ranked first is returned and in completion to former sentence, obtains complete sentence by sequence " the first-generation Hawaii of Obama ";
Embodiment 2
In carrying out sentence entity completion task in by the data set of wikipedia and Freebase constructions, by Wiki hundred Sentence in section is split as training set and test set at random, the experimental results showed that, it is used herein under identical data set Method compared with language model method of the tradition based on word co-occurrence and without using the limit system of triplet information, using flat It sorts (MR) and preceding 10 hit rate (H@10) is used as evaluation metrics, following experimental result can be obtained.
Table 1 uses method proposed by the present invention and other sentence complementing method performance comparisons
Table 1 the experimental results showed that:It is identical in training set and test set data, using the method for the invention Compared with without using the method for the present invention, 10 evaluation metrics of MR and H@are obviously improved.
The above is presently preferred embodiments of the present invention, and it is public that the present invention should not be limited to embodiment and attached drawing institute The content opened.It is every not depart from the lower equivalent or modification completed of spirit disclosed in this invention, both fall within the model that the present invention protects It encloses.

Claims (1)

1. a kind of sentence entity complementing method of fusion triple knowledge base, it is characterised in that:Related definition is carried out first:Specifically It is as follows:
Define 1:Entity can represent the mark of specific things;
Define 2:Relationship can represent the mark of the contact between the things representated by two entities;
Define 3:Triple, i.e., by two entities and its between a kind of structure for forming of relationship;Wherein, the two entities form Entity pair;
Define 4:Triple knowledge base, i.e., the set being made of a large amount of triples;
Wherein, the collection of all entities is collectively referred to as entity set in all triples, and the related collection of institute is collectively referred to as set of relations;
Define 5:Sentence entity can carry out matched noun that is, in sentence with entity;
Define 6:Sentence template, i.e., remaining content after deleting any two sentence entity in sentence;
Define 7:Entity recognition method is named, can be according to the sentence of input, the method for obtaining sentence entity;
Define 8:Entity link method, can be according to the sentence entity of input, the method for finding entity that can be matched;
Define 9:Triple concatenate rule, even in two triples, second entity of first triple and second First entity of triple is identical, then the two triples can be attached using identical entity;
Define 10:Multiple triples are attached by relation path according to triple concatenate rule, constitute chain structure, these The collection that relationship is formed in triple is collectively referred to as relation path;
Wherein, triple concatenate rule is as described in defining 9;
A kind of sentence entity complementing method of fusion triple knowledge base, concrete operation step are:
Step 1: the data set that structure is used for model training;
Given sentence set carries out step 1.1-step 1.3 for each sentence in sentence set and operates:
Step 1.1:All sentence entities in sentence, obtained all sentence group of entities are extracted using name entity recognition method At set N;
Wherein, sentence entity is as described in defining 5;Name entity recognition method as described in defining 7;
Step 1.2:Sentence entity in the N obtained by step 1.1 is matched two-by-two according to the rule of traversal combination, every two A entity being mutually matched all constitutes sentence entity pair;
Step 1.3:The each sentence entity centering obtained from step 1.2 carries out step 1.3.1-1.3.2 operations:
Step 1.3.1:Two sentence entities of sentence entity centering are deleted from sentence, obtain sentence template;
Wherein, sentence template is as described in defining 6;
Step 1.3.2:Being found using entity link method can be matched with the progress of two sentence entities of sentence entity centering Two entities in entity set, and constitute entity pair;All triples are traversed in triple knowledge base again, searching can connect The relation path of this two entity of entity centering;
Wherein, entity link method is as described in defining 8, and triple knowledge base is as described in defining 4, and relation path is as defined 11 institutes It states;
Step 2: entity, relationship, sentence template are indicated with vector;
Wherein, entity is as described in defining 1;Relationship is as described in defining 2;Sentence template is as described in defining 3;
Step 2.1:To one vector of each entity random initializtion in entity set, the collection of all obtained institute's directed quantities Conjunction is denoted as E, and to one vector of each relationship random initializtion in set of relations, the set of obtained institute's directed quantity is denoted as R;
Step 2.2:Triple all in triple knowledge base is substituted into following formula successively: It calculatesAll triples are calculated laterIt sums up to obtain Lk
Wherein,Indicate the 1st entity in i-th of triple,Indicate the 2nd entity in i-th of triple, riIt indicates Relationship in i-th of triple,With E (ri) the two entities and pass in i-th of triple is indicated respectively Tie up to the vector by random initializtion in step 2.1;
Step 2.3:Step 2.3.1-2.3.2 is carried out for each sentence template obtained by step 1.3.1:
Step 2.3.1:By i-th of sentence template siInput the neural network model based on sequence, output obtain sentence template to Amount indicates, is denoted as f (si);
Step 2.3.2:According to defining 6, sentence template has a corresponding sentence entity pair, by the entity corresponding to the sentence template To two entities and the obtained f (s of step 2.3.1 in the entity pair that is obtained in step 1.3.2i) substitute into formula (1):
Wherein,WithThe vector that two entities of presentation-entity centering initialize in step 2.1 respectively, ‖ ‖ Operation represents two norms;
Step 2.4:Step 2.3 is obtained allIt sums up to obtain Ls
Step 2.5:For all sentence templates obtained in step 1.3.1, and by this sentence template siIt is obtained in step 2.3.1 Vector f (the s arrivedi) and this sentence template corresponding to sentence entity the relation path obtained by step 1.3.2 is substituted into it is public Formula (2):
Wherein,Indicate that the vector that each relationship initializes in step 2.1 in this relation path, ∑ expression are asked And operation;All sentence templates are calculated laterIt sums up to obtain Lp
Step 2.6,:The L that step 2.2 is obtainedk, the obtained L of step 2.4s, the obtained L of step 2.5pIt sums up and is optimized Object function L;
Step 2.7:Using the vector of all entities in gradient descent algorithm optimization object function L, relationship vector, be based on sequence Neural network model in parameter, so that L is minimized, obtained after optimization all entities vector, institute it is related vector, Optimized parameter in neural network model based on sequence;
Step 3: the entity word in completion sentence, specifically includes following sub-step:
Step 3.1:User provides a sentence s for requiring supplementation with entity word, is extracted using name entity recognition method therein All sentence entities, obtained all sentence entities constitute set E1
Wherein, name entity recognition method is as described in defining 7;
Step 3.2:For E1In each sentence entity, carry out step 3.2.1-3.2.3, wherein i-th of sentence entity is denoted as wi
Step 3.2.1:By sentence entity wiIt is combined to obtain sentence entity pair with the sentence entity w required supplementation with;
Step 3.2.2:By sentence entity wiIt is deleted from s, obtains sentence template
Sentence template is as described in defining 6;
Step 3.2.3:The optimized parameter for the neural network model based on sequence that sentence template and step 2.7 are obtained substitutes into base In the neural network model of sequence, obtain indicating the vector of sentence template
Step 3.2.4:The entity e that can be carried out with the sentence entity in matched entity set is found using entity link methodi
Step 3.2.5:The entity e that step 3.2.4 is obtainediIt is obtained with step 3.2.3Substitute into formula (3):
Wherein, E (ei) it is sentence entity eiIn the vector that step 2.7 obtains;
Step 3.2.6:Vector E is calculated using similarity formulai(e) the corresponding vector of all entities for being obtained with step 2.7 it is similar Degree, and all entities are sorted from big to small according to similarity, obtain the sequence serial number of each entity;
Step 3.3:For each entity in entity set, the sequence serial number that each entity is obtained in step 3.2.6 is added With, obtain sequence serial number and;
Step 3.4:The sequence serial number that entity in entity set is obtained according to step 3.3 and ascending sequence, will be in sequence Sentence entity corresponding to the entity being ranked first returns, and in the sentence s that is given to step 3.1 user of completion;
Wherein, sentence entity is as described in defining 5.
CN201810328826.9A 2018-04-13 2018-04-13 A kind of sentence entity complementing method of fusion triple knowledge base Pending CN108563637A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810328826.9A CN108563637A (en) 2018-04-13 2018-04-13 A kind of sentence entity complementing method of fusion triple knowledge base

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810328826.9A CN108563637A (en) 2018-04-13 2018-04-13 A kind of sentence entity complementing method of fusion triple knowledge base

Publications (1)

Publication Number Publication Date
CN108563637A true CN108563637A (en) 2018-09-21

Family

ID=63534799

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810328826.9A Pending CN108563637A (en) 2018-04-13 2018-04-13 A kind of sentence entity complementing method of fusion triple knowledge base

Country Status (1)

Country Link
CN (1) CN108563637A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109918640A (en) * 2018-12-22 2019-06-21 浙江工商大学 A kind of Chinese text proofreading method of knowledge based map
CN110263324A (en) * 2019-05-16 2019-09-20 华为技术有限公司 Text handling method, model training method and device
CN111858867A (en) * 2019-04-30 2020-10-30 广东小天才科技有限公司 Incomplete corpus completion method and device
CN114090722A (en) * 2022-01-19 2022-02-25 支付宝(杭州)信息技术有限公司 Method and device for automatically completing query content

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH1049532A (en) * 1996-07-31 1998-02-20 A T R Onsei Honyaku Tsushin Kenkyusho:Kk Example machine translation system
CN102262658A (en) * 2011-07-13 2011-11-30 东北大学 Method for extracting web data from bottom to top based on entity
CN105808688A (en) * 2016-03-02 2016-07-27 百度在线网络技术(北京)有限公司 Complementation retrieval method and device based on artificial intelligence
CN107357787A (en) * 2017-07-26 2017-11-17 微鲸科技有限公司 Semantic interaction method, apparatus and electronic equipment
CN107491500A (en) * 2017-07-28 2017-12-19 中国人民大学 A kind of knowledge base complementing method of strong adaptability

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH1049532A (en) * 1996-07-31 1998-02-20 A T R Onsei Honyaku Tsushin Kenkyusho:Kk Example machine translation system
CN102262658A (en) * 2011-07-13 2011-11-30 东北大学 Method for extracting web data from bottom to top based on entity
CN105808688A (en) * 2016-03-02 2016-07-27 百度在线网络技术(北京)有限公司 Complementation retrieval method and device based on artificial intelligence
CN107357787A (en) * 2017-07-26 2017-11-17 微鲸科技有限公司 Semantic interaction method, apparatus and electronic equipment
CN107491500A (en) * 2017-07-28 2017-12-19 中国人民大学 A kind of knowledge base complementing method of strong adaptability

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
XIAOCHI WEI 等: "I Know What You Want to Express: Sentence Element Inference by Incorporating External Knowledge Base", 《IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109918640A (en) * 2018-12-22 2019-06-21 浙江工商大学 A kind of Chinese text proofreading method of knowledge based map
CN109918640B (en) * 2018-12-22 2023-05-02 浙江工商大学 Chinese text proofreading method based on knowledge graph
CN111858867A (en) * 2019-04-30 2020-10-30 广东小天才科技有限公司 Incomplete corpus completion method and device
CN110263324A (en) * 2019-05-16 2019-09-20 华为技术有限公司 Text handling method, model training method and device
US20220147715A1 (en) * 2019-05-16 2022-05-12 Huawei Technologies Co., Ltd. Text processing method, model training method, and apparatus
CN114090722A (en) * 2022-01-19 2022-02-25 支付宝(杭州)信息技术有限公司 Method and device for automatically completing query content

Similar Documents

Publication Publication Date Title
CN109902171B (en) Text relation extraction method and system based on hierarchical knowledge graph attention model
CN104615767B (en) Training method, search processing method and the device of searching order model
CN106202032B (en) A kind of sentiment analysis method and its system towards microblogging short text
CN106294593B (en) In conjunction with the Relation extraction method of subordinate clause grade remote supervisory and semi-supervised integrated study
CN108563637A (en) A kind of sentence entity complementing method of fusion triple knowledge base
CN107273913B (en) Short text similarity calculation method based on multi-feature fusion
CN106649275A (en) Relation extraction method based on part-of-speech information and convolutional neural network
CN106855853A (en) Entity relation extraction system based on deep neural network
CN106055604B (en) Word-based network carries out the short text topic model method for digging of feature extension
CN109101479A (en) A kind of clustering method and device for Chinese sentence
CN109902159A (en) A kind of intelligent O&M statement similarity matching process based on natural language processing
WO2020063092A1 (en) Knowledge graph processing method and apparatus
CN111310438A (en) Chinese sentence semantic intelligent matching method and device based on multi-granularity fusion model
CN108763376B (en) Knowledge representation learning method for integrating relationship path, type and entity description information
CN108280064A (en) Participle, part-of-speech tagging, Entity recognition and the combination treatment method of syntactic analysis
CN111639252A (en) False news identification method based on news-comment relevance analysis
CN110175221B (en) Junk short message identification method by combining word vector with machine learning
CN110675269B (en) Text auditing method and device
CN111651566B (en) Multi-task small sample learning-based referee document dispute focus extraction method
CN112100365A (en) Two-stage text summarization method
CN107679225A (en) A kind of reply generation method based on keyword
CN108073576A (en) Intelligent search method, searcher and search engine system
CN109117474A (en) Calculation method, device and the storage medium of statement similarity
CN107092605A (en) A kind of entity link method and device
CN105446955A (en) Adaptive word segmentation method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20180921

WD01 Invention patent application deemed withdrawn after publication