CN109766453A - A kind of method and system of user's corpus semantic understanding - Google Patents

A kind of method and system of user's corpus semantic understanding Download PDF

Info

Publication number
CN109766453A
CN109766453A CN201910046978.4A CN201910046978A CN109766453A CN 109766453 A CN109766453 A CN 109766453A CN 201910046978 A CN201910046978 A CN 201910046978A CN 109766453 A CN109766453 A CN 109766453A
Authority
CN
China
Prior art keywords
user
corpus
participle
semantic
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910046978.4A
Other languages
Chinese (zh)
Inventor
魏誉荧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Genius Technology Co Ltd
Original Assignee
Guangdong Genius Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Genius Technology Co Ltd filed Critical Guangdong Genius Technology Co Ltd
Priority to CN201910046978.4A priority Critical patent/CN109766453A/en
Publication of CN109766453A publication Critical patent/CN109766453A/en
Pending legal-status Critical Current

Links

Landscapes

  • Machine Translation (AREA)

Abstract

The present invention provides a kind of method and system of user's corpus semantic understanding, method includes: to establish knowledge mapping;Obtain corpus sample;Part-of-speech tagging and syntax mark are carried out to the corpus sample;Main body word and conjunctive word in the corpus sample are extracted according to the part-of-speech tagging and syntax mark;According to the part-of-speech tagging and the main body word generative semantics slot;The main body word and the knowledge mapping are matched, the connection relationship between the semantic slot is obtained;Regular expression is generated according to the semantic slot, the connection relationship and the conjunctive word;Obtain user's corpus;By user's corpus and the semantic slot, regular expression comparison, parsing obtains corresponding user semantic.The present invention is based on user corpus of the knowledge mapping to acquisition to parse, to obtain corresponding user semantic.

Description

A kind of method and system of user's corpus semantic understanding
Technical field
The present invention relates to language processing techniques field, espespecially a kind of method and system of user's corpus semantic understanding.
Background technique
Currently with the rapid development of network, Intelligent treatment information is also more and more common.Computer, smart machine etc. are every It may need to handle thousands of information.Smart machine generally passes through analysis corpus and obtains corresponding regular expression, from And parse corpus.
But in semanteme parsing, according to current participle technique, multiple mutually independent words can be extracted, word with Logical relation between word can not judge, will lead to semantic resolution logic confusion, can not correctly parse the semanteme of user's corpus.
Therefore, it is necessary to parse user's corpus by a kind of method and system of user's corpus semantic understanding to obtain most Possible corresponding user semantic.
Summary of the invention
The object of the present invention is to provide a kind of method and system of user's corpus semantic understanding, realize knowledge based map pair User's corpus of acquisition parses, to obtain corresponding user semantic.
Technical solution provided by the invention is as follows:
The present invention provides a kind of method of user's corpus semantic understanding, comprising:
Establish knowledge mapping;
Obtain corpus sample;
Part-of-speech tagging and syntax mark are carried out to the corpus sample;
Main body word and conjunctive word in the corpus sample are extracted according to the part-of-speech tagging and syntax mark;
According to the part-of-speech tagging and the main body word generative semantics slot;
The main body word and the knowledge mapping are matched, the connection relationship between the semantic slot is obtained;
Regular expression is generated according to the semantic slot, the connection relationship and the conjunctive word;
Obtain user's corpus;
By user's corpus and the semantic slot, regular expression comparison, parsing obtains corresponding user semantic.
Further, the knowledge mapping of establishing specifically includes:
Obtain knowledge point and the corresponding incidence relation in the knowledge point;
The knowledge mapping is established according to the knowledge point and described hierarchical relationship.
Further, described to match the main body word and the knowledge mapping, obtain the semantic slot it Between connection relationship specifically include:
The main body word and the knowledge mapping are matched, the corresponding corpus knowledge point of the main body word is obtained With corpus hierarchical relationship;
The connection relationship between the semantic slot is obtained according to the corpus knowledge point and the corpus hierarchical relationship.
Further, described that regular expressions are generated according to the semantic slot, the connection relationship and the conjunctive word Formula, which specifically includes, includes:
It is different but semantic identical that multiple clause are generated according to the semantic slot, the connection relationship and the conjunctive word Regular expression;
The logical relation between the semantic slot is obtained according to the multiple regular expression.
Further, described to obtain user's corpus with the semantic slot, regular expression comparison, parsing Corresponding user semantic specifically includes:
User's corpus is segmented by participle technique, obtains corresponding user's participle and participle part of speech;
It is compared in conjunction with user participle and the participle part of speech and the semantic slot, obtains the user point Participle connection relationship between word;
Corresponding user's canonical formula is generated in conjunction with user participle and the participle part of speech, by user's canonical formula It is compared with the regular expression, obtains the participle logical relation between user's participle;
According to user participle, the participle part of speech, the participle connection relationship and the participle logical relation solution It analyses user's corpus and obtains the corresponding user semantic.
The present invention also provides a kind of systems of user's corpus semantic understanding, comprising:
Map establishes module, establishes knowledge mapping;
Sample acquisition module obtains corpus sample;
Labeling module carries out part-of-speech tagging to the corpus sample that the sample acquisition module obtains and syntax marks;
Abstraction module extracts the corpus according to the part-of-speech tagging of labeling module mark and syntax mark Main body word and conjunctive word in sample;
Semantic slot generation module is obtained according to the part-of-speech tagging of labeling module mark and the abstraction module The main body word generative semantics slot;
Matching module establishes the main body word that the abstraction module obtains and the map described in module establishes Knowledge mapping is matched, and the connection relationship between the semantic slot is obtained;
Canonical formula generation module is obtained according to the semantic slot, the matching module that the semantic slot generation module generates To the obtained conjunctive word of the connection relationship and the abstraction module generate regular expression;
Corpus obtains module, obtains user's corpus;
The corpus is obtained the user's corpus and the semantic slot generation module generation that module obtains by parsing module The semantic slot, the regular expression comparison that generates of the canonical formula generation module, parsing obtains corresponding user's language Justice.
Further, the map is established module and is specifically included:
Acquiring unit obtains knowledge point and the corresponding incidence relation in the knowledge point;
Map establishes unit, knows described in the knowledge point obtained according to the acquiring unit and described hierarchical relationship foundation Know map.
Further, the matching module specifically includes:
Matching unit establishes the main body word that the abstraction module obtains and the map described in module establishes Knowledge mapping is matched, and the corresponding corpus knowledge point of the main body word and corpus hierarchical relationship are obtained;
Analytical unit, the corpus knowledge point and corpus hierarchical relationship obtained according to the matching unit obtain institute's predicate Connection relationship between adopted slot.
Further, the canonical formula generation module specifically includes:
Canonical formula generation unit is obtained according to the semantic slot, the matching module that the semantic slot generation module generates To the obtained conjunctive word of the connection relationship and the abstraction module to generate multiple clause different but semantic identical Regular expression;
Processing unit obtains the semantic slot according to the multiple regular expression that the canonical formula generation unit generates Between logical relation.
Further, the parsing module specifically includes:
Participle unit obtains user's corpus that module obtains to the corpus by participle technique and segments, obtains Part of speech is segmented and segmented to corresponding user;
Comparison unit, the user participle and the participle part of speech and institute's predicate obtained in conjunction with the participle unit The semantic slot that adopted slot generation module generates compares, and obtains the participle connection relationship between user's participle;
The comparison unit, the user participle obtained in conjunction with the participle unit and participle part of speech generation pair The user's canonical formula answered carries out the regular expression that user's canonical formula and the canonical formula generation module generate pair Than obtaining the participle logical relation between user's participle;
Resolution unit, the user participle obtained according to the participle unit, the participle part of speech, the comparison unit The obtained participle connection relationship and the participle logical relation parse user's corpus and obtain the corresponding user It is semantic.
A kind of method and system of the user's corpus semantic understanding provided through the invention, can bring following at least one The utility model has the advantages that
1, in the present invention, knowledge mapping is established by knowledge point, is obtained in conjunction with the corpus sample of knowledge graph spectrum information Then semantic slot and regular expression parse user's corpus to obtain corresponding user according to semantic slot and regular expression It is semantic.
2, in the present invention, corresponding knowledge mapping is established according to knowledge point and incidence relation, gives expression to and knows clear and accurately The system for knowing point is constituted, and is combed and is understood convenient for user, is also convenient for clearing the logic in user's corpus between user's participle Relationship.
Detailed description of the invention
Below by clearly understandable mode, preferred embodiment is described with reference to the drawings, a kind of user's corpus semanteme is managed Above-mentioned characteristic, technical characteristic, advantage and its implementation of the method and system of solution are further described.
Fig. 1 is a kind of flow chart of one embodiment of the method for user's corpus semantic understanding of the present invention;
Fig. 2 is a kind of flow chart of another embodiment of the method for user's corpus semantic understanding of the present invention;
Fig. 3 is a kind of flow chart of another embodiment of the method for user's corpus semantic understanding of the present invention;
Fig. 4 is a kind of flow chart of another embodiment of the method for user's corpus semantic understanding of the present invention;
Fig. 5 is a kind of flow chart of another embodiment of the method for user's corpus semantic understanding of the present invention;
Fig. 6 is a kind of structural schematic diagram of one embodiment of the system of user's corpus semantic understanding of the present invention.
Drawing reference numeral explanation:
The system of 100 user's corpus semantic understandings
110 maps establish 111 acquiring unit of module, 112 map and establish unit
120 sample acquisition modules
130 labeling module, 140 abstraction module
150 semantic slot generation modules
160 matching module, 161 matching unit, 162 analytical unit
170 canonical formula generation module, 171 canonical formula generation unit, 172 processing unit
180 corpus obtain module
190 parsing module, 191 participle unit, 192 comparison unit, 193 resolution unit
Specific embodiment
It, below will be to ordinarily in order to clearly illustrate the embodiment of the present invention or technical solution in the prior art Bright book Detailed description of the invention a specific embodiment of the invention.It should be evident that the accompanying drawings in the following description is only of the invention one A little embodiments for those of ordinary skill in the art without creative efforts, can also be according to these Attached drawing obtains other attached drawings, and obtains other embodiments.
In order to make simplified form, part related to the present invention is only schematically shown in each figure, their not generations Its practical structures as product of table.In addition, there is identical structure or function in some figures so that simplified form is easy to understand Component, only symbolically depict one of those, or only marked one of those.Herein, "one" not only table Show " only this ", can also indicate the situation of " more than one ".
One embodiment of the present of invention, as shown in Figure 1, a kind of method of user's corpus semantic understanding, comprising:
S100 establishes knowledge mapping.
Specifically, knowledge mapping is also known as mapping knowledge domains, it is the one of explicit knowledge's development process and structural relation The a variety of different figures of series describe knowledge resource and its carrier, excavation, analysis, building, drafting and display with visualization technique Knowledge and connecting each other between them.
Corresponding knowledge mapping is established, the corresponding one big knowledge point of each node or general knowledge point in knowledge mapping are known Knowing point includes many specific physical contents, for example, the physical contents for including in knowledge point " trigonometric function " have trigonometric function general The study stage that explanation, the explanation of trigonometric function course, trigonometric function exercise etc. and trigonometric function are related to is read, for example just It is middle and high medium.
S200 obtains corpus sample.
Specifically, obtaining a large amount of corpus sample, wherein corpus sample can be writtening language for specification, be also possible to use Family voice, audio etc., because user speech input and text input are all the interactive modes of mainstream during human-computer interaction.
In addition, since entire analytic process is for penman text, so if what is collected is the languages such as user speech, audio Sound file, it is necessary first to convert identification text for voice document, then the identification text is performed corresponding processing.
S300 carries out part-of-speech tagging to the corpus sample and syntax marks.
Specifically, being segmented according to participle technique to corpus sample, identifies and segmented in every a word in corpus sample Part of speech and carry out part-of-speech tagging, then analyze corpus sample in every a word syntactic structure segmented between connection Relationship, to carry out syntax mark.
For example, a certain corpus sample are as follows: why whale can spray water, wherein " whale " corresponding part of speech is noun, it " is assorted " corresponding part of speech is pronoun, " meeting " corresponding part of speech is verb, and " water spray " corresponding part of speech is verb, analysis syntactic structure Obtaining " whale " and " water spray " is subject-predicate relationship, " why " and " water spray " be relationship in shape, " meeting " and " water spray " is closed in shape System.
S400 extracts main body word and association in the corpus sample according to the part-of-speech tagging and syntax mark Word.
Specifically, extracting main body word and conjunctive word in corpus sample in conjunction with part-of-speech tagging and syntax mark, for example may be used With select the noun that syntax is labeled as to subject-predicate relationship extract based on word, or syntax is labeled as in shape in relationship It is conjunctive word that pronoun, which extracts, and the example above is intended merely to facilitate understanding, the actually decimation rule of main body word and conjunctive word It is not limited in two kinds of the example above.
For example, a certain corpus sample are as follows: why whale can spray water, wherein " whale " corresponding part of speech is noun, it " is assorted " corresponding part of speech is pronoun, " meeting " corresponding part of speech is verb, and " water spray " corresponding part of speech is verb, analysis syntactic structure Obtaining " whale " and " water spray " is subject-predicate relationship, " why " and " water spray " be relationship in shape, " meeting " and " water spray " is closed in shape System.Word based on " whale " therein and " water spray " is extracted, " why " it is conjunctive word.
S500 is according to the part-of-speech tagging and the main body word generative semantics slot.
Specifically, according to part-of-speech tagging and main body word generative semantics slot, for example, a certain corpus sample are as follows: whale is assorted It can spray water, wherein " whale " corresponding part of speech is noun, " why " corresponding part of speech is pronoun, " meeting " corresponding part of speech is Verb, " water spray " corresponding part of speech are verb, extract word based on " whale " therein and " water spray ", generate semantic nouns slot In physical contents be " whale ", the physical contents in semantic verbs slot are " water spray ".
S600 matches the main body word and the knowledge mapping, and the connection obtained between the semantic slot is closed System.
Specifically, the knowledge mapping for analyzing obtained main body word and foundation is seriatim matched, if matching meets, Then the connection between the corresponding semantic slot of the main body word is obtained in the position in knowledge mapping according to the main body word being consistent to close System.
S700 generates regular expression according to the semantic slot, the connection relationship and the conjunctive word.
Specifically, generating the corresponding regular expression of corpus sample, example according to semantic slot, connection relationship and conjunctive word Such as, a certain corpus sample are as follows: why whale can spray water, wherein " whale " corresponding part of speech be noun, " why " it is corresponding Part of speech is pronoun, and " meeting " corresponding part of speech is verb, and " water spray " corresponding part of speech is verb, and analysis syntactic structure obtains " whale " " water spray " is subject-predicate relationship, " why " and " water spray " be relationship in shape, " meeting " and " water spray " is relationship in shape.It extracts wherein " whale " and " water spray " based on word, " why " be conjunctive word.Obtained regular expression are as follows: ## thesaurus ## [for What] [meeting] ## verb library ##.
S800 obtains user's corpus.
Specifically, obtain user's corpus, smart machine during obtaining user's corpus, user by voice input with Text input is all the interactive mode of mainstream, but which kind of form the user's corpus no matter obtained is, final system is handled Be all textual form, therefore, if getting speech form, need to be first converted into textual form.
S900 compares user's corpus and the semantic slot, the regular expression, and parsing obtains corresponding user It is semantic.
Specifically, user's corpus that analysis obtains, then seriatim carries out with the semantic slot and regular expression of generation Comparison, to obtain the relationship in user's corpus between each word, parsing obtains corresponding user semantic.
In the present embodiment, knowledge mapping is established by knowledge point, is obtained in conjunction with the corpus sample of knowledge graph spectrum information Then semantic slot and regular expression parse user's corpus to obtain corresponding user according to semantic slot and regular expression It is semantic.
Another embodiment of the invention is the optimal enforcement example of the above embodiments, as shown in Figure 2, comprising:
S100 establishes knowledge mapping.
The S100 establishes knowledge mapping and specifically includes:
S110 obtains knowledge point and the corresponding incidence relation in the knowledge point.
Specifically, knowledge point, which includes that study class, common sense class, animal class etc. are miscellaneous, to be known specifically, obtaining knowledge point Know, and wherein every a kind of also comprising many general knowledge points.For example, Chinese language includes Tang poetry, the such poems of the Song Dynasty, Yuan songs, the writing in classical Chinese, modern poetic Song etc., Tang poetry include poem with five characters in one line, seven-character octave etc., and Tang poetry is further divided into lyric class, object is borrowed to explain mankind etc..
Knowledge mapping is constituted according to several triples, and triple can simply be interpreted as (entity, association pass System, entity), if regarding entity as node, regard incidence relation (including attribute, classification etc.) as a line, then wrapping The knowledge base for having contained a large amount of triples just constitutes a huge knowledge mapping.
For example, the relationship between subject and knowledge point can be expressed as (Chinese language, inclusion relation, Tang poetry), (Chinese language includes Relationship, the such poems of the Song Dynasty), (Chinese language, inclusion relation, Yuan songs), (Chinese language, inclusion relation, writing in classical Chinese) etc..It is above-mentioned to know in Chinese language subject Knowing between point is coordination, but in each knowledge point again includes multiple general knowledge points, for example, (Tang poetry, inclusion relation, five Say regulated verse), (Tang poetry, inclusion relation, seven-character octave).
Therefore, in order to construct corresponding knowledge mapping, it is also necessary to obtain the incidence relation between all knowledge points, association is closed System includes the connection relationship and hierarchical relationship between knowledge point, such as when being inclusion relation between two knowledge points, includes Knowledge point level be higher than by comprising knowledge point, when being coordination between two knowledge points.The two level is identical.
S120 establishes the knowledge mapping according to the knowledge point and described hierarchical relationship.
Specifically, corresponding knowledge mapping is established according to knowledge point and incidence relation, a knowledge point in knowledge mapping It is exactly a connecting node, passes through line and curve connection between two knowledge points with incidence relation.Therefore, knowledge mapping can be clear The clear system for accurately giving expression to knowledge point is constituted, and is combed and is understood convenient for user.
S200 obtains corpus sample.
S300 carries out part-of-speech tagging to the corpus sample and syntax marks.
S400 extracts main body word and association in the corpus sample according to the part-of-speech tagging and syntax mark Word.
S500 is according to the part-of-speech tagging and the main body word generative semantics slot.
S600 matches the main body word and the knowledge mapping, and the connection obtained between the semantic slot is closed System.
S700 generates regular expression according to the semantic slot, the connection relationship and the conjunctive word.
S800 obtains user's corpus.
S900 compares user's corpus and the semantic slot, the regular expression, and parsing obtains corresponding user It is semantic.
In the present embodiment, corresponding knowledge mapping is established according to knowledge point and incidence relation, gives expression to and knows clear and accurately The system for knowing point is constituted, and is combed and is understood convenient for user, is also convenient for clearing the logic in user's corpus between user's participle Relationship.
Another embodiment of the invention is the optimal enforcement example of the above embodiments, as shown in Figure 3, comprising:
S100 establishes knowledge mapping.
The S100 establishes knowledge mapping and specifically includes:
S110 obtains knowledge point and the corresponding incidence relation in the knowledge point.
S120 establishes the knowledge mapping according to the knowledge point and described hierarchical relationship.
S200 obtains corpus sample.
S300 carries out part-of-speech tagging to the corpus sample and syntax marks.
S400 extracts main body word and association in the corpus sample according to the part-of-speech tagging and syntax mark Word.
S500 is according to the part-of-speech tagging and the main body word generative semantics slot.
S600 matches the main body word and the knowledge mapping, and the connection obtained between the semantic slot is closed System.
The S600 matches the main body word and the knowledge mapping, obtains the company between the semantic slot The relationship of connecing specifically includes:
S610 matches the main body word and the knowledge mapping, obtains the corresponding corpus of the main body word and knows Know point and corpus hierarchical relationship.
Specifically, the main body word that above-mentioned analysis obtains and the knowledge mapping of foundation are seriatim matched, if matching Meet, then the knowledge point being consistent in knowledge mapping with main body word is denoted as corpus knowledge point, is obtained in conjunction with the structure of knowledge mapping The hierarchical relationship of the corpus knowledge point in knowledge mapping is taken, corpus hierarchical relationship is denoted as.
S620 obtains the connection relationship between the semantic slot according to the corpus knowledge point and the corpus hierarchical relationship.
Specifically, being corresponded to according to corpus knowledge point and corpus hierarchical relationship of the corpus knowledge point in knowledge mapping Main body word between connection relationship, to obtain the connection relationship between the corresponding semantic slot of the main body word.
S700 generates regular expression according to the semantic slot, the connection relationship and the conjunctive word.
S800 obtains user's corpus.
S900 compares user's corpus and the semantic slot, the regular expression, and parsing obtains corresponding user It is semantic.
In the present embodiment, in conjunction with foundation knowledge mapping in incidence relation between knowledge point and knowledge point, thus fastly Speed accurately obtains the connection relationship between main body word and semantic slot in corpus sample, convenient for solving to user's corpus Analysis.
Another embodiment of the invention is the optimal enforcement example of the above embodiments, as shown in Figure 4, comprising:
S100 establishes knowledge mapping.
S200 obtains corpus sample.
S300 carries out part-of-speech tagging to the corpus sample and syntax marks.
S400 extracts main body word and association in the corpus sample according to the part-of-speech tagging and syntax mark Word.
S500 is according to the part-of-speech tagging and the main body word generative semantics slot.
S600 matches the main body word and the knowledge mapping, and the connection obtained between the semantic slot is closed System.
S700 generates regular expression according to the semantic slot, the connection relationship and the conjunctive word.
The S700 is specific according to the semantic slot, the connection relationship and conjunctive word generation regular expression Including including:
S710 generates multiple clause differences but semantic phase according to the semantic slot, the connection relationship and the conjunctive word Same regular expression.
Specifically, generating multiple clause differences but semantic phase according to semantic slot, connection relationship and by conjunctive word conversion Same regular expression, for example, a certain corpus sample are as follows: why whale can spray water, wherein " whale " corresponding part of speech is run after fame Word, " why " corresponding part of speech is pronoun, " meeting " corresponding part of speech is verb, and " water spray " corresponding part of speech is verb, is analyzed It is subject-predicate relationship that syntactic structure, which obtains " whale " and " water spray ", " why " and " water spray " be relationship in shape, " meeting " and " water spray " It is relationship in shape.Word based on " whale " therein and " water spray " is extracted, " why " it is conjunctive word.Obtained regular expressions Formula are as follows: ## thesaurus ## [why] [meeting] ## verb library ##, by conjunctive word " why " carry out conversion adjustment structure, generate Another clause difference but semantic identical regular expression: [why] ## thesaurus ## [meeting] ## verb library ##.
S720 obtains the logical relation between the semantic slot according to the multiple regular expression.
Specifically, the logical relation between semantic slot is analyzed according to obtained multiple regular expressions, for example, a certain corpus Sample are as follows: why whale can spray water, obtained regular expression are as follows: ## thesaurus ## [why] [meeting] ## verb library ##, By conjunctive word " why " carry out conversion adjustment structure, generate that another clause is different but semantic identical regular expression: [why] ## thesaurus ## [meeting] ## verb library ##, wherein the position of conjunctive word is adjusted the language for having no effect on corpus sample Justice, and the relative position of semantic nouns slot and semantic verbs slot is constant always, therefore obtains semantic nouns slot and verb language Logical relation between adopted slot.
S800 obtains user's corpus.
S900 compares user's corpus and the semantic slot, the regular expression, and parsing obtains corresponding user It is semantic.
In the present embodiment, for corpus sample, multiple clause differences is generated by conjunctive word conversion but semanteme is identical just Then then expression formula analyzes to obtain the logical relation in corpus sample between semantic slot according to multiple regular expressions of generation.
Another embodiment of the invention is the optimal enforcement example of the above embodiments, as shown in Figure 5, comprising:
S100 establishes knowledge mapping.
S200 obtains corpus sample.
S300 carries out part-of-speech tagging to the corpus sample and syntax marks.
S400 extracts main body word and association in the corpus sample according to the part-of-speech tagging and syntax mark Word.
S500 is according to the part-of-speech tagging and the main body word generative semantics slot.
S600 matches the main body word and the knowledge mapping, and the connection obtained between the semantic slot is closed System.
S700 generates regular expression according to the semantic slot, the connection relationship and the conjunctive word.
The S700 is specific according to the semantic slot, the connection relationship and conjunctive word generation regular expression Including including:
S710 generates multiple clause differences but semantic phase according to the semantic slot, the connection relationship and the conjunctive word Same regular expression.
S720 obtains the logical relation between the semantic slot according to the multiple regular expression.
S800 obtains user's corpus.
S900 compares user's corpus and the semantic slot, the regular expression, and parsing obtains corresponding user It is semantic.
Described compares user's corpus and the semantic slot, the regular expression, and parsing obtains corresponding use Family semanteme specifically includes:
S910 segments user's corpus by participle technique, obtains corresponding user's participle and participle word Property.
Specifically, segmenting according to participle technique to user's corpus, word in every a word in user's corpus is identified Part of speech, then entire sentence will be divided by word, word and short according to the part of speech of word in every a word in user's corpus The participles such as language are constituted.Therefore the user for including in user's corpus participle and corresponding participle part of speech have been obtained.
For example, some user's corpus is " Xiao Ming not only likes blue, and likes red ", the use segmented Family participle be " Xiao Ming ", " not only ", " liking ", " blue ", " but also ", " liking ", " red ", it is " Xiao Ming ", " blue " and " red The corresponding participle part of speech of color " is noun, " not only " and " but also " corresponding participle part of speech is pronoun, " liking " corresponding participle Part of speech is verb.
S920 is segmented in conjunction with the user and the participle part of speech and the semantic slot compare, and obtains the use Participle connection relationship between the participle of family.
Specifically, part of speech is segmented and segmented in conjunction with user, and semantic slot seriatim compares, if comparison meets, The corresponding participle connection relationship compared between the user's participle being consistent is then obtained according to the connection relationship between semantic slot.
S930 is segmented in conjunction with the user and the participle part of speech generates corresponding user's canonical formula, just by the user Then formula and the regular expression compare, and obtain the participle logical relation between user's participle.
Specifically, according to the method for generating corresponding regular expression by corpus sample in above-described embodiment, in conjunction with Family participle and participle part of speech generate the corresponding user's canonical formula of user's corpus, then by obtained user's canonical formula and corpus sample This corresponding regular expression seriatim compares, if comparison meets, according to regular expressions between middle semantic slot Logical relation obtains the corresponding participle logical relation compared between the user's participle being consistent.
S940 is segmented according to the user, the participle part of speech, the participle connection relationship and the participle logic are closed System parses user's corpus and obtains the corresponding user semantic.
Specifically, according to user obtained above participle, participle part of speech, participle connection relationship and participle logical relation solution Analysis user's corpus obtains corresponding user semantic.
In the present embodiment, the user in user's corpus is analyzed by the semantic slot and regular expression that are obtained by corpus sample Participle connection relationship and participle logical relation between participle, to parse corresponding user semantic.
One embodiment of the present of invention, as shown in fig. 6, a kind of system 100 of user's corpus semantic understanding, comprising:
Map establishes module 110, establishes knowledge mapping.
Specifically, knowledge mapping is also known as mapping knowledge domains, it is the one of explicit knowledge's development process and structural relation The a variety of different figures of series describe knowledge resource and its carrier, excavation, analysis, building, drafting and display with visualization technique Knowledge and connecting each other between them.
Map establishes module 110 and establishes corresponding knowledge mapping, the corresponding big knowledge point of each node in knowledge mapping Or general knowledge point, knowledge point include many specific physical contents, for example, in the entity for including in knowledge point " trigonometric function " Have trigonometric function concept explanation, the explanation of trigonometric function course, trigonometric function exercise etc. and trigonometric function are related to In the habit stage, for example junior middle school, height are medium.
The map is established module 110 and is specifically included:
Acquiring unit 111 obtains the corresponding association in knowledge point and the corresponding hierarchical relationship knowledge point in the knowledge point and closes System.
Specifically, knowledge point includes study class, common sense class, animal class etc. specifically, acquiring unit 111 obtains knowledge point Miscellaneous knowledge, and it is wherein every a kind of also comprising many general knowledge points.For example, Chinese language include Tang poetry, the such poems of the Song Dynasty, Yuan songs, The writing in classical Chinese, Modern Poetry etc., Tang poetry include poem with five characters in one line, seven-character octave etc., and Tang poetry is further divided into lyric class, object is borrowed to explain the mankind Deng.
Knowledge mapping is constituted according to several triples, and triple can simply be interpreted as (entity, association pass System, entity), if regarding entity as node, regard incidence relation (including attribute, classification etc.) as a line, then wrapping The knowledge base for having contained a large amount of triples just constitutes a huge knowledge mapping.
For example, the relationship between subject and knowledge point can be expressed as (Chinese language, inclusion relation, Tang poetry), (Chinese language includes Relationship, the such poems of the Song Dynasty), (Chinese language, inclusion relation, Yuan songs), (Chinese language, inclusion relation, writing in classical Chinese) etc..It is above-mentioned to know in Chinese language subject Knowing between point is coordination, but in each knowledge point again includes multiple general knowledge points, for example, (Tang poetry, inclusion relation, five Say regulated verse), (Tang poetry, inclusion relation, seven-character octave).
Therefore, in order to construct corresponding knowledge mapping, it is also necessary to which acquiring unit 111 obtains the association between all knowledge points Relationship, incidence relation include the connection relationship and hierarchical relationship between knowledge point, such as when being to include between two knowledge points When relationship, the level of the knowledge point for including be higher than by comprising knowledge point, when being coordination between two knowledge points.The two Level is identical.
Map establishes unit 112, and the knowledge point obtained according to the acquiring unit 111 and described hierarchical relationship are established The knowledge mapping.
Specifically, map, which establishes unit 112, establishes corresponding knowledge mapping according to knowledge point and incidence relation, in knowledge graph A knowledge point is exactly a connecting node in spectrum, passes through line and curve connection between two knowledge points with incidence relation.Therefore, The system that knowledge mapping can give expression to knowledge point clear and accurately is constituted, and is combed and is understood convenient for user.
Sample acquisition module 120 obtains corpus sample.
Specifically, sample acquisition module 120 obtains a large amount of corpus sample, wherein corpus sample can be the written of specification Term is also possible to user speech, audio etc., because user speech input and text input are all during human-computer interaction The interactive mode of mainstream.
In addition, since entire analytic process is for penman text, so if what is collected is the languages such as user speech, audio Sound file, it is necessary first to convert identification text for voice document, then the identification text is performed corresponding processing.
Labeling module 130 carries out part-of-speech tagging and syntax to the corpus sample that the sample acquisition module 120 obtains Mark.
Specifically, labeling module 130 segments corpus sample according to participle technique, identify each in corpus sample The part of speech that segments in word simultaneously carries out part-of-speech tagging, and the syntactic structure for then analyzing every a word in corpus sample is segmented Between connection relationship, to carry out syntax mark.
For example, a certain corpus sample are as follows: why whale can spray water, wherein " whale " corresponding part of speech is noun, it " is assorted " corresponding part of speech is pronoun, " meeting " corresponding part of speech is verb, and " water spray " corresponding part of speech is verb, analysis syntactic structure Obtaining " whale " and " water spray " is subject-predicate relationship, " why " and " water spray " be relationship in shape, " meeting " and " water spray " is closed in shape System.
Abstraction module 140 extracts institute according to the part-of-speech tagging of the labeling module 130 mark and syntax mark Main body word and conjunctive word in predicate material sample.
Specifically, abstraction module 140 combines part-of-speech tagging and syntax mark to extract main body word and pass in corpus sample Join word, for example can choose word based on the noun extraction that syntax is labeled as to subject-predicate relationship, or syntax is labeled as It is conjunctive word that pronoun in shape in relationship, which extracts, and the example above is intended merely to facilitate understanding, actually main body word and association The decimation rule of word is not limited in two kinds of the example above.
For example, a certain corpus sample are as follows: why whale can spray water, wherein " whale " corresponding part of speech is noun, it " is assorted " corresponding part of speech is pronoun, " meeting " corresponding part of speech is verb, and " water spray " corresponding part of speech is verb, analysis syntactic structure Obtaining " whale " and " water spray " is subject-predicate relationship, " why " and " water spray " be relationship in shape, " meeting " and " water spray " is closed in shape System.Word based on " whale " therein and " water spray " is extracted, " why " it is conjunctive word.
Semantic slot generation module 150, the part-of-speech tagging and the abstraction module marked according to the labeling module 130 The 140 obtained main body word generative semantics slots.
Specifically, semantic slot generation module 150 is according to part-of-speech tagging and main body word generative semantics slot, for example, a certain language Expect sample are as follows: why whale can spray water, wherein " whale " corresponding part of speech is noun, " why " corresponding part of speech is generation Word, " meeting " corresponding part of speech are verb, and " water spray " corresponding part of speech is verb, are extracted based on " whale " therein and " water spray " Word, generating the physical contents in semantic nouns slot is " whale ", and the physical contents in semantic verbs slot are " water spray ".
The main body word that the abstraction module 140 obtains and the map are established module 110 by matching module 160 The knowledge mapping established is matched, and the connection relationship between the semantic slot is obtained.
Specifically, matching module 160 seriatim matches the knowledge mapping for analyzing obtained main body word and foundation, If matching meets, according to the position of the main body word that is consistent in knowledge mapping obtain the corresponding semantic slot of the main body word it Between connection relationship.
The matching module 160 specifically includes:
The main body word that the abstraction module 140 obtains and the map are established module 110 by matching unit 161 The knowledge mapping established is matched, and the corresponding corpus knowledge point of the main body word and corpus hierarchical relationship are obtained.
Specifically, matching unit 161 seriatim carries out the obtained main body word of above-mentioned analysis and the knowledge mapping of foundation The knowledge point being consistent in knowledge mapping with main body word is denoted as corpus knowledge point, in conjunction with knowledge graph if matching meets by matching The structure of spectrum obtains the hierarchical relationship of the corpus knowledge point in knowledge mapping, is denoted as corpus hierarchical relationship.
Analytical unit 162, the corpus knowledge point obtained according to the matching unit 161 and the corpus hierarchical relationship Obtain the connection relationship between the semantic slot.
Specifically, analytical unit 162 is according to corpus knowledge point and corpus level of the corpus knowledge point in knowledge mapping Relationship obtains the connection relationship between corresponding main body word, to obtain the connection between the corresponding semantic slot of the main body word Relationship.
Canonical formula generation module 170, the semantic slot, the matching generated according to the semantic slot generation module 150 The conjunctive word that the connection relationship and the abstraction module 140 that module 160 obtains obtain generates regular expression.
It is corresponded to specifically, canonical formula generation module 170 generates corpus sample according to semantic slot, connection relationship and conjunctive word Regular expression, for example, a certain corpus sample are as follows: why whale can spray water, wherein " whale " corresponding part of speech be noun, " why " corresponding part of speech is pronoun, and " meeting " corresponding part of speech is verb, and " water spray " corresponding part of speech is verb, analyzes syntax It is subject-predicate relationship that structure, which obtains " whale " and " water spray ", " why " and " water spray " be relationship in shape, " meeting " and " water spray " is shape Middle relationship.Word based on " whale " therein and " water spray " is extracted, " why " it is conjunctive word.Obtained regular expression Are as follows: ## thesaurus ## [why] [meeting] ## verb library ##.
The canonical formula generation module 170 specifically includes:
Canonical formula generation unit 171, the semantic slot, the matching generated according to the semantic slot generation module 150 It is different that the conjunctive word that the connection relationship and the abstraction module 140 that module 160 obtains obtain generates multiple clause But semantic identical regular expression.
Specifically, canonical formula generation unit 171 is multiple according to semantic slot, connection relationship and by conjunctive word conversion generation Clause difference but semantic identical regular expression, for example, a certain corpus sample are as follows: why whale can spray water, wherein " whale The corresponding part of speech of fish " is noun, " why " corresponding part of speech is pronoun, " meeting " corresponding part of speech is verb, " water spray " correspondence Part of speech be verb, it is subject-predicate relationship that analysis syntactic structure, which obtains " whale " and " water spray ", " why " and " water spray " be in shape Relationship, " meeting " and " water spray " are relationships in shape.Word based on " whale " therein and " water spray " is extracted, " why " it is association Word.Obtained regular expression are as follows: ## thesaurus ## [why] [meeting] ## verb library ##, by conjunctive word " why " carry out Conversion adjustment structure generates another clause difference but semantic identical regular expression: [why] ## thesaurus ## [meeting] ## verb library ##.
Processing unit 172 obtains described according to the multiple regular expression that the canonical formula generation unit 171 generates Logical relation between semantic slot.
Specifically, processing unit 172 analyzes the logical relation between semantic slot, example according to obtained multiple regular expressions Such as, a certain corpus sample are as follows: why whale can spray water, obtained regular expression are as follows: ## thesaurus ## [why] [meeting] ## verb library ##, by conjunctive word " why " carry out conversion adjustment structure, generate that another clause is different but semantic phase Same regular expression: [why] ## thesaurus ## [meeting] ## verb library ##, wherein the position of conjunctive word is adjusted not The semanteme of corpus sample is influenced, and the relative position of semantic nouns slot and semantic verbs slot is constant always, therefore obtains noun Logical relation between semantic slot and semantic verbs slot.
Corpus obtains module 180, obtains user's corpus.
Specifically, corpus, which obtains module 180, obtains user's corpus, smart machine is used during obtaining user's corpus Family is inputted by voice and text input is all the interactive mode of mainstream, but which kind of form the user's corpus no matter obtained is, What final system was handled is all textual form, therefore, if getting speech form, needs to be first converted into text Form.
Parsing module 190, user's corpus that corpus acquisition module 180 is obtained and the semantic slot generation mould The regular expression comparison that the semantic slot, the canonical formula generation module 170 that block 150 generates generate, parsing obtain Corresponding user semantic.
Specifically, parsing module 190 analysis obtain user's corpus, then with the semantic slot and regular expression of generation It seriatim compares, to obtain the relationship in user's corpus between each word, parsing obtains corresponding user semantic.
The parsing module 190 specifically includes:
Participle unit 191 obtains user's corpus that module 180 obtains to the corpus by participle technique and divides Word obtains corresponding user's participle and participle part of speech.
Specifically, participle unit 191 segments user's corpus according to participle technique, identify each in user's corpus Then entire sentence is divided by the part of speech of word in word by the part of speech in every a word in user's corpus according to word The participles such as word, word and phrase are constituted.Therefore the user for including in user's corpus participle and corresponding participle part of speech have been obtained.
For example, some user's corpus is " Xiao Ming not only likes blue, and likes red ", the use segmented Family participle be " Xiao Ming ", " not only ", " liking ", " blue ", " but also ", " liking ", " red ", it is " Xiao Ming ", " blue " and " red The corresponding participle part of speech of color " is noun, " not only " and " but also " corresponding participle part of speech is pronoun, " liking " corresponding participle Part of speech is verb.
Comparison unit 192, the user participle obtained in conjunction with the participle unit 191 and the participle part of speech, and The semantic slot that the semanteme slot generation module 150 generates compares, and obtains the participle connection between user's participle Relationship.
Specifically, comparison unit 192 combines user to segment and segment part of speech, and semantic slot seriatim compares, such as Fruit comparison meets, then obtains the corresponding participle company compared between the user's participle being consistent according to the connection relationship between semantic slot Connect relationship.
The comparison unit 192, the user participle obtained in conjunction with the participle unit 191 and the participle part of speech Corresponding user's canonical formula is generated, the canonical table that user's canonical formula and the canonical formula generation module 170 are generated It is compared up to formula, obtains the participle logical relation between user's participle.
Specifically, comparison unit 192 generates corresponding regular expression by corpus sample according in above-described embodiment Method segments and is segmented in conjunction with user part of speech and generate the corresponding user's canonical formula of user's corpus, then just by obtained user Then formula and the corresponding regular expression of corpus sample seriatim compare, if comparison meet, according to regular expressions in Logical relation between semantic slot obtains the corresponding participle logical relation compared between the user's participle being consistent.
Resolution unit 193 is the user participle obtained according to the participle unit 191, the participle part of speech, described right The participle connection relationship obtained than unit 192 and participle logical relation parsing user's corpus obtain corresponding The user semantic.
Specifically, resolution unit 193 is segmented according to user obtained above, participle part of speech, is segmented connection relationship and is divided Word logical relation parsing user's corpus obtains corresponding user semantic.
In the present embodiment, knowledge mapping is established by knowledge point, is obtained in conjunction with the corpus sample of knowledge graph spectrum information Then semantic slot and regular expression parse user's corpus to obtain corresponding user according to semantic slot and regular expression It is semantic.
In the present embodiment, corresponding knowledge mapping is established according to knowledge point and incidence relation, gives expression to and knows clear and accurately The system for knowing point is constituted, and is combed and is understood convenient for user, is also convenient for clearing the logic in user's corpus between user's participle Relationship.
In conjunction with foundation knowledge mapping in incidence relation between knowledge point and knowledge point, to rapidly and accurately obtain The connection relationship between main body word and semantic slot in corpus sample.For corpus sample, generated by conjunctive word conversion more A clause difference but semantic identical regular expression, then analyze to obtain corpus sample according to multiple regular expressions of generation Logical relation between middle semanteme slot.It is analyzed in user's corpus by the semantic slot and regular expression that are obtained by corpus sample Participle connection relationship and participle logical relation between user's participle, to parse corresponding user semantic.
It should be noted that above-described embodiment can be freely combined as needed.The above is only of the invention preferred Embodiment, it is noted that for those skilled in the art, in the premise for not departing from the principle of the invention Under, several improvements and modifications can also be made, these modifications and embellishments should also be considered as the scope of protection of the present invention.

Claims (10)

1. a kind of method of user's corpus semantic understanding characterized by comprising
Establish knowledge mapping;
Obtain corpus sample;
Part-of-speech tagging and syntax mark are carried out to the corpus sample;
Main body word and conjunctive word in the corpus sample are extracted according to the part-of-speech tagging and syntax mark;
According to the part-of-speech tagging and the main body word generative semantics slot;
The main body word and the knowledge mapping are matched, the connection relationship between the semantic slot is obtained;
Regular expression is generated according to the semantic slot, the connection relationship and the conjunctive word;
Obtain user's corpus;
By user's corpus and the semantic slot, regular expression comparison, parsing obtains corresponding user semantic.
2. the method for user's corpus semantic understanding according to claim 1, which is characterized in that described establishes knowledge mapping It specifically includes:
Obtain knowledge point and the corresponding incidence relation in the knowledge point;
The knowledge mapping is established according to the knowledge point and described hierarchical relationship.
3. the method for user's corpus semantic understanding according to claim 2, which is characterized in that described by the main body word Language and the knowledge mapping are matched, and the connection relationship obtained between the semantic slot specifically includes:
The main body word and the knowledge mapping are matched, the corresponding corpus knowledge point of the main body word and language are obtained Expect hierarchical relationship;
The connection relationship between the semantic slot is obtained according to the corpus knowledge point and the corpus hierarchical relationship.
4. the method for user's corpus semantic understanding according to claim 1, which is characterized in that described according to the semanteme Slot, the connection relationship and the conjunctive word generate regular expression and specifically include
Multiple clause differences but semantic identical canonical are generated according to the semantic slot, the connection relationship and the conjunctive word Expression formula;
The logical relation between the semantic slot is obtained according to the multiple regular expression.
5. the method for user's corpus semantic understanding according to claim 4, which is characterized in that described by user's language Material and the semantic slot, regular expression comparison, parsing obtain corresponding user semantic and specifically include:
User's corpus is segmented by participle technique, obtains corresponding user's participle and participle part of speech;
It is compared in conjunction with user participle and the participle part of speech and the semantic slot, obtains the user and segment it Between participle connection relationship;
Corresponding user's canonical formula is generated in conjunction with user participle and the participle part of speech, by user's canonical formula and institute It states regular expression to compare, obtains the participle logical relation between user's participle;
Institute is parsed according to user participle, the participle part of speech, the participle connection relationship and the participle logical relation It states user's corpus and obtains the corresponding user semantic.
6. a kind of system of user's corpus semantic understanding characterized by comprising
Map establishes module, establishes knowledge mapping;
Sample acquisition module obtains corpus sample;
Labeling module carries out part-of-speech tagging to the corpus sample that the sample acquisition module obtains and syntax marks;
Abstraction module extracts the corpus sample according to the part-of-speech tagging of labeling module mark and syntax mark In main body word and conjunctive word;
Semantic slot generation module, the part-of-speech tagging marked according to the labeling module and described in the abstraction module obtains Main body word generative semantics slot;
The main body word that the abstraction module obtains and the map are established the knowledge of module foundation by matching module Map is matched, and the connection relationship between the semantic slot is obtained;
Canonical formula generation module is obtained according to the semantic slot, the matching module that the semantic slot generation module generates The conjunctive word that the connection relationship and the abstraction module obtain generates regular expression;
Corpus obtains module, obtains user's corpus;
The corpus is obtained user's corpus that module obtains and the institute that the semantic slot generation module generates by parsing module The regular expression comparison that predicate justice slot, the canonical formula generation module generate, parsing obtain corresponding user semantic.
7. the system of user's corpus semantic understanding according to claim 6, which is characterized in that the map establishes module tool Body includes:
Acquiring unit obtains knowledge point and the corresponding incidence relation in the knowledge point;
Map establishes unit, and the knowledge graph is established in the knowledge point and described hierarchical relationship obtained according to the acquiring unit Spectrum.
8. the system of user's corpus semantic understanding according to claim 7, which is characterized in that the matching module specifically wraps It includes:
The main body word that the abstraction module obtains and the map are established the knowledge of module foundation by matching unit Map is matched, and the corresponding corpus knowledge point of the main body word and corpus hierarchical relationship are obtained;
Analytical unit, the corpus knowledge point and the corpus hierarchical relationship obtained according to the matching unit obtain institute's predicate Connection relationship between adopted slot.
9. the system of user's corpus semantic understanding according to claim 6, which is characterized in that the canonical formula generation module It specifically includes:
Canonical formula generation unit is obtained according to the semantic slot, the matching module that the semantic slot generation module generates The conjunctive word that the connection relationship and the abstraction module obtain generates multiple clause differences but semantic identical canonical Expression formula;
Processing unit obtains between the semantic slot according to the multiple regular expression that the canonical formula generation unit generates Logical relation.
10. the system of user's corpus semantic understanding according to claim 9, which is characterized in that the parsing module is specific Include:
Participle unit obtains user's corpus that module obtains to the corpus by participle technique and segments, obtains pair The user's participle and participle part of speech answered;
Comparison unit, the user participle and the participle part of speech and the semantic slot obtained in conjunction with the participle unit The semantic slot that generation module generates compares, and obtains the participle connection relationship between user's participle;
The comparison unit, the user participle and the participle part of speech obtained in conjunction with the participle unit generate corresponding User's canonical formula compares the regular expression that user's canonical formula and the canonical formula generation module generate, Obtain the participle logical relation between user's participle;
Resolution unit, the user participle obtained according to the participle unit, the participle part of speech, the comparison unit obtain The participle connection relationship and the participle logical relation parse user's corpus and obtain the corresponding user semantic.
CN201910046978.4A 2019-01-18 2019-01-18 A kind of method and system of user's corpus semantic understanding Pending CN109766453A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910046978.4A CN109766453A (en) 2019-01-18 2019-01-18 A kind of method and system of user's corpus semantic understanding

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910046978.4A CN109766453A (en) 2019-01-18 2019-01-18 A kind of method and system of user's corpus semantic understanding

Publications (1)

Publication Number Publication Date
CN109766453A true CN109766453A (en) 2019-05-17

Family

ID=66454192

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910046978.4A Pending CN109766453A (en) 2019-01-18 2019-01-18 A kind of method and system of user's corpus semantic understanding

Country Status (1)

Country Link
CN (1) CN109766453A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110765235A (en) * 2019-09-09 2020-02-07 深圳市人马互动科技有限公司 Training data generation method and device, terminal and readable medium
CN110909123A (en) * 2019-10-23 2020-03-24 深圳价值在线信息科技股份有限公司 Data extraction method and device, terminal equipment and storage medium
CN110929045A (en) * 2019-12-06 2020-03-27 苏州思必驰信息科技有限公司 Construction method and system of poetry-semantic knowledge map
CN112380866A (en) * 2020-11-25 2021-02-19 厦门市美亚柏科信息股份有限公司 Text topic label generation method, terminal device and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20070090642A (en) * 2006-03-03 2007-09-06 삼성전자주식회사 Apparatus for providing voice dialogue service and method for operating the apparatus
CN107247736A (en) * 2017-05-08 2017-10-13 广州索答信息科技有限公司 The kitchen field answering method and system of a kind of knowledge based collection of illustrative plates
CN107315737A (en) * 2017-07-04 2017-11-03 北京奇艺世纪科技有限公司 A kind of semantic logic processing method and system
CN107885844A (en) * 2017-11-10 2018-04-06 南京大学 Automatic question-answering method and system based on systematic searching
CN108804521A (en) * 2018-04-27 2018-11-13 南京柯基数据科技有限公司 A kind of answering method and agricultural encyclopaedia question answering system of knowledge based collection of illustrative plates
CN109033063A (en) * 2017-06-09 2018-12-18 微软技术许可有限责任公司 The machine inference of knowledge based map

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20070090642A (en) * 2006-03-03 2007-09-06 삼성전자주식회사 Apparatus for providing voice dialogue service and method for operating the apparatus
CN107247736A (en) * 2017-05-08 2017-10-13 广州索答信息科技有限公司 The kitchen field answering method and system of a kind of knowledge based collection of illustrative plates
CN109033063A (en) * 2017-06-09 2018-12-18 微软技术许可有限责任公司 The machine inference of knowledge based map
CN107315737A (en) * 2017-07-04 2017-11-03 北京奇艺世纪科技有限公司 A kind of semantic logic processing method and system
CN107885844A (en) * 2017-11-10 2018-04-06 南京大学 Automatic question-answering method and system based on systematic searching
CN108804521A (en) * 2018-04-27 2018-11-13 南京柯基数据科技有限公司 A kind of answering method and agricultural encyclopaedia question answering system of knowledge based collection of illustrative plates

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
黎锦熙: "《新著国语文法》", 31 December 2007, 湖南教育出版社 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110765235A (en) * 2019-09-09 2020-02-07 深圳市人马互动科技有限公司 Training data generation method and device, terminal and readable medium
CN110765235B (en) * 2019-09-09 2023-09-05 深圳市人马互动科技有限公司 Training data generation method, device, terminal and readable medium
CN110909123A (en) * 2019-10-23 2020-03-24 深圳价值在线信息科技股份有限公司 Data extraction method and device, terminal equipment and storage medium
CN110909123B (en) * 2019-10-23 2023-08-25 深圳价值在线信息科技股份有限公司 Data extraction method and device, terminal equipment and storage medium
CN110929045A (en) * 2019-12-06 2020-03-27 苏州思必驰信息科技有限公司 Construction method and system of poetry-semantic knowledge map
CN110929045B (en) * 2019-12-06 2022-07-12 思必驰科技股份有限公司 Construction method and system of poetry-semantic knowledge map
CN112380866A (en) * 2020-11-25 2021-02-19 厦门市美亚柏科信息股份有限公司 Text topic label generation method, terminal device and storage medium

Similar Documents

Publication Publication Date Title
CN109766453A (en) A kind of method and system of user's corpus semantic understanding
CN106777275B (en) Entity attribute and property value extracting method based on more granularity semantic chunks
CN106776711B (en) Chinese medical knowledge map construction method based on deep learning
Zanettin Corpus methods for descriptive translation studies
CN110502642B (en) Entity relation extraction method based on dependency syntactic analysis and rules
Falk et al. Classifying French verbs using French and English lexical resources
Gozdz-Roszkowski et al. Legal phraseology today: Corpus-based applications across legal languages and genres
CN109783693B (en) Method and system for determining video semantics and knowledge points
CN111061882A (en) Knowledge graph construction method
CN109271492A (en) A kind of automatic generation method and system of corpus regular expression
CN110609983A (en) Structured decomposition method for policy file
CN109871543A (en) A kind of intention acquisition methods and system
CN112733547A (en) Chinese question semantic understanding method by utilizing semantic dependency analysis
CN109902305A (en) Template generation, search and text generation apparatus and method for based on name Entity recognition
CN113312922A (en) Improved chapter-level triple information extraction method
CN111104803A (en) Semantic understanding processing method, device and equipment and readable storage medium
CN111553160A (en) Method and system for obtaining answers to question sentences in legal field
CN109783819A (en) A kind of generation method and system of regular expression
KR100338806B1 (en) Method and apparatus of language translation based on analysis of target language
CN117216214A (en) Question and answer extraction generation method, device, equipment and medium
CN110008314B (en) Intention analysis method and device
Sriram et al. Validation and normalization of DCS corpus and development of the Sanskrit heritage engine’s segmenter
CN112380877B (en) Construction method of machine translation test set used in discourse-level English translation
CN112148838B (en) Service source object extraction method and device
CN109783820B (en) Semantic parsing method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190517

RJ01 Rejection of invention patent application after publication