CN108959240A - A kind of proprietary ontology automatic creation system and method - Google Patents
A kind of proprietary ontology automatic creation system and method Download PDFInfo
- Publication number
- CN108959240A CN108959240A CN201710383135.4A CN201710383135A CN108959240A CN 108959240 A CN108959240 A CN 108959240A CN 201710383135 A CN201710383135 A CN 201710383135A CN 108959240 A CN108959240 A CN 108959240A
- Authority
- CN
- China
- Prior art keywords
- phrase
- sentence
- proprietary
- ontology
- relationship
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/211—Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a kind of proprietary ontology automatic creation system and method, the system text databases, for storing text data;Natural language understanding module, input terminal are connected to text database, obtain the syntactic-semantic structure of sentence for being divided into several sentences to text data and analyzing the sentence;Phrase analysis module, input terminal are connected to natural language understanding module output end, obtain corresponding phrase and phrase relationship for the syntactic-semantic structure according to the sentence;It identifies suggestion module, wait establish proprietary ontology library, identifies suggestion module input terminal conjunctive phrase analysis module, the phrase and phrase relationship as the classification and attribute wait establish proprietary ontology and are put into wait establish in proprietary ontology library for identification.
Description
Technical field
The present invention relates to the field of semantic technology and semantic search in artificial intelligence, in particular to a kind of proprietary ontology is certainly
It is dynamic to generate system and method.
Background technique
The combination of computer and internet produces a large amount of information, this makes us have the feeling being submerged quickly.Thing
In fact and in this way, we while tackling unconventional massive information, are also constantly manufacturing new information.This information content is
Increased in a manner of geometric progression.People wishing to be placed on computer being effectively treated to massive information, expect not only from
Information flood it is middle freed, also can preferably utilize these massive informations.
The information processing of computer is confined at the beginning in the simple data of structure, although data volume may be very big,
But structure is relatively simple.With the rapid enhancing of computer hardware ability, computer is used to tackle complicated problem, data
The complexity of structure greatly increases.It has passed through internet to accumulate the difference of data, the data of different data sources start to collect in
Together, so that data processing becomes more complicated.In computer science and artificial intelligence educational circles, the appearance of ontology and proprietary ontology
It is to cope with such complexity.Ontology and proprietary ontology are Third Generation of Interconnected Network -- semantic net (Semantic Web)
Basis, while being also the foundation stone of semantic search.Third Generation of Interconnected Network and semantic search are the bases of big data processing.
The writing of traditional proprietary ontology is manual work.Proprietary ontology writing worker is by ontology editor one
Class (Class), entity (Entity), attribute (Property) are established in a proprietary field, at the same also need to use for reference it is existing its
Its proprietary ontology absorbs certain ingredients of these proprietary ontologies.This process expends the time very much, and it is different to be easy front and back
It causes.
Summary of the invention
The object of the present invention is to provide a kind of proprietary ontology automatic creation system and methods, pass through natural language understanding technology
The document in one proprietary field is handled, a large amount of phrases in this proprietary field are obtained, from these phrases and phrase it
Between relationship in, study establish proprietary ontology automatically, solve the problems, such as time consumption and inconsequent.
In order to achieve the goal above, the present invention is achieved by the following technical solutions:
A kind of proprietary ontology automatic creation system, its main feature is that, include:
Text database, for storing text data;
Natural language understanding module, input terminal are connected to text database, for being divided into several sentences to text data
And it analyzes the sentence and obtains the syntactic-semantic structure of sentence;
Phrase analysis module, input terminal is connected to natural language understanding module output end, for the sentence according to the sentence
Method semantic structure obtains corresponding phrase and phrase relationship;
Identify suggestion module, wait establish proprietary ontology library, the identification suggestion module input terminal conjunctive phrase analysis module is used
It as the classification and attribute wait establish proprietary ontology and is put into wait establish proprietary in the identification phrase and phrase relationship
In body library.
The proprietary ontology automatic creation system also includes other proprietary ontology libraries, is connected with identification suggestion module, uses
In the phrase that default storage had been set up.
The natural language understanding module includes:
Sentence cutting unit becomes several sentences for carrying out the cutting of sentence to text;
Analysis of sentence unit is analyzed for carrying out syntax and semantic to several sentences of input, it is corresponding to obtain sentence
Syntactic-semantic structure.
The phrase analysis module includes:
Phrase semantic analysis filter element, for extracting the genitive phrase in syntactic-semantic structure, and to carry out semantic analysis,
Filtering has corresponding phrase with other proprietary ontology libraries, and leaving does not have corresponding phrase with other proprietary ontology libraries;
Relationship analysis unit between phrase obtains the relationship of phrase for analyzing the relationship that filtering leaves phrase and has.
A kind of proprietary body automatic generation method, its main feature is that, this method comprises the following steps:
S1 stores text data;
S2 is divided into sentence described in several sentences and analysis to obtain the syntactic-semantic structure of sentence text data;
S3 obtains corresponding phrase and phrase relationship according to the syntactic-semantic structure of the sentence;
S4 identifies the phrase and phrase relationship as the classification and attribute wait establish proprietary ontology and is put into wait establish specially
Have in ontology library.
The step S2 includes:
S2.1 carries out the cutting of sentence to text, becomes several sentences;
S2.2 carries out syntax and semantic to several sentences of input and analyzes, obtains the corresponding syntactic-semantic structure of sentence.
The step S3 includes:
S3.1, extract syntactic-semantic structure in genitive phrase, and to carries out semantic analysis, filtering and other proprietary ontology libraries
There is corresponding phrase, leaving does not have corresponding phrase with other proprietary ontology libraries;
S3.2, the relationship that analysis filtering leaves phrase and has obtain the relationship of phrase.
Compared with prior art, the present invention having the advantage that
It is handled, is obtained a large amount of in this proprietary field by document of the natural language understanding technology to a proprietary field
Phrase, from the relationship between these phrases and phrase, proprietary ontology is established in study automatically, solves time consumption and front and back not
Consistent problem.
Detailed description of the invention
Fig. 1 is a kind of block diagram of proprietary ontology automatic creation system of the present invention;
Fig. 2 is a kind of flow chart of proprietary body automatic generation method of the present invention.
Specific embodiment
The present invention is further elaborated by the way that a preferable specific embodiment is described in detail below in conjunction with attached drawing.
As shown in Figure 1, a kind of proprietary ontology automatic creation system, includes: text database 100, for storing textual data
According to;Natural language understanding module 200, input terminal are connected to text database 100, for being divided into several to text data
Sentence described in sentence and analysis obtains the syntactic-semantic structure of sentence;Phrase analysis module 300, input terminal is connected to nature
200 output end of language understanding module show that corresponding phrase and phrase close for the syntactic-semantic structure according to the sentence
System;Identify suggestion module 400, wait establish proprietary ontology library 500, the 400 input terminal conjunctive phrase of identification suggestion module point
Module 300 is analysed, the phrase and phrase relationship as the classification and attribute wait establish proprietary ontology and are put into for identification
Wait establish in proprietary ontology library 500.
In a particular embodiment, proprietary ontology automatic creation system also includes other proprietary ontology libraries 600, with identification
Suggestion module is connected, and for the phrase that default storage had been set up, during generating proprietary ontology, there are many specially
There is ontology to be established, these ontologies there are many influences to the ontology to be established, it is not necessary that already existing class is resettled
Once.
Above-mentioned natural language understanding module 200 includes: sentence cutting unit 201, cuts for carrying out sentence to text
It cuts, becomes several sentences, sentence is the fundamental analysis object of natural language understanding system;Analysis of sentence unit 202, for pair
Several sentences of input carry out syntax and semantic and are analyzed, and obtain the corresponding syntactic-semantic structure of sentence, the syntax of sentence
Semantic structure indicates the meaning structure of sentence and the structure of composition sentence.Such as example sentence, " citizen, which send a telegram here, to be seeked advice from: parent is outer
Provinces and cities' household register buys medical insurance in Shanghai after child's birth, and whether consulting child needs to handle residence permit." after this analysis of sentence
Syntactic-semantic structure be:
[consulting] (Agent citizen) (Content
{ [being] (Theme parent) (Situation other provinces and towns
Household register) }
{ [purchase] (Agent citizen) (Object medical insurance) (child's time birth
(Shanghai Location) (child Recipient) afterwards) }
{ [consulting] ({ [handling] (Agent citizen) (Patient is occupied Content
It firmly demonstrate,proves)
(child Recipient) }) }
}
}。
Above-mentioned phrase analysis module 300 includes: phrase semantic analysis filter element 301, for extracting syntactic-semantic knot
Genitive phrase in structure, and to carry out semantic analysis, filtering with other proprietary ontology libraries has corresponding phrase, leave not with
Other proprietary ontology libraries have corresponding phrase;Relationship analysis unit 302 between phrase leaves what phrase had for analyzing filtering
Relationship obtains the relationship of phrase.Such as example sentence, " other provinces personnel can enjoy the medical insurance policies of this city in this city professional
" " other provinces personnel are in this city professional " and " medical insurance policies of this city " be two phrases, " enjoyment " is that connection is two short
The relationship of language.
As shown in Fig. 2, a kind of proprietary body automatic generation method, comprises the following steps:
S1 stores text data;
S2 is divided into several sentences to text data and analyzes the sentence, obtains the syntactic-semantic structure of sentence;
S3 obtains corresponding phrase and phrase relationship according to the syntactic-semantic structure of the sentence;
S4 identifies the phrase and phrase relationship as the classification and attribute wait establish proprietary ontology and is put into wait establish specially
Have in ontology library.
Above-mentioned step S2 includes:
S2.1 carries out the cutting of sentence to text, becomes several sentences;
S2.2 carries out syntax and semantic to several sentences of input and analyzes, obtains the corresponding syntactic-semantic structure of sentence.
Above-mentioned step S3 includes:
S3.1, extract syntactic-semantic structure in genitive phrase, and to carries out semantic analysis, filtering and other proprietary ontology libraries
There is corresponding phrase, leaving does not have corresponding phrase with other proprietary ontology libraries;
The relationship that S3.2 analysis filtering leaves phrase and has obtains the relationship of phrase.
In conclusion a kind of proprietary ontology automatic creation system of the present invention and method, pass through natural language understanding technology pair
The document in one proprietary field is handled, and a large amount of phrases in this proprietary field are obtained, between these phrases and phrase
Relationship in, study establish proprietary ontology automatically, solve the problems, such as time consumption and inconsequent.
It is discussed in detail although the contents of the present invention have passed through above preferred embodiment, but it should be appreciated that above-mentioned
Description is not considered as limitation of the present invention.After those skilled in the art have read above content, for of the invention
A variety of modifications and substitutions all will be apparent.Therefore, protection scope of the present invention should be limited to the appended claims.
Claims (7)
1. a kind of proprietary ontology automatic creation system, characterized by comprising:
Text database, for storing text data;
Natural language understanding module, input terminal are connected to text database, for being divided into several sentences to text data
And it analyzes the sentence and obtains the syntactic-semantic structure of sentence;
Phrase analysis module, input terminal is connected to natural language understanding module output end, for the sentence according to the sentence
Method semantic structure obtains corresponding phrase and phrase relationship;
Identify suggestion module, wait establish proprietary ontology library, the identification suggestion module input terminal conjunctive phrase analysis module is used
It as the classification and attribute wait establish proprietary ontology and is put into wait establish proprietary in the identification phrase and phrase relationship
In body library.
2. proprietary ontology automatic creation system as described in claim 1, which is characterized in that also include other proprietary ontologies
Library is connected with identification suggestion module, the phrase having been set up for default storage.
3. proprietary ontology automatic creation system as described in claim 1, which is characterized in that the natural language understanding module
Include:
Sentence cutting unit becomes several sentences for carrying out the cutting of sentence to text;
Analysis of sentence unit is analyzed for carrying out syntax and semantic to several sentences of input, it is corresponding to obtain sentence
Syntactic-semantic structure.
4. proprietary ontology automatic creation system as claimed in claim 3, which is characterized in that the phrase analysis module packet
Contain:
Phrase semantic analysis filter element, for extracting the genitive phrase in syntactic-semantic structure, and to carry out semantic analysis,
Filtering has corresponding phrase with other proprietary ontology libraries, and leaving does not have corresponding phrase with other proprietary ontology libraries;
Relationship analysis unit between phrase obtains the relationship of phrase for analyzing the relationship that filtering leaves phrase and has.
5. a kind of proprietary body automatic generation method, which is characterized in that this method comprises the following steps:
S1 stores text data;
S2 is divided into sentence described in several sentences and analysis to obtain the syntactic-semantic structure of sentence text data;
S3 obtains corresponding phrase and phrase relationship according to the syntactic-semantic structure of the sentence;
S4 identifies the phrase and phrase relationship as the classification and attribute wait establish proprietary ontology and is put into wait establish specially
Have in ontology library.
6. proprietary body automatic generation method as claimed in claim 5, which is characterized in that the step S2 includes:
S2.1 carries out the cutting of sentence to text, becomes several sentences;
S2.2 carries out syntax and semantic to several sentences of input and analyzes, obtains the corresponding syntactic-semantic structure of sentence.
7. proprietary body automatic generation method as claimed in claim 6, which is characterized in that the step S3 includes:
S3.1, extract syntactic-semantic structure in genitive phrase, and to carries out semantic analysis, filtering and other proprietary ontology libraries
There is corresponding phrase, leaving does not have corresponding phrase with other proprietary ontology libraries;
S3.2, the relationship that analysis filtering leaves phrase and has obtain the relationship of phrase.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710383135.4A CN108959240A (en) | 2017-05-26 | 2017-05-26 | A kind of proprietary ontology automatic creation system and method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710383135.4A CN108959240A (en) | 2017-05-26 | 2017-05-26 | A kind of proprietary ontology automatic creation system and method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108959240A true CN108959240A (en) | 2018-12-07 |
Family
ID=64494529
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710383135.4A Pending CN108959240A (en) | 2017-05-26 | 2017-05-26 | A kind of proprietary ontology automatic creation system and method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108959240A (en) |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101446942A (en) * | 2008-12-10 | 2009-06-03 | 苏州大学 | Semantic character labeling method of natural language sentence |
CN101710343A (en) * | 2009-12-11 | 2010-05-19 | 北京中机科海科技发展有限公司 | Body automatic build system and method based on text mining |
CN102439590A (en) * | 2009-03-13 | 2012-05-02 | 发明机器公司 | System and method for automatic semantic labeling of natural language texts |
CN102609512A (en) * | 2012-02-07 | 2012-07-25 | 北京中机科海科技发展有限公司 | System and method for heterogeneous information mining and visual analysis |
CN102609402A (en) * | 2012-01-12 | 2012-07-25 | 北京航空航天大学 | Device and method for generation and management of ontology model based on real-time strategy |
CN103119585A (en) * | 2010-12-17 | 2013-05-22 | 北京交通大学 | Device for acquiring knowledge and method thereof |
US20140163955A1 (en) * | 2012-12-10 | 2014-06-12 | General Electric Company | System and Method For Extracting Ontological Information From A Body Of Text |
CN105808525A (en) * | 2016-03-29 | 2016-07-27 | 国家计算机网络与信息安全管理中心 | Domain concept hypernym-hyponym relation extraction method based on similar concept pairs |
CN106155999A (en) * | 2015-04-09 | 2016-11-23 | 科大讯飞股份有限公司 | Semantics comprehension on natural language method and system |
US20160364377A1 (en) * | 2015-06-12 | 2016-12-15 | Satyanarayana Krishnamurthy | Language Processing And Knowledge Building System |
-
2017
- 2017-05-26 CN CN201710383135.4A patent/CN108959240A/en active Pending
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101446942A (en) * | 2008-12-10 | 2009-06-03 | 苏州大学 | Semantic character labeling method of natural language sentence |
CN102439590A (en) * | 2009-03-13 | 2012-05-02 | 发明机器公司 | System and method for automatic semantic labeling of natural language texts |
CN101710343A (en) * | 2009-12-11 | 2010-05-19 | 北京中机科海科技发展有限公司 | Body automatic build system and method based on text mining |
CN103119585A (en) * | 2010-12-17 | 2013-05-22 | 北京交通大学 | Device for acquiring knowledge and method thereof |
CN102609402A (en) * | 2012-01-12 | 2012-07-25 | 北京航空航天大学 | Device and method for generation and management of ontology model based on real-time strategy |
CN102609512A (en) * | 2012-02-07 | 2012-07-25 | 北京中机科海科技发展有限公司 | System and method for heterogeneous information mining and visual analysis |
US20140163955A1 (en) * | 2012-12-10 | 2014-06-12 | General Electric Company | System and Method For Extracting Ontological Information From A Body Of Text |
CN106155999A (en) * | 2015-04-09 | 2016-11-23 | 科大讯飞股份有限公司 | Semantics comprehension on natural language method and system |
US20160364377A1 (en) * | 2015-06-12 | 2016-12-15 | Satyanarayana Krishnamurthy | Language Processing And Knowledge Building System |
CN105808525A (en) * | 2016-03-29 | 2016-07-27 | 国家计算机网络与信息安全管理中心 | Domain concept hypernym-hyponym relation extraction method based on similar concept pairs |
Non-Patent Citations (1)
Title |
---|
李蓉蓉: "《面向复杂语义的专利本体构建方法研究》", 《中国博士学位论文全文数据库》 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Siddharthan | Text simplification using typed dependencies: A comparision of the robustness of different generation strategies | |
CN105528410B (en) | The method that the online comment of a kind of pair of hospital is concluded and classified | |
CN107832382A (en) | Method, apparatus, equipment and storage medium based on word generation video | |
JP6676109B2 (en) | Utterance sentence generation apparatus, method and program | |
CN109408811B (en) | Data processing method and server | |
KR20110009205A (en) | Systems and methods for natural language communication with a computer | |
Riefer et al. | Mining process models from natural language text: A state-of-the-art analysis | |
JP2005174330A (en) | Method, system and program for analyzing opinion expressed from text document | |
WO2017198031A1 (en) | Semantic parsing method and apparatus | |
US20120124467A1 (en) | Method for automatically generating descriptive headings for a text element | |
CN104216873B (en) | Method for analyzing network left word emotion fluctuation characteristics of emotional handicap sufferer | |
Miyazaki et al. | Automatic conversion of sentence-end expressions for utterance characterization of dialogue systems | |
CN108614814A (en) | A kind of abstracting method of evaluation information, device and equipment | |
Roy et al. | " Is depression related to cannabis?": A knowledge-infused model for Entity and Relation Extraction with Limited Supervision | |
CN109446337A (en) | A kind of knowledge mapping construction method and device | |
O’Gorman et al. | The new Propbank: Aligning Propbank with AMR through POS unification | |
CN110245361B (en) | Phrase pair extraction method and device, electronic equipment and readable storage medium | |
Keskes et al. | Splitting Arabic texts into elementary discourse units | |
CN105045784B (en) | The access device method and apparatus of English words and phrases | |
Saggion et al. | Simplifying words in context. Experiments with two lexical resources in Spanish | |
CN108959240A (en) | A kind of proprietary ontology automatic creation system and method | |
Peng et al. | Research on tree kernel-based personal relation extraction | |
Skanda et al. | Detecting stance in kannada social media code-mixed text using sentence embedding | |
CN109800219A (en) | A kind of method and apparatus of corpus cleaning | |
JP4033011B2 (en) | Natural language processing system, natural language processing method, and computer program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20181207 |
|
RJ01 | Rejection of invention patent application after publication |