KR20100070084A - Apparatus and method for in real time retrieving knowledge relevant to user's query from a large-scale ontology - Google Patents
Apparatus and method for in real time retrieving knowledge relevant to user's query from a large-scale ontology Download PDFInfo
- Publication number
- KR20100070084A KR20100070084A KR1020080128681A KR20080128681A KR20100070084A KR 20100070084 A KR20100070084 A KR 20100070084A KR 1020080128681 A KR1020080128681 A KR 1020080128681A KR 20080128681 A KR20080128681 A KR 20080128681A KR 20100070084 A KR20100070084 A KR 20100070084A
- Authority
- KR
- South Korea
- Prior art keywords
- path
- ontology
- graph
- knowledge
- query
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3346—Query execution using probabilistic model
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Probability & Statistics with Applications (AREA)
- Life Sciences & Earth Sciences (AREA)
- Animal Behavior & Ethology (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention relates to an apparatus and method for retrieving knowledge related to user queries in a large volume ontology in real time. The present invention relates to a large-scale ontology representing real-world knowledge in the form of a node-arc graph and to each node in the converted ontology graph. Searching the knowledge related to user query in large-scale ontology in real time by managing partial pairs and generating partial graphs by searching and integrating the paths related to user query when user query comes in. An apparatus and a method thereof are provided.
To this end, the present invention provides a knowledge retrieval apparatus, comprising: ontology graph converting means for converting a large-scale ontology into an ontology graph; Path generation means for generating a path from the ontology graph converted by the ontology graph conversion means; Path storage means for storing a path generated by said path generation means; Route retrieving means for retrieving a user query and a related route from the route storing means; Partial graph generating means for generating a partial graph by processing the path searched by said path searching means; And triple converting means for converting the partial graph generated by the partial graph generating means into a triple set.
Description
The present invention relates to an apparatus and method for retrieving knowledge related to user queries in a large-scale ontology in real time, and more particularly, to a large-scale ontology expressing real-world knowledge in the form of a triple (sub-predicate-object). After converting the subject and the object of each triple into an ontology graph having a descriptor as an arc and managing the paths obtained for each node pair in the converted ontology graph, the user The present invention relates to an apparatus and method for retrieving knowledge related to a user query in a large-scale ontology by generating a partial graph by searching and integrating paths related to the user query when a query comes in.
In the following embodiment, a method of converting an ontology into a graph, a method of generating paths for each node pair in an ontology graph, a method of storing the generated paths, and searching for a part related to a user query Although a method of generating a graph is described as an example, it will be apparent that the present invention is not limited thereto.
Ontology is receiving great attention as it is recognized as a core technology of the Semantic Web, which is emerging as the next generation web. Ontology is a systematic expression of knowledge and semantic relationships between them. Ontologies enable you to obtain not only the information you need but also other relevant information, so you can use intelligent QnA systems, intelligent search engines, etc. It can be used for the same application.
Looking at the related patents and research papers to use the ontology in search, the overall focus is on providing information on the ontology that is directly related to the user search word.
First, in the paper "Multimedia Information Retrieval Using Meaningful Relevance" (Park Chang-Seop, Korean Society for Internet Information, Vol. 8, pp. 67-79), the concepts that are directly related to the user's search terms in multimedia retrieval are discussed. It describes a methodology for providing a variety of multimedia content as a search result. To this end, we proposed an algorithm that numerically calculates the semantic relation between user search terms and concepts included in ontology.
Next, the Korean Patent "Browse system and method for browsing information using ontologies (Patent Registration No. 10-0820746)" provides the ontology on the screen in the form of a node-arc on the screen for the user's search convenience, the user of the graph Browsing systems and methods are described in which selecting a node moves the center of the graph and presents it back to the user. This prior patent also has a limitation in displaying ontologies on the screen, so it is shown mainly on nodes directly connected to the node selected by the user.
On the other hand, in recent years, attempts have been made to find information that is directly related to a user's search term, and even hidden information, that is, information that is indirectly related. Recently, academia has been conducting research to find out the relation between people who have no acquaintance by constructing a human network that expresses the relationship between two people as nodes and arcs. In the same context, studies have been conducted to find the hidden (indirect) linkage between two nodes after converting ontology map node-arc graphs. For example, while Student A and Student B, who are attending the same university, have no acquaintance with each other, the relationship between the nodes in the ontology graph indicates that they are taking different subjects of Professor C. In order to find the hidden (indirect) association between any two nodes on the ontology graph, researches on algorithms for quickly searching the path between the two nodes have been conducted.
However, these studies limit the number of search terms (number of nodes selected) to two, excluding the method of finding partial graphs (hidden associations) connected between search terms when there are three or more search terms.
In addition, as the ontology becomes larger in size, it is almost impossible in terms of time complexity of an algorithm to provide a search result to a user in real time by performing an algorithm simultaneously with inputting a search word.
Therefore, there is an urgent need for an algorithm that finds hidden (indirect) associations among three or more search terms without limiting the number of user search terms, as well as a semantic search system that can search knowledge in real time on a large scale ontology. .
As described above, in the ontology search, an algorithm for finding hidden (indirect) associations among three or more search terms without limiting the number of user search terms is required, and the core of the Semantic Web, which is attracting attention as the next generation web environment, is required. As the ontology of the technology becomes larger, the problem of how to quickly search the knowledge described in the ontology in real time has been encountered, and it is an object of the present invention to solve such a problem and meet the needs.
Accordingly, the present invention provides an apparatus and method for retrieving knowledge related to a user query in real time on a large scale ontology for retrieving and providing a path related to a user query in real time after converting and storing the large ontology in a path form. The purpose is.
That is, the present invention converts a large-scale ontology representing the real world knowledge into a node-arc graph form and manages paths obtained for each node pair in the converted ontology graph. It is an object of the present invention to provide an apparatus and method for retrieving knowledge related to user query in real time by generating partial graphs by searching and integrating paths related to user query.
The objects of the present invention are not limited to the above-mentioned objects, and other objects and advantages of the present invention which are not mentioned above can be understood by the following description, and will be more clearly understood by the embodiments of the present invention. Also, it will be readily appreciated that the objects and advantages of the present invention may be realized by the means and combinations thereof indicated in the claims.
An apparatus of the present invention for achieving the above object is a knowledge retrieval apparatus, comprising: an ontology graph converting means for converting a large capacity ontology into an ontology graph; Path generation means for generating a path from the ontology graph converted by the ontology graph conversion means; Path storage means for storing a path generated by said path generation means; Route retrieving means for retrieving a user query and a related route from the route storing means; Partial graph generating means for generating a partial graph by processing the path searched by said path searching means; And triple converting means for converting the partial graph generated by the partial graph generating means into a triple set.
On the other hand, the method of the present invention for achieving the above object, in a knowledge search method, comprising: converting a triple-based ontology into an ontology graph in the form of a node-arc; Generating a path for each node pair of the converted ontology graph and indexing and storing the path in a path repository; A route retrieval step of retrieving a route associated with a user query from the route repository; Generating a partial graph by incorporating the searched paths or removing the meaningless paths; And converting the generated partial graph into a triple set.
The present invention as described above, there is an effect that can search in real time the knowledge associated with the user query in a large capacity ontology.
That is, the present invention converts a large-scale ontology representing the real world knowledge into a node-arc graph form and manages paths obtained for each node pair in the converted ontology graph. When the partial graph is generated by searching and integrating the paths related to the user query, it is possible to search the knowledge related to the user query in real time.
In addition, the present invention can find a hidden (indirect) association between three or more search terms without limiting the number of user search terms.
In addition, the present invention can be applied to a variety of applications (eg, semantic web-based search engine, etc.) that require retrieving knowledge in real time from a large-scale ontology, thereby improving the quality of the search.
BRIEF DESCRIPTION OF THE DRAWINGS The above and other objects, features and advantages of the present invention will become more apparent from the following detailed description of the present invention when taken in conjunction with the accompanying drawings, It can be easily carried out. In addition, in describing the present invention, when it is determined that the detailed description of the known technology related to the present invention may unnecessarily obscure the gist of the present invention, the detailed description thereof will be omitted. Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings.
First, the semantic web technology will be described in more detail to help understanding of the present invention.
Tim Berners-Lee first proposed the World Wide Web in 1989, marking the existing client-server architecture and HyperText Markup Language (HTML). The language allows users to post personal information anywhere on the Internet and to have a shared infrastructure for accessing that information through a browser. As a result, a great deal of information has been put on the Internet and distributed, and a great deal of information has existed on the Internet. The sharing of this information promotes social and technological development, and as a result, leads the innovation of the information society. Became.
However, the enormous amount of information has led to more and more efforts to find the information they want, and the emergence of various applications and services using the web has made it difficult and effective to find and use. Many difficulties have arisen.
In particular, the existing web-based search method is mainly searched by keywords, and the method of determining the priority of web documents using frequency of words or lexical information is mainly used to find a desired web document. There is a limit. In addition, it is very difficult to expand, integrate and share related web documents. This problem is because the existing web and markup languages are human-centered and focus on the expression technology of the web browser for humans to see and understand. As a result, the existing web is a human-centered information processing technology that does not provide enough functions for a computer to effectively extract, interpret, and process necessary information on its own.
Then, the Semantic Web is a technology that can extend the existing web to realize semantic interoperability based on well-defined meanings that computers can understand and to build an effective cooperative system between humans and computers. This appeared.
Tim Berners-Lee is not the concept of a new web that is completely different from the existing web, but rather extends the current web to give well-defined meaning to the information on the web, thereby helping computers and people work collaboratively. The role is defined as a paradigm that can be performed. The Semantic Web is designed to understand the meaning of information on the web, not only by people, but also by machines (computers) to provide intelligent services that meet the needs of users, or to facilitate collaboration between people and machines. It is a web with automatic service.
In other words, the semantic web is a next-generation web technology that enables a computer to understand, automate, integrate, and reuse the meaning of information resources.
1) Ontology
Ontology is a formal specification system for shared conceptualization and provides semantic information of domain vocabulary. Ontology is a kind of knowledge expression, and the computer can understand the concept represented by the ontology and process the knowledge. In order to deal with inferences, the ontology's axiom and rule system are needed.
2) semantically annotated web
A semantically annotated web is an ontology annotated web, which is a knowledge base. The Semantic Web can build a huge knowledge base that semantically integrates the distributed information resources of the Internet. In a narrow sense, it may be possible to build a knowledge base of information resources of a company or institution.
3) agent
An agent is an intelligent agent that collects, retrieves and infers information resources on behalf of a person (user), and exchanges information with other agents. Intelligent agents are the core of semantic web-based application systems.
The semantic web realizes semantic interoperability by using ontology and agent technology, and thus, the semantic web can leap from the information-based web to the knowledge-based semantic web.
1 is a block diagram of an embodiment of a device for searching for user query related knowledge in real time and a semantic search system using the same in a large-scale ontology according to the present invention.
As shown in FIG. 1, the semantic search system to which the present invention is applied includes a
Next, the components of the ontology knowledge search unit 30 will be described in more detail as follows.
As shown in FIG. 1, the ontology knowledge retrieval unit 30 according to the present invention has an ontology stored in the triple-based
Next, the operation of the ontology knowledge search unit 30 and its specific embodiments will be described in more detail with reference to FIGS. 2 to 4B.
2 is an exemplary view of the ontology graph obtained from the ontology graph conversion unit according to the present invention, Figure 3 is an exemplary view of the table structure of the path storage according to the present invention, Figure 4a is a view of the path storage according to the present invention 4B is a diagram illustrating a structure, and FIG. 4B is a diagram for describing an operation of a path search unit and a partial graph generator according to the present invention.
First, the ontology
FIG. 2 shows an example of converting ontology graphs into ontology graphs using an ontology language called a resource description framework (RDF) and an RDF schema. In this case, when there is a correlation between two nodes, it is represented by a solid line arc with bidirectionality. Otherwise, it is represented by a unidirectional dotted arc.
When the ontology is converted into an ontology graph in this way, weights can be given to the arcs of each triple. Simply all arcs can be weighted equally equal to 1, or they can be weighted differently depending on the importance of the triples. For example, a method of differentially assigning weights can give a low weight (penalty) when the corresponding triple has appeared a lot lately. Random triple
Given, arc weights Is shown in
Here, t is a parameter for time and means a date from one day before to m days before from the present. If you consider only one year, m is 365.
And
to be. only, Is the number of all triples that occurred in web documents published t days before now, Triples occur on web documents published on a date less than t days ago. Is the frequency ofAnd
to be. only, Is the number of all web documents published more than t days before now, Is the triple among web documents published on Is the number of documents included.The
On the other hand, considering that it is an ontology graph converted from a large ontology, the number of paths for each node pair may be very large. For example, if there are 100,000 nodes, the number of node pairs may be 10 billion, so the number of paths may be 10 billion or more. However, the paths between all node pairs are not meaningful. Therefore, it is not necessary to generate paths for every node pair, and it is reasonable to only generate paths for node pairs that are expected to be meaningful. For example, each node can only generate paths between itself and nodes within a certain number of hops.
In addition, the ontology graph may change as new knowledge is added or changed over time. Then, the
The
(Kim Dong Gun, Business Administration), 1 : Kim Dong Gun-Major-Business Administration
(Kim Dong Gun, age), 1 : Kim Dong Gun-Age-34
(Kim Dong Gun, Ontology Tech ), 1 : Kim Dong Gun-Own-Ontology Tech
(Kim Dong Gun, Employer), 1 : Kim Dong Gun-type-employer
(Kim Dong Gun, Company), 2 : Kim Dong Gun-Own-Ontology Tech-type-Company
( Ontology Tech , http://www.ontologytech.co.kr), 1 : Ontology Tech-Homepage-http: //www.ontologytech.co.kr
The indexed paths may be stored in a path table as shown in FIG. 3, and the table may be distributed and stored in a plurality of computers as shown in FIG. 4. As mentioned above, the number of paths could be significantly reduced by only generating paths of node pairs that are expected to be meaningful, but the number of paths may still be large, requiring a large amount of
In addition, as shown in FIG. 4B, when a user query is received later, the same query information is transmitted to each individual computer, and each individual computer simultaneously searches for a relatively small number of paths related to the query. Providing search results can significantly reduce route search time.
The
Next, the
(1) When one noun word is received from the query
(2) When two noun-type words are received from the query
(3) When three noun-type words are received from the query
(4) When four noun words are received from the query
In addition, even when five or more noun-type words are received from the query
The
FIG. 5 is a flowchart illustrating a method for searching user query related knowledge in real time in a large-scale ontology according to an embodiment of the present invention. Since the specific embodiments are the same as described above, the technical gist of the operation method is briefly described herein. Let's explain.
First, the
Thereafter, the
Thereafter, the
Subsequently, the
Thereafter, the
On the other hand, the method of the present invention as described above can be written in a computer program. And the code and code segments constituting the program can be easily inferred by a computer programmer in the art. In addition, the written program is stored in a computer-readable recording medium (information storage medium), and read and executed by a computer to implement the method of the present invention. The recording medium may include any type of computer readable recording medium.
The present invention described above is capable of various substitutions, modifications, and changes without departing from the technical spirit of the present invention for those skilled in the art to which the present invention pertains. It is not limited by the drawings.
The present invention solves the problem that it is difficult to search the knowledge described in the ontology in real time as the ontology becomes large, so that various applications (eg, semantic web-based search engines, etc.) systems that require retrieval of the knowledge from the large-scale ontology are active. Can be utilized.
1 is a configuration diagram of an apparatus for searching for user query related knowledge in real time and a semantic search system using the same in a large-scale ontology according to the present invention;
2 is an exemplary view of an ontology graph obtained from an ontology graph converter according to the present invention;
3 is an exemplary diagram of a table structure of a path store according to the present invention;
Figure 4a is an exemplary view showing the structure of the path storage according to the present invention,
4B is a diagram for describing an operation of a path search unit and a partial graph generation unit according to the present invention;
5 is a flowchart illustrating a method for searching for user query related knowledge in real time in a large-scale ontology according to the present invention.
* Explanation of symbols for the main parts of the drawings
10: Web Document Repository 20: Triple Based Ontology Repository
30: ontology knowledge search unit 31: ontology graph conversion unit
32: path generation unit 33: path storage
34: path search unit 35: partial graph generation unit
36: triple conversion unit 40: query information processing unit
50: web document search unit 60: search result processing unit
Claims (7)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020080128681A KR20100070084A (en) | 2008-12-17 | 2008-12-17 | Apparatus and method for in real time retrieving knowledge relevant to user's query from a large-scale ontology |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020080128681A KR20100070084A (en) | 2008-12-17 | 2008-12-17 | Apparatus and method for in real time retrieving knowledge relevant to user's query from a large-scale ontology |
Publications (1)
Publication Number | Publication Date |
---|---|
KR20100070084A true KR20100070084A (en) | 2010-06-25 |
Family
ID=42367959
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020080128681A KR20100070084A (en) | 2008-12-17 | 2008-12-17 | Apparatus and method for in real time retrieving knowledge relevant to user's query from a large-scale ontology |
Country Status (1)
Country | Link |
---|---|
KR (1) | KR20100070084A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101226162B1 (en) * | 2012-07-30 | 2013-01-24 | 한국과학기술정보연구원 | Method and apparatus for converting ontology date to graph data |
KR101648011B1 (en) * | 2015-06-30 | 2016-08-12 | 경희대학교 산학협력단 | Method and apparatus for frequent subgraph mining using embedding overlapped relationships |
KR102079289B1 (en) * | 2019-04-23 | 2020-04-07 | 주식회사 비닛 | Wine recommendation system and method |
KR102147854B1 (en) * | 2020-06-08 | 2020-08-25 | 한화시스템(주) | Battlefield situation multiple reasoning system and method |
-
2008
- 2008-12-17 KR KR1020080128681A patent/KR20100070084A/en not_active Application Discontinuation
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101226162B1 (en) * | 2012-07-30 | 2013-01-24 | 한국과학기술정보연구원 | Method and apparatus for converting ontology date to graph data |
KR101648011B1 (en) * | 2015-06-30 | 2016-08-12 | 경희대학교 산학협력단 | Method and apparatus for frequent subgraph mining using embedding overlapped relationships |
KR102079289B1 (en) * | 2019-04-23 | 2020-04-07 | 주식회사 비닛 | Wine recommendation system and method |
KR102147854B1 (en) * | 2020-06-08 | 2020-08-25 | 한화시스템(주) | Battlefield situation multiple reasoning system and method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Jain et al. | Ontology development and query retrieval using protégé tool | |
Maali et al. | Re-using Cool URIs: Entity Reconciliation Against LOD Hubs. | |
Dong et al. | A survey in semantic search technologies | |
JP2015212947A (en) | Apparatus and method for web page access | |
Singh et al. | Ontology development using Hozo and Semantic analysis for information retrieval in Semantic Web | |
Aksac et al. | A novel semantic web browser for user centric information retrieval: PERSON | |
Xie et al. | An evolvable and transparent data as a service framework for multisource data integration and fusion | |
KR20100070084A (en) | Apparatus and method for in real time retrieving knowledge relevant to user's query from a large-scale ontology | |
Belozerov et al. | Semantic web technologies: Issues and possible ways of development | |
KR20070065774A (en) | System and method for managing a semantic blog using the ontology | |
Marx et al. | Exploring term networks for semantic search over RDF knowledge graphs | |
Angele et al. | Semantic Web empowered E-tourism | |
Chaudhary et al. | A novel ontology design and comparative analysis of various retrieval schemes on education domain in protégé | |
Kettouch et al. | Using semantic similarity for schema matching of semi-structured and linked data | |
Kim et al. | The index organizations for RDF and RDF schema | |
Konstantinou et al. | Deploying linked open data: Methodologies and software tools | |
KR20100003084A (en) | Apparatus and method for extracting partial ontology graph, and apparatus and method for semantic matching between user's question and ontology using thereof | |
TWI442249B (en) | Domain Knowledge Network Construction Method and Its System | |
Mullins et al. | Treelicious: a system for semantically navigating tagged web pages | |
Chun et al. | Semantic annotation and search for deep web services | |
Nešić et al. | Publishing agro-environmental resources as linked data | |
Uppal et al. | Semantic web mining and semantic search engine: A review | |
Soza et al. | Web ontology language applied to the tourism sector | |
Goel et al. | Semantic Web Engineering: Boon or Bane | |
Movva et al. | Noesis: a semantic search engine and resource aggregator for atmospheric science |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WITN | Withdrawal due to no request for examination |