KR102090237B1

KR102090237B1 - Method, system and computer program for knowledge extension based on triple-semantic

Info

Publication number: KR102090237B1
Application number: KR1020180089423A
Authority: KR
Inventors: 김동환; 권유경; 성길제
Original assignee: 주식회사 포티투마루
Priority date: 2018-07-31
Filing date: 2018-07-31
Publication date: 2020-03-17
Also published as: KR20200014047A

Abstract

본 발명의 일 실시예에 따르면, 기존재하는 시맨틱 트리플 데이터를 업데이트하는 데이터 업데이트부; 엔티티(Entity) 유의어, 어트리뷰트(attribute) 유의어를 활용 및 조합하여 질의문을 생성하는 질의문 생성 모듈; 사용자 로그에 기반한 실제 사용자 질의를 획득하는 실제 질의 획득부; 상기 질의문 생성 모듈에서 생성된 질의 혹은 상기 실제 사용자 질의문을 입력값으로 획득하여, 질의의 특성에 따라, 관련성이 존재하는 패시지(Passage) 후보군을 1차 선정하여 검색 대상 타겟팅을 시행하고 해당 질의와 관련 있는 Passage를 검색하며, 획득한 패시지 및 질의 데이터를 기반으로 유니크 인스턴트 응답을 도출하는 시맨틱 트리플 추출기; 정답인 유니크 인스턴트 응답(Unique Instant Answer) 및 질의를 엔티티, 어트리뷰트, 인스턴트 응답 형태인 시맨틱 트리플로 변환하는 시맨틱 트리플 변환 모듈; 을 포함하는, 시맨틱 트리플 기반의 지식 확장 시스템이 제공된다.According to an embodiment of the present invention, a data update unit for updating existing semantic triple data; A query statement generation module for generating a query statement by using and combining entity synonyms and attribute synonyms; An actual query acquisition unit that acquires an actual user query based on the user log; The query generated by the query statement generation module or the actual user query statement is obtained as an input value, and targets are searched by first selecting a candidate candidate group with a relationship according to the characteristics of the query A semantic triple extractor that searches for a Passage related to and derives a unique instant response based on the acquired passage and query data; A semantic triple conversion module that converts a unique instant answer, which is a correct answer, and a query to a semantic triple, which is an entity, attribute, or instant response form; A semantic triple based knowledge expansion system is provided, which includes a.

Description

시맨틱 트리플 기반의 지식 확장 시스템, 방법 및 컴퓨터 프로그램 {METHOD, SYSTEM AND COMPUTER PROGRAM FOR KNOWLEDGE EXTENSION BASED ON TRIPLE-SEMANTIC}Semantic triple based knowledge expansion system, method and computer program {METHOD, SYSTEM AND COMPUTER PROGRAM FOR KNOWLEDGE EXTENSION BASED ON TRIPLE-SEMANTIC}

본 발명은 시맨틱 트리플 기반의 지식 확장 시스템, 방법 및 컴퓨터 프로그램에 관한 것으로, 보다 상세하게는 자연어 검색에 대해 정확도 높은 즉답을 제공할 수 있는 시맨틱 트리플 기반의 지식 확장 시스템, 방법 및 컴퓨터 프로그램에 관한 것이다.The present invention relates to a semantic triple-based knowledge expansion system, method, and computer program, and more particularly, to a semantic triple-based knowledge expansion system, method, and computer program capable of providing high-accuracy immediate answers to natural language search. .

사람의 언어는 풍부하고 복잡하며, 복잡한 문법 및 문맥 의미를 갖는 많은 어휘를 포함하고 있으나 하드웨어 또는 소프트웨어 애플리케이션은 일반적으로 특정 형식 또는 규칙에 따라 데이터를 입력할 것을 요구한다. 여기서, 자연어 입력은 사람과 상호작용하기 위한 거의 모든 소프트웨어 애플리케이션에서 이용될 수 있다. 최근에 자연어를 이용한 질의응답방식은 텍스트나 음성으로 구성된 언어적 입력(Lexical input)을 NLP(자연어 처리 프로세서, Natural Language Processor) 모듈이 입력받아 컴퓨터상에서 처리될 수 있는 형태로 처리해주고, 처리된 자연어의 컨텍스트(Context)를 분석하는 컨텍스트 분석기(Context Analyzer)를 지나서, 컨텍스트에 따라 답변 내용을 정하는 결정부(Decision Maker)에서 답변 내용을 분류하여 확정하고, 확정된 답변 내용에 따라 사용자에게 답변을 하는 응답부(Response Generator)를 통해 언어적 출력(Lexical output)이 나오게 되는 구조로 구성된다.Human languages are rich and complex, and contain many vocabulary words with complex grammar and contextual meaning, but hardware or software applications generally require data to be entered according to a specific format or rule. Here, natural language input can be used in almost any software application to interact with a person. In recent years, the question-and-answer method using natural language processes a language input composed of text or voice into a form that can be processed on a computer by receiving a natural language processor (NLP) module. After passing through the context analyzer that analyzes the context of the application, the decision maker that determines the content of the answer according to the context classifies and confirms the answer, and responds to the user according to the determined answer It consists of a structure in which a verbal output is output through a response generator.

한편, 음성인식 스피커를 필두로 한 스마트 머신 보급 확대, 인공지능 기술의 발전에 따라 정보 검색 방식이 기존 키워드 입력 기반, 문서 리스트를 확인했던 기존의 검색 방법에서 자연어 기반의 문장 입력, 구체적인 응답 형태로 검색의 트렌드가 변화하고 있다.On the other hand, with the expansion of the spread of smart machines with voice-recognition speakers, and the development of artificial intelligence technology, the information retrieval method is based on the existing keyword input, the existing search method that checks the document list, the natural language-based sentence input, and the specific response form. Search trends are changing.

KR 10-1851787 B1KR 10-1851787 B1

본 발명은 정확성 높은 유니크 인스턴트 응답(Unique Instant Answer)을 제공하는 것을 일 목적으로 한다.One object of the present invention is to provide a unique instant answer with high accuracy.

본 발명은 자동적으로 질의문 및 응답을 생성하여 엔티티(entity), 어트리뷰트(attribute), 인스턴트(instant) 응답 형태인 시맨틱 트리플로 변환할 수 있다.The present invention can automatically generate a query and a response and convert it into a semantic triple that is an entity, attribute, or instant response form.

본 발명의 일 관점에 따르면, 기존재하는 시맨틱 트리플 데이터를 업데이트하는 데이터 업데이트부; 엔티티(Entity) 유의어, 어트리뷰트(attribute) 유의어를 활용 및 조합하여 질의문을 생성하는 질의문 생성 모듈; 사용자 로그에 기반한 실제 사용자 질의를 획득하는 실제 질의 획득부; 상기 질의문 생성 모듈에서 생성된 질의 혹은 상기 실제 사용자 질의문을 입력값으로 획득하여, 질의의 특성에 따라, 관련성이 존재하는 패시지(Passage) 후보군을 1차 선정하여 검색 대상 타겟팅을 시행하고 해당 질의와 관련 있는 Passage를 검색하며, 획득한 패시지 및 질의 데이터를 기반으로 유니크 인스턴트 응답을 도출하는 시맨틱 트리플 추출기; 정답인 유니크 인스턴트 응답(Unique Instant Answer) 및 질의를 엔티티, 어트리뷰트, 인스턴트 응답 형태인 시맨틱 트리플로 변환하는 시맨틱 트리플 변환 모듈;을 포함하는, 시맨틱 트리플 기반의 지식 확장 시스템을 제공한다.According to an aspect of the present invention, a data update unit for updating existing semantic triple data; A query statement generation module for generating a query statement by using and combining entity synonyms and attribute synonyms; An actual query acquisition unit that acquires an actual user query based on the user log; The query generated by the query statement generation module or the actual user query statement is obtained as an input value, and targets are searched by first selecting a candidate candidate group with a relationship according to the characteristics of the query A semantic triple extractor that searches for a Passage related to and derives a unique instant response based on the acquired passage and query data; It provides a semantic triple-based knowledge expansion system, including; a unique answer (Unique Instant Answer) and a semantic triple conversion module for converting a query into a semantic triple in the form of an entity, attribute, and instant response.

본 실시예에 있어서, 상기 질의문 생성 모듈은, 전체 시맨틱 트리플 데이터에서, 엔티티(entity) 필드와 어트리뷰트(attribute) 필드를 룩업(Lookup) 하여 조합하고, 특정 관계 카테고리 별로 엔티티 데이터베이스와 어트리뷰트 데이터베이스를 연계하며, 유의어 정보를 활용하여 생성될 질의문 수를 확장할 수 있다.In the present embodiment, the query generation module combines the entity field and the attribute field by looking up the entire semantic triple data, and links the entity database and the attribute database for each specific relationship category. In addition, the number of query statements to be generated can be expanded by using synonym information.

본 실시예에 있어서, 정답인 유니크 인스턴트 응답을 판별하는 스크리닝부를 더 포함하고, 상기 스크리닝부는, 질의 데이터 기반으로 다수의 유니크 인스턴트 응답 결과가 같게 나오거나, 자체 신뢰도가 특정 임계치 이상인 경우 정답으로 판단할 수 있다.In this embodiment, further comprising a screening unit for determining a unique instant response that is a correct answer, the screening unit, if a number of unique instant response results are the same based on the query data, or if the self-reliability is more than a certain threshold, it is determined as a correct answer You can.

본 발명의 다른 관점에 따르면, 기존재하는 시맨틱 트리플 데이터를 업데이트하는 데이터 업데이트 단계; 엔티티(Entity) 유의어, 어트리뷰트(attribute) 유의어를 활용 및 조합하여 질의문을 생성하는 질의문 생성 단계; 사용자 로그에 기반한 실제 사용자 질의를 획득하는 실제 질의 획득 단계; 상기 질의문 생성 모듈에서 생성된 질의 혹은 상기 실제 사용자 질의문을 입력값으로 획득하여, 질의의 특성에 따라, 관련성이 존재하는 패시지(Passage) 후보군을 1차 선정하여 검색 대상 타겟팅을 시행하고 해당 질의와 관련 있는 Passage를 검색하며, 획득한 패시지 및 질의 데이터를 기반으로 유니크 인스턴트 응답을 도출하는 시맨틱 트리플 추출 단계; 정답인 유니크 인스턴트 응답(Unique Instant Answer) 및 질의를 엔티티, 어트리뷰트, 인스턴트 응답 형태인 시맨틱 트리플로 변환하는 시맨틱 트리플 변환 단계;를 포함하는, 시맨틱 트리플 기반의 지식 확장 방법이 제공된다.According to another aspect of the present invention, a data update step of updating existing semantic triple data; A query statement generation step of generating a query statement by using and combining entity synonyms and attribute synonyms; An actual query acquisition step of obtaining an actual user query based on the user log; The query generated by the query statement generation module or the actual user query statement is obtained as an input value, and targets are searched by first selecting a candidate candidate group with a relationship according to the characteristics of the query A semantic triple extraction step of searching for a Passage related to and deriving a unique instant response based on the acquired passage and query data; A semantic triple-based knowledge extension method is provided, including; a unique instant answer that is a correct answer and a semantic triple conversion step of converting a query into a semantic triple in the form of an entity, attribute, and instant response.

본 실시예에 있어서, 상기 질의문 생성 단계는, 전체 시맨틱 트리플 데이터에서, 엔티티(entity) 필드와 어트리뷰트(attribute) 필드를 룩업(Lookup) 하여 조합하고, 특정 관계 카테고리 별로 엔티티 데이터베이스와 어트리뷰트 데이터베이스를 연계하며, 유의어 정보를 활용하여 생성될 질의문 수를 확장할 수 있다.In the present embodiment, the query statement generating step combines an entity field and an attribute field by looking up in all semantic triple data, and links the entity database and the attribute database for each specific relationship category. In addition, the number of query statements to be generated can be expanded by using synonym information.

본 실시예에 있어서, 정답인 유니크 인스턴트 응답을 판별하는 스크리닝 단계를 더 포함하고, 상기 스크리닝 단계는, 질의 데이터 기반으로 다수의 유니크 인스턴트 응답 결과가 같게 나오거나, 자체 신뢰도가 특정 임계치 이상인 경우 정답으로 판단할 수 있다.In this embodiment, further comprising a screening step of determining a unique instant response that is the correct answer, the screening step, a plurality of unique instant response results based on the query data, or if the self-reliability is more than a certain threshold, as a correct answer I can judge.

본 발명의 또 다른 관점에 따르면, 엔티티 유의어, 어트리뷰트 유의어를 활용 및 조합하여 질의문을 생성하는 질의문 생성 모듈; 상기 생성된 질의문에 대해 유니크 인스턴트 응답(Unique Instant Answer)을 도출하는 시맨틱 트리플 추출기; 상기 시맨틱 트리플 추출기의 결과를 판단하여 정답인 유니크 인스턴트 응답(Unique Instant Answer) 및 질의를 생성하는 스크리닝부; 상기 정답인 유니크 인스턴트 응답(Unique Instant Answer) 및 질의를 엔티티, 어트리뷰트, 인스턴트 응답 형태인 시맨틱 트리플로 변환하는 시맨틱 트리플 변환 모듈;을 포함하는 시맨틱 트리플 기반의 지식 확장 시스템이 제공된다.According to another aspect of the present invention, a query statement generation module for generating a query statement by using and combining entity synonyms and attribute synonyms; A semantic triple extractor that derives a unique instant answer to the generated query; A screening unit that determines a result of the semantic triple extractor and generates a unique answer (Unique Instant Answer) and a query that are correct answers; A semantic triple-based knowledge expansion system including a semantic triple conversion module that converts the unique answer, which is the correct answer, and a query into a semantic triple that is an entity, attribute, and instant response form.

본 실시예에 있어서, 상기 시맨틱 트리플 추출기는, 질의의 특성에 따라, 관련성이 존재하는 패시지(Passage) 후보군을 1차 선정하여 검색 대상 타겟팅을 시행하고, 해당 질의와 관련 있는 Passage를 검색하는 패시지 검색 모듈; 및 획득한 패시지 및 질의 데이터를 기반으로 유니크 인스턴트 응답을 도출하며, 상기 패시지 각가에 대하여 유니크 인스턴트 응답과 해당 응답의 신뢰도를 도출하는 기계독해 질의응답 모듈; 을 포함할 수 있다.In the present exemplary embodiment, the semantic triple extractor first selects a candidate group of Passages in which relevance exists according to the characteristics of the query, performs targeting of the search, and searches a Passage that searches for the Passage associated with the query. module; And a machine-reading question-and-answer module that derives a unique instant response based on the acquired passage and query data, and derives a unique instant response and the reliability of the response for each passage; It may include.

본 발명의 또 다른 관점에 따르면, 엔티티 유의어, 어트리뷰트 유의어를 활용 및 조합하여 질의문을 생성하는 질의문 생성 단계; 상기 생성된 질의문에 대해 유니크 인스턴트 응답(Unique Instant Answer)을 도출하는 시맨틱 트리플 추출 단계; 상기 시맨틱 트리플 추출기의 결과를 판단하여 정답인 유니크 인스턴트 응답(Unique Instant Answer) 및 질의를 생성하는 스크리닝 단계; 상기 정답인 유니크 인스턴트 응답(Unique Instant Answer) 및 질의를 엔티티, 어트리뷰트, 인스턴트 응답 형태인 시맨틱 트리플로 변환하는 시맨틱 트리플 변환 단계;을 포함하는 시맨틱 트리플 기반의 지식 확장 방법이 제공된다.According to another aspect of the present invention, a query statement generation step of generating a query statement by using and combining entity synonyms and attribute synonyms; A semantic triple extraction step of deriving a unique instant answer to the generated query; A screening step of determining a result of the semantic triple extractor and generating a unique answer and query, which are correct answers; A semantic triple-based knowledge extension method comprising a semantic triple conversion step of converting the unique answer, which is the correct answer, and a query into a semantic triple that is an entity, attribute, and instant response form.

본 실시예에 있어서, 상기 시맨틱 트리플 추출 단계는, 질의의 특성에 따라, 관련성이 존재하는 패시지(Passage) 후보군을 1차 선정하여 검색 대상 타겟팅을 시행하고, 해당 질의와 관련 있는 Passage를 검색하는 패시지 검색 단계; 및 획득한 패시지 및 질의 데이터를 기반으로 유니크 인스턴트 응답을 도출하며, 상기 패시지 각가에 대하여 유니크 인스턴트 응답과 해당 응답의 신뢰도를 도출하는 기계독해 질의응답 단계; 를 포함할 수 있다.In the present embodiment, in the semantic triple extraction step, according to the characteristics of the query, a target group is selected by first selecting a candidate candidate group that has relevance, and targeting a search target, and searching for a passage associated with the query. Search step; And a machine-reading question-and-answer step of deriving a unique instant response based on the acquired passage and query data, and deriving a unique instant response and the reliability of the corresponding response for each passage; It may include.

본 실시예에 있어서, 상기 질의문 생성 단계는, 전체 시맨틱 트리플 데이터에서, 엔티티(entity) 필드와 어트리뷰트(attribute) 필드를 룩업(Lookup)하여 조합하고, 특정 관계 카테고리 별로 엔티티 데이터베이스와 어트리뷰트 데이터베이스를 연계하며, 유의어 정보를 활용하여 생성될 질의문 수를 확장할 수 있다.In this embodiment, the query statement generation step, in the entire semantic triple data, the entity field and the attribute field are looked up and combined, and the entity database and the attribute database are linked for each specific relationship category. In addition, the number of query statements to be generated can be expanded by using synonym information.

본 발명의 또 다른 관점에 따르면, 엔티티(Entity) 유의어, 어트리뷰트(attribute) 유의어를 활용 및 조합하여 질의문을 생성하는 질의문 생성 모듈; 및 상기 질의문 생성 모듈에서 생성된 질의 혹은 상기 실제 사용자 질의문을 입력값으로 획득하여 생성된 질의문에 대해 유니크 인스턴트 응답을 도출하는 시맨틱 트리플 추출기; 를 포함하고, 상기 시맨틱 트리플 추출기는, 질의의 특성에 따라, 관련성이 존재하는 패시지(Passage) 후보군을 1차 선정하여 검색 대상 타겟팅을 시행하고, 해당 질의와 관련 있는 Passage를 검색하는 패시지 검색 모듈; 및 획득한 패시지 및 질의 데이터를 기반으로 유니크 인스턴트 응답을 도출하며, 상기 패시지 각가에 대하여 유니크 인스턴트 응답과 해당 응답의 신뢰도를 도출하는 기계독해 질의응답 모듈; 을 포함하는, 시맨틱 트리플 기반의 지식 확장 시스템이 제공된다.According to another aspect of the present invention, a query statement generation module that generates a query statement by using and combining entity synonyms and attribute synonyms; And a semantic triple extractor for deriving a unique instant response to the query generated by the query generated by the query generation module or the actual user query as an input value. Including, the semantic triple extractor, according to the characteristics of the query, a target search group by first selecting a candidate candidate group (Passage) that is relevant, and performs a search target targeting, search for a Passage associated with the query; And a machine-reading question-and-answer module that derives a unique instant response based on the acquired passage and query data, and derives a unique instant response and the reliability of the response for each passage; A semantic triple based knowledge expansion system is provided, which includes a.

본 발명의 또 다른 관점에 따르면, 상기 방법을 실행하기 위해 컴퓨터 판독 가능한 기록 매체에 기록된 컴퓨터 프로그램.According to another aspect of the invention, a computer program recorded on a computer readable recording medium for carrying out the method.

본 발명에 의하면, 정확성 높은 유니크 인스턴트 응답을 제공할 수 있다.According to the present invention, a unique instant response with high accuracy can be provided.

또한, 본 발명에 의하면 자동적으로 질의문 및 응답을 생성하여 엔티티(entity), 어트리뷰트(attribute), 인스턴트(instant) 응답 형태인 시맨틱 트리플로 변환하여 데이터베이스에 추가할 수 잇다.In addition, according to the present invention, it is possible to automatically generate a query and a response, convert it into a semantic triple in the form of an entity, attribute, and instant response and add it to the database.

도 1 은 본 발명의 일 실시예에 따른 네트워크 환경의 예를 도시한 도면이다.
도 2 는 본 발명의 일 실시예에 있어서, 사용자 단말 및 서버의 내부 구성을 설명하기 위한 블록도이다.
도 3 은 본 발명의 일 실시예에 따른 서버의 프로세서의 내부 구성을 나타낸 것이다.
도 4 및 도 5 는 본 발명의 일 실시예에 따른 시맨틱 트리플 기반의 지식 확장 방법을 시계열적으로 나타낸 도면이다.
도 6 은 본 발명의 일 실시예에 따른 시맨틱 트리플 기반의 지식 확장 시스템을 설명하기 위한 것이다.
도 7 은 본 발명의 일 실시예에 따른 시맨틱 트리플 기반의 지식 확장 방법을 설명하기 위한 도면이다.
도 8 은 본 발명의 일 실시예에 따른 질의문 생성 모듈의 동작을 시계열적으로 나타낸 것이다.
도 9 는 본 발명의 일 실시예에 따른 질의문 확장을 설명하기 위한 것이다.1 is a diagram illustrating an example of a network environment according to an embodiment of the present invention.
2 is a block diagram illustrating an internal configuration of a user terminal and a server in an embodiment of the present invention.
Figure 3 shows the internal configuration of the processor of the server according to an embodiment of the present invention.
4 and 5 are diagrams showing a semantic triple based knowledge extension method according to an embodiment of the present invention in time series.
6 is for explaining a semantic triple-based knowledge expansion system according to an embodiment of the present invention.
7 is a diagram for explaining a method for expanding knowledge based on a semantic triple according to an embodiment of the present invention.
8 is a time series showing the operation of the query statement generation module according to an embodiment of the present invention.
9 is for explaining query statement expansion according to an embodiment of the present invention.

후술하는 본 발명에 대한 상세한 설명은, 본 발명이 실시될 수 있는 특정 실시예를 예시로서 도시하는 첨부 도면을 참조한다. 이러한 실시예는 당업자가 본 발명을 실시할 수 있기에 충분하도록 상세히 설명된다. 본 발명의 다양한 실시예는 서로 다르지만 상호 배타적일 필요는 없음이 이해되어야 한다. 예를 들어, 본 명세서에 기재되어 있는 특정 형상, 구조 및 특성은 본 발명의 정신과 범위를 벗어나지 않으면서 일 실시예로부터 다른 실시예로 변경되어 구현될 수 있다. 또한, 각각의 실시예 내의 개별 구성요소의 위치 또는 배치도 본 발명의 정신과 범위를 벗어나지 않으면서 변경될 수 있음이 이해되어야 한다. 따라서, 후술하는 상세한 설명은 한정적인 의미로서 행하여지는 것이 아니며, 본 발명의 범위는 특허청구범위의 청구항들이 청구하는 범위 및 그와 균등한 모든 범위를 포괄하는 것으로 받아들여져야 한다. 도면에서 유사한 참조부호는 여러 측면에 걸쳐서 동일하거나 유사한 구성요소를 나타낸다.For a detailed description of the present invention, which will be described later, reference is made to the accompanying drawings that illustrate, by way of example, specific embodiments in which the present invention may be practiced. These embodiments are described in detail enough to enable those skilled in the art to practice the present invention. It should be understood that the various embodiments of the present invention are different, but need not be mutually exclusive. For example, specific shapes, structures, and characteristics described in this specification may be implemented by changing from one embodiment to another without departing from the spirit and scope of the present invention. In addition, it should be understood that the position or arrangement of individual components within each embodiment may be changed without departing from the spirit and scope of the present invention. Therefore, the detailed description to be described later is not intended to be taken in a limiting sense, and the scope of the present invention should be taken to cover the scope claimed by the claims of the claims and all equivalents thereto. In the drawings, similar reference numerals denote the same or similar components throughout several aspects.

도 1 은 본 발명의 일 실시예에 따른 네트워크 환경의 예를 도시한 도면이다.1 is a diagram illustrating an example of a network environment according to an embodiment of the present invention.

도 1의 네트워크 환경은 복수의 사용자 단말들(110, 120, 130, 140), 서버(150) 및 네트워크(160)를 포함하는 예를 나타내고 있다. 이러한 도 1은 발명의 설명을 위한 일례로 사용자 단말의 수나 서버의 수가 도 1과 같이 한정되는 것은 아니다. The network environment of FIG. 1 shows an example including a plurality of user terminals 110, 120, 130, 140, a server 150, and a network 160. 1 is not limited to the number of user terminals or the number of servers as shown in FIG. 1 as an example for explaining the present invention.

복수의 사용자 단말들(110, 120, 130, 140)은 컴퓨터 장치로 구현되는 고정형 단말이거나 이동형 단말일 수 있다. 복수의 사용자 단말들(110, 120, 130, 140)의 예를 들면, 스마트폰(smart phone), 휴대폰, 네비게이션, 컴퓨터, 노트북, 디지털방송용 단말, PDA(Personal Digital Assistants), PMP(Portable Multimedia Player), 태블릿 PC 등이 있다. 일례로 사용자 단말 1(110)은 무선 또는 유선 통신 방식을 이용하여 네트워크(160)를 통해 다른 사용자 단말들(120, 130, 140) 및/또는 서버(150)와 통신할 수 있다.The plurality of user terminals 110, 120, 130, and 140 may be a fixed terminal or a mobile terminal implemented as a computer device. For example, a plurality of user terminals (110, 120, 130, 140), smart phones (smart phone), mobile phones, navigation, computers, notebooks, digital broadcasting terminals, PDA (Personal Digital Assistants), PMP (Portable Multimedia Player) ), Tablet PC, etc. For example, the user terminal 1 110 may communicate with other user terminals 120, 130, 140 and / or the server 150 through the network 160 using a wireless or wired communication method.

통신 방식은 제한되지 않으며, 네트워크(160)가 포함할 수 있는 통신망(일례로, 이동통신망, 유선 인터넷, 무선 인터넷, 방송망)을 활용하는 통신 방식뿐만 아니라 기기들간의 근거리 무선 통신 역시 포함될 수 있다. 예를 들어, 네트워크(160)는, PAN(personal area network), LAN(local area network), CAN(campus area network), MAN(metropolitan area network), WAN(wide area network), BBN(broadband network), 인터넷 등의 네트워크 중 하나 이상의 임의의 네트워크를 포함할 수 있다. 또한, 네트워크(160)는 버스 네트워크, 스타 네트워크, 링 네트워크, 메쉬 네트워크, 스타-버스 네트워크, 트리 또는 계층적(hierarchical) 네트워크 등을 포함하는 네트워크 토폴로지 중 임의의 하나 이상을 포함할 수 있으나, 이에 제한되지 않는다.The communication method is not limited, and a communication method using a communication network (for example, a mobile communication network, a wired Internet, a wireless Internet, and a broadcasting network) that the network 160 may include may also include short-range wireless communication between devices. For example, the network 160 includes a personal area network (PAN), a local area network (LAN), a campus area network (CAN), a metropolitan area network (MAN), a wide area network (WAN), and a broadband network (BBN). , Any one or more of the networks such as the Internet. In addition, the network 160 may include any one or more of a network topology including a bus network, a star network, a ring network, a mesh network, a star-bus network, a tree or a hierarchical network, etc. It is not limited.

서버(150)는 복수의 사용자 단말들(110, 120, 130, 140)과 네트워크(160)를 통해 통신하여 명령, 코드, 파일, 컨텐츠, 서비스 등을 제공하는 컴퓨터 장치 또는 복수의 컴퓨터 장치들로 구현될 수 있다.The server 150 is a computer device or a plurality of computer devices that communicate with a plurality of user terminals 110, 120, 130, 140 through a network 160 to provide commands, codes, files, contents, services, etc. Can be implemented.

일례로, 서버(150)는 네트워크(160)를 통해 접속한 사용자 단말 1(110)로 어플리케이션의 설치를 위한 파일을 제공할 수 있다. 이 경우 사용자 단말 1(110)은 서버(150)로부터 제공된 파일을 이용하여 어플리케이션을 설치할 수 있다. 또한 사용자 단말 1(110)이 포함하는 운영체제(Operating System, OS) 및 적어도 하나의 프로그램(일례로 브라우저나 설치된 어플리케이션)의 제어에 따라 서버(150)에 접속하여 서버(150)가 제공하는 서비스나 컨텐츠를 제공받을 수 있다. 예를 들어, 사용자 단말1(110)이 어플리케이션의 제어에 따라 네트워크(160)를 통해 컨텐츠 열람을 서버(150)로 전송하면, 서버(150)는 시맨틱 트리플 기반의 지식 확장 시스템을 이용한 유니크 인스턴트 응답을 사용자 단말 1(110)로 전송할 수 있고, 사용자 단말 1(110)은 어플리케이션의 제어에 따라 유니크 인스턴트 응답을 표시할 수 있다. 다른 예로, 서버(150)는 데이터 송수신을 위한 통신 세션을 설정하고, 설정된 통신 세션을 통해 복수의 사용자 단말들(110, 120, 130, 140)간의 데이터 송수신을 라우팅할 수도 있다.For example, the server 150 may provide a file for the installation of the application to the user terminal 1 110 accessed through the network 160. In this case, the user terminal 1 110 may install an application using a file provided from the server 150. In addition, the service provided by the server 150 is accessed by accessing the server 150 under the control of an operating system (OS) included in the user terminal 1 110 and at least one program (eg, a browser or an installed application). Content can be provided. For example, when the user terminal 1 110 transmits the content viewing through the network 160 to the server 150 under the control of the application, the server 150 unique response using the semantic triple-based knowledge expansion system It can be transmitted to the user terminal 1 (110), the user terminal 1 (110) can display a unique instant response under the control of the application. As another example, the server 150 may establish a communication session for data transmission and reception, and may route data transmission and reception between a plurality of user terminals 110, 120, 130 and 140 through the established communication session.

도 2 는 본 발명의 일 실시예에 있어서, 사용자 단말 및 서버의 내부 구성을 설명하기 위한 블록도이다.2 is a block diagram illustrating an internal configuration of a user terminal and a server in an embodiment of the present invention.

도 2에서는 하나의 사용자 단말에 대한 예로서 사용자 단말 1(110), 그리고 하나의 서버에 대한 예로서 서버(150)의 내부 구성을 설명한다. 다른 사용자 단말들(120, 130, 140)들 역시 동일한 또는 유사한 내부 구성을 가질 수 있다.2 illustrates the internal configuration of the user terminal 1 110 as an example for one user terminal and the server 150 as an example for one server. Other user terminals 120, 130, 140 may also have the same or similar internal configuration.

사용자 단말 1(110)과 서버(150)는 메모리(211, 221), 프로세서(212, 222), 통신 모듈(213, 223) 그리고 입출력 인터페이스(214, 224)를 포함할 수 있다. 메모리(211, 221)는 컴퓨터에서 판독 가능한 기록 매체로서, RAM(random access memory), ROM(read only memory) 및 디스크 드라이브와 같은 비소멸성 대용량 기록장치(permanent mass storage device)를 포함할 수 있다. 또한, 메모리(211, 221)에는 운영체제와 적어도 하나의 프로그램 코드(일례로 사용자 단말 1(110)에 설치되어 구동되는 브라우저나 상술한 어플리케이션 등을 위한 코드)가 저장될 수 있다. 이러한 소프트웨어 구성요소들은 드라이브 메커니즘(drive mechanism)을 이용하여 메모리(211, 221)와는 별도의 컴퓨터에서 판독 가능한 기록 매체로부터 로딩될 수 있다. 이러한 별도의 컴퓨터에서 판독 가능한 기록 매체는 플로피 드라이브, 디스크, 테이프, DVD/CD-ROM 드라이브, 메모리 카드 등의 컴퓨터에서 판독 가능한 기록 매체를 포함할 수 있다. 다른 실시예에서 소프트웨어 구성요소들은 컴퓨터에서 판독 가능한 기록 매체가 아닌 통신 모듈(213, 223)을 통해 메모리(211, 221)에 로딩될 수도 있다. 예를 들어, 적어도 하나의 프로그램은 개발자들 또는 어플리케이션의 설치 파일을 배포하는 파일 배포 시스템(일례로 상술한 서버(150))이 네트워크(160)를 통해 제공하는 파일들에 의해 설치되는 프로그램(일례로 상술한 어플리케이션)에 기반하여 메모리(211, 221)에 로딩될 수 있다.The user terminal 1 110 and the server 150 may include memories 211 and 221, processors 212 and 222, communication modules 213 and 223, and input / output interfaces 214 and 224. The memory 211, 221 is a computer-readable recording medium, and may include a non-permanent mass storage device such as random access memory (RAM), read only memory (ROM), and a disk drive. In addition, an operating system and at least one program code (for example, a code for a browser or an application described above, which is installed and driven in the user terminal 1 110) may be stored in the memories 211 and 221. These software components can be loaded from a computer-readable recording medium separate from the memories 211 and 221 using a drive mechanism. Such a separate computer-readable recording medium may include a computer-readable recording medium such as a floppy drive, disk, tape, DVD / CD-ROM drive, and memory card. In other embodiments, software components may be loaded into memory 211 and 221 through communication modules 213 and 223 rather than a computer-readable recording medium. For example, at least one program is a program (an example) in which a file distribution system (for example, the server 150 described above) for distributing installation files of developers or applications is installed by files provided through the network 160. It can be loaded into the memory (211, 221) based on the above-described application).

프로세서(212, 222)는 기본적인 산술, 로직 및 입출력 연산을 수행함으로써, 컴퓨터 프로그램의 명령을 처리하도록 구성될 수 있다. 명령은 메모리(211, 221) 또는 통신 모듈(213, 223)에 의해 프로세서(212, 222)로 제공될 수 있다. 예를 들어 프로세서(212, 222)는 메모리(211, 221)와 같은 기록 장치에 저장된 프로그램 코드에 따라 수신되는 명령을 실행하도록 구성될 수 있다.Processors 212 and 222 may be configured to process instructions of a computer program by performing basic arithmetic, logic, and input / output operations. Instructions may be provided to processors 212 and 222 by memory 211 and 221 or communication modules 213 and 223. For example, the processors 212 and 222 may be configured to execute instructions received according to program codes stored in a recording device such as the memories 211 and 221.

통신 모듈(213, 223)은 네트워크(160)를 통해 사용자 단말 1(110)과 서버(150)가 서로 통신하기 위한 기능을 제공할 수 있으며, 다른 사용자 단말(일례로 사용자 단말 2(120)) 또는 다른 서버(일례로 서버(150))와 통신하기 위한 기능을 제공할 수 있다. 일례로, 사용자 단말 1(110)의 프로세서(212)가 메모리(211)와 같은 기록 장치에 저장된 프로그램 코드에 따라 생성한 요청이 통신 모듈(213)의 제어에 따라 네트워크(160)를 통해 서버(150)로 전달될 수 있다. 역으로, 서버(150)의 프로세서(222)의 제어에 따라 제공되는 제어 신호나 명령, 컨텐츠, 파일 등이 통신 모듈(223)과 네트워크(160)를 거쳐 사용자 단말 1(110)의 통신 모듈(213)을 통해 사용자 단말 1(110)로 수신될 수 있다. 예를 들어 통신 모듈(213)을 통해 수신된 서버(150)의 제어 신호나 명령 등은 프로세서(212)나 메모리(211)로 전달될 수 있고, 컨텐츠나 파일 등은 사용자 단말 1(110)이 더 포함할 수 있는 저장 매체로 저장될 수 있다.The communication modules 213 and 223 may provide a function for the user terminal 1 110 and the server 150 to communicate with each other through the network 160, and other user terminals (eg, user terminal 2 120) Alternatively, a function for communicating with another server (eg, the server 150) may be provided. In one example, the request generated by the processor 212 of the user terminal 1 110 according to the program code stored in the recording device such as the memory 211 is controlled through the network 160 under the control of the communication module 213 server ( 150). Conversely, control signals or commands, contents, files, etc. provided under the control of the processor 222 of the server 150 are communicated through the communication module 223 and the network 160 to the communication module of the user terminal 1 110 ( 213) may be received by the user terminal 1 (110). For example, control signals or commands of the server 150 received through the communication module 213 may be transmitted to the processor 212 or the memory 211, and the user terminal 1 110 may be used for content or files. It may be stored as a storage medium that may further include.

입출력 인터페이스(214, 224)는 입출력 장치(215)와의 인터페이스를 위한 수단일 수 있다. 예를 들어, 입력 장치는 키보드 또는 마우스 등의 장치를, 그리고 출력 장치는 어플리케이션의 통신 세션을 표시하기 위한 디스플레이와 같은 장치를 포함할 수 있다. 다른 예로 입출력 인터페이스(214)는 터치스크린과 같이 입력과 출력을 위한 기능이 하나로 통합된 장치와의 인터페이스를 위한 수단일 수도 있다. 보다 구체적인 예로, 사용자 단말 1(110)의 프로세서(212)는 메모리(211)에 로딩된 컴퓨터 프로그램의 명령을 처리함에 있어서 서버(150)나 사용자 단말 2(120)가 제공하는 데이터를 이용하여 구성되는 서비스 화면이나 컨텐츠가 입출력 인터페이스(214)를 통해 디스플레이에 표시될 수 있다.The input / output interfaces 214 and 224 may be means for interfacing with the input / output device 215. For example, the input device may include a device such as a keyboard or mouse, and the output device may include a device such as a display for displaying a communication session of an application. As another example, the input / output interface 214 may be a means for interface with a device in which functions for input and output are integrated into one, such as a touch screen. As a more specific example, the processor 212 of the user terminal 1 110 is configured using data provided by the server 150 or the user terminal 2 120 in processing a command of a computer program loaded in the memory 211. The service screen or content to be displayed may be displayed on the display through the input / output interface 214.

또한, 다른 실시예들에서 사용자 단말 1(110) 및 서버(150)는 도 2의 구성요소들보다 더 많은 구성요소들을 포함할 수도 있다. 그러나, 대부분의 종래기술적 구성요소들을 명확하게 도시할 필요성은 없다. 예를 들어, 사용자 단말 1(110)은 상술한 입출력 장치(215) 중 적어도 일부를 포함하도록 구현되거나 또는 트랜시버(transceiver), GPS(Global Positioning System) 모듈, 카메라, 각종 센서, 데이터베이스 등과 같은 다른 구성요소들을 더 포함할 수도 있다.In addition, in other embodiments, the user terminal 1 110 and the server 150 may include more components than those of FIG. 2. However, there is no need to clearly show most prior art components. For example, the user terminal 1 110 is implemented to include at least a portion of the input / output device 215 described above, or other configuration such as a transceiver, global positioning system (GPS) module, camera, various sensors, database, etc. It may further include elements.

도 3 은 본 발명의 일 실시예에 따른 프로세서의 내부 구성을 나타낸 것이다.3 shows an internal configuration of a processor according to an embodiment of the present invention.

프로세서(212)는 웹 페이지를 온라인으로부터 제공받아 출력할 수 있는 웹 브라우저(web browser) 또는 어플리케이션을 포함할 수 있다. 프로세서(212) 내에서 본 발명의 일 실시예에 따른 시맨틱 트리플 기반의 지식 확장 시스템의 구성은 도 3 에 도시된 바와 같이 데이터 업데이트부(310), 질의문 생성 모듈(320), 실제 질의 획득부(330), 시맨틱 트리플 추출기(340), 스크리닝부(350), 시맨틱 트리플 변환 모듈(360), 시맨틱 트리플 추가부(370)를 포함할 수 있다. 실시예에 따라 프로세서(212)의 구성요소들은 선택적으로 프로세서(212)에 포함되거나 제외될 수도 있다. 또한, 실시예에 따라 프로세서(212)의 구성요소들은 프로세서(212)의 기능의 표현을 위해 분리 또는 병합될 수도 있다.The processor 212 may include a web browser or application capable of receiving and outputting a web page from online. The configuration of the semantic triple-based knowledge expansion system according to an embodiment of the present invention in the processor 212 is a data update unit 310, a query generation module 320, an actual query acquisition unit as shown in FIG. 330, a semantic triple extractor 340, a screening unit 350, a semantic triple conversion module 360, and a semantic triple adding unit 370. Depending on the embodiment, the components of the processor 212 may be selectively included or excluded from the processor 212. Further, according to an embodiment, the components of the processor 212 may be separated or merged to express the function of the processor 212.

여기서, 프로세서(212)의 구성요소들은 사용자 단말 1(110)에 저장된 프로그램 코드가 제공하는 명령(일례로, 사용자 단말 1(110)에서 구동된 웹 브라우저가 제공하는 명령)에 따라 프로세서(212)에 의해 수행되는 프로세서(212)의 서로 다른 기능들(different functions)의 표현들일 수 있다.Here, the components of the processor 212 are the processor 212 according to an instruction provided by the program code stored in the user terminal 1 110 (eg, an instruction provided by a web browser driven by the user terminal 1 110). It may be representations of different functions of the processor 212 performed by.

이러한 프로세서(212) 및 프로세서(212)의 구성요소들은 도 4 의 시맨틱 트리플 기반의 지식 확장 방법이 포함하는 단계들(S1 내지 S6)을 수행하도록 사용자 단말 1(110)을 제어할 수 있다. 예를 들어, 프로세서(212) 및 프로세서(212)의 구성요소들은 메모리(211)가 포함하는 운영체제의 코드와 적어도 하나의 프로그램의 코드에 따른 명령(instruction)을 실행하도록 구현될 수 있다.The processor 212 and the components of the processor 212 may control the user terminal 1 110 to perform steps S1 to S6 included in the semantic triple-based knowledge expansion method of FIG. 4. For example, the processor 212 and components of the processor 212 may be implemented to execute instructions according to code of an operating system included in the memory 211 and code of at least one program.

도 4 및 도 5 는 본 발명의 일 실시예에 따른 시맨틱 트리플 기반의 지식 확장 방법을 시계열적으로 나타낸 도면이다. 이하의 명세서에서는, 도 3 및 도 4 를 함께 참조하여 본 발명의 시맨틱 트리플 기반의 지식 확장 방법, 시스템 및 컴퓨터 프로그램을 구체적으로 살펴보기로 한다.4 and 5 are diagrams showing a semantic triple based knowledge extension method according to an embodiment of the present invention in time series. In the following specification, the semantic triple based knowledge expansion method, system, and computer program of the present invention will be described in detail with reference to FIGS. 3 and 4 together.

이를 위해, 먼저 본 발명의 시맨틱 트리플 기반의 지식 확장 방법과 기존 검색 엔진과의 차이점을 살펴보기로 한다. 본 발명의 일 실시예에 따른 시맨틱 트리플 기반의 지식 확장 시스템은, 정확성 중심의 유니크 인스턴트 응답(Unique Instant Answer)을 제공할 수 있다. 본 발명의 시맨틴 트리플 기반의 지식 확장 방법은 검색 결과를 문서 형태가 아닌, 유니크 인스턴트 응답(Unique Instant Answer), 즉 즉답 형태로 제공한다는 점에서 기존의 검색 엔진과 차이점이 존재할 수 있다.To this end, first, the difference between the semantic triple based knowledge expansion method of the present invention and the existing search engine will be described. The semantic triple-based knowledge extension system according to an embodiment of the present invention may provide a unique instant answer focused on accuracy. The method for expanding knowledge based on the semantic triple of the present invention may have a difference from the existing search engine in that the search result is provided in a unique instant answer, that is, in the form of an immediate answer, not in the form of a document.

도 6 은 본 발명의 일 실시예에 따른 시맨틱 트리플 기반의 지식 확장 시스템을 설명하기 위한 것이다.6 is for explaining a semantic triple-based knowledge expansion system according to an embodiment of the present invention.

도 6 을 참조하면, 기존의 검색 엔진(As-Is, Searh)은 입력 방식이 키워드이고, 검색 결과로 문서리스트를 제공하고, 검색 플랫폼은 PC 혹은 모바일 에서 동작한다.Referring to FIG. 6, an existing search engine (As-Is, Searh) has an input method as a keyword, provides a document list as a search result, and the search platform operates on a PC or mobile.

이에 반해, 본 발명의 시맨틱 트리플 기반의 지식 확장 시스템(To-Be, Question-Answering)은, 입력 방식이 자연어 기반의 문장이고, 검색 결과로서 구체적인 응답, 즉 인스턴트 유니크 응답을 제공할 수 있으며, 플랫폼은 PC 혹은 모바일에 한정되지 않고 어디서나 구현될 수 있다.On the other hand, the semantic triple-based knowledge extension system (To-Be, Question-Answering) of the present invention is a natural language-based sentence with an input method, and can provide a specific response, that is, an instant unique response, as a search result, and a platform. Is not limited to PC or mobile and can be implemented anywhere.

보다 상세히, 본 발명의 시맨틱 트리플 기반의 지식 확장 시스템은 기존 검색 엔진이 키워드를 입력하는데 반해 자연어 기반의 문장을 입력 가능하도록 함으로써, 사용자가 사람에게 질문하듯이 자연스럽게 정보를 탐색할 수 있도록 한다. 또한, 본 발명의 시맨틱 트리플 기반의 지식 확장 시스템은 검색 결과로 구체적인 응답을 제공함으로써, 기존의 검색 엔진이 제공하는 문서 리스트에서 사용자가 직접 검색 결과를 찾아야 하는 불편을 경감시키고 최적의 검색 결과를 제공할 수 있다. 또한, 본 발명의 시맨틱 트리플 기반의 지식 확장 시스템은 플랫폼으로서 PC 혹은 모바일에 한정되지 않고 스마트 머신 기반으로 어디서나 즉시 정보를 탐색할 수 있다는 장점이 존재한다. 이하에서는, 도 3 및 도 4 를 중심으로 본 발명의 시맨틱 트리플 기반의 지식 확장 시스템 및 방법의 세부적인 구성을 살펴보기로 한다.In more detail, the semantic triple-based knowledge expansion system of the present invention enables a user to search information naturally as if asking a person by asking a natural language-based sentence while a conventional search engine inputs a keyword. In addition, the semantic triple-based knowledge expansion system of the present invention provides a specific response as a search result, thereby reducing the inconvenience of a user having to search for a search result directly from a document list provided by an existing search engine and providing an optimal search result can do. In addition, the semantic triple-based knowledge expansion system of the present invention is not limited to a PC or mobile as a platform, and has the advantage of being able to instantly search for information anywhere on a smart machine basis. Hereinafter, a detailed configuration of the semantic triple based knowledge extension system and method of the present invention will be described with reference to FIGS. 3 and 4.

먼저, 데이터 업데이트부(310)는 이전에 만들어진 시맨틱 트리플 형태의 데이터를 획득하고, 신규 데이터 혹은 사용자 질의가 발생하면 그에 대한 데이터를 업데이트한다(S1). 이때, 본 발명의 일 실시예에 따른 시맨틱 트리플 기반 지식 확장 방법은 기존의 데이터로서 시맨틱 트리플 형태의 데이터가 이미 만들어져 있다고 가정한다. 즉, 신규 데이터 및 사용자 질의 등의 데이터 업데이트가 발생하지 않으면, 본 발명의 데이터 업데이트 과정은 일어나지 않는다.First, the data update unit 310 acquires data of the previously created semantic triple type, and when new data or a user query occurs, updates the data (S1). At this time, it is assumed that the semantic triple-based knowledge extension method according to an embodiment of the present invention has already created semantic triple data as existing data. That is, if data updates such as new data and user queries do not occur, the data update process of the present invention does not occur.

본 발명의 시맨틱 트리플 기반의 지식 확장 방법을 수행하게 하는 데이터 업데이트의 예시는 다양할 수 있다. 본 발명의 일 실시예에 따르면, 문서, 데이터베이스(Data Base, DB) 등의 신규 정보가 업데이트 되었을 때, 사용자들이 본 발명의 질의 응답 서비스를 이용하며 새로운 질문을 남겼을 때, 혹은 전체 데이터에 변동이 있는 경우, 데이터 업데이트부(310)는 데이터를 업데이트할 수 있다. 본 발명의 일 실시예에 따르면, 데이터 업데이트부(310)는 주기적으로 데이터 업데이트를 진행하거나, 혹은 사용자의 요청에 따라 데이터 업데이트를 진행할 수 있다.Examples of data update to perform the semantic triple based knowledge expansion method of the present invention may be various. According to an embodiment of the present invention, when new information such as a document, a database (Data Base, DB) is updated, when users use the question and answer service of the present invention and leave a new question, or the entire data fluctuates If present, the data update unit 310 may update the data. According to an embodiment of the present invention, the data update unit 310 may periodically perform data update or may perform data update according to a user's request.

다음으로, 질의문 생성 모듈(320)은 엔티티(Entity) 유의어, 어트리뷰트(attribute) 유의어를 활용 및 조합하여 질의문을 생성한다. 보다 상세히, 질의어 생성 모듈(320)은 본 발명의 일 실시예에 따른 시맨틱 트리플 기반의 지식 확장 방법을 수행 시 시맨틱 트리플 데이터를 기반으로 하여, 질의문을 생성한다. 본 발명의 시맨틱 트리플 기반의 지식 확장 시스템을 수행하는 것은 데이터 업데이트 시에도 가능하고, 주기적, 혹은 사용자의 요청에 의해서도 가능하다.Next, the query statement generation module 320 generates a query statement by using and combining entity synonyms and attribute synonyms. In more detail, the query language generation module 320 generates a query statement based on the semantic triple data when performing the method for expanding knowledge based on the semantic triple according to an embodiment of the present invention. Performing the semantic triple based knowledge expansion system of the present invention is possible even when data is updated, and periodically or at the request of a user.

혹은, 후술하는 관리자 페이지에서 사용자가 질의문 생성에 대한 규칙(rule)을 추가하면, 해당 규칙 기반으로 질의문을 생성할 수 있다.Alternatively, when a user adds a rule for generating a query statement in the administrator page to be described later, a query statement can be generated based on the rule.

도 7 은 본 발명의 일 실시예에 따른 시맨틱 트리플 기반의 지식 확장 방법을 설명하기 위한 도면이다. 7 is a diagram for explaining a method for expanding knowledge based on a semantic triple according to an embodiment of the present invention.

도 7 은 본 발명의 바람직한 일 실시예로서, 시맨틱 트리플 기반의 검색 수행의 일 예를 도시한다.7 is a preferred embodiment of the present invention, and shows an example of performing a semantic triple based search.

시맨틱 트리플 데이터베이스는 실제 사용자들의 질의문을 모사한 특수한 형태의 지식기반(Knowledge Base) 데이터베이스로 별도의 추론과정없이 유니크 인스턴트 응답(Unique instant answer)을 검색할 수 있다. 시맨틱 트리플 데이터베이스는 entity(732)-attribute(734) - instant answer(738)의 형태를 지닌다.The semantic triple database is a special type of knowledge base database that mimics real users' queries, and can search for unique instant answers without additional inference. The semantic triple database has the form entity (732) -attribute (734)-instant answer (738).

도 7 은 "백두산의 높이가 얼마야?"인 사용자 질의(710)를 수신한 경우, 사용자 질의를 분석하여(720), '백두산'과 '높이'라는 핵심단어를 추출한 후 백두산을 물어볼 대상으로 높이를 질문의 의도록 분석할 수 있다. FIG. 7 shows a user query 710 that is "How high is the height of Baekdusan?", Analyzes the user query (720), extracts key words of 'Baekdusan' and 'height', and then asks for the question about Baekdusan The height can be analyzed as a question.

관리자설정부는 시맨틱 트리플 DB를 확인하여(730), entity = "백두산", attribute = "높이" 인 데이터를 검색하고, 해당하는 항목의 instant answer를 결과값으로 판단하여, 사용자에게 해당답변 2,744m을 제공한다(750). 상술한 바와 같은 시맨틱 트리플 데이터베이스는 최적 정답을 검색하는데 별도의 추론 과정 없이 최적의 답을 제공할 수 있다.The administrator setting unit checks the semantic triple DB (730), searches for data with entity = "Baekdusan" and attribute = "height", determines the instant answer of the corresponding item as the result value, and gives the user a corresponding answer of 2,744m Provide (750). The semantic triple database as described above may search for an optimal correct answer and provide an optimal answer without a separate reasoning process.

관리자설정부는 스크리닝부에서 판단한 최적정답값을 시맨틱 트리플 기반으로 저장하기 위해 사용자 질의와 유니크 인스턴트 응답의 형태를 확인하고, 사용자 질의를 entity(732) 및 attribute(734)로, 유니크 인스턴트 응답을 instant answer(738)로 변환한다. 이 때 질문 변환 과정은 자연어 이해 기술 및 기 시맨틱 트리플의 entity /attribute 데이터 검색 기술을 포함한다In order to store the optimal correct answer value determined by the screening unit based on the semantic triple, the administrator setting unit checks the types of user queries and unique instant responses, instant user answers to user queries as entity (732) and attribute (734), and instant answer (738). At this time, the question transformation process includes natural language comprehension technology and entity semantic triple entity / attribute data retrieval technology.

시맨틱 트리플은 실제 사용자들의 질의문을 모사한, 특수 형태의 Knowledge Base로, 자사의 특징적인 DB 형태라 할 수 있다. 시맨틱 트리플 DB 형태는 entity - attribute -instant answer 형태를 띄고 있으며, 이 같은 형태 때문에 별도의 추론 과정 없이 Unique instant answer 검색이 가능하다.Semantic Triple is a special type of knowledge base that simulates real users' queries, and can be called its characteristic DB type. The semantic triple DB type has an entity-attribute-instant answer type, and because of this type, a unique instant answer search is possible without a separate reasoning process.

예를 들어, 백두산의 높이가 얼마야? 라는 질문이 있을 때, 질의어 분석을 통해 백두산을 entity, 높이를 attribute로 우선 분석한다. 이후 시맨틱 트리플 DB를 Lookup하여, entity = “백두산”, attribute = “높이” 인 데이터를 검색하고, 해당하는 항목의 instant answer를 결과값으로 판단하여, 사용자에게 해당 답변을 제공할 수 있다.For example, how tall is Baekdusan? When there is a question, first analyze Baekdusan as an entity and height as an attribute through query analysis. Then, by looking up the semantic triple DB, the entity = “Baekdusan”, the attribute = “height” data is retrieved, and the instant answer of the corresponding item is determined as the result value, and the corresponding answer can be provided to the user.

보다 상세히, 질의문 생성 모듈(320)은 시맨틱 트리플 DB를 기반으로 하여 유의어 확장 기능을 추가하는 형태로 작동된다. 이하에서는, 질의문 생성 모듈(320)의 세부적인 작동 프로세스를 도 8 을 중심으로 살펴보기로 한다.In more detail, the query statement generation module 320 operates in the form of adding a synonym expansion function based on the semantic triple DB. Hereinafter, a detailed operation process of the query statement generation module 320 will be described with reference to FIG. 8.

도 8 은 본 발명의 일 실시예에 따른 질의문 생성 모듈의 동작을 시계열적으로 나타낸 것이다.8 is a time series showing the operation of the query statement generation module according to an embodiment of the present invention.

도 8 을 참조하면, 먼저 질의문 생성 모듈(320)은 전체 시맨틱 트리플 데이터에서, 엔티티(entity) 필드와 어트리뷰트(attribute) 필드를 룩업(Lookup) 하여 조합한다(S21). 예를 들면, 백두산이 entity, 최종 분화일이 attribute라고 할 때, 해당 entity와 attribute를 조합하여 신규 질문 “백두산 최종분화일”을 생성한다. 이 때, 생길 수 있는 질문의 경우의 수는 entity DB(데이터베이스) 수량과 attribute DB 수량의 곱이 될 것이다.Referring to FIG. 8, first, the query statement generation module 320 looks up and combines the entity field and the attribute field in the entire semantic triple data (S21). For example, when Baekdusan is an entity and the final eruption date is an attribute, a new question “Baekdusan final eruption date” is generated by combining the entity and the attribute. At this time, the number of possible questions will be the product of the quantity of the entity DB (database) and the quantity of the attribute DB.

다음으로, 질의문 생성 모듈(320)은 별도의 카테고리 정보를 통해 단순 entity DB 수량과 attribute DB 수량의 곱이 아닌, 특정 관계 카테고리 별로 entity DB와 attribute DB를 연계할 수 있다(S22). 본 발명의 일 실시예에 따르면, S22 단계는 S21 과정 진행 시 동시에 진행될 수 있다.Next, the query statement generation module 320 may link the entity DB and the attribute DB for each specific relationship category, not the product of the simple entity DB quantity and the attribute DB quantity through separate category information (S22). According to an embodiment of the present invention, step S22 may be performed simultaneously when the process of S21 is performed.

보다 상세히, 본 발명의 일 실시예에 따라 entity가 사람 이름일 경우, S21 단계와 같이 질문의 수가 entity DB 수량과 attribute DB 수량의 곱인 경우만을 가정하면, entity 및 attribute 데이터의 단순 곱으로 질의문을 생성하여 [entity : 이순신 / attribute : 발매일 / 생성된 질문 : 이순신 발매일] 와 같은 부적격 데이터가 생성될 수 있다. 이와 같이 부적격 데이터가 생성되는 것을 방지하기 위해, 본 발명의 일 실시예에 따른 질의문 생성 모듈(320)은 질문 생성 시 카테고리 정보를 활용하여 관련된 entity 및 attribute 정보만 활용하여, 질의문을 생성할 수 있다. In more detail, if the entity is a person's name according to an embodiment of the present invention, assuming only the case where the number of questions is the product of the entity DB quantity and the attribute DB quantity, as in step S21, the query is a simple product of the entity and attribute data. By creating, ineligible data such as [entity: Yi / attribute: release date / created question: Yi release date] may be generated. In order to prevent such ineligible data from being generated, the query statement generation module 320 according to an embodiment of the present invention uses the category information to generate a query statement using only related entity and attribute information when creating a question. You can.

다음으로, 질의문 생성 모듈(320)은 entity 및 attribute 유의어 정보를 추가로 활용하여 생성될 질의문 수를 확장한다(S23). 이 때 전체 질의문 숫자는 관련 있는 카테고리별로 (entity + entity 유의어) * (attribute + attribute 유의어)가 될 수 있다.Next, the query statement generation module 320 expands the number of query statements to be generated by additionally utilizing entity and attribute synonym information (S23). In this case, the total number of query statements can be (entity + entity synonym) * (attribute + attribute synonym) for each category.

도 9 는 본 발명의 일 실시예에 따른 질의문 확장을 설명하기 위한 것이다.9 is for explaining query statement expansion according to an embodiment of the present invention.

도 9 를 참조하면, entity 는 카테고리를 국가명으로 하는 [미국, 프랑스, 영국, 대한민국, ...] 인 경우, entity 유의어는 [USA, 한국, ...] 이 될 수 있다. 또한, attribute 는 카테고리를 국가정보로 하는 [공용어, 정부 형태, 수도, 최대 도시, 국왕 ...] 인 경우, attribute 유의어는 [여왕, 공화국, 주도, ?] 등이 될 수 있다. 이와 같이 유의어를 함께 고려함으로써, 질의문 수를 확장할 수 있다.Referring to FIG. 9, if the entity is [USA, France, UK, Korea, ...] whose category is the country, the entity synonym may be [USA, Korea, ...]. In addition, if the attribute is [official language, government type, capital, largest city, king ...] with the category as national information, the attribute synonym can be [Queen, Republic, Capital,?]. By considering synonyms together, the number of query statements can be expanded.

한편, 본 발명의 일 실시예에 따르면 질의문 생성 모듈(320)은 카테고리 구분이 아니라 전체 entity 및 attribute로 질의어를 생성할 수도 있다. 해당 부분이 관리자가 평가하였을 때, 적합성이 떨어지더라도 시맨틱 트리플 추출기 및 스크리닝 과정에서 걸러질 수 있다. 예를 들어, 생성 질의문에서 이미 유니크 인스턴트 응답이 존재하는 경우에는 해당 질의어를 시맨틱 트리플 추출기에 넣지 않을 수 있다.Meanwhile, according to an embodiment of the present invention, the query statement generation module 320 may generate query words with all entities and attributes, not category classification. When the part is evaluated by the manager, even if it is inadequate, it can be filtered during the semantic triple extractor and screening process. For example, if a unique instant response already exists in the generated query, the query may not be included in the semantic triple extractor.

또한, 실제 사용자 질의 획득부(330)는 사용자 로그에 기반한 실제 사용자 질의를 시맨틱 트리플 추출기(340)에 입력할 수 있다.Also, the real user query acquisition unit 330 may input the real user query based on the user log into the semantic triple extractor 340.

다음으로, 시맨틱 트리플 추출기(340)는 질의문 생성 모듈에서 생성된 질의, 혹은 실제 사용자 질의문을 입력값으로 획득하여, 생성된 질의문에 대해 유니크 인스턴트 응답을 도출한다(S3). 시맨틱 트리플 추출기(340)는 패시지 검색 모듈(333) 및 MRC QA 모듈(334)을 포함할 수 있다.Next, the semantic triple extractor 340 obtains a query generated by the query statement generation module or an actual user query as an input value, and derives a unique instant response to the generated query statement (S3). The semantic triple extractor 340 may include a passage search module 333 and an MRC QA module 334.

패시지 검색 모듈(341)은 질의의 특성에 따라, 관련성이 존재하는 패시지(Passage) 후보군을 1차 선정하여 검색 대상 타겟팅을 시행한다. 다음으로, 패시지 검색 모듈(341)은 해당 질의와 관련 있는 Passage를 검색하여 MRC(Machine Reading Comprehension) QA 모듈로 전달한다. 또한, 패시지 검색 모듈(341)은 하나의 문서에서 다수의 Passage를 추출할 수도 있고, 다수의 문서에서 다수의 Passage를 추출할 수도 있다. 또한, 패시지 검색 모듈(333)은 기존의 검색 엔진에서 흔히 사용하는 TF-IDF 알고리즘을 응용하여 Passage 를 도출할 수 있다. 또한, 패시지 검색 모듈(333)은 임의의 일정 Score 이상, Top N개의 결과를 MRC QA 모듈로 전달할 수 있다.The passage search module 341 performs targeting of the search target by first selecting a candidate candidate group of passages in which relevance exists according to the characteristics of the query. Next, the passage search module 341 searches for the Passage related to the query and transmits it to the Machine Reading Comprehension (MRC) QA module. In addition, the passage search module 341 may extract a plurality of passports from one document or extract a plurality of passes from a plurality of documents. In addition, the passage search module 333 can derive the passage by applying the TF-IDF algorithm commonly used in existing search engines. In addition, the passage search module 333 may transmit the top N results of a certain predetermined score or more to the MRC QA module.

MRC QA 모듈(342)은 주어진 Passage 결과를 받아, Passage, Question 데이터를 기반으로 Unique Instant Answer를 도출할 수 있다. 또한, MRC QA 모듈(342)은 각 Passage 숫자대로, Unique Instant Answer와 해당 정답의 신뢰도를 도출할 수 있다. 또한, MRC QA 모듈(342)의 경우, 다수의 MRC QA Algorithm을 탑재할 수 있다.The MRC QA module 342 may receive a given Passage result and derive a Unique Instant Answer based on the Passage and Question data. In addition, the MRC QA module 342 may derive the reliability of the unique instant answer and the correct answer according to each passage number. In addition, in the case of the MRC QA module 342, a number of MRC QA Algorithms can be mounted.

최종적으로, 시맨틱 트리플 추출기(340)는 MRC QA 모듈(342) 에서 도출된 각각의 unique instant answer 와 신뢰도를 스크리닝부(350)에 전달한다.Finally, the semantic triple extractor 340 transmits each unique instant answer and reliability derived from the MRC QA module 342 to the screening unit 350.

다음으로, 스크리닝부(350)는 시맨틱 트리플 추출기(340)로부터 획득한 결과를 판단하여 정답인 유니크 인스턴트 응답 및 질의를 시맨틱 트리플 변환 모듈에 제공한다(S4). 보다 상세히, 스크리닝부(350)는 시맨틱 트리플 추출기(340)에서 받은 결과를 확인하여, 해당 결과가 정답인지를 판별할 수 있다. 스크리닝부(350)는 MRC QA 모듈(342)에서 나온 자체 신뢰도 및, Question 데이터 기반으로 다수의 결과가 같게 나왔을 때 정답으로 판단한다. Next, the screening unit 350 determines the result obtained from the semantic triple extractor 340 and provides a unique instant response and query, which are correct answers, to the semantic triple conversion module (S4). In more detail, the screening unit 350 may check the result received from the semantic triple extractor 340 to determine whether the corresponding result is the correct answer. The screening unit 350 determines the correct answer when a plurality of results are the same based on its own reliability and question data from the MRC QA module 342.

보다 상세히, 스크리닝부(350)는 자체 신뢰도가 특정 임계치 이상일 경우, 정답으로 판단한다. 본 발명의 일 실시예에 따르면, 특정 임계치의 경우 초기에는 기본값으로 설정되고, 후에 실제 정답 도출 이력및 질문 패턴을 고려하여 자동으로 변경될 수 있다. 구체적인 예를 들어, 초기값 특정 임계치를 90%로 선정했다 해도, 실제 정답 도출 이력을 확인하여 국가 관련 질문의 경우 85% 이상의 신뢰도임에도 정답으로 선정됐다면, 스크리닝부(350)는 자동으로 국가 관련 질문의 해당 임계치를 90%에서 85%로 갱신할 수 있다.In more detail, the screening unit 350 determines that the self-reliability is greater than or equal to a specific threshold. According to an embodiment of the present invention, in the case of a specific threshold, it is initially set as a default value, and may be automatically changed after considering the actual correct answer derivation history and question pattern. For a specific example, even if a specific threshold is selected as 90%, the screening unit 350 automatically selects a country-related question if it is selected as a correct answer even though it has a reliability of 85% or more in the case of a country-related question by checking the history of deriving the correct answer. This threshold can be updated from 90% to 85%.

또한, 스크리닝부(350)는 Question 데이터 기반으로 다수의 unique instant answer 결과가 같게 나올 경우, 정답으로 판단할 수 있다. 또한, 스크리닝부(350)는 자체 신뢰도가 특정 임계치 이상인 경우와, 다수의 결과가 같게 나온 경우의 정답이 서로 다르다면, 우선적으로 다수의 결과가 같게 나온 경우를 정답으로 간주할 수 있다.In addition, the screening unit 350 may determine a correct answer when a plurality of unique instant answer results are the same based on Question data. Also, the screening unit 350 may preferentially consider a case in which a plurality of results are the same if the self-reliability is greater than or equal to a certain threshold and a case in which a plurality of results are the same is different.

또한, 스크리닝부(350)는 MRC QA 알고리즘 자체 신뢰도가 특정 임계치 미만일 경우, unique instant answer 결과가 1개 이상 같게 나오지 않은 경우 및 Question 데이터에 기반한 같은 unique instant answer가 다수가 아닐 경우 오답으로 간주할 수 잇다.In addition, the screening unit 350 may be regarded as an incorrect answer when the reliability of the MRC QA algorithm itself is less than a certain threshold, when there are not more than one unique instant answer results, and if the same unique instant answer based on Question data is not many. connect.

스크리닝부(350)는 정답 및 오답을 판단한 후, 정답으로 판별한 데이터를 시맨틱 트리플 변환 모듈(360)로 전달한다. 시맨틱 트리플 변환모듈(360)이 획득하는 정보는, Question 데이터 및 unique instant answer 이다.After determining the correct answer and incorrect answer, the screening unit 350 transmits the data determined as the correct answer to the semantic triple conversion module 360. The information acquired by the semantic triple conversion module 360 is Question data and unique instant answer.

한편, 본 발명의 일 실시예에 따르면 스크리닝부(350)가 정답으로 판별한 결과 전체를 관리자 페이지에 전달하여 추후 사람이 수동 확인 후 시맨틱 트리플 데이터에 추가할 수 있도록 결과값을 저장한다. 이 때, 저장된 결과값은 Passage, Question, Unique Instant Answer, 신뢰도, 사용된 MRC QA 모듈 정보를 포함한다.On the other hand, according to an embodiment of the present invention, the screening unit 350 transmits the entire result determined as the correct answer to the manager page to store the result value so that a person can add it to the semantic triple data after manual verification. At this time, the stored results include Passage, Question, Unique Instant Answer, Reliability, and MRC QA module information used.

다음으로, 시맨틱 트리플 변환 모듈(360)은 유니크 인스턴트 응답 및 질의를 엔티티, 어트리뷰트, 인스턴트 응답 형태인 시맨틱 트리플로 변환할 수 있다. 보다 상세히, 시맨틱 트리플 변환 모듈(360)은 스크리닝부(350)가 정답으로 판별한 데이터를 획득하여, 해당 데이터를 Entity, Attribute, Unique Instant Answer 형태로 변환할 수 있다. 스크리닝부(350)가 제공하는 input 정보는 Question, Unique Instant Answer이며, 시맨틱 트리플 변환모듈은 이를 Entity, Attribute, Unique Instant Answer 형태로 변환할 수 있다. 변환을 위해, NLP(Natural Language Processing) 및 NLU(Natural Language Understanding)이 사용될 수 있다.Next, the semantic triple conversion module 360 may convert the unique instant response and query into a semantic triple in the form of entity, attribute, and instant response. In more detail, the semantic triple conversion module 360 may acquire data determined by the screening unit 350 as a correct answer, and convert the data into an Entity, Attribute, or Unique Instant Answer form. The input information provided by the screening unit 350 is Question, Unique Instant Answer, and the semantic triple conversion module can convert it into Entity, Attribute, and Unique Instant Answer types. For conversion, NLP (Natural Language Processing) and NLU (Natural Language Understanding) can be used.

실제 시맨틱 트리플 기반 검색 서비스를 제공하듯이, 시맨틱 트리플 변환 모듈(360)은 Question을 분석하여 해당 정보를 Entity 및 Attribute로 분해한다. 보다 구체적인 예로, 도 6 의 예시처럼 “백두산 높이가 얼마야?” 라는 질문이 있다면, NLP(Natural Language Processing) 및 NLU(Natural Language Understanding) 기술을 사용하여 백두산을 entity로, 높이를 Attribute로 분해한다. 그리고 도출된 Unique Instant Answer를 해당 entity, attribute와 쌍을 이뤄 최종적으로 시맨틱 트리플 형태로 저장할 수 있다.As in the case of providing a real semantic triple-based search service, the semantic triple conversion module 360 analyzes the question and decomposes the information into Entity and Attribute. As a more specific example, as in the example of FIG. 6, “How high is Baekdusan?” If you have a question, use the Natural Language Processing (NLP) and Natural Language Understanding (NLU) techniques to decompose Baekdusan into an entity and height into an Attribute. And the derived Unique Instant Answer can be paired with the corresponding entity and attribute, and finally saved as a semantic triple.

또한, 시맨틱 트리플 변환 모듈(360)이 사용하는 NLP(Natural Language Processing) 및 NLU(Natural Language Understanding) 기술 요소에는, 기본적인 자연어 이해에 필요한 형태소 분석 사전, Entity, Attribute DB Lookup, Rule 기반의 문장 구조 분석기, 딥러닝 기술을 활용한 워드 임베딩을 통한 유사 질의어 매핑 기술이 포함될 수 있다.In addition, the NLP (Natural Language Processing) and NLU (Natural Language Understanding) technology elements used by the semantic triple conversion module 360 include a morphological analysis dictionary, Entity, Attribute DB Lookup, and Rule-based sentence structure analysis necessary for basic natural language understanding. , Similar query word mapping technology through word embedding using deep learning technology may be included.

다음으로, 시맨틱 트리플 추가부(370)는 시맨틱 트리플 변환모듈(360)에서 Entity, Attribute, Unique Instant Answer를 생성하여 전달하면, 해당 부분 DB를 반영하여 신규/업데이트 된 시맨틱 트리플을 자동으로 DB에 추가할 수 있다.Next, when the semantic triple adding unit 370 generates and delivers an Entity, Attribute, and Unique Instant Answer from the semantic triple conversion module 360, the new / updated semantic triple is automatically added to the DB by reflecting the partial DB. can do.

한편, 본 발명의 시맨틱 트리플 기반의 지식 확장 시스템의 일 실시예에 따르면 관리자 페이지가 추가적으로 제공될 수 있다. 관리자는 관리자 페이지를 잉하여 시스템 전반을 관리하고 수동으로 시맨틱 트리플 수정/삭제/업데이트/추가할 수 있다. 보다 상세히, 관리자는 관리자 페이지를 이용하여 시맨틱 트리플 기반의 지식 확장 플랫폼 동작 주기 및 시행을 조정(주기별 업데이트, 시스템 수동 구동 가능)하고, 생성된 질의문 및 실제 사용자들의 질의 내용을 확인할 수 있으며, 질의문 생성 모듈에서, Rule 기반의 질의어 생성 규칙을 추가하여 다른 패턴의 질의문 생성이 가능하다. 또한, 관리자는 관리자 페이지를 이용하여 시맨틱 트리플 추출기(340)에서, Passage 검색 모듈의 TF-IDF 값 Score 지정 및 최대 Passage 전달 숫자, MRC QA Algorithm 을 추가/삭제할 수 있다. 또한, 관리자는 관리자 페이지를 이용하여 스크리닝부(350)의 초기 신뢰도를 설정하고, 전체 결과를 확인하여, 수동으로 시맨틱 트리플을 추가하며, 그 외 시스템 모니터링 정보 등을 확인할 수 있다.Meanwhile, according to an embodiment of the semantic triple-based knowledge expansion system of the present invention, an administrator page may be additionally provided. Administrators can manage the entire system by turning on the admin page and manually modify / delete / update / add semantic triples. In more detail, the administrator can adjust the semantic triple-based knowledge expansion platform operation cycle and execution using the admin page (periodical update, system manual operation possible), and check the generated query statement and actual users' query contents, In the query generation module, it is possible to create query patterns of different patterns by adding rule-based query word generation rules. In addition, the administrator can add / delete the TF-IDF value Score of the Passage search module, the maximum passage transfer number, and the MRC QA Algorithm in the semantic triple extractor 340 using the administrator page. In addition, the administrator can set the initial reliability of the screening unit 350 using the administrator page, check the overall result, manually add semantic triples, and check other system monitoring information.

본 발명의 일 실시예에 따른 시맨틱 트리플 기반의 지식 확장 시스템은, 정확성 높은 유니크 인스턴트 응답(Unique Instant Answer)을 제공할 수 있다. 본 발명의 시맨틱 트리플 기반의 지식 확장 방법은 검색 결과를 문서 형태가 아닌, 유니크 인스턴트 응답(Unique Instant Answer), 즉 즉답 형태로 제공한다는 점에서 기존의 검색 엔진과 차이점이 존재할 수 있다.The semantic triple-based knowledge extension system according to an embodiment of the present invention may provide a highly accurate unique instant answer. The semantic triple-based knowledge extension method of the present invention may have a difference from the existing search engine in that the search result is provided in a unique instant answer, that is, in the form of an immediate answer, rather than in the form of a document.

또한, 본 발명의 일 실시예에 따른 시맨틱 트리플 기반의 지식 확장 시스템은, 지식 확장을 위해, 시맨틱 트리플이라는 특수한 형태의 KB(Knowledge Base), 질문에 대한 답변을 문단 내에서 찾아주는 MRC(Machine Reading Comprehension) 기술 및 해당 문단을 전통적인 IR(Information Retrieval) 방식으로 찾아주는 자체 개발 기술을 조합하여, 시맨틱 트리플 기반의 지식 확장 플랫폼을 구축할 수 있다. In addition, the semantic triple-based knowledge expansion system according to an embodiment of the present invention, in order to expand knowledge, a special type of KB (Knowledge Base) called semantic triple, MRC (Machine Reading) to find answers to questions in a paragraph Comprehension technology and self-developed technology that finds the relevant paragraph in the traditional IR (Information Retrieval) method can be combined to build a semantic triple-based knowledge expansion platform.

이상 설명된 본 발명에 따른 실시예는 컴퓨터 상에서 다양한 구성요소를 통하여 실행될 수 있는 컴퓨터 프로그램의 형태로 구현될 수 있으며, 이와 같은 컴퓨터 프로그램은 컴퓨터로 판독 가능한 매체에 기록될 수 있다. 이때, 매체는 컴퓨터로 실행 가능한 프로그램을 계속 저장하거나, 실행 또는 다운로드를 위해 저장하는 것일 수도 있다. 또한, 매체는 단일 또는 수개 하드웨어가 결합된 형태의 다양한 기록수단 또는 저장수단일 수 있는데, 어떤 컴퓨터 시스템에 직접 접속되는 매체에 한정되지 않고, 네트워크 상에 분산 존재하는 것일 수도 있다. 매체의 예시로는, 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체, CD-ROM 및 DVD와 같은 광기록 매체, 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical medium), 및 ROM, RAM, 플래시 메모리 등을 포함하여 프로그램 명령어가 저장되도록 구성된 것이 있을 수 있다. 또한, 다른 매체의 예시로, 애플리케이션을 유통하는 앱 스토어나 기타 다양한 소프트웨어를 공급 내지 유통하는 사이트, 서버 등에서 관리하는 기록매체 내지 저장매체도 들 수 있다.The embodiment according to the present invention described above may be implemented in the form of a computer program that can be executed through various components on a computer, and such a computer program can be recorded on a computer-readable medium. In this case, the medium may be continuously stored in a program executable by a computer or may be stored for execution or download. In addition, the medium may be various recording means or storage means in the form of a single or several hardware combinations, and is not limited to a medium directly connected to a computer system, but may be distributed on a network. Examples of the medium include magnetic media such as hard disks, floppy disks, and magnetic tapes, optical recording media such as CD-ROMs and DVDs, and magneto-optical media such as floptical disks, And program instructions including ROM, RAM, flash memory, and the like. In addition, examples of other media include an application store for distributing applications, a site for distributing or distributing various software, and a recording medium or storage medium managed by a server.

이상에서 본 발명이 구체적인 구성요소 등과 같은 특정 사항과 한정된 실시예 및 도면에 의하여 설명되었으나, 이는 본 발명의 보다 전반적인 이해를 돕기 위하여 제공된 것일 뿐, 본 발명이 상기 실시예에 한정되는 것은 아니며, 본 발명이 속하는 기술분야에서 통상적인 지식을 가진 자라면 이러한 기재로부터 다양한 수정과 변경을 꾀할 수 있다.In the above, the present invention has been described by specific matters such as specific components and limited examples and drawings, but it is provided to help a more comprehensive understanding of the present invention, and the present invention is not limited to the above embodiments, but Those skilled in the art to which the invention pertains may seek various modifications and changes from these descriptions.

따라서, 본 발명의 사상은 상기 설명된 실시예에 국한되어 정해져서는 아니 되며, 후술하는 특허청구범위뿐만 아니라 이 특허청구범위와 균등한 또는 이로부터 등가적으로 변경된 모든 범위는 본 발명의 사상의 범주에 속한다고 할 것이다.Accordingly, the spirit of the present invention is not limited to the above-described embodiments, and should not be determined, and the scope of the spirit of the present invention as well as the claims to be described later, as well as all ranges that are equivalent to or equivalently changed from the claims Would belong to

110, 120, 130, 140: 복수의 사용자 단말들
150: 서버
160: 네트워크
211, 221: 메모리
212, 222: 프로세서
213, 223: 통신 모듈
214, 224: 입출력 인터페이스110, 120, 130, 140: a plurality of user terminals
150: server
160: network
211, 221: memory
212, 222: processor
213, 223: communication module
214, 224: I / O interface

Claims

기존재하는 시맨틱 트리플 데이터를 업데이트하는 데이터 업데이트부;
엔티티(Entity) 유의어, 어트리뷰트(attribute) 유의어를 활용 및 조합하여 질의문을 생성하는 질의문 생성 모듈;
사용자 로그에 기반한 실제 사용자 질의를 획득하는 실제 질의 획득부;
상기 질의문 생성 모듈에서 생성된 질의 혹은 상기 실제 사용자 질의문을 입력값으로 획득하여, 질의의 특성에 따라, 관련성이 존재하는 패시지(Passage) 후보군을 1차 선정하여 검색 대상 타겟팅을 시행하고 해당 질의와 관련 있는 Passage를 검색하며, 획득한 패시지 및 질의 데이터를 기반으로 유니크 인스턴트 응답을 도출하는 시맨틱 트리플 추출기;
정답인 유니크 인스턴트 응답(Unique Instant Answer) 및 질의를 엔티티, 어트리뷰트, 인스턴트 응답 형태인 시맨틱 트리플로 변환하는 시맨틱 트리플 변환 모듈;
을 포함하는, 시맨틱 트리플 기반의 지식 확장 시스템.A data update unit for updating existing semantic triple data;
A query statement generation module for generating a query statement by utilizing and combining entity synonyms and attribute synonyms;
An actual query acquisition unit that acquires an actual user query based on the user log;
The query generated by the query statement generation module or the actual user query statement is obtained as an input value, targeting the search target by first selecting the candidate candidate group that has relevance according to the characteristics of the query A semantic triple extractor that searches for a Passage related to and derives a unique instant response based on the acquired passage and query data;
A semantic triple conversion module that converts a unique instant answer, which is a correct answer, and a query to a semantic triple, which is an entity, attribute, or instant response form;
Including, semantic triple based knowledge expansion system.

제 1 항에 있어서,
상기 질의문 생성 모듈은,
전체 시맨틱 트리플 데이터에서, 엔티티(entity) 필드와 어트리뷰트(attribute) 필드를 룩업(Lookup) 하여 조합하고, 특정 관계 카테고리 별로 엔티티 데이터베이스와 어트리뷰트 데이터베이스를 연계하며, 유의어 정보를 활용하여 생성될 질의문 수를 확장하는, 시맨틱 트리플 기반의 지식 확장 시스템.According to claim 1,
The query statement generation module,
In the entire semantic triple data, look up and combine the entity field and attribute field, link the entity database with the attribute database for each specific relationship category, and use the synonym information to determine the number of query statements to be generated. Expanding, semantic triple based knowledge expansion system.

제 1 항에 있어서,
정답인 유니크 인스턴트 응답을 판별하는 스크리닝부를 더 포함하고,
상기 스크리닝부는,
질의 데이터 기반으로 다수의 유니크 인스턴트 응답 결과가 같게 나오거나, 자체 신뢰도가 특정 임계치 이상인 경우 정답으로 판단하는, 시맨틱 트리플 기반의 지식 확장 시스템.According to claim 1,
Further comprising a screening unit for determining a unique answer that is a correct answer,
The screening unit,
A semantic triple-based knowledge expansion system that determines if multiple unique instant response results are the same based on the query data, or if their reliability is higher than a certain threshold, as the correct answer.

기존재하는 시맨틱 트리플 데이터를 업데이트하는 데이터 업데이트 단계;
엔티티(Entity) 유의어, 어트리뷰트(attribute) 유의어를 활용 및 조합하여 질의문을 생성하는 질의문 생성 단계;
사용자 로그에 기반한 실제 사용자 질의를 획득하는 실제 질의 획득 단계;
상기 질의문 생성 모듈에서 생성된 질의 혹은 상기 실제 사용자 질의문을 입력값으로 획득하여, 질의의 특성에 따라, 관련성이 존재하는 패시지(Passage) 후보군을 1차 선정하여 검색 대상 타겟팅을 시행하고 해당 질의와 관련 있는 Passage를 검색하며, 획득한 패시지 및 질의 데이터를 기반으로 유니크 인스턴트 응답을 도출하는 시맨틱 트리플 추출 단계;
정답인 유니크 인스턴트 응답(Unique Instant Answer) 및 질의를 엔티티, 어트리뷰트, 인스턴트 응답 형태인 시맨틱 트리플로 변환하는 시맨틱 트리플 변환 단계;
를 포함하는, 시맨틱 트리플 기반의 지식 확장 방법.A data update step of updating existing semantic triple data;
A query statement generation step of generating a query statement by using and combining an entity synonym and an attribute synonym;
An actual query acquisition step of obtaining an actual user query based on the user log;
The query generated by the query statement generation module or the actual user query statement is obtained as an input value, and targets are searched by first selecting a candidate candidate group with a relationship according to the characteristics of the query A semantic triple extraction step of searching for a Passage related to and deriving a unique instant response based on the acquired passage and query data;
A semantic triple conversion step of converting a unique instant answer, which is a correct answer, and a query into a semantic triple, which is an entity, attribute, or instant response form;
Including, semantic triple based knowledge expansion method.

제 4 항에 있어서,
상기 질의문 생성 단계는,
전체 시맨틱 트리플 데이터에서, 엔티티(entity) 필드와 어트리뷰트(attribute) 필드를 룩업(Lookup) 하여 조합하고, 특정 관계 카테고리 별로 엔티티 데이터베이스와 어트리뷰트 데이터베이스를 연계하며, 유의어 정보를 활용하여 생성될 질의문 수를 확장하는, 시맨틱 트리플 기반의 지식 확장 방법.The method of claim 4,
The query statement generation step,
In the whole semantic triple data, look up and combine the entity field and attribute field, link the entity database with the attribute database for each specific relationship category, and use the synonym information to determine the number of query statements to be generated. A method of expanding and expanding semantic triple based knowledge.

제 4 항에 있어서,
정답인 유니크 인스턴트 응답을 판별하는 스크리닝 단계를 더 포함하고,
상기 스크리닝 단계는,
질의 데이터 기반으로 다수의 유니크 인스턴트 응답 결과가 같게 나오거나, 자체 신뢰도가 특정 임계치 이상인 경우 정답으로 판단하는, 시맨틱 트리플 기반의 지식 확장 방법.The method of claim 4,
Further comprising a screening step of determining a unique answer that is a correct answer,
The screening step,
A semantic triple-based knowledge expansion method that determines if multiple unique instant response results are the same based on the query data or if their own reliability is higher than a certain threshold.

엔티티 유의어, 어트리뷰트 유의어를 활용 및 조합하여 질의문을 생성하는 질의문 생성 모듈;
상기 생성된 질의문에 대해 유니크 인스턴트 응답(Unique Instant Answer)을 도출하는 시맨틱 트리플 추출기;
상기 시맨틱 트리플 추출기의 결과를 판단하여 정답인 유니크 인스턴트 응답(Unique Instant Answer) 및 질의를 생성하는 스크리닝부;
상기 정답인 유니크 인스턴트 응답(Unique Instant Answer) 및 질의를 엔티티, 어트리뷰트, 인스턴트 응답 형태인 시맨틱 트리플로 변환하는 시맨틱 트리플 변환 모듈;
을 포함하는 시맨틱 트리플 기반의 지식 확장 시스템.A query statement generation module for generating a query statement by using and combining entity synonyms and attribute synonyms;
A semantic triple extractor that derives a unique instant answer to the generated query;
A screening unit that determines a result of the semantic triple extractor and generates a unique answer (Unique Instant Answer) and a query that are correct answers;
A semantic triple conversion module that converts the unique answer, which is the correct answer, and a query into semantic triples in the form of entities, attributes, and instant responses;
Semantic triple-based knowledge expansion system comprising a.

제 7 항에 있어서,
상기 시맨틱 트리플 추출기는,
질의의 특성에 따라, 관련성이 존재하는 패시지(Passage) 후보군을 1차 선정하여 검색 대상 타겟팅을 시행하고, 해당 질의와 관련 있는 Passage를 검색하는 패시지 검색 모듈; 및
획득한 패시지 및 질의 데이터를 기반으로 유니크 인스턴트 응답을 도출하며, 상기 패시지 각가에 대하여 유니크 인스턴트 응답과 해당 응답의 신뢰도를 도출하는 기계독해 질의응답 모듈; 을 포함하는, 시맨틱 트리플 기반의 지식 확장 시스템.The method of claim 7,
The semantic triple extractor,
According to the characteristics of the query, a passage search module that performs targeting of a search target by first selecting a candidate candidate group having a relevance and searches for a Passage related to the query; And
A machine-reading question-and-answer module for deriving a unique instant response based on the acquired passage and query data, and deriving a unique instant response and reliability of the corresponding response for each passage; Including, semantic triple based knowledge expansion system.

제 7 항에 있어서,
상기 질의문 생성 모듈은,
전체 시맨틱 트리플 데이터에서, 엔티티(entity) 필드와 어트리뷰트(attribute) 필드를 룩업(Lookup) 하여 조합하고, 특정 관계 카테고리 별로 엔티티 데이터베이스와 어트리뷰트 데이터베이스를 연계하며, 유의어 정보를 활용하여 생성될 질의문 수를 확장하는, 시맨틱 트리플 기반의 지식 확장 시스템.The method of claim 7,
The query statement generation module,
In the entire semantic triple data, look up and combine the entity field and attribute field, link the entity database with the attribute database for each specific relationship category, and use the synonym information to determine the number of query statements to be generated. Expanding, semantic triple based knowledge expansion system.

엔티티 유의어, 어트리뷰트 유의어를 활용 및 조합하여 질의문을 생성하는 질의문 생성 단계;
상기 생성된 질의문에 대해 유니크 인스턴트 응답(Unique Instant Answer)을 도출하는 시맨틱 트리플 추출 단계;
상기 시맨틱 트리플 추출기의 결과를 판단하여 정답인 유니크 인스턴트 응답(Unique Instant Answer) 및 질의를 생성하는 스크리닝 단계;
상기 정답인 유니크 인스턴트 응답(Unique Instant Answer) 및 질의를 엔티티, 어트리뷰트, 인스턴트 응답 형태인 시맨틱 트리플로 변환하는 시맨틱 트리플 변환 단계;
을 포함하는 시맨틱 트리플 기반의 지식 확장 방법.A query statement generation step of generating a query statement by using and combining entity synonyms and attribute synonyms;
A semantic triple extraction step of deriving a unique instant answer to the generated query;
A screening step of determining a result of the semantic triple extractor and generating a unique answer and query, which are correct answers;
A semantic triple conversion step of converting the unique answer, which is the correct answer, and the query into a semantic triple in the form of an entity, attribute, and instant response;
Semantic triple based knowledge expansion method comprising a.

제 10 항에 있어서,
상기 시맨틱 트리플 추출 단계는,
질의의 특성에 따라, 관련성이 존재하는 패시지(Passage) 후보군을 1차 선정하여 검색 대상 타겟팅을 시행하고, 해당 질의와 관련 있는 Passage를 검색하는 패시지 검색 단계; 및
획득한 패시지 및 질의 데이터를 기반으로 유니크 인스턴트 응답을 도출하며, 상기 패시지 각가에 대하여 유니크 인스턴트 응답과 해당 응답의 신뢰도를 도출하는 기계독해 질의응답 단계; 를 포함하는, 시맨틱 트리플 기반의 지식 확장 방법.The method of claim 10,
The semantic triple extraction step,
According to the characteristics of the query, the first step is to select a candidate candidate group that has relevance (passage search), targeting the search target, and searching for a passage associated with the query; And
A machine-reading question-and-answer step of deriving a unique instant response based on the acquired passage and query data, and deriving a unique instant response and reliability of the corresponding response for each passage; Including, semantic triple based knowledge expansion method.

제 10 항에 있어서,
상기 질의문 생성 단계는,
전체 시맨틱 트리플 데이터에서, 엔티티(entity) 필드와 어트리뷰트(attribute) 필드를 룩업(Lookup)하여 조합하고, 특정 관계 카테고리 별로 엔티티 데이터베이스와 어트리뷰트 데이터베이스를 연계하며, 유의어 정보를 활용하여 생성될 질의문 수를 확장하는, 시맨틱 트리플 기반의 지식 확장 방법.The method of claim 10,
The query statement generation step,
In the whole semantic triple data, look up and combine the entity field and attribute field, link the entity database with the attribute database for each specific relationship category, and use the synonym information to determine the number of query statements to be generated. A way to expand, expand knowledge based on semantic triples.

엔티티(Entity) 유의어, 어트리뷰트(attribute) 유의어를 활용 및 조합하여 질의문을 생성하는 질의문 생성 모듈; 및
상기 질의문 생성 모듈에서 생성된 질의 혹은 실제 사용자 질의문을 입력값으로 획득하여 생성된 질의문에 대해 유니크 인스턴트 응답을 도출하는 시맨틱 트리플 추출기; 를 포함하고,
상기 시맨틱 트리플 추출기는,
질의의 특성에 따라, 관련성이 존재하는 패시지(Passage) 후보군을 1차 선정하여 검색 대상 타겟팅을 시행하고, 해당 질의와 관련 있는 Passage를 검색하는 패시지 검색 모듈; 및
획득한 패시지 및 질의 데이터를 기반으로 유니크 인스턴트 응답을 도출하며, 상기 패시지 각가에 대하여 유니크 인스턴트 응답과 해당 응답의 신뢰도를 도출하는 기계독해 질의응답 모듈; 을 포함하는, 시맨틱 트리플 기반의 지식 확장 시스템.A query statement generation module for generating a query statement by utilizing and combining entity synonyms and attribute synonyms; And
A semantic triple extractor that derives a unique instant response to the query generated by obtaining the query generated by the query generation module or an actual user query as an input value; Including,
The semantic triple extractor,
According to the characteristics of the query, a passage search module that performs targeting of a search target by first selecting a candidate candidate group having a relevance and searches for a Passage related to the query; And
A machine-reading question-and-answer module for deriving a unique instant response based on the acquired passage and query data, and deriving a unique instant response and reliability of the corresponding response for each passage; Including, semantic triple based knowledge expansion system.

제 13 항에 있어서,
상기 질의문 생성 모듈은,
전체 시맨틱 트리플 데이터에서, 엔티티(entity) 필드와 어트리뷰트(attribute) 필드를 룩업(Lookup) 하여 조합하고, 특정 관계 카테고리 별로 엔티티 데이터베이스와 어트리뷰트 데이터베이스를 연계하며, 유의어 정보를 활용하여 생성될 질의문 수를 확장하는, 시맨틱 트리플 기반의 지식 확장 시스템.The method of claim 13,
The query statement generation module,
In the whole semantic triple data, look up and combine the entity field and attribute field, link the entity database with the attribute database for each specific relationship category, and use the synonym information to determine the number of query statements to be generated. Expanding, semantic triple based knowledge expansion system.

제 13 항에 있어서,
정답인 유니크 인스턴트 응답을 판별하는 스크리닝부를 더 포함하고,
상기 스크리닝부는,
질의 데이터 기반으로 다수의 유니크 인스턴트 응답 결과가 같게 나오거나, 자체 신뢰도가 특정 임계치 이상인 경우 정답으로 판단하는, 시맨틱 트리플 기반의 지식 확장 시스템.The method of claim 13,
Further comprising a screening unit for determining a unique answer that is the correct answer,
The screening unit,
A semantic triple-based knowledge expansion system that determines if multiple unique instant response results are the same based on the query data, or if their reliability is higher than a certain threshold, as the correct answer.

제4항 내지 제6항 및 제10항 내지 제12항 중 어느 한 항에 따른 방법을 실행하기 위해 컴퓨터 판독 가능한 기록 매체에 기록된 컴퓨터 프로그램.A computer program recorded on a computer-readable recording medium for carrying out the method according to any one of claims 4 to 6 and 10 to 12.