KR20220108971A

KR20220108971A - Method for suggesting promising convergence items and suggesting related company-engineer based on machine learning related to Artificial Intelligence technology

Info

Publication number: KR20220108971A
Application number: KR1020210012096A
Authority: KR
Inventors: 최낙식; 고민철; 강민수
Original assignee: 주식회사 케이프로텍; (주)광개토연구소
Priority date: 2021-01-28
Filing date: 2021-01-28
Publication date: 2022-08-04

Abstract

The present invention relates to a method for discovering a convergence item with high feature promising for each bio-item and recommending related companies-researchers. The method comprises: A) a step of receiving at least one reference item in a state of performing a first state of generating related data between reference items by using an individual document and the reference items extracted from the individual document, a second state of generating at least one among at least two objective variable values for each objective variable for each reference item pair of a reference item pair set included in the related data between the reference items, and at least one descriptive variable value for each descriptive variable, and a third state of establishing a predicting model using a machine learning algorithm with the descriptive variable value for each descriptive variable and the objective variable value for each objective variable; B) a step of generating recommended item data for at least two predicting items by applying the received reference item to a predicting model; and C) a step of processing generation of at least one piece of metadata used for selecting the recommended item by being applied to the recommended item data. The present invention can improve future marketability for convergence items.

Description

AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법{Method for suggesting promising convergence items and suggesting related company-engineer based on machine learning related to Artificial Intelligence technology}TECHNICAL FIELD [0002] Method for suggesting promising convergence items and suggesting related company-engineer based on machine learning related to Artificial Intelligence technology

본 발명은 바이오 아이템별 미래 유망성이 높은 융복합 아이템의 발굴과 관련 기업-연구자를 추천하는 방법에 관한 것으로서, 보다 자세하게는 AI(Artificial Intelligence) 기술분야의 기계 학습을 기반으로 하여, 기업 보유의 아이템과 관련된 유망한 융합 이이템을 특허 혹은 논문 등의 문서들로부터 추출하고 선별하여 추천함은 물론, 특허 혹은 논문 등의 문서들로부터 해당 융합 아이템과 관련된 기업과 연구원까지 추천할 수 있도록 하는 방법에 관한 기술 분야이다.The present invention relates to a method of discovering highly promising convergence items for each bio item and recommending a related company-researcher, and more specifically, based on machine learning in the AI (Artificial Intelligence) technology field, Technology field related to a method for extracting, selecting, and recommending promising fusion items related to patents or thesis documents, as well as recommending companies and researchers related to the fusion item from documents such as patents or thesis to be.

질병 진단 및 치료 분야에 나노기술 바이오기술 등의 융합기술을 적용함으로써 현재 의료기술이 가지고 있는 기술적 한계를 극 복하고, 새로운 기능을 창출할 수 있을 것으로 전망한다. By applying convergence technologies such as nanotechnology and biotechnology to the field of disease diagnosis and treatment, it is expected that the technological limitations of current medical technology will be overcome and new functions will be created.

융합기술의 일례로서 나노-바이오 융합기술의 경우, 나노 기술 종합발전계획에서는 '나노 - 바이오 기술'을 '바이오 시스템 및 이들이 나노 구조와 결합된 융합 시스템을 나노미터 크기의 수준에서 조작 및 분석하고 이를 제어하는 과학기술'로 정의하고 있다. In the case of nano-bio convergence technology as an example of convergence technology, 'nano-bio technology' is defined as 'a bio system and a fusion system combined with a nano structure in the nanotechnology comprehensive development plan at the nanometer level, and Controlling science and technology'.

융합기술의 일 예시로서, 나노기술과 결합된 바이오 기술의 발전은 특히 의료 등 인체치료와 유전자 지도, 생명공학, 마이크로 전자기계시스템, 바이오센서, 바이오칩 기술에 적용되기 위하여 많은 연구와 정책적인 지원이 이루어지고 있다. 나노바이오 기술은 그 학문적, 상업적, 사회문화적 중요성과 향후 기술의 발전 가능성에 대한 기대 때문에 세계 각국에서는 정부차원에서 그 연구개발 투자에 박차를 가하고 있다. 미국의 경우 2001년에 'National Nanotechnology Initiative'를 수립하여 2015년까지 성취할 10대 나노기술 연구개발 목표를 설정하였으며, 이 중 4가지가 나노바이오기술 구체적으로는 (1) 암의 조기발견·진단·완치, (2) 나노급의 의약품 합성 및 전달 체계 확립, (3) 인공장기 등의 나노급 융합기술개발, (4) 생체 적합형 물질 및 시스템개발에 관한 것이었다. As an example of convergence technology, the development of biotechnology combined with nanotechnology requires a lot of research and policy support to be applied to human treatment such as medicine, genetic guidance, biotechnology, microelectromechanical system, biosensor, and biochip technology. is being done Because of the academic, commercial, and socio-cultural significance of nanobio technology and the expectation of future technology development, governments around the world are spurring their R&D investment. In the case of the United States, the 'National Nanotechnology Initiative' was established in 2001 and set 10 nanotechnology R&D goals to be achieved by 2015. · It was about cure, (2) establishment of nano-level drug synthesis and delivery system, (3) development of nano-level convergence technology such as artificial organs, and (4) development of biocompatible materials and systems.

또 다른 예시로서, 어느 일 물질 혹은 요소에 해당 물질에 대하여 융합 가능한 타 아이템의 발굴을 통하여, 해당 물질에 대한 새로운 기술기회의 발굴은 물론, 해당 물질에 대한 융합 아이템의 개발을 통한 미래 먹거리 사업에 대한 시장 전망 역시 매우 밝은 편이다.As another example, through the discovery of other items that can be fused with one substance or element, it is not only the discovery of new technological opportunities for the substance, but also the future food business through the development of fusion items for the substance. The market outlook for Korea is also very bright.

그러나, 특정 아이템과 해당 아이템에 대한 융복합 가능성이 높은 아이템을 연구자, 개발자, 혹은 이들의 집단이 집단 지성 혹은 개별적 지성을 통해 발굴하기란 매우 난해할 뿐만 아니라, 비용 및 시간 대비 비생산적이며 비효율적인 면이 강하다. 이로 인하여 기술 집단 지성의 결집체인 글로벌 특허 문헌과, 논문 그리고 소셜 소스 등의 기술 관련 문서의 빅데이터를 활용해야 하는 니즈가 강하게 발생하고 있다. 이에 따라, 기술 집단 지성 데이터를 활용하여 융복합 가능성이 높은 아이템을 추천 체계나 추천 시스템을 통하여 발굴하여 해당 특정 아이템에 응용, 적용하여 산업적으로 활용해야하는 것이 경제적이며 효율적인 방법에 해당할 것이다.However, it is very difficult for researchers, developers, or groups of researchers, developers, or their groups to discover a specific item and an item with a high possibility of convergence for the item through collective or individual intelligence, and it is unproductive and inefficient in terms of cost and time. this is strong Due to this, there is a strong need to utilize big data of technology-related documents such as global patent documents, papers, and social sources, which are aggregation of technical collective intelligence. Accordingly, it would be an economical and efficient method to discover items with high convergence potential through a recommendation system or a recommendation system using technical group intelligence data, apply and apply them to the specific item, and use them industrially.

그러나, 기술 집단 지성의 결집체인 글로벌 특허 문헌과, 논문 그리고 소셜 소스 등은 시간에 따라 축적되는 양이 방대할 뿐만 아니라, 발생되는 출처 역시 매우 산발적으로 분산되는바, 융복합 대상이 되는 아이템의 발굴을 단순하게 적용하면 그 양이 매우 방대하여 실효성 또한 낮아지는 문제점이 동반하게 된다. 따라서, 추천 아이템의 선별과 특정 아이템의 주체인 기업 혹은 개인에 맞추어진 맞춤화 추천이 필요한 것은 물론이다. 특히 특정 아이템에 대한 아이템별 추천을 정량화된 평가 체계를 동원하면 대량의 추천 아이템 속에서 실제 상업적 성공율이 높은 융합 아이템을 효과적으로 선별할 수 있을 것이다.However, global patent literature, papers, and social sources, which are aggregation of technical collective intelligence, not only accumulate vast amounts over time, but also the generated sources are very sporadically dispersed. If applied simply, the amount is very large and the effectiveness is also lowered. Therefore, it goes without saying that it is necessary to select a recommended item and to make a customized recommendation tailored to the company or individual who is the subject of the specific item. In particular, if a quantified evaluation system for item-specific recommendations for specific items is used, it will be possible to effectively select convergence items with a high actual commercial success rate among a large number of recommended items.

뿐만 아니라, 해당 특정 아이템을 보유한 기업이나 개인의 입장에서는 추천된 융복합 아이템에 대한 높은 시장성, 기술성 및 사업성의 평가에도 불구하고, 해당 미래 유망한 융복합 아이템을 상업적으로 상용화하기 위한 실질적 노하우 혹은 실증 기술이 부재할 수 있기 때문에, 추천 융복합 아이템의 융합 과정에서 조력 혹은 인수합병 혹은 영입에 필요한 기업-연구자의 추천을 통해 해당 융복합 아이템의 실질적인 구현 가능성을 높일 수도 있다 할 것이다.In addition, in spite of the high marketability, technicality and business viability of the recommended convergence item from the perspective of a company or individual possessing the specific item, practical know-how or demonstrative technology to commercialize the promising future convergence item commercially This may be absent, so it is possible to increase the practical implementation possibility of the recommended convergence item through the recommendation of a company-researcher necessary for assistance or mergers and acquisitions or recruitment in the convergence process of the recommended convergence item.

위와 같은 다양한 산업적 니즈를 충족 시킬만한 기술이 부재한 것이 현 실정이다.The current situation is that there is no technology that can satisfy the above various industrial needs.

"특허정보를 이용한 지질자원 분야 유망기술군 분석방법 및 시스템(공개번호 제10-2017-0030016호, 특허문헌1)"도 관련 기술로 있으나, 지질 자원에 특정 융복합할 타 아이템과 관련 기업-연구자의 추천은 부재한 실정이다."Method and system for analyzing promising technology groups in the field of geological resources using patent information (Publication No. 10-2017-0030016, Patent Document 1)" is also a related technology, but other items to be specifically fused with geological resources and related companies- There is no researcher's recommendation.

공개번호 제10-2017-0030016호Publication No. 10-2017-0030016

본 발명에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법 및 시스템은 상기한 바와 같은 종래 문제점을 해결하기 위해 안출된 것으로서, 다음과 같은 해결하고자 하는 과제를 제시한다.According to the present invention, a highly promising convergence item and related company-researcher recommendation method and system for each machine learning-based bio-item in the field of AI technology according to the present invention have been devised to solve the conventional problems as described above. present the task

첫째, 대량의 특허 빅데이터, 논문 데이터, 소셜 데이터 등에 기초하여 기업이 보유한 기술(물질)에 적합한 융복합 아이템을 로직화되고 시스템화된 방법을 동원하여 효율적으로 발굴하고자 한다.First, based on a large amount of patent big data, thesis data, social data, etc., it is intended to efficiently discover convergence items suitable for the technology (material) possessed by the company using a logical and systemized method.

둘째, 발굴한 융복합 데이터 중에서 미래 유망성 있는 융복합 아이템으로 선별하도록 한다. Second, among the discovered convergence data, select convergence items with future promise.

셋째, 발굴, 선별 및 추천된 융복합 아이템에 대한 실무적 지식을 갖춘 기술 주체를 동반 추천하여 해당 융복합 아이템의 상업화를 위한 실효성을 갖출 수 있도록 한다.Third, a technical subject with practical knowledge about the discovered, selected, and recommended convergence item is recommended together to ensure the effectiveness for the commercialization of the convergence item.

본 발명의 해결 과제는 이상에서 언급된 것들에 한정되지 않으며, 언급되지 아니한 다른 해결과제들은 아래의 기재로부터 당업자에게 명확하게 이해될 수 있을 것이다.The problems to be solved of the present invention are not limited to those mentioned above, and other problems not mentioned will be clearly understood by those skilled in the art from the following description.

본 발명에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법 및 시스템은 상기의 해결하고자 하는 과제를 위하여 다음과 같은 과제 해결 수단을 가진다.The convergence item and related company-researcher recommendation method and system with high future potential for each machine learning-based bio-item in the AI technology field according to the present invention have the following problem solving means for the above-mentioned problem.

본 발명에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법 및 시스템의 경우, A) 기술 집단 지성 문서셋을 구성하는 개별 문서 및 상기 개별 문서에서 추출된 지칭 아이템을 활용하여, 지칭 아이템 간 연관 데이터를 생성한 제1 상태; 상기 지칭 아이템 간 연관 데이터에 포함된 지칭 아이템쌍 집합을 대상으로, 선택된 상기 지칭 아이템쌍에 대하여, 상기 지칭 아이템쌍 별로 적어도 2 이상의 목적 변수별 목적 변수값과, 적어도 하나 이상의 설명 변수별 설명 변수값 중 어느 하나 이상을 생성한 제2 상태; 상기 지칭 아이템별 설명 변수별 설명 변수값과 목적 변수별 목적 변수값으로 기계 학습 알고리즘을 적용한 예측 모델을 수립한 제3 상태가 수행된 상태에서, 적어도 하나 이상의 지칭 아이템을 입수 받는 단계; B) 상기 입수 받은 지칭 아이템을 상기 예측 모델에 적용하여, 적어도 2 이상의 예측 아이템에 대한 추천 아이템 데이터를 생성하는 단계; 및 C) 상기 추천 아이템 데이터에 적용되어 추천 아이템의 선별에 사용될 수 있는 적어도 하나 이상의 메타 데이터의 생성을 처리하는 단계를 포함하는 것을 특징으로 할 수 있다.In the case of a highly promising convergence item and related company-researcher recommendation method and system for each machine learning-based bio item in the AI technology field according to the present invention, A) Extraction from individual documents constituting the technical group intelligence document set and the individual documents a first state in which association data between the referenced items is generated by utilizing the referenced item; For the reference item pair selected for the reference item pair set included in the association data between the reference items, at least two object variable values for each reference item pair, and at least one description variable value for each description variable a second state that produced any one or more of the following; receiving at least one reference item while a third state of establishing a predictive model to which a machine learning algorithm is applied with the explanatory variable value for each reference item and the target variable value for each target variable is performed; B) generating recommended item data for at least two or more prediction items by applying the received reference item to the prediction model; and C) processing the generation of at least one or more meta data that is applied to the recommendation item data and can be used to select a recommendation item.

본 발명에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법 및 시스템의 경우, 상기 예측 모델의 수립은, 상기 지칭 아이템쌍 집합을 적어도 2개 이상의 지칭 아이템 분할 집합으로의 분할을 이용하고, 상기 학습은 상기 지칭 아이템 분할 집합을 사용하여 수행하는 것이며, 상기 분할은 기 설정된 적어도 하나 이상의 요인 계열을 단일, 순차, 조합 또는 복합하여 적용하는 2차원 이상인 행렬 분할인 것인 것을 특징으로 할 수 있다.In the case of a highly promising convergence item and related company-researcher recommendation method and system for each machine learning-based bio item in the AI technology field according to the present invention, the establishment of the predictive model refers to at least two or more sets of reference item pairs Using division into an item division set, the learning is performed using the reference item division set, and the division is a two-dimensional or more matrix that applies at least one or more preset factor series singly, sequentially, in combination, or in combination. It may be characterized in that it is a division.

본 발명에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법 및 시스템의 경우, 상기 요인 계열은, 상기 지칭 아이템에 대한, i) 적어도 하나 이상의 범주 분류 속성 계열, ii) 적어도 하나 이상의 인적 분류 속성 계열, iii) 적어도 하나 이상의 계량값 분류 속성 계열, iv) 적어도 하나 이상의 평가값 분류 속성 계열 중 어느 하나 이상인 것을 특징으로 할 수 있다.In the case of a highly promising convergence item and related company-researcher recommendation method and system for each machine learning-based bio item in the AI technology field according to the present invention, the factor series is, i) at least one category classification for the referenced item Attribute series, ii) at least one or more human classification attribute series, iii) at least one quantitative value classification attribute series, and iv) at least one or more evaluation value classification attribute series.

본 발명에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법 및 시스템의 경우, 상기 범주 분류 속성 계열은 적어도 2 이상의 지칭 아이템 공통적으로 부여될 수 있는 텍사노미(taxonomy)인 것이며, 상기 범주 분류에는 적어도 1 뎁스(depth) 이상의 계층 구조를 가질 수 있는 것이며, 상기 계층 구조는 제품-부품과 소재-물질와 질병 중 어느 하나 이상을 포함하는 것이며, 상기 소재-물질은 하기 (1) 소재-물질 중 어느 하나 이상을 포함하며, 상기 상품-부품은 하기 (2) 상품-부품 중 어느 하나 이상을 포함하며, 상기 질병은 한국표준질병 및 사인 분류(KCD-8), 국제질병사인분류, ICD-10(International Classification of Diseases-10) 중 어느 하나 이상을 포함하는 것을 특징으로 할 수 있다.In the case of a highly promising convergence item and a related company-researcher recommendation method and system for each machine learning-based bio item in the AI technology field according to the present invention, the category classification attribute series is a text that can be assigned to at least two or more reference items in common It is a taxonomy, and the category classification may have a hierarchical structure of at least 1 depth, and the hierarchical structure includes any one or more of product-part, material-material, and disease, and the material -A substance includes any one or more of the following (1) material-materials, the product-part includes any one or more of the following (2) product-parts, and the disease is classified as a Korean standard disease and cause of death (KCD- 8), International Classification of Causes of Disease, ICD-10 (International Classification of Diseases-10) may be characterized as including any one or more.

(1) 소재-물질: FDA 승인 물질, 의약품, 유전자, 생화학 물질, 효소, 화학물질, 고분자 소재, 의료 소재, 천연 소재, 천연 물질, 건축용 자재, 제조 부품용 소재, 연료와 연료 첨가지 및 윤활유, 음식료품, 실험 시약 및 소재, 종이 또는 위생 장비용 소재(1) Materials-substances: FDA-approved substances, pharmaceuticals, genes, biochemicals, enzymes, chemicals, polymeric materials, medical materials, natural materials, natural materials, building materials, materials for manufacturing parts, fuels and fuel additives and lubricants. , food and beverage, laboratory reagents and materials, paper or materials for sanitary equipment

(2) 상품-부품: IT 및 컴퓨터, 의료, 제약 및 화장품, 자동차 및 운송, 전자 부품, 반도체, 건축 및 토목, 물질 취급, 조절 및 저장, 인쇄, 사진 및 시청각 장비, 광학, 발전, 배전 및 동력 전달, 전기 시스템 및 조명, 가열, 냉각, 환기, 필터, 파이프 및 튜브, 제조 및 가공, 기계 요소 또는 단위, 안전, 무기, 서비스 및 금융, 실험 및 측정, 농산물, 식품 및 담배, 개인, 가정 및 사무용품, 스포츠, 예술, 게임, 장난감 및 교육 자재, 상품 및 부품 일단(2) Commodity-Components: IT and Computers, Medical, Pharmaceutical and Cosmetics, Automotive and Transportation, Electronic Components, Semiconductors, Building and Civil Engineering, Material Handling, Control and Storage, Printing, Photographic and Audio-Visual Equipment, Optics, Power Generation, Distribution and Power Transmission, Electrical Systems and Lighting, Heating, Cooling, Ventilation, Filters, Pipes and Tubes, Manufacturing and Processing, Mechanical Elements or Units, Safety, Weapons, Services and Finance, Experiments and Measurements, Agricultural Products, Food and Tobacco, Personal, Home and office supplies, sports, arts, games, toys and educational materials, merchandise and parts once

본 발명에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법 및 시스템의 경우, 상기 인적 분류는 법인 및 자연인에 대한 분류 중 어느 하나 이상인 것이며, 상기 법인은 특허 문서의 경우에는 출원인, 현재 권리자, 원고, 피고, 매입자, 매각자, 라이센시 또는 라이센서 및 특허 문서에 기재된 기타 비자연인 주체(대학의 산학협력단, 연구 기관, 조합, 국가)인 것이며, 논문 문서인 경우에는 연구자가 소속된 조직인 것이며, 상기 자연인은 특허 문서의 경우에는 발명자, 논문의 경우에는 저자인 것인 것이며, 상기 인적 분류는 상기 지칭 아이템이 포함되어 있는 개별 문서와 관련하여 생성하는 인적 정보로부터 분류되는 것이며, 상기 인적 분류로서 법인은 법인 명칭, 국적, 조직 속성(상기 조직 속성은 기업, 대학, 연구 기관, 조합 중 하나로 분류된다.), 법인이 소속되는 집단 속성(상기 집단 속성은 상장 기업, 외감 기업 또는 그 외 기업으로 분류된다.), 법인이 분류되는 분류 속성(상기 분류 속성은 SIC/KSIC에 따른 산업 분류 속성, 증권 시장별 종목 분류 속성, 기술-제품-서비스 관점의 분류 속성으로 분류된다.) 중 어느 하나 이상인 것을 특징으로 할 수 있다.In the case of a highly promising convergence item and related company-researcher recommendation method and system for each machine learning-based bio item in the AI technology field according to the present invention, the human classification is any one or more of classifications for corporations and natural persons, and the corporation In the case of patent documents, it is the applicant, the current right holder, the plaintiff, the defendant, the purchaser, the seller, the licensee or the licensor, and other non-natural persons described in the patent document (industrial-academic cooperation organizations of universities, research institutes, associations, and countries), and the thesis document In the case of , it is the organization to which the researcher belongs, the natural person is the inventor in the case of a patent document, and the author in the case of a thesis. As the human classification, a corporation is a corporation name, nationality, organizational attribute (the organizational attribute is classified as one of a company, university, research institution, or association), and group attribute to which the corporation belongs (the group attribute is listed Classified as a corporation, externally supervised corporation, or other corporations), a classification attribute under which a corporation is classified (the classification attribute is an industry classification attribute according to SIC/KSIC, a stock classification attribute by stock market, and a classification attribute from a technology-product-service perspective) It may be characterized as any one or more of).

본 발명에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법 및 시스템의 경우, 상기 계량값 분류는 상기 지칭 아이템이 포함되어 있는 개별 문서에서 계량할 수 있는 계량값인 것이며, 상기 계량값은 시간 단위별 또는 시간 독립적 문서의 개수, 관련 문서(레퍼런스, 피인용 또는 유사 문서 중 하나 이상을 포함한다)의 개수, 상기 문서와 관련된 이벤트-상기 이벤트는 소송, 심판, 표준, FDA 승인, 거래, 라이선스 체결을 포함한다-의 개수, 상기 단일 또는 2 이상의 개수에 대한 처리값(증감율에 대한 비율값 또는 계산값을 포함한다), 상기 인적 분류와 관련된 처리값(점유율 또는 집중율) 중 어느 하나 이상인 것이며, 상기 지칭 아이템별 계량값 분류는 상기 개별 문서에서 계량할 수 있는 계량값을 사용하여 생성되는 것이며, 상기 평가값 분류는 상기 개별 문서에 대한 2 이상의 측정값을 모델이나 수식에 사용하여 생성되는 평가값이나 예측값을 사용하여 지칭 아이템별로 생성된 것이거나, 상기 지칭 아이템별로 대응되는 문서 집합에서 집단적으로 측정되는 측정값을 모델이나 수식에 사용하여 생성되는 평가값이나 예측값인 것을 특징으로 할 수 있다.In the case of a highly promising convergence item and related company-researcher recommendation method and system for each machine learning-based bio item in the AI technology field according to the present invention, the quantification value classification can be quantified in an individual document including the referenced item. The metric value is the number of time units or time-independent documents, the number of related documents (including one or more of references, citations, or similar documents), the event related to the document - the event is a lawsuit , judgment, standard, FDA approval, transaction, license conclusion - the number of - processing values for the single or two or more numbers (including ratio values or calculated values for increase/decrease rates), processing values related to the human classification (occupancy rate or concentration rate), the classification of the measurement value for each reference item is generated using the measurement value that can be measured in the individual document, and the evaluation value classification is the measurement of two or more of the individual document An evaluation generated for each reference item using an evaluation value or predicted value generated by using a value in a model or formula, or an evaluation generated using a model or formula using a measurement value collectively measured in a document set corresponding to each reference item It may be characterized as a value or a predicted value.

본 발명에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법 및 시스템의 경우, 상기 예측 모델은 예측 기간별로 수립하는 것이며, 상기 예측 모델에서 상기 목적 변수별 목적 변수값과 상기 설명 변수별 설명 변수값은 상기 예측 기간을 기준으로 분할된 적어도 2개 이상의 집합별로 별도로 생성되는 것을 특징으로 할 수 있다.In the case of a highly promising convergence item and related company-researcher recommendation method and system for each machine learning-based bio item in the AI technology field according to the present invention, the prediction model is established for each prediction period, and the target variable in the prediction model The target variable value for each explanatory variable and the explanatory variable value for each explanatory variable may be separately generated for at least two or more sets divided based on the prediction period.

본 발명에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법 및 시스템의 경우, 상기 예측 모델이 생성하는 예측값은 지칭 아이템별로 지칭 아이템쌍을 구성하는 예측 아이템에 대한 예측 확률값 또는 예측 빈도값인 것을 특징으로 할 수 있다.In the case of a highly promising convergence item and related company-researcher recommendation method and system for each machine learning-based bio-item in the AI technology field according to the present invention, the prediction value generated by the prediction model is a prediction constituting a reference item pair for each reference item It may be characterized as a predicted probability value or a predicted frequency value for the item.

본 발명에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법 및 시스템의 경우, 상기 메타 데이터는 상기 추천 아이템에 대한 적어도 하나 이상의 요인 계열 또는 상기 요인 계열을 구성하는 요인 계열별 적어도 하나 이상의 요인 계열값인 것이며, 상기 요인 계열은 상기 추천 아이템에 대한, i) 적어도 하나 이상의 범주 분류 속성 계열, ii) 적어도 하나 이상의 인적 분류 속성 계열, iii) 적어도 하나 이상의 계량값 분류 속성 계열, iv) 적어도 하나 이상의 평가값 분류 속성 계열 중 어느 하나 이상인 것을 특징으로 할 수 있다.In the case of a highly promising convergence item and a related company-researcher recommendation method and system for each machine learning-based bio item in the AI technology field according to the present invention, the metadata includes at least one or more factor series or the factor series for the recommended item at least one or more factor series values for each factor series constituting It may be characterized in that it is any one or more of the quantitative value classification attribute series, iv) at least one or more evaluation value classification attribute series.

본 발명에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법 및 시스템의 경우, 상기 설명 변수별 설명 변수값은 설명 변수 종류별로 적어도 1종 이상 생성되는 것이며, 상기 예측 모델은 상기 설명 변수 종류별로 생성되는 것을 특징으로 할 수 있다.In the case of a highly promising convergence item and a related company-researcher recommendation method and system for each machine learning-based bio item in the AI technology field according to the present invention, the explanatory variable value for each explanatory variable is generated by at least one type for each explanatory variable type and the prediction model may be generated for each type of the explanatory variable.

본 발명에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법 및 시스템의 경우, 상기 메타 데이터는 상기 추천 아이템에 대한 평가 정보인 것이며, 상기 평가 정보는 상기 추천 아이템을 구성하는 문서 집합에 대한 집단 속성에 대한 평가 모델 또는 예측 모델을 수립하고, 상기 추천 아이템에 대응되는 문서 집합에 대한 집단 속성을 기 설정된 시점 기준으로 측정하여, 상기 평가 모델 또는 상기 예측 모델에 투입하여 생성되는 것이며, 상기 평가 모델의 수립은, a) 상기 지칭 아이템쌍별로 적어도 2 이상의 목적 변수별 목적 변수값과 적어도 1종 이상의 설명 변수 종류별 설명 변수별 설명 변수값을 생성하는 단계; b) 설명 변수별 설명 변수값과 목적 변수별 목적 변수값으로 기계 학습 알고리즘을 적용하는 단계를 포함하여 생성되는 것을 특징으로 할 수 있다.In the case of a highly promising convergence item and a related company-researcher recommendation method and system for each machine learning-based bio item in the AI technology field according to the present invention, the metadata is evaluation information for the recommended item, and the evaluation information is The evaluation model or the prediction model is established by establishing an evaluation model or a prediction model for a group attribute of the document set constituting the recommendation item, and measuring the group attribute of the document set corresponding to the recommendation item based on a preset time point. It is generated by inputting into a model, and the establishment of the evaluation model includes: a) generating at least two target variable values for each of the reference item pairs and at least one or more explanatory variable values for each explanatory variable for each type of explanatory variable; b) it may be characterized in that it is generated including the step of applying a machine learning algorithm to the explanatory variable value for each explanatory variable and the target variable value for each objective variable.

본 발명에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법 및 시스템의 경우, 상기 평가 모델의 수립은, c) 상기 기계 학습 알고리즘에 대한 검증 데이터를 생성하는 단계를 더 포함하는 것을 특징으로 할 수 있다.In the case of a highly promising convergence item and related company-researcher recommendation method and system for each machine learning-based bio item in the AI technology field according to the present invention, the establishment of the evaluation model includes c) verification data for the machine learning algorithm It may be characterized in that it further comprises the step of generating.

본 발명에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법 및 시스템의 경우, 상기 메타 데이터는 상기 추천 아이템과 관련된 법인 및 자연인에 대한 정보 중 어느 하나 이상인 것이며, 상기 메타 데이터의 생성은 상기 추천 아이템에 대응되는 문서 집합의 집단 속성별로 제공되는 제1 생성 방법; 또는 상기 추천 아이템에 대응되는 문서에 포함된 지칭 아이템 집합, 상기 지칭 아이템의 분류, 상기 지칭 아이템의 문서별 문서 내 특성 또는 문서 특성(feature) 사용하여 네트워크 분석을 통하여 생성되는 제2 생성 방법; 중 어느 하나 이상인 것을 특징으로 할 수 있다.In the case of a highly promising convergence item and a related company-researcher recommendation method and system for each machine learning-based bio item in the AI technology field according to the present invention, the metadata is any one of information about a legal entity and a natural person related to the recommended item As described above, the generation of the metadata may include: a first generation method provided for each group attribute of a document set corresponding to the recommendation item; or a second generation method that is generated through network analysis using a reference item set included in a document corresponding to the recommended item, a classification of the reference item, and a document-specific feature or document feature of the reference item; It may be characterized by any one or more of them.

본 발명에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법 및 시스템의 경우, 상기 제1 생성 방법에서, 상기 문서 집합의 집단 속성은, i) 문서에 대한 적어도 하나 이상의 계량값별 계량 총량, 증감율, 시간 구간별 밀집도 ii) 소송, 심판, 거래, 표준, FDA 승인 중 어느 하나 이상을 포함하는 특별한 이벤트에 관련된 문서 총량, 증감율, 시간 구간별 밀집도, iii) 전체 또는 상위 범주 기준 특정한 상기 아이템에 대한 적어도 하나 이상의 계량값별 집중율 또는 점유율 중 어느 하나 이상인 것이며, 상기 자연인 또는 법인은 국가별, 국적별, 적어도 하나 이상의 분류 체계에 따른 사실상 또는 추정되는 분류별로 제공될 수 있는 것을 특징으로 할 수 있다.In the case of the highly promising convergence item and related company-researcher recommendation method and system for each machine learning-based bio item in the AI technology field according to the present invention, in the first generating method, the group attribute of the document set is: i) a document ii) The total amount of documents related to a special event including any one or more of litigation, adjudication, transaction, standard, FDA approval, increase/decrease rate, and density by time interval, iii) ) is any one or more of the concentration rate or share by at least one measurement value for the specific item based on the whole or upper category, and the natural or legal person is classified according to country, nationality, or actual or estimated classification according to at least one classification system. It may be characterized in that it can be provided.

본 발명에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법 및 시스템의 경우, 상기 제2 생성 방법에서, 상기 네트워크 분석의 경우, 상기 지칭 아이템 집합은 키워드(keyword), 키프레이즈(key phrase) 중 어느 하나 이상인 것이거나, 상기 문서 특성(feature)는 CPC/IPC 등의 특허 분류 정보, 레퍼런스 문서 번호나 피인용 문서 번호 등의 번호 정보, 권리자나 발명자를 포함한 인적 정보 중 어느 하나 이상인 것이거나, 상기 지칭 아이템의 문서별 문서 내 정보에는 예시적으로 TF-ID(term frequency-inverted document frequency)와 같은 상기 지칭 아이템의 문서 대표성에 대한 적어도 하나 이상의 측정값에 대한 정보인 것이거나 상기 지칭 아이템 집합이나 문서 특성에는 총량, 시계열적 변동값, 및 점유율과 집중율 등 분포 상의 비율이나 위치 정보 중 어느 하나 이상을 활용하는 것인 것이며, 상기 네트워크 분석은 연관성, 중첩성, 공통성, 보충성, 매개성 중 어느 하나 이상의 관점에서 분석되는 것을 특징으로 할 수 있다.In the case of a highly promising convergence item and related company-researcher recommendation method and system for each machine learning-based bio item in the AI technology field according to the present invention, in the second generation method, in the case of the network analysis, the reference item set is It is any one or more of a keyword and a key phrase, or the document feature is patent classification information such as CPC/IPC, number information such as reference document number or cited document number, right holder or inventor At least one measurement value of the document representativeness of the reference item, such as TF-ID (term frequency-inverted document frequency), for example, in the information in the document for each document of the reference item It is information about or to utilize any one or more of ratio or location information on distribution such as total amount, time-series fluctuation value, and occupancy and concentration rate for the set of referenced items or document characteristics, and the network analysis is related to, It may be characterized in that it is analyzed from the viewpoint of any one or more of overlapping, commonality, complementarity, and mediation.

본 발명에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법 및 시스템의 경우, 상기 지칭 아이템 간 연관 데이터는 상기 문서별로 상기 문서에서 추출한 추출 지칭 아이템 맵핑 행렬 정보로 생성되는 것을 특징으로 할 수 있다.In the case of a highly promising convergence item and a related company-researcher recommendation method and system for each machine learning-based bio item in the AI technology field according to the present invention, the related data between the reference items is extracted from the document for each document, and the referenced item is extracted. It may be characterized in that it is generated as matrix information.

이상과 같은 구성의 본 발명에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법 및 시스템은 다음과 같은 효과를 제공한다.The highly promising convergence item and related company-researcher recommendation method and system for each machine learning-based bio-item in the AI technology field according to the present invention having the above configuration provides the following effects.

첫째, 머신 러닝을 통하여, 기업 보유 기술(물질)에 대한 융복합 아이템을 대량으로 체계적으로 선별하여, 기업 보유 기술(물질)에 대한 융복합 가능성이 높은 아이템 위주로 추천하게 된다.First, through machine learning, convergence items for company-owned technology (material) are systematically selected in large quantities, and items with high convergence potential for company-owned technology (material) are recommended.

둘째, 예측 모델을 통하여, 과거와 현재의 데이터의 추이를 통하여 융복합 아이템들에 대한 미래 유망성을 예측 및 검증하여 해당 융복합 아이템들에 대한 미래 시장성, 미래 기술성, 미래 사업성을 향상시키도록 한다.Second, through the predictive model, the future prospects of convergence items are predicted and verified through the transition of past and present data to improve the future marketability, future technology, and future business feasibility of the convergence items.

셋째, 미래 유망한 융복합 아이템의 발굴과 함께, 해당 융복합 아이템에 대한 실무적 지식과 실증 데이터를 보유한 기업-연구자 정보를 함께 제공하는바, 해당 융복합 아이템의 기술적 완성도를 높이도록 하고, 융복합 아이템의 상용화 시기를 앞당기도록 한다.Third, along with the discovery of promising future convergence items, company-researcher information with practical knowledge and empirical data on the convergence item is provided together, so that the technical perfection of the convergence item is improved, and the convergence item to accelerate the commercialization of

본 발명의 효과는 이상에서 언급한 것들에 한정되지 않으며, 언급되지 아니한 다른 효과들은 아래의 기재로부터 당업자에게 명확하게 이해될 수 있을 것이다.Effects of the present invention are not limited to those mentioned above, and other effects not mentioned will be clearly understood by those skilled in the art from the following description.

도 1은 본 발명의 일 실시예에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 시스템의 주요 구성을 도시한 블록도이다.
도 2는 본 발명의 일 실시예에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 시스템 중, 기계 학습 장치와 그 하위 구성을 도시한 블록도이다.
도 3은 본 발명의 일 실시예에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 시스템 중, 추천 장치와 그 하위 구성을 도시한 블록도이다.
도 4는 본 발명의 일 실시예에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 시스템 중, 데이터 저장 장치와 그 하위 구성을 도시한 블록도이다.
도 5는 본 발명의 일 실시예에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법의 전제로서, 제1 상태 내지 제3상태의 생성에 대한 플로우차트이다.
도 6은 본 발명의 일 실시예에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법 중, 지칭 아이템의 입수와 추천 아이템의 생성 그리고 추첨 아이템 선별을 위한 메타 데이터의 생성 처리에 대한 플로우 차트이다.
도 7은 본 발명의 일 실시예에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법 중, 예측 모델의 적용을 위해 2개 이상의 이칭 아이템 분할 집합의 처리를 위한 매트릭스 프로세스 개념도이다.
도 8은 본 발명의 일 실시예에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법 중, 기간을 나누어 과거 특정 기간 동안의 기계 학습을 진행하고, 최근 몇 년간의 데이터의 일부를 할애하여 모델링을 반복적으로 적용한 후, 최근 몇 년간의 잔부 데이터를 검증용으로 할당하여 검증하여 예측 모델을 수립하도록 하는 것을 도시한 개념도이다.
도 9는 본 발명의 일 실시예에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법 및 시스템 중, 물질 별 요인 계열의 행렬과 후보 아이템의 요인 계열을 통해 예측 모델을 수립하고 예측 모델을 트레이닝하여 검증 과정에서 확률을 높이는 프로세스에 대한 개념도이다.
도 10은 본 발명의 일 실시예에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법 및 시스템이 구현되어 유망 아이템이 제시되는 디스플레이의 예시 화면이다.
도 11은 도 10에서 나아가 유망 아이템별 유망성 등급을 나타낸 것을 도시한 디스플레이의 예시 화면이다.
도 12는 도 10에서 유망 아이템으로 출력되기 전, 전체의 융합 아이템 후보군들을 도식화한 것이다.
도 13은 도 10과 같이 유망 아이템을 선별하기 위하여 유명성 예측 점수를 활용하여 필터링 하는 것을 도시한 개념도이다.
도 14는 본 발명의 일 실시예에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법 중, 최근 특정 기간 이전의 기준 설명 변수값과 최근 특정 기간의 유망성 변수값을 통하여, 기계학습을 진행하고, 최종 예측값의 정확성을 높이기 위한 가중치 계수를 튜닝하는 것을 도시한 개념도이다.
도 15는 본 발명의 일 실시예에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법 및 시스템을 통하여 선별된 유망 아이템에 대한 기업-연구자를 추천하는 것을 도시한 디스플레이 화면의 예시이다.
도 16은 본 발명의 일 실시예에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법 및 시스템을 통하여 선별된 유망 아이템에 대한 기업-연구자를 추천하는 것을 도시한 디스플레이 화면의 또 다른 예시이다.
도 17은 본 발명의 일 실시예에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법 및 시스템을 통하여 다차원 요소를 반영하여 기업의 물질에 대한 해외 추천 기업-연구자를 추천하는 것을 도시한 개념도이다.
도 18은 본 발명의 일 실시예에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법 및 시스템을 통하여 다차원 요소를 반영하여 연구자의 물질에 대한 해외 추천 기업-연구자를 추천하는 것을 도시한 개념도이다.1 is a block diagram showing the main configuration of a highly promising convergence item and a related company-researcher recommendation system for each bio-item based on machine learning in the AI technology field according to an embodiment of the present invention.
FIG. 2 is a block diagram illustrating a machine learning apparatus and its sub-configuration among a highly promising convergence item and a related company-researcher recommendation system for each machine learning-based bio-item in the AI technology field according to an embodiment of the present invention.
3 is a block diagram illustrating a recommendation device and its sub-configuration among a highly promising convergence item and a related company-researcher recommendation system for each machine learning-based bio-item in the AI technology field according to an embodiment of the present invention.
4 is a block diagram illustrating a data storage device and its sub-configuration among a highly promising convergence item and a related company-researcher recommendation system for each machine learning-based bio-item in the AI technology field according to an embodiment of the present invention.
5 is a flow for generation of first to third states as a premise of a highly promising convergence item and a related company-researcher recommendation method for each machine learning-based bio-item in the AI technology field according to an embodiment of the present invention; it's a chart
6 is a convergence item with high future potential for each bio-item based on machine learning and related companies-researcher recommendation method in the AI technology field according to an embodiment of the present invention. This is a flow chart for the process of generating metadata for
7 is a convergence item with high future potential for each bio-item based on machine learning in the AI technology field according to an embodiment of the present invention and a related company-researcher recommendation method of a split set of two or more alias items for application of a predictive model; FIG. It is a conceptual diagram of a matrix process for processing.
8 is a convergence item with high future potential for each bio-item based on machine learning in the AI technology field and related companies-researcher recommendation method according to an embodiment of the present invention, in which machine learning is performed for a specific period in the past by dividing the period; It is a conceptual diagram showing that a predictive model is established by assigning a part of data of recent years to repeatedly applying modeling, and then assigning and verifying the remaining data of recent years for verification.
9 is a matrix of factor series for each material and factor series of a candidate item among a highly promising convergence item and related company-researcher recommendation method and system for each machine learning-based bio item in the AI technology field according to an embodiment of the present invention; It is a conceptual diagram of the process of establishing a predictive model and training the predictive model to increase the probability in the verification process.
10 is an exemplary screen of a display in which a promising item is presented by implementing a highly promising convergence item and a related company-researcher recommendation method and system for each machine learning-based bio item in the AI technology field according to an embodiment of the present invention.
11 is an exemplary screen of a display showing a probability grade for each promising item further from FIG. 10 .
12 is a schematic diagram of all fusion item candidate groups before being output as promising items in FIG. 10 .
FIG. 13 is a conceptual diagram illustrating filtering using a reputation prediction score to select a promising item as shown in FIG. 10 .
14 is a view showing a reference explanatory variable value before a recent specific period and a recent specific period among the highly promising convergence items and related companies-researcher recommendation methods for each machine learning-based bio item in the AI technology field according to an embodiment of the present invention; It is a conceptual diagram illustrating the tuning of a weighting coefficient to increase the accuracy of the final prediction value by performing machine learning through the value of the probability variable.
15 is a future-promising convergence item for each machine learning-based bio-item in the AI technology field according to an embodiment of the present invention and a company-researcher recommendation for a promising item selected through a related company-researcher recommendation method and system; It is an example of a display screen showing the
16 is a future-promising convergence item for each machine learning-based bio-item in the AI technology field according to an embodiment of the present invention and a company-researcher recommendation for a promising item selected through a related company-researcher recommendation method and system; Another example of the display screen showing the
17 is a convergence item with high future potential for each machine learning-based bio-item in the field of AI technology according to an embodiment of the present invention and a related company-researcher recommendation method and system by reflecting multi-dimensional factors through a method and system for overseas recommendation for a company's material It is a conceptual diagram illustrating the recommendation of a company-researcher.
18 is a convergence item with high future potential for each machine learning-based bio-item in the AI technology field and related companies-researcher recommendation method and system in accordance with an embodiment of the present invention by reflecting multi-dimensional factors through the researcher's material overseas recommendation It is a conceptual diagram illustrating the recommendation of a company-researcher.

본 발명에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법 및 시스템은 다양한 변경을 가할 수 있고 여러 가지 실시예를 가질 수 있는바, 특정 실시예들을 도면에 예시하고 상세한 설명에 상세하게 설명하고자 한다. 그러나, 이는 본 발명을 특정한 실시 형태에 대해 한정하려는 것이 아니며, 본 발명의 기술적 사상 및 기술 범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다. According to the present invention, the highly promising convergence item and related company-researcher recommendation method and system for each machine learning-based bio-item in the AI technology field according to the present invention can apply various changes and can have various embodiments. It is illustrated in the drawings and will be described in detail in the detailed description. However, this is not intended to limit the present invention to a specific embodiment, it should be understood to include all modifications, equivalents, and substitutes included in the spirit and scope of the present invention.

도 1은 본 발명의 일 실시예에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 시스템의 주요 구성을 도시한 블록도이다. 도 2는 본 발명의 일 실시예에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 시스템 중, 기계 학습 장치와 그 하위 구성을 도시한 블록도이다. 도 3은 본 발명의 일 실시예에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 시스템 중, 추천 장치와 그 하위 구성을 도시한 블록도이다. 도 4는 본 발명의 일 실시예에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 시스템 중, 데이터 저장 장치와 그 하위 구성을 도시한 블록도이다. 도 5는 본 발명의 일 실시예에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법의 전제로서, 제1 상태 내지 제3상태의 생성에 대한 플로우 차트이다. 도 6은 본 발명의 일 실시예에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법 중, 지칭 아이템의 입수와 추천 아이템의 생성 그리고 추첨 아이템 선별을 위한 메타 데이터의 생성 처리에 대한 플로우 차트이다. 도 7은 본 발명의 일 실시예에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법 중, 예측 모델의 적용을 위해 2개 이상의 이칭 아이템 분할 집합의 처리를 위한 매트릭스 프로세스 개념도이다. 도 8은 본 발명의 일 실시예에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법 중, 기간을 나누어 과거 특정 기간 동안의 기계 학습을 진행하고, 최근 몇 년간의 데이터의 일부를 할애하여 모델링을 반복적으로 적용한 후, 최근 몇 년간의 잔부 데이터를 검증용으로 할당하여 검증하여 예측 모델을 수립하도록 하는 것을 도시한 개념도이다. 도 9는 본 발명의 일 실시예에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법 및 시스템 중, 물질 별 요인 계열의 행렬과 후보 아이템의 요인 계열을 통해 예측 모델을 수립하고 예측 모델을 트레이닝하여 검증 과정에서 확률을 높이는 프로세스에 대한 개념도이다. 도 10은 본 발명의 일 실시예에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법 및 시스템이 구현되어 유망 아이템이 제시되는 디스플레이의 예시 화면이다. 도 11은 도 10에서 나아가 유망 아이템별 유망성 등급을 나타낸 것을 도시한 디스플레이의 예시 화면이다. 도 12는 도 10에서 유망 아이템으로 출력되기 전, 전체의 융합 아이템 후보군들을 도식화한 것이다. 도 13은 도 10과 같이 유망 아이템을 선별하기 위하여 유명성 예측 점수를 활용하여 필터링 하는 것을 도시한 개념도이다. 도 14는 본 발명의 일 실시예에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법 중, 최근 특정 기간 이전의 기준 설명 변수값과 최근 특정 기간의 유망성 변수값을 통하여, 기계학습을 진행하고, 최종 예측값의 정확성을 높이기 위한 가중치 계수를 튜닝하는 것을 도시한 개념도이다. 도 15는 본 발명의 일 실시예에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법 및 시스템을 통하여 선별된 유망 아이템에 대한 기업-연구자를 추천하는 것을 도시한 디스플레이 화면의 예시이다. 도 16은 본 발명의 일 실시예에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법 및 시스템을 통하여 선별된 유망 아이템에 대한 기업-연구자를 추천하는 것을 도시한 디스플레이 화면의 또 다른 예시이다. 도 17은 본 발명의 일 실시예에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법 및 시스템을 통하여 다차원 요소를 반영하여 기업의 물질에 대한 해외 추천 기업-연구자를 추천하는 것을 도시한 개념도이다. 도 18은 본 발명의 일 실시예에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법 및 시스템을 통하여 다차원 요소를 반영하여 연구자의 물질에 대한 해외 추천 기업-연구자를 추천하는 것을 도시한 개념도이다.1 is a block diagram showing the main configuration of a highly promising convergence item and a related company-researcher recommendation system for each bio-item based on machine learning in the AI technology field according to an embodiment of the present invention. FIG. 2 is a block diagram illustrating a machine learning apparatus and its sub-configuration among a highly promising convergence item and a related company-researcher recommendation system for each machine learning-based bio-item in the AI technology field according to an embodiment of the present invention. 3 is a block diagram illustrating a recommendation device and its sub-configuration among a highly promising convergence item and a related company-researcher recommendation system for each machine learning-based bio-item in the AI technology field according to an embodiment of the present invention. 4 is a block diagram illustrating a data storage device and its sub-configuration among a highly promising convergence item and a related company-researcher recommendation system for each machine learning-based bio-item in the AI technology field according to an embodiment of the present invention. 5 is a flow for generation of first to third states as a premise of a highly promising convergence item and a related company-researcher recommendation method for each machine learning-based bio-item in the AI technology field according to an embodiment of the present invention; it's a chart 6 is a convergence item with high future potential for each bio-item based on machine learning and related companies-researcher recommendation method in the AI technology field according to an embodiment of the present invention. This is a flow chart for the process of generating metadata for 7 is a convergence item with high future potential for each bio-item based on machine learning in the AI technology field according to an embodiment of the present invention and a related company-researcher recommendation method of a split set of two or more alias items for application of a predictive model; FIG. It is a conceptual diagram of a matrix process for processing. 8 is a convergence item with high future potential for each bio-item based on machine learning in the AI technology field and related companies-researcher recommendation method according to an embodiment of the present invention, in which machine learning is performed for a specific period in the past by dividing the period; It is a conceptual diagram showing that a predictive model is established by assigning a part of data of recent years to repeatedly applying modeling, and then assigning and verifying the remaining data of recent years for verification. 9 is a matrix of factor series for each material and factor series of a candidate item among a highly promising convergence item and related company-researcher recommendation method and system for each machine learning-based bio item in the AI technology field according to an embodiment of the present invention; It is a conceptual diagram of the process of establishing a predictive model and training the predictive model to increase the probability in the verification process. 10 is an exemplary screen of a display in which a promising item is presented by implementing a highly promising convergence item and a related company-researcher recommendation method and system for each machine learning-based bio item in the AI technology field according to an embodiment of the present invention. 11 is an exemplary screen of a display showing a probability grade for each promising item further from FIG. 10 . 12 is a schematic diagram of all fusion item candidate groups before being output as promising items in FIG. 10 . 13 is a conceptual diagram illustrating filtering using a reputation prediction score to select a promising item as shown in FIG. 10 . 14 is a view showing a reference explanatory variable value before a recent specific period and a recent specific period among the highly promising convergence items and related companies-researcher recommendation methods for each machine learning-based bio item in the AI technology field according to an embodiment of the present invention; It is a conceptual diagram illustrating the tuning of a weighting coefficient for increasing the accuracy of the final prediction value by performing machine learning through the value of the probability variable. 15 is a future-promising convergence item for each machine learning-based bio-item in the AI technology field according to an embodiment of the present invention and a company-researcher recommendation for a promising item selected through a related company-researcher recommendation method and system; It is an example of a display screen showing the 16 is a future-promising convergence item for each machine learning-based bio-item in the AI technology field according to an embodiment of the present invention and a company-researcher recommendation for a promising item selected through a related company-researcher recommendation method and system; Another example of the display screen showing the 17 is a convergence item with high future potential for each machine learning-based bio-item in the field of AI technology according to an embodiment of the present invention and a related company-researcher recommendation method and system by reflecting multi-dimensional factors through a method and system for overseas recommendation for a company's material It is a conceptual diagram illustrating company-researcher recommendation. 18 is a convergence item with high future potential for each machine learning-based bio-item in the AI technology field and related companies-researcher recommendation method and system in accordance with an embodiment of the present invention by reflecting multi-dimensional factors through the researcher's material overseas recommendation It is a conceptual diagram illustrating company-researcher recommendation.

본 발명에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법 및 시스템은 도 1에 도시된 바와 같은 개념적인 구성 요소들에 의하여 구현될 수 있다.The convergence item with high future potential for each machine learning-based bio-item and related company-researcher recommendation method and system in the AI technology field according to the present invention can be implemented by conceptual components as shown in FIG. 1 .

도 1에 도시된 바와 같이, 본 발명은 기계 학습 장치(100), 추천 장치(200), 데이터 저장 장치(300), 관리 장치(400)를 포함하고, 이와 유무선 네트워킹 되는 타 장치(10) 예컨대 사용자 컴퓨터(11)가 접속하여 요청되는 입력을 입력하고 요구되는 출력값을 열람하게 된다. As shown in FIG. 1 , the present invention includes a machine learning device 100 , a recommendation device 200 , a data storage device 300 , and a management device 400 , and other devices 10 that are wired and wirelessly networked, such as The user computer 11 accesses, inputs a requested input, and reads a requested output value.

본 발명에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법 및 시스템의 경우, 네트워크 접근 가능한 범위의 단말기나 서버 혹은 데이터베이스에 접속하여, 접근 권한이 부여되거나 열람이 가능한 자료 혹은 공개된 공개 자료인 개별 문서를 통하여 추출된 지칭 아이템을 활용하고, 이러한 지칭 아이템 간의 연관 데이터를 생성한 후, 이들 지칭 아이템 쌍 들에 대하여 지칭 아이템 쌍 별로 2 이상에 대한 예측 모델을 적용하여 추천 아이템을 생성하고, 이후, 생성된 추천 아이템들 중 최적의 융복합 아이템의 선별을 위하여 메타 데이터 생성을 처리하는 과정을 수반하게 된다. In the case of a highly promising convergence item and related company-researcher recommendation method and system for each machine learning-based bio item in the AI technology field according to the present invention, access is granted by accessing a terminal, server, or database within a network accessible range Utilizing reference items extracted through individual documents that have been or can be read or published public data, and generating related data between these reference items, prediction of 2 or more for each reference item pair for these reference item pairs A recommended item is generated by applying a model, and thereafter, metadata generation is performed to select an optimal convergence item from among the generated recommendation items.

기계 학습 장치(100)의 경우, 위 내용 중, 학습 데이터를 통하여 기계 학습을 진행하는 것인데, 이러한 기계 학습 장치(100)는 모델링을 위한 프로세싱을 가지게 된다.In the case of the machine learning apparatus 100, among the above, machine learning is performed through learning data, and the machine learning apparatus 100 has processing for modeling.

기계 학습 장치(100)는 도 2에 도시된 바와 같이, 기능을 중심으로 하여 그 하위 기능별 개념화된 기능적 단위를 구성할 수 있는데, 기계 학습 장치(100)의 1차적 하위 개념은 학습 데이터 생성부(110)와 기계 학습부(120)를 포함할 수 있다.As shown in FIG. 2 , the machine learning apparatus 100 may configure a conceptualized functional unit for each sub-function centered on the function, and the primary sub-concept of the machine learning apparatus 100 is the learning data generator ( 110 ) and a machine learning unit 120 .

도 2에 도시된 바와 같이, 학습 데이터 생성부(110)는 기계 학습부(120)가 기계 학습을 하기 위한 데이터를 생성하는 구성이며, 기계 학습부(120)는 상술한 바와 같이, 수요 기업이 보유하는 물질에 대한 미래 유망한 융복합 후보 아이템을 추천하기 위한 융복합 예측 모델 학습부(121)가 있으며, 추천한 융복합 후보 아이템에 대한 아이템 평가-예측 모델 학습부(122)가 존재할 수 있다. As shown in FIG. 2 , the learning data generating unit 110 is a configuration in which the machine learning unit 120 generates data for machine learning, and the machine learning unit 120 is, as described above, a demand company There may be a convergence prediction model learning unit 121 for recommending a promising future fusion candidate item for the material possessed, and an item evaluation-prediction model learning unit 122 for the recommended convergence candidate item may exist.

상술한 바와 같이, 학습 데이터 생성부(110)는 위와 같은 기계 학습부(120)의 기능의 발현을 위하여 학습용 설명 변수 데이터 생성부(111)와 학습용 목적 변수 데이터 생성부(112)가 존재할 수 있다. As described above, the learning data generating unit 110 may include an explanatory variable data generating unit 111 for learning and a learning objective variable data generating unit 112 in order to express the functions of the machine learning unit 120 as described above. .

추천 장치(200)의 경우, 도 3에 도시된 바와 같이, 추천됨 아이템의 선별을 위한 기작을 수행하게 되는데, 추천 장치(200)의 위 기능을 위하여 모델 적용용 이터 생성부(210)과 추천 데이터 생성부(220), 그리고 메타 데이터 생성부(230)는 물론 추천 데이터 제공부(240)로 개념적으로 나뉘게 된다. In the case of the recommendation device 200, as shown in FIG. 3, a mechanism for selecting a recommended item is performed. For the above function of the recommendation device 200, the model application data generator 210 and recommendation The data generating unit 220 and the meta data generating unit 230 are conceptually divided into a recommendation data providing unit 240 .

추천 장치(200)는 학습된 모델을 사용하여 예측값을 생성하는 것이기 때문에, 설명 변수값인 X값용 모델 적용용 데이터가 생성되어 있어야 하며, 이는 모델 적용용 데이터 생성부(210)가 수행하게 된다. Since the recommendation apparatus 200 generates a predicted value using the learned model, data for model application for the X value, which is an explanatory variable value, must be generated, and this is performed by the model application data generation unit 210 .

모델 적용용 데이터 생성부(210)는 그 하위 개념으로서 모델 적용용 설명 변수 데이터 생성부(211)와 모델 적용용 목적 변수 데이터 생성부(212)를 구비하게 된다. 이에 대해서는 후술하기로 한다. The data generating unit 210 for model application includes an explanatory variable data generating unit 211 for model application and a target variable data generating unit 212 for model application as a sub-concept thereof. This will be described later.

추천 데이터 생성부(220)는 지칭 아이템 별로 생성해 놓고 준비하게 된다. 추천 데이터 생성부(220)는 지칭 아이템 대비 지칭 아이템별 추천 아이템 쌍 데이터에 해당하는데, 추천 데이터는 실시간 생성하는 것 보다는 미리 주기적으로 지속적으로 업데이트하여 구비하여 준비하게 되고, 특정 지칭 아이템이 입수되면 저장된 추천 데이터에서 바로 읽어 들여 가공 데이터를 생성하게 되는 것이다. The recommendation data generating unit 220 prepares for each reference item. The recommendation data generator 220 corresponds to the reference item pair data for each reference item compared to the reference item, and the recommendation data is prepared by periodically continuously updating in advance rather than generating in real time, and when a specific reference item is obtained, the recommended data is stored. It reads directly from the recommended data and creates machining data.

추천 데이터 생성부(200)의 지칭 아이템은 약 100만개 혹은 그 이상으로 지속적으로 업데이트 하게 된다. The reference items of the recommendation data generating unit 200 are continuously updated to about 1 million or more.

메타 데이터 생성부(230)의 경우, 메타 데이터를 생성하는 것인데, 메타 데이터는 추천 아이템에 대한 적어도 하나 이상의 요인 계열 또는 요인 계열을 구성하는 요인 계열별 적어도 하나 이상의 요인 계열값일 수 있다. In the case of the metadata generator 230 , metadata is generated, and the metadata may be at least one or more factor series for a recommended item or at least one or more factor series values for each factor series constituting the factor series.

메타 데이터는 추천 아이템에 대한 선별에 사용되는 데이터에 해당하는데, 메타 데이터는 6개 계열로 총 12종을 구비하도록 준비한다. 메타 데이터에 대해서는 보다 자세히 후술하도록 한다. Meta data corresponds to data used for selection of recommended items, and the meta data is prepared to have a total of 12 types in 6 series. Meta data will be described later in more detail.

본 발명에 따른 AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법의 경우, 도 5에 도시된 바와 같이, 개념적으로 제1상태, 제2상태, 그리고 제3상태의 전제 하에, 도 6의 프로세스가 전개됨을 의미하게 된다. In the case of a highly promising convergence item and a related company-researcher recommendation method by machine learning-based bio-item according to the present invention according to the present invention, as shown in FIG. 5, conceptually the first state, the second state, and the second state Under the premise of three states, it means that the process of FIG. 6 is developed.

도 5에 도시된 바와 같이, 우선 제1상태의 경우, 기술 집단 지성 문서셋에 해당하는 개별 문서로부터 추출된 지칭 아이템을 활용하여, 지칭 아이템 상호간의 연관 데이터가 마련되어 있음을 의미한다. As shown in FIG. 5 , in the first state, it means that the reference item extracted from the individual document corresponding to the technical group intelligent document set is used, and correlation data between the reference items is prepared.

즉, 지칭 아이템을 의미하는 물질1(예컨대, 베타 갈락토시다아제)-물질 n이라고 한다면, 이러한 물질 1-물질n에 대하여 융합 가능한 후보 물질 즉, 아이템을 상호 연관 지은 연관 데이터를 마련된 상태를 제1상태로 의미하게 된다.That is, if it is Substance 1 (eg, beta galactosidase)-substance n, which means the referenced item, a fusionable candidate substance, that is, related data correlating the items, is prepared for this substance 1-substance n. 1 means state.

이러한 제1상태는 도 5에 도시된 바와 같이, 방대한 양을 형성하는 문서셋으로부터 개별 문서로부터 개별적으로 추출된 키워드 즉, 지칭 아이템을 추출한 것에 해당한다. 이들 상호간의 연관 데이터는 물질1-n과 아이템 상호간의 연관성에 대한 정보를 담고 있게 된다.As shown in FIG. 5 , the first state corresponds to the extraction of keywords, ie, reference items, which are individually extracted from individual documents from a document set forming a vast amount. These correlation data contain information about the correlation between material 1-n and the item.

도 5에 도시된 바와 같이, 제2상태의 경우, 지칭 아이템 간 연관 데이터가 포함된 지칭 아이템 쌍의 집합을 대상으로 하여, 지칭 아이템 쌍에 대한 개별적인 2 이상의 목적 변수별 목적 변수값과, 적어도 하나 이상의 설명 변수별 설명 변수값 중 어느 하나 이상이 마련된 것을 의미한다. As shown in FIG. 5 , in the second state, target variable values for each of two or more individual target variables for the reference item pair and at least one target set of reference item pairs including related data between the reference items It means that at least one of the explanatory variable values for each explanatory variable described above is provided.

제3상태의 경우, 도 5에 도시된 바와 같이, 위에서 설명한 바와 같은 지칭 아이템에 따른 개별적인 목적 변수별 목적 변수값, 설명 변수별 설명 변수값으로 기계 학습 알고리즘을 적용한 예측 모델을 수립하도록 마련된 것을 의미하게 된다. In the case of the third state, as shown in Fig. 5, it means to establish a predictive model to which a machine learning algorithm is applied with the objective variable value for each individual objective variable and the explanatory variable value for each explanatory variable according to the reference item as described above. will do

설명 변수별 설명 변수값은 설명 변수 종류별로 적어도 1종 이상 생성되는 것이며, 예측 모델은 상기 설명 변수 종류별로 생성될 수 있다.At least one explanatory variable value for each explanatory variable is generated for each explanatory variable type, and a predictive model may be generated for each explanatory variable type.

위 제1상태 내지 제3상태가 마련되면, 수요 기업에의 개별 문서로부터 물질 즉, 지칭 아이템을 입수받게 된다. When the above first to third states are prepared, a substance, that is, a reference item, is obtained from a separate document to the demanding company.

도 6에 도시된 바와 같이, 지칭 아이템을 입수하게 되면, 이후, 예측 모델을 적용하여, 적어도 2 이상의 예측 아이템에 대한 추천 아이템 데이터를 생성하게 된다. As shown in FIG. 6 , when a reference item is obtained, a prediction model is then applied to generate recommended item data for at least two or more prediction items.

이후, 도 6에 도시된 바와 같이, 방대한 양의 추천 아이템의 데이터 중, 추천 아이템의 선별을 위하여 하나 이상의 메타 데이터를 생성하여 처리하게 된다. Thereafter, as shown in FIG. 6 , one or more metadata is generated and processed for selection of a recommended item from among a vast amount of data of the recommended item.

도 7에 도시된 바와 같이, 위에서 언급한 예측 모델의 수립의 경우, 지칭 아이템 쌍의 집합이 적어도 2개 이상의 지칭 아이템 분할 집합을 이용하게 된다.As shown in FIG. 7 , in the case of establishment of the above-mentioned prediction model, a set of reference item pairs uses at least two or more reference item split sets.

예컨대, 도 7에서의 바이오 물질의 행(row)과 후보 융합 아이템의 열(column)은 그 내부 행렬에 이들 지칭 아이템 쌍들에 대한 상호 연관 데이터가 기재되어 있으며, 이들 연관 데이터가 물질과 아이템에 대한 연관성의 정도를 나타내는 시그널에 해당한다. For example, in a row of biomaterials and a column of candidate fusion items in FIG. 7 , correlation data for these reference item pairs are described in an internal matrix, and these correlation data are for substances and items Corresponds to a signal indicating the degree of correlation.

이러한 바이오 물질과 후보 융복합 아이템의 쌍(Pair)은 우측 물질(row) 대비 요인 계열(factor)(column)에 대한 정보의 매트릭스1과 그 우측에는 요인 계열(factor)(row) 대비 아이템(column)의 매트릭스2의 곱 연산으로 획득된다.Pair of these biomaterials and candidate convergence items is matrix 1 of information on factor series (column) versus material on the right side, and factor series (row) versus item (column) on the right side. ) is obtained by multiplying the matrix 2 of

앞서 설명한 바와 같은 "학습"은 위 지칭 아이템 분할 집단을 사용하게 되는데, 위 분할은 상술한 바와 같이, 기 설정된 적어도 하나 이상의 요인 계열을 단일, 순차, 조합 또는 복합하여 적용하는 2차원 이상의 행렬의 분할일 수 있다. As described above, "learning" uses the above-mentioned item division group. As described above, the division of at least one or more preset factor series is applied singly, sequentially, in combination, or in combination of a two-dimensional or more matrix division. can be

요인 계열(factor)의 경우, 지칭 아이템에 대한, i) 적어도 하나 이상의 범주 분류 속성 계열, ii) 적어도 하나 이상의 인적 분류 속성 계열, iii) 적어도 하나 이상의 계량값 분류 속성 계열, iv) 적어도 하나 이상의 평가값 분류 속성 계열 중 어느 하나 이상에 해당할 수 있다. In the case of a factor family, i) at least one categorical classification attribute series, ii) at least one human classification attribute series, iii) at least one metric classification attribute series, iv) at least one or more evaluations of the referenced item It may correspond to any one or more of the value classification attribute series.

이러한 범주 분류 속성 계열은 적어도 2 이상의 지칭 아이템에 공통적으로 적용될 수 있는 텍사노미(taxonomy)인 것이 바람직하여, 문서셋으로부터 검색되는 용어를 통해 검색된 용어에 대한 범주 분류에 대한 적어도 1 이상의 개념적 뎁스(depth), 개념적 레이어(layer) 혹은 개념적 트리(tree) 구조를 가지는 하이어아키(hierarchy)인 계층 구조를 형성할 수 있음은 물론이다.The category classification attribute series is preferably a taxonomy that can be commonly applied to at least two or more referenced items, and at least one or more conceptual depth ( Of course, it is possible to form a hierarchical structure having a hierarchical structure having a depth, a conceptual layer, or a conceptual tree structure.

나아가 위 계층 구조는 제품-부품과 소재-물질과 질병 중 어느 하나 이상을 포함하는 것이 바람직하다. Furthermore, the above hierarchical structure preferably includes any one or more of product-part, material-substance, and disease.

먼저, 제품 부품은 아래의 표1의 어느 하나 이상을 포함하는 것이 바람직하다.First, the product parts preferably include any one or more of Table 1 below.

범주 분류 속성 계열의 범주 분류Categorization Classification of an attribute series 제품-부품product-part IT 및 컴퓨터, 의료, 제약 및 화장품, 자동차 및 운송, 전자 부품, 반도체, 건축 및 토목, 물질 취급, 조절 및 저장, 인쇄, 사진 및 시청각 장비, 광학, 발전, 배전 및 동력 전달, 전기 시스템 및 조명, 가열, 냉각, 환기, 필터, 파이프 및 튜브, 제조 및 가공, 기계 요소 또는 단위, 안전, 무기, 서비스 및 금융, 실험 및 측정, 농산물, 식품 및 담배, 개인, 가정 및 사무용품, 스포츠, 예술, 게임, 장난감 및 교육 자재, 상품 및 부품 일단IT & Computers, Medical, Pharmaceutical & Cosmetics, Automotive & Transportation, Electronic Components, Semiconductors, Building & Civil Engineering, Material Handling, Control & Storage, Printing, Photographic & Audio-Visual Equipment, Optics, Power Generation, Distribution & Power Transmission, Electrical Systems & Lighting , heating, cooling, ventilation, filters, pipes and tubes, manufacturing and processing, mechanical elements or units, safety, weapons, services and finance, laboratory and measurement, agricultural products, food and tobacco, personal, household and office supplies, sports, arts, Games, toys and educational materials, merchandise and parts once

아울러, 소재-물질은 아래 표2의 항목을 하나 이상을 포함하는 것이 바람직하다. In addition, the material-material preferably includes one or more of the items in Table 2 below.

범주 분류 속성 계열의 범주 분류categorization categorization of attribute series 소재-물질material - substance FDA 승인 물질, 의약품, 유전자, 생화학 물질, 효소, 화학물질, 고분자 소재, 의료 소재, 천연 소재, 천연 물질, 건축용 자재, 제조 부품용 소재, 연료와 연료 첨가지 및 윤활유, 음식료품, 실험 시약 및 소재, 종이 또는 위생 장비용 소재FDA Approved Substances, Pharmaceuticals, Genes, Biochemicals, Enzymes, Chemicals, Polymer Materials, Medical Materials, Natural Materials, Natural Substances, Building Materials, Materials for Manufacturing Parts, Fuels and Fuel Additives and Lubricants, Food and Beverage, Laboratory Reagents and Materials , paper or material for sanitary equipment

아울러 질병은 한국표준질병 및 사인 분류(KCD-8), 국제질병사인분류, ICD-10(International Classification of Diseases-10) 중, 적어도 하나 이상의 항목이 포함되는 것이 바람직하다.인적 분류 속성 계열의 인적 분류의 경우, 인적 분류는 국내법 기준으로 하여 법인 및 자연인에 대한 분류 중 어느 하나 이상일 수 있다.In addition, the disease preferably includes at least one item from among Korean standard disease and cause classification (KCD-8), international cause of disease classification, and ICD-10 (International Classification of Diseases-10). In the case of classification, the classification of persons may be any one or more of classifications for legal persons and natural persons based on domestic law.

먼저, 법인의 경우, 법인은 i) 특허 문서에서는 출원인, 현재 권리자, 원고, 피고, 매입자, 매각자, 라이센시 또는 라이센서 및 특허 문서에 기재된 기타 비자연인 주체(대학의 산학협력단, 연구 기관, 조합, 국가)일 수 있으며, ii) 논문 문서에서는 연구자가 소속된 조직일 수 있다.First, in the case of a corporation, the corporation is i) the applicant, the current right holder, the plaintiff, the defendant, the acquirer, the seller, the licensee or licensor in the patent document, and other non-natural persons described in the patent document (industrial-academic cooperation organizations of universities, research institutes, associations, country), and ii) in the thesis document, it may be the organization to which the researcher belongs.

나아가 자연인은 특허 문서의 경우에는 발명자(들), 논문의 경우에는 저자(들)일 수 있다.Furthermore, the natural person may be the inventor(s) in the case of a patent document and the author(s) in the case of a thesis.

인적 분류의 경우, 지칭 아이템이 포함되어 있는 개별 문서와 관련하여 생성하는 인적 정보로부터 분류될 수 있으며, 인적 분류로서 법인은 법인 명칭, 국적, 조직 속성(조직 속성은 기업, 대학, 연구 기관, 조합 등으로부터 하나로 분류될 수 있다.), 법인이 소속되는 집단 속성(집단 속성은 상장 기업, 외감 기업 또는 그 외 기업 등으로 분류될 수 있다.), 법인이 분류되는 분류 속성(분류 속성은 SIC/KSIC에 따른 산업 분류 속성, 증권 시장별 종목 분류 속성, 기술-제품-서비스 관점의 분류 속성 등으로 분류될 수 있다.) 중 어느 하나 이상일 수 있다.In the case of human classification, it can be classified from personal information generated in relation to individual documents that contain reference items. etc.), the group attribute to which the legal entity belongs (the group attribute can be classified as a listed company, externally supervised company, or other company, etc.), the classification attribute to which the legal entity is classified (classification attribute is SIC/ It may be classified into an industry classification attribute according to KSIC, a stock classification attribute by stock market, a classification attribute from a technology-product-service perspective, etc.).

계량값 분류의 경우, 지칭 아이템이 포함되어 있는 개별 문서에서 계량할 수 있는 계량값에 해당한다.In the case of metric value classification, it corresponds to a metric value that can be measured in an individual document that includes a reference item.

계량값의 경우, 시간 단위별 또는 시간 독립적 문서의 개수, 관련 문서(레퍼런스, 피인용 또는 유사 문서 중 하나 이상을 포함한다)의 개수, 문서와 관련된 이벤트-상기 이벤트는 소송, 심판, 표준, FDA 승인, 거래, 라이선스 체결을 포함한다-의 개수, 단일 또는 2 이상의 개수에 대한 처리값(증감율에 대한 비율값 또는 계산값을 포함한다), 인적 분류와 관련된 처리값(점유율 또는 집중율) 중 어느 하나 이상인 것이며, 지칭 아이템별 계량값 분류는 개별 문서에서 계량할 수 있는 계량값을 사용하여 생성될 수 있다.For metrics, the number of time-unit or time-independent documents, the number of related documents (including one or more of references, citations, or similar documents), and the events related to the documents - such events as litigation, adjudication, standard, FDA Any of the number of - including approval, transaction, and license execution - treatment values for single or two or more numbers (including ratio values or calculated values for increase/decrease rates), and treatment values related to human classification (share or concentration rate) There is more than one, and the classification of the measurement value for each reference item can be created using the measurement value that can be measured in an individual document.

평가값 분류의 경우, 개별 문서에 대한 2 이상의 측정값을 모델이나 수식에 사용하여 생성되는 평가값이나 예측값을 사용하여 지칭 아이템별로 생성된 것이거나, 지칭 아이템별로 대응되는 문서 집합에서 집단적으로 측정되는 측정값을 모델이나 수식에 사용하여 생성되는 평가값이나 예측값일 수 있다.In the case of evaluation value classification, it is generated for each reference item using an evaluation value or predicted value generated by using two or more measured values for individual documents in a model or formula, or is measured collectively in a set of documents corresponding to each reference item. It can be an evaluation or predicted value generated by using a measurement in a model or formula.

도 8에 도시된 바와 같이, 예측 모델의 경우, 융합 아이템 후보들 혹은 추천된 유망 융합 아이템들에 대한 선별 작업을 수행하는 기능이다. As shown in FIG. 8 , in the case of a predictive model, it is a function of performing a selection operation on fusion item candidates or recommended promising fusion items.

예측 모델의 경우, 도 8에 도시된 바와 같이, 과거의 특정 기간을 설정하고 해당 과거의 특정 기간은 x값용 데이터로서 기계학습의 디폴트값을 끌어올리기 위한 작업에 해당하며, Y값용의 데이터는 모델링용 즉, 과거의 데이터를 통해 디폴트값을 끌어올린 것에 대한 결과치의 정확성을 다시 높이도록 하는 데이터에 해당하며, 과거 특정 기간 이후의 데이터 중 30%는 모델링에 대한 검증용으로 활용하게 된다.In the case of a predictive model, as shown in FIG. 8, a specific period in the past is set, and the specific period in the past corresponds to an operation to raise the default value of machine learning as data for the x value, and the data for the Y value is modeling In other words, it corresponds to data that increases the accuracy of the result value for raising the default value through past data, and 30% of the data after a specific period in the past will be used for modeling verification.

즉, 도 8에서 도시된 바와 같이, 과거 10년의 데이터를 X값용으로 활용하여 기계 학습을 시키는데, 최근의 3년 데이터 중 70%의 데이터는 기계학습의 결과에 대한 모델링용으로 활용하며 최근 3년 데이터 중 30%는 이에 대한 모델링의 검증용에 해당한다. That is, as shown in FIG. 8, machine learning is performed by using data from the past 10 years for the X value. 30% of the year data is for verification of modeling.

과거의 몇 년을 기계 학습시킬 것인가는 수요 기업의 융합 대상 물질이 속하는 기술 분야, 기술의 특수성, 기술의 라이프 사이클, 기술의 트랜드의 변화율 등을 고려하여 임의 설정할 수 있음은 물론이다. 이에 따라 X값용 데이터의 기간 설정은 선택적으로 행사하여, Y값용 데이터 즉, 모델링의 정확성을 높이는데 얼마든지 튜닝될 수 있다. Y값용 데이터의 모델링용 비율과 검증용 비율의 선정 역시 모델링의 정밀성과 검증의 신뢰성을 높이기 위하여 선택적으로 행사될 수 있다. It goes without saying that the number of years to be machine-learned can be arbitrarily set in consideration of the technology field to which the fusion target material of the demanding company belongs, the specificity of the technology, the life cycle of the technology, and the change rate of the technology trend. Accordingly, the period setting of the data for the X value is selectively exercised, and the data for the Y value, ie, can be tuned as much as possible to increase the accuracy of modeling. The selection of the ratio for modeling and the ratio for verification of the Y-value data can also be selectively exercised in order to increase the precision of modeling and the reliability of verification.

예측 모델은 예측 기간 별로 수립하는 것이며, 예측 모델에 있어서 목적 변수별 목적 변수값과 설명 변수별 설명 변수값은 예측 기간을 기준으로 분할된 적어도 2개 이상의 집합별로 별도로 생성될 수 있다.The prediction model is established for each prediction period, and in the prediction model, the objective variable value for each objective variable and the explanatory variable value for each explanatory variable may be separately generated for each at least two or more sets divided based on the prediction period.

도 9에 도시된 바와 같이, 예측 모델이 생성하는 예측값은 지칭 아이템별로 지칭 아이템쌍을 구성하는 예측 아이템에 대한 예측 확률값 또는 예측 빈도값에 해당할 수 있다. As shown in FIG. 9 , the prediction value generated by the prediction model may correspond to the prediction probability value or the prediction frequency value for the prediction items constituting the reference item pair for each reference item.

예컨대, 도 9에 도시된 바와 같이, 수요 기업의 물질(row) 대비 요인 계열(factor)의 기본 행렬이 있으며, 방대한 양으로 미리 구비된 아이템(row) 대비 요인 계열(factor)의 후보 아이템 행렬이 준비되면, 후보 아이템은 요인 계열(factor) 별, 혹은 그 요인 계열의 계측 구조에 따른 또 다른 하위 후보 아이템 행렬 혹은 이들의 다양한 조합에 따른 또 다른 수많은 계층적 하위 후보 아이템 행렬의 인버스는 기본 행렬과의 곱셈 연산이 가능(좌측이 기본 행렬이며 우측이 후보 아이템 행렬의 인버스 형태)하게 되는데, 이러한 행렬의 곱은 이들 상호간의 연관성을 의미하게 된다. For example, as shown in FIG. 9 , there is a basic matrix of a factor series versus a substance (row) of a demanding company, and a candidate item matrix of a factor series versus an item (row) provided in advance in a vast amount is When prepared, the candidate items are determined by factor series, or another matrix of sub-candidate items according to the metric structure of the factor series, or another number of hierarchical sub-candidate item matrices according to various combinations thereof. multiplication operation is possible (the left is the basic matrix and the right is the inverse form of the candidate item matrix), and the multiplication of these matrices means the correlation between them.

메타 데이터의 경우, 추천 아이템에 대한 적어도 하나 이상의 요인 계열 또는 요인 계열을 구성하는 요인 계열별 적어도 하나 이상의 요인 계열값인 것이며, 요인 계열은, 추천 아이템에 대한, i) 적어도 하나 이상의 범주 분류 속성 계열, ii) 적어도 하나 이상의 인적 분류 속성 계열, iii) 적어도 하나 이상의 계량값 분류 속성 계열, iv) 적어도 하나 이상의 평가값 분류 속성 계열 중 어느 하나 이상일 수 있다. In the case of metadata, at least one or more factor series for the recommended item or at least one or more factor series values for each factor series constituting the factor series, the factor series is: i) at least one or more category classification attribute series for the recommended item , ii) at least one or more human classification attribute series, iii) at least one metric value classification attribute series, and iv) at least one or more evaluation value classification attribute series.

예컨대, 아래 표3과 같이, 물질 A에 대해 추천하는 추천 아이템을 효소 계열에서만 제공할 수 있는데, 이 때 효소 계열은 요인 계열이 되며 효소 계열에서만 아웃풋을 선별 즉, 필터링하여 발생시키게 되어, 융합 아이템이 과도하게 방대하게 되거나 과도하게 이종의 영역으로 발산하는 것을 방지하게 된다. 이로써 추천 정보는 선별(narrow down)되는 것이다. For example, as shown in Table 3 below, a recommended item recommended for substance A can be provided only in the enzyme series. It is prevented from becoming excessively massive or distributing to an excessively heterogeneous area. In this way, the recommended information is narrowed down.

[표3: Input 물질 A에 대한 융복합 추천][Table 3: Convergence recommendation for input material A] InputInput OutputOutput 물질Asubstance A 질병Adisease A 효소 Benzyme B 단백질Cprotein C 천연물Xnatural product X 박테리아Kbacteria K. ????

나아가, 메타 데이터는 추천 아이템에 대한 평가 정보에 해당한다.여기서 평가 정보의 경우, 추천 아이템을 구성하는 문서 집합에 대한 집단 속성에 대한 평가 모델 또는 예측 모델을 수립하고, 추천 아이템에 대응되는 문서 집합에 대한 집단 속성을 기 설정된 시점 기준으로 측정하여, 평가 모델 또는 예측 모델에 투입하여 생성되는 것이다. Furthermore, the metadata corresponds to evaluation information on the recommended item. Here, in the case of evaluation information, an evaluation model or a prediction model for a group attribute of a document set constituting a recommended item is established, and a document set corresponding to the recommended item is established. It is generated by measuring the group attribute of , based on a preset time point, and inputting it to an evaluation model or a predictive model.

평가 모델의 수립의 경우, a) 지칭 아이템 쌍 별로 적어도 2 이상의 목적 변수별 목적 변수값과 적어도 1종 이상의 설명 변수 종류별 설명 변수별 설명 변수값을 생성하는 단계와 b) 설명 변수별 설명 변수값과 목적 변수별 목적 변수값으로 기계 학습 알고리즘을 적용하는 단계를 포함하여 생성되는 것이 바람직하다. In the case of establishment of the evaluation model, a) generating an explanatory variable value for each explanatory variable by at least two target variable values and at least one or more explanatory variable types for each pair of reference items; b) an explanatory variable value for each explanatory variable; It is preferable to generate, including the step of applying a machine learning algorithm to the target variable value for each target variable.

아래의 표 4에서 예를 들면, 설명 변수의 경우, 공개량, 피인용수 등이 되며, 키워드로 기재된 lidar, neural network 등은 아이템에 해당한다.In Table 4 below, for example, in the case of explanatory variables, the amount of disclosure, the number of citations, etc., and lidar, neural network, etc. described as keywords correspond to items.

[표4: 국개별 동향][Table 4: Trends by Country]

[표5: 관심도(피인용)][Table 5: Interest (Citation)]

[표6: 거래 및 M&A][Table 6: Transactions and M&A]

[표7: 포트폴리오(해외패밀리수)][Table 7: Portfolio (No. of Overseas Families)]

[표8: 국가 R&D][Table 8: National R&D]

[표 9: 리스크][Table 9: Risk]

아울러 메타 데이터는 추천 아이템과 관련된 법인 및 자연인에 대한 정보 중 어느 하나 이상일 수 있는데, 이 경우, 메타 데이터의 생성의 경우, 추천 아이템에 대응되는 문서 집합의 집단 속성별로 제공되는 제1 생성 방법; 또는 추천 아이템에 대응되는 문서에 포함된 지칭 아이템 집합, 지칭 아이템의 분류, 지칭 아이템의 문서별 문서 내 특성 또는 문서 특성(feature) 사용하여 네트워크 분석을 통하여 생성되는 제2 생성 방법 중 어느 하나 이상일 수 있다.In addition, the metadata may be any one or more of information on a legal entity and a natural person related to the recommended item. In this case, in the case of generating the metadata, a first generating method provided for each group attribute of a document set corresponding to the recommended item; Alternatively, it may be any one or more of a set of reference items included in the document corresponding to the recommended item, classification of the reference item, and a second generation method generated through network analysis using a document-specific feature or document feature of the reference item. have.

나아가, 제1 생성 방법에서, 문서 집합의 집단 속성은, i) 문서에 대한 적어도 하나 이상의 계량값별 계량 총량, 증감율, 시간 구간별 밀집도 ii) 소송, 심판, 거래, 표준, FDA 승인 중 어느 하나 이상을 포함하는 특별한 이벤트에 관련된 문서 총량, 증감율, 시간 구간별 밀집도, iii) 전체 또는 상위 범주 기준 특정한 상기 아이템에 대한 적어도 하나 이상의 계량값별 집중율 또는 점유율 중 어느 하나 이상인 것이 바람직하며, 자연인 또는 법인은 국가별, 국적별, 적어도 하나 이상의 분류 체계에 따른 사실상 또는 추정되는 분류별로 제공될 수 있다.Furthermore, in the first generating method, the collective attribute of the document set includes: i) the total amount of measurement by at least one measurement value for the document, the increase/decrease rate, and the density by time interval ii) any one or more of litigation, adjudication, transaction, standard, FDA approval It is preferable that at least one of the total amount of documents related to a special event including It may be provided by country, by nationality, by actual or presumed classification according to at least one classification system.

아울러, 제2 생성 방법에서, 네트워크 분석의 경우, 지칭 아이템 집합은 키워드(keyword), 키프레이즈(key phrase) 중 어느 하나 이상인 것이거나, 문서 특성(feature)은 CPC/IPC 등의 특허 분류 정보, 레퍼런스 문서 번호나 피인용 문서 번호 등의 번호 정보, 권리자나 발명자를 포함한 인적 정보 중 어느 하나 이상인 것일 수 있으며, 지칭 아이템의 문서별 문서 내 정보에는 예시적으로 TF-ID(term frequency-inverted document frequency)와 같은 칭 아이템의 문서 대표성에 대한 적어도 하나 이상의 측정값에 대한 정보인 것이거나 지칭 아이템 집합이나 문서 특성에는 총량, 시계열적 변동값, 및 점유율과 집중율 등 분포 상의 비율이나 위치 정보 중 어느 하나 이상을 활용하는 것인 것이며, 네트워크 분석은 연관성, 중첩성, 공통성, 보충성, 매개성 중 어느 하나 이상의 관점에서 분석될 수 있다.In addition, in the second generation method, in the case of network analysis, the reference item set is any one or more of a keyword and a key phrase, or the document feature is patent classification information such as CPC/IPC, It may be any one or more of number information such as reference document number or cited document number, and personal information including the right holder or inventor, and the information in the document for each document of the referenced item includes, for example, TF-ID (term frequency-inverted document frequency) ) is information on at least one or more measurement values for document representativeness of the referred item such as ), or any one of distribution ratio or location information such as total amount, time-series variation value, and occupancy and concentration rate for a set of referenced items or document characteristics It is to utilize the above, and the network analysis may be analyzed from the viewpoint of any one or more of relevance, overlap, commonality, complementarity, and mediation.

아울러, 지칭 아이템 간 연관 데이터는, 문서별로 상기 문서에서 추출한 추출 지칭 아이템 맵핑 행렬 정보로 생성되는 것일 수 있다.In addition, the association data between the reference items may be generated by extracting reference item mapping matrix information extracted from the document for each document.

본 발명의 권리 범위는 특허청구범위에 기재된 사항에 의해 결정되며, 특허 청구범위에 사용된 괄호는 선택적 한정을 위해 기재된 것이 아니라, 명확한 구성요소를 위해 사용되었으며, 괄호 내의 기재도 필수적 구성요소로 해석되어야 한다.The scope of the present invention is determined by the matters described in the claims, and parentheses used in the claims are not described for selective limitation, but are used for clear components, and descriptions in parentheses are also interpreted as essential components. should be

10: 타장치
11: 사용자 컴퓨터
100: 기계 학습 장치
110: 학습 데이터 생성부
111: 학습용 설명 변수 데이터 생성부
112: 학습용 목적 변수 데이터 생성부
120: 기계 학습부
121: 융복합 예측 모델 학습부
122: 아이템 평가-예측 모델 학습부
200: 추천 장치
210: 모델 적용용 데이터 생성부
211: 모델 적용용 설명 변수 데이터 생성부
212: 모델 적용용 목적 변수 데이터 생성부
220: 추천 데이터 생성부
230: 메타 데이터 생성부
240: 추천 데이터 제공부
241: 지칭 아이템 입수부
242: 추천 아이템 제공부
243: 메타 데이터 제공부
300: 데이터 저장 장치
310: 학습 데이터 저장부
320: 추천 데이터 저장부
330: 메타 데이터 저장부
400: 관리 장치10: other device
11: Your computer
100: machine learning device
110: training data generation unit
111: explanatory variable data generating unit for learning
112: object variable data generation unit for learning
120: machine learning unit
121: Convergence prediction model learning unit
122: Item evaluation-prediction model learning unit
200: recommended device
210: data generation unit for model application
211: explanatory variable data generator for model application
212: Object variable data generation unit for model application
220: recommendation data generating unit
230: metadata generating unit
240: recommendation data providing unit
241: Reference item acquisition section
242: Recommendation item provider
243: metadata provider
300: data storage device
310: learning data storage unit
320: recommendation data storage unit
330: metadata storage unit
400: management device

Claims

A) 기술 집단 지성 문서셋을 구성하는 개별 문서 및 상기 개별 문서에서 추출된 지칭 아이템을 활용하여, 지칭 아이템 간 연관 데이터를 생성한 제1 상태;
상기 지칭 아이템 간 연관 데이터에 포함된 지칭 아이템쌍 집합을 대상으로, 선택된 상기 지칭 아이템쌍에 대하여, 상기 지칭 아이템쌍 별로 적어도 2 이상의 목적 변수별 목적 변수값과, 적어도 하나 이상의 설명 변수별 설명 변수값 중 어느 하나 이상을 생성한 제2 상태;
상기 지칭 아이템별 설명 변수별 설명 변수값과 목적 변수별 목적 변수값으로 기계 학습 알고리즘을 적용한 예측 모델을 수립한 제3 상태가 수행된 상태에서, 적어도 하나 이상의 지칭 아이템을 입수 받는 단계;
B) 상기 입수 받은 지칭 아이템을 상기 예측 모델에 적용하여, 적어도 2 이상의 예측 아이템에 대한 추천 아이템 데이터를 생성하는 단계; 및
C) 상기 추천 아이템 데이터에 적용되어 추천 아이템의 선별에 사용될 수 있는 적어도 하나 이상의 메타 데이터의 생성을 처리하는 단계를 포함하는 것을 특징으로 하는, AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법.
A) a first state in which association data between reference items is generated by utilizing individual documents constituting the technical group intelligent document set and reference items extracted from the individual documents;
For the reference item pair selected for the reference item pair set included in the association data between the reference items, at least two object variable values for each reference item pair, and at least one description variable value for each description variable a second state that produced any one or more of the following;
receiving at least one reference item while a third state of establishing a predictive model to which a machine learning algorithm is applied with the explanatory variable value for each reference item and the target variable value for each target variable is performed;
B) generating recommended item data for at least two prediction items by applying the received reference item to the prediction model; and
C) Machine learning-based bio-item high-promising fusion in the AI technology field, characterized in that it comprises the step of processing the generation of at least one or more metadata that can be applied to the recommended item data and used for selection of the recommended item Composite items and related company-researcher recommendation methods.

제1항에 있어서,
상기 예측 모델의 수립은,
상기 지칭 아이템쌍 집합을 적어도 2개 이상의 지칭 아이템 분할 집합으로의 분할을 이용하고,
상기 학습은 상기 지칭 아이템 분할 집합을 사용하여 수행하는 것이며,
상기 분할은 기 설정된 적어도 하나 이상의 요인 계열을 단일, 순차, 조합 또는 복합하여 적용하는 2차원 이상인 행렬 분할인 것인 것을 특징으로 하는, AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법.
According to claim 1,
The establishment of the predictive model is,
Using the division of the reference item pair set into at least two or more reference item division sets,
The learning is performed using the reference item partition set,
The division is a two-dimensional or more matrix division that applies single, sequential, combination, or complex application of at least one or more predetermined factor series, a machine learning-based bio-item with high future potential in the AI technology field. and related company-researcher recommendation methods.

제2항에 있어서,
상기 요인 계열은,
상기 지칭 아이템에 대한,
i) 적어도 하나 이상의 범주 분류 속성 계열, ii) 적어도 하나 이상의 인적 분류 속성 계열, iii) 적어도 하나 이상의 계량값 분류 속성 계열, iv) 적어도 하나 이상의 평가값 분류 속성 계열 중 어느 하나 이상인 것을 특징으로 하는, AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법.
3. The method of claim 2,
The factor series is
For the above referenced item,
characterized in that at least one of i) at least one category classification attribute series, ii) at least one human classification attribute series, iii) at least one quantitative value classification attribute series, and iv) at least one or more evaluation value classification attribute series, Machine learning-based bio-items in the field of AI technology, highly promising convergence items and related companies-researcher recommendation methods.

제3항에 있어서,
상기 범주 분류 속성 계열은 적어도 2 이상의 지칭 아이템에 공통적으로 부여될 수 있는 텍사노미(taxonomy)인 것이며,
상기 범주 분류에는 적어도 1 뎁스(depth) 이상의 계층 구조를 가질 수 있는 것이며,
상기 계층 구조는 제품-부품과 소재-물질와 질병 중 어느 하나 이상을 포함하는 것이며,
상기 소재-물질은 하기 (1) 소재-물질 중 어느 하나 이상을 포함하며,
상기 제품-부품은 하기 (2) 제품-부품 중 어느 하나 이상을 포함하며,
상기 질병은 한국표준질병 및 사인 분류(KCD-8), 국제질병사인분류, ICD-10(International Classification of Diseases-10) 중 어느 하나 이상을 포함하는 것을 특징으로 하는, AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법.
((1) 소재-물질: FDA 승인 물질, 의약품, 유전자, 생화학 물질, 효소, 화학물질, 고분자 소재, 의료 소재, 천연 소재, 천연 물질, 건축용 자재, 제조 부품용 소재, 연료와 연료 첨가지 및 윤활유, 음식료품, 실험 시약 및 소재, 종이 또는 위생 장비용 소재
(2) 상품-부품: IT 및 컴퓨터, 의료, 제약 및 화장품, 자동차 및 운송, 전자 부품, 반도체, 건축 및 토목, 물질 취급, 조절 및 저장, 인쇄, 사진 및 시청각 장비, 광학, 발전, 배전 및 동력 전달, 전기 시스템 및 조명, 가열, 냉각, 환기, 필터, 파이프 및 튜브, 제조 및 가공, 기계 요소 또는 단위, 안전, 무기, 서비스 및 금융, 실험 및 측정, 농산물, 식품 및 담배, 개인, 가정 및 사무용품, 스포츠, 예술, 게임, 장난감 및 교육 자재, 상품 및 부품 일단)
4. The method of claim 3,
The category classification attribute series is a taxonomy that can be commonly assigned to at least two or more referenced items,
The category classification may have a hierarchical structure of at least one depth or more,
The hierarchical structure includes any one or more of product-part, material-material, and disease,
The material-material includes any one or more of the following (1) material-materials,
The product-part includes any one or more of the following (2) product-parts,
The disease is based on machine learning in the field of AI technology, characterized in that it includes any one or more of Korean standard disease and cause classification (KCD-8), international cause of disease classification, and ICD-10 (International Classification of Diseases-10) A convergence item with high future promise by bio-item and a method for recommending related companies-researchers.
((1) Material-substance: FDA-approved substances, pharmaceuticals, genes, biochemicals, enzymes, chemicals, polymeric materials, medical materials, natural materials, natural materials, building materials, materials for manufacturing parts, fuels and fuel additives; Materials for lubricants, food and beverages, laboratory reagents and materials, paper or sanitary equipment
(2) Commodity-Components: IT and Computers, Medical, Pharmaceuticals and Cosmetics, Automotive and Transportation, Electronic Components, Semiconductors, Building and Civil Engineering, Material Handling, Control and Storage, Printing, Photographic and Audio-Visual Equipment, Optics, Power Generation, Distribution and Power Transmission, Electrical Systems and Lighting, Heating, Cooling, Ventilation, Filters, Pipes and Tubes, Manufacturing and Processing, Mechanical Elements or Units, Safety, Weapons, Services and Finance, Experiments and Measurements, Agricultural Products, Food and Tobacco, Personal, Home and office supplies, sports, arts, games, toys and educational materials, goods and parts once)

제3항에 있어서,
상기 인적 분류는 법인 및 자연인에 대한 분류 중 어느 하나 이상인 것이며,
상기 법인은 특허 문서의 경우에는 출원인, 현재 권리자, 원고, 피고, 매입자, 매각자, 라이센시 또는 라이센서 및 특허 문서에 기재된 기타 비자연인 주체(대학의 산학협력단, 연구 기관, 조합, 국가)인 것이며, 논문 문서인 경우에는 연구자가 소속된 조직인 것이며,
상기 자연인은 특허 문서의 경우에는 발명자, 논문의 경우에는 저자인 것인 것이며,
상기 인적 분류는 상기 지칭 아이템이 포함되어 있는 개별 문서와 관련하여 생성하는 인적 정보로부터 분류되는 것이며,
상기 인적 분류로서 법인은 법인 명칭, 국적, 조직 속성(상기 조직 속성은 기업, 대학, 연구 기관, 조합 중 하나로 분류된다.), 법인이 소속되는 집단 속성(상기 집단 속성은 상장 기업, 외감 기업 또는 그 외 기업으로 분류된다.), 법인이 분류되는 분류 속성(상기 분류 속성은 SIC/KSIC에 따른 산업 분류 속성, 증권 시장별 종목 분류 속성, 기술-제품-서비스 관점의 분류 속성으로 분류된다.) 중 어느 하나 이상인 것을 특징으로 하는, AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법.
4. The method of claim 3,
The above human classification is any one or more of classifications for corporations and natural persons,
In the case of patent documents, the above corporation is the applicant, the current right holder, the plaintiff, the defendant, the purchaser, the seller, the licensee or the licensor, and other non-natural persons described in the patent document (industrial-academic cooperation foundations of universities, research institutes, associations, countries); In the case of a thesis document, it is the organization to which the researcher belongs,
The natural person is the inventor in the case of a patent document and the author in the case of a thesis,
The personal classification is classified from the personal information generated in relation to the individual document including the reference item,
As the human classification, a corporation is a corporation name, nationality, organizational attribute (the organizational attribute is classified as one of a company, university, research institution, or union), and group attribute to which the corporation belongs (the collective attribute is a listed company, externally supervised company or It is classified as other companies.), the classification attribute under which a corporation is classified (the above classification attribute is classified as an industry classification attribute according to SIC/KSIC, a stock classification attribute by stock market, and a classification attribute from a technology-product-service perspective). A highly promising convergence item and a related company-researcher recommendation method by machine learning-based bio-item in the field of AI technology, characterized in that at least one of them.

제3항에 있어서,
상기 계량값 분류는 상기 지칭 아이템이 포함되어 있는 개별 문서에서 계량할 수 있는 계량값인 것이며,
상기 계량값은 시간 단위별 또는 시간 독립적 문서의 개수, 관련 문서(레퍼런스, 피인용 또는 유사 문서 중 하나 이상을 포함한다)의 개수, 상기 문서와 관련된 이벤트-상기 이벤트는 소송, 심판, 표준, FDA 승인, 거래, 라이선스 체결을 포함한다-의 개수, 상기 단일 또는 2 이상의 개수에 대한 처리값(증감율에 대한 비율값 또는 계산값을 포함한다), 상기 인적 분류와 관련된 처리값(점유율 또는 집중율) 중 어느 하나 이상인 것이며, 상기 지칭 아이템별 계량값 분류는 상기 개별 문서에서 계량할 수 있는 계량값을 사용하여 생성되는 것이며,
상기 평가값 분류는 상기 개별 문서에 대한 2 이상의 측정값을 모델이나 수식에 사용하여 생성되는 평가값이나 예측값을 사용하여 지칭 아이템별로 생성된 것이거나, 상기 지칭 아이템별로 대응되는 문서 집합에서 집단적으로 측정되는 측정값을 모델이나 수식에 사용하여 생성되는 평가값이나 예측값인 것을 특징으로 하는, AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법.
4. The method of claim 3,
The measurement value classification is a measurement value that can be measured in an individual document including the reference item,
The metric includes the number of time units or time-independent documents, the number of related documents (including one or more of references, citations, or similar documents), the events related to the documents - the events are litigation, adjudication, standard, FDA The number of - including approval, transaction, and license execution - the processing value for the single or two or more numbers (including the ratio value or calculated value for the increase/decrease rate), the processing value related to the human classification (share or concentration rate) It is any one or more of the above, and the weighted value classification by the referenced item is generated using the weighted value that can be measured in the individual document,
The evaluation value classification is generated for each reference item using an evaluation value or predicted value generated by using two or more measured values for the individual document in a model or formula, or is collectively measured in a document set corresponding to each reference item A highly promising convergence item and related company-researcher recommendation method for each machine learning-based bio item in the AI technology field, characterized in that it is an evaluation value or a predicted value generated by using the measured value in a model or formula.

제1항에 있어서,
상기 예측 모델은
예측 기간별로 수립하는 것이며,
상기 예측 모델에서 상기 목적 변수별 목적 변수값과 상기 설명 변수별 설명 변수값은 상기 예측 기간을 기준으로 분할된 적어도 2개 이상의 집합별로 별도로 생성되는 것을 특징으로 하는, AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법.
According to claim 1,
The predictive model is
It is established for each forecast period,
In the predictive model, the target variable value for each objective variable and the explanatory variable value for each explanatory variable are separately generated for each at least two or more sets divided based on the prediction period. Future-promising convergence items by item and related companies-researcher recommendation method.

제1항에 있어서,
상기 예측 모델이 생성하는 예측값은 지칭 아이템별로 지칭 아이템쌍을 구성하는 예측 아이템에 대한 예측 확률값 또는 예측 빈도값인 것을 특징으로 하는, AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법.
According to claim 1,
The prediction value generated by the prediction model is a prediction probability value or a prediction frequency value for a prediction item constituting a reference item pair for each reference item. Relevant company-researcher recommendation method.

제1항에 있어서,
상기 메타 데이터는 상기 추천 아이템에 대한 적어도 하나 이상의 요인 계열 또는 상기 요인 계열을 구성하는 요인 계열별 적어도 하나 이상의 요인 계열값인 것이며,
상기 요인 계열은,
상기 추천 아이템에 대한,
i) 적어도 하나 이상의 범주 분류 속성 계열, ii) 적어도 하나 이상의 인적 분류 속성 계열, iii) 적어도 하나 이상의 계량값 분류 속성 계열, iv) 적어도 하나 이상의 평가값 분류 속성 계열 중 어느 하나 이상인 것을 특징으로 하는, AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법.
According to claim 1,
The metadata is at least one factor series for the recommended item or at least one factor series value for each factor series constituting the factor series,
The factor series is
For the above recommended items,
characterized in that at least one of i) at least one category classification attribute series, ii) at least one human classification attribute series, iii) at least one quantitative value classification attribute series, and iv) at least one or more evaluation value classification attribute series, Machine learning-based bio-items in the field of AI technology, highly promising convergence items and related companies-researcher recommendation methods.

제1항에 있어서,
상기 설명 변수별 설명 변수값은 설명 변수 종류별로 적어도 1종 이상 생성되는 것이며, 상기 예측 모델은 상기 설명 변수 종류별로 생성되는 것을 특징으로 하는, AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법.
According to claim 1,
The explanatory variable value for each explanatory variable is generated by at least one type of explanatory variable type, and the predictive model is generated for each explanatory variable type. Composite items and related company-researcher recommendation methods.

제1항에 있어서,
상기 메타 데이터는 상기 추천 아이템에 대한 평가 정보인 것이며,
상기 평가 정보는 상기 추천 아이템을 구성하는 문서 집합에 대한 집단 속성에 대한 평가 모델 또는 예측 모델을 수립하고, 상기 추천 아이템에 대응되는 문서 집합에 대한 집단 속성을 기 설정된 시점 기준으로 측정하여, 상기 평가 모델 또는 상기 예측 모델에 투입하여 생성되는 것이며,
상기 평가 모델의 수립은,
a) 상기 지칭 아이템쌍별로 적어도 2 이상의 목적 변수별 목적 변수값과 적어도 1종 이상의 설명 변수 종류별 설명 변수별 설명 변수값을 생성하는 단계;
b) 설명 변수별 설명 변수값과 목적 변수별 목적 변수값으로 기계 학습 알고리즘을 적용하는 단계를 포함하여 생성되는 것을 특징으로 하는, AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법.
According to claim 1,
The metadata is evaluation information for the recommended item,
The evaluation information establishes an evaluation model or a prediction model for a group attribute of a set of documents constituting the recommendation item, and measures the group attribute of a document set corresponding to the recommendation item based on a preset time point, and the evaluation It is generated by inputting the model or the predictive model,
The establishment of the evaluation model is,
a) generating at least two target variable values for each target variable and at least one explanatory variable value for each explanatory variable type for each pair of referenced items;
b) Future-promising convergence items for each machine-learning-based bio-item in the field of AI technology, characterized in that it is generated including the step of applying a machine learning algorithm to the explanatory variable value for each explanatory variable and the target variable value for each objective variable; Relevant company-researcher recommendation method.

제11항에 있어서,
상기 평가 모델의 수립은,
c) 상기 기계 학습 알고리즘에 대한 검증 데이터를 생성하는 단계를 더 포함하는 것을 특징으로 하는, AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법.
12. The method of claim 11,
The establishment of the evaluation model is,
c) Future-promising convergence items and related companies-researcher recommendation method for each machine learning-based bio-item in the AI technology field, characterized in that it further comprises the step of generating verification data for the machine learning algorithm.

제1항에 있어서,
상기 메타 데이터는 상기 추천 아이템과 관련된 법인 및 자연인에 대한 정보 중 어느 하나 이상인 것이며,
상기 메타 데이터의 생성은,
상기 추천 아이템에 대응되는 문서 집합의 집단 속성별로 제공되는 제1 생성 방법; 또는
상기 추천 아이템에 대응되는 문서에 포함된 지칭 아이템 집합, 상기 지칭 아이템의 분류, 상기 지칭 아이템의 문서별 문서 내 특성 또는 문서 특성(feature) 사용하여 네트워크 분석을 통하여 생성되는 제2 생성 방법 중 어느 하나 이상인 것을 특징으로 하는, AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법.
According to claim 1,
The metadata is any one or more of information on legal entities and natural persons related to the recommended item,
The generation of the metadata is
a first generating method provided for each group attribute of a document set corresponding to the recommended item; or
Any one of a second generation method generated through network analysis using a set of reference items included in a document corresponding to the recommended item, a classification of the reference item, and a document-specific feature or document feature of the reference item A convergence item with high future potential for each bio-item based on machine learning in the AI technology field and a method for recommending related companies-researchers, characterized in that above.

제13항에 있어서,
상기 제1 생성 방법에서, 상기 문서 집합의 집단 속성은, i) 문서에 대한 적어도 하나 이상의 계량값별 계량 총량, 증감율, 시간 구간별 밀집도 ii) 소송, 심판, 거래, 표준, FDA 승인 중 어느 하나 이상을 포함하는 특별한 이벤트에 관련된 문서 총량, 증감율, 시간 구간별 밀집도, iii) 전체 또는 상위 범주 기준 특정한 상기 아이템에 대한 적어도 하나 이상의 계량값별 집중율 또는 점유율 중 어느 하나 이상인 것이며,
상기 자연인 또는 법인은 국가별, 국적별, 적어도 하나 이상의 분류 체계에 따른 사실상 또는 추정되는 분류별로 제공될 수 있는 것을 특징으로 하는, AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법.
14. The method of claim 13,
In the first generating method, the group attribute of the document set includes: i) the total amount of measurement by at least one measurement value for the document, the increase/decrease rate, and the density by time interval ii) any one or more of litigation, judgment, transaction, standard, FDA approval It is any one or more of the total amount of documents related to a special event including
The natural or legal person is a highly promising convergence item for each machine learning-based bio item in the field of AI technology, characterized in that it can be provided by country, nationality, or actual or estimated classification according to at least one classification system, and Relevant company-researcher recommendation method.

제13항에 있어서,
상기 제2 생성 방법에서, 상기 네트워크 분석의 경우,
상기 지칭 아이템 집합은 키워드(keyword), 키프레이즈(key phrase) 중 어느 하나 이상인 것이거나,
상기 문서 특성(feature)는 CPC/IPC 등의 특허 분류 정보, 레퍼런스 문서 번호나 피인용 문서 번호 등의 번호 정보, 권리자나 발명자를 포함한 인적 정보 중 어느 하나 이상인 것이거나,
상기 지칭 아이템의 문서별 문서 내 정보에는 예시적으로 TF-ID(term frequency-inverted document frequency)와 같은 상기 지칭 아이템의 문서 대표성에 대한 적어도 하나 이상의 측정값에 대한 정보인 것이거나
상기 지칭 아이템 집합이나 문서 특성에는 총량, 시계열적 변동값, 및 점유율과 집중율 등 분포 상의 비율이나 위치 정보 중 어느 하나 이상을 활용하는 것인 것이며,
상기 네트워크 분석은 연관성, 중첩성, 공통성, 보충성, 매개성 중 어느 하나 이상의 관점에서 분석되는 것을 특징으로 하는, AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법.
14. The method of claim 13,
In the second generating method, in the case of the network analysis,
The reference item set is any one or more of a keyword or a key phrase,
The document feature is any one or more of patent classification information such as CPC/IPC, number information such as reference document number or cited document number, and personal information including the right holder or inventor,
The information in the document for each document of the reference item includes, for example, information on at least one or more measurement values for document representativeness of the reference item, such as TF-ID (term frequency-inverted document frequency), or
For the set of referenced items or document characteristics, any one or more of the distribution ratio or location information such as the total amount, time-series fluctuation value, and occupancy and concentration rate is used,
The network analysis is characterized in that it is analyzed from the perspective of any one or more of relevance, overlap, commonality, complementarity, and mediation. Way.

제1항에 있어서,
상기 지칭 아이템 간 연관 데이터는,
상기 문서별로 상기 문서에서 추출한 추출 지칭 아이템 맵핑 행렬 정보로 생성되는 것을 특징으로 하는, AI 기술 분야의 기계 학습 기반 바이오 아이템별 미래 유망성 높은 융복합 아이템 및 관련 기업-연구자 추천 방법.According to claim 1,
The related data between the referenced items is,
A convergence item with high future potential and a related company-researcher recommendation method by machine learning-based bio-item in the AI technology field, characterized in that the document is generated by the extracted reference item mapping matrix information extracted from the document for each document.