KR20200047006A

KR20200047006A - Method and system for constructing meta model based on machine learning

Info

Publication number: KR20200047006A
Application number: KR1020180128941A
Authority: KR
Inventors: 최병열; 이용빈; 최동훈
Original assignee: 주식회사 피도텍
Priority date: 2018-10-26
Filing date: 2018-10-26
Publication date: 2020-05-07
Also published as: KR102170968B1

Abstract

Disclosed is a design expert system comprising: a predictive model generation unit for generating a predictive model by using experimental data collected by a design target experiment and keywords describing the experimental data; a data generation unit for generating characteristic data using an approximate model for identifying characteristics of a design object; a data analysis unit for analyzing the generated characteristic data; and a design optimization unit for performing design optimization of the design target according to the optimal design algorithm using various data analysis results based on the prediction model. According to the present invention, it is possible to obtain an optimized performance improvement result only by inputting data in an optimal design and multidimensional data analysis and setting a design target.

Description

머신 러닝 기반의 근사모델 구축 방법 및 시스템{METHOD AND SYSTEM FOR CONSTRUCTING META MODEL BASED ON MACHINE LEARNING}METHOD AND SYSTEM FOR CONSTRUCTING META MODEL BASED ON MACHINE LEARNING}

본 발명은 머신 러닝 기반의 근사모델 구축 방법 및 시스템에 관한 것으로, 더욱 상세하게는 엔지니어링 빅데이터를 이용하고, 머신 러닝 기반으로 하는 학습을 통해 가장 적합한 메타모델 타입을 결정하고 하이퍼-파마미터를 이용하여 메타모델의 정확도를 높일 수 있는 근사모델 구축 방법 및 시스템에 관한 것이다.The present invention relates to a method and system for constructing an approximate model based on machine learning, more specifically, using engineering big data, determining a metamodel type that is most suitable through machine learning based learning, and using hyper-parameters. The present invention relates to an approximate model construction method and system capable of increasing the accuracy of the metamodel.

오늘날 빅데이터를(BIG DATA) 처리하는 방법으로 크게 분류(Classification), 군집(Clustering), 연관성(Association), 연속성(Sequencing), 및 예측(Forecasting)의 데이터 마이닝 (Data Mining) 또는 데이터 분석(Data Analytics) 기법이 활용된다.Data mining or data analysis of big classification, clustering, association, sequencing, and prediction as a method of processing big data today Analytics) technique is utilized.

빅데이터의 활용이 도입되고 있는 금융이나 의료 분야와는 달리 엔지니어링 분야에서 빅데이터는 정성적인 지표가 아닌 정량적인 성능 지수의 예측을 통해 제품의 개념 설계에서 적극 활용되어야 한다.Unlike the financial and medical fields, where the use of big data is being introduced, in the engineering field, big data should be actively utilized in the conceptual design of products through the prediction of quantitative performance indexes rather than qualitative indicators.

엔지니어링 빅데이터에는 실험계획법과 같이 정규화된 계획 아래에서 획득된 데이터가 많이 분포한다. 따라서, 엔지니어링 분야에서는 정규화된 계획 아래에서 획득된 빅데이터를 토대로 개념 설계 단계에서 성능을 사전에 예측할 수 있는 시스템이 필요하다. 이를 위해서 데이터 기반의 예측 시스템을 구축할 필요가 있으며, 회귀 모둘(regression model) 또는 보간 모델(interpolation model) 등의 메타모델링(metamodeling) 방식이 활용된다.In engineering big data, a lot of data obtained under a normalized plan such as an experimental planning method are distributed. Therefore, in the field of engineering, a system capable of predicting performance in the conceptual design stage based on big data obtained under a normalized plan is needed. To do this, it is necessary to build a data-based prediction system, and metamodeling methods such as a regression model or an interpolation model are used.

그런데 종래의 기술에 따른 데이터 기반의 메타모델링 방식은 주어진 데이터에 가장 적합한 메타모델을 수작업을 통해 탐색하는 방식을 이용한다. 그리고 수작업을 통한 메타모델 탐색 방식은 주어진 데이터에 한정되므로 학습의 효과를 기대하기 어렵다.However, the data-based metamodeling method according to the related art uses a method of manually searching for a metamodel that is most suitable for a given data. In addition, the meta-model search method by hand is limited to given data, so it is difficult to expect the effect of learning.

본 발명의 일 실시 예에 따른 머신 러닝 기반의 근사모델 구축 방법 및 시스템은, 빅데이터를 기반으로 하고 있으며, 데이터가 축적됨에 따라 학습 효과를 통해 가장 적합한 메타모델을 탐색해 주는 방법 및 이를 이용하는 시스템이라는 점에서 발명의 특징이 있으며, 이러한 특징은 종래기술과 구별되고 상기 종래기술이 갖는 문제점을 해결하기 위해 개시된다.A method and system for constructing an approximate model based on machine learning according to an embodiment of the present invention are based on big data, and a method and system for searching the most suitable metamodel through a learning effect as data is accumulated In this respect, there are features of the invention, and these features are distinguished from the prior art and are disclosed to solve the problems of the prior art.

한국 등록특허공보 제10-0576941호(2006.04.28.)Korean Registered Patent Publication No. 10-0576941 (2006.04.28.)

본 발명이 해결하고자 하는 과제는, 엔지니어링 빅데이터를 이용하는 머신 러닝 기반의 근사모델 구축 시스템을 제공하는 것이다.The problem to be solved by the present invention is to provide a machine learning based approximation model building system using engineering big data.

본 발명이 추가적으로 해결하려는 과제는, 입력된 데이터의 성능을 대표할 수 있는 특성 인자를 자동으로 추출하는 기술을 이용하여 정확도가 높은 근사모델을 결정하는 근사모델 구축 시스템을 제공하는 것이다.Another problem to be solved by the present invention is to provide an approximate model construction system for determining an approximate model with high accuracy using a technique for automatically extracting characteristic factors that can represent the performance of input data.

본 발명의 일 실시 예에 따른 머신 러닝 기반의 근사모델 구축 방법은, 실험 계획법에 따른 실험을 통해 엔지니어링 데이터를 수집하는 단계; 근사모델(metamodel)의 머신 러닝 용도의 데이터베이스를 생성하기 위해 수집된 엔지니어링 데이터를 가공하는 데이터 전처리 단계; 및 전처리된 데이터를 이용하여 근사모델을 설정하고, 반복된 근사모델의 하이퍼-파라미터 최적화를 통해 근사모델을 생성하는 단계를 포함하는 것을 특징으로 한다.A method for constructing an approximate model based on machine learning according to an embodiment of the present invention includes: collecting engineering data through an experiment according to an experimental design method; A data preprocessing step of processing the collected engineering data to generate a database for machine learning of the metamodel; And setting an approximate model using pre-processed data, and generating an approximate model through hyper-parameter optimization of the repeated approximate model.

여기서, 상기 데이터 전처리 단계는, 수집된 엔지니어링 데이터를 이용하여 데이터 별로 특성인자를 추출하는 단계; 및 수집된 엔지니어링 데이터를 이용하여 데이터 별로 레이블을 생성하는 단계를 포함하는 것을 특징으로 한다.Here, the data pre-processing step, using the collected engineering data to extract characteristic factors for each data; And generating a label for each data using the collected engineering data.

여기서, 상기 엔지니어링 데이터는, IoT 센서, 시뮬레이션 및 실험을 통해서 수집되는 것을 특징으로 한다.Here, the engineering data is characterized by being collected through IoT sensors, simulations and experiments.

여기서, 상기 데이터 전처리 단계는, 머신 러닝 용도의 데이터베이스를 이용하여 특성인자 중에서 레이블에 영향력을 미치는 주요 특성인자를 선별하는 단계를 더 포함하는 것을 특징으로 한다.Here, the data pre-processing step is characterized in that it further comprises the step of selecting a main characteristic factor influencing the label among the characteristic factors using a database for machine learning purposes.

여기서, 상기 근사모델을 생성하는 단계는, 주요 인자를 이용하여 근사모델을 설정하는 단계; 설정된 근사모델의 파마미터를 최적화하는 단계; 및 상기 파라미터에 기반하여 근사모델의 정확성을 평가하는 단계를 포함하는 것을 특징으로 한다.Here, the step of generating the approximate model includes: setting an approximate model using main factors; Optimizing the parameters of the set approximation model; And evaluating the accuracy of the approximate model based on the parameters.

여기서, 상기 근사모델을 생성하는 단계는, 상기 평가 결과에 기반하여 근사모델을 구성하는 근사함수의 오차와 오차 분포를 고려하여 상기 실험 계획법에 따라 엔지니어링 데이터를 추가 생성하는 단계를 더 포함하는 것을 특징으로 한다.Here, the step of generating the approximate model further includes generating engineering data according to the experimental design method in consideration of an error and an error distribution of an approximate function constituting the approximate model based on the evaluation result. Is done.

본 발명의 일 실시 예에 따른 근사모델 구축 시스템은, 실험 계획법에 따른 실험을 통해 엔지니어링 데이터를 수집하는 데이터 수집부; 근사모델(metamodel)의 머신 러닝 용도의 데이터베이스를 생성하기 위해 수집된 엔지니어링 데이터를 가공하는 데이터 전처리부; 및 전처리된 데이터를 이용하여 근사모델을 설정하고, 반복된 근사모델의 하이퍼-파라미터 최적화를 통해 근사모델을 생성하는 근사모델 생성부를 포함하는 것을 특징으로 한다.Approximate model construction system according to an embodiment of the present invention, a data collection unit for collecting engineering data through experiments according to the experimental design method; A data preprocessing unit for processing the collected engineering data to generate a database for machine learning of the metamodel; And an approximate model generator configured to set an approximate model using pre-processed data, and generate an approximate model through hyper-parameter optimization of the repeated approximate model.

여기서, 상기 데이터 전처리부는, 수집된 엔지니어링 데이터를 이용하여 데이터 별로 특성인자를 추출하는 특성인자 추출모듈; 및 수집된 엔지니어링 데이터를 이용하여 데이터 별로 레이블을 생성하는 레이블 생성모듈을 포함하는 것을 특징으로 한다.Here, the data pre-processing unit, a feature factor extraction module for extracting a characteristic factor for each data using the collected engineering data; And a label generation module that generates a label for each data using the collected engineering data.

여기서, 상기 데이터 전처리부는, 머신 러닝 용도의 데이터베이스를 이용하여 특성인자 중에서 레이블에 영향력을 미치는 주요 특성인자를 선별하는 주요인자 선별모듈을 더 포함하는 것을 특징으로 한다.Here, the data pre-processing unit is characterized in that it further comprises a main factor selection module that selects a main characteristic factor influencing the label among the characteristic factors using a database for machine learning purposes.

여기서, 상기 근사모델 생성부는, 주요 인자를 이용하여 근사모델을 설정하는 근사모델 설정모듈; 설정된 근사모델의 파마미터를 최적화하는 근사모델 최적화 모듈; 및 상기 파라미터에 기반하여 근사모델의 정확성을 평가하는 근사모델 평가모듈을 포함하는 것을 특징으로 한다.Here, the approximate model generation unit, an approximate model setting module for setting the approximate model using the main factors; An approximation model optimization module for optimizing the parameters of the set approximation model; And an approximate model evaluation module for evaluating the accuracy of the approximate model based on the parameters.

여기서, 상기 데이터 수집부는, 상기 평가 결과에 기반하여 근사모델을 구성하는 근사함수의 오차와 오차 분포를 고려하여 상기 실험 계획법에 따라 엔지니어링 데이터를 추가 생성하는 것을 특징으로 한다.Here, the data collection unit is characterized in that it generates additional engineering data according to the experimental design method in consideration of the error and the error distribution of the approximate function constituting the approximate model based on the evaluation results.

본 발명에 의하면, 엔지니어링 빅데이터를 이용하여 머신 러닝 기반의 학습을 통해 근사모델을 구축할 수 있다.According to the present invention, it is possible to construct an approximate model through machine learning based learning using engineering big data.

또한, 입력된 데이터의 성능을 대표할 수 있는 특성 인자를 자동으로 추출하는 기술을 이용하여 정확도가 높은 메타 모델을 결정하는 근사모델을 구축할 수 있다.In addition, it is possible to construct an approximate model for determining a meta model with high accuracy by using a technique for automatically extracting characteristic factors that can represent the performance of input data.

도 1은 본 발명의 일 실시 예에 따른 근사모델 구축 시스템의 블록도이다.
도 2는 본 발명의 일 실시 예에 따른 근사모델 구축 시스템의 예시도이다.
도 3은 도 1의 사용자 단말의 블록도이다.
도 4는 본 발명의 일 실시 예에 따른 근사모델 구축 방법의 흐름도이다.1 is a block diagram of an approximation model building system according to an embodiment of the present invention.
2 is an exemplary diagram of an approximate model construction system according to an embodiment of the present invention.
3 is a block diagram of the user terminal of FIG. 1.
4 is a flowchart of a method for constructing an approximation model according to an embodiment of the present invention.

이하, 첨부한 도면을 참조하여 엔지니어링 빅데이터를 이용하는 머신 러닝 기반의 근사모델 구축 방법 및 시스템에 대한 바람직한 실시 예를 상세히 설명하기로 한다.Hereinafter, a preferred embodiment of a method and system for constructing an approximate model based on machine learning using engineering big data will be described in detail with reference to the accompanying drawings.

각 도면에 제시된 동일한 참조부호는 동일한 부재를 나타낸다. 또한 본 발명의 일 실시 예들에 대해서 특정한 구조적 내지 기능적 설명들은 단지 본 발명에 따른 실시 예를 설명하기 위한 목적으로 예시된 것으로, 다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가지고 있다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥상 가지는 의미와 일치하는 의미를 가지는 것으로 해석되어야 하며, 본 명세서에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는 것이 바람직하다.The same reference numerals in each drawing denote the same members. In addition, specific structural or functional descriptions for one embodiment of the present invention are exemplified for the purpose of explaining an embodiment according to the present invention, and unless defined otherwise, all used herein including technical or scientific terms. The terms have the same meaning as generally understood by a person skilled in the art to which the present invention pertains. Terms, such as those defined in a commonly used dictionary, should be interpreted as having meanings consistent with meanings in the context of related technologies, and should not be interpreted as ideal or excessively formal meanings unless explicitly defined herein. It is desirable not to.

이하 본 발명의 일 실시 예에 따른 엔지니어링 빅데이터를 이용하는 머신 러닝 기반의 근사모델 구축 시스템(200)에 대해 설명하기로 한다.Hereinafter, a machine learning based approximate model construction system 200 using engineering big data according to an embodiment of the present invention will be described.

본 발명의 일 실시 예에 따른 머신 러닝 기반의 근사모델 구축 시스템(200)은 사람의 수작업에 의해 수행되었던 기존의 근사모델 구축 작업을 머신 러닝에 기반하여 자동으로 근사모델을 구축하는 것에 특징이 있다.The machine learning-based approximate model construction system 200 according to an embodiment of the present invention is characterized in automatically constructing an approximate model based on machine learning based on a machine learning of an existing approximate model construction work performed by human hand. .

도 1은 본 발명의 일 실시 예에 따른 근사모델 구축 시스템의 블록도이다.1 is a block diagram of an approximation model building system according to an embodiment of the present invention.

도 1을 참조하면, 근사모델 구축 시스템(200)은, 그 구성요소로서 데이터 수집부(210), 데이터 전처리부(220) 및 근사모델 생성부(230)를 포함한다.Referring to FIG. 1, the approximate model construction system 200 includes, as its components, a data collection unit 210, a data preprocessing unit 220, and an approximate model generation unit 230.

데이터 수집부(210)는 학습에 의한 근사모델의 설정, 파라미터 최적화 및 근사모델 평가에 필요한 학습 데이터 및 평가 데이터를 수집하는 역할을 한다.The data collection unit 210 serves to collect training data and evaluation data necessary for setting up an approximate model by learning, optimizing parameters, and evaluating an approximate model.

여기서, 사용되는 학습 데이터 및 평가 데이터는 실험을 통해 수집된다. 예를 실험계획법 등에 의한 실험을 통해 수집된 엔지니어링 실험 데이터는 전처리 과정을 통해 머린 러닝 학습 및 평가에 사용될 수 있는 데이터로 가공된다. 또한, 수집되는 데이터의 규모는 빅데이터(big data)에 해당하는 것이 특징이다. 따라서, 본 발명의 일 실시 예에 따른 근사모델 구축 시스템(200)은, 종래 기술에 따른 빅데이터 처리의 한계를 극복함으로써 빅데이터를 머신 러닝 학습에 이용할 수 있다.Here, the learning data and evaluation data used are collected through experiments. For example, engineering experiment data collected through experiments based on experiment planning methods are processed into data that can be used for learning and evaluating marine learning through pre-processing. In addition, the size of the collected data is characterized by corresponding to the big data (big data). Accordingly, the approximate model construction system 200 according to an embodiment of the present invention can use big data for machine learning learning by overcoming the limitations of big data processing according to the prior art.

데이터 수집부(210)는, 근사모델 평가모듈(233)에 의한 평가 결과에 따라 더 나은 최적화를 위해 실험계획법에 따라 엔지니어링 실험 데이터를 추가로 생성할 수 있다. 따라서, 본 발명의 일 실시 예에 따른 실험계획법에 의한 머신 러닝 학습을 위한 엔지니어링 실험 데이터는 파마리터 최적화의 결과에 따라 계속해서 추가될 수 있는 것을 특징으로 한다.The data collection unit 210 may additionally generate engineering experiment data according to the experimental planning method for better optimization according to the evaluation result by the approximate model evaluation module 233. Therefore, the engineering experiment data for machine learning learning by the experimental planning method according to an embodiment of the present invention is characterized in that it can be continuously added according to the result of the optimizing parameter.

데이터 전처리부(220)는 수집된 엔지니어링 실험 데이터로부터 특성인자를 추출하고 레이블을 생성하고 주요 인자를 선별하는 역할을 한다. 데이터 전처리부(220)에 의해 가공된 데이터 및 주요 인자는 머신 러닝 학습 과정에서 근사모델을 설정하거나 근사모델을 학습 훈련시키거나 근사모델을 평가하는데 사용된다.The data pre-processing unit 220 serves to extract characteristic factors from the collected engineering experiment data, generate labels, and select key factors. The data and the main factors processed by the data pre-processing unit 220 are used to set the approximate model in the machine learning learning process, train and train the approximate model, or evaluate the approximate model.

데이터 전처리부(220)는, 그 구성요소로서 특성인자 추출모듈(221), 레이블 생성모듈(222) 및 주요 인자 선별모듈(223)를 포함한다.The data pre-processing unit 220 includes, as its components, a characteristic factor extraction module 221, a label generation module 222, and a main factor selection module 223.

데이터 전처리부(220)는 근사모델의 학습 및 평가에 사용될 각종 데이터를 수집하고 이를 전처리한다. 데이터 전처리 과정은, 확보된 실험데이터를 완전한 분석 대상으로 만들기 위한 데이터 가공 단계이다. 데이터 전처리 방법은, 데이터 정제(data cleaning), 데이터 통합(data integration), 데이터 정리(data reduction) 및, 데이터 변환(data transformation)을 포함한다.The data pre-processing unit 220 collects various data to be used for learning and evaluation of the approximate model and pre-processes it. The data pre-processing step is a data processing step to make the obtained experimental data into a complete analysis target. Data preprocessing methods include data cleaning, data integration, data reduction, and data transformation.

본 발명의 특징 중의 하나로서 근사모델 구축 시스템(200)은, 빅데이터 처리가 가능한 머신 러닝 기술에 기반하고 있다. 종래의 기술에 따르면 공학에서 사용되는 메타모델링(meta modeling) 기술은 처리할 수 있는 데이터 개수에 제한이 존재하였으나 이를 극복하기 위해 머신 러닝(Machine Learning, ML) 기술을 이용하여 메타모델을 생성할 수 있는 프레임워크 기술이 제안된다.As one of the features of the present invention, the approximate model building system 200 is based on machine learning technology capable of processing big data. According to the conventional technology, a meta modeling technology used in engineering has a limitation in the number of data that can be processed, but to overcome this, a meta model can be generated by using machine learning (ML) technology. Framework techniques are proposed.

특성인자 추출모듈(221)은 입력된 각종 데이터에서 특성 인자를 추출하는 역할을 한다. 특성 인자 추출(Extracting Features)은 입력된 데이터의 성능을 대표할 수 있는 특성 인자(features)를 추출하는 기술이다. 또한, 특성 인자 추출은 데이터 학습을 위해 입력된 데이터의 특성을 추출하여, 이를 통해 Input-Output 관계를 정립하는 기술이다. 정확도가 높은 메타모델을 생성하기 위해서는 입력된 데이터의 개수나 실험점의 개수, 데이터의 실험 방식, 데이터 상관관계 등을 학습하여 지식화 한 후 이를 통해 가장 적합한 메타모델 타입이 결정된다.The characteristic factor extraction module 221 serves to extract characteristic factors from the inputted various data. Extracting Features is a technique for extracting features that can represent the performance of input data. In addition, feature factor extraction is a technique that extracts the characteristics of input data for data learning and establishes the input-output relationship. In order to create a meta model with high accuracy, the most suitable meta model type is determined through learning the knowledge of the number of input data, the number of experimental points, the experimental method of data, and the data correlation.

레이블 생성모듈(222)은 수집된 데이터 별로 레이블(label)을 생성하는 역할을 한다. 머신 러닝에서 대상 대답을 이미 알고 있는 데이터가 사용하는 것이 유리하다. 이미 알고 있는 데이터를 레이블이 지정된 데이터라 한다. 감독되는 머신 러닝에서 알고리즘은 사용자가 제공하는 레이블이 있는 예제를 학습하도록 지시한다.The label generation module 222 serves to generate a label for each collected data. In machine learning, it is advantageous to use data that already knows the target answer. Data already known is called labeled data. In supervised machine learning, the algorithm instructs the user to learn a labeled example provided by the user.

주요 인자 선별모듈(223)은, 수집된 데이터에서 추출된 특성인자로부터 주요 인자를 선별하는 역할을 한다. 주요 인자는 특성인자 중에서 레이블에 영향력을 미칠 수 있는 주요 특성인자를 말한다. 주요 인자 선별모듈(223)은 특성인자로부터 레이블에 영향력을 미칠 수 있는 주요 인자를 선별한다.The main factor selection module 223 serves to select a main factor from characteristic factors extracted from the collected data. The main factor is the main factor that can influence the label among the factor. The main factor selection module 223 selects the main factors that can influence the label from the characteristic factors.

근사모델 생성부(230)는 선별된 주요 인자를 이용하여 근사모델을 설정하고 근사모델을 학습을 통해 훈련시키고, 학습된 근사모델을 평가함으로써 최종적으로 근사모델의 구축을 완성하는 역할을 한다. 근사모델을 구축함에 있어서 데이터 전처리부(220)를 통해 전처리된 주요 인자, 학습용 데이터 및 평가용 데이터가 사용된다.The approximate model generation unit 230 serves to complete the construction of the approximate model by setting the approximate model using selected key factors, training the approximate model through learning, and evaluating the learned approximate model. In constructing the approximate model, main factors pre-processed through the data pre-processing unit 220, learning data, and evaluation data are used.

근사모델 생성부(230)는 근사모델 설정모듈(231), 최적화 모듈(232), 근사모델 평가모델(23) 및 근사모델 생성모듈(234)을 포함한다.The approximate model generation unit 230 includes an approximate model setting module 231, an optimization module 232, an approximate model evaluation model 23, and an approximate model generation module 234.

근사모델 설정모듈(231)은 데이터 전처리부(220)에서 생성된 주요 인자를 이용하여 근사모델을 정의하고, 정의에 기초하여 근사모델 생성을 위한 초기 조건을 설정한다.The approximate model setting module 231 defines an approximate model using the main factors generated by the data preprocessing unit 220 and sets initial conditions for generating the approximate model based on the definition.

최적화 모듈(232)은, 수집된 엔지니어링 빅데이터에 가장 적합한 근사모델을 생성하기 위해 근사모델의 하이퍼-파라미터를 최적화한다. 예를 들어, 메타모델(metamodel)의 하이퍼 파라미터 최적화 기술은 학습을 통해 가장 적합한 메타모델 타입이 결정되고, 머신 러닝 기술을 통해 Big Data 처리가 가능한 메타모델이 구성되었으면, 세부적인 파라미터 튜닝을 통해 메타모델의 정확성을 개선한다. 해당 최적화 기술은 축적된 노하우를 근거로 구축된 최적화 프로세서를 이용하여 메타모델의 하이퍼 파라미터를 최적화하여 메타모델의 정확도를 극대화 시킨다.The optimization module 232 optimizes the hyper-parameters of the approximate model to generate an approximate model best suited to the collected engineering big data. For example, in the metamodel, the most suitable metamodel type is determined through learning of the metamodel, and if a metamodel capable of processing big data is constructed through machine learning technology, meta is optimized through detailed parameter tuning. Improve model accuracy. This optimization technique maximizes the accuracy of the metamodel by optimizing the hyperparameters of the metamodel using an optimization processor built on the basis of accumulated know-how.

근사모델 평가모듈(233)은 학습 기능의 수행에 의해 구축된 근사모델을 평가하는 역할을 한다. 상술하였듯이 평가 방식은 데이터베이스(300)에 포함된 평가용 해석/실험 데이터를 이용하여 근사모델을 평가하는 방식이다. 예를 들어, 생성된 근사함수에 평가용 데이터를 대입하여 실제값과 근사값의 차이가 계산될 수 있다. 근사모델 구축 시스템(200)은 근사모델 평가모듈(233)을 통해 사용자가 원하는 수준을 입력 받고, 근사모델이 사용자가 원하는 수준에 도달하였는지를 평가한다.The approximate model evaluation module 233 serves to evaluate the approximate model constructed by performing the learning function. As described above, the evaluation method is a method of evaluating the approximate model using analysis / experimental data for evaluation included in the database 300. For example, the difference between the actual value and the approximate value may be calculated by substituting evaluation data into the generated approximation function. The approximate model construction system 200 receives the level desired by the user through the approximate model evaluation module 233 and evaluates whether the approximate model has reached the level desired by the user.

사용자가 원하는 수준이 만족되는 경우 근사모델의 구축은 완성되나, 그렇지 않은 경우에는 실험계획법 등에 의해 학습용 실험 데이터가 추가되고, 근사모델은 추가된 학습용 실험 데이터를 이용하여 더 학습하게 된다.If the level desired by the user is satisfied, the construction of the approximate model is completed, but if not, experimental data for learning is added by the experimental planning method, etc., and the approximate model is further trained using the added experimental data.

근사모델 생성모듈(234)는 학습 기능을 이용하여 근사모델을 생성하는 역할을 한다. 즉, 근사모델 구축 시스템(200)은 학습 기능 및 학습용 실험 데이터를 추가함으로써, 근사모델 생성모듈(234)을 통해 근사모델을 생성한다.The approximate model generation module 234 serves to generate an approximate model using a learning function. That is, the approximate model construction system 200 generates an approximate model through the approximate model generation module 234 by adding a learning function and experimental data for learning.

이하 근사모델 구축 시스템(200)의 다양한 실시 예에 대해 설명하기로 한다.Hereinafter, various embodiments of the approximate model construction system 200 will be described.

도 2는 본 발명의 일 실시 예에 따른 근사모델 구축 시스템의 예시도이다.2 is an exemplary diagram of an approximate model construction system according to an embodiment of the present invention.

도 2를 참조하면, 근사모델 구축을 위한 전체 환경(1000)은, 사용자 단말(100), 근사모델 구축 시스템(200), 데이터베이스(300) 및 네트워크(400)를 포함한다. 사용자는 근사모델 구축 시스템(200)을 직접 이용해서 근사모델을 구축하거나, 사용자 단말(100)을 통해 근사모델 구축 시스템(200)에 접속해서 근사모델을 구축할 수도 있다. 위 두 가지 경우 모두에 있어서, 데이터베이스(300)는 필수적이다.Referring to FIG. 2, the entire environment 1000 for building an approximate model includes a user terminal 100, an approximate model building system 200, a database 300, and a network 400. The user may construct the approximate model using the approximate model building system 200 directly, or access the approximate model building system 200 through the user terminal 100 to build the approximate model. In both cases, the database 300 is essential.

사용자 단말(100)은 근사모델 구축 시스템(200)이 제공하는 모델 구축 서비스를 제공받는 장치이다. 여기서, 사용자 단말(100)은, 근사모델을 구축을 수행하는 장치로서, 그 구성 요소로서 입력 디바이스 및 출력 디바이스를 포함하고, 그 종류로서 컴퓨팅 장치, 단말기(terminal) 및 무선 단말(wireless terminal)을 포함할 수 있다.The user terminal 100 is a device that receives a model building service provided by the approximate model building system 200. Here, the user terminal 100 is an apparatus for constructing an approximate model, and includes an input device and an output device as its components, and includes a computing device, a terminal, and a wireless terminal as its kind. It can contain.

상기 무선 단말의 다양한 실시 예들은 셀룰러 전화기, 무선 통신 기능을 가지는 스마트 폰, 무선 통신 기능을 가지는 개인 휴대용 단말기(PDA), 무선 모뎀, 무선 통신 기능을 가지는 휴대용 컴퓨터, 무선 통신 기능을 가지는 디지털 카메라와 같은 촬영장치, 무선 통신 기능을 가지는 게이밍 (gaming) 장치, 무선 통신 기능을 가지는 음악저장 및 재생 가전제품, 무선 인터넷 접속 및 브라우징이 가능한 인터넷 가전제품뿐만 아니라 그러한 기능들의 조합들을 통합하고 있는 휴대형 유닛 또는 단말기들을 포함할 수 있으나, 이에 한정되는 것은 아니다.Various embodiments of the wireless terminal include a cellular phone, a smart phone having a wireless communication function, a personal digital assistant (PDA) having a wireless communication function, a wireless modem, a portable computer having a wireless communication function, and a digital camera having a wireless communication function. The same imaging device, gaming device with wireless communication function, music storage and playback home appliance with wireless communication function, Internet home appliance with wireless Internet access and browsing, as well as a portable unit incorporating combinations of those functions or It may include terminals, but is not limited thereto.

근사모델 구축 시스템(200)은 네트워크를 통해 사용자 단말(100) 및 데이터베이스(300)와 연결되어, 사용자 단말(100)에 근사모델 구축을 위한 서비스를 제공한다. 여기서, 근사모델 구축 시스템(200)은 그 종류로서 웹서버, 클라우드 서버 및 파일 서버를 포함할 수 있다.The approximate model construction system 200 is connected to the user terminal 100 and the database 300 through a network, and provides a service for constructing the approximate model to the user terminal 100. Here, the approximate model construction system 200 may include a web server, a cloud server, and a file server as its types.

데이터베이스(300)는 실험계획법과 같이 구체적인 계획에 의해 생성된 실험 데이터가 구비되도록 구축될 수 있다. 그리고 데이터베이스(300)에 구비된 실험 데이터는 머신 러닝 학습에서 사용되는 학습용 데이터 및 머신 러닝의 근사모델을 평가하기 위한 평가용 데이터를 포함한다.The database 300 may be constructed such that experimental data generated by a specific plan such as an experimental planning method is provided. And the experimental data provided in the database 300 includes learning data used in machine learning learning and evaluation data for evaluating an approximate model of machine learning.

네트워크(400)는 유선 및 무선 네트워크, 예를 들어 인터넷(internet), 인트라넷(intranet) 및 엑스트라넷(extranet), 셀룰러, 예를 들어 무선 전화 네트워크, LAN(local area network), WAN(wide area network), WiFi 네트워크, 애드혹 네트워크 및 이들의 조합을 비롯한 임의의 적절한 통신 네트워크 일 수 있다.The network 400 is a wired and wireless network, such as the internet, intranet and extranet, cellular, such as a wireless telephone network, local area network (LAN), wide area network (WAN) ), WiFi networks, ad hoc networks, and combinations thereof.

네트워크(400)는 허브, 브리지, 라우터, 스위치 및 게이트웨이와 같은 네트워크 요소들의 연결을 포함할 수 있다. 네트워크(400)는 인터넷과 같은 공용 네트워크 및 안전한 기업 사설 네트워크와 같은 사설 네트워크를 비롯한 하나 이상의 연결된 네트워크들, 예컨대 다중 네트워크 환경을 포함할 수 있다. 네트워크(400)에의 액세스는 하나 이상의 유선 또는 무선 액세스 네트워크들을 통해 제공될 수 있다.Network 400 may include connections of network elements such as hubs, bridges, routers, switches, and gateways. The network 400 may include one or more connected networks, including a public network such as the Internet and a private network such as a secure corporate private network, such as a multiple network environment. Access to network 400 may be provided through one or more wired or wireless access networks.

본 발명의 일 실시 예에 따른 사용자 단말(100)은 하나 이상의 CPU(central processing unit), 메모리, 대용량 저장소, 입력 인터페이스 장치, 출력 인터페이스 장치로 구성된 컴퓨팅 장치로 구현될 수 있다. 컴퓨팅 장치의 각 구성 요소들은 버스를 통해 서로 통신할 수 있다.The user terminal 100 according to an embodiment of the present invention may be implemented as a computing device composed of one or more central processing units (CPUs), memory, mass storage, input interface devices, and output interface devices. Each component of the computing device can communicate with each other via a bus.

컴퓨팅 장치의 하드웨어 플랫폼은 개인용 컴퓨터, 핸드헬드 또는 랩톱 디바이스, 다중 프로세서 시스템, 마이크로프로세서 기반 시스템, 프로그램 가전제품, 및 이상의 시스템들 또는 디바이스들 중 임의의 것을 포함하는 분산 컴퓨팅 환경, 예컨대 클라우드 기반 컴퓨팅 시스템을 비롯한 많은 형태들로 구현될 수 있다.The hardware platform of the computing device is a personal computer, handheld or laptop device, multiprocessor system, microprocessor based system, program consumer electronics, and a distributed computing environment including any of the above systems or devices, such as a cloud based computing system. It can be implemented in many forms including.

도 3은 본 발명의 일 실시 예에 따른 사용자 단말(100)에 해당하는 컴퓨팅 장치의 블록도이다.3 is a block diagram of a computing device corresponding to a user terminal 100 according to an embodiment of the present invention.

도 3을 참조하면, 컴퓨팅 장치(500)는 입력 인터페이스 장치(510), 출력 인터페이스 장치(520), 메모리(531), 저장장치(532), 전원 장치(540), 프로세서(550), 네트워크 인터페이스 장치(560), 무선통신 장치(570) 및 버스(580)를 포함한다.Referring to FIG. 3, the computing device 500 includes an input interface device 510, an output interface device 520, a memory 531, a storage device 532, a power device 540, a processor 550, and a network interface Device 560, a wireless communication device 570, and a bus 580.

입력 인터페이스 장치(510)는 사용자의 입력에 따라 문서작성에 필요한 문자 또는 개체를 입력한다. 입력 인터페이스 장치(510)는 키보드(keyboard), 터치스크린(touch screen), 마우스(mouse), 전자펜(stylus pen) 및 펜 태블릿(pen tablet)을 포함하되, 이에 한정되는 것은 아니다.The input interface device 510 inputs a character or an object necessary for writing a document according to a user's input. The input interface device 510 includes, but is not limited to, a keyboard, a touch screen, a mouse, an stylus pen, and a pen tablet.

출력 인터페이스 장치(520)는 문서편집 애플리케이션 모듈 관련 사용자 인터페이스 등을 표시하는 디스플레이(display) 및 문서를 프린트 출력하는 프린터(printer)를 포함한다. 또한, 출력 인터페이스 장치(520)는 문서 내의 문자를 음성합성(text to speech, TTS) 엔진을 이용하여 음성으로 출력하는 스피커(speaker), 헤드폰(head-phone) 및 헤드셋(head-set)을 포함한다.The output interface device 520 includes a display displaying a document editing application module-related user interface and the like, and a printer that prints and outputs the document. In addition, the output interface device 520 includes a speaker, a head-phone, and a head-set that outputs text in a document as voice using a text to speech (TTS) engine. do.

프로세서(550)는 메모리(531) 및/또는 저장 장치(532)에 저장된 본 발명의 일 실시 예에 따른 데이터 백업 방법에 관한 문서편집 애플리케이션/서버 모듈(174/274)이 포함하고 있는 컴퓨터 명령어 셋을 실행할 수 있다. 프로세서(550)는 중앙 처리 장치(central processing unit, CPU), 그래픽 처리 장치(graphics processing unit, GPU) 또는 본 발명에 따른 방법들이 수행되는 전용의 프로세서를 의미할 수 있다. 메모리(531)와 저장 장치(532)는 휘발성 저장 매체 및/또는 비휘발성 저장 매체로 구성될 수 있다. 예를 들어, 메모리(531)는 읽기 전용 메모리(read only memory, ROM) 및/또는 랜덤 액세스 메모리(random access memory, RAM)로 구성될 수 있다.The processor 550 is a computer instruction set included in a document editing application / server module 174/274 related to a data backup method according to an embodiment of the present invention stored in the memory 531 and / or the storage device 532 You can run The processor 550 may refer to a central processing unit (CPU), a graphics processing unit (GPU), or a dedicated processor on which methods according to the present invention are performed. The memory 531 and the storage device 532 may be composed of volatile storage media and / or non-volatile storage media. For example, the memory 531 may be composed of read only memory (ROM) and / or random access memory (RAM).

무선통신 장치(570)는 근거리 무선통신, 무선 데이터 통신 및 무선 음성 통신을 위한 장치를 포함한다.The wireless communication device 570 includes devices for short-range wireless communication, wireless data communication, and wireless voice communication.

컴퓨팅 장치(500)에 포함된 각각의 구성 요소들은 버스(580)에 의해 연결되어 서로 통신을 수행한다.Each component included in the computing device 500 is connected by the bus 580 to communicate with each other.

도 4는 본 발명의 일 실시 예에 따른 근사모델 구축 방법의 흐름도이다.4 is a flowchart of a method for constructing an approximation model according to an embodiment of the present invention.

도 4를 참조하면, 근사모델 구축 방법(S100)은, 엔지니어링 데이터를 수집하는 단계(S110), 수집데이터를 전처리하는 단계(S120) 및 전처리된 데이터를 이용하여 근사모델을 구축하는 단계(S130)를 포함한다.Referring to FIG. 4, the approximate model construction method (S100) includes collecting engineering data (S110), preprocessing the collected data (S120), and constructing an approximate model using the preprocessed data (S130). It includes.

먼저, 근사모델 구축 시스템(200)은 엔지니어링 실험 데이터를 수집한다(S110). 엔지니어링 실험 데이터는 실험을 통해 수집될 수 있다. 예를 들어, 엔지니어링 실험 데이터는 Rule 기반의 실험계획법 등에 의한 실험을 통해 수집된다. 본 발명의 일 실시 예에서는 근사모델의 설정, 파라미터 최적화 및 근사모델 평가에 사용되는 엔지니어링 실험 데이터는 근사모델의 설정 전에 한 번에 생성될 수도 있으나 파라미터 최적화의 결과에 따라 순차적으로 추가될 수 있는 것을 특징으로 한다.First, the approximate model building system 200 collects engineering experiment data (S110). Engineering experiment data can be collected through experiments. For example, engineering experiment data is collected through experiments based on rule-based experiment planning. In an embodiment of the present invention, engineering experiment data used for setting the approximate model, optimizing parameters, and evaluating the approximate model may be generated at one time before setting the approximate model, but may be sequentially added according to the result of parameter optimization. It is characterized by.

다양한 종류의 실험계획법들이 존재할 수 있으며 실험계획법들은 서로 다른 Rule에 따라 규율된다. 그리고 실험계획법들 중에서 실험에 관해 미리 설정된 Rule에 맞는 실험계획법이 자동으로 선택될 수 있다. 또한, 실험의 결과에 따라 근사모델의 복잡성을 평가하면서 실험계획법의 방식이 결정될 수 있다.Various types of experimental planning methods can exist, and the experimental planning methods are regulated according to different rules. And among the experimental planning methods, an experimental planning method suitable for a predetermined rule may be automatically selected. In addition, the method of the experimental planning method can be determined while evaluating the complexity of the approximate model according to the results of the experiment.

다음으로 근사모델 구축 시스템(200)은, 수집된 엔지니어링 실험 데이터를 이용하여 근사모델의 학습에 필요한 데이터에 기반하는 머신 러닝 용도의 데이터베이스를 구축하는 데이터 전처리 단계를 수행한다(S120).Next, the approximate model building system 200 performs a data pre-processing step of constructing a database for machine learning use based on data necessary for learning the approximate model using the collected engineering experiment data (S120).

다음으로, 근사모델 구축 시스템(200)은 머신 러닝 용도의 데이터베이스를 이용하여 근사모델을 설정하고, 근사모델의 하이퍼-파마미터를 최적화하고, 반복된 하이퍼-파라미터의 최적화를 통해 근사모델을 생성하는 기계 학습 단계를 수행한다(S130). Next, the approximate model building system 200 sets up an approximate model using a database for machine learning purposes, optimizes the hyper-parameters of the approximate model, and generates an approximate model through optimization of repeated hyper-parameters. The machine learning step is performed (S130).

본 발명의 실시 일 실시 예에 따른 근사모델 구축 시스템(200)은 엔지니어링 빅데이터(big data)를 기반으로 한다. 근사모델 구축 시스템(200)은 엔지니어링 빅데이터를 전처리 하고, 근사모델을 설정하고, 머신 러닝 용도의 데이터를 이용하여 근사모델을 학습시키고, 학습 과정에서 근사모델을 최적화하고, 근사모델을 평가하여 최종적으로 근사모델을 구축할 수 있다. 학습 및 평가에 사용되는 엔지니어링 빅데이터는 정성적인 지표보다는 정량적인 성능 지수의 예측에 활용되는 것으로서 정규화된 방법, 예를 들면 실험계획법 등을 통해 얻어진다.The approximate model construction system 200 according to an embodiment of the present invention is based on engineering big data. The approximate model building system 200 preprocesses engineering big data, sets an approximate model, trains the approximate model using data for machine learning purposes, optimizes the approximate model in the learning process, evaluates the approximate model, and finally Can build an approximation model. Engineering big data used for learning and evaluation is used for the prediction of quantitative performance index rather than qualitative indicator, and is obtained through normalized methods, for example, experimental planning.

엔지니어링 분야에서의 실험은 부품, 제품 혹은 시스템의 거동을 재현하기 위해 수행된다. 이러한 실험에 있어 가장 중요한 요구사항은 재현의 정확성으로 여러 번의 실험에서 큰 차이가 없는 결과가 나와야 한다는 점이다. 만일 단 한번의 실험으로 원하는 거동을 충분히 파악할 수 있다면 다행이겠지만, 많은 경우 설계인자에 따른 거동 추이를 충분히 파악하기 위해서는 수 차례의 실험이 요구된다.Experiments in the field of engineering are conducted to reproduce the behavior of parts, products or systems. The most important requirement for these experiments is that the accuracy of reproducibility should result in no significant difference in multiple experiments. It would be nice if the desired behavior could be grasped sufficiently with just one experiment, but in many cases, several experiments are required to fully grasp the behavioral trends according to the design factors.

설계인자에 따른 거동의 변화 추이를 파악하기 위해서는 각 설계인자를 변화시키면서 실험을 수행해야 한다. 그렇다면 충분한 실험 회수와 각 실험에 있어 설계인자들의 값들을 어떻게 결정해야 하는가? 하는 의문이 생기게 될 것이다. 이러한 의문에 대한 해답을 제공하는 것이 바로 실험계획법(design of experiments)으로서, 흔히 DOE라는 약어로 통용되고 있다. 실험계획법은 주어진 설계인자의 개수에 대한 제품거동의 변화 추이를 정확하게 파악하기 위해 필요한 최소한의 실험 회수와 각 실험에 대한 설계인자 값들을 체계적으로 결정하는 기법이다. 가장 대표적인 DOE로 직교배열표(orthogonal array)가 있으며, 이 표는 설계인자의 개수와 각 설계인자의 수준에 따른 실험회수와 각 실험을 위한 설계인자 수준들의 조합을 나타낸 표이다.In order to grasp the change trend of behavior according to design factors, experiments should be performed while changing each design factor. If so, how should we determine the sufficient number of experiments and the values of the design factors for each experiment? The question will arise. The answer to these questions is the design of experiments, commonly referred to as the DOE. The experiment planning method is a technique to systematically determine the minimum number of experiments and the design factor values for each experiment in order to accurately grasp the change trend of product behavior for a given number of design factors. The most representative DOE is an orthogonal array, which is a combination of the number of design factors, the number of design factors, and the combination of design factor levels for each experiment.

DOE는 입력 변수(요인)가 출력 변수(반응)에 미치는 영향을 동시에 조사하는데 도움이 된다. 이러한 실험은 특정 목적에 따라 입력 변수를 변경하는 일련의 런이나 검정으로 구성된다. 데이터는 각 런에서 수집된다. 실험계획법을 사용하여 품질에 영향을 미치는 공정 조건 및 제품 성분을 확인한 다음, 결과를 최적화하는 요인 설정을 구한다DOE helps to simultaneously investigate the effect of input variables (factors) on output variables (responses). These experiments consist of a series of runs or tests that change input variables according to a specific purpose. Data is collected in each run. Using experimental design method, process conditions and product components affecting quality are identified, and then factor setting to optimize the results is obtained.

또한, 상기 엔지니어링 데이터는, IoT 센서, 시뮬레이션 및 실험을 통해서 수집되는 것을 특징으로 한다.In addition, the engineering data is characterized by being collected through IoT sensors, simulations and experiments.

여기서, 상기 머신 러닝 용도의 데이터베이스를 구축하기 위해 데이터를 가공하는 데이터 전처리 단계(S120)는, 머신 러닝 용도의 데이터베이스를 구축하는 단계(S121)를 포함하고, S121 단계는 수집된 엔지니어링 데이터를 이용하여 데이터 별로 특성인자를 추출하는 단계(S122); 및 수집된 엔지니어링 데이터를 이용하여 데이터 별로 레이블을 생성하는 단계(S123)를 포함하는 것을 특징으로 한다.Here, the data pre-processing step (S120) of processing data to build a database for machine learning use includes a step (S121) of building a database for machine learning use, and step S121 uses collected engineering data. Extracting characteristic factors for each data (S122); And generating a label for each data using the collected engineering data (S123).

특성인자(feature) 추출은 피쳐 엔지니어링과 관련이 있다. 피쳐 엔지니어링(feature engineering)이란, 기존의 변수를 사용해서 데이터에 정보를 추가하는 일련의 과정이다. 새로 관측치나 변수를 추가하지 않고도 기존의 데이터를 보다 유용하게 만드는 방법론 중의 하나이다.Feature extraction is related to feature engineering. Feature engineering is a process of adding information to data using existing variables. This is one of the methodologies that make existing data more useful without adding new observations or variables.

그 방법 중에서 스케일링(scaling)은 변수의 단위를 변경하고 싶거나, 변수의 분포가 편향되어 있을 경우, 변수 간의 관계가 잘 드러나지 않는 경우에 사용된다.Among the methods, scaling is used when the unit of the variable is desired to change, or when the distribution of the variable is biased, the relationship between the variables is not well revealed.

가장 자주 사용하는 방법으로는 Log 함수가 있고, 유사하지만 좀 덜 자주 사용되는 Square root를 취하는 방법도 있다.The most frequently used method is the Log function, and there is a similar but less frequently used Square root method.

바이닝(binning) 방법은, 연속형 변수를 범주형 변수로 만드는 방법이다. 예를 들어 연봉 데이터가 수치로 존재하는 경우, 이를 100 미만, 101~200 하는 식으로 범주형 변수로 변환하는 것이다.The binning method is a method of making a continuous variable into a categorical variable. For example, if the annual salary data exists as a number, it is converted to a categorical variable in the form of less than 100 and 101 to 200.

바이닝에는 특별한 원칙이 있는 것이 아니기 때문에, 분석가의 비즈니스 이해도에 따라 창의적인 방법으로 바이닝 할 수 있다.Since there is no special principle in binning, it can be done in a creative way according to the analyst's business understanding.

변환(transform) 방법은, 기존 존재하는 변수의 성질을 이용해 다른 변수를 만드는 방법이다. 예를 들어 날짜 별 판매 데이터가 있다면, 날짜 변수를 주중/주말로 나눈 변수를 추가한다던지, 스포츠 관람객 데이터의 경우 해당 일에 특정 팀의 경기가 있는지 여부 등을 추가하는 것이다.The transform method is a method of making another variable using the properties of existing variables. For example, if there is sales data by date, the date variable is divided by weekday / weekend, or in the case of sports spectator data, whether there is a specific team's game on the day is added.

변환(transform)에도 특별한 원칙이 있는 것은 아니며, 분석가의 Business 이해도에 따라 다양한 변수가 생성될 수 있다.There is no special principle in transformation, and various variables can be created according to the analyst's business understanding.

더미(dummy) 방법은, 바이닝과는 반대로 범주형 변수를 연속형 변수로 변환하기 위해 사용된다. 사용하고자 하는 분석 방법론에서 필요한 경우에 주로 사용된다.The dummy method is used to convert categorical variables to continuous variables as opposed to binning. It is mainly used when necessary in the analysis methodology to be used.

또한, 상기 데이터 전처리 단계(S120)는, 머신 러닝 용도의 데이터베이스를 이용하여 주요 특성인자를 선별하는 단계(S124)를 더 포함하는 것을 특징으로 한다.In addition, the data pre-processing step (S120) is characterized in that it further comprises the step of selecting the main characteristic factors using a database for machine learning (S124).

주요 특성인자는 레이블에 영향을 미치는 특성인자를 말한다. 선별된 주요 특성인자는 기계 학습 단계에서 근사모델을 설정하는데 사용된다.The main characteristic factor is the characteristic factor that affects the label. The selected main characteristic factors are used to establish the approximate model in the machine learning stage.

다음으로, 상기 근사모델 생성 단계(S130)는, 주요 인자를 이용하여 근사모델을 설정하는 단계(S131); 설정된 근사모델의 하이퍼-파마미터를 최적화하고, 반복된 하이퍼-파라미터의 최적화를 통해 근사모델을 생성하는 단계(S132) 및 근사모델 파마미터 최적화의 정확성을 평가하는 단계(S133) 및 경우에 따라 파마리터 최적화의 정확성에 따라 엔지니어링 데이터를 추가 생성하는 단계(S134)를 포함하는 것을 특징으로 한다.Next, the approximate model generation step (S130) comprises: setting an approximate model using main factors (S131); Optimizing the hyper-parameter of the set approximate model and generating an approximate model through optimization of the repeated hyper-parameters (S132) and evaluating the accuracy of the approximate model parameter optimization (S133) and optionally perm. It characterized in that it comprises the step of generating additional engineering data according to the accuracy of the liter optimization (S134).

근사모델에 해당하는 설계 대상의 거동 또는 성능은 수 많은 인자(factor)들의 영향을 받는다. 이러한 인자들 중에서 변할 수 있는 것들과 일정한 값으로 고정된 것이 있다. 여기서 가변 가능한 인자가 변수(variable)에 해당하고, 고정된 인자가 제약 조건에 해당한다.The behavior or performance of a design target corresponding to an approximate model is influenced by a number of factors. Some of these factors are variable and fixed at constant values. Here, a variable argument corresponds to a variable, and a fixed argument corresponds to a constraint.

변수들 중에서도 특별히 설계 대상의 성능에 지대한 영향을 미치는 변수들을 설계 변수(design variable)라고 부르고, 설계 대상의 설계는 이러한 설계변수들을 결정하는 작업에 해당한다. 참고로 동일한 설계 대상을 구축하는 경우에 있어서도 목표로 하는 성능이 달라지면 설계 변수도 달라질 수 있다.Among the variables, variables that greatly affect the performance of the design object are called design variables, and design of the design object corresponds to the task of determining these design variables. For reference, even in the case of constructing the same design target, if the target performance is different, design variables may also be different.

설계 대상의 설계 변수를 결정하는 작업이 머신 러닝에서는 근사모델을 설정하고, 즉 근사모델의 파라미터 값을 설정하고, 하이퍼-파라미터 최적화를 통해서 최적의 파마미터 값을 찾는 과정을 통해 근사모델을 구축하는 작업에 해당한다.In machine learning, the task of determining the design variables of a design target is to establish an approximate model through the process of setting an approximate model, that is, setting a parameter value of the approximate model, and finding the optimal parameter value through hyper-parameter optimization. Corresponds to the work.

하이퍼-파라미터 최적화는 수집된 엔지니어링 빅데이터에 가장 적합한 근사모델을 생성하기 위해 최적의 파라미터를 찾는 작업이다. 근사모델 구축 시스템(200)은 최적의 파라미터 값을 찾기 위해 다수의 하이퍼-파라미터 최적화를 수행할 수 있다.Hyper-parameter optimization is the task of finding the optimal parameters to generate an approximate model that is most suitable for the collected engineering big data. The approximate model building system 200 may perform a number of hyper-parameter optimizations to find the optimal parameter values.

머신 러닝 알고리즘을 이용하여 학습된 근사모델은 사용자가 원하는 근사모델에 해당하는지를 판단하기 위해 평가 대상이 된다. 근사모델 구축 시스템(200)은 근사모델 평가모듈(233)을 통해 근사모델을 평가한다. 사용자의 요구치를 입력하고 근사모델을 평가되는데, 학습된 근사모델이 사용자의 기대치에 미치는 경우에 근사모델 생성모듈(234)을 통해 근사모델이 최종 생성된다.The approximate model trained using the machine learning algorithm is subject to evaluation to determine whether it corresponds to the approximate model desired by the user. The approximate model construction system 200 evaluates the approximate model through the approximate model evaluation module 233. The user's request is input and the approximate model is evaluated. When the learned approximate model reaches the user's expectations, the approximate model is finally generated through the approximate model generation module 234.

만약 학습된 근사모델이, 입력된 사용자의 기대치에 미치지 못하는 경우 추가 학습이 이어진다. 머신 러닝 모델의 학습 및 평가를 위해 실험계획법 등에 의해 생성된 실험데이터를 포함하는 각종 데이터는 추가 학습을 위해 데이터베이스에 계속해서 추가될 수 있다.If the learned approximate model does not meet the input user's expectations, further learning continues. Various data including experimental data generated by an experimental planning method for learning and evaluating a machine learning model may be continuously added to a database for further learning.

도 1의 근사모델 구축 시스템(200)의 구성 요소는 설명의 편의를 위해 기능적으로 구분한 구성으로, 하드웨어적으로는 하나의 프로세서에 의해 처리되는 논리적인 기능으로 구성될 수 있는 것으로, 제시된 구분에 의해 본 발명을 한정하는 것은 아니다.The components of the approximate model building system 200 of FIG. 1 are functionally divided for convenience of explanation, and may be configured in a logical function processed by one processor in hardware. It does not limit the present invention.

또한, 각 구성 요소 간의 연결 관계가 일일이 표현되어 있지 않지만, 연결선이 도시되지 않은 구성 간에도 제어 또는 데이터 교환을 위한 통신, 전달이 발생될 수 있으며, 제시된 바에 의해서만 본 발명을 한정하는 것은 아니다.In addition, although the connection relationship between each component is not individually expressed, communication and transmission for control or data exchange may also occur between configurations in which a connection line is not illustrated, and the present invention is not limited only by the suggestions.

상기 도면을 통해 설명된 일 실시 예에 따른 근사모델 구축 방법(S100)은, 컴퓨터에 의해 실행되는 프로그램 모듈과 같은 컴퓨터에 의해 실행이 가능한 명령어 셋을 포함하는 기록 매체의 형태로도 구현될 수 있다. 컴퓨터 판독 가능 매체는 컴퓨터에 의해 액세스될 수 있는 임의의 가용 매체일 수 있고, 휘발성 및 비휘발성 매체, 분리형 및 비분리형 매체를 모두 포함한다. 또한, 컴퓨터 판독가능 매체는 컴퓨터 저장 매체 및 통신 매체를 모두 포함할 수 있다. 컴퓨터 저장 매체는 컴퓨터 판독가능 명령어, 데이터 구조, 프로그램 모듈 또는 기타 데이터와 같은 정보의 저장을 위한 임의의 방법 또는 기술로 구현된 휘발성 및 비휘발성, 분리형 및 비분리형 매체를 모두 포함한다. 통신 매체는 전형적으로 컴퓨터 판독가능 명령어, 데이터 구조, 프로그램 모듈, 또는 반송파와 같은 변조된 데이터 신호의 기타 데이터, 또는 기타 전송 메커니즘을 포함하며, 임의의 정보 전달 매체를 포함한다.The approximate model construction method (S100) according to an embodiment described through the drawings may be implemented in the form of a recording medium including a set of instructions executable by a computer, such as a program module executed by a computer. . Computer readable media can be any available media that can be accessed by a computer and includes both volatile and nonvolatile media, removable and non-removable media. In addition, computer readable media may include both computer storage media and communication media. Computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Communication media typically include computer readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave, or other transport mechanism, and includes any information delivery media.

이와 같이 본 발명의 일 실시 예에 따르면, 엔지니어링 빅데이터를 이용하여 머신 러닝 기반의 학습을 통해 근사모델을 구축할 수 있다.As described above, according to an embodiment of the present invention, an approximate model may be built through machine learning based learning using engineering big data.

전술한 본 발명의 설명은 예시를 위한 것이며, 본 발명이 속하는 기술분야의 통상의 지식을 가진 자는 본 발명의 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 쉽게 변형이 가능하다는 것을 이해할 수 있을 것이다. 그러므로 이상에서 기술한 실시 예들은 모든 면에서 예시적인 것이며 한정적이 아닌 것으로 이해해야만 한다. 예를 들어, 단일형으로 설명되어 있는 각 구성 요소는 분산되어 실시될 수도 있으며, 마찬가지로 분산된 것으로 설명되어 있는 구성 요소들도 결합된 형태로 실시될 수 있다.The above description of the present invention is for illustration only, and a person having ordinary knowledge in the technical field to which the present invention pertains can understand that it can be easily modified into other specific forms without changing the technical spirit or essential features of the present invention. will be. Therefore, it should be understood that the embodiments described above are illustrative in all respects and not restrictive. For example, each component described as a single type may be implemented in a distributed manner, and similarly, components described as distributed may be implemented in a combined form.

본 발명의 범위는 상기 상세한 설명보다는 후술하는 특허청구범위에 의하여 나타내어지며, 특허청구범위의 의미 및 범위 그리고 그 균등 개념으로부터 도출되는 모든 변경 또는 변형된 형태가 본 발명의 범위에 포함되는 것으로 해석되어야 한다.The scope of the present invention is indicated by the following claims rather than the above detailed description, and it should be interpreted that all changes or modified forms derived from the meaning and scope of the claims and equivalent concepts thereof are included in the scope of the present invention. do.

100: 사용자 단말
200: 근사모델 구축 시스템
210: 데이터 수집부
220: 데이터 전처리부
221: 특성인자 추출모듈
222: 레이블 생성모듈
233: 주요 인자 선별모듈
220: 근사모델 생성부
231: 근사모델 설정모듈
232: 최적화 모듈
233: 근사모델 평가모듈
234: 근사모델 생성모듈100: user terminal
200: approximate model building system
210: data collection unit
220: data pre-processing unit
221: characteristic factor extraction module
222: label creation module
233: main factor selection module
220: approximate model generation unit
231: Approximate model setting module
232: optimization module
233: Approximate model evaluation module
234: approximate model generation module

Claims

실험 계획법에 따른 실험을 통해 엔지니어링 데이터를 수집하는 단계;
근사모델(metamodel)의 학습용 데이터베이스를 생성하기 위해 수집된 엔지니어링 데이터를 가공하는 데이터 전처리 단계; 및
전처리된 데이터를 이용하여 근사모델을 설정하고, 반복된 근사모델의 하이퍼-파라미터 최적화를 통해 근사모델을 생성하는 단계를 포함하는 것을 특징으로 하는 근사모델 구축 방법.Collecting engineering data through an experiment according to an experimental design method;
A data pre-processing step of processing the collected engineering data to generate a learning database of the approximate model; And
A method for constructing an approximation model comprising setting up an approximation model using preprocessed data and generating an approximation model through hyper-parameter optimization of the repeated approximation model.

청구항 1에 있어서,
상기 데이터 전처리 단계는,
수집된 엔지니어링 데이터를 이용하여 데이터 별로 특성인자를 추출하는 단계; 및
수집된 엔지니어링 데이터를 이용하여 데이터 별로 레이블을 생성하는 단계를 포함하는 것을 특징으로 하는, 근사모델 구축 방법.The method according to claim 1,
The data pre-processing step,
Extracting characteristic factors for each data using the collected engineering data; And
And generating a label for each data using the collected engineering data.

청구항 1에 있어서,
상기 엔지니어링 데이터는,
IoT 센서, 시뮬레이션 및 실험을 통해서 수집되는 것을 특징으로 하는, 근사모델 구축 방법.The method according to claim 1,
The engineering data,
IoT sensor, characterized by being collected through simulation and experiment, approximation model construction method.

청구항 2에 있어서,
상기 데이터 전처리 단계는,
머신 러닝 용도의 데이터베이스를 이용하여 특성인자 중에서 레이블에 영향력을 미치는 주요 특성인자를 선별하는 단계를 더 포함하는 것을 특징으로 하는, 근사모델 구축 방법.The method according to claim 2,
The data pre-processing step,
A method for constructing an approximate model, further comprising the step of selecting a key characteristic factor that influences the label among the characteristic factors using a database for machine learning purposes.

청구항 1에 있어서,
상기 근사모델을 생성하는 단계는,
주요 인자를 이용하여 근사모델을 설정하는 단계;
설정된 근사모델의 파마미터를 최적화하는 단계; 및
상기 파라미터에 기반하여 근사모델의 정확성을 평가하는 단계를 포함하는 것을 특징으로 하는, 근사모델 구축 방법.The method according to claim 1,
Generating the approximate model,
Setting an approximation model using the main factors;
Optimizing the parameters of the set approximation model; And
And evaluating the accuracy of the approximate model based on the parameter.

청구항 5에 있어서,
상기 근사모델을 생성하는 단계는,
상기 평가 결과에 기반하여 근사모델을 구성하는 근사함수의 오차와 오차 분포를 고려하여 상기 실험 계획법에 따라 엔지니어링 데이터를 추가 생성하는 단계를 더 포함하는 것을 특징으로 하는, 근사모델 구축 방법.The method according to claim 5,
Generating the approximate model,
And generating engineering data according to the experimental design method in consideration of an error and an error distribution of an approximation function constituting an approximation model based on the evaluation result.

실험 계획법에 따른 실험을 통해 엔지니어링 데이터를 수집하는 데이터 수집부;
근사모델(metamodel)의 학습용 데이터베이스를 생성하기 위해 수집된 엔지니어링 데이터를 가공하는 데이터 전처리부; 및
전처리된 데이터를 이용하여 근사모델을 설정하고, 반복된 근사모델의 하이퍼-파라미터 최적화를 통해 근사모델을 생성하는 근사모델 생성부를 포함하는 것을 특징으로 하는, 근사모델 구축 시스템.A data collection unit for collecting engineering data through experiments according to the experimental design method;
A data pre-processing unit for processing the collected engineering data to generate a learning database of the approximate model; And
Approximation model construction system, characterized in that it comprises an approximate model generation unit for generating an approximate model through the hyper-parameter optimization of the repeated approximation model by setting the approximate model using pre-processed data.

청구항 7에 있어서,
상기 데이터 전처리부는,
수집된 엔지니어링 데이터를 이용하여 데이터 별로 특성인자를 추출하는 특성인자 추출모듈; 및
수집된 엔지니어링 데이터를 이용하여 데이터 별로 레이블을 생성하는 레이블 생성모듈을 포함하는 것을 특징으로 하는, 근사모델 구축 시스템.The method according to claim 7,
The data pre-processing unit,
A characteristic factor extraction module for extracting characteristic factors for each data using the collected engineering data; And
Approximate model building system, characterized in that it comprises a label generation module for generating a label for each data using the collected engineering data.

청구항 7에 있어서,
상기 엔지니어링 데이터는,
IoT 센서, 시뮬레이션 및 실험을 통해서 수집되는 것을 특징으로 하는, 근사모델 구축 시스템.The method according to claim 7,
The engineering data,
Approximate model building system, characterized by being collected through IoT sensors, simulations and experiments.

청구항 8에 있어서,
상기 데이터 전처리부는,
머신 러닝 용도의 데이터베이스를 이용하여 특성인자 중에서 레이블에 영향력을 미치는 주요 특성인자를 선별하는 주요인자 선별모듈을 더 포함하는 것을 특징으로 하는, 근사모델 구축 시스템.The method according to claim 8,
The data pre-processing unit,
Approximation model building system, characterized in that it further comprises a key factor selection module for selecting a key factor that affects the label among the attribute factors using a database for machine learning purposes.

청구항 9에 있어서,
상기 근사모델 생성부는,
주요 인자를 이용하여 근사모델을 설정하는 근사모델 설정모듈;
설정된 근사모델의 파마미터를 최적화하는 근사모델 최적화 모듈; 및
상기 파라미터에 기반하여 근사모델의 정확성을 평가하는 근사모델 평가모듈을 포함하는 것을 특징으로 하는, 근사모델 구축 시스템.The method according to claim 9,
The approximate model generation unit,
An approximate model setting module for setting an approximate model using main factors;
An approximation model optimization module for optimizing the parameters of the set approximation model; And
And an approximate model evaluation module for evaluating the accuracy of the approximate model based on the parameters.

청구항 11에 있어서,
상기 데이터 수집부는,
상기 평가 결과에 기반하여 근사모델을 구성하는 근사함수의 오차와 오차 분포를 고려하여 상기 실험 계획법에 따라 엔지니어링 데이터를 추가 생성하는 것을 특징으로 하는, 근사모델 구축 시스템.
The method according to claim 11,
The data collection unit,
Based on the evaluation results, considering the error and error distribution of the approximate function constituting the approximate model, characterized in that for generating additional engineering data according to the experimental design method, approximation model construction system.