KR102009284B1

KR102009284B1 - Training apparatus for training dynamic recurrent neural networks to predict performance time of last activity in business process

Info

Publication number: KR102009284B1
Application number: KR1020180149916A
Authority: KR
Inventors: 김정연; 윤석준; 이보경
Original assignee: 주식회사 피엠아이지
Priority date: 2018-11-28
Filing date: 2018-11-28
Publication date: 2019-08-09

Abstract

Disclosed are a business process prediction method for predicting an execution time point of the final activity in each case, that is, a final achievement index, by using deep learning in the form of a dynamic recurrent neural network; and a device appropriate for the same. The business process prediction method comprises: a process for preparing a recurrent neural network having a sequence spread as long as the maximum process of a case to be predicted; a preprocessing process for padding each case with respect to a case having the maximum process length with respect to training data and test data obtained from an event log, and adding time range information having categorized an execution time point of the final activity of each case; a hyperparameter setting process for setting hyperparameters for training; a dynamic recurrent neural network training process for training the recurrent neural network by using the training data padded and time categorized in the preprocessing process and the hyperparameters set in the hyperparameter setting process; and a prediction process for calculating and providing a prediction accuracy of the dynamic recurrent neural network by using the test data preprocessed in the preprocessing process.

Description

비즈니스 프로세스의 마지막 액티비티의 수행 시점을 예측하기 위해 동적 순환신경망을 학습시키는 비즈니스 프로세스 학습 장치 {Training apparatus for training dynamic recurrent neural networks to predict performance time of last activity in business process}Training apparatus for training dynamic recurrent neural networks to predict performance time of last activity in business process}

본 발명은 비즈니스 프로세스 예측 방법 및 장치에 관한 것으로서 더욱 상세하게는 동적 순환신경망 형태의 딥러닝을 활용하여 각각의 케이스(즉, 비즈니스 프로세스 인스턴스)에 있어서 마지막 액티비티(즉, 이벤트)의 수행 시점을 예측하는 방법, 이에 적합한 학습 방법 그리고 이에 적합한 장치에 관한 것이다.The present invention relates to a method and apparatus for predicting a business process, and more particularly, to predict the execution time of the last activity (ie, event) in each case (ie, business process instance) using deep learning in the form of a dynamic cyclic neural network. How to do this, learning suitable for this, and apparatus suitable for the same.

비즈니스 프로세스 행동을 예측하는 것은 비즈니스 프로세스 관리의 중요한 측면이다.Predicting business process behavior is an important aspect of business process management.

예측 분석을 자사의 비즈니스 프로세스에 탑재할 수 있는 기업은 차별화된 가치를 창출할 수 있다. 이에 따라, 프로세스의 미래 행동을 예측할 수 있는 능력은 기업의 핵심 역량이 되고 있다. Companies that can embed predictive analytics into their business processes can create differentiated value. As a result, the ability to predict the future behavior of a process is becoming a core competency of a company.

예를 들어, 구매한 자재의 지연 또는 조기 입고를 예측할 수 있다면 기업은 사전에 필요한 조치를 취함으로써 지연 입고에 따른 생산 중단과 조기 입고에 따른 장기 재고의 발생을 막을 수 있다.For example, if you can anticipate delays or early receipt of purchased materials, companies can take precautionary measures to prevent production interruptions from late receipts and long-term inventory from early receipts.

이러한 비즈니스 프로세스 예측은 현실 세계로부터 얻어진 이벤트 로그에 기초하며; 다음 액티비티(activity)를 예측하는 것, 현재 진행되는 케이스의 다음 경로(path)를 예측하는 것, 잔존 사이클 타임(remaining cycle time)을 예측하는 것, 종료 시간 위반(deadline violations)을 예측하는 것 등을 포함한다. This business process prediction is based on event logs obtained from the real world; Predicting the next activity, predicting the next path of the current case, predicting the remaining cycle time, predicting deadline violations, etc. It includes.

이러한 비즈니스 프로세스 예측은 ERP(Enterprise Resource Planning, 전사적 자원 관리), WMS(Warehouse Management System: 창고 관리 시스템), ITSM(IT service management; IT 서비스 관리)등에 적용될 수 있다.Such business process prediction may be applied to enterprise resource planning (ERP), warehouse management system (WMS), IT service management (ITSM), and the like.

종래의 비즈니스 프로세스의 예측에 있어서 정적 순환신경망(RNN, Recurrent neural networks) 형태의 딥러닝(deep learning)을 이용한 프로세스 예측이 큰 관심을 받고 있고 또한 우수한 결과를 내고 있다. 정적 순환신경망 형태의 딥러닝을 이용한 프로세스 예측에 관련한 연구는 문장에서 다음 단어를 예측하기 위해 인공신경망을 자연어 처리에 적용한 방식을 채택하고 있다. In the prediction of conventional business processes, process prediction using deep learning in the form of recurrent neural networks (RNN) is receiving great attention and producing excellent results. A study on process prediction using deep learning in the form of static cyclic neural networks adopts the method of applying artificial neural networks to natural language processing to predict the next word in a sentence.

구체적으로, 종래의 정적 순환신경망 형태의 딥러닝을 이용한 프로세스 예측은‘이벤트 로그’를 ‘문서’‘케이스(프로세스 인스턴스)’를 ‘문장’‘이벤트(액티비티)’를 ‘단어’에 대응시켜 정적 순환신경망을 이용한 프로세스 예측 접근법을 적용한다. Specifically, the process prediction using the deep learning in the form of a static cyclic neural network is a static by mapping the 'event log' to the 'document' case (process instance) 'and the' statement 'event (activity)' to the 'word'. Apply process prediction approach using cyclic neural network.

Evermann. et al은 RNN(Recurrent neural network) 형태의 딥러닝(deep learning)을 활용한 비즈니스 프로세스에서 다음 이벤트를 예측하는 것을 개시한 바 있다. (Predicting Process Behaviour using Deep Learning Joerg Evermann, Jana-Rebecca Rehse, Peter Fettke, Memorial University of Newfoundland, St. John's, NL, Canada German Research Center for Articial Intelligence, Saarbrucken, Germany Saarland University, Saarbrucken, Germany arXiv:1612.04600v2 [cs.LG] 22 Mar 2017)Evermann. et al discloses predicting the next event in a business process that utilizes deep learning in the form of a recurrent neural network (RNN). (Predicting Process Behavior using Deep Learning Joerg Evermann, Jana-Rebecca Rehse, Peter Fettke, Memorial University of Newfoundland, St. John's, NL, Canada German Research Center for Articial Intelligence, Saarbrucken, Germany Saarland University, Saarbrucken, Germany arXiv: 1612.04600v2 (cs.LG] 22 Mar 2017)

Evermann et al에 의해 제시된 바와 같은 정적 순환신경망 형태의 딥러닝을 활용한 프로세스 예측 연구는 순환신경망 내에 암묵적으로 학습된 프로세스를 활용하고, 예측 변수와 목표 변수 간 비선형 관계를 가정하고 있다. 그러므로 딥러닝을 활용한 프로세스 예측 연구는 프로세스 모델 표현을 위한 제약을 줄일 수 있고, 예측 정확도를 개선할 수 있다. 이런 이유로 딥러닝을 활용한 프로세스 예측 방법이 새롭고 혁신적인 접근법으로 간주되고 있다.Process prediction studies using deep learning in the form of static cyclic neural networks, as presented by Evermann et al, utilize implicitly learned processes in cyclic neural networks, and assume nonlinear relationships between predictors and target variables. Therefore, process prediction research using deep learning can reduce the constraints for process model representation and improve prediction accuracy. For this reason, process learning with deep learning is considered a new and innovative approach.

이러한 종래의 프로세스 예측 방법은 비즈니스 프로세스의 실제 행동을 정확히 반영하지 못하고, 특정 프로세스에 적용할 때마다 많은 노력과 시간을 요구한다는 한계를 지닌다. Such a conventional process prediction method does not accurately reflect the actual behavior of a business process, and has a limitation in that it requires a lot of effort and time every time it is applied to a specific process.

특히, 학습 및 예측 단위가 고정되어 있기 때문에 현실 세계의 프로세스 분석에는 한계가 있었다. 구체적으로 현실 세계의 케이스들은 서로 다른 길이 즉, 이벤트 개수를 가지는데, 종래의 프로세스 예측은 케이스들이 고정된 길이를 가지는 것으로 가정하고 있기 때문에 충분하고 정확한 예측 결과를 만들어 내지 못한다는 문제점이 있었다.In particular, real-world process analysis was limited because the learning and prediction units were fixed. Specifically, cases in the real world have different lengths, that is, the number of events. However, the conventional process prediction assumes that the cases have a fixed length, and thus there is a problem in that it does not produce sufficient and accurate prediction results.

도 1은 이벤트 로그의 예를 보인다.1 shows an example of an event log.

ERP(Enterprise Resource Planning), WMS(Workflow Management System), ITSM(IT Service Management) 등의 정보시스템은 현실 세계에서 업무를 수행할 때마다 발생하는 이벤트 데이터를 기록하고 있다. 이벤트는 수행된 활동이나 의사 결정, 또는 관심 사안의 발생 등을 나타낸다.Information systems such as Enterprise Resource Planning (ERP), Workflow Management System (WMS), and IT Service Management (ITSM) record event data that occurs whenever a business performs in the real world. An event represents an activity or decision made, or the occurrence of a concern.

각 이벤트는 발생 시점을 나타내는 타임스탬프와 특정 프로세스 인스턴스(또는 케이스)에 속함을 표시하는 케이스 아이디 속성을 가진다. 또한, 각 이벤트는 현실 세계에서 수행된 특정 액티비티에 대응한다. Each event has a timestamp indicating when it occurred and a case ID attribute indicating that it belongs to a specific process instance (or case). In addition, each event corresponds to a specific activity performed in the real world.

도 1에 보이는 것처럼, 이벤트 로그는 시간 순서로 발생한 이벤트들의 모임인 케이스들의 집합으로 구성된다. 현실 세계에서 발생한 이벤트들에 관한 기록들이 빅데이터로서 데이터베이스에 저장되고, 이 빅데이터를 분석함에 의해 이벤트 로그가 작성된다.As shown in Figure 1, the event log consists of a set of cases, which are a collection of events that occur in chronological order. Records of events that occur in the real world are stored in a database as big data, and an event log is created by analyzing this big data.

이벤트 로그는 케이스(프로세스 인스턴스)들의 모임이며, 시간적으로 순서를 두고 발생하는 케이스들을 정렬한 것이다. 각각의 케이스는 케이스 ID, 액티비티명, 타임스탬프를 포함한다.The event log is a collection of cases (process instances), arranged in chronological order. Each case contains a case ID, activity name, and time stamp.

도 1을 참조하면, 이벤트 로그에는 4개의 케이스가 기록되어 있고 이들 중에서 첫 번째 케이스는 A, B, C, D 네 개의 이벤트를 가지며 이벤트의 발생순서는 A->B->C->D이다.Referring to FIG. 1, four cases are recorded in the event log, the first of which has four events A, B, C, and D. The order of occurrence of the events is A-> B-> C-> D. .

한편, 다섯 번째 케이스는 3개의 이벤트를 가지며, 이벤트의 발생 순서는 A->B->C이다.On the other hand, the fifth case has three events, and the order of occurrence of the events is A-> B-> C.

딥러닝 기술인 순환신경망(RNN: Recurrent Neural Network)은 음성, 영상, 문서와 같은 순서 데이터(sequential data) 처리에 적합한 구조를 가지고 있다. 그러나 RNN은 관련 정보와 그 정보를 사용하는 지점 사이 거리가 멀 경우 예를 들어, 자연어 처리 등에서 상호 중요한 관계를 갖는 정보가 시간적으로 멀리 떨어지는 경우가 자주 발생하면, 역전파 시 그래디언트가 점차 줄어 학습 능력이 크게 저하되는 것으로 알려져 있다. 이를 vanishing gradient problem(기울기 값이 사라지는 문제)이라고 한다. Recurrent Neural Network (RNN), a deep learning technology, has a structure suitable for processing sequential data such as voice, video, and documents. However, if the RNN is far from the relevant information and the point of using the information, for example, if the information that has a mutually important relationship in the natural language processing is often far apart in time, the gradient gradually decreases during back propagation. This is known to be greatly reduced. This is called the vanishing gradient problem.

이 문제를 극복하기 위해서 고안된 것이 바로 LSTM(Long Short Term Memory)이다. LSTM은 RNN(Recurrent Neural Network)의 일종으로 RNN에 비해 오래 전 시간에 발생한 정보들이 현재의 의사결정에 미치는 영향력을 학습하는 것이 가능하다. 근래에 LSTM은 기계 번역 및 자연어 처리 분야에서 다양하게 활용되고 있는 네트워크 중 하나이다.It was designed to overcome this problem, the Long Short Term Memory (LSTM). LSTM is a kind of Recurrent Neural Network (RNN), which allows learning about the influence of information generated in a long time compared to RNN on current decision making. In recent years, LSTM is one of various networks in machine translation and natural language processing.

LSTM은 시간에 따라 변하는 현상을 나타내는 타임 시리즈(time series)문제들에 대하여 주로 적용되며, 초기 타임 스텝(time step)에서의 관측 값으로부터 현재 값들을 결정하기 위한 스테이트(state)들을 각 타임 스텝 별로 생성되는 히든 벡터(hidden vector)들을 통해서 구분한다.LSTM is mainly applied to time series problems that show time-varying phenomena, and states for each time step to determine the current values from the observed values in the initial time step. Distinguish between hidden vector generated.

도 2는 LSTM의 예를 보인다.2 shows an example of LSTM.

도 2에 도시된 순환신경망은 2층으로 구성되어 있고, 3단계로 펼쳐진 LSTM 셀을 가지고 있다(즉, 시퀸스 길이가 3임). LSTM 셀 대신에 GRU(Gated Recurrent Unit)나 RNN 셀을 사용할 수도 있다. 사실상, 층(layer)의 수나 시퀸스 길이(sequence length), 셀에 적용한 순환신경망의 종류(예, LSTM이나 GRU, RNN) 등과 같은 하이퍼 파라미터(hyper-parameter)의 값을 최적으로 결정하는 것은 쉬운 일이 아니다. 최적의 하이퍼 파라미터 값을 결정하기 위해서는 수많은 반복 실험 과정을 거쳐야 한다. 여기서, 시퀀스 크기(sequence size)는 RNN 모델을 펼칠 때(unfold) 몇 단계로 할 것인지를 의미한다.The circulatory neural network shown in FIG. 2 is composed of two layers and has an LSTM cell expanded in three steps (that is, the sequence length is three). Instead of an LSTM cell, a GRU (Gated Recurrent Unit) or an RNN cell may be used. In fact, it is easy to optimally determine the values of the hyper-parameters, such as the number of layers, the sequence length, the type of cyclic neural network applied to the cell (e.g. LSTM, GRU, RNN), etc. This is not it. To determine the optimal hyperparameter value, a number of iterative experiments are required. Here, the sequence size refers to how many steps to unfold the RNN model.

순환신경망은 일대다(一對多), 다대일, 다대다 등의 유형을 가질 수 있다. 예를 들어, 모차르트가 작곡한 악보를 학습한 순환신경망은 음악을 만들 수 있다. 이 경우에는 일대다 유형의 순환신경망을 적용한다. The cyclic neural network can be of one type, one to many, many to one, many to many. For example, a circulatory neural network that has learned the scores composed by Mozart can make music. In this case, the one-to-many type of cyclic neural network is applied.

문장에 포함된 단어가 사람의 이름인가를 찾는 순환신경망은 도 2와 같은 다대다 유형을 가진다. 많은 단어로 구성된 사용자 후기로부터 평점을 예측하는 순환신경망은 다대일 유형을 가진다.The cyclic neural network that finds whether a word included in a sentence is a person's name has a many-to-many type as shown in FIG. 2. A cyclic neural network that predicts a rating from many word user reviews has a many-to-one type.

대부분의 프로세스 예측 연구는 다양한 기법을 활용하여 프로세스 결과(예, 완료까지 남은 시간)를 예측하는 것에 초점을 두고 있었다. 그런데 최근에 딥러닝을 활용하지 않고 프로세스의 다음 단계를 예측하고자 한 일부 연구가 수행되었다. 이러한 연구는 명확한 모델 표현에 기반을 두고 있고, 회귀 나무(regression trees)와 같은 선형 기법을 활용하고 있다. Most process prediction studies have focused on using a variety of techniques to predict process outcomes (eg, time to completion). Recently, however, some studies have been conducted to predict the next step in the process without using deep learning. This research is based on clear model representation and uses linear techniques such as regression trees.

이에 반해, 정적 순환신경망 형태의 딥러닝을 활용한 프로세스 예측 연구는 순환신경망 내에 암묵적으로 학습된 프로세스를 활용하고, 예측 변수와 목표 변수 간 비선형 관계를 가정하고 있다. 그러므로 정적 순환신경망 형태의 딥러닝을 활용한 프로세스 예측 연구는 (프로세스) 모델 표현을 위한 제약을 줄일 수 있고, 예측 정확도를 개선할 수 있다. 이런 이유로 정적 순환신경망 형태의 딥러닝을 활용한 프로세스 예측 방법이 새롭고 혁신적인 접근법으로 간주되고 있다.On the contrary, the process prediction study using the deep learning in the form of static cyclic neural network utilizes the process implicitly learned in the cyclic neural network, and assumes the nonlinear relationship between the predictor and the target variable. Therefore, process prediction studies using deep learning in the form of static cyclic neural networks can reduce the constraints for (process) model representation and improve prediction accuracy. For this reason, process prediction with deep learning in the form of static cyclic neural networks is considered a new and innovative approach.

도 3은 정적 순환신경망 형태의 딥러닝을 사용한 예측 및 학습에 있어서의 데이터 변환의 예를 보인다.3 shows an example of data transformation in prediction and learning using deep learning in the form of a static cyclic neural network.

각 케이스의 마지막 이벤트(즉, 액티비티)를 예측한다고 가정해 보자. Suppose you predict the last event (that is, activity) in each case.

정적 순환신경망 형태의 딥러닝을 활용한 프로세스 예측 방법은 학습과 예측을 수행하기 위해 도 1에 표시된 5개의 케이스 수행과 관련된 이벤트 로그를 도 3과 같이 변환(정적 순환신경망을 이용한 학습과 예측을 위한 데이터로 변환)한다. Process prediction method using deep learning in the form of static cyclic neural network transforms the event log associated with the five case execution shown in Figure 1 to perform the learning and prediction as shown in Figure 3 (for learning and prediction using static cyclic neural network) To data).

만약 도 2의 순환신경망 구조를 활용한다면 시퀸스 길이가 3이므로 예측될 마지막 이벤트를 포함하여 마지막 이벤트보다 먼저 수행된 3개의 이벤트가 학습과 예측에 활용된다. 그러므로 케이스 1과 3의 이벤트 기록은 학습과 예측에 온전히 활용될 수 있다. If the cyclic neural network structure of FIG. 2 is used, since the sequence length is 3, three events performed before the last event, including the last event to be predicted, are used for learning and prediction. Therefore, event recordings in Cases 1 and 3 can be fully utilized for learning and prediction.

그러나 케이스 2의 경우에는 첫 번째 이벤트인 A가 빠져야만 한다. 왜냐하면 마지막 이벤트인 D를 예측하기 위해 시퀸스 길이('3’)에 맞는 이벤트 B, C, D만 활용될 수 있기 때문이다. In case 2, however, the first event, A, must be missing. Because only events B, C, and D that fit the sequence length ('3') can be used to predict the last event, D.

케이스 4와 5의 경우에는 예측될 이벤트 D나 C를 제외하면 시퀸스 길이와 맞지 않기 때문에 케이스 4와 5의 전체 이벤트 기록이 학습이나 예측에서 빠져야만 한다.For Cases 4 and 5, except for the events D or C to be predicted, they do not match the sequence length, so the entire event record for Cases 4 and 5 must be left out of learning or prediction.

도 4는 정적 순환신경망 형태의 딥러닝을 활용한 예측 및 학습에 있어서의 데이터 변환의 다른 예를 보인다.4 shows another example of data transformation in prediction and learning using deep learning in the form of a static cyclic neural network.

기존 연구는 도 1에 표시된 5개의 케이스 수행과 관련된 이벤트 기록을 도 4와 같이 변환한 후에 이를 이용하여 학습하고 예측한다. 이 경우에도 케이스5의 전체 이벤트 기록이 학습이나 예측에서 빠져야만 한다. 또한 학습 및 예측의 단위가 프로세스 수행 단위인 케이스가 아니라, 임의로 설정한 시퀸스 길이이므로 마지막 이벤트 예측과 같은 예측 작업에서 이러한 데이터 변환 방법은 예측 성능을 저하시킬 수 있다. 실험 결과, 도 4의 데이터 변환 방법을 이용하면 마지막 이벤트 예측의 정확도가 크게 떨어지는 것으로 확인되었다.Existing research converts the event records related to the performance of the five cases shown in FIG. 1 as shown in FIG. In this case too, the entire event record for Case 5 must be lost from learning or prediction. In addition, since the unit of learning and prediction is not a case that is a unit of process execution, but a randomly set sequence length, such a data conversion method may reduce prediction performance in a prediction operation such as last event prediction. As a result of the experiment, it was confirmed that the accuracy of the last event prediction is greatly reduced by using the data transformation method of FIG. 4.

예측 정확도를 높일 수 있는 첫 번째 데이터 변환 방법(도 3 참조)을 따르는 경우에도 정적 순환신경망 모형에 대한 학습을 진행할 때 케이스 4와 5의 수행을 기록한 전체 이벤트 기록과 케이스 2의 수행을 기록한 일부 이벤트 기록을 학습 데이터에 포함할 수 없었다. 그러므로 이러한 데이터를 이용하여 학습한 모형은 현실 세계의 실제 프로세스 행동을 정확히 반영하지 못한다는 한계를 지닌다. 정적 순환신경망에 기반을 둔 프로세스 예측 모형의 이러한 한계는 결국 예측 정확도에도 영향을 줄 것이다.Even if you follow the first data transformation method (see Figure 3) for better predictive accuracy, as you train your static cyclic neural network model, record a full event that records performance of cases 4 and 5, and some events that record performance of case 2 The record could not be included in the training data. Therefore, the model trained using such data has a limitation that does not accurately reflect the actual process behavior in the real world. These limitations of process prediction models based on static cyclic neural networks will eventually affect prediction accuracy.

Predicting Process Behaviour using Deep Learning Joerg Evermann, Jana-Rebecca Rehse, Peter Fettke, Memorial University of Newfoundland, St. John's, NL, Canada German Research Center for Articial Intelligence, Saarbrucken, Germany Saarland University, Saarbrucken, Germany arXiv:1612.04600v2 [cs.LG] 22 Mar 2017Predicting Process Behavior using Deep Learning Joerg Evermann, Jana-Rebecca Rehse, Peter Fettke, Memorial University of Newfoundland, St. John's, NL, Canada German Research Center for Articial Intelligence, Saarbrucken, Germany saarland University, Saarbrucken, Germany arXiv: 1612.04600v2 [cs.LG] 22 Mar 2017

본 발명은 상기의 문제점을 해결하기 위하여 안출된 것으로서 동적 순환신경망 형태의 딥러닝을 활용한 비즈니스 프로세스 예측 방법을 제공하는 것을 그 목적으로 한다.An object of the present invention is to provide a business process prediction method utilizing deep learning in the form of a dynamic circulatory neural network, which is devised to solve the above problems.

본 발명의 다른 목적은 비즈니스 프로세스에서 마지막 액티비티의 수행 시점을 예측할 수 있는 비즈니스 프로세스 예측 방법을 제공하는 것에 있다.Another object of the present invention is to provide a business process prediction method capable of predicting the execution time of the last activity in the business process.

본 발명의 또 다른 목적은 상기의 비즈니스 프로세스 예측 방법에 적합한 학습 방법을 제공하는 것에 있다. Another object of the present invention is to provide a learning method suitable for the above-described business process prediction method.

본 발명의 또 다른 목적은 상기의 예측 방법에 적합한 비즈니스 프로세스 예측 장치를 제공하는 것에 있다.Another object of the present invention is to provide a business process prediction apparatus suitable for the above-described prediction method.

본 발명의 또 다른 목적은 상기의 학습 방법에 적합한 비즈니스 프로세스 학습 장치를 제공하는 것에 있다.Still another object of the present invention is to provide a business process learning apparatus suitable for the above learning method.

본 상기의 목적을 달성하기 위한 본 발명에 따른 비즈니스 프로세스 예측 방법은Business process prediction method according to the present invention for achieving the above object is

동적 순환신경망 형태의 딥러닝을 활용하여 비즈니스 프로세스의 마지막 액티비티의 수행 시점을 예측하는 비즈니스 프로세스 예측 방법에 있어서, In the business process prediction method for predicting the execution time of the last activity of the business process using the deep learning in the form of dynamic cyclic neural network,

예측하고자 하는 케이스(비즈니스 프로세스)의 최대 프로세스 길이만큼 펼쳐진 시퀀스를 가지는 순환신경망을 준비하는 과정; Preparing a cyclic neural network having a sequence spread by a maximum process length of a case (business process) to be predicted;

이벤트 로그로부터 얻어진 학습 데이터 및 테스트 데이터에 대하여 최대 프로세스 길이를 가지는 케이스를 기준으로 각 케이스에 대한 패딩 처리를 수행하고, 각 케이스에 마지막 액티비티의 수행 시점을 카테고리화한 시간 범주 정보를 추가하는 전처리 과정;The preprocessing process performs padding processing on each case based on the case having the maximum process length for the training data and the test data obtained from the event log, and adds time category information categorizing the execution time of the last activity to each case. ;

학습을 위한 하이퍼 파라미터를 설정하는 하이퍼 파라미터 설정 과정;A hyperparameter setting process of setting a hyperparameter for learning;

상기 전처리 과정에 의해 패딩 및 시간 범주 처리된 학습 데이터, 상기 하이퍼 파라미터 설정 과정에서 설정된 하이퍼 파라미터를 이용하여 상기 순환신경망을 학습시키는 동적 순환신경망 학습 과정; 및A dynamic circulatory neural network learning process for learning the circulatory neural network using the training data padded and time-categorized by the preprocessing process and the hyperparameter set in the hyper parameter setting process; And

상기 학습된 동적 순환신경망 및 실시간으로 발생된 이벤트 로그를 이용하여 각 케이스에서 마지막 액티비티의 수행 시점을 예측하는 예측 과정;A prediction process of predicting the execution time of the last activity in each case by using the learned dynamic cyclic neural network and the event log generated in real time;

을 포함한다.It includes.

여기서, 상기 순환신경망은 예측하고자 하는 케이스의 최대 프로세스 길이만큼 펼쳐진 시퀀스를 가지는 LSTM인 것을 특징으로 한다.Here, the cyclic neural network is characterized in that the LSTM having a sequence spread by the maximum process length of the case to be predicted.

여기서, 전처리 과정은 이벤트 로그를 순환신경망을 이용한 학습과 예측을 위한 데이터로 변환하는 것을 특징으로 한다.Here, the preprocessing process is characterized by converting the event log into data for learning and prediction using the cyclic neural network.

여기서, 전처리 과정은 프로세스 길이가 1인 케이스를 제거하는 것을 특징으로 한다.Here, the pretreatment process is characterized in that the case having a process length of 1 is removed.

여기서, 전처리 과정은 워드 엠베딩(word embedding)에 의해 액티비티의 이름을 결정하는 것을 특징으로 한다.Here, the preprocessing process is characterized by determining the name of the activity by word embedding.

여기서, 상기 하이퍼 파라미터는 배치 사이즈, 에포크의 수, 활성화 함수, 최적화 함수를 포함하는 것을 특징으로 한다.Herein, the hyperparameter may include a batch size, the number of epochs, an activation function, and an optimization function.

여기서, 상기 학습 과정은 이벤트 로그로부터 파싱된 데이터를 학습 데이터와 테스트 데이터로 구분하고, 학습 데이터를 사용하여 학습하며, 미니 배치 방식(mini batch)을 사용하고 과적합 방지를 위해 교차 검증 기법을 사용하는 것을 특징으로 한다.Here, the learning process divides the data parsed from the event log into training data and test data, learns using the training data, uses a mini batch method and uses a cross-validation technique to prevent overfitting. Characterized in that.

여기서, 상기 학습 과정은 교차 검증을 위해 입력 데이터와 목표 데이터를 인덱싱하는 것을 특징으로 한다.Here, the learning process is characterized in that the input data and the target data indexed for cross-validation.

여기서, 상기 학습 과정은 교차 엔트로피(cross-entropy)를 통해 손실 값(loss)을 계산하고, 역전파 알고리즘(back propagation algorithm)을 통해 오류를 수정하는 것을 특징으로 한다.Here, the learning process is characterized by calculating a loss value through cross-entropy, and correcting an error through a back propagation algorithm.

여기서, 학습 과정은 카테고리화를 위한 구간값을 바꾸어 가면서 학습을 수행하여 구간값을 최적화하는 것을 특징으로 한다.Here, the learning process is characterized by optimizing the interval value by performing the learning while changing the interval value for categorization.

여기서, 상기 전처리 과정에 의해 전처리된 테스트 데이터를 이용하여 상기 동적 순환신경망의 예측 정확도를 계산하여 제공하는 학습 검증 과정을 더 포함하는 것을 특징으로 한다.The method may further include a learning verification process of calculating and providing a prediction accuracy of the dynamic cyclic neural network by using the test data preprocessed by the preprocessing process.

여기서, 상기 예측 과정은 실시간 데이터에서 학습에 사용된 최대 프로세스 길이보다 긴 데이터가 입력되면 그 차이만큼 상기 순환신경망 모듈의 시퀀스를 확장하고 패딩 처리하는 것을 특징으로 한다.In the prediction process, when data longer than the maximum process length used for learning is input in real time data, the sequence of the cyclic neural network module is expanded and padded by the difference.

여기서, 상기 이벤트 로그를 참조하여 핵심 예측 변수 및 숨겨진 액티비티를 탐색하는 프로세스 마이닝 과정을 더 포함하고,The method may further include a process mining process of searching for key predictors and hidden activities by referring to the event log.

상기 전처리 과정은 상기 프로세스 마이닝 과정의 탐색 결과를 참조하여 이벤트 로그를 최적화하는 것을 특징으로 한다.The preprocessing process may be configured to optimize an event log by referring to a search result of the process mining process.

상기의 다른 목적을 달성하기 위한 본 발명에 따른 비즈니스 학습 방법은Business learning method according to the present invention for achieving the above another object is

동적 순환신경망 형태의 딥러닝을 활용하여 비즈니스 프로세스의 마지막 액티비티의 수행 시점을 예측하기 위해 동적 순환신경망 모듈을 학습시키는 학습 방법에 있어서, In the learning method of learning the dynamic cyclic neural network module to predict the execution time of the last activity of the business process using the deep learning in the form of the dynamic cyclic neural network,

학습을 위한 하이퍼 파라미터를 설정하는 하이퍼 파라미터 설정 과정; 및A hyperparameter setting process of setting a hyperparameter for learning; And

상기 전처리 과정에 의해 패딩 및 시간 범주 처리된 학습 데이터, 상기 하이퍼 파라미터 설정 과정에서 설정된 하이퍼 파라미터를 이용하여 상기 순환신경망을 학습시키는 동적 순환신경망 학습 과정;A dynamic circulatory neural network learning process for learning the circulatory neural network using the training data padded and time-categorized by the preprocessing process and the hyperparameter set in the hyper parameter setting process;

을 포함한다.It includes.

여기서, 상기 순환신경망은 최대 프로세스 길이만큼 펼쳐진 시퀀스를 가지는 LSTM인 것을 특징으로 한다.Here, the circulatory neural network is characterized in that the LSTM having a sequence spread by the maximum process length.

여기서, 상기 전처리 과정은 이벤트 로그를 순환신경망을 이용한 학습과 예측을 위한 데이터로 변환하는 것을 특징으로 한다.Here, the preprocessing process is characterized in that the event log is converted into data for learning and prediction using the cyclic neural network.

여기서, 상기 전처리 과정은 프로세스 길이가 1인 케이스를 제거하는 것을 특징으로 한다.Here, the pretreatment process is characterized in that to remove the case of the process length of 1.

여기서, 상기 전처리 과정은 워드 엠베딩에 의해 액티비티의 이름을 결정하는 것을 특징으로 한다.Here, the preprocessing step is characterized by determining the name of the activity by word embedding.

여기서, 상기 학습 과정은 이벤트 로그로부터 파싱(parsing)된 데이터를 학습 데이터와 테스트 데이터로 구분하고, 학습 데이터를 사용하여 학습하며, 미니 배치 방식(mini batch)을 사용하고 과적합 방지를 위해 교차 검증 기법을 사용하는 것을 특징으로 한다.Here, the learning process divides the data parsed from the event log into training data and test data, learns using the training data, uses a mini batch method and cross-verifies to prevent overfitting. The technique is used.

여기서, 학습 과정은 카테고리화를 위한 구간 값을 바꾸어가면서 학습을 수행하여 구간 값을 최적화하는 것을 특징으로 한다.Here, the learning process is characterized by optimizing the interval value by performing the learning while changing the interval value for categorization.

상기의 또 다른 목적을 달성하기 위한 본 발명에 따른 비즈니스 프로세스 학습 장치는Business process learning apparatus according to the present invention for achieving the above another object is

동적 순환신경망 형태의 딥러닝을 활용하여 비즈니스 프로세스의 마지막 액티비티의 수행 시점을 예측하기 위해 동적 순환신경망 모듈을 학습시키는 학습 장치에 있어서, In the learning device for learning the dynamic cyclic neural network module to predict the execution time of the last activity of the business process using the deep learning in the form of dynamic cyclic neural network,

예측하고자 하는 케이스(비즈니스 프로세스)의 최대 프로세스 길이만큼 펼쳐진 시퀀스를 가지는 순환신경망 모듈;A cyclic neural network module having a sequence spread by a maximum process length of a case (business process) to be predicted;

이벤트 로그를 파싱하여 학습 데이터 및 테스트 데이터를 얻고, 학습 데이터 및 테스트 데이터에 대하여 최대 프로세스 길이를 가지는 케이스를 기준으로 각 케이스에 대한 패딩 처리를 수행하고 각 케이스에 마지막 액티비티의 수행 시점을 카테고리화한 시간 범주 변수를 추가하는 전처리 모듈; Parse the event log to obtain training data and test data, perform padding processing on each case based on the case with the maximum process length, and categorize the execution time of the last activity in each case. A preprocessing module for adding time category variables;

학습을 위한 하이퍼 파라미터를 설정하는 하이퍼 파라미터 설정 모듈; 및A hyper parameter setting module for setting a hyper parameter for learning; And

상기 전처리 모듈에 의해 패딩 및 시간 범주 처리된 학습 데이터, 상기 하이퍼 파라미터 설정 과정에서 설정된 하이퍼 파라미터를 이용하여 상기 순환신경망을 학습시키는 동적 순환신경망 학습 모듈; A dynamic circulatory neural network learning module for learning the circulatory neural network using learning data padded and temporally processed by the preprocessing module and hyper parameters set in the hyper parameter setting process;

을 포함한다.It includes.

여기서, 상기 순환신경망 모듈은 최대 프로세스 길이만큼 펼쳐진 시퀀스를 가지는 LSTM인 것을 특징으로 한다.Here, the cyclic neural network module is characterized in that the LSTM having a sequence spread by the maximum process length.

여기서, 상기 전처리 모듈은 프로세스 길이가 1인 케이스를 제거하는 것을 특징으로 한다.Here, the pretreatment module is characterized in that to remove the case having a process length of 1.

여기서, 상기 학습 모듈은 미니 배치 방식(mini batch) 및 과적합 방지를 위해 교차 검증 기법에 의해 학습하는 것을 특징으로 한다.Here, the learning module is characterized by learning by a cross-validation technique for mini-batch and minifit.

여기서, 상기 학습 모듈은 교차 검증을 위해 입력 데이터와 목표 데이터를 인덱싱하는 것을 특징으로 한다.Here, the learning module is characterized by indexing the input data and the target data for cross-validation.

여기서, 상기 학습 모듈은 교차 엔트로피(cross-entropy)를 통해 손실 값(loss)을 계산하고, 역전파 알고리즘(back propagation algorithm)을 통해 오류를 수정하는 것을 특징으로 한다.In this case, the learning module may calculate a loss value through cross-entropy and correct an error through a back propagation algorithm.

여기서, 상기 학습 모듈은 카테고리화를 위한 구간 값을 바꾸어가면서 학습을 수행하여 구간값을 최적화하는 것을 특징으로 한다.Here, the learning module is characterized by optimizing the interval value by performing the learning while changing the interval value for categorization.

여기서, 상기 전처리 모듈에 의해 패딩 및 시간 범주 처리된 테스트 데이터를 이용하여 상기 동적 순환신경망의 예측 정확도를 계산하여 제공하는 학습 검증 모듈;A learning verification module for calculating and providing a prediction accuracy of the dynamic cyclic neural network by using test data padded and time-domain processed by the preprocessing module;

을 더 포함한다.It includes more.

여기서, 이벤트 로그를 참조하여 핵심 예측 변수 및 숨겨진 액티비티를 탐색하는 프로세스 마이닝 모듈을 더 포함하고,Here, further comprising a process mining module for exploring key predictors and hidden activities by referring to the event log,

상기 전처리 모듈은 상기 프로세스 마이닝 모듈의 탐색 결과를 참조하여 이벤트 로그를 최적화하는 것을 특징으로 한다.The preprocessing module may optimize an event log by referring to a search result of the process mining module.

동적 순환신경망 형태의 딥러닝을 활용하여 비즈니스 프로세스의 마지막 액티비티의 수행 시점을 예측하기 위해 동적 순환신경망을 학습시키는 학습 장치에 있어서, In the learning device that learns the dynamic cyclic neural network to predict the execution time of the last activity of the business process using the deep learning in the form of the dynamic cyclic neural network,

이벤트 로그를 파싱하여 학습 데이터 및 테스트 데이터를 얻고, 학습 데이터 및 테스트 데이터에 대하여 최대 프로세스 길이를 가지는 케이스를 기준으로 각 케이스에 대한 패딩 처리를 수행하고 각 케이스에 마지막 액티비티의 수행 시점을 카테고리화한 시간 범주 변수를 추가하는 전처리 모듈; 및Parse the event log to obtain training data and test data, perform padding processing on each case based on the case with the maximum process length, and categorize the execution time of the last activity in each case. A preprocessing module for adding time category variables; And

을 포함한다.It includes.

여기서, 상기 순환신경망 모듈은 예측하고자 하는 케이스의 최대 프로세스 길이만큼 펼쳐진 시퀀스를 가지는 LSTM인 것을 특징으로 한다.The cyclic neural network module is characterized in that the LSTM having a sequence spread by the maximum process length of the case to be predicted.

을 더 포함하는 것을 특징으로 한다.It characterized in that it further comprises.

여기서, 상기 이벤트 로그를 참조하여 핵심 예측 변수 및 숨겨진 액티비티를 탐색하는 프로세스 마이닝 모듈을 더 포함하고,The method may further include a process mining module that searches for key predictors and hidden activities by referring to the event log.

본 발명에 따른 프로세스 예측 방법 및 장치는 국내 중소기업의 판매 프로세스를 기록한 실제 데이터를 활용하여 검증되었다. 특정 프로세스 인스턴스의 마지막 이벤트(즉, 액티비티)를 예측하는 작업에서 제안된 접근법을 활용한 예측 정확도(96.29%)는 기존 접근법을 활용한 예측 정확도보다 테스트 데이터 기준으로 1.62% 더 높았다. 기존 접근법을 활용한 예측 정확도가 이미 94.67%를 달성한 가운데 본 발명의 접근법이 이를 1 ~ 2% 정도 더 개선했다는 것은 매우 우수한 결과라고 판단된다.Process prediction method and apparatus according to the present invention has been verified using the actual data recording the sales process of domestic SMEs. In predicting the last event (ie activity) of a particular process instance, the accuracy of the prediction (96.29%) using the proposed approach was 1.62% higher based on test data than the prediction accuracy using the conventional approach. While the prediction accuracy using the existing approach has already achieved 94.67%, it is considered to be a very good result that the approach of the present invention has improved it by 1 to 2%.

본 발명은 정적 순환신경망에 기반을 둔 최신 접근법보다 우수한 예측 정확도를 달성했다. 또한 프로세스의 행동 예측을 위해 훨씬 적은 노력과 시간이 요구되는 것으로 검증되었다.The present invention achieves better prediction accuracy than modern approaches based on static cyclic neural networks. In addition, much less effort and time were required to predict the behavior of the process.

본 발명에 따른 학습 방법 및 장치는 정보 시스템 기반으로 업무가 수행되는 모든 프로세스의 다양한 행동 예측을 위한 학습에 적용될 수 있다. 이에 따라 본 발명에 따른 프로세스 예측 방법 및 장치는 스마트 팩토리(smart factory)나 디지털 금융 등을 포함하는 4차 산업혁명의 핵심 기술이 될 수 있다.The learning method and apparatus according to the present invention can be applied to learning for predicting various behaviors of all processes in which a task is performed based on an information system. Accordingly, the process prediction method and apparatus according to the present invention may be a core technology of the fourth industrial revolution including a smart factory or digital finance.

도 1은 이벤트 로그의 예를 보인다.
도 2는 LSTM의 예를 보인다.
도 3은 정적 순환신경망 형태의 딥러닝을 사용한 예측 및 학습에 있어서의 데이터 변환의 예를 보인다.
도 4는 정적 순환신경망 형태의 딥러닝을 활용한 예측 및 학습에 있어서의 데이터 변환의 다른 예를 보인다.
도 5는 본 발명에 적용된 동적 순환신경망의 구조를 보인다.
도 6은 도 5에 도시된 동적 순환신경망을 구성하기 위한 환경의 예를 보인다.
도 7은 본 발명에 따른 비즈니스 프로세스 예측 방법을 보이는 과정도이다.
도 8은 액티비티 수행 시점을 카테고리화하는 예를 보인다.
도 9는 본 발명에 따른 비즈니스 프로세스 예측 방법이 적용된 창고 관리 프로세스 맵의 예를 보이는 것이다.
도 10은 학습 데이터 및 테스트 데이터의 구성을 보인다.
도 11은 10겹 교차 검증 기법을 도식적으로 도시한다.
도 12는 본 발명에 따른 비즈니스 프로세스 예측 장치의 구성을 보인다.
도 13은 본 발명에 적용된 동적 순환신경망의 예를 보인다.
도 14는 10겹 교차 검증 방식에 따른 예측률을 보이는 것이고,
도 15는 10겹 교차 검증 방식에 따른 비용을 보인다.1 shows an example of an event log.
2 shows an example of LSTM.
3 shows an example of data transformation in prediction and learning using deep learning in the form of a static cyclic neural network.
4 shows another example of data transformation in prediction and learning using deep learning in the form of a static cyclic neural network.
5 shows the structure of a dynamic circulatory neural network applied to the present invention.
FIG. 6 shows an example of an environment for constructing the dynamic circulatory neural network shown in FIG. 5.
7 is a flowchart illustrating a business process prediction method according to the present invention.
8 shows an example of categorizing an activity execution time.
9 shows an example of a warehouse management process map to which a business process prediction method according to the present invention is applied.
10 shows the configuration of training data and test data.
11 diagrammatically illustrates a 10-fold cross validation technique.
12 shows a configuration of a business process prediction apparatus according to the present invention.
Figure 13 shows an example of a dynamic circulatory neural network applied to the present invention.
14 shows the prediction rate according to the 10-fold cross-validation scheme,
15 shows the cost according to the 10-fold cross-validation scheme.

본 발명은 다양한 변경을 가할 수 있고 여러 가지 실시예를 가질 수 있는 바, 특정 실시예들을 도면에 예시하고 발명을 실시하기 위한 구체적인 내용에 상세하게 설명하고자 한다. 그러나 이는 본 발명을 특정한 실시 형태에 대해 한정하려는 것이 아니며, 본 발명의 사상 및 기술 범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다. 각 도면을 설명하면서 유사한 참조 부호를 유사한 구성요소로 사용하였다.As the invention allows for various changes and numerous embodiments, particular embodiments will be illustrated in the drawings and described in detail in order to practice the invention. However, this is not intended to limit the present invention to specific embodiments, it should be understood to include all changes, equivalents, and substitutes included in the spirit and scope of the present invention. In describing the drawings, like reference numerals refer to like elements.

제1, 제2, A, B 등의 용어는 다양한 구성 요소들을 설명하는데 사용될 수 있지만, 상기 구성 요소들은 상기 용어들에 의해 한정되어서는 안된다. 상기 용어들은 하나의 구성 요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다. 예를 들어, 본 발명의 권리 범위를 벗어나지 않으면서 제1 구성 요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소도 제1 구성요소로 명명될 수 있다. 및/또는 이라는 용어는 복수의 관련된 기재된 항목들의 조합 또는 복수의 관련된 기재된 항목들 중의 어느 항목을 포함한다.Terms such as first, second, A, and B may be used to describe various components, but the components should not be limited by the terms. The terms are used only for the purpose of distinguishing one component from another. For example, without departing from the scope of the present invention, the first component may be referred to as the second component, and similarly, the second component may also be referred to as the first component. The term and / or includes a combination of a plurality of related items or any item of a plurality of related items.

어떤 구성 요소가 다른 구성 요소에 "연결되어" 있다거나 "접속되어" 있다고 언급된 때에는, 그 다른 구성 요소에 직접적으로 연결되어 있거나 또는 접속되어 있을 수도 있지만, 중간에 다른 구성 요소가 존재할 수도 있다고 이해되어야 할 것이다. 반면에, 어떤 구성요소가 다른 구성 요소에 "직접 연결되어" 있다거나 "직접 접속되어" 있다고 언급된 때에는, 중간에 다른 구성 요소가 존재하지 않는 것으로 이해되어야 할 것이다.When a component is said to be "connected" or "connected" to another component, it may be directly connected to or connected to that other component, but it may be understood that other components may exist in the middle. Should be. On the other hand, when a component is said to be "directly connected" or "directly connected" to another component, it should be understood that there is no other component in between.

본 출원에서 사용한 용어는 단지 특정한 실시예를 설명하기 위해 사용된 것으로, 본 발명을 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 출원에서, "포함하다" 또는 "가지다" 등의 용어는 명세서 상에 기재된 특징, 숫자, 단계, 동작, 구성 요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품, 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.The terminology used herein is for the purpose of describing particular example embodiments only and is not intended to be limiting of the present invention. Singular expressions include plural expressions unless the context clearly indicates otherwise. In this application, the terms "comprise" or "have" are intended to indicate that there is a feature, number, step, action, component, part, or combination thereof described on the specification, and one or more other features. It is to be understood that the present invention does not exclude the possibility of the presence or the addition of numbers, steps, operations, components, components, and combinations thereof.

다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가지고 있다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥 상 가지는 의미와 일치하는 의미를 가지는 것으로 해석되어야 하며, 본 출원에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다.Unless defined otherwise, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by one of ordinary skill in the art. Terms such as those defined in the commonly used dictionaries should be construed as having meanings consistent with the meanings in the context of the related art and shall not be construed in ideal or excessively formal meanings unless expressly defined in this application. Do not.

이하 첨부된 도면을 참조하여 본 발명의 구성 및 동작을 상세히 설명하기로 한다.Hereinafter, the configuration and operation of the present invention will be described in detail with reference to the accompanying drawings.

전술한 바와 같이, 정적 순환신경망 형태의 딥러닝을 활용한 프로세스 예측 방법이 새롭고 혁신적인 접근법으로 간주되고 있지만 이러한 접근법은 실제 프로세스의 행동을 정확히 반영하지 못한다는 한계를 지닌다. 이러한 한계를 극복하기 위해 본 발명은 동적 순환신경망 형태의 딥러닝을 활용하는 구조를 제안하고 구현했다.As mentioned above, the process prediction method using deep learning in the form of static cyclic neural network is regarded as a new and innovative approach, but this approach has a limitation that does not accurately reflect the actual process behavior. In order to overcome these limitations, the present invention proposes and implements a structure that utilizes deep learning in the form of a dynamic cyclic neural network.

도 5는 본 발명에 적용된 동적 순환신경망의 구조를 보인다.5 shows the structure of a dynamic circulatory neural network applied to the present invention.

도 5에 도시된 동적 순환신경망은 예측된 마지막 이벤트를 대상 프로세스의 전체 액티비티 종류의 수만큼 가지는 차원으로 선형 투영시켜 출력한다.The dynamic circulatory neural network shown in FIG. 5 outputs the predicted last event linearly to a dimension having the total number of activity types of the target process.

학습 진행시에는 예측된 마지막 이벤트를 출력하고, 교차 엔트로피(cross-entropy)를 통해 손실 값(loss)을 계산하고, 역전파 알고리즘(back propagation algorithm)을 통해 오류를 수정한다.During the learning process, the last predicted event is output, the loss is calculated through cross-entropy, and the error is corrected through a back propagation algorithm.

도 5를 참조하면, 순환신경망은 LSTM 으로 구성된 것이다. 입력 값으로서 패딩된 이벤트 로그가 제공되고 출력값으로서 예측된 이벤트가 얻어진다. Referring to Figure 5, the circulatory neural network is composed of LSTM. Padded event logs are provided as input values and predicted events are obtained as output values.

도 6은 도 5에 도시된 동적 순환신경망을 구성하기 위한 환경의 예를 보인다.FIG. 6 shows an example of an environment for constructing the dynamic circulatory neural network shown in FIG. 5.

도 6에 도시된 바와 같이, 도 5에 도시된 동적 순환신경망은 오픈 소스 딥러닝 프레임워크인 텐서플로우(https://www.tensorflow.org/ 참조)에 기반을 두고 파이썬 프로그래밍 언어를 이용하여 구현된다. As shown in FIG. 6, the dynamic cyclic neural network shown in FIG. 5 is implemented using the Python programming language based on TensorFlow (see https://www.tensorflow.org/), an open source deep learning framework. do.

텐서플로우는 손실 함수 측면에서 그래프를 구성하는 모든 텐서들의 경사도(gradients)를 계산하고 다양한 최적화 함수를 제공한다. 학습의 목적은 손실 함수를 최적화 하는 것이다. TensorFlow calculates the gradients of all the tensors that make up the graph in terms of loss function and provides various optimization functions. The purpose of learning is to optimize the loss function.

텐서플로우를 활용한 응용 프로그램은 텐서 연산을 사용하여 최적화 함수를 통해 손실 함수를 줄인다. 손실 함수는 최소화될 그래프의 노드이다. 예를 들어, 범주형 변수의 교차 엔트로피와 같은 계산된 출력을 대상 목표 변수의 실제 값과 비교하여 손실(오차)를 계산하고, 이를 줄여 나가는 연산을 수행한다.An application using TensorFlow reduces the loss function through optimization functions using tensor operations. The loss function is the node of the graph to be minimized. For example, a calculated output, such as the cross-entropy of a categorical variable, is compared with the actual value of the target target variable to calculate a loss (error) and perform an operation that reduces it.

도 7은 본 발명에 따른 비즈니스 프로세스 예측 방법을 보이는 과정도이다.7 is a flowchart illustrating a business process prediction method according to the present invention.

도 7을 참조하면, 본 발명에 따른 비즈니스 프로세스 예측 방법(700)은 크게 순환신경망 모델을 준비하는 과정(S702), CSV 형식의 이벤트 로그 데이터를 전처리하는 전처리 과정(S704), 하이퍼 파라미터 설정(configuration) 과정(S706), 동적 순환신경망 학습 과정(S708), 학습 검증 과정(S710) 그리고 예측 과정(S712)으로 구성된다. Referring to FIG. 7, the business process prediction method 700 according to the present invention includes a process of preparing a cyclic neural network model (S702), a preprocessing process of preprocessing event log data in CSV format (S704), and hyperparameter configuration. ) Process (S706), dynamic cyclic neural network learning process (S708), learning verification process (S710) and the prediction process (S712).

CSV(comma-separated values)는 몇 가지 필드를 쉼표(,)로 구분한 텍스트 데이터 및 텍스트 파일이다. 확장자는 .csv이다.Comma-separated values (CSV) are text data and text files, with several fields separated by commas. The extension is .csv.

전처리 과정(S704)은 이벤트 로그를 순환신경망을 이용한 학습과 예측을 위한 데이터로 변환한다. 데이터는 학습에 사용할 데이터와 제외할 데이터로 구분된다. 예를 들어, 각 케이스는 최소 하나의 입력 이벤트와 예측에 활용될 목표 이벤트를 가져야 하므로 프로세스 길이가 최소 2가 되어야 한다. 따라서 프로세스 길이가 1인 케이스는 학습 및 예측에 사용될 수 없다. 이와 함께, 최대의 프로세스 길이보다 작은 길이를 가지는 케이스에 대해서는 패딩 처리를 수행한다(도 5참조).The preprocessing process (S704) converts the event log into data for learning and prediction using the cyclic neural network. Data is divided into data to be used for learning and data to be excluded. For example, each case must have at least one input event and a target event to be used for prediction, so the process length must be at least two. Thus, a case with a process length of 1 cannot be used for learning and prediction. In addition, padding processing is performed on cases having a length smaller than the maximum process length (see FIG. 5).

패딩 처리라 함은 최대의 프로세스 길이보다 작은 길이를 가지는 케이스의 길이를 최대 프로세스 길이만큼 펼치되 빈 액티비티에 대해서는 null값을 할당하는 것을 말한다.The padding process is to expand the length of a case having a length smaller than the maximum process length by the maximum process length, but to assign a null value to an empty activity.

전처리 과정(S704)은 각 케이스에 대하여 마지막 액티비티의 수행 시점을 카테고리화하여 얻어지는 시간 범주 정보를 추가한다. 여기서, 카테고리화란 주어진 수행 시점을 몇 개의 시간 범주로 분류하는 것을 말한다. The preprocessing process S704 adds time category information obtained by categorizing the execution time of the last activity for each case. Here, categorization refers to classifying a given execution time into several time categories.

각 액티비티 사이의 시간 차이를 “일(day)"단위로 계산하고, 시간 차이 카테고리 정보를 참고하여 카테고리컬 변수로 변환한다.(Numerical Data -> Categorical Data) 이러한 방식으로 데이터를 전처리(preprocessing)하면, 학습 모형은 이벤트 간 “시간 차이”를 예측하는 대신에 “시간 범위”를 예측한다. 이를 통해 학습 예측률이 높은 예측 모형이 얻어진다.The time difference between each activity is calculated in “day” units and converted to categorical variables by referring to the time difference category information (Numerical Data-> Categorical Data). Instead of predicting the “time difference” between events, the learning model predicts the “time range.” This results in a predictive model with a high learning prediction rate.

구체적으로, 이벤트 로그에 포함된 액티비티 수행 시점들 중에서 최대 길이와 최소 길이 사이의 차이 값을 구하고, 이 차이값을 소정 개수의 구간으로 나누고 각 케이스의 마지막 액티비티의 수행 시점이 분할된 구간의 어디에 속하는 지를 판단하여 해당 구간 번호를 얻어서 시간 범주 정보로서 추가한다. Specifically, the difference value between the maximum length and the minimum length among the execution points of the activities included in the event log is obtained, the difference value is divided into a predetermined number of sections, and the execution time of the last activity of each case belongs to the divided section. After that, the corresponding section number is obtained and added as time category information.

도 8은 액티비티 수행 시점을 카테고리화하는 예를 보인다.8 shows an example of categorizing an activity execution time.

예를 들어, 구간 값이 “3일”이고, 0~3일은 “1”, “4일~6일”은 “2”,,, 라고 한다면, 도 8의 위쪽에 도시된 시간은 각각 “1”, “1”, “2”로 표시된다. 시간 범주 값은 원래의 수행 시점 옆에 새로운 컬럼(속성)으로 추가된다.For example, if the interval value is "3 days", 0 to 3 days are "1", and "4 to 6 days" are "2" ,,, the time shown in the upper part of FIG. ”,“ 1 ”and“ 2 ”are displayed. The time category value is added as a new column (attribute) next to the original run time.

액티비티 이름도 문자열이므로 전처리 과정(S704)에서 단어를 벡터로 변환하는 워드 임베딩 기법을 적용하는 것이 바람직하다.Since the activity name is also a string, it is preferable to apply a word embedding technique for converting a word into a vector in the preprocessing process (S704).

하이퍼 파라미터 설정 과정(S706)은 배치 크기(batch size), 에포크 수(epoch number), 활성화 함수와 최적화 함수 등을 설정한다. The hyperparameter setting process S706 sets a batch size, an epoch number, an activation function, an optimization function, and the like.

하이퍼 파라미터(Hyper Parameter)란 신경망 학습을 통해서 튜닝 또는 최적화 해야 하는 메인 변수가 아니라, 학습 진도율이나 일반화 변수처럼, 사람들이 선험적 지식으로 설정을 하거나 또는 외부 모델 메커니즘을 통해 자동으로 설정이 되는 변수를 말한다.Hyperparameters are not the main variables that need to be tuned or optimized through neural network learning, but variables that are set by a priori knowledge or automatically set by external model mechanisms, such as learning progress or generalized variables. .

하이퍼 파라미터의 설정에 따라 학습 결과가 크게 달라질 수 있다. 그러므로 가장 효과적인 하이퍼 파라미터 설정을 찾기 위해서는 반복 과정을 거쳐야 한다. Depending on the setting of the hyperparameters, the learning results can vary greatly. Therefore, it is necessary to go through an iterative process to find the most effective hyper parameter setting.

동적 순환신경망 학습 과정(S708)은 패딩 및 시간 범주 처리된 학습 데이터를 활용하여 학습을 진행한다. Dynamic circulatory neural network learning process (S708) proceeds the learning using the padding and time category processed learning data.

학습 과정(S708)은 패딩 및 시간 범주 처리된 데이터를 학습 데이터(70%)와 테스트 데이터(30%)로 구분하고, 학습 데이터를 사용하여 학습하며, 미니 배치 방식(mini batch)을 사용하고 과적합 방지를 위해 교차 검증 기법을 사용한다.The learning process (S708) divides the padding and time category processed data into training data (70%) and test data (30%), learns using the training data, uses a mini batch method and overloads. Cross-validation techniques are used to prevent sums.

교차 검증 기법을 사용하기 위해 입력 데이터와 목표 데이터를 인덱싱하는 과정이 함께 진행된다. In order to use the cross-validation technique, the input and target data are indexed together.

인덱싱의 대상이 되는 입력 데이터 및 목표 데이터는 각각 입력 액티비티와 목표 액티비티를 말한다.The input data and the target data to be indexed refer to the input activity and the target activity, respectively.

예측된 마지막 이벤트(즉, 액티비티)는 대상 프로세스의 전체 액티비티 수만큼의 차원으로 선형 투영시켜 결과물을 출력하고, 교차 엔트로피 (cross-entropy)를 통해 손실 값(loss)을 계산한다. 손실 값을 계산한 이후에는 역전파 알고리즘(back propagation algorithm)을 통해 오류를 수정하면서 학습을 진행한다. The last predicted event (i.e. activity) is linearly projected into the dimension of the total number of activities of the target process, outputting the result, and calculating the loss value through cross-entropy. After calculating the loss value, we learn while correcting the error using a back propagation algorithm.

여기서, 다차원으로 출력된 벡터 값을 계산하여 액티비티 수만큼의 선형 벡터로 변환한 뒤 가장 높은 값을 가지는 결과물을 선택한다. 차원은 데이터의 특징, 표현의 정도를 나타낸다. 높은 차원의 데이터일수록 데이터를 표현하는 특징이 많다.Here, the multi-dimensionally outputted vector value is calculated, converted into linear vectors as many as the number of activities, and the result with the highest value is selected. Dimensions represent the characteristics of data and the degree of representation. The higher the data, the more features that represent the data.

그리고 선형 투영이란, 다차원으로 표현된 데이터를 1차원 선형 데이터로 변환하는 것을 말한다. And linear projection means converting the data represented by multidimensional into 1-dimensional linear data.

또한, 학습 과정(S708)은 카테고리화를 위한 구간 값을 바꾸어 가면서 학습을 수행하여 구간값을 최적화한다.In addition, the learning process S708 optimizes the interval value by performing the learning while changing the interval value for categorization.

예측률을 높이기 위해 시간 범주(time category)의 수를 변경하여 각 시간 범주의 구간 값을 미세 조정하면서 최적의 시간 범주 수를 구한다. To increase the prediction rate, the number of time categories is changed to fine tune the interval values for each time category to obtain the optimal number of time categories.

예를 들어, 예측에 활용되는 데이터에 포함된 각 케이스에 소속된 이벤트 간의 시간 차이의 모든 최댓값(예, ‘100’)과 최솟값(예, ‘0’)을 구해서 최대 시간 차이(즉, ‘100’)를 구한다. For example, the maximum time difference (ie, '100') is obtained by obtaining all the maximum values (eg, '100') and the minimum value (eg, '0') of the time difference between events in each case included in the data used for the prediction. Get ')

시간 범주의 수를 4로 설정할 경우(범주1: 0~25, 범주2: 26~50, 범주3: 51~75, 범주4: 76~100)와 10으로 설정할 경우(범주1: 0~10, 범주2: 11~20, ...... ,범주10: 91~100))의 예측률은 달라진다. If you set the number of time categories to 4 (category 1: 0 to 25, category 2: 26 to 50, category 3: 51 to 75, category 4: 76 to 100) and 10 (category 1: 0 to 10) , Category 2: 11 to 20, ......, category 10: 91 to 100).

범주의 수가 적으면 예측률은 향상되나, 예측된 범주의 구간 값이 커지게 되고, 반대로, 범주의 수가 많으면 예측률은 떨어질 수 있으나, 예측된 범주의 구간 값이 작게 된다. 그러므로 학습을 통해서 주어진 예측 주제에 맞는 최적의 시간 범주 수를 구하는 과정이 필요하다. If the number of categories is small, the prediction rate is improved, but the interval value of the predicted category is large. On the contrary, if the number of categories is large, the prediction rate is low, but the interval value of the predicted category is small. Therefore, it is necessary to find the optimal number of time categories for a given prediction topic through learning.

학습 검증 과정(S710)은 학습된 모형을 테스트하기 위해 학습에 사용하지 않은 30%의 테스트 데이터를 이용하여 학습된 모형의 정확도를 계산하여 제공한다.The learning verification process (S710) calculates and provides the accuracy of the learned model using 30% test data not used for learning to test the learned model.

마지막으로 예측 과정(S712)은 학습된 동적 순환신경망 및 실시간으로 발생된 이벤트 로그를 이용하여 각 케이스에서 마지막 액티비티의 수행 시점을 예측한다. Finally, the prediction process S712 predicts the execution time of the last activity in each case by using the learned dynamic cyclic neural network and the event log generated in real time.

본 발명에 따른 비즈니스 프로세스 예측 방법은 각각의 케이스에 있어서 마지막 액티비티(이벤트)의 수행 시점 즉, 마지막 성과 지표를 예측하는 것이다.The business process prediction method according to the present invention is to predict the execution time of the last activity (event), that is, the last performance indicator in each case.

한편, 도 7에 있어서, 순환신경망 모델을 준비하는 과정(S702), 전처리 과정(S704), 하이퍼 파라미터 설정 과정(S706), 동적 순환신경망 학습 과정(S708) 그리고 학습 검증 과정(S710)은 본 발명의 요약에 있어서의 비즈니스 프로세스 학습 방법에 해당한다. 따라서 비즈니스 프로세스 학습 방법에 관한 설명은 생략하기로 한다.Meanwhile, in FIG. 7, the process of preparing a circulatory neural network model (S702), a preprocessing process (S704), a hyperparameter setting process (S706), a dynamic circulatory neural network learning process (S708), and a learning verification process (S710) are the present invention. Corresponds to the business process learning method in summary. Therefore, the description of the business process learning method will be omitted.

도 9는 본 발명에 따른 비즈니스 프로세스 예측 방법이 적용된 창고 관리 프로세스 맵의 예를 보이는 것이다.9 shows an example of a warehouse management process map to which a business process prediction method according to the present invention is applied.

도 9에 도시되는 창고 관리 프로세스 맵(process map)은 국내 화장품 제조업체 N사의 창고 관리 프로세스를 분석하여 얻어진 것으로서, 총 81,457개의 케이스들을 포함하는 것이다. 단일 액티비티를 가지는 케이스는 제외된 숫자이다.The warehouse management process map shown in FIG. 9 is obtained by analyzing the warehouse management process of the domestic cosmetics manufacturer N, and includes a total of 81,457 cases. Cases with a single activity are excluded numbers.

도 9에 있어서 각각의 네모 박스들은 창고 관리 프로세스를 구성하는 액티비티들을 나타내며 이들 중에서 파란색 박스는 주요 액티비티를 나타낸다. 액티비티 사이를 잇는 화살표는 액티비티들 사이의 연결을 나타내는 패스이며, 숫자는 해당 패스가 발생한 횟수를 나타낸다.In FIG. 9, each box represents activities constituting the warehouse management process, and a blue box represents a main activity among them. The arrows between the activities are the paths that represent the connections between the activities, and the numbers indicate the number of times that the path occurred.

또한 굵은 화살표로 표시된 것은 주요 케이스를 나타낸다.Also indicated by bold arrows indicate major cases.

도 10은 학습 데이터 및 테스트 데이터의 구성을 보인다. 10 shows the configuration of training data and test data.

학습 과정(S708)은 미니 배치 방식(mini batch)을 사용하고 과적합 방지를 위해 교차 검증 기법을 사용한다. The learning process S708 uses a mini batch method and uses a cross-validation technique to prevent overfitting.

학습 과정(S708)은 도 4의 창고 관리 프로세스 맵을 통하여 얻어지는 데이터를 70:30으로 나누어 각각 학습 데이터 및 테스트 데이터로 활용한다. 여기서, 학습 데이터는 동적순환망의 학습에 사용된 데이터를 말하며, 테스트 데이터는 학습된 동적순환망을 테스트하기 위해 사용된 데이터를 말한다. 테스트 데이터는 실제 사례로부터 수집된 데이터이기 때문에 테스트 결과를 평가하는 데 유용하다.In the learning process S708, the data obtained through the warehouse management process map of FIG. 4 is divided into 70:30 and used as learning data and test data, respectively. Here, the training data refers to data used for learning of the dynamic circulation network, and the test data refers to data used for testing the learned dynamic circulation network. Test data is useful for evaluating test results because it is data collected from real-world cases.

학습 데이터의 배치(Batch)를 20으로 설정하고 57,019개의 케이스가 30번 반복하여 학습되도록 에포크(Epoch)를 30으로 설정하였다.The batch of training data was set to 20 and the epoch was set to 30 so that 57,019 cases were learned 30 times.

또한, 과적합을 방지하기 위하여 10겹 교차 검증(10-fold cross validation)기법을 적용하였다.In addition, a 10-fold cross validation technique was applied to prevent overfitting.

도 11은 10겹 교차 검증 기법을 도식적으로 도시한다.11 diagrammatically illustrates a 10-fold cross validation technique.

신경망에서는 학습을 통해 최적의 가중치 매개변수를 결정하기 위한 지표로(기준으로) 손실 함수(loss function)를 사용한다. In neural networks, the loss function is used as an index (as a reference) to determine an optimal weight parameter through learning.

손실 함수의 결과값(오차)을 가장 작게 만드는 것이 신경망 학습의 목표이고, 손실 함수의 결과값을 작게 만들기 위해서 가중치 매개변수를 조작해 나가는 과정이 학습이고, 각각의 가중치 매개변수를 어디로 얼마나 조절해야 손실 함수의 결과값이 적어질지를 결정할 때 참고하는 것이 미분값(기울기)이다.The goal of neural network learning is to make the result of the loss function the smallest (error), and the process of manipulating the weight parameter to make the result of the loss function small is learning, and how and where each weight parameter should be adjusted The derivative value is taken into account when determining whether the result of the loss function is small.

예를 들어, x가 가중치 매개변수, y가 손실 함수라고 할 때, 미분이 음수면 기울기가 음수니까 x를 h만큼 증가시켰을 때 y는 감소하므로 그 가중치 매개변수를 증가시켜 손실 함수의 값을 감소시킬 수 있다.For example, if x is a weight parameter and y is a loss function, then if the derivative is negative and the slope is negative, y is decreased when x is increased by h, so increasing the weight parameter decreases the value of the loss function. You can.

반대로 미분이 양수면 기울기가 양수이므로 가중치 매개변수를 감소시켜 손실 함수의 값을 감소시킬 수 있다.Conversely, if the derivative is positive, the slope is positive, so that the weight parameter can be reduced to reduce the value of the loss function.

손실 함수로는 보통 평균 제곱 오차(mean squared error, MSE), 교차 엔트로피 오차(cross entropy error, CEE)를 주로 사용한다.The loss function usually uses mean squared error (MSE) and cross entropy error (CEE).

평균 제곱 오차(mean squared error, MSE)는 회귀에서 항등 함수의 손실 함수로 사용된다. 교차 엔트로피 오차(cross entropy error, CEE)는 분류에서 소프트맥스 함수(softmax function)의 손실 함수로 사용된다.Mean squared error (MSE) is used as the loss function of the identity function in regression. Cross entropy error (CEE) is used as a loss function of the softmax function in the classification.

손실 함수는 모든 학습 데이터에 적용해서 평균값을 구해야 한다. 각각의 출력을 더해 개수만큼 나눠줘도 되겠지만, 함수 자체에서 여러 개의 학습 데이터를 처리해 이에 대한 평균값을 리턴하도록 수정하는 편이 좋다. 이를 평균 손실 함수라고 한다.The loss function should be applied to all learning data to average. Each output can be added and divided by the number, but it is better to modify the process so that the function itself can process multiple training data and return the average value. This is called the average loss function.

묶은 데이터 각각에 대해 손실 함수를 구한 후 모두 더하고 데이터 개수로 나눠 정규화 해주면 된다.You can find the loss function for each group of data, add them all together, and divide them by the number of data to normalize them.

모든 학습 데이터에 대해 손실 함수를 구하는 작업은 시간이 오래 걸린다.Finding a loss function for all training data takes a long time.

이 경우 데이터를 일부만 추려 전체의 근사치로 사용한다. 이 일부를 미니 배치(mini-batch)라고 부르며 이런 학습 방법을 미니배치 학습이라고 한다.In this case, we use some data to approximate the whole. This part is called mini-batch and this learning method is called mini-batch learning.

예측 과정(S712)은 학습된 예측 모델 및 실시간으로 발생된 이벤트 로그를 입력으로 프로세스 예측을 수행한다. The prediction process S712 performs process prediction by inputting the learned prediction model and the event log generated in real time.

“입고”에서부터 “출고”까지 정상적으로 끝난 케이스를 대상으로 학습하고 예측을 수행했기 때문에 예측 모형의 마지막 셀의 예측 값은 “출고 시간 범위”가 된다.Because we trained and predicted a case that normally went from "receipt" to "outgoing", the prediction value of the last cell in the forecasting model would be the "outgoing time range."

따라서 실제 데이터의 마지막 레코드의 직전 레코드 시간에 예측된 시간 범위를 더하면 예측 모형이 예측하고자 했던 출고일이 도출된다.Therefore, adding the predicted time range to the record time immediately before the last record of the actual data yields the release date that the predictive model intended to predict.

결과적으로 예측 출고일에서 실제 출고일을 빼면 예측 모형의 출고일 오차가 도출된다. 이렇게 얻어진 오차의 합을 예측에 사용된 케이스의 수로 나누면 모형의 평균 오차가 도출된다.As a result, subtracting the actual issue date from the forecast issue date yields the issue date of the forecast model. Divide the sum of the errors so obtained by the number of cases used in the prediction to derive the mean error of the model.

도 12는 본 발명에 따른 비즈니스 프로세스 예측 장치의 구성을 보인다.12 shows a configuration of a business process prediction apparatus according to the present invention.

도 12를 참조하면, 본 발명에 따른 비즈니스 프로세스 예측 장치(1200)는 프로세스 마이닝 모듈(1202), 전처리 모듈(1204), 하이퍼 파라미터 설정 모듈(1206), 동적 순환신경망 학습 모듈(1208), 동적 순환신경망 모듈(1210), 학습 검증 모듈(1212), 예측 모듈(1214) 그리고 예측 결과 저장 모듈(1216)을 포함한다.Referring to FIG. 12, the business process prediction apparatus 1200 according to the present invention may include a process mining module 1202, a preprocessing module 1204, a hyperparameter setting module 1206, a dynamic cyclic neural network learning module 1208, and a dynamic circulation. The neural network module 1210, the learning verification module 1212, the prediction module 1214, and the prediction result storage module 1216 are included.

예를 들어, 도 4에 도시된 창고 관리 프로세스 맵으로 얻어지는 빅데이터(big data)가 준비되고, 이 빅데이터를 분석함에 의해 이벤트 로그가 생성될 수 있다. For example, big data obtained from the warehouse management process map shown in FIG. 4 may be prepared, and an event log may be generated by analyzing the big data.

프로세스 마이닝 모듈(1202)은 이벤트 로그를 참조하여 핵심 예측 변수 및 숨겨진 액티비티를 탐색한다. 전처리 모듈(1204)은 프로세스 마이닝 모듈(1202)의 탐색 결과를 참조하여 이벤트 로그를 최적화한다.Process mining module 1202 consults the event log to search for key predictors and hidden activities. The preprocessing module 1204 references the search results of the process mining module 1202 to optimize the event log.

프로세스 마이닝 분석을 통해 예측 대상 프로세스에 대한 이해도 제고와 예측 정확도 향상에 기여할 수 있는 핵심 예측 변수 및 숨겨진 프로세스 예를 들면, 중도 포기 단계를 발견할 수 있다. 프로세스 마이닝 분석을 통해 예측 대상 프로세스(판매와 창고 관리 프로세스)를 이해하고, 예측의 입력과 결과 액티비티로 활용될 수 있는 액티비티 목록과 숨겨진 액티비티(예, ‘중도포기 단계’)를 발견할 수 있다. Process mining analysis reveals key predictors and hidden processes, such as abandonment phases, that can contribute to better understanding of the predicted process and improved forecast accuracy. Process mining analysis allows you to understand the processes to be predicted (sales and warehouse management processes), and to discover a list of activities that can be used as input and result activities for forecasts and hidden activities (such as the "scattering stage").

또한 프로세스 마이닝 분석에서 발견된 프로세스의 과거 행동에 관한 통찰력은 예측 정확도의 향상에 기여한다.In addition, insight into the past behavior of processes found in process mining analysis contributes to improved prediction accuracy.

전처리 모듈(1204)은 이벤트 로그로부터 불필요한 케이스를 제거하고, 최대 프로세스 길이에 맞추어 패당을 수행한다. 또한, 각 케이스마다 마지막 액티비티의 수행 시점을 카테고리화한 시간 범주 변수를 추가한다. The preprocessing module 1204 removes unnecessary cases from the event log and performs the loss according to the maximum process length. In addition, for each case, add a time category variable that categorizes when the last activity was performed.

카테고리화된 시간 범주 변수를 생성하기 위해, 먼저 인접한 케이스의 마지막 이벤트들 사이의 시간 차이를 계산한다. 그와 동시에 전체 데이터를 체크하면서 시간 차이가 가장 많이 나는 시간을 기준으로 시간 범주 정보를 생성한다. To create a categorized time category variable, we first calculate the time difference between the last events of adjacent cases. At the same time, it checks the entire data and generates time category information based on the time with the most time difference.

이러한 시간 범주 정보는 인접한 케이스의 마지막 이벤트들 사이의 시간차이를 계산하고, 범주로 분류하기 위한 기준이 되어 전체 케이스들에 동일하게 적용한다. 이러한 방식으로 데이터를 전처리하면 학습 모형은 이벤트 간 ‘시간 차이’를 예측하는 대신에 ‘시간 범위’를 예측함. 이를 통해 학습 예측률이 높은 예측 모형을 만들 수 있다.This time category information serves as a criterion for calculating the time difference between the last events of adjacent cases and classifying them into categories. When you preprocess the data in this way, the learning model predicts the 'time range' instead of predicting the 'time difference' between the events. This makes it possible to create predictive models with high learning prediction rates.

하이퍼 파라미터 설정 모듈(1206)은 하이퍼 파라미터를 설정한다.The hyper parameter setting module 1206 sets hyper parameters.

동적 순환신경망 학습 모듈(1208)은 전처리 모듈(1204)에 의해 패딩 및 시간 범주 처리된 학습 데이터를 사용하여 학습을 수행한다.The dynamic cyclic neural network learning module 1208 performs the learning using the learning data padded and time-categorized by the preprocessing module 1204.

학습은 동적 순환신경망 모듈(1210)을 사용하여 수행된다. 동적 순환신경망 모듈(1210)은 예측하고자 하는 케이스의 최대 프로세스 길이만큼 펼쳐진 시퀀스를 가지는 2층 n시퀀스 순환신경망 모델이다.Learning is performed using the dynamic circulatory neural network module 1210. The dynamic cyclic neural network module 1210 is a two-layer n-sequence cyclic neural network model having a sequence spread by the maximum process length of a case to be predicted.

학습 검증 모듈(1212)은 전처리 모듈(1204)에 의해 처리된 테스트 데이터를 사용하여 학습에 대한 검증을 수행한다.The learning verification module 1212 performs verification of learning using the test data processed by the preprocessing module 1204.

예측 모듈(1214)은 실시간 데이터 및 예측 모형 즉, 학습된 동적 순환신경망 .모듈(1210)을 이용하여 예측을 수행한다. 또한, 예측 모듈(1214)에 의한 예측 결과는 예측 결과 DB(1216)에 저장되어 예측 모델의 학습 알고리즘과 예측 모델의 성능 개선에 활용될 수 있다. The prediction module 1214 performs prediction by using real-time data and a prediction model, that is, the learned dynamic cyclic neural network module 1210. In addition, the prediction result by the prediction module 1214 may be stored in the prediction result DB 1216 and used to improve the learning algorithm of the prediction model and the performance of the prediction model.

예측 모듈(1214)이 예측을 수행함에 있어서, 실시간 데이터의 프로세스 길이가 학습에 사용된 이벤트 로그에서의 최대 프로세스 길이보다 크면 그 길이만큼 순환신경망 모듈(1210)의 시퀀스를 확장하고, 입력 데이터를 패딩 처리한다.In the prediction module 1214 performing prediction, if the process length of the real-time data is greater than the maximum process length in the event log used for learning, extend the sequence of the cyclic neural network module 1210 by that length, and padding the input data. Process.

본 발명에 따른 Dynamic RNN 형태의 딥러닝을 활용하는 접근법을 채택하면 최적의 Sequence 길이를 찾기 위한 별도의 노력을 들일 필요가 없기 때문에 개발 기술은 다양한 프로세스의 행동 예측에 매우 효율적으로 적용될 수 있다.If the approach using the deep learning in the form of Dynamic RNN according to the present invention is adopted, the development technique can be applied to the prediction of the behavior of various processes very efficiently because it does not have to take extra effort to find the optimal sequence length.

도 12에 있어서, 프로세스 마이닝 모듈(1202), 전처리 모듈(1204), 하이퍼 파라미터 설정 모듈(1206), 동적 순환신경망 학습 모듈(1208), 순환신경망 모듈(1210), 학습 검증 모듈(1212)은 본 발명의 요약에 있어서의 비즈니스 프로세스 학습 장치에 해당하므로, 비즈니스 프로세스 학습 장치에 대한 별도의 설명을 생략하기로 한다.In FIG. 12, the process mining module 1202, the preprocessing module 1204, the hyperparameter setting module 1206, the dynamic cyclic neural network learning module 1208, the cyclic neural network module 1210, and the learning verification module 1212 are shown. Since it corresponds to the business process learning apparatus in the summary of the invention, a separate description for the business process learning apparatus will be omitted.

도 13은 본 발명에 적용된 동적 순환신경망의 예를 보인다.Figure 13 shows an example of a dynamic circulatory neural network applied to the present invention.

도 13을 참조하면, 순환신경망은 LSTM 으로 구성된 것이다. 동적 순환신경망은 입력층과 출력층 2층(2 layer)로 구성된다. 입력 값으로서 이벤트 로그가 제공되고 출력값으로서 예측된 이벤트가 얻어진다. Referring to Figure 13, the circulatory neural network is composed of LSTM. The dynamic circulatory neural network consists of two layers, an input layer and an output layer. An event log is provided as an input value and a predicted event is obtained as an output value.

이벤트 로그는 케이스들의 집합이며, 각각의 케이스는 액티비티(이벤트)들의 집합이다. 액티비티(이벤트)는 액티비티 수행 시점(종료 시간)을 갖는다. 본 발명은 액티비티 수행 시점 외에도 액티비티 수행 시점을 카테고리화한 시간 범주 변수를 추가하여 사용한다.The event log is a collection of cases, and each case is a collection of activities. An activity (event) has an activity execution time (end time). The present invention adds and uses a time category variable that categorizes the activity execution time point in addition to the activity execution time point.

도 13에서 상측 방향은 레이어이고 우측 방향은 시퀀스이다. 도 13에는 2층으로 이루어진 순환신경망을 개시하지만 그에 한정하는 것은 아니다. 레이어의 개수가 늘어날수록 예측률은 높아지지만 연산의 복잡도가 증가한다. In FIG. 13, the upper direction is a layer and the right direction is a sequence. 13 discloses, but is not limited to, a circulatory neural network consisting of two layers. As the number of layers increases, the prediction rate increases, but the complexity of the operation increases.

시퀀스의 최대값은 이벤트 로그에서 최대 프로세스 길이의 케이스에 의해 결정된다. 즉, 최대 프로세스 길이의 케이스를 구성하는 이벤트의 수가 n이라 할 때 시퀀스의 수 역시 n이 된다.The maximum value of the sequence is determined by the case of the maximum process length in the event log. That is, when the number of events constituting the case of the maximum process length is n, the number of sequences is also n.

케이스는 발생 순서에 따라 정렬된 것이고, 이벤트는 케이스 발생 순서와는 상관없이 소속 케이스에서의 발생 순서에 따라 정렬된 것이다.Cases are sorted according to their occurrence order, and events are sorted according to their occurrence in their case, regardless of their occurrence.

입력층은 n개의 LSTM 셀로 구성되고 출력층 역시 n개의 LSTM 셀로 구성된다.The input layer consists of n LSTM cells and the output layer also consists of n LSTM cells.

입력층의 n개의 LSTM 셀 각각은 케이스의 최대 n개의 액티비티에 대응한다. 즉, 입력층의 첫 번째 LSTM 셀에는 케이스의 첫 번째 액티비티가 입력되고 이런 식으로 해서 입력층의 마지막 LSTM 셀에는 케이스의 마지막 액티비티가 입력된다.Each of the n LSTM cells in the input layer corresponds to a maximum of n activities in the case. That is, the first activity of the case is input to the first LSTM cell of the input layer, and the last activity of the case is input to the last LSTM cell of the input layer.

케이스는 최대 프로세스 길이 즉, 최대 액티비티 수에 맞추어 패딩된 것이다. 패딩값으로는 null 즉 ‘0’가 사용된다.The case is padded to the maximum process length, i.e. the maximum number of activities. The padding value is null or '0'.

LSTM 셀에 패딩값이 입력된다는 것은 해당 스테이트(state)에서 아무런 계산을 하지 않고 다음의 스테이트로 넘어간다는 것을 의미한다. The padding value input to the LSTM cell means that no calculation is performed in the state, and the state goes to the next state.

출력층은 예측된 액티비티를 출력하며, 특히 예측된 액티비티 수행 시점을 시간범주화한 값을 포함한다.The output layer outputs the predicted activity, and in particular, includes a time-categorized value of the predicted activity execution time.

카테고리화된 시간값이란 최단 수행 시간과 최장 수행 시간 사이의 시간 처리를 m개로 균등하게 분할하고, 각 구간에 구간 번호를 할당하고, 수행 시점이 속하는 구간의 구간 번호를 결정하고, 수행 시점을 구간 번호로 나타낸 것을 말한다. 카테고리화를 위한 구간의 수는 학습 과정(S708)에서 최적화된다.The categorized time value equally divides the time processing between the shortest execution time and the longest execution time into m, assigns a section number to each section, determines the section number of the section to which the execution time belongs, and the execution time section. Refers to the number. The number of intervals for categorization is optimized in the learning process (S708).

이하, 본 발명의 실시예에 대하여 설명한다.EMBODIMENT OF THE INVENTION Hereinafter, the Example of this invention is described.

본 발명에 따른 비즈니스 프로세스 예측 방법 및 장치를 검증하기 위해 본 출원인은 국내 화장품 제조업체인 I사의 판매 프로세스 데이터를 활용했다. In order to verify the business process prediction method and apparatus according to the present invention, the applicant used the sales process data of I company, a domestic cosmetics manufacturer.

표 1은 J사의 판매 프로세스 데이터를 보인다.Table 1 shows J's sales process data.

CharacteristicsCharacteristics ValueValue CharacteristicsCharacteristics ValueValue Total number of eventsTotal number of events 70,24070,240 Total number of casesTotal number of cases 11,63211,632 Number of activity typesNumber of activity types 99 Number of cases excludedNumber of cases excluded 299299 Max. number of events in
casesMax. number of events in
cases 5151 Min. number of events in
casesMin. number of events in
cases 22 Start timeStart time 2016/05/19
14:31:292016/05/19
14:31:29 Completion timeCompletion time 2017/10/30
11:21:032017/10/30
11:21:03

표 1에 보여지는 데이터는 70,240개의 이벤트와 11,632개의 케이스를 포함하고 있다. The data shown in Table 1 contains 70,240 events and 11,632 cases.

계절적 요인 등을 제거하기 위해 1년 이상의 판매 프로세스 행동을 기록한 데이터가 본 발명에 활용되었다. 단일 이벤트만을 포함하는 299개의 케이스가 제외되어 실제 사용된 데이터는 11,333개의 케이스를 포함하고 있다. 이 케이스들은 최소 2개부터 최대 51개의 이벤트를 포함하고 있다(표 1 참조). 이들을 임의로 나누어서 학습 데이터(70%, 7,933개의 케이스)와 테스트 데이터(30%, 3,400개의 케이스)를 구성했다.Data was recorded in the sales process behavior of one year or more to remove seasonal factors and the like. Excluding 299 cases containing only a single event, the actual data used included 11,333 cases. These cases contain a minimum of two and a maximum of 51 events (see Table 1). These were randomly divided to form training data (70%, 7,933 cases) and test data (30%, 3,400 cases).

기존 연구에서 적용한 방식처럼, 케이스를 구성하는 액티비티 이름을 순환신경망의 각 셀에 입력하기 위해 워드 임베딩(word embedding) 기법이 사용되었다. Word Embedding은 Word를 R차원의 Vector로 매핑시켜주는 것을 말한다. As in the previous study, the word embedding technique was used to input the name of the activity that constitutes the case into each cell of the cyclic neural network. Word Embedding is the mapping of Word to R-dimensional Vector.

액티비티 이름도 문자열이므로 단어를 벡터로 변환하는 워드 임베딩 기법을 적용하는 것이 바람직하다.Since activity names are strings, it's a good idea to apply the word embedding technique, which converts words to vectors.

다음으로 배치(batch) 크기는 20으로 설정되고, 7,933개의 케이스를 포함한 학습 데이터가 100번 반복하여 학습되도록 에포크(epoch)의 수는 100으로 설정되었다. 배치 크기에 따른 정확도의 차이는 크지 않지만 배치 크기가 20일 때 가장 우수한 예측 정확도를 얻을 수 있었다.Next, the batch size was set to 20, and the number of epochs was set to 100 so that the training data including 7,933 cases was learned 100 times. Although the difference in accuracy according to the batch size is not large, the best prediction accuracy is obtained when the batch size is 20.

표 2는 배치 사이즈에 따른 예측 정확도를 보인다.Table 2 shows the prediction accuracy according to the batch size.

Batch SizeBatch size Training
Prediction(%)Training
Prediction (%) 1010 96.41196.411 2020 96.44196.441 4040 96.29096.290 6060 96.20996.209 8080 95.00095.000 100100 95.70095.700 AverageAverage 96.00296.002

마지막으로 과적합(over-fitting)을 방지하기 위해, 드롭아웃과 10겹 교차검증 기법이 적용되었다. 특히, 교차검증 기법을 적용할 때 첫 번째 겹(fold)에서는 7,933개의 학습 데이터를 10개의 부분집합(Sub1 ~ Sub10)으로 임의로 나눈 후에 처음 9개의 부분집합(Sub1 ~ Sub9)을 학습에 사용하고, 마지막 1개의 부분집합(Sub10)을 학습된 모형의 검증에 사용했다. Finally, to prevent over-fitting, dropout and 10-fold cross-validation techniques were applied. In particular, when the cross-validation technique is applied, the first fold is randomly divided into 7,933 training data into 10 subsets (Sub1 to Sub10), and then the first 9 subsets (Sub1 to Sub9) are used for training. The last one subset (Sub10) was used to validate the trained model.

두 번째 겹에서는 검증에 사용된 부분집합(Sub10)과 Sub1 ~ Sub8까지의 부분집합이 다시 학습에 사용하고, Sub9 부분집합은 학습된 모형의 검증에 사용했다. 이러한 과정을 반복하면 예측 모형에 대한 학습과 검증이 총 10번 수행되고, 각 겹마다 학습과 검증 정확도를 도출할 수 있었다.In the second layer, the subsets used for the test (Sub10) and the subsets Sub1 to Sub8 were used again for training, and the Sub9 subset was used for the validation of the trained model. By repeating this process, a total of 10 times of learning and verification of the prediction model were performed, and the accuracy of learning and verification was derived for each layer.

도 14는 10겹 교차 검증 방식에 따른 예측률을 보이는 것이고 도 15는 10겹 교차 검증 방식에 따른 비용을 보인다.14 shows the prediction rate according to the 10-ply cross-validation scheme and FIG. 15 shows the cost according to the 10-ply cross-validation scheme.

표 3은 폴딩 횟수에 따른 학습 예측율을 보인다. Table 3 shows the learning prediction rate according to the number of folding.

표 3 및 도 14에 보여지는 것처럼, 학습 데이터를 활용한 예측 정확도의 평균은 95.79%이었고, 테스트 데이터를 활용한 예측 정확도는 96.29%이었다. As shown in Table 3 and FIG. 14, the average of prediction accuracy using training data was 95.79%, and the prediction accuracy using test data was 96.29%.

한편, 도 15를 참조하면, 각 겹(fold)의 비용(cost)이 0.2에 근접하게 줄어들고 있음을 확인할 수 있다. 이러한 사실을 통해 본 출원인은 학습이 제대로 수행되었다고 판단할 수 있었다. On the other hand, referring to Figure 15, it can be seen that the cost (cost) of each fold (down) is approaching to 0.2. Through this fact, the applicant was able to judge that the learning was performed properly.

한편, 기존 연구에서 제안된 정적 순환신경망 형태의 딥러닝을 활용한 접근법에서는 학습 데이터를 활용한 예측 정확도의 평균은 94.49%이었고, 테스트 데이터를 활용한 예측 정확도의 평균은 94.67%이었다.On the other hand, in the approach using the deep learning in the form of static cyclic neural network proposed in the previous study, the prediction accuracy using the training data was 94.49% and the average of the accuracy using the test data was 94.67%.

표 3은 Fold별 Training Prediction(Training Prediction by Fold)를 보인다.Table 3 shows Training Prediction by Fold.

foldfold Training
Prediction(%)Training
Prediction (%) 1One 97.597.5 22 99.599.5 33 100.0100.0 44 95.49995.499 55 90.99990.999 66 92.592.5 77 96.49996.499 88 93.093.0 99 93.593.5 1010 98.99998.999 AverageAverage 95.7995.79

본 발명은 전 세계에서 처음으로 동적 순환신경망 형태의 딥러닝을 활용한 프로세스 예측 접근법을 성공적으로 구현하고 검증했다. 동적 순환신경망 형태의 딥러닝을 활용한 접근법의 예측 정확도는 정적 순환신경망 형태의 딥러닝을 활용하는 기존 접근법보다 테스트 데이터 기준으로 1.62% 더 높았다. 정적 순환신경망 형태의 딥러닝을 활용한 예측 정확도가 이미 94.67%를 달성한 가운데 이를 1.62% 더 개선했다는 것은 매우 우수한 결과라고 판단된다.The present invention has successfully implemented and validated a process prediction approach that utilizes deep learning in the form of dynamic cyclic neural networks for the first time in the world. The prediction accuracy of the approach using deep learning in the form of dynamic cyclic neural network was 1.62% higher than that of the conventional approach using the deep learning in the form of static cyclic neural network. Predictive accuracy using static recursive deep neural networks has already achieved 94.67% and improved 1.62% further.

본 발명의 방법의 메모리 내에서 프로그램될 수 있다. "메모리"는 머신이 특정 방식으로 동작하게 하는 데이터 및/또는 명령을 저장하는 임의의 비일시적 매체를 나타낸다. 이러한 저장 매체는 비휘발성 매체 및/또는 휘발성 매체를 포함할 수 있다. 예를 들면, 비휘발성 매체는 광 또는 자기 디스크를 포함한다. 예를 들면, 휘발성 매체는 동적 메모리를 포함한다. 저장 매체의 일반적인 형태는, 예를 들면, 플로피 디스크, 플렉서블 디스크, 하드디스크, 솔리드 스테이트 드라이브, 자기 테이프, 또는 임의의 다른 자기데이터 저장 매체, CD-ROM, 임의의 다른 광학 데이터 저장 매체, 홀 패턴을 갖는 임의의 물리적 매체, RAM,PROM, 및 EPROM, FLASH-EPROM, NVRAM, 임의의 다른 메모리 칩 또는 카트리지를 포함한다.It can be programmed in the memory of the method of the invention. "Memory" refers to any non-transitory medium that stores data and / or instructions that cause a machine to operate in a particular manner. Such storage media may include non-volatile media and / or volatile media. For example, nonvolatile media include optical or magnetic disks. For example, volatile media includes dynamic memory. Typical forms of storage media include, for example, floppy disks, flexible disks, hard disks, solid state drives, magnetic tapes, or any other magnetic data storage media, CD-ROMs, any other optical data storage media, hole patterns. Any physical medium having RAM, RAM, PROM, and EPROM, FLASH-EPROM, NVRAM, any other memory chip or cartridge.

본 명세서에서 "일 실시예"는 설명된 특정 특징, 구조 또는 특성이 적어도 하나의 실시예에 포함됨을 의미한다. 따라서 이러한 어구는 하나 이상의 실시예를 지칭할 수 있다. 또한, 설명된 특징, 구조 또는 특성은 하나 이상의 실시예에서 임의의 적절한 방식으로 결합될 수 있다. As used herein, "an embodiment" means that a particular feature, structure, or characteristic described is included in at least one embodiment. Thus, such phrases may refer to one or more embodiments. In addition, the described features, structures or features may be combined in any suitable manner in one or more embodiments.

그러나 당업자라면 알 수 있는 바와 같이, 본 발명은 하나 이상의 구체적인 설명없이도 구현될 수 있거나, 다른 방법, 리소스, 방식 등으로 구현될 수 있다. 다른 예로서, 본 발명의 측면들을 불명료하게 하는 것을 단지 피하기 위해 잘 알려져 있는 구조, 리소스 또는 동작들은 도시 또는 설명되지 않았다.As will be appreciated by one skilled in the art, however, the present invention may be practiced without one or more specific details, or may be implemented in other ways, resources, ways, and the like. As another example, well-known structures, resources, or operations have not been shown or described, merely to avoid obscuring aspects of the present invention.

이상과 같이 본 발명에서는 구체적인 구성 요소 등과 같은 특정 사항들과 한정된 실시예 및 도면에 의해 설명되었으나 이는 본 발명의 보다 전반적인 이해를 돕기 위해서 제공된 것일 뿐, 본 발명은 상기의 실시예에 한정되는 것은 아니며, 본 발명이 속하는 분야에서 통상적인 지식을 가진 자라면 이러한 기재로부터 다양한 수정 및 변형이 가능하다. 따라서, 본 발명의 사상은 설명된 실시예에 국한되어 정해져서는 아니되며, 후술하는 특허청구범위뿐 아니라 이 특허청구범위와 균등하거나 등가적 변형이 있는 모든 것들은 본 발명 사상의 범주에 속한다고 할 것이다.In the present invention as described above has been described by the specific embodiments, such as specific components and limited embodiments and drawings, but this is provided to help a more general understanding of the present invention, the present invention is not limited to the above embodiments. For those skilled in the art, various modifications and variations are possible from these descriptions. Therefore, the spirit of the present invention should not be limited to the described embodiments, and all the things that are equivalent to or equivalent to the claims as well as the following claims will belong to the scope of the present invention. .

명세서 전체에서, 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다. 또한, 명세서에 기재된 "…부", "…기", "모듈" 등의 용어는 적어도 하나의 기능이나 동작을 처리하는 단위를 의미하며, 이는 하드웨어나 소프트웨어 또는 하드웨어 및 소프트웨어의 결합으로 구현될 수 있다.Throughout the specification, when a part is said to "include" a certain component, it means that it can further include other components, except to exclude other components unless otherwise stated. In addition, the terms “… unit”, “… unit”, “module”, etc. described in the specification mean a unit that processes at least one function or operation, which may be implemented by hardware or software or a combination of hardware and software. have.

이상 실시예를 참조하여 설명하였지만, 해당 기술 분야의 숙련된 당업자는 하기의 특허 청구 범위에 기재된 본 발명의 사상 및 영역으로부터 벗어나지 않는 범위 내에서 본 발명을 다양하게 수정 및 변경시킬 수 있음을 이해할 수 있을 것이다.Although described with reference to the above embodiments, those skilled in the art can understand that the present invention can be variously modified and changed without departing from the spirit and scope of the invention described in the claims below. There will be.

1200..비즈니스 프로세스 예측 장치
1202...프로세스 마이닝 모듈 1204...전처리 모듈
1206...하이퍼 파라미터 설정 모듈 1208...동적 순환신경망 학습 모듈
1210...순환 신경망 모듈 1212...학습 검증 모듈
1214...예측 모듈 1216...예측 결과 저장 모듈1200 .. Business Process Prediction Device
1202 Process mining module 1204 Preprocessing module
1206 ... Hyperparameter Setting Module 1208 ...
1210 ... Circulation Neural Network Module 1212 ... Learning Validation Module
1214 Prediction Module 1216 Predictive Results Storage Module

Claims

삭제delete

동적 순환신경망 형태의 딥러닝을 활용하여 비즈니스 프로세스의 마지막 액티비티의 수행 시점을 예측하기 위해 동적 순환신경망을 학습시키는 학습 장치에 있어서,
예측하고자 하는 케이스(비즈니스 프로세스)의 최대 프로세스 길이만큼 펼쳐진 시퀀스를 가지는 동적 순환신경망 모듈;
이벤트 로그를 파싱하여 학습 데이터 및 테스트 데이터를 얻고, 학습 데이터 및 테스트 데이터에 대하여 최대 프로세스 길이를 가지는 케이스를 기준으로 각 케이스에 대한 패딩 처리를 수행하는 전처리 모듈;
학습을 위한 하이퍼 파라미터를 설정하는 하이퍼 파라미터 설정 모듈; 및
상기 전처리 모듈에 의해 패딩 및 시간 범주 처리된 학습 데이터, 상기 하이퍼 파라미터 설정 과정에서 설정된 하이퍼 파라미터를 이용하여 상기 순환신경망을 학습시키는 동적 순환신경망 학습 모듈;
을 포함하며,
여기서, 상기 전처리 모듈은 각 케이스에 마지막 액티비티의 수행 시점을 카테고리화한 시간 범주 변수를 추가하고,
카테고리화된 시간 범주 변수란 최단 수행 시간과 최장 수행 시간 사이의 시간을 m개로 균등하게 분할하고, 각 구간에 구간 번호를 할당하고, 수행 시점이 속하는 구간의 구간 번호를 결정하고, 수행 시점을 구간 번호로 나타낸 것이고,
상기 학습 모듈은 카테고리화를 위한 구간값을 바꾸어가면서 학습을 수행하여 구간값을 최적화하는 것을 특징으로 하는 비즈니스 프로세스 학습 장치.
In the learning device that learns the dynamic cyclic neural network to predict the execution time of the last activity of the business process using the deep learning in the form of the dynamic cyclic neural network,
A dynamic cyclic neural network module having a sequence spread by a maximum process length of a case (business process) to be predicted;
A preprocessing module configured to parse the event log to obtain training data and test data, and to perform padding processing on each case based on a case having a maximum process length with respect to the training data and the test data;
A hyper parameter setting module for setting a hyper parameter for learning; And
A dynamic circulatory neural network learning module for learning the circulatory neural network using learning data padded and temporally processed by the preprocessing module and hyper parameters set in the hyper parameter setting process;
Including;
Here, the preprocessing module adds a time category variable that categorizes the execution time of the last activity to each case.
The categorized time category variable divides the time between the shortest execution time and the longest execution time equally into m, assigns the section number to each section, determines the section number of the section to which the execution time belongs, and the execution time section. Numbered,
The learning module is a business process learning apparatus, characterized in that to optimize the interval value by performing the learning while changing the interval value for categorization.

제36항에 있어서, 상기 순환신경망 모듈은 예측하고자 하는 케이스의 최대 프로세스 길이만큼 펼쳐진 시퀀스를 가지는 LSTM(Long Short Time Memory)인 것을 특징으로 하는 비즈니스 프로세스 학습 장치.
37. The apparatus of claim 36, wherein the cyclic neural network module is a Long Short Time Memory (LSTM) having a sequence spread by a maximum process length of a case to be predicted.

제36항에 있어서, 상기 전처리 모듈은 프로세스 길이가 1인 케이스를 제거하는 것을 특징으로 하는 비즈니스 프로세스 학습 장치.
37. The business process learning apparatus of claim 36, wherein the preprocessing module removes a case having a process length of one.

제36항에 있어서, 상기 하이퍼 파라미터는 배치 사이즈, 에포크의 수, 활성화 함수, 최적화 함수를 포함하는 것을 특징으로 하는 비즈니스 프로세스 학습 장치.
37. The apparatus of claim 36, wherein the hyperparameter includes a batch size, a number of epochs, an activation function, and an optimization function.

제36항에 있어서, 상기 학습 모듈은 미니 배치 방식(mini batch) 및 과적합 방지를 위해 교차 검증 기법에 의해 학습하는 것을 특징으로 하는 비즈니스 프로세스 학습 장치.
37. The business process learning apparatus of claim 36, wherein the learning module learns by mini-batch and cross-validation techniques to prevent overfitting.

제40항에 있어서, 상기 학습 모듈은 교차 검증을 위해 입력 데이터와 목표 데이터를 인덱싱하는 것을 특징으로 하는 비즈니스 프로세스 학습 장치.
41. The business process learning apparatus of claim 40, wherein the learning module indexes input data and target data for cross validation.

제41항에 있어서, 상기 학습 모듈은 교차 엔트로피(cross-entropy)를 통해 손실 값(loss)을 계산하고, 역전파 알고리즘(back propagation algorithm)을 통해 오류를 수정하는 것을 특징으로 하는 비즈니스 프로세스 학습 장치.
42. The apparatus of claim 41, wherein the learning module calculates a loss value through cross-entropy and corrects an error through a back propagation algorithm. .

삭제delete

제36항에 있어서, 상기 전처리 모듈에 의해 패딩 및 시간 범주 처리된 테스트 데이터를 이용하여 상기 동적 순환신경망의 예측 정확도를 계산하여 제공하는 학습 검증 모듈;
을 더 포함하는 것을 특징으로 하는 비즈니스 프로세스 학습 장치.
37. The apparatus of claim 36, further comprising: a learning verification module for calculating and providing prediction accuracy of the dynamic cyclic neural network using test data padded and time-domain processed by the preprocessing module;
Business process learning apparatus further comprises.

제36항에 있어서, 상기 이벤트 로그를 참조하여 핵심 예측 변수 및 숨겨진 액티비티를 탐색하는 프로세스 마이닝 모듈을 더 포함하고,
상기 전처리 모듈은 상기 프로세스 마이닝 모듈의 탐색 결과를 참조하여 이벤트 로그를 최적화하는 것을 특징으로 하는 비즈니스 프로세스 학습 장치. 37. The method of claim 36, further comprising a process mining module that searches for key predictors and hidden activities by referring to the event log,
And the preprocessing module optimizes an event log by referring to a search result of the process mining module.