KR20220060976A

KR20220060976A - Deep Learning Method and Apparatus for Emotion Recognition based on Efficient Multimodal Feature Groups and Model Selection

Info

Publication number: KR20220060976A
Application number: KR1020210049818A
Authority: KR
Inventors: 김덕환; 맹준호; 강동현
Original assignee: 인하대학교 산학협력단
Priority date: 2020-11-05
Filing date: 2021-04-16
Publication date: 2022-05-12
Also published as: KR102646257B1

Abstract

Proposed are a deep learning method for selecting effective models and feature groups in emotion recognition using various databases of Asians and an apparatus thereof. The deep learning apparatus for selecting effective models and feature groups in emotion recognition using various databases of Asians proposed in the present invention comprises: a feature extraction unit which extracts electroencephalogram (EEG) features in a time domain, a frequency domain, and a time-frequency domain to select active features for emotion recognition; an LSTM model selection unit which selects an LSTM model to be applied to the extracted EEG features using a genetic algorithm (GA); and a feature set selection unit for selecting a feature set for the selected LSTM model using the GA. The present invention can further improve the results of emotion recognition.

Description

효율적인 멀티모달 특징그룹과 모델 선택 기반 감정인식을 위한 딥러닝 방법 및 장치{Deep Learning Method and Apparatus for Emotion Recognition based on Efficient Multimodal Feature Groups and Model Selection}Deep Learning Method and Apparatus for Emotion Recognition based on Efficient Multimodal Feature Groups and Model Selection}

본 발명은 아시아인의 다양한 데이터베이스를 이용한 감정 인식에서 효과적 모델 및 특징 그룹 선정을 위한 딥러닝 방법 및 장치에 관한 것이다.The present invention relates to a deep learning method and apparatus for effective model and feature group selection in emotion recognition using various databases of Asians.

인간과 로봇의 상호작용을 위한 휴먼 컴퓨팅 인터페이스(Human Computing Interface; HCI) 분야를 개발하려면 인간의 감정을 인식하는 기술이 필수적이다. 나아가 인공지능 등 다른 분야에서도 감정 인식이 주목 받고 있다. 최근 기술에서 감정 인식 분야는 사용된 자료의 종류에 따라 두 분야로 나눌 수 있다. 한 영역은 목소리나 이미지에 기초한 외부 감정 인식이다. 현재의 음성정보가 희박하기 때문에 지속적인 감정을 추출하는 데는 근본적인 한계가 있다. 최근에는 대화 맥락과 음성 톤을 이용한 감정 인식 연구가 이루어지고 있다[1, 2]. 그러나 영상정보를 바탕으로 한 감정 인식 연구는 최고의 성과를 보였다. 감정들은 주로 얼굴 이미지에서 얻은 특징에 기반한 얼굴 표정 변화를 인식함으로써 분류된다[3, 4]. 최근에는 추가적인 특징 추출 과정 없이 콘볼루션 신경망을 이용하여 감정을 엔드 투 엔드로 분류하는 메커니즘이 개발되어 높은 정확도로 수행되고 있다[5, 6].In order to develop the field of Human Computing Interface (HCI) for human-robot interaction, technology for recognizing human emotions is essential. Furthermore, emotion recognition is attracting attention in other fields such as artificial intelligence. In recent technologies, the field of emotion recognition can be divided into two fields according to the type of data used. One area is the recognition of external emotions based on voices or images. Since current voice information is sparse, there is a fundamental limit to extracting continuous emotions. Recently, research on emotion recognition using conversational context and voice tone has been conducted [1, 2]. However, the emotion recognition study based on image information showed the best results. Emotions are mainly classified by recognizing facial expression changes based on features obtained from facial images [3, 4]. Recently, a mechanism for classifying emotions end-to-end using a convolutional neural network without an additional feature extraction process has been developed and is being performed with high accuracy [5, 6].

반대로 내부 감정 인식은 외부의 변화를 고려하지 않고 감정을 인식하기 위해 생체신호의 변화를 이용한다. 감정 인식은 인간의 뇌에서 생성되는 전기 신호로부터 생성된 뇌전도(EEG)를 사용하여 수행된다[7, 8]. 전력 스펙트럼 밀도(Power Spectral Density; PSD)와 같은 주파수 영역 특징을 추출하고, 일반적인 학습 알고리즘을 사용해 감정을 인식한다. 최근의 기술은 EEG 신호 채널 간 비대칭 특징을 추출하고 딥러닝 알고리즘을 적용함으로써 감정 인식의 정확성을 향상시켰다[9, 10, 36]. 그러나 생체신호를 이용한 감정 인식은 데이터 획득에 어려움을 준다. DEAP[11] 및 MAHNOB-HCI[12]와 같은 공용 데이터셋이 자주 사용된다. 데이터의 유형은 생물학적 기반 감정 인식 연구의 결과에 강한 영향을 미친다. 이전 기술에서는 서구 기반의 데이터셋과 다양한 EEG 특징을 사용했다[13, 19].Conversely, internal emotion recognition uses changes in biosignals to recognize emotions without considering external changes. Emotion recognition is performed using electroencephalogram (EEG) generated from electrical signals generated in the human brain [7, 8]. It extracts frequency domain features such as Power Spectral Density (PSD) and recognizes emotions using a general learning algorithm. Recent techniques have improved the accuracy of emotion recognition by extracting asymmetric features between EEG signal channels and applying a deep learning algorithm [9, 10, 36]. However, emotion recognition using biosignals makes it difficult to obtain data. Common datasets such as DEAP [11] and MAHNOB-HCI [12] are frequently used. The type of data strongly influences the outcome of biologically-based emotion recognition studies. Previous techniques used Western-based datasets and various EEG features [13, 19].

본 발명이 이루고자 하는 기술적 과제는 아시아인의 생리학적 신호를 기반으로 한 MERTI-Apps라는 새로운 다양한 데이터셋를 도입하고, 감정 인식을 위한 활성 특징 그룹을 도출하기 위한 GA(Genetic Algorithm)-LSTM(Long Short-Term Memory) 모델을 통해 감정 인식에서 효과적으로 모델을 선택하고, 특징 그룹을 선택하기 위한 딥러닝 방법 및 장치방법 및 장치를 제공하는데 있다. The technical task of the present invention is to introduce a new diverse dataset called MERTI-Apps based on the physiological signals of Asians, and to derive an active feature group for emotion recognition (GA (Genetic Algorithm)-LSTM (Long Short) To provide a deep learning method and apparatus method and apparatus for effectively selecting a model in emotion recognition through a term memory) model and selecting a feature group.

일 측면에 있어서, 본 발명에서 제안하는 아시아인의 다양한 데이터베이스를 이용한 감정 인식에서 효과적 모델 및 특징 그룹 선정을 위한 딥러닝 방법은 감정 인식을 위한 활성 특징을 선택하기 위해 시간 영역, 주파수 영역, 시간-주파수 영역에서 EEG(electroencephalogram) 특징을 추출하는 단계, 추출된 EEG 특징에 적용하기 위한 LSTM 모델을 유전 알고리즘(Genetic Algorithm; GA)을 사용하여 선택하는 단계 및 선택된 LSTM 모델에 대하여 유전 알고리즘을 사용하여 특징 세트를 선택하는 단계를 포함한다. In one aspect, the deep learning method for selecting an effective model and feature group in emotion recognition using various databases of Asians proposed in the present invention is a time domain, frequency domain, time-domain Extracting an electroencephalogram (EEG) feature in the frequency domain, selecting an LSTM model to apply to the extracted EEG feature using a genetic algorithm (GA), and using a genetic algorithm for the selected LSTM model to feature selecting a set.

감정 인식을 위한 활성 특징을 선택하기 위해 시간 영역, 주파수 영역, 시간-주파수 영역에서 EEG 특징을 추출하는 단계는 아시아인 기반의 데이터베이스를 이용한 MERTI-Apps 데이터 셋으로부터 시간 영역에서 추출되는 특징을 시간에 따른 EEG 신호의 변화로 나타내어 시간에 따른 감정의 변화를 인식하고, 주파수 영역에서 추출되는 특징을 저속 알파, 알파, 베타, 감마로 나누어 추출하고, 시간-주파수 영역에서 추출되는 특징을 이산형 파장 변환(Discrete Wavelet Transform; DWT)을 통해 신호를 시간에 따라 비트로 분해하여 나타내며, 상기 추출된 시간 영역 특징, 주파수 영역 특징, 시간-주파수 영역 특징 및 뇌 기능 분화 특징을 포함하는 EEG 특징을 1차원 벡터로 변환하여 입력 데이터로 이용한다. The step of extracting EEG features in the time domain, frequency domain, and time-frequency domain in order to select active features for emotion recognition is to analyze the features extracted in the time domain from the MERTI-Apps data set using an Asian-based database in time. Recognizes the change of emotion over time by expressing it as a change in the EEG signal, and extracts the features extracted in the frequency domain by dividing them into slow alpha, alpha, beta, and gamma, and converts the features extracted in the time-frequency domain to discrete wavelength A signal is decomposed into bits according to time through (Discrete Wavelet Transform; DWT) and represented, and EEG features including the extracted time domain features, frequency domain features, time-frequency domain features, and brain function differentiation features are converted into a one-dimensional vector. It is converted and used as input data.

추출된 EEG 특징에 적용하기 위한 LSTM 모델을 유전 알고리즘을 사용하여 선택하는 단계는 MERTI-Apps 데이터 셋으로부터 추출된 EEG 특징에 대한 임의의 상위 객체를 생성하여 학습한 후 미리 정해진 기준 이상의 상위 부모 객체를 선정해 차세대 모델로 이동하고, 선택, 돌연변이, 크로스 오버를 포함하는 유전 알고리즘 과정을 거쳐 자녀 객체를 생성하고, 미리 정해진 수의 차세대 모델이 생성되거나 또는 현재 모델의 RMSE(Root-Mean-Square Error)가 더 이상 개선되지 않을 때까지 학습을 반복한 후 추출된 EEG 특징에 적용하기 위한 LSTM 모델을 출력한다. The step of selecting an LSTM model to be applied to the extracted EEG features using a genetic algorithm is to create and learn a random parent object for the EEG feature extracted from the MERTI-Apps data set, and then select the parent object above a predetermined standard. Select and move to the next-generation model, generate a child object through a genetic algorithm process including selection, mutation, and crossover, and generate a predetermined number of next-generation models or RMSE (Root-Mean-Square Error) of the current model The LSTM model is output to apply to the extracted EEG features after repeating the learning until it is no longer improved.

선택된 LSTM 모델에 대하여 유전 알고리즘을 사용하여 특징 세트를 선택하는 단계는 임의의 상위 객체를 생성하고, 선택된 LSTM 모델을 이용하여 학습한 후 미리 정해진 기준 이상의 상위 부모 객체를 선정해 차세대 모델로 이동하고, 선택, 돌연변이, 크로스 오버를 포함하는 유전 알고리즘 과정을 거쳐 자녀 객체를 생성하고, 선택된 LSTM 모델을 통해 미리 정해진 기준 이상의 우세한(dominant) 특징 세트를 선택하며, 초기 모델, 특징 세트를 결정하지 않고, 모든 모델, 모든 특징 그룹 및 모든 채널이 GA에 의해 평가되어 감정 인식을 방해하는 모델, 특징, 채널을 제거함으로써 최적의 모델과 특징 세트를 선택한다. In the step of selecting a feature set using a genetic algorithm for the selected LSTM model, an arbitrary parent object is created, trained using the selected LSTM model, and then a parent object higher than a predetermined standard is selected and moved to the next-generation model, Through a genetic algorithm process including selection, mutation, and crossover, a child object is generated, a dominant feature set greater than a predetermined criterion is selected through the selected LSTM model, the initial model, the feature set is not determined, and all The model, all feature groups, and all channels are evaluated by GA to select the optimal model and feature set by removing models, features, and channels that interfere with emotion recognition.

또 다른 일 측면에 있어서, 본 발명에서 제안하는 아시아인의 다양한 데이터베이스를 이용한 감정 인식에서 효과적 모델 및 특징 그룹 선정을 위한 딥러닝 장치는 감정 인식을 위한 활성 특징을 선택하기 위해 시간 영역, 주파수 영역, 시간-주파수 영역에서 EEG(electroencephalogram) 특징을 추출하는 특징 추출부, 추출된 EEG 특징에 적용하기 위한 LSTM 모델을 유전 알고리즘(Genetic Algorithm; GA)을 사용하여 선택하는 LSTM 모델 선택부 및 선택된 LSTM 모델에 대하여 유전 알고리즘을 사용하여 특징 세트를 선택하는 특징 세트 선택부를 포함한다.In another aspect, the deep learning device for effective model and feature group selection in emotion recognition using various databases of Asians proposed in the present invention is a time domain, frequency domain, A feature extraction unit that extracts EEG (electroencephalogram) features in the time-frequency domain, an LSTM model selector that selects an LSTM model to apply to the extracted EEG features using a Genetic Algorithm (GA), and a selected LSTM model and a feature set selector for selecting a feature set by using a genetic algorithm.

본 발명의 실시예들에 따르면 효과적인 네트워크 모델을 선택하고 유효한 EEG 채널 및 특징을 찾기 위해 GA와 LSTM을 결합하여, 먼저 효과적인 LSTM 모델을 선택하고 감정 인식 판단에 방해가 되는 특징을 제거한다. 제안된 GA-LSTM 아키텍처는 아시아인의 생리학적 신호(MERTI-Apps) 데이터셋이 확립되어 EEG 신호에 기반한 감정 인식을 수행할 수 있는 정서 및 각성도 주석 레이블링과 다양한 특징 집합(시간, 주파수, 시간-주파수 영역 및 뇌 기능 분화 특징)을 제공할 수 있다. 또한, 제안된 GA-LSTM 아키텍처에서의 뇌 기능 분화 특징은 감정 인식의 결과를 더욱 향상시킬 수 있고, 회귀 성능과 분류 성능(정확성(accuracy)) 면에서 뛰어난 성능을 보인다. According to embodiments of the present invention, GA and LSTM are combined to select an effective network model and find an effective EEG channel and feature, first select an effective LSTM model, and remove features that interfere with emotion recognition judgment. The proposed GA-LSTM architecture has established the Asian physiological signal (MERI-Apps) dataset to perform emotional and arousal annotation labeling and various feature sets (time, frequency, time, etc.) to perform emotion recognition based on EEG signals. -frequency domain and brain function differentiation characteristics). In addition, the brain function differentiation feature in the proposed GA-LSTM architecture can further improve the result of emotion recognition, and it shows excellent performance in regression performance and classification performance (accuracy).

도 1은 본 발명의 일 실시예에 따른 CNS 및 PNS의 생체 신호 유형 및 위치를 나타내는 도면이다.
도 2는 본 발명의 일 실시예에 따른 MERTI-Apps 데이터 수집 과정을 나타내는 도면이다.
도 3은 본 발명의 일 실시예에 따른 MERTI-Apps 데이터 수집의 예시를 설명하기 위한 도면이다.
도 4는 본 발명의 일 실시예에 따른 따른 관찰자를 위한 주석 레이블링 프로그램을 나타내는 도면이다.
도 5는 본 발명의 일 실시예에 따른 아시아인의 다양한 데이터베이스를 이용한 감정 인식에서 효과적 모델 및 특징 그룹 선정을 위한 딥러닝 방법을 설명하기 위한 흐름도이다.
도 6은 본 발명의 일 실시예에 따른 아시아인의 다양한 데이터베이스를 이용한 감정 인식에서 효과적 모델 및 특징 그룹 선정을 위한 딥러닝 장치의 구성을 나타내는 도면이다.
도 7은 본 발명의 일 실시예에 따른 뇌파 신호에 대한 뇌 기능 분화 특징을 설명하기 위한 도면이다.
도 8은 본 발명의 일 실시예에 따른 MANHOB-HCI 데이터 세트에서 GA-LSTM의 EEG 특징 가중치, 채널 가중치를 나타내는 도면이다.
도 9는 본 발명의 일 실시예에 따른 LSTM-FC 모델 구조를 나타내는 도면이다.1 is a diagram illustrating biosignal types and locations of CNS and PNS according to an embodiment of the present invention.
2 is a diagram illustrating a MERTI-Apps data collection process according to an embodiment of the present invention.
3 is a diagram for explaining an example of MERTI-Apps data collection according to an embodiment of the present invention.
4 is a diagram illustrating an annotation labeling program for an observer according to an embodiment of the present invention.
5 is a flowchart illustrating a deep learning method for effective model and feature group selection in emotion recognition using various databases of Asians according to an embodiment of the present invention.
6 is a diagram showing the configuration of a deep learning apparatus for effective model and feature group selection in emotion recognition using various databases of Asians according to an embodiment of the present invention.
7 is a view for explaining the brain function differentiation characteristics for EEG signals according to an embodiment of the present invention.
8 is a diagram illustrating EEG feature weights and channel weights of GA-LSTM in the MANHOB-HCI data set according to an embodiment of the present invention.
9 is a diagram illustrating a structure of an LSTM-FC model according to an embodiment of the present invention.

정서 인식은 인간과 컴퓨터 시스템 사이의 진보된 상호작용을 위해 필수적이다. 본 발명은 아시아인의 생리학적 신호를 기반으로 한 MERTI-Apps라는 새로운 다양한 데이터셋를 도입하고, 감정 인식을 위한 활성 특징 그룹을 도출하기 위한 GA(Genetic Algorithm)-LSTM(Long Short-Term Memory) 모델을 제안한다. 데이터셋 생성 중 대상자의 감정을 각성도(arousal)와 정서(valence)로 태그하는 관찰자를 위한 주석 레이블링 프로그램을 개발했다. 학습 단계에서는 GA를 사용하여 효과적인 LSTM 모델 파라미터를 선택하고 뇌전도(Electroencephalogram; EEG) 시간, 주파수, 시간-주파수 영역에서 추출한 25개의 뇌 기능 분화 특징 및 37개의 특징으로부터의 활성 특징 그룹을 결정하였다. 실험 결과에 따르면 제안된 모델은 MAHNOB-HCI 데이터셋에서 정서 회귀 성능 면에서 0.0156의 RMSE(Root-Mean-Square Error)를 달성했으며([17]보다 71% 향상), 아시아인 인구 특징 12채널 EEG 데이터를 사용하고 추가적인 뇌 기능분화(Brain Lateralization; BL) 특징을 추가하는 내부의 MERTI-App 데이터셋에서 정서 및 각성도 회귀에서 RMSE 성능은 0.0579, 00287, 그리고 정서 및 정서 정확도 면에서 65.7%, 88.3% 향상되었다. 또한, 유전 알고리즘의 효과적인 모델 선택으로 DEAP 데이터셋에서는 정서와 각성도 영역에서 91.3% 94.8%의 정확도를 보인다. 이하, 본 발명의 실시 예를 첨부된 도면을 참조하여 상세하게 설명한다.Emotion recognition is essential for advanced interactions between humans and computer systems. The present invention introduces a new diverse dataset called MERTI-Apps based on the physiological signals of Asians, and a Genetic Algorithm (GA)-Long Short-Term Memory (LSTM) model for deriving active feature groups for emotion recognition. suggest We developed an annotation labeling program for observers to tag subjects' emotions with arousal and valence during data set creation. In the learning phase, effective LSTM model parameters were selected using GA, and 25 brain function differentiation features extracted from electroencephalogram (EEG) time, frequency, and time-frequency domains and active feature groups from 37 features were determined. According to the experimental results, the proposed model achieved a Root-Mean-Square Error (RMSE) of 0.0156 in sentiment regression performance on the MAHNOB-HCI dataset (71% improvement over [17]), and 12-channel EEG with Asian population characteristics. In the internal MERTI-App dataset using the data and adding additional Brain Lateralization (BL) features, the RMSE performance in emotion and arousal regression was 0.0579, 00287, and 65.7%, 88.3 in emotion and emotion accuracy. % improved. In addition, due to the effective model selection of the genetic algorithm, the DEAP dataset shows an accuracy of 91.3% and 94.8% in the emotional and arousal domains. Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

본 발명은 EEG 신호에 기반한 감정 인식을 수행하기 위해 공용 데이터셋 대신 한국인의 데이터셋을 사용하여 감정 인식에 사용되는 모델, 특징, 채널을 검색함으로써 감정 분류의 정확성을 향상시키는 것을 목적으로 한다. 특징은 비교를 위해 시간, 주파수, 시간-주파수 영역에서 추출되었다. 유전 알고리즘(Genetic Algorithm; GA)은 유효한 특징을 선택하기 위해 사용되었고, 학습 알고리즘은 LSTM((Long Short-Term Memory)를 기반으로 훈련되었다. LSTM 모델의 결과에서 모든 특징은 GA의 반복적인 특징 평가를 사용하여 가중치를 부여했다. 모든 특징을 일관성 있게 평가하기 위해 학습은 GA에 의해 특징이 제거되는 것을 방지하기 위해 랜덤으로 혼합된 최대 반복 횟수를 사용했다. 제안된 GA-LSTM 프로세스를 완료한 후 모델 파라미터와 특징의 가중치를 통해 EEG 신호의 어떤 모델, 채널 및 특징이 감정 인식에 유효한지 구분할 수 있다. 연속 시간 주석을 적용한 MAHNOB-HCI 데이터셋을 사용한 실험과 DEAP 데이터셋을 사용한 다른 실험에서는 제안된 방법이 대표적인 PSD 특징[17]과 최근의 감정 분류 방법을 사용한 연속 감정 인식 방법[41, 42, 43]을 능가하는 것으로 나타났다. 아시아인에 중점을 둔 데이터셋의 평가를 위해 MAHNOB-HCI 데이터셋과 유사한 아시아인의 생리적 신호(MERTI-Apps) 데이터셋을 사용하여 로봇 인터페이스에 대한 다양한 감정을 구성했다. MERTI-App 데이터셋에 제안하는 모델을 적용함으로써, 제안된 기법이 효과적인 모델 선택과 특징 그룹 선택을 제공할 수 있음을 검증했다.An object of the present invention is to improve the accuracy of emotion classification by searching for models, features, and channels used for emotion recognition using a Korean dataset instead of a public dataset to perform emotion recognition based on EEG signals. Features were extracted in time, frequency, and time-frequency domains for comparison. A Genetic Algorithm (GA) was used to select valid features, and the learning algorithm was trained based on LSTM (Long Short-Term Memory). From the results of the LSTM model, all features were evaluated by GA iteratively. To evaluate all features consistently, the training used a randomly mixed maximum number of iterations to avoid removing features by GA.After completing the proposed GA-LSTM process, The weighting of model parameters and features can distinguish which models, channels, and features of the EEG signal are valid for emotion recognition.Suggested in experiments using the MAHNOB-HCI dataset with continuous-time annotation and other experiments using the DEAP dataset This method was found to outperform the representative PSD feature [17] and the continuous emotion recognition method [41, 42, 43] using the recent emotion classification method. Various emotions about the robot interface were constructed using the physiological signal (MERTI-Apps) dataset of Asians similar to the set. By applying the proposed model to the MERTI-App dataset, the proposed method can effectively select and characterize the model. Validated that it can provide group selection.

본 발명의 실시예에 따르면, 효과적인 네트워크 모델을 선택하고 유효한 EEG 채널 및 특징을 찾기 위해 딥러닝 네트워크에 대한 유전 알고리즘이 제안된다. GA는 LSTM과 결합해 먼저 효과적인 LSTM 모델을 선택하고 감정 인식 판단에 방해가 되는 특징을 제거한다. 제안된 GA-LSTM 아키텍처는 회귀 성능(RMSE)과 분류 성능(정확성(accuracy)) 면에서 종래 기술[17, 41, 42, 43]을 능가한다.According to an embodiment of the present invention, a genetic algorithm for a deep learning network is proposed to select an effective network model and to find valid EEG channels and features. GA combines with LSTM to first select an effective LSTM model and remove features that interfere with emotion recognition judgment. The proposed GA-LSTM architecture outperforms the prior art [17, 41, 42, 43] in terms of regression performance (RMSE) and classification performance (accuracy).

본 발명에서는 아시아인의 생리학적 신호(MERTI-Apps) 데이터셋이 확립되어 EEG 신호에 기반한 감정 인식을 수행할 수 있는 정서 및 각성도 주석 레이블링과 다양한 특징 집합(시간, 주파수, 시간 주파수 영역 및 뇌 기능 분화 특징)을 제공할 수 있다. 또한, 본 발명의 실시예에 따른 뇌 기능 분화 특징은 감정 인식의 결과를 더욱 향상시킬 수 있다. In the present invention, an Asian physiological signal (MERTI-Apps) dataset has been established, and emotional and arousal annotation labeling capable of performing emotion recognition based on EEG signals and various feature sets (time, frequency, temporal frequency domain and brain) functional differentiation characteristics). In addition, the brain function differentiation feature according to an embodiment of the present invention may further improve the result of emotion recognition.

본 발명의 실시예에 따르면, 생물학적 신호와 감정 인식을 위해 공용 데이터셋을 사용하는 HCI(Human Computing Interface)를 이용한다. 인간의 감정은 오랫동안 얼굴 표정과 동일시되어 왔다. 따라서, 얼굴 이미지를 기반으로 하는 감정 인식 방법들이 집중적으로 개발되었다. Tong 등은 얼굴 이미지에서 움직임이 활성화되는 부위를 감지하여 얼굴 위치와 모양에 따라 행동 단위(Action Unit; AU)를 결정했다[15]. 얼굴 표정은 AU 유형과 조합에 따라 분류되었다. 그러나 랜드마크를 추출하는 알고리즘의 성능이 획기적으로 향상됨에 따라 얼굴 표정을 결정하는 새로운 방법이 등장했다[16]. 이러한 맥락에서 랜드마크를 얼굴 이미지에서 중요한 포인트의 위치 정보로 정의하고, 얼굴 표정은 각각의 랜드마크의 위치와 표시 사이의 거리로 특징지어진다. 실제 환경에서 인간의 감정은 데이터셋의 감정과 다를 수 있다. 예를 들어, 같은 감정들이 다른 표현 수단을 가질 수도 있고, 얼굴 표정에서 드러나지 않고 감정이 변할 수도 있다. 따라서 실제 환경에서 정확한 감정 인식을 위해서는 얼굴 이미지 외에 생체 신호 등의 정보를 이용하는 것이 바람직하다.According to an embodiment of the present invention, a Human Computing Interface (HCI) using a common dataset is used for biological signal and emotion recognition. Human emotions have long been identified with facial expressions. Therefore, emotion recognition methods based on face images have been intensively developed. Tong et al. determined the action unit (AU) according to the position and shape of the face by detecting the part where the movement was activated in the face image [15]. Facial expressions were classified according to AU types and combinations. However, as the performance of the algorithm for extracting landmarks has improved dramatically, a new method for determining facial expressions has emerged [16]. In this context, landmarks are defined as location information of important points in a facial image, and facial expressions are characterized by the location of each landmark and the distance between the marks. In the real environment, human emotions may be different from those in the dataset. For example, the same emotions may have different means of expression, and emotions may change without being revealed in facial expressions. Therefore, for accurate emotion recognition in a real environment, it is preferable to use information such as a biosignal in addition to a face image.

EEG는 인간의 뇌에서 생성되는 전기 신호의 기록으로, 내부 감정 인식에 사용될 수 있다. EEG의 위치와 파형 특징을 분석함으로써 감정의 상태를 추정할 수 있다. 2012년에는 인간의 감정의 맥락에서 EEG를 측정하는 DEAP 데이터셋과 MAHNOB-HCI 데이터셋이 출시되었다. 대표적인 EEG 기반 감정인식 기술은 Soleymani[17]와 Kim[18] 등이 개발했다. Soleymani 외 연구진은 연속 시간 영역에 주석을 단 MAHNOB-HCI 데이터셋을 사용하여 실시간으로 양수/음수 정서 값의 회귀 분석 방법을 제안했다. Kim 외 연구진은 EEG 신호를 활성화하는 채널이 EEG 신호 채널 간의 연결을 사용하여 추출된 중요한 특징과 감정 상태에 따라 다르다는 것을 발견했다. 콘볼루션 LSTM은 정서와 각성도 영역에서 감정 상태를 분류하기 위해 추출된 특징에 적용되었다 [7, 20]. 최근 종래기술[19]에서는 분류기 연구가 우세하다는 것을 보여준다. 분류기 연구[41, 42, 43]는 여전히 활발하며 영상, 음성, 뇌파 등의 순서로 감정 인식이 잘 되어 있다. 그러나 EEG의 정확도는 여전히 비디오와 오디오 분야에 비해 낮은 정확도를 보인다. 이는 회귀와 다른 생물학적 신호와의 융합이 필요하다고 볼 수 있다. 브레인-머신 인터페이스 시스템(brain-machine interface systems) 분야는 감정인식 외에도 EEG 신호를 분석해 성능을 향상시킨다. 운전자의 상태를 파악해 미래의 자동차 기술에 적용하는 기술이 매우 중요하다. Zhongke Gao의 연구[40]는 운전자의 피로도를 측정하고 PSD를 포함한 특징을 모델에 적용하여 성능을 향상시킨다. EEG의 분석이 어렵기 때문에, 감정 인식을 위한 효과적인 특징과 경로를 찾는 연구가 활발하다[38, 39, 43]. Hao Chao[38]에서는 최대 정보 계수(Maximal Information Coefficient; MIC) 특징을 활용하여 성능을 향상시켰다. Peiyang Li[39]는 감정인식을 위한 보상 활성화와 연결 정보를 결합하기 위해 다중 특징 융합 접근법을 채택하였다. 특징 집합의 조합을 결정한 후 특징 조합 중에서 최적의 특징 하위 집합을 선택한다. Wang[43]은 EEG 스펙트로그램을 이용하여 유효채널을 선정하고 상호정보를 표준화하였다. 본 발명에서는 초기 모델, 특징 그룹을 결정하지 않으며, 모든 모델, 특징 그룹 및 채널이 GA에 의해 평가되어 최적의 모델과 특징 부분집합을 선정한다. GA를 통해 감정 인식을 방해하는 모델, 특징, 채널 등을 제거해 더 우수한 효과를 보여준다.EEG is a recording of electrical signals generated by the human brain, which can be used to recognize internal emotions. By analyzing the location and waveform characteristics of the EEG, it is possible to estimate the emotional state. In 2012, the DEAP dataset and the MAHNOB-HCI dataset for measuring EEG in the context of human emotions were released. Representative EEG-based emotion recognition technology was developed by Soleymani [17] and Kim [18]. Soleymani et al. proposed a regression analysis method for positive/negative sentiment values in real time using the MAHNOB-HCI dataset annotated in the continuous time domain. Kim et al. found that the channels that activate EEG signals depended on important features and emotional states extracted using connections between EEG signaling channels. Convolutional LSTM was applied to extracted features to classify emotional states in the emotional and arousal domains [7, 20]. Recent prior art [19] shows that classifier research is predominant. Classifier research [41, 42, 43] is still active, and emotion recognition in the order of image, voice, and EEG is well done. However, the accuracy of EEG is still lower than that of video and audio fields. This can be seen as a need for regression and fusion with other biological signals. In addition to emotion recognition, the field of brain-machine interface systems analyzes EEG signals to improve performance. It is very important to understand the driver's condition and apply it to future automobile technology. Zhongke Gao's study [40] measures driver fatigue and applies features including PSD to the model to improve performance. Since the analysis of EEG is difficult, studies to find effective features and pathways for emotion recognition are active [38, 39, 43]. In Hao Chao [38], the performance was improved by using the Maximum Information Coefficient (MIC) feature. Peiyang Li [39] adopted a multi-feature fusion approach to combine reward activation and connection information for emotion recognition. After determining the combination of feature sets, an optimal feature subset is selected from among the feature combinations. Wang [43] selected effective channels using EEG spectrogram and standardized mutual information. In the present invention, the initial model and feature group are not determined, and all models, feature groups, and channels are evaluated by GA to select an optimal model and feature subset. It shows a better effect by removing models, features, and channels that interfere with emotion recognition through GA.

도 1은 본 발명의 일 실시예에 따른 CNS 및 PNS의 생체 신호 유형 및 위치를 나타내는 도면이다. 1 is a diagram illustrating biosignal types and locations of CNS and PNS according to an embodiment of the present invention.

감정 인식 분야에서 딥러닝 기술이 빠르게 발전하면서 EEG 이외의 생체신호를 결합해 감정을 분류하는 연구가 활발히 진행되고 있다. 도 1에 나타낸 바와 같이 중추신경계(Central Nervous System; CNS) 신호는 뇌에서 생성되는 EEG 신호를 가리킨다. 말초신경계(Peripheral Nervous System; PNS) 신호는 근육 움직임과 눈의 움직임을 통해 발생하는 전기 신호인 전기전도(Electrooculogram; EOG), 전자기파(Electromyogram; EMG), 광용적맥파(Photoplethysmogram; PPG), 갈바닉 피부 반응(Galvanic Skin Response; GSR)을 나타낸다. EEG는 내부 감정 인식에 가장 중요한 신호다. 그러나 EEG 신호는 매우 취약하기 때문에 데이터 수집, 소음, 온도에서 아티팩트가 발생할 수 있으며, 개인차가 심하다. PNS 신호는 EEG 신호와 다르게 일반화될 수 있다. EEG 신호만으로 감정을 분류하는 것은 어려운 일이기 때문에 종종 보조 수단으로 PNS 신호를 이용한다. 한 종래기술[21]에서는 심전도(Electrocardiogram; ECG)에서 얻은 심박수 정보를 사용하여 EMG 신호를 통해 안면 근육의 움직임을 평가함으로써 감정을 인식했다. 또 다른 종래기술[22]에서는 후부 근육을 통해 얼굴 표정을 측정하여 감정의 변화를 읽었다. 이러한 종래기술에서 EEG 신호는 PSD 특징을 사용한다. 감정 인식을 위해 PSD 이외의 유용한 특징을 정의한 종래기술은 거의 없다.With the rapid development of deep learning technology in the field of emotion recognition, research to classify emotions by combining biosignals other than EEG is being actively conducted. As shown in FIG. 1 , a central nervous system (CNS) signal refers to an EEG signal generated in the brain. Peripheral Nervous System (PNS) signals are electrical signals generated through muscle movement and eye movement, such as Electrooculogram (EOG), Electromyogram (EMG), Photoplethysmogram (PPG), and galvanic skin. Represents Galvanic Skin Response (GSR). EEG is the most important signal for internal emotion recognition. However, since the EEG signal is very fragile, artifacts may occur in data collection, noise, and temperature, and individual differences are severe. PNS signals can be generalized differently from EEG signals. Because it is difficult to classify emotions using only EEG signals, PNS signals are often used as an aid. In one prior art [21], emotion was recognized by evaluating facial muscle movements through EMG signals using heart rate information obtained from an electrocardiogram (ECG). In another prior art [22], changes in emotions were read by measuring facial expressions through the posterior muscles. In this prior art, the EEG signal uses a PSD feature. There are few prior art that define useful features other than PSD for emotion recognition.

EEG 신호를 포함한 생체신호 데이터는 전문 장비가 필요하며, 감정이 느껴지는 실험 조건을 만드는 데 다소 어려움이 있다. 따라서 대부분의 종래기술에서는 개방형 데이터셋을 사용한다.Biosignal data including EEG signals require specialized equipment, and it is somewhat difficult to create experimental conditions in which emotions are felt. Therefore, most of the prior art uses an open data set.

표 1은 감정 인식을 위한 생리 학적 데이터 세트의 비교를 나타낸다. 표 1에 제시된 바와 같이, 데이터셋은 연구 목적에 따라 나열된다. 그러나 대부분의 데이터셋은 MAHNOB-HCI를 포함한 서구 기반이기 때문에 실험 결과가 아시아인 모집단에 동일하게 적용될지는 확신할 수 없다.Table 1 presents a comparison of physiological data sets for emotion recognition. As presented in Table 1, the datasets are listed according to study purpose. However, since most datasets are Western-based, including MAHNOB-HCI, it is not certain whether the experimental results are equally applicable to the Asian population.

<표 1><Table 1>

본 발명의 실시예에 따르면, 세 가지 데이터셋이 사용되었다. MAHNOB-HCI 데이터셋[12]은 지속적인 감정 인식을 위한 관찰자 태그가 있는 개방형 데이터셋이고, DEAP 데이터셋[11]은 사람들의 이산 감정을 분석하기 위한 다중 데이터셋이며, 본 발명에서 제안된 MERTI-App 데이터셋이다. According to an embodiment of the present invention, three datasets were used. The MAHNOB-HCI dataset [12] is an open dataset with an observer tag for continuous emotion recognition, and the DEAP dataset [11] is a multi-dataset for analyzing people’s discrete emotions, and the MERTI- App dataset.

MAHNOB-HCI 데이터셋[12](이하, MANHOB 데이터셋)에는 20개의 비디오가 사용되어 지속적인 감성을 유도하였다. 사전 연구에서는 참가자가 자기 평가를 통해 자신의 감정을 보고함으로써 영상 선정에 도움을 주었다. 발췌된 20개의 비디오의 출처에는 혐오, 재미, 기쁨, 공포, 슬픔, 중립 같은 감정들이 포함되어 있다. 각 동영상의 길이는 약 34초에서 117초였다. 참가자들은 건강한 27명의 대상으로 구성되었다(남성 11명, 여성 16명). EEG 신호는 Biosemi Active II 시스템[27]과 10-20 국제 시스템[28]의 기준에 따라 배치된 32개의 활성화 전극을 사용하여 얻었다. 참가자들의 얼굴은 EEG 신호와 동시에 60Hz의 720 × 580 비디오로 촬영되었다. 데이터셋 생성에 대한 자세한 설명은 표 2(a)에 나타내었다. In the MAHNOB-HCI dataset [12] (hereafter, the MANHOB dataset), 20 videos were used to induce continuous emotions. In the preliminary study, participants helped in video selection by reporting their emotions through self-evaluation. The sources of the 20 video excerpts contain emotions such as disgust, fun, joy, fear, sadness, and neutrality. Each video was approximately 34 to 117 seconds in length. Participants consisted of 27 healthy subjects (11 males and 16 females). EEG signals were obtained using 32 activation electrodes arranged according to the criteria of the Biosemi Active II system [27] and the 10-20 international system [28]. The participants' faces were filmed with 720 × 580 video at 60 Hz simultaneously with the EEG signal. A detailed description of data set creation is shown in Table 2(a).

<표 2(a)><Table 2(a)>

총 239개의 레코드가 제작되었으며, 각 레코드에는 해당 레이블 정보가 수록되어 있다. 또한 교육받은 해설자 5명이 참가자들의 표정연기에 대한 지속적인 의견을 제공했으며, FEELTRACE[29]와 조이스틱을 이용해 정면 얼굴 표정의 정서를 파악했다.A total of 239 records were produced, and each record contains the corresponding label information. In addition, 5 educated commentators provided continuous opinions on the participants' facial expressions, and they used FEELTRACE [29] and a joystick to identify the emotions of frontal facial expressions.

DEAP 데이터셋[11]은 사람들의 개별 감정을 분석하기 위한 다양한 데이터셋이다. 데이터셋은 32명의 참가자의 EEG와 생리학적 주변 신호의 추출이다. 추출 과정에서 총 40편의 뮤직비디오가 1분 간격으로 조회됐다. 표 2(b)와 같이, 참가자는 32명의 대상으로 구성되었다(남성 16명, 여성 16명). The DEAP dataset [11] is a diverse dataset for analyzing individual emotions of people. The dataset is the extraction of EEG and physiological peripheral signals from 32 participants. During the extraction process, a total of 40 music videos were viewed at 1-minute intervals. As shown in Table 2(b), the participants consisted of 32 subjects (16 males and 16 females).

<표 2(b)><Table 2(b)>

참가자들은 각 비디오의 레이블을 각성도, 정서, 호감/불쾌함, 지배력, 친숙함으로 평가했다. 수집된 데이터는 EEG, EMG, EOG, GSR, BVP, 온도 및 호흡 데이터였다. 비디오의 길이는 강화된 탐지를 통해 1분 단위로 추출되었다. EEG 데이터는 샘플링 속도 512Hz의 32개 전극에서 수집되었으며, PNS 생체신호는 EOG 4개, EMG 4개, GSR 2개, BVP 1개, 온도 및 호흡 신호가 있는 13개 전극에서 수집되었다.Participants rated each video's label as arousal, emotion, like/dislike, dominance, and familiarity. Data collected were EEG, EMG, EOG, GSR, BVP, temperature and respiration data. The length of the video was extracted in 1-minute increments through enhanced detection. EEG data were collected from 32 electrodes with a sampling rate of 512 Hz, and PNS biosignals were collected from 13 electrodes with 4 EOG, 4 EMG, 2 GSR, 1 BVP, and temperature and respiration signals.

본 발명의 실시예에 따른 데이터셋은 15개의 감정 비디오에 대한 응답에 대한 참가자의 다양한 녹화 기록을 수집했다. 처음에는 행복, 흥분, 슬픔, 지루함, 혐오, 분노, 침착함, 편안함 등 8가지 각성도와 정서를 바탕으로 실험 비디오 집합을 제작하는 분류 체계를 구축한다. 감정을 유도하기 위해 제시할 비디오는 감정 어휘 키워드를 이용해 인터넷 동영상을 무작위로 검색해 수집했다. 수집된 비디오는 5~7명의 연구 보조원들이 검토했으며, 사용된 감정을 유도할 수 있다고 판단된 영상을 검색 키워드로 분류하는 작업이 진행됐다. 이후 감정 유도에 최적화된 비디오를 선정하기 위해 현장검사를 통해 감정유도 적합성(감정유형, 강도 등)의 내용 유효성을 점검했다. 그 결과 CVI 2.5 이하 영상을 제외한 최종 감정 수집 비디오가 선정되었다. 4개의 정서-각성도 베이스(HAPV, HANV, LAPV, LANV)와 중립 도메인 중 최종 32개의 비디오가 선정되었다. 이들 중 15명이 참가자의 집중 시간에 선발되었다. 한 비디오는 60초에서 206초 사이였고, 평균 길이는 114.6초였다. 모집 과정, 동의, 실험실 환경, 감정 유발 동영상, 참가자 치료, 사후 관리 및 기타 측정 항목에서 전문가 간 CVI=.92의 유효성을 보였다. 전문가 CVI 측정에 의해 유효성을 확보하고 파일럿 테스트를 실시했다. 표 2(c)는 세 가지 실험을 나타낸다. A dataset according to an embodiment of the present invention collected various recordings of participants in response to 15 emotional videos. In the beginning, a classification system is constructed to create a set of experimental videos based on eight arousal levels and emotions: happiness, excitement, sadness, boredom, disgust, anger, calmness, and comfort. The videos to be presented to induce emotions were collected by randomly searching Internet videos using emotional vocabulary keywords. The collected videos were reviewed by 5-7 research assistants, and the images judged to be able to induce the emotions used were classified into search keywords. Afterwards, in order to select a video optimized for emotion induction, the content validity of emotion induction suitability (emotion type, intensity, etc.) was checked through on-site inspection. As a result, the final emotion collection video was selected except for the CVI 2.5 or lower video. The final 32 videos were selected among the four emotion-awareness bases (HAPV, HANV, LAPV, LANV) and the neutral domain. Of these, 15 were selected during the participant's concentration period. One video was between 60 and 206 seconds, with an average length of 114.6 seconds. Inter-professional CVI=.92 was found to be effective in recruitment process, consent, laboratory setting, emotion-evoking video, participant treatment, follow-up, and other metrics. Validation was ensured by expert CVI measurements and pilot testing was conducted. Table 2(c) shows three experiments.

<표 2(c)><Table 2(c)>

실험 1은 EEG 측정용 EEG 캡이 참가자의 머리를 꽉 쥐었기 때문에 EEG를 제외한 신호로만 얻은 데이터로 구성되었다. 실험 2는 EEG 신호와 같은 CNS를 이용한 감정 인식만을 고려하였으며 EEG, EMG, EOG 신호를 함께 측정하여 EMG와 EOG 신호를 이용하여 머리와 얼굴 근육의 움직임에 의한 노이즈를 제거하였다. 이 경우 얼굴을 가리는 EMG 전극의 문제가 확인되었고, 실험 3에서는 말초신경계인 PPG와 GSR 신호가 EMG 신호를 제외한 EEG 신호와 동기화되며, CNS와 PNS와 같은 서로 다른 신경 신호를 복합 형태로 측정한다.Experiment 1 consisted of data obtained only with signals excluding EEG because the EEG cap for EEG measurement gripped the participant's head. Experiment 2 considered only emotion recognition using CNS such as EEG signal, and EEG, EMG, and EOG signals were measured together, and noise caused by head and facial muscle movements was removed using EMG and EOG signals. In this case, the problem of the EMG electrode covering the face was confirmed. In Experiment 3, PPG and GSR signals of the peripheral nervous system are synchronized with EEG signals except for EMG signals, and different nerve signals such as CNS and PNS are measured in a complex form.

참가자들은 21세부터 29세까지의 남성 28명과 여성 34명 등 62명의 건강한 대상으로 구성되었다. The participants consisted of 62 healthy subjects, 28 males and 34 females, aged 21 to 29 years old.

도 2는 본 발명의 일 실시예에 따른 MERTI-Apps 데이터 수집 과정을 나타내는 도면이다. 2 is a diagram illustrating a MERTI-Apps data collection process according to an embodiment of the present invention.

EEG 신호는 BIOPAC MP150 계측기를 사용하여 10-20 국제 시스템에 따라 위치한 12개의 활성 전극으로 획득했다. EEG 신호와 함께 참가자의 얼굴이 담긴 영상이 1080p 30Hz로 촬영됐다. 비디오 런타임이 짧았기 때문에 EOG 채널을 사용하여 눈 깜박임으로 인한 아티팩트를 제거하였다. 이외에도 참가자의 자기 감정 설문과 평가에 부합하지 않는 자료는 기록에서 제외했다. MERTI-App 데이터셋의 유효 기록은 실험 1의 283, 실험 2의 312, 실험 3의 236이다. 당초 각 실험에서는 62명의 참가자와 5개의 영상을 이용해 320개의 기록이 생성됐다. 이후, 유도된 감정에서 심각한 아티팩트를 야기하는 기록은 제외된다. 실험 3에서는 주석 레이블링에 236개의 기록이 사용되었는데, 이는 얼굴 움직임에 불편함이 없도록 설계되었다. 각 기록에는 해당 정서와 각성도 레이블 정보가 수록되어 있었다. MAHNOB 데이터셋과 마찬가지로 훈련된 해설자들이 참가자들의 얼굴 표정들에 대해 지속적으로 코멘트를 했으며, 프로그램은 정면 얼굴 표정의 정서와 각성도를 평가했다. 주석 레이블링과 관련된 측정 프로그램은 그림 3과 같이 기록되었다.EEG signals were acquired with 12 active electrodes positioned according to the 10-20 international system using a BIOPAC MP150 instrument. A video of the participant's face along with the EEG signal was recorded at 1080p 30Hz. Due to the short video runtime, the EOG channel was used to eliminate the artifact caused by blinking. In addition, data that did not meet the participant's self-emotional questionnaire and evaluation were excluded from the record. The valid records of the MERTI-App dataset are 283 in Experiment 1, 312 in Experiment 2, and 236 in Experiment 3. Initially, 320 records were generated using 62 participants and 5 images in each experiment. Thereafter, records that cause serious artifacts in the induced emotions are excluded. In Experiment 3, 236 records were used for annotation labeling, which was designed so that there was no discomfort in facial movements. Each record contained the corresponding emotion and arousal label information. As with the MAHNOB dataset, trained narrators continuously commented on the participants' facial expressions, and the program evaluated the emotion and arousal of the frontal facial expressions. The measurement program related to annotation labeling was recorded as shown in Figure 3.

도 3은 본 발명의 일 실시예에 따른 MERTI-Apps 데이터 수집의 예시를 설명하기 위한 도면이다. 3 is a diagram for explaining an example of MERTI-Apps data collection according to an embodiment of the present invention.

도 3(a)는 비디오, 도 3(b)는 얼굴 비디오, 도 3(c)는 생체 신호, 도 3(d)는 정서 주석 레이블링, 도 3(e)는 각성도 주석 레이블링에 관한 MERTI-Apps 데이터 수집의 예시를 나타낸다. 3(a) is a video, FIG. 3(b) is a face video, FIG. 3(c) is a biosignal, FIG. 3(d) is emotional annotation labeling, and FIG. 3(e) is MERTI- related to arousal annotation labeling An example of Apps data collection is shown.

본 발명의 실시예에 따른 주석 레이블링은 MERTI-App 데이터셋에서 생체신호의 감정을 평가하기 위해 관찰자에 의해 수행되었다. 영상에서 유도된 감정은 단 한 가지뿐이며, 유도된 감정과 참가자 자신의 감정이 일치하는 것으로 확인됐다. 참가자의 정서와 각성도를 녹음된 얼굴 영상을 이용해 평가하였다. 참가자는 영상을 본 뒤 자신이 느낀 감정을 담은 자기평가 레이블을 만든다. 참가자의 자기 감정 설문지와 평가와 일치하지 않는 데이터는 실험에서 제외되었다. 정서는 어떤 사건, 사물 또는 상황의 매력이나 적수를 가리키는 감정적 품질이며, 음에서 양까지 다양하다. 정서만으로 감정을 더 세분화하기는 어려운 만큼 감정을 일컫는 호감을 높은 것에서 낮은 것으로 평가했다. 정서와 각성도는 모두 -100 ~ 100 범위에서 평가되었고, 데이터 수집은 생물학적 데이터에 따라 0.25초 간격으로 이루어졌다. Annotation labeling according to an embodiment of the present invention was performed by an observer to evaluate the emotion of biosignals in the MERTI-App dataset. There was only one emotion induced in the video, and it was confirmed that the induced emotion and the participant's own emotion were consistent. Participants' emotions and arousal were evaluated using recorded face images. Participants create a self-evaluation label containing their feelings after watching the video. Data inconsistent with the participant's self-emotion questionnaire and evaluation were excluded from the experiment. Emotion is an emotional quality that refers to the attractiveness or adversary of an event, object, or situation, and can range from yin to yang. As it is difficult to further subdivide emotions based on emotions alone, favorable feelings, which refer to emotions, were evaluated from high to low. Emotion and arousal were both evaluated in the range of -100 to 100, and data collection was performed at 0.25-second intervals according to biological data.

도 4는 본 발명의 일 실시예에 따른 관찰자를 위한 주석 레이블링 프로그램을 나타내는 도면이다. 4 is a diagram illustrating an annotation labeling program for an observer according to an embodiment of the present invention.

주석 레이블링을 평가할 때 가장 중요한 측면은 관찰자가 가장 객관적인 평가를 가능하게 해야 한다는 것이다. 이러한 이유로 도 4에 표시된 내부 주석 레이블링 프로그램이 사용되었다. 관찰자는 22~25세의 남녀 5명으로 구성되었다. 시범 실험 영상을 통해 참가자들의 감정 평가에 대한 관찰자들의 의견을 훈련하고 조율한 뒤 관찰자들의 레이블링 작업이 본격적으로 시작됐다. 의견이 일치하지 않는 경우, 두 명의 추가 관찰자가 비정상적인 값을 배제하고 나머지 관찰자의 평균 정서나 각성도를 사용했다. 영상과 레이블링 데이터를 일치시키기 위해 실험 참가자의 얼굴 영상의 시작과 끝 부분에 생체신호 데이터와 포인트가 표시된다. 관찰자들은 안면 영상을 보고 프로그램 오른쪽 스크롤 바를 통해 정서와 각성도를 0.25초 간격으로 평가했다. 5명의 관측자는 표적 데이터로 사용된 레이블링 데이터(도 3(d) 참조)를 기록했다. 같은 감정의 데이터를 찾기 위해 감정 유발 영상과 참가자가 느끼는 실제 감정이 동일한지 여부를 확인했고, 실험에는 레이블링 값이 일치하는 데이터만 사용했다. The most important aspect when evaluating annotation labeling is that it should enable the most objective evaluation by the observer. For this reason, the internal annotation labeling program shown in FIG. 4 was used. The observers consisted of 5 males and females aged 22 to 25 years. After training and tuning the opinions of the observers on the emotional evaluation of the participants through the video of the demonstration experiment, the observers' labeling work began in earnest. In case of disagreement, two additional observers excluded abnormal values and used the average sentiment or arousal of the remaining observers. In order to match the image and the labeling data, the biosignal data and points are displayed at the beginning and end of the face image of the experimental participant. Observers viewed facial images and evaluated emotion and arousal at 0.25-second intervals through the scroll bar on the right side of the program. Five observers recorded the labeling data (see Fig. 3(d)) used as target data. In order to find data of the same emotion, it was checked whether the emotion-evoking image and the actual emotion felt by the participants were the same, and only data with the same labeling value were used in the experiment.

도 5는 본 발명의 일 실시예에 따른 아시아인의 다양한 데이터베이스를 이용한 감정 인식에서 효과적 모델 및 특징 그룹 선정을 위한 딥러닝 방법을 설명하기 위한 흐름도이다. 5 is a flowchart illustrating a deep learning method for effective model and feature group selection in emotion recognition using various databases of Asians according to an embodiment of the present invention.

제안하는 아시아인의 다양한 데이터베이스를 이용한 감정 인식에서 효과적 모델 및 특징 그룹 선정을 위한 딥러닝 방법은 감정 인식을 위한 활성 특징을 선택하기 위해 시간 영역, 주파수 영역, 시간-주파수 영역에서 EEG(electroencephalogram) 특징을 추출하는 단계(510), 추출된 EEG 특징에 적용하기 위한 LSTM 모델을 유전 알고리즘(Genetic Algorithm; GA)을 사용하여 선택하는 단계(520) 및 선택된 LSTM 모델에 대하여 유전 알고리즘을 사용하여 특징 세트를 선택하는 단계(530)를 포함한다. The proposed deep learning method for effective model and feature group selection in emotion recognition using a diverse database of Asians is an electroencephalogram (EEG) feature in time domain, frequency domain, and time-frequency domain to select active features for emotion recognition. extracting 510, selecting an LSTM model to apply to the extracted EEG features using a genetic algorithm (GA) 520, and using a genetic algorithm for the selected LSTM model to generate a feature set and selecting 530 .

단계(510)에서, 감정 인식을 위한 활성 특징을 선택하기 위해 시간 영역, 주파수 영역, 시간-주파수 영역에서 EEG 특징을 추출한다. In step 510, EEG features are extracted in the time domain, frequency domain, and time-frequency domain to select active features for emotion recognition.

아시아인 기반의 데이터베이스를 이용한 MERTI-Apps 데이터 셋으로부터 시간 영역에서 추출되는 특징을 시간에 따른 EEG 신호의 변화로 나타내어 시간에 따른 감정의 변화를 인식하고, 주파수 영역에서 추출되는 특징을 저속 알파, 알파, 베타, 감마로 나누어 추출하고, 시간-주파수 영역에서 추출되는 특징을 이산형 파장 변환(Discrete Wavelet Transform; DWT)을 통해 신호를 시간에 따라 비트로 분해하여 나타낸다. 상기 추출된 시간 영역 특징, 주파수 영역 특징, 시간-주파수 영역 특징 및 뇌 기능 분화 특징을 포함하는 EEG 특징을 1차원 벡터로 변환하여 입력 데이터로 이용한다. From the MERTI-Apps data set using an Asian-based database, the characteristics extracted in the time domain are expressed as changes in the EEG signal over time to recognize changes in emotion over time, and features extracted from the frequency domain are identified as low-speed alpha, alpha , beta, and gamma are extracted, and the characteristics extracted in the time-frequency domain are expressed by decomposing the signal into bits according to time through discrete wavelet transform (DWT). EEG features including the extracted time domain features, frequency domain features, time-frequency domain features, and brain function differentiation features are converted into a one-dimensional vector and used as input data.

단계(520)에서, 추출된 EEG 특징에 적용하기 위한 LSTM 모델을 유전 알고리즘을 사용하여 선택한다. In step 520, an LSTM model to apply to the extracted EEG features is selected using a genetic algorithm.

MERTI-Apps 데이터 셋으로부터 추출된 EEG 특징에 대한 임의의 상위 객체를 생성하여 학습한 후 미리 정해진 기준 이상의 상위 부모 객체를 선정해 차세대 모델로 이동하고, 선택, 돌연변이, 크로스 오버를 포함하는 유전 알고리즘 과정을 거쳐 자녀 객체를 생성하고, 미리 정해진 수의 차세대 모델이 생성되거나 또는 현재 모델의 RMSE(Root-Mean-Square Error)가 더 이상 개선되지 않을 때까지 학습을 반복한 후 추출된 EEG 특징에 적용하기 위한 LSTM 모델을 출력한다. Genetic algorithm process including selection, mutation, and crossover by creating and learning random parent objects for EEG features extracted from the MERTI-Apps data set Create child objects through Outputs the LSTM model for

단계(530)에서, 선택된 LSTM 모델에 대하여 유전 알고리즘을 사용하여 특징 세트를 선택한다. In step 530, a set of features is selected using a genetic algorithm for the selected LSTM model.

임의의 상위 객체를 생성하고, 선택된 LSTM 모델을 이용하여 학습한 후 미리 정해진 기준 이상의 상위 부모 객체를 선정해 차세대 모델로 이동하고, 선택, 돌연변이, 크로스 오버를 포함하는 유전 알고리즘 과정을 거쳐 자녀 객체를 생성하고, 선택된 LSTM 모델을 통해 미리 정해진 기준 이상의 우세한(dominant) 특징 세트를 선택한다. After creating a random parent object, learning using the selected LSTM model, selecting a parent object higher than a predetermined standard, moving to the next-generation model, and passing through a genetic algorithm process including selection, mutation, and crossover to generate child objects and selects a set of dominant features over a predetermined criterion through the selected LSTM model.

초기 모델, 특징 세트를 결정하지 않고, 모든 모델, 모든 특징 그룹 및 모든 채널이 GA에 의해 평가되어 감정 인식을 방해하는 모델, 특징, 채널을 제거함으로써 최적의 모델과 특징 세트를 선택한다. Without determining the initial model, feature set, all models, all feature groups, and all channels are evaluated by GA to select the optimal model and feature set by removing models, features, and channels that interfere with emotion recognition.

본 발명의 실시예에 따르면, 표 3과 같이 EEG 신호의 특징을 추출해 딥러닝을 통해 감정인식을 위한 효과적인 특징 그룹을 도출했다. According to an embodiment of the present invention, an effective feature group for emotion recognition was derived through deep learning by extracting the features of the EEG signal as shown in Table 3.

<표 3><Table 3>

도 6은 본 발명의 일 실시예에 따른 아시아인의 다양한 데이터베이스를 이용한 감정 인식에서 효과적 모델 및 특징 그룹 선정을 위한 딥러닝 장치의 구성을 나타내는 도면이다. 6 is a diagram showing the configuration of a deep learning apparatus for effective model and feature group selection in emotion recognition using various databases of Asians according to an embodiment of the present invention.

제안된 딥러닝 모델은 도 6과 같이 GA-LSTM과 결합되었다. 초기화 단계는 GA의 유효 특징을 연속적으로 전송하기 위해 랜덤으로 수행되었다.The proposed deep learning model was combined with GA-LSTM as shown in FIG. The initialization step was performed randomly to continuously transmit the valid characteristics of the GA.

제안하는 감정 인식의 효과적 모델 및 특징 그룹 선정을 위한 딥러닝 장치는 특징 추출부(610), LSTM 모델 선택부(620) 및 특징 세트 선택부(630)를 포함한다. The proposed deep learning apparatus for selecting an effective model and feature group for emotion recognition includes a feature extraction unit 610 , an LSTM model selection unit 620 , and a feature set selection unit 630 .

특징 추출부(610)는 감정 인식을 위한 활성 특징을 선택하기 위해 시간 영역, 주파수 영역, 시간-주파수 영역에서 EEG(electroencephalogram) 특징을 추출한다. The feature extraction unit 610 extracts an electroencephalogram (EEG) feature from the time domain, the frequency domain, and the time-frequency domain to select an active feature for emotion recognition.

감정 인식을 위한 활성 특징을 선택하기 위한 EEG 특징을 표 3에 요약한 세 가지 영역인 시간 영역, 주파수 영역, 시간-주파수 영역에서 추출한다. 제안된 방법은 특징의 가중치를 통해 EEG 신호의 어떤 채널과 특징이 감정 인식에 유효한지 구별할 수 있다. 이것은 종래기술[17]이 PSD 특징만을 사용하는 데 반해 본 발명에서는 다양한 특징 집합을 사용하는 동기가 된다. 채널당 37개의 EEG 특징이 추출되었고, 그 중 일부는 다른 연구에서 PSD에 추가로 사용되었다[30, 31]. 참가자들의 작은 움직임, 땀, 체온, 긴장 등 다양한 요소가 EEG 신호에서 노이즈로 작용한다. EEG features for selecting active features for emotion recognition are extracted from three domains summarized in Table 3: time domain, frequency domain, and time-frequency domain. The proposed method can distinguish which channels and features of the EEG signal are effective for emotion recognition through the weights of the features. This is the motivation to use a variety of feature sets in the present invention, whereas the prior art [17] uses only PSD features. 37 EEG features per channel were extracted, some of which were further used for PSD in other studies [30, 31]. Various factors, such as small movements of participants, sweat, body temperature, and tension, act as noise in the EEG signal.

본 발명에서는 매우 낮은 주파수 대역과 50Hz 이상의 주파수 대역을 제거하기 위해 Biopack의 M150 장비에 노치 필터를 사용했다. EOG 신호를 통해 아이블링크(eyeblink) 패턴도 제거했다. 주파수 영역에서는 고속 푸리에 변환(FFT)을 사용하여 EEG 신호를 저속 알파, 알파, 베타, 감마파로 나누었다. 특징을 추출해 1차원 벡터로 변환해 입력 데이터로 활용했다. 최종 특징 차원은 (채널 수) × (특징의 양)으로 계산되었다. MAHOB 데이터셋의 경우 특징 차원은 32 × 37 = 1184이었다. 본 발명의 실시예에 따른 MERTI-App의 경우 특징 차원은 12 × 37 = 444이었다. 또한 1:5, 5:1, 3:3, 5:5로 구성된 뇌 편향 특징이 사용되었다. 본 발명에서는 데이터가 연속 시간 주석에 해당하기 때문에 이전 2초 데이터 스트림으로부터 특징을 추출한다.In the present invention, a notch filter is used in Biopack's M150 equipment to remove a very low frequency band and a frequency band of 50 Hz or higher. The eyeblink pattern was also removed through the EOG signal. In the frequency domain, a fast Fourier transform (FFT) was used to divide the EEG signal into slow alpha, alpha, beta, and gamma waves. The features were extracted and converted into a one-dimensional vector and used as input data. The final feature dimension was calculated as (number of channels) × (amount of features). For the MAHOB dataset, the feature dimension was 32 × 37 = 1184. In the case of MERTI-App according to an embodiment of the present invention, the feature dimension was 12 × 37 = 444. In addition, brain bias features consisting of 1:5, 5:1, 3:3, and 5:5 were used. In the present invention, features are extracted from the previous 2 second data stream because the data corresponds to continuous time annotation.

시간 영역 특징은 시간에 따른 EEG 신호의 변화로 나타낼 수 있다. 시간 영역에서는 평균, 최소, 최대, 첫 번째 차이 및 정규화된 첫 번째 차이를 사용하여 시간에 따른 감정의 변화를 인식한다[25].The time domain feature may be represented by a change in the EEG signal with time. In the time domain, we use the mean, minimum, maximum, first difference, and normalized first difference to recognize changes in emotion over time [25].

주파수 영역 특징은 종래기술에서는 EEG의 공간 해상도가 뛰어나 주파수 영역의 구성요소가 사용되었다. 서로 다른 주파수 대역에서 나타나는 출력은 서로 다른 감정 상태를 감지하는 좋은 요인이다. 본 발명에서 PSD 특징은 네 가지 영역, 저속 알파(8-10Hz), 알파(8-12.9Hz), 베타(13-29.9Hz), 감마(30-50Hz)으로 나뉘어 추출된다. 평균, 최대 및 적분 값은 PSD 특징에서 비교되었다.As for the frequency domain characteristic, in the prior art, the spatial resolution of the EEG is excellent, so that the frequency domain component is used. Outputs appearing in different frequency bands are good factors for detecting different emotional states. In the present invention, PSD features are extracted by dividing them into four regions, slow alpha (8-10 Hz), alpha (8-12.9 Hz), beta (13-29.9 Hz), and gamma (30-50 Hz). Mean, maximum and integral values were compared in PSD features.

시간-주파수 영역 특징은 4개의 주파수 영역으로 나뉘었고, 5개의 특징이 선택되었다. 이산형 파장 변환(Discrete Wavelet Transform; DWT)은 신호를 시간에 따라 비트로 분해한다. 이러한 특징은 음성 분야[34]와 감정 인식 분야에서 사용된다[35]. DWT의 각 주파수 범위에는 평균, 최대, 절대값이 사용되었고, 로그와 절대(로그) 값도 사용되었다.The time-frequency domain feature was divided into four frequency domains, and five features were selected. Discrete Wavelet Transform (DWT) decomposes a signal into bits over time. These features are used in speech field [34] and emotion recognition field [35]. Average, maximum, and absolute values were used for each frequency range of the DWT, and log and absolute (log) values were also used.

뇌 기능 분화 특징에서 두 전극의 값은 전원 스펙트럼으로 표시되며, 두 전극의 편향 값은 0 ~ 1의 범위에 있다. 1에 가까운 편향 값은 두 전극이 강하게 연결되어 있음을 의미하며, 0에 가까운 편향 값은 두 전극 사이의 상관관계가 작다는 것을 의미한다. In the brain function differentiation feature, the values of the two electrodes are expressed as power spectra, and the bias values of the two electrodes are in the range of 0 to 1. A deflection value close to 1 means that the two electrodes are strongly connected, and a deflection value close to 0 means that the correlation between the two electrodes is small.

도 7은 본 발명의 일 실시예에 따른 뇌파 신호에 대한 뇌 기능 분화 특징을 설명하기 위한 도면이다. 7 is a view for explaining the brain function differentiation characteristics for EEG signals according to an embodiment of the present invention.

본 발명에서는 도 7에서와 같이 뇌 기능 분화 특징을 4가지 방법으로 추출하였다. 도 7(a)는 1:5, 좌뇌 중심에서 우뇌로 기능 분화가 변화할 때 좌뇌 중심 전극 1개와 우뇌 전극 5개 사이의 기능 분화를 측정하는 것을 나타내는 도면이다. 도 7(b)는 5:1, 우뇌 중심에서 좌뇌로 기능 분화가 변화할 때 우뇌 중심 전극 5개와 좌뇌 중심 전극 1개 사이의 기능 분화를 측정하는 것을 나타내는 도면이다. 도 7(c)는 3:3, 도 7(d)는 5×5, 각각 좌뇌와 우뇌 사이에 기능 분화가 변화할 때, 좌뇌와 우뇌에 있는 동일한 수의 전극 사이의 기능 분화를 측정하는 것을 나타내는 도면이다. In the present invention, as shown in FIG. 7 , the brain function differentiation characteristics were extracted by four methods. FIG. 7(a) is a diagram illustrating the measurement of functional differentiation between one left brain center electrode and five right brain electrodes when the functional differentiation changes from the left brain center to the right brain at 1:5. FIG. 7(b) is a diagram illustrating the measurement of functional differentiation between five right brain center electrodes and one left brain center electrode when the functional differentiation changes from the right brain center to the left brain at 5:1. Fig. 7(c) is 3:3, Fig. 7(d) is 5x5, respectively, measuring functional differentiation between the same number of electrodes in the left and right brain when the functional differentiation is changed between the left and right brain. It is a drawing showing

도 7은 10-20 국제 시스템을 나타내며 모든 전극이 언제 표시되는지 알기 어렵기 때문에 단순화된다.7 shows the 10-20 international system and is simplified because it is difficult to know when all electrodes are displayed.

LSTM 모델 선택부(620)는 추출된 EEG 특징에 적용하기 위한 LSTM 모델을 유전 알고리즘(Genetic Algorithm; GA)을 사용하여 선택한다. The LSTM model selection unit 620 selects an LSTM model to be applied to the extracted EEG feature using a genetic algorithm (GA).

GA는 유전적 생물학적 진화를 문제해결 전략으로 모방한 휴리스틱(heuristic) 검색 알고리즘이다. 먼저 특징 추출부(610)에서 추출된 데이터에 적합한 유효 LSTM 모델을 선택하기 위해 도 6과 같이 유전 알고리즘(621)을 사용한다. 모집단의 40%를 무작위로 선택하여 상위 객체를 만들고, 학습 후 상위 20%의 부모 객체를 선정해 차세대 모델로 옮겨간다. 또한 선택, 돌연변이, 교차 등을 이용하여 자녀 객체를 만든다. 이 과정은 10차 세대 모델이 생성되거나 현재 모델의 RMSE가 더 이상 개선되지 않을 때까지 반복된다. 본 발명의 실시예에 따른 돌연변이의 발생 확률은 10%로 설정되었고 돌연변이 프로세스는 epoch, LSTM_cell, drop_out 모델의 계수만 업데이트한다. 본 발명에서는 모델의 epoch, LSTM_cell, drop_out, activation 및 최적화 파라미터가 초기 시간에 랜덤으로 설정되었다고 가정한다. 현재 모델의 RMSE가 더 이상 개선되지 않을 때까지 반복한 후, 파라미터를 최적화하여 테스트 데이터와 함께 선택된 LSTM 모델(622)을 출력한다. GA is a heuristic search algorithm that mimics genetic and biological evolution as a problem-solving strategy. First, the genetic algorithm 621 is used as shown in FIG. 6 to select an effective LSTM model suitable for the data extracted by the feature extraction unit 610 . 40% of the population is randomly selected to create a parent object, and after training, the parent object in the top 20% is selected and moved to the next-generation model. It also creates child objects using selection, mutation, crossover, etc. This process is repeated until the 10th generation model is generated or the RMSE of the current model is no longer improved. The probability of occurrence of a mutation according to an embodiment of the present invention was set to 10%, and the mutation process updates only the coefficients of the epoch, LSTM_cell, and drop_out models. In the present invention, it is assumed that the epoch, LSTM_cell, drop_out, activation, and optimization parameters of the model are randomly set at the initial time. After iterating until the RMSE of the current model is no longer improved, the parameters are optimized to output the selected LSTM model 622 together with the test data.

특징 세트 선택부(630)는 선택된 LSTM 모델에 대하여 유전 알고리즘을 사용하여 특징 세트를 선택한다. 유전 알고리즘(631)을 다시 사용하여 LSTM 모델 선택부(620)에서 선택된 LSTM 모델을 통해 우세한(dominant) 특징 세트를 선택한다. 이 단계는 전체 기록에서 25-50%의 특징을 랜덤으로 선택한다. The feature set selector 630 selects a feature set with respect to the selected LSTM model using a genetic algorithm. Using the genetic algorithm 631 again, a dominant feature set is selected through the LSTM model selected by the LSTM model selection unit 620 . This step randomly selects 25-50% of the features from the entire record.

도 8은 본 발명의 일 실시예에 따른 MANHOB-HCI 데이터 세트에서 GA-LSTM의 EEG 특징 가중치, 채널 가중치를 나타내는 도면이다. 8 is a diagram illustrating EEG feature weights and channel weights of GA-LSTM in the MANHOB-HCI data set according to an embodiment of the present invention.

모든 특징이 LSTM 훈련에 사용되는 경우, 잘못된 특징이 훈련 모델을 혼동할 수 있다. 도 8에서 보듯이, MAHNOB-HCI의 모든 특징의 약 20%가 좋은 성능을 보이고 있음을 알 수 있다. 성능이 좋은 특징이 모두 집합 안에 있다면 새로운 특징을 추가하고 평가하기 어렵다. 따라서, 본 발명에서는 25-50%의 특징을 우세한 특징 세트로 사용했다. 차세대 특징을 만들기 위해 GA는 선택, 교차, 돌연변이 등 3가지 주요 과정으로 구성된다. 특징 세트는 정수의 배열이며, 여기서 각 정수는 가중치를 나타낸다. 선택은 우세한 특징으로부터 높은 가중치를 가진 특징을 선택하여 다음 우세한 특징을 입력하는 것이다. 크로스오버 기능은 사용되지 않는 특징에서 높은 가중치 특징으로 선택한 후 남은 특징을 교환(swap)한다. 돌연변이는 우세한 특징에서 랜덤으로 선택하고 랜덤 선택이 적은 경우를 방지하기 위해 새로운 특징으로 돌연변이를 생성한다.If all features are used for LSTM training, the wrong features can confuse the training model. As shown in FIG. 8 , it can be seen that about 20% of all features of MAHNOB-HCI show good performance. If all the features with good performance are in the set, it is difficult to add and evaluate new features. Therefore, in the present invention, 25-50% features were used as the dominant feature set. To create next-generation features, GA consists of three main processes: selection, crossover, and mutation. A feature set is an array of integers, where each integer represents a weight. Selection is to select a feature with a high weight from the dominant feature and input the next dominant feature. The crossover function swaps the remaining features after selecting the unused features as high-weight features. Mutations select randomly from the dominant features and generate mutations with new features to avoid cases where random selection is small.

학습 후 상위 20%의 상위 객체를 선정해 차세대 특징 그룹으로 이동한다. 크로스오버 시 특징을 선택하고 다음 세대를 위해 하위 객체를 생성할 때 가져오는 특징의 크로스오버 비율은 8:2이다. 돌연변이 발생 확률을 10%로 설정해 유전적 다양성을 위해 선택하지 않은 특징을 방지했다. 이 과정은 10세대 특징 그룹이 생성되거나 현재 모델의 RMSE가 더 이상 개선되지 않을 때까지 반복된다. GA 효과와 효율을 위해서는 올바른 적합(fitness) 함수를 선택하는 것이 중요하다.After learning, the top 20% of objects are selected and moved to the next-generation feature group. When selecting a feature at crossover and creating sub-objects for the next generation, the crossover ratio of imported features is 8:2. We set the mutation probability to 10% to avoid traits not selected for genetic diversity. This process is repeated until the 10th generation feature group is generated or the RMSE of the current model is no longer improved. For GA effectiveness and efficiency, it is important to choose the right fitness function.

(1)

(One)

여기서 β는 -0.25부터 1.25까지의 난수를 나타내며, 1과 2는 부모 유전 값을 나타내고, V는 자녀 유전 값을 나타낸다. 두 부모 유전이 비슷할 경우 반복적인 유전에서 매번 유사한 여러 특징을 평가한다. 무작위성은 새로운 자녀 유전이 만들어졌을 때 주어졌다. 식(1)에 따라 새로운 자녀 유전에 대해 각 부모 유전을 선정하고, 크로스오버할 때 그 특징을 임의로 축소하여 선택하며, GA를 통해 자녀 유전을 결정한다.Here, β represents a random number from -0.25 to 1.25, 1 and 2 represent the parental genetic value, and V represents the child's genetic value. If the inheritance of both parents is similar, several similar traits are evaluated each time in repeated inheritance. Randomness was given when new offspring inheritance was created. According to Equation (1), each parent's inheritance is selected for the new child's inheritance, the characteristics are arbitrarily reduced and selected when crossover, and the child's inheritance is determined through GA.

도 9는 본 발명의 일 실시예에 따른 LSTM-FC 모델 구조를 나타내는 도면이다. 9 is a diagram illustrating a structure of an LSTM-FC model according to an embodiment of the present invention.

본 발명의 실시예에 따른 특징 추출 단계는 각 EEG 신호에서 1차원 특징으로 3-영역(domain) 특징을 추출한다. 본 발명의 실시예에 따른 회귀는 GA-LSTM을 통해 수행된다. GA-LSTM에 적용되는 LSTM 완전연결(Fully Connected; FC) 모델은 도 9와 도 6에 나타낸 것과 같이 3개의 LSTM 계층과 2개의 FC 계층으로 구성된다. 출력 계층은 마지막 계층에 하나의 뉴런이 있는 FC 계층의 활성 함수 tanh를 사용하여 -0.1과 0.1 사이의 값을 생성하여 하나의 정서 값을 예측한다. LSTM 입력 데이터로서 특징 추출에 의해 정화된 EEG 데이터는 0.25초 간격으로 2초씩 입력되었다. 일반 LSTM 모델과 동일하지만, 본 발명의 실시예에 따른 모델은 한 배치에서 학습한 상태가 다음 배치로 전송되는 등의 조건을 유지할 수 있는 순환 신경망을 포함하고 있다. LSTM은 세 개의 계층으로 쌓여서 하나의 계층보다 더 깊은 추론을 할 수 있다. 과적합을 방지하기 위해 LSTM 계층 사이에 드롭아웃 계층이 추가되었다.In the feature extraction step according to an embodiment of the present invention, a 3-domain feature is extracted as a one-dimensional feature from each EEG signal. Regression according to an embodiment of the present invention is performed through GA-LSTM. The LSTM Fully Connected (FC) model applied to the GA-LSTM consists of three LSTM layers and two FC layers as shown in FIGS. 9 and 6 . The output layer predicts one sentiment value by generating a value between -0.1 and 0.1 using the activation function tanh of the FC layer with one neuron in the last layer. As LSTM input data, EEG data purified by feature extraction were input at 0.25-second intervals for 2 seconds. It is the same as the general LSTM model, but the model according to an embodiment of the present invention includes a recurrent neural network that can maintain conditions such as the state learned from one batch being transferred to the next batch. LSTMs are stacked in three layers, allowing deeper reasoning than one layer. A dropout layer is added between the LSTM layers to prevent overfitting.

1차원 특징을 바탕으로 LSTM의 입력으로 사용할 초기 특징 그룹을 GA를 통해 랜덤으로 선택했다. 각 이미지에서 순차적으로 추출한 1차원 특징으로 한 정서 값을 회귀시킨다. LSTM에 의해 처리되기 전, 특징 그룹은 GA에 의해 조정되었고, LSTM의 숨겨진 상태 벡터 출력은 다음 LSTM 이전에 새로운 특징 그룹을 모집할 수 있다. GA-LSTM의 최종 출력은 정서 값이다. 정서와 각성도 실험은 모두 MERTI-Apps 데이터셋을 사용하여 수행되었다. Based on the one-dimensional features, the initial feature group to be used as the input of the LSTM was randomly selected through GA. A sentiment value is returned as a one-dimensional feature extracted sequentially from each image. Before being processed by the LSTM, the feature groups were adjusted by the GA, and the hidden state vector output of the LSTM can recruit new feature groups before the next LSTM. The final output of the GA-LSTM is the sentiment value. Both emotion and arousal experiments were performed using the MERTI-Apps dataset.

이상에서 설명된 장치는 하드웨어 구성요소, 소프트웨어 구성요소, 및/또는 하드웨어 구성요소 및 소프트웨어 구성요소의 조합으로 구현될 수 있다. 예를 들어, 실시예들에서 설명된 장치 및 구성요소는, 예를 들어, 프로세서, 콘트롤러, ALU(arithmetic logic unit), 디지털 신호 프로세서(digital signal processor), 마이크로컴퓨터, FPA(field programmable array), PLU(programmable logic unit), 마이크로프로세서, 또는 명령(instruction)을 실행하고 응답할 수 있는 다른 어떠한 장치와 같이, 하나 이상의 범용 컴퓨터 또는 특수 목적 컴퓨터를 이용하여 구현될 수 있다. 처리 장치는 운영 체제(OS) 및 상기 운영 체제 상에서 수행되는 하나 이상의 소프트웨어 애플리케이션을 수행할 수 있다.　 또한, 처리 장치는 소프트웨어의 실행에 응답하여, 데이터를 접근, 저장, 조작, 처리 및 생성할 수도 있다.　 이해의 편의를 위하여, 처리 장치는 하나가 사용되는 것으로 설명된 경우도 있지만, 해당 기술분야에서 통상의 지식을 가진 자는, 처리 장치가 복수 개의 처리 요소(processing element) 및/또는 복수 유형의 처리 요소를 포함할 수 있음을 알 수 있다.　 예를 들어, 처리 장치는 복수 개의 프로세서 또는 하나의 프로세서 및 하나의 콘트롤러를 포함할 수 있다.　 또한, 병렬 프로세서(parallel processor)와 같은, 다른 처리 구성(processing configuration)도 가능하다.The device described above may be implemented as a hardware component, a software component, and/or a combination of the hardware component and the software component. For example, devices and components described in the embodiments may include, for example, a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable array (FPA), It may be implemented using one or more general purpose or special purpose computers, such as a programmable logic unit (PLU), microprocessor, or any other device capable of executing and responding to instructions. The processing device may execute an operating system (OS) and one or more software applications running on the operating system. A processing device may also access, store, manipulate, process, and generate data in response to execution of the software. For convenience of understanding, although one processing device is sometimes described as being used, one of ordinary skill in the art will recognize that the processing device includes a plurality of processing elements and/or a plurality of types of processing elements. It can be seen that can include For example, the processing device may include a plurality of processors or one processor and one controller. Other processing configurations are also possible, such as parallel processors.

소프트웨어는 컴퓨터 프로그램(computer program), 코드(code), 명령(instruction), 또는 이들 중 하나 이상의 조합을 포함할 수 있으며, 원하는 대로 동작하도록 처리 장치를 구성하거나 독립적으로 또는 결합적으로(collectively) 처리 장치를 명령할 수 있다.　 소프트웨어 및/또는 데이터는, 처리 장치에 의하여 해석되거나 처리 장치에 명령 또는 데이터를 제공하기 위하여, 어떤 유형의 기계, 구성요소(component), 물리적 장치, 가상 장치(virtual equipment), 컴퓨터 저장 매체 또는 장치에 구체화(embody)될 수 있다.　 소프트웨어는 네트워크로 연결된 컴퓨터 시스템 상에 분산되어서, 분산된 방법으로 저장되거나 실행될 수도 있다. 소프트웨어 및 데이터는 하나 이상의 컴퓨터 판독 가능 기록 매체에 저장될 수 있다.Software may comprise a computer program, code, instructions, or a combination of one or more thereof, which configures a processing device to operate as desired or is independently or collectively processed You can command the device. The software and/or data may be any kind of machine, component, physical device, virtual equipment, computer storage medium or apparatus, to be interpreted by or to provide instructions or data to the processing device. may be embodied in The software may be distributed over networked computer systems and stored or executed in a distributed manner. Software and data may be stored in one or more computer-readable recording media.

실시예에 따른 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다.　 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다.　 상기 매체에 기록되는 프로그램 명령은 실시예를 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다.　 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다.　 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다.　 The method according to the embodiment may be implemented in the form of program instructions that can be executed through various computer means and recorded in a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, etc. alone or in combination. The program instructions recorded on the medium may be specially designed and configured for the embodiment, or may be known and available to those skilled in the art of computer software. Examples of the computer-readable recording medium include magnetic media such as hard disks, floppy disks and magnetic tapes, optical media such as CD-ROMs and DVDs, and magnetic such as floppy disks. - includes magneto-optical media, and hardware devices specially configured to store and execute program instructions, such as ROM, RAM, flash memory, and the like. Examples of program instructions include not only machine language codes such as those generated by a compiler, but also high-level language codes that can be executed by a computer using an interpreter or the like.

이상과 같이 실시예들이 비록 한정된 실시예와 도면에 의해 설명되었으나, 해당 기술분야에서 통상의 지식을 가진 자라면 상기의 기재로부터 다양한 수정 및 변형이 가능하다.　 예를 들어, 설명된 기술들이 설명된 방법과 다른 순서로 수행되거나, 및/또는 설명된 시스템, 구조, 장치, 회로 등의 구성요소들이 설명된 방법과 다른 형태로 결합 또는 조합되거나, 다른 구성요소 또는 균등물에 의하여 대치되거나 치환되더라도 적절한 결과가 달성될 수 있다.As described above, although the embodiments have been described with reference to the limited embodiments and drawings, various modifications and variations are possible from the above description by those skilled in the art. For example, the described techniques are performed in an order different from the described method, and/or the described components of the system, structure, apparatus, circuit, etc. are combined or combined in a different form than the described method, or other components Or substituted or substituted by equivalents may achieve an appropriate result.

그러므로, 다른 구현들, 다른 실시예들 및 특허청구범위와 균등한 것들도 후술하는 특허청구범위의 범위에 속한다.Therefore, other implementations, other embodiments, and equivalents to the claims are also within the scope of the following claims.

<참고문헌><References>

1. M. S. Sinith, E. Aswathi, T. M. Deepa, C. P. Shameema, and S. Rajan; Emotion recognition from audio signals using Support Vector Machine. In Intelligent Computational Systems (RAICS), IEEE Recent Advances in, 2005, 139-144.1. M. S. Sinith, E. Aswathi, T. M. Deepa, C. P. Shameema, and S. Rajan; Emotion recognition from audio signals using Support Vector Machine. In Intelligent Computational Systems (RAICS), IEEE Recent Advances in, 2005, 139-144.

2. C. Busso, Z. Deng, S. Yildirim, M.Bulut, C. M. Lee, A. Kazemzadeh,... and S. Narayanan; Analysis of emotion recognition using facial expressions, speech, and multimodal information. In Proceedings of the 6th international conference on Multimodal interfaces, 2004, 205-211.2. C. Busso, Z. Deng, S. Yildirim, M. Bulut, C. M. Lee, A. Kazemzadeh,... and S. Narayanan; Analysis of emotion recognition using facial expressions, speech, and multimodal information. In Proceedings of the 6th international conference on Multimodal interfaces, 2004, 205-211.

3. H. Yang, U. Ciftci, and L. Yin; Facial Expression Recognition by De-Expression Residue Learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, 2168-2177.3. H. Yang, U. Ciftci, and L. Yin; Facial Expression Recognition by De-Expression Residue Learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, 2168-2177.

4. H. Jung, S. Lee, J. Yim, S. Park, and J. Kim; Joint fine-tuning in deep neural networks for facial expression recognition. In Proceedings of the IEEE International Conference on Computer Vision, 2015, 2983-2991.4. H. Jung, S. Lee, J. Yim, S. Park, and J. Kim; Joint fine-tuning in deep neural networks for facial expression recognition. In Proceedings of the IEEE International Conference on Computer Vision, 2015, 2983-2991.

5. H. Ranganathan, S. Chakraborty, S. Panchanathan; Multimodal emotion recognition using deep learning architectures. 2016 IEEE Winter conference on Applications of Computer Vision(WACV), 2016.5. H. Ranganathan, S. Chakraborty, S. Panchanathan; Multimodal emotion recognition using deep learning architectures. 2016 IEEE Winter conference on Applications of Computer Vision (WACV), 2016.

6. M. Valstar, and M. Pantic; Induced disgust, happiness, and surprise: an addition to the mmi facial expression database. In Proc. 3rd Intern. Workshop on EMOTION (satellite of LREC): Corpora for Research on Emotion and Affect, 2010, 65-70.6. M. Valstar, and M. Pantic; Induced disgust, happiness, and surprise: an addition to the mmi facial expression database. In Proc. 3rd Intern. Workshop on EMOTION (satellite of LREC): Corpora for Research on Emotion and Affect, 2010, 65-70.

7. X-H. Wang, T. Zhang, X-M. Xu, L. Chen, X-F. Xing, C.L. Philip Chen; EEG Emotion Recognition Using Dynamical Graph Convolutional Neural Networks and Broad Learning System. IEEE Conf. on Bioinformatics and Biomedicine(BIBM), 2018, 1240-1244.7. X-H. Wang, T. Zhang, X-M. Xu, L. Chen, X-F. Xing, C. L. Philip Chen; EEG Emotion Recognition Using Dynamical Graph Convolutional Neural Networks and Broad Learning System. IEEE Conf. on Bioinformatics and Biomedicine (BIBM), 2018, 1240-1244.

8. A. Mavratzakis, C. Herbert, and P. Walla; Emotional facial expressions evoke faster-orienting responses, but weaker emotional responses at neural and behavioral levels compared to scenes: A simultaneous EEG and facial EMG study. Neuroimage, 2016, 124, 931-946.8. A. Mavratzakis, C. Herbert, and P. Walla; Emotional facial expressions evoke faster-orienting responses, but weaker emotional responses at neural and behavioral levels compared to scenes: A simultaneous EEG and facial EMG study. Neuroimage, 2016, 124, 931-946.

9. R.Jenke, A. Peer, and M. Buss; Feature extraction and selection for emotion recognition from EEG. IEEE Transactions on Affective Computing, 2014, 5(3), 327-339.9. R. Jenke, A. Peer, and M. Buss; Feature extraction and selection for emotion recognition from EEG. IEEE Transactions on Affective Computing, 2014, 5(3), 327-339.

10. C. Busso, Z. Deng, S. Yildirim, M.Bulut, C. M. Lee, A. Kazemzadeh, and S. Narayanan; Analysis of emotion recognition using facial expressions, speech, and multimodal information. In Proceedings of the 6th international conference on Multimodal interfaces, 2004, 205-211.10. C. Busso, Z. Deng, S. Yildirim, M. Bulut, C. M. Lee, A. Kazemzadeh, and S. Narayanan; Analysis of emotion recognition using facial expressions, speech, and multimodal information. In Proceedings of the 6th international conference on Multimodal interfaces, 2004, 205-211.

11. S. Koelstra, C. Muhl, M. Soleymani, J. S.Lee, A. Yazdani, T. Ebrahimi,... and I. Patras; Deep: A database for emotion analysis; using physiological signals. IEEE Transactions on Affective Computing, 2012, 3(1), 18-31.11. S. Koelstra, C. Muhl, M. Soleymani, J. S. Lee, A. Yazdani, T. Ebrahimi,... and I. Patras; Deep: A database for emotion analysis; using physiological signals. IEEE Transactions on Affective Computing, 2012, 3(1), 18-31.

12. M. Soleymani, J. Lichtenauer, T. Pun, and M. Pantic; A multimodal database for affect recognition and implicit tagging. IEEE Transactions on Affective Computing, 2012, 3(1), 42-55.12. M. Soleymani, J. Lichtenauer, T. Pun, and M. Pantic; A multimodal database for affect recognition and implicit tagging. IEEE Transactions on Affective Computing, 2012, 3(1), 42-55.

13. Soraia M. A and Manuel J. F; Emotions Recognition Using EEG Signals: A Survey. IEEE Transactions on Affective Computing, 2017, 10(3), 374-393.13. Soraia M. A and Manuel J. F; Emotions Recognition Using EEG Signals: A Survey. IEEE Transactions on Affective Computing, 2017, 10(3), 374-393.

14. N. Ketkar; Introduction to PyTorch. In Deep Learning with Python, Apress, Berkeley, CA, 2017, 195-208.14. N. Ketkar; Introduction to PyTorch. In Deep Learning with Python, Apress, Berkeley, CA, 2017, 195-208.

15. P. Lucey, J. F. Cohn, T. Kanade, J. Saragih, Z. Ambadar and I. Matthews; The extended Cohn-Kanade dataset (ck+): A complete dataset for action unit and emotion-specified expression. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2010, 94-101.15. P. Lucey, J. F. Cohn, T. Kanade, J. Saragih, Z. Ambadar and I. Matthews; The extended Cohn-Kanade dataset (ck+): A complete dataset for action unit and emotion-specified expression. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2010, 94-101.

16. L. S. Granados, M. M. Organero, G. R. Gonzalez, E. Abdulhay, and N. Arunkumar; Using Deep Convolutional Neural Network for Emotion Detection on a Physiological Signals Dataset(AMIGOS). IEEE Access, 2018, 7, 57-67.16. L. S. Granados, M. M. Organero, G. R. Gonzalez, E. Abdulhay, and N. Arunkumar; Using Deep Convolutional Neural Network for Emotion Detection on a Physiological Signals Dataset (AMIGOS). IEEE Access, 2018, 7, 57-67.

17. M. Soleymani, S. Asghari-Esfeden, Y. Fu and M. Pantic; Analysis of EEG signals and facial expressions for continuous emotion detection. IEEE Transactions on Affective Computing, 2016, 1, 17-28.17. M. Soleymani, S. Asghari-Esfeden, Y. Fu and M. Pantic; Analysis of EEG signals and facial expressions for continuous emotion detection. IEEE Transactions on Affective Computing, 2016, 1, 17-28.

18. B. H. Kim and S. Jo; Deep Physiological Affect Network for the Recognition of Human Emotions. IEEE Transactions on Affective Computing, 2020, 11(2), 230-243.18. B. H. Kim and S. Jo; Deep Physiological Affect Network for the Recognition of Human Emotions. IEEE Transactions on Affective Computing, 2020, 11(2), 230-243.

19. E. Maria, L. Matthias and H. Sten; Emotion Recognition from Physiological Signal Analysis: A Review, Electronic Notes in Theoretical Computer Science. 2019, 343, 35-55.19. E. Maria, L. Matthias and H. Sten; Emotion Recognition from Physiological Signal Analysis: A Review, Electronic Notes in Theoretical Computer Science. 2019, 343, 35-55.

20. T. Song, W. Zheng, P. Song, and Z. Cui; EEG Emotion Recognition Using Dynamical Graph Convolutional Neural Networks. . IEEE Transactions on Affective Computing, 2020, 11(3), 532-541.20. T. Song, W. Zheng, P. Song, and Z. Cui; EEG Emotion Recognition Using Dynamical Graph Convolutional Neural Networks. . IEEE Transactions on Affective Computing, 2020, 11(3), 532-541.

21. S. Katsigiannis and N. Ramzan; DREAMER: A Database for Emotion Recognition through EEG and ECG Signals From Wireless Low-cost Off-the-Shelf Devices. IEEE Journal of Biomedical and Health informatics. 2017, 22(1), 98-107.21. S. Katsigiannis and N. Ramzan; DREAMER: A Database for Emotion Recognition through EEG and ECG Signals From Wireless Low-cost Off-the-Shelf Devices. IEEE Journal of Biomedical and Health informatics. 2017, 22(1), 98-107.

22. S. A. Mithbavkar and M. S. Shah; Recognition of Emotion Through Facial Expressions Using EMG Signal. 2019 International Conference on Nascent Technologies in Engineering(ICNTE). 2019.22. S. A. Mithbavkar and M. S. Shah; Recognition of Emotion Through Facial Expressions Using EMG Signal. 2019 International Conference on Nascent Technologies in Engineering (ICNTE). 2019.

23. J.A. Healey and R.W. Picard; Detecting Stress During Real-World Driving Tasks Using Physiological Sensors. IEEE Trans. Intelligent Transportation Systems, 2005, 6(2), 156-166.23. J.A. Healey and R.W. Picard; Detecting Stress During Real-World Driving Tasks Using Physiological Sensors. IEEE Trans. Intelligent Transportation Systems, 2005, 6(2), 156-166.

24. M. Grimm, K. Kroschel, and S. Narayanan; The Vera am Mittag German Audio-Visual Emotional Speech Database. Proc. IEEE Int'l Conf. Multimedia and Expo, 2008, 865-868.24. M. Grimm, K. Kroschel, and S. Narayanan; The Vera am Mittag German Audio-Visual Emotional Speech Database. Proc. IEEE Int'l Conf. Multimedia and Expo, 2008, 865-868.

25. Firgan Nihatov Feradov and Todor Dimitrov Ganchev; Ranking of EEG Time-domain Features on the Negative Emotions Recognition Task. Annual Journal of Electronics. 2015.25. Firgan Nihatov Feradov and Toodor Dimitrov Ganchev; Ranking of EEG Time-domain Features on the Negative Emotions Recognition Task. Annual Journal of Electronics. 2015.

26. G. McKeown, M.F. Valstar, R. Cowie, and M. Pantic; The SEMAINE Corpus of Emotionally Coloured Character Interactions. Proc. IEEE Int'l Conf. Multimedia and Expo, July 2010, 1079-1084.26. G. McKeown, M. F. Valstar, R. Cowie, and M. Pantic; The SEMAINE Corpus of Emotionally Colored Character Interactions. Proc. IEEE Int'l Conf. Multimedia and Expo, July 2010, 1079-1084.

27. Biosemi Active II system: https://www.biosemi.com/products.htm27. Biosemi Active II system: https://www.biosemi.com/products.htm

28. U. Herwing, P.Satrapi, and C. Ronfeldt-Lecuona; Using the International 10-20 EEG system for Positioning of Transcranial Magnetic stimulation, Brain Topography, 2003, 16(2).28. U. Herwing, P. Satrapi, and C. Ronfeldt-Lecuona; Using the International 10-20 EEG system for Positioning of Transcranial Magnetic stimulation, Brain Topography, 2003, 16(2).

29. R. Cowie, E. Douglas-Cowie, S. Savvidou, E. McMahon, M. Sawey, and M. Schrφder; 'FEELTRACE': An instrument for recording perceived emotion in real-time. In ISCA tutorial and research workshop (ITRW) on speech and emotion. 2000.29. R. Cowie, E. Douglas-Cowie, S. Savvidou, E. McMahon, M. Sawey, and M. Schrφder; 'FEELTRACE': An instrument for recording perceived emotion in real-time. In ISCA tutorial and research workshop (ITRW) on speech and emotion. 2000.

30. R.Jenke, A. Peer, and M. Buss; Feature extraction and selection for emotion recognition from EEG. IEEE Transactions on Affective Computing, 2014, 5(3), 327-339.30. R. Jenke, A. Peer, and M. Buss; Feature extraction and selection for emotion recognition from EEG. IEEE Transactions on Affective Computing, 2014, 5(3), 327-339.

31. Z. Yin, M. Zhao, Y. Wang, J. Yang, and J. Zhang; Recognition of emotions using multimodal physiological signals and an ensemble deep learning model. Computer methods and programs in biomedicine, 2017, 140, 93-110.31. Z. Yin, M. Zhao, Y. Wang, J. Yang, and J. Zhang; Recognition of emotions using multimodal physiological signals and an ensemble deep learning model. Computer methods and programs in biomedicine, 2017, 140, 93-110.

32. Gerard E. Bruder, Jorge Alvarenga, Karen Abraham, Jamie Skipper, Virginia Warner, Daniel Voyer, Bradley S. Peterson & Myrna M. Weissman; Brain laterality, depression, and anxiety disorders: New findings for emotional and verbal dichotic listening in individuals at risk for depression, Laterality: Asymmetries of Body, Brain, and Cognition, 2016, 21:4-6, 525-54832. Gerard E. Bruder, Jorge Alvarenga, Karen Abraham, Jamie Skipper, Virginia Warner, Daniel Voyer, Bradley S. Peterson & Myrna M. Weissman; Brain laterality, depression, and anxiety disorders: New findings for emotional and verbal dichotic listening in individuals at risk for depression, Laterality: Asymmetries of Body, Brain, and Cognition, 2016, 21:4-6, 525-548

33. Fiona Kumfor, Ramon Landin-Romero, Emma Devenney, Rosalind Hutchings, Roberto Grasso, John R. Hodges, Olivier Piguet; On the right side? A longitudinal study of left- versus right-lateralized semantic dementia. Brain a Journal of Neurology, March 2016, 139(3), 986-998.33. Fiona Kumfor, Ramon Landin-Romero, Emma Devenney, Rosalind Hutchings, Roberto Grasso, John R. Hodges, Olivier Piguet; On the right side? A longitudinal study of left- versus right-lateralized semantic dementia. Brain a Journal of Neurology, March 2016, 139(3), 986-998.

34. Hazrat Ali, Nasir Ahmad, Xianwei Zhou, Khalid Iqbal, and Sahibzada Muhammad Ali; DWT features performance analysis for automatic speech recognition of Urdu, SpringerPlus, 2014, 3, 204.34. Hazrat Ali, Nasir Ahmad, Xianwei Zhou, Khalid Iqbal, and Sahibzada Muhammad Ali; DWT features performance analysis for automatic speech recognition of Urdu, SpringerPlus, 2014, 3, 204.

35. H. Candra, M. Yuwono, R. Chai, H. T. Nguyen, and S. Su; EEG emotion recognition using reduced channel wavelet entropy and average wavelet coefficient features with normal Mutual Information method, IEEE Engineering in Medicine and Biology Society(EMBC). 2017.35. H. Candra, M. Yuwono, R. Chai, H. T. Nguyen, and S. Su; EEG emotion recognition using reduced channel wavelet entropy and average wavelet coefficient features with normal Mutual Information method, IEEE Engineering in Medicine and Biology Society (EMBC). 2017.

36. Y. Ding, X. Hu, Z. Xia, Y.J. Liu and D. Zhang; Inter-brain EEG feature Extraction and Analysis for Continuous Implicit Emotion Tagging during Video Watching, IEEE Transactions on Affective Computing, Early Access, 2018 June 22.36. Y. Ding, X. Hu, Z. Xia, Y. J. Liu and D. Zhang; Inter-brain EEG feature Extraction and Analysis for Continuous Implicit Emotion Tagging during Video Watching, IEEE Transactions on Affective Computing, Early Access, 2018 June 22.

37. D. Mehta, M. F. H. Siddiqui and A. Y. Javaid; Recognition of Emotion Intensities Using Machine Learning Algorithms: A Comparative Study, Sensors, April 2019, 21.37. D. Mehta, M. F. H. Siddiqui and A. Y. Javaid; Recognition of Emotion Intensities Using Machine Learning Algorithms: A Comparative Study, Sensors, April 2019, 21.

38. H. Chao, L. Dong, Y. Liu and B. Lu; Improved Deep Feature Learning by Synchronization Measurements for Multi-Channel EEG Emotion Recognition, Hindawi Complexity, 2020.38. H. Chao, L. Dong, Y. Liu and B. Lu; Improved Deep Feature Learning by Synchronization Measurements for Multi-Channel EEG Emotion Recognition, Hindawi Complexity, 2020.

39. P. Li, H, Liu, Y. Si, C. Li, F. Li, X. Zhu, X. Huang, Y. Zeng, D. Yao, Y. Zhang and Peng Xu; EEG Based Emotion Recognition by Combining Functional Connectivity Network and Local Activations, IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, OCT. 2019, 66(10). 39. P. Li, H, Liu, Y. Si, C. Li, F. Li, X. Zhu, X. Huang, Y. Zeng, D. Yao, Y. Zhang and Peng Xu; EEG Based Emotion Recognition by Combining Functional Connectivity Network and Local Activations, IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, OCT. 2019, 66(10).

40. Z. Gao, X. Wang, Y. Yang, C. Mu, Q. Cai, W. Dang and Siyang Zuo; EEG-Based Spatio-Temporal Convolutional Neural Network for Driver Fatigue Evaluation, IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, SEPTEMBER 2019, 30(9).40. Z. Gao, X. Wang, Y. Yang, C. Mu, Q. Cai, W. Dang and Siyang Zuo; EEG-Based Spatio-Temporal Convolutional Neural Network for Driver Fatigue Evaluation, IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, SEPTEMBER 2019, 30(9).

41. S. Alhagry, A.A. Fahmy and R.A. El-Khoribi; Emotion recognition based on EEG using LSTM recurrent neural network, Int. J. Adv. Comput. Sci. Appl. 2017, 8(10), 8-11.41. S. Alhagry, A.A. Fahmy and R.A. El-Khoribi; Emotion recognition based on EEG using LSTM recurrent neural network, Int. J. Adv. Compute. Sci. Appl. 2017, 8(10), 8-11.

42. E.S. Salama, R.A. El-Khoribi, M.E. Shoman and M.A. Wahby Shalaby; EEG-based emotion recognition using 3D convolutional neural networks, Int. J. Adv. Comput. Sci. Appl., 2018, 9(8), 329-337.42. E.S. Salama, R.A. El-Khoribi, M.E. Shoman and M.A. Wahby Shalaby; EEG-based emotion recognition using 3D convolutional neural networks, Int. J. Adv. Compute. Sci. Appl., 2018, 9(8), 329-337.

43. Z.M. Wang, S.Y. Hu, H. Song; Channel selection method for eeg emotion recognition using normalized mutual information, IEEE Access, 2019, 7, 143303-143311.43. Z.M. Wang, S. Y. Hu, H. Song; Channel selection method for eeg emotion recognition using normalized mutual information, IEEE Access, 2019, 7, 143303-143311.

Claims

감정 인식을 위한 활성 특징을 선택하기 위해 시간 영역, 주파수 영역, 시간-주파수 영역에서 EEG(electroencephalogram) 특징을 추출하는 단계;
추출된 EEG 특징에 적용하기 위한 LSTM 모델을 유전 알고리즘(Genetic Algorithm; GA)을 사용하여 선택하는 단계; 및
선택된 LSTM 모델에 대하여 유전 알고리즘을 사용하여 특징 세트를 선택하는 단계
를 포함하는 모델 선택 기반 감정인식을 위한 딥러닝 방법. extracting electroencephalogram (EEG) features in the time domain, frequency domain, and time-frequency domain to select active features for emotion recognition;
selecting an LSTM model to be applied to the extracted EEG features using a Genetic Algorithm (GA); and
selecting a feature set using a genetic algorithm for the selected LSTM model;
A deep learning method for model selection-based emotion recognition, including

제1항에 있어서,
감정 인식을 위한 활성 특징을 선택하기 위해 시간 영역, 주파수 영역, 시간-주파수 영역에서 EEG 특징을 추출하는 단계는,
아시아인 기반의 데이터베이스를 이용한 MERTI-Apps 데이터 셋으로부터 시간 영역에서 추출되는 특징을 시간에 따른 EEG 신호의 변화로 나타내어 시간에 따른 감정의 변화를 인식하고, 주파수 영역에서 추출되는 특징을 저속 알파, 알파, 베타, 감마로 나누어 추출하고, 시간-주파수 영역에서 추출되는 특징을 이산형 파장 변환(Discrete Wavelet Transform; DWT)을 통해 신호를 시간에 따라 비트로 분해하여 나타내며, 상기 추출된 시간 영역 특징, 주파수 영역 특징, 시간-주파수 영역 특징 및 뇌 기능 분화 특징을 포함하는 EEG 특징을 1차원 벡터로 변환하여 입력 데이터로 이용하는
모델 선택 기반 감정인식을 위한 딥러닝 방법. According to claim 1,
Extracting EEG features in the time domain, frequency domain, and time-frequency domain to select active features for emotion recognition includes:
From the MERTI-Apps data set using an Asian-based database, the characteristics extracted in the time domain are expressed as changes in the EEG signal over time to recognize changes in emotion over time, and features extracted from the frequency domain are identified as low-speed alpha, alpha , beta, and gamma are extracted, and the features extracted in the time-frequency domain are decomposed into bits according to time through Discrete Wavelet Transform (DWT) to represent the extracted time domain features, the frequency domain EEG features including features, time-frequency domain features, and brain function differentiation features are converted into one-dimensional vectors and used as input data.
A deep learning method for model selection-based emotion recognition.

제1항에 있어서,
추출된 EEG 특징에 적용하기 위한 LSTM 모델을 유전 알고리즘을 사용하여 선택하는 단계는,
MERTI-Apps 데이터 셋으로부터 추출된 EEG 특징에 대한 임의의 상위 객체를 생성하여 학습한 후 미리 정해진 기준 이상의 상위 부모 객체를 선정해 차세대 모델로 이동하고, 선택, 돌연변이, 크로스 오버를 포함하는 유전 알고리즘 과정을 거쳐 자녀 객체를 생성하고, 미리 정해진 수의 차세대 모델이 생성되거나 또는 현재 모델의 RMSE(Root-Mean-Square Error)가 더 이상 개선되지 않을 때까지 학습을 반복한 후 추출된 EEG 특징에 적용하기 위한 LSTM 모델을 출력하는
모델 선택 기반 감정인식을 위한 딥러닝 방법. According to claim 1,
The step of selecting an LSTM model to be applied to the extracted EEG features using a genetic algorithm is:
Genetic algorithm process including selection, mutation, and crossover by creating and learning random parent objects for EEG features extracted from the MERTI-Apps data set Create child objects through to output the LSTM model for
A deep learning method for model selection-based emotion recognition.

제1항에 있어서,
선택된 LSTM 모델에 대하여 유전 알고리즘을 사용하여 특징 세트를 선택하는 단계는,
임의의 상위 객체를 생성하고, 선택된 LSTM 모델을 이용하여 학습한 후 미리 정해진 기준 이상의 상위 부모 객체를 선정해 차세대 모델로 이동하고, 선택, 돌연변이, 크로스 오버를 포함하는 유전 알고리즘 과정을 거쳐 자녀 객체를 생성하고, 선택된 LSTM 모델을 통해 미리 정해진 기준 이상의 우세한(dominant) 특징 세트를 선택하며,
초기 모델, 특징 세트를 결정하지 않고, 모든 모델, 모든 특징 그룹 및 모든 채널이 GA에 의해 평가되어 감정 인식을 방해하는 모델, 특징, 채널을 제거함으로써 최적의 모델과 특징 세트를 선택하는
모델 선택 기반 감정인식을 위한 딥러닝 방법. According to claim 1,
Selecting a feature set using a genetic algorithm for the selected LSTM model comprises:
After creating a random parent object, learning using the selected LSTM model, selecting a parent object higher than a predetermined standard, moving to the next-generation model, and passing through a genetic algorithm process including selection, mutation, and crossover to generate child objects generating and selecting a set of dominant features over a predetermined criterion through the selected LSTM model;
Without determining the initial model, feature set, all models, all feature groups, and all channels are evaluated by GA to select the optimal model and feature set by removing models, features, and channels that interfere with emotion recognition.
A deep learning method for model selection-based emotion recognition.

감정 인식을 위한 활성 특징을 선택하기 위해 시간 영역, 주파수 영역, 시간-주파수 영역에서 EEG(electroencephalogram) 특징을 추출하는 특징 추출부;
추출된 EEG 특징에 적용하기 위한 LSTM 모델을 유전 알고리즘(Genetic Algorithm; GA)을 사용하여 선택하는 LSTM 모델 선택부; 및
선택된 LSTM 모델에 대하여 유전 알고리즘을 사용하여 특징 세트를 선택하는 특징 세트 선택부
를 포함하는 모델 선택 기반 감정인식을 위한 딥러닝 장치. a feature extraction unit for extracting electroencephalogram (EEG) features from a time domain, a frequency domain, and a time-frequency domain to select an active feature for emotion recognition;
an LSTM model selection unit that selects an LSTM model to be applied to the extracted EEG features using a genetic algorithm (GA); and
A feature set selector that selects a feature set using a genetic algorithm for the selected LSTM model
A deep learning device for model selection-based emotion recognition, including

제5항에 있어서,
특징 추출부는,
아시아인 기반의 데이터베이스를 이용한 MERTI-Apps 데이터 셋으로부터 시간 영역에서 추출되는 특징을 시간에 따른 EEG 신호의 변화로 나타내어 시간에 따른 감정의 변화를 인식하고, 주파수 영역에서 추출되는 특징을 저속 알파, 알파, 베타, 감마로 나누어 추출하고, 시간-주파수 영역에서 추출되는 특징을 이산형 파장 변환(Discrete Wavelet Transform; DWT)을 통해 신호를 시간에 따라 비트로 분해하여 나타내며, 상기 추출된 시간 영역 특징, 주파수 영역 특징, 시간-주파수 영역 특징 및 뇌 기능 분화 특징을 포함하는 EEG 특징을 1차원 벡터로 변환하여 입력 데이터로 이용하는
모델 선택 기반 감정인식을 위한 딥러닝 장치.6. The method of claim 5,
The feature extraction unit,
From the MERTI-Apps data set using an Asian-based database, the characteristics extracted in the time domain are expressed as changes in the EEG signal over time to recognize changes in emotion over time, and features extracted from the frequency domain are identified as low-speed alpha, alpha , beta, and gamma are extracted, and the features extracted in the time-frequency domain are decomposed into bits according to time through Discrete Wavelet Transform (DWT) to represent the extracted time domain features, the frequency domain EEG features including features, time-frequency domain features, and brain function differentiation features are converted into one-dimensional vectors and used as input data.
A deep learning device for model selection-based emotion recognition.

제5항에 있어서,
LSTM 모델 선택부는,
MERTI-Apps 데이터 셋으로부터 추출된 EEG 특징에 대한 임의의 상위 객체를 생성하여 학습한 후 미리 정해진 기준 이상의 상위 부모 객체를 선정해 차세대 모델로 이동하고, 선택, 돌연변이, 크로스 오버를 포함하는 유전 알고리즘 과정을 거쳐 자녀 객체를 생성하고, 미리 정해진 수의 차세대 모델이 생성되거나 또는 현재 모델의 RMSE(Root-Mean-Square Error)가 더 이상 개선되지 않을 때까지 학습을 반복한 후 추출된 EEG 특징에 적용하기 위한 LSTM 모델을 출력하는
모델 선택 기반 감정인식을 위한 딥러닝 장치. 6. The method of claim 5,
LSTM model selection unit,
Genetic algorithm process including selection, mutation, and crossover by creating and learning random parent objects for EEG features extracted from the MERTI-Apps data set Create child objects through to output the LSTM model for
A deep learning device for model selection-based emotion recognition.

제5항에 있어서,
특징 세트 선택부는,
임의의 상위 객체를 생성하고, 선택된 LSTM 모델을 이용하여 학습한 후 미리 정해진 기준 이상의 상위 부모 객체를 선정해 차세대 모델로 이동하고, 선택, 돌연변이, 크로스 오버를 포함하는 유전 알고리즘 과정을 거쳐 자녀 객체를 생성하고, 선택된 LSTM 모델을 통해 미리 정해진 기준 이상의 우세한(dominant) 특징 세트를 선택하며,
초기 모델, 특징 세트를 결정하지 않고, 모든 모델, 모든 특징 그룹 및 모든 채널이 GA에 의해 평가되어 감정 인식을 방해하는 모델, 특징, 채널을 제거함으로써 최적의 모델과 특징 세트를 선택하는
모델 선택 기반 감정인식을 위한 딥러닝 장치.6. The method of claim 5,
The feature set selection unit,
After creating a random parent object, learning using the selected LSTM model, selecting a parent object higher than a predetermined standard, moving to the next-generation model, and passing through a genetic algorithm process including selection, mutation, and crossover to generate child objects generating and selecting a set of dominant features over a predetermined criterion through the selected LSTM model;
Without determining the initial model, feature set, all models, all feature groups, and all channels are evaluated by GA to select the optimal model and feature set by removing models, features, and channels that interfere with emotion recognition.
A deep learning device for model selection-based emotion recognition.