KR20180058175A

KR20180058175A - DNN Based Object Learning and Recognition System

Info

Publication number: KR20180058175A
Application number: KR1020170057911A
Authority: KR
Inventors: 하영국; 정혁준; 이명재; 최수용
Original assignee: 건국대학교 산학협력단
Priority date: 2016-11-23
Filing date: 2017-05-10
Publication date: 2018-05-31

Abstract

In a DNN-based object learning and recognizing system according to the present invention, a method for learning and recognizing an object using a deep neutral network (DNN) in a server includes the steps of: receiving a plurality of images collected by a crawling process; learning a DNN model based on the collected images; generating a DNN model profile from the learned DNN model; and transmitting the generated DNN model profile to an on-board computer. Accordingly, the present invention can quickly and accurately recognize the object and reduce an error rate.

Description

DNN 기반 객체 학습 및 인식 시스템{DNN Based Object Learning and Recognition System}{DNN Based Object Learning and Recognition System}

본 발명은 DNN 기반 객체 학습 및 인식 시스템에 관한 것으로, 좀 더 구체적으로 크롤링 프로세스를 이용하여 이미지를 수집하고 서버 수준에서 수집된 이미지를 심층 신경 네트워크(DNN; Deep Neural Network) 알고리즘을 이용하여 학습하는 것에 의하여 DNN 모델 프로파일을 생성하고, 온보드(On-Board) 상에서 이를 이용하여 객체를 인식함으로써 빠르고 정확한 객체 인식을 가능하게 한 DNN 기반 객체 학습 및 인식 시스템에 관한 것이다.The present invention relates to a DNN-based object learning and recognition system, and more particularly, to a method and apparatus for collecting images using a crawling process and learning images collected at a server level using a DNN (Deep Neural Network) algorithm The present invention relates to a DNN-based object learning and recognition system that enables a fast and accurate object recognition by generating a DNN model profile and recognizing an object using it on an on-board basis.

시각적 요소는 많은 정보를 제공한다. 이처럼 흔히 볼 수 있는 이미지와 이미지 속에도 셀 수 없이 많은 정보들이 포함되어 있다. 하지만 다양한 이미지에서 얻고자 하는 최적의 정보를 얻어내는 데에는 적합한 알고리즘이 필요하다. 이에 따라 이미지에서 원하는 정보를 빠르게 얻어내고자 하는 연구가 가속화되고 있다.Visual elements provide a lot of information. There are innumerable pieces of information in these common images and images. However, a suitable algorithm is needed to obtain optimal information to be obtained from various images. As a result, studies are rapidly being conducted to quickly obtain desired information from images.

이미지에서 사물에 대한 정보를 얻기 위해서는 먼저 이미지을 분석해야 한다. 이전의 이미지 처리 기법은 이미지를 2차원 신호로 보고 여기에 신호를 처리하는 기법을 적용하는 방법을 사용하였다. 이는 이미지 속 객체를 인식하는 방법으로도 사용이 될 수 있는데, 이러한 방법을 활용한 대표적인 사례로 번호판 인식 시스템이 있다(선행문헌 1). 하지만, 이전의 이미지처리 기법은 여러 단계의 처리과정을 거쳐 추출시간이 많이 소요된다는 점과 원이미지의 정보 손실이 있을 수 있다는 점에서 실시간 객체 인식에 사용하기에는 부족한 부분이 많았다. To get information about an object in an image, you first need to analyze the image. Previous image processing techniques used a method of applying the technique of processing the signal to the image as a two-dimensional signal. This can also be used as a method of recognizing objects in an image. A typical example using this method is a license plate recognition system (Prior Art 1). However, since the previous image processing method takes a lot of processing time and requires a lot of processing time, and there is a loss of information in the original image, there are many parts that are not enough to be used for real-time object recognition.

이러한 문제를 해결하기 위해 최근 딥 러닝을 이용한 연구가 이뤄지고 있다. 딥 러닝은 심층신경 네트워크(DNN, Deep Neural Network) 알고리즘과 이를 학습하는 방법을 의미하는 것으로, 초기에는 시간이 오래 소모되고 학습 데이터와의 과적합(overfitting)이라는 단점 때문에 활용도가 낮았다. 그러나 병렬연산이 가능한 GPU의 등장과 새로운 알고리즘의 등장으로 이미지 내 객체 인식 기술 또한 획기적으로 개선되었다. 현재는 오차율이 굉장히 낮아져 인간의 인지 오차율에 매우 근접한 수준이 되었다(선행문헌 2, 3).To solve these problems, researches using deep running have recently been carried out. Deep learning means DNN (Deep Neural Network) algorithm and its learning method. In the beginning, it is time consuming and its utilization is low due to the disadvantage of overfitting with learning data. However, the introduction of GPUs capable of parallel computing and the introduction of new algorithms have drastically improved the object recognition technology in images. At present, the error rate is very low, which is close to human cognitive error rate (Preceding Literature 2, 3).

그러나 이러한 낮은 오차율을 가지는 DNN 모델을 생성하기 위해서는 빅데이터에 해당할 정도의 수많은 이미지를 수집해야 하며, 수집된 이미지를 기반으로 매우 많은 횟수의 강화학습이 이루어져야 하므로 많은 시간 및 자원이 소모된다. 종래에 비하여 상당히 개선되기는 하였지만 이러한 시간 및 자원은 온 보도 상에서 DNN 모델을 사용하여 객체를 인식하는데 상당히 부담이 된다는 문제가 있다.However, in order to generate a DNN model having such a low error rate, a large number of images corresponding to big data must be collected, and a large number of reinforcement learning must be performed based on the collected images, so that much time and resources are consumed. The time and resources are considerably burdensome to recognize the object using the DNN model on the news report.

선행문헌Precedent literature

선행문헌 1. 전병태, 윤호섭. (1993.7). 신호처리 기법을 응용한 차량번호판 추출방법(A Method to Extract Vehicle Number Plates by Applying Signal Processing Technique). The Institute of Electronics Engineers of Korea - B, 30(7), 728-737. Preceding Literature 1. Byung-Tae Byun, Yoon Ho-Sub. (1993.7). A Method for Extracting Vehicle License Plates Applying Signal Processing Techniques. The Institute of Electronics Engineers of Korea - B, 30 (7), 728-737.

선행문헌 2. ByungIn Yoo, Wonjun Hwang, Seungju Han, Seon-Min Rhee, Jung-Bae Kim, Jae-Joon Han. (2015.9). 인간 수준에 근접한 딥러닝 기반 이미지 인식의 동향. Communications of the Korean Institute of Information Scientists and Engineers, 33(9), 32-41. Prior literature 2. Byungin Yoo, Wonjun Hwang, Seungju Han, Seon-Min Rhee, Jung-Bae Kim, Jae-Joon Han. (2015.9). Trend of Deep Learning - based Image Recognition near Human Level. Communications of the Korean Institute of Information Scientists and Engineers, 33 (9), 32-41.

선행문헌 3. 박제강, 박용규, 온한익, 강동중. (2015.12). 딥러닝을 이용한 이미지내 물체 인식 기법. 제어로봇시스템학회지, 21(4), 21-26Preceding literature 3. Park Jae-gang, Park Yong-gyu, Han Hyeok, Kang Dong-jung. (2015.12). Object Recognition Technique in Image Using Deep Learning. Journal of Control, Robotics and Systems, 21 (4), 21-26

본 발명은 상술한 문제를 해소하기 위한 것으로, 빠르고 정확한 객체 인식이 가능한 DNN 기반 객체 학습 및 인식 시스템을 제공하고자 하는 것이다. The object of the present invention is to provide a DNN-based object learning and recognition system capable of quickly and accurately recognizing objects.

본 발명의 다른 목적은 크롤링 프로세스를 이용하여 빅데이터 수준의 이미지를 수집하여 이를 기반으로 이미지를 학습함으로써 오차율이 향상된 DNN 기반 객체 학습 및 인식 시스템을 제공하고자 하는 것이다. Another object of the present invention is to provide a DNN-based object learning and recognition system with improved error rate by collecting images of a big data level using a crawling process and learning images based on the collected images.

본 발명의 또 다른 목적들은 다음의 상세한 설명과 첨부한 도면으로부터 보다 명확해질 것이다.Other objects of the present invention will become more apparent from the following detailed description and the accompanying drawings.

이를 위하여, 본 발명의 일 실시예에 따른 DNN 기반 객체 학습 및 인식 방법은 서버에서 심층 심경망(Deep Netral Network; DNN)을 이용하여 객체를 학습하고 인식하는 방법에 있어서, 크롤링 프로세스에 의해 수집된 다수의 이미지를 수신하는 단계; 상기 수집된 이미지에 근거하여 DNN 모델을 학습하는 단계; 학습된 DNN 모델로부터 DNN 모델 프로파일을 생성하는 단계; 온-보드(On-board) 컴퓨터로 상기 생성된 DNN 모델 프로파일을 전송하는 단계를 포함한다. To this end, a DNN-based object learning and recognition method according to an embodiment of the present invention is a method for learning and recognizing an object using a Deep Netral Network (DNN) in a server, Receiving a plurality of images; Learning a DNN model based on the collected images; Generating a DNN model profile from the learned DNN model; And transmitting the generated DNN model profile to an on-board computer.

상기 DNN 기반 객체 학습 및 인식 방법은 상기 온-보드 컴퓨터에서 상기 서버로부터 상기 DNN 모델 프로파일을 로딩하는 단계; 상기 DNN 모델 프로파일로부터 DNN 모델을 구축하는 단계; 및 입력 이미지에 상기 DNN 모델을 적용하는 것에 의하여 상기 입력 이미지에 포함된 객체를 인식하는 단계를 더 포함할 수 있다.The DNN-based object learning and recognition method includes loading the DNN model profile from the server in the on-board computer; Constructing a DNN model from the DNN model profile; And recognizing an object included in the input image by applying the DNN model to the input image.

상기 다수의 이미지를 수집하는 단계는, 이미지를 분류하기 위한 온톨로지를 구축하는 단계; 상기 구축된 온톨로지에 따라 각 인스턴스에 대한 다수의 이미지를 수집하는 단계; 및 상기 수집된 이미지를 저장하는 단계를 포함할 수 있다.Wherein the collecting of the plurality of images comprises: building an ontology for classifying images; Collecting a plurality of images for each instance according to the established ontology; And storing the collected image.

상기 온톨로지는 각 인스턴스를 상위 인스턴스로 하여 상기 상위 인스턴스에 대한 다수의 하위 인스턴스를 더 포함하며, 상기 각 인스턴스에 대한 다수의 이미지를 수집하는 단계는 상기 다수의 하위 인스턴스에 대한 다수의 이미지를 수집하는 단계를 포함하며, 상기 수집된 이미지를 저장하는 단계는 상기 하위 인스턴스에 대한 다수의 이미지를 상기 상위 인스턴스에 저장하는 단계를 포함할 수 있다.Wherein the ontology further comprises a plurality of sub-instances for the parent instance with each instance being a parent instance, wherein collecting the plurality of images for each instance collects a plurality of images for the plurality of sub-instances Wherein storing the collected image may include storing a plurality of images for the lower instance in the upper instance.

상기 DNN 모델을 학습하는 단계는 상기 이미지에 포함된 객체의 인식률이 소정 오차율 범위 내가 될 때까지 이미지를 반복 학습하는 것을 특징으로 할 수 있다.The learning of the DNN model may repeatedly learn the image until the recognition rate of the object included in the image is within a predetermined error rate range.

상기 DNN 모델을 학습하는 단계는 상기 이미지에 대한 학습 횟수가 소정 횟수가 될 때까지 이미지를 반복 학습하는 것을 특징으로 할 수 있다.The learning of the DNN model may repeatedly learn an image until the number of learning times for the image reaches a predetermined number.

상기 DNN 모델은 컨볼루션 신경 네트워크 모델인 것을 특징으로 할 수 있다.The DNN model may be a convolution neural network model.

상기 DNN 모델 프로파일은 상기 DNN 모델에 있어서의 모멘텀, 학습률, 가중치 및 배치 사이즈를 포함하는 것을 특징으로 할 수 있다.The DNN model profile may include a momentum, a learning rate, a weight, and a placement size in the DNN model.

본 발명의 다른 실시예에 따른 컴퓨터 판독가능한 저장 매체는 컴퓨팅 장치에 의해 실행시, 상기 컴퓨팅 장치가 상술한 DNN 기반 객체 학습 및 인식 방법을 실행하게 하는 명령어들을 포함한다. A computer-readable storage medium according to another embodiment of the present invention includes instructions that, when executed by a computing device, cause the computing device to perform the DNN-based object learning and recognition method described above.

본 발명의 또 다른 실시예에 따른 DNN 기반 객체 학습 및 인식 시스템은 심층 심경망(Deep Netral Network; DNN)을 이용하여 객체를 학습하고 인식하는 시스템에 있어서, 네트워크를 통하여 다수의 이미지를 수집하는 크롤링 서버; 상기 크롤링 서버에 의하여 수집된 이미지를 학습하여 DNN 모델을 생성하고 생성된 DNN 모델로부터 DNN 모델 프로파일을 추출하는 DNN 학습 서버; 및 상기 DNN 학습 서버로부터의 상기 DNN 모델 프로파일을 이용하여 입력 이미지에 포함됨 객체를 인식하는 온-보드(On-Board) 컴퓨터를 포함한다. A DNN-based object learning and recognition system according to another embodiment of the present invention is a system for learning and recognizing objects using a Deep Netral Network (DNN) server; A DNN learning server that learns the images collected by the crawling server to generate a DNN model and extracts a DNN model profile from the generated DNN model; And an on-board computer that recognizes the objects included in the input image using the DNN model profile from the DNN learning server.

상기 온-보드 컴퓨터는 상기 DNN 프로파일로부터 상기 DNN 모델을 구축하는 DNN 모델 구축 모듈을 포함하며, 상기 DNN 모델 구축 모듈은, 상기 DNN 학습 서버로부터 상기 DNN 모델 프로파일을 다운로딩하는 프로파일 다운로더; 상기 프로파일 다운로더에서 다운로드 받은 DNN 프로파일을 해석하는 프로파일 해석기; 및 상기 해석된 프로파일에 기초하여 DNN 모델을 생성하는 DNN 모델 생성기를 포함할 수 있다.The on-board computer includes a DNN model building module for building the DNN model from the DNN profile, the DNN model building module including: a profile downloader for downloading the DNN model profile from the DNN learning server; A profile analyzer for analyzing the DNN profile downloaded from the profile downloader; And a DNN model generator for generating a DNN model based on the interpreted profile.

상기 크롤링 서버는, 이미지를 분류하기 위한 인스턴스들을 포함하는 온톨로지를 구축하는 온톨로지 구축 모듈; 상기 온톨로지 구축 모듈에 의해 구축된 온톨로지에 따라 각 인스턴스에 대한 다수의 이미지를 수집하는 이미지 수집 모듈; 및 상기 수집된 이미지를 저장하는 저장 모듈을 포함할 수 있다.The crawling server includes an ontology building module for building an ontology including instances for classifying images; An image collection module for collecting a plurality of images for each instance according to an ontology constructed by the ontology building module; And a storage module for storing the collected images.

상기 온톨로지는 각 인스턴스를 상위 인스턴스로 하여 상위 인스턴스에 대한 다수의 하위 인스턴스를 더 포함하며, 상기 이미지 수집 모듈은 상기 다수의 하위 인스턴스에 대한 다수의 이미지를 수집하고, 상기 저장 모듈은 상기 하위 인스턴스에 대한 다수의 이미지를 상기 상위 인스턴스에 저장하할 수 있다.Wherein the ontology further comprises a plurality of sub-instances for an ancestor with each instance being a parent instance, the image acquisition module collecting a plurality of images for the plurality of sub-instances, A plurality of images for the upper instance can be stored in the upper instance.

상기 DNN 학습 서버는 상기 이미지에 포함된 객체의 인식률이 소정 오차율 범위 내가 될 때까지 이미지를 반복 학습할 수 있다.The DNN learning server may repeatedly learn an image until the recognition rate of the object included in the image is within a predetermined error rate range.

상기 DNN 학습 서버는 상기 이미지에 대한 학습 횟수가 소정 횟수가 될 때까지 이미지를 반복 학습할 수있다.The DNN learning server can repeatedly learn an image until the number of learning times for the image reaches a predetermined number.

상기 DNN 모델 프로파일은 상기 DNN 모델에 있어서의 모멘텀, 학습률, 가중치 및 배치 사이즈를 포함할 수 있다.The DNN model profile may include momentum, learning rate, weight, and batch size in the DNN model.

이와 같은 구성에 의하여 본 발명의 DNN 기반 객체 학습 및 인식 시스템은 빠르고 정확하게 객체를 인식할 수 있다. With this configuration, the DNN-based object learning and recognition system of the present invention can recognize objects quickly and accurately.

본 발명의 다른 효과는 크롤링 프로세스를 이용하여 빅데이터 수준의 이미지를 수집하여 이를 기반으로 이미지를 학습함으로써 오차율이 향상될 수 있다는 것이다.Another advantage of the present invention is that an error rate can be improved by collecting images of a big data level using a crawling process and learning an image based on the images.

도 1은 본 발명의 일 실시예에 따른 DNN 기반 객체 학습 및 인식 시스템을 개략적으로 나타낸 전체 시스템 구성도이다.
도 2는 본 발명의 일 실시예에 따른 DNN 기반 객체 학습 및 인식 방법을 개략적으로 나타낸 순서도이다.
도 3은 본 발명의 일 실시예에 따른 DNN 기반 객체 학습 및 인식 시스템의 크롤러 구조를 나타내는 블럭도이다.
도 4는 본 발명의 일 실시예에 따라 구축된 온톨로지를 예시적으로 나타내는 도면이다.
도 5는 구글 이미지 검색에서 하나의 인스턴스를 사용하여 검색할 수 있는 이미지들을 나타내는 예시화면이다.
도 6은 도 5의 인스턴스에 속하는 복수의 하위 인스턴스들을 사용하여 검색할 수 있는 이미지들을 나타내는 예시화면이다.
도 7은 본 발명의 일 실시예에 따른 DNN 기반 객체 학습 및 인식 시스템의 DNN 학습 서버의 구성을 나타내는 블럭도이다.
도 8은 본 발명의 일 실시예에 따라 구성한 DNN 기반 객체 학습에 사용되는 컨볼루션 신경 네트워크(Convolution Neural Network)를 나타내는 도면이다.
도 9는 본 발명의 일 실시예에 따른 DNN 기반 객체 학습 및 인식 시스템의 온-보드(On-Board) 컴퓨터의 구성을 나타내는 도면이다.
도 10은 본 발명의 일 실시예에 따른 DNN 기반 객체 학습 및 인식 시스템의 이미지 객체 인식 결과를 나타낸다.FIG. 1 is an overall system configuration diagram schematically illustrating a DNN-based object learning and recognition system according to an embodiment of the present invention.
2 is a flowchart schematically illustrating a DNN-based object learning and recognition method according to an embodiment of the present invention.
3 is a block diagram illustrating a crawler structure of a DNN-based object learning and recognition system according to an embodiment of the present invention.
4 is an exemplary diagram illustrating an ontology constructed in accordance with an embodiment of the present invention.
5 is an exemplary screen for displaying images that can be retrieved using one instance in the Google image search.
Figure 6 is an exemplary screen showing images that can be retrieved using a plurality of sub-instances belonging to the instance of Figure 5;
7 is a block diagram illustrating a configuration of a DNN learning server of a DNN-based object learning and recognition system according to an embodiment of the present invention.
FIG. 8 is a diagram showing a Convolution Neural Network used for DNN-based object learning configured according to an embodiment of the present invention.
9 is a diagram illustrating the configuration of an on-board computer of a DNN-based object learning and recognition system according to an embodiment of the present invention.
FIG. 10 shows the result of recognition of an image object of a DNN-based object learning and recognition system according to an embodiment of the present invention.

이하, 본 발명의 바람직한 실시예들을 첨부된 도 2 내지 도 6을 참고하여 더욱 상세히 설명한다. 본 발명의 실시예들은 여러 가지 형태로 변형될 수 있으며, 본 발명의 범위가 아래에서 설명하는 실시예들에 한정되는 것으로 해석되어서는 안 된다. 본 실시예들은 당해 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 본 발명을 더욱 상세하게 설명하기 위해서 제공되는 것이다. 따라서 도면에 나타난 각 요소의 형상은 보다 분명한 설명을 강조하기 위하여 과장될 수 있다.Hereinafter, preferred embodiments of the present invention will be described in more detail with reference to FIGS. 2 to 6 attached hereto. The embodiments of the present invention can be modified in various forms, and the scope of the present invention should not be construed as being limited to the embodiments described below. The embodiments are provided to explain the present invention to a person having ordinary skill in the art to which the present invention belongs. Accordingly, the shape of each element shown in the drawings may be exaggerated to emphasize a clearer description.

본 발명에 따른 일 실시예에서는 학습을 위한 Caffe Framework로 DNN 기술을 사용한다. 또한, 여러 개의 GPU를 통해 빠른 연산 과정을 실행하여 다양한 이미지 데이터들을 분석하여 객체를 최종적으로 인식을 할 수 있게 한다. 이 시스템은 빠르고 정확한 객체 인지를 가능하도록 하는 시스템이다. In one embodiment according to the present invention, DNN technology is used as a Caffe Framework for learning. In addition, it executes a fast calculation process through several GPUs, analyzes various image data, and finally recognizes the object. The system is a system that enables fast and accurate object recognition.

1. 설계1. Design

도 1은 본 발명의 일 실시예에 따른 DNN 기반 객체 학습 및 인식 시스템을 개략적으로 나타낸 전체 시스템 구성도이다. 도 1을 참조하면, 본 발명의 DNN 기반 객체 학습 및 인식 시스템(100)은 네트워크(150)를 통하여 학습할 이미지를 수집하고 저장하기 위한 크롤링 서버(110), 상기 크롤링 서버(110)가 수집한 이미지를 학습하여 DNN 모델 프로파일을 생성하기 위한 DNN 학습 서버(130) 및 네트워크를 통하여 DNN 학습 서버(130)로부터 DNN 모델 프로파일을 로딩하고 이를 이용해 DNN 모델을 구축하며며 이를 통하여 객체를 인식하는 온-보드 컴퓨터(170)를 포함한다. 이 실시예에서는 크롤링 서버(110)가 네트워크를 통하여 이미지를 수집하기 위한 검색 서버(190)가 더 포함되는 것으로 기술하고 있으나, 검색 서버(190)는 본 발명의 필수적인 구성이 아니며, 크롤링 서버(110)는 검색 서버(190)가 아닌 임의의 적절한 방법을 이용하여 네트워크 상의 이미지들을 검색할 수 있다.FIG. 1 is an overall system configuration diagram schematically illustrating a DNN-based object learning and recognition system according to an embodiment of the present invention. 1, the DNN-based object learning and recognition system 100 includes a crawling server 110 for collecting and storing images to be learned through the network 150, An DNN learning server 130 for learning an image and creating a DNN model profile and a DNN model profile from a DNN learning server 130 via a network and constructing a DNN model using the DNN model profile, And a board computer 170. In this embodiment, the crawl server 110 further includes a search server 190 for collecting images through the network. However, the search server 190 is not an essential configuration of the present invention, and the crawl server 110 May retrieve images on the network using any suitable method other than the search server 190. [

도 2는 본 발명의 일 실시예에 따른 DNN 기반 객체 학습 및 인식 방법을 개략적으로 나타낸 순서도이다. 본 발명의 일 실시예에 따른 DNN 기반 객체 학습 및 인식 방법은 실행 주체에 따라 크게 세 가지의 단계로 나뉘어진며, 도 2에서 각 단계는 두 개의 세부 단계로 이루어진다. 크롤링 서버(110)에서 수행되는 첫 번째 단계로 DNN 기반 객체 학습 및 인식 방법은 프로토 타입의 온톨로지를 구축한다(S210). 그 후, 크롤링 서버(110)는 구축된 온톨로지에 따라 인스턴스 단위로 객체 수집 자동화 시스템, 즉 크롤링 프로세스를 구축해 이미지를 수집한다(S220). 크롤링 서버(110)가 온톨로지를 구성하고 이를 이용하여 이미지를 수집하는 크롤링 과정은 이하의 2.1에서 더욱 상세히 설명하기로 한다.2 is a flowchart schematically illustrating a DNN-based object learning and recognition method according to an embodiment of the present invention. The DNN-based object learning and recognition method according to an embodiment of the present invention is roughly divided into three steps according to the execution subject. In FIG. 2, each step has two detailed steps. As a first step performed by the crawling server 110, a DNN-based object learning and recognition method establishes a prototype ontology (S210). Thereafter, the crawling server 110 constructs an object collection automation system, i.e., a crawling process, on an instance-by-instance basis according to the established ontology, and collects images (S220). The crawling process in which the crawling server 110 constructs an ontology and collects images using the crawl server 110 will be described in more detail in the following section 2.1.

다음으로, DNN 학습 서버(130)는 객체를 인식하기 위하여 크롤링 서버(110)에서 수집된 다수의 이미지들을 학습한다(S230). DNN 학습 서버(130)는 S230 단계에서 수행한 학습의 결과로서 DNN 모델 프로파일을 생성한 후 이를 온-보드 컴퓨터(170)로 전송한다(S240). DNN 학습 서버(130)가 객체를 학습하고 DNN 모델 프로파일을 생성하는 구체적인 방법에 대하여는 이하의 2.2에서 더욱 상세히 설명하기로 한다.Next, the DNN learning server 130 learns a plurality of images collected by the crawling server 110 to recognize the object (S230). The DNN learning server 130 generates a DNN model profile as a result of the learning performed in step S230, and transmits the DNN model profile to the on-board computer 170 (S240). A specific method for the DNN learning server 130 to learn an object and generate a DNN model profile will be described in more detail below in section 2.2.

온-보드 컴퓨터(170)는 네트워크(150)를 통해 DNN 학습 서버(130)로부터 DNN 모델 프로파일을 로딩한 후 이를 이용하여 DNN 모델을 구축한다(S250). 마지막으로 온-보드 컴퓨터(170)는 구축된 DNN 모델을 통하여 입력 이미지로부터 객체를 인식한다(S260). 온-보드 컴퓨터(170)가 객체를 인식하는 방법에 대하여는 이하의 2.3에서 더욱 상세히 설명하기로 한다.The on-board computer 170 loads the DNN model profile from the DNN learning server 130 via the network 150 and constructs a DNN model using the DNN model profile (S250). Finally, the on-board computer 170 recognizes the object from the input image through the constructed DNN model (S260). The manner in which the on-board computer 170 recognizes the object will be described in more detail in 2.3 below.

2.1 온톨로지 구성과 2.1 Ontology Construction and 크롤링crawling 과정 process

객체를 인식하기 위해서는 다양한 객체를 인식하기 위한 객체 학습이 필요하며, 이러한 객체 학습을 위해서는 빅데이터 수준의 대량의 객체 이미지 데이터가 요구된다. 따라서 이하에서는 각 객체에 대한 대량의 이미지 데이터를 수집하기 위한 객체 이미지 수집 방법에 대하여 상세히 설명하기로 한다. In order to recognize an object, it is necessary to learn an object to recognize various objects. To learn such an object, a large amount of object image data at a level of a big data is required. Therefore, an object image collection method for collecting a large amount of image data for each object will be described in detail below.

도 3은 본 발명의 일 실시예에 따른 크롤링 서버의 구성을 나타내는 블럭도이다. 도 3을 참조하면, 본 발명의 크롤링 서버(110)는 검색할 객체에 대한 온토로지를 구축하기 위한 온톨로지 구축 모듈(111), 구축된 온톨로지에 따라 이미지를 수집하기 위한 이미지 수집 모듈(113) 및 수집된 이미지를 저장하기 위한 이미지 저장 모듈(115)을 포함한다. 이 실시예에서는 설명의 편의를 위하여 객체로 자동차를 인식하기 위한 방법을 예로 들어 설명하고 있으나, 이에 한정되지 않으며 크롤링 서버는 학습을 위하여 임의의 적절한 객체가 포함된 이미지들을 수집할 수 있다. 3 is a block diagram illustrating a configuration of a crawling server according to an embodiment of the present invention. 3, the crawling server 110 of the present invention includes an ontology building module 111 for building ontologies for an object to be searched, an image collection module 113 for collecting images according to the constructed ontology, And an image storage module 115 for storing the collected images. In this embodiment, a method for recognizing a car as an object is described as an example for convenience of explanation, but the present invention is not limited to this, and the crawling server can collect images containing any suitable object for learning.

온톨로지 구축 모듈(110)은 수집하고자 하는 객체에 대한 트리 구조의 온토로지를 구축한다. 도 4는 본 발명의 일 실시예에 따라 구축된 온톨로지의 예를 나타낸다. 최근 무인자율주행 시스템에 대한 요구가 높아지면서 무인자율 주행을 위한 객체 인식 및 학습 시스템에 설계가 요구되고 있으며, 따라서 이 실시예에서는 무인자율주행에 필요한 객체 학습 및 인식을 위한 온톨로지 구축을 예로 들어 설명하기로 한다. 다만, 이후 객체 학습은 자동차에 대하여만 실시하였다. The ontology building module 110 constructs a tree structure ontology for objects to be collected. Figure 4 shows an example of an ontology constructed in accordance with an embodiment of the present invention. In recent years, there has been a demand for an object recognition and learning system for unmanned autonomous navigation as a demand for an unmanned autonomous navigation system has increased. Therefore, in this embodiment, an example of the construction of an ontology for object learning and recognition . However, after that, object learning was conducted only for automobile.

도 4를 참조하면, 자율 주행을 위하여 길 위의 객체(Road Objects)에 대한 트리 구조 온톨로지(400)를 구축한다. 이 예에서, 온톨로지의 각 노드를 인스턴스라 하며, 트리 구조에서 상위 계층에 있는 인스턴스를 상위 인스턴스 각 인스턴스의 하위 계측에 있는 인스턴스를 하위 인스턴스라 하며, 이는 상대적인 개념이다. 객체로서 자동차를 인식하기 위한 시스템의 경우 상위 인스턴스로서 자동차(Road Vehicles)를 하위 인스턴스로 소형차(Small-size Car), 중형차(Mid-size Car), 구급차(Ambulance) 등을 포함한다. 여기서 각 하위 인스턴스를 상위 인스턴스로 하여 복수의 하위 인스턴스가 더 포함될 수 있다. 예를 들어, 중형차의 경우 그 하위 인스턴스로 기아 K5(Kia K5) 및 현대 소나타(Hyundai Sonata) 등이 포함될 수 있다. 이 실시예에서는 이미지 수집을 위하여 구글 크롤러를 이용하고 있어, 각 인스턴스를 영문으로 구성하였다. Referring to FIG. 4, a tree structure ontology 400 for road objects is constructed for autonomous driving. In this example, each node of the ontology is referred to as an instance, and an instance in the upper layer in the tree structure is referred to as a lower instance. In the case of a system for recognizing a car as an object, a subordinate instance of a road vehicle as a parent instance includes a small-size car, a mid-size car, and an ambulance. Here, a plurality of sub-instances may be further included in each sub-instance as a parent instance. For example, in the case of medium-sized cars, its sub-instances may include Kia K5 and Hyundai Sonata. In this embodiment, a Google crawler is used for image collection, and each instance is configured in English.

이미지 수집 모듈(113)은 온톨로지 구축 모듈(111)로부터 인스턴트 리스트를 수신하여, 수신된 인스턴트 리스트에 해당하는 이미지를 수집하는 모듈로서, 이 실시예에서는 구글 크롤러를 사용하여 이미지를 수집하나, 이에 한정되지 않으며, 임의의 적절한 구조의 크롤러가 사용될 수 있다. 다만, 구글 내 이미지 검색에서는 페이지 구조가 같아 그에 맞는 페이지 구조대로 크롤링 서비스를 설계할 수 있으므로 이를 이용한다.The image collection module 113 is an module that receives an instant list from the ontology construction module 111 and collects images corresponding to the received instant list. In this embodiment, the image collection module 113 collects images using the Google crawler, And a crawler of any suitable construction may be used. However, since the image structure of Google image search is the same, the crawl service can be designed with the page structure corresponding to it.

이미지 수집 모듈(113)은 인스턴스 리스트에 포함된 인스턴스들을 키워드로 하여 이미지 검색을 실시한다. 현재 구글 이미지 검색은 하나의 검색어에 대하여 최대 400개의 아이템만이 검색되므로, 특정 객체에 대한 학습 데이터를 충분히 확보하기 위하여 그 객체에 해당하는 하위 인스턴스들을 구성하는 것에 의하여 검색 데이터의 수를 확장할 수 있다. 상술한 바와 같이, 자동차 객체에 대한 이미지들을 검색하기 위하여 소형차, 중형차, 대형차, 소방차 등을 검색어(인스턴스)로 사용할 수 있고, 이 중 중형차에 대한 하위 인스턴스로서 기아 K5, 현대 소나타, Ford Taurus, BMW 5 등을 사용할 수 있다. 또한, 각 하위 인스턴스를 이용하여 검색한 결과 이미지들을 상위 인스턴스에 저장할 수 있으며, 이 경우 상위 인스턴스를 분류를 위한 클래스(class)로 정의한다. The image collection module 113 performs an image search using the instances included in the instance list as keywords. In order to secure enough learning data for a specific object, the number of retrieval data can be extended by constructing sub-instances corresponding to the object, have. As described above, it is possible to use a small car, a medium car, a large car, a fire truck, or the like as a search term (instance) in order to search for images of an automobile object. Kia K5, Hyundai Sonata, Ford Taurus, BMW 5 can be used. In addition, the resultant images retrieved using each sub-instance can be stored in the parent instance. In this case, the parent instance is defined as a class for classification.

도 5는 구글 이미지 검색에서 하나의 인스턴스를 사용하여 검색할 수 있는 이미지들을 나타내는 예시화면이며, 도 6은 도 5의 인스턴스에 속하는 복수의 하위 인스턴스들을 사용하여 검색할 수 있는 이미지들을 나타내는 예시화면이다. 도 5 및 6을 참조하면 알 수 있는 바와 같이, 중형차를 포함하는 이미지를 수집하기 위하여 '중형차(mid size car)'를 검색어로 사용할 경우, 최대 400개의 이미지를 검색할 수 있는 반면, 중형차에 대한 차종 리스트를 하위 인스턴스로 하여 검색한 경우 각 차종에 대하여 400개의 이미지를 검색할 수 있으므로 학습을 위하여 더 많은 이미지 데이터를 수집할 수 있다. 5 is an exemplary screen showing images that can be searched using one instance in the Google image search, and FIG. 6 is an exemplary screen showing images that can be searched using a plurality of sub-instances belonging to the instance of FIG. 5 . As can be seen from FIGS. 5 and 6, when a 'mid size car' is used as a search term to collect an image including a medium-sized car, a maximum of 400 images can be searched, If you search the vehicle list as a sub-instance, you can retrieve 400 images for each vehicle type, so you can collect more image data for learning.

이미지 수집 모듈(113)은 이미지 검색을 통하여 검색된 이미지가 포함됨 웹 페이지로부터 소스 코드를 가져와 분석한 후 웹 파싱을 통해 각 이미지의 원본 URL을 획득한 후 이를 이미지 저장 모듈(115)에 저장한다. The image acquisition module 113 acquires the source code from the web page including the searched image through the image search, acquires the original URL of each image through the web parsing, and stores it in the image storage module 115.

이미지 저장 모듈(115)은 이미지 수집 모듈(113)에 의해 수집된 이미지의 원본 URL을 수신한 후 MongoDB를 사용하여 URL의 중복 여부를 체크하고, URL로부터 이미지를 생성하여 하둡(Hadoop) 기반의 파일 시스템인 HDFS(Hadoop Distributed File System)에 저장한다. 이미지 저장 모듈(115)은 각 하위 인스턴스에 해당하는 이미지들을 객체 학습을 위한 분류에 따라 상위 인스턴스(또는 그 상위 인스턴스의 상위 인스턴스)에 저장할 수 있다. 이 실시예에서는 이미지 저장 모듈(115)로 MongoDB를 사용하는 것으로 기술하고 있으나, 이에 한정되지 않으며 임의의 적절한 데이터베이스 시스템이 사용될 수 있다. 다만, MongoDB를 이용한 HDFS의 경우 대규모의 데이터를 분산 처리하여 빠른 속도를 보장한다는 장점이 있으며, 뛰어난 확장성으로 많은 저장 공간을 확보할 수 있는 장점이 있어 유용한다.After receiving the original URL of the image collected by the image collection module 113, the image storage module 115 checks whether the URL is duplicated using MongoDB, generates an image from the URL, and stores it in a Hadoop-based file And stores it in the HDFS (Hadoop Distributed File System) system. The image storage module 115 may store the images corresponding to each lower instance in the upper instance (or an upper instance of the upper instance) according to classification for object learning. In this embodiment, MongoDB is used as the image storage module 115, but it is not limited thereto and any suitable database system can be used. However, in the case of HDFS using MongoDB, it is advantageous because it has a merit that it distributes a large amount of data and guarantees a high speed, and it can secure a lot of storage space with excellent scalability.

2.2 객체 이미지 학습 서버 2.2 Object Image Learning Server

다음으로, 도 7을 참조하여 DNN 학습 서버에 대하여 설명하기로 한다. 도 7은 본 발명의 일 실시예에 따른 DNN 기반 객체 학습 및 인식 시스템의 DNN 학습 서버의 구성을 나타내는 블럭도이다.Next, the DNN learning server will be described with reference to FIG. 7 is a block diagram illustrating a configuration of a DNN learning server of a DNN-based object learning and recognition system according to an embodiment of the present invention.

도 7에 도시된 같이, 본 발명의 일 실시예에 따른 DNN 학습 서버(130)는 이미지 로딩 모듈(131), 이미지 학습 모듈(133) 및 프로파일 생성 모듈(135)을 포함한다. 이는 객체 이미지 학습 과정을 크게 3단계의 과정으로 나눌 때 각 과정을 수행하는 모듈이다. 7, the DNN learning server 130 according to an embodiment of the present invention includes an image loading module 131, an image learning module 133, and a profile generation module 135. This is a module that performs each process when dividing the object image learning process into three stages.

첫 번째로, 이미지 로딩 모듈(131)은 먼저 학습을 위해 HDFS에 저장되어 있는 이미지 데이터를 불러오는 이미지 로딩 단계를 수행한다. 즉, 크롤링 서버(110)에 의해 클로링한 이미지를 HDFS로부터 불러오는 것이다. 이미지는 각 클래스(상위 인스턴스별)로 분류되어 있으며, 이미지 로딩 모듈(131)은 학습을 위하여 각 클래스에 해당하는 이미지 데이터 세트를 로딩한다. First, the image loading module 131 performs an image loading step of loading image data stored in the HDFS for learning. That is, the image crawled by the crawling server 110 is fetched from the HDFS. The images are classified into respective classes (per ancestor), and the image loading module 131 loads an image data set corresponding to each class for learning.

두 번째로, 이미지 학습 모듈(133)은 HDFS에서 불러온 이미지 데이터 세트를 학습하는 이미지 학습 단계를 수행한다. 이미지 학습 모듈(133)은 HDFS에서 불러온 이미지들을 미리 정한 클래스(인스턴스) 단위로 나누어 학습을 하게 된다. 이미지 학습 모듈(133)은 이미지 학습을 위하여 여러 가지 라이브러리를 적용할 수 있는데 일 실시예에서는 학습 과정에 Caffe Framework를 사용하였으나, 이에 한정되지 않으며 임의의 적절한 학습 라이브러리가 사용될 수 있다. 다만, Caffe는 표현력, 속도, 모듈화 지원을 고려한 라이브러리로 C++로 직접 사용할 수도 있지만 Python과 Matlab 인터페이스도 잘 구현되어 있다. Second, the image learning module 133 performs an image learning step of learning an image data set loaded in the HDFS. The image learning module 133 performs learning by dividing the images retrieved from the HDFS into predetermined classes (instances). The image learning module 133 may apply various libraries for image learning. In one embodiment, the Caffe Framework is used in the learning process, but the present invention is not limited to this, and any suitable learning library can be used. However, Caffe is a library that supports expressiveness, speed, and modularization. It can be used directly in C ++, but Python and Matlab interfaces are well implemented.

도 8은 본 발명의 일 실시예에 따른 DNN 기반 객체 학습 및 인식 시스템의 DNN 구조와 원리를 나타낸다. DNN(Deep Neural Network)이란 딥 러닝의 일종으로 입력 레이어(input layer)와 출력 레이어(output layer) 사이에 복수 개의 은닉 레이어(hidden layer)들로 이루어진 인공신경 네트워크(Artificial Neural Network, ANN)이다. 심층 신경 네트워크은 일반적인 인공신경 네트워크과 마찬가지로 복잡한 비선형 관계(non-linear relationship)들을 모델링할 수 있다. DNN 기반 모델은 CNN(Convolutional Neural Network) 모델, DCNN(Deep Convolutional Neural Network) 모델, RNN(Recurrent Neural Network), RDNN(Recurrent Deep Neural Network) 모델 등을 포함할 수 있고, 이 실시예에서는 CNN 모델을 사용하는 것을 예로 들어 설명하고 있으나 이에 한정되지 않으며, 임의의 적절한 모델링 방법이 사용될 수 있다. FIG. 8 illustrates a DNN structure and principle of a DNN-based object learning and recognition system according to an embodiment of the present invention. DNN (Deep Neural Network) is a kind of deep learning. It is an artificial neural network (ANN) composed of hidden layers between an input layer and an output layer. The deep nervous network can model complex non-linear relationships as well as a general neural network. The DNN-based model may include a CNN (Convolutional Neural Network) model, a DCNN (Deep Convolutional Neural Network) model, an RNN (Recurrent Neural Network) model, and a RDNN However, the present invention is not limited thereto, and any suitable modeling method may be used.

도 9는 본 발명의 일 실시예에 따라 구성한 DNN 기반 객체 학습에 사용되는 컨볼루션 신경 네트워크(CNN; Convolution Neural Network)를 나타내는 도면이다. 도 9에 도시된 바와 같이, 본 발명의 일 실시예에 따른 CNN 모델은 25개의 레이어들을 포함할 수 있다. 전체 레이어는 입력 레이어(910), 컨볼루션 레이어 1, 2(920), 컨볼루션 레이어 3, 4(930), 컨볼루션 레이어 5(930), FC(Full Connected) 레이어 1, 2(950) 및 출력 레이어(960)를 포함한다. 9 is a diagram illustrating a Convolution Neural Network (CNN) used in DNN-based object learning configured according to an embodiment of the present invention. As shown in FIG. 9, the CNN model according to an embodiment of the present invention may include 25 layers. The entire layer includes an input layer 910, convolution layers 1 and 2 920, convolution layers 3 and 4 930, convolution layer 5 930, FC (Full Connected) layers 1 and 2 950, And an output layer 960.

컨볼루션 레이어 1, 2(920)는 컨볼루션 레이어(Conv)(921), ReLU 레이어(923), 풀링(Pool) 레이어(925) 및 LRN(Local Response Nomalization) 레이어(927)을 포함하며 동일한 컨볼루션 레이어가 두 번 반복된다. Convolution layers 1 and 2 920 include a convolution layer (Conv) 921, a ReLU layer 923, a pool layer 925, and a Local Response Nomalization (LRN) layer 927, The routing layer is repeated twice.

컨볼루션 레이어(Conv)(921)는 이미지를 분석하는데 사용되는 작은 필터를 이미지의 몇몇 픽셀에 매칭하여 곱하는 레이어로 필터의 크기 및 필터가 적용되는 간격을 나타내는 Stride 값 등으로 정의된다. ReLU 레이어(923)는 중요한 역할을 하는 픽셀에 대하여 그 중요도에 비례하는 값을 곱하여 각 픽셀에 대하여 서로 다른 가중치를 주는 레이어이다. 풀링 레이어(925)는 특정 범위의 픽셀 중 가장 중요한(가중치가 높은) 픽셀을 선택하여 대표값으로 하거나 특정 범위의 픽셀들의 평균을 구하는 것에 의하여 그 특정 범위의 픽셀들을 통합하는 것으로, 특징 맵을 감소시켜 네트워크를 경량화하는 레이어이다. LRN(927)은 정규화 레이어로 컨볼루션 레이어 1(920)과 같이 다수의 레이어로 이루어진(일반적으로, 레이어가 깊다고 표현함) 경우 오버플로우(overflow)를 방지하는 레이어이다.Convolution layer (Conv) 921 is a layer that multiplies a small filter used for analyzing an image by matching several pixels of an image, and is defined by a size of a filter and a Stride value indicating an interval at which the filter is applied. The ReLU layer 923 is a layer that multiplies a pixel having a critical role by a value proportional to its importance, and assigns different weights to each pixel. The pooling layer 925 integrates the pixels of the specific range by selecting the most significant (heavier) pixel among the pixels of a specific range and taking it as a representative value or by averaging a specific range of pixels, To make the network lightweight. The LRN 927 is a normalization layer that prevents overflow when a plurality of layers (generally denoted as deep layers), such as the convolution layer 1 920, are used.

컨볼루션 레이어 3, 4(930) 및 컨볼루션 레이어 5(940)을 구성하는 레이어 또한 동일하므로 이에 대한 설명은 생략하기로 한다.Convolution layer 3, 4 (930) and convolution layer 5 (940) are also the same, so that a description thereof will be omitted.

FC 레이어 1, 2(950)는 FC 레이어(951), ReLU 레이어(953) 및 드롭아웃(Dropout) 레이어(955)를 포함한다. FC 레이어(951)는 컨볼루션 연산을 이미지의 모든 픽셀에 매칭하여 곱하는 레이어이며, 드롭 아웃 레이어(955)는 일부 노드를 학습에서 배제시키는 것에 의하여 학습 결과가 하나에만 치중되지 않도록 방지하는 레이어로, 예를 들어 자동차에 대하여 A, B, C, D 이미지가 있을 때, A, B, C 이미지를 학습에 사용하고, D를 인식에 사용하려 한다고 가정하면 드롭아웃 레이어가 없는 경우 A, B, C 이미지는 자동차로 인식하지만 D는 학습하지 않아 자동차로 인식하지 못하는 경우가 발생한다. 이 경우 드롭아웃 레이어를 이용하면 D도 자동차로 인식할 있게 된다. The FC layers 1 and 2 950 include an FC layer 951, a ReLU layer 953, and a dropout layer 955. The FC layer 951 is a layer for matching the convolution operation to all pixels in the image and multiplying. The dropout layer 955 is a layer for preventing learning results from being focused on only one by excluding some nodes from learning. For example, if you have A, B, C, and D images for your car and you want to use A, B, and C images for learning, and you want to use D for recognition, The image is recognized as a car, but D does not learn and can not recognize it as a car. In this case, the dropout layer allows D to be recognized as a car.

마지막으로 출력 레이어(960)는 FC 레이어(961) 소프트맥스(Softmax) 레이어(963) 및 출력 레이어(965)를 포함한다. 소프트맥스 레이어(963)는 인식 결과가 가장 높은 결과값을 나타내는 레이블(객체명)을 추출하는 레이어이다.Finally, the output layer 960 includes an FC layer 961, a Softmax layer 963, and an output layer 965. The soft max layer 963 is a layer for extracting a label (object name) indicating a result value having the highest recognition result.

이미지 학습 모듈(133)은 상술한 25개의 레이어로 구성된 네트워크를 이용하여 이미지를 학습하는 것에 의하여 최종 DNN 모델을 생성한다. 이미지 학습 모듈(133)은 학습 이미지에 포함된 객체의 인식률이 소정 오차율 범위 내가 될 때까지 이미지를 반복 학습할 수 있다. 예를 들어, 이미지에 포함된 객체의 인식 오차율이 3%, 즉 객체의 인식률이 97% 이상이 될 때까지 학습을 반복할 수 있다. 다른 예로, 이미지 학습 모듈(133)은 이미지에 대한 학습 횟수가 소정 횟수가 될 때까지 이미지를 반복 학습할 수 있다. 예를 들어, 이미지 학습 모듈(133)은 25단계로 이루어진 네트워크를 100,000번 반복하여 강화학습할 수 있다.The image learning module 133 generates the final DNN model by learning an image using the above-described 25-layer network. The image learning module 133 can repeatedly learn the image until the recognition rate of the object included in the learning image is within a predetermined error rate range. For example, the learning can be repeated until the recognition error rate of the object included in the image is 3%, that is, the recognition rate of the object is 97% or more. As another example, the image learning module 133 may repeatedly learn an image until the number of learning times for the image reaches a predetermined number. For example, the image learning module 133 may reinitialize the network consisting of 25 steps by repeating 100,000 times.

마지막으로 학습이 완료되면 프로파일 생성 모듈(135)은 이미지 인식 시스템에 적용할 수 있도록 DNN 모델 프로파일을 생성한다. DNN 모델 프로파일은 학습을 통해 가장 최적의 결과를 나타내는 네트워크를 찾아 그 네트워크를 구성하기 위한 파라미터 값들의 묶음으로, 예를 들어, 입력 레이어의 노드 수, 은닉 레이어의 수, 각 은닉 레이어의 노드 수, 출력 레이어의 노드 수, 모멘텀, 학습룰, 가중치 및 배치 사이즈 등을 포함한다. DNN 모델 프로파일은 또한 DNN 모델 프로파일을 업데이트하기 위하여 배포되는 DNN 모델 프로파일의 버전 정보를 포함할 수 있다. 이와 같이 생성된 DNN 모델 프로파일은 여러 온-보드 시스템에서 객체 인식을 할 수 있도록 배포된다.Finally, when the learning is completed, the profile creation module 135 creates a DNN model profile for application to the image recognition system. The DNN model profile is a set of parameter values for finding a network that represents the most optimal result through learning and constituting the network. For example, the number of nodes of the input layer, the number of hidden layers, the number of nodes of each hidden layer, The number of nodes in the output layer, momentum, learning rules, weights, and batch sizes. The DNN model profile may also include version information of the DNN model profile that is distributed to update the DNN model profile. The generated DNN model profile is distributed so that it can recognize objects in various on-board systems.

2.3 객체 인식 온-보드(On-Board)2.3 Object recognition On-board

다음으로, 도 9을 참조하여 온-보드 컴퓨터 상에서 객체를 인식하는 방법에 대하여 설명하기로 한다. 도 9는 본 발명의 일 실시예에 따른 DNN 기반 객체 학습 및 인식 시스템의 온-보드(On-Board) 컴퓨터의 구성을 나타내는 도면이다.Next, a method of recognizing an object on the on-board computer will be described with reference to FIG. 9 is a diagram illustrating the configuration of an on-board computer of a DNN-based object learning and recognition system according to an embodiment of the present invention.

본 발명의 온-보드 컴퓨터(170)는 DNN 학습 서버(130)에서 생성한 DNN 모델 프로파일로부터 DNN 모델을 구축하는 DNN 모델 구축 모듈(171), 구축된 DNN 모델을 이용하여 입력 이미지로부터 객체를 인식하는 객체 인식 모듈(173) 및 객체 인식 결과를 로깅하여 피드백할 수 있도록 하는 객체 인식 결과 로깅 모듈(175)을 포함한다. The on-board computer 170 of the present invention includes a DNN model construction module 171 for constructing a DNN model from the DNN model profile generated by the DNN learning server 130, And an object recognition result logging module 175 for logging and feedbacking object recognition results.

DNN 모델 구축 모듈(171)은 또한 프로파일 다운로더(176), 프로파일 해석기(177) 및 DNN 모델 생성기(178)를 포함한다. 프로파일 다운로더(176)는 DNN 학습 서버(130)로부터 네트워크(150)를 이용하여 DNN 모델 파라미터를 다운로드한다. 일 실시예로, 프로파일 다운로더(176)는 TCP/IP 프로토콜을 이용하여 DNN 모델 파라미터를 다운로드할 수 있으나 이에 한정되지 않으며, 임의의 적절한 통신 네트워크 및 유무선 통신을 이용하여 DNN 학습 서버(130)로부터 DNN 모델 파라미터를 다운로드할 수 있다.The DNN model building module 171 also includes a profile downloader 176, a profile interpreter 177, and a DNN model generator 178. The profile downloader 176 downloads the DNN model parameters from the DNN learning server 130 using the network 150. In one embodiment, the profile downloader 176 may download the DNN model parameters using the TCP / IP protocol, but is not limited to the DNN model parameters from the DNN learning server 130 using any suitable communication network and wired / Model parameters can be downloaded.

TCP/IP 프로토콜을 통해 온-보드 컴퓨터로 들어온 DNN 모델 프로파일은 프로파일 다운로더(176)를 거쳐 프로파일 해석기(177)로 들어간다. 프로파일 해석기(177)는 프로파일 다운로더(176)를 통해 다운받은 DNN 모델 프로파일을 분석하여 프로파일에 포함된 값을 네트워크 파라미터로 결정한다. DNN 모델 생성기(178)는 프로파일 해석기(177)의 해석에 의해 정해진 레이어의 노드 수와 레이어 순서대로 네트워크를 구성하며, 모멘텀, 학습률, 배치 사이즈 및 가중치 등을 이미지 분석(객체 인식)을 위하여 필요한 파라미터로 적용하여 이미지를 분석하고 객체를 인식하는 DNN 모델을 생성한다.The DNN model profile entered into the on-board computer via the TCP / IP protocol enters the profile interpreter 177 via the profile downloader 176. The profile analyzer 177 analyzes the DNN model profile downloaded through the profile downloader 176 and determines the value included in the profile as a network parameter. The DNN model generator 178 constructs a network in accordance with the number of nodes and the layer order determined by the analysis of the profile interpreter 177 and calculates the parameters necessary for image analysis (object recognition), such as momentum, learning rate, batch size, To analyze the image and create a DNN model that recognizes the object.

그 후, 온-보드 컴퓨터(170)의 객체 인식 모듈(173)은 카메라로부터 실시간으로 들어오는 이미지를 DNN 모델을 이용하여 하나의 객체로 인식한다. 이미지 내의 객체는 객체 이미지 학습 서버에서 정한 클래스 단위로 인식되며, 그 정확도와 인식 속도는 객체 인식 결과 로깅 모듈(175)에 의해 기록되어 추후 인식 결과를 분석하고 평가하는 피드백에 쓰일 수 있다.Then, the object recognition module 173 of the on-board computer 170 recognizes the image coming in real time from the camera as one object using the DNN model. The object in the image is recognized as a class unit determined by the object image learning server, and the accuracy and recognition speed can be recorded by the object recognition result logging module 175 and used for feedback to analyze and evaluate the recognition result in the future.

3. 3. 실시예Example

본 명세서에서는 일 실시예로, 아래 표 1과 같이 DNN 학습 서버의 CPU로는 Intel Xeon E5를 사용하였다. 학습 성능에 핵심이 되는 GPU는 Geforce GTX 1080을 4개 사용하였고, RAM은 128GB로 성능을 극대화 하였다. Ubuntu 14.04 운영체제에서 학습을 하였으며, CAFFE Library를 사용하였다.In the present specification, as an example, Intel Xeon E5 is used as the CPU of the DNN learning server as shown in Table 1 below. GPUs that are key to learning performance use four GeForce GTX 1080s and 128GB of RAM for maximum performance. We used Ubuntu 14.04 operating system and CAFFE Library.

객체 인식을 하는 온-보드d 컴퓨터에서는 i7 사양의 CPU를 사용하였으며, GPU는 마찬가지로 Geforce GTX 1080을 사용하였다. RAM은 16GB이며, Ubuntu 14.04 환경에서 객체 인식 실험을 하였다.The on-board computer for object recognition used the CPU of the i7 specification, and the GPU used Geforce GTX 1080 as well. The RAM is 16GB, and we performed object recognition experiments in Ubuntu 14.04 environment.

CPUCPU Intel Xeron E5Intel Xeron E5 CPUCPU Intel Core [email protected]Intel Core [email protected] GPUGPU Geforce GTX 1080"4Geforce GTX 1080 "4 GPUGPU Geforce GTX 1080Geforce GTX 1080 RAMRAM 128GB128GB RAMRAM 16GB16GB OSOS Ubuntu 14.04Ubuntu 14.04 SSDSSD 225GB225GB UBARAYUBARAY CUDA, OpenCV, CAFFECUDA, OpenCV, CAFFE OSOS OBUNTU 14.04OBUNTU 14.04

3.1 이미지 3.1 Image 빅데이터에Big Data 대한 객체 학습 실험 Object Learning Experiment

이 실시예에서는 17종의 쿨래스에 대하여 HDFS에 수집된 65,000여장의 이미지 파일 학습시켜 보았다. 수집된 인스턴스들의 학습 이미지 크기는 640x480부터 1920x1080로 일반적으로 널리 사용되는 해상도를 사용하였다. 또한 반복(iteration)은 100,000번 하여 강화학습 하였다. 또한 컨볼루션 네트워크(Convolution Network)는 25단계로 구성하여 학습하였다. 25단계의 컨볼루션 ㄴ네트워크로는 최대 500 클래스에 대한 분류가 가능하였다.In this embodiment, we have studied about 65,000 image files collected in HDFS for 17 kinds of coolas. The training image sizes of the collected instances were generally from 640x480 to 1920x1080, using widely used resolutions. In addition, iteration was performed 100,000 times to reinforce learning. In addition, the Convolution Network is composed of 25 steps. Up to 500 convolution networks can be classified into 25 classes.

3.2 이미지 인식 실험 3.2 Image Recognition Experiment

도 10은 본 발명의 일 실시예에 따른 DNN 기반 객체 학습 및 인식 시스템의 이미지 객체 인식 결과를 나타낸다. 이 실시예에서는 본 발명의 일 실시예에 따른 DNN 기반 객체 학습 및 인식 방법에서 생성된 DNN 모델을 이용하여 인터넷에서 검색한 임의의 이미지로부터 객체를 인식하는 실험을 수행하였고, 도 10을 참조하면,그 결과 98% 이상의 정확도로 객체를 인식하였고 인식 속도는 약 50ms(20FPS)임을 확인할 수 있다. FIG. 10 shows the result of recognition of an image object of a DNN-based object learning and recognition system according to an embodiment of the present invention. In this embodiment, an experiment was performed to recognize an object from an arbitrary image retrieved from the Internet using the DNN model generated in the DNN-based object learning and recognition method according to an embodiment of the present invention. Referring to FIG. 10, As a result, the object is recognized with accuracy of 98% or more and the recognition speed is about 50 ms (20 FPS).

4. 결론4. Conclusion

본 발명에 따른 일 실시예에서는 크롬 기반의 테스트 자동화 시스템을 통해 다양한 이미지를 온톨로지 분류 별로 수집하였다. 효율적인 학습을 위하여 CAFFE 기반의 Framework으로 딥러닝(DDN) 기술을 제안했고 다양한 이미지 데이터들을 분석하고 최종적으로 인식을 할 수 있는 방법을 제안하였다. 그 시스템을 구성해 테스트를 진행해 본 결과 본 발명에서 제안된 Caffe 라이브러리를 활용한 모델 프로파일을 통한 객체 인식이 실험 결과를 통해 정확성이 높으며 빠르다는 것을 확인할 수 있다. In the embodiment of the present invention, various images are collected on the basis of the ontology classification through the chrome-based test automation system. For efficient learning, we proposed Deep Learning (DDN) technology based on CAFFE based framework, analyzed various image data and proposed a method to finally recognize. As a result of constructing the system and testing it, it is confirmed that the object recognition through the model profile using the Caffe library proposed in the present invention is high in accuracy and fast.

이와 같이, 본 발명에 따르면, 서버 수준에서 객체 인식을 위한 DNN 모델을 학습하고, 학습된 DNN 모델로부터 DNN 모델 프로파일을 생성하여 배포하며, 온-보드 상에서 이 DNN 모델 프로파일을 이용하여 DNN 모델을 생성한 후 입력 이미지를 DNN 모델에 적용하여 객체를 인식함으로써, 이미지로부터 빠르고 정확하게 객체를 인식할 수 있다. 또한, 온톨로지를 구축하고 구축된 온톨로지에 따라 이미지를 수집함으로써 빅데이터 수준의 이미지를 수집하여 이를 기반으로 이미지를 학습할 수 있으며, 따라서 객체 인식에 대한 오차율이 향상될 수 있다.As described above, according to the present invention, a DNN model for object recognition at the server level is learned, a DNN model profile is generated from the learned DNN model and distributed, and a DNN model is generated using the DNN model profile on- After recognizing the object by applying the input image to the DNN model, the object can be recognized quickly and accurately from the image. In addition, by constructing an ontology and collecting images according to the constructed ontology, it is possible to collect images of a big data level and learn an image based on the collected images, thereby improving the error rate of object recognition.

110: 크롤링 서버
130: DNN 학습 서버
150: 네트워크
170: 온-보드 컴퓨터
190: 검색 서버110: Crawl server
130: DNN Learning Server
150: Network
170: On-board computer
190: Search Server

Claims

서버에서 심층 심경망(Deep Netral Network; DNN)을 이용하여 객체를 학습하고 인식하는 방법에 있어서,
크롤링 프로세스에 의해 수집된 다수의 이미지를 수신하는 단계;
상기 수집된 이미지에 근거하여 DNN 모델을 학습하는 단계;
학습된 DNN 모델로부터 DNN 모델 프로파일을 생성하는 단계;
온-보드(On-board) 컴퓨터로 상기 생성된 DNN 모델 프로파일을 전송하는 단계를 포함하는, DNN 기반 객체 학습 및 인식 방법.A method for learning and recognizing an object using a Deep Netral Network (DNN) in a server,
Receiving a plurality of images collected by the crawling process;
Learning a DNN model based on the collected images;
Generating a DNN model profile from the learned DNN model;
And sending the generated DNN model profile to an on-board computer.

제1항에 있어서,
상기 온-보드 컴퓨터에서 상기 서버로부터 상기 DNN 모델 프로파일을 로딩하는 단계;
상기 DNN 모델 프로파일로부터 DNN 모델을 구축하는 단계; 및
입력 이미지에 상기 DNN 모델을 적용하는 것에 의하여 상기 입력 이미지에 포함된 객체를 인식하는 단계를 더 포함하는, DNN 기반 객체 학습 및 인식 방법.The method according to claim 1,
Loading the DNN model profile from the server in the on-board computer;
Constructing a DNN model from the DNN model profile; And
Further comprising: recognizing objects included in the input image by applying the DNN model to the input image.

제1항에 있어서,
상기 다수의 이미지를 수집하는 단계는,
이미지를 분류하기 위한 온톨로지를 구축하는 단계;
상기 구축된 온톨로지에 따라 각 인스턴스에 대한 다수의 이미지를 수집하는 단계; 및
상기 수집된 이미지를 저장하는 단계를 포함하는, DNN 기반 객체 학습 및 인식 방법. The method according to claim 1,
Wherein collecting the plurality of images comprises:
Constructing an ontology for classifying images;
Collecting a plurality of images for each instance according to the established ontology; And
And storing the collected images. &Lt; Desc / Clms Page number 19 >

제3항에 있어서,
상기 온톨로지는 각 인스턴스를 상위 인스턴스로 하여 상기 상위 인스턴스에 대한 복수의 하위 인스턴스를 더 포함하며,
상기 각 인스턴스에 대한 다수의 이미지를 수집하는 단계는 상기 다수의 하위 인스턴스에 대한 다수의 이미지를 수집하는 단계를 포함하며,
상기 수집된 이미지를 저장하는 단계는 상기 하위 인스턴스에 대한 다수의 이미지를 상기 상위 인스턴스에 저장하는 단계를 포함하는, DNN 기반 객체 학습 및 인식 방법.The method of claim 3,
Wherein the ontology further comprises a plurality of sub-instances for the parent instance with each instance being a parent instance,
Wherein collecting a plurality of images for each instance comprises collecting a plurality of images for the plurality of sub-instances,
Wherein storing the collected image comprises storing a plurality of images for the lower instance in the upper instance.

제1항에 있어서,
상기 DNN 모델을 학습하는 단계는 상기 이미지에 포함된 객체의 인식률이 소정 오차율 범위 내가 될 때까지 이미지를 반복 학습하는 것을 특징으로 하는, DNN 기반 객체 학습 및 인식 방법.The method according to claim 1,
Wherein the step of learning the DNN model repeatedly learns an image until the recognition rate of the object included in the image is within a predetermined error rate range.

제1항에 있어서,
상기 DNN 모델을 학습하는 단계는 상기 이미지에 대한 학습 횟수가 소정 횟수가 될 때까지 이미지를 반복 학습하는 것을 특징으로 하는, DNN 기반 객체 학습 및 인식 방법.The method according to claim 1,
Wherein the step of learning the DNN model repeatedly learns an image until the number of learning times for the image reaches a predetermined number of times.

제1항에 있어서,
상기 DNN 모델은 컨볼루션 신경 네트워크 모델인 것을 특징으로 하는, DNN 기반 객체 학습 및 인식 방법.The method according to claim 1,
Wherein the DNN model is a convolution neural network model.

제1항에 있어서,
상기 DNN 모델 프로파일은 상기 DNN 모델에 있어서의 모멘텀, 학습률, 가중치 및 배치 사이즈를 포함하는 것을 특징으로 하는, DNN 기반 객체 학습 및 인식 방법.The method according to claim 1,
Wherein the DNN model profile includes momentum, learning rate, weight, and placement size in the DNN model.

컴퓨팅 장치에 의해 실행시, 상기 컴퓨팅 장치가 제1항 내지 제9항 중 어느 한 항에 따른 DNN 기반 객체 학습 및 인식 방법을 실행하게 하는 명령어들을 포함하는 컴퓨터 판독가능한 저장매체.17. A computer-readable storage medium comprising instructions, when executed by a computing device, to cause the computing device to perform a DNN-based object learning and recognition method according to any one of claims 1-9.

심층 심경망(Deep Netral Network; DNN)을 이용하여 객체를 학습하고 인식하는 시스템에 있어서,
네트워크를 통하여 다수의 이미지를 수집하는 크롤링 서버;
상기 크롤링 서버에 의하여 수집된 이미지를 학습하여 DNN 모델을 생성하고 생성된 DNN 모델로부터 DNN 모델 프로파일을 추출하는 DNN 학습 서버; 및
상기 DNN 학습 서버로부터의 상기 DNN 모델 프로파일을 이용하여 입력 이미지에 포함됨 객체를 인식하는 온-보드(On-Board) 컴퓨터를 포함하는, DNN 기반 객체 학습 및 인식 시스템.A system for learning and recognizing objects using a Deep Netral Network (DNN)
A crawling server for collecting a plurality of images through a network;
A DNN learning server that learns the images collected by the crawling server to generate a DNN model and extracts a DNN model profile from the generated DNN model; And
And an on-board computer that recognizes the objects included in the input image using the DNN model profile from the DNN learning server.

제10항에 있어서,
상기 온-보드 컴퓨터는 상기 DNN 프로파일로부터 상기 DNN 모델을 구축하는 DNN 모델 구축 모듈을 포함하며,
상기 DNN 모델 구축 모듈은,
상기 DNN 학습 서버로부터 상기 DNN 모델 프로파일을 다운로딩하는 프로파일 다운로더;
상기 프로파일 다운로더에서 다운로드 받은 DNN 프로파일을 해석하는 프로파일 해석기; 및
상기 해석된 프로파일에 기초하여 DNN 모델을 생성하는 DNN 모델 생성기를 포함하는, DNN 기반 객체 학습 및 인식 시스템.11. The method of claim 10,
The on-board computer includes a DNN model building module for building the DNN model from the DNN profile,
The DNN model building module includes:
A profile downloader for downloading the DNN model profile from the DNN learning server;
A profile analyzer for analyzing the DNN profile downloaded from the profile downloader; And
And a DNN model generator for generating a DNN model based on the interpreted profile.

제10항에 있어서,
상기 크롤링 서버는,
이미지를 분류하기 위한 인스턴스들을 포함하는 온톨로지를 구축하는 온톨로지 구축 모듈;
상기 온톨로지 구축 모듈에 의해 구축된 온톨로지에 따라 각 인스턴스에 대한 다수의 이미지를 수집하는 이미지 수집 모듈; 및
상기 수집된 이미지를 저장하는 이미지 저장 모듈을 포함하는, DNN 기반 객체 학습 및 인식 시스템.11. The method of claim 10,
The crawling server comprises:
An ontology building module for building an ontology including instances for classifying images;
An image collection module for collecting a plurality of images for each instance according to an ontology constructed by the ontology building module; And
And an image storage module for storing the collected images.

제12항에 있어서,
상기 온톨로지는 각 인스턴스를 상위 인스턴스로 하여 상위 인스턴스에 대한 다수의 하위 인스턴스를 더 포함하며,
상기 이미지 수집 모듈은 상기 다수의 하위 인스턴스에 대한 다수의 이미지를 수집하고,
상기 저장 모듈은 상기 하위 인스턴스에 대한 다수의 이미지를 상기 상위 인스턴스에 저장하는, DNN 기반 객체 학습 및 인식 시스템.13. The method of claim 12,
Wherein the ontology further includes a plurality of sub-instances for an ancestor with each instance as a super-instance,
Wherein the image collection module collects a plurality of images for the plurality of sub-
Wherein the storage module stores a plurality of images for the lower instance in the upper instance.

제10항에 있어서,
상기 DNN 학습 서버는 상기 이미지에 포함된 객체의 인식률이 소정 오차율 범위 내가 될 때까지 이미지를 반복 학습하는 것을 특징으로 하는, DNN 기반 객체 학습 및 인식 시스템.11. The method of claim 10,
Wherein the DNN learning server repeatedly learns an image until a recognition rate of an object included in the image is within a predetermined error rate range.

제10항에 있어서,
상기 DNN 학습 서버는 상기 이미지에 대한 학습 횟수가 소정 횟수가 될 때까지 이미지를 반복 학습하는 것을 특징으로 하는, DNN 기반 객체 학습 및 인식 시스템.11. The method of claim 10,
Wherein the DNN learning server repeatedly learns an image until the number of learning times for the image reaches a predetermined number.

제10항에 있어서,
상기 DNN 모델 프로파일은 상기 DNN 모델에 있어서의 모멘텀, 학습률, 가중치 및 배치 사이즈를 포함하는 것을 특징으로 하는, DNN 기반 객체 학습 및 인식 시스템.11. The method of claim 10,
Wherein the DNN model profile includes momentum, learning rate, weight, and placement size in the DNN model.