KR20200066125A

KR20200066125A - Method and device for estimating similarity of vedio

Info

Publication number: KR20200066125A
Application number: KR1020190025744A
Authority: KR
Inventors: 이영구; 김태연; 아즈헐 우딘 엠디; 이창주
Original assignee: 경희대학교 산학협력단
Priority date: 2018-11-30
Filing date: 2019-03-06
Publication date: 2020-06-09
Also published as: KR102158095B1

Abstract

According to the present invention, a method for predicting similarity of image data comprises: a step of receiving first image data and reference image data to be compared with the first image data from an image database; a step of distributing and storing the first image data as a plurality of second image data; a preprocessing step of extracting a first frame from the second image data; a step of detecting an edge mask having a preset size in the first frame; a step of extracting features for an area of the edge mask to generate first feature vectors; a step of combining one or more of the first feature vectors generated in the first frame to generate a second feature vector corresponding to the first frame; a step of identifying a second frame corresponding to the first frame and a third and a fourth frame continuing with the second frame in the reference image data; and a step of comparing the first frame with the second to fourth frames to measure similarity.

Description

영상의 유사도 예측 방법 및 장치{METHOD AND DEVICE FOR ESTIMATING SIMILARITY OF VEDIO}METHOD AND DEVICE FOR ESTIMATING SIMILARITY OF VEDIO}

본 발명은 영상의 유사도를 예측하는 방법 및 장치에 관한 것으로, 보다 자세하게는 분산 처리 시스템을 이용하여 영상에서 특징 벡터를 추출하여 유사도를 보다 신속하게 예측하는 방법 및 장치에 관한 것이다.The present invention relates to a method and apparatus for predicting similarity of an image, and more particularly, to a method and apparatus for predicting similarity more quickly by extracting a feature vector from an image using a distributed processing system.

최근 인터넷 기술의 발전 및 스마트 기기의 급속한 보급에 따라 방대한 양의 영상에 대한 접근성이 낮아지고 있다. 영상에서 객체를 검출하고 인식하는 기술은 영상감지 시스템에 꾸준하게 적용되어 왔으며, 컴퓨터 비전 및 기계 이해 분야의 근본적인 문제로 떠오르고 있다. 일반적으로 영상은 다양한 방식으로 구성된 표면 및 객체로 구성되어 있기 때문에, 영상에서 객체만을 분리하는 것은 지도 제작, 로봇 네비게이션, 장소 추천 등의 다양한 응용 프로그램에 있어서 주요한 이슈가 될 수 있다.With the recent development of Internet technology and the rapid dissemination of smart devices, the accessibility to a large amount of images has been reduced. Technology for detecting and recognizing objects in images has been steadily applied to image sensing systems, and has emerged as a fundamental problem in computer vision and machine understanding. In general, since images are composed of surfaces and objects composed in various ways, separating only objects from images can be a major issue in various applications such as map production, robot navigation, and place recommendation.

영상에서 특징 벡터를 검출 및 인식하기 위해 다양한 특징 추출 방법이 사용되고 있으나 대부분은 이미지를 이용하여 객체를 인식하고 일부만이 영상에 적용 가능하다는 한계가 있다. 객체를 검출하는 기존의 방법 중 대표적인 것이 LBP(Local Binary Pattern)이다. LBP의 견고성, 차별성 및 적용 가능성을 향상시키기 위해 다양한 LBP 변형 방법이 연구되었으나 조명, 시점 등 픽셀의 변화에 따라 바이너리 코드가 완전히 달라질 수 있어 일관된 패턴을 생성하지 못하기 때문에 노이즈에 취약하다는 치명적인 단점이 있다.Various feature extraction methods are used to detect and recognize feature vectors in an image, but most have a limitation that an object is recognized using an image and only a part can be applied to the image. One of the existing methods for detecting objects is LBP (Local Binary Pattern). In order to improve the robustness, differentiation and applicability of the LBP, various LBP transformation methods have been studied, but the fatal disadvantage of being vulnerable to noise is that it cannot generate a consistent pattern because the binary code can be completely changed according to changes in pixels such as lighting and viewpoint. have.

대한민국 등록특허 제10-1716646호, 공개일자 2014년 7월 18일Republic of Korea Registered Patent No. 10-1716646, Publication date July 18, 2014

본 발명은 전술한 문제점을 해결하기 위한 것으로서, 영상 데이터와 참조 영상 데이터의 유사도를 예측하기 위하여 영상 데이터의 특징 벡터를 추출하는 것을 일 목적으로 한다.The present invention is to solve the above-mentioned problems, and an object thereof is to extract feature vectors of image data in order to predict the similarity between image data and reference image data.

또한 본 발명은 영상 데이터에서 유사도를 예측함에 있어서 분산 처리 시스템을 이용하여 연산의 속도를 향상시키는 것을 일 목적으로 한다.In addition, an object of the present invention is to improve the speed of computation by using a distributed processing system in predicting similarity in image data.

또한 본 발명은 영상 데이터에서 추출한 제1 프레임과 유사도를 연산할 기준 프레임으로 참조 영상 데이터에 포함된 복수 개의 프레임을 이용함으로써 유사도 예측의 정확성을 향상시키는 것을 일 목적으로 한다.In addition, an object of the present invention is to improve accuracy of prediction of similarity by using a plurality of frames included in reference image data as a reference frame for calculating similarity with a first frame extracted from image data.

또한 본 발명은 영상 데이터에서 추출한 제1 프레임의 특징 벡터를 추출함에 있어서 에지 마스크를 이용함으로써 특징 벡터 추출의 정확성을 향상시키는 것을 일 목적으로 한다.In addition, an object of the present invention is to improve the accuracy of feature vector extraction by using an edge mask in extracting feature vectors of a first frame extracted from image data.

이러한 목적을 달성하기 위한 본 발명은 영상 데이터베이스로부터 제1 영상 데이터와 상기 제1 영상 데이터의 비교 대상이 되는 참고 영상 데이터를 수신하는 단계, 상기 제1 영상 데이터를 복수 개의 제2 영상 데이터로 분산 저장하는 단계, 상기 제2 영상 데이터에서 제1 프레임을 추출하는 전처리 단계, 상기 제1 프레임에서 기 설정된 크기를 갖는 에지 마스크를 검출하는 단계, 상기 에지 마스크의 영역에 대한 특징을 추출하여 제1 특징 벡터를 생성하는 단계, 상기 제1 프레임에서 생성된 하나 이상의 상기 제1 특징 벡터를 결합하여 상기 제1 프레임에 대응되는 제2 특징 벡터를 생성하는 단계, 상기 참고 영상 데이터에서 상기 제1 프레임과 대응되는 제2 프레임, 상기 제2 프레임과 인접한 제3 및 제4 프레임을 식별하는 단계, 상기 제1 프레임과 상기 제2 내지 제4 프레임을 각각 비교하여 유사도를 측정하는 단계를 포함하는 것을 일 특징으로 한다.The present invention for achieving the above object is the step of receiving reference image data to be compared to the first image data and the first image data from the image database, distributed storage of the first image data as a plurality of second image data A first pre-processing step of extracting a first frame from the second image data, detecting an edge mask having a preset size in the first frame, and extracting features for an area of the edge mask to form a first feature vector Generating a second feature vector corresponding to the first frame by combining one or more first feature vectors generated in the first frame, and corresponding to the first frame in the reference image data And a second frame, identifying third and fourth frames adjacent to the second frame, and measuring similarity by comparing the first frame and the second to fourth frames, respectively. .

또한 상기 전처리 단계는, 상기 제2 영상 데이터에서 제1 프레임을 추출하는 단계, 상기 제1 프레임을 그레이 스케일로 변환하는 단계, 상기 제1 프레임의 크기를 변경하는 단계, 상기 제1 프레임에서 전경을 추출하는 단계를 포함하는 것을 일 특징으로 한다.In addition, the pre-processing step, extracting a first frame from the second image data, converting the first frame to gray scale, changing the size of the first frame, and the foreground in the first frame It characterized in that it comprises the step of extracting.

나아가 상기 제1 프레임에서 전경을 추출하기 위하여 가우시안 정규 분포를 이용하는 것을 일 특징으로 한다.Furthermore, it is characterized in that a Gaussian normal distribution is used to extract the foreground from the first frame.

또한 상기 에지 마스크는, 상기 에지 마스크의 중심에 위치한 중심 픽셀, 상기 중심 픽셀에 인접한 복수 개의 이웃 픽셀을 포함하는 것을 일 특징으로 한다.In addition, the edge mask is characterized in that it comprises a center pixel located in the center of the edge mask, a plurality of neighboring pixels adjacent to the center pixel.

나아가 상기 제1 특징 벡터를 생성하는 단계는, 상기 에지 마스크의 상기 중심 픽셀과 상기 이웃 픽셀 사이의 차이 값을 연산하는 단계, 상기 차이 값의 평균값을 연산하는 단계, 상기 차이 값과 상기 평균값을 비교하여, 상기 차이 값이 상기 평균값보다 크면 1의 값을, 그렇지 않으면 0의 값을 부여하여 제1 특징 벡터를 생성하는 단계를 포함하는 것을 일 특징으로 한다.Further, generating the first feature vector includes: calculating a difference value between the center pixel and the neighboring pixel of the edge mask, calculating an average value of the difference values, and comparing the difference value and the average value Accordingly, the method may include generating a first feature vector by assigning a value of 1 if the difference value is greater than the average value, and a value of 0 otherwise.

또한 상기 제2 내지 제4 프레임은 서로 연속하는 것을 일 특징으로 한다.In addition, the second to fourth frames are characterized in that continuous with each other.

나아가 상기 유사도를 측정하는 단계는, 상기 제1 프레임과 상기 제2 내지 제4 프레임에 코사인 유사도를 적용하는 것을 일 특징으로 한다.Furthermore, the measuring the similarity may include applying cosine similarity to the first frame and the second to fourth frames.

또한 본 발명은 영상 데이터베이스로부터 제1 영상 데이터와 상기 제1 영상 데이터의 비교 대상이 되는 참고 영상 데이터를 수신하는 영상 수신부, 상기 제1 영상 데이터를 복수 개의 제2 영상 데이터로 분산 저장하는 데이터 분할부, 상기 제2 영상 데이터에서 제1 프레임을 추출하는 전처리부, 상기 제1 프레임에서 기 설정된 크기를 갖는 에지 마스크를 검출하고, 상기 에지 마스크의 영역에 대한 특징을 추출하여 제1 특징 벡터를 생성하며, 상기 제1 프레임에서 생성된 하나 이상의 상기 제1 특징 벡터를 결합하여 상기 제1 프레임에 대응되는 제2 특징 벡터를 생성하는 제어부, 상기 참고 영상 데이터에서 상기 제1 프레임과 대응되는 제2 프레임, 상기 제2 프레임과 인접한 제3 및 제4 프레임을 식별하여 상기 제1 프레임과 상기 제2 내지 제4 프레임을 각각 비교하여 유사도를 측정하는 유사도 측정부를 포함하는 것을 일 특징으로 한다.In addition, the present invention is an image receiving unit for receiving reference image data that is a comparison target of the first image data and the first image data from an image database, and a data division unit for distributing and storing the first image data as a plurality of second image data. , A pre-processing unit extracting a first frame from the second image data, detecting an edge mask having a preset size in the first frame, extracting features for the region of the edge mask, and generating a first feature vector, , A control unit for generating a second feature vector corresponding to the first frame by combining one or more of the first feature vectors generated in the first frame, a second frame corresponding to the first frame in the reference image data, And a similarity measurement unit that identifies the third and fourth frames adjacent to the second frame and compares the first frame and the second to fourth frames to measure similarity.

나아가 상기 전처리부는, 상기 제2 영상 데이터에서 제1 프레임을 추출하고, 상기 제1 프레임을 그레이 스케일로 변환하며, 상기 제1 프레임의 크기를 변경하고, 상기 제1 프레임에서 전경을 추출하는 것을 일 특징으로 한다.Furthermore, the pre-processing unit may extract a first frame from the second image data, convert the first frame to gray scale, change the size of the first frame, and extract the foreground from the first frame. It is characterized by.

또한 상기 제1 프레임에서 전경을 추출하기 위하여 가우시안 정규 분포를 이용하는 것을 일 특징으로 한다.In addition, it is characterized in that a Gaussian normal distribution is used to extract the foreground from the first frame.

나아가 상기 에지 마스크는 상기 에지 마스크의 중심에 위치한 중심 픽셀, 상기 중심 픽셀에 인접한 복수 개의 이웃 픽셀을 포함하는 것을 일 특징으로 한다.Further, the edge mask is characterized in that it comprises a center pixel located in the center of the edge mask, a plurality of neighboring pixels adjacent to the center pixel.

또한 상기 제어부는, 상기 에지 마스크의 상기 중심 픽셀과 상기 이웃 픽셀 사이의 차이 값을 연산하고, 상기 차이 값의 평균값을 연산하며, 상기 차이 값과 상기 평균값을 비교하여, 상기 차이 값이 상기 평균값보다 크면 1의 값을, 그렇지 않으면 0의 값을 부여하여 제1 특징 벡터를 생성하는 제1 특징 벡터 생성부를 포함하는 것을 일 특징으로 한다.In addition, the controller calculates a difference value between the center pixel and the neighboring pixel of the edge mask, calculates an average value of the difference value, compares the difference value with the average value, and the difference value is greater than the average value. It is characterized in that it includes a first feature vector generator which generates a first feature vector by giving a value of 1 to a value of 0 if large.

나아가 상기 제2 내지 제4 프레임은 서로 연속하는 것을 일 특징으로 한다.Furthermore, the second to fourth frames are characterized in that they are continuous with each other.

또한 상기 유사도 측정부는, 상기 제1 프레임과 상기 제2 내지 제4 프레임에 코사인 유사도를 적용하는 것을 일 특징으로 한다.In addition, the similarity measurement unit is characterized in that to apply the cosine similarity to the first frame and the second to fourth frames.

전술한 바와 같은 본 발명에 의하면, 영상 데이터와 참조 영상 데이터의 유사도를 예측하기 위하여 영상 데이터의 특징 벡터를 추출할 수 있다.According to the present invention as described above, a feature vector of image data can be extracted to predict the similarity between image data and reference image data.

또한 본 발명은 영상 데이터에서 유사도를 예측함에 있어서 분산 처리 시스템을 이용하여 연산의 속도를 향상시킬 수 있다.In addition, the present invention can improve the speed of computation by using a distributed processing system in predicting similarity in image data.

또한 본 발명은 영상 데이터에서 추출한 제1 프레임과 유사도를 연산할 기준 프레임으로 참조 영상 데이터에 포함된 복수 개의 프레임을 이용함으로써 유사도 예측의 정확성을 향상시킬 수 있다.In addition, the present invention can improve accuracy of prediction of similarity by using a plurality of frames included in the reference image data as a reference frame for calculating similarity with the first frame extracted from the image data.

또한 본 발명은 영상 데이터에서 추출한 제1 프레임의 특징 벡터를 추출함에 있어서 에지 마스크를 이용함으로써 특징 벡터 추출의 정확성을 향상시킬 수 있다.In addition, the present invention can improve the accuracy of feature vector extraction by using an edge mask in extracting the feature vector of the first frame extracted from the image data.

도 1은 본 발명의 일 실시 예에 의한 유사도 예측 방법의 대략적인 동작을 나타낸 도면이다.
도 2는 본 발명의 일 실시 예에 의한 유사도 예측 장치의 구성을 도시한 도면이다.
도 3은 본 발명의 일 실시 예에 의한 유사도 예측 방법을 설명하기 위한 도면이다.
도 4는 본 발명의 일 실시 예에 의한 제1 특징 벡터를 추출하는 방법을 도시한 도면이다.
도 5는 본 발명의 일 실시 예에 의한 프레임 비교 방법을 도시한 도면이다.
도 6은 본 발명의 일 실시 예에 의한 비교 프레임 수에 따른 연산 속도를 나타낸 그래프이다.
도 7은 UCF Youtube 데이터 세트의 카테고리 별 샘플 영상 데이터를 나타낸 것이다.
도 8은 유사도 예측 장치를 이용하여 UCF Youtube 데이터 세트에 대하여 측정된 유사도의 정확도를 나타낸 그래프이다.
도 9는 본 발명의 일 실시 예에 의한 유사도 예측 장치에서 사용되는 특징 추출 알고리즘과 기존 특징 추출 알고리즘을 비교한 결과를 나타낸 그래프이다.
도 10은 본 발명의 일 실시 예에 의한 Spark 클러스터의 노드의 수에 따른 소요 시간을 나타낸 그래프이다.1 is a view showing a rough operation of the similarity prediction method according to an embodiment of the present invention.
2 is a diagram illustrating a configuration of a similarity prediction apparatus according to an embodiment of the present invention.
3 is a view for explaining a similarity prediction method according to an embodiment of the present invention.
4 is a diagram illustrating a method of extracting a first feature vector according to an embodiment of the present invention.
5 is a diagram illustrating a frame comparison method according to an embodiment of the present invention.
6 is a graph showing an operation speed according to the number of comparison frames according to an embodiment of the present invention.
7 shows sample image data for each category of the UCF Youtube data set.
8 is a graph showing the accuracy of the similarity measured for the UCF Youtube data set using the similarity prediction apparatus.
9 is a graph showing a result of comparing a feature extraction algorithm and a conventional feature extraction algorithm used in a similarity prediction apparatus according to an embodiment of the present invention.
10 is a graph showing the time required according to the number of nodes in a Spark cluster according to an embodiment of the present invention.

전술한 목적, 특징 및 장점은 첨부된 도면을 참조하여 상세하게 후술되며, 이에 따라 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 본 발명의 기술적 사상을 용이하게 실시할 수 있을 것이다. 본 발명을 설명함에 있어서 본 발명과 관련된 공지 기술에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에는 상세한 설명을 생략한다. The above-described objects, features, and advantages will be described in detail below with reference to the accompanying drawings, and accordingly, a person skilled in the art to which the present invention pertains can easily implement the technical spirit of the present invention. In the description of the present invention, when it is determined that detailed descriptions of known technologies related to the present invention may unnecessarily obscure the subject matter of the present invention, detailed descriptions will be omitted.

도면에서 동일한 참조부호는 동일 또는 유사한 구성요소를 가리키는 것으로 사용되며, 명세서 및 특허청구의 범위에 기재된 모든 조합은 임의의 방식으로 조합될 수 있다. 그리고 다른 식으로 규정하지 않는 한, 단수에 대한 언급은 하나 이상을 포함할 수 있고, 단수 표현에 대한 언급은 또한 복수 표현을 포함할 수 있음이 이해되어야 한다. The same reference numbers in the drawings are used to indicate the same or similar elements, and all combinations described in the specification and claims can be combined in any way. And, unless specified otherwise, it should be understood that a reference to a singular may include one or more, and a reference to a singular expression may also include a plural expression.

본 명세서에서 사용되는 용어는 단지 특정 예시적 실시 예들을 설명할 목적을 가지고 있으며 한정할 의도로 사용되는 것이 아니다. 본 명세서에서 사용된 바와 같은 단수적 표현들은 또한, 해당 문장에서 명확하게 달리 표시하지 않는 한, 복수의 의미를 포함하도록 의도될 수 있다. 용어 "및/또는," "그리고/또는"은 그 관련되어 나열되는 항목들의 모든 조합들 및 어느 하나를 포함한다. 용어 "포함한다", "포함하는", "포함하고 있는", "구비하는", "갖는", "가지고 있는" 등은 내포적 의미를 갖는 바, 이에 따라 이러한 용어들은 그 기재된 특징, 정수, 단계, 동작, 요소, 및/또는 컴포넌트를 특정하며, 하나 이상의 다른 특징, 정수, 단계, 동작, 요소, 컴포넌트, 및/또는 이들의 그룹의 존재 혹은 추가를 배제하지 않는다. 본 명세서에서 설명되는 방법의 단계들, 프로세스들, 동작들은, 구체적으로 그 수행 순서가 확정되는 경우가 아니라면, 이들의 수행을 논의된 혹은 예시된 그러한 특정 순서로 반드시 해야 하는 것으로 해석돼서는 안 된다. 추가적인 혹은 대안적인 단계들이 사용될 수 있음을 또한 이해해야 한다.The terminology used herein is for the purpose of describing only specific exemplary embodiments and is not intended to be limiting. Singular expressions as used herein may also be intended to include plural meanings unless expressly indicated otherwise in the sentence. The term “and/or,” “and/or” includes all combinations and any of the items listed therein. The terms “comprises”, “comprising”, “comprising”, “having”, “having”, “having”, etc. have an inclusive meaning, whereby these terms are described in terms of their characteristics, integers, It specifies steps, actions, elements, and/or components, and does not exclude the presence or addition of one or more other features, integers, steps, actions, elements, components, and/or groups thereof. The steps, processes, and operations of the method described herein are not to be construed as having to perform their performance in such a specific order as discussed or illustrated, unless specifically the order of performance is determined. . It should also be understood that additional or alternative steps may be used.

또한, 각각의 구성요소는 각각 하드웨어 프로세서로 구현될 수 있고, 위 구성요소들이 통합되어 하나의 하드웨어 프로세서로 구현될 수 있으며, 또는 위 구성요소들이 서로 조합되어 복수 개의 하드웨어 프로세서로 구현될 수도 있다.In addition, each component may be implemented as a hardware processor, and the above components may be integrated and implemented as a single hardware processor, or the above components may be combined and implemented as a plurality of hardware processors.

이하, 첨부된 도면을 참조하여 본 발명에 따른 바람직한 실시 예를 상세히 설명하기로 한다.Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings.

본 발명은 영상의 유사도를 평가하기 위하여 영상 데이터의 특징 벡터를 추출할 수 있다. 영상에서 특징 벡터를 추출하기 위한 기존의 방법 중 대표적인 것이 LBP(Local Binary Pattern)이다. LBP는 이미지의 텍스쳐를 분류하기 위하여 개발된 알고리즘으로, 객체 인식과 같은 다양한 영상 인식 분야에도 활용된다. LBP는 영상에 포함된 모든 픽셀에 대해 연산되는 값으로, 각 픽셀의 주변 3x3 크기를 갖는 영역에 대한 상대적인 밝기 변화를 2진수로 나타내는 인덱스 값을 의미한다. 즉, LBP는 지역적인 이진 패턴을 연산한다.The present invention can extract feature vectors of image data in order to evaluate the similarity of images. One of the existing methods for extracting a feature vector from an image is LBP (Local Binary Pattern). LBP is an algorithm developed to classify the texture of an image, and is also used in various image recognition fields such as object recognition. LBP is a value calculated for all pixels included in an image, and means an index value representing a change in brightness relative to an area having a size of 3x3 around each pixel in binary. In other words, LBP computes a local binary pattern.

LBP의 구체적인 원리는 3x3 영역 내에서 중심에 위치하는 중심 픽셀, 중심 픽셀과 이웃하는 8개의 이웃 픽셀끼리 서로 크기를 비교하여 이웃 픽셀의 값이 중심 픽셀보다 크면 1의 값을, 그렇지 않으면 0의 값을 갖도록 하여 이진 값을 연산한다. 예를 들어 중심 픽셀의 값이 50이고 그에 대한 이웃 픽셀의 값이 {65, 90, 40, 125, 35, 15, 70, 5}이면, LBP 이진 값은 {1, 1, 0, 1, 0, 0, 1, 0}이 된다. 따라서 최종 이진 패턴은 11010010(2)가 될 수 있다.The specific principle of LBP is to compare the size of the center pixel located at the center within the 3x3 area, the center pixel and the 8 neighboring pixels to each other, and a value of 1 if the value of the neighboring pixel is greater than the center pixel, and a value of 0 otherwise. To calculate the binary value. For example, if the value of the center pixel is 50 and the neighboring pixel value is {65, 90, 40, 125, 35, 15, 70, 5}, the LBP binary value is {1, 1, 0, 1, 0 , 0, 1, 0}. Therefore, the final binary pattern may be 11010010(2).

이와 같이 LBP는 영상에 포함된 픽셀에 대한 상대적인 값을 식별하기 때문에 밝기와 상관없이 일정한 값을 가질 수 있다. 그러나 LBP는 이웃 노드의 강도에 따라 이진 값이 완전히 달라질 수 있어 일관된 이진 패턴을 추출하지 못하기 때문에 노이즈에 취약하다는 단점이 있다.As described above, since LBP identifies a relative value for a pixel included in an image, it can have a constant value regardless of brightness. However, LBP has a disadvantage in that it is vulnerable to noise because it cannot extract a consistent binary pattern because the binary value can be completely changed according to the strength of neighbor nodes.

도 1은 본 발명의 일 실시 예에 의한 유사도 예측 방법의 대략적인 동작을 나타낸 도면이다. 도 1을 참조하면, 유사도 예측 방법은 영상 데이터를 저장하기 위하여 HDFS를 사용할 수 있다. HDFS(Hadoop Distribution File System)은 Hadoop의 데이터 저장 시스템을 의미한다. Hadoop은 대용량 데이터를 분산 처리할 수 있는 자바 기반의 오픈 소스 프레임워크이다. Hadoop은 복수 개의 서버에 데이터를 분산 저장하고, 데이터가 저장된 서버에서 동시에 데이터를 처리하는 시스템이다.1 is a view showing a rough operation of the similarity prediction method according to an embodiment of the present invention. Referring to FIG. 1, the similarity prediction method may use HDFS to store image data. HDFS (Hadoop Distribution File System) means Hadoop's data storage system. Hadoop is a Java-based open source framework capable of distributed processing of large amounts of data. Hadoop is a system that distributes and stores data on multiple servers and processes data simultaneously on the server where the data is stored.

영상 데이터에 포함된 특징을 추출하기 위하여 Spark RDD는 HDFS에 저장된 영상 데이터를 불러올 수 있다. Spark RDD는 Spark 내의 저장소를 의미한다. Spark는 범용성 분산 플랫폼으로, 분산된 복수 개의 노드에서 연산을 수행하게 하는 범용 분산 클러스터링 플랫폼을 의미한다. Spark는 Hadoop과 유사한데, Hadoop이 Map&Reduce 작업을 디스크 기반으로 수행하기 때문에 성능의 감소가 발생하는 반면, Spark는 Map&Reduce 작업을 메모리 기반으로 수행하기 때문에 Hadoop보다 빠른 속도를 가질 수 있다.In order to extract features included in the image data, Spark RDD can retrieve image data stored in HDFS. Spark RDD means a repository in Spark. Spark is a general-purpose distributed platform, which means a general-purpose distributed clustering platform that allows computation to be performed on multiple distributed nodes. Spark is similar to Hadoop, but because Hadoop performs Map&Reduce operations on disk, performance decreases, while Spark performs Map&Reduce operations on memory basis, so it can have a faster speed than Hadoop.

본 발명은 Spark RDD가 HDFS로부터 제1 영상 데이터를 불러오면, 제1 영상 데이터를 복수 개의 제2 영상 데이터로 분할하여 Spark에서 처리 가능한 노드에 분산 저장할 수 있다. 제2 영상 데이터가 저장된 각 노드에서, 제2 영상 데이터를 전처리하고, 전처리된 영상 데이터로부터 특징 벡터를 추출한 후, 추출된 특징 벡터를 HDFS에 저장할 수 있다. HDFS에 저장된 특징 벡터와 영상 데이터와 비교 대상이 되는 기 저장된 영상 데이터의 특징 벡터를 비교하여 영상 데이터의 객체를 식별할 수 있다.In the present invention, when Spark RDD fetches the first image data from HDFS, the first image data can be divided into a plurality of second image data and distributedly stored in a node that can be processed by Spark. At each node where the second image data is stored, the second image data may be pre-processed, the feature vector extracted from the pre-processed image data, and the extracted feature vector stored in HDFS. An object of image data may be identified by comparing the feature vector stored in the HDFS with the feature vector of the pre-stored image data to be compared with the image data.

이하에서는 도 2를 참조하여, 본 발명의 일 실시 예에 의한 유사도 예측 장치를 설명한다. 도 2를 참조하면, 본 발명의 일 실시 예에 의한 유사도 예측 장치는 영상 수신부(100), 전처리부(200), 제어부(300), 데이터 분할부(400), 유사도 측정부(500), 저장부(600)를 포함할 수 있다. 제어부(300)는 에지 마스크 생성부(310), 특징 벡터 생성부(330)를 더 포함할 수 있고, 저장부(600)는 영상 데이터 저장부(미도시)와 특징 벡터 저장부(미도시)를 더 포함할 수 있다.Hereinafter, a similarity prediction apparatus according to an embodiment of the present invention will be described with reference to FIG. 2. Referring to FIG. 2, the similarity prediction apparatus according to an embodiment of the present invention includes an image receiving unit 100, a pre-processing unit 200, a control unit 300, a data division unit 400, a similarity measurement unit 500, and storage It may include a portion 600. The control unit 300 may further include an edge mask generation unit 310 and a feature vector generation unit 330, and the storage unit 600 may include an image data storage unit (not shown) and a feature vector storage unit (not shown). It may further include.

영상 수신부(100)는 HDFS에서 제1 영상 데이터를 수신할 수 있다. 데이터 분할부(400)는 영상 수신부(100)로부터 수신된 제1 영상 데이터를 복수 개의 제2 영상 데이터로 분할하여 영상 데이터 저장부(600)에 분산 저장할 수 있다. 영상 데이터 저장부(600)는 Spark RDD를 따르며, 복수 개의 노드로 구성되어 있어 제2 영상 데이터를 각 노드에 분산시켜 저장함으로써 제2 영상 데이터 각각을 동시에 처리할 수 있도록 한다. 이하에서는 각 노드에서 제2 영상 데이터를 처리하기 위한 모듈을 설명한다.The image receiving unit 100 may receive the first image data from HDFS. The data division unit 400 may divide the first image data received from the image reception unit 100 into a plurality of second image data and distribute and store the image data in the image data storage unit 600. The image data storage unit 600 follows Spark RDD and consists of a plurality of nodes, so that the second image data can be processed simultaneously by distributing and storing the second image data to each node. Hereinafter, a module for processing the second image data at each node will be described.

전처리부(200)는 영상 데이터 저장부(600)에 분산 저장된 제2 영상 데이터의 전처리를 수행할 수 있다. 보다 구체적으로 전처리부(200)는 제2 영상 데이터에서 하나 이상의 제1 프레임을 추출할 수 있다. 제1 프레임은 RGB 스케일이기 때문에, 전처리부(200)는 제1 프레임을 그레이 스케일로 변환한 후, 제1 프레임의 크기를 기 설정된 크기로 재설정할 수 있다. 이 때, 제1 프레임이 재설정 되는 크기는 720 * 404일 수 있으나 이 외의 다른 크기로 설정될 수 있다.The pre-processing unit 200 may perform pre-processing of the second image data distributedly stored in the image data storage unit 600. More specifically, the pre-processing unit 200 may extract one or more first frames from the second image data. Since the first frame is an RGB scale, the pre-processor 200 may convert the first frame to a gray scale, and then reset the size of the first frame to a preset size. At this time, the size at which the first frame is reset may be 720 * 404, but may be set to other sizes.

나아가 전처리부(200)는 제1 프레임에서 전경을 추출하기 위해 배경을 제거할 수 있다. 전처리부(200)는 제1 프레임에서 전경을 추출하기 위해 가우시안 정규 분포를 이용할 수 있다. 가우시안 정규 분포는 가장 널리 사용되는 연속 확률 분포로, 일반적으로 정규 분포로 알려져 있다. 가우시안 정규 분포는 수집된 데이터에 대한 분포를 근사하는 것으로, 수학식 1을 통해 연산된다.Furthermore, the pre-processing unit 200 may remove the background to extract the foreground from the first frame. The pre-processing unit 200 may use a Gaussian normal distribution to extract the foreground from the first frame. The Gaussian normal distribution is the most widely used continuous probability distribution, commonly known as the normal distribution. The Gaussian normal distribution approximates the distribution for the collected data, and is calculated through Equation 1.

수학식 1에서, C는 배경 이미지의 수를,

는 영상 데이터의 (x, y) 픽셀의 강도 값을,

는 영상 데이터에서 추출된 배경의 (x, y) 픽셀의 강도 값을 의미할 수 있다.In Equation 1, C is the number of background images,

Is the intensity value of (x, y) pixels in the image data,

May denote an intensity value of (x, y) pixels of a background extracted from image data.

제어부(300)는 전처리가 수행된 제1 프레임에서 기 설정된 크기를 갖는 에지 마스크를 검출하고, 에지 마스크를 이용하여 제1 프레임에 대응되는 특징 벡터를 생성할 수 있다. 보다 구체적으로 제어부(300)는 에지 검출부(310)와 특징 벡터 생성부(330)를 포함할 수 있다.The controller 300 may detect an edge mask having a predetermined size in the first frame in which the pre-processing is performed, and generate a feature vector corresponding to the first frame using the edge mask. More specifically, the controller 300 may include an edge detector 310 and a feature vector generator 330.

에지 검출부(310)는 제1 프레임에서 에지 마스크를 검출하기 위하여 소벨 연산자(Sobel Operator)를 이용할 수 있다. 소벨 연산자는 3 * 3 크기의 행렬에서 중심 값을 기준으로 이웃 값을 비교하여 픽셀의 변화량을 검출하는 1차 미분 알고리즘이다. 에지 검출부(310)는 소벨 연산자를 이용하여 제1 프레임에 포함된 픽셀 중 밝기의 변화가 급격한 부분을 나타내는 에지를 검출할 수 있다.The edge detector 310 may use a Sobel Operator to detect the edge mask in the first frame. The Sobel operator is a first-order differential algorithm that detects a change in a pixel by comparing neighbor values based on a center value in a 3 * 3 matrix. The edge detector 310 may detect an edge representing a portion in which the change in brightness is sudden among the pixels included in the first frame using the Sobel operator.

영상 데이터에서 추출한 제1 프레임은 2차원의 속성을 갖기 때문에 제1 프레임의 가로와 세로 방향에 대해 각각 소벨 연산자를 적용할 수 있다. 에지 검출부(310)는 수학식 2를 통해 제1 프레임에서 에지 마스크를 검출할 수 있다.Since the first frame extracted from the image data has two-dimensional properties, the Sobel operator can be applied to the horizontal and vertical directions of the first frame, respectively. The edge detector 310 may detect an edge mask in the first frame through Equation (2).

에지 검출부(310)가 생성한 에지 마스크는 에지 마스크의 중심에 위치한 중심 픽셀과 중심 픽셀에 인접한 복수 개의 이웃 픽셀을 포함할 수 있다.The edge mask generated by the edge detector 310 may include a center pixel located at the center of the edge mask and a plurality of neighboring pixels adjacent to the center pixel.

특징 벡터 생성부(330)는 에지 검출부(310)에서 추출된 에지 마스크를 이용하여 제1 프레임에 대한 특징 벡터를 생성할 수 있다. 특징 벡터 생성부(330)는 제1 프레임에서 기 설정된 영역 및 크기에 대한 에지 마스크에 대한 제1 특징 벡터를 생성할 수 있다.The feature vector generator 330 may generate a feature vector for the first frame using the edge mask extracted by the edge detector 310. The feature vector generator 330 may generate a first feature vector for an edge mask for a predetermined region and size in the first frame.

특징 벡터 생성부(330)는 에지 마스크의 중심 픽셀과 이웃 픽셀의 차이 값을 연산할 수 있다. 에지 마스크가 중심 픽셀 a와 이웃 픽셀 {a1, a2, a3, a4, a5, a6, a7, a8}을 포함할 때, 특징 벡터 생성부(330)는 중심 픽셀과 이웃 픽셀의 차이 값을 {a1-a, a2-a, a3-a, a4-a, a5-a, a6-a, a7-a, a8-a}와 같이 연산할 수 있다. 이 때 a1에 대응되는 이웃 픽셀은 사용자의 설정에 따라 달라질 수 있으나 본 발명의 설명에서는 에지 마스크의 (1, 1) 위치에 대응되는 이웃 픽셀로 설명한다.The feature vector generator 330 may calculate a difference value between the center pixel and the neighboring pixel of the edge mask. When the edge mask includes the center pixel a and the neighboring pixels {a1, a2, a3, a4, a5, a6, a7, a8}, the feature vector generator 330 sets the difference value between the center pixel and the neighboring pixels as {a1 -a, a2-a, a3-a, a4-a, a5-a, a6-a, a7-a, a8-a}. At this time, the neighboring pixels corresponding to a1 may vary depending on the user's setting, but in the description of the present invention, neighboring pixels corresponding to the (1, 1) position of the edge mask will be described.

도 4를 참조하여 유사도 예측 방법에 대한 일 실시 예를 설명하면, 에지 마스크에 포함된 중심 픽셀의 값이 32이고, 중심 픽셀과 이웃한 이웃 픽셀의 값이 {80, 85, 95, 31, 51, 23, 22, 35}이면, 특징 벡터 생성부(330)는 중심 픽셀과 이웃 픽셀의 차이 값을 {48, 53, 63, -1, 19, -9, -10, 3}과 같이 연산할 수 있다.Referring to FIG. 4, an embodiment of the method for predicting similarity will be described. The center pixel value included in the edge mask is 32, and the neighboring pixel value and the center pixel value are {80, 85, 95, 31, and 51. , 23, 22, 35}, the feature vector generator 330 calculates the difference value between the center pixel and the neighboring pixels as {48, 53, 63, -1, 19, -9, -10, 3}. Can be.

특징 벡터 생성부(330)는 이웃 픽셀의 차이 값에 대한 평균 값을 연산할 수 있다. 보다 구체적으로 특징 벡터 생성부(330)는 이웃 픽셀의 차이 값의 절대 값에 대한 평균 값을 연산할 수 있다. 위 예시에서, 특징 벡터 생성부(330)는 이웃 픽셀의 차이 값의 절대 값인 {48, 53, 63, 1, 19, 9, 10, 3}에 대한 평균 값을 연산할 수 있다. 이 때 연산되는 평균 값은 25.75로, 특징 벡터 생성부는 연산된 평균 값을 내림하여 25의 값을 얻을 수 있다.The feature vector generator 330 may calculate an average value of the difference values of neighboring pixels. More specifically, the feature vector generator 330 may calculate an average value of absolute values of difference values of neighboring pixels. In the above example, the feature vector generator 330 may calculate an average value for {48, 53, 63, 1, 19, 9, 10, 3}, which is an absolute value of a difference value of neighboring pixels. At this time, the average value calculated is 25.75, and the feature vector generator may obtain a value of 25 by rounding down the calculated average value.

특징 벡터 생성부(300)는 연산된 평균 값과 이웃 픽셀의 차이 값을 비교하여 제1 특징 벡터를 생성할 수 있다. 특징 벡터 생성부(330)는 이웃 픽셀의 차이 값이 평균 값보다 크면 1의 값을, 그렇지 않으면 0의 값을 부여하여 제1 특징 벡터를 생성할 수 있다.The feature vector generator 300 may generate a first feature vector by comparing the calculated average value and a difference value between neighboring pixels. The feature vector generator 330 may generate a first feature vector by assigning a value of 1 if the difference value of neighboring pixels is greater than an average value and a value of 0 otherwise.

위 예시에서, 특징 벡터 생성부(330)는 평균 값인 25와 이웃 픽셀의 차이 값인 {48, 53, 63, -1, 19, -9, -10, 3}를 비교하여 제1 특징 벡터를 생성할 수 있다. 특징 벡터 생성부(330)는 이웃 픽셀의 차이 값과 평균 값을 비교하여 {1, 1, 1, 0, 0, 0, 0, 0}의 값을 갖는 제1 특징 벡터를 생성할 수 있다. 특징 벡터 생성부(330)가 생성한 제1 특징 벡터는 a1에 대응되는 이웃 픽셀에 따라 상이할 수 있다.In the above example, the feature vector generation unit 330 generates a first feature vector by comparing the average value of 25 and the neighboring pixel values of {48, 53, 63, -1, 19, -9, -10, 3}. can do. The feature vector generator 330 may generate a first feature vector having a value of {1, 1, 1, 0, 0, 0, 0, 0} by comparing the average value with the difference value of neighboring pixels. The first feature vector generated by the feature vector generator 330 may be different according to neighboring pixels corresponding to a1.

특징 벡터 생성부(330)는 제1 프레임의 일 영역에 대하여 추출된 하나 이상의 제1 특징 벡터를 이용하여 제1 프레임의 전체 영역에 대응되는 제2 특징 벡터를 생성할 수 있다. 다시 말해서 특징 벡터 생성부(330)는 제1 프레임에 대해 추출된 하나 이상의 제1 특징 벡터를 병합하여 제1 프레임 전체에 대응되는 제2 특징 벡터를 생성할 수 있다.The feature vector generator 330 may generate a second feature vector corresponding to the entire region of the first frame by using one or more first feature vectors extracted for one region of the first frame. In other words, the feature vector generator 330 may merge the one or more first feature vectors extracted for the first frame to generate a second feature vector corresponding to the entire first frame.

유사도 측정부(500)는 제어부(300)에서 생성된 제2 특징 벡터를 이용하여 제2 영상 데이터의 유사도를 측정할 수 있다. 유사도 측정부(500)는 제2 영상 데이터의 유사도를 측정하기 위하여 기준이 되는 참고 영상 데이터를 이용할 수 있다. 유사도 측정부(500)는 참고 영상 데이터에 포함된 하나 이상의 프레임에서 제2 영상 데이터에 포함된 하나 이상의 제1 프레임에 대응되는 제2 프레임을 식별할 수 있다. 나아가 유사도 측정부(500)는 참고 영상 데이터에서 제2 프레임의 이웃 프레임인 제3 및 제4 프레임을 식별할 수 있다. 이 때, 제2 프레임, 제3 프레임, 그리고 제4 프레임은 서로 연속한 프레임일 수 있다. 제1 프레임과 제2 내지 제4 프레임을 비교함으로써 영상 데이터와 참고 영상 데이터의 속도가 상이할 경우 제1 프레임과 제2 프레임의 유사도가 낮게 평가되는 것을 방지할 수 있다.The similarity measurement unit 500 may measure the similarity of the second image data using the second feature vector generated by the control unit 300. The similarity measurement unit 500 may use reference image data as a reference to measure the similarity of the second image data. The similarity measurement unit 500 may identify a second frame corresponding to one or more first frames included in the second image data from one or more frames included in the reference image data. Furthermore, the similarity measurement unit 500 may identify the third and fourth frames, which are neighboring frames of the second frame, from the reference image data. At this time, the second frame, the third frame, and the fourth frame may be a continuous frame with each other. By comparing the first frame with the second to fourth frames, when the speeds of the image data and the reference image data are different, it is possible to prevent the similarity between the first frame and the second frame from being evaluated low.

참고 영상 데이터의 제2 프레임이 참고 영상 데이터의 첫 프레임일 경우, 제3 프레임은 제2 프레임의 다음 프레임이고, 제4 프레임은 제3 프레임의 다음 프레임일 수 있다. 나아가 참고 영상 데이터의 제2 프레임이 참고 영상 데이터의 중간에 위치한 프레임일 경우, 제3 프레임은 제2 프레임의 이전 프레임이고, 제4 프레임은 제2 프레임의 이후 프레임일 수 있다. 또한 참고 영상 데이터의 제2 프레임이 참고 영상 데이터의 마지막 프레임일 경우, 제3 프레임은 제2 프레임의 이전 프레임이고, 제4 프레임은 제3 프레임의 이전 프레임일 수 있다.When the second frame of the reference image data is the first frame of the reference image data, the third frame may be the next frame of the second frame, and the fourth frame may be the next frame of the third frame. Furthermore, when the second frame of the reference image data is a frame located in the middle of the reference image data, the third frame may be a previous frame of the second frame, and the fourth frame may be a subsequent frame of the second frame. Also, when the second frame of the reference image data is the last frame of the reference image data, the third frame may be the previous frame of the second frame, and the fourth frame may be the previous frame of the third frame.

도 5를 참조하면, 유사도 측정부(500)는 영상 데이터의 제1 프레임과 참고 영상 데이터의 제2 프레임 내지 제4 프레임 각각에 대한 유사도를 연산할 수 있다. 유사도 측정부(500)는 유사도를 측정하기 위해 코사인 유사도를 이용할 수 있다. 코사인 유사도(Cosine Similarity)는 두 벡터 사이의 각도에 대한 코사인 값을 이용하여 측정된 벡터 간의 유사도의 정도를 의미한다. 유사도 측정부(500)는 제1 프레임의 제2 특징 벡터와 제2 내지 제4 프레임 각각에 대한 특징 벡터의 유사도를 측정하기 위하여 수학식 3과 같은 코사인 유사도를 이용할 수 있다.Referring to FIG. 5, the similarity measurement unit 500 may calculate similarity for each of the first frame of the image data and the second to fourth frames of the reference image data. The similarity measurement unit 500 may use cosine similarity to measure similarity. Cosine similarity refers to the degree of similarity between vectors measured using a cosine value for an angle between two vectors. The similarity measurement unit 500 may use the cosine similarity as in Equation 3 to measure the similarity between the second feature vector of the first frame and the feature vector for each of the second to fourth frames.

유사도 측정부(500)는 제1 프레임의 제2 특징 벡터와 제2 내지 제4 프레임 각각에 대한 특징 벡터의 유사도 중 최대 값을 최종 유사도로 할 수 있다. The similarity measurement unit 500 may set the maximum value among the similarities of the second feature vector of the first frame and the feature vectors for each of the second to fourth frames as the final similarity.

부가적으로 설명하면 유사도 측정부(500)는 제1 프레임의 비교 대상 프레임으로 제2 내지 제4 프레임, 즉 3개의 프레임을 식별하는데, 이는 비교 프레임 수가 3개일 때 연산 속도가 가장 최적화되어 있기 때문이다. 이는 도 6을 통해 확인할 수 있다.In addition, the similarity measurement unit 500 identifies the second to fourth frames, that is, three frames, as the comparison target frame of the first frame, because the calculation speed is most optimized when the number of comparison frames is three. to be. This can be confirmed through FIG. 6.

이하에서는 도 3을 이용하여 본 발명의 일 실시 예에 의한 유사도 예측 방법을 설명한다. 유사도 예측 방법에 관한 설명에 있어서 전술한 유사도 예측 시스템과 중복되는 세부 실시 예는 생략될 수 있다.Hereinafter, a similarity prediction method according to an embodiment of the present invention will be described with reference to FIG. 3. In the description of the similarity prediction method, detailed embodiments overlapping with the similarity prediction system described above may be omitted.

도 3은 본 발명의 일 실시 예에 의한 유사도 예측 방법을 설명하기 위한 도면이다. 유사도 예측 방법을 수행하는 유사도 예측 장치는 서버로 구현될 수 있는 바, 이하에서는 설명의 편의를 위해 서버로 명명한다.3 is a view for explaining a similarity prediction method according to an embodiment of the present invention. The similarity prediction apparatus that performs the similarity prediction method may be implemented as a server, hereinafter, referred to as a server for convenience of description.

도 3을 참조하면, 서버는 HDFS에서 제1 영상 데이터와 제1 영상 데이터의 비교 데이터인 참조 영상 데이터를 불러올 수 있다(S100). 서버는 불러온 제1 영상 데이터를 복수 개의 제2 영상 데이터로 분할(S200)하여 Spark RDD의 각 노드에 분산 저장할 수 있다. Spark RDD의 노드는 Mapper와 Reducer를 각각 포함하고 있다. 이하에서는 설명의 편의성을 위하여 제2 영상 데이터가 분할된 노드를 제1 노드라고 명명한다.Referring to FIG. 3, the server may load reference image data that is comparison data between the first image data and the first image data in HDFS (S100 ). The server may divide the loaded first image data into a plurality of second image data (S200) and distribute the data to each node of Spark RDD. The nodes of Spark RDD include Mapper and Reducer respectively. Hereinafter, for convenience of description, a node in which the second image data is divided is referred to as a first node.

서버는 제1 노드 각각에 분산 저장된 제2 영상 데이터에서 제1 프레임을 추출하고, 제1 프레임을 그레이 스케일로 변환하며, 변환된 제1 프레임의 크기를 변경하고, 제1 프레임의 전경을 추출하는 전처리를 수행할 수 있다(S300).The server extracts the first frame from the second image data stored in each of the first nodes, converts the first frame to gray scale, changes the size of the converted first frame, and extracts the foreground of the first frame Pre-processing may be performed (S300).

서버는 전처리된 제2 영상 데이터에서 추출한 제1 프레임에서, 기 설정된 위치 및 크기를 갖는 에지 마스크를 검출(S400)할 수 있다. 서버는 에지 마스크를 검출하기 위해 소벨 연산자를 사용할 수 있다.In the first frame extracted from the pre-processed second image data, the server may detect an edge mask having a preset position and size (S400). The server can use the Sobel operator to detect the edge mask.

서버는 검출된 에지 마스크를 이용하여 제1 특징 벡터를 생성(S500)할 수 있다. 서버는 에지 마스크에 포함된 중간 픽셀과 중간 픽셀과 인접한 이웃 픽셀의 값을 이용하여 제1 특징 벡터를 생성할 수 있다. 보다 구체적으로 서버는 중간 픽셀의 값과 이웃 픽셀의 값에 대한 차이 값을 연산하고, 연산된 차이 값의 절대 값에 대한 평균 값을 연산할 수 있다. 서버는 연산된 평균 값과 중간 픽셀의 값과 이웃 픽셀의 값에 대한 차이 값을 비교하여 제1 특징 벡터를 생성할 수 있다.The server may generate a first feature vector using the detected edge mask (S500). The server may generate a first feature vector using values of an intermediate pixel included in the edge mask and neighboring pixels adjacent to the intermediate pixel. More specifically, the server may calculate the difference value between the value of the intermediate pixel and the value of the neighboring pixel, and calculate the average value of the absolute value of the calculated difference value. The server may generate a first feature vector by comparing the calculated average value with the difference value between the middle pixel value and the neighbor pixel value.

서버는 제1 프레임에서 추출된 하나 이상의 제1 특징 벡터를 이용하여 제1 프레임에 대응되는 제2 특징 벡터를 생성할 수 있다(S600). 제1 특징 벡터는 제1 프레임에 포함된 특정 영역에 대한 벡터 값이고, 제2 특징 벡터는 제1 프레임에 대한 벡터 값이다.The server may generate a second feature vector corresponding to the first frame using one or more first feature vectors extracted from the first frame (S600 ). The first feature vector is a vector value for a specific region included in the first frame, and the second feature vector is a vector value for the first frame.

서버는 제1 프레임의 제2 특징 벡터의 유사도를 연산하기 위하여 제1 영상 데이터의 비교 대상인 참조 영상 데이터에 포함된 하나 이상의 프레임에서 제1 프레임과 대응되는 제2 프레임을 식별할 수 있다. 나아가 서버는 참조 영상 데이터에서 제2 프레임과 인접한 제3 및 제4 프레임을 식별할 수 있다. 참고 영상 데이터의 제2 프레임이 참고 영상 데이터의 첫 프레임일 경우, 제3 프레임은 제2 프레임의 다음 프레임이고, 제4 프레임은 제3 프레임의 다음 프레임일 수 있다. 나아가 참고 영상 데이터의 제2 프레임이 참고 영상 데이터의 중간에 위치한 프레임일 경우, 제3 프레임은 제2 프레임의 이전 프레임이고, 제4 프레임은 제2 프레임의 이후 프레임일 수 있다. 또한 참고 영상 데이터의 제2 프레임이 참고 영상 데이터의 마지막 프레임일 경우, 제3 프레임은 제2 프레임의 이전 프레임이고, 제4 프레임은 제3 프레임의 이전 프레임일 수 있다.The server may identify a second frame corresponding to the first frame from one or more frames included in the reference image data to be compared with the first image data in order to calculate the similarity of the second feature vector of the first frame. Furthermore, the server may identify third and fourth frames adjacent to the second frame from the reference image data. When the second frame of the reference image data is the first frame of the reference image data, the third frame may be the next frame of the second frame, and the fourth frame may be the next frame of the third frame. Furthermore, when the second frame of the reference image data is a frame located in the middle of the reference image data, the third frame may be a previous frame of the second frame, and the fourth frame may be a subsequent frame of the second frame. Also, when the second frame of the reference image data is the last frame of the reference image data, the third frame may be the previous frame of the second frame, and the fourth frame may be the previous frame of the third frame.

서버는 제1 프레임의 제2 특징 벡터와 제2 내지 제4 프레임 각각의 특징 벡터의 유사도를 연산한 후, 최대 값을 제1 프레임에 대한 유사도로 설정할 수 있다(S700).After calculating the similarity between the second feature vector of the first frame and the feature vector of each of the second to fourth frames, the server may set the maximum value as the similarity to the first frame (S700).

이하에서는 도 7 내지 도 10을 이용하여 본 발명의 일 실시 예에 의한 유사도 예측 장치의 성능을 기존 방법과 비교하기 위한 실험 및 그 결과를 설명한다.Hereinafter, experiments and results for comparing the performance of the similarity prediction apparatus according to an embodiment of the present invention with existing methods will be described with reference to FIGS. 7 to 10.

유사도 예측 장치의 실험에는 Hadoop, Spark 클러스터를 사용할 수 있다. Spark 클러스터는 3개의 노드를 포함할 수 있으며, 각 노드는 3.0GHz, 32GB의 성능을 갖는 4개의 코어를 가질 수 있다. 또한 실험에 사용되는 Hadoop은 2.7.1 버전, Spark는 1.6.2버전이다. 유사도 예측 장치의 실험에 있어서 사용되는 데이터는 UCF Youtube 데이터 세트로, 11개의 카테고리로 구분될 수 있다. UCF 데이터 세트는 농구 슈팅(basketball shooting), 자전거(biking), 드라이브(driving), 골프 스윙(golf swinging), 승마(horse-back riding), 축구공 저글링(soccer juggling), 그네(swinging), 테니스 스윙(tennis swinging), 트램플린 점프(trampoline jumping), 발리볼 스파이크(volleyball spiking), 개와의 산책(walking with a dog)와 같은 총 11개의 카테고리를 포함할 수 있다. UCF 데이터 세트는 각 카테고리 별 125개의 영상 데이터, 즉 총 1375개의 영상 데이터를 포함할 수 있으며 도 7은 각 카테고리 별 샘플 영상 데이터를 나타낸 것이다.Hadoop and Spark clusters can be used to test the similarity prediction device. A Spark cluster can contain 3 nodes, and each node can have 4 cores with 3.0 GHz and 32 GB performance. Also, Hadoop used for the experiment is version 2.7.1 and Spark is version 1.6.2. Data used in the experiment of the similarity prediction apparatus is a UCF Youtube data set, and can be divided into 11 categories. UCF data sets include basketball shooting, biking, driving, golf swinging, horse-back riding, soccer juggling, swinging, and tennis. It can include a total of eleven categories, such as tennis swinging, trampoline jumping, volleyball spiking, and walking with a dog. The UCF data set may include 125 image data for each category, that is, a total of 1375 image data, and FIG. 7 shows sample image data for each category.

도 8은 유사도 예측 장치를 이용하여 UCFYoutube 데이터 세트에 대하여 측정된 유사도의 정확도(mAP, mean average precision)를 나타낸 그래프이다. 도 8을 참조하면, 승마 카테고리의 정확도가 0.89로 가장 높은 수치를 나타내고, 자전거 카테고리와 테니스 스윙 카테고리가 각각 0.85, 0.84로 그 뒤를 따르고 있다. 농구 슈팅 카테고리의 정확도는 0.72로 가장 낮은 수치를 나타내고 있다.8 is a graph showing the accuracy (mAP, mean average precision) of the similarity measured for the UCFYoutube data set using the similarity prediction apparatus. Referring to FIG. 8, the accuracy of the horse riding category is the highest with 0.89, and the bicycle category and the tennis swing category are followed by 0.85 and 0.84, respectively. The accuracy of the basketball shooting category was the lowest with 0.72.

도 9는 유사도 예측 장치에서 사용되는 특징 추출 알고리즘과 기존 특징 추출 알고리즘을 비교한 결과를 나타낸 그래프이다. 기존 알고리즘으로 LBP, MBP, LTP, MTP, Improved LBP, GLTP, DRLTP, Gradient LBP, Gradient direction and LBP를 이용할 수 있다. 도 9를 참조하면 유사도 예측 장치에서 사용되는 특징 추출 알고리즘은 0.806의 정확도를 가져 기존 알고리즘과 대비하여 정확도가 높은 것을 확인할 수 있다. 그러나 DRLTP 알고리즘의 경우 에지 정보와 질감 정보를 모두 포함하기 때문에 가장 정확한 결과 값을 가질 수 있다. 질감 정보를 포함하지 않는 알고리즘 중에서는 유사도 예측 장치에서 사용되는 특징 추출 알고리즘의 정확도가 단연 우수하다고 볼 수 있다.9 is a graph showing a result of comparing the feature extraction algorithm used in the similarity prediction apparatus and the existing feature extraction algorithm. LBP, MBP, LTP, MTP, Improved LBP, GLTP, DRLTP, Gradient LBP, Gradient direction and LBP can be used as existing algorithms. Referring to FIG. 9, it can be seen that the feature extraction algorithm used in the similarity prediction apparatus has an accuracy of 0.806, which is higher than that of the existing algorithm. However, since the DRLTP algorithm includes both edge information and texture information, it can have the most accurate result value. Among the algorithms that do not include texture information, it can be considered that the accuracy of the feature extraction algorithm used in the similarity prediction apparatus is excellent.

도 10은 Spark 클러스터의 노드의 수에 따른 소요 시간을 나타낸 그래프이다. 본 발명은 Spark 클러스터에서 3개의 노드를 사용하는데, 도 10을 참조하면 3개의 노드가 특징을 추출함에 있어서 가장 적은 시간을 소모하는 것을 확인할 수 있다.10 is a graph showing the time required according to the number of nodes in the Spark cluster. The present invention uses three nodes in a Spark cluster. Referring to FIG. 10, it can be seen that three nodes spend the least time in extracting features.

본 명세서와 도면에 개시된 본 발명의 실시 예들은 본 발명의 기술 내용을 쉽게 설명하고 본 발명의 이해를 돕기 위해 특정 예를 제시한 것뿐이며, 본 발명의 범위를 한정하고자 하는 것은 아니다. 여기에 개시된 실시 예들 이외에도 본 발명의 기술적 사상에 바탕을 둔 다른 변형 예들이 실시 가능하다는 것은 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에게 자명한 것이다.The embodiments of the present invention disclosed in the present specification and drawings are merely to provide a specific example to easily explain the technical contents of the present invention and to understand the present invention, and are not intended to limit the scope of the present invention. It is apparent to those skilled in the art to which the present invention pertains that other modified examples based on the technical idea of the present invention can be implemented in addition to the embodiments disclosed herein.

Claims

영상 데이터베이스로부터 제1 영상 데이터와 상기 제1 영상 데이터의 비교 대상이 되는 참고 영상 데이터를 수신하는 단계;
상기 제1 영상 데이터를 복수 개의 제2 영상 데이터로 분산 저장하는 단계;
상기 제2 영상 데이터에서 제1 프레임을 추출하는 전처리 단계;
상기 제1 프레임에서 기 설정된 크기를 갖는 에지 마스크를 검출하는 단계;
상기 에지 마스크의 영역에 대한 특징을 추출하여 제1 특징 벡터를 생성하는 단계;
상기 제1 프레임에서 생성된 하나 이상의 상기 제1 특징 벡터를 결합하여 상기 제1 프레임에 대응되는 제2 특징 벡터를 생성하는 단계;
상기 참고 영상 데이터에서 상기 제1 프레임과 대응되는 제2 프레임, 상기 제2 프레임과 인접한 제3 및 제4 프레임을 식별하는 단계;
상기 제1 프레임과 상기 제2 내지 제4 프레임을 각각 비교하여 유사도를 측정하고, 상기 유사도의 최대 값을 최종 유사도로 하는 단계를 포함하는 유사도 예측 방법.
Receiving reference image data that is a comparison target of the first image data and the first image data from the image database;
Distributing and storing the first image data as a plurality of second image data;
A pre-processing step of extracting a first frame from the second image data;
Detecting an edge mask having a predetermined size in the first frame;
Generating a first feature vector by extracting features for a region of the edge mask;
Generating a second feature vector corresponding to the first frame by combining the one or more first feature vectors generated in the first frame;
Identifying a second frame corresponding to the first frame and third and fourth frames adjacent to the second frame from the reference image data;
And comparing the first frame and the second to fourth frames to measure similarity, and making the maximum value of the similarity the final similarity.

제1항에 있어서,
상기 전처리 단계는,
상기 제2 영상 데이터에서 제1 프레임을 추출하는 단계;
상기 제1 프레임을 그레이 스케일로 변환하는 단계;
상기 제1 프레임의 크기를 변경하는 단계;
상기 제1 프레임에서 전경을 추출하는 단계를 포함하는 유사도 예측 방법.
According to claim 1,
The pre-treatment step,
Extracting a first frame from the second image data;
Converting the first frame to gray scale;
Changing the size of the first frame;
And extracting the foreground from the first frame.

제2항에 있어서,
상기 제1 프레임에서 전경을 추출하기 위하여 가우시안 정규 분포를 이용하는 것을 특징으로 하는 유사도 예측 방법.
According to claim 2,
A similarity prediction method characterized by using a Gaussian normal distribution to extract the foreground in the first frame.

제1항에 있어서,
상기 에지 마스크는,
상기 에지 마스크의 중심에 위치한 중심 픽셀;
상기 중심 픽셀에 인접한 복수 개의 이웃 픽셀을 포함하는 것을 특징으로 하는 유사도 예측 방법.
According to claim 1,
The edge mask,
A center pixel located at the center of the edge mask;
A similarity prediction method comprising a plurality of neighboring pixels adjacent to the center pixel.

제1항 내지 제4항 중 어느 한 항에 있어서,
상기 제1 특징 벡터를 생성하는 단계는,
상기 에지 마스크의 상기 중심 픽셀과 상기 이웃 픽셀 사이의 차이 값을 연산하는 단계;
상기 차이 값의 평균값을 연산하는 단계;
상기 차이 값과 상기 평균값을 비교하여, 상기 차이 값이 상기 평균값보다 크면 1의 값을, 그렇지 않으면 0의 값을 부여하여 제1 특징 벡터를 생성하는 단계를 포함하는 것을 특징으로 하는 유사도 예측 방법.
The method according to any one of claims 1 to 4,
Generating the first feature vector,
Calculating a difference value between the center pixel and the neighboring pixel of the edge mask;
Calculating an average value of the difference values;
And comparing the difference value with the average value to generate a first feature vector by assigning a value of 1 if the difference value is greater than the average value, and a value of 0 otherwise.

제1항에 있어서,
상기 제2 내지 제4 프레임은 서로 연속하는 것을 특징으로 하는 유사도 예측 방법.
According to claim 1,
The second to fourth frames are similarity prediction method, characterized in that continuous with each other.

제1항에 있어서,
상기 유사도를 측정하는 단계는,
상기 제1 프레임과 상기 제2 내지 제4 프레임에 코사인 유사도를 적용하는 것을 특징으로 하는 유사도 예측 방법.
According to claim 1,
The step of measuring the similarity,
A method for predicting similarity, wherein cosine similarity is applied to the first frame and the second to fourth frames.

영상 데이터베이스로부터 제1 영상 데이터와 상기 제1 영상 데이터의 비교 대상이 되는 참고 영상 데이터를 수신하는 영상 수신부;
상기 제1 영상 데이터를 복수 개의 제2 영상 데이터로 분산 저장하는 데이터 분할부;
상기 제2 영상 데이터에서 제1 프레임을 추출하는 전처리부;
상기 제1 프레임에서 기 설정된 크기를 갖는 에지 마스크를 검출하고, 상기 에지 마스크의 영역에 대한 특징을 추출하여 제1 특징 벡터를 생성하며, 상기 제1 프레임에서 생성된 하나 이상의 상기 제1 특징 벡터를 결합하여 상기 제1 프레임에 대응되는 제2 특징 벡터를 생성하는 제어부;
상기 참고 영상 데이터에서 상기 제1 프레임과 대응되는 제2 프레임, 상기 제2 프레임과 인접한 제3 및 제4 프레임을 식별하고, 상기 제1 프레임과 상기 제2 내지 제4 프레임을 각각 비교하여 유사도를 측정하며, 상기 유사도의 최대 값을 최종 유사도로 하는 유사도 측정부를 포함하는 유사도 예측 시스템.
An image receiving unit receiving reference image data to be compared with the first image data from the image database;
A data divider for distributing and storing the first image data as a plurality of second image data;
A pre-processor for extracting a first frame from the second image data;
An edge mask having a predetermined size is detected in the first frame, a feature for the region of the edge mask is extracted, a first feature vector is generated, and one or more of the first feature vectors generated in the first frame are generated. A control unit that combines and generates a second feature vector corresponding to the first frame;
In the reference image data, the second frame corresponding to the first frame, the third and fourth frames adjacent to the second frame are identified, and the similarity is compared by comparing the first frame and the second to fourth frames, respectively. A similarity prediction system including a similarity measurement unit that measures the maximum value of the similarity as a final similarity.

제8항에 있어서,
상기 전처리부는,
상기 제2 영상 데이터에서 제1 프레임을 추출하고, 상기 제1 프레임을 그레이 스케일로 변환하며, 상기 제1 프레임의 크기를 변경하고, 상기 제1 프레임에서 전경을 추출하는 것을 특징으로 하는 유사도 예측 시스템.
The method of claim 8,
The pre-processing unit,
A similarity prediction system characterized by extracting a first frame from the second image data, converting the first frame to gray scale, changing the size of the first frame, and extracting a foreground from the first frame. .

제9항에 있어서,
상기 제1 프레임에서 전경을 추출하기 위하여 가우시안 정규 분포를 이용하는 것을 특징으로 하는 유사도 예측 시스템.
The method of claim 9,
A similarity prediction system, characterized in that a Gaussian normal distribution is used to extract the foreground from the first frame.

제8항에 있어서,
상기 에지 마스크는 상기 에지 마스크의 중심에 위치한 중심 픽셀, 상기 중심 픽셀에 인접한 복수 개의 이웃 픽셀을 포함하는 것을 특징으로 하는 유사도 예측 시스템.
The method of claim 8,
The edge mask includes a center pixel positioned at the center of the edge mask, and a plurality of neighboring pixels adjacent to the center pixel.

제8항 내지 제11항 중 어느 한 항에 있어서,
상기 제어부는,
상기 에지 마스크의 상기 중심 픽셀과 상기 이웃 픽셀 사이의 차이 값을 연산하고, 상기 차이 값의 평균값을 연산하며, 상기 차이 값과 상기 평균값을 비교하여, 상기 차이 값이 상기 평균값보다 크면 1의 값을, 그렇지 않으면 0의 값을 부여하여 제1 특징 벡터를 생성하는 제1 특징 벡터 생성부를 포함하는 것을 특징으로 하는 유사도 예측 시스템.
The method according to any one of claims 8 to 11,
The control unit,
Compute the difference value between the center pixel and the neighboring pixel of the edge mask, calculate the average value of the difference value, compare the difference value and the average value, and if the difference value is greater than the average value, a value of 1 is set. And a first feature vector generator that generates a first feature vector by assigning a value of 0 otherwise.

제8항에 있어서,
상기 제2 내지 제4 프레임은 서로 연속하는 것을 특징으로 하는 유사도 예측 시스템.
The method of claim 8,
The second to fourth frames are similarity prediction system, characterized in that continuous with each other.

제8항에 있어서,
상기 유사도 측정부는,
상기 제1 프레임과 상기 제2 내지 제4 프레임에 코사인 유사도를 적용하는 것을 특징으로 하는 유사도 예측 시스템.
The method of claim 8,
The similarity measurement unit,
A similarity prediction system, characterized in that cosine similarity is applied to the first frame and the second to fourth frames.