KR20200052400A

KR20200052400A - Advanced system and method for video classification

Info

Publication number: KR20200052400A
Application number: KR1020180126790A
Authority: KR
Inventors: 오병태; 홍진형
Original assignee: 한국항공대학교산학협력단
Priority date: 2018-10-23
Filing date: 2018-10-23
Publication date: 2020-05-15

Abstract

The present invention relates to an improved image classification system and a method thereof. The improved image classification system comprises: a preprocessing device to analyze features of one or more inputted images to select some of the features which can be found in common; and an image classification device to perform image classification processing using an image selected by the preprocessing device as an input. According to the present invention, only a portion of the code information of a compressed video is used in learning and classification. If images compressed with various parameters in various environments are learned to be applied to machine learning, machine learning does not have to be carried out for each parameter. Therefore, system complexity can be greatly reduced.

Description

개선된 영상 분류 시스템 및 방법{ADVANCED SYSTEM AND METHOD FOR VIDEO CLASSIFICATION}Improved video classification system and method {ADVANCED SYSTEM AND METHOD FOR VIDEO CLASSIFICATION}

본 발명은 개선된 영상 분류 시스템 및 방법에 관한 것으로, 더욱 상세하게는 영상분류를 위해 영상의 다양한 특징을 전부 사용하는 것이 아닌 일부의 영상이나, 일부의 특징을 추출하여 다양한 상황에 적용할 수 있도록 하는 개선된 영상 분류 시스템 및 방법에 관한 것이다.The present invention relates to an improved image classification system and method, and more specifically, rather than using all of the various features of the image for image classification, so that some images or some features can be extracted and applied to various situations. The present invention relates to an improved image classification system and method.

이 부분에 기술된 내용은 단순히 본 실시 예에 대한 배경 정보를 제공할 뿐 종래기술을 구성하는 것은 아니다.The contents described in this section merely provide background information for the present embodiment, and do not constitute a prior art.

급격하게 발전하고 있는 정보 기술 사회와 함께 영상/동영상 또한 실생활에서 다양하게 활용되고 있다. 영상/동영상이 일반 소비자의 일상생활에 밀접하게 관련됨에 따라, 다양한 디지털 영상이 획득되어 활용되고 있다. Along with the rapidly developing information technology society, video / video is also used in various ways in real life. As the video / video is closely related to the daily life of the general consumer, various digital images have been acquired and utilized.

영상분석 기술은 화상 데이터가 가지고 있는 정보를 바탕으로 여러 가지 화상 처리를 함으로써 필요한 정보를 꺼내서 해석하는 것을 말한다. 최근 영상 기술이 산업 전반에 활용됨에 따라 영상을 분석하는 기술에 대한 수요가 점점 증가하고 있다. 특히 영상을 분류하는 기술이 중요해짐에 따라 영상의 특징을 추출하는 기술이 중요해지고 있다. Image analysis technology refers to extracting and interpreting necessary information by performing various image processing based on information possessed by image data. 2. Description of the Related Art Recently, as image technology is utilized throughout the industry, demand for an image analysis technology is gradually increasing. In particular, as the technology for classifying images becomes important, a technique for extracting characteristics of images is becoming important.

영상분류(Image Classification) 기술은 각각의 픽셀의 특성을 분석하여 몇 개의 종류 중 하나로 구분함으로써 영상의 내용을 쉽게 이해할 수 있도록 하는 기법으로서, 영상에서 실제 물체가 무엇인지 일일이 인간의 시각을 통하지 않고 컴퓨터에 의하여 분류하는 방법이다. 영상분류는 특별한 특성에 대한 정보의 범주를 포함하는 하나 또는 그 이상의 공간적인, 시각적인 분광대 주파수대의 관계들을 포함한 수치 영상의 수치 분석으로 수행된다.Image classification technology is a technique that makes it easy to understand the contents of an image by analyzing the characteristics of each pixel and classifying it into one of several types. It is a method of sorting. Image classification is performed by numerical analysis of numerical images, including relationships of one or more spatial and visual spectral bands that contain categories of information about particular characteristics.

동영상 플랫폼에서는 시청자들의 요구와 취향에 맞춘 서비스를 제공하기 위해 영상/동영상을 분류할 필요가 있으며, 인간과 로봇의 상호작용을 위해 로봇은 입력 장치로부터 얻어지는 영상/동영상을 분류할 필요가 있다. 또한, 세계적인 스포츠 경기에 있어서 팀의 전술과 전략을 분석하거나, 상대팀의 반칙 여부를 판독하기 위해 영상/동영상의 분류가 필요하다. 그리고, 보안 영상/동영상을 녹화하는 보안 카메라는 실시간으로 다양한 상황을 감지하여 사용자에게 경고의 신호를 보내도록 영상/동영상을 지속적으로 분류하기도 한다. In the video platform, it is necessary to classify video / video to provide a service tailored to the needs and tastes of viewers, and for human-robot interaction, the robot needs to classify video / video obtained from an input device. In addition, video / video classification is required to analyze the tactics and strategies of a team in a global sports game or to read whether an opponent has been fouled. In addition, the security camera that records the security video / video continuously classifies the video / video to detect various situations in real time and send a warning signal to the user.

이와 같이, 영상/동영상의 분류 문제는 정보 기술의 급격한 발전과 사회의 다양한 요구 조건에 따라 활용이 급증하고 있으며, 다양한 분야에서 발전해야 할 필요가 있다.As such, the problem of video / video classification is rapidly increasing in accordance with the rapid development of information technology and various requirements of society, and needs to be developed in various fields.

종래 기술에 의한 동영상 분류 기술은 특정한 목적을 위한 동영상 분류를 진행할 때, 동영상 마다 다양하게 압축된 표준방식을 구분하여 처리하여야 하고, 모든 압축정보를 이용하여야 하기 때문에 많은 양의 데이터를 처리하여야 하고 시스템이 복잡하여지는 문제가 있었다. In the video classification technology according to the prior art, when classifying a video for a specific purpose, a variety of compressed standard methods must be classified for each video, and all compressed information must be used. There was a problem with this complexity.

이에, 본 발명에서는, 전술한 기술적 제약을 해소시킬 수 있는 개선된 영상 분류 시스템 및 방법을 제안하고자 한다.Accordingly, in the present invention, it is intended to propose an improved image classification system and method capable of resolving the aforementioned technical limitations.

한국등록특허 제10-1782339호, 2017년 8월 8일 공개(명칭: 다중객체 영상분석 및 그 결과 제공을 위한 영상분석 시스템)Korean Registered Patent No. 10-1782339, published on August 8, 2017 (Name: Multi-object image analysis and image analysis system for providing the results) 한국등록특허공보 제10-1247220호, 2012년 9월 19일 공개(명칭: 반복 패턴을 이용한 영상 처리 방법 및 장치)Korean Registered Patent Publication No. 10-1247220, published on September 19, 2012 (name: image processing method and apparatus using repetitive patterns) 한국공개특허공보 제10-2012-0103284호, 2012년 9월 19일 공개(명칭: 반복 패턴을 이용한 영상 처리 방법 및 장치)Published Korean Patent Publication No. 10-2012-0103284, published on September 19, 2012 (name: image processing method and apparatus using repetitive patterns)

(비특허 문헌 1) SHANABLEH, Tamer. Detection of frame deletion for digital video forensics. Digital Investigation, 2013, 10.4: 350-360.(Non-patent document 1) SHANABLEH, Tamer. Detection of frame deletion for digital video forensics. Digital Investigation, 2013, 10.4: 350-360. (비특허 문헌 2) YU, Liyang, et al. Exposing frame deletion by detecting abrupt changes in video streams. Neurocomputing, 2016, 205: 84-91.(Non-patent document 2) YU, Liyang, et al. Exposing frame deletion by detecting abrupt changes in video streams. Neurocomputing, 2016, 205: 84-91.

본 발명은 전술한 종래 기술의 문제점을 해결하기 위하여 제안된 것으로, 압축된 동영상의 부호정보의 일부만을 학습 및 분류에 사용함으로써 다양한 환경에서 다양한 파라미터로 압축된 영상들을 학습하여 이를 기계학습에 적용하는 경우, 각각의 파라미터별로 기계학습을 진행하지 않아도 되기 때문에 시스템 복잡도를 크게 줄일 수 있도록 하는 개선된 영상 분류 시스템 및 방법을 제공하는데 주된 목적이 있다.The present invention has been proposed to solve the above-mentioned problems of the prior art. By using only a part of the code information of a compressed video for learning and classification, learning compressed images with various parameters in various environments and applying them to machine learning In this case, the main purpose is to provide an improved image classification system and method that greatly reduces system complexity since it is not necessary to perform machine learning for each parameter.

또한, 본 발명의 다른 목적은 다양한 압축 표준으로 압축된 영상을 분류하고자 할 때, 대부분의 동영상 압축에서 공통으로 사용되고 있는 부분을 통합적으로 학습하여 적용할 수 있게 되어서 일반화되고 통합적인 동영상의 분류를 가능하게 하는 개선된 영상 분류 시스템 및 방법을 제공하는데 있다.In addition, another object of the present invention is to classify a compressed image by various compression standards, and it is possible to collectively learn and apply parts commonly used in most video compression, thereby enabling generalized and integrated classification of videos. It is to provide an improved image classification system and method to do that.

본 발명의 해결하고자 하는 과제는 이상에서 언급한 것으로 제한되지 않으며, 언급되지 않은 또 다른 해결하고자 하는 과제는 아래의 기재로부터 본 발명이 속하는 통상의 지식을 가진 자에게 명확하게 이해될 수 있을 것이다.The problem to be solved of the present invention is not limited to those mentioned above, and another problem to be solved that is not mentioned will be clearly understood by those having ordinary knowledge to which the present invention belongs from the following description.

전술한 목적을 달성하기 위한 본 발명의 일 양상은, 입력된 하나 이상의 영상의 특징을 분석하여, 공통적으로 찾을 수 있는 특징의 일부를 선택하는 전처리장치; 및 상기 전처리장치에서 선택된 영상을 입력으로 하여 영상분류 처리를 수행하는 영상분류장치를 포함하는 것을 특징으로 하는 개선된 영상 분류 시스템을 제공한다. An aspect of the present invention for achieving the above object is to analyze the features of one or more input images, a pre-processing device for selecting a portion of features that can be found in common; And it provides an improved image classification system, characterized in that it comprises an image classification device for performing the image classification processing using the image selected by the pre-processing device as an input.

상기 영상분류장치는, 상기 전처리장치에서 선택된 영상을 입력으로 기계학습(Machine Learning) 기법을 적용하여 특징벡터를 생성할 수 있다. The image classification apparatus may generate a feature vector by applying a machine learning technique to an image selected by the pre-processing apparatus as an input.

상기 전처리장치는, 분류하고자 하는 영상의 특징정보를 분석하여 추출하는 영상특징분석부; 및 상기 영상특징분석부를 통하여 추출된 여러 종류의 동영상에서 공통적으로 찾을 수 있는 특징 중에서 상기 영상분류장치에서 학습 및 분류에 사용할 일부 영상을 선별하는 영상선택부를 포함할 수 있다. The pre-processing apparatus includes: an image feature analysis unit for analyzing and extracting feature information of an image to be classified; And an image selection unit to select some images to be used for learning and classification in the image classification apparatus among features commonly found in various types of videos extracted through the image feature analysis unit.

상기 영상특징분석부는, 분류하고자 하는 영상의 압축구조의 특징을 분석하여 추출할 수 있다. The image feature analysis unit may analyze and extract characteristics of a compression structure of an image to be classified.

상기 영상특징분석부는, 분류하고자 하는 영상의 압축구조의 특징을 동영상이 압축될 때 설정할 수 있는 한 GoP 내 압축 구조에 따라 저지연 모드 혹은 임의 접근 모드로 구분할 수 있다. The image feature analysis unit may classify a feature of a compression structure of an image to be classified into a low-delay mode or a random access mode according to a compression structure in GoP as long as it can be set when the video is compressed.

상기 GoP 내 압축구조가 저지연 모드인 경우, 상기 영상선택부는 영상의 시간적 순서에 따라 일부 영상을 선별할 수 있다. When the compression structure in the GoP is in the low-latency mode, the image selector may select some images according to the temporal order of the images.

상기 GoP 내 압축구조가 임의 접근 모드인 경우, 상기 영상선택부는 부호화 순서에 따른 영상의 계층적인 구조를 고려하여 일부 영상을 선별할 수 있다.When the compression structure in the GoP is a random access mode, the image selector may select some images in consideration of the hierarchical structure of the images according to the encoding order.

상기 영상특징분석부는, 동영상이 부호화 될 때 발생하는 압축 정보 중에서 동영상 부호화 과정의 기본 단위블록의 압축 정보를 추출하고, 상기 영상선택부는 상기 기본 단위블록의 압축 정보의 분할 깊이 중 일부만 선별할 수 있다. The image feature analysis unit may extract compression information of a basic unit block of a video encoding process from among compression information generated when a video is encoded, and the image selection unit may select only a part of the splitting depth of the compressed information of the basic unit block. .

적용된 압축방식이 HEVC인 경우, 압축정보의 분할 깊이 정보 중에서 CU(coding unit), PU(prediction unit) 혹은 TU(transform unit) 중 어느 하나 이상의 분할 깊이 정보를 이용하여 일부 영상을 선별할 수 있다.When the applied compression method is HEVC, some images may be selected using one or more of split depth information among a coding unit (CU), a prediction unit (PU), or a transform unit (TU) among split depth information of the compressed information.

본 발명의 다른 일 양상은, 영상분류를 위한 영상이 입력되는 단계; 상기 입력된 영상의 압축구조의 특징을 분석하여 추출하는 단계; 상기 분석되어 추출된 영상특징에 따라, 여러 종류의 동영상에 공통적으로 찾을 수 있는 특징의 일부를 선택하는 단계; 및 상기 선택된 일부의 영상을 뉴럴 네트워크의 입력으로 하여 동영상의 학습 및 분류를 수행하는 단계를 포함하는 것을 특징으로 하는 개선된 영상 분류 방법을 제공한다.Another aspect of the invention, the step of inputting the image for image classification; Analyzing and extracting characteristics of the compressed structure of the input image; Selecting a part of features commonly found in various types of videos according to the analyzed and extracted image features; And performing learning and classification of a video using the selected part of the image as an input of a neural network.

상기 입력된 영상의 압축구조의 특징은, 동영상이 압축될 때 설정할 수 있는 한 GoP내 압축 구조에 따라 저지연 모드 혹은 임의 접근 모드로 구분하거나, 동영상이 부호화 될 때 발생하는 압축 정보 중에서 동영상 부호화 과정의 기본 단위블록의 압축 정보일 수 있다. The feature of the compression structure of the input image is a video encoding process among compression information generated when a video is encoded or classified into a low-delay mode or a random access mode according to a compression structure in GoP as long as it can be set when the video is compressed. It may be compressed information of the basic unit block of.

상기 GoP의 압축구조가 저지연 모드인 경우, 영상의 시간적 순서에 따라 일부 영상을 선별할 수 있다. When the compression structure of the GoP is in the low-latency mode, some images may be selected according to the temporal order of the images.

상기 GoP의 압축구조가 임의 접근 모드인 경우, 부호화 순서에 따른 영상의 계층적인 구조를 고려하여 일부 영상을 선별할 수 있다. When the compression structure of the GoP is a random access mode, some images may be selected in consideration of a hierarchical structure of images according to an encoding order.

상기 입력된 영상의 압축구조의 특징이 동영상이 부호화 될 때 발생하는 압축 정보 중에서 동영상 부호화 과정의 기본 단위블록의 압축 정보인 경우, 동영상이 부호화 될 때 발생하는 압축 정보 중에서 동영상 부호화 과정의 기본 단위블록의 압축 정보의 분할 깊이 중 일부만 선별할 수 있다.If the feature of the compression structure of the input image is compression information of a basic unit block of a video encoding process among compressed information generated when a video is encoded, a basic unit block of a video encoding process among compressed information generated when a video is encoded Only a part of the splitting depth of the compressed information can be selected.

본 발명의 다른 일 양상은, 상기 개선된 영상 분류 방법을 실행하는 프로그램을 기록한 컴퓨터 판독 가능한 기록 매체를 제공한다.Another aspect of the present invention provides a computer-readable recording medium recording a program executing the improved image classification method.

본 발명의 개선된 영상 분류 시스템 및 방법에 의하면, 압축된 동영상의 부호정보의 일부만을 학습 및 분류에 사용함으로써 다양한 환경에서 다양한 파라미터로 압축된 영상들을 학습하여 이를 기계학습에 적용하는 경우, 각각의 파라미터별로 기계학습을 진행하지 않아도 되기 때문에 시스템 복잡도를 크게 줄일 수 있도록 하는 개선된 영상 분류 시스템 및 방법을 제공할 수 있다는 효과가 있다.According to the improved image classification system and method of the present invention, when only a part of the code information of a compressed video is used for learning and classification, when the compressed images are learned with various parameters in various environments and applied to machine learning, each Since there is no need to perform machine learning for each parameter, there is an effect that an improved image classification system and method capable of greatly reducing system complexity can be provided.

또한, 다양한 압축 표준으로 압축된 영상을 분류하고자 할 때, 대부분의 동영상 압축에서 공통으로 사용되고 있는 부분을 통합적으로 학습하여 적용할 수 있게 되어서 일반화되고 통합적인 동영상의 분류를 가능하게 하는 개선된 영상 분류 시스템 및 방법을 제공할 수 있다는 효과가 있다.In addition, when it is desired to classify compressed images by various compression standards, it is possible to collectively learn and apply the parts commonly used in most video compression, thereby improving the classification of generalized and integrated videos. It has the effect of providing a system and method.

본 발명에서 얻을 수 있는 효과는 이상에서 언급한 효과로 제한되지 않으며, 언급하지 않은 또 다른 효과들은 아래의 기재로부터 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 명확하게 이해될 수 있을 것이다.The effects obtainable in the present invention are not limited to the above-mentioned effects, and other effects not mentioned will be clearly understood by those skilled in the art from the following description. .

본 발명에 관한 이해를 돕기 위해 상세한 설명의 일부로 포함되는, 첨부 도면은 본 발명에 대한 실시예를 제공하고, 상세한 설명과 함께 본 발명의 기술적 특징을 설명한다.
도 1은 본 발명의 일 실시예에 따른 개선된 영상 분류 시스템의 구성을 예시한 도면이다.
도 2는 본 발명의 일 실시예에 따른 전처리장치의 구성을 예시한 도면이다.
도 3은 GoP가 저지연 모드인 경우의 참조 구조를 예시한 도면이다.
도 4는 GoP가 임의 접근 모드인 경우의 참조 구조를 예시한 도면이다.
도 5는 본 발명의 일 실시예에 따른 저지연 모드인 경우, 일부 영상의 선택 방법을 예시한 도면이다.
도 6은 본 발명의 일 실시예에 따른 임의 접근 모드인 경우, 일부 영상의 선택 방법을 예시한 도면이다.
도 7은 본 발명의 일 실시예에 따른 개선된 영상 분류 방법을 예시한 도면이다.BRIEF DESCRIPTION OF THE DRAWINGS The accompanying drawings, which are included as part of the detailed description to aid understanding of the present invention, provide embodiments of the present invention, and describe the technical features of the present invention together with the detailed description.
1 is a diagram illustrating a configuration of an improved image classification system according to an embodiment of the present invention.
2 is a diagram illustrating the configuration of a pre-processing apparatus according to an embodiment of the present invention.
3 is a diagram illustrating a reference structure when the GoP is a low-latency mode.
4 is a diagram illustrating a reference structure when the GoP is a random access mode.
5 is a diagram illustrating a method of selecting some images in a low-latency mode according to an embodiment of the present invention.
6 is a diagram illustrating a method of selecting some images in a random access mode according to an embodiment of the present invention.
7 is a diagram illustrating an improved image classification method according to an embodiment of the present invention.

이하, 본 발명에 따른 바람직한 실시 형태를 첨부된 도면을 참조하여 상세하게 설명한다. 첨부된 도면과 함께 이하에 개시될 상세한 설명은 본 발명의 예시적인 실시형태를 설명하고자 하는 것이며, 본 발명이 실시될 수 있는 유일한 실시형태를 나타내고자 하는 것이 아니다. 이하의 상세한 설명은 본 발명의 완전한 이해를 제공하기 위해서 구체적 세부사항을 포함한다. 그러나, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자는 본 발명이 이러한 구체적 세부사항 없이도 실시될 수 있음을 안다. Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. DETAILED DESCRIPTION The following detailed description, together with the accompanying drawings, is intended to describe exemplary embodiments of the present invention, and is not intended to represent the only embodiments in which the present invention may be practiced. The following detailed description includes specific details to provide a thorough understanding of the present invention. However, one of ordinary skill in the art to which the present invention pertains knows that the present invention may be practiced without these specific details.

몇몇 경우, 본 발명의 개념이 모호해지는 것을 피하기 위하여 공지의 구조 및 장치는 생략되거나, 각 구조 및 장치의 핵심기능을 중심으로 한 블록도 형식으로 도시될 수 있다.In some cases, in order to avoid obscuring the concept of the present invention, well-known structures and devices may be omitted, or block diagrams centered on the core functions of each structure and device may be illustrated.

명세서 전체에서, 어떤 부분이 어떤 구성요소를 "포함(comprising 또는 including)"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다. 또한, 명세서에 기재된 "…부", "…기", "모듈" 등의 용어는 적어도 하나의 기능이나 동작을 처리하는 단위를 의미하며, 이는 하드웨어나 소프트웨어 또는 하드웨어 및 소프트웨어의 결합으로 구현될 수 있다. 또한, "일(a 또는 an)", "하나(one)", "그(the)" 및 유사 관련어는 본 발명을 기술하는 문맥에 있어서(특히, 이하의 청구항의 문맥에서) 본 명세서에 달리 지시되거나 문맥에 의해 분명하게 반박되지 않는 한, 단수 및 복수 모두를 포함하는 의미로 사용될 수 있다.Throughout the specification, when a part "comprising or including" a certain component, this means that other components may be further included instead of excluding other components, unless otherwise specified. do. In addition, terms such as “… unit”, “… group”, and “module” described in the specification mean a unit that processes at least one function or operation, which may be implemented by hardware or software or a combination of hardware and software. have. In addition, "a (a or an)", "one (one)," "the (the)" and similar related terms in the context of describing the present invention (especially in the context of the following claims) is different herein. It may be used in a sense including both singular and plural unless indicated or clearly contradicted by context.

본 발명의 실시예들을 설명함에 있어서 공지 기능 또는 구성에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명을 생략할 것이다. 그리고 후술되는 용어들은 본 발명의 실시예에서의 기능을 고려하여 정의된 용어들로서 이는 사용자, 운용자의 의도 또는 관례 등에 따라 달라질 수 있다. 그러므로 그 정의는 본 명세서 전반에 걸친 내용을 토대로 내려져야 할 것이다.In describing embodiments of the present invention, when it is determined that a detailed description of known functions or configurations may unnecessarily obscure the subject matter of the present invention, the detailed description will be omitted. In addition, terms to be described later are terms defined in consideration of functions in an embodiment of the present invention, which may vary according to a user's or operator's intention or practice. Therefore, the definition should be made based on the contents throughout this specification.

본 발명의 도면의 각 구성부들은 개선된 영상 분류 시스템 및 방법에서 서로 다른 특징적인 기능들을 나타내기 위해 독립적으로 도시한 것으로, 각 구성부들이 분리된 하드웨어나 하나의 소프트웨어 구성단위로 이루어짐을 의미하지 않는다. 즉, 각 구성부는 설명의 편의상 각각의 구성부로 나열하여 포함한 것으로 각 구성부 중 적어도 두 개의 구성부가 합쳐져 하나의 구성부로 이루어지거나, 하나의 구성부가 복수개의 구성부로 나뉘어져 기능을 수행할 수 있고 이러한 각 구성부의 통합된 실시예 및 분리된 실시예도 본 발명의 본질에서 벗어나지 않는 한 본 발명의 권리범위에 포함된다. Each component in the drawings of the present invention is independently illustrated to represent different characteristic functions in the improved image classification system and method, and does not mean that each component is composed of separate hardware or one software component unit. Does not. That is, for convenience of description, each component is listed and included as each component, and at least two components of each component may be combined to form one component, or one component may be divided into a plurality of components to perform a function. The integrated and separated embodiments of the components are also included in the scope of the present invention without departing from the essence of the present invention.

또한, 일부의 구성 요소는 본 발명에서 본질적인 기능을 수행하는 필수적인 구성 요소는 아니고 단지 성능을 향상시키기 위한 선택적 구성 요소일 수 있다. 본 발명은 단지 성능 향상을 위해 사용되는 구성 요소를 제외한 본 발명의 본질을 구현하는데 필수적인 구성부만을 포함하여 구현될 수 있고, 단지 성능 향상을 위해 사용되는 선택적 구성 요소를 제외한 필수 구성 요소만을 포함한 구조도 본 발명의 권리범위에 포함된다.Also, some of the components are not essential components for performing essential functions in the present invention, but may be optional components for improving performance. The present invention can be implemented by including only components necessary for realizing the essence of the present invention, except components used for performance improvement, and structures including only essential components excluding optional components used for performance improvement. Also included in the scope of the present invention.

이하, 첨부된 도면들을 참조하여 본 발명의 실시예에 대해 살펴보기로 한다.Hereinafter, embodiments of the present invention will be described with reference to the accompanying drawings.

도 1은 본 발명의 일 실시예에 따른 개선된 영상 분류 시스템의 구성을 예시한 도면이다.1 is a diagram illustrating a configuration of an improved image classification system according to an embodiment of the present invention.

전처리장치(100)는 영상분류장치(200)에서 보다 더 일반적인 상황에 맞는 동영상을 분류하도록 할 수 있도록 다양한 동영상을 입력으로 받아 동영상의 특징을 분석하고 적절한 일부 영상을 선택하는 장치이다.The pre-processing device 100 is a device that analyzes the characteristics of a video and selects some suitable videos by receiving various videos as input so that the video classification device 200 can classify videos suitable for more general situations.

동영상이 사용되는 분야는 다양하고, 상황에 따라 여러가지 형태로 사용된다. 하지만 종래 기술에 따르면, 이런 여러가지 상황에 맞는 다양한 종류의 동영상을 분류하는 데에 어려움이 있다. 전처리장치(100)는 다양한 상황에도 동영상의 분류를 수행할 수 있도록 여러 종류의 동영상에서 공통적으로 찾을 수 있는 특징의 일부만을 추출하고 선택하여 동영상을 분류할 수 있도록 한다.Fields where video is used vary, and are used in various forms depending on the situation. However, according to the prior art, it is difficult to classify various types of videos suitable for these various situations. The pre-processing apparatus 100 extracts and selects only a part of features commonly found in various types of videos so as to classify the videos even in various situations.

전처리장치(100)의 보다 상세한 구성은 도 2의 설명부분에서 설명하도록 한다. The more detailed configuration of the pre-processing device 100 will be described in the description of FIG. 2.

영상분류장치(200)는 입력된 영상의 특징을 추출하여 분류하는 장치이다. The image classification device 200 is a device that extracts and classifies features of the input image.

본 발명의 영상분류장치(200)에서는 영상분류를 위한 특징벡터 등을 형성하는 데에 딥러닝과 같은 기계학습(Machine Learning) 기법을 활용한 알고리즘을 적용하여 보다 성능을 향상할 수 있다. 기계학습 알고리즘에는 의사결정나무(Decision Tree), 베이지안 망(Bayesian network), 지지벡터머신(Support Vector Machine, SVM), 인공 신경망(Artificial Neural Network), 딥러닝 등이 있다. 지지벡터머신 모델은 패턴분류에 있어서 각광받고 있는 통계적 학습이론으로, 패턴인식과 자료 분석을 위한 지도(감독/교사) 학습의 모델로 주로 분류와 회귀분석을 위해 사용하고 있다. 다시 말하면 학습데이터와 범주 정보의 학습 진단을 대상으로 학습과정에서 얻어진 확률분포를 이용하여 의사결정함수를 추정한 후 이 함수에 따라 새로운 데이터를 이원 분류하는 것이다. 지도 학습은 컴퓨터에 먼저 정보를 가르치는 방법으로 컴퓨터는 미리 학습된 결과를 바탕으로 정보를 구분하는 것이다. 딥러닝은 컴퓨터가 여러 데이터를 이용해 마치 사람처럼 스스로 학습할 수 있게 하기 위해 인공 신경망(ANN: artificial neural network)을 기반으로 구축한 한 기계 학습 기술로서, 인간의 두뇌가 수많은 데이터 속에서 패턴을 발견한 뒤 사물을 구분하는 정보처리 방식을 모방해 컴퓨터가 사물을 분별하도록 기계를 학습시킨다. 딥 러닝 기술을 적용하면 사람이 모든 판단 기준을 정해주지 않아도 컴퓨터가 스스로 인지·추론·판단할 수 있게 된다. In the image classification apparatus 200 of the present invention, an algorithm utilizing a machine learning technique such as deep learning may be applied to form a feature vector for image classification, thereby improving performance. Machine learning algorithms include decision trees, Bayesian networks, support vector machines (SVMs), artificial neural networks, and deep learning. The support vector machine model is a statistical learning theory that is spotlighted in pattern classification, and is mainly used for classification and regression analysis as a model of supervised (director / teacher) learning for pattern recognition and data analysis. In other words, it is to classify new data according to this function after estimating the decision function using the probability distribution obtained in the learning process for learning diagnosis of learning data and category information. Supervised learning is a method of teaching information to a computer first, and the computer classifies information based on the results learned in advance. Deep learning is a machine learning technology built on an artificial neural network (ANN) to enable computers to learn by themselves using multiple data as if they were humans. After that, the computer is trained to discriminate objects by imitating the information processing method for classifying objects. When deep learning technology is applied, a computer can recognize, reason, and judge by itself without the need for all judgment criteria.

기본적으로 동영상은 기록하기 위해 반드시 압축과정이 수행된다. 따라서 압축 과정에서 발생하는 특징을 분석하여 동영상의 분류에 이용할 수 있다. 종래 기술에 의할 경우 분류 대상 영상의 모든 압축정보를 추출하여 신경망 등의 분류기를 통하여 동영상을 학습하고 분류하여야 하지만, 본 발명의 일 실시예에 따르면 전처리장치(100)에서 효과적으로 입력 영상의 데이터양을 줄이거나, 특징들을 혼합하여 영상분류장치(200)의 입력으로 활용함으로써 보다 효율적이고 효과적인 영상분류가 가능하게 된다. Basically, in order to record a video, a compression process is necessarily performed. Therefore, it is possible to analyze the characteristics occurring in the compression process and use them for classification of videos. According to the prior art, it is necessary to learn and classify a video through a classifier such as a neural network by extracting all compressed information of a video to be classified. By reducing or mixing the features and using it as an input to the image classification apparatus 200, more efficient and effective image classification is possible.

도 2는 본 발명의 일 실시예에 따른 전처리장치의 구성을 예시한 도면이다.2 is a diagram illustrating the configuration of a pre-processing apparatus according to an embodiment of the present invention.

영상특징분석부(110)는 분류하고자 하는 영상의 특징정보, 대표적으로는 압축구조의 특징을 분석하여 추출하는 기능부이다. The image feature analysis unit 110 is a function unit that analyzes and extracts feature information of an image to be classified, typically a feature of a compressed structure.

동영상은 그 사용 분야에 따라 다양한 GoP(Group of Pictures) 크기로 부호화될 수 있다. The video may be encoded in various Group of Pictures (GoP) sizes according to the field of use.

GoP란 몇 장의 전후 화면 데이터를 한 묶음으로하는 영상 데이터 단위로서, 임의 시점으로의 접근 및 재생이 가능하도록 한 동영상 단위이다. GoP는 동영상 편집의 단위가 될 수 있으며, 동영상 압축률은 GOP 단위로 이루어진다. GoP is a video data unit that combines several pieces of front and back screen data into a bundle, and is a video unit that enables access and playback at any time. GoP can be a unit of video editing, and the video compression rate is done in units of GOP.

GoP는 동영상 압축시 I-frames, P-frames 및 B-frames의 수를 어떻게 가져갈까 하는 것으로 동영상 화질 및 비트레이트 그리고 파일 사이즈에 많은 영향을 미친다. I-frame은 압축에서 기본이 되는 Frame으로 Key Frame이라고도 하며, 완전한 한장의 이미지이고, P-frame 과 B-frame은 I-frame을 기준으로 변환된 부분(움직인 부분)의 정보만 가지고 있는 Frame이다. 따라서 I-frame이 없으면 P 및 B Frame은 의미가 없다. I-frame수가 많으면 화질은 좋아지는 반면 비트레이트가 올라가고 파일 사이즈가 커진다. 그래서 압축시 움직임이 많고 장면변화가 심한 동영상은 I-Frame이 많아 상대적으로 움직임이 거의 없는 동영상(I-Frame수가 적음)보다 비트레이트가 높고 파일 사이즈가 커지게 된다. GoP 크기를 설정할 경우 액션이 많은 동영상은 GoP 크기를 적게(하나의 I-frame당 P-frame 과 B-frame의 수를 적게) 가져가고 움직임이 적은 동영상은 GoP 크기를 크게(하나의 I-frame당 P-frame 과 B-frame의 수를 많이) 가져갈 수 있다.GoP is how to take the number of I-frames, P-frames and B-frames when compressing a video, which greatly affects the video quality, bit rate and file size. I-frame is a frame that is the basic frame in compression, also called a key frame, and is a complete image, and P-frame and B-frame are frames that contain only the information of the converted part (moving part) based on the I-frame. to be. Therefore, if there is no I-frame, P and B frames are meaningless. The higher the number of I-frames, the better the image quality, but the bit rate increases and the file size increases. So, a video with a lot of motion and a lot of scene changes during compression has a lot of I-Frames, and the bit rate is higher and the file size is larger than a video with relatively little motion (the number of I-Frames is small). When setting the GoP size, videos with a lot of action take the GoP size less (less number of P-frames and B-frames per I-frame), and videos with less movement increase the GoP size (one I-frame The number of P-frames and B-frames per party) can be taken.

이와 같은 다양한 GoP 크기를 갖는 동영상을 GoP 정보를 토대로 분류하기 위해서는, 보다 일반적인 정보를 구축할 필요가 있다. 예를 들면, GoP의 크기가 12인 동영상과 GoP의 크기가 18인 동영상을 동일한 알고리즘이나 분류기를 통해 분류하고자 한다면, 두 동영상 모두 가지고 있는 GoP의 특징을 이용할 필요가 있다. 따라서 이 경우에는 영상특징분석부(110)에서 한 GoP 내에서 두 동영상 모두에 사용될 수 있는 공통된 영상들을 일부 추출하여 영상의 학습 및 분류에 사용할 수 있다. In order to classify videos having various GoP sizes based on GoP information, it is necessary to construct more general information. For example, if you want to classify a video with a GoP size of 12 and a video with a GoP size of 18 through the same algorithm or classifier, you need to use the features of GoP that both videos have. Therefore, in this case, the image feature analysis unit 110 may extract some of the common images that can be used for both videos in one GoP and use them for learning and classifying the images.

영상선택부(130)는 영상특징분석부(110)를 통하여 추출된 여러 종류의 동영상에서 공통적으로 찾을 수 있는 특징 중에서 영상분류장치(200)에서 학습 및 분류에 사용할 일부 영상을 선별하는 기능부이다.The image selection unit 130 is a function unit that selects some images to be used for learning and classification in the image classification apparatus 200 among features commonly found in various types of videos extracted through the image feature analysis unit 110. .

본 발명의 일실시예에 따르면 영상선택부(130)는 동영상이 압축될 때 설정할 수 있는 한 GoP 내 압축 구조에 따라 저지연 모드와 임의 접근 모드의 최소한 두 가지로 나눌 수 있다 According to an embodiment of the present invention, the image selection unit 130 may be divided into at least two types of a low-latency mode and a random access mode according to a compression structure in GoP as long as it can be set when a video is compressed.

본 발명의 일실시예에 따른 HEVC(H.265) 코덱의 경우, 영상의 서비스 응용에 따라 인트라(All-intra)모드, 저지연 모드, 임의 접근 모드로 GoP를 구성할 수 있다. In the case of the HEVC (H.265) codec according to an embodiment of the present invention, GoP may be configured in an all-intra mode, a low-latency mode, or a random access mode according to a video service application.

도 3은 GoP가 저지연 모드인 경우의 참조 구조를 예시한 도면이다.3 is a diagram illustrating a reference structure when the GoP is a low-latency mode.

저지연 모드는 영상의 시작부에만 I 픽처 한 장을 삽입하고, 이후는 뒤따르는 참조 픽처들의 복호화 순서와 출력순서를 같도록 한다. 이는 화상회의와 같이 적은 지연을 필요로 하며 임의 접근이 불필요한 대화식 응용에서 사용되는 방법이다.In the low-latency mode, a single I picture is inserted only at the beginning of an image, and then the decoding order and output order of subsequent reference pictures are the same. This is a method used in interactive applications that require less delay, such as videoconferencing, and that do not require random access.

도 4는 GoP가 임의 접근 모드인 경우의 참조 구조를 예시한 도면이다.4 is a diagram illustrating a reference structure when the GoP is a random access mode.

임의 접근 모드는 비트스트림에 IRAP(Intra Random Access Point) 프레임을 주기적으로 삽입하여 임의 접근을 가능하게 한다. 임의 접근 모드는 GoP 구조로 계층적 B 픽처 구조를 사용한다. The random access mode enables random access by periodically inserting an IRAP (Intra Random Access Point) frame into the bitstream. The random access mode uses a hierarchical B picture structure as a GoP structure.

도 4와 같이 GoP 내의 B 픽처들이 B 픽처를 참조하게 하여 부호화 효율이 향상된다. 계층이 더 높아질수록, B 픽처들이 참조할 수 있는 픽처들이 더 많아지기 때문에 부호화 효율이 더 향상된다. 반면, 계층적 B 픽처 구조로 인해 구조적 지연이 발생할 수 있다. 이는 영상의 부호화 순서와 출력 순서가 다르기 때문이다. As shown in FIG. 4, coding efficiency is improved by making B pictures in GoP refer to B pictures. The higher the layer, the more pictures can be referred to by B pictures, so the coding efficiency is improved. On the other hand, structural delay may occur due to the hierarchical B picture structure. This is because the video coding order and the output order are different.

I 픽처, P 픽처 및 B 픽처의 구분은 GoP 내 각 픽처별 특징에 따른 것으로, 압축비는 I 픽처 (10:1 ~ 20:1), P 픽처 (20:1 ~ 30:1), B 픽처(30:1 ~ 50:1)이다.The classification of I picture, P picture and B picture is based on the characteristics of each picture in GoP, and the compression ratio is I picture (10: 1 to 20: 1), P picture (20: 1 to 30: 1), B picture ( 30: 1 to 50: 1).

도 5는 본 발명의 일 실시예에 따른 저지연 모드인 경우, 일부 영상의 선택 방법을 예시한 도면이다. 5 is a diagram illustrating a method of selecting some images in a low-latency mode according to an embodiment of the present invention.

동영상의 압축방식이 저지연 모드인 경우, 영상의 시간적 순서에 따라 일부 영상을 선별하는 방식을 사용할 수 있다.When the video compression method is a low-latency mode, a method of sorting some images according to a temporal order of images may be used.

일부 영상을 선별하는 것은 적은 정보로 높은 성능을 낼 수 있는가를 기준으로 하며, 이는 해당 프레임에 압축된 영상의 bit가 많이 할당되었는가를 통하여 결정할 수 있다. 즉, 소정의 bit rate을 기준으로 높으면 포함하고 낮으면 제외할 수 있다.Selecting some images is based on whether high performance can be achieved with little information, and this can be determined through whether a lot of bits of the compressed image are allocated to the corresponding frame. That is, if it is high based on a predetermined bit rate, it can be included and if it is low, it can be excluded.

동영상의 압축 과정에서 I 프레임은 여러 종류의 영상 중에서 화면 내 압축을 통해 가장 많은 정보를 담고 있으며, 그 결과 해당 영상의 bit 율이 가장 높다.During the video compression process, the I frame contains the most information through intra-screen compression among various types of video, and as a result, the bit rate of the video is the highest.

저지연 모드에서는 I 프레임에서 멀어질수록 오차가 높아지므로 I 프레임으로부터 시간적으로 먼 영상들은 영상의 종류를 분류하는데 있어서 성능 저하의 원인이 되므로, GoP내에서 I 프레임과 시간적으로 가장 근접한 영상을 선별하는 것이 바람직하다.In the low-latency mode, the error increases as it moves away from the I-frame, so images that are temporally distant from the I-frame cause performance deterioration in classifying the type of the image. It is preferred.

즉, GoP내에서I프레임에 시간적으로 근접한 하나 이상의 프레임을 선택하고, I프레임에서 시간적으로 가장 먼쪽 프레임으로부터 하나 이상의 프레임을 제외한다. That is, one or more frames temporally close to the I frame are selected in GoP, and one or more frames are excluded from the farthest frame temporally in the I frame.

도 6은 본 발명의 일 실시예에 따른 임의 접근 모드인 경우, 일부 영상의 선택 방법을 예시한 도면이다.6 is a diagram illustrating a method of selecting some images in a random access mode according to an embodiment of the present invention.

동영상의 압축 방식이 임의 접근 모드인 경우, 부호화 순서에 따라 영상이 계층적인 구조를 가지게 되므로, 이러한 계층적인 구조를 고려하여 일부 영상을 선별하는 방식을 사용할 수 있다. When the video compression method is a random access mode, since an image has a hierarchical structure according to an encoding order, a method of selecting some images in consideration of the hierarchical structure may be used.

임의 접근 모드인 경우에도 일부 영상을 선별하는 기준은 저지연 모드와 같이, 적은 정보로 높은 성능을 낼 수 있는가를 기준으로 하여 해당 프레임에 압축된 영상의 bit가 많이 할당되었는가를 통하여 결정할 수 있다. Even in a random access mode, a criterion for selecting some images can be determined through whether a lot of bits of a compressed image are allocated to a corresponding frame based on whether high performance can be achieved with less information, such as a low-latency mode.

임의 접근 모드에서는 계층적인 구조로 인해 영상의 할당된 bit가 일반적으로 높은 '낮은 계층'의 영상을 선별할 수 있다.In the random access mode, because of the hierarchical structure, it is possible to select an image of the 'lower layer' in which the allocated bit of the image is generally higher.

즉, 최하위 계층으로부터 하나 이상의 프레임을 포함하고, 최상위 계층으로부터 하나 이상의 프레임을 제외한다.That is, one or more frames are included from the lowest layer, and one or more frames are excluded from the highest layer.

본 발명의 또 다른 실시예에 따르면, 영상특징분석부(110)는 동영상이 부호화 될 때 발생하는 압축 정보 중에서 동영상 부호화 과정의 기본 단위블록의 압축 정보를 추출하고, 영상선택부(130)는 블록 압축 정보의 분할 깊이 중 일부만 이용하여 동영상을 분류할 수도 있다. According to another embodiment of the present invention, the image feature analysis unit 110 extracts compression information of a basic unit block of a video encoding process from among compression information generated when a video is encoded, and the image selection unit 130 blocks It is also possible to classify a video using only a part of the split depth of the compressed information.

부호화 과정에서의 블록 분할 과정은 여러 다양한 동영상의 용도에 따라 그 깊이 제한이 다를 수 있다. 따라서, 동영상이 부호화 될 때, 블록 압축 정보의 분할 깊이 중 일부 깊이만을 이용하여 블록 압축 깊이 정보를 구성하고, 학습 및 분류에 사용한다면, 보다 더 다양한 종류의 동영상 분류를 할 수 있다. 예를 들어 단위 블록의 분할 깊이를 2 깊이(16 X 16)까지만 이용하여 학습 및 분류 데이터를 구성한다면, 2 깊이까지만 사용한 동영상과 3 깊이(8 X 8)까지 분할된 동영상 모두 분류하는데 사용할 수 있다. The block segmentation process in the encoding process may have different depth limitations depending on the use of various video. Accordingly, when the video is encoded, if the block compression depth information is configured by using only some of the depths of the splitting depth of the block compression information, and used for learning and classification, more various types of video classification can be performed. For example, if the learning and classification data is constructed by using the splitting depth of a unit block up to 2 depths (16 X 16), it can be used to classify both a video using up to 2 depths and a video split up to 3 depths (8 X 8). .

HEVC의 경우 최소 8 X 8에서 최대 64 X 64까지의 크기를 가질 수 있는 최대코딩단위(largest coding unit; LCU)들로 slice를 나누고, 각각의 LCU들이 쿼드트리 형태로 재귀적 분할되는 자유로운 형태를 갖고 있다. 이렇게 분할된 단위를 코딩단위(coding unit; CU)라고 하며, 이 CU들은 추가적으로 예측단위(prediction unit; PU)로 분할된다. 인트라 예측 혹은 인터 예측 과정이 수행되고 나면, 원본신호와 예측신호 간 차이인 잔차신호를 변환하는 과정을 거치게 된다. 이 과정에서, H.264/AVC의 경우 분할모드에 따라 전체 MB에 동일한 형태의 변환 크기가 사용되는 반면, HEVC에서는 CU를 기준으로 재차 재귀적 분할을 수행하고, 이 변환단위(transform unit; TU) 크기의 변환을 잔차신호에 적용한다. In the case of HEVC, the slice is divided into largest coding units (LCUs) that can have a size of at least 8 X 8 up to 64 X 64, and each LCU is recursively divided into quadtrees. Have The divided unit is called a coding unit (CU), and these CUs are additionally divided into prediction units (PUs). After the intra prediction or inter prediction process is performed, a residual signal that is a difference between the original signal and the prediction signal is transformed. In this process, in the case of H.264 / AVC, the transform size of the same type is used for all MBs according to the split mode, whereas in HEVC, recursive partitioning is performed again based on the CU, and this transform unit (TU) ) Apply the transformation of magnitude to the residual signal.

본 발명의 실시예에서 영상특징분석부(110)는 CU, PU 혹은 TU의 분할 깊이 정보를 각각 이용하여, 영상선택부(130)는 이들 CU, PU 혹은 TU의 분할 깊이 정보 중 일부만 이용하여 동영상을 선별할 수 있다. In an embodiment of the present invention, the image feature analysis unit 110 uses CU, PU, or TU segmentation depth information, respectively, and the image selection unit 130 uses only some of the CU, PU, or TU segmentation depth information. Can be selected.

상기 제시된 동영상 학습 및 분류를 위한 전처리장치(100)에서의 데이터 구성에 관한 실시예들은 독립적으로 기술되었으나, 혼합되어 영상분류의 성능을 높이는데 사용이 가능하다. The above-described embodiments for configuring data in the pre-processing apparatus 100 for video learning and classification are described independently, but can be mixed and used to increase the performance of image classification.

도 7은 본 발명의 일 실시예에 따른 개선된 영상 분류 방법을 예시한 도면이다.7 is a diagram illustrating an improved image classification method according to an embodiment of the present invention.

영상분류를 위한 영상이 입력되면(S701), 입력된 영상의 특징정보, 대표적으로는 압축구조의 특징을 분석하여 추출한다(S702).When an image for image classification is input (S701), characteristic information of the input image, typically, a characteristic of a compression structure is analyzed and extracted (S702).

영상특징분석은 동영상이 압축될 때 설정할 수 있는 한 GoP내 압축 구조에 따라 저지연 모드 혹은 임의 접근 모드로 구분하거나, 동영상이 부호화 될 때 발생하는 압축 정보 중에서 동영상 부호화 과정의 기본 단위블록의 압축 정보를 이용하여 동영상을 분석할 수 있다. Video feature analysis is divided into low-latency mode or random access mode according to the compression structure in GoP as long as it can be set when the video is compressed, or among the compression information generated when the video is encoded, compression information of the basic unit block of the video encoding process You can analyze the video using.

분석된 영상특징에 따라, 분류하고자 하는 영상의 압축정보를 모두 사용하지 않고 다양한 상황에서 여러 종류의 동영상에 공통적으로 찾을 수 있는 특징의 일부만을 선택한다(S705). According to the analyzed image characteristics, only some of the features commonly found in various types of videos are selected in various situations without using all of the compressed information of the image to be classified (S705).

동영상의 압축방식이 저지연 모드인 경우, 영상의 시간적 순서에 따라 일부 영상을 선별하는 방식을 사용할 수 있다. When the video compression method is a low-latency mode, a method of sorting some images according to a temporal order of images may be used.

한편, 동영상이 부호화 될 때 발생하는 압축 정보 중에서 동영상 부호화 과정의 기본 단위블록의 압축 정보의 분할 깊이 중 일부만 이용하여 동영상을 분류할 수도 있다.Meanwhile, among the compressed information generated when the video is encoded, the video may be classified using only a part of the splitting depth of the compressed information of the basic unit block of the video encoding process.

이러한 분류대상 동영상 데이터 구성방법들은 단독으로 혹은 혼합되어 분류성능을 높이는데 사용될 수 있다. These classification target video data composition methods can be used alone or in combination to improve classification performance.

S705에서 선택된 일부의 영상을 이용하여 뉴럴 네트워크의 입력으로 하여 동영상의 학습 및 분류를 수행한다(S707). Learning and classification of a video is performed by using a partial image selected in S705 as an input of a neural network (S707).

도 7에서는 단계 S701 내지 단계 S707을 순차적으로 실행하는 것으로 기재하고 있으나, 이는 본 실시예의 기술 사상을 예시적으로 설명한 것에 불과한 것으로서, 본 실시예가 속하는 기술 분야에서 통상의 지식을 가진 자라면 본 실시예의 본질적인 특성에서 벗어나지 않는 범위에서 도 7에 기재된 순서를 변경하여 실행하거나 단계 S701 내지 단계 S707 중 하나 이상의 단계를 병렬적으로 실행하는 것으로 다양하게 수정 및 변형하여 적용 가능할 것이므로, 도 7은 시계열적인 순서로 한정되는 것은 아니다. Although FIG. 7 describes that steps S701 to S707 are sequentially executed, this is merely illustrative of the technical idea of the present embodiment, and those skilled in the art to which this embodiment belongs belongs to this embodiment. 7 can be applied in various modifications and variations by changing the order described in FIG. 7 or executing one or more steps in steps S701 to S707 in parallel, without departing from the essential characteristics. It is not limited.

본 명세서에 첨부된 블록도의 각 블록과 흐름도의 각 단계의 조합들은 컴퓨터 프로그램 인스트럭션들에 의해 수행될 수도 있다. 이들 컴퓨터 프로그램 인스트럭션들은 범용 컴퓨터, 특수용 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비의 프로세서에 탑재될 수 있으므로, 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비의 프로세서를 통해 수행되는 그 인스트럭션들이 블록도의 각 블록 또는 흐름도의 각 단계에서 설명된 기능들을 수행하는 수단을 생성하게 된다. 이들 컴퓨터 프로그램 인스트럭션들은 특정 방식으로 기능을 구현하기 위해 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비를 지향할 수 있는 컴퓨터 이용 가능 또는 컴퓨터 판독 가능 메모리에 저장되는 것도 가능하므로, 그 컴퓨터 이용가능 또는 컴퓨터 판독 가능 메모리에 저장된 인스트럭션들은 블록도의 각 블록 또는 흐름도 각 단계에서 설명된 기능을 수행하는 인스트럭션 수단을 내포하는 제조 품목을 생산하는 것도 가능하다. 컴퓨터 프로그램 인스트럭션들은 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비 상에 탑재되는 것도 가능하므로, 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비 상에서 일련의 동작 단계들이 수행되어 컴퓨터로 실행되는 프로세스를 생성해서 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비를 수행하는 인스트럭션들은 블록도의 각 블록 및 흐름도의 각 단계에서 설명된 기능들을 실행하기 위한 단계들을 제공하는 것도 가능하다. The combinations of each block in the block diagrams and each step in the flowcharts attached to this specification may be performed by computer program instructions. These computer program instructions may be mounted on a processor of a general purpose computer, special purpose computer, or other programmable data processing equipment, so that instructions executed through a processor of a computer or other programmable data processing equipment may be used in each block or flowchart of the block diagram. In each step, means are created to perform the functions described. These computer program instructions can also be stored in computer readable or computer readable memory that can be oriented to a computer or other programmable data processing equipment to implement a function in a particular way, so that computer readable or computer readable memory The instructions stored in it are also possible to produce an article of manufacture containing instructions means for performing the functions described in each step of each block or flowchart of the block diagram. Computer program instructions can also be mounted on a computer or other programmable data processing equipment, so a series of operational steps are performed on a computer or other programmable data processing equipment to create a process that is executed by the computer to generate a computer or other programmable data. It is also possible for instructions to perform processing equipment to provide steps for performing the functions described in each block of the block diagram and in each step of the flowchart.

또한, 각 블록 또는 각 단계는 특정된 논리적 기능(들)을 실행하기 위한 하나 이상의 실행 가능한 인스트럭션들을 포함하는 모듈, 세그먼트 또는 코드의 일부를 나타낼 수 있다. 또, 몇 가지 대체 실시예들에서는 블록들 또는 단계들에서 언급된 기능들이 순서를 벗어나서 발생하는 것도 가능함을 주목해야 한다. 예컨대, 잇달아 도시되어 있는 두 개의 블록들 또는 단계들은 사실 실질적으로 동시에 수행되는 것도 가능하고 또는 그 블록들 또는 단계들이 때때로 해당하는 기능에 따라 역순으로 수행되는 것도 가능하다.Further, each block or each step can represent a module, segment, or portion of code that includes one or more executable instructions for executing the specified logical function (s). It should also be noted that in some alternative embodiments it is possible that the functions mentioned in blocks or steps occur out of order. For example, two blocks or steps shown in succession may in fact be executed substantially simultaneously, or it is also possible that the blocks or steps are sometimes performed in reverse order depending on the corresponding function.

이상의 설명은 본 발명의 기술 사상을 예시적으로 설명한 것에 불과한 것으로서, 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자라면 본 발명의 본질적인 특성에서 벗어나지 않는 범위에서 다양한 수정 및 변형이 가능할 것이다. 따라서, 본 발명에 개시된 실시예들은 본 발명의 기술 사상을 한정하기 위한 것이 아니라 설명하기 위한 것이고, 이러한 실시예에 의하여 본 발명의 기술 사상의 범위가 한정되는 것은 아니다. 본 발명의 보호 범위는 아래의 청구범위에 의하여 해석되어야 하며, 그와 동등한 범위 내에 있는 모든 기술사상은 본 발명의 권리범위에 포함되는 것으로 해석되어야 할 것이다.The above description is merely illustrative of the technical idea of the present invention, and those of ordinary skill in the art to which the present invention pertains may make various modifications and variations without departing from the essential characteristics of the present invention. Therefore, the embodiments disclosed in the present invention are not intended to limit the technical spirit of the present invention, but to explain, and the scope of the technical spirit of the present invention is not limited by these embodiments. The scope of protection of the present invention should be interpreted by the claims below, and all technical thoughts within the equivalent range should be interpreted as being included in the scope of the present invention.

본 발명의 개선된 영상 분류 시스템 및 방법에 따르면, 압축된 동영상의 부호정보의 일부만을 학습 및 분류에 사용함으로써 다양한 환경에서 다양한 파라미터로 압축된 영상들을 학습하여 이를 기계학습에 적용하는 경우, 각각의 파라미터별로 기계학습을 진행하지 않아도 되기 때문에 시스템 복잡도를 크게 줄일 수 있도록 하는 개선된 영상 분류 시스템 및 방법을 제공할 수 있는 솔루션으로 활용 가능하다는 점에서, 기존 기술의 한계를 뛰어 넘음에 따라 관련 기술에 대한 이용만이 아닌 적용되는 장치의 시판 또는 영업의 가능성이 충분할 뿐만 아니라 현실적으로 명백하게 실시할 수 있는 정도이므로 산업상 이용가능성이 있는 발명이다.According to the improved image classification system and method of the present invention, when only a part of the code information of a compressed video is used for learning and classification, when the compressed images are learned with various parameters in various environments and applied to machine learning, each Since it is not necessary to carry out machine learning for each parameter, it can be used as a solution that can provide an improved image classification system and method that can greatly reduce system complexity. It is an invention with industrial applicability, as it is not only for use, but also for the commercial or business possibilities of the applied device, as well as to the extent that it can be practiced in practice.

100: 전처리장치 110: 영상특징분석부 130: 영상선택부 200: 영상분류장치100: pretreatment device 110: image feature analysis unit 130: video selector 200: video classification device

Claims

입력된 하나 이상의 영상의 특징을 분석하여, 상기 입력된 영상들에서 공통적으로 찾을 수 있는 특징의 일부를 선택하는 전처리장치; 및
상기 전처리장치에서 선택된 일부의 영상을 입력으로 하여 영상분류 처리를 수행하는 영상분류장치를 포함하는 것을 특징으로 하는 개선된 영상 분류 시스템.A pre-processing device that analyzes features of one or more input images and selects some of the features commonly found in the input images; And
And an image classification device that performs image classification processing by using a part of the images selected by the pre-processing device as an input.

제1항에 있어서,
상기 영상분류장치는,
상기 전처리장치에서 선택된 영상을 입력으로 기계학습(Machine Learning) 기법을 적용하여 특징벡터를 생성하는 것을 특징으로 하는 개선된 영상 분류 시스템.According to claim 1,
The video classification device,
An improved image classification system, characterized in that a feature vector is generated by applying a machine learning technique to an image selected by the pre-processing device.

제1항에 있어서,
상기 전처리장치는,
분류하고자 하는 상기 하나 이상의 입력 영상들의 특징정보를 분석하여 추출하는 영상특징분석부; 및
상기 영상특징분석부를 통하여 추출된 여러 종류의 동영상에서 공통적으로 찾을 수 있는 특징 중에서 상기 영상분류장치에서 학습 및 분류에 사용할 일부 영상을 선별하는 영상선택부를 포함하는 것을 특징으로 하는 개선된 영상 분류 시스템.According to claim 1,
The pre-processing device,
An image feature analysis unit for analyzing and extracting feature information of the one or more input images to be classified; And
An improved image classification system comprising an image selection unit to select some images to be used for learning and classification in the image classification apparatus among features commonly found in various types of videos extracted through the image feature analysis unit.

제3항에 있어서,
상기 영상특징분석부는,
분류하고자 하는 영상의 압축구조의 특징을 분석하여 추출하는 것을 특징으로 하는 개선된 영상 분류 시스템.According to claim 3,
The image feature analysis unit,
An improved image classification system, characterized by analyzing and extracting the characteristics of the compression structure of the image to be classified.

제3항에 있어서,
상기 영상선택부는,
해당 프레임에 압축된 영상이 소정의 bit rate보다 높으면 선택하고, 낮으면 제외하는 것을 특징으로 하는 개선된 영상 분류 시스템.According to claim 3,
The image selector,
Improved video classification system, characterized in that if the video compressed in the frame is higher than a predetermined bit rate, it is selected and excluded.

제4항에 있어서,
상기 영상특징분석부는,
분류하고자 하는 영상의 압축구조의 특징을 동영상이 압축될 때 설정할 수 있는 GoP 내 압축 구조에 따라 저지연 모드 혹은 임의 접근 모드로 구분하는 것을 특징으로 하는 개선된 영상 분류 시스템.The method of claim 4,
The image feature analysis unit,
An improved video classification system characterized by classifying the characteristics of the compression structure of the video to be classified into a low-latency mode or a random access mode according to a compression structure in GoP that can be set when a video is compressed.

제6항에 있어서,
상기 GoP 내 압축구조가 저지연 모드인 경우,
상기 영상선택부는 상기 입력 영상의 시간적 순서에 따라 일부 영상을 선별하는 것을 특징으로 하는 개선된 영상 분류 시스템.The method of claim 6,
When the compression structure in the GoP is a low-latency mode,
The image selection unit is an improved image classification system, characterized in that for sorting some images in accordance with the temporal order of the input image.

제7항에 있어서,
상기 영상선택부는,
GoP내에서I프레임에 시간적으로 근접한 하나 이상의 프레임을 선택하고, I프레임에서 시간적으로 가장 먼쪽 프레임으로부터 하나 이상의 프레임을 제외하는 것을 특징으로 하는 개선된 영상 분류 시스템.The method of claim 7,
The image selector,
An improved video classification system characterized in that one or more frames temporally close to an I frame in GoP are selected and one or more frames are excluded from the farthest frame temporally in the I frame.

제6항에 있어서,
상기 GoP 내 압축구조가 임의 접근 모드인 경우,
상기 영상선택부는 상기 입력 영상의 부호화 순서에 따른 영상의 계층적인 구조를 고려하여 일부 영상을 선별하는 것을 특징으로 하는 개선된 영상 분류 시스템.The method of claim 6,
When the compression structure in the GoP is a random access mode,
The image selection unit is an improved image classification system, characterized in that to select some images in consideration of the hierarchical structure of the image according to the encoding order of the input image.

제9항에 있어서,
상기 영상선택부는,
최하위 계층으로부터 하나 이상의 프레임을 포함하고, 최상위 계층으로부터 하나 이상의 프레임을 제외하는 것을 특징으로 하는 개선된 영상 분류 시스템. The method of claim 9,
The image selector,
An improved video classification system, comprising one or more frames from the lowest layer and excluding one or more frames from the highest layer.

제4항에 있어서,
상기 영상특징분석부는,
상기 입력 영상이 부호화 될 때 발생하는 압축 정보 중에서 동영상 부호화 과정의 기본 단위블록의 압축 정보를 추출하고,
상기 영상선택부는 상기 기본 단위블록의 압축 정보의 분할 깊이 중 일부만 선별하는 것을 특징으로 하는 개선된 영상 분류 시스템.The method of claim 4,
The image feature analysis unit,
Extract the compressed information of the basic unit block of the video encoding process from the compressed information generated when the input image is encoded,
The image selection unit is an improved image classification system, characterized in that only a part of the splitting depth of the compressed information of the basic unit block.

제11항에 있어서,
적용된 압축방식이 HEVC인 경우, 압축정보의 분할 깊이 정보 중에서 CU(coding unit), PU(prediction unit) 혹은 TU(transform unit) 중 어느 하나 이상의 분할 깊이 정보를 이용하여 일부 영상을 선별하는 것을 특징으로 하는 개선된 영상 분류 시스템.The method of claim 11,
When the applied compression method is HEVC, it is characterized in that some images are selected using one or more of split depth information among CU (coding unit), PU (prediction unit), or TU (transform unit) among split depth information of compressed information. Improved video classification system.

영상분류를 위한 하나 이상의 영상을 입력받는 단계;
상기 입력된 영상의 압축구조의 특징을 분석하여 추출하는 단계;
상기 분석되어 추출된 영상특징에 따라, 여러 종류의 동영상에 공통적으로 찾을 수 있는 특징의 일부를 선택하는 단계; 및
상기 선택된 일부의 영상을 뉴럴 네트워크의 입력으로 하여 동영상의 학습 및 분류를 수행하는 단계를 포함하는 것을 특징으로 하는 개선된 영상 분류 방법.Receiving one or more images for image classification;
Analyzing and extracting characteristics of the compressed structure of the input image;
Selecting a part of features commonly found in various types of videos according to the analyzed and extracted image features; And
And performing learning and classification of a video using the selected part of the image as an input of a neural network.

제13항에 있어서,
상기 입력된 영상의 압축구조의 특징은,
동영상이 압축될 때 설정할 수 있는 GoP 내 압축 구조에 따라 저지연 모드 혹은 임의 접근 모드로 구분하거나, 동영상이 부호화 될 때 발생하는 압축 정보 중에서 동영상 부호화 과정의 기본 단위블록의 압축 정보인 것을 특징으로 하는 개선된 영상 분류 방법. The method of claim 13,
The characteristic of the compression structure of the input image is,
According to the compression structure in GoP that can be set when a video is compressed, it is classified into a low-latency mode or a random access mode, or among compression information generated when a video is encoded, it is compression information of a basic unit block of a video encoding process. Improved image classification method.

제14항에 있어서,
상기 GoP의 압축구조가 저지연 모드인 경우,
상기 입력된 영상의 시간적 순서에 따라 일부 영상을 선별하는 것을 특징으로 하는 개선된 영상 분류 방법.The method of claim 14,
When the compression structure of the GoP is a low-latency mode,
An improved image classification method, characterized in that some images are sorted according to the temporal order of the input images.

제14항에 있어서,
상기 GoP의 압축구조가 임의 접근 모드인 경우,
상기 입력된 영상의 부호화 순서에 따른 영상의 계층적인 구조를 고려하여 일부 영상을 선별하는 것을 특징으로 하는 개선된 영상 분류 방법. The method of claim 14,
When the compression structure of the GoP is a random access mode,
An improved image classification method, characterized in that some images are selected in consideration of the hierarchical structure of the images according to the encoding order of the input images.

제14항에 있어서,
상기 입력된 영상의 압축구조의 특징이 동영상이 부호화 될 때 발생하는 압축 정보 중에서 동영상 부호화 과정의 기본 단위블록의 압축 정보인 경우,
동영상이 부호화 될 때 발생하는 압축 정보 중에서 동영상 부호화 과정의 기본 단위블록의 압축 정보의 분할 깊이 중 일부만 선별하는 것을 특징으로 하는 개선된 영상 분류 방법.The method of claim 14,
When the feature of the compression structure of the input image is compression information of a basic unit block of a video encoding process among compression information generated when a video is encoded,
Improved image classification method, characterized in that only a part of the splitting depth of compressed information of a basic unit block of a video encoding process is selected from among compressed information generated when a video is encoded.

제13항 내지 제17항 중 어느 한 항에 기재된 개선된 영상 분류 방법을 실행하는 프로그램을 기록한 컴퓨터 판독 가능한 기록 매체.A computer-readable recording medium in which a program for executing the improved image classification method according to any one of claims 13 to 17 is recorded.