KR102424814B1

KR102424814B1 - Apparatus and method for encording kinect video data

Info

Publication number: KR102424814B1
Application number: KR1020180005957A
Authority: KR
Inventors: 오중선; 김연우; 김태암; 조재형
Original assignee: 한국전력공사; (유)아홉
Priority date: 2018-01-17
Filing date: 2018-01-17
Publication date: 2022-07-26
Also published as: KR20190087764A

Abstract

본 발명은 키넥트 영상 데이터 부호화 장치 및 그 방법에 관한 것으로, 본 발명의 일실시예에 따른 키넥트 영상 데이터 부호화 장치는, 키넥트로부터 입력된 깊이 영상 데이터의 범위 차이에 대한 오프셋을 보정하기 위한 오프셋 보정부; 상기 보정된 깊이 영상 데이터의 깊이값에 대해 표현 가능한 최대 비트 범위 내에 포함되도록 정규화 처리를 수행하기 위한 정규화 처리부; 및 상기 정규화 처리된 깊이 영상 데이터에 대한 부호화 과정을 진행하여 깊이 비트 스트림을 출력하기 위한 깊이 영상 부호화부;를 포함한다.The present invention relates to an apparatus for encoding Kinect image data and a method therefor. The apparatus for encoding Kinect image data according to an embodiment of the present invention provides a method for correcting an offset for a range difference of depth image data input from Kinect. offset correction unit; a normalization processing unit configured to perform normalization processing so that the corrected depth value of the depth image data is included within a maximum representable bit range; and a depth image encoder configured to output a depth bit stream by performing an encoding process on the normalized depth image data.

Description

키넥트 영상 데이터 부호화 장치 및 그 방법{APPARATUS AND METHOD FOR ENCORDING KINECT VIDEO DATA}Kinect video data encoding apparatus and method

본 발명은 키넥트 영상 데이터 부호화 장치 및 그 방법에 관한 것으로서, 보다 상세하게는, 키넥트로부터 취득된 키넥트 영상 데이터(즉, RGB 영상 데이터 및 깊이 영상 데이터)를 분리하여 데이터 종류에 따라 적절한 부호화 방식을 적용함으로써 데이터 손실을 줄이기 위한, 키넥트 영상 데이터 부호화 장치 및 그 방법에 관한 것이다.The present invention relates to an apparatus and method for encoding Kinect image data, and more particularly, to separate Kinect image data (ie, RGB image data and depth image data) obtained from Kinect and encode appropriate data according to data types. To reduce data loss by applying the method, to a Kinect image data encoding apparatus and method therefor.

3차원 비디오는 차세대 멀티미디어 컨텐츠 포맷으로 주목받고 있고, 2차원 비디오를 대체할 것으로 기대된다. 이러한 3차원 비디오는 능동 센서 기반의 키넥트(kinect)를 이용하여 사물로부터 직접 깊이 정보를 얻을 수 있다.3D video is attracting attention as a next-generation multimedia content format and is expected to replace 2D video. Such 3D video can obtain depth information directly from an object using an active sensor-based kinect.

'키넥트'라 함은 콘트롤러 없이 이용자의 신체를 이용하여 게임과 엔터테인먼트를 경험할 수 있는 엑스박스 360과 연결해서 사용하는 주변기기를 말한다.'Kinect' refers to a peripheral device used in connection with the Xbox 360 that allows users to experience games and entertainment using the user's body without a controller.

여기서, 키넥트는 적외선 카메라의 중심점을 원점으로 하여 객체를 3차원으로 표시한다. Z축은 영상영역(image plane)에 수직이고, X축은 Z축에 대하여 수직이며, 적외선 카메라에서 레이저 프로젝터로 향하는 방향이다. Y축은 Z축과 X축에 대하여 수직이다.Here, Kinect displays the object in three dimensions with the center point of the infrared camera as the origin. The Z axis is perpendicular to the image plane, the X axis is perpendicular to the Z axis, and is the direction from the infrared camera to the laser projector. The Y axis is perpendicular to the Z axis and the X axis.

키넥트는 RGB 카메라, 적외선 센서, 적외선 프로젝터 및 4개의 마이크로폰으로 구성된다. RGB 카메라는 색상 정보를 획득하며, 적외선 센서 및 적외선 프로젝터는 전면 물체에 픽셀 단위의 적외선을 송출하고 반사되어 돌아오는 것을 받아들여 거리 정보를 획득하게 된다.The Kinect consists of an RGB camera, an infrared sensor, an infrared projector and four microphones. The RGB camera acquires color information, and the infrared sensor and infrared projector transmit pixel unit infrared rays to the front object and receive the reflected back to acquire distance information.

센서들은 색상 뷰(color view), 영상의 깊이 정보를 나타내는 깊이 뷰(depth view), 객체의 골격을 나타내는 골격 뷰(skeleton view)를 얻을 수 있다. 이때, 센서들은 사람 신체의 47개 부위를 초당 30번씩 감지한다.The sensors may obtain a color view, a depth view indicating depth information of an image, and a skeleton view indicating a skeleton of an object. At this time, the sensors detect 47 parts of the human body 30 times per second.

깊이 영상 데이터는 픽셀 별 키넥트와 대상체간의 상대적 거리를 나타내며, 이를 이미지 형태의 정보로 나타내는 것을 깊이맵(depth map)이라고 한다. 카메라에서 가까운 픽셀은 밝은 픽셀, 즉 높은 값을 가지며, 멀수록 낮은 값을 가지게 된다. The depth image data indicates the relative distance between the kinect and the object for each pixel, and representing this as information in the form of an image is called a depth map. Pixels closer to the camera have a bright pixel, that is, a higher value, and a pixel farther away from the camera has a lower value.

깊이 영상 데이터는 도 1을 참조하면, 총 16비트의 데이터로 표현하게 되는데, 3비트는 플레이어 인덱스(player index)로서 인간의 형태를 감지하기 위한 정보이며, 13비트는 깊이 비트(depth bits)이다. 여기서, 깊이 비트 13비트 중 12비트는 각 픽셀의 깊이 정보를 담고, 1비트는 깊이 측정의 불가여부에 사용된다. 도 1은 깊이 영상 데이터의 프레임을 나타낸 도면이다.1 , the depth image data is expressed as data of a total of 16 bits. 3 bits are information for detecting a human shape as a player index, and 13 bits are depth bits. . Here, among the 13 depth bits, 12 bits contain depth information of each pixel, and 1 bit is used to determine whether depth measurement is impossible. 1 is a diagram illustrating a frame of depth image data.

깊이맵은 3차원 비디오 합성에서 중요한 역할을 하게 된다. 이의 효율적인 압축은 추가적인 비트를 절약할 수 있으며, 결과적으로 영상 전송, 저장 및 재생 시 품질을 향상시킬 수 있다. Depth maps play an important role in 3D video synthesis. Its efficient compression can save additional bits, which in turn can improve the quality of video transmission, storage and playback.

그런데, 2차원 비디오 코덱은 깊이 영상 데이터를 반영하는 알고리즘이 설계되어 있지 않기 때문에, 깊이 영상 데이터를 표준화된 방식으로 부호화/복호화할 수 있는 방식이 아직 체계적으로 정립되어 있지 않다.However, since the 2D video codec does not have an algorithm that reflects the depth image data, a method for encoding/decoding the depth image data in a standardized manner has not yet been systematically established.

따라서, 종래에는 깊이 영상 데이터가 포함된 3차원 비디오를 부호화/복호화할 수 있는 방안이 마련될 필요성이 있다.Accordingly, there is a need for a method for encoding/decoding a 3D video including depth image data in the related art.

대한민국 등록특허공보 제10-1603467호 (2016.03.08 등록)Republic of Korea Patent Publication No. 10-1603467 (registered on March 8, 2016)

본 발명의 목적은 키넥트로부터 취득된 키넥트 영상 데이터(즉, RGB 영상 데이터 및 깊이 영상 데이터)를 분리하여 데이터 종류에 따라 적절한 부호화 방식을 적용함으로써 데이터 손실을 줄이기 위한, 키넥트 영상 데이터 부호화／복호화 장치 및 그 방법을 제공하는데 있다.An object of the present invention is to reduce data loss by separating Kinect image data (that is, RGB image data and depth image data) obtained from Kinect and applying an appropriate encoding method according to the data type. To provide a decryption apparatus and method therefor.

본 발명의 일실시예에 따른 키넥트 영상 데이터 부호화 장치는, 키넥트로부터 입력된 깊이 영상 데이터의 범위 차이에 대한 오프셋을 보정하기 위한 오프셋 보정부; 상기 보정된 깊이 영상 데이터의 깊이값에 대해 표현 가능한 최대 비트 범위 내에 포함되도록 정규화 처리를 수행하기 위한 정규화 처리부; 및 상기 정규화 처리된 깊이 영상 데이터에 대한 부호화 과정을 진행하여 깊이 비트 스트림을 출력하기 위한 깊이 영상 부호화부;를 포함할 수 있다.An apparatus for encoding Kinect image data according to an embodiment of the present invention includes: an offset correcting unit for correcting an offset with respect to a range difference of depth image data inputted from Kinect; a normalization processing unit configured to perform normalization processing so that the corrected depth value of the depth image data is included within a maximum representable bit range; and a depth image encoder configured to output a depth bit stream by performing an encoding process on the normalized depth image data.

일실시예에 의하면, 상기 키넥트로부터 입력된 RGB 영상 데이터에 대한 암호화를 수행하기 위한 RGB 영상 부호화부;를 더 포함할 수 있다.According to an embodiment, an RGB image encoder for performing encryption on the RGB image data input from the Kinect; may further include.

상기 깊이 영상 부호화부는, H.265/HEVC 코덱을 이용하여 부호화 과정을 수행하는 것일 수 있다.The depth image encoder may perform an encoding process using the H.265/HEVC codec.

상기 RGB 영상 부호화부는, H.264 코덱을 이용하여 부호화 과정을 수행하는 것일 수 있다.The RGB image encoder may perform an encoding process using an H.264 codec.

상기 오프셋 보정부는, 상기 깊이 영상 데이터의 범위에서 최소값을 영점으로 맞춰주는 것일 수 있다.The offset corrector may be configured to set a minimum value in the range of the depth image data to a zero point.

상기 표현 가능한 최대 비트는, 12비트일 수 있다.The maximum representable bit may be 12 bits.

또한, 본 발명의 일실시예에 따른 키넥트 영상 데이터 부호화 방법은, 키넥트로부터 입력된 깊이 영상 데이터의 범위 차이에 대한 오프셋을 보정하는 단계; 상기 보정된 깊이 영상 데이터의 깊이값에 대해 표현 가능한 최대 비트 범위 내에 포함되도록 정규화 처리를 수행하는 단계; 및 상기 정규화 처리된 깊이 영상 데이터에 대한 부호화 과정을 진행하여 깊이 비트 스트림을 출력하는 단계;를 포함할 수 있다.Also, according to an embodiment of the present invention, a method for encoding Kinect image data includes correcting an offset for a range difference between depth image data input from Kinect; performing normalization processing with respect to the depth value of the corrected depth image data so as to be included within a maximum representable bit range; and outputting a depth bit stream by performing an encoding process on the normalized depth image data.

일실시예에 의하면, 상기 키넥트로부터 입력된 RGB 영상 데이터에 대한 암호화를 수행하는 단계;를 더 포함할 수 있다.According to an embodiment, the method may further include performing encryption on the RGB image data input from the Kinect.

본 발명은 키넥트로부터 취득된 키넥트 영상 데이터(즉, RGB 영상 데이터 및 깊이 영상 데이터)를 분리하여 데이터 종류에 따라 적절한 부호화 방식을 적용함으로써 데이터 손실을 줄일 수 있다.According to the present invention, data loss can be reduced by separating Kinect image data (ie, RGB image data and depth image data) obtained from Kinect and applying an appropriate encoding method according to the data type.

또한, 본 발명은 깊이 영상 데이터에 대해 H.265/HEVC 코덱을 사용하여 부호화함으로써 데이터 손실 없이 압축할 수 있다.In addition, the present invention can compress depth image data without data loss by encoding the depth image data using the H.265/HEVC codec.

또한, 본 발명은 능동 센서 기반 3차원 이미지 제작이 가능하여 저비용으로 고성능의 3차원 이미지 생성 및 저장, 데이터로부터 3차원 이미지 합성 등을 할 수 있다.In addition, the present invention can produce a three-dimensional image based on an active sensor, so that it is possible to generate and store a high-performance three-dimensional image at a low cost, and to synthesize a three-dimensional image from data.

또한, 본 발명은 H.265/HEVC 코덱을 사용하여 2차원 영상을 압축하고, 추가적으로 깊이 영상 데이터를 확장 프로파일에 저장하여 3차원 영상 제작시 데이터 부호화에 널리 활용할 수 있다. In addition, the present invention compresses a two-dimensional image using the H.265/HEVC codec, and additionally stores depth image data in an extended profile, so that it can be widely used for data encoding when producing a three-dimensional image.

또한, 본 발명은 3차원 영상 렌더링 작업 시 용량의 오버헤드로 제한이 예상되는 모바일 장치에도 적극 활용할 수 있다. In addition, the present invention can be actively utilized in mobile devices that are expected to be limited due to overhead in capacity when rendering a 3D image.

또한, 본 발명은 H.265/HEVC 코덱의 확장 프로파일인 모노크롬 12를 채택하여 낭비되는 공간 없이 깊이 영상 데이터를 담을 수 있다. In addition, the present invention can contain depth image data without wasted space by adopting Monochrome 12, which is an extended profile of the H.265/HEVC codec.

또한, 본 발명은 기계 학습을 이용하여 특징을 추출하여 학습하는 데이터 위주로 정보를 처리하는 시스템에 효율적으로 활용될 수 있다.In addition, the present invention can be efficiently utilized in a system for processing information mainly on data that is learned by extracting features using machine learning.

또한, 본 발명은 움직임을 주로 처리하는 경우 데이터 처리 대상 용량을 줄여서 3차원을 표현할 수 있기 때문에 데이터 저장 및 처리를 용이하게 구현할 수 있다.In addition, since the present invention can express three-dimensional data by reducing the data processing target capacity when motion is mainly processed, data storage and processing can be easily implemented.

또한, 본 발명은 H.265/HEVC 코덱은 여타의 코덱보다 향상된 압축률을 보이며 저용량으로 고해상도의 영상 데이터를 처리할 수 있다는 이점이 있기 때문에, 깊이 영상 데이터를 포함하여 3차원 비디오를 처리할 때 유리한 이점을 제공할 수 있다.In addition, according to the present invention, since the H.265/HEVC codec has an advantage of being able to process high-resolution image data with a low capacity while showing an improved compression rate than other codecs, it is advantageous when processing 3D video including depth image data. can provide an advantage.

도 1은 깊이 영상 데이터의 프레임을 나타낸 도면,
도 2는 본 발명의 일실시예에 따른 키넥트 영상 데이터 부호화 장치를 나타낸 도면,
도 3은 본 발명의 일실시예에 따른 키넥트 영상 데이터 복호화 장치를 나타낸 도면,
도 4는 본 발명의 일실시예에 따른 키넥트 영상 데이터 부호화/복호화 방법을 나타낸 도면이다.1 is a view showing a frame of depth image data;
2 is a view showing a Kinect image data encoding apparatus according to an embodiment of the present invention;
3 is a view showing a Kinect image data decoding apparatus according to an embodiment of the present invention;
4 is a diagram illustrating a method for encoding/decoding Kinect image data according to an embodiment of the present invention.

이하 본 발명의 바람직한 실시 예를 첨부한 도면을 참조하여 상세히 설명한다. 다만, 하기의 설명 및 첨부된 도면에서 본 발명의 요지를 흐릴 수 있는 공지 기능 또는 구성에 대한 상세한 설명은 생략한다. 또한, 도면 전체에 걸쳐 동일한 구성 요소들은 가능한 한 동일한 도면 부호로 나타내고 있음에 유의하여야 한다.Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. However, detailed descriptions of well-known functions or configurations that may obscure the gist of the present invention in the following description and accompanying drawings will be omitted. In addition, it should be noted that throughout the drawings, the same components are denoted by the same reference numerals as much as possible.

이하에서 설명되는 본 명세서 및 청구범위에 사용된 용어나 단어는 통상적이거나 사전적인 의미로 한정해서 해석되어서는 아니 되며, 발명자는 그 자신의 발명을 가장 최선의 방법으로 설명하기 위한 용어로 적절하게 정의할 수 있다는 원칙에 입각하여 본 발명의 기술적 사상에 부합하는 의미와 개념으로 해석되어야만 한다.The terms or words used in the present specification and claims described below should not be construed as being limited to conventional or dictionary meanings, and the inventor shall appropriately define his or her invention in terms of the best way to describe it. It should be interpreted as meaning and concept consistent with the technical idea of the present invention based on the principle that it can be done.

따라서 본 명세서에 기재된 실시 예와 도면에 도시된 구성은 본 발명의 가장 바람직한 일 실시 예에 불과할 뿐이고, 본 발명의 기술적 사상을 모두 대변하는 것은 아니므로, 본 출원시점에 있어서 이들을 대체할 수 있는 다양한 균등물과 변형 예들이 있을 수 있음을 이해하여야 한다.Therefore, the embodiments described in this specification and the configurations shown in the drawings are only the most preferred embodiment of the present invention, and do not represent all the technical ideas of the present invention, so at the time of the present application, various It should be understood that there may be equivalents and variations.

첨부 도면에 있어서 일부 구성요소는 과장되거나 생략되거나 또는 개략적으로 도시되었으며, 각 구성요소의 크기는 실제 크기를 전적으로 반영하는 것이 아니다. 본 발명은 첨부한 도면에 그려진 상대적인 크기나 간격에 의해 제한되어지지 않는다.In the accompanying drawings, some components are exaggerated, omitted, or schematically illustrated, and the size of each component does not fully reflect the actual size. The present invention is not limited by the relative size or spacing drawn in the accompanying drawings.

명세서 전체에서 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있음을 의미한다. 또한, 어떤 부분이 다른 부분과 "연결"되어 있다고 할 때, 이는 "직접적으로 연결"되어 있는 경우뿐 아니라, 그 중간에 다른 소자를 사이에 두고 "전기적으로 연결"되어 있는 경우도 포함한다.In the entire specification, when a part "includes" a certain component, it means that other components may be further included, rather than excluding other components, unless otherwise stated. In addition, when a part is said to be "connected" with another part, it includes not only the case of being "directly connected", but also the case of being "electrically connected" with another element interposed therebetween.

단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. "포함하다" 또는 "가지다" 등의 용어는 명세서상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.The singular expression includes the plural expression unless the context clearly dictates otherwise. Terms such as “comprise” or “have” are intended to designate that a feature, number, step, operation, component, part, or combination thereof described in the specification is present, and includes one or more other features, number, or step. , it should be understood that it does not preclude in advance the possibility of the presence or addition of an operation, component, part, or combination thereof.

또한, 명세서에서 사용되는 "부"라는 용어는 소프트웨어, FPGA 또는 ASIC과 같은 하드웨어 구성요소를 의미하며, "부"는 어떤 역할들을 수행한다. 그렇지만 "부"는 소프트웨어 또는 하드웨어에 한정되는 의미는 아니다. "부"는 어드레싱할 수 있는 저장 매체에 있도록 구성될 수도 있고 하나 또는 그 이상의 프로세서들을 재생시키도록 구성될 수도 있다. 따라서, 일 예로서 "부"는 소프트웨어 구성요소들, 객체지향 소프트웨어 구성요소들, 클래스 구성요소들 및 태스크 구성요소들과 같은 구성요소들과, 프로세스들, 함수들, 속성들, 프로시저들, 서브루틴들, 프로그램 코드의 세그먼트들, 드라이버들, 펌웨어, 마이크로 코드, 회로, 데이터, 데이터베이스, 데이터 구조들, 테이블들, 어레이들 및 변수들을 포함한다. 구성요소들과 "부"들 안에서 제공되는 기능은 더 작은 수의 구성요소들 및 "부"들로 결합되거나 추가적인 구성요소들과 "부"들로 더 분리될 수 있다.Also, as used herein, the term “unit” refers to a hardware component such as software, FPGA, or ASIC, and “unit” performs certain roles. However, "part" is not meant to be limited to software or hardware. A “unit” may be configured to reside on an addressable storage medium and may be configured to refresh one or more processors. Thus, by way of example, “part” refers to components such as software components, object-oriented software components, class components, and task components, processes, functions, properties, procedures, subroutines, segments of program code, drivers, firmware, microcode, circuitry, data, databases, data structures, tables, arrays and variables. The functionality provided within components and “parts” may be combined into a smaller number of components and “parts” or further divided into additional components and “parts”.

아래에서는 첨부한 도면을 참고하여 본 발명의 실시예에 대하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 상세히 설명한다. 그러나 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시예에 한정되지 않는다. 그리고 도면에서 본 발명을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면 부호를 붙였다.Hereinafter, with reference to the accompanying drawings, the embodiments of the present invention will be described in detail so that those skilled in the art can easily carry out the embodiments of the present invention. However, the present invention may be embodied in several different forms and is not limited to the embodiments described herein. And in order to clearly explain the present invention in the drawings, parts irrelevant to the description are omitted, and similar reference numerals are attached to similar parts throughout the specification.

이하, 첨부된 도면을 참조하여 본 발명의 바람직한 실시예를 설명한다.Hereinafter, preferred embodiments of the present invention will be described with reference to the accompanying drawings.

도 2는 본 발명의 일실시예에 따른 키넥트 영상 데이터 부호화 장치를 나타낸 도면이고, 도 3은 본 발명의 일실시예에 따른 키넥트 영상 데이터 복호화 장치를 나타낸 도면이다.2 is a diagram illustrating a Kinect image data encoding apparatus according to an embodiment of the present invention, and FIG. 3 is a diagram illustrating a Kinect image data decoding apparatus according to an embodiment of the present invention.

도 2에 도시된 바와 같이, 본 발명의 일실시예에 따른 키넥트 영상 데이터 부호화 장치(100a)는, 키넥트(10)로부터 취득된 키넥트 영상 데이터(즉, RGB 영상 데이터 및 깊이 영상 데이터)를 분리하여 데이터 종류에 따라 적절한 부호화 방식을 적용함으로써 데이터 손실을 줄일 수 있다.As shown in FIG. 2 , in the Kinect image data encoding apparatus 100a according to an embodiment of the present invention, Kinect image data (ie, RGB image data and depth image data) obtained from the Kinect 10 . Data loss can be reduced by separating and applying an appropriate encoding method according to the data type.

깊이 영상 데이터는 도 1과 같이 최대 12비트의 값으로 표현이 가능하고, 이를 압축하기 위한 방식으로 영상 코덱을 사용하여 부호화(압축)하는 방식을 고려해볼 수 있다.Depth image data can be expressed as a value of up to 12 bits as shown in FIG. 1 , and as a method for compressing it, a method of encoding (compressing) using an image codec may be considered.

키넥트(10)는 640×480 픽셀의 RGB 영상 데이터와 깊이맵을 생성할 수 있다. 깊이맵은 데이터로 표현되고, 스트림이나 신호로 전달하기 위해 부호화 과정이 필요하다.The Kinect 10 may generate 640×480 pixel RGB image data and a depth map. The depth map is expressed as data, and an encoding process is required to transmit it as a stream or signal.

일반적으로, 비디오 정보는 부호화하는 경우에 코덱(codec)을 사용하게 된다. 코덱은 임의의 데이터 스트림을 부호화(Encoding) 및 복호화(Decoding) 하는 소프트웨어 또는 하드웨어를 의미한다.In general, when encoding video information, a codec is used. The codec refers to software or hardware that encodes and decodes an arbitrary data stream.

여기서는 RGB 영상 데이터를 H.264 코덱에 따라 부호화하고, 깊이 영상 데이터를 H.265/HEVC(High Efficiency Video Coding) 코덱에 따라 부호화하는 경우에 대해 설명하기로 한다. 이에 한정되지 않고, 이외에도 예를 들어 MP3G-1, MPEG-2, MPEG-4, H.264/AVC, MVC, SVC 등을 적용할 수도 있다.Here, a case in which RGB image data is encoded according to the H.264 codec and depth image data is encoded according to the H.265/HEVC (High Efficiency Video Coding) codec will be described. It is not limited thereto, and, for example, MP3G-1, MPEG-2, MPEG-4, H.264/AVC, MVC, SVC, etc. may be applied.

특히, H.265/HEVC 코덱은 기존 H.264 코덱을 개발한 ISO/IEC MPEG와 ITU-U 의 영상 부호화 전문가 그룹이 협력해 개발한 차세대 영상 압축 기술이다. 이러한 H.265/HEVC 코덱은 주 프로파일을 다수 가지고 있는데, 버전1에서 Main, Main 10, Main Still Picture가 주를 이루며, 버전2에서 확장 프로파일 21개가 추가되었다. 확장 프로파일은 비트 깊이, 4:2:2/4:4:4 크로마 샘플링, 멀티뷰 비디오 코딩(MVC), 확장 비디오코딩(SCV) 등의 다양한 요소를 포함하게 된다.In particular, the H.265/HEVC codec is a next-generation video compression technology developed in cooperation with ISO/IEC MPEG, which developed the existing H.264 codec, and the video encoding expert group of ITU-U. This H.265/HEVC codec has many main profiles. Main, Main 10, and Main Still Picture in version 1 are the main ones, and 21 extended profiles are added in version 2. The extended profile includes various elements such as bit depth, 4:2:2/4:4:4 chroma sampling, multi-view video coding (MVC), and extended video coding (SCV).

이처럼, H.265/HEVC 코덱의 프로파일은 버전2의 모노크롬 12(monochrome 12) 또는 그 이상의 프로파일을 이용한 코덱을 사용하여 12비트 깊이 영상 데이터를 표현하게 된다. As such, the profile of the H.265/HEVC codec expresses 12-bit depth image data using a codec using the version 2 version of Monochrome 12 or higher.

이는 깊이 영상 데이터의 입력값 범위가 12비트 범위 내에 표현될 수 있다는 점에 기인한다.This is due to the fact that the input value range of the depth image data can be expressed within the 12-bit range.

다시 말해, 깊이 영상 데이터는 픽셀별로 8비트 부호화를 진행하는 H.264 코덱으로 압축을 진행하면 정보 손실이 발생할 수 있다. 그런데, 깊이 영상 데이터는 부호화 이후에 데이터 손실 없는 복호화가 가능한지에 따라 판독 가능성이 달라질 수 있기 때문에 정보 손실 없이 압축하는 방식이 필요하다. 예를 들어, 수화 인식 시스템은 정교한 깊이 영상 데이터를 복호화할 수 있는지에 따라 손짓에 대한 판독 가능성이 높아질 수 있기 때문에 가능한 손실이 없는 압축 방식이 필요하다.In other words, when depth image data is compressed with the H.264 codec that performs 8-bit encoding for each pixel, information loss may occur. However, since the readability of the depth image data may vary depending on whether decoding without data loss is possible after encoding, a method of compressing the depth image data without loss of information is required. For example, since a sign language recognition system can increase the readability of hand gestures depending on whether it can decode sophisticated depth image data, a compression method that is as lossless as possible is required.

이에 따라, 깊이 영상 데이터는 RGB 영상 데이터와 동일하게 H.264 코덱을 적용하여 부호화하지 않고, RGB 영상 데이터와 분리하여 H.265/HEVC 코덱을 적용하여 부호화한다.Accordingly, the depth image data is not encoded by applying the H.264 codec in the same way as the RGB image data, but is separated from the RGB image data and encoded by applying the H.265/HEVC codec.

다시 도 2를 참조하면, 키넥트 영상 데이터 부호화 장치(100a)는 깊이 영상 데이터를 H.265/HEVC 코덱으로 부호화하기 위해 H.265/HEVC 코덱에서 원하는 형식(format)으로 맞추는 과정이 필요하다. 이를 위해, 키넥트 영상 데이터 부호화 장치(100a)는 키넥트 버전에 따라 오프셋을 제거하여 12비트에 맞게 변환시키는 전처리 과정을 수행한다.Referring back to FIG. 2 , the Kinect image data encoding apparatus 100a requires a process of matching the depth image data to a desired format in the H.265/HEVC codec to encode the depth image data using the H.265/HEVC codec. To this end, the Kinect image data encoding apparatus 100a performs a pre-processing process of converting to 12 bits by removing the offset according to the Kinect version.

키넥트 영상 데이터 부호화 장치(100a)는 RGB 영상 부호화부(110a), 깊이 영상 전처리부(210a), 깊이 영상 부호화부(220a)를 포함한다.The Kinect image data encoding apparatus 100a includes an RGB image encoder 110a, a depth image preprocessor 210a, and a depth image encoder 220a.

RGB 영상 부호화부(110a)는 키넥트(10)로부터 입력된 RGB 영상 데이터에 대한 부호화 과정을 진행하여 RGB 비트 스트림을 출력한다. 이때, RGB 영상 부호화부(110a)는 H.264 코덱을 적용하여 RGB 영상 데이터에 대한 부호화 과정을 진행한다.The RGB image encoder 110a performs an encoding process on the RGB image data input from the Kinect 10 to output an RGB bit stream. At this time, the RGB image encoder 110a applies the H.264 codec to perform an encoding process for RGB image data.

깊이 영상 전처리부(210a)는 키넥트(10)로부터 입력된 깊이 영상 데이터에 대해 H.265/HEVC 코덱을 적용하여 부호화를 진행하기 위한 전처리 과정을 수행한다. 깊이 영상 전처리부(210a)는 오프셋 보정부(211)와 정규화 처리부(212)를 포함한다.The depth image preprocessor 210a performs a preprocessing process for encoding the depth image data input from the Kinect 10 by applying the H.265/HEVC codec. The depth image preprocessor 210a includes an offset corrector 211 and a normalization processor 212 .

오프셋 보정부(211)는 키넥트 버전에 따라 나타내는 깊이 영상 데이터의 범위 차이에 대한 오프셋을 보정한다. 즉, 오프셋 보정부(211)는 키넥트 버전에 따라 깊이 영상 데이터에서 0∼4096 또는 500∼4500으로 범위 차이가 발생하는 경우에 오프셋을 영점(0점)으로 맞춰주는 오프셋 보정을 수행한다.The offset correcting unit 211 corrects an offset for a range difference of depth image data indicated according to the Kinect version. That is, the offset correcting unit 211 performs offset correction to set the offset to a zero point (0 point) when a range difference of 0 to 4096 or 500 to 4500 occurs in the depth image data according to the Kinect version.

오프셋 보정부(211)는 깊이 영상 데이터의 범위가 500∼4500와 같이 깊이값 500을 최소값으로 가지는 경우에, 오프셋 보정을 아래 수학식 1처럼 수행하여 깊이값을 영점 기준으로 조정한다.When the range of depth image data has a depth value of 500 as a minimum value, such as 500 to 4500, the offset correction unit 211 performs offset correction as in Equation 1 below to adjust the depth value based on the zero point.

정규화 처리부(212)는 오프셋 보정부(211)를 통해 오프셋 보정 과정을 수행한 후, 보정된 깊이값에 대해 12비트(즉, 4096개 값) 범위 내에 들어오도록 정규화 처리를 수행한다.After performing an offset correction process through the offset correction unit 211 , the normalization processing unit 212 normalizes the corrected depth value so that it falls within a range of 12 bits (ie, 4096 values).

즉, 정규화 처리부(212)는 최대값이 12비트(즉, 4096개 값)을 넘지 않는 경우에 그대로 이용하며, 최대값이 12비트(즉, 4096개 값)을 넘는 경우에 정규화 처리를 수행한다.That is, the normalization processing unit 212 is used as it is when the maximum value does not exceed 12 bits (ie, 4096 values), and performs normalization processing when the maximum value exceeds 12 bits (ie, 4096 values). .

정규화 처리부(212)는 아래 수학식 2와 같이 깊이값에 대한 정규화 처리를 수행한다.The normalization processing unit 212 performs normalization processing on the depth value as shown in Equation 2 below.

깊이 영상 부호화부(220a)는 정규화 처리부(212)로부터 정규화된 깊이 영상 데이터에 대한 부호화 과정을 진행하여 깊이 비트 스트림을 출력한다. 이때, 깊이 영상 부호화부(220a)는 H.265/HEVC 코덱을 적용하여 깊이 영상 데이터에 대한 부호화 과정을 진행한다.The depth image encoding unit 220a outputs a depth bit stream by performing an encoding process on the normalized depth image data from the normalization processing unit 212 . In this case, the depth image encoder 220a performs an encoding process on the depth image data by applying the H.265/HEVC codec.

이와 같이, RGB 비트 스트림 및 깊이 비트 스트림은 파일로 저장하거나 네트워크를 통해 전송될 수 있다.As such, the RGB bit stream and the depth bit stream can be saved as a file or transmitted over a network.

도 3을 참조하면, 키넥트 영상 데이터 복호화 장치(100b)는 RGB 비트 스트림 및 깊이 비트 스트림을 복호화한다. 즉, 키넥트 영상 데이터 복호화 장치(100b)는 RGB 비트 스트림 및 깊이 비트 스트림에 대해 역으로 재생하거나 정보를 추출하기 위해 전술한 키넥트 영상 데이터 부호화 장치(100a)의 수행 과정을 반대로 진행한다.Referring to FIG. 3 , the Kinect image data decoding apparatus 100b decodes an RGB bit stream and a depth bit stream. That is, the Kinect image data decoding apparatus 100b reverses the above-described execution process of the Kinect image data encoding apparatus 100a in order to reversely reproduce the RGB bit stream and the depth bit stream or to extract information.

키넥트 영상 데이터 복호화 장치(100b)는 RGB 영상 복호화부(110b), 깊이 영상 복호화부(220b), 깊이 영상 후처리부(210b)를 포함한다.The Kinect image data decoding apparatus 100b includes an RGB image decoding unit 110b, a depth image decoding unit 220b, and a depth image post-processing unit 210b.

RGB 비트 스트림은 복호화를 통해 화면에 재생하거나, 깊이 비트 스트림은 복호화를 통해 흑백 화면으로 표현할 수 있다. 이때, 픽셀 별로 0∼255값으로 변환하는 과정을 거친다.The RGB bit stream can be reproduced on the screen through decoding, or the depth bit stream can be expressed as a black and white screen through decoding. At this time, a process of converting values from 0 to 255 is performed for each pixel.

수화 인식 시스템은 깊이 비트 스트림을 복호화하여 특징값(Feature Vector)을 추출하여 기계학습에 사용할 수 있다.The sign language recognition system can decode the depth bit stream to extract a feature vector and use it for machine learning.

도 4는 본 발명의 일실시예에 따른 키넥트 영상 데이터 부호화/복호화 방법을 나타낸 도면이다.4 is a diagram illustrating a method for encoding/decoding Kinect image data according to an embodiment of the present invention.

키넥트 영상 데이터 부호화 장치(100a)는 RGB 영상 데이터에 대한 부호화를 통해 RGB 비트 스트림을 출력한다(S101). 이때, 키넥트 영상 데이터 부호화 장치(100a)는 H.264 코덱을 이용한다.The Kinect image data encoding apparatus 100a outputs an RGB bit stream through encoding of the RGB image data (S101). In this case, the Kinect image data encoding apparatus 100a uses the H.264 codec.

이와 동시에, 키넥트 영상 데이터 부호화 장치(100a)는 깊이 영상 데이터에 대한 오프셋 보정을 수행한다(S201). 이때, 키넥트 영상 데이터 부호화 장치(100a)는 깊이 영상 데이터의 범위 최소값을 영점 기준으로 조정한다.At the same time, the Kinect image data encoding apparatus 100a performs offset correction on the depth image data ( S201 ). In this case, the Kinect image data encoding apparatus 100a adjusts the minimum value of the range of the depth image data based on the zero point.

이후, 키넥트 영상 데이터 부호화 장치(100a)는 보정된 깊이 영상 데이터에 대한 정규화 처리를 수행한다(S202). 이때, 키넥트 영상 데이터 부호화 장치(100a)는 보정된 깊이값에 대해 표현 가능한 최대 비트(즉, 12비트) 범위 내에 포함되도록 정규화 처리를 수행한다.Thereafter, the Kinect image data encoding apparatus 100a performs a normalization process on the corrected depth image data ( S202 ). In this case, the Kinect image data encoding apparatus 100a normalizes the corrected depth value to be included within the range of the maximum representable bits (ie, 12 bits).

그런 다음, 키넥트 영상 데이터 부호화 장치(100a)는 깊이 영상 데이터에 대한 부호화를 통해 깊이 비트 스트림을 출력한다(S203). 이때, 키넥트 영상 데이터 부호화 장치(100a)는 H.265/HEVC 코덱을 이용한다.Then, the Kinect image data encoding apparatus 100a outputs a depth bit stream through encoding of the depth image data ( S203 ). In this case, the Kinect image data encoding apparatus 100a uses the H.265/HEVC codec.

일부 실시 예에 의한 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체에 기록되는 프로그램 명령은 본 발명을 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CDROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다.The method according to some embodiments may be implemented in the form of program instructions that can be executed through various computer means and recorded in a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, etc. alone or in combination. The program instructions recorded on the medium may be specially designed and configured for the present invention, or may be known and available to those skilled in the art of computer software. Examples of the computer-readable recording medium include magnetic media such as hard disks, floppy disks and magnetic tapes, optical media such as CDROMs and DVDs, and magneto-optical disks such as floppy disks. hardware devices specially configured to store and execute program instructions, such as magneto-optical media, and ROM, RAM, flash memory, and the like. Examples of program instructions include not only machine language codes such as those generated by a compiler, but also high-level language codes that can be executed by a computer using an interpreter or the like.

비록 상기 설명이 다양한 실시예들에 적용되는 본 발명의 신규한 특징들에 초점을 맞추어 설명되었지만, 본 기술 분야에 숙달된 기술을 가진 사람은 본 발명의 범위를 벗어나지 않으면서도 상기 설명된 장치 및 방법의 형태 및 세부 사항에서 다양한 삭제, 대체, 및 변경이 가능함을 이해할 것이다. 따라서, 본 발명의 범위는 상기 설명에서보다는 첨부된 특허청구범위에 의해 정의된다. 특허청구범위의 균등 범위 안의 모든 변형은 본 발명의 범위에 포섭된다.Although the foregoing description has focused on novel features of the invention as applied to various embodiments, those skilled in the art will recognize the apparatus and method described above without departing from the scope of the invention. It will be understood that various deletions, substitutions, and changes are possible in the form and details of Accordingly, the scope of the invention is defined by the appended claims rather than by the above description. All modifications within the scope of equivalents of the claims are included in the scope of the present invention.

10 : 키넥트 110a : RGB 영상 부호화부
210a : 깊이 영상 전처리부 211 : 오프셋 보정부
212 : 정규화 처리부 220a : 깊이 영상 부호화부
110b : RGB 영상 복호화부 210b : 깊이 영상 후처리부
220b : 깊이 영상 복호화부10: Kinect 110a: RGB image encoding unit
210a: depth image preprocessing unit 211: offset correcting unit
212: normalization processing unit 220a: depth image encoding unit
110b: RGB image decoding unit 210b: depth image post-processing unit
220b: depth image decoder

Claims

키넥트로부터 입력된 키넥트 버전에 따라 미리 설정되는 깊이 영상 데이터의 범위에 따른 차이에 대한 오프셋을 보정하기 위한 오프셋 보정부;
상기 보정된 깊이 영상 데이터의 깊이값에 대해 표현 가능한 최대 비트 범위 내에 포함되도록 정규화 처리를 수행하기 위한 정규화 처리부; 및
상기 정규화 처리된 깊이 영상 데이터에 대한 부호화 과정을 진행하여 깊이 비트 스트림을 출력하기 위한 깊이 영상 부호화부;를 포함하며,
상기 오프셋 보정부는,
상기 깊이 영상 데이터의 범위에서 최소값을 영점으로 맞춰주며,
상기 표현 가능한 최대 비트는, 12비트이고,
상기 정규화 처리부는 보정된 상기 깊이값이 12비트을 넘지 않는 경우에 그대로 이용하며, 상기 깊이값이 12비트를 넘는 경우에 정규화 처리를 수행하며,
상기 정규화 처리는 수학식

을 이용하여 이루어지는 키넥트 영상 데이터 부호화 장치.
an offset correction unit for correcting an offset for a difference according to a range of depth image data preset according to a Kinect version input from the Kinect;
a normalization processing unit configured to perform normalization processing so that the corrected depth value of the depth image data is included within a maximum representable bit range; and
a depth image encoder configured to output a depth bit stream by performing an encoding process on the normalized depth image data;
The offset correction unit,
Set the minimum value to zero in the range of the depth image data,
The maximum representable bit is 12 bits,
The normalization processing unit is used as it is when the corrected depth value does not exceed 12 bits, and performs normalization processing when the depth value exceeds 12 bits,
The normalization process is

Kinect video data encoding apparatus using

제 1 항에 있어서,
상기 키넥트로부터 입력된 RGB 영상 데이터에 대한 암호화를 수행하기 위한 RGB 영상 부호화부;
를 더 포함하는 키넥트 영상 데이터 부호화 장치.
The method of claim 1,
an RGB image encoding unit for encrypting the RGB image data input from the Kinect;
Kinect image data encoding apparatus further comprising a.

제 1 항에 있어서,
상기 깊이 영상 부호화부는,
H.265/HEVC 코덱을 이용하여 부호화 과정을 수행하는 것인 키넥트 영상 데이터 부호화 장치.
The method of claim 1,
The depth image encoder,
A Kinect video data encoding apparatus that performs an encoding process using the H.265/HEVC codec.

제 2 항에 있어서,
상기 RGB 영상 부호화부는,
H.264 코덱을 이용하여 부호화 과정을 수행하는 것인 키넥트 영상 데이터 부호화 장치.
3. The method of claim 2,
The RGB image encoding unit,
A Kinect image data encoding apparatus that performs an encoding process using an H.264 codec.

삭제delete

키넥트로부터 입력된 키넥트 버전에 따라 미리 설정되는 깊이 영상 데이터의 범위에 따른 차이에 대한 오프셋을 보정하는 단계;
상기 보정된 깊이 영상 데이터의 깊이값에 대해 표현 가능한 최대 비트 범위 내에 포함되도록 정규화 처리를 수행하는 단계; 및
상기 정규화 처리된 깊이 영상 데이터에 대한 부호화 과정을 진행하여 깊이 비트 스트림을 출력하는 단계;를 포함하며,
상기 보정하는 단계는,
상기 깊이 영상 데이터의 범위에서 최소값을 영점으로 맞춰주며,
상기 표현 가능한 최대 비트는, 12비트이고,
정규화 처리부는 보정된 상기 깊이값이 12비트을 넘지 않는 경우에 그대로 이용하며, 상기 깊이값이 12비트를 넘는 경우에 정규화 처리를 수행하며,
상기 정규화 처리는 수학식

을 이용하여 이루어지는 키넥트 영상 데이터 부호화 방법.
correcting an offset for a difference according to a range of depth image data preset according to a Kinect version input from Kinect;
performing normalization processing with respect to the depth value of the corrected depth image data so as to be included within a maximum representable bit range; and
outputting a depth bit stream by performing an encoding process on the normalized depth image data;
The correcting step is
Set the minimum value to zero in the range of the depth image data,
The maximum bits that can be expressed are 12 bits,
The normalization processing unit uses it as it is when the corrected depth value does not exceed 12 bits, and performs normalization processing when the depth value exceeds 12 bits,
The normalization process is

A method of encoding Kinect image data using

제 7 항에 있어서,
상기 키넥트로부터 입력된 RGB 영상 데이터에 대한 암호화를 수행하는 단계;
를 더 포함하는 키넥트 영상 데이터 부호화 방법.
8. The method of claim 7,
performing encryption on the RGB image data input from the Kinect;
Kinect image data encoding method further comprising a.