KR102413098B1

KR102413098B1 - Image processing method and image player using thereof

Info

Publication number: KR102413098B1
Application number: KR1020190063678A
Authority: KR
Inventors: 정준영; 윤국진
Original assignee: 한국전자통신연구원
Priority date: 2018-05-31
Filing date: 2019-05-30
Publication date: 2022-06-24
Also published as: KR20190136992A

Abstract

본 발명은 360도 비디오 상에 추가 정보를 표시하는 오버레이를 출력하는 방법 및 재생 장치를 개시한다. 본 발명에 따른 영상 처리 방법은, 오버레이를 복호화하는 단계, 및 오버레이 관련 정보에 기초하여 상기 복호화된 오버레이를 360도 비디오에 렌더링하는 단계를 포함할 수 있다. 이때, 상기 오버레이 관련 정보는 오버레이의 개수를 나타내는 정보 및 상기 오버레이에 할당되는 고유 식별자를 나타내는 정보를 포함할 수 있고, 복수의 오버레이가 존재하는 경우, 상기 복수의 오버레이 각각에 할당되는 식별자는 상이할 수 있다. The present invention discloses a method and a playback apparatus for outputting an overlay displaying additional information on a 360-degree video. The image processing method according to the present invention may include decoding an overlay, and rendering the decoded overlay on a 360-degree video based on overlay-related information. In this case, the overlay-related information may include information indicating the number of overlays and information indicating a unique identifier assigned to the overlays. When a plurality of overlays exist, identifiers assigned to each of the plurality of overlays may be different can

Description

영상 처리 방법 및 이를 이용한 영상 재생 장치{IMAGE PROCESSING METHOD AND IMAGE PLAYER USING THEREOF}Image processing method and image reproducing apparatus using the same

본 발명은 360도 비디오 상에 추가 정보를 표시하는 오버레이를 출력하는 방법 및 재생 장치에 관한 것이다.The present invention relates to a method for outputting an overlay displaying additional information on a 360-degree video and to a playback apparatus.

360도 비디오는 적어도 하나의 기준축을 중심으로 회전 자유도를 갖는 비디오를 360도 비디오라 정의할 수 있다. 일 예로, 360도 비디오는 Yaw, Roll 또는 Pitch 중 적어도 하나에 대한 회전 자유도를 가질 수 있다. 360도 비디오는 2D 비디오에 비해 상당한 양의 영상 정보를 더 포함하게 되므로, 그 파일 크기 또한 현저히 증가하게 된다. 이에 따라, 360도 비디오를 로컬 디바이스에 저장하는 동영상 서비스 보다 원격지에 저장된 360도 비디오를 원격 재생 또는 스트리밍하는 동영상 서비스가 각광받을 것으로 예상된다. 이에, 네트워크에 기반한 360도 비디오의 처리 방안에 대한 논의가 활발히 이루어지고 있다.A 360-degree video may be defined as a video having a degree of freedom of rotation about at least one reference axis as a 360-degree video. As an example, a 360-degree video may have rotational degrees of freedom for at least one of Yaw, Roll, and Pitch. Since 360-degree video includes a significant amount of image information compared to 2D video, the file size also significantly increases. Accordingly, it is expected that a video service that remotely plays or streams a 360-degree video stored in a remote location is more popular than a video service that stores a 360-degree video in a local device. Accordingly, there is an active discussion on a network-based 360-degree video processing method.

본 발명은 360도 비디오에 오버레이를 렌더링하는 방법 및 장치를 제공하는 것을 목적으로 한다.It is an object of the present invention to provide a method and apparatus for rendering an overlay on a 360 degree video.

본 발명은 360도 비디오에 오버레이를 렌더링하기 위한 오버레이 관련 메타데이터의 상세 구조를 제안한다.The present invention proposes a detailed structure of overlay-related metadata for rendering an overlay on a 360-degree video.

본 발명은 360도 비디오에 오버레이를 렌더링하기 위한 다양한 오버레이 관련 정보를 제안한다.The present invention proposes various overlay-related information for rendering an overlay on a 360-degree video.

본 발명에서 이루고자 하는 기술적 과제들은 이상에서 언급한 기술적 과제들로 제한되지 않으며, 언급하지 않은 또 다른 기술적 과제들은 아래의 기재로부터 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 명확하게 이해될 수 있을 것이다.The technical problems to be achieved in the present invention are not limited to the technical problems mentioned above, and other technical problems not mentioned will be clearly understood by those of ordinary skill in the art to which the present invention belongs from the description below. will be able

본 발명에 따른 영상 처리 방법 및 영상 재생 장치는, 오버레이를 복호화하고, 오버레이 관련 정보에 기초하여 상기 복호화된 오버레이를 360도 비디오에 렌더링할 수 있다. 이때, 상기 오버레이 관련 정보는 오버레이의 개수를 나타내는 정보 및 상기 오버레이에 할당되는 고유 식별자를 나타내는 정보를 포함할 수 있고, 복수의 오버레이가 존재하는 경우, 상기 복수의 오버레이 각각에 할당되는 식별자는 상이할 수 있다.The image processing method and image reproducing apparatus according to the present invention may decode an overlay and render the decoded overlay on a 360-degree video based on overlay-related information. In this case, the overlay-related information may include information indicating the number of overlays and information indicating a unique identifier assigned to the overlays. When a plurality of overlays exist, identifiers assigned to each of the plurality of overlays may be different can

본 발명에 따른 영상 처리 방법 및 영상 재생 장치에 있어서, 상기 오버레이 관련 정보는 상기 오버레이가 렌더링될 영역의 크기 또는 위치 중 적어도 하나를 결정하기 위한 영역 정보를 포함할 수 있다. In the image processing method and image reproducing apparatus according to the present invention, the overlay-related information may include region information for determining at least one of a size and a location of an area in which the overlay is to be rendered.

본 발명에 따른 영상 처리 방법 및 영상 재생 장치에 있어서, 상기 영역 정보는 현재 뷰포트의 크기 대비 상기 영역의 크기를 나타내는 크기 정보 또는 상기 현재 뷰포트 내 상기 영역의 위치를 나타내는 위치 정보 중 적어도 하나를 포함할 수 있다.In the image processing method and image reproducing apparatus according to the present invention, the region information may include at least one of size information indicating the size of the region relative to the size of the current viewport or location information indicating the position of the region in the current viewport. can

본 발명에 따른 영상 처리 방법 및 영상 재생 장치에 있어서, 상기 오버레이 관련 정보는 상기 오버레이의 우선순위를 나타내는 정보를 포함할 수 있다.In the image processing method and image reproducing apparatus according to the present invention, the overlay-related information may include information indicating the priority of the overlay.

본 발명에 따른 영상 처리 방법 및 영상 재생 장치에 있어서, 상기 오버레이를 복호화할 것인지 여부는, 상기 우선순위에 기초하여 결정될 수 있다.In the image processing method and image reproducing apparatus according to the present invention, whether to decode the overlay may be determined based on the priority.

본 발명에 따른 영상 처리 방법 및 영상 재생 장치에 있어서, 상기 오버레이 관련 정보는 상기 오버레이가 렌더링될 영역의 깊이를 나타내는 깊이 정보를 포함할 수 있다.In the image processing method and image reproducing apparatus according to the present invention, the overlay-related information may include depth information indicating a depth of an area in which the overlay is to be rendered.

본 발명에 따른 영상 처리 방법 및 영상 재생 장치에 있어서, 상기 깊이 정보는 구 표면에서 상기 영역까지의 거리 또는 구 중심점으로부터 상기 영역까지의 거리를 나타낼 수 있다.In the image processing method and image reproducing apparatus according to the present invention, the depth information may indicate a distance from a sphere surface to the region or a distance from a sphere center point to the region.

본 발명에 대하여 위에서 간략하게 요약된 특징들은 후술하는 본 발명의 상세한 설명의 예시적인 양상일 뿐이며, 본 발명의 범위를 제한하는 것은 아니다.The features briefly summarized above with respect to the invention are merely exemplary aspects of the detailed description of the invention that follows, and do not limit the scope of the invention.

본 발명에 의하면, 360도 비디오에 오버레이를 렌더링하는 방법 및 장치를 제공할 수 있다.According to the present invention, it is possible to provide a method and apparatus for rendering an overlay on a 360 degree video.

본 발명에 의하면, 오버레이 관련 메타데이터를 이용하여 360도 비디오에 효과적으로 오버레이를 렌더링할 수 있다.According to the present invention, an overlay can be effectively rendered on a 360-degree video using overlay-related metadata.

본 발명에 의하면, 다양한 오버레이 관련 정보를 바탕으로 오버레이 출력을 제어할 수 있다.According to the present invention, overlay output can be controlled based on various overlay-related information.

본 발명에서 얻을 수 있는 효과는 이상에서 언급한 효과들로 제한되지 않으며, 언급하지 않은 또 다른 효과들은 아래의 기재로부터 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 명확하게 이해될 수 있을 것이다.The effects obtainable in the present invention are not limited to the above-mentioned effects, and other effects not mentioned may be clearly understood by those of ordinary skill in the art to which the present invention belongs from the following description. will be.

도 1은 본 발명의 일 실시예에 따른, 콘텐트 생성 장치의 블록도이다.
도 2는 본 발명의 일실시예에 따른 콘텐트 재생 장치의 블록도이다.
도 3은 본 발명의 일실시예에 따른 데이터 처리 과정을 나타낸 흐름도이다.
도 4는 구 표면에 오버레이가 출력되는 양상을 나타낸 도면이다.
도 5는 오버레이 우선순위에 기초하여 디코딩 순서가 결정되는 예를 설명하기 위한 도면이다.
도 6은 오버레이의 깊이와 360도 비디오의 깊이가 상이하게 설정된 예를 나타낸 도면이다.1 is a block diagram of an apparatus for generating content, according to an embodiment of the present invention.
2 is a block diagram of a content reproducing apparatus according to an embodiment of the present invention.
3 is a flowchart illustrating a data processing process according to an embodiment of the present invention.
4 is a diagram illustrating an aspect in which an overlay is output on a sphere surface.
5 is a diagram for explaining an example in which a decoding order is determined based on an overlay priority.
6 is a diagram illustrating an example in which an overlay depth and a 360-degree video depth are set differently.

본 발명은 다양한 변경을 가할 수 있고 여러 가지 실시예를 가질 수 있는 바, 특정 실시예들을 도면에 예시하고 상세한 설명에 상세하게 설명하고자 한다. 그러나, 이는 본 발명을 특정한 실시 형태에 대해 한정하려는 것이 아니며, 본 발명의 사상 및 기술 범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다. 각 도면을 설명하면서 유사한 참조부호를 유사한 구성요소에 대해 사용하였다.Since the present invention can have various changes and can have various embodiments, specific embodiments are illustrated in the drawings and described in detail in the detailed description. However, this is not intended to limit the present invention to specific embodiments, and it should be understood to include all modifications, equivalents and substitutes included in the spirit and scope of the present invention. In describing each figure, like reference numerals have been used for like elements.

제1, 제2 등의 용어는 다양한 구성요소들을 설명하는데 사용될 수 있지만, 상기 구성요소들은 상기 용어들에 의해 한정되어서는 안 된다. 상기 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다. 예를 들어, 본 발명의 권리 범위를 벗어나지 않으면서 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소도 제1 구성요소로 명명될 수 있다. 및/또는 이라는 용어는 복수의 관련된 기재된 항목들의 조합 또는 복수의 관련된 기재된 항목들 중의 어느 항목을 포함한다.Terms such as first, second, etc. may be used to describe various elements, but the elements should not be limited by the terms. The above terms are used only for the purpose of distinguishing one component from another. For example, without departing from the scope of the present invention, a first component may be referred to as a second component, and similarly, a second component may also be referred to as a first component. and/or includes a combination of a plurality of related listed items or any of a plurality of related listed items.

어떤 구성요소가 다른 구성요소에 "연결되어" 있다거나 "접속되어"있다고 언급 또는 도시된 때에는, 그 다른 구성요소에 직접적으로 연결되어 있거나 또는 접속되어 있을 수도 있지만, 중간에 다른 구성요소가 존재할 수도 있다고 이해되어야 할 것이다. When a component is referred to or shown as being “connected” or “connected” to another component, it may be directly connected or connected to the other component, but other components may exist in between. It should be understood that there is

본 출원에서 사용한 용어는 단지 특정한 실시예를 설명하기 위해 사용된 것으로, 본 발명을 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 출원에서, "포함하다" 또는 "가지다" 등의 용어는 명세서상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.The terms used in the present application are only used to describe specific embodiments, and are not intended to limit the present invention. The singular expression includes the plural expression unless the context clearly dictates otherwise. In the present application, terms such as “comprise” or “have” are intended to designate that a feature, number, step, operation, component, part, or combination thereof described in the specification exists, but one or more other features It should be understood that this does not preclude the existence or addition of numbers, steps, operations, components, parts, or combinations thereof.

이하, 첨부한 도면들을 참조하여, 본 발명의 바람직한 실시예를 보다 상세하게 설명하고자 한다. 이하, 도면상의 동일한 구성요소에 대해서는 동일한 참조부호를 사용하고 동일한 구성요소에 대해서 중복된 설명은 생략한다.Hereinafter, preferred embodiments of the present invention will be described in more detail with reference to the accompanying drawings. Hereinafter, the same reference numerals are used for the same components in the drawings, and repeated descriptions of the same components are omitted.

본 발명에서 설명하는 콘텐트 생성 장치 및 콘텐트 재생 장치는 360도 비디오 관련 데이터 처리를 수행하는 장치를 의미한다. The content generating apparatus and content reproducing apparatus described in the present invention refer to an apparatus for processing 360-degree video-related data.

도 1은 본 발명의 일 실시예에 따른, 콘텐트 생성 장치의 블록도이다.1 is a block diagram of an apparatus for generating content, according to an embodiment of the present invention.

도 1을 참조하면, 콘텐트 생성 장치(100)는 이미지 전처리부(110), 오디오 인코딩부(120), 비디오 인코딩부(130) 및 캡슐화부(140)를 포함할 수 있다. Referring to FIG. 1 , the content generating apparatus 100 may include an image preprocessing unit 110 , an audio encoding unit 120 , a video encoding unit 130 , and an encapsulation unit 140 .

오디오 센서와 카메라 기기에 의해 현실 세계에 대한 오디오-비디오가 캡처되면, 비디오 시그널과 오디오 시그널이 콘텐트 생성 장치(100)로 입력된다. 카메라 기기는 360도 비디오를 촬영하기 위한 것이고, 360도 비디오는 중심점(center point) 주변 전 방향을 커버할 수 있다. 360도 비디오 생성을 위해, 상기 비디오 시그널은 복수 방향에 대한 복수 영상을 포함할 수 있다.When audio-video for the real world is captured by the audio sensor and the camera device, the video signal and the audio signal are input to the content generating apparatus 100 . The camera device is for capturing a 360-degree video, and the 360-degree video may cover all directions around a center point. In order to generate a 360-degree video, the video signal may include multiple images in multiple directions.

이미지 전처리부(110)는 비디오 인코딩부(120)에서 비디오 인코딩을 수행하기 위한 2차원 영상을 생성한다. 구체적으로, 이미지 전처리부(110)는 복수 영상을 스티칭하고, 스티치된 영상을 구체(sphere)에 투사할 수 있다. 그리고 나서, 이미지 전처리부(110)는 투사 포맷에 기초하여, 구 형태의 영상 데이터를 2차원으로 전개하여 투사 픽처(projected picture)를 생성할 수 있다. 투사 포맷은 등장방형 투사(Equrectangular projection), 큐브맵 투사(CubeMap projection), TPP (Truncated Pyramid Projection) 또는 SSP (Segmented Sphere Projection) 중 적어도 하나를 포함할 수 있다.The image preprocessor 110 generates a two-dimensional image for performing video encoding in the video encoding unit 120 . Specifically, the image preprocessor 110 may stitch a plurality of images and project the stitched images onto a sphere. Then, the image preprocessor 110 may generate a projected picture by developing the spherical image data in two dimensions based on the projection format. The projection format may include at least one of an equirectangular projection, a cubemap projection, a truncated pyramid projection (TPP), and a segmented sphere projection (SSP).

이미지 전처리부(110)는 투사 픽처를 패킹 픽처(packed picture)로 변환할 수 있다. 패킹 픽처는 사각형의 영상 데이터일 수 있다. 투사 픽처를 패킹 픽처로 변환하기 위해, 이미지 전처리부(110)는 투사 픽처를 적어도 하나의 페이스로 구획할 수 있다. 페이스의 개수, 모양 또는 크기 중 적어도 하나는 투사 포맷에 기초하여 결정될 수 있다. 그리고 나서, 이미지 전처리부(110)는 리전 와이즈 패킹(region-wise packing)을 통해 투사 픽처를 패킹 픽처로 변환할 수 있다. 리전 와이즈 패킹은 페이스의 리사이징, 와핑(warping) 또는 재배치 중 적어도 하나를 수반한다. 리전 와이즈 패킹이 수행되지 않을 경우, 이미지 전처리부(110)는 패킹 픽처를 투사 픽처와 동일하게 설정할 수 있다. The image preprocessor 110 may convert the projection picture into a packed picture. The packing picture may be rectangular image data. In order to convert the projection picture into a packing picture, the image preprocessor 110 may partition the projection picture into at least one face. At least one of the number, shape, or size of the faces may be determined based on the projection format. Then, the image preprocessor 110 may convert the projection picture into a packed picture through region-wise packing. Region-wise packing involves at least one of resizing, warping or repositioning of the face. When region-wise packing is not performed, the image preprocessor 110 may set the packing picture to be the same as the projection picture.

투사 픽처는 구의 전체 영역을 커버하는 반면, 패킹 픽처에는 위 제한사항이 적용되지 않는다. 일 예로, 투사 픽처로부터 구체 상의 일부 영역만을 커버하는 패킹 픽처를 생성할 수 있다. The projection picture covers the entire area of the sphere, whereas the above limitation does not apply to the packed picture. As an example, a packing picture covering only a partial area on a sphere may be generated from the projection picture.

또한, 동일한 소스 이미지에 대해 상술한 이미지 전처리가 반복적으로 수행될 수 있다. 동일한 소스 이미지에 대해 복수 회 이미지 전처리를 수행함으로써 복수의 패킹 픽처들을 생성할 수 있다. 이때, 복수의 패킹 픽처들 각각이 커버하는 구체 상의 영역은 상이할 수 있다.In addition, the above-described image pre-processing may be repeatedly performed on the same source image. A plurality of packing pictures may be generated by performing image pre-processing on the same source image a plurality of times. In this case, the area on the sphere covered by each of the plurality of packing pictures may be different.

이미지 전처리부(110)는 투사 픽처 및/또는 패킹 픽처와 관련한 메타 데이터를 생성할 수 있다. 상기 메타 데이터는 콘텐트 재생 장치에서 복호화된 픽처를 구체로 렌더링하는 것에 이용될 수 있다. 상기 메타 데이터는 투사 픽처의 투사 포맷 정보, 패킹 픽처에 의해 커버되는 구 표면 영역 정보 또는 리전 와이즈 패킹 정보 중 적어도 하나를 포함할 수 있다.The image preprocessor 110 may generate metadata related to the projection picture and/or the packing picture. The metadata may be used to render the decoded picture as a concrete in the content reproduction apparatus. The metadata may include at least one of projection format information of the projection picture, spherical surface area information covered by the packing picture, and region-wise packing information.

오디오 인코딩부(120)는 오디오 신호를 인코딩할 수 있다. 오디오 신호 인코딩의 결과로 오디오 비트스트림이 출력될 수 있다. The audio encoding unit 120 may encode an audio signal. An audio bitstream may be output as a result of audio signal encoding.

비디오 인코딩부(120)는 패킹 픽처를 부호화할 수 있다. 비디오 신호 인코딩 결과로 비디오 비트스트림이 출력될 수 있다. The video encoding unit 120 may encode the packing picture. A video bitstream may be output as a result of video signal encoding.

캡슐화부(140)는 특정 미디어 컨테이너 파일 포맷에 따라 오디오 비트스트림 및 비디오 비트스트림을 하나의 미디어 파일로 합성할 수 있다. 또는, 캡슐화부(140)는 특정 미디어 컨테이너 파일 포맷에 따라 오디오 파일 및 비디오 파일을 스트리밍을 위한 세그먼트 시퀀스로 구성할 수 있다. 미디어 컨테이너 파일 포맷은 ISO BMFF (Base Media File Format)일 수 있다. 파일 또는 세그먼트는 디코딩된 픽처의 3D 렌더링을 위한 메타데이터를 포함할 수 있다.The encapsulator 140 may synthesize an audio bitstream and a video bitstream into one media file according to a specific media container file format. Alternatively, the encapsulator 140 may configure an audio file and a video file into a segment sequence for streaming according to a specific media container file format. The media container file format may be ISO BMFF (Base Media File Format). The file or segment may contain metadata for 3D rendering of the decoded picture.

도 2는 본 발명의 일실시예에 따른 콘텐트 재생 장치의 블록도이다. 콘텐트 재생 장치는 HMD (Head Mount Display), 스마트폰, 랩톱, 태블릿, PC, 웨어러블 기기 또는 TV와 같은 단말 장치일 수 있다.2 is a block diagram of a content reproducing apparatus according to an embodiment of the present invention. The content reproduction device may be a terminal device such as a head mounted display (HMD), a smart phone, a laptop, a tablet, a PC, a wearable device, or a TV.

도 2를 참조하면, 콘텐트 재생 장치(200)는 디캡슐화부(210), 오디오 디코더부(220), 비디오 디코더부(230), 렌더링부(240) 및 유저 트래킹부(250)를 포함할 수 있다.Referring to FIG. 2 , the content reproducing apparatus 200 may include a decapsulation unit 210 , an audio decoder 220 , a video decoder 230 , a rendering unit 240 , and a user tracking unit 250 . have.

디캡슐화부(210)는 콘텐트 생성 장치로부터 수신한 미디어 파일 또는 세그먼트 시퀀스로부터 오디오 비트스트림 및 비디오 비트스트림을 추출할 수 있다. 또한, 디캡슐화부(210)는 미디어 파일 또는 세그먼트로부터 복호화된 픽처의 3D 렌더링을 위한 메타 데이터를 파싱할 수 있다. The decapsulation unit 210 may extract an audio bitstream and a video bitstream from a media file or a segment sequence received from the content generating device. Also, the decapsulation unit 210 may parse metadata for 3D rendering of a picture decoded from a media file or segment.

트래킹부는 사용자의 시청 방향을 결정하고, 시청 방향을 나타내는 메타 데이터를 생성한다. 트래킹부는 헤드 트래킹(Head tracking) 또는 아이 트래킹(Eye tracing)에 기초하여 시청 방향을 결정할 수 있다. The tracking unit determines the user's viewing direction and generates metadata indicating the viewing direction. The tracking unit may determine the viewing direction based on head tracking or eye tracking.

오디오 디코더부(220)는 오디오 비트스트림을 복호화한다. 오디오 비트스트림의 복호화 결과 복호화된 오디오 신호가 출력될 수 있다. The audio decoder 220 decodes the audio bitstream. As a result of decoding the audio bitstream, a decoded audio signal may be output.

비디오 디코더부(230)는 비디오 비트스트림을 복호화한다. 비디오 비트스트림의 복호화 결과, 복호화된 픽처가 출력될 수 있다. The video decoder 230 decodes a video bitstream. As a result of decoding the video bitstream, a decoded picture may be output.

비디오 비트스트림은 복수의 서브 픽처 비트스트림 또는 복수의 트랙들을 포함할 수 있다. 현재 시청 방향과 일치하는 부분을 렌더링하기 위해, 비디오 디코더부(230)에서 현재 시청 방향을 나타내는 정보를 이용할 수 있다. 일 예로, 비디오 디코더부(230)는 비디오 비트스트림 중 현재 시청 방향에 따른 뷰 포트(Viewport)를 포함하는 적어도 하나의 서브 픽처 비트스트림 또는 현재 시청 방향에 따른 뷰 포트를 포함하는 적어도 하나의 트랙을 복호화할 수 있다. A video bitstream may include a plurality of sub-picture bitstreams or a plurality of tracks. In order to render a portion that matches the current viewing direction, the video decoder 230 may use information indicating the current viewing direction. As an example, the video decoder unit 230 may include at least one sub-picture bitstream including a viewport according to the current viewing direction among the video bitstream or at least one track including a viewport according to the current viewing direction. can be decrypted.

렌더링부(240)는 시청 방향 또는 뷰포트를 나타내는 메타 데이터 또는 파일 또는 세그먼트로부터 파싱된 메타데이터 중 적어도 하나에 기초하여, 복호화된 픽처를 360도 영상으로 변환할 수 있다. 또한, 렌더링부(240)는 현재 시청 방향에 따라 복호화된 오디오 신호를 렌더링할 수 있다.The rendering unit 240 may convert the decoded picture into a 360-degree image based on at least one of metadata indicating a viewing direction or a viewport or metadata parsed from a file or segment. Also, the rendering unit 240 may render the decoded audio signal according to the current viewing direction.

도 3은 본 발명의 일실시예에 따른 데이터 처리 과정을 나타낸 흐름도이다. 도 3의 좌측은 콘텐트 생성 장치에서의 데이터 처리 과정을 나타낸 것이고, 도 3의 우측은 콘텐트 재생 장치에서의 데이터 처리 과정을 나타낸 것이다.3 is a flowchart illustrating a data processing process according to an embodiment of the present invention. The left side of FIG. 3 shows a data processing process in the content generating apparatus, and the right side of FIG. 3 shows a data processing process in the content reproducing apparatus.

콘텐트 생성 장치에 비디오 신호가 입력되면, 이미지 전처리부에서 입력된 비디오 신호를 비디오 인코더기로 입력하기 위한 이미지 전처리를 수행할 수 있다(S310). 이미지 전처리는, 이미지 스티칭, 회전, 투사 및 리전 와이즈 패킹 중 적어도 하나의 과정을 수반할 수 있다. 이미지 전처리 과정을 통해 투사 이미지 및 패킹 이미지가 생성될 수 있고, 투사 이미지 및 패킹 이미지 생성과 관련된 정보가 메타데이터로 생성될 수 있다. When a video signal is input to the content generating apparatus, the image pre-processing unit may perform image pre-processing for inputting the input video signal to the video encoder (S310). The image preprocessing may involve at least one of image stitching, rotation, projection, and region wise packing. A projection image and a packing image may be generated through the image preprocessing process, and information related to generation of the projection image and the packing image may be generated as metadata.

그리고 나서, 패킹 이미지에 대해 비디오 인코딩이 수행될 수 있다(S320). Then, video encoding may be performed on the packed image (S320).

콘텐트 생성 장치에 오디오 신호가 입력되면, 오디오 인코더에부에서 입력된 오디오 신호를 인코딩할 수 있다(S330).When an audio signal is input to the content generating device, the audio signal input to the audio encoder may be encoded ( S330 ).

비디오 비트스트림 및 오디오 비트스트림이 생성되면, 이들을 합성하여 미디어 파일을 생성할 수 있다(S340). 또는, 상기 미디어 파일을 스트리밍을 위한 세그먼트들로 분할하여 세그먼트 시퀀스를 생성할 수 있다. 이미지 전처리과정에서 생성된 메타데이터가 상기 미디어 파일 또는 상기 세그먼트에 포함될 수 있다.When a video bitstream and an audio bitstream are generated, a media file may be generated by synthesizing them ( S340 ). Alternatively, a segment sequence may be generated by dividing the media file into segments for streaming. Metadata generated in the image preprocessing process may be included in the media file or the segment.

비디오 신호의 처리와 오디오 신호는 병렬처리될 수 있다. The processing of the video signal and the processing of the audio signal can be parallelized.

콘텐트 생성 장치로부터 미디어 파일 또는 세그먼트 시퀀스가 수신되면, 콘텐트 재생장치는 수신된 미디어 파일 또는 세그먼트 시퀀스로부터 비디오 비트스트림, 오디오 비트스트림 및 메타데이터를 파싱할 수 있다(S350).When a media file or a segment sequence is received from the content generating device, the content playback device may parse a video bitstream, an audio bitstream, and metadata from the received media file or segment sequence ( S350 ).

콘텐트 재생 장치는 파싱된 비디오 비트스트림을 복호화할 수 있다(S360). 이때, 콘텐트 재생 장치는 사용자의 시청 방향 또는 현재 뷰 포트를 고려하여, 복호화 대상 비트스트림을 결정할 수 있다. 그리고, 콘텐트 재생 장치는 메타데이터를 이용하여 복호화된 이미지를 렌더링할 수 있다(S370). 복호화된 이미지 렌더링 시 사용자의 시청 방향 또는 현재 뷰 포트가 더 고려될 수 있다. The content reproducing apparatus may decode the parsed video bitstream (S360). In this case, the content reproducing apparatus may determine the decoding target bitstream in consideration of the user's viewing direction or the current view port. Then, the content reproducing apparatus may render the decoded image by using the metadata (S370). When rendering the decoded image, the user's viewing direction or current view port may be further considered.

콘텐트 재생 장치는 파싱된 오디오 비트스트림을 복호화하고(S380), 복호화된 오디오를 렌더링할 수 있다(S390). 복호화된 오디오 렌더링 시 사용자의 시청 방향 또는 현재 뷰 포트가 고려될 수 있다.The content reproducing apparatus may decode the parsed audio bitstream (S380) and render the decoded audio (S390). When rendering the decoded audio, a user's viewing direction or a current view port may be considered.

사용자 경험 향상을 위해, 360도 비디오 상에 추가 정보를 출력하는 방안을 고려할 수 있다. 일 예로, 360도 비디오 상에 로고, 수화 번역, 360도 비디오 전체 영역에 대응하는 ERP 기반의 프리뷰 윈도우, 상기 프리뷰 윈도우 상에서 현재 뷰포트를 안내하는 가이드 라인 및/또는 추천 뷰포트의 썸네일을 출력함으로써, 사용자 편의를 제공하는 한편, 사용자 시청 경험을 향상시킬 수 있다. 위처럼, 360도 비디오 상에 출력되는 추가 영상을 오버레이(Overlay)라 정의할 수 있다.In order to improve the user experience, a method of outputting additional information on a 360-degree video may be considered. For example, by outputting a logo, sign language translation, an ERP-based preview window corresponding to the entire 360-degree video area on a 360-degree video, a guide line guiding the current viewport on the preview window, and/or a thumbnail of a recommended viewport, the user While providing convenience, it is possible to improve the user's viewing experience. As described above, an additional image output on a 360-degree video may be defined as an overlay.

도 4는 구 표면에 오버레이가 출력되는 양상을 나타낸 도면이다.4 is a diagram illustrating an aspect in which an overlay is output on a sphere surface.

360도 비디오 상에 오버레이를 렌더링하기 위해, 오버레이 관련 메타데이터가 정의되어야 한다. 비디오 복호화기에서 오버레이 영상이 복호화되면, 렌더링부는 오버레이 관련 메타데이터를 참조하여, 360도 비디오 상에 오버레이를 출력할 수 있다. 이하, 오버레이 관련 메타데이터에 대해 상세히 살펴보기로 한다. In order to render an overlay on a 360 degree video, overlay related metadata must be defined. When the video decoder decodes the overlay image, the rendering unit may output the overlay on the 360-degree video by referring to overlay-related metadata. Hereinafter, overlay-related metadata will be described in detail.

오버레이 관련 메타데이터는 오버레이 개수, 오버레이 식별자, 오버레이 영역 정보, 오버레이 깊이 정보, 오버레이 우선순위, 기타, 재생 시점 변경 정보, 레이어링 순서 정보, 투명도 또는 유저 인터랙션(interaction) 정보 중 적어도 하나를 포함할 수 있다. The overlay-related metadata may include at least one of the number of overlays, overlay identifier, overlay area information, overlay depth information, overlay priority, others, playback time change information, layering order information, transparency, or user interaction information. .

오버레이 개수는, 오버레이가 존재하는지 여부 및 오버레이가 존재하는 경우 오버레이의 개수를 나타낸다. 오버레이 관련 메타데이터에는 오버레이 개수를 나타내기 위한 신택스 ‘num_overlays’가 포함될 수 있다. 일 예로, num_overlays의 값이 0인 것은 오버레이가 존재하지 않음을 나타낸다. num_overlays의 값이 0보다 큰 경우, num_overlays의 값만큼의 오버레이가 존재함을 나타낸다.The number of overlays indicates whether an overlay exists and the number of overlays if there is an overlay. The overlay-related metadata may include a syntax 'num_overlays' for indicating the number of overlays. For example, a value of num_overlays of 0 indicates that an overlay does not exist. When the value of num_overlays is greater than 0, it indicates that there are as many overlays as the value of num_overlays.

오버레이 식별자는 오버레이에 할당되는 고유의 식별자를 나타낸다. 일 예로, 메타데이터에는 오버레이의 고유 식별자를 나타내기 위한 신택스 ‘overlay_id’가 포함될 수 있다. 동일한 식별자가 둘 이상의 오버레이에 할당될 수 없다. 즉, 복수의 오버레이가 존재하는 경우, 각 오버레이에 할당되는 식별자는 상이하다. The overlay identifier indicates a unique identifier assigned to the overlay. As an example, the metadata may include a syntax 'overlay_id' for indicating the unique identifier of the overlay. The same identifier cannot be assigned to more than one overlay. That is, when a plurality of overlays exist, identifiers assigned to each overlay are different.

표 1은 오버레이에 오버레이 식별자를 할당하기 위한 구문 구조를 예시한 것이다.Table 1 illustrates a syntax structure for allocating an overlay identifier to an overlay.

OverlayStruct 함수는 오버레이 관련 메타데이터를 나타낸다. SingleOverlayStruct는 단일 오버레이에 대한 메타 데이터를 나타낸다.The OverlayStruct function represents overlay-related metadata. SingleOverlayStruct represents metadata for a single overlay.

OverlayStruct 함수는, num_overlays 신택스를 포함할 수 있다. num_overlays를 통해 오버레이가 존재하는지 여부 및 오버레이의 개수가 결정될 수 있다. 복수의 오버레이가 존재하는 경우, 각 오버레이 마다 overlay_id를 통해 식별자를 할당할 수 있다. The OverlayStruct function may include a num_overlays syntax. Whether an overlay exists and the number of overlays may be determined through num_overlays. When a plurality of overlays exist, an identifier may be assigned to each overlay through overlay_id.

표 1에서는 신택스 overlay_id가 OverlayStruct 함수에 포함되는 것으로 예시되었다. 다른 예로, 단일 오버레이에 대한 메타데이터를 나타내는 SingleOverlayStruct 함수에서 신택스 overlay_id가 정의될 수 있다.Table 1 exemplifies that the syntax overlay_id is included in the OverlayStruct function. As another example, the syntax overlay_id may be defined in the SingleOverlayStruct function indicating metadata for a single overlay.

오버레이 영역 정보는 오버레이가 렌더링될 영역의 위치 및/또는 크기를 나타낸다. 상기 오버레이 영역 정보를 통해 360도 비디오 위 오버레이의 출력 위치 및/또는 크기가 결정될 수 있다. The overlay area information indicates a location and/or size of an area in which an overlay is to be rendered. An output position and/or size of the overlay on the 360-degree video may be determined through the overlay area information.

표 2는 오버레이 위치 정보를 나타내는 구문 구조를 예시한 것이다.Table 2 exemplifies the syntax structure indicating the overlay location information.

표 2에 나타난 예에서와 같이, 오버레이에 의해 커버되는 구 상의 영역을 나타내는 SphereRegionStruct 함수가 OverlayStruct 함수에 추가될 수 있다. As in the example shown in Table 2, the SphereRegionStruct function indicating the area of the sphere covered by the overlay may be added to the OverlayStruct function.

복수개의 오버레이가 존재하는 경우, 각 오버레이 별로 SphereRegionStruct 함수를 호출할 수 있다. 이에 따라, 복수의 오버레이들에 대한 렌더링 영역의 위치가 개별적으로 결정될 수 있다.When a plurality of overlays exist, the SphereRegionStruct function can be called for each overlay. Accordingly, positions of rendering regions for the plurality of overlays may be individually determined.

표 2에서는 SphereRegionStruct 함수가 OverlayStruct 함수의 하위 함수인 것으로 예시되었다. 다른 예로, 단일 오버레이에 대한 메타데이터를 나타내는 SingleOverlayStruct 함수의 하위 함수로 SphereRegionStruct 함수가 정의될 수 있다.In Table 2, the SphereRegionStruct function is exemplified as a sub-function of the OverlayStruct function. As another example, the SphereRegionStruct function may be defined as a subfunction of the SingleOverlayStruct function representing metadata for a single overlay.

오버레이가 렌더링될 영역의 위치를 결정하기 위해, 상기 영역의 너비 및 높이를 나타내는 신택스가 정의될 수 있다. 또한, 오버레이가 렌더링될 영역의 위치를 결정하기 위해, 투사 픽처 내 상기 영역의 위치, 패킹 픽처 내 상기 영역의 위치 또는 뷰 포트 내 상기 영역의 위치를 나타내는 신택스가 정의될 수 있다.In order to determine the position of the region where the overlay is to be rendered, syntax indicating the width and height of the region may be defined. In addition, in order to determine the position of the region where the overlay is to be rendered, a syntax indicating the position of the region in the projection picture, the position of the region in the packing picture or the position of the region in the viewport may be defined.

표 3 및 표 4는 오버레이 영역 정보와 관련한 신택스를 예시한 것이다.Tables 3 and 4 exemplify syntax related to overlay area information.

표 3에서 proj_reg_width 및 proj_reg_height는 각각 오버레이를 렌더링할 영역의 너비 및 높이를 나타낸다. proj_reg_top 및 proj_reg_left는 투사 픽처 내 오버레이를 렌더링할 영역의 좌상단 좌표를 나타낸다. In Table 3, proj_reg_width and proj_reg_height indicate the width and height of an area to render an overlay, respectively. proj_reg_top and proj_reg_left indicate upper-left coordinates of a region to render an overlay in the projection picture.

다른 예로, 패킹 픽처(즉, 디코딩된 픽처) 내 오버레이를 렌더링할 영역의 좌상단 좌표를 나타내는 packed_reg_top 및 packed_reg_left를 이용할 수 있다.As another example, packed_reg_top and packed_reg_left indicating upper-left coordinates of a region to render an overlay in a packed picture (ie, a decoded picture) may be used.

표 4에서 rect_left_percent 및 rect_left_percent는 오버레이를 렌더링항 영역의 좌상단 좌표를 나타낸다. 구체적으로, rect_left_percent는 현재 뷰포트의 좌측 경계를 0%로 가정하고, 우측 경계를 100%로 가정했을 때, 상기 영역의 좌측 좌표의 비율을 나타낸다. rect_top_percent는 현재 뷰포트의 상단 경계를 0%로 가정하고, 하단 경계를 100%로 가정했을 때, 상기 영역의 상단 좌표의 비율을 나타낸다. rect_width_percent는 현재 뷰 포트의 너비와 상기 영역의 너비 비율을 나타내고, rect_height_percent는 현재 뷰 포트의 높이와 상기 영역의 높이 비율을 나타낸다.In Table 4, rect_left_percent and rect_left_percent indicate upper-left coordinates of the area for rendering the overlay. Specifically, rect_left_percent represents the ratio of the left coordinates of the region when it is assumed that the left boundary of the current viewport is 0% and the right boundary is 100%. rect_top_percent represents the ratio of the top coordinates of the region when it is assumed that the top boundary of the current viewport is 0% and the bottom boundary is 100%. rect_width_percent represents the ratio of the width of the current viewport to the width of the region, and rect_height_percent represents the ratio of the height of the current viewport to the height of the region.

오버레이 우선순위는 오버레이 간 우선순위를 나타낸다. 콘텐트 재생 장치는 오버레이 간 우선순위에 기초하여 복수 오버레이들의 디코딩 순서를 결정할 수 있다. 일 예로, 복수의 오버레이를 렌더링하기 위해서는, 콘텐트 재생 장치에 복수의 디코더가 구비되어야 한다. 그러나, 콘텐트 재생 장치에 포함된 디코더의 개수가 모든 오버레이들에 대한 비트스트림을 복호화하기에 충분치 못한 경우, 콘텐트 재생 장치에 어느 오버레이를 우선하여 복호화하여야 하는지 알려주어야 한다. 콘텐트 재생 장치는 상기 오버레이 간 우선순위를 나타내는 정보에 기초하여, 타 오버레이보다 우선시하여 복호화하여야할 오버레이를 결정할 수 있다.The overlay priority indicates the priority between overlays. The content reproducing apparatus may determine the decoding order of the plurality of overlays based on the priority between overlays. For example, in order to render a plurality of overlays, a plurality of decoders should be provided in the content reproduction apparatus. However, when the number of decoders included in the content reproducing apparatus is insufficient to decode bitstreams for all overlays, it is necessary to inform the content reproducing apparatus which overlays to be decoded with priority. The content reproducing apparatus may determine an overlay to be decoded with priority over other overlays based on the information indicating the priority between the overlays.

도 5는 오버레이 우선순위에 기초하여 디코딩 순서가 결정되는 예를 설명하기 위한 도면이다. 5 is a diagram for explaining an example in which a decoding order is determined based on an overlay priority.

사용자의 뷰 포트 상에 오버레이 렌더링을 위한 3개의 영역이 포함되나, 콘텐트 재생 장치에는 1개의 디코더만이 구비된 경우, 상기 뷰 포트 상에 3개의 오버레이들을 동시에 출력할 수 없다. 콘텐트 재생 장치는 3개의 오버레이들의 우선순위에 기초하여 디코딩 대상을 결정할 수 있다. 즉, 3개의 오버레이들 중 우선순위가 가장 높은 오버레이를 복호화하고, 뷰포트 위 복호화된 오버레이를 렌더링할 수 있다.When three regions for overlay rendering are included on the user's view port, but only one decoder is provided in the content reproducing apparatus, the three overlays cannot be simultaneously output on the view port. The content reproducing apparatus may determine a decoding target based on the priorities of the three overlays. That is, the overlay having the highest priority among the three overlays may be decoded, and the decoded overlay may be rendered on the viewport.

표 5은 오버레이 우선순위를 나타내는 정보를 나타내는 구문 구조를 예시한 것이다.Table 5 exemplifies the syntax structure indicating information indicating the overlay priority.

표 5에 나타난 예에서와 같이, 오버레이의 우선 순위를 나타내는 신택스 overlay_priority가 OverlayStruct 함수에 추가될 수 있다. 복수의 오버레이들이 존재하는 경우, 각각의 오버레이마다 신택스 overlay_priority를 파싱할 수 있다. As in the example shown in Table 5, the syntax overlay_priority indicating the priority of the overlay may be added to the OverlayStruct function. When there are a plurality of overlays, the syntax overlay_priority may be parsed for each overlay.

표 5에서는 신택스 overlay_priority가 OverlayStruct 함수에 포함되는 것으로 예시되었다. 다른 예로, 단일 오버레이에 대한 메타데이터를 나타내는 SingleOverlayStruct 함수에서 신택스 overlay_priority가 정의될 수 있다.Table 5 illustrates that the syntax overlay_priority is included in the OverlayStruct function. As another example, the syntax overlay_priority may be defined in the SingleOverlayStruct function representing metadata for a single overlay.

동일한 우선순위가 둘 이상의 오버레이에 할당될 수 없다. 즉, 복수의 오버레이가 존재하는 경우, 각 오버레이에 할당되는 우선순위는 상이하다. 오버레이 우선순위가 0인 것은 해당 오버레이가 필수로 출력되어야 함을 나타낸다.The same priority cannot be assigned to more than one overlay. That is, when there are a plurality of overlays, the priority assigned to each overlay is different. An overlay priority of 0 indicates that the corresponding overlay must be output as required.

오버레이 깊이 정보는 오버레이가 렌더링될 영역의 깊이를 나타낸다. 구 표면에 오버레이 영상을 디스플레이하는 것보다, 오버레이가 렌더링되는 영역을 구 표면으로부터 이격하는 것이 사용자에게 더 나은 가시성(visibility)을 제공할 수 있다. The overlay depth information indicates a depth of an area in which an overlay is to be rendered. Rather than displaying the overlay image on the spherical surface, separating the area where the overlay is rendered from the spherical surface may provide better visibility to the user.

도 6은 오버레이의 깊이와 360도 비디오의 깊이가 상이하게 설정된 예를 나타낸 도면이다. 6 is a diagram illustrating an example in which an overlay depth and a 360-degree video depth are set differently.

렌더링부는 상기 오버레이 깊이 정보를 이용하여 오버레이가 출력될 영역의 깊이를 결정할 수 있다. 오버레이 깊이 정보는 중심점으로부터 상기 영역까지의 거리 또는 구 표면으로부터 상기 영역까지의 거리를 나타낼 수 있다. The rendering unit may determine a depth of an area in which an overlay is to be output by using the overlay depth information. The overlay depth information may indicate a distance from a center point to the region or a distance from a spherical surface to the region.

표 6은 오버레이 깊이 정보를 나타내는 구문 구조를 예시한 것이다.Table 6 illustrates a syntax structure indicating overlay depth information.

표 6에 나타난 예에서와 같이, 오버레이 깊이를 나타내는 신택스 overlay_depth가 정의될 수 있다. 신택스 overlay_depth는 OverlayDepth 함수에 포함될 수 있다. 신택스 overlay_depth는 중심점으로부터 오버레이가 렌더링될 영역까지의 거리 또는 구 표면으로부터 상기 영역까지의 거리를 나타낼 수 있다.As in the example shown in Table 6, the syntax overlay_depth indicating the overlay depth may be defined. The syntax overlay_depth may be included in the OverlayDepth function. The syntax overlay_depth may indicate a distance from a center point to an area where an overlay is to be rendered or a distance from a spherical surface to the area.

OverlayDepth 함수는 OverlayStruct 또는 SingleOverlayStruct의 하위 함수로 정의될 수 있다. The OverlayDepth function can be defined as a sub function of OverlayStruct or SingleOverlayStruct.

재생 시점 변경 정보는 오버레이의 재생 시점이 360도 비디오와 상이하게 설정될 수 있는지 여부를 나타낸다. 상기 사항을 지시하기 위해 신택스 timeline_chage_flag가 정의될 수 있다. The reproduction time change information indicates whether the reproduction time of the overlay may be set to be different from that of the 360-degree video. To indicate the above, a syntax timeline_chage_flag may be defined.

일 예로, timeline_change_flag가 1인 것은, 오버레이의 재생 시점이 360도 비디오와 동일하게 설정되어야 함을 나타낸다. 즉, 사용자의 뷰포트가 오버레이를 벗어나 오버레이의 재생이 일시 정지되었다 하더라도, 사용자의 뷰포트가 복귀되면, 360도 비디오의 재생 시점과 동일한 재생 시점부터 상기 오버레이를 재생할 수 있다. As an example, timeline_change_flag of 1 indicates that the playback time of the overlay should be set to be the same as the 360-degree video. That is, even if the user's viewport leaves the overlay and playback of the overlay is paused, when the user's viewport is restored, the overlay may be played from the same playback timing as the 360-degree video playback timing.

반면, timeline_change_flag가 0인 것은, 오버레이의 재생 시점이 360도 비디오와 상이하게 설정될수 있음을 나타낸다. 즉, 사용자의 뷰포트가 오버레이를 벗어나 오버레이의 재생이 일시 정지되었다면, 사용자의 뷰포트가 복귀된 이후에는 360도 비디오의 재생 시점과 관계없이 중단 시점 이후부터 오버레이가 재생될 수 있다. On the other hand, timeline_change_flag of 0 indicates that the playback time of the overlay may be set to be different from that of a 360-degree video. That is, if the user's viewport leaves the overlay and playback of the overlay is paused, after the user's viewport is returned, the overlay may be played from the stop point regardless of the playback timing of the 360-degree video.

레이어링 순서 정보는 오버레이들의 레이어링 순서를 나타낸다. 레이어링 순서가 작은 오버레이가 앞에 배치되고, 레이어링 순서가 큰 오버레이가 뒤에 배치될 수 있다. 레이어링 순서 정보를 나타내기 위해, 신택스 layering_order가 정의될 수 있다.The layering order information indicates a layering order of overlays. An overlay having a small layering order may be disposed in front, and an overlay having a large layering order may be disposed later. In order to indicate layering order information, a syntax layering_order may be defined.

투명도는 오버레이의 투명도를 나타낸다. 오버레이의 투명도를 나타내기 위해, 신택스 opacity가 정의될 수 있다. 투명도는 0부터 100까지의 숫자로 표현될 수 있다. 0은 완전 투명 상태를 나타내고, 100은 완전 불투명 상태를 나타낼 수 있다.Transparency refers to the transparency of the overlay. To indicate the transparency of the overlay, a syntax opacity may be defined. Transparency may be expressed as a number from 0 to 100. 0 may indicate a fully transparent state, and 100 may indicate a fully opaque state.

유저 상호작용 정보는, 오버레이 관련 설정이 사용자 입력에 의해 변경될 수 있는지 여부를 나타낸다. 여기서, 오버레이 관련 설정은 오버레이의 출력 위치, 오버레이 깊이, 오버레이 On/Off, 투명도, 오버레이 크기, 오버레이 레이어 순서, 오버레이 우선순위, 오버레이 소스(또는, 오버레이 미디어)를 나타낼 수 있다. 오버레이 관련 설정이 사용자 입력에 의해 변경 가능한지 여부를 나타내는 플래그들이 정의될 수 있다. The user interaction information indicates whether an overlay-related setting can be changed by a user input. Here, the overlay related settings may indicate an output position of an overlay, an overlay depth, an overlay On/Off, transparency, an overlay size, an overlay layer order, an overlay priority, and an overlay source (or overlay media). Flags indicating whether an overlay-related setting can be changed by a user input may be defined.

상기 열거된 오버레이 관련 정보들은 제어 구조 함수로서 구현될 수 있다. 제어 구조 함수는 단일 오버레이의 메타데이터(즉, SingleOverlayStruct)에 정의된 하위 함수를 의미한다. 제어 구조 함수별로 상이한 인덱스가 할당되고, 단일 오버레이의 메타데이터에 제어 구조 함수가 존재하는지 여부를 나타내는 플래그가 포함될 수 있다.The above-listed overlay-related information may be implemented as a control structure function. The control structure function means a sub-function defined in the metadata of a single overlay (ie, SingleOverlayStruct). A different index is allocated to each control structure function, and a flag indicating whether the control structure function exists in the metadata of a single overlay may be included.

표 7는 제어 구조 함수별 인덱스 할당 예시를 나타낸 것이다.Table 7 shows an example of index assignment for each control structure function.

표 8은 제어 구조 함수가 존재하는지 여부를 나타내는 구문 구조를 예시한 것이다.Table 8 exemplifies the syntax structure indicating whether a control structure function exists.

표 6에서, num_flag_bytes는 제어 구조 플래그 overlay_control_flag들에 할당되는 총 비트수를 나타낸다. overlay_control_flag는 제어 구조 함수가 존재하는지 여부를 나타낸다. 일 예로, overlay_control_flag[i]의 값이 0인 것은, 인덱스가 i인 제어 구조 함수가 존재하지 않음을 나타낸다. 반면, overlay_control_flag[i]의 값이 1인 것은, 인덱스가 i인 제어 구조 함수가 존재함을 나타낸다.In Table 6, num_flag_bytes represents the total number of bits allocated to the control structure flag overlay_control_flags. overlay_control_flag indicates whether a control structure function exists. As an example, the value of overlay_control_flag[i] of 0 indicates that the control structure function having the index i does not exist. On the other hand, a value of overlay_control_flag[i] of 1 indicates that a control structure function having an index of i exists.

overlay_cotrol_flag[i]의 값이 1인 경우, i번째 인덱스에 대응하는 제어 구조 함수를 호출하여, 오버레이 관련 정보를 획득할 수 있다.When the value of overlay_cotrol_flag[i] is 1, the control structure function corresponding to the i-th index may be called to obtain overlay-related information.

추가로, 콘텐트 재생 장치가 제어 구조 함수를 필수적으로 파싱하여야 하는지 여부를 나타내는 플래그가 단일 오버레이의 메타데이터에 포함될 수 있다.Additionally, a flag indicating whether or not the content playback device must parse the control structure function may be included in the metadata of the single overlay.

표 9는 제어 구조 함수를 필수적으로 파싱하여야 하는지 여부를 나타내는 구문 구조를 예시한 것이다.Table 9 exemplifies the syntax structure indicating whether the control structure function must be parsed.

표 9에서, overlay_control_essential_flag는 제어 구조 함수를 필수적으로 파싱하여야 하는지 여부를 나타낸다. 일 예로, overlay_control_essential_flag[i]의 값이 0인 것은, 콘텐트 재생 장치가 인덱스가 i인 제어 구조 함수를 파싱하는 것이 요구되지 않음을 나타낸다. 반면, overlay_control_essential_flag[i]의 값이 1인 것은, 콘텐트 재생 장치가 인덱스가 i인 제어 구조 함수를 파싱하여야 함을 나타낸다.In Table 9, overlay_control_essential_flag indicates whether the control structure function must be necessarily parsed. As an example, a value of overlay_control_essential_flag[i] of 0 indicates that it is not required for the content reproducing apparatus to parse the control structure function having the index i. On the other hand, when the value of overlay_control_essential_flag[i] is 1, it indicates that the content reproducing apparatus should parse the control structure function having the index i.

상술한 실시예는 일련의 단계 또는 순서도를 기초로 설명되고 있으나, 이는 발명의 시계열적 순서를 한정한 것은 아니며, 필요에 따라 동시에 수행되거나 다른 순서로 수행될 수 있다. 또한, 상술한 실시예에서 블록도를 구성하는 구성요소(예를 들어, 유닛, 모듈 등) 각각은 하드웨어 장치 또는 소프트웨어로 구현될 수도 있고, 복수의 구성요소가 결합하여 하나의 하드웨어 장치 또는 소프트웨어로 구현될 수도 있다. 상술한 실시예는 다양한 컴퓨터 구성요소를 통하여 수행될 수 있는 프로그램 명령어의 형태로 구현되어 컴퓨터 판독 가능한 기록 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능한 기록 매체는 프로그램 명령어, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 컴퓨터 판독 가능한 기록 매체의 예에는, 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체, CD-ROM, DVD와 같은 광기록 매체, 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 ROM, RAM, 플래시 메모리 등과 같은 프로그램 명령어를 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 상기 하드웨어 장치는 본 발명에 따른 처리를 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.Although the above-described embodiment has been described based on a series of steps or a flowchart, this does not limit the time-series order of the invention, and may be performed simultaneously or in a different order, if necessary. In addition, each of the components (eg, unit, module, etc.) constituting the block diagram in the above-described embodiment may be implemented as a hardware device or software, or a plurality of components may be combined to form one hardware device or software. may be implemented. The above-described embodiment may be implemented in the form of program instructions that can be executed through various computer components and recorded in a computer-readable recording medium. The computer-readable recording medium may include program instructions, data files, data structures, etc. alone or in combination. Examples of the computer-readable recording medium include a hard disk, a magnetic medium such as a floppy disk and a magnetic tape, an optical recording medium such as a CD-ROM and DVD, and a magneto-optical medium such as a floppy disk. media), and hardware devices specially configured to store and execute program instructions, such as ROM, RAM, flash memory, and the like. The hardware device may be configured to operate as one or more software modules for carrying out the processing according to the present invention, and vice versa.

Claims

오버레이를 복호화하는 단계; 및
오버레이 관련 정보에 기초하여 상기 복호화된 오버레이를 360도 비디오에 렌더링하는 단계를 포함하되,
상기 오버레이는 상기 360도 비디오 상에 추가로 출력되는 영상을 의미하고,
상기 오버레이 관련 정보는 오버레이의 개수를 나타내는 정보 및 단일 오버레이에 대한 정보를 포함하고,
상기 단일 오버레이에 대한 정보는 각각의 단일 오버레이에 할당되는 고유 식별자를 나타내는 정보 및 제어 구조 플래그를 포함하되,
상기 제어 구조 플래그는, 제어 구조 정보의 종류에 따라 부여되는 인덱스 정보에 기초하여, 상기 각각의 단일 오버레이가 특정 제어 구조 정보를 포함하는지 여부를 지시하는 것을 특징으로 하는, 영상 처리 방법. decrypting the overlay; and
Rendering the decoded overlay to a 360-degree video based on the overlay-related information;
The overlay means an image additionally output on the 360-degree video,
The overlay-related information includes information indicating the number of overlays and information about a single overlay,
The information on the single overlay includes information indicating a unique identifier assigned to each single overlay and a control structure flag,
and the control structure flag indicates whether each of the single overlays includes specific control structure information, based on index information provided according to a type of control structure information.

제1 항에 있어서,
상기 제어 구조 정보는 상기 오버레이가 렌더링될 영역의 크기 또는 위치 중 적어도 하나를 결정하기 위한 영역 정보를 포함하는, 영상 처리 방법.According to claim 1,
The control structure information includes region information for determining at least one of a size and a position of a region in which the overlay is to be rendered.

제2 항에 있어서,
상기 영역 정보는 현재 뷰포트의 크기 대비 상기 영역의 크기를 나타내는 크기 정보 또는 상기 현재 뷰포트 내 상기 영역의 위치를 나타내는 위치 정보 중 적어도 하나를 포함하는, 영상 처리 방법.3. The method of claim 2,
The region information includes at least one of size information indicating the size of the region relative to the size of the current viewport and location information indicating the position of the region in the current viewport.

제1 항에 있어서,
상기 제어 구조 정보는 상기 오버레이의 우선순위를 나타내는 정보를 포함하는, 영상 처리 방법.According to claim 1,
The control structure information includes information indicating the priority of the overlay.

제4 항에 있어서,
상기 오버레이를 복호화할 것인지 여부는, 상기 우선순위에 기초하여 결정되는 것을 특징으로 하는, 영상 처리 방법.5. The method of claim 4,
Whether to decode the overlay is determined based on the priority.

제1 항에 있어서,
상기 제어 구조 정보는 상기 오버레이가 렌더링될 영역의 깊이를 나타내는 깊이 정보를 포함하는, 영상 처리 방법.According to claim 1,
The control structure information includes depth information indicating a depth of an area in which the overlay is to be rendered.

제6 항에 있어서,
상기 깊이 정보는 구 표면에서 상기 영역까지의 거리 또는 구 중심점으로부터 상기 영역까지의 거리를 나타내는 것을 특징으로 하는, 영상 처리 방법.7. The method of claim 6,
The depth information is an image processing method, characterized in that it represents a distance from the surface of the sphere to the region or a distance from a center point of the sphere to the region.

오버레이를 복호화하는 영상 복호화부; 및
오버레이 관련 정보에 기초하여 상기 복호화된 오버레이를 360도 비디오에 렌더링하는 렌더링부를 포함하되,
상기 오버레이는 상기 360도 비디오 상에 추가로 출력되는 영상을 의미하고,
상기 오버레이 관련 정보는 오버레이의 개수를 나타내는 정보 및 단일 오버레이에 대한 정보를 포함하고,
상기 단일 오버레이에 대한 정보는 각각의 단일 오버레이에 할당되는 고유 식별자를 나타내는 정보 및 제어 구조 플래그를 포함하되,
상기 제어 구조 플래그는, 제어 구조 정보의 종류에 따라 부여되는 인덱스 정보에 기초하여, 상기 각각의 단일 오버레이가 특정 제어 구조 정보를 포함하는지 여부를 지시하는 것을 특징으로 하는, 영상 재생 장치.an image decoder decoding the overlay; and
A rendering unit for rendering the decoded overlay to a 360-degree video based on the overlay-related information,
The overlay means an image additionally output on the 360-degree video,
The overlay-related information includes information indicating the number of overlays and information about a single overlay,
The information on the single overlay includes information indicating a unique identifier assigned to each single overlay and a control structure flag,
and the control structure flag indicates whether each of the single overlays includes specific control structure information, based on index information provided according to a type of control structure information.

제8 항에 있어서,
상기 제어 구조 정보는 상기 오버레이가 렌더링될 영역의 크기 또는 위치 중 적어도 하나를 결정하기 위한 영역 정보를 포함하는, 영상 재생 장치.9. The method of claim 8,
The control structure information includes area information for determining at least one of a size and a location of an area in which the overlay is to be rendered.

제9 항에 있어서,
상기 영역 정보는 현재 뷰포트의 크기 대비 상기 영역의 크기를 나타내는 크기 정보 또는 상기 현재 뷰포트 내 상기 영역의 위치를 나타내는 위치 정보 중 적어도 하나를 포함하는, 영상 재생 장치.10. The method of claim 9,
The region information includes at least one of size information indicating the size of the region relative to the size of the current viewport and location information indicating the position of the region in the current viewport.

제8 항에 있어서,
상기 제어 구조 정보는 상기 오버레이의 우선순위를 나타내는 정보를 포함하는, 영상 재생 장치.9. The method of claim 8,
The control structure information includes information indicating the priority of the overlay.

제11 항에 있어서,
상기 오버레이를 복호화할 것인지 여부는, 상기 우선순위에 기초하여 결정되는 것을 특징으로 하는, 영상 재생 장치.12. The method of claim 11,
Whether to decode the overlay is determined based on the priority.

제8 항에 있어서,
상기 제어 구조 정보는 상기 오버레이가 렌더링될 영역의 깊이를 나타내는 깊이 정보를 포함하는, 영상 재생 장치.9. The method of claim 8,
The control structure information includes depth information indicating a depth of an area in which the overlay is to be rendered.

제13 항에 있어서,
상기 깊이 정보는 구 표면에서 상기 영역까지의 거리 또는 구 중심점으로부터 상기 영역까지의 거리를 나타내는 것을 특징으로 하는, 영상 재생 장치.14. The method of claim 13,
The depth information is characterized in that it represents a distance from the surface of the sphere to the region or a distance from a center point of the sphere to the region.