KR20160106703A

KR20160106703A - Selection of motion vector precision

Info

Publication number: KR20160106703A
Application number: KR1020167021658A
Authority: KR
Inventors: 개리 제이 설리번; 유 조우; 밍 치에 리; 치렁 린
Original assignee: 마이크로소프트 테크놀로지 라이센싱, 엘엘씨
Priority date: 2014-01-08
Filing date: 2014-12-19
Publication date: 2016-09-12
Also published as: JP6498679B2; US9900603B2; US20200329247A1; EP3075154A1; US11095904B2; CN110177274B; RU2016127410A; BR112016015854A2; BR112016015854B1; CN110099278B; US20200177887A1; AU2014376190A1; WO2015105662A1; RU2682859C1; US11638016B2; US20210337214A1; KR102570202B1; CA3118603A1; CN110149513A; US10313680B2

Abstract

비디오 인코딩 동안 MV(motion vector) 정밀도를 선택하는 접근법이 제시된다. 이 접근법은 레이트 왜곡 성능 및/또는 계산 효율의 면에서 효과적인 압축을 용이하게 할 수 있다. 예를 들어, 비디오 인코더는 하나 이상의 소수 샘플 MV 정밀도 및 정수 샘플 MV 정밀도를 포함하는 복수의 MV 정밀도 중에서 비디오의 단위에 대한 MV 정밀도를 결정한다. 비디오 인코더는 소수 샘플 MV 정밀도를 가지는 MV 값 세트를 식별하고, 이어서 0의 소수 부분을 가지는 (세트 내에서의) MV 값들의 출현율에 적어도 부분적으로 기초하여 단위에 대한 MV 정밀도를 선택할 수 있다. 또는, 비디오 인코더가 레이트 왜곡 분석을 수행할 수 있고, 여기서 레이트 왜곡 분석은 정수 샘플 MV 정밀도 쪽으로 바이어스된다. 또는, 비디오 인코더는 비디오에 관한 정보를 수집하고, 수집된 정보에 적어도 부분적으로 기초하여 단위에 대한 MV 정밀도를 선택할 수 있다.An approach for selecting motion vector (MV) precision during video encoding is presented. This approach can facilitate effective compression in terms of rate distortion performance and / or computational efficiency. For example, the video encoder determines the MV precision for a unit of video among a plurality of MV accuracies including one or more fractional sample MV precision and integer sample MV precision. The video encoder may identify a set of MV values with fractional sample MV precision and then select the MV precision for the unit based at least in part on the occurrence rate of MV values (in the set) with a fractional part of zero. Alternatively, a video encoder may perform a rate distortion analysis, wherein the rate distortion analysis is biased towards integer sample MV precision. Alternatively, the video encoder can collect information about the video and select the MV precision for the unit based at least in part on the collected information.

Description

움직임 벡터 정밀도의 선택{SELECTION OF MOTION VECTOR PRECISION}Selection of Motion Vector Accuracy {SELECTION OF MOTION VECTOR PRECISION}

엔지니어는 디지털 비디오의 비트 레이트(bit rate)를 감소시키기 위해 압축(compression)(소스 코딩(source coding) 또는 소스 인코딩(source encoding)이라고도 불리움)을 사용한다. 압축은 비디오 정보를 보다 낮은 비트 레이트 형태로 변환함으로써 비디오 정보를 저장하고 전송하는 비용을 감소시킨다. 압축 해제(decompression)(디코딩(decoding)이라고도 불리움)는 압축된 형태로부터 원래 정보(original information)의 버전을 재구성한다. "코덱"은 인코더/디코더 시스템이다.The engineer uses compression (also referred to as source coding or source encoding) to reduce the bit rate of the digital video. Compression reduces the cost of storing and transmitting video information by converting video information into a lower bit rate format. Decompression (also referred to as decoding) reconstructs the version of original information from the compressed form. The "codec" is an encoder / decoder system.

지난 20년에 걸쳐, ITU-T H.261, H.262(MPEG-2 또는 ISO/IEC 13818-2), H.263 및 H.264(MPEG-4 AVC 또는 ISO/IEC 14496-10) 표준, MPEG-1(ISO/IEC 11172-2) 및 MPEG-4 Visual(ISO/IEC 14496-2) 표준, 및 SMPTE 421M(VC-1) 표준을 비롯한, 다양한 비디오 코덱 표준이 채택되었다. 보다 최근에, HEVC 표준(ITU-T H.265 또는 ISO/IEC 23008-2)이 승인되었다. (예컨대, 스케일러블 비디오 코딩/디코딩(scalable video coding/decoding)에 대한, 샘플 비트 심도(sample bit depth) 또는 크로마 샘플링 레이트(chroma sampling rate)의 면에서 보다 높은 충실도(fidelity)를 갖는 비디오의 코딩/디코딩에 대한, 또는 멀티뷰 코딩/디코딩(multi-view coding/decoding)에 대한) HEVC 표준에 대한 확장이 현재 개발 중이다. 비디오 코덱 표준은 전형적으로, 특정의 특징이 인코딩 및 디코딩에서 사용될 때 비트스트림에서의 파라미터를 상세히 기술하는, 인코딩된 비디오 비트스트림의 구문(syntax)에 대한 옵션을 정의한다. 많은 경우에, 비디오 코덱 표준은 또한 디코더가 디코딩에서 부합하는 결과를 달성하기 위해 수행해야만 하는 디코딩 동작에 관한 상세를 제공한다. 코덱 표준 이외에, 다양한 독점적 코덱 포맷(proprietary codec format)은 인코딩된 비디오 비트스트림의 구문에 대한 다른 옵션 및 대응하는 디코딩 동작을 정의한다.Over the past two decades, ITU-T H.261, H.262 (MPEG-2 or ISO / IEC 13818-2), H.263 and H.264 (MPEG-4 AVC or ISO / IEC 14496-10) standards , MPEG-1 (ISO / IEC 11172-2) and MPEG-4 Visual (ISO / IEC 14496-2) standards, and SMPTE 421M (VC-1) standards. More recently, the HEVC standard (ITU-T H.265 or ISO / IEC 23008-2) has been approved. Coding of video with a higher fidelity in terms of sample bit depth or chroma sampling rate, for scalable video coding / decoding, Extensions to the HEVC standard for decoding / decoding, or multi-view coding / decoding are currently under development. Video codec standards typically define options for the syntax of an encoded video bitstream, which describes in detail the parameters in the bitstream when a particular feature is used in encoding and decoding. In many cases, the video codec standard also provides details about the decoding operation that the decoder must perform to achieve the matching result in decoding. In addition to codec standards, various proprietary codec formats define other options for the syntax of the encoded video bitstream and corresponding decoding operations.

일반적으로, 비디오 압축 기법은 "인트라 픽처(intra-picture)" 압축 및 "인터 픽처(inter-picture)" 압축을 포함한다. 인트라 픽처 압축 기법은 개개의 픽처를 압축하고, 인터 픽처 압축 기법은 선행 및/또는 후속 픽처(종종 참조 또는 앵커 픽처라고 불리움) 또는 픽처들을 참조하여 픽처를 압축한다.Generally, video compression techniques include "intra-picture" compression and "inter-picture" compression. The intra-picture compression scheme compresses individual pictures, and the inter-picture compression scheme compresses pictures referring to preceding and / or following pictures (often referred to as reference or anchor pictures) or pictures.

인터 픽처 압축 기법은 비디오 시퀀스에서의 시간 중복성(temporal redundancy)을 이용함으로써 비트 레이트를 감소시키기 위해 움직임 추정(motion estimation) 및 움직임 보상(motion compensation)을 종종 사용한다. 움직임 추정은 픽처들 간의 움직임을 추정하는 프로세스이다. 하나의 통상적인 기법에서, 움직임 추정을 사용하는 인코더는 현재 픽처에서의 현재 샘플 값 블록을 다른 픽처(참조 픽처) 내의 검색 영역에 있는 동일한 크기의 후보 블록과 정합시키려고 시도한다. 인코더가 참조 픽처 내의 검색 영역에서 정확한 또는 "충분히 가까운" 일치를 발견할 때, 인코더는 현재 블록과 후보 블록 사이의 위치 변화를 (MV(motion vector: 움직임 벡터)와 같은) 움직임 데이터로서 파라미터화한다. MV가 종래에는 좌우 공간 변위를 나타내는 수평 MV 성분과 상하 공간 변위를 나타내는 수직 MV 성분을 가지는 2차원 값이다. 일반적으로, 움직임 보상은 움직임 데이터를 사용하여 참조 픽처(들)로부터 픽처를 재구성하는 프로세스이다.Interpretive compression techniques often use motion estimation and motion compensation to reduce bit rate by using temporal redundancy in a video sequence. Motion estimation is a process of estimating motion between pictures. In one conventional technique, the encoder using motion estimation attempts to match the current sample value block in the current picture with a candidate block of the same size in the search area in another picture (reference picture). When the encoder finds an exact or "near enough" match in the search area in the reference picture, the encoder parameterizes the position change between the current block and the candidate block as motion data (such as MV (motion vector)) . MV is a two-dimensional value having a horizontal MV component indicating a left and right spatial displacement and a vertical MV component indicating a vertical displacement. In general, motion compensation is a process of reconstructing a picture from a reference picture (s) using motion data.

MV는 현재 블록에 대한 참조 픽처 내의 동일 장소에 있는 위치(co-located position)로부터 시작하여 정수 개의 샘플 격자 위치로 환산되는 공간 변위를 나타낼 수 있다. 예를 들어, 현재 픽처에서 위치 (32, 16)에 있는 현재 블록에 대해, MV (-3, 1)은 참조 픽처에서의 위치 (29, 17)을 나타낸다. 또는, MV는 현재 블록에 대한 참조 픽처 내의 동일 장소에 있는 위치로부터의 소수 개의 샘플 격자 위치로 환산되는 공간 변위를 나타낼 수 있다. 예를 들어, 현재 픽처에서 위치 (32, 16)에 있는 현재 블록에 대해, MV (-3.5, 1.25)은 참조 픽처에서의 위치 (28.5, 17.25)을 나타낸다. 참조 픽처에서 소수 오프셋에 있는 샘플 값을 결정하기 위해, 인코더는 전형적으로 정수 샘플 위치에 있는 샘플 값들 사이에서 보간한다. 이러한 보간은 계산 집중적일 수 있다. 움직임 보상 동안, 디코더도 참조 픽처에서 소수 오프셋에 있는 샘플 값을 계산하기 위해 필요에 따라 보간을 수행한다.The MV may represent a spatial displacement that is converted to an integer number of sample grid positions, starting from a co-located position in the reference picture for the current block. For example, for the current block in positions 32 and 16 in the current picture, MV (-3, 1) represents positions 29 and 17 in the reference picture. Alternatively, the MV may represent spatial displacements converted to a small number of sample grid positions from positions in the same place in the reference picture for the current block. For example, for the current block in positions 32 and 16 in the current picture, MV (-3.5, 1.25) represents the position (28.5, 17.25) in the reference picture. To determine a sample value at a fractional offset in a reference picture, the encoder typically interpolates between sample values at integer sample locations. This interpolation can be computationally intensive. During motion compensation, the decoder also performs interpolation as needed to compute a sample value at a fractional offset in the reference picture.

상이한 비디오 코덱 표준 및 포맷이 상이한 MV 정밀도를 갖는 MV를 사용해왔다. 정수 샘플 MV 정밀도에 있어서, MV 성분은 공간 변위에 대한 정수 개의 샘플 격자 위치를 나타낸다. 1/2 샘플 MV 정밀도 또는 1/4 샘플 MV 정밀도와 같은 소수 샘플 MV 정밀도에 있어서, MV 성분은 공간 변위에 대한 정수 개의 샘플 격자 위치 또는 소수 개의 샘플 격자 위치를 나타낼 수 있다. 예를 들어, MV 정밀도가 1/4 샘플 MV 정밀도이면, MV 성분은 0 개의 샘플, 0.25 개의 샘플, 0.5 개의 샘플, 0.75 개의 샘플, 1.0 개의 샘플, 1.25 개의 샘플 등의 공간 변위를 나타낼 수 있다. 일부 비디오 코덱 표준 및 포맷은 인코딩 동안 MV 정밀도의 스위칭을 지원한다. 그렇지만, 특정 인코딩 시나리오에서, 어느 MV 정밀도를 사용할지에 관한 인코더측 결정이 효과적으로 행해지지 않는다.Different video codec standards and formats have used MVs with different MV precision. For integer samples MV precision, the MV component represents an integer number of sample grid positions for spatial displacement. For fractional sample MV precision, such as 1/2 sample MV precision or 1/4 sample MV precision, the MV component may represent an integer number of sample lattice positions or a small number of sample lattice positions for spatial displacement. For example, if the MV precision is 1/4 sample MV precision, the MV component may represent spatial displacement of zero samples, 0.25 samples, 0.5 samples, 0.75 samples, 1.0 samples, 1.25 samples, and so on. Some video codec standards and formats support switching of MV precision during encoding. However, in certain encoding scenarios, encoder side determination about which MV precision to use will not be effected effectively.

요약하면, 상세한 설명은 MV(motion vector) 정밀도의 선택에 대한 인코더측 동작에서의 혁신을 제시한다. 예를 들어, 비디오 인코더가 비디오를 인코딩할 때, 비디오 인코더는 비디오의 어떤 단위에 대한 MV 정밀도를 결정한다.In summary, the detailed description suggests an innovation in encoder side operation for selection of MV (motion vector) precision. For example, when a video encoder encodes video, the video encoder determines the MV precision for any unit of video.

본원에 기술되는 혁신의 일 양태에 따르면, 비디오 인코더가 단위에 대한 MV 정밀도를 결정할 때, 비디오 인코더는 소수 샘플 MV 정밀도를 가지는 MV 값 세트를 식별할 수 있다. 비디오 인코더는 0의 소수 부분을 가지는 MV 값들의, MV 값 세트 내에서의, 출현율(prevalence)에 적어도 부분적으로 기초하여 단위에 대한 MV 정밀도를 선택할 수 있다.According to one aspect of the innovation described herein, when a video encoder determines the MV precision for a unit, the video encoder may identify a set of MV values with a fractional sample MV precision. The video encoder may select the MV precision for the unit based at least in part on the prevalence of the MV values with a fraction of 0 in the set of MV values.

본원에 기술되는 혁신의 다른 양태에 따르면, 비디오 인코더가 단위에 대한 MV 정밀도를 결정할 때, 비디오 인코더는 하나 이상의 소수 샘플 MV 정밀도 및 정수 샘플 MV 정밀도를 포함하는 다수의 MV 정밀도 중에서 결정하기 위해 레이트 왜곡 분석을 수행할 수 있다. 레이트 왜곡 분석은 (a) 왜곡 비용을 스케일링하는 것, (b) 왜곡 비용에 페널티(penalty)를 부가하는 것, (c) 비트 레이트 비용(bit rate cost)을 스케일링하는 것, (d) 비트 레이트 비용에 페널티를 부가하는 것, 및/또는 (e) 라그랑지 승수 인자(Lagrangian multiplier factor)를 조절하는 것에 의해 정수 샘플 MV 정밀도 쪽으로 바이어스된다.According to another aspect of the innovation described herein, when a video encoder determines the MV precision for a unit, the video encoder may use rate delta to determine from among multiple MV precision, including one or more fractional sample MV precision and integer sample MV precision, Analysis can be performed. The rate distortion analysis may be performed by (a) scaling the distortion cost, (b) adding a penalty to the distortion cost, (c) scaling the bit rate cost, (d) And / or (e) bias the integer sample MV precision by adjusting the Lagrangian multiplier factor.

본원에 기술되는 혁신의 다른 양태에 따르면, 비디오 인코더가 단위에 대한 MV 정밀도를 결정할 때, 비디오 인코더는 비디오에 관한 정보를 수집하고, 수집된 정보에 적어도 부분적으로 기초하여, 다수의 MV 정밀도 중에서 단위에 대한 MV 정밀도를 선택할 수 있다. 다수의 MV 정밀도는 하나 이상의 소수 샘플 MV 정밀도 및 정수 샘플 MV 정밀도를 포함한다.According to another aspect of the innovation described herein, when a video encoder determines the MV precision for a unit, the video encoder collects information about the video and, based at least in part on the collected information, Can be selected. The multiple MV precisions include one or more fractional sample MV precision and integer sample MV precision.

MV 정밀도를 선택하는 것에 대한 인코더측 옵션에 대한 혁신은 방법의 일부로서, 방법을 수행하도록 구성된 컴퓨팅 디바이스의 일부로서, 또는 컴퓨팅 디바이스로 하여금 방법을 수행하게 하기 위한 컴퓨터 실행 가능 명령어를 저장하는 유형적 컴퓨터 판독 가능 매체의 일부로서 구현될 수 있다. 다양한 혁신들이 결합하여 또는 개별적으로 사용될 수 있다.Innovation for encoder side options for selecting MV precision may be implemented as part of a method, as part of a computing device configured to perform a method, or as a tangible computer that stores computer-executable instructions for causing a computing device to perform a method May be implemented as part of a readable medium. Various innovations can be used in combination or individually.

본 발명의 전술한 목적, 특징 및 장점 그리고 다른 목적, 특징 및 장점이 첨부 도면을 참조하여 계속되는 이하의 발명을 실시하기 위한 구체적인 내용으로부터 보다 명백하게 될 것이다.The foregoing and other objects, features and advantages of the present invention will become more apparent from the following detailed description taken in conjunction with the accompanying drawings, in which: FIG.

도 1은 일부 기술된 실시예가 구현될 수 있는 예시적인 컴퓨팅 시스템을 나타낸 도면.
도 2a 및 도 2b는 일부 기술된 실시예가 구현될 수 있는 예시적인 네트워크 환경을 나타낸 도면.
도 3은 일부 기술된 실시예가 구현될 수 있는 예시적인 인코더 시스템을 나타낸 도면.
도 4a 및 도 4b는 일부 기술된 실시예가 구현될 수 있는 예시적인 비디오 인코더를 나타낸 도면.
도 5는 화면 포착을 위한 입력을 제공할 수 있는 콘텐츠를 갖는 컴퓨터 바탕화면 환경을 나타낸 도면.
도 6은 자연스런 비디오 콘텐츠 및 인위적인 비디오 콘텐츠를 갖는 혼합 콘텐츠 비디오를 나타낸 도면.
도 7a 및 도 7b는, 각각, 정수 샘플 공간 변위 및 소수 샘플 공간 변위를 가지는 MV 값에 의한 움직임 보상을 나타낸 도면.
도 8은 인코딩 동안 MV 정밀도를 적응시키는 일반화된 기법을 나타낸 플로우차트.
도 9는 저복잡도 접근법을 사용하여 인코딩 동안 MV 정밀도를 적응시키는 예시적인 기법을 나타낸 플로우차트.
도 10은 저복잡도 접근법의 일부 변형에 따른 픽처의 상이한 영역을 나타낸 도면.BRIEF DESCRIPTION OF THE DRAWINGS Figure 1 illustrates an exemplary computing system in which some of the described embodiments may be implemented.
Figures 2a and 2b show an exemplary network environment in which some of the described embodiments may be implemented.
Figure 3 illustrates an exemplary encoder system in which some of the described embodiments may be implemented.
Figures 4A and 4B illustrate an exemplary video encoder in which some of the described embodiments may be implemented.
Figure 5 illustrates a computer desktop environment with content capable of providing input for screen capture;
Figure 6 shows mixed content video with natural video content and artificial video content;
Figures 7A and 7B show motion compensation by MV values with integer sample space displacement and fractional sample space displacement, respectively.
8 is a flowchart depicting a generalized technique for adapting MV precision during encoding.
9 is a flow chart illustrating an exemplary technique for adapting MV precision during encoding using a low complexity approach;
Figure 10 shows different areas of a picture according to some variants of a low complexity approach;

상세한 설명은 인코딩 동안의 MV(motion vector) 정밀도의 선택에서의 혁신을 제시한다. 이 접근법은 레이트 왜곡 성능 및/또는 계산 효율의 면에서 효과적인 압축을 용이하게 할 수 있다. 예를 들어, 비디오 인코더는 하나 이상의 소수 샘플 MV 정밀도 및 정수 샘플 MV 정밀도를 포함하는 다수의 MV 정밀도 중에서 비디오의 어떤 단위에 대한 MV 정밀도를 결정한다. 비디오 인코더는 소수 샘플 MV 정밀도를 가지는 MV 값 세트를 식별하고, 이어서 0의 소수 부분을 가지는 (세트 내에서의) MV 값들의 출현율에 적어도 부분적으로 기초하여 단위에 대한 MV 정밀도를 선택할 수 있다. 또는, 비디오 인코더가 레이트 왜곡 분석을 수행할 수 있고, 여기서 레이트 왜곡 분석은 정수 샘플 MV 정밀도 쪽으로 바이어스된다. 또는, 비디오 인코더는 비디오에 관한 정보를 수집하고, 수집된 정보에 적어도 부분적으로 기초하여 단위에 대한 MV 정밀도를 선택할 수 있다. 또는, 비디오 인코더는 어떤 다른 방식에서 비디오의 어떤 단위에 대한 MV 정밀도를 결정할 수 있다.The detailed description suggests an innovation in the choice of motion vector (MV) precision during encoding. This approach can facilitate effective compression in terms of rate distortion performance and / or computational efficiency. For example, a video encoder determines the MV precision for any unit of video among multiple MV precision, including one or more fractional sample MV precision and integer sample MV precision. The video encoder may identify a set of MV values with fractional sample MV precision and then select the MV precision for the unit based at least in part on the occurrence rate of MV values (in the set) with a fractional part of zero. Alternatively, a video encoder may perform a rate distortion analysis, wherein the rate distortion analysis is biased towards integer sample MV precision. Alternatively, the video encoder can collect information about the video and select the MV precision for the unit based at least in part on the collected information. Alternatively, the video encoder may determine the MV precision for any unit of video in some other manner.

본원에 기술되는 동작이 여러 곳에서 비디오 인코더에 의해 수행되는 것으로 기술되어 있지만, 많은 경우에, 동작이 다른 유형의 미디어 처리 도구에 의해 수행될 수 있다.Although the operations described herein are described as being performed by video encoders at various locations, in many cases, the operations may be performed by other types of media processing tools.

본원에 기술되는 혁신들 중 일부는 HEVC 표준에 특유한 구문 요소(syntax element) 및 동작을 참조하여 설명된다. 본원에 기술되는 혁신은 또한 다른 표준 또는 포맷에 대해 구현될 수 있다.Some of the innovations described herein are described with reference to syntax elements and operations specific to the HEVC standard. The innovations described herein may also be implemented for other standards or formats.

보다 일반적으로, 본원에 기술되는 예에 대한 다양한 대안이 가능하다. 예를 들어, 본원에 기술되는 방법들 중 일부가 기술된 방법 동작들의 순서를 변경하는 것에 의해, 특정의 방법 동작을 분할, 반복 또는 생략하는 것에 의해, 기타에 의해 수정될 수 있다. 개시되는 기술의 다양한 양태가 결합하여 또는 개별적으로 사용될 수 있다. 상이한 실시예가 기술되는 혁신들 중 하나 이상을 사용한다. 본원에 기술되는 혁신들 중 일부는 배경 기술에서 살펴본 문제점들 중 하나 이상을 해결한다. 전형적으로, 주어진 기법/도구가 이러한 문제점들 모두를 해결하지는 않는다.More generally, various alternatives to the examples described herein are possible. For example, some of the methods described herein may be modified by others, by varying the order of the method operations described, by dividing, repeating or omitting certain method operations. Various aspects of the disclosed technology may be used in combination or separately. Different embodiments use one or more of the innovations described. Some of the innovations described herein address one or more of the problems discussed in the Background section. Typically, a given technique / tool does not address all of these problems.

I. 예시적인 컴퓨팅 시스템I. Exemplary Computing System

도 1은 기술되는 혁신들 중 몇몇이 구현될 수 있는 적당한 컴퓨팅 시스템(100)의 일반화된 예를 나타낸 것이다. 혁신들이 비디오 인코딩을 위해 구성된 특수 목적 컴퓨팅 시스템을 비롯한 다양한 컴퓨팅 시스템에서 구현될 수 있기 때문에, 컴퓨팅 시스템(100)은 용도 또는 기능의 범주에 관한 어떤 제한을 암시하는 것으로 의도되어 있지 않다.Figure 1 illustrates a generalized example of a suitable computing system 100 in which some of the innovations described may be implemented. The computing system 100 is not intended to imply any limitation as to the scope of use or functionality, as innovations may be implemented in various computing systems, including special purpose computing systems configured for video encoding.

도 1을 참조하면, 컴퓨팅 시스템(100)은 적어도 하나의 처리 유닛들(110, 115) 및 메모리(120, 125)를 포함한다. 처리 유닛(110, 115)은 컴퓨터 실행 가능 명령어를 실행한다. 처리 유닛은 CPU(central processing unit), ASIC(application-specific integrated circuit) 내의 프로세서, 또는 임의의 다른 유형의 프로세서일 수 있다. 다중 처리 시스템(multi-processing system)에서는, 처리 능력을 증가시키기 위해 다수의 처리 유닛이 컴퓨터 실행 가능 명령어를 실행한다. 예를 들어, 도 1은 중앙 처리 유닛(central processing unit)(110)은 물론, 그래픽 처리 유닛 또는 코프로세싱 유닛(coprocessing unit)(115)을 나타내고 있다. 유형적 메모리(tangible memory)(120, 125)는 처리 유닛(들)에 의해 액세스 가능한, 휘발성 메모리(예컨대, 레지스터, 캐시, RAM), 비휘발성 메모리(예컨대, ROM, EEPROM, 플래시 메모리 등), 또는 이 둘의 어떤 조합일 수 있다. 메모리(120, 125)는 인코딩 동안의 MV 정밀도의 선택을 위한 하나 이상의 혁신들을 구현하는, 처리 유닛(들)에 의한 실행에 적당한 컴퓨터 실행 가능 명령어들의 형태로 된 소프트웨어(180)를 저장한다.Referring to FIG. 1, computing system 100 includes at least one processing unit 110, 115 and memory 120, 125. The processing units 110 and 115 execute computer executable instructions. The processing unit may be a central processing unit (CPU), a processor within an application-specific integrated circuit (ASIC), or any other type of processor. In a multi-processing system, multiple processing units execute computer-executable instructions to increase processing power. For example, FIG. 1 illustrates a graphics processing unit or coprocessing unit 115, as well as a central processing unit 110. Tangible memories 120 and 125 may be volatile memory (e.g., registers, cache, RAM), non-volatile memory (e.g., ROM, EEPROM, flash memory, etc.) It can be any combination of the two. The memories 120 and 125 store software 180 in the form of computer-executable instructions suitable for execution by the processing unit (s), implementing one or more innovations for selection of MV precision during encoding.

컴퓨팅 시스템은 부가의 특징을 가질 수 있다. 예를 들어, 컴퓨팅 시스템(100)은 저장소(140), 하나 이상의 입력 디바이스(150), 하나 이상의 출력 디바이스(160), 및 하나 이상의 통신 연결(170)을 포함한다. 버스, 제어기 또는 네트워크와 같은 상호연결 메커니즘(도시되지 않음)은 컴퓨팅 시스템(100)의 구성요소들을 상호연결시킨다. 전형적으로, 운영 체제 소프트웨어(도시되지 않음)는 컴퓨팅 시스템(100)에서 실행 중인 다른 소프트웨어에 대한 운영 환경을 제공하고, 컴퓨팅 시스템(100)의 구성요소의 활동을 조정한다.The computing system may have additional features. For example, the computing system 100 includes a storage 140, one or more input devices 150, one or more output devices 160, and one or more communication connections 170. An interconnecting mechanism (not shown) such as a bus, controller, or network interconnects the components of the computing system 100. Typically, operating system software (not shown) provides an operating environment for other software running in the computing system 100 and coordinates the activities of the components of the computing system 100.

유형적 저장소(140)는 이동식 또는 비이동식일 수 있고, 정보를 저장하는 데 사용될 수 있고 컴퓨팅 시스템(100) 내에서 액세스될 수 있는, 자기 디스크, 자기 테이프 또는 카세트, CD-ROM, DVD, 또는 임의의 다른 매체를 포함한다. 저장소(140)는 인코딩 동안의 MV 정밀도의 선택을 위한 하나 이상의 혁신들을 구현하는 소프트웨어(180)에 대한 명령어들을 저장한다.The tangible store 140 may be a magnetic disk, a magnetic tape or a cassette, a CD-ROM, a DVD, or an arbitrary disk, which may be removable or non-removable and may be used to store information and be accessed within the computing system 100 Lt; / RTI > The store 140 stores instructions for the software 180 implementing one or more innovations for selection of MV precision during encoding.

입력 디바이스(들)(150)는 컴퓨팅 시스템(100)에 입력을 제공하는 키보드, 마우스, 펜, 또는 트랙볼과 같은 터치 입력 디바이스, 음성 입력 디바이스, 스캐닝 디바이스, 또는 다른 디바이스일 수 있다. 비디오의 경우, 입력 디바이스(들)(150)는 아날로그 또는 디지털 형태로 비디오 입력을 받아들이는 카메라, 비디오 카드, TV 튜너 카드, 화면 포착 모듈, 또는 유사한 디바이스, 또는 비디오 입력을 컴퓨팅 시스템(100) 내로 읽어들이는 CD-ROM 또는 CD-RW일 수 있다. 출력 디바이스(들)(160)는 컴퓨팅 시스템(100)으로부터의 출력을 제공하는 디스플레이, 프린터, 스피커, CD 라이터(CD-writer), 또는 다른 디바이스일 수 있다.The input device (s) 150 may be a touch input device, such as a keyboard, mouse, pen, or trackball, a voice input device, a scanning device, or other device that provides input to the computing system 100. In the case of video, the input device (s) 150 may include a camera, video card, TV tuner card, screen capture module, or similar device, or video input that accepts video input in analog or digital form into the computing system 100 Readable CD-ROM or CD-RW. The output device (s) 160 may be a display, a printer, a speaker, a CD-writer, or other device that provides output from the computing system 100.

통신 연결(들)(170)은 통신 매체를 통한 다른 컴퓨팅 엔터티와의 통신을 가능하게 한다. 통신 매체는 컴퓨터 실행 가능 명령어, 오디오 또는 비디오 입력 또는 출력, 또는 다른 데이터와 같은 정보를 피변조 데이터 신호(modulated data signal)로 전달한다. 피변조 데이터 신호는 신호의 특성들 중 하나 이상이 정보를 그 신호에 인코딩하는 방식으로 설정되거나 변경된 신호이다. 제한이 아닌 예로서, 통신 매체는 전기, 광, RF, 또는 다른 반송파를 사용할 수 있다.The communication connection (s) 170 enable communication with other computing entities via a communication medium. Communication media carry information such as computer-executable instructions, audio or video input or output, or other data to a modulated data signal. The modulated data signal is a signal that is set or changed in such a way that at least one of the characteristics of the signal encodes the information into the signal. By way of example, and not limitation, communication media may utilize electrical, optical, RF, or other carrier wave.

혁신이 일반적으로 컴퓨터 판독 가능 매체와 관련하여 기술되어 있을 수 있다. 컴퓨터 판독 가능 매체는 컴퓨팅 환경 내에서 액세스될 수 있는 임의의 이용 가능한 유형적 매체이다. 제한이 아닌 예로서, 컴퓨팅 시스템(100)에서, 컴퓨터 판독 가능 매체는 메모리(120, 125), 저장소(140), 및 상기한 것들 중 임의의 것의 조합을 포함한다.Innovation can generally be described in connection with computer readable media. Computer readable media are any available tangible media that can be accessed within a computing environment. By way of example, and not limitation, in computing system 100, a computer-readable medium includes a memory 120, 125, a storage 140, and a combination of any of the foregoing.

혁신이 일반적으로 컴퓨팅 시스템에서 실제 또는 가상의 대상 프로세서 상에서 실행되는, 프로그램 모듈에 포함된 것과 같은, 컴퓨터 실행 가능 명령어와 관련하여 기술될 수 있다. 일반적으로, 프로그램 모듈은 특정의 작업을 수행하거나 특정의 추상 데이터 형식을 구현하는 루틴, 프로그램, 라이브러리, 객체, 클래스, 구성요소, 데이터 구조 등을 포함한다. 프로그램 모듈의 기능이 다양한 실시예에서 원하는 바에 따라 프로그램 모듈들 간에 결합되거나 분할될 수 있다. 프로그램 모듈에 대한 컴퓨터 실행 가능 명령어는 로컬 또는 분산 컴퓨팅 시스템 내에서 실행될 수 있다.Innovation may be described in connection with computer-executable instructions, such as those included in program modules, that are generally executed on a real or virtual target processor in a computing system. Generally, program modules include routines, programs, libraries, objects, classes, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The functionality of the program modules may be combined or partitioned between program modules as desired in various embodiments. Computer executable instructions for the program modules may be executed within a local or distributed computing system.

"시스템"과 "디바이스"라는 용어는 본원에서 서로 바꾸어 사용될 수 있다. 문맥이 명백히 달리 나타내지 않는 한, 어느 용어도 컴퓨팅 시스템 또는 컴퓨팅 디바이스의 유형에 대한 어떤 제한을 암시하지 않는다. 일반적으로, 컴퓨팅 시스템 또는 컴퓨팅 디바이스는 로컬이거나 분산되어 있을 수 있고, 본원에 기술되는 기능을 구현하는 소프트웨어를 갖는 특수 목적 하드웨어 및/또는 하드웨어의 임의의 조합을 포함할 수 있다.The terms "system" and "device" may be used interchangeably herein. No language implies any limitation as to the type of computing system or computing device, unless the context clearly indicates otherwise. In general, a computing system or computing device may be local or distributed, and may include any combination of special purpose hardware and / or hardware having software implementing the functions described herein.

개시되는 방법이 또한 개시되는 방법들 중 임의의 것을 수행하도록 구성된 특수 컴퓨팅 하드웨어를 사용하여 구현될 수 있다. 예를 들어, 개시되는 방법은 개시되는 방법들 중 임의의 것을 구현하도록 특수 설계되거나 구성된 집적 회로(예컨대, ASIC DSP(digital signal process unit)와 같은 ASIC, GPU(graphics processing unit), 또는 FPGA(field programmable gate array)와 같은 PLD(programmable logic device))에 의해 구현될 수 있다.The disclosed methods may also be implemented using specialized computing hardware configured to perform any of the disclosed methods. For example, the disclosed methods may be implemented in an ASIC, graphics processing unit (GPU), or FPGA (field-programmable gate array), such as an integrated circuit a programmable logic device (PLD), such as a programmable gate array).

제시를 위해, 발명을 실시하기 위한 구체적인 내용은 컴퓨팅 시스템에서의 컴퓨터 동작을 기술하기 위해 "결정한다" 및 "사용한다"와 같은 용어를 사용한다. 이 용어들은 컴퓨터에 의해 수행되는 동작에 대한 상위 레벨 추상화(high-level abstraction)이고, 사람에 의해 수행되는 동작과 혼동되어서는 안된다. 이 용어들에 대응하는 실제 컴퓨터 동작은 구현에 따라 다르다. 본원에서 사용되는 바와 같이, "최적화*"라는 용어(최적화 및 최적화하는과 같은 변형을 포함함)는 주어진 결정 범주 하에서 옵션들 중의 선택을 지칭하고, 최적화된 선택이 확장된 결정 범주에 대한 "최상의" 또는 "최적의" 선택임을 암시하지 않는다.For purposes of illustration, specific details for carrying out the invention use terms such as "determine" and "use" to describe computer operation in a computing system. These terms are high-level abstractions of operations performed by a computer and should not be confused with operations performed by a person. The actual computer operation corresponding to these terms is implementation dependent. As used herein, the term "optimization *" (including variations such as optimizing and optimizing) refers to the selection of options under a given decision category, Quot; or "optimal" choice.

II. 예시적인 네트워크 환경II. Exemplary network environment

도 2a 및 도 2b는 비디오 인코더들(220) 및 비디오 디코더들(270)을 포함하는 예시적인 네트워크 환경들(201, 202)을 나타낸 것이다. 인코더(220) 및 디코더(270)는 적절한 통신 프로토콜을 사용하여 네트워크(250)를 통해 연결된다. 네트워크(250)는 인터넷 또는 다른 컴퓨터 네트워크를 포함할 수 있다.2A and 2B illustrate exemplary network environments 201 and 202 that include video encoders 220 and video decoders 270. [ Encoder 220 and decoder 270 are connected via network 250 using an appropriate communication protocol. The network 250 may include the Internet or other computer networks.

도 2a에 도시된 네트워크 환경(201)에서, 각각의 RTC(real-time communication: 실시간 통신) 도구(210)는 양방향 통신을 위해 인코더(220) 및 디코더(270) 둘 다를 포함한다. 주어진 인코더(220)는 HEVC 표준(H.265라고도 알려져 있음), SMPTE 421M 표준, ISO-IEC 14496-10 표준(H.264 또는 AVC라고도 알려져 있음), 다른 표준, 또는 독점적 포맷(proprietary format)의 변형 또는 확장과 호환되는 출력을 생성할 수 있고, 대응하는 디코더(270)는 인코더(220)부터 인코딩된 데이터를 받는다. 양방향 통신은 화상 회의, 화상 전화 통화, 또는 다른 양자간 또는 다자간 통신 시나리오의 일부일 수 있다. 도 2a에서의 네트워크 환경(201)이 2 개의 실시간 통신 도구(210)를 포함하지만, 네트워크 환경(201)이 그 대신에 다자간 통신에 참여하는 3 개 이상의 실시간 통신 도구(210)를 포함할 수 있다.In the network environment 201 shown in FIG. 2A, each real-time communication (RTC) tool 210 includes both an encoder 220 and a decoder 270 for bidirectional communication. A given encoder 220 may be implemented in a variety of ways, including the HEVC standard (also known as H.265), the SMPTE 421M standard, the ISO-IEC 14496-10 standard (also known as H.264 or AVC), other standards, or proprietary format Transforms or extensions, and the corresponding decoder 270 receives the encoded data from the encoder 220. The encoder 220 may then generate an output that is compatible with the transform or extension. Bi-directional communication may be part of a video conferencing, videophone call, or other bilateral or multi-party communication scenario. The network environment 201 in Figure 2a includes two real-time communication tools 210, but the network environment 201 may instead include three or more real-time communication tools 210 participating in a multi-party communication .

실시간 통신 도구(210)는 인코더(220)에 의한 인코딩을 관리한다. 도 3은 실시간 통신 도구(210)에 포함될 수 있는 예시적인 인코더 시스템(300)을 나타낸 것이다. 대안적으로, 실시간 통신 도구(210)는 다른 인코더 시스템을 사용한다. 실시간 통신 도구(210)는 디코더(270)에 의한 디코딩도 관리한다.The real-time communication tool 210 manages the encoding by the encoder 220. 3 illustrates an exemplary encoder system 300 that may be included in a real-time communication tool 210. Alternatively, the real-time communication tool 210 uses a different encoder system. The real-time communication tool 210 also manages decoding by the decoder 270.

도 2b에 도시된 네트워크 환경(202)에서, 인코딩 도구(212)는 디코더(270)를 포함하는 다수의 재생 도구(214)로 전달하기 위한 비디오를 인코딩하는 인코더(220)를 포함한다. 비디오가 인코딩되어 하나의 장소로부터 하나 이상의 다른 장소로 송신되는 비디오 감시 시스템, 웹 카메라 모니터링 시스템, 화면 포착 모듈, 원격 데스크톱 회의 프레젠테이션 또는 다른 시나리오를 위해 단방향 통신이 제공될 수 있다. 도 2b에서의 네트워크 환경(202)이 2 개의 재생 도구(214)를 포함하지만, 네트워크 환경(202)은 더 많거나 더 적은 수의 재생 도구(214)를 포함할 수 있다. 일반적으로, 재생 도구(214)는 재생 도구(214)가 수신할 비디오 스트림을 결정하기 위해 인코딩 도구(212)와 통신한다. 재생 도구(214)는 스트림을 수신하고, 수신된 인코딩된 데이터를 적절한 기간 동안 버퍼링하며, 디코딩 및 재생을 시작한다.In the network environment 202 shown in FIG. 2B, the encoding tool 212 includes an encoder 220 for encoding video for delivery to a plurality of playback tools 214, including a decoder 270. Unidirectional communication may be provided for video surveillance systems, web camera monitoring systems, screen capture modules, remote desktop conference presentations or other scenarios where the video is encoded and transmitted from one place to one or more other places. The network environment 202 in Figure 2B includes two playback tools 214 but the network environment 202 may include more or fewer playback tools 214. [ Generally, the playback tool 214 communicates with the encoding tool 212 to determine the video stream to be received by the playback tool 214. The playback tool 214 receives the stream, buffers the received encoded data for an appropriate period of time, and starts decoding and playback.

도 3은 인코딩 도구(212)에 포함될 수 있는 예시적인 인코더 시스템(300)을 나타낸 것이다. 대안적으로, 인코딩 도구(212)는 다른 인코더 시스템을 사용한다. 인코딩 도구(212)는 또한 하나 이상의 재생 도구(214)와의 연결을 관리하기 위한 서버측 제어기 논리(server-side controller logic)를 포함할 수 있다. 재생 도구(214)는 또한 인코딩 도구(212)와의 연결을 관리하기 위한 클라이언트측 제어기 논리(client-side controller logic)를 포함할 수 있다.FIG. 3 illustrates an exemplary encoder system 300 that may be included in the encoding tool 212. Alternatively, the encoding tool 212 uses a different encoder system. The encoding tool 212 may also include server-side controller logic for managing the connection with the one or more playback tools 214. The playback tool 214 may also include client-side controller logic for managing the connection with the encoding tool 212.

III. 예시적인 인코더 시스템III. An exemplary encoder system

도 3은 일부 기술된 실시예가 구현될 수 있는 예시적인 인코더 시스템(300)의 블록도이다. 인코더 시스템(300)은 실시간 통신을 위한 저대기시간(low-latency) 인코딩 모드, 트랜스코딩 모드(transcoding mode), 및 파일 또는 스트림으로부터 재생을 위한 미디어를 생성하기 위한 고대기시간(higher-latency) 인코딩 모드와 같은 다수의 인코딩 모드 중 임의의 것에서 동작할 수 있는 범용 인코딩 도구일 수 있거나, 하나의 이러한 인코딩 모드를 위해 구성된 특수 목적 인코딩 도구일 수 있다. 인코더 시스템(300)은 운영 체제 모듈로서, 애플리케이션 라이브러리의 일부로서 또는 독립형 애플리케이션으로서 구현될 수 있다. 일반적으로, 인코더 시스템(300)은 비디오 소스(310)로부터 소스 비디오 프레임들(311)의 시퀀스를 수신하고 인코딩된 데이터를 채널(390)로의 출력으로서 생성한다. 채널로 출력되는 인코딩된 데이터는 선택된 MV 정밀도를 사용하여 인코딩된 콘텐츠를 포함할 수 있다.3 is a block diagram of an exemplary encoder system 300 in which some of the described embodiments may be implemented. Encoder system 300 includes a low-latency encoding mode for real-time communication, a transcoding mode, and a higher-latency encoding for generating media for playback from a file or stream Mode, or may be a special purpose encoding tool configured for one such encoding mode. Encoder system 300 may be implemented as an operating system module, as part of an application library, or as a standalone application. Generally, the encoder system 300 receives a sequence of source video frames 311 from a video source 310 and generates the encoded data as an output to a channel 390. The encoded data output to the channel may include the encoded content using the selected MV precision.

비디오 소스(310)는 카메라, 튜너 카드, 저장 매체, 화면 포착 모듈, 또는 다른 디지털 비디오 소스일 수 있다. 비디오 소스(310)는, 예를 들어, 초당 30 프레임의 프레임 레이트(frame rate)로 비디오 프레임들의 시퀀스를 생성한다. 본원에서 사용되는 바와 같이, "프레임"이라는 용어는 일반적으로 소스, 코딩된 또는 재구성된 영상 데이터를 지칭한다. 순차 주사 비디오(progressive-scan video)의 경우, 프레임은 순차 주사 비디오 프레임(progressive-scan video frame)이다. 인터레이스 비디오(interlaced video)의 경우, 예시적인 실시예에서, 인터레이스 비디오 프레임(interlaced video frame)은 인코딩 이전에 디인터레이스(de-interlace)될 수 있다. 대안적으로, 2 개의 상보적인 인터레이스 비디오 필드가함께 단일의 비디오 프레임으로서 인코딩되거나 2 개의 개별적으로 인코딩된 필드로서 인코딩된다. 순차 주사 비디오 프레임 또는 비월 주사 비디오 프레임(interlaced-scan video frame)을 나타내는 것 이외에, "프레임" 또는 "픽처"라는 용어는 단일의 쌍이 아닌 비디오 필드(single non-paired video field), 상보적인 비디오 필드 쌍, 주어진 때의 비디오 객체를 표현하는 비디오 객체 평면(video object plane), 또는 보다 큰 영상에서의 관심 영역을 나타낼 수 있다. 비디오 객체 평면 또는 영역은 장면의 다수의 객체 또는 영역을 포함하는 보다 큰 영상의 일부일 수 있다.Video source 310 may be a camera, tuner card, storage medium, screen capture module, or other digital video source. Video source 310 generates a sequence of video frames at a frame rate of, for example, 30 frames per second. As used herein, the term "frame" generally refers to source, coded, or reconstructed image data. In the case of progressive-scan video, the frame is a progressive-scan video frame. In the case of interlaced video, in an exemplary embodiment, the interlaced video frame may be de-interlaced prior to encoding. Alternatively, the two complementary interlaced video fields may be encoded together as a single video frame or as two individually encoded fields. The term "frame" or "picture" refers to a single non-paired video field, a complementary video field, A video object plane representing a video object at a given time, or a region of interest in a larger image. The video object plane or area may be part of a larger image that includes multiple objects or areas of the scene.

도착하는 소스 프레임(311)은 다수의 프레임 버퍼 저장 구역(321, 322, ..., 32n)을 포함하는 소스 프레임 임시 메모리 저장 구역(320)에 저장된다. 프레임 버퍼(321, 322 등)는 소스 프레임 저장 구역(320)에 하나의 소스 프레임을 보유한다. 소스 프레임들(311) 중 하나 이상이 프레임 버퍼(321, 322 등)에 저장된 후에, 프레임 선택기(frame selector)(330)는 소스 프레임 저장 구역(320)으로부터 개개의 소스 프레임을 선택한다. 프레임이 인코더(340)에 입력하기 위해 프레임 선택기(330)에 의해 선택되는 순서는 프레임이 비디오 소스(310)에 의해 생성되는 순서와 상이할 수 있다(예컨대, 어떤 나중의 프레임이 먼저 인코딩될 수 있게 하고 따라서 시간상 역방향 예측(temporally backward prediction)을 용이하게 하기 위해, 어떤 프레임의 인코딩이 순서가 지연될 수 있다). 인코더(340)보다 앞에, 인코더 시스템(300)은 인코딩 이전에 선택된 프레임(331)의 전처리(pre-processing)(예컨대, 필터링)를 수행하는 전처리기(pre-processor)(도시되지 않음)를 포함할 수 있다. 전처리는 또한 주 성분(primary component)(예컨대, 루마(luma) 성분) 및 보조 성분(secondary component)(예컨대, 적색에 대한 색차(chroma difference) 성분 및 청색에 대한 색차 성분)로의 색 공간 변환(color space conversion) 및 (예컨대, 크로마 성분(chroma component)의 공간 분해능을 감소시키기 위한) 재샘플링 처리를 포함할 수 있다. 전형적으로, 인코딩 이전에, 비디오는 YUV와 같은 색 공간으로 변환되었고, 여기서 루마(luma)(Y) 성분의 샘플 값은 밝기 또는 휘도 값을 나타내고, 크로마(chroma)(U, V) 성분의 샘플 값은 색차 값(color-difference value)을 나타낸다. 크로마 샘플 값은 (예컨대, YUV 4:2:0 포맷 또는 YUV 4:2:2에 대해) 보다 낮은 크로마 샘플링 레이트로 서브샘플링될 수 있거나, 크로마 샘플 값은 (예컨대, YUV 4:4:4 포맷에 대해) 루마 샘플 값과 동일한 분해능을 가질 수 있다. YUV 4:2:0 포맷에서, 크로마 성분은 수평으로 2배만큼 그리고 수직으로 2배만큼 다운샘플링된다. YUV 4:2:2 포맷에서, 크로마 성분은 수평으로 2배만큼 다운샘플링된다. 또는, 비디오가 다른 포맷(예컨대, RGB 4:4:4 포맷)으로 인코딩될 수 있다.The arriving source frame 311 is stored in a source frame temporary memory storage area 320 that includes a plurality of frame buffer storage areas 321, 322, ..., 32n. The frame buffer (321, 322, etc.) holds one source frame in the source frame storage area (320). After one or more of the source frames 311 are stored in the frame buffer 321, 322, etc., the frame selector 330 selects an individual source frame from the source frame storage area 320. The order in which the frames are selected by the frame selector 330 for input to the encoder 340 may differ from the order in which the frames are generated by the video source 310 (e.g., And thus the encoding of certain frames may be delayed in order to facilitate temporally backward prediction). Prior to the encoder 340, the encoder system 300 includes a pre-processor (not shown) that performs pre-processing (e.g., filtering) of the selected frame 331 prior to encoding can do. The preprocessing may also be used to transform the color space to a primary component (e.g., a luma component) and a secondary component (e.g., a chroma difference component for red and a chrominance component for blue) space conversion) and resampling (e.g., to reduce the spatial resolution of the chroma component). Typically, prior to encoding, the video has been converted to a color space such as YUV where the sample value of the luma (Y) component represents the brightness or luminance value and the sample of the chroma (U, V) The value represents a color-difference value. The chroma sample value may be subsampled at a lower chroma sampling rate (e.g., for YUV 4: 2: 0 format or YUV 4: 2: 2) Lt; RTI ID = 0.0 > luma < / RTI > In the YUV 4: 2: 0 format, the chroma component is downsampled horizontally by two times and vertically by two times. In the YUV 4: 2: 2 format, the chroma component is downsampled twice horizontally. Alternatively, the video may be encoded in another format (e.g., RGB 4: 4: 4 format).

인코더(340)는, 코딩된 프레임(341)을 생성하기 위해, 선택된 프레임(331)을 인코딩하고, 또한 MMCO(memory management control operation: 메모리 관리 제어 동작) 신호(342) 또는 RPS(reference picture set: 참조 픽처 세트) 정보를 생성한다. 현재 프레임이 인코딩된 첫 번째 프레임이 아닌 경우, 그의 인코딩 프로세스를 수행할 때, 인코더(340)는 디코딩된 프레임 임시 메모리 저장 구역(360)에 저장된 하나 이상의 이전에 인코딩된/디코딩된 프레임(369)을 사용할 수 있다. 이러한 저장된 디코딩된 프레임(369)은 현재 소스 프레임(331)의 콘텐츠의 인터 프레임 예측(inter-frame prediction)을 위한 참조 프레임(reference frame)으로서 사용된다. MMCO/RPS 정보(342)는 어느 재구성된 프레임이 참조 프레임으로서 사용될 수 있고 따라서 프레임 저장 구역에 저장되어야만 하는지를 디코더에게 알려준다.The encoder 340 encodes the selected frame 331 and generates a memory management control operation (MMCO) signal 342 or a reference picture set (RPS) 342 to generate a coded frame 341. [ Reference picture set) information. If the current frame is not the first frame to be encoded, when performing its encoding process, the encoder 340 generates one or more previously encoded / decoded frames 369 stored in the decoded frame temporary memory storage area 360, Can be used. This stored decoded frame 369 is used as a reference frame for inter-frame prediction of the current source frame 331 content. The MMCO / RPS information 342 informs the decoder which reconstructed frame can be used as a reference frame and therefore should be stored in the frame storage area.

일반적으로, 인코더(340)는 타일들로 분할하는 것, 인트라 예측 추정 및 예측, 움직임 추정 및 보상, 주파수 변환, 양자화 및 엔트로피 코딩과 같은 인코딩 작업을 수행하는 다수의 인코딩 모듈을 포함한다. 인코더(340)에 의해 수행되는 정확한 동작은 압축 포맷(compression format)에 따라 변할 수 있다. 출력되는 인코딩된 데이터의 포맷은 HEVC 포맷(H.265), WMV(Windows Media Video) 포맷, VC-1 포맷, MPEG-x 포맷(예컨대, MPEG-1, MPEG-2, 또는 MPEG-4), H.26x 포맷(예컨대, H.261, H.262, H.263, H.264), 또는 다른 포맷의 변형 또는 확장일 수 있다.Generally, the encoder 340 includes a number of encoding modules that perform encoding tasks such as splitting into tiles, intra prediction prediction and prediction, motion estimation and compensation, frequency conversion, quantization, and entropy coding. The exact operation performed by the encoder 340 may vary depending on the compression format. The format of the encoded data to be output may be a HEVC format (H.265), a WMV (Windows Media Video) format, a VC-1 format, an MPEG-x format (e.g., MPEG-1, MPEG- H.26x format (e.g., H.261, H.262, H.263, H.264), or other format.

인코더(340)는 프레임을 동일한 크기 또는 상이한 크기의 다수의 타일로 분할할 수 있다. 예를 들어, 인코더(340)는 프레임을, 프레임 경계와 함께, 프레임 내의 타일의 수평 및 수직 경계를 정의하는 타일 행 및 타일 열을 따라 분할하고, 여기서 각각의 타일은 직사각형 영역이다. 타일은 종종 병렬 처리를 위한 옵션을 제공하기 위해 사용된다. 프레임은 또한 하나 이상의 슬라이스로서 편성(organize)될 수 있고, 여기서 슬라이스는 프레임 전체 또는 프레임의 영역일 수 있다. 슬라이스는 프레임 내의 다른 슬라이스와 독립적으로 디코딩될 수 있고, 이는 오류 내성(error resilience)을 개선시킨다. 슬라이스 또는 타일의 내용이 인코딩 및 디코딩을 위해 블록 또는 다른 샘플 세트로 추가로 분할된다.The encoder 340 may divide the frame into a number of tiles of equal or different size. For example, the encoder 340 divides a frame along with a frame boundary along a tile row and a tile column that define the horizontal and vertical boundaries of the tile in the frame, where each tile is a rectangular area. Tiles are often used to provide options for parallel processing. The frame may also be organized as one or more slices, where the slice may be the entire frame or the area of the frame. The slice can be decoded independently of other slices in the frame, which improves error resilience. The contents of the slice or tile are further divided into blocks or other sample sets for encoding and decoding.

HEVC 표준에 따른 구문에 대해, 인코더는 프레임(또는 슬라이스 또는 타일)의 내용을 코딩 트리 단위(coding tree unit)로 분할한다. CTU(coding tree unit: 코딩 트리 단위)는 루마 CTB(coding tree block: 코딩 트리 블록)로서 편성되는 루마 샘플 값 및 2 개의 크로마 CTB로서 편성되는 대응하는 크로마 샘플 값을 포함한다. CTU(및 그의 CTB)의 크기는 인코더에 의해 선택된다. 루마 CTB는, 예를 들어, 64x64, 32x32 또는 16x16 루마 샘플 값을 포함할 수 있다. CTU는 하나 이상의 코딩 단위를 포함한다. CU(coding unit)는 루마 CB(coding block: 코딩 블록) 및 2 개의 대응하는 크로마 CB를 가진다. 예를 들어, 하나의 64x64 루마 CTB 및 2 개의 64x64 크로마 CTB를 갖는 CTU(YUV 4:4:4 포맷)는 4 개의 CU로 분할될 수 있고, 각각의 CU는 하나의 32x32 루마 CB 및 2 개의 32x32 크로마 CB를 포함하고, 각각의 CU는 어쩌면 보다 작은 CU로 추가로 분할된다. 또는, 다른 예로서, 하나의 64x64 루마 CTB 및 2 개의 32x32 크로마 CTB를 갖는 CTU(YUV 4:2:0 포맷)는 4 개의 CU로 분할될 수 있고, 각각의 CU는 하나의 32x32 루마 CB 및 2 개의 16x16 크로마 CB를 포함하고, 각각의 CU는 어쩌면 보다 작은 CU로 추가로 분할된다. CU의 가장 작은 허용 가능 크기(예컨대, 8x8, 16x16)는 비트스트림에서 신호될 수 있다.For syntax according to the HEVC standard, the encoder divides the contents of the frame (or slice or tile) into a coding tree unit. The coding tree unit (CTU) includes a luma sample value organized as a luma CTB (coding tree block) and a corresponding chroma sample value organized as two chroma CTBs. The size of the CTU (and its CTB) is selected by the encoder. The luma CTB may include, for example, 64x64, 32x32 or 16x16 luma sample values. A CTU includes one or more coding units. The coding unit (CU) has a luma CB (coding block) and two corresponding chroma CBs. For example, a CTU (YUV 4: 4: 4 format) with one 64x64 luma CTB and two 64x64 chroma CTBs can be divided into four CUs, each CU comprising one 32x32 luma CB and two 32x32 Chroma CB, and each CU is further divided into possibly smaller CUs. Alternatively, as another example, a CTU (YUV 4: 2: 0 format) with one 64x64 luma CTB and two 32x32 chroma CTBs may be divided into four CUs, each CU comprising one 32x32 luma CB and two 16 x 16 chroma CBs, each of which is further divided into possibly smaller CUs. The smallest allowable size of the CU (e.g., 8x8, 16x16) may be signaled in the bitstream.

일반적으로, CU는 인터(inter) 또는 인트라(intra)와 같은 예측 모드를 갖는다. CU는 예측 정보(예측 모드 상세, 변위 값, 기타 등등) 및/또는 예측 처리를 신호하기 위한 하나 이상의 예측 단위를 포함한다. PU(prediction unit)는 루마 PB(prediction block: 예측 블록) 및 2 개의 크로마 PB를 가진다. 인트라 예측된 CU에 대해, PU는, CU가 가장 작은 크기(예컨대, 8x8)를 갖지 않는 한, CU와 동일한 크기를 갖는다. 그 경우에, CU에 대한 구문 요소에 의해 나타내는 바와 같이, CU는 4 개의 보다 작은 PU(예컨대, 가장 작은 CU 크기가 8x8인 경우 각각이 4x4임)로 분할될 수 있거나, PU가 가장 작은 CU 크기를 가질 수 있다. CU는 또한 잔차 코딩/디코딩(residual coding/decoding)을 위한 하나 이상의 변환 단위(transform unit)를 가지며, 여기서 TU(transform unit: 변환 단위)는 하나의 루마 TB(transform block: 변환 블록) 및 2 개의 크로마 TB를 갖는다. 인트라 예측된 CU에서의 PU는 단일의 TU(크기가 PU와 같음) 또는 다수의 TU를 포함할 수 있다. 인코더는 비디오를 CTU, CU, PU, TU, 기타로 어떻게 분할할지를 결정한다. H.264/AVC 표준과 관련하여, "매크로블록"이라는 용어는 H.265/HEVC 표준에 대한 CTU의 영역과 유사한 블록 형상의 영역을 나타내고, "서브매크로블록 파티션"이라는 용어는 CU 또는 PU의 영역과 유사한 블록 형상의 영역을 나타낸다. 본원에서 사용되는 바와 같이, "블록"이라는 용어는, 문맥에 따라, CB, PB, TB, CTU, CU, PU, TU, 매크로블록, 서브매크로블록 파티션 또는 다른 샘플 값 세트를 나타낼 수 있다.Generally, the CU has a prediction mode such as inter or intra. The CU includes prediction information (prediction mode details, displacement values, etc.) and / or one or more prediction units for signaling prediction processing. The prediction unit (PU) has a luma PB (prediction block) and two chroma PBs. For an intra predicted CU, the PU has the same size as the CU unless the CU has the smallest size (e.g., 8x8). In that case, as indicated by the syntax elements for the CU, the CU may be divided into four smaller PUs (e.g., each of the 4x4 if the smallest CU size is 8x8), or the PU may be divided into the smallest CU size Lt; / RTI > The CU also has one or more transform units for residual coding / decoding, where the TU (transform unit) is one transformed block (luma TB) and two Chroma TB. A PU in an intra-predicted CU may contain a single TU (size equal to PU) or multiple TUs. The encoder decides how to split the video into CTU, CU, PU, TU, and so on. In the context of the H.264 / AVC standard, the term "macroblock" refers to a block shaped area similar to that of the CTU for the H.265 / HEVC standard, and the term " Shaped area similar to the area. As used herein, the term "block " may refer to a set of CB, PB, TB, CTU, CU, PU, TU, macroblock, sub-

도 3으로 돌아가서, 인코더는 소스 프레임(331)의 인트라 코딩된 블록을 소스 프레임(331) 내의 다른 이전에 재구성된 샘플 값으로부터의 예측으로 표현한다. 인트라 BC(block copy: 블록 복사) 예측에 있어서, 인트라 픽처 추정기(intra-picture estimator)는 다른 이전에 재구성된 샘플 값에 대한 블록의 변위를 추정한다. 인트라 프레임 예측 참조 영역(intra-frame prediction reference region)(또는 간단히 인트라 예측 영역)은 블록에 대한 BC 예측 값을 발생시키기 위해 사용되는 프레임 내의 샘플 영역이다. 인트라 프레임 예측 영역은 BV(block vector) 값(BV 추정에서 결정됨)으로 표시될 수 있다. 블록에 대한 인트라 공간 예측(intra spatial prediction)에 대해, 인트라 픽처 추정기는 이웃하는 재구성된 샘플 값의 블록 내로의 외삽(extrapolation)을 추정한다. 인트라 픽처 추정기는 엔트로피 코딩되어 있는, (인트라 BC 예측을 위한 BV 값 또는 인트라 공간 예측을 위한 예측 모드(방향)와 같은) 예측 정보를 출력할 수 있다. 인트라 프레임 예측 예측기(intra-frame prediction predictor)는 인트라 예측 값을 결정하기 위해 예측 정보를 적용한다.3, the encoder represents the intra-coded block of the source frame 331 as a prediction from another previously reconstructed sample value in the source frame 331. [ In intra-block copy (PBC) prediction, the intra-picture estimator estimates the displacement of the block for other previously reconstructed sample values. An intra-frame prediction reference region (or simply an intra-prediction region) is a sample region in a frame used to generate a BC prediction value for a block. The intra frame prediction region may be represented by a BV (block vector) value (determined from BV estimation). For intra spatial prediction on a block, the intra picture estimator estimates the extrapolation of neighboring reconstructed sample values into blocks. The intra picture estimator may output prediction information (such as a BV value for intra-BC prediction or a prediction mode (direction) for intra-spatial prediction) that is entropy-coded. An intra-frame prediction predictor applies prediction information to determine an intra-prediction value.

인코더(340)는 소스 프레임(331)의 인터 프레임 코딩된, 예측된 블록을 참조 프레임으로부터의 예측으로 표현한다. 움직임 추정기(motion estimator)는 하나 이상의 참조 프레임(369)에 대한 블록의 움직임을 추정한다. 움직임 추정기는, 본원에 기술되는 바와 같이, MV(motion vector) 정밀도(예컨대, 정수 샘플 MV 정밀도, 1/2 샘플 MV 정밀도, 또는 1/4 샘플 MV 정밀도)를 선택할 수 있고, 이어서 움직임 추정 동안 선택된 MV 정밀도를 사용할 수 있다. 다수의 참조 프레임이 사용될 때, 다수의 참조 프레임은 상이한 시간 방향 또는 동일한 시간 방향으로부터의 것일 수 있다. 움직임 보상된 예측 참조 영역(motion-compensated prediction reference region)은 현재 프레임의 샘플 블록에 대한 움직임 보상된 예측 값을 발생시키는 데 사용되는 참조 프레임(들) 내의 샘플 영역이다. 움직임 추정기는 엔트로피 코딩되어 있는, MV 정보와 같은, 움직임 정보를 출력한다. 움직임 보상기(motion compensator)는 인터 프레임 예측을 위한 움직임 보상된 예측 값을 결정하기 위해 선택된 MV 정밀도를 가지는 MV 값을 참조 프레임(369)에 적용한다.The encoder 340 expresses the interframe coded, predicted block of the source frame 331 as a prediction from the reference frame. The motion estimator estimates the motion of the block for one or more reference frames 369. The motion estimator may select motion vector (MV) precision (e.g., integer sample MV precision, 1/2 sample MV precision, or 1/4 sample MV precision), as described herein, MV precision can be used. When a plurality of reference frames are used, the plurality of reference frames may be from different time directions or from the same time direction. A motion-compensated prediction reference region is a sample region in a reference frame (s) used to generate a motion compensated prediction value for a sample block of a current frame. The motion estimator outputs motion information, such as MV information, entropy-coded. The motion compensator applies an MV value with a selected MV precision to the reference frame 369 to determine a motion compensated prediction value for interframe prediction.

인코더는 블록의 예측 값(인트라 또는 인터)과 대응하는 원래 값 사이의 차이(있는 경우)를 결정할 수 있다. 이 예측 잔차 값(prediction residual value)은 주파수 변환(주파수 변환이 생략되지 않는 경우), 양자화 및 엔트로피 인코딩을 사용하여 추가로 인코딩된다. 예를 들어, 인코더(340)는 비디오의 픽처, 타일, 슬라이스 및/또는 다른 부분에 대해 QP(quantization parameter: 양자화 파라미터)에 대한 값을 설정하고, 그에 따라 변환 계수를 양자화한다. 인코더(340)의 엔트로피 코더(entropy coder)는 양자화된 변환 계수 값은 물론, 특정의 보조 정보(side information)(예컨대, MV 정보, 선택된 MV 정밀도, BV 값, QP 값, 모드 결정, 파라미터 선택)를 압축한다. 전형적인 엔트로피 코딩 기법은 지수 골롬 코딩(Exponential-Golomb coding), 골롬 라이스 코딩(Golomb-Rice coding), 산술 코딩(arithmetic coding), 차분 코딩(differential coding), 허프만 코딩(Huffman coding), 런 길이 코딩(run length coding), V2V(variable-length-to-variable-length) 코딩, V2F(variable-length-to-fixed-length) 코딩, LZ(Lempel-Ziv) 코딩, 사전 코딩(dictionary coding), PIPE(probability interval partitioning entropy coding), 및 이들의 조합을 포함한다. 엔트로피 코더는 상이한 종류의 정보에 대해 상이한 코딩 기법을 사용할 수 있고, (예컨대, 골롬 라이스 코딩에 뒤이어서 산술 코딩을 적용하는 것에 의해) 다수의 기법을 결합하여 적용할 수 있으며, 특정의 코딩 기법 내에서 다수의 코드 테이블 중에서 선택할 수 있다. 일부 구현에서, 주파수 변환이 생략될 수 있다. 이 경우에, 예측 잔차 값이 양자화되고 엔트로피 코딩될 수 있다.The encoder can determine the difference (if any) between the predicted value (intra or inter) of the block and the corresponding original value. This prediction residual value is further encoded using frequency conversion (if frequency conversion is not omitted), quantization and entropy encoding. For example, the encoder 340 sets a value for a quantization parameter (QP) for a picture, tile, slice, and / or other portion of video and quantizes the transform coefficient accordingly. The entropy coder of the encoder 340 can be used to generate specific quantities of side information (e.g., MV information, selected MV precision, BV value, QP value, mode decision, . Typical entropy coding techniques include Exponential-Golomb coding, Golomb-Rice coding, arithmetic coding, differential coding, Huffman coding, run-length coding run length coding, variable-length-to-variable-length (V2V) coding, variable-length-to-fixed-length (V2F) coding, Lempel-Ziv coding, dictionary coding, probability interval partitioning entropy coding, and combinations thereof. An entropy coder may use different coding techniques for different types of information and may combine and apply multiple techniques (e.g., by applying arithmetic coding following Golomirian coding), and within a particular coding technique It is possible to select among a plurality of code tables. In some implementations, frequency conversion may be omitted. In this case, the prediction residual value may be quantized and entropy coded.

디코딩된 프레임에서의 블록 경계 행 및/또는 열에 걸친 불연속을 평활화하기 위해 적응적 블록화 제거 필터(adaptive deblocking filter)가 인코더(340) 내의 움직임 보상 루프 내에 포함된다(즉, "루프내" 필터링). (도시되지 않은, 링잉 제거 필터링(de-ringing filtering), ALF(adaptive loop filtering), 또는 SAO(sample-adaptive offset) 필터링과 같은) 다른 필터링이 대안적으로 또는 그에 부가하여 루프내 필터링 동작으로서 적용될 수 있다.An adaptive deblocking filter is included in the motion compensation loop within the encoder 340 (i. E., "In-the-loop" filtering) to smooth out the block boundary row and / or column discontinuity in the decoded frame. Other filtering (not shown, such as de-ringing filtering, adaptive loop filtering (ALF), or sample-adaptive offset (SAO) filtering) may alternatively or additionally be applied as intra-loop filtering operations .

코딩된 프레임(341) 및 MMCO/RPS 정보(342)(또는 프레임에 대한 의존 관계 및 정렬 구조를 인코더(340)에서 이미 알고 있기 때문에, MMCO/RPS 정보(342)와 동등한 정보)는 디코딩 프로세스 에뮬레이터(350)에 의해 처리된다. 디코딩 프로세스 에뮬레이터(350)는 디코더의 기능 중 일부(예를 들어, 참조 프레임을 재구성하는 디코딩 작업)를 구현한다. MMCO/RPS 정보(342)와 부합하는 방식으로, 디코딩 프로세스 에뮬레이터(350)는 주어진 코딩된 프레임(342)이 재구성되어 인코딩될 후속 프레임의 인터 프레임 예측에서 참조 프레임으로서 사용하기 위해 저장될 필요가 있는지를 결정한다. 코딩된 프레임(341)이 저장될 필요가 있는 경우, 디코딩 프로세스 에뮬레이터(350)는 코딩된 프레임(341)을 수신하고 대응하는 디코딩된 프레임(351)을 생성하는 디코더에 의해 수행될 디코딩 프로세스를 모델링한다. 그렇게 함에 있어서, 인코더(340)가 디코딩된 프레임 저장 구역(360)에 저장된 디코딩된 프레임(들)(369)을 사용할 때, 디코딩 프로세스 에뮬레이터(350)는 또한 디코딩 프로세스의 일부로서 저장 구역(360)으로부터의 디코딩된 프레임(들)(369)을 사용한다.The MMCO / RPS information 342 (or the information equivalent to the MMCO / RPS information 342 since the dependency and alignment structure for the frame is already known in the encoder 340) Lt; / RTI > The decoding process emulator 350 implements some of the functions of the decoder (e.g., a decoding operation to reconstruct the reference frame). In a manner consistent with the MMCO / RPS information 342, the decoding process emulator 350 determines whether a given coded frame 342 needs to be reconstructed and stored for use as a reference frame in the interframe prediction of the next frame to be encoded . When the coded frame 341 needs to be stored, the decoding process emulator 350 models the decoding process to be performed by a decoder that receives the coded frame 341 and generates a corresponding decoded frame 351 do. In doing so, when the encoder 340 uses the decoded frame (s) 369 stored in the decoded frame storage area 360, the decoding process emulator 350 also decodes the storage area 360 as part of the decoding process, (S) < / RTI >

디코딩된 프레임 임시 메모리 저장 구역(360)은 다수의 프레임 버퍼 저장 구역(361, 362, ..., 36n)을 포함한다. MMCO/RPS 정보(342)와 부합하는 방식으로, 디코딩 프로세스 에뮬레이터(350)는 참조 프레임들로서 사용하기 위해 인코더(340)에 의해 더 이상 필요로 하지 않는 프레임들을 갖는 임의의 프레임 버퍼들(361, 362 등)을 식별하기 위해 저장 구역(360)의 내용을 관리한다. 디코딩 프로세스를 모델링한 후에, 디코딩 프로세스 에뮬레이터(350)는 새로 디코딩된 프레임(351)을 이러한 방식으로 식별된 프레임 버퍼(361, 362 등)에 저장한다.The decoded frame temporary memory storage area 360 includes a plurality of frame buffer storage areas 361, 362, ..., 36n. In a manner consistent with the MMCO / RPS information 342, the decoding process emulator 350 may include any of the frame buffers 361, 362 with frames that are no longer needed by the encoder 340 for use as reference frames Etc.) in order to identify the storage area 360. After modeling the decoding process, the decoding process emulator 350 stores the newly decoded frame 351 in the identified frame buffers 361, 362, and so on in this manner.

코딩된 프레임(341) 및 MMCO/RPS 정보(342)는 임시 코딩된 데이터 구역(temporary coded data area)(370)에 버퍼링된다. 코딩된 데이터 구역(370)에 통합되어 있는 코딩된 데이터는, 기본 코딩된 비디오 비트스트림(elementary coded video bitstream)의 구문의 일부로서, 하나 이상의 픽처에 대한 인코딩된 데이터를 포함한다. 코딩된 데이터 구역(370)에 통합되어 있는 코딩된 데이터는 또한 코딩된 비디오 데이터에 관한 미디어 메타데이터를 (예컨대, 하나 이상의 SEI(supplemental enhancement information) 메시지 또는 VUI(video usability information) 메시지에 하나 이상의 파라미터로서) 포함할 수 있다.The coded frame 341 and the MMCO / RPS information 342 are buffered in a temporary coded data area 370. The coded data incorporated in the coded data area 370 includes encoded data for one or more pictures as part of the syntax of an elementary coded video bitstream. The coded data incorporated in the coded data area 370 may also include one or more parameters (e. G., One or more SEI (supplemental enhancement information) messages or video usability information (VUI) As shown in FIG.

임시 코딩된 데이터 구역(370)으로부터의 통합된 데이터(371)는 채널 인코더(channel encoder)(380)에 의해 처리된다. 채널 인코더(380)는 (예컨대, ITU-T H.222.0 | ISO/IEC 13818-1과 같은 미디어 프로그램 스트림 또는 전송 스트림 포맷 또는 IETF RFC 3550과 같은 인터넷 실시간 전송 프로토콜 포맷에 따라) 미디어 스트림으로서 전송 또는 저장하기 위해 통합된 데이터를 패킷화(packetize) 및/또는 다중화(multiplex)할 수 있고, 이 경우에 채널 인코더(380)는 미디어 전송 스트림의 구문의 일부로서 구문 요소를 추가할 수 있다. 또는, 채널 인코더(380)는 (예컨대, ISO/IEC 14496-12와 같은 미디어 컨테이너 포맷(media container format)에 따라) 파일로서 저장하기 위해 통합된 데이터를 편성할 수 있고, 이 경우에 채널 인코더(380)는 미디어 저장 파일의 구문의 일부로서 구문 요소를 추가할 수 있다. 또는, 보다 일반적으로, 채널 인코더(380)는 하나 이상의 미디어 시스템 다중화 프로토콜 또는 전송 프로토콜을 구현할 수 있고, 이 경우에 채널 인코더(380)는 프로토콜(들)의 구문의 일부로서 구문 요소를 추가할 수 있다. 채널 인코더(380)는 출력에 대한 저장소, 통신 연결, 또는 다른 채널을 나타내는 출력을 채널(390)에 제공한다. 채널 인코더(380) 또는 채널(390)은 또한 FEC(forward-error correction: 순방향 오류 정정) 인코딩 및 아날로그 신호 변조와 같은 다른 요소(도시되지 않음)를 포함할 수 있다.The integrated data 371 from the temporary coded data area 370 is processed by a channel encoder 380. The channel encoder 380 may be adapted to transmit or receive as a media stream (e.g., in accordance with a media program stream or transport stream format such as ITU-T H.222.0 | ISO / IEC 13818-1 or Internet real time transport protocol format such as IETF RFC 3550) May packetize and / or multiplex the aggregated data for storage, in which case the channel encoder 380 may add syntax elements as part of the syntax of the media transport stream. Alternatively, the channel encoder 380 may organize the integrated data for storage as a file (e.g., in accordance with a media container format such as ISO / IEC 14496-12), in which case the channel encoder 380) may add syntax elements as part of the syntax of the media save file. Alternatively, more generally, the channel encoder 380 may implement one or more media system multiplexing protocols or transport protocols, in which case the channel encoder 380 may add syntax elements as part of the syntax of the protocol (s) have. The channel encoder 380 provides to the channel 390 an output indicative of a depot, communication link, or other channel for the output. The channel encoder 380 or channel 390 may also include other elements (not shown) such as FEC (forward error correction) encoding and analog signal modulation.

IV. 예시적인 비디오 인코더IV. An exemplary video encoder

도 4a 및 도 4b는 일부 기술된 실시예가 구현될 수 있는 일반화된 비디오 인코더(400)의 블록도이다. 인코더(400)는 현재 프레임(405)을 포함하는 비디오 픽처 시퀀스를 입력 비디오 신호(505)로서 수신하고, 코딩된 비디오 비트스트림(495)에서의 인코딩된 데이터를 출력으로서 생성한다.4A and 4B are block diagrams of a generalized video encoder 400 in which some of the described embodiments may be implemented. The encoder 400 receives the video picture sequence containing the current frame 405 as an input video signal 505 and generates the encoded data in the coded video bitstream 495 as an output.

인코더(400)는 블록 기반(block-based)이고, 구현에 의존하는 블록 포맷을 사용한다. 블록이 상이한 스테이지에서(예컨대, 예측, 주파수 변환 및/또는 엔트로피 인코딩 스테이지에서) 추가로 세분화될 수 있다. 예를 들어, 픽처가 64x64 블록, 32x32 블록 또는 16x16 블록으로 나누어질 수 있고, 이들이 차례로 코딩 및 디코딩을 위해 보다 작은 샘플 값 블록으로 나누어질 수 있다. HEVC 표준에 대한 인코딩의 구현에서, 인코더는 픽처를 CTU(CTB), CU(CB), PU(PB) 및 TU(TB)로 분할한다.The encoder 400 is block-based and uses an implementation-dependent block format. The blocks may be further subdivided at different stages (e.g., in the prediction, frequency conversion, and / or entropy encoding stages). For example, a picture can be divided into 64x64 blocks, 32x32 blocks, or 16x16 blocks, which in turn can be divided into smaller sample value blocks for coding and decoding. In the implementation of the encoding for the HEVC standard, the encoder divides the pictures into CTU (CTB), CU (CB), PU (PB) and TU (TB).

인코더(400)는 인트라 픽처 코딩(intra-picture coding) 및/또는 인터 픽처 코딩(inter-picture coding)을 사용하여 픽처를 압축한다. 인코더(400)의 구성요소들 중 다수는 인트라 픽처 코딩 및 인터 픽처 코딩 둘 다를 위해 사용된다. 그 구성요소들에 의해 수행되는 정확한 동작은 압축되는 정보의 유형에 따라 달라질 수 있다.The encoder 400 compresses pictures using intra-picture coding and / or inter-picture coding. Many of the components of the encoder 400 are used for both intra picture coding and inter picture coding. The exact operation performed by the components may vary depending on the type of information being compressed.

타일화 모듈(tiling module)(410)은, 선택적으로, 픽처를 동일한 크기 또는 상이한 크기의 다수의 타일로 분할한다. 예를 들어, 타일화 모듈(410)은 픽처를, 픽처 경계와 함께, 픽처 내의 타일의 수평 및 수직 경계를 정의하는 타일 행 및 타일 열을 따라 분할하고, 여기서 각각의 타일은 직사각형 영역이다.The tiling module 410 optionally divides the picture into a plurality of tiles of the same size or different sizes. For example, the tiling module 410 divides a picture along with a picture boundary along a tile row and a tile row that define the horizontal and vertical boundaries of the tile in the picture, where each tile is a rectangular area.

일반 인코딩 제어(420)는 입력 비디오 신호(405)에 대한 픽처는 물론 인코더(400)의 다양한 모듈로부터의 피드백(도시되지 않음)을 수신한다. 전체적으로, 일반 인코딩 제어(420)는, 인코딩 동안 코딩 파라미터를 설정하고 변경하기 위해, 제어 신호(도시되지 않음)를 다른 모듈(타일화 모듈(410), 변환기/스케일러/양자화기(430), 스케일러/역변환기(435), 인트라 픽처 추정기(440), 움직임 추정기(450) 및 인트라/인터 스위치(intra/inter switch) 등)에 제공한다. 상세하게는, 움직임 추정기(450)와 관련하여, 일반 인코딩 제어(420)는 인코딩 동안 MV 정밀도를 결정할 수 있다. 일반 인코딩 제어(420)는 또한 인코딩 동안 중간 결과를 평가할 수 있다(예컨대, 레이트 왜곡 분석(rate-distortion analysis)을 수행함). 일반 인코딩 제어(420)는, 대응하는 디코더가 일관성 있는 결정을 할 수 있도록, 인코딩 동안 행해진 결정을 나타내는 일반 제어 데이터(422)를 생성한다. 일반 제어 데이터(422)는 헤더 포맷터/엔트로피 코더(header formatter/entropy coder)(490)에 제공된다.The general encoding control 420 receives feedback (not shown) from the various modules of the encoder 400, as well as pictures for the input video signal 405. In general, the general encoding control 420 may control a control signal (not shown) to another module (tiling module 410, transducer / scaler / quantizer 430, scaler 420) to set and change coding parameters during encoding / Inverse transformer 435, intra picture estimator 440, motion estimator 450 and intra / inter switch). In particular, with respect to motion estimator 450, general encoding control 420 may determine the MV precision during encoding. The general encoding control 420 may also evaluate the intermediate result during encoding (e.g., perform rate-distortion analysis). The general encoding control 420 generates general control data 422 that represents the decisions made during encoding so that the corresponding decoder can make a coherent decision. General control data 422 is provided to a header formatter / entropy coder 490.

현재 픽처가 인터 픽처 예측을 사용하여 예측되는 경우, 움직임 추정기(450)는 하나 이상의 참조 픽처에 대한 입력 비디오 신호(405)의 현재 픽처의 샘플 값 블록의 움직임을 추정한다. 움직임 추정기(450)는, 본원에 기술되는 바와 같이, MV(motion vector) 정밀도(예컨대, 정수 샘플 MV 정밀도, 1/2 샘플 MV 정밀도, 또는 1/4 샘플 MV 정밀도)를 선택할 수 있고, 이어서 움직임 추정 동안 선택된 MV 정밀도를 사용할 수 있다. 디코딩된 픽처 버퍼(470)는 참조 픽처로서 사용하기 위해 하나 이상의 재구성된 이전에 코딩된 픽처(reconstructed previously coded picture)를 버퍼링한다. 다수의 참조 픽처가 사용될 때, 다수의 참조 픽처는 상이한 시간 방향 또는 동일한 시간 방향으로부터의 것일 수 있다. 움직임 추정기(450)는 MV 데이터, 병합 모드 인덱스 값(merge mode index value) 및 참조 픽처 선택 데이터와 같은 보조 정보 움직임 데이터(452)는 물론, 선택된 MV 정밀도를 나타내는 보조 정보를 생성한다. 움직임 데이터(452)를 포함하는 보조 정보는 헤더 포맷터/엔트로피 코더(490)에는 물론 움직임 보상기(455)에도 제공된다.If the current picture is predicted using inter-picture prediction, the motion estimator 450 estimates the motion of the sample value block of the current picture of the input video signal 405 for one or more reference pictures. Motion estimator 450 may select motion vector (MV) precision (e.g., integer sample MV precision, 1/2 sample MV precision, or 1/4 sample MV precision), as described herein, The MV precision selected during estimation can be used. The decoded picture buffer 470 buffers one or more reconstructed previously coded pictures for use as reference pictures. When a plurality of reference pictures are used, the plurality of reference pictures may be from different time directions or from the same time direction. Motion estimator 450 generates auxiliary information indicating the selected MV precision as well as auxiliary information motion data 452 such as MV data, merge mode index value and reference picture selection data. Sub-information including motion data 452 is also provided to the motion compensator 455 as well as to the header formatter / entropy coder 490. [

움직임 보상기(455)는 선택된 MV 정밀도를 가지는 MV 값을 디코딩된 픽처 버퍼(470)로부터의 재구성된 참조 픽처(들)에 적용한다. 픽처에 대한 크로마 데이터가 루마 데이터와 동일한 분해능을 가질 때(예컨대, 포맷이 YUV 4:4:4 포맷 또는 RGB 4:4:4 포맷일 때), 크로마 블록에 대해 적용되는 MV 값은 루마 블록에 대해 적용되는 MV 값과 동일할 수 있다. 다른 한편으로, 픽처에 대한 크로마 데이터가 루마 데이터에 비해 감소된 분해능을 가질 때(예컨대, 포맷이 YUV 4:2:0 포맷 또는 YUV 4:2:2 포맷일 때), (예컨대, YUV 4:2:0 포맷에 대해, MV 값의 수직 및 수평 성분을 2로 나누고 이를 크로마 움직임 보상 프로세스에 사용되는 정밀도로 버림(truncate)하거나 반올림(round)하는 것에 의해; YUV 4:2:2 포맷에 대해, MV 값의 수평 성분을 2로 나누고 이를 크로마 움직임 보상 프로세스에 사용되는 정밀도로 버림하거나 반올림하는 것에 의해) 크로마 블록에 대해 적용되는 MV 값은 크로마 분해능에서의 차이에 대해 조절하기 위해 스케일링 다운(scale down)되고 어쩌면 반올림된 MV 값일 수 있다. 움직임 보상기(455)는 현재 픽처에 대한 움직임 보상된 예측을 생성한다.The motion compensator 455 applies the MV value with the selected MV precision to the reconstructed reference picture (s) from the decoded picture buffer 470. When the chroma data for the picture has the same resolution as the luma data (e.g., when the format is YUV 4: 4: 4 format or RGB 4: 4: 4 format), the MV value applied to the chroma block is May be the same as the MV value applied for the < / RTI > On the other hand, when the chroma data for the picture has a reduced resolution (e.g., when the format is YUV 4: 2: 0 format or YUV 4: 2: 2 format) For the 2: 0 format, by dividing the vertical and horizontal components of the MV value by 2 and truncating or rounding it to the precision used in the chroma motion compensation process; for the YUV 4: 2: 2 format , By dividing the horizontal component of the MV value by 2 and by truncating or rounding it to the precision used in the chroma motion compensation process), the MV values applied to the chroma block are scaled down to scale down and maybe a rounded MV value. Motion compensator 455 generates a motion compensated prediction for the current picture.

인코더(400) 내의 별도의 경로에서, 인트라 픽처 추정기(440)는 입력 비디오 신호(405)의 현재 픽처의 샘플 값 블록에 대한 인트라 픽처 예측을 어떻게 수행할지를 결정한다. 현재 픽처는 전체 또는 일부가 인트라 픽처 코딩을 사용하여 코딩될 수 있다. 현재 픽처의 재구성(438)의 값을 사용하여, 인트라 공간 예측에 대해, 인트라 픽처 추정기(440)는 현재 픽처의 이웃하는 이전에 재구성된 샘플 값으로부터 현재 픽처의 현재 블록의 샘플 값을 어떻게 공간적으로 예측할지를 결정한다. 또는, BV 값을 사용하는 인트라 BC 예측에 있어서, 인트라 픽처 추정기(440)는 현재 픽처 내의 상이한 후보 영역에 대한 현재 블록의 샘플 값의 변위를 추정한다.In a separate path within the encoder 400, the intra picture estimator 440 determines how to perform intra picture prediction on the sample value block of the current picture of the input video signal 405. [ The current picture may be coded in whole or in part using intra picture coding. Using the value of the reconstruction 438 of the current picture, for intra-spatial prediction, the intra-picture estimator 440 determines how the sample value of the current block of the current picture from the neighboring previous reconstructed sample value of the current picture is spatially Determine whether to predict. Alternatively, in an intra-BC prediction using a BV value, the intra-picture estimator 440 estimates the displacement of the sample value of the current block with respect to different candidate regions in the current picture.

인트라 픽처 추정기(440)는, 인트라 예측이 공간 예측 또는 인트라 BC 예측을 사용하는지를 나타내는 정보(예컨대, 인트라 블록마다의 플래그 값), (인트라 공간 예측에 대한) 예측 모드 방향, 및 (인트라 BC 예측에 대한) BV 값과 같은 인트라 예측 데이터(442)를 보조 정보로서 생성한다. 인트라 예측 데이터(442)는 헤더 포맷터/엔트로피 코더(490)에는 물론 인트라 픽처 예측기(445)에도 제공된다.Intra-picture estimator 440 includes information indicating whether intraprediction uses spatial prediction or intra-BC prediction (e.g., flag values for intra-block), prediction mode direction (for intra-spatial prediction) (For example) BV value as auxiliary information. Intra prediction data 442 is also provided to the intra-picture predictor 445 as well as to the header formatter / entropy coder 490. [

인트라 예측 데이터(442)에 따라, 인트라 픽처 예측기(445)는 현재 픽처의 이웃하는 이전에 재구성된 샘플 값으로부터 현재 픽처의 현재 블록의 샘플 값을 공간적으로 예측한다. 또는 인트라 BC 예측에 있어서, 인트라 픽처 예측기(445)는 현재 블록에 대한 BV 값에 의해 표시되는, 인트라 예측 영역의 이전에 재구성된 샘플 값을 사용하여 현재 블록의 샘플 값을 예측한다.According to the intra prediction data 442, the intra prediction unit 445 spatially predicts the sample value of the current block of the current picture from the neighboring previous reconstructed sample value of the current picture. Or intra-BC prediction, the intra-picture predictor 445 predicts the sample value of the current block using the previously reconstructed sample value of the intra-prediction region indicated by the BV value for the current block.

인트라/인터 스위치는 주어진 블록에 대한 예측(458)으로서 사용하기 위해 움직임 보상된 예측 또는 인트라 픽처 예측의 값을 선택한다. 잔차 코딩이 생략되지 않을 때, 예측(458)의 블록과 입력 비디오 신호(405)의 원래의 현재 픽처의 대응하는 부분 사이의 차이(있는 경우)는 잔차(418)의 값을 제공한다. 현재 픽처의 재구성 동안, 잔차 값이 인코딩/신호되었을 때, 재구성된 잔차 값들은 예측(458)과 결합되어, 비디오 신호(405)로부터의 원래 콘텐츠의 재구성(438)을 생성한다. 그렇지만, 손실 압축에서, 일부 정보가 비디오 신호(405)로부터 여전히 손실된다.The intra / interswitch selects the value of motion compensated prediction or intra picture prediction for use as prediction 458 for a given block. The difference (if any) between the block of predictions 458 and the corresponding portion of the original current picture of the input video signal 405 provides the value of the residual 418 when the residual coding is not omitted. During reconstruction of the current picture, when the residual value is encoded / signaled, the reconstructed residual values are combined with prediction 458 to generate a reconstruction 438 of the original content from the video signal 405. However, in lossy compression, some information is still lost from the video signal 405.

변환기/스케일러/양자화기(430)에서, 주파수 변환이 생략되지 않을 때, 주파수 변환기는 공간 영역 비디오 데이터를 주파수 영역(즉, 스펙트럼, 변환) 데이터로 변환시킨다. 블록 기반 비디오 코딩에 있어서, 주파수 변환기는 예측 잔차 데이터(또는 예측(458)이 널(null)인 경우, 샘플 값 데이터)의 블록에 이산 코사인 변환(DCT), 그의 정수 근사화(integer approximation), 또는 다른 유형의 순방향 블록 변환(forward block transform)(예컨대, 이산 사인 변환(discrete sine transform) 또는 그의 정수 근사화)을 적용하여, 주파수 변환 계수 블록을 생성한다. 인코더(400)는 또한 이러한 변환 단계가 생략된다는 것을 나타낼 수 있다. 스케일러/양자화기는 변환 계수를 스케일링하고 양자화한다. 예를 들어, 양자화기는 프레임 단위로, 타일 단위로, 슬라이스 단위로, 블록 단위로, 주파수 특정(frequency-specific) 단위로 또는 다른 단위로 변하는 양자화 계단 크기(quantization step size)로 데드존 스칼라 양자화(dead-zone scalar quantization)를 주파수 영역 데이터에 적용한다. 양자화된 변환 계수 데이터(432)는 헤더 포맷터/엔트로피 코더(490)에 제공된다. 주파수 변환이 생략되는 경우, 스케일러/양자화기는 예측 잔차 데이터(또는 예측(458)이 널인 경우, 샘플 값 데이터)의 블록을 스케일링 및 양자화하여, 헤더 포맷터/엔트로피 코더(490)에 제공되는 양자화된 값을 생성할 수 있다.In the converter / scaler / quantizer 430, when frequency conversion is not omitted, the frequency converter converts the spatial domain video data into frequency domain (i.e., spectral, transform) data. For block-based video coding, the frequency transformer may include a discrete cosine transform (DCT), an integer approximation thereof, or a discrete cosine transform on a block of predictive residual data (or sample value data if prediction 458 is null) Another type of forward block transform (e.g., a discrete sine transform or its integer approximation) is applied to generate a frequency transform coefficient block. The encoder 400 may also indicate that this conversion step is omitted. The scaler / quantizer scales and quantizes the transform coefficients. For example, the quantizer can be a deadzone scalar quantizer (e.g., a quantizer) with a quantization step size that varies from frame to tile, slice, block, frequency-specific, dead-zone scalar quantization) to the frequency domain data. The quantized transform coefficient data 432 is provided to a header formatter / entropy coder 490. If frequency conversion is omitted, the scaler / quantizer scales and quantizes the block of predictive residual data (or sample value data, if prediction 458 is null) to generate a quantized value provided to header formatter / entropy coder 490 Can be generated.

스케일러/역변환기(435)에서, 스케일러/역양자화기는 양자화된 변환 계수들에 대해 역스케일링 및 역양자화를 수행한다. 역 주파수 변환기는 역 주파수 변환을 수행하여, 재구성된 예측 잔차 값 블록 또는 재구성된 샘플 값 블록을 생성한다. 변환 스테이지가 생략된 경우, 역 주파수 변환도 생략된다. 이 경우에, 스케일러/역양자화기는 예측 잔차 데이터(또는 샘플 값 데이터)의 블록에 대해 역스케일링 및 역양자화를 수행하여, 재구성된 값을 생성할 수 있다. 잔차 값이 인코딩/신호되었을 때, 인코더(400)는 재구성된 잔차 값을 예측(458)의 값(예컨대, 움직임 보상된 예측 값, 인트라 픽처 예측 값)과 결합하여 재구성(438)을 형성한다. 잔차 값이 인코딩/신호되지 않았을 때, 인코더(400)는 예측(458)의 값을 재구성(438)으로서 사용한다.In the scaler / inverse transformer 435, the scaler / dequantizer performs inverse scaling and inverse quantization on the quantized transform coefficients. The inverse frequency transformer performs an inverse frequency transform to generate a reconstructed prediction residual value block or a reconstructed sample value block. If the conversion stage is omitted, the inverse frequency conversion is also omitted. In this case, the scaler / dequantizer may perform inverse scaling and dequantization on the blocks of the prediction residual data (or sample value data) to generate reconstructed values. When the residual value is encoded / signaled, the encoder 400 combines the reconstructed residual value with the value of the prediction 458 (e.g., motion compensated prediction value, intra picture prediction value) to form a reconstruction 438. When the residual value is not encoded / signaled, the encoder 400 uses the value of the prediction 458 as the reconstruction 438.

인트라 픽처 예측의 경우, 재구성(438)의 값은 인트라 픽처 추정기(440) 및 인트라 픽처 예측기(445)에 피드백될 수 있다. 또한, 재구성(438)의 값이 후속 픽처의 움직임 보상된 예측을 위해 사용될 수 있다. 재구성(438)의 값이 추가로 필터링될 수 있다. 필터링 제어(460)는, 비디오 신호(405)의 주어진 픽처에 대해, 재구성(438)의 값에 대해 블록화 제거 필터링 및 SAO 필터링을 어떻게 수행할지를 결정한다. 필터링 제어(460)는 헤더 포맷터/엔트로피 코더(490) 및 병합기/필터(들)(465)에 제공되는 필터 제어 데이터(462)를 생성한다.For intra-picture prediction, the value of reconstruction 438 may be fed back to intra-picture estimator 440 and intra-picture predictor 445. In addition, the value of reconstruction 438 may be used for motion compensated prediction of subsequent pictures. The value of the reconstruction 438 may be further filtered. The filtering control 460 determines how to perform deblocking filtering and SAO filtering on the value of the reconstruction 438 for a given picture of the video signal 405. Filtering control 460 generates filter control data 462 provided to header formatter / entropy coder 490 and merger / filter (s) 465.

병합기/필터(들)(465)에서, 인코더(400)는 상이한 타일로부터의 콘텐츠를 픽처의 재구성된 버전에 병합한다. 인코더(400)는, 프레임에서의 경계에 걸쳐 불연속을 적응적으로 평활화하기 위해, 필터 제어 데이터(462)에 따라 블록화 제거 필터링 및 SAO 필터링을 선택적으로 수행한다. (도시되지 않은, 링잉 제거 필터링 또는 ALF와 같은) 다른 필터링이 대안적으로 또는 그에 부가하여 적용될 수 있다. 인코더(400)의 설정에 따라, 타일 경계가 선택적으로 필터링되거나 전혀 필터링되지 않을 수 있고, 인코더(400)는 이러한 필터링이 적용되었는지 여부를 나타내는 구문을 코딩된 비트스트림 내에 제공할 수 있다. 디코딩된 픽처 버퍼(470)는 후속하는 움직임 보상된 예측에서 사용하기 위해 재구성된 현재 픽처를 버퍼링한다.In the merger / filter (s) 465, the encoder 400 merges the content from the different tiles into the reconstructed version of the picture. The encoder 400 optionally performs deblocking filtering and SAO filtering in accordance with the filter control data 462 to adaptively smooth out discontinuities across boundaries in the frame. Other filtering (not shown, such as ringing cancellation filtering or ALF) may alternatively or additionally be applied. Depending on the setting of the encoder 400, the tile boundary may be selectively filtered or not filtered at all, and the encoder 400 may provide a syntax in the coded bitstream to indicate whether such filtering has been applied. The decoded picture buffer 470 buffers the reconstructed current picture for use in subsequent motion compensated prediction.

헤더 포맷터/엔트로피 코더(490)는 일반 제어 데이터(422), 양자화된 변환 계수 데이터(432), 인트라 예측 데이터(442), 움직임 데이터(452) 및 필터 제어 데이터(462)를 포맷 지정하고 그리고/또는 엔트로피 코딩한다. MV 값이 예측 코딩될 수 있다. 예를 들어, 헤더 포맷터/엔트로피 코더(490)는, MV 예측 후에, 차분 MV 값에 대한 구문 요소와 같은 다양한 구문 요소의 엔트로피 코딩을 위해 지수 골롬 코딩(Exponential-Golomb coding)을 사용한다.The header formatter / entropy coder 490 formats and / or scales the general control data 422, the quantized transform coefficient data 432, the intra prediction data 442, the motion data 452 and the filter control data 462, Or entropy coding. The MV value can be predictively coded. For example, the header formatter / entropy coder 490 uses Exponential-Golomb coding for entropy coding of various syntax elements, such as syntax elements for differential MV values, after MV prediction.

헤더 포맷터/엔트로피 코더(490)는 인코딩된 데이터를 코딩된 비디오 비트스트림(495)으로 제공한다. 코딩된 비디오 비트스트림(495)의 포맷은 HEVC 포맷, WMV(Windows Media Video) 포맷, VC-1 포맷, MPEG-x 포맷(예컨대, MPEG-1, MPEG-2, 또는 MPEG-4), H.26x 포맷(예컨대, H.261, H.262, H.263, H.264), 또는 다른 포맷의 변형 또는 확장일 수 있다.The header formatter / entropy coder 490 provides the encoded data as a coded video bitstream 495. The format of the coded video bitstream 495 may be an HEVC format, a WMV (Windows Media Video) format, a VC-1 format, an MPEG-x format (e.g., MPEG-1, MPEG-2 or MPEG-4) A 26x format (e.g., H.261, H.262, H.263, H.264), or other format.

구현 및 원하는 압축 유형에 따라, 인코더의 모듈이 추가되고, 생략되며, 다수의 모듈로 분할되고, 다른 모듈과 결합되며, 그리고/또는 유사한 모듈로 대체될 수 있다. 대안의 실시예에서, 상이한 모듈 및/또는 다른 구성의 모듈을 갖는 인코더가 기술되는 기법들 중 하나 이상을 수행한다. 인코더의 구체적인 실시예는 전형적으로 인코더(400)의 변형 또는 보완된 버전을 사용한다. 인코더(400) 내의 모듈들 간의 도시된 관계는 인코더에서의 정보의 일반적인 흐름을 나타내고; 간단함을 위해, 다른 관계는 도시되어 있지 않다.Depending on the implementation and the type of compression desired, modules of the encoder may be added, omitted, divided into multiple modules, combined with other modules, and / or replaced with similar modules. In alternative embodiments, encoders with different modules and / or modules of different configurations perform one or more of the techniques described. The specific embodiment of the encoder typically uses a modified or supplemented version of the encoder 400. [ The depicted relationship between the modules in the encoder 400 represents the general flow of information at the encoder; For simplicity, other relationships are not shown.

V. 인코딩 동안의 MV 정밀도의 선택V. Selection of MV Precision During Encoding

이 섹션은 인코딩 동안의 MV(motion vector) 정밀도의 선택에 대한 다양한 접근법을 제시한다. 이 접근법은 인코딩 및 디코딩의 레이트 왜곡 성능 및/또는 계산 효율의 면에서 효과적인 압축을 용이하게 할 수 있다.This section presents various approaches to the selection of motion vector (MV) precision during encoding. This approach can facilitate effective compression in terms of rate-distortion performance and / or computational efficiency of encoding and decoding.

MV 정밀도를 선택하는 본원에 기술되는 접근법은 임의의 유형의 비디오를 인코딩할 때 적용될 수 있다. 그렇지만, 상세하게는, 본원에 기술되는 바와 같은 MV 정밀도의 선택은 화면 포착 콘텐츠(screen capture content)와 같은 특정의 인위적으로 생성된 비디오 콘텐츠를 인코딩할 때 성능을 향상시킬 수 있다.The approach described herein for selecting MV precision can be applied when encoding any type of video. However, in particular, the selection of MV precision as described herein may improve performance when encoding certain artificially generated video content, such as screen capture content.

A. 비디오의 유형A. Type of video

일반적으로, 화면 포착 비디오(화면 콘텐츠 비디오 또는 화면 포착 콘텐츠라고도 불리움)는 컴퓨터 화면 또는 다른 디스플레이에 대한 콘텐츠를 발생시키는 그래픽 렌더링 프로세스의 출력을 나타낸다. 이것은 현실 세계 물체의 카메라 센서 뷰로부터 포착되는 비디오 영상 또는 유사한 특성을 가지는 비디오를 지칭하는 자연스런 비디오와 대비된다. 화면 포착 비디오는, 카메라 포착 비디오 콘텐츠만과 달리(또는 그에 부가하여), 전형적으로 렌더링된 텍스트, 컴퓨터 그래픽, 애니메이션 발생 콘텐츠 또는 컴퓨터 디스플레이에 대한 렌더링 프로세스의 출력으로부터 포착된 다른 유사한 유형의 콘텐츠를 포함한다. 화면 포착 콘텐츠의 인코딩/디코딩에 대한 통상적인 시나리오는 원격 데스크톱 회의 및 자연스런 비디오 또는 다른 "혼합 콘텐츠" 비디오 상의 그래픽 또는 텍스트 오버레이의 인코딩/디코딩을 포함한다. 본원에 기술되는 혁신들 중 몇몇은 화면 포착 비디오 또는 다른 인위적으로 생성된 비디오의 인코딩을 위해 구성되어 있다. 이 혁신들은 또한 자연스런 비디오에 대해서도 사용될 수 있지만, 그만큼 효과적이지 않을 수 있다. 본원에 기술되는 다른 혁신들은 자연스런 비디오 또는 인위적으로 생성된 비디오의 인코딩에 효과적이다.Generally, screen capture video (also referred to as screen content video or screen capture content) represents the output of a graphics rendering process that generates content for a computer screen or other display. This contrasts with a natural video that refers to a video image captured from a camera sensor view of a real-world object or a video having similar characteristics. The screen capture video includes (or in addition to) only camera captured video content, typically rendered text, computer graphics, animated content, or other similar types of content captured from the output of the rendering process for a computer display do. Common scenarios for encoding / decoding of screen capture content include remote desktop conferencing and encoding / decoding of graphics or text overlays on natural video or other "mixed content" video. Some of the innovations described herein are configured for the encoding of screen capture video or other artificially generated video. These innovations can also be used for natural video, but they may not be as effective. Other innovations described herein are effective for encoding natural or artificially generated video.

도 5는 화면 포착을 위한 입력을 제공할 수 있는 콘텐츠를 갖는 컴퓨터 바탕화면 환경(510)을 나타낸 것이다. 예를 들어, 화면 포착 비디오는 컴퓨터 바탕 화면(511) 전체의 일련의 영상을 나타낼 수 있다. 또는, 화면 포착 비디오는 게임 콘텐츠를 포함하는 앱 창(513), 웹 페이지 콘텐츠를 갖는 브라우저 창(512) 또는 워드 프로세서 콘텐츠를 갖는 창(514)과 같은, 컴퓨터 바탕화면 환경의 창들 중 하나에 대한 일련의 영상을 나타낼 수 있다.Figure 5 illustrates a computer desktop environment 510 with content that can provide input for screen capture. For example, the screen capture video may represent a series of images of the entire computer desktop 511. Alternatively, the screen capture video may be generated for one of the windows of the computer desktop environment, such as an application window 513 containing game content, a browser window 512 having web page content, or a window 514 having word processor content A series of images can be displayed.

컴퓨터에 의해 생성되는, 인위적으로 생성된 비디오 콘텐츠로서, 화면 포착 콘텐츠는, 비디오 카메라를 사용하여 포착되는 자연스런 비디오 콘텐츠와 비교하여, 비교적 적은 이산 샘플 값을 가지는 경향이 있다. 예를 들어, 화면 포착 콘텐츠의 영역은 종종 단일의 균일한 색상을 포함하는 반면, 자연스런 비디오 콘텐츠에서의 영역은 점진적으로 변하는 색상을 포함할 가능성이 보다 많다. 또한, 화면 포착 콘텐츠는 전형적으로, 콘텐츠가 (예컨대, 스크롤링으로 인해) 공간적으로 변위될 수 있더라도, 프레임마다 정확히 반복되는 독특한 구조(예컨대, 그래픽, 텍스트 문자)를 포함한다. 화면 포착 콘텐츠는 보통 높은 크로마 샘플링 분해능을 갖는 포맷(예컨대, YUV 4:4:4 또는 RGB 4:4:4)으로 인코딩되지만, 보다 낮은 크로마 샘플링 분해능을 갖는 포맷(예컨대, YUV 4:2:0, YUV 4:2:2)으로도 인코딩될 수 있다.As artificially generated video content generated by a computer, screen capture content tends to have a relatively small discrete sample value as compared to natural video content captured using a video camera. For example, areas of screen capture content often contain a single uniform color, while areas in natural video content are more likely to contain progressively changing colors. In addition, the screen capture content typically includes unique structures (e.g., graphics, text characters) that repeat exactly per frame, even though the content may be spatially displaced (e.g., due to scrolling). The screen capture content is usually encoded in a format with high chroma sampling resolution (e.g., YUV 4: 4: 4 or RGB 4: 4: 4), but in a format with lower chroma sampling resolution , YUV 4: 2: 2).

도 6은 어떤 자연스런 비디오(621)와 어떤 인위적으로 생성된 비디오 콘텐츠를 포함하는 혼합 콘텐츠 비디오(620)를 나타낸 것이다. 인위적으로 생성된 비디오 콘텐츠는 자연스런 비디오(621) 옆에 있는 그래픽(622) 및 자연스런 비디오(621) 아래쪽에 실행 중인 시세표시기(ticker)(623)를 포함한다. 도 5에 도시된 화면 포착 콘텐츠와 같이, 도 6에 도시된 인위적으로 생성된 비디오 콘텐츠는 비교적 적은 이산 샘플 값을 갖는 경향이 있다. 이는 또한 (예컨대, 스크롤링으로 인해) 프레임마다 정확히 반복되는 독특한 구조(예컨대, 그래픽, 텍스트 문자)를 갖는 경향이 있다.FIG. 6 shows a mixed content video 620 that includes some natural video 621 and some artificially generated video content. Artificially generated video content includes a graphic 622 next to natural video 621 and a ticker 623 running below natural video 621. As with the screen capture content shown in Fig. 5, the artificially generated video content shown in Fig. 6 tends to have a relatively small discrete sample value. It also tends to have unique structures (e.g., graphics, text characters) that repeat exactly per frame (e.g., due to scrolling).

화면 포착 비디오 또는 혼합 콘텐츠 비디오는 디스플레이 디바이스에 대한 출력 버퍼로부터 또는 프레임을 저장하는 하나 이상의 다른 버퍼로부터 주기적으로 읽힐 수 있다. 또는, 화면 포착 비디오는 (디스플레이 디바이스에 대한 출력 버퍼로부터 값을 주기적으로 읽거나, 운영 체제 모듈로부터의 디스플레이 명령을 가로채거나, 디스플레이될 샘플 값을 다른 방식으로 포착할 수 있는) 화면 포착 모듈로부터 제공될 수 있다. 화면 포착 비디오 또는 혼합 콘텐츠 비디오는 "라이브" 스트림으로부터 또는 저장소에 이전에 기록된 스트림으로부터 온 것일 수 있다.The screen capture video or mixed content video may be periodically read from the output buffer for the display device or from one or more other buffers that store the frame. Alternatively, the screen capture video may be obtained from a screen capture module (which may periodically read a value from the output buffer for the display device, intercept the display command from the operating system module, or otherwise capture the sample value to be displayed) Can be provided. The screen capture video or mixed content video may be from a "live" stream or from a stream previously recorded in the repository.

B. 상이한 MV 정밀도B. Different MV Accuracy

많은 인코딩 시나리오에서, 화면 포착 비디오 또는 다른 인위적으로 생성된 비디오 콘텐츠를 인코딩할 때, 대부분의 MV 값은 정수 샘플 공간 변위를 나타내고, 아주 적은 MV 값은 소수 샘플 공간 변위를 나타낸다. 이것은 전체적인 성능을 향상시키기 위해 MV 정밀도를 감소시키기 위한 기회들을 제공한다.In many encoding scenarios, when encoding scene capture video or other artificially generated video content, most MV values represent integer sample space displacements and very few MV values represent small sample space displacements. This provides opportunities for reducing MV accuracy to improve overall performance.

도 7a는 정수 샘플 공간 변위를 가지는 MV(720)를 사용한 움직임 보상을 나타낸 것이다. MV(720)는 현재 블록에 대한 참조 픽처에서의 동일 장소에 있는 위치(710)에 대한, 좌측으로 4 개의 샘플 그리고 위쪽으로 하나의 샘플의 공간 변위를 나타낸다. 예를 들어, 현재 픽처에서 위치 (64, 96)에 있는 4x4 현재 블록에 대해, MV(720)는 4x4 예측 영역(730) - 그의 위치는 참조 픽처에서 (60, 95)임 - 을 나타낸다. 예측 영역(730)은 참조 픽처에서 정수 샘플 위치에 재구성된 샘플 값을 포함한다. 인코더 또는 디코더는 예측 영역(730)의 값을 결정하기 위해 보간을 수행할 필요가 없다.7A shows motion compensation using MV 720 with integer sample space displacements. MV 720 represents the spatial displacement of four samples to the left and one sample to the left, relative to the location 710 in the same location in the reference picture for the current block. For example, for a 4x4 current block in positions 64 and 96 in the current picture, MV 720 indicates 4x4 prediction region 730 - its position is (60, 95) in the reference picture. The prediction region 730 includes a reconstructed sample value at an integer sample position in the reference picture. The encoder or decoder does not need to perform interpolation to determine the value of the predicted area 730. [

도 7b는 소수 샘플 공간 변위를 가지는 MV(721)를 사용한 움직임 보상을 나타낸 것이다. MV(721)는 현재 블록에 대한 참조 픽처에서의 동일 장소에 있는 위치(710)에 대한, 좌측으로 3.75 개의 샘플 그리고 위쪽으로 0.5 개의 샘플의 공간 변위를 나타낸다. 예를 들어, 현재 픽처에서 위치 (64, 96)에 있는 4x4 현재 블록에 대해, MV(721)는 4x4 예측 영역(731) - 그의 위치는 참조 픽처에서 (60.25, 95.5)임 - 을 나타낸다. 예측 영역(731)은 참조 픽처에서 소수 샘플 위치에 보간된 샘플 값을 포함한다. 인코더 또는 디코더는 예측 영역(731)의 샘플 값을 결정하기 위해 보간을 수행한다. 소수 샘플 공간 변위가 허용될 때, 현재 블록과 일치할 수 있는 보다 많은 후보 예측 영역이 있고, 따라서, 적어도 어떤 유형의 비디오 콘텐츠(예컨대, 자연스런 비디오)에 대해, 움직임 보상된 예측의 품질이 보통 향상된다.FIG. 7B shows motion compensation using MV 721 with a small number of sample space displacements. The MV 721 represents the spatial displacement of 3.75 samples to the left and 0.5 samples to the upper side, for the position 710 in the same place in the reference picture for the current block. For example, for a 4x4 current block in positions 64 and 96 in the current picture, MV 721 indicates that the 4x4 prediction region 731 - its position is (60.25, 95.5) in the reference picture. The prediction region 731 includes interpolated sample values at the fractional sample positions in the reference picture. The encoder or decoder performs the interpolation to determine the sample value of the predicted area 731. There is more candidate prediction regions that can match the current block when a few sample space displacements are allowed and thus at least for some types of video content (e.g., natural video) the quality of motion compensated prediction is usually improved do.

비디오의 어떤 단위에 대해 MV 정밀도가 정수 샘플 정밀도일 때, 단위 내의 블록들에 대한 모든 MV 값은 정수 샘플 공간 변위를 나타낸다. 비디오의 어떤 단위에 대해 MV 정밀도가 소수 샘플 정밀도일 때, 단위 내의 블록에 대한 MV 값은 소수 샘플 공간 변위 또는 정수 샘플 공간 변위를 나타낼 수 있다. 즉, 비디오의 어떤 단위에 대해 MV 정밀도가 소수 샘플 정밀도일 때, 단위 내의 블록들에 대한 일부 MV 값은 소수 샘플 공간 변위를 나타낼 수 있는 반면, 단위 내의 블록들에 대한 다른 MV 값은 정수 샘플 공간 변위를 나타낸다.When MV precision for any unit of video is integer sample precision, all MV values for blocks in the unit represent the integer sample space displacement. For any unit of video, when the MV precision is a fractional sample precision, the MV value for a block in a unit may represent a fractional sample space displacement or an integer sample space displacement. That is, when the MV precision is a fractional sample precision for some unit of video, some MV values for the blocks in the unit may represent the fractional sample space displacements, while other MV values for the blocks in the unit are the integer sample space Displacement.

움직임 추정 및 움직임 보상을 사용하여 블록을 인코딩할 때, 인코더는 종종 블록의 샘플 값과 그의 움직임 보상된 예측 간의 샘플 단위 차이(sample-by-sample difference)(잔차 값 또는 오차 값이라고도 불리움)를 계산한다. 잔차 값이 이어서 인코딩될 수 있다. 잔차 값에 대해, 인코딩 효율은 잔차 값의 복잡도 및 압축 프로세스의 일부로서 얼마나 많은 손실 또는 왜곡이 유입되는지에 의존한다. 일반적으로, 양호한 움직임 보상된 예측은, 잔차 값이 효율적으로 인코딩될 수 있는 작은 진폭의 차이이도록, 블록을 가깝게 근사화시킨다. 다른 한편으로, 좋지 않은 움직임 보상 예측은 종종 보다 큰 진폭의 값 - 효율적으로 인코딩하기가 보다 어려움 - 을 포함하는 잔차 값을 생성한다. 인코더는 전형적으로, 양호한 일치를 찾고 그로써 레이트 왜곡 성능을 향상시키려고 시도하면서, 움직임 추정을 수행하는 데 인코딩 시간의 대부분을 소비한다.When encoding a block using motion estimation and motion compensation, the encoder often calculates a sample-by-sample difference (also called a residual value or error value) between the sample value of the block and its motion compensated prediction do. The residual value can then be encoded. For the residual value, the encoding efficiency depends on the complexity of the residual value and how much loss or distortion is introduced as part of the compression process. In general, good motion compensated prediction approximates the block so that the residual value is a small amplitude difference that can be efficiently encoded. On the other hand, poor motion compensation predictions often produce residual values that include larger amplitude values - more difficult to encode efficiently. The encoder typically consumes most of the encoding time to perform motion estimation while seeking good coincidence and thereby improving rate distortion performance.

코덱이 정수 샘플 MV 정밀도를 갖는 MV 값을 사용할 때, 인코더 및 디코더는, MV 값이 정수 샘플 공간 변위를 나타내기 때문에, 움직임 보상을 위해 참조 픽처의 샘플 값들 간의 보간 동작을 수행할 필요가 없다. 코덱이 소수 샘플 MV 정밀도를 갖는 MV 값을 사용할 때, 인코더 및 디코더는 움직임 보상을 위해 참조 픽처의 샘플 값들 간의 보간 동작을 수행할 수 있지만(이는 적어도 소수 샘플 공간 변위를 나타내는 MV 값에 대해 계산 복잡도를 증가시킴), 움직임 보상된 예측은, 정수 샘플 MV 정밀도와 비교하여, 블록을 보다 가깝게 근사화시키는 경향이 있다(보다 적은 유효 값을 갖는 잔차 값을 가져옴).When the codec uses an MV value with an integer sample MV precision, the encoder and decoder need not perform an interpolation operation between the sample values of the reference picture for motion compensation, since the MV value represents the integer sample space displacement. When the codec uses an MV value with a fractional sample MV precision, the encoder and decoder can perform an interpolation operation between the sample values of the reference picture for motion compensation (which results in at least a small number of sample space displacements, , The motion compensated prediction tends to approximate the block closer to the integer sample MV precision (resulting in a residual value with less effective value).

C. MV 값의 표현C. Representation of MV values

MV 값은 전형적으로 정수 값 - 그의 의미는 연관된 MV 정밀도에 의존함 - 을 사용하여 표현된다. 예를 들어, 정수 샘플 MV 정밀도에 대해, 1의 정수 값은 1 개의 샘플의 공간 변위를 나타내고, 2의 정수 값은 2 개의 샘플의 공간 변위를 나타내며, 이하 마찬가지이다. 예를 들어, 1/4 샘플 MV 정밀도에 대해, 1의 정수 값은 0.25 개의 샘플의 공간 변위를 나타낸다. 2, 3, 4 및 5의 정수 값은, 각각, 0.5 개, 0.75 개, 1.0 개 및 1.25 개의 샘플의 공간 변위를 나타낸다. MV 정밀도에 관계없이, 정수 값은 공간 변위의 크기를 나타낼 수 있고, 별도의 플래그 값은 변위가 마이너스인지 플러스인지를 나타낼 수 있다. 주어진 MV 값의 수평 MV 성분 및 수직 MV 성분이 2 개의 정수 값을 사용하여 표현될 수 있다. 이와 같이, MV 값을 표현하는 2 개의 정수 값의 의미는 MV 정밀도에 의존한다. 예를 들어, 2-샘플 수평 변위는 갖지만 수직 변위를 갖지 않는 MV 값에 대해, MV 정밀도가 1/4 샘플 MV 정밀도이면, MV 값은 (8, 0)으로서 표현된다. 그렇지만, MV 정밀도가 정수 샘플 MV 정밀도이면, MV 값은 (2, 0)으로서 표현된다.The MV value is typically expressed using an integer value - its meaning depends on the associated MV precision. For example, for an integer sample MV precision, an integer value of 1 represents the spatial displacement of one sample, an integer value of 2 represents the spatial displacement of two samples, and so on. For example, for a 1/4 sample MV precision, an integer value of 1 represents the spatial displacement of 0.25 samples. The integer values of 2, 3, 4 and 5 represent the spatial displacement of 0.5, 0.75, 1.0 and 1.25 samples, respectively. Regardless of the MV precision, the integer value can represent the magnitude of the spatial displacement, and the separate flag value can indicate whether the displacement is negative or positive. The horizontal MV component and the vertical MV component of a given MV value can be represented using two integer values. Thus, the meaning of the two integer values representing the MV value depends on the MV precision. For example, for a MV value that has a 2-sample horizontal displacement but no vertical displacement, the MV value is expressed as (8, 0) if the MV precision is 1/4 sample MV precision. However, if the MV precision is an integer sample MV precision, the MV value is expressed as (2, 0).

인코딩된 비디오 데이터의 비트스트림에서의 MV 값은 전형적으로 (예컨대, MV 성분 별로) 엔트로피 코딩된다. MV 값이 또한 (예컨대, MV 성분 별로) 예측된 MV 값에 대해 차분 인코딩될 수 있다. 많은 경우에, MV 값은 예측된 MV 값과 같고, 따라서 차분 MV 값은, 아주 효율적으로 인코딩될 수 있는, 0이다. 차분 MV 값(또는, MV 예측이 사용되지 않는 경우, MV 값)은 지수 골롬 코딩, 컨텍스트 적응적 이진 산술 코딩 또는 다른 형태의 엔트로피 코딩을 사용하여 엔트로피 인코딩될 수 있다. MV 값(또는 차분 MV 값)과 인코딩된 비트 사이의 정확한 관계가, 일반적으로, 사용되는 엔트로피 코딩의 형태에 의존하지만, 보다 작은 값은, 보다 흔하기 때문에, 보다 효율적으로(즉, 보다 적은 비트를 사용하여) 인코딩되고, 보다 큰 값은, 덜 흔하기 때문에, 덜 효율적으로(즉, 보다 많은 비트를 사용하여) 인코딩된다.The MV values in the bit stream of the encoded video data are typically entropy coded (e.g., by MV components). The MV value may also be differentially encoded for the predicted MV value (e.g., by MV component). In many cases, the MV value is equal to the predicted MV value, and thus the difference MV value is zero, which can be encoded very efficiently. The difference MV value (or the MV value if MV prediction is not used) can be entropy encoded using exponential Golomb coding, context adaptive binary arithmetic coding or other forms of entropy coding. The exact relationship between the MV value (or difference MV value) and the encoded bit generally depends on the type of entropy coding used, but smaller values are more commonly used (i.e., ), And larger values are less efficient, so they are encoded less efficiently (i.e., using more bits).

D. 적응적 MV 정밀도 - 서문D. Adaptive MV Precision - Preface

이전의 3 개의 섹션을 요약하면, 정수 샘플 MV 정밀도를 갖는 MV 값을 사용하는 것은 MV 값을 신호하는 것과 연관된 비트 레이트를 감소시키고 (참조 픽처 내의 소수 샘플 위치에서의 샘플 값의 보간을 회피하는 것에 의해) 인코딩 및 디코딩의 계산 복잡도를 감소시키는 경향이 있지만, 움직임 보상된 예측의 품질을 감소시키고 따라서, 적어도 어떤 유형의 비디오 콘텐츠에 대해, 잔차 값의 진폭을 증가시킬 수 있다. 다른 한편으로, 소수 샘플 MV 정밀도를 갖는 MV 값을 사용하는 것은 MV 값을 신호하는 것과 연관된 비트 레이트를 증가시키고 (참조 픽처 내의 소수 샘플 위치에서의 샘플 값의 보간을 포함하는 것에 의해) 인코딩 및 디코딩의 계산 복잡도를 증가시키는 경향이 있지만, 움직임 보상된 예측의 품질을 향상시키고, 적어도 어떤 유형의 비디오 콘텐츠에 대해, 잔차 값의 진폭을 감소시킬 수 있다. 일반적으로, 계산 복잡도, MV 값을 신호하기 위한 비트 레이트, 및 움직임 보상된 예측의 품질이, 수확 체감점(point of diminishing returns)까지는, MV 정밀도가 (예컨대, 정수 샘플로부터 1/2 샘플로, 또는 1/2 샘플로부터 1/4 샘플로) 증가함에 따라 증가한다. 이와 동시에, MV 정밀도의 증가가 MV 값을 신호하는 데 필요한 비트 레이트를 증가시키는 경향이 있지만, 자연스런 콘텐츠를 인코딩할 때, 움직임 보상된 예측의 품질의 연관된 향상은 잔차 값의 적절한 근사값을 송신하는 데 필요한 비트 레이트를 감소시키고 그로써 적절한 픽처 품질로 비디오 콘텐츠를 인코딩하는 데 필요한 총 비트 레이트를 감소시킬 수 있다.To summarize the previous three sections, using an MV value with an integer sample MV precision reduces the bit rate associated with signaling the MV value (by avoiding interpolation of the sample value at the fractional sample position within the reference picture) To compensate for the complexity of encoding and decoding, but it can reduce the quality of the motion compensated prediction and thus, for at least some types of video content, increase the amplitude of the residual value. On the other hand, using an MV value with a fractional sample MV precision increases the bit rate associated with signaling the MV value (by including interpolation of the sample value at the fractional sample position within the reference picture) But it can reduce the amplitude of the residual value for at least some types of video content. &Lt; RTI ID = 0.0 > In general, the computational complexity, the bit rate for signaling the MV value, and the quality of the motion compensated prediction, up to the point of diminishing returns, indicate that the MV precision (e.g., Or from 1/2 sample to 1/4 sample). At the same time, an increase in MV precision tends to increase the bit rate required to signal the MV value, but when encoding natural content, the associated enhancement of the quality of the motion compensated prediction is to transmit an appropriate approximation of the residual value Thereby reducing the required bit rate and thereby reducing the total bit rate required to encode the video content with appropriate picture quality.

화면 포착 비디오 또는 다른 인위적으로 생성된 비디오 콘텐츠를 인코딩할 때, 소수 샘플 MV 정밀도의 (비트 레이트 및 계산 복잡도 측면에서의) 추가된 비용이 정당화되지 않을 수 있다. 예를 들어, 대부분의 MV 값이 정수 샘플 공간 변위를 나타내고, 아주 적은 MV 값이 소수 샘플 공간 변위를 나타내는 경우, 소수 샘플 MV 정밀도의 추가된 비용이 보장되지 않는다. 인코더는 움직임 추정 동안 소수 샘플 위치에서의 검색을 생략할 수 있다(그리고 소수 샘플 위치에서의 샘플 값을 결정하는 보간 동작을 생략할 수 있다). 이러한 콘텐츠에 대해, 정수 샘플 MV 정밀도를 갖는 MV 값을 사용함으로써, 움직임 보상된 예측의 품질에 대한 페널티가 그다지 없이, 비트 레이트 및 계산 복잡도가 감소될 수 있다.When encoding scene capture video or other artificially generated video content, the added cost (in terms of bit rate and computational complexity) of the fractional sample MV precision may not be justified. For example, if most MV values represent integer sample space displacements and very few MV values represent fractional sample space displacements, the added cost of fractional sample MV precision is not guaranteed. The encoder may skip the search at the fractional sample position during motion estimation (and may skip the interpolation operation to determine the sample value at the fractional sample position). For this content, by using MV values with integer sample MV precision, the bit rate and computational complexity can be reduced without much penalty for the quality of motion compensated prediction.

소수 샘플 MV 정밀도가 다른 유형의 비디오 콘텐츠(예컨대, 카메라에 의해 포착된 자연스런 비디오)에 대해 여전히 유용할 수 있기 때문에, 인코더 및 디코더가 MV 정밀도 간에 스위칭하도록 구성될 수 있다. 예를 들어, 인코더 및 디코더가 화면 포착 비디오에 대해서는 정수 샘플 MV 정밀도를 사용할 수 있지만, 자연스런 비디오에 대해서는 소수 샘플 MV 정밀도(1/4 샘플 MV 정밀도 등)를 사용할 수 있다. 인코더가 MV 정밀도를 선택할 때 따를 수 있는 접근법이 다음 섹션에 기술되어 있다. 인코더는 선택된 MV 정밀도를 비트스트림에서의 하나 이상의 구문 요소를 사용하여 디코더로 신호할 수 있다.Because the fractional sample MV precision may still be useful for other types of video content (e.g., natural video captured by the camera), the encoder and decoder may be configured to switch between MV precision. For example, the encoder and decoder can use integer-integer MV precision for screen-captured video, but can use fractional sample MV precision (such as 1/4 sample MV precision) for natural video. The approach that the encoder can follow when selecting the MV precision is described in the next section. The encoder can signal the selected MV precision to the decoder using one or more syntax elements in the bitstream.

MV 정밀도를 신호하는 하나의 접근법에서, MV 정밀도의 적응적 선택이 인에이블되어 있을 때, 인코더는 슬라이스 단위로 MV 정밀도를 선택한다. SPS(sequence parameter set: 시퀀스 파라미터 세트), PPS(picture parameter set: 픽처 파라미터 세트) 또는 다른 구문 구조에서의 플래그 값은 MV 정밀도의 적응적 선택이 인에이블되어 있는지를 나타낸다. 그러한 경우, 주어진 슬라이스에 대한 슬라이스 헤더 내의 하나 이상의 구문 요소는 그 슬라이스의 블록에 대한 선택된 MV 정밀도를 나타낸다. 예를 들어, 0의 플래그 값은 1/4 샘플 MV 정밀도를 나타내고, 1의 플래그 값은 정수 샘플 MV 정밀도를 나타낸다.In one approach to signaling MV accuracy, when the adaptive selection of MV precision is enabled, the encoder selects the MV precision on a slice basis. Flag values in SPS (Sequence Parameter Set), PPS (Picture Parameter Set), or other syntax structures indicate whether adaptive selection of MV precision is enabled. In such case, one or more syntax elements in the slice header for a given slice represent the selected MV precision for the block of that slice. For example, a flag value of 0 represents a 1/4 sample MV precision, and a flag value of 1 represents an integer sample MV precision.

MV 정밀도를 신호하는 다른 접근법에서, 인코더는 픽처 단위로 또는 슬라이스 단위로 MV 정밀도를 선택한다. PPS 내의 구문 요소는 3 개의 MV 정밀도 모드 중 하나를 나타낸다: (0) PPS와 연관된 픽처의 슬라이스(들)의 MV 값에 대한 1/4 샘플 MV 정밀도, (1) PPS와 연관된 픽처의 슬라이스(들)의 MV 값에 대한 정수 샘플 MV 정밀도, 또는 (2) 슬라이스 헤더마다 신호되는 플래그 값에 의존하는 슬라이스-적응적 MV 정밀도, 여기서 슬라이스의 슬라이스 헤더 내의 플래그 값은 슬라이스의 MV 값에 대한 1/4 샘플 MV 정밀도 또는 정수 샘플 MV 정밀도를 나타낼 수 있다. 일 구현에서의 이 접근법에 관한 부가 상세에 대해서는, JCTVC-P0277를 참조하십시오.In another approach to signaling MV accuracy, the encoder selects the MV precision on a per picture or slice basis. The syntax element in the PPS represents one of three MV precision modes: (0) a 1/4 sample MV precision to the MV value of the slice (s) of the picture associated with the PPS, (1) a slice of pictures associated with the PPS ), Or (2) slice-adaptive MV precision depending on the flag value signaled per slice header, where the flag value in the slice header of the slice is 1/4 of the MV value of the slice Sample MV precision or integer sample MV precision. For additional details on this approach in one implementation, see JCTVC-P0277.

MV 정밀도를 신호하는 또 다른 접근법에서, MV 정밀도의 적응적 선택이 인에이블되어 있을 때, 인코더는 CU 단위로 MV 정밀도를 선택한다. 주어진 CU에 대한 구조 내의 하나 이상의 구문 요소는 그 CU의 블록에 대한 선택된 MV 정밀도를 나타낸다. 예를 들어, CU에 대한 CU 구문 구조 내의 플래그 값은 CU와 연관된 모든 PU에 대한 MV 값이 정수 샘플 MV 정밀도 또는 1/4 샘플 MV 정밀도를 갖는지를 나타낸다. 일 구현에서의 이 접근법에 관한 부가 상세에 대해서는, JCTVC-P0283을 참조하십시오.In another approach to signaling MV accuracy, when the adaptive selection of MV precision is enabled, the encoder selects the MV precision in CU units. One or more syntax elements in the structure for a given CU represents the selected MV precision for that block of CUs. For example, the flag value in the CU syntax structure for the CU indicates whether the MV value for all PUs associated with the CU has an integer sample MV precision or 1/4 sample MV precision. For additional details on this approach in one implementation, see JCTVC-P0283.

이 접근법들 중 임의의 것에서, 인코더 및 디코더는 수평 및 수직 MV 성분에 대해 상이한 MV 정밀도를 사용할 수 있다. 이것은 (스케일링되지 않은 차원에서는 정수 샘플 MV 정밀도를 사용하여 그리고 스케일링된 차원에서는 소수 샘플 MV 정밀도를 사용하여) 수평으로 또는 수직으로 스케일링된 화면 포착 비디오를 인코딩할 때 유용할 수 있다. 일부 구현에서, QP 값의 조절만을 통해서는 레이트 제어가 달성될 수 없는 경우, 인코더는 비트 레이트를 감소시키기 위해 화면 포착 비디오를 수평으로 또는 수직으로 크기 조정하고, 이어서 크기 조정된 비트 레이트를 인코딩할 수 있다. 디코더측에서, 비디오가 디코딩 후에 다시 그의 원래의 차원으로 스케일링된다. 인코더는 디코더로 수평 MV 성분에 대한 MV 정밀도를 (예컨대, 제1 플래그 값 또는 구문 요소를 사용해) 신호하고 또한 수직 MV 성분에 대한 MV 정밀도를 (예컨대, 제2 플래그 값 또는 구문 요소를 사용해) 신호할 수 있다.In any of these approaches, the encoder and decoder may use different MV precision for horizontal and vertical MV components. This may be useful when encoding horizontally or vertically scaled screen capture video (using integer sample MV precision on non-scaled dimensions and fractional sample MV precision on scaled dimensions). In some implementations, if rate control can not be achieved only through adjustment of the QP value, the encoder scales the screen capture video horizontally or vertically to reduce the bit rate, and then encodes the scaled bit rate . On the decoder side, the video is again scaled back to its original dimensions after decoding. The encoder signals the MV accuracy for the horizontal MV component (e.g., using the first flag value or the syntax element) to the decoder and the MV precision for the vertical MV component (e.g., using the second flag value or syntax element) can do.

보다 일반적으로, MV 정밀도의 적응적 선택이 인에이블되어 있을 때, 인코더는 어떤 방식으로 MV 정밀도를 선택하고 선택된 MV 정밀도를 신호한다. 예를 들어, SPS, PPS 또는 다른 구문 구조에서의 플래그 값은 MV 정밀도의 적응적 선택이 인에이블되어 있는지를 나타낼 수 있다. 적응적 MV 정밀도가 인에이블되어 있을 때, 시퀀스 계층 구문, 픽처 그룹 계층 구문("GOP 계층 구문"), 픽처 계층 구문, 슬라이스 계층 구문, 타일 계층 구문, 블록 계층 구문 또는 다른 구문 구조에서의 하나 이상의 구문 요소는 MV 값에 대한 선택된 MV 정밀도를 나타낼 수 있다. 또는, 시퀀스 계층 구문, GOP 계층 구문, 픽처 계층 구문, 슬라이스 헤더 계층 구문, 슬라이스 데이터 계층 구문, 타일 계층 구문, 블록 계층 구문 또는 다른 구문 구조에서의 하나 이상의 구문 요소는 상이한 MV 성분에 대한 선택된 MV 정밀도를 나타낼 수 있다. 2 개의 이용 가능한 MV 정밀도가 있을 때, 플래그 값은 2 개의 MV 정밀도 간의 선택을 나타낼 수 있다. 보다 많은 이용 가능한 MV 정밀도가 있을 때, 플래그 값은 그 MV 정밀도들 간의 선택을 나타낼 수 있다.More generally, when adaptive selection of MV precision is enabled, the encoder selects the MV precision in some way and signals the selected MV precision. For example, a flag value in an SPS, PPS, or other syntax structure may indicate whether adaptive selection of MV precision is enabled. When adaptive MV precision is enabled, one or more of a sequence layer syntax, a picture group hierarchy syntax ("GOP layer syntax"), a picture layer syntax, a slice layer syntax, a tile hierarchy syntax, a block hierarchy syntax, The syntax element may represent the selected MV precision for the MV value. Alternatively, one or more syntax elements in a sequence layer syntax, a GOP layer syntax, a picture layer syntax, a slice header layer syntax, a slice data layer syntax, a tile layer syntax, a block layer syntax, Lt; / RTI > When there are two available MV precision, the flag value can indicate the choice between the two MV accuracies. When there is more available MV precision, the flag value may indicate a choice between its MV precision.

선택된 MV 정밀도(들)를 나타내는 구문 요소를 신호/파싱하기 위한 수정 이외에, 신호된 MV 값이 선택된 MV 정밀도에 따라 어떻게 해석되는지를 변경하기 위해 디코딩이 수정될 수 있다. MV 값이 어떻게 인코딩되고 재구성되는지의 상세는 MV 정밀도에 따라 달라질 수 있다. 예를 들어, MV 정밀도가 정수 샘플 정밀도일 때, 예측된 MV 값은 가장 가까운 정수로 반올림될 수 있고, 차분 MV 값은 정수 샘플 오프셋을 나타낼 수 있다. 또는, MV 정밀도가 1/4 샘플 정밀도일 때, 예측된 MV 값은 가장 가까운 1/4 샘플 오프셋으로 반올림될 수 있고, 차분 MV 값은 1/4 샘플 오프셋을 나타낼 수 있다. 또는, MV 값이 어떤 다른 방식으로 신호될 수 있다. MV 값이 정수 샘플 MV 정밀도를 갖고 비디오가 4:2:2 또는 4:2:0 크로마 샘플링을 사용할 때, 크로마 MV 값은 스케일링 등에 의해 도출될 수 있고, 그 결과 크로마에 대해 1/2 샘플 변위가 얻어질 수 있다. 또는, 크로마 MV 값이 정수 값으로 반올림될 수 있다.In addition to modifications to signal / parse a syntax element representing the selected MV precision (s), the decoding can be modified to change how the signaled MV value is interpreted according to the selected MV precision. The details of how the MV value is encoded and reconstructed may vary depending on the MV precision. For example, when the MV precision is integer sample precision, the predicted MV value may be rounded to the nearest integer, and the difference MV value may represent an integer sample offset. Alternatively, when the MV precision is 1/4 sample precision, the predicted MV value can be rounded to the nearest 1/4 sample offset, and the difference MV value can represent 1/4 sample offset. Alternatively, the MV value may be signaled in some other way. When the MV value has an integer sample MV precision and the video uses 4: 2: 2 or 4: 2: 0 chroma sampling, the chroma MV value can be derived by scaling or the like, resulting in a 1/2 sample displacement Can be obtained. Alternatively, the chroma MV value may be rounded to an integer value.

E. MV 정밀도를 선택하는 접근법E. Approach to Choosing MV Precision

비디오 인코딩 동안 MV 정밀도가 적응될 수 있을 때, 인코더가 비디오의 어떤 단위에 대한 MV 정밀도를 선택할 때, 인코더는 비디오 소스로부터의 힌트에 기초하여 사용할 MV 정밀도(들)를 선택할 수 있다(이하의 접근법 1을 참조). 예를 들어, 비디오 소스는 비디오가 화면 포착 콘텐츠 또는 (카메라로부터 포착된) 자연스런 비디오라는 것을 나타낼 수 있다. 또는, 인코더는 다양한 MV 정밀도들의 전수적인 평가에 기초하여 MV 정밀도(들)를 선택할 수 있다(이하의 접근법 2를 참조). 또는, 인코더는 이전 단위들로부터의 통계 데이터 및/또는 인코딩되고 있는 현재 단위에 대한 통계 데이터의 분석에 기초하여 MV 정밀도(들)를 선택할 수 있다(이하의 접근법 3 및 접근법 4를 참조).When the MV precision can be adapted during video encoding, the encoder can select the MV precision (s) to use based on the hint from the video source when the encoder selects the MV precision for any unit of video 1). For example, a video source may indicate that the video is a screen capture content or a natural video (captured from a camera). Alternatively, the encoder may select the MV precision (s) based on a holistic evaluation of the various MV precisions (see approach 2 below). Alternatively, the encoder may select the MV precision (s) based on analysis of statistical data from previous units and / or statistical data for the current unit being encoded (see approach 3 and approach 4 below).

MV 정밀도를 선택하는 접근법들 중 일부는 화면 포착 인코딩 시나리오에 대해 적응되어 있다. 다른 접근법들은 보다 일반적으로 임의의 유형의 비디오 콘텐츠를 인코딩할 때 적용된다.Some of the approaches for selecting MV precision have been adapted for screen capture encoding scenarios. Other approaches are more generally applied when encoding any type of video content.

이 섹션에 기술되는 일부 예에서, 인코더는 1/4 샘플 MV 정밀도를 사용하는 것과 정수 샘플 MV 정밀도를 사용하는 것 중에 선택한다. 보다 일반적으로, 인코더는 정수 샘플 MV 정밀도, 1/2 샘플 MV 정밀도, 1/4 샘플 MV 정밀도 및/또는 다른 MV 정밀도를 포함할 수 있는 다수의 이용 가능한 MV 정밀도 중에서 선택한다.In some examples described in this section, the encoder chooses between using 1/4 sample MV precision and using integer sample MV precision. More generally, the encoder selects among a number of available MV accuracies that may include integer sample MV precision, 1/2 sample MV precision, 1/4 sample MV precision, and / or other MV precision.

인코더가 비디오의 어떤 단위에 대한 MV 정밀도를 선택할 때, 비디오의 단위는 시퀀스, GOP, 픽처, 슬라이스, 타일, CU, PU, 다른 블록 또는 비디오의 다른 유형의 단위일 수 있다. 복잡도와 유연성 간의 원하는 트레이드오프에 따라, 아주 국소적으로(예컨대, CU 단위로), 보다 큰 영역 단위로(예컨대, 타일 단위로 또는 슬라이스 단위로), 픽처 전체 단위로, 또는 보다 전역적인 단위로(예컨대, 인코딩 세션마다, 시퀀스마다, GOP 마다, 또는 검출된 장면 변화 사이의 일련의 픽처마다) MV 정밀도를 선택하는 것이 적절할 수 있다.When an encoder selects the MV precision for any unit of video, the unit of video may be a sequence, a GOP, a picture, a slice, a tile, a CU, a PU, another block, or another type of unit of video. Depending on the desired tradeoff between complexity and flexibility, it may be very local (e.g., in CU units), on larger area units (e.g., in tile units or slice units), in whole units of pictures, It may be appropriate to select the MV precision (e.g., per encoding session, per sequence, per GOP, or per sequence of pictures between detected scene changes).

1. 애플리케이션, 운영 체제 또는 비디오 소스로부터의 힌트를 사용하는 접근법1. Approach to using hints from applications, operating systems or video sources

인코더는 애플리케이션, 운영 체제 또는 비디오 소스에 의해 신호되는 힌트에 기초하여 MV 정밀도를 선택할 수 있다. 예를 들어, 힌트는 인코딩될 비디오 콘텐츠가 (자연스런 비디오 콘텐츠일 수 있는 삽입된 비디오 영역 없이), 워드 프로세서, 스프레드시트 애플리케이션, 또는 웹 브라우저와 같은, 특정의 애플리케이션에 의해 렌더링되었다는 것을 나타낼 수 있다. 이러한 애플리케이션에 의한 렌더링은 콘텐츠의 정수 샘플 공간 변위를 생성하는 경향이 있을 것이다. 이러한 힌트에 기초하여, 인코더는 정수 샘플 MV 정밀도를 선택할 수 있다. 보통 자연스런 비디오 콘텐츠를 렌더링하지 않는 워드 프로세서, 스프레드시트 애플리케이션, 웹 브라우저 또는 다른 애플리케이션에 의해 렌더링된 콘텐츠에 대해, 정수 샘플 MV 정밀도가 소수 샘플 MV 정밀도보다 바람직할 가능성이 있다. (그러나 비디오가 크기 조정된 경우, 소수 샘플 MV 정밀도가 바람직할 수 있다.)The encoder may select the MV precision based on the hint signaled by the application, operating system, or video source. For example, the hint may indicate that the video content to be encoded (without an embedded video area, which may be natural video content) has been rendered by a particular application, such as a word processor, spreadsheet application, or web browser. Rendering by such applications will tend to produce integer sample space displacements of the content. Based on these hints, the encoder can select the integer sample MV precision. For content rendered by a word processor, spreadsheet application, web browser or other application that does not normally render natural video content, integer integer MV precision is more likely than fractional sample MV precision. (However, if the video is scaled, a fractional sample MV precision may be desirable.)

또는, 힌트는 비디오 콘텐츠가 전형적으로 인위적으로 생성된 비디오 콘텐츠를 전달하는 화면 포착 모듈 또는 다른 비디오 소스에 의해 전달되었다는 것을 나타낼 수 있다. 이러한 콘텐츠에 대해, 정수 샘플 MV 정밀도가 소수 샘플 MV 정밀도보다 바람직할 가능성이 있고, 따라서 인코더는 정수 샘플 MV 정밀도를 선택한다. (그러나 비디오가 크기 조정된 경우, 소수 샘플 MV 정밀도가 바람직할 수 있다.)Alternatively, the hint may indicate that the video content is typically delivered by a screen capture module or other video source that delivers artificially generated video content. For such content, the integer sample MV precision may be preferable to the fractional sample MV precision, and therefore the encoder selects the integer sample MV precision. (However, if the video is scaled, a fractional sample MV precision may be desirable.)

다른 한편으로, 힌트가 비디오 콘텐츠가 카메라, DVD 또는 다른 디스크, 또는 튜너 카드에 의해 전달되거나, 비디오 플레이어에 의해 렌더링되었다는 것을 나타내는 경우, 인코더는 소수 샘플 MV 정밀도를 선택할 수 있다. 이러한 콘텐츠에 대해, 소수 샘플 MV 정밀도가 정수 샘플 MV 정밀도보다 바람직할 가능성이 있다.On the other hand, if the hint indicates that the video content has been delivered by a camera, DVD or other disk, or a tuner card, or has been rendered by a video player, the encoder may select a fractional sample MV precision. For such content, there is a possibility that the fractional sample MV precision is preferable to the integer sample MV precision.

힌트가 인코딩 세션에, 일련의 프레임에, 단일의 비디오 프레임에 또는 비디오 프레임의 일부(애플리케이션과 연관된 창에 대응하는 구역 등)에 적용될 수 있다.The hint may be applied to an encoding session, to a series of frames, to a single video frame, or to a portion of a video frame (such as a zone corresponding to a window associated with an application).

어떤 경우에, 인코더는 비디오 콘텐츠의 속성에 관한 비디오 소스, 운영 체제 또는 애플리케이션에 의해 제공되는 힌트를 수신하지 않을 수 있거나 그를 해석하지 못할 수 있다. 또는, (예컨대, 자연스런 비디오 콘텐츠 및 인위적으로 생성된 비디오 콘텐츠를 포함하는 혼합 콘텐츠 비디오에 대해, 또는 크기 조정된 비디오에 대해) 힌트가 틀리거나 오해의 소지가 있을 수 있다. 이러한 경우에, 인코더는 어느 MV 정밀도(들)가 선택되어야 하는지를 결정하기 위해 다른 접근법을 사용할 수 있다.In some cases, the encoder may not receive or interpret a hint provided by the video source, operating system, or application regarding the attributes of the video content. Alternatively, the hints may be incorrect or misleading (e.g., for mixed content video including natural video content and artificially generated video content, or for scaled video). In this case, the encoder may use a different approach to determine which MV precision (s) should be selected.

2. 무차별 인코딩 접근법2. The indiscriminate encoding approach

MV 정밀도를 신호하는 다른 접근법 세트에서, 인코더는 상이한 MV 정밀도를 사용하여 비디오의 단위를 여러 번(예컨대, 정수 샘플 MV 정밀도를 사용하여 한번, 1/4 샘플 MV 정밀도를 사용하여 한번) 인코딩한다. 인코더는 최상의 성능을 제공하는 MV 정밀도를 선택하고, 출력을 위한 단위를 인코딩할 때 선택된 MV 정밀도를 사용한다. 비디오의 단위는 블록, PU, CU, 슬라이스, 타일, 픽처, GOP, 시퀀스 또는 비디오의 다른 유형의 단위일 수 있다. 전형적으로, 인코더는 이러한 접근법에서 다수의 인코딩 패스(pass of encoding)를 수행한다.In another approach set to signal MV accuracy, the encoder encodes the unit of video multiple times (e.g., once using the integer sample MV precision, once using the 1/4 sample MV precision) using different MV precision. The encoder selects the MV precision that provides the best performance and uses the selected MV precision when encoding the unit for output. The unit of video may be a block, a PU, a CU, a slice, a tile, a picture, a GOP, a sequence or other type of unit of video. Typically, the encoder performs a number of encoding passes in this approach.

어느 MV 정밀도가 최상의 성능을 제공하는지를 평가하기 위해, 인코더는 단위의 인코딩 동안 상이한 MV 정밀도가 사용될 때 레이트 왜곡 비용을 결정하고, 가장 낮은 레이트 왜곡 비용을 갖는 옵션을 선택할 수 있다. 레이트 왜곡 비용은 왜곡 비용(D) 및 비트 레이트 비용(R)을 가지며, 왜곡 비용에 대해 비트 레이트 비용을 가중시키거나(D+λR) 그 반대인(R+λD) 인자(λ)(종종 라그랑지 승수라고 불리움)를 갖는다. 비트 레이트 비용은 추정된 또는 실제 비트 레이트 비용일 수 있다. 일반적으로, 왜곡 비용은 원래의 샘플과 재구성된 샘플 간의 비교에 기초한다. 왜곡 비용은 SAD(sum of absolute differences), SAHD(sum of absolute Hadamard-transformed differences) 또는 다른 SATD(sum of absolute transformed differences), SSE(sum of squared errors), MSE(mean squared error), 평균 분산(mean variance) 또는 다른 왜곡 메트릭으로서 측정될 수 있다. 인자(λ)는 인코딩 동안 달라질 수 있다(예컨대, 양자화 계단 크기(quantization step size)가 보다 클 때 비트 레이트 비용의 상대 가중치를 증가시킴). 레이트 왜곡 비용은 보통 상이한 MV 정밀도 옵션들의 성능의 가장 정확한 평가를 제공하지만, 또한 가장 높은 계산 복잡도를 갖는다.To assess which MV precision provides the best performance, the encoder can determine the rate distortion cost when different MV precision is used during encoding of the unit, and select the option with the lowest rate distortion cost. The rate distortion cost has a distortion cost (D) and a bit rate cost (R), and either adds the bit rate cost to the distortion cost or (R + lambda D) Quot; multiplier "). The bit rate cost may be an estimated or actual bit rate cost. Generally, the distortion cost is based on a comparison between the original sample and the reconstructed sample. Distortion cost is the sum of absolute differences (SAD), the sum of absolute Hadamard-transformed differences (SAHD) or other sum of absolute transformed differences (SATD), the sum of squared errors (SSE) mean variance or other distortion metric. The factor lambda may vary during encoding (e.g., increasing the relative weight of the bitrate cost when the quantization step size is greater). The rate distortion cost usually provides the most accurate estimate of the performance of different MV precision options, but also has the highest computational complexity.

인코더는 레이트 왜곡 분석을 정수 샘플 MV 정밀도 옵션 쪽으로 바이어싱시키기 위해 레이트 왜곡 비용 함수의 항들 중 하나 이상을 변화시킬 수 있다. 예를 들어, 다수의 MV 정밀도 중에서 결정하기 위해 레이트 왜곡 분석을 사용하여 비디오의 어떤 단위에 대한 MV 정밀도를 결정할 때, 레이트 왜곡 분석은 왜곡 비용을 스케일링하는 것, 왜곡 비용에 페널티를 부가하는 것, 비트 레이트 비용을 스케일링하는 것, 비트 레이트 비용에 페널티를 부가하는 것, 및/또는 라그랑지 승수 인자를 조절하는 것에 의해 정수 샘플 MV 정밀도 쪽으로 바이어싱된다. 소수 샘플 MV 정밀도를 평가할 때, 인코더는 (예컨대, 1 초과의 인자에 의해) 왜곡 비용을 스케일링 업(scale up)하고, (1 초과의 인자에 의해) 비트 레이트 비용을 스케일링 업하며, 왜곡 페널티를 부가하고, 비트 레이트 페널티를 부가하며, 그리고/또는 보다 큰 라그랑지 승수 인자를 사용할 수 있다. 또는, 정수 샘플 MV 정밀도를 평가할 때, 인코더는 (예컨대, 1 미만의 인자에 의해) 왜곡 비용을 스케일링 다운하고, (1 미만의 인자에 의해) 비트 레이트 비용을 스케일링 다운하며, 그리고/또는 보다 작은 라그랑지 승수 인자를 사용할 수 있다.The encoder may change one or more of the terms of the rate distortion cost function to bias the rate distortion analysis towards the integer sample MV precision option. For example, when determining the MV precision for any unit of video using a rate distortion analysis to determine among multiple MV precision, the rate distortion analysis is based on scaling the distortion cost, adding a penalty to the distortion cost, Is biased toward the integer sample MV precision by scaling the bit rate cost, adding a penalty to the bit rate cost, and / or adjusting the Lagrange multiplier factor. When evaluating the fractional sample MV precision, the encoder scales up the distortion cost (e.g., by more than one factor), scales up the bit rate cost (by more than one factor), and increases the distortion penalty Add a bit rate penalty, and / or use a larger Lagrangian multiplier factor. Alternatively, when evaluating the integer sample MV precision, the encoder scales down the distortion cost (e.g., by a factor of less than one), scales down the bitrate cost (by less than one factor), and / You can use the Lagrange multiplier factor.

인코더는 인코딩 동안 정수 샘플 MV 정밀도 쪽으로의 또는 그 반대쪽으로의 바이어스의 정도를 변화시킬 수 있다. 예를 들어, 인코더는 정수 샘플 MV 값이 비디오 콘텐츠를 인코딩하는 데 보다 적절할 가능성이 있는 신뢰 수준에 따라 정수 샘플 MV 정밀도 쪽으로의 바이어스를 조절할 수 있다(예컨대, 비디오 콘텐츠가 인위적으로 생성된 콘텐츠일 가능성이 있는 경우 정수 샘플 MV 정밀도 쪽으로의 바이어스를 증가시킴). 또는, 인코더는 인코딩 및/또는 디코딩에 대한 계산 능력에 따라 정수 샘플 MV 정밀도 쪽으로의 바이어스를 조절할 수 있다(예컨대, 이용 가능한 계산 능력이 보다 낮은 경우 정수 샘플 MV 정밀도 쪽으로의 바이어스를 증가시킴).The encoder can vary the degree of bias towards or away from the integer sample MV precision during encoding. For example, the encoder may adjust the bias towards the integer sample MV precision according to the confidence level that the integer sample MV value may be more appropriate to encode the video content (e.g., the video content may be artificially generated content Increases the bias toward integer sample MV precision). Alternatively, the encoder may adjust the bias toward the integer sample MV precision (e.g., increase the bias toward the integer sample MV precision if the available computation capability is lower), depending on the computational capability for encoding and / or decoding.

대안적으로, 인코더는 어느 MV 정밀도가 최상의 성능을 제공하는지를 평가하기 위해 다른 접근법을 사용할 수 있다. 예를 들어, 인코더는, 주어진 양자화 계단 크기에 대해, 어느 MV 정밀도가 가장 적은 비트의 인코딩된 데이터를 가져오는지를 측정한다. 또는, 인코더는 상이한 MV 정밀도를 사용하는 인코딩에 대한 왜곡만을 평가한다. 또는, 인코더는 단일의 인코딩 패스에서 결정할 정도로 충분히 간단할 수 있는, 정수 샘플 MV 정밀도와 비교하여 소수 샘플 MV 정밀도에 대한 왜곡 감소 이점과 같은 보다 간단한 척도를 사용한다. 예를 들어, 인코더는, 정수 샘플 MV 정밀도가 사용될 때와 비교하여, 소수 샘플 MV 정밀도가 사용될 때의 (SAD, SATD, TSE, MSE, 또는 다른 왜곡 메트릭의 측면에서의) 왜곡 감소의 양을 검사한다.Alternatively, the encoder may use a different approach to evaluate which MV precision provides the best performance. For example, the encoder measures, for a given quantization step size, which encoded data of which bits have the least MV precision. Alternatively, the encoder only evaluates distortion for encoding using different MV precision. Alternatively, the encoder uses a simpler scale, such as the advantage of distortion reduction for fractional sample MV precision, as compared to the integer sample MV precision, which can be sufficiently simple to determine in a single encoding pass. For example, the encoder may check the amount of distortion reduction (in terms of SAD, SATD, TSE, MSE, or other distortion metric) when fractional sample MV precision is used, as compared to when integer sample MV precision is used do.

무차별 인코딩 접근법은 계산 집중적일 수 있다. 무차별 인코딩 접근법은, 고정 MV 정밀도를 사용하는 인코딩과 비교하여, 어쩌면 상당한 부가 계산, 부가 메모리 저장소, 및 부가 메모리 읽기 및 쓰기 동작을 필요로 한다.The indiscriminate encoding approach can be computationally intensive. The indiscriminate encoding approach requires significant additional computation, additional memory storage, and additional memory read and write operations, as compared to encoding using fixed MV precision.

3. 콘텐츠 분석을 사용하는 접근법3. Approach to using content analysis

MV 정밀도를 신호하는 다른 접근법 세트에서, 인코더는 입력 비디오 콘텐츠 및/또는 인코딩된 비디오 콘텐츠의 분석에 기초하여 비디오의 어떤 단위에 대한 MV 정밀도를 선택한다. 비디오의 단위는 블록, PB, PU, CU, CTU, 서브매크로블록 파티션, 매크로블록, 슬라이스, 타일, 픽처, GOP, 시퀀스 또는 비디오의 다른 유형의 단위일 수 있다.In another approach set to signal MV accuracy, the encoder selects the MV precision for any unit of video based on the analysis of the input video content and / or the encoded video content. The unit of video may be a block, PB, PU, CU, CTU, sub-macroblock partition, macroblock, slice, tile, picture, GOP, sequence or other type of unit of video.

도 8은 인코딩 동안 MV 정밀도를 적응시키는 기법(800)을 나타낸 것이다. 기법(800)은 도 3 또는 도 4a 및 도 4b를 참조하여 기술된 것과 같은 인코더에 의해 또는 다른 인코더에 의해 수행될 수 있다. 기법(800)에 따르면, 비디오의 인코딩 동안, 인코더는 비디오의 단위들에 대한 다수의 MV 정밀도 중에서 MV 정밀도를 결정한다. 다수의 MV 정밀도는 하나 이상의 소수 샘플 MV 정밀도는 물론 정수 샘플 MV 정밀도를 포함할 수 있다. 예를 들어, 다수의 MV 정밀도는 정수 샘플 MV 정밀도 및 1/4 샘플 MV 정밀도를 포함할 수 있다. 또는, 다수의 MV 정밀도는 정수 샘플 MV 정밀도, 1/2 샘플 MV 정밀도 및 1/4 샘플 MV 정밀도를 포함할 수 있다.FIG. 8 illustrates a technique 800 for adapting MV precision during encoding. The technique 800 may be performed by an encoder such as that described with reference to FIG. 3 or FIGS. 4A and 4B or by another encoder. According to the technique 800, during encoding of video, the encoder determines the MV precision among the multiple MV accuracies for the units of video. The multiple MV precisions may include integer sample MV precision as well as one or more fractional sample MV precisions. For example, multiple MV precisions may include integer sample MV precision and 1/4 sample MV precision. Alternatively, the multiple MV precisions may include integer sample MV precision, 1/2 sample MV precision, and 1/4 sample MV precision.

구체적으로는, 비디오의 어떤 단위를 인코딩할 때, 인코더는 MV 정밀도를 변경할지를 결정한다(810). 인코딩의 시작에서, 인코더는 처음에 MV 정밀도를 기본값에 따라 설정하거나, MV 정밀도를 변경하는 것처럼 진행할 수 있다. 비디오의 나중의 단위에 대해, 인코더는 (하나 이상의 이전에 인코딩된 단위에 대해 사용된) 현재 MV 정밀도를 사용하거나 MV 정밀도를 변경할 수 있다. 예를 들어, 인코더는 정의된 이벤트의 발생 시에(예컨대, 문턱값 개수의 단위의 인코딩 후에, 장면 변화 후에, 비디오의 유형이 변했다는 결정 후에) MV 정밀도를 변경하기로 결정할 수 있다.Specifically, when encoding any unit of video, the encoder determines whether to change the MV precision (810). At the beginning of the encoding, the encoder can initially set the MV precision to the default value, or proceed as if changing the MV precision. For later units of video, the encoder can use the current MV precision (used for one or more previously encoded units) or change the MV precision. For example, the encoder may decide to change the MV precision at the occurrence of the defined event (e.g., after encoding the threshold number of units, after the scene change, after the determination that the type of video has changed).

MV 정밀도를 변경하기 위해, 인코더는 비디오에 관한 정보를 수집한다(820). 일반적으로, 수집된 정보는 입력 비디오의 특성 또는 인코딩된 비디오의 특성일 수 있다. 수집된 정보는 인코딩되고 있는 현재 단위에 관련되어 있고 그리고/또는 비디오의 이전에 인코딩된 단위에 관련되어 있을 수 있다. (수집된 정보가 비디오의 하나 이상의 이전에 인코딩된 단위에 관련되어 있을 때, 이러한 정보의 수집(820)은 이전 단위(들)의 인코딩 이전에, 그 동안에 또는 그 후에 일어날 수 있다. 이 수집(820)은 도 8에 도시된 타이밍과 상이하고, MV 정밀도를 변경하는 것에 관한 결정(810)에 관계없이 일어난다.) 인코더는 이어서 수집된 정보에 적어도 부분적으로 기초하여 비디오의 단위에 대한 MV 정밀도를 선택한다(830).To change the MV precision, the encoder collects 820 information about the video. In general, the collected information may be a characteristic of the input video or a characteristic of the encoded video. The collected information may be related to the current unit being encoded and / or to a previously encoded unit of video. (When the collected information is related to one or more previously encoded units of video, this collection of information 820 may occur before, during, or after the encoding of the previous unit (s). 820 is different from the timing shown in FIG. 8 and occurs regardless of the decision 810 regarding changing the MV precision.) The encoder then determines the MV precision for the unit of video based at least in part on the collected information (830).

일 예로서, 인코더는 현재 단위에 대한 샘플 값을 수집할 수 있다. 적은 수의 이산 샘플 값의 존재는 화면 포착 콘텐츠를 나타내고, 따라서 정수 샘플 MV 정밀도가 선택되어야만 한다는 것을 암시하는 경향이 있다. 다른 한편으로, 많은 수의 이산 샘플 값의 존재는 자연스런 비디오를 나타내고, 따라서 소수 샘플 MV 정밀도가 선택되어야만 한다는 것을 암시하는 경향이 있다. 샘플 값은 히스토그램으로서 편성될 수 있다. 샘플 값은 YUV 색 공간에서의 루마(Y) 샘플로부터만, YUV 색 공간에서의 루마는 물론 크로마(U, V) 샘플로부터, RGB 색 공간에서의 R, G 및 B 샘플로부터, 또는 RGB 색 공간에서의 G(또는 R 또는 B) 샘플로부터만 수집될 수 있다. 예를 들어, MV 정밀도를 선택할 때, 인코더는 수집된 샘플 값 중에서 독특한 샘플 값들의 개수를 결정한다. 인코더는 개수를 문턱값과 비교한다. 개수가 문턱값보다 낮으면, 인코더는 정수 샘플 MV 정밀도를 선택한다. 개수가 문턱값보다 높으면, 인코더는 소수 샘플 MV 정밀도를 선택한다. 경계 조건(개수가 문턱값과 같은 것)은, 구현에 따라, 어느 한 옵션을 사용하여 처리될 수 있다. 또는, 인코더는 수집된 샘플 값으로부터의 통계를 다른 방식으로 고려한다. 예를 들어, 인코더는 x 개의 가장 흔한 수집된 샘플 값이 샘플 값들의 y% 초과를 차지하는지를 결정한다. 그러한 경우, 인코더는 정수 샘플 MV 정밀도를 선택하고; 그렇지 않은 경우, 인코더는 소수 샘플 MV 정밀도를 선택한다. x 및 y의 값은 구현에 의존한다. x의 값은 10 또는 어떤 다른 개수일 수 있다. y의 값은 80, 90 또는 100보다 작은 어떤 퍼센트일 수 있다.As an example, the encoder may collect sample values for the current unit. The presence of a small number of discrete sample values is indicative of the screen capture content and thus tends to imply that the integer sample MV precision has to be selected. On the other hand, the presence of a large number of discrete sample values tends to indicate a natural video and thus imply that the fractional sample MV precision should be selected. The sample value can be organized as a histogram. The sample values are obtained from the Ruma, Y samples only in the YUV color space, from the Luma in the YUV color space as well as from the chroma (U, V) samples, from the R, G and B samples in the RGB color space, Can be collected only from the G (or R or B) samples at For example, when selecting the MV precision, the encoder determines the number of unique sample values among the collected sample values. The encoder compares the number to the threshold. If the number is lower than the threshold, the encoder selects the integer sample MV precision. If the number is higher than the threshold, the encoder selects the fractional sample MV precision. Boundary conditions (where the number is equal to the threshold) can be handled using either option, depending on the implementation. Alternatively, the encoder considers the statistics from the collected sample values in a different way. For example, the encoder determines whether the x most common collected sample values occupy more than y% of the sample values. In such a case, the encoder selects the integer sample MV precision; Otherwise, the encoder selects the fractional sample MV precision. The values of x and y are implementation dependent. The value of x can be 10 or any other number. The value of y can be any percentage less than 80, 90 or 100.

다른 예로서, 인코더는 각자의 MV 정밀도로 인코딩된 현재 단위의 블록들에 대한 왜곡 척도를 수집할 수 있다. 예를 들어, 인코더는, 정수 샘플 MV 정밀도와 비교하여, 소수 샘플 MV 정밀도를 사용할 때의 왜곡의 개선(감소)을 기록한다. MV 정밀도를 선택할 때, 인코더는 왜곡의 감소가 MV 정밀도의 증가를 정당화시키는지를 결정한다.As another example, the encoder may collect a distortion measure for blocks of the current unit encoded with their MV precision. For example, the encoder records an improvement (reduction) in distortion when using a fractional sample MV precision, compared to an integer sample MV precision. When selecting the MV precision, the encoder determines if the reduction in distortion justifies an increase in MV accuracy.

다른 예로서, 인코더는 하나 이상의 이전 단위에 대한 (소수 샘플 MV 정밀도를 가지는) MV 값을 수집할 수 있다. 수집된 MV 값은 그의 소수 부분의 값에 따라, 예컨대, 1/4 샘플 MV 정밀도 MV 값에 대해, 0의 소수 부분을 가지는 MV 값에 대한 빈, 0.25의 소수 부분을 가지는 MV 값에 대한 빈, 0.5의 소수 부분을 가지는 MV 값에 대한 빈, 및 0.75의 소수 부분을 가지는 MV 값에 대한 빈을 갖는 히스토그램으로 편성될 수 있다. 이 접근법의 저복잡도 변형이 다음 섹션에서 기술된다.As another example, the encoder may collect MV values (with fractional sample MV precision) for one or more previous units. The collected MV value may be a bin for an MV value having a fractional part of 0, a bin for an MV value having a fractional part of 0.25, for example, 1/4 sample MV precision, A bin for the MV value with a fractional part of 0.5, and a bin for the MV value with a fractional part of 0.75. The low complexity variants of this approach are described in the next section.

다른 예로서, 인코더는 소수 샘플 MV 정밀도를 사용하여 인코딩된 블록에 대한 MV 데이터(차분 MV 값)에 대한 인코딩된 비트의 개수에 관한 정보를 수집할 수 있다. 차분 MV 값에 대한 낮은 평균 비트 수는 규칙적인(예측 가능한) 움직임을 나타내고, 정수 샘플 MV 정밀도가 적절할 때 보다 통상적이다. 차분 MV 값에 대해 사용되는 높은 평균 비트 수는 소수 샘플 MV 정밀도가 적절할 때 보다 통상적이다. MV 정밀도를 선택할 때, 인코더는 차분 MV 값에 대한 인코딩된 비트의 개수들의 평균(또는 중간(median)) 비트 수를 측정한다. 인코더는 측정을 문턱값과 비교한다. 측정이 문턱값보다 낮으면, 인코더는 정수 샘플 MV 정밀도를 선택한다. 측정이 문턱값보다 높으면, 인코더는 소수 샘플 MV 정밀도를 선택한다. 경계 조건(측정이 문턱값과 같은 것)은, 구현에 따라, 어느 한 옵션을 사용하여 처리될 수 있다.As another example, the encoder may use the fractional sample MV precision to gather information about the number of encoded bits for the MV data (differential MV value) for the encoded block. The low average number of bits for the difference MV value represents regular (predictable) motion and is more common when the integer sample MV precision is appropriate. The high average number of bits used for the difference MV value is more common when the fractional sample MV precision is appropriate. When selecting the MV precision, the encoder measures the average (or median) number of bits of the number of encoded bits for the difference MV value. The encoder compares the measurement with a threshold value. If the measurement is below the threshold, the encoder selects the integer sample MV precision. If the measurement is higher than the threshold, the encoder selects the fractional sample MV precision. Boundary conditions (measurements equal to thresholds) can be processed using either option, depending on the implementation.

다른 예로서, 단위를 인코딩할 때, 인코더는 단위의 블록(예컨대, PU)마다 다수의 MV 정밀도를 평가하고, 어느 MV 정밀도가 그 블록에 대한 최상의 성능을 제공하는지를 나타내는 정보를 블록마다 수집한다. 인코더는 블록이 정수 샘플 MV 정밀도를 사용하여 인코딩될 때의 레이트 왜곡 비용(예컨대, D+λR)을 결정하고, 또한 블록이 소수 샘플 MV 정밀도를 사용하여 인코딩될 때의 레이트 왜곡 비용(예컨대, D+λR)을 결정할 수 있다. 인코더는 다수의 MV 정밀도 각각이 단위 내의 각자의 블록에 대해 몇번 최상인지를 결정하고, 가장 큰 카운트를 갖는 MV 정밀도를 선택한다. 예를 들어, 픽처 내의 블록들 각각에 대해, 인코더는 블록이 정수 샘플 MV 정밀도를 사용하여 인코딩될 때의 레이트 왜곡 비용을 결정하고, 또한 블록이 1/4 샘플 MV 정밀도를 사용하여 인코딩될 때의 레이트 왜곡 비용을 결정한다. 인코더는 정수 샘플 MV 정밀도가 보다 나은 횟수 및 1/4 샘플 MV 정밀도가 보다 나은 횟수를 카운트하고, 둘 중 높은 것을 선택한다. 대안적으로, 인코더는 정수 샘플 MV 정밀도가 단위의 블록들에 대해 몇번 최상인지의 카운트를 결정하고, 이어서 카운트가 단위 내의 블록들의 개수의 문턱값 퍼센트보다 높은 경우에만 정수 샘플 MV 정밀도를 선택한다. 일부 구현에서, 인코더는 임의의 값의 MV를 갖는 블록들을 고려한다. 다른 구현에서, 인코더는 영이 아닌 값의 MV를 갖는 블록들만을 고려한다. 주어진 단위에 대해 사용된 MV 정밀도 모드에 관계없이, 하나 이상의 후속 단위에 대한 MV 정밀도를 선택하기 위해, 다수의 MV 정밀도의 이러한 블록별 평가가 주어진 단위의 블록들에 대해 수행될 수 있다. 또는, 주어진 단위에 대한 MV 정밀도를 선택하기 위해 주어진 단위에 대해 다수의 MV 정밀도의 블록별 평가가 수행될 수 있다.As another example, when encoding a unit, the encoder evaluates multiple MV accuracies per block of unit (e.g., PU) and collects information per block that indicates which MV precision provides the best performance for that block. The encoder determines the rate distortion cost (e.g., D + lambda R) when the block is encoded using integer sample MV precision and also determines the rate distortion cost when the block is encoded using the fractional sample MV precision + [lambda] R). The encoder determines how many MV precision each is best for each block in the unit, and selects the MV precision with the largest count. For example, for each of the blocks in a picture, the encoder determines the rate distortion cost when the block is encoded using integer-number-of-samples MV precision and also determines the rate-distortion cost when the block is encoded using quarter- Determines the rate distortion cost. The encoder counts the number of times the integer sample MV precision is better and 1/4 sample MV precision is better, whichever is higher. Alternatively, the encoder determines a count of how many times the integer sample MV precision is best for the blocks of the unit, and then selects the integer sample MV precision only if the count is higher than the threshold percentage of the number of blocks in the unit. In some implementations, the encoder considers blocks with MV of any value. In other implementations, the encoder considers only those blocks that have non-zero MVs. In order to select the MV precision for one or more subsequent units, regardless of the MV precision mode used for a given unit, such block-by-block evaluation of multiple MV precision can be performed on the blocks of a given unit. Alternatively, a block-by-block evaluation of multiple MV precision for a given unit may be performed to select the MV precision for a given unit.

대안적으로, 인코더는 정보를 수집하고 선택된 정보에 적어도 부분적으로 기초하여 MV 정밀도를 선택하는 다른 접근법을 사용한다.Alternatively, the encoder uses a different approach of collecting information and selecting the MV precision based at least in part on the selected information.

도 8로 돌아가서, MV 정밀도가 변했든 그렇지 않든 간에, 인코더는 선택된 MV 정밀도를 사용하여 단위를 인코딩한다(840). 비디오의 단위 내의 블록(예컨대, PU, 매크로블록, 또는 다른 블록)에 대한 MV 값은 선택된 MV 정밀도를 갖는다. 인코더는 현재 단위에 대한 인코딩된 데이터를, 예컨대, 비트스트림으로 출력한다. 인코딩된 데이터는 선택된 MV 정밀도를 나타내는 구문 요소를 포함할 수 있다.Returning to FIG. 8, whether the MV precision has changed or not, the encoder encodes the unit using the selected MV precision (840). The MV value for a block (e.g., PU, macroblock, or other block) in a unit of video has a selected MV precision. The encoder outputs the encoded data for the current unit, e.g., as a bit stream. The encoded data may comprise syntax elements representing the selected MV precision.

인코더는 다음 단위를 계속할지 여부를 결정한다(850). 그러한 경우, 인코더는 다음 단위에 대해 MV 정밀도를 변경할지를 결정한다(810). 이와 같이, MV 정밀도가 각각의 단위에 대해(예컨대, 세그먼트별로, GOP별로, 픽처별로, 슬라이스별로, CTU별로, CU별로, PU별로, PB별로, 매크로블록별로, 서브매크로블록 파티션별로) 선택될 수 있다. 또는, 복잡도를 감소시키기 위해, 단위에 대한 MV 정밀도가 때때로(예컨대, 주기적으로 또는 정의된 이벤트의 발생 시에) 변경되고, 하나 이상의 후속 단위에 대해 반복될 수 있다.The encoder determines whether to continue the next unit (850). In such a case, the encoder determines whether to change the MV precision for the next unit (810). Thus, MV accuracy is selected for each unit (e.g., segment, per GOP, per picture, per slice, per CTU, per CU, per PU, per PB, per macroblock, . Alternatively, to reduce complexity, the MV precision for a unit may be changed from time to time (e.g., periodically or at the occurrence of a defined event) and repeated for one or more subsequent units.

인코더가 픽처마다 동일한 타일 패턴을 사용할 때, 인코더는 픽처마다 타일별 MV 정밀도를 반복할 수 있다. 픽처마다 동일한 장소에 있는 타일이 동일한 MV 정밀도를 사용할 수 있다. 이와 유사하게, 픽처마다 동일한 장소에 있는 슬라이스가 동일한 MV 정밀도를 사용할 수 있다. 예를 들어, 비디오가 컴퓨터 바탕 화면을 나타내고, 바탕 화면의 일부가 자연스런 비디오 콘텐츠를 디스플레이하는 창을 갖는 것으로 가정한다. 픽처마다 바탕 화면의 그 영역 내에서 소수 샘플 MV 정밀도가 사용될 수 있는 반면, 텍스트 또는 다른 렌더링된 콘텐츠를 보여주는 다른 구역들은 정수 샘플 MV 정밀도를 사용하여 인코딩된다.When the encoder uses the same tile pattern per picture, the encoder can repeat the MV precision per tile for each picture. Tiles in the same place for each picture can use the same MV precision. Similarly, slices at the same location for each picture can use the same MV precision. For example, assume that video represents a computer desktop, and that a portion of the desktop has a window that displays natural video content. While the fractional sample MV precision within that area of the desktop may be used per picture, other regions showing text or other rendered content are encoded using the integer sample MV precision.

이 접근법 세트에서, 인코더는 단일 패스 인코딩을 사용할 수 있다. 인코딩되고 있는 비디오의 현재 단위에 대해, 현재 단위에 대한 선택된 MV 정밀도는 (시간 순서, 출력 순서 또는 디스플레이 순서라고도 불리우는 입력 순서가 아니라, 디코딩 순서 또는 비트스트림 순서라고도 불리우는 인코딩 순서에서) 비디오의 하나 이상의 이전 단위로부터 수집된 정보에 적어도 부분적으로 의존할 수 있다.In this approach set, the encoder can use single pass encoding. For the current unit of video being encoded, the selected MV precision for the current unit (in the encoding order, also known as the decoding order or bitstream order, not the input order, also called the time order, output order, or display order) And may at least partially rely on information collected from previous units.

대안적으로, 이 접근법 세트에서, 인코더는 다중 패스 인코딩 또는 짧은 미리보기 창(look-ahead window)을 갖는 인코딩(때때로 1.5 패스 인코딩(1.5-pass encoding)이라고 불리움)을 사용할 수 있다. 인코딩되고 있는 비디오의 현재 단위에 대해, 선택된 MV 정밀도는 현재 단위로부터의 수집된 정보에 적어도 부분적으로 의존한다. 현재 단위에 대한 선택된 MV 정밀도는 또한 (입력 순서가 아니라 인코딩 순서에서) 비디오의 하나 이상의 이전 단위로부터의 수집된 정보에 적어도 부분적으로 의존할 수 있다.Alternatively, in this approach set, the encoder may use multiple pass encoding or encoding with a short look-ahead window (sometimes referred to as 1.5-pass encoding). For the current unit of video being encoded, the selected MV precision depends at least in part on the information collected from the current unit. The selected MV precision for the current unit may also depend, at least in part, on the collected information from one or more previous units of the video (in the encoding order, not the input order).

이 접근법 세트에서, 인코더는 정수 샘플 MV 정밀도가 적절한 신뢰 수준에 적어도 부분적으로 기초하여 정수 샘플 MV 정밀도 쪽으로의 또는 그 반대쪽으로의 바이어스의 양을 조절할 수 있다. 인코더는 또한 인코딩 및/또는 디코딩의 계산 능력에 적어도 부분적으로 기초하여 정수 샘플 MV 정밀도 쪽으로의 또는 그 반대쪽으로의 바이어스의 양을 조절할 수 있다(보다 적은 계산 능력이 이용 가능한 경우 계산 복잡도를 감소시키기 위해 정수 샘플 MV 정밀도를 우선시함). 예를 들어, 정수 샘플 MV 정밀도의 선택을 우선시하기 위해, 인코더는, 정수 샘플 MV 정밀도가 선택될 가능성이 보다 많도록 하기 위해, 비교 동작에서 사용되는 문턱값을 조절할 수 있다.In this set of approaches, the encoder may adjust the amount of bias towards or away from the integer sample MV precision based on, at least in part, the integer sample MV precision to an appropriate confidence level. The encoder may also adjust the amount of bias towards or away from the integer sample MV precision based at least in part on the computational capabilities of encoding and / or decoding (to reduce computational complexity when less computational power is available) Integer sample MV precision is prioritized). For example, to prioritize selection of the integer sample MV precision, the encoder may adjust the threshold used in the comparison operation so that the integer sample MV precision is more likely to be selected.

이 접근법 세트에서, 선택된 MV 정밀도는 비디오의 단위 내의 블록들에 대한 MV 값들의 수평 MV 성분 및/또는 수직 MV 성분에 대한 것일 수 있고, 여기서 수평 MV 성분 및 수직 MV 성분은 상이한 MV 정밀도를 갖도록 허용되어 있다. 또는, 선택된 MV 정밀도는 비디오의 단위 내의 블록들에 대한 MV 값들의 수평 MV 성분 및 수직 MV 성분 둘 다에 대한 것일 수 있고, 여기서 수평 MV 성분 및 수직 MV 성분은 동일한 MV 정밀도를 갖는다.In this set of approaches, the selected MV precision may be for a horizontal MV component and / or a vertical MV component of MV values for blocks within a unit of video, wherein the horizontal MV component and the vertical MV component are allowed to have different MV precision . Alternatively, the selected MV precision may be for both the horizontal MV component and the vertical MV component of MV values for blocks in a unit of video, where the horizontal MV component and the vertical MV component have the same MV precision.

이 접근법 세트에서, (예컨대, 비트스트림에서의) 인코딩된 비디오는 단위에 대한 선택된 MV 정밀도를 나타내는 하나 이상의 구문 요소를 포함한다. 대안적으로, 인코딩된 비디오는 단위에 대한 선택된 MV 정밀도를 나타내는 어떤 구문 요소도 갖지 않을 수 있다(이하에서 비규범적 접근법에 관한 섹션을 참조). 예를 들어, 비트스트림이 소수 샘플 MV 정밀도를 갖는 MV 값의 시그널링을 지원하는 경우라도, 인코더는 0의 소수 부분을 갖는 MV 값만을 사용하도록 비디오의 단위에 대한 움직임 추정을 제약할 수 있다. 이것은 보간 동작을 회피함으로써 인코딩 및 디코딩의 계산 복잡도를 감소시킬 수 있다.In this set of approaches, the encoded video (e.g., in a bitstream) includes one or more syntax elements representing the selected MV precision for the unit. Alternatively, the encoded video may not have any syntax elements representing the selected MV precision for the unit (see the section on non-normative approaches below). For example, even if the bitstream supports signaling of MV values with fractional sample MV precision, the encoder may constrain the motion estimation for a unit of video to use only MV values with a fractional part of zero. This can reduce the computational complexity of encoding and decoding by avoiding interpolation operations.

4. 저복잡도 콘텐츠 분석을 사용하는 접근법4. Approaches to using low complexity content analysis

의사 결정 프로세스를 단순화시키기 위해, 인코더는 MV 정밀도를 선택하기 전에 보다 작은 데이터 세트를 고려하거나 MV 정밀도를 선택할 때 보다 간단한 결정 논리를 사용하여, 다중 인코딩 패스를 피할 수 있다.To simplify the decision process, the encoder can avoid multiple encoding passes, using simpler decision logic when considering smaller data sets or selecting MV precision before selecting MV precision.

도 9는 저복잡도 접근법을 사용하여 인코딩 동안 MV 정밀도를 적응시키는 기법(900)을 나타낸 것이다. 기법(900)은 도 3 또는 도 4a 및 도 4b를 참조하여 기술된 것과 같은 인코더에 의해 또는 다른 인코더에 의해 수행될 수 있다. 기법(900)은, 도 8을 참조하여 기술된 바와 같이, 비디오에 관한 정보를 수집하고 수집된 정보에 적어도 부분적으로 기초하여 MV 정밀도를 선택하는 하나의 접근법을 상세히 설명한다.FIG. 9 illustrates a technique 900 for adapting MV precision during encoding using a low complexity approach. The technique 900 may be performed by an encoder such as that described with reference to Fig. 3 or Figs. 4A and 4B or by another encoder. The technique 900 details one approach to collecting information about video and selecting MV accuracy based at least in part on the collected information, as described with reference to FIG.

기법(900)에 따르면, 비디오의 인코딩 동안, 인코더는 비디오의 어떤 단위에 대한 MV 정밀도를 결정한다. 단위에 대한 MV 정밀도를 결정할 때, 인코더는 소수 샘플 MV 정밀도를 가지는 MV 값 세트를 식별한다(910). MV 값 세트는 영 값의 MV 및 영이 아닌 값의 MV를 포함하도록 허용될 수 있다. 또는, MV 값 세트는 영이 아닌 값의 MV만을 포함하도록 제약될 수 있다. 또는, MV 값 세트는 특정 블록 크기 이상의 블록들로부터의 영이 아닌 값의 MV만을 포함하도록 추가로 제약될 수 있다.According to the technique 900, during encoding of video, the encoder determines the MV precision for any unit of video. When determining the MV precision for a unit, the encoder identifies a set of MV values with a fractional sample MV precision (910). The set of MV values may be allowed to include MV of zero value and MV of non-zero value. Alternatively, the set of MV values may be constrained to include only MVs of non-zero values. Alternatively, the set of MV values may be further constrained to include only MVs of non-zero values from blocks over a particular block size.

인코더는 0의 소수 부분을 가지는 MV 값들의, MV 값 세트 내에서의, 출현율(prevalence)에 적어도 부분적으로 기초하여 단위에 대한 MV 정밀도를 선택한다(920). 출현율은 0의 소수 부분을 가지는 MV 값 세트의 분율(fraction)로 측정될 수 있다. 예를 들어, 픽처에 대해, 인코더는 0의 소수 부분을 가지는 MV 값의 퍼센트를 결정할 수 있다. 또는, MV 값 세트를 사용하는 영역 또는 영역 세트에 대해, 출현율은 0의 소수 부분을 가지는 그 영역 또는 영역 세트의 분율로 측정될 수 있다. 분율이 문턱값을 초과하면, 단위에 대한 선택된 MV 정밀도는 정수 샘플 MV 정밀도이다. 분율이 문턱값을 초과하지 않으면, 단위에 대한 선택된 MV 정밀도는 소수 샘플 MV 정밀도이다. 경계 조건(분율이 문턱값과 같은 것)은, 구현에 따라, 어느 한 옵션을 사용하여 처리될 수 있다.The encoder selects 920 the MV precision for the unit based at least in part on the prevalence of the MV values having a fraction of 0 in the set of MV values. The occurrence rate can be measured as a fraction of the MV value set having a fractional part of zero. For example, for a picture, the encoder may determine the percentage of MV values that have a fractional part of zero. Alternatively, for a region or set of regions using a set of MV values, the occurrence rate may be measured as a fraction of that region or set of regions having a fractional part of zero. If the fraction exceeds the threshold, the selected MV precision for the unit is the integer sample MV precision. If the fraction does not exceed the threshold, the selected MV precision for the unit is the fractional sample MV precision. The boundary condition (the fraction equal to the threshold value) can be processed using either option, depending on the implementation.

단위에 대한 MV 정밀도의 선택(920)은 또한, 문턱값 양의 영이 아닌 값의 MV들이 있으면 정수 샘플 MV 정밀도로의 스위칭이 허용되도록, 영이 아닌 값의 MV들의 출현율에 적어도 부분적으로 기초할 수 있다. 영이 아닌 값의 MV들의 출현율은 영이 아닌 값의 MV들인 MV 값들의 분율로, 영이 아닌 값의 MV들을 사용하는 블록들의 개수로, 또는 영이 아닌 값의 MV들을 사용하는 영역 또는 영역 세트의 분율로 측정될 수 있다. 이 경우에, 소수 샘플 MV 정밀도를 가지는 MV 값 세트는 영역 또는 영역 세트의 영이 아닌 값의 MV 들 중에서 식별될 수 있다. 이와 같이, 인코더는 영이 아닌 값의 MV들인 MV 세트 내에서의 0의 소수 부분을 가지는 영이 아닌 값의 MV들의 출현율을 고려할 수 있다. 예를 들어, 인코더는 2 개의 조건이 충족되면 정수 샘플 MV 정밀도로 스위칭한다: (1) 충분히 많은 양의 영이 아닌 값의 MV들이 검출되는 것, 및 (2) 그 영이 아닌 값의 MV 세트 내에서, 0의 소수 부분을 갖는 것이 충분히 많은 것(또는 대안적으로, 영이 아닌 소수 부분을 갖는 것이 충분히 적은 것). 영이 아닌 값의 MV의 출현율과 0의 소수 부분을 가지는 MV 값의 출현율은 (MV 값의 연관된 블록 크기에 관계없이) MV 값을 카운트하는 것에 의해 또는 (예컨대, 일부 MV 값이 다른 것보다 더 큰 블록에 적용되기 때문에) MV 값에 대한 연관된 블록 크기를 고려하는 것에 의해 결정될 수 있다.The selection 920 of the MV precision for a unit may also be based at least in part on the rate of occurrence of MVs of non-zero values, so that switching to integer sample MV precision is allowed if the MVs of non-zero threshold positive values are present . The occurrence rate of non-zero MVs is a fraction of MV values that are MVs of non-zero values, measured as the number of blocks using MVs of non-zero values, or as a fraction of an area or set of regions using MVs of non- . In this case, a set of MV values with fractional sample MV precision can be identified among MVs of non-zero values of the region or set of regions. Thus, the encoder can consider the occurrence rate of non-zero MVs having a fractional part of 0 in the set of MVs that are MVs of non-zero values. For example, the encoder switches to integer sample MV precision when two conditions are met: (1) a sufficiently large amount of non-zero MVs are detected, and (2) within the MV set of non-zero values , It is sufficiently large to have a fractional part of 0 (or alternatively, it is small enough to have a non-zero fractional part). The occurrence rate of the MV having a non-zero value MV and the fractional part of 0 may be determined by counting the MV value (regardless of the associated block size of the MV value) (e.g., by some MV value being greater Lt; RTI ID = 0.0 > block < / RTI > value).

인코더는 단위에 대한 선택된 MV 정밀도를 사용하여 단위를 인코딩한다. 비디오의 단위 내의 블록(예컨대, PU, 매크로블록, 또는 다른 블록)에 대한 MV 값은 단위에 대한 선택된 MV 정밀도를 갖는다. 인코더는 현재 단위에 대한 인코딩된 데이터를, 예컨대, 비트스트림으로 출력한다. 인코딩된 데이터는 단위에 대한 선택된 MV 정밀도를 나타내는 구문 요소를 포함할 수 있다.The encoder encodes the unit using the selected MV precision for the unit. The MV value for a block (e.g., PU, macroblock, or other block) in a unit of video has a selected MV precision for the unit. The encoder outputs the encoded data for the current unit, e.g., as a bit stream. The encoded data may include syntax elements representing the selected MV precision for the unit.

인코더가 MV 정밀도를 설정하는 데 소비하는 시간의 양을 감소시키기 위해, 단위에 대해 정수 샘플 MV 정밀도가 선택된 후에, 어떤 이벤트가 MV 정밀도를 소수 샘플 MV 정밀도로 스위칭 백시킬 때까지, 선택된 MV 정밀도가 비디오의 후속 단위들에 대해 사용될 수 있다. 예를 들어, 이벤트는 정의된 수의 단위를 인코딩하는 것, 장면 변화, 또는 인코딩 동안의 관찰에 기초하여, 소수 샘플 MV 정밀도로의 스위칭 백이 유익할 것이라는 결정일 수 있다.To reduce the amount of time the encoder consumes to set the MV precision, after the integer-number MV precision is selected for the unit, the selected MV precision is increased until an event switches the MV precision to the fractional sample MV precision Can be used for subsequent units of video. For example, the event may be a determination that switching back to the fractional sample MV accuracy would be beneficial based on encoding a defined number of units, scene change, or observation during encoding.

하나의 예시적인 구현에서, 인코더는 비디오의 어떤 단위(예컨대, 픽처, 타일, 슬라이스 또는 CU)를 한번만 인코딩한다. 먼저, 인코더는 1/4 샘플 MV 정밀도를 사용하여 단위를 인코딩한다. 인코딩 동안, 인코더는 MV 값의 소수 부분이 0인지 여부를 결정한다. 예를 들어, 인코더는 MV 값의 어떤 분율이 영이 아닌 소수 부분을 갖는지를 측정한다. 또는, 일부 MV 값이 다른 것보다 더 큰 픽처 영역에 영향을 미치기 때문에, 인코더는 인터 픽처 예측된 영역(들)의 어떤 분율이 영이 아닌 소수 부분을 갖는 MV 값을 사용하는지를 측정한다(MV 값의 카운트가 아니라 면적을 측정함). 분율이 (구현에 의존하고, 예를 들어, 75%인) 문턱값을 초과하면, 인코더는 비디오의 하나 이상의 후속 단위에 대해 정수 샘플 MV 정밀도로 스위칭한다.In one exemplary implementation, the encoder encodes any unit of video (e.g., a picture, tile, slice, or CU) only once. First, the encoder encodes the unit using a 1/4 sample MV precision. During encoding, the encoder determines whether the fractional part of the MV value is zero. For example, the encoder measures whether a fraction of the MV value has a fractional part other than zero. Alternatively, because some MV values affect a larger picture area than another, the encoder measures whether a fraction of the inter-picture predicted area (s) uses MV values with non-zero fractional parts Measure area rather than count). If the fraction exceeds the threshold (which is implementation dependent and is, for example, 75%), the encoder switches to integer sample MV precision for one or more subsequent units of video.

이 예시적인 구현에서, 인코더가 정수 샘플 MV 정밀도로 스위칭한 후에, 인코더는 그 정수 샘플 MV 정밀도를 무한히 또는 정의된 이벤트가 소수 샘플 MV 정밀도로의 스위칭 백을 트리거할 때까지 적어도 일시적으로 유지할 수 있다. 이벤트는, 예를 들어, 특정의 수의 단위(예컨대, 100 개의 단위)의 인코딩일 수 있다. 또는, 이벤트는 장면 변화일 수 있다. 또는, 이벤트는, 인코딩 동안 수집된 통계에 기초하여, 소수 샘플 MV 정밀도로의 스위칭 백이 유익할 가능성이 많다는 결정일 수 있다. (이러한 통계는 어떤 제한된 양의 면적의 인코딩 동안 소수 샘플 MV 정밀도가 그 면적에 대해 더 나을 것인지를 결정하기 위해 수집되고, 이어서 하나 이상의 단위에 대한 MV 정밀도를 스위칭하기 위해 적용될 수 있다.)In this exemplary implementation, after the encoder switches to the integer sample MV precision, the encoder can at least temporarily maintain its integer sample MV precision until an infinite or defined event triggers a switching back to the fractional sample MV precision . The event may be, for example, an encoding of a specific number of units (e.g., 100 units). Alternatively, the event may be a scene change. Alternatively, the event may be a determination that switching back to the fractional sample MV precision is more likely to be beneficial, based on statistics collected during encoding. (These statistics may be collected to determine if a fractional sample MV precision is better for that area during encoding of a limited amount of area, and then applied to switch the MV precision for one or more units.)

비디오 콘텐츠가 자연스런 비디오 콘텐츠이든 인위적으로 생성된 비디오 콘텐츠이든 간에, 비디오의 큰 부분은 정지해 있을 수 있다. 예를 들어, 정지 부분은 자연스런 비디오에서의 움직이지 않는 배경 또는 화면 포착 콘텐츠에서의 움직이지 않는 콘텐츠일 수 있다. 비디오의 정지 부분은, MV 정밀도가 소수 샘플 MV 정밀도일 때 0의 소수 부분을 가지는, 영 값의 MV를 갖는다. 상당한 수의 영 값의 MV 값의 존재는 영이 아닌 소수 부분을 갖는 MV 값의 분율을 고려하는 결정 논리를 혼란시킬 수 있다.Whether the video content is natural video content or artificially generated video content, a large portion of the video may be frozen. For example, the still portion may be a non-moving background in natural video or non-moving content in screen capture content. The static portion of the video has a zero valued MV with a fractional part of zero when the MV precision is a fractional sample MV precision. The presence of a significant number of MV values with zero values may confuse the decision logic that takes into account the fraction of MV values with nonzero fractional parts.

따라서, 인코더는 영 값의 MV를 고려하지 않을 수 있다. 도 10은 (대체로) 영 값의 MV를 갖는 하나의 움직이지 않는 부분(1001) 및 (대체로) 영이 아닌 값의 MV를 갖는 2 개의 움직이는 부분(1002, 1003)을 포함하는 픽처(1000)를 나타낸 것이다. 인코더는 움직이는 부분(1002, 1003)에서의 영이 아닌 값의 MV는 고려하지만, 움직이지 않는 부분(1001)의 MV 값은 고려하지 않는다. 인코더는 0의 소수 부분을 갖는 (움직이는 부분(1002, 1003)에 있는) 영이 아닌 값의 MV의 분율이 문턱값을 초과할 때(또는 (면적의 측면에서의) 영의 소수 부분을 갖는 영이 아닌 MV를 사용하는 픽처의 분율이 문턱값을 초과할 때) 정수 샘플 MV 정밀도로 스위칭할 수 있다.Therefore, the encoder may not consider the MV of the zero value. Figure 10 shows a picture 1000 that includes one moving part 1001 with MV of (approximately) zero value and two moving parts 1002, 1003 with (almost) MV of non-zero values. will be. The encoder considers the non-zero value MV in the moving parts 1002 and 1003, but does not consider the MV value of the non-moving part 1001. [ The encoder determines whether the fraction of the MV of non-zero values (in moving parts 1002, 1003) having a fractional part of 0 exceeds the threshold value (or is not zero with the fractional part of zero (in terms of area) When the fraction of pictures using MV exceeds the threshold value, it is possible to switch to integer-number MV precision.

인코더는 또한, 적은 수의 MV 값에 기초하여 결정이 행해지지 않도록, 평가되는 영이 아닌 값의 MV의 개수가 문턱값 양을 초과하는지를 검사할 수 있다. 이것은 의사 결정 프로세스를 보다 강건하게 만들 수 있다.The encoder may also check whether the number of MVs of non-zero values being evaluated exceeds the threshold amount so that no determination is made based on a small number of MV values. This can make the decision process more robust.

다른 예시적인 구현에서, 인코더는 비디오의 주어진 단위(예컨대, 픽처, 타일, 슬라이스 또는 CU)를 1/4 샘플 MV 정밀도를 사용하여 인코딩한다. 인코더는 (1) 단위의 x% 초과가 영이 아닌 값의 MV에 의한 인터 픽처 예측를 사용하고, (2) 영이 아닌 MV를 사용하는 단위의 부분의 y% 초과가 정수 값의 MV(0의 소수 부분)를 갖는 경우, 비디오의 하나 이상의 후속 단위에 대해 정수 샘플 MV 정밀도로 스위칭한다. x 및 y의 값은 구현에 의존하고, 예를 들어, 각각, 5 및 75일 수 있다.In another exemplary implementation, the encoder encodes a given unit of video (e.g., a picture, tile, slice, or CU) using a 1/4 sample MV precision. The encoder uses (1) inter picture prediction by MV of a value that is not zero when x% of units exceeds zero, and (2) when y exceeds% y of the fraction of the unit using MV that is not zero, ), It switches to integer-sample MV precision for one or more subsequent units of video. The values of x and y are implementation dependent and can be, for example, 5 and 75, respectively.

유사한 예시적인 구현에서, 인코더는 비디오의 주어진 단위(예컨대, 픽처, 타일, 슬라이스 또는 CU)를 1/4 샘플 MV 정밀도를 사용하여 인코딩한다. 인코더는 (1) 단위의 z개 초과의 PU가 영이 아닌 값의 MV를 갖고, (2) 그 PU의 y% 초과가 정수 값의 MV(0의 소수 부분)를 갖는 경우, 비디오의 하나 이상의 후속 단위에 대해 정수 샘플 MV 정밀도로 스위칭한다. z 및 y의 값은 구현에 의존하고, 예를 들어, 각각, 100 및 75일 수 있다.In a similar exemplary implementation, the encoder encodes a given unit of video (e.g., a picture, tile, slice, or CU) using a 1/4 sample MV precision. The encoder shall determine if more than z PUs in units of (1) have non-zero MVs and (2) if more than y% of the PUs have integer MVs (the fractional part of 0) Switches to integer sample MV accuracy for the unit. The values of z and y are implementation dependent and can be, for example, 100 and 75, respectively.

보다 큰 영역에 대한 MV 값이 보다 작은 영역에 대한 MV 값보다 더 신뢰성 있을 수 있다. 인코더는 어느 MV 값이 평가되는지를 제한할 수 있다. 예를 들어, 인코더는 특정 블록 크기 이상(예컨대, 16x16 이상)의 블록들에 대한 MV 값만을 평가할 수 있다.The MV value for a larger area may be more reliable than the MV value for a smaller area. The encoder may limit which MV value is evaluated. For example, the encoder may evaluate only MV values for blocks of a particular block size or more (e.g., 16x16 or more).

다른 예시적인 구현에서, 인코더는 비디오의 주어진 단위(예컨대, 픽처, 타일, 슬라이스 또는 CU)를 1/4 샘플 MV 정밀도를 사용하여 인코딩한다. 인코더는 (1) 단위의 z개 초과의 PU가 w x w 이상이고 영이 아닌 값의 MV를 가지며, (2) 그 PU의 y% 초과가 정수 값의 MV(0의 소수 부분)를 갖는 경우, 비디오의 하나 이상의 후속 단위에 대해 정수 샘플 MV 정밀도로 스위칭한다. w, z 및 y의 값은 구현에 의존하고, 예를 들어, 각각, 16, 100 및 75일 수 있다.In another exemplary implementation, the encoder encodes a given unit of video (e.g., a picture, tile, slice, or CU) using a 1/4 sample MV precision. The encoder shall ensure that the PU of (1) units is greater than or equal to wxw and has a non-zero MV, and (2) the y% of the PU has an MV of integer values (the fractional part of 0) And switches to integer-sample MV precision for one or more subsequent units. The values of w, z and y are implementation dependent and can be, for example, 16, 100 and 75, respectively.

5. 비규범적 접근법5. Non-Normative Approach

이전의 예들 대부분에서, 인코더는 인코딩된 데이터에서의 선택된 MV 정밀도를 나타내는 하나 이상의 구문 요소를, 예컨대, 비트스트림으로 신호한다. 디코더는 선택된 MV 정밀도를 나타내는 구문 요소(들)를 파싱하고, 선택된 MV 정밀도에 따라 MV 값을 해석한다.In most of the previous examples, the encoder signals one or more syntax elements, e.g., a bitstream, that represent the selected MV precision in the encoded data. The decoder parses the syntax element (s) representing the selected MV precision and interprets the MV value according to the selected MV precision.

대안적으로, 비규범적 접근법에서, 인코더는 인코더에 의해 선택된 MV 정밀도를 나타내는 어떤 구문 요소도 신호하지 않는다. 예를 들어, 인코더는 정수 샘플 MV 정밀도와 소수 샘플 MV 정밀도 중에서 선택하지만, 항상 MV 값을 소수 샘플 MV 정밀도로 인코딩한다. 디코더는 소수 샘플 MV 정밀도로 MV 값을 재구성하고 적용한다.Alternatively, in a non-normative approach, the encoder does not signal any syntax elements representing the MV precision selected by the encoder. For example, the encoder chooses between integer sample MV precision and fractional sample MV precision, but always encodes the MV value with a fractional sample MV precision. The decoder reconstructs and applies the MV values with a fractional sample MV precision.

인코더가 정수 샘플 MV 정밀도를 선택할 때, 인코더는 소수 샘플 오프셋에서의 샘플 값의 보간을 회피하는 것에 의해 그리고 정수 샘플 오프셋에서만 후보 예측 영역을 평가하는 것에 의해 움직임 추정을 단순화할 수 있다. 또한, MV 예측이 - 예컨대, 시간 MV 예측을 사용하여 - 소수 값을 생성하는 경우, 인코더는 MV 차분을 (예컨대, 시간 MV 예측으로부터의) 소수 값의 MV 예측에 가산할 때 정수 값이 얻어지는 그 MV 예측 차분만을 고려할 수 있다. 디코딩 동안, 움직임 보상이 소수 샘플 오프셋에서의 샘플 값의 보간을 회피하는 것에 의해 단순화될 수 있다.When the encoder selects the integer sample MV precision, the encoder can simplify the motion estimation by avoiding interpolation of the sample value at the fractional sample offset and evaluating the candidate prediction area only at the integer sample offset. In addition, when the MV prediction produces a decimal value using, for example, a temporal MV prediction, the encoder adds the MV difference to the MV prediction of the decimal value (e.g., from the temporal MV prediction) Only the MV prediction difference can be considered. During decoding, motion compensation can be simplified by avoiding interpolation of sample values at fractional sample offsets.

이전의 섹션에서 기술된 특정 접근법(예컨대, 왜곡 비용 및/또는 비트 레이트 비용을 스케일링하는 것에 의해 스케일된 레이트 왜곡 비용을 사용하는 것, 또는 왜곡 비용 페널티 또는 비트 레이트 비용 페널티를 부가하는 것, 또는 가중치 인자를 조절하는 것)이 또한 비규범적 접근법에 대해 적응될 수 있다. 인코더는 인코딩 동안 정수 샘플 MV 정밀도 쪽으로의 또는 그 반대쪽으로의 바이어스의 정도를 변화시킬 수 있다. 스케일링, 페널티 및/또는 가중치 인자를 통해, 인코더는 정수 샘플 MV 값이 비디오 콘텐츠를 인코딩하는 데 보다 적절할 가능성이 있는 신뢰 수준에 따라, 또는 인코딩 또는 디코딩에 대한 계산 능력에 따라 정수 샘플 MV 정밀도 쪽으로의 바이어스를 조절할 수 있다.The use of scaled rate distortion cost by scaling the distortion cost and / or bit rate cost, or adding a distortion cost penalty or bit rate cost penalty, Controlling factors) can also be adapted to non-normative approaches. The encoder can vary the degree of bias towards or away from the integer sample MV precision during encoding. Through the scaling, penalty and / or weighting factors, the encoder can determine whether the integer sample MV value is to be used in accordance with a confidence level that is more likely to be suitable for encoding the video content, The bias can be adjusted.

6. 대안 및 변형6. Alternatives and variations

일부 사용 시나리오에서, 픽처의 인코딩 순서(디코딩 순서 또는 디코딩되는 순서라고도 불리움)는 입력/카메라 포착 및 디스플레이에서의 시간 순서(디스플레이 순서라고도 불리움)와 상이하다. 인코더는 MV 정밀도를 선택할 때 이러한 재정렬(reordering)을 고려할 수 있다. 예를 들어, 인코더는 픽처들의 인코딩 순서가 아니라 픽처들의 시간 순서에 기초하여 MV 정밀도(들)를 선택할 수 있다.In some usage scenarios, the encoding order of the pictures (also called decoding order or decoded order) is different from the time order (also called display order) in input / camera capture and display. The encoder may consider this reordering when selecting the MV precision. For example, the encoder may select the MV precision (s) based on the temporal order of the pictures, rather than the encoding order of the pictures.

본원에 기술되는 예들 중 다수의 예에서, 인트라 BC 예측 및 움직임 보상이 개별적인 구성요소 또는 프로세스에서 구현되고, BV 추정 및 움직임 추정이 개별적인 구성요소 또는 프로세스에서 구현된다. 대안적으로, 인트라 BC 예측이 움직임 보상의 특별한 경우로서 구현될 수 있고, BV 추정이, 현재 픽처가 참조 픽처로서 사용되는, 움직임 추정의 특별한 경우로서 구현될 수 있다. 이러한 구현에서, BV 값은 MV 값으로서 신호되지만, 인터 픽처 예측보다는 (현재 픽처 내에서의) 인트라 BC 예측을 위해 사용될 수 있다. 이 용어가 본원에서 사용되는 바와 같이, "인트라 BC 예측"은, 그 예측이 인트라 픽처 예측 모듈, 움직임 보상 모듈, 또는 어떤 다른 모듈을 사용하여 제공되든 관계없이, 현재 픽처 내에서의 예측을 나타낸다. 이와 유사하게, BV 값이 MV 값을 사용하여 또는 독특한 유형의 파라미터 또는 구문 요소를 사용하여 표현될 수 있고, BV 추정은 인트라 픽처 추정 모듈, 움직임 추정 모듈 또는 어떤 다른 모듈을 사용하여 제공될 수 있다. 인트라 BC 예측(즉, 현재 픽처가 참조 픽처임)에 대한 BV 값으로서 사용될 MV 값의 정밀도를 결정하기 위해, MV 정밀도를 선택하는 본원에 기술되는 접근법이 적용될 수 있다.In many of the examples described herein, intra-BC prediction and motion compensation are implemented in separate components or processes, and BV estimation and motion estimation are implemented in separate components or processes. Alternatively, an intra BC prediction may be implemented as a special case of motion compensation, and a BV estimation may be implemented as a special case of motion estimation, in which the current picture is used as a reference picture. In this implementation, the BV value is signaled as the MV value, but may be used for intra-BC prediction (in the current picture) rather than inter-picture prediction. As this term is used herein, an "intra-BC prediction" indicates a prediction within a current picture, regardless of whether the prediction is provided using an intra prediction prediction module, a motion compensation module, or some other module. Similarly, BV values may be represented using MV values or using a distinct type of parameter or syntax element, and the BV estimate may be provided using an intra picture estimation module, a motion estimation module, or some other module . To determine the precision of the MV value to be used as the BV value for the intra-BC prediction (i.e., the current picture is a reference picture), the approach described herein for selecting the MV precision can be applied.

VI. 혁신적 특징VI. Innovative Features

이하에 제시되는 청구항에 부가하여, 본원에 기술되는 혁신적 특징은 이하의 것을 포함하지만, 이들로 제한되지 않는다.In addition to the claims set forth below, the innovative features described herein include, but are not limited to, the following.

# 특징# Characteristic

A1 컴퓨팅 디바이스로서,A1 As a computing device,

비디오를 인코딩하는 수단과,Means for encoding video,

인코딩된 비디오를 출력하는 수단Means for outputting encoded video

을 포함하며,/ RTI >

상기 비디오를 인코딩하는 수단은, 비디오의 단위에 대한 MV(motion vector) 정밀도를 결정하는 수단을 포함하고, 비디오의 단위 내의 블록들에 대한 MV 값들은 단위에 대한 MV 정밀도를 가지며, 단위에 대한 MV 정밀도를 결정하는 수단은,Wherein the means for encoding the video comprises means for determining a motion vector (MV) precision for a unit of video, wherein MV values for blocks in a unit of video have MV precision for a unit, Means for determining precision include,

소수 샘플 MV 정밀도를 가지는 MV 값 세트를 식별하는 수단; 및 Means for identifying a set of MV values with fractional sample MV precision; And

0의 소수 부분을 가지는 MV 값들의, MV 값 세트 내에서의, 출현율에 적어도 부분적으로 기초하여 단위에 대한 MV 정밀도를 선택하는 수단을 포함하는 것인, 컴퓨팅 디바이스. Means for selecting an MV precision for a unit based at least in part on an occurrence rate of the MV values having a fractional part of 0 in the set of MV values.

B1 컴퓨팅 디바이스로서,B1 As a computing device,

비디오를 인코딩하는 수단과,Means for encoding video,

인코딩된 비디오를 출력하는 수단Means for outputting encoded video

을 포함하며,/ RTI >

상기 비디오을 인코딩하는 수단은, 비디오의 단위에 대한 MV(motion vector) 정밀도를 결정하는 수단을 포함하고, 비디오의 단위 내의 블록들에 대한 MV 값들은 단위에 대한 MV 정밀도를 가지며, 결정하는 수단은 복수의 MV 정밀도 중에서 결정하기 위해 레이트 왜곡 분석을 수행하는 수단을 포함하고, 복수의 MV 정밀도는 하나 이상의 소수 샘플 MV 정밀도 및 정수 샘플 MV 정밀도를 포함하며, 레이트 왜곡 분석은 (a) 왜곡 비용을 스케일링하는 것, (b) 왜곡 비용에 페널티를 부가하는 것, (c) 비트 레이트 비용을 스케일링하는 것, (d) 비트 레이트 비용에 페널티를 부가하는 것, 및/또는 (e) 라그랑지 승수 인자를 조절하는 것에 의해 정수 샘플 MV 정밀도 쪽으로 바이어스되는 것인 컴퓨팅 디바이스.Wherein the means for encoding the video comprises means for determining a motion vector (MV) precision for a unit of video, wherein MV values for blocks in a unit of video have MV precision for a unit, Wherein the plurality of MV precision comprises one or more fractional sample MV precision and integer sample MV precision, and wherein the rate distortion analysis comprises: (a) scaling the distortion cost (B) adding a penalty to the distortion cost, (c) scaling the bit rate cost, (d) adding a penalty to the bit rate cost, and / or (e) adjusting the Lagrangian multiplier factor. Wherein the integer samples are biased towards the MV precision.

C1 컴퓨팅 디바이스로서,C1 As a computing device,

비디오를 인코딩하는 수단과,Means for encoding video,

인코딩된 비디오를 출력하는 수단Means for outputting encoded video

을 포함하며,/ RTI >

상기 비디오를 인코딩하는 수단은, 복수의 MV(motion vector) 정밀도 중에서 비디오의 어떤 단위에 대한 MV 정밀도를 결정하는 수단을 포함하고, 복수의 MV 정밀도는 하나 이상의 소수 샘플 MV 정밀도 및 정수 샘플 MV 정밀도를 포함하며, 비디오의 단위 내의 블록들에 대한 MV 값들은 단위에 대한 MV 정밀도를 가지며, 결정하는 수단은,Wherein the means for encoding the video comprises means for determining the MV precision for any unit of video from among a plurality of motion vector (MV) precision, wherein the plurality of MV precision comprises one or more fractional sample MV precision and integer sample MV precision Wherein the MV values for the blocks in the unit of video have MV precision for the unit,

비디오에 관한 정보를 수집하는 수단; 및Means for collecting information about the video; And

수집된 정보에 적어도 부분적으로 기초하여 단위에 대한 MV 정밀도를 선택하는 수단을 포함하는 것인 컴퓨팅 디바이스.And means for selecting MV precision for the unit based at least in part on the collected information.

개시된 발명의 원리가 적용될 수 있는 많은 가능한 실시예를 바탕으로, 예시된 실시예가 본 발명의 바람직한 예에 불과하고 본 발명의 범주를 제한하는 것으로서 해석되어서는 안된다는 것을 잘 알 것이다. 오히려, 본 발명의 범주는 이하의 청구범위에 의해 한정된다. 따라서, 이 청구범위의 범주 및 사상 내에 속하는 모든 것을 본 발명으로서 청구한다.It will be appreciated that on the basis of many possible embodiments to which the principles of the disclosed invention may be applied, the illustrated embodiments are merely preferred embodiments of the invention and should not be construed as limiting the scope of the invention. Rather, the scope of the present invention is defined by the claims that follow. Accordingly, all such modifications as fall within the scope and spirit of the claims are claimed as the invention.

Claims

비디오 인코더를 갖는 컴퓨팅 디바이스에서의 방법으로서,
비디오를 인코딩하는 단계와,
인코딩된 비디오를 출력하는 단계
를 포함하고,
상기 비디오를 인코딩하는 단계는,
상기 비디오의 단위에 대한 MV(motion vector: 움직임 벡터) 정밀도를 결정하는 단계를 포함하고, 상기 비디오의 상기 단위 내의 블록들에 대한 MV 값들은 상기 단위에 대한 MV 정밀도를 가지며,
상기 단위에 대한 MV 정밀도를 결정하는 단계는,
소수 샘플 MV 정밀도(fractional-sample MV precision)를 가지는 MV 값 세트를 식별하는 단계와,
0의 소수 부분을 가지는 MV 값들의, 상기 MV 값 세트 내에서의, 출현율(prevalence)에 적어도 부분적으로 기초하여 상기 단위에 대한 MV 정밀도를 선택하는 단계를 포함하는 것인, 비디오 인코더를 갖는 컴퓨팅 디바이스에서의 방법.A method in a computing device having a video encoder,
Encoding video,
Outputting the encoded video
Lt; / RTI >
Wherein encoding the video comprises:
And determining MV (motion vector) precision for the unit of video, wherein MV values for blocks in the unit of video have MV precision for the unit,
Wherein determining the MV precision for the unit comprises:
Identifying a set of MV values having a fractional-sample MV precision,
Selecting a MV precision for the unit based at least in part on the prevalence of the MV values having a fractional part of 0 in the set of MV values, Lt; / RTI >

제1항에 있어서, 상기 출현율은 0의 소수 부분을 가지는 상기 MV 값 세트의 분율(fraction)로 측정되거나,
영역 또는 영역 세트가 상기 MV 값 세트를 사용하고, 상기 출현율은 0의 소수 부분을 가지는 상기 MV 값들 중 하나를 사용하는 상기 영역 또는 상기 영역 세트의 분율로 측정되는 것인 비디오 인코더를 갖는 컴퓨팅 디바이스에서의 방법.2. The method of claim 1, wherein the occurrence rate is measured as a fraction of the MV set having a fractional part of zero,
Wherein the region or region set uses the set of MV values and the appearance rate is measured as a fraction of the region or set of regions using one of the MV values having a fractional portion of zero. Gt;

제2항에 있어서, 상기 분율이 문턱값을 초과하면, 상기 단위에 대한 상기 선택된 MV 정밀도는 정수 샘플 MV 정밀도(integer-sample MV precision)이고, 상기 분율이 상기 문턱값을 초과하지 않으면, 상기 단위에 대한 상기 선택된 MV 정밀도는 소수 샘플 MV 정밀도인 것인 비디오 인코더를 갖는 컴퓨팅 디바이스에서의 방법.3. The method of claim 2, wherein if the fraction exceeds a threshold, the selected MV precision for the unit is an integer-sample MV precision, and if the fraction does not exceed the threshold, Wherein the selected MV precision for a pixel is a fractional sample MV precision.

제1항에 있어서, 상기 단위에 대한 상기 선택된 MV 정밀도는 정수 샘플 MV 정밀도이고, 상기 단위에 대한 상기 선택된 MV 정밀도는 또한, 이벤트가 상기 소수 샘플 MV 정밀도로의 스위치 백(switch back)을 야기할 때까지, 상기 비디오의 후속 단위들에 대해 사용되며,
상기 이벤트는,
정의된 수의 단위들의 인코딩;
장면(scene) 변화; 또는
인코딩 동안의 관찰에 기초하여, 상기 소수 샘플 MV 정밀도로의 스위칭 백이 유익할 것이라는 결정인 것인 비디오 인코더를 갖는 컴퓨팅 디바이스에서의 방법.2. The method of claim 1 wherein the selected MV precision for the unit is an integer number of sample MV precision and the selected MV precision for the unit also causes the event to switch back to the fractional sample MV precision Until then, it is used for subsequent units of the video,
In the event,
Encoding of a defined number of units;
Scene change; or
Wherein the determination is based on observations during encoding that the switching back to the fractional sample MV precision is beneficial.

제1항에 있어서, 상기 MV 값 세트는,
영 값의 MV 및 영이 아닌 값의 MV를 포함하도록 허용되거나,
영이 아닌 값의 MV만을 포함하도록 제약되거나,
특정 블록 크기 이상의 블록들로부터의 영이 아닌 값의 MV만을 포함하도록 제약되는 것인 비디오 인코더를 갖는 컴퓨팅 디바이스에서의 방법.2. The method of claim 1,
A MV of zero value and MV of non-zero value,
Constrained to include only MVs of non-zero values,
Wherein the constraint is constrained to include only MVs of non-zero values from blocks over a particular block size.

제1항에 있어서, 상기 단위에 대한 MV 정밀도를 선택하는 단계는 또한, 문턱값 양의 영이 아닌 값의 MV들이 있으면 정수 샘플 MV 정밀도로의 스위칭이 허용되도록, 영이 아닌 값의 MV들의 출현율에 적어도 부분적으로 기초하고, 영이 아닌 값의 MV들의 출현율은 (a) 영이 아닌 값의 MV들인 MV 값들의 분율, (b) 영이 아닌 값의 MV들을 사용하는 블록들의 개수, 또는 (c) 영이 아닌 값의 MV들을 사용하는 영역 또는 영역 세트의 분율로 측정되며, 소수 샘플 MV 정밀도를 가지는 MV 값 세트는 상기 영역 또는 영역 세트의 영이 아닌 값의 MV 들 중에서 식별되는 것인 비디오 인코더를 갖는 컴퓨팅 디바이스에서의 방법.The method of claim 1, wherein the step of selecting the MV precision for the unit further comprises the steps of: selecting at least MVs of non-zero values such that switching to the integer sample MV precision is allowed if the MVs of non- (B) the number of blocks using MVs of non-zero values, or (c) the number of non-zero values of non-zero values, Wherein a set of MV values having a fractional sample MV precision is measured in fractions of a region or set of regions using MVs is identified among MVs of non-zero values of the region or set of regions. .

비디오 인코더를 갖는 컴퓨팅 디바이스에서의 방법으로서,
비디오를 인코딩하는 단계와,
인코딩된 비디오를 출력하는 단계
를 포함하며,
복수의 MV(motion vector) 정밀도 중에서 상기 비디오의 단위에 대한 MV 정밀도를 결정하는 단계를 포함하고, 상기 복수의 MV 정밀도는 하나 이상의 소수 샘플 MV 정밀도 및 정수 샘플 MV 정밀도를 포함하며, 상기 비디오의 상기 단위 내의 블록들에 대한 MV 값들은 상기 단위에 대한 MV 정밀도를 갖고,
상기 결정하는 단계는,
상기 비디오에 관한 정보를 수집하는 단계; 및
상기 수집된 정보에 적어도 부분적으로 기초하여 상기 단위에 대한 MV 정밀도를 선택하는 단계를 포함하는 것인 비디오 인코더를 갖는 컴퓨팅 디바이스에서의 방법.A method in a computing device having a video encoder,
Encoding video,
Outputting the encoded video
/ RTI >
Determining MV precision for a unit of the video from among a plurality of motion vector (MV) precision, wherein the plurality of MV precision comprises one or more fractional sample MV precision and integer sample MV precision, The MV values for the blocks in the unit have the MV precision for the unit,
Wherein the determining comprises:
Collecting information about the video; And
And selecting the MV precision for the unit based at least in part on the collected information. &Lt; Desc / Clms Page number 19 >

제7항에 있어서, 상기 수집된 정보는 샘플 값들을 포함하고, 상기 샘플 값들은 히스토그램으로서 편성되며, 상기 단위에 대한 MV 정밀도를 선택하는 단계는,
상기 수집된 정보 중에서 독특한 샘플 값들의 개수를 결정하는 단계와,
상기 개수를 문턱값과 비교하는 단계
를 포함하고,
상기 개수가 상기 문턱값보다 낮으면, 상기 정수 샘플 MV 정밀도가 선택되고, 상기 개수가 상기 문턱값보다 높으면, 상기 하나 이상의 소수 샘플 MV 정밀도 중 하나가 선택되는 것인 비디오 인코더를 갖는 컴퓨팅 디바이스에서의 방법.8. The method of claim 7, wherein the collected information comprises sample values, the sample values are organized as a histogram, and wherein selecting the MV precision for the unit comprises:
Determining a number of unique sample values from the collected information,
Comparing the number to a threshold value
Lt; / RTI >
Wherein if the number is less than the threshold, then the integer sample MV precision is selected, and if the number is higher than the threshold, one of the one or more fractional sample MV precision is selected. Way.

제7항에 있어서, 상기 수집된 정보는 각자의 복수의 MV 정밀도로 인코딩되는 블록들에 대한 왜곡 척도(distortion measure)를 포함하고, 상기 단위에 대한 MV 정밀도를 선택하는 단계는,
왜곡의 감소가 MV 정밀도의 증가를 정당화시키는지를 결정하는 단계를 포함하는 것인 비디오 인코더를 갖는 컴퓨팅 디바이스에서의 방법.8. The method of claim 7, wherein the collected information comprises a distortion measure for each of the plurality of blocks encoded with MV precision, and wherein selecting the MV precision for the unit comprises:
And determining if the reduction in distortion justifies an increase in MV accuracy.

제7항에 있어서, 상기 수집된 정보는 상기 하나 이상의 소수 샘플 MV 정밀도 중 하나에서의 MV 값들을 포함하고, 상기 수집된 MV 값들은 그의 소수 부분의 값에 따라 편성되는 것인 비디오 인코더를 갖는 컴퓨팅 디바이스에서의 방법.8. The method of claim 7 wherein the collected information comprises MV values at one of the one or more fractional sample MV precision and the collected MV values are organized according to the value of the fractional part thereof. Method in a device.

제1항 또는 제7항에 있어서, 상기 인코딩은 단일 패스 인코딩(single-pass encoding)이고, 상기 비디오의 상기 단위는 상기 비디오의 현재 단위이며, 상기 현재 단위에 대한 상기 선택된 MV 정밀도는 상기 비디오의 하나 이상의 이전 단위에 적어도 부분적으로 의존하거나,
상기 인코딩은 다중 패스 인코딩(multi-pass encoding)이고, 상기 비디오의 상기 단위는 상기 비디오의 현재 단위이며, 상기 현재 단위에 대한 상기 선택된 MV 정밀도는 상기 비디오의 현재 단위에 적어도 부분적으로 의존하는 것인 비디오 인코더를 갖는 컴퓨팅 디바이스에서의 방법.8. The method of claim 1 or 7, wherein the encoding is a single-pass encoding, the unit of video is a current unit of the video, and the selected MV precision for the current unit is At least partially dependent on one or more previous units,
Wherein the encoding is a multi-pass encoding, the unit of video is a current unit of the video, and the selected MV precision for the current unit is at least partially dependent on the current unit of the video A method in a computing device having a video encoder.

제1항 또는 제7항에 있어서, 상기 방법은,
(a) 정수 샘플 MV 정밀도가 적절한 신뢰 수준 및/또는 (b) 인코딩 및/또는 디코딩의 계산 능력(computational capacity)에 적어도 부분적으로 기초하여 정수 샘플 MV 정밀도 쪽으로의 또는 그 반대쪽으로의 바이어스의 양을 조절하는 단계를 더 포함하는 비디오 인코더를 갖는 컴퓨팅 디바이스에서의 방법.8. The method of claim 1 or 7,
(a) the amount of bias towards or away from the integer sample MV precision based on, at least in part, the integer sample MV precision, and / or (b) the computational capacity of the encoding and / &Lt; / RTI > further comprising the step of adjusting the video encoder.

제1항 또는 제7항에 있어서, 상기 단위에 대한 상기 선택된 MV 정밀도는, 상기 비디오의 상기 단위 내의 블록들에 대한 MV 값들의 수평 MV 성분 및/또는 수직 MV 성분에 대한 것인 비디오 인코더를 갖는 컴퓨팅 디바이스에서의 방법.8. The method of claim 1 or 7, wherein the selected MV precision for the unit is for a horizontal MV component and / or a vertical MV component of MV values for blocks in the unit of video A method in a computing device.

제1항 또는 제7항에 있어서, 상기 단위는 시퀀스, 장면 변화 사이의 일련의 픽처, 픽처 그룹, 픽처, 타일, 슬라이스 코딩 트리 단위, 및 코딩 단위로 이루어지는 그룹 중에서 선택되고, 상기 블록들은 예측 블록, 예측 단위, 매크로블록 또는 서브매크로블록 파티션(sub-macroblock partition)인 비디오 인코더를 갖는 컴퓨팅 디바이스에서의 방법.8. The method of claim 1 or 7, wherein the unit is selected from the group consisting of a sequence, a series of pictures between scene changes, a picture group, a picture, a tile, a slice coding tree unit, and a coding unit, , A prediction unit, a macroblock, or a sub-macroblock partition.

비디오 인코더를 갖는 컴퓨팅 디바이스에서의 방법으로서,
비디오를 인코딩하는 단계와,
상기 인코딩된 비디오를 출력하는 단계
를 포함하며,
상기 비디오를 인코딩하는 단계는, 상기 비디오의 단위에 대한 MV(motion vector) 정밀도를 결정하는 단계를 포함하고, 상기 비디오의 상기 단위 내의 블록들에 대한 MV 값들은 상기 단위에 대한 MV 정밀도를 가지며, 상기 결정하는 단계는 복수의 MV 정밀도 중에서 결정하기 위해 레이트 왜곡 분석(rate-distortion analysis)을 수행하는 단계를 포함하고, 상기 복수의 MV 정밀도는 하나 이상의 소수 샘플 MV 정밀도 및 정수 샘플 MV 정밀도를 포함하며, 상기 레이트 왜곡 분석은 (a) 왜곡 비용을 스케일링하는 것, (b) 상기 왜곡 비용에 페널티(penalty)를 부가하는 것, (c) 비트 레이트 비용(bit rate cost)을 스케일링하는 것, (d) 상기 비트 레이트 비용에 페널티를 부가하는 것, 및/또는 (e) 라그랑지 승수 인자(Lagrangian multiplier factor)를 조절하는 것에 의해 상기 정수 샘플 MV 정밀도 쪽으로 바이어스되는 것인 비디오 인코더를 갖는 컴퓨팅 디바이스에서의 방법.A method in a computing device having a video encoder,
Encoding video,
Outputting the encoded video
/ RTI >
Wherein encoding the video comprises determining a motion vector (MV) precision for the unit of video, wherein MV values for blocks in the unit of video have MV precision for the unit, Wherein the determining step comprises performing a rate-distortion analysis to determine a plurality of MV precision, the plurality of MV precision comprising one or more fractional sample MV precision and integer sample MV precision , Said rate distortion analysis comprising: (a) scaling the distortion cost, (b) adding a penalty to the distortion cost, (c) scaling the bit rate cost, ) Adding the penalty to the bit rate cost, and / or (e) adjusting the Lagrangian multiplier factor to the integer sample MV precision side Wherein the video encoder is biased with a video encoder.