TWI736923B - Extended merge mode - Google Patents

Extended merge mode Download PDF

Info

Publication number
TWI736923B
TWI736923B TW108123158A TW108123158A TWI736923B TW I736923 B TWI736923 B TW I736923B TW 108123158 A TW108123158 A TW 108123158A TW 108123158 A TW108123158 A TW 108123158A TW I736923 B TWI736923 B TW I736923B
Authority
TW
Taiwan
Prior art keywords
candidate
motion
emm
candidates
patent application
Prior art date
Application number
TW108123158A
Other languages
Chinese (zh)
Other versions
TW202002650A (en
Inventor
劉鴻彬
張莉
張凱
王悅
Original Assignee
大陸商北京字節跳動網絡技術有限公司
美商字節跳動有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 大陸商北京字節跳動網絡技術有限公司, 美商字節跳動有限公司 filed Critical 大陸商北京字節跳動網絡技術有限公司
Publication of TW202002650A publication Critical patent/TW202002650A/en
Application granted granted Critical
Publication of TWI736923B publication Critical patent/TWI736923B/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/184Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being bits, e.g. of the compressed video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • H04N19/517Processing of motion vectors by encoding
    • H04N19/52Processing of motion vectors by encoding by predictive encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • H04N19/521Processing of motion vectors for estimating the reliability of the determined motion vectors or motion vector field, e.g. for smoothing the motion vector field or for correcting motion vectors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/523Motion estimation or motion compensation with sub-pixel accuracy

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

Methods, devices and systems for using an extended merge mode (EMM) in video coding are described. An exemplary method of video processing includes constructing a list of EMM candidates; determining, based on a first set of bits in a bitstream representation of a current block, a motion information inherited by the current block from the list; determining, based on a second set of bits in the bitstream representation, a signaled motion information of the current block; and performing, based on the list of EMM candidates and the signaled motion information, a conversion between the current block and the bitstream representation.

Description

擴展Merge模式Extended Merge mode

本文件涉及視頻編碼和解碼技術。 [相關申請的交叉引用] 根據適用的專利法和/或依據巴黎公約的規則,本發明要求於2018年6月29日提交的國際專利申請第PCT/CN2018/093646號的優先權和權益。該國際專利申請第PCT/CN2018/093646號的全部公開內容通過引用併入作為本發明的公開內容的一部分。This document deals with video encoding and decoding technology. [Cross references to related applications] In accordance with the applicable patent law and/or in accordance with the rules of the Paris Convention, the present invention requires the priority and rights of the international patent application No. PCT/CN2018/093646 filed on June 29, 2018. The entire disclosure of the International Patent Application No. PCT/CN2018/093646 is incorporated by reference as a part of the disclosure of the present invention.

數位視頻占網際網路和其他數位通信網路上最大的頻寬使用。隨著能夠接收和顯示視頻的所連接的使用者設備的數量增加,預計數位視頻使用的頻寬需求將繼續增長。Digital video accounts for the largest bandwidth usage on the Internet and other digital communication networks. As the number of connected user devices capable of receiving and displaying video increases, it is expected that the demand for bandwidth used by digital video will continue to grow.

所公開的技術可以由視頻解碼器或編碼器實施例用於使用擴展合併(merge)模式,其中一些運動資訊可以被繼承,而一些運動資訊可以被信號通知。The disclosed technology can be used by a video decoder or encoder embodiment to use an extended merge mode, in which some motion information can be inherited, and some motion information can be signaled.

在一個示例方面,公開了一種視頻處理方法。該方法包括構建擴展Merge模式(EMM)候選列表;基於當前塊的位元流表示中的位元的第一集合來確定由當前塊從列表中繼承的運動資訊;基於該位元流表示中的位元的第二集合來確定該當前塊被信號通知的運動資訊;並且基於EMM候選列表和被信號通知的運動資訊來執行當前塊與位元流表示之間的轉換。In an example aspect, a video processing method is disclosed. The method includes constructing an extended Merge mode (EMM) candidate list; determining the motion information inherited from the list by the current block based on the first set of bits in the bitstream representation of the current block; The second set of bits is used to determine the signaled motion information of the current block; and the conversion between the current block and the bitstream representation is performed based on the EMM candidate list and the signaled motion information.

在另一個示例方面,上述方法可以由包括處理器的視頻解碼器裝置實現。In another example aspect, the above method may be implemented by a video decoder device including a processor.

在另一個示例方面,上述方法可以由視頻編碼器裝置實現,該視頻編碼器裝置包括用於在視頻編碼過程期間對已編碼的視頻進行解碼的處理器。In another example aspect, the above method may be implemented by a video encoder device including a processor for decoding the encoded video during the video encoding process.

在又一個示例方面,這些方法可以以處理器可執行指令的形式實施並儲存在電腦可讀程式介質上。In yet another example aspect, these methods may be implemented in the form of processor-executable instructions and stored on a computer-readable program medium.

在本文件中進一步描述了這些方面和其他方面。These and other aspects are further described in this document.

本文件提供了可由視頻位元流的解碼器使用的各種技術,以改善解壓縮的或解碼的數位視頻的品質。此外,視頻編碼器也可在編碼過程期間實現這些技術,以便重建用於進一步編碼的已解碼幀。This document provides various techniques that can be used by decoders of video bitstreams to improve the quality of decompressed or decoded digital video. In addition, video encoders can also implement these techniques during the encoding process in order to reconstruct decoded frames for further encoding.

為了便於理解,在本文件中使用了節標題,但是並非將實施例和技術限制於對應的節。這樣,來自一個節的實施例可以與來自其他節的實施例組合。For ease of understanding, section titles are used in this document, but the embodiments and techniques are not limited to the corresponding sections. In this way, embodiments from one section can be combined with embodiments from other sections.

22 .技術框架. Technical framework

視頻編碼標準主要是通過眾所周知的ITU-T和ISO/IEC標準的發展而得以演進。ITU-T產生了H.261和H.263標準,ISO/IEC產生了MPEG-1和MPEG-4 Visual標準,並且兩個組織聯合產生了H.262/MPEG-2視頻標準和H.264/MPEG-4高級視頻編碼(Advanced Video Coding,AVC)標準和H.265/HEVC標準。自H.262以來,視頻編碼標準基於混合視頻編碼結構,其中利用時間預測加變換編碼。為探索HEVC之外的未來視頻編碼技術,聯合視頻探索團隊(Joint Video Exploration Team, JVET)由VCEG和MPEG於2015年聯合創立。自那時以來,JVET採用了許多新方法,並將其納入了名為聯合探索模型(Joint Exploration Model, JEM)的參考軟體中。在2018年4月,VCEG(Q6/16)和ISO/IEC JTC1 SC29/WG11(MPEG)之間的聯合視頻專家團隊(Joint Video Expert Team, JVET)被創建以從事於VVC標準,目標是與HEVC相比降低50%的位元速率。Video coding standards are mainly evolved through the development of the well-known ITU-T and ISO/IEC standards. ITU-T produced H.261 and H.263 standards, ISO/IEC produced MPEG-1 and MPEG-4 Visual standards, and the two organizations jointly produced H.262/MPEG-2 video standards and H.264/ MPEG-4 Advanced Video Coding (AVC) standard and H.265/HEVC standard. Since H.262, the video coding standard is based on a hybrid video coding structure in which temporal prediction plus transform coding is used. In order to explore future video coding technologies beyond HEVC, the Joint Video Exploration Team (JVET) was jointly founded by VCEG and MPEG in 2015. Since then, JVET has adopted many new methods and incorporated them into a reference software called Joint Exploration Model (JEM). In April 2018, the Joint Video Expert Team (JVET) between VCEG (Q6/16) and ISO/IEC JTC1 SC29/WG11 (MPEG) was created to work on the VVC standard, with the goal of cooperating with HEVC Compared to reduce the bit rate by 50%.

2.12.1 exist HEVC/H.HEVC/H. 265265 中的幀間預測Inter prediction

每個幀間預測的PU具有對於一個或兩個參考圖像列表的運動參數。運動參數包括運動向量和參考圖像索引。也可以使用inter_pred_idc 來信號通知對兩個參考圖像列表中一個的使用。運動向量可以被明確地編碼為相對於預測器的增量。Each inter-predicted PU has motion parameters for one or two reference image lists. The motion parameters include motion vectors and reference image indexes. It is also possible to use inter_pred_idc to signal the use of one of the two reference picture lists. The motion vector can be explicitly coded as an increment relative to the predictor.

當使用跳過(skip)模式編碼CU時,一個PU與該CU相關聯,並且不存在顯著的殘差係數,不存在編碼的運動向量增量或參考圖像索引。指定Merge模式,由此從相鄰PU——包括空間和時間候選——獲得用於當前PU的運動參數。Merge模式可以應用於任何幀間預測的PU,而不僅適用於跳過模式。Merge模式的替代方案是運動參數的顯式傳輸(explicit transmission),其中運動向量(更確切地說,與運動向量預測器相比的運動向量差異)、每個參考圖像列表的對應參考圖像索引、參考圖像列表使用被每個PU地明確地信號通知。這樣的模式在本文件中被命名為高級運動向量預測(Advanced motion vector prediction, AMVP)。When a CU is coded in skip mode, a PU is associated with the CU, and there are no significant residual coefficients, no coded motion vector increments or reference image indexes. Specify the Merge mode, thereby obtaining motion parameters for the current PU from neighboring PUs including spatial and temporal candidates. The Merge mode can be applied to any inter-predicted PU, not only to the skip mode. The alternative to Merge mode is the explicit transmission of motion parameters, where the motion vector (more precisely, the motion vector difference compared with the motion vector predictor), the corresponding reference image of each reference image list The index and reference picture list usage are clearly signaled by each PU. Such a mode is named Advanced Motion Vector Prediction (AMVP) in this document.

當信號指示要使用兩個參考圖像列表中的一個時,PU從一個樣點塊產生。這被稱為“單向預測”。單向預測可用於P條帶和B條帶。When the signal indicates that one of the two reference picture lists is to be used, the PU is generated from one sample block. This is called "one-way prediction". One-way prediction can be used for P-band and B-band.

當信號指示要使用兩個參考圖像列表時,PU從兩個樣點塊產生。這被稱為“雙向預測”。雙向預測僅可用於B條帶。When the signal indicates that two reference picture lists are to be used, the PU is generated from two sample blocks. This is called "bidirectional prediction". Bi-directional prediction is only available for band B.

以下文本提供了HEVC中指定的幀間預測模式的細節。描述將從Merge模式開始。The following text provides details of the inter prediction modes specified in HEVC. The description will start in Merge mode.

2.1.1Merge2.1.1Merge 模式model

2.1.1.1Merge2.1.1.1Merge 模式的候選的推導Derivation of model candidates

當使用Merge模式預測PU時,從位元流解析出指向Merge候選列表(merge candidates list) 的條目的索引,並且該索引被用於檢索運動資訊。該列表的構建在HEVC標準中規定,並且可以根據以下步驟順序進行總結:When the Merge mode is used to predict the PU, an index pointing to an entry in the merge candidates list (merge candidates list) is parsed from the bit stream, and the index is used to retrieve motion information. The construction of this list is specified in the HEVC standard and can be summarized according to the following sequence of steps:

步驟1:初始候選推導Step 1: Initial candidate derivation

步驟1.1:空間候選推導Step 1.1: Spatial candidate derivation

步驟1.2:空間候選的冗餘校驗Step 1.2: Redundancy Check of Spatial Candidates

步驟1.3:時間候選推導Step 1.3: Time candidate derivation

步驟2:附加候選***Step 2: Additional candidate insertion

步驟2.1:創建雙向預測候選Step 2.1: Create bidirectional prediction candidates

步驟2.2:***零運動候選Step 2.2: Insert zero motion candidates

這些步驟也在圖1中示意性地描繪。對於空間Merge候選推導,在位於五個不同位置的候選中選擇最多四個Merge候選。對於時間Merge候選推導,在兩個候選中選擇最多一個Merge候選。由於在解碼器處假設每個PU的候選的數量為常數,因此當從步驟1獲得的候選的數量未達到在條帶標頭中信號通知的最大Merge候選數量(MaxNumMergeCand)時,生成附加的候選。由於候選的數量是恆定的,因此使用二進位一元截斷(TUB)來編碼最佳Merge候選的索引。如果CU的尺寸等於8,則當前CU的所有PU共用單個Merge候選列表,該單個Merge候選列表與2N×2N預測單元的Merge候選列表相同。These steps are also schematically depicted in FIG. 1. For the derivation of spatial Merge candidates, a maximum of four Merge candidates are selected from candidates located at five different positions. For the temporal Merge candidate derivation, at most one Merge candidate is selected from the two candidates. Since the number of candidates for each PU is assumed to be constant at the decoder, when the number of candidates obtained from step 1 does not reach the maximum number of Merge candidates (MaxNumMergeCand) signaled in the slice header, additional candidates are generated . Since the number of candidates is constant, binary unary truncation (TUB) is used to encode the index of the best Merge candidate. If the size of the CU is equal to 8, all PUs of the current CU share a single Merge candidate list, which is the same as the Merge candidate list of the 2N×2N prediction unit.

在下文中,詳細描述了與上述步驟相關聯的操作。In the following, the operations associated with the above steps are described in detail.

2.1.1.22.1.1.2 空間候選推導Spatial candidate derivation

在空間Merge候選的推導中,在位於圖2中描繪的位置中的候選中選擇最多四個Merge候選。推導的順序是A1 、B1 、B0 、A0 和B2 。僅當位置A1 、B1 、B0 、A0的任何PU不可用時(例如,因為該PU屬於另一個條帶(slice)或片(tile))或者是幀內編碼時,才考慮位置B2 。在添加位置A1 處的候選之後,對剩餘候選的添加進行冗餘校驗,該冗餘校驗確保具有相同運動資訊的候選被排除在列表之外,從而改善編碼效率。為了降低計算複雜度,在所提到的冗餘校驗中並未考慮所有可能的候選對。相反,僅考慮與圖3中的箭頭連結的對,並且如果用於冗餘校驗的對應候選具有不同的運動資訊,則該候選僅被添加到列表中。重複的運動資訊的另一個來源是與不同於2Nx2N的劃分相關聯的“第二PU 。作為示例,圖4A-4B分別描繪了針對N×2N和2N×N的情況的第二PU。當當前PU被劃分為N×2N時,位置A1 處的候選不被考慮用於列表構建。實際上,通過添加該候選將導致具有相同運動資訊的兩個預測單元,這對於在編碼單元中僅具有一個PU是冗餘的。類似地,當當前PU被劃分為2N×N時,不考慮位置B1In the derivation of spatial Merge candidates, a maximum of four Merge candidates are selected among candidates located in the positions depicted in FIG. 2. The order of derivation is A 1 , B 1 , B 0 , A 0 and B 2 . Position B is only considered when any PU at positions A 1 , B 1 , B 0 , A0 is not available (for example, because the PU belongs to another slice or tile) or is intra-coded 2 . After the addition of the candidate position at A 1, of the remaining candidate is added redundancy check, CRC to ensure that the candidate has the same motion information is excluded from the list, thereby improving coding efficiency. In order to reduce the computational complexity, not all possible candidate pairs are considered in the mentioned redundancy check. On the contrary, only the pair connected with the arrow in FIG. 3 is considered, and if the corresponding candidate used for the redundancy check has different motion information, the candidate is only added to the list. Another source of repetitive motion information is the "second PU " associated with a division other than 2Nx2N. As an example, FIGS. 4A-4B depict the second PU for the N×2N and 2N×N cases, respectively. When the current PU is divided into N×2N, the candidate at position A 1 is not considered for list construction. In fact, adding this candidate will result in two prediction units with the same motion information, which is redundant for having only one PU in the coding unit. Similarly, when the current PU is divided into 2N×N, the position B 1 is not considered.

2.1.1.32.1.1.3 時間候選推導Time candidate derivation

在此步驟中,只有一個候選被添加到列表中。特別地,在該時間Merge候選的推導中,基於共位(co-located)的PU來導出縮放的運動向量,該共位的PU屬於相對於給定參考圖像列表內的當前圖像具有最小POC差異的圖像。在條帶標頭中明確地信號通知用於共位的PU的推導的參考圖像列表。如圖5中的虛線所示,獲得用於時間Merge候選的縮放的運動向量,其使用POC距離tb和td從共位的PU的運動向量縮放,其中tb被定義為當前圖像的參考圖像與該當前圖像之間的POC差異,並且td被定義為共位元圖像的參考圖像與該共位元圖像之間的POC差異。時間Merge候選的參考圖像索引被設置為等於零。在HEVC規範中描述了縮放過程的實際實現。對於B條帶,獲得兩個運動向量並將其組合以產生雙向預測Merge候選,該兩個運動向量中的一個用於參考圖像列表0而另一個用於參考圖像列表1。In this step, only one candidate is added to the list. In particular, in the derivation of the temporal Merge candidate, the scaled motion vector is derived based on the co-located PU. The co-located PU has the smallest value relative to the current image in the given reference image list. POC difference image. The derivation of the reference picture list for the co-located PU is explicitly signaled in the slice header. As shown by the dashed line in Figure 5, a scaled motion vector for temporal Merge candidates is obtained, which uses POC distances tb and td to scale from the motion vector of the co-located PU, where tb is defined as the reference image of the current image The POC difference with the current image, and td is defined as the POC difference between the reference image of the co-bit image and the co-bit image. The reference image index of the temporal Merge candidate is set equal to zero. The actual implementation of the scaling process is described in the HEVC specification. For the B slice, two motion vectors are obtained and combined to generate a bi-directional predictive Merge candidate, one of the two motion vectors is used for reference image list 0 and the other is used for reference image list 1.

如圖6所示,在屬於參考幀的共位的PU(Y)中,在候選C0 和C1 之間選擇用於時間候選的位置。如果位置C0 處的PU為不可用的、被幀內編碼的或在當前CTU行之外,則使用位置C1 。否則,在時間Merge候選的推導中使用位置C0As shown in FIG. 6, in the co-located PU (Y) belonging to the reference frame, the position for the temporal candidate is selected between the candidates C 0 and C 1. If the PU at position C 0 is unavailable, intra-coded, or outside the current CTU line, position C 1 is used. Otherwise, the position C 0 is used in the derivation of the temporal Merge candidate.

2.1.1.42.1.1.4 附加的候選***Additional candidate insertion

除了空間和時間Merge候選之外,還存在兩種附加類型的Merge候選:組合的雙向預測Merge候選和零Merge候選。通過利用空間和時間Merge候選來生成組合的雙向預測Merge候選。組合的雙向預測Merge候選僅用於B條帶。通過將初始候選的第一參考圖像列表運動參數與另一個候選的第二參考圖像列表運動參數組合來生成組合的雙向預測候選。如果這兩個元組提供不同的運動假設,則它們將形成一個新的雙向預測候選。作為示例,圖7描繪了當原始列表(在左方)中具有mvL0和refIdxL0或mvL1和refIdxL1的兩個候選被用於創建組合的雙向預測Merge候選的情況,該組合的雙向預測Merge候選被添加到最終列表(在右方)。關於被考慮來生成這些附加Merge候選的組合存在許多規則。In addition to spatial and temporal Merge candidates, there are two additional types of Merge candidates: combined bidirectional predictive Merge candidates and zero Merge candidates. The combined bidirectional predictive Merge candidate is generated by using spatial and temporal Merge candidates. The combined bi-directional prediction Merge candidate is only used for B-slices. The combined bidirectional prediction candidate is generated by combining the first reference image list motion parameter of the initial candidate with the second reference image list motion parameter of another candidate. If these two tuples provide different motion hypotheses, they will form a new bi-prediction candidate. As an example, Figure 7 depicts a situation when two candidates with mvL0 and refIdxL0 or mvL1 and refIdxL1 in the original list (on the left) are used to create a combined bi-directional prediction Merge candidate, and the combined bi-directional prediction Merge candidate is added Go to the final list (on the right). There are many rules regarding the combinations that are considered to generate these additional Merge candidates.

零運動候選被***以填充Merge候選列表中的剩餘條目,並且因此達到MaxNumMergeCand容量。這些候選具有零空間位移和參考圖像索引,該參考圖像索引從零開始並且每當新的零運動候選被添加到列表時增加。這些候選使用的參考幀的數量是分別用於單向和雙向預測的1和2。最後,不對這些候選執行冗餘校驗。Zero motion candidates are inserted to fill the remaining entries in the Merge candidate list, and thus reach the MaxNumMergeCand capacity. These candidates have a zero spatial displacement and a reference image index, which starts at zero and increases every time a new zero motion candidate is added to the list. The number of reference frames used by these candidates is 1 and 2 for unidirectional and bidirectional prediction, respectively. Finally, no redundancy check is performed on these candidates.

2.1.1.52.1.1.5 用於並行處理的運動估計區域Motion estimation area for parallel processing

為了加速編碼過程,可以並存執行運動估計,由此同時導出給定區域內的所有預測單元的運動向量。來自空間鄰域的Merge候選的推導可能干擾並行處理,因為一個預測單元直到其相關聯的運動估計完成才能從相鄰PU導出運動參數。為了減輕編碼效率和處理等待時間之間的折衷,HEVC定義了運動估計區域(MER),運動估計區域的尺寸在圖像參數集中使用“log2_parallel_merge_level_minus2”語法元素來信號通知。當MER被定義時,落入同一區域的Merge候選被標記為不可用,並且因此在列表構建中也不被考慮。In order to speed up the encoding process, motion estimation can be performed concurrently, thereby deriving the motion vectors of all prediction units in a given area at the same time. The derivation of Merge candidates from the spatial neighborhood may interfere with parallel processing because a prediction unit cannot derive motion parameters from neighboring PUs until its associated motion estimation is completed. In order to reduce the trade-off between coding efficiency and processing latency, HEVC defines a motion estimation area (MER). The size of the motion estimation area is signaled in the image parameter set using the "log2_parallel_merge_level_minus2" syntax element. When MER is defined, Merge candidates that fall into the same area are marked as unavailable, and therefore are not considered in the list construction.

2.1.2 AMVP2.1.2 AMVP

AMVP利用運動向量與相鄰PU的時空相關性,該時空相關性用於運動參數的顯式傳輸。對於每個參考圖像列表,通過下述操作來構建運動向量候選列表:首先校驗左方、上方在時間上相鄰PU位置的可用性,移除冗餘候選,並添加零向量以使候選列表為恆定長度。然後,編碼器可以從候選列表中選擇最佳預測器,並傳輸指示所選候選的對應索引。與Merge索引信號類似,使用二進位一元截斷來編碼最佳運動向量候選的索引。在這種情況下要編碼的最大值是2(參見圖8)。在以下的節中,提供了關於運動向量預測候選的推導過程的細節。AMVP utilizes the spatio-temporal correlation between the motion vector and neighboring PUs, and the spatio-temporal correlation is used for the explicit transmission of motion parameters. For each reference image list, the motion vector candidate list is constructed by the following operations: first check the availability of adjacent PU positions on the left and above in time, remove redundant candidates, and add zero vectors to make the candidate list Is a constant length. Then, the encoder can select the best predictor from the candidate list and transmit a corresponding index indicating the selected candidate. Similar to the Merge index signal, binary unary truncation is used to encode the index of the best motion vector candidate. The maximum value to be encoded in this case is 2 (see Figure 8). In the following section, details on the derivation process of motion vector prediction candidates are provided.

2.1.2.1 AMVP2.1.2.1 AMVP 候選的推導Candidate derivation

圖8總結了用於運動向量預測候選的推導過程。Figure 8 summarizes the derivation process for motion vector prediction candidates.

在運動向量預測中,考慮兩種類型的運動向量候選:空間運動向量候選和時間運動向量候選。如圖2所示,對於空間運動向量候選推導,最終基於位於五個不同位置的每個PU的運動向量來導出兩個運動向量候選。In motion vector prediction, two types of motion vector candidates are considered: spatial motion vector candidates and temporal motion vector candidates. As shown in FIG. 2, for the derivation of spatial motion vector candidates, two motion vector candidates are finally derived based on the motion vector of each PU located at five different positions.

對於時間運動向量候選推導,從基於兩個不同的共位位置導出的兩個候選中選擇一個運動向量候選。在製作時空候選的第一列表之後,移除列表中的重複的運動向量候選。如果潛在候選的數量大於2,則從列表中移除其在相關聯的參考圖像列表內的參考圖像索引大於1的運動向量候選。如果時空運動向量候選的數量小於2,則將附加的零運動向量候選添加到列表中。For the derivation of temporal motion vector candidates, one motion vector candidate is selected from two candidates derived based on two different co-located positions. After the first list of spatiotemporal candidates is made, the repeated motion vector candidates in the list are removed. If the number of potential candidates is greater than 2, the motion vector candidate whose reference image index in the associated reference image list is greater than 1 is removed from the list. If the number of spatiotemporal motion vector candidates is less than 2, then additional zero motion vector candidates are added to the list.

2.1.2.22.1.2.2 空間運動向量候選Spatial motion vector candidate

在空間運動向量候選的推導中,在五個潛在候選中考慮最多兩個候選,該五個潛在候選來自位於如圖2所示的位置的PU,這些位置與運動合併的那些位置相同。當前PU的左側的推導順序被定義為A0 、A1 以及縮放的A0 、縮放的A1 。當前PU的上側的推導順序被定義為B0 、B1 、B2 、縮放的B0 、縮放的B1 、縮放的B2 。因此,對於每一側,存在四種可用作運動向量候選的情況,其中兩種情況不需要使用空間縮放,並且兩種情況使用空間縮放。四種不同的情況總結如下。In the derivation of the spatial motion vector candidates, at most two candidates are considered among the five potential candidates, the five potential candidates are from the PU located at the positions shown in FIG. 2, and these positions are the same as those of the motion merge. The derivation order of the left side of the current PU is defined as A 0 , A 1 and scaled A 0 , scaled A 1 . The derivation order of the upper side of the current PU is defined as B 0 , B 1 , B 2 , scaled B 0 , scaled B 1 , scaled B 2 . Therefore, for each side, there are four cases that can be used as motion vector candidates, of which two cases do not need to use spatial scaling, and two cases use spatial scaling. The four different situations are summarized below.

無空間縮放No space zoom

(1)相同的參考圖像列表,以及相同的參考圖像索引(相同的POC)(1) The same reference image list, and the same reference image index (same POC)

(2)不同的參考圖像列表,但是相同的參考圖像索引(相同的POC)(2) Different reference image lists, but the same reference image index (same POC)

空間縮放Space zoom

(3)相同的參考圖像列表,但不同的參考圖像索引(不同的POC)(3) The same reference image list, but different reference image index (different POC)

(4)不同的參考圖像列表,以及不同的參考圖像索引(不同的POC)(4) Different reference image lists, and different reference image indexes (different POC)

首先校驗無空間縮放情況,接下來校驗空間縮放。不管參考圖像列表如何,當POC在相鄰PU的參考圖像與當前PU的參考圖像之間是不同的時,考慮空間縮放。如果左方候選的所有PU都不可用或者是被幀內編碼的,則允許對上方運動向量進行縮放,以幫助左方和上方MV候選的並行推導。否則,對上側運動向量不允許空間縮放。First verify the absence of spatial scaling, and then verify the spatial scaling. Regardless of the reference image list, when the POC is different between the reference image of the adjacent PU and the reference image of the current PU, spatial scaling is considered. If all the PUs of the left candidate are unavailable or are intra-coded, the upper motion vector is allowed to be scaled to help the parallel derivation of the left and upper MV candidates. Otherwise, no spatial scaling is allowed for the upper motion vector.

如圖9所示,在空間縮放過程中,以與時間縮放類似的方式縮放相鄰PU的運動向量。主要差異在於當前PU的參考圖像列表和索引被給出以作為輸入;實際縮放過程與時間縮放過程相同。As shown in FIG. 9, during the spatial scaling process, the motion vectors of adjacent PUs are scaled in a similar manner to the temporal scaling. The main difference is that the reference image list and index of the current PU are given as input; the actual scaling process is the same as the time scaling process.

2.1.2.32.1.2.3 時間運動向量候選Temporal motion vector candidate

除了參考圖像索引推導之外,用於時間Merge候選的推導的所有過程與用於空間運動向量候選的推導的過程相同(參見圖6)。將參考圖像索引信號通知給解碼器。Except for the reference image index derivation, all the processes used for the derivation of temporal Merge candidates are the same as those used for the derivation of spatial motion vector candidates (see FIG. 6). The reference image index signal is notified to the decoder.

2.2 JEM2.2 JEM 中新的幀間預測方法New inter prediction method

2.2.12.2.1 自我調整運動向量差異解析度Self-adjusting motion vector difference resolution

在HEVC中,當條帶標頭中的use_integer_mv_flag等於0時,以四分之一亮度樣點為單位,信號通知(PU的運動向量與預測的運動向量之間的)運動向量差異(MVD)。在JEM中,引入了局部自我調整運動向量解析度(LAMVR)。在JEM中,MVD可以以四分之一亮度樣點、整數亮度樣點或四個亮度樣點為單位進行編碼。在編碼單元(CU)級別控制MVD解析度,並且向具有至少一個非零MVD分量的每個CU,有條件地信號通知MVD解析度標誌。In HEVC, when use_integer_mv_flag in the slice header is equal to 0, the motion vector difference (MVD) (between the motion vector of the PU and the predicted motion vector) is signaled in a unit of quarter luminance sample. In JEM, local self-adjusting motion vector resolution (LAMVR) is introduced. In JEM, MVD can be coded in units of a quarter brightness sample, an integer brightness sample, or four brightness samples. The MVD resolution is controlled at the coding unit (CU) level, and an MVD resolution flag is conditionally signaled to each CU with at least one non-zero MVD component.

對於具有至少一個非零MVD分量的CU,信號通知第一標記,以指示是否在CU中使用四分之一亮度樣點MV精度。當第一標誌(等於1)指示未使用四分之一亮度樣點MV精度時,信號通知另一個標誌,以指示是否使用整數亮度樣點MV精度或四個亮度樣點MV精度。For a CU with at least one non-zero MVD component, a first flag is signaled to indicate whether to use a quarter-luminance sample MV accuracy in the CU. When the first flag (equal to 1) indicates that one-quarter luminance sample MV accuracy is not used, another flag is signaled to indicate whether to use integer luminance sample MV accuracy or four luminance sample MV accuracy.

當CU的第一MVD解析度標誌為零,或未針對CU編碼(意味著CU中的所有MVD均為零)時,四分之一亮度樣點MV解析度被用於該CU。當CU使用整數亮度樣點MV精度或四個亮度樣點MV精度時,該CU的AMVP候選列表中的MVP被取整到對應的精度。When the first MVD resolution flag of the CU is zero, or is not coded for the CU (meaning that all MVDs in the CU are zero), a quarter luminance sample MV resolution is used for the CU. When a CU uses integer luminance sample MV precision or four luminance sample MV precision, the MVP in the AMVP candidate list of the CU is rounded to the corresponding precision.

在編碼器中,使用CU級別RD校驗來確定將哪個MVD解析度將用於CU。換言之,對於每個MVD解析度,執行CU級RD校驗三次。為了加快編碼器速度,在JEM中應用以下編碼方案。In the encoder, a CU level RD check is used to determine which MVD resolution will be used for the CU. In other words, for each MVD resolution, CU-level RD verification is performed three times. In order to speed up the encoder, the following coding scheme is applied in JEM.

在具有正常四分之一亮度樣點MVD解析度的CU的RD校驗期間,儲存該當前CU的運動資訊(整數亮度樣點精度)。對於具有整數亮度樣點和4個亮度樣點MVD解析度的相同CU,儲存的運動資訊(取整後)被用作RD校驗期間進一步小範圍運動向量細化的起點,使得耗時的運動估計過程不重複三次。During the RD verification period of a CU with a normal quarter luminance sample MVD resolution, the motion information (integer luminance sample accuracy) of the current CU is stored. For the same CU with integer luminance samples and 4 luminance sample MVD resolutions, the stored motion information (after rounding) is used as the starting point for further refinement of the small range motion vector during RD verification, making time-consuming motion The estimation process is not repeated three times.

有條件地調用具有4個亮度樣點MVD解析度的CU的RD校驗。對於CU,當整數亮度樣點MVD解析度的RD成本遠大於四分之一亮度樣點MVD解析度的RD成本時,跳過該CU的4個亮度樣點MVD解析度的RD校驗。Conditionally invoke the RD check of a CU with 4 luminance samples MVD resolution. For a CU, when the RD cost of the integer luminance sample MVD resolution is much greater than the RD cost of the quarter luminance sample MVD resolution, the RD check of the MVD resolution of the 4 luminance samples of the CU is skipped.

2.2.22.2.2 較高的運動向量儲存精度Higher motion vector storage accuracy

在HEVC中,運動向量精度是四分之一像素(pel)(用於4:2:0視頻的四分之一亮度樣點和八分之一彩度樣點)。在JEM中,內部運動向量儲存和Merge候選的準確度增加到1/16像素。較高的運動向量精度(1/16像素)用於以跳過模式/Merge模式編碼的CU的運動補償幀間預測。如節2.2.1所述,對於使用正常AMVP模式編碼的CU,使用整數像素或四分之一像素運動。In HEVC, the accuracy of the motion vector is one-quarter pixel (pel) (one-quarter luminance sample and one-eighth chroma sample used for 4:2:0 video). In JEM, the accuracy of internal motion vector storage and Merge candidates is increased to 1/16 pixel. Higher motion vector accuracy (1/16 pixel) is used for motion compensation inter prediction of CU coded in skip mode/Merge mode. As described in section 2.2.1, for CUs coded in the normal AMVP mode, integer pixel or quarter pixel motion is used.

具有與HEVC運動補償插值濾波器相同的濾波器長度和歸一化因數的SHVC上採樣插值濾波器,被用作附加分數像素位置的運動補償插值濾波器。在JEM中彩度分量運動向量精度是1/32樣點,通過使用兩個相鄰1/16像素分數位置的濾波器的平均值,來導出1/32像素分數位置的附加插值濾波器。The SHVC up-sampling interpolation filter, which has the same filter length and normalization factor as the HEVC motion compensation interpolation filter, is used as a motion compensation interpolation filter for adding fractional pixel positions. In JEM, the accuracy of the chroma component motion vector is 1/32 sample points. By using the average value of the filters of two adjacent 1/16 pixel score positions, an additional interpolation filter at 1/32 pixel score positions is derived.

2.2.32.2.3 局部亮度補償Local brightness compensation

局部亮度補償(LIC)基於用於亮度變化的線性模型,使用縮放因數a 和偏移b 。並且,針對每個幀間模式編碼的編碼單元(CU)自我調整地啟用或禁用LIC。Local Luminance Compensation (LIC) is based on a linear model for luminance changes, using a scaling factor a and an offset b . And, the LIC is enabled or disabled in self-adjustment for each coding unit (CU) coded in the inter mode.

當LIC應用於CU時,採用最小平方誤差方法,通過使用當前CU的相鄰樣點及其對應的參考樣點來導出參數ab 。更具體地,如圖10所示,使用了該CU的子採樣(2:1子採樣)的相鄰樣點和參考圖像中的對應樣點(其由當前CU或子CU的運動資訊識別)。IC參數被導出並被分別應用於每個預測方向。When the LIC is applied to the CU, the least square error method is adopted to derive the parameters a and b by using the neighboring samples of the current CU and the corresponding reference samples. More specifically, as shown in FIG. 10, the adjacent samples of the sub-sampling (2:1 sub-sampling) of the CU and the corresponding samples in the reference image (which are identified by the motion information of the current CU or sub-CU) are used. ). The IC parameters are derived and applied to each prediction direction separately.

當使用Merge模式對CU進行編碼時,以類似於Merge模式中的運動資訊複製的方式從相鄰塊複製LIC標誌;否則,向該CU信號通知LIC標誌以指示LIC是否適用。When the CU is encoded in the Merge mode, the LIC flag is copied from adjacent blocks in a manner similar to the motion information copy in the Merge mode; otherwise, the LIC flag is signaled to the CU to indicate whether the LIC is applicable.

當對圖像啟用LIC時,需要附加的CU級別RD校驗以確定是否對CU應用LIC。當對CU啟用LIC時,分別對整數像素運動搜索和分數像素運動搜索,使用均值移除的絕對差和(mean-removed sum of absolute diffefference,MR-SAD)以及均值移除的絕對哈達瑪變換差和(mean-removed sum of absolute Hadamard-transformed difference,MR-SATD),而不是SAD和SATD。When LIC is enabled for an image, an additional CU-level RD check is required to determine whether to apply LIC to CU. When LIC is enabled for CU, use mean-removed sum of absolute diffefference (MR-SAD) and absolute Hadamard transform difference of mean-removed sum of absolute diffefference (MR-SAD) for integer pixel motion search and fractional pixel motion search respectively And (mean-removed sum of absolute Hadamard-transformed difference, MR-SATD) instead of SAD and SATD.

為了降低編碼複雜度,在JEM中應用以下編碼方案。In order to reduce coding complexity, the following coding scheme is applied in JEM.

當當前圖像與其參考圖像之間不存在明顯的亮度變化時,對整個圖像禁用LIC。為了識別這種情況,在編碼器處,計算當前圖像與該當前圖像的每個參考圖像的長條圖。如果當前圖像與該當前圖像的每個參考圖像之間的長條圖差異小於給定閾值,則對當前圖像禁用LIC;否則,對當前圖像啟用LIC。When there is no obvious brightness change between the current image and its reference image, LIC is disabled for the entire image. In order to recognize this situation, at the encoder, a bar graph of the current image and each reference image of the current image is calculated. If the bar graph difference between the current image and each reference image of the current image is less than a given threshold, LIC is disabled for the current image; otherwise, LIC is enabled for the current image.

2.2.42.2.4 仿射運動補償預測Affine motion compensation prediction

在HEVC中,僅將平移運動模型應用於運動補償預測(MCP)。而在現實世界中,存在許多種運動,例如放大/縮小、旋轉、透視運動和其他不規則的運動。在JEM中,應用簡化的仿射變換運動補償預測。如圖11所示,塊的仿射運動場由兩個控制點運動向量描述。In HEVC, only the translational motion model is applied to motion compensation prediction (MCP). In the real world, there are many kinds of motions, such as zoom in/out, rotation, perspective motion, and other irregular motions. In JEM, a simplified affine transform motion compensation prediction is applied. As shown in Figure 11, the affine motion field of the block is described by two control point motion vectors.

塊的運動向量場(MVF)由以下等式描述:The motion vector field (MVF) of the block is described by the following equation:

Figure 02_image001
(1)
Figure 02_image001
(1)

其中(v0x ,v0y )是左頂角控制點的運動向量,(v1x ,v1y )是右頂角控制點的運動向量。Where ( v 0x , v 0y ) is the motion vector of the control point at the left top corner, ( v 1x , v 1y ) is the motion vector of the control point at the right top corner.

為了進一步簡化運動補償預測,應用基於子塊的仿射變換預測。子塊尺寸

Figure 02_image003
如等式(2)。中導出,其中MvPre 是運動向量分數精度(在JEM中為1/16),(v2x ,v2y )是左下控制點的運動向量,根據等式(1)計算。In order to further simplify the motion compensation prediction, sub-block-based affine transform prediction is applied. Sub-block size
Figure 02_image003
Such as equation (2). Derived from, where MvPre is the score accuracy of the motion vector (1/16 in JEM), and ( v 2x , v 2y ) is the motion vector of the lower left control point, calculated according to equation (1).

Figure 02_image005
(2)
Figure 02_image005
(2)

在由等式(2)導出之後,如果需要,應該向下調整M和N,以使其分別為w和h的除數。After deriving from equation (2), if necessary, M and N should be adjusted downward so that they are divisors of w and h, respectively.

如圖12所示,為了導出每個M×N子塊的運動向量,根據等式(1)計算每個子塊的中心樣點的運動向量並將其取整至1/16分數精度。然後,應用運動補償插值濾波器,以利用導出的運動向量生成每個子塊的預測。As shown in FIG. 12, in order to derive the motion vector of each M×N sub-block, the motion vector of the center sample point of each sub-block is calculated according to equation (1) and rounded to 1/16 fractional accuracy. Then, a motion compensation interpolation filter is applied to use the derived motion vector to generate a prediction for each sub-block.

在MCP之後,每個子塊的高精度運動向量以與正常運動向量相同的精度被取整並保存。After MCP, the high-precision motion vector of each sub-block is rounded and saved with the same accuracy as the normal motion vector.

在JEM中,存在兩種仿射運動模式:AF_INTER模式和AF_MERGE模式。對於寬度和高度均大於8的CU,可以應用AF_INTER模式。在位元流中信號通知CU級別的仿射標誌,以指示是否使用AF_INTER模式。在此模式下,使用相鄰塊構建具有運動向量對

Figure 02_image007
的候選列表。如圖13所示,從塊A、B或C的運動向量中選擇
Figure 02_image009
。來自相鄰塊的運動向量根據參考列表以及根據相鄰塊的參考的POC、當前CU的參考的POC和當前CU的POC之間的關係來縮放。並且從相鄰塊D和E中選擇
Figure 02_image011
的方法是類似的。如果候選列表的數量小於2,則由通過重複每個AMVP候選而構建的運動向量對來填充該列表。當候選列表大於2時,首先根據相鄰運動向量的一致性(候選對中的兩個運動向量的相似性)對候選進行分類,並且僅保留前兩個候選。RD成本校驗用於確定選擇哪個運動向量對候選作為當前CU的控制點運動向量預測(CPMVP)。並且,在位元流中信號通知指示候選列表中的CPMVP的位置的索引。在確定當前仿射CU的CPMVP之後,應用仿射運動估計,並找到控制點運動向量(CPMV)。然後在位元流中信號通知CPMV與CPMVP的差異。In JEM, there are two affine motion modes: AF_INTER mode and AF_MERGE mode. For CUs whose width and height are both greater than 8, AF_INTER mode can be applied. The CU-level affine flag is signaled in the bit stream to indicate whether to use the AF_INTER mode. In this mode, adjacent blocks are used to construct pairs with motion vectors
Figure 02_image007
List of candidates. As shown in Figure 13, select from the motion vectors of block A, B or C
Figure 02_image009
. The motion vector from the neighboring block is scaled according to the reference list and the relationship between the POC of the neighboring block's reference, the POC of the current CU's reference, and the POC of the current CU. And select from adjacent blocks D and E
Figure 02_image011
The method is similar. If the number of the candidate list is less than 2, the list is filled with a pair of motion vectors constructed by repeating each AMVP candidate. When the candidate list is greater than 2, the candidates are first classified according to the consistency of adjacent motion vectors (the similarity of the two motion vectors in the candidate pair), and only the first two candidates are retained. The RD cost check is used to determine which motion vector pair candidate is selected as the control point motion vector prediction (CPMVP) of the current CU. And, an index indicating the position of the CPMVP in the candidate list is signaled in the bit stream. After determining the CPMVP of the current affine CU, apply affine motion estimation and find the control point motion vector (CPMV). Then signal the difference between CPMV and CPMVP in the bit stream.

當在AF_MERGE模式中應用CU時,它從有效的相鄰重建塊獲得使用仿射模式編碼的第一塊。如圖14A所示,並且對於候選塊的選擇順序是從左方、上方、右上方、左下方到左上方。如圖14B所示,如果相鄰左下塊A以仿射模式編碼,則導出包含塊A的CU的左頂角、右上角和左底角的運動向量

Figure 02_image013
Figure 02_image015
Figure 02_image017
。並且根據
Figure 02_image013
Figure 02_image015
Figure 02_image017
來計算當前CU的左頂角的運動向量
Figure 02_image009
。其次,計算當前CU的右上方的運動向量
Figure 02_image011
。When the CU is applied in the AF_MERGE mode, it obtains the first block coded using the affine mode from the valid neighboring reconstructed block. As shown in FIG. 14A, and the selection order for candidate blocks is from left, top, top right, bottom left to top left. As shown in FIG. 14B, if the adjacent lower left block A is coded in affine mode, the motion vectors of the top left corner, the top right corner, and the bottom left corner of the CU containing the block A are derived
Figure 02_image013
,
Figure 02_image015
with
Figure 02_image017
. And according to
Figure 02_image013
,
Figure 02_image015
with
Figure 02_image017
To calculate the motion vector of the top left corner of the current CU
Figure 02_image009
. Second, calculate the motion vector at the upper right of the current CU
Figure 02_image011
.

在導出當前CU的CPMV

Figure 02_image009
Figure 02_image011
之後,根據簡化的仿射運動模型等式(1),生成該當前CU的MVF。為了識別當前CU是否使用AF_MERGE模式編碼,當存在至少一個相鄰塊以仿射模式編碼時,在位元流中信號通知仿射標誌。Exporting the CPMV of the current CU
Figure 02_image009
with
Figure 02_image011
After that, according to the simplified affine motion model equation (1), the MVF of the current CU is generated. In order to identify whether the current CU is coded in AF_MERGE mode, when there is at least one neighboring block coded in affine mode, an affine flag is signaled in the bit stream.

2.2.52.2.5 模式匹配的運動向量推導Derivation of motion vector for pattern matching

模式匹配的運動向量推導(Pattern matched motion vector derivation, PMMVD)模式是一種基於幀速率上轉換(Frame-Rate Up Conversion, FRUC)技術的特殊Merge模式。使用該模式,塊的運動資訊不被信號通知,而是在解碼器側導出。Pattern matched motion vector derivation (PMMVD) mode is a special Merge mode based on Frame-Rate Up Conversion (FRUC) technology. Using this mode, the motion information of the block is not signaled, but is derived on the decoder side.

當CU的Merge標誌為真時,向該CU信號通知FRUC標誌。當FRUC標誌為假時,信號通知Merge索引,並使用常規Merge模式。當FRUC標誌為真時,信號通知附加的FRUC模式標誌以指示將使用哪種方法(雙邊匹配或模板匹配)來導出該塊的運動資訊。When the Merge flag of the CU is true, the FRUC flag is signaled to the CU. When the FRUC flag is false, the Merge index is signaled and the regular Merge mode is used. When the FRUC flag is true, an additional FRUC mode flag is signaled to indicate which method (bilateral matching or template matching) will be used to derive the motion information of the block.

在編碼器側,關於是否對CU使用FRUC Merge模式的決定是基於如對正常Merge候選那樣所做的RD成本選擇。換言之,通過使用RD成本選擇來校驗CU的兩種匹配模式(雙邊匹配和模板匹配)。導致最小成本的匹配模式與其他CU模式進一步比較。如果FRUC匹配模式是最有效的模式,則對於CU將FRUC標誌設置為真,並且使用有關匹配模式。On the encoder side, the decision on whether to use the FRUC Merge mode for the CU is based on the RD cost selection as for the normal Merge candidates. In other words, two matching modes (bilateral matching and template matching) of CU are checked by using RD cost selection. The matching mode that results in the least cost is further compared with other CU modes. If the FRUC matching mode is the most effective mode, the FRUC flag is set to true for the CU, and the relevant matching mode is used.

FRUC Merge模式中的運動推導過程有兩個步驟。首先執行CU級別運動搜索,接下來執行子CU級別運動細化。在CU級別,基於雙邊匹配或模板匹配為整個CU導出初始運動向量。首先,生成MV候選列表,並且選擇導致最小匹配成本的候選作為進一步CU級別細化的起點。然後,圍繞起始點執行基於雙邊匹配或模板匹配的局部搜索,並且將導致最小匹配成本的MV作為整個CU的MV。隨後,運動資訊在子CU級別進一步細化,其中導出的CU運動向量作為起點。The motion derivation process in FRUC Merge mode has two steps. The CU-level motion search is performed first, and then the sub-CU-level motion refinement is performed. At the CU level, the initial motion vector is derived for the entire CU based on bilateral matching or template matching. First, a list of MV candidates is generated, and the candidate that leads to the smallest matching cost is selected as the starting point for further CU level refinement. Then, a partial search based on bilateral matching or template matching is performed around the starting point, and the MV that results in the smallest matching cost is taken as the MV of the entire CU. Subsequently, the motion information is further refined at the sub-CU level, where the derived CU motion vector is used as the starting point.

例如,針對

Figure 02_image019
CU運動資訊推導執行以下推導處理。在第一階段,導出整體
Figure 02_image019
CU的MV。在第二階段,CU進一步劃分為
Figure 02_image021
子CU。如(3)中計算
Figure 02_image023
的值,
Figure 02_image025
是預定義的劃分深度,其在JEM中默認設置為3。然後導出每個子CU的MV。For example, for
Figure 02_image019
CU motion information derivation performs the following derivation processing. In the first stage, export the whole
Figure 02_image019
CU’s MV. In the second stage, CU is further divided into
Figure 02_image021
Child CU. As calculated in (3)
Figure 02_image023
Value,
Figure 02_image025
Is a predefined division depth, which is set to 3 by default in JEM. Then export the MV of each sub-CU.

Figure 02_image027
} (3)
Figure 02_image027
} (3)

如圖15所示,雙邊匹配用於通過在兩個不同參考圖像中沿當前CU的運動軌跡找到兩個塊之間的最接近匹配,來導出當前CU的運動資訊。在連續運動軌跡的假設下,指向兩個參考塊的運動向量MV0和MV1應當與在當前圖像和兩個參考圖像之間的時間距離——即TD0和TD1——成比例。作為特殊情況,當當前圖像在時間上在兩個參考圖像之間並且從當前圖像到兩個參考圖像的時間距離相同時,雙邊匹配變為基於鏡像的雙向MV。As shown in FIG. 15, bilateral matching is used to derive the motion information of the current CU by finding the closest match between two blocks along the motion trajectory of the current CU in two different reference images. Under the assumption of continuous motion trajectories, the motion vectors MV0 and MV1 pointing to the two reference blocks should be proportional to the time distance between the current image and the two reference images, that is, TD0 and TD1. As a special case, when the current image is between two reference images in time and the time distance from the current image to the two reference images is the same, bilateral matching becomes a mirror-based two-way MV.

如圖16所示,模板匹配用於通過找到在當前圖像中的 模板(當前CU的頂部相鄰塊和/或左方相鄰塊)與參考圖像中的塊(具有與模板相同的尺寸)之間的最接近匹配,來導出當前CU的運動資訊。除了上述FRUC Merge模式之外,模板匹配也適用於AMVP模式。在JEM中,如在HEVC中一樣,AMVP有兩個候選。使用模板匹配方法,導出新的候選。如果由模板匹配的新導出的候選與第一現有AMVP候選不同,則將其***AMVP候選列表的最開始,並且然後將列表尺寸設置為2(這意味著移除第二現有AMVP候選)。當應用於AMVP模式時,僅應用CU級別搜索。As shown in Figure 16, template matching is used to find the The closest match between the template (the top neighboring block and/or the left neighboring block of the current CU) and the block in the reference image (having the same size as the template) is used to derive the motion information of the current CU. In addition to the aforementioned FRUC Merge mode, template matching is also applicable to AMVP mode. In JEM, as in HEVC, AMVP has two candidates. Use the template matching method to derive new candidates. If the newly derived candidate matched by the template is different from the first existing AMVP candidate, it is inserted at the very beginning of the AMVP candidate list, and then the list size is set to 2 (which means removing the second existing AMVP candidate). When applied to AMVP mode, only CU level search is applied.

2.2.5.1 CU2.2.5.1 CU 級別level MVMV 候選集合Candidate set

CU級別的MV候選集合由以下組成:The MV candidate set at the CU level consists of the following:

如果當前CU處於AMVP模式,則為原始AMVP候選,If the current CU is in AMVP mode, it is the original AMVP candidate,

所有Merge候選,All Merge candidates,

插值MV域中的數個MV,Interpolate several MVs in the MV domain,

頂部和左方相鄰的運動向量。The motion vector adjacent to the top and left.

當使用雙邊匹配時,Merge候選的每個有效MV被用作輸入,以在假設雙邊匹配的情況下生成MV對。例如,Merge候選的一個有效MV是在參考列表A中的(MVa,refa)。然後,在其他參考列表B中找到其配對雙邊MV的參考圖像refb,使得refa和refb在時間上位於當前圖片的不同側。如果參考列表B中這樣的refb不可用,則refb被確定為與refa不同的參考,並且refb到當前圖像的時間距離是列表B中的最小值。在確定refb之後,基於當前圖像與refa、refb之間的時間距離通過縮放MVa來導出MVb。When using bilateral matching, each valid MV of the Merge candidate is used as an input to generate MV pairs assuming bilateral matching. For example, a valid MV of the Merge candidate is in the reference list A (MVa, refa). Then, find the reference image refb of the paired bilateral MV in the other reference list B, so that refa and refb are located on different sides of the current picture in time. If such refb in reference list B is not available, refb is determined to be a different reference from refa, and the time distance from refb to the current image is the minimum value in list B. After determining refb, MVb is derived by scaling MVa based on the time distance between the current image and refa and refb.

來自插值MV域的四個MV也被添加到CU級別候選列表。更具體地,添加當前CU的位置(0, 0)、(W/2, 0)、(0, H/2)和(W/2, H/2)處的插值MV。Four MVs from the interpolation MV domain are also added to the CU-level candidate list. More specifically, the interpolation MVs at the positions (0, 0), (W/2, 0), (0, H/2), and (W/2, H/2) of the current CU are added.

當FRUC應用於AMVP模式時,原始AMVP候選也被添加到CU級別MV候選集合。When FRUC is applied to the AMVP mode, the original AMVP candidate is also added to the CU-level MV candidate set.

在CU級別,用於AMVP CU的最多15個MV、用於Merge CU的最多13個MV被添加到候選列表。At the CU level, up to 15 MVs for AMVP CU and up to 13 MVs for Merge CU are added to the candidate list.

2.2.5.22.2.5.2 son CUCU 級別level MVMV 候選集合Candidate set

子CU級別的MV候選集合由以下組成:The MV candidate set at the sub-CU level consists of the following:

從CU級別搜索確定的MV,Search the determined MV from the CU level,

頂部、左方、左頂和右頂的相鄰MV,The adjacent MVs at the top, left, top left, and top right,

來自參考圖像的並列MV的縮放版本,A scaled version of the side-by-side MV from the reference image,

最多4個ATMVP候選,Up to 4 ATMVP candidates,

最多4個STMVP候選。Up to 4 STMVP candidates.

來自參考圖像的縮放MV如下導出。遍歷兩個列表中的所有參考圖像。參考圖像中的子CU的並列位置處的MV被縮放到起始CU級別MV的參考。The zoomed MV from the reference image is derived as follows. Iterate over all reference images in the two lists. The MV at the juxtaposed position of the sub-CU in the reference image is scaled to the reference of the starting CU level MV.

ATMVP和STMVP候選僅限於前四個。ATMVP and STMVP candidates are limited to the first four.

在子CU級別,最多17個MV被添加到候選列表中。At the sub-CU level, up to 17 MVs are added to the candidate list.

2.2.5.32.2.5.3 插值Interpolation MVMV 域的生成Domain generation

在對幀進行編碼之前,基於單邊ME為整個圖像生成插值運動域。然後,運動域可以稍後用作CU級別或子CU級別MV候選。Before encoding the frame, an interpolated motion field is generated for the entire image based on a single-sided ME. Then, the motion domain can be used as a CU-level or sub-CU-level MV candidate later.

首先,兩個參考列表中的每個參考圖像的運動域以4×4塊級別遍歷。對於每個4×4塊,如果與塊相關聯的運動通過當前圖像中的4×4塊(如圖17所示)並且該塊尚未被分配任何插值運動,則參考塊的運動根據TD0和TD1(與HEVC中的TMVP的MV縮放的方式相同的方式)縮放到當前圖像,並且將縮放的運動分配給當前幀中的塊。如果無縮放的MV被分配到4×4塊,則在插值的運動域中將塊的運動標記為不可用。First, the motion domain of each reference image in the two reference lists is traversed at a 4×4 block level. For each 4×4 block, if the motion associated with the block passes through a 4×4 block in the current image (as shown in Figure 17) and the block has not been assigned any interpolation motion, the motion of the reference block is based on the TD0 and TD1 (the same way as the MV scaling of TMVP in HEVC) scales to the current image, and assigns the scaled motion to the block in the current frame. If an unscaled MV is allocated to a 4×4 block, the motion of the block is marked as unavailable in the interpolated motion domain.

2.2.5.42.2.5.4 插值和匹配成本Interpolation and matching costs

當運動向量指向分數樣點位置時,需要運動補償插值。為了降低複雜性,雙邊匹配和模板匹配都使用雙線性插值而不是常規的8抽頭HEVC插值。When the motion vector points to the position of the fractional sample point, motion compensation interpolation is required. In order to reduce complexity, bilateral matching and template matching both use bilinear interpolation instead of conventional 8-tap HEVC interpolation.

匹配成本的計算在不同的步驟有點不同。當從CU級別的候選集合中選擇候選時,匹配成本是雙邊匹配或模板匹配的絕對差值和(SAD)。在確定起始MV之後,如下計算子CU級別搜索的雙邊匹配的匹配成本

Figure 02_image029
:The calculation of the matching cost is a bit different in different steps. When selecting candidates from the CU-level candidate set, the matching cost is the sum of absolute differences (SAD) of bilateral matching or template matching. After the starting MV is determined, the matching cost of the bilateral matching of the sub-CU level search is calculated as follows
Figure 02_image029
:

Figure 02_image031
(4)
Figure 02_image031
(4)

其中

Figure 02_image033
是一個加權因數,且根據經驗設置為4,
Figure 02_image035
Figure 02_image037
分別指示當前MV和起始MV。SAD仍用作子CU級別搜索的模板匹配的匹配成本。in
Figure 02_image033
Is a weighting factor and is set to 4 based on experience,
Figure 02_image035
with
Figure 02_image037
Indicate the current MV and the start MV respectively. SAD is still used as the matching cost of template matching for sub-CU level searches.

在FRUC模式中,MV通過僅使用亮度樣點導出。導出的運動將用於MC幀間預測的亮度和彩度。在確定MV之後,使用用於亮度的8抽頭插值濾波器和用於彩度的4抽頭插值濾波器來執行最終MC。In FRUC mode, MV is derived by using only luminance samples. The derived motion will be used for the luma and chroma of MC inter prediction. After the MV is determined, the final MC is performed using an 8-tap interpolation filter for luma and a 4-tap interpolation filter for chroma.

2.2.5.5 MV2.2.5.5 MV 細化Refine

MV細化是以雙邊匹配成本或模板匹配成本為準則的基於模式的MV搜索。在JEM中,支援兩種搜索模式——分別用於CU級別和子CU級別的MV細化的無限制的中心偏置菱形搜索(unrestricted center-biased diamond search,UCBDS)和自我調整交叉搜索(adaptive cross search)。對於CU級別和子CU級別MV細化,以四分之一亮度樣點MV精度直接搜索MV,並且接下來以八分之一亮度樣點MV細化。對於CU步驟和子CU步驟的MV細化的搜索範圍被設置為等於8個亮度樣點。MV refinement is a pattern-based MV search based on bilateral matching cost or template matching cost. In JEM, two search modes are supported-unrestricted center-biased diamond search (UCBDS) and adaptive cross search (unrestricted center-biased diamond search, UCBDS) for CU-level and sub-CU-level MV refinement. search). For CU-level and sub-CU-level MV refinement, the MV is directly searched with a quarter-luminance sample MV accuracy, and then the MV is refined with one-eighth brightness sample MV. The search range of MV refinement for the CU step and the sub-CU step is set equal to 8 luminance samples.

2.2.5.62.2.5.6 模板匹配Template matching FRUC MergeFRUC Merge 模式中預測方向的選擇Selection of prediction direction in the model

在雙邊匹配Merge模式中,始終應用雙向預測,因為基於在兩個不同參考圖像中沿當前CU的運動軌跡的兩個塊之間的最接近匹配來導出CU的運動資訊。模板匹配Merge模式不存在這樣的限制。在模板匹配Merge模式中,編碼器可以在針對CU的來自列表0的單向預測、來自列表1的單向預測或者雙向預測之中進行選擇。選擇基於模板匹配成本,如下:In the bilateral matching Merge mode, bidirectional prediction is always applied because the motion information of the CU is derived based on the closest match between two blocks along the motion trajectory of the current CU in two different reference images. There is no such limitation in template matching Merge mode. In the template matching Merge mode, the encoder can choose among unidirectional prediction from list 0, unidirectional prediction from list 1, or bidirectional prediction for the CU. Choose based on template matching cost, as follows:

如果costBi >= factor * min (cost 0,cost1 )If costBi >= factor * min ( cost 0, cost1 )

使用雙向預測;Use bidirectional prediction;

否則,如果cost 0 >=cost1 Otherwise, if cost 0 >= cost1

使用來自列表0的單向預測;Use one-way prediction from list 0;

否則,otherwise,

使用來自列表1的單向預測;Use the one-way forecast from Listing 1;

其中cost0是列表0模板匹配的SAD,cost1是列表1模板匹配的SAD,costBi是雙向預測模板匹配的SAD。factor 的值等於1.25,這意味著選擇過程偏向於雙向預測。Where cost0 is the SAD matched by the template in list 0, cost1 is the SAD matched by the template in list 1, and costBi is the SAD matched by the bidirectional prediction template. The value of factor is equal to 1.25, which means that the selection process is biased towards bidirectional prediction.

幀間預測方向選擇僅應用於CU級別模板匹配過程。Inter-frame prediction direction selection is only applied to the CU-level template matching process.

2.2.62.2.6 解碼器側運動向量細化Decoder side motion vector refinement

在雙向預測操作中,為了預測一個塊區域,分別使用列表0的運動向量(MV)和列表1的MV形成的兩個預測塊被組合以形成單個預測信號。在解碼器側運動向量細化(DMVR)方法中,雙向預測的兩個運動向量通過雙邊模板匹配過程進一步細化。雙邊模板匹配應用於解碼器中,以在雙邊模板和參考圖像中的重建樣點之間執行基於失真的搜索,以便在不傳輸附加的運動資訊的情況下獲得細化的MV。In the bidirectional prediction operation, in order to predict one block area, two prediction blocks formed using the motion vector (MV) of List 0 and the MV of List 1 are combined to form a single prediction signal. In the decoder-side motion vector refinement (DMVR) method, the two motion vectors of bidirectional prediction are further refined through a bilateral template matching process. Bilateral template matching is applied in the decoder to perform a distortion-based search between the bilateral template and the reconstructed samples in the reference image in order to obtain a refined MV without transmitting additional motion information.

如圖18所示,在DMVR中,分別從列表0的初始MV0和列表1的MV1生成雙邊模板,作為兩個預測塊的加權組合(即平均)。模板匹配操作包括計算生成的模板與參考圖像中的樣點區域(在初始預測塊周圍)之間的成本度量。對於兩個參考圖像中的每一個,產生最小模板成本的MV被視為該列表的更新MV以替換原始模板。在JEM中,對於每個列表,搜索九個MV候選。該九個MV候選包括原始MV和8個周圍MV,其中一個亮度樣點在水平或垂直方向上或在兩個方向上偏移到原始MV。最後,如圖18所示,兩個新的MV,即MV0'和MV1',被用於生成最終的雙向預測結果。絕對差值和(SAD)用作成本度量。請注意,當計算由一個周圍MV生成的預測塊的成本時,實際上使用取整的MV(到整數像素)而不是真實MV來獲得預測塊。As shown in Figure 18, in DMVR, a bilateral template is generated from the initial MV0 of list 0 and MV1 of list 1, respectively, as a weighted combination (ie, average) of two prediction blocks. The template matching operation includes calculating the cost metric between the generated template and the sample area (around the initial prediction block) in the reference image. For each of the two reference images, the MV that produces the smallest template cost is regarded as the updated MV of the list to replace the original template. In JEM, for each list, nine MV candidates are searched. The nine MV candidates include the original MV and 8 surrounding MVs, and one of the luminance samples is offset to the original MV in the horizontal or vertical direction or in both directions. Finally, as shown in Figure 18, two new MVs, namely MV0' and MV1', are used to generate the final bidirectional prediction result. The sum of absolute differences (SAD) is used as a cost metric. Please note that when calculating the cost of a prediction block generated from a surrounding MV, the rounded MV (to integer pixels) is actually used instead of the real MV to obtain the prediction block.

DMVR被應用於雙向預測的Merge模式,在不傳輸附加的語法元素的情況下使用來自過去的參考圖像中的一個MV和來自將來的參考圖像中的另一個MV。在JEM中,當為CU啟用LIC、仿射運動、FRUC或子CU Merge候選時,不應用DMVR。DMVR is applied to the Merge mode of bidirectional prediction, using one MV from the past reference image and another MV from the future reference image without transmitting additional syntax elements. In JEM, when LIC, affine motion, FRUC, or sub-CU Merge candidates are enabled for CU, DMVR is not applied.

2.32.3 非相鄰Non-adjacent MergeMerge 候選Candidate

在J0021中,高通建議從非相鄰的相鄰位置導出附加的空間Merge候選,這些非相鄰的相鄰位置被標記為6至49,如圖19所示。在Merge候選列表中的TMVP候選之後添加導出的候選。In J0021, Qualcomm recommends deriving additional spatial Merge candidates from non-adjacent adjacent positions. These non-adjacent adjacent positions are marked as 6 to 49, as shown in Figure 19. The derived candidate is added after the TMVP candidate in the Merge candidate list.

在J0058中,騰訊建議從外部參考區域中的位置導出附加的空間Merge候選,該外部參考區域到當前塊的偏移為(-96, -96)。In J0058, Tencent recommends deriving additional spatial Merge candidates from the position in the external reference area, the offset of the external reference area to the current block is (-96, -96).

如圖20所示,位置標記為A(i,j)、B(i,j)、C(i,j)、D(i,j)和E(i,j)。與其先前的B或C候選相比,每個候選B(i, j)或C(i, j)在垂直方向上具有16的偏移。與其先前的A或D候選相比,每個候選A(i, j)或D(i, j)在水平方向上具有16的偏移。與其先前的E候選相比,每個E (i, j)在水平方向和垂直方向上具有16的偏移。候選從內到外進行校驗。並且候選的順序是A(i, j)、B(i, j)、C(i, j)、D(i, j)和E(i, j)。進一步研究Merge候選的數量是否可以進一步減少。在Merge候選列表中的TMVP候選之後添加候選。As shown in Fig. 20, the positions are marked as A(i,j), B(i,j), C(i,j), D(i,j) and E(i,j). Compared with its previous B or C candidates, each candidate B(i, j) or C(i, j) has an offset of 16 in the vertical direction. Compared with its previous A or D candidates, each candidate A(i, j) or D(i, j) has an offset of 16 in the horizontal direction. Compared to its previous E candidate, each E (i, j) has an offset of 16 in the horizontal and vertical directions. Candidates are checked from the inside out. And the order of the candidates is A(i, j), B(i, j), C(i, j), D(i, j) and E(i, j). Further study whether the number of Merge candidates can be further reduced. The candidate is added after the TMVP candidate in the Merge candidate list.

在J0059中,根據時間候選之後的數位順序校驗如圖21中的從6到27的擴展空間位置。為了保存MV線緩衝區,所有空間候選都被限制在兩個CTU線內。In J0059, the position of the extended space from 6 to 27 in Figure 21 is checked according to the digit sequence after the time candidate. In order to save the MV line buffer, all spatial candidates are limited to two CTU lines.

2.42.4 相關方法Related methods

在J0024中的終極運動向量表達(ultimate motion vector expression,UMVE)可以是跳過模式或直接(或Merge)模式,其使用所建議的使用相鄰運動資訊的運動向量表達方法。作為HEVC中的跳過模式和Merge模式,UMVE還根據相鄰的運動資訊製作候選列表。在列表中的那些候選中,選擇MV候選並通過新的運動向量表達方法對該MV候選進行進一步擴展。The ultimate motion vector expression (UMVE) in J0024 can be skip mode or direct (or Merge) mode, which uses the recommended motion vector expression method using adjacent motion information. As the skip mode and merge mode in HEVC, UMVE also creates a candidate list based on adjacent motion information. Among those candidates in the list, the MV candidate is selected and the MV candidate is further expanded through a new motion vector expression method.

圖22示出了UMVE搜索過程的示例,並且圖23示出了UMVE搜索點的示例。FIG. 22 shows an example of the UMVE search process, and FIG. 23 shows an example of the UMVE search point.

UMVE提供了具有簡化信號的新的運動向量表達。該表達方法包括起點、運動幅度和運動方向。UMVE provides a new motion vector representation with simplified signals. The expression method includes starting point, movement amplitude and movement direction.

基礎候選索引定義了起點。基礎候選索引如下指示在列表中的候選中的最佳候選。

Figure 108123158-A0304-0001
The base candidate index defines the starting point. The basic candidate index indicates the best candidate among the candidates in the list as follows.
Figure 108123158-A0304-0001

距離索引是運動幅度資訊。距離索引指示距離起點資訊的預定義距離。該預定義距離如下所示(表格中pel表示像素)。

Figure 108123158-A0304-0002
The distance index is the motion range information. The distance index indicates the predefined distance from the starting point information. The predefined distance is shown below (pel in the table represents pixels).
Figure 108123158-A0304-0002

方向索引表示MVD相對於起點的方向。該方向索引可以表示四個方向,如下所示。The direction index indicates the direction of the MVD relative to the starting point. The direction index can indicate four directions, as shown below.

Figure 108123158-A0304-0003
Figure 108123158-A0304-0003

33 .現有實現方式的缺點的討論. Discussion of the shortcomings of existing implementations

在Merge模式中,Merge候選的運動資訊由當前塊繼承,包括運動向量、參考圖像、預測方向、LIC標誌等。僅Merge索引被信號通知,這在許多情況下是高效的。然而,繼承的運動資訊尤其是運動向量可能不足夠好。In Merge mode, the motion information of Merge candidates is inherited by the current block, including motion vector, reference image, prediction direction, LIC flag, etc. Only the Merge index is signaled, which is efficient in many cases. However, the inherited motion information, especially the motion vector, may not be good enough.

另一方面,在AMVP模式中,所有運動資訊被信號通知,包括運動向量(即MVP索引和MVD)、參考圖像(即參考索引)、預測方向、LIC標誌和MVD精度等,這消耗位元。On the other hand, in the AMVP mode, all motion information is signaled, including the motion vector (ie MVP index and MVD), reference image (ie reference index), prediction direction, LIC flag and MVD accuracy, etc., which consumes bits .

在J0024建議的UMVE中,建議對附加的MVD進行編碼。然而,MVD僅可以在水平方向或垂直方向上具有非零分量,而不能在兩個方向上都具有非零分量。同時,它還信號通知MVD資訊,即距離索引或運動幅度資訊。In the UMVE suggested by J0024, it is recommended to encode the additional MVD. However, MVD can only have non-zero components in the horizontal direction or the vertical direction, and cannot have non-zero components in both directions. At the same time, it also signals MVD information, that is, distance index or motion amplitude information.

44 .基於所公開的技術的擴展. Expansion based on the disclosed technology MergeMerge 模式(model( EMMEMM )的方法)Methods

視頻編碼器和解碼器實施例可以使用本文件中公開的技術來實現擴展Merge模式(EMM),其中只有很少的資訊被信號通知,並且對MVD沒有特別的限制。The video encoder and decoder embodiments can use the technology disclosed in this document to implement the extended Merge mode (EMM), in which only a small amount of information is signaled, and there are no special restrictions on MVD.

下面的詳細發明應被視為解釋一般概念的示例。不應以狹隘的方式解釋這些發明。此外,這些發明可以以任何方式組合。The following detailed invention should be regarded as an example explaining the general concept. These inventions should not be interpreted in a narrow way. In addition, these inventions can be combined in any manner.

建議將運動資訊(諸如預測方向、參考索引/圖像、運動向量、LIC標誌、仿射標誌、幀內塊複製(IBC)標誌、MVD精度、MVD值)劃分為兩部分。第一部分是直接繼承的,並且第二部分在有/無預測編碼的情況下被明確地信號通知。It is recommended to divide motion information (such as prediction direction, reference index/image, motion vector, LIC flag, affine flag, intra block copy (IBC) flag, MVD accuracy, MVD value) into two parts. The first part is directly inherited, and the second part is explicitly signaled with/without predictive coding.

建議構建EMM列表,並且信號通知索引以指示當前塊(例如,PU/CU)繼承哪個候選的運動資訊的第一部分。同時,進一步信號通知如MVD的附加資訊(即運動資訊的第二部分)。It is recommended to build the EMM list and signal the index to indicate which candidate the first part of the motion information is inherited by the current block (for example, PU/CU). At the same time, further signal additional information such as MVD (that is, the second part of the sports information).

運動資訊的第一部分包括以下資訊中的所有或一些:預測方向、參考圖像、運動向量、LIC標誌和MVD精度等。The first part of the motion information includes all or some of the following information: prediction direction, reference image, motion vector, LIC flag, MVD accuracy, etc.

第二部分可以使用預測編碼進行編碼。The second part can be coded using predictive coding.

建議通過***空間相鄰塊、時間相鄰塊或非相鄰塊的運動資訊來構建運動資訊候選列表。It is recommended to construct a candidate list of motion information by inserting motion information of spatial neighboring blocks, temporal neighboring blocks, or non-neighboring blocks.

在一個示例中,候選列表以與Merge模式相同的方式構建。In one example, the candidate list is constructed in the same way as the Merge mode.

替代地,另外,將非相鄰塊的運動資訊***候選列表中。Alternatively, in addition, the motion information of non-adjacent blocks is inserted into the candidate list.

替代地,另外,將基於PU/CU的FRUC候選***候選列表中。Alternatively, in addition, PU/CU-based FRUC candidates are inserted into the candidate list.

對於FRUC候選,MVD精度設置為1/4或任何隨機的有效MVD精度。LIC標誌設置為假。For FRUC candidates, the MVD accuracy is set to 1/4 or any random effective MVD accuracy. The LIC flag is set to false.

替代地,另外,單向候選(如果不可用)從雙向候選(如果可用)生成並被***到候選列表中。從對應的雙向候選中複製LIC標誌和MVD精度。Alternatively, additionally, a one-way candidate (if not available) is generated from a two-way candidate (if available) and inserted into the candidate list. Copy the LIC flag and MVD accuracy from the corresponding bidirectional candidates.

替代地,另外,通過縮放LX方向候選的MV(如果可用)來生成L1-X方向候選(如果不可用)。從對應的LX方向候選中複製LIC標誌和MVD精度。Alternatively, additionally, the L1-X direction candidate (if not available) is generated by scaling the MV of the LX direction candidate (if available). Copy the LIC flag and MVD accuracy from the corresponding LX direction candidates.

在一個示例中,選擇L1-X參考圖像列表的第一條目作為L1-X方向的參考圖像。In an example, the first entry of the L1-X reference image list is selected as the reference image in the L1-X direction.

在一個示例中,選擇LX參考圖像的對稱參考圖像(如果可用)作為L1-X方向的參考圖像。In one example, the symmetrical reference image of the LX reference image (if available) is selected as the reference image in the L1-X direction.

還***組合的雙向預測候選和/或零候選。The combined bi-prediction candidate and/or zero candidate are also inserted.

替代地,預測方向不是被繼承的而是明確地被信號通知的。在這種情況下,建議構建兩個或多個運動資訊候選列表。Instead, the prediction direction is not inherited but explicitly signaled. In this case, it is recommended to construct two or more motion information candidate lists.

對於一個預測方向(一個參考圖像列表),構建運動資訊候選列表,其中運動資訊的第一部分(與上述實施例及其示例相比,排除參考圖像列表索引)可以從運動資訊候選列表中的一個繼承。在一個示例中,運動資訊的第一部分可以包括以下資訊中的全部或一些:參考圖像、運動向量、LIC標誌和MVD精度等。For a prediction direction (a reference image list), a motion information candidate list is constructed, where the first part of the motion information (compared with the above-mentioned embodiment and examples, excluding the reference image list index) can be selected from the motion information candidate list An inheritance. In an example, the first part of the motion information may include all or some of the following information: reference image, motion vector, LIC flag, MVD accuracy, and so on.

替代地,如上面的實施例及其示例所述,僅構建一個運動資訊候選列表。然而,可以信號通知兩個索引,以指示針對雙向預測情況的每個參考圖像列表繼承哪些候選。Alternatively, as described in the above embodiment and its examples, only one motion information candidate list is constructed. However, two indexes can be signaled to indicate which candidates are inherited by each reference picture list for the bi-prediction case.

所建議的方法可以應用於某些塊尺寸/形狀和/或某些子塊尺寸。The proposed method can be applied to certain block sizes/shapes and/or certain sub-block sizes.

所建議的方法可以應用於某些模式,諸如傳統的平移運動(即,禁用了仿射模式)。The proposed method can be applied to certain modes, such as traditional translational motion (ie, affine mode is disabled).

上述示例可以被併入下面描述的方法(例如,方法2400)的上下文中,該方法可以在視頻解碼器或視頻編碼器實現。The above examples can be incorporated into the context of the method described below (for example, method 2400), which can be implemented in a video decoder or a video encoder.

圖24是處理視頻位元流的示例方法2400的流程圖。該方法2400包括構建(2402)擴展Merge模式(EMM)候選列表;基於當前塊的位元流表示中的位元的第一集合來確定(2404)由該當前塊從列表中繼承的運動資訊;基於該位元流表示中的位元的第二集合來確定(2406)該當前塊被信號通知的運動資訊;並且基於EMM候選列表和被信號通知的運動資訊來執行(2408)當前塊與位元流表示之間的轉換。Figure 24 is a flowchart of an example method 2400 of processing a video bitstream. The method 2400 includes constructing (2402) an extended Merge mode (EMM) candidate list; determining (2404) the motion information inherited from the list by the current block based on a first set of bits in the bitstream representation of the current block; Determine (2406) the signaled motion information of the current block based on the second set of bits in the bitstream representation; and execute (2408) the current block and bit based on the EMM candidate list and the signaled motion information. Metaflow represents the conversion between.

以下列舉的示例提供了可以解決本文件中描述的技術問題以及其他問題的實施例。The examples listed below provide embodiments that can solve the technical problems described in this document and other problems.

1.一種視頻處理方法,該方法包括構建擴展Merge模式(EMM)候選列表;基於當前塊的位元流表示中的位元的第一集合來確定由當前塊從列表中繼承的運動資訊;基於該位元流表示中的位元的第二集合來確定該當前塊被信號通知的運動資訊;並且基於EMM候選列表和被信號通知的運動資訊來執行當前塊與位元流表示之間的轉換。1. A video processing method, the method includes constructing an extended Merge mode (EMM) candidate list; determining the motion information inherited from the list by the current block based on a first set of bits in the bit stream representation of the current block; The second set of bits in the metastream representation determines the signaled motion information of the current block; and the conversion between the current block and the bitstream representation is performed based on the EMM candidate list and the signaled motion information.

2.根據示例1的方法,其中由當前塊繼承的運動資訊包括另一個塊的如下運動資訊中的至少一個:預測方向、參考圖像、運動向量、局部亮度補償(LIC)標誌和運動向量差異(MVD)精度。2. The method according to example 1, wherein the motion information inherited by the current block includes at least one of the following motion information of another block: prediction direction, reference image, motion vector, local luminance compensation (LIC) flag, and motion vector difference (MVD) ) Precision.

3.根據示例1或2的方法,其中被信號通知的運動資訊包括當前塊的預測運動資訊或當前塊的運動向量差異(MVD)資訊。3. The method according to example 1 or 2, wherein the signaled motion information includes predicted motion information of the current block or motion vector difference (MVD) information of the current block.

4.根據示例1或2的方法,其中使用預測編碼對位元的第二集合進行編碼。4. The method according to example 1 or 2, wherein predictive coding is used to encode the second set of bits.

5.根據示例1至4中任一個的方法,其中構建EMM候選列表包括將來自空間相鄰塊的運動候選***EMM候選列表中。5. The method according to any one of examples 1 to 4, wherein constructing the EMM candidate list includes inserting motion candidates from spatial neighboring blocks into the EMM candidate list.

6.根據示例1至4中任一個的方法,其中構建EMM候選列表包括將來自時間相鄰塊的運動候選***EMM候選列表中。6. The method according to any one of Examples 1 to 4, wherein constructing the EMM candidate list includes inserting motion candidates from temporal neighboring blocks into the EMM candidate list.

7.根據示例1至4中任一個的方法,其中構建EMM候選列表包括將來自非相鄰塊的運動候選***EMM候選列表中。7. The method according to any one of examples 1 to 4, wherein constructing the EMM candidate list includes inserting motion candidates from non-adjacent blocks into the EMM candidate list.

8.根據示例1至4中任一個的方法,其中構建EMM候選列表包括將幀速率上轉換(FRUC)候選***EMM候選列表中。8. The method according to any one of examples 1 to 4, wherein constructing the EMM candidate list includes inserting frame rate up-conversion (FRUC) candidates into the EMM candidate list.

9.根據示例8的方法,其中對於FRUC候選,MVD精度設置為1/4並且LIC標誌設置為假。9. The method according to Example 8, wherein for FRUC candidates, the MVD accuracy is set to 1/4 and the LIC flag is set to false.

10.根據示例1至4中任一個的方法,其中構建EMM候選列表包括將單向候選***EMM候選列表中。10. The method according to any one of examples 1 to 4, wherein constructing the EMM candidate list includes inserting one-way candidates into the EMM candidate list.

11.根據示例10的方法,其中單向候選是從雙向候選生成的。11. The method according to example 10, wherein the one-way candidate is generated from the two-way candidate.

12.根據示例11的方法,其中單向候選的MVD精度和LIC標誌是從雙向候選複製的。12. The method according to Example 11, wherein the MVD accuracy and LIC flag of the one-way candidate are copied from the two-way candidate.

13.根據示例1至4中任一個的方法,其中構建EMM候選列表包括將LY方向候選***EMM候選列表中,其中LY方向候選由LX方向候選的縮放運動向量生成,其中X={0,1}且Y=1-X,並且其中L0和L1表示參考圖像列表。13. The method according to any one of Examples 1 to 4, wherein constructing the EMM candidate list includes inserting the LY direction candidate into the EMM candidate list, wherein the LY direction candidate is generated by the scaling motion vector of the LX direction candidate, where X={0,1} and Y=1-X, and where L0 and L1 represent a list of reference images.

14.根據示例13的方法,其中選擇LX參考圖像的對稱參考圖像作為LY方向的參考圖像。14. The method according to example 13, wherein the symmetrical reference image of the LX reference image is selected as the reference image in the LY direction.

15.根據示例1至14中任一個的方法,其中構建EMM候選列表包括將組合的雙向預測候選或零候選***EMM候選列表中。15. The method according to any one of Examples 1 to 14, wherein constructing the EMM candidate list includes inserting the combined bidirectional prediction candidate or the zero candidate into the EMM candidate list.

16.根據示例1至4中任一個的方法,其中預測方向不是繼承的而是包含在被信號通知的運動資訊中,並且其中該方法進一步包括構建多個運動資訊候選列表,其中多個運動資訊候選列表中的一個包括來自相同預測方向的多個候選,並且其中由當前塊繼承的運動資訊是從多個運動資訊候選列表中的一個繼承的。16. The method according to any one of examples 1 to 4, wherein the prediction direction is not inherited but is included in the signaled motion information, and wherein the method further includes constructing a plurality of motion information candidate lists, wherein the plurality of motion information candidate lists One of them includes multiple candidates from the same prediction direction, and the motion information inherited by the current block is inherited from one of the multiple motion information candidate lists.

17.根據示例16的方法,其中由當前塊繼承的運動資訊識別參考圖像、運動向量、局部亮度補償(LIC)標誌和運動向量差異(MVD)精度中的至少一個。17. The method according to example 16, wherein the motion information inherited by the current block identifies at least one of a reference image, a motion vector, a local luminance compensation (LIC) flag, and a motion vector difference (MVD) accuracy.

18.根據示例1至15中任一個的方法,其中使用兩個索引來指示針對每個參考圖像列表繼承哪些候選,以用於當前塊的雙向預測編碼。18. The method according to any one of Examples 1 to 15, wherein two indexes are used to indicate which candidates are inherited for each reference image list for the bidirectional predictive coding of the current block.

19.根據示例1至4中任一個的方法,其中由當前塊繼承的運動資訊包括運動向量差異(MVD)精度。19. The method according to any one of examples 1 to 4, wherein the motion information inherited by the current block includes motion vector difference (MVD) accuracy.

20.根據示例1至19中任一項的方法,其中基於當前塊的編碼特性來選擇性地使用方法,並且其中編碼特性包括使用平移運動模型。20. The method according to any one of examples 1 to 19, wherein the method is selectively used based on the coding characteristics of the current block, and wherein the coding characteristics include the use of a translational motion model.

21.一種視頻系統中的裝置,包括處理器和其上具有指令的非暫時性儲存器,其中這些指令在由處理器執行時使得處理器實現示例1至20中的任何一個中的方法。twenty one. A device in a video system includes a processor and a non-transitory storage with instructions thereon, where these instructions when executed by the processor cause the processor to implement the method in any one of Examples 1 to 20.

22.一種儲存在非暫時性電腦可讀介質上的電腦程式產品,電腦程式產品包括用於實施示例1至20中的任何一個中的方法的程式碼。twenty two. A computer program product stored on a non-transitory computer readable medium. The computer program product includes program code for implementing the method in any one of Examples 1 to 20.

55 .參考文獻. references

[1] ITU-T和ISO/IEC,“High efficiency video coding”,Rec. ITU-T H.265| ISO/IEC 23008-2(有效版本)。[1] ITU-T and ISO/IEC, "High efficiency video coding", Rec. ITU-T H.265| ISO/IEC 23008-2 (valid version).

[2] C. Rosewarne,B.Bross,M.Naccari,K.Sharman,G.Sullivan,“High Efficiency Video Coding (HEVC) Test Model 16 (HM 16) Improved Encoder Description Update 7,”JCTVC-Y1002,2016年10月。[2] C. Rosewarne, B. Bross, M. Naccari, K. Sharman, G. Sullivan, "High Efficiency Video Coding (HEVC) Test Model 16 (HM 16) Improved Encoder Description Update 7," JCTVC-Y1002, 2016 October.

[3] J. Chen,E.Alshina,G.J. Sullivan,J.-R. Ohm,J.Boyce,“Algorithm description of Joint Exploration Test Model 7 (JEM7)”,JVET-G1001,2017年8月。[3] J. Chen, E. Alshina, G.J. Sullivan, J.-R. Ohm, J. Boyce, "Algorithm description of Joint Exploration Test Model 7 (JEM7)", JVET-G1001, August 2017.

[4] JEM-7.0:https://jvet.hhi.fraunhofer.de/svn/svn_HMJEMSoftware/tags/ HM-16.6-JEM-7.0。[4] JEM-7.0: https://jvet.hhi.fraunhofer.de/svn/svn_HMJEMSoftware/tags/HM-16.6-JEM-7.0.

[5] A. Alshin,E.Alshina等,“Description of SDR, HDR and 360° video coding technology proposal by Samsung, Huawei, GoPro, and HiSilicon – mobile application scenario”,JVET-J0024,2018年4月。[5] A. Alshin, E. Alshina, etc., "Description of SDR, HDR and 360° video coding technology proposal by Samsung, Huawei, GoPro, and HiSilicon – mobile application scenario", JVET-J0024, April 2018.

66 .所公開技術的實施例. Examples of the disclosed technology

圖25是視頻處理裝置2500的方塊圖。裝置2500可以用於實現本文描述的一種或多種方法。裝置2500可以實施在智慧手機、平板電腦、電腦、物聯網(IoT)接收器等中。裝置2500可以包括一個或多個處理器2502、一個或多個儲存器2504和視頻處理硬體2506。(一個或多個)處理器2502可以被配置為實現本文件中描述的一種或多種方法(包括但不限於方法2400)。(一個或多個)儲存器2504可以用於儲存用於實現本文描述的方法和技術的資料和代碼。視頻處理硬體2506可以用於在硬體電路中實現本文件中描述的一些技術。FIG. 25 is a block diagram of the video processing device 2500. The device 2500 can be used to implement one or more of the methods described herein. The device 2500 may be implemented in a smart phone, a tablet computer, a computer, an Internet of Things (IoT) receiver, and the like. The device 2500 may include one or more processors 2502, one or more storages 2504, and video processing hardware 2506. The processor(s) 2502 may be configured to implement one or more methods described in this document (including but not limited to method 2400). The storage(s) 2504 may be used to store data and codes used to implement the methods and techniques described herein. Video processing hardware 2506 can be used to implement some of the technologies described in this document in hardware circuits.

在一些實施例中,視頻編碼方法可以使用在如關於圖25所描述的硬體平臺上實現的裝置來實現。In some embodiments, the video encoding method may be implemented using a device implemented on a hardware platform as described with respect to FIG. 25.

本文件中描述的公開和其他解決方案、示例、實施例、模組和功能操作可以以數位電子電路實現,或者以電腦軟體、韌體或硬體實現,包含本文件中公開的結構及其結構等同物,或者以它們中的一個或多個的組合實現。公開和其他實施例可以實現為一個或多個電腦程式產品,即,在電腦可讀介質上編碼的一個或多個電腦程式指令模組,用於由資料處理裝置執行或控制資料處理裝置的操作。電腦可讀介質可以是機器可讀存放裝置、機器可讀儲存基板、儲存器設備、影響機器可讀傳播信號的物質組合、或者它們中的一個或多個的組合。術語“資料處理裝置”涵蓋用於處理資料的所有裝置、設備和機器,包括例如可程式設計處理器、電腦或多個處理器或電腦。除了硬體之外,該裝置還可以包括為所討論的電腦程式創建執行環境的代碼,例如,構成處理器韌體、協定棧、資料庫管理系統、作業系統、或者它們中的一個或多個的組合的代碼。傳播信號是人工生成的信號,例如機器生成的電信號、光信號或電磁信號,其被生成以對資訊進行編碼以便傳輸到合適的接收器裝置。The disclosed and other solutions, examples, embodiments, modules, and functional operations described in this document can be implemented in digital electronic circuits, or computer software, firmware, or hardware, including the structure and structure disclosed in this document Equivalent, or realized in a combination of one or more of them. The disclosed and other embodiments can be implemented as one or more computer program products, that is, one or more computer program instruction modules encoded on a computer readable medium, used to be executed by a data processing device or to control the operation of the data processing device . The computer-readable medium may be a machine-readable storage device, a machine-readable storage substrate, a storage device, a combination of substances that affect a machine-readable propagated signal, or a combination of one or more of them. The term "data processing device" covers all devices, equipment, and machines used to process data, including, for example, a programmable processor, a computer, or multiple processors or computers. In addition to hardware, the device may also include code that creates an execution environment for the computer program in question, for example, constituting processor firmware, protocol stack, database management system, operating system, or one or more of them The code of the combination. Propagated signals are artificially generated signals, such as electrical, optical or electromagnetic signals generated by a machine, which are generated to encode information for transmission to a suitable receiver device.

電腦程式(也稱為程式、軟體、軟體應用、腳本或代碼)可以以任何形式的程式設計語言編寫,包括編譯或解釋語言,並且可以以任何形式來部署電腦程式,包括作為獨立程式或作為適合在計算環境中使用的模組、元件、子常式或其他單元。電腦程式不一定對應於檔案系統中的文件。程式可以儲存在保存其他程式或資料的文件的一部分中(例如,儲存在標記語言文件中的一個或多個腳本),儲存在專用於所討論的程式的單個文件中,或儲存在多個協調文件中(例如,儲存一個或多個模組、副程式或代碼部分的文件)。可以部署電腦程式以在一個電腦上或在位於一個網站上或分佈在多個網站上並由通信網路互連的多個電腦上執行。Computer programs (also called programs, software, software applications, scripts or codes) can be written in any form of programming language, including compiled or interpreted languages, and computer programs can be deployed in any form, including as stand-alone programs or as suitable Modules, components, subroutines, or other units used in a computing environment. Computer programs do not necessarily correspond to documents in the file system. The program can be stored in a part of a document that holds other programs or data (for example, one or more scripts stored in a markup language document), in a single document dedicated to the program in question, or in multiple coordination In a document (for example, a document that stores one or more modules, subprograms, or code parts). Computer programs can be deployed to be executed on one computer or on multiple computers located on one website or distributed on multiple websites and interconnected by a communication network.

本文件中描述的過程和邏輯流程可以由執行一個或多個電腦程式的一個或多個可程式設計處理器執行,以通過對輸入資料進行操作並生成輸出來執行功能。過程和邏輯流程也可以由專用邏輯電路執行,並且裝置也可以實現為專用邏輯電路,例如FPGA(現場可程式設計閘陣列)或ASIC(專用積體電路)。The processes and logic flows described in this document can be executed by one or more programmable processors that execute one or more computer programs to perform functions by operating on input data and generating output. The process and logic flow can also be executed by a dedicated logic circuit, and the device can also be implemented as a dedicated logic circuit, such as FPGA (Field Programmable Gate Array) or ASIC (Dedicated Integrated Circuit).

舉例來說,適合於執行電腦程式的處理器包括通用和專用微處理器、以及任何種類的數位電腦的任何一個或多個處理器。通常,處理器將從唯讀儲存器或隨機存取儲存器或兩者接收指令和資料。電腦的基本元件是用於執行指令的處理器和用於儲存指令和資料的一個或多個儲存器設備。通常,電腦還將包括或可操作地耦合到用於儲存資料的一個或多個大型存放區設備,例如磁片、磁光碟或光碟,以從該一個或多個大型存放區設備接收資料,或將資料傳遞到該一個或多個大型存放區設備,或者既接收又傳遞資料。然而,電腦不需要具有這樣的設備。適用於儲存電腦程式指令和資料的電腦可讀介質包括所有形式的非揮發性儲存器、介質和儲存器設備,舉例來說,包括半導體儲存器設備,例如EPROM、EEPROM和快閃儲存器設備;磁片,例如內部硬碟或抽取式磁碟;磁光碟;以及CD ROM和DVD-ROM磁片。處理器和儲存器可以由專用邏輯電路補充或併入專用邏輯電路中。For example, processors suitable for executing computer programs include general-purpose and special-purpose microprocessors, and any one or more processors of any kind of digital computer. Generally, the processor will receive commands and data from read-only memory or random access memory or both. The basic components of a computer are a processor for executing instructions and one or more storage devices for storing instructions and data. Generally, the computer will also include or be operatively coupled to one or more large storage area devices for storing data, such as floppy disks, magneto-optical discs, or optical discs, to receive data from the one or more large storage area devices, or Deliver data to the one or more large storage area devices, or both receive and deliver data. However, the computer does not need to have such equipment. Computer-readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media and storage devices, including, for example, semiconductor storage devices such as EPROM, EEPROM and flash memory devices; Diskettes, such as internal hard disks or removable disks; magneto-optical disks; and CD ROM and DVD-ROM disks. The processor and storage can be supplemented by or incorporated into dedicated logic circuits.

雖然本專利文件包含許多細節,但這些細節不應被解釋為對任何發明或可要求保護的範圍的限制,而是作為特定於特定發明的特定實施例的特徵的描述。在本專利文件中,在分開的實施例的上下文中描述的某些特徵也可以在單個實施例中組合實現。相反,在單個實施例的上下文中描述的各種特徵也可以分開地或以任何合適的子組合在多個實施例中實現。此外,儘管上面的特徵可以描述為以某些組合起作用並且甚至最初如此要求保護,但是在一些情況下,可以從所要求保護的組合中去除來自該組合的一個或多個特徵,並且所要求保護的組合可以指向子組合或子組合的變型。Although this patent document contains many details, these details should not be construed as limitations on the scope of any invention or claimable, but as a description of features specific to a particular embodiment of a particular invention. In this patent document, certain features described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination. In addition, although the above features may be described as functioning in certain combinations and even initially claimed as such, in some cases, one or more features from the combination may be removed from the claimed combination, and the claimed The combination of protection can be directed to a sub-combination or a variant of the sub-combination.

類似地,雖然在附圖中以特定順序描繪了操作,但是這不應該被理解為要求以所示的特定順序或按循序執行這樣的操作,或者執行所有示出的操作,以實現期望的結果。此外,在本專利文件中描述的實施例中的各種系統元件的分離不應被理解為在所有實施例中都要求這樣的分離。Similarly, although operations are depicted in a specific order in the drawings, this should not be understood as requiring that such operations be performed in the specific order shown or in a sequential order, or that all the operations shown are performed to achieve the desired result . In addition, the separation of various system elements in the embodiments described in this patent document should not be understood as requiring such separation in all embodiments.

僅描述了幾個實現方式和示例,並且可以基於本專利文件中描述和示出的內容來做出其他實現方式、增強和變型。Only a few implementations and examples are described, and other implementations, enhancements and modifications can be made based on the content described and shown in this patent document.

tb、td‧‧‧距離 TD0、TD1‧‧‧時間距離 2400‧‧‧方法 2402、2404、2406、2408‧‧‧步驟 2500‧‧‧裝置 2502‧‧‧處理器 2504‧‧‧儲存器 2506‧‧‧視頻處理硬體tb, td‧‧‧distance TD0, TD1‧‧‧Time distance 2400‧‧‧Method 2402, 2404, 2406, 2408‧‧‧ steps 2500‧‧‧device 2502‧‧‧Processor 2504‧‧‧Storage 2506‧‧‧Video processing hardware

圖1示出了用於Merge候選列表構建的推導過程的示例。 圖2示出了空間Merge候選的示例位置。 圖3示出了考慮空間Merge候選的冗餘校驗的候選對的示例。 圖4A和圖4B示出了N×2N和2N×N劃分的第二PU的示例位置。 圖5是用於時間Merge候選的運動向量縮放的示例圖示。 圖6示出了時間Merge候選C0和C1的候選位置的示例。 圖7示出了組合的雙向預測Merge候選的示例。 圖8示出了用於運動向量預測候選的示例推導過程。 圖9示出了用於空間運動向量候選的運動向量縮放的示例圖示。 圖10示出了用於導出IC參數的相鄰樣點的示例。 圖11示出了簡化的仿射運動模型的示例。 圖12示出了每個子塊的仿射MVF的示例。 圖13顯示了AF_INTER的MVP的示例。 圖14A和14B示出了AF_MERGE的候選的示例。 圖15示出了雙邊匹配的示例。 圖16示出了模板匹配的示例。 圖17示出了FRUC中單向ME的示例。 圖18示出了基於雙邊模板匹配的DMVR的示例。 圖19示出了非相鄰Merge候選的示例。 圖20示出了非相鄰Merge候選的示例。 圖21示出了非相鄰Merge候選的示例。 圖22和圖23描繪了視頻編碼的最終運動向量表達技術的示例。 圖24是視頻位元流處理方法的示例的流程圖。 圖25是視頻處理裝置的示例的方塊圖。Fig. 1 shows an example of the derivation process for the construction of the Merge candidate list. Fig. 2 shows example positions of spatial Merge candidates. FIG. 3 shows an example of candidate pairs considering redundancy check of spatial Merge candidates. 4A and 4B show example positions of the second PU divided by N×2N and 2N×N. Fig. 5 is an example illustration of motion vector scaling for temporal Merge candidates. Fig. 6 shows an example of candidate positions of temporal Merge candidates C0 and C1. Fig. 7 shows an example of combined bidirectional prediction Merge candidates. Fig. 8 shows an example derivation process for motion vector prediction candidates. FIG. 9 shows an example illustration of motion vector scaling for spatial motion vector candidates. Figure 10 shows an example of neighboring samples used to derive IC parameters. Fig. 11 shows an example of a simplified affine motion model. Fig. 12 shows an example of the affine MVF of each sub-block. Figure 13 shows an example of the MVP of AF_INTER. 14A and 14B show examples of candidates for AF_MERGE. Fig. 15 shows an example of bilateral matching. Fig. 16 shows an example of template matching. Figure 17 shows an example of one-way ME in FRUC. FIG. 18 shows an example of DMVR based on bilateral template matching. Fig. 19 shows an example of non-adjacent Merge candidates. Fig. 20 shows an example of non-adjacent Merge candidates. FIG. 21 shows an example of non-adjacent Merge candidates. Figures 22 and 23 depict examples of final motion vector expression techniques for video coding. Fig. 24 is a flowchart of an example of a video bitstream processing method. Fig. 25 is a block diagram of an example of a video processing device.

2400‧‧‧方法 2400‧‧‧Method

2402、2404、2406、2408‧‧‧步驟 2402, 2404, 2406, 2408‧‧‧ steps

Claims (21)

一種視頻處理方法,包括:構建擴展Merge模式(EMM)候選列表;基於當前塊的位元流表示中的位元的第一集合來確定由所述當前塊從所述列表中繼承的運動資訊;基於所述位元流表示中的位元的第二集合來確定所述當前塊的被信號通知的運動資訊;並且基於所述EMM候選列表和所述被信號通知的運動資訊來執行所述當前塊與所述位元流表示之間的轉換,其中所述被信號通知的運動資訊包括所述當前塊的預測運動資訊或所述當前塊的運動向量差異(MVD)資訊。 A video processing method, comprising: constructing an extended Merge mode (EMM) candidate list; determining the motion information inherited from the list by the current block based on a first set of bits in the bitstream representation of the current block; Determine the signaled motion information of the current block based on the second set of bits in the bitstream representation; and execute the current block based on the EMM candidate list and the signaled motion information A conversion between a block and the bitstream representation, wherein the signaled motion information includes predicted motion information of the current block or motion vector difference (MVD) information of the current block. 如申請專利範圍第1項所述的方法,其中由所述當前塊繼承的所述運動資訊包括另一個塊的如下運動資訊中的至少一個:預測方向、參考圖像、運動向量、局部亮度補償(LIC)標誌和運動向量差異(MVD)精度。 The method according to claim 1, wherein the motion information inherited by the current block includes at least one of the following motion information of another block: prediction direction, reference image, motion vector, local brightness compensation (LIC) Mark and Motion Vector Difference (MVD) accuracy. 如申請專利範圍第1項所述的方法,其中使用預測編碼對所述位元的第二集合進行編碼。 The method described in item 1 of the scope of patent application, wherein predictive coding is used to code the second set of bits. 如申請專利範圍第1至3項中任一項所述的方法,其中構建所述EMM候選列表包括:將來自空間相鄰塊的運動候選***所述EMM候選列表中。 The method according to any one of items 1 to 3 of the scope of patent application, wherein constructing the EMM candidate list includes: inserting motion candidates from spatially adjacent blocks into the EMM candidate list. 如申請專利範圍第1至3項中任一項所述的方法,其中構建所述EMM候選列表包括:將來自時間相鄰塊的運動候選***所述EMM候選列表中。 The method according to any one of items 1 to 3 of the scope of patent application, wherein constructing the EMM candidate list includes: inserting motion candidates from temporal neighboring blocks into the EMM candidate list. 如申請專利範圍第1至3項中任一項所述的方法,其中構建所述EMM候選列表包括:將來自非相鄰塊的運動候選***所述EMM候選列表中。 The method according to any one of items 1 to 3 of the scope of patent application, wherein constructing the EMM candidate list includes: inserting motion candidates from non-adjacent blocks into the EMM candidate list. 如申請專利範圍第1至3項中任一項所述的方法,其中構建所述EMM候選列表包括:將幀速率上轉換(FRUC)候選***所述EMM候選列表中。 The method according to any one of items 1 to 3 of the scope of patent application, wherein constructing the EMM candidate list includes: inserting frame rate up-conversion (FRUC) candidates into the EMM candidate list. 如申請專利範圍第7項所述的方法,其中對於所述FRUC候選,所述MVD精度設置為1/4並且所述LIC標誌設置為假。 The method according to item 7 of the scope of patent application, wherein for the FRUC candidate, the MVD accuracy is set to 1/4 and the LIC flag is set to false. 如申請專利範圍第1至3項中任一項所述的方法,其中構建所述EMM候選列表包括:將單向候選***所述EMM候選列表中。 The method according to any one of items 1 to 3 of the scope of patent application, wherein constructing the EMM candidate list includes: inserting a one-way candidate into the EMM candidate list. 如申請專利範圍第9項所述的方法,其中所述單向候選是從雙向候選生成的。 The method described in item 9 of the scope of patent application, wherein the one-way candidate is generated from the two-way candidate. 如申請專利範圍第10項所述的方法,其中所述單向候選的所述MVD精度和所述LIC標誌是從所述雙向候選複製的。 The method according to item 10 of the scope of patent application, wherein the MVD accuracy and the LIC flag of the one-way candidate are copied from the two-way candidate. 如申請專利範圍第1至3項中任一項所述的方法,其中構建所述EMM候選列表包括: 將LY方向候選***所述EMM候選列表中,其中所述LY方向候選由LX方向候選的縮放運動向量生成,其中X={0,1}且Y=1-X,並且其中L0和L1表示參考圖像列表。 The method according to any one of items 1 to 3 in the scope of the patent application, wherein constructing the EMM candidate list includes: Insert the LY direction candidate into the EMM candidate list, where the LY direction candidate is generated by the scaling motion vector of the LX direction candidate, where X={0,1} and Y=1-X, and where L0 and L1 represent references List of images. 如申請專利範圍第12項所述的方法,其中選擇LX參考圖像的對稱參考圖像作為所述LY方向的參考圖像。 The method described in item 12 of the scope of patent application, wherein the symmetrical reference image of the LX reference image is selected as the reference image in the LY direction. 如申請專利範圍第1至3項中任一項所述的方法,其中構建所述EMM候選列表包括:將組合的雙向預測候選或零候選***所述EMM候選列表中。 The method according to any one of items 1 to 3 in the scope of the patent application, wherein constructing the EMM candidate list includes: inserting a combined bidirectional prediction candidate or a zero candidate into the EMM candidate list. 如申請專利範圍第1至3項中任一項所述的方法,其中預測方向不是繼承的而是包含在所述被信號通知的運動資訊中,並且所述方法進一步包括:構建多個運動資訊候選列表,其中所述多個運動資訊候選列表中的一個包括來自相同預測方向的多個候選,並且其中由所述當前塊繼承的所述運動資訊是從所述多個運動資訊候選列表中的一個繼承的。 The method according to any one of items 1 to 3 in the scope of the patent application, wherein the prediction direction is not inherited but is included in the signaled motion information, and the method further includes: constructing a plurality of motion information A candidate list, wherein one of the plurality of motion information candidate lists includes a plurality of candidates from the same prediction direction, and wherein the motion information inherited by the current block is from the plurality of motion information candidate lists An inherited one. 如申請專利範圍第15項所述的方法,其中由所述當前塊繼承的所述運動資訊識別參考圖像、運動向量、局部亮度補償(LIC)標誌和運動向量差異(MVD)精度中的至少一個。 The method according to item 15 of the scope of patent application, wherein the motion information inherited by the current block identifies at least one of the accuracy of a reference image, a motion vector, a local luminance compensation (LIC) flag, and a motion vector difference (MVD) one. 如申請專利範圍第1至3項中任一項所述的方法,其中使用兩個索引來指示針對每個參考圖像列表繼承哪些候選,以用於所述當前塊的雙向預測編碼。 The method according to any one of items 1 to 3 of the scope of the patent application, wherein two indexes are used to indicate which candidates are inherited for each reference image list for the bidirectional predictive coding of the current block. 如申請專利範圍第1至3項中任一項所述的方法,其中由所述當前塊繼承的所述運動資訊包括運動向量差異(MVD)精度。 The method according to any one of items 1 to 3 of the scope of patent application, wherein the motion information inherited by the current block includes motion vector difference (MVD) accuracy. 如申請專利範圍第1至3項中任一項所述的方法,其中基於所述當前塊的編碼特性來選擇性地使用所述方法,並且其中所述編碼特性包括使用平移運動模型。 The method according to any one of items 1 to 3 of the scope of patent application, wherein the method is selectively used based on the coding characteristics of the current block, and wherein the coding characteristics include using a translational motion model. 一種視頻系統中的裝置,包括處理器和其上具有指令的非暫時性儲存器,其中所述指令在由所述處理器執行時使得所述處理器實現申請專利範圍第1至3項中的任一項所述的方法。 A device in a video system, comprising a processor and a non-transitory storage with instructions thereon, wherein the instructions when executed by the processor enable the processor to implement the patent application in items 1 to 3 Any of the methods. 一種儲存在非暫時性電腦可讀介質上的電腦程式產品,所述電腦程式產品包括用於實施申請專利範圍第1至3項中的任一項所述的方法的程式碼。 A computer program product stored on a non-transitory computer readable medium, the computer program product including program code for implementing the method described in any one of the first to third items in the scope of the patent application.
TW108123158A 2018-06-29 2019-07-01 Extended merge mode TWI736923B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2018093646 2018-06-29
WOPCT/CN2018/093646 2018-06-29

Publications (2)

Publication Number Publication Date
TW202002650A TW202002650A (en) 2020-01-01
TWI736923B true TWI736923B (en) 2021-08-21

Family

ID=67253944

Family Applications (3)

Application Number Title Priority Date Filing Date
TW108123158A TWI736923B (en) 2018-06-29 2019-07-01 Extended merge mode
TW108123171A TWI731362B (en) 2018-06-29 2019-07-01 Interaction between emm and other tools
TW108123159A TWI722467B (en) 2018-06-29 2019-07-01 Video processing method, device of video system and product of computer program

Family Applications After (2)

Application Number Title Priority Date Filing Date
TW108123171A TWI731362B (en) 2018-06-29 2019-07-01 Interaction between emm and other tools
TW108123159A TWI722467B (en) 2018-06-29 2019-07-01 Video processing method, device of video system and product of computer program

Country Status (3)

Country Link
CN (3) CN110662041B (en)
TW (3) TWI736923B (en)
WO (3) WO2020003276A1 (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11051025B2 (en) * 2018-07-13 2021-06-29 Tencent America LLC Method and apparatus for video coding
KR20230146108A (en) * 2018-08-28 2023-10-18 에프쥐 이노베이션 컴퍼니 리미티드 Device and method for coding video data
JP7372443B2 (en) 2019-08-10 2023-10-31 北京字節跳動網絡技術有限公司 Signaling within the video bitstream that depends on subpictures
MX2022003765A (en) 2019-10-02 2022-04-20 Beijing Bytedance Network Tech Co Ltd Syntax for subpicture signaling in a video bitstream.
EP4032290A4 (en) 2019-10-18 2022-11-30 Beijing Bytedance Network Technology Co., Ltd. Syntax constraints in parameter set signaling of subpictures
WO2021139806A1 (en) * 2020-01-12 2021-07-15 Beijing Bytedance Network Technology Co., Ltd. Constraints for video coding and decoding
CN115335817A (en) * 2020-03-27 2022-11-11 科乐美数码娱乐株式会社 Video distribution system, video distribution control method, and computer program
US20240205425A1 (en) * 2021-04-09 2024-06-20 Beijing Bytedance Network Technology Co., Ltd. Method, device, and medium for video processing
CN117581539A (en) * 2021-04-10 2024-02-20 抖音视界有限公司 GPM motion refinement

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180098087A1 (en) * 2016-09-30 2018-04-05 Qualcomm Incorporated Frame rate up-conversion coding mode

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013009104A2 (en) * 2011-07-12 2013-01-17 한국전자통신연구원 Inter prediction method and apparatus for same
KR101210892B1 (en) * 2011-08-29 2012-12-11 주식회사 아이벡스피티홀딩스 Method for generating prediction block in amvp mode
US9357214B2 (en) * 2012-12-07 2016-05-31 Qualcomm Incorporated Advanced merge/skip mode and advanced motion vector prediction (AMVP) mode for 3D video
KR101854003B1 (en) * 2013-07-02 2018-06-14 경희대학교 산학협력단 Video including multi layers encoding and decoding method
CN103561263B (en) * 2013-11-06 2016-08-24 北京牡丹电子集团有限责任公司数字电视技术中心 Based on motion vector constraint and the motion prediction compensation method of weighted motion vector
WO2015149698A1 (en) * 2014-04-01 2015-10-08 Mediatek Inc. Method of motion information coding
US10958927B2 (en) * 2015-03-27 2021-03-23 Qualcomm Incorporated Motion information derivation mode determination in video coding
US10812791B2 (en) * 2016-09-16 2020-10-20 Qualcomm Incorporated Offset vector identification of temporal motion vector predictor
US10750190B2 (en) * 2016-10-11 2020-08-18 Lg Electronics Inc. Video decoding method and device in video coding system
CN107396106A (en) * 2017-06-26 2017-11-24 深圳市亿联智能有限公司 A kind of Video Encryption Algorithm based on H.265 coding standard
CN107396102B (en) * 2017-08-30 2019-10-08 中南大学 A kind of inter-frame mode fast selecting method and device based on Merge technological movement vector
EP3468194A1 (en) * 2017-10-05 2019-04-10 Thomson Licensing Decoupled mode inference and prediction

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180098087A1 (en) * 2016-09-30 2018-04-05 Qualcomm Incorporated Frame rate up-conversion coding mode

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
"Algorithm description of Joint Exploration Test Model 2 (JEM2)", 114. MPEG MEETING; 20160222 - 20160226; SAN DIEGO; (MOTION PICTURE EXPERT GROUP OR ISO/IEC JTC1/SC29/WG11), 4 April 2016 (2016-04-04), XP030022739 *
"Algorithm description of Joint Exploration Test Model 2 (JEM2)" 114. MPEG MEETING;22-2-2016 - 26-2-2016; SAN DIEGO; (MOTION PICTURE EXPERT GROUP OR ISO/IEC JTC1/SC29/WG11), no. N16066, 4 April 2016, *
"Algorithm description of Joint Exploration Test Model 2 (JEM2)" 114. MPEG MEETING;22-2-2016 - 26-2-2016; SAN DIEGO; (MOTION PICTURE EXPERT GROUP OR ISO/IEC JTC1/SC29/WG11), no. N16066, 4 April 2016, XP030022739

Also Published As

Publication number Publication date
CN110662055B (en) 2022-07-05
TW202002650A (en) 2020-01-01
CN110662041B (en) 2022-07-29
CN110662046B (en) 2022-03-25
WO2020003276A1 (en) 2020-01-02
CN110662041A (en) 2020-01-07
WO2020003273A1 (en) 2020-01-02
WO2020003281A1 (en) 2020-01-02
TW202017370A (en) 2020-05-01
TWI731362B (en) 2021-06-21
TW202002651A (en) 2020-01-01
TWI722467B (en) 2021-03-21
CN110662055A (en) 2020-01-07
CN110662046A (en) 2020-01-07

Similar Documents

Publication Publication Date Title
TWI727338B (en) Signaled mv precision
TWI706670B (en) Generalized mvd resolutions
TWI736923B (en) Extended merge mode
TW202025781A (en) Mode dependent adaptive motion vector resolution for affine mode coding
TW201933866A (en) Improved decoder-side motion vector derivation
CN113424538A (en) Selective application of decoder-side refinement tools
TWI709332B (en) Motion prediction based on updated motion vectors
TWI719522B (en) Symmetric bi-prediction mode for video coding
CN113545076A (en) Enabling BIO based on information in picture header
CN113424535A (en) History update based on motion vector prediction table