WO2020047807A1 - Procédé et appareil d'interprédiction et codec - Google Patents

Procédé et appareil d'interprédiction et codec Download PDF

Info

Publication number
WO2020047807A1
WO2020047807A1 PCT/CN2018/104430 CN2018104430W WO2020047807A1 WO 2020047807 A1 WO2020047807 A1 WO 2020047807A1 CN 2018104430 W CN2018104430 W CN 2018104430W WO 2020047807 A1 WO2020047807 A1 WO 2020047807A1
Authority
WO
WIPO (PCT)
Prior art keywords
motion information
candidate motion
inter prediction
prediction mode
image block
Prior art date
Application number
PCT/CN2018/104430
Other languages
English (en)
Chinese (zh)
Inventor
徐巍炜
赵寅
杨海涛
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2020047807A1 publication Critical patent/WO2020047807A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/137Motion inside a coding unit, e.g. average field, frame or block difference
    • H04N19/139Analysis of motion vectors, e.g. their magnitude, direction, variance or reliability

Definitions

  • the present application relates to the field of video image coding and decoding, and particularly to an inter prediction method, an inter prediction device, and a corresponding encoder and decoder.
  • Video compression technology such as MPEG-1 video, MPEG-2 video, ITU-T H.262 / MPEG-2, ITU-T H.263, ITU-T H.264 / MPEG-4 Part 10 Advanced Video Coding (Advanced Video Coding, AVC), ITU-T H.265 / High Efficiency Video Coding (HEVC), ITU-T H.266 / Versatile Video Coding (VVC) standards in progress
  • AVC Advanced Video Coding
  • HEVC High Efficiency Video Coding
  • VVC Versatile Video Coding
  • devices can efficiently transmit and receive digital video information.
  • images of a video sequence are divided into image blocks for encoding or decoding.
  • the inter prediction mode may include but is not limited to: skip / merge mode (skip / merge mode) and non-skip / merge mode (such as advanced motion vector prediction mode (AMVP mode)), etc., and both use multi-motion
  • the information contention method performs inter prediction.
  • a candidate motion information list (referred to as a candidate list) including multiple sets of motion information (also referred to as multiple candidate motion information) is introduced.
  • the encoder can select a suitable set from the candidate list
  • Candidate motion information to predict the motion information (such as a motion vector) of the current coded image block, and then obtain the best reference image block (that is, the prediction block) of the current coded image block.
  • a candidate list including multiple sets of motion information (also called block vectors) is sometimes introduced.
  • the embodiments of the present application provide an inter prediction method, an inter prediction device, and a corresponding video encoder and video decoder, so as to improve the inter prediction efficiency and thus the encoding and decoding performance.
  • an embodiment of the present application provides an inter prediction method, where the method includes:
  • the use of the historical candidate list to construct or update the candidate motion information list of the current image block includes: ordering from the tail to the head of the historical candidate list, checking the target historical candidate motion information and the candidate motion information list. Whether the existing candidate motion information is the same; if different, add the target historical candidate motion information to the candidate motion information list (for example, add the target historical candidate motion information to the tail of the candidate motion information list );
  • the target historical candidate motion information is: historical candidate motion information other than Q historical candidate motion information arranged in order from the tail to the head in the historical candidate list (for example, skipping from the historical candidate list Q historical candidate motion information arranged in order from tail to head), where Q is a positive integer.
  • the Q1 historical candidate motion information arranged in order from the tail to the head in the historical candidate list may not be skipped, but the skipped order from the tail to the head is ranked in the aforementioned Q1
  • the current image block (referred to as the current block) can be understood as the current encoding image block (coding block); during the decoding process, the current image block (referred to as the current block) can be It is understood as the current decoding image block.
  • the candidate motion information in the candidate motion information list does not reach a preset number
  • the order from the tail to the head of the historical candidate list is checked for the target historical candidate motion information and the existing candidate in the candidate motion information list. Whether the movement information is the same;
  • the historical candidate list includes L historical candidate motion information (also referred to as historical candidates), and the historical candidate motion information is a previously coded or previously decoded image block (for example, a previously decoded CU).
  • Motion information L is a positive integer greater than 0,
  • Q is a positive integer (for example, 1,2,3,4,5) and 0 ⁇ Q ⁇ L. It should be understood that if the historical candidate list in the historical candidate list has reached the maximum allowable number, L can be understood as the length of the historical candidate list; otherwise, L can be understood as the number of historical candidates currently included in the historical candidate list. .
  • the historical candidate list may be slice-level, or the historical candidate list may be image-level, or the historical candidate list may be at the level of several CTUs (CTU lines)
  • the historical candidate list may be CTU-level; the historical candidate list may be accessed or updated in a first-in-first-out manner.
  • the historical candidate motion information is motion information of a previously decoded image block in a slice to which the current image block belongs; for another example, the historical candidate motion information is a previously decoded image in an image to which the current image block belongs.
  • the historical candidate list is generated and updated during the encoding or decoding process of the slice to which the current image block belongs; and for example, the historical candidate list is the encoding or decoding process for the image to which the current image block belongs. Generated and updated; and for example, the historical candidate list is generated and updated during the encoding or decoding process of a CTU group (eg, one or more CTUs) to which the current image block belongs;
  • a CTU group eg, one or more CTUs
  • the candidate motion information list is different from the historical candidate list; the candidate motion information list is at the image block level (different image blocks have corresponding candidate motion information lists respectively), and the historical candidate list (also known as Is the historical candidate list) is the current slice (slice) or current image or the current CTU level of one or more coding tree units. For example, during encoding or decoding of multiple image blocks in the same slice, this can be used. Constantly updated historical candidate list.
  • the candidate motion information list includes motion information of one or more spatial domain reference blocks of the current image block (also referred to as spatial domain candidates) and / or one or Motion information of multiple time-domain reference blocks (also referred to as time-domain candidates);
  • the candidate motion information list includes motion vectors of one or more spatial domain reference blocks of the current image block (also referred to as spatial domain candidates) and / or motion vectors of one or more time domain reference blocks of the current image block. (Also known as time domain candidate).
  • the spatial domain reference block herein may include one or more spatial domain neighboring blocks (also referred to as spatial domain neighboring positions) adjacent to the current image block in the image where the current image block is located.
  • a fourth spatial domain neighboring block A0 located on the lower left side of the current image block
  • a first spatial domain neighboring block A1 located on the left side of the current image block
  • a third spatial domain neighboring block B0 located on the upper right side of the current image block
  • the second spatial domain neighboring block B1 located on the upper side of the current image block, or the fifth spatial domain neighboring block B2 located on the upper left side of the current image block it should be understood that the spatial domain neighboring block here may be 4 * 4 pixels .
  • the time-domain reference block herein may include one or more spatial-domain reference blocks adjacent to a co-located block (co-located block) in the reference image, and / or, one or more of the co-located blocks or A plurality of sub-blocks, wherein the collocated block is an image block having the same size, shape, and coordinates in the reference image as the current image block, or the collocated block is a reference image having the same size and shape as the current image block Image blocks of the same size and shape with a specified position offset.
  • the reference image here refers to a reconstructed image. Specifically, the reference image here refers to a reference image in one or more reference image lists.
  • the time-domain reference block includes: a lower-right spatial-domain neighboring block H of a co-located block of the current image block, a co-located block, an upper-left middle block C0, and the parallel The lower right middle block C3 of the collocated block, the upper left block TL of the collocated block, or the lower right block BR of the collocated block.
  • the execution subject of the method in the embodiment of the present application may be an inter prediction device, for example, it may be a video encoder or a video decoder or an electronic device with a video codec function.
  • the An inter prediction unit, or an inter prediction unit (such as a motion compensation unit) in a video decoder may be an inter prediction device, for example, it may be a video encoder or a video decoder or an electronic device with a video codec function.
  • the An inter prediction unit, or an inter prediction unit (such as a motion compensation unit) in a video decoder may be a video encoder or a video decoder or an electronic device with a video codec function.
  • the historical candidate list is used to construct or During the process of updating the candidate motion information list of the current image block, the latest Q historical candidate motion information is skipped. For example, if the inter prediction mode of the currently decoded image block is skip / merge mode, the historical candidate list is used to construct the fusion.
  • the Q historical candidates arranged in order from the tail to the head are skipped; if the inter prediction mode of the currently decoded image block is the inter MVP mode, when the historical candidate list is used to construct the motion vector prediction candidate list, Skip the Q historical candidates in order from the tail to the head; compared to the use of the historical candidate list (assuming length L) in the prior art to construct a fusion motion information candidate list (assuming length K) and motion vector prediction candidate list (Assume length is J), you need L times K and L times J duplicate search, respectively.
  • the number of duplicate check operations when the historical candidate list is added to the fusion motion information candidate list or the motion vector prediction candidate list is reduced, and the encoding and decoding time is reduced, thereby helping to improve the efficiency of inter prediction and thus encoding and decoding. performance.
  • the target history candidate motion information is the history X-th historical candidate motion information other than the Q historical candidate motion information arranged in order from the tail to the head in the candidate list (for example, skipping the Q historical candidates arranged in order from the tail to the head in the historical candidate list Motion information), wherein the reference frame index corresponding to the Xth historical candidate motion information is the same as the target reference frame index.
  • inter MVP inter motion vector prediction
  • the method is used to decode a current image block, and the target reference frame index is a reference frame index of the current image block parsed from the code stream.
  • the reference frame index corresponding to some historical candidate MVs is different from the target reference frame index.
  • the reference frame index corresponding to some historical candidate MVs is the same as the target reference frame index. Only the same historical candidate MV involves the repetitive check with the candidate motion information in the candidate motion information list.
  • the embodiment of the present application can further reduce the duplicate item check operation when the historical candidate list is added to the fusion motion information candidate list or the motion vector prediction candidate list. Number of.
  • whether the checking target historical candidate motion information is the same as existing candidate motion information in the candidate motion information list includes:
  • the inter prediction mode of the image block where the current candidate motion information is located is the first inter prediction mode
  • the inter prediction mode of the image block where the current candidate motion information is located is not the first inter prediction mode, it is checked whether the target historical candidate motion information is the same as the current candidate motion information. It should be understood that after the duplicate item check on the target historical candidate motion information and the current candidate motion information is completed, if the result of the duplicate item check indicates that the target historical candidate motion information is different from the current candidate motion information, then further checking the Whether the inter prediction mode of the image block where the next candidate motion information in the candidate motion information list is the first inter prediction mode.
  • the first inter prediction mode is an inter prediction mode that is not a fusion mode or a skip mode.
  • the first inter prediction mode is an inter motion vector prediction (inter MVP) mode, such as an advanced motion vector. Forecasting mode.
  • inter MVP inter motion vector prediction
  • the candidate motion information list for example, fusion Motion information candidate list or motion vector prediction candidate list
  • the number of duplicate check operations reduces codec time, which helps to improve inter prediction efficiency and thus codec performance.
  • whether the checking target historical candidate motion information is the same as existing candidate motion information in the candidate motion information list includes:
  • the target historical candidate motion information is added to the candidate motion information list; it should be understood that That is, skip the duplicate check operation between the target historical candidate motion information and each of the existing candidate motion information in the candidate motion information list (for example, the motion information at the A1 position and the motion information at the B0 position), and directly describe the Target historical candidate motion information is added to the candidate motion information list;
  • the inter prediction mode of the image block where the target historical candidate motion information is located is not the first inter prediction mode, check whether the target historical candidate motion information is the same as existing candidate motion information in the candidate motion information list .
  • the first inter prediction mode is an inter prediction mode that is not a fusion mode or a skip mode.
  • the first inter prediction mode is an inter motion vector prediction (inter MVP) mode, such as an advanced motion vector. Forecasting mode.
  • inter MVP inter motion vector prediction
  • a candidate motion information list such as a fusion motion information candidate list or a motion vector prediction candidate list
  • Selectively skip the duplicate check process of some historical candidates which can further reduce the number of duplicate check operations when the historical candidate list is added to the fusion motion information candidate list or the motion vector prediction candidate list, reducing the encoding and decoding time, thereby helping To improve the efficiency of inter-frame prediction, thereby improving the performance of encoding and decoding.
  • whether the checking target historical candidate motion information is the same as existing candidate motion information in the candidate motion information list includes:
  • the target historical candidate motion information is added to the candidate motion information list; it should be understood that That is, skip the duplicate check operation between the target historical candidate motion information and each of the existing candidate motion information in the candidate motion information list (for example, the motion information at the A1 position and the motion information at the B0 position), and directly describe the Target historical candidate motion information is added to the candidate motion information list;
  • the inter prediction mode of the image block where the target historical candidate motion information is located is not the first inter prediction mode and the inter prediction mode of the image block where the current candidate motion information in the candidate motion information list is the first frame
  • inter prediction mode check whether the inter prediction mode of the image block where the next candidate motion information in the candidate motion information list is the first inter prediction mode; it should be understood that the target historical candidate is skipped
  • the duplicate item checking operation between the motion information and the current candidate motion information directly proceeds to check whether the inter prediction mode of the image block where the next candidate motion information in the candidate motion information list is the first inter prediction mode;
  • the inter prediction mode of the image block where the target historical candidate motion information is located is not the first inter prediction mode and the inter prediction mode of the image block where the current candidate motion information in the candidate motion information list is not the first frame
  • the inter prediction mode it is checked whether the target historical candidate motion information is the same as the current candidate motion information. It should be understood that after the duplicate item check between the target historical candidate motion information and the current candidate motion information is completed, if the result of the duplicate item check indicates that the target historical candidate motion information is different from the current candidate motion information, a further check is performed. Whether the inter prediction mode of the image block where the next candidate motion information in the candidate motion information list is the first inter prediction mode.
  • the first inter prediction mode is an inter prediction mode that is not a fusion mode or a skip mode.
  • the first inter prediction mode is an inter motion vector prediction (inter MVP) mode, such as an advanced motion vector. Forecasting mode.
  • inter MVP inter motion vector prediction
  • the historical candidate list is used to construct or update the candidate motion information list (Such as a fusion motion information candidate list or a motion vector prediction candidate list), selectively skip part of the historical candidate's duplicate entry checking process, which can further reduce the historical candidate list to join the fusion motion information candidate list or motion vector prediction candidate.
  • the number of duplicate entry checking operations in the list reduces codec time, which helps to improve inter prediction efficiency and thus codec performance.
  • whether the checking target historical candidate motion information is the same as existing candidate motion information in the candidate motion information list includes:
  • the inter prediction mode of the image block where the target historical candidate motion information is located and the inter prediction mode of the image block where the current candidate motion information in the candidate motion information list is the first inter prediction mode
  • the target historical candidate motion information and the current candidate motion are skipped
  • the operation of checking duplicate items between information directly goes to check whether the inter prediction mode of the image block where the next candidate motion information in the candidate motion information list is the first inter prediction mode
  • At least one of the inter prediction mode of the image block where the target historical candidate motion information is located and the inter prediction mode of the image block where the current candidate motion information in the candidate motion information list is not the first inter prediction mode In the case, it is checked whether the target historical candidate motion information is the same as the current candidate motion information.
  • the first inter prediction mode is an inter prediction mode that is not a fusion mode or a skip mode.
  • the first inter prediction mode is an inter motion vector prediction (inter MVP) mode, such as an advanced motion vector. Forecasting mode.
  • inter MVP inter motion vector prediction
  • the historical candidate list is used to construct or update the candidate motion information list (for example, In the process of merging motion information candidate list or motion vector prediction candidate list), the skip check process of some historical candidates can be selectively skipped, so that the historical candidate list can be further reduced to be added to the fused motion information candidate list or motion vector prediction candidate list.
  • the number of duplicate check operations is reduced, which reduces the encoding and decoding time, which helps to improve the efficiency of inter-frame prediction and improve the encoding and decoding performance.
  • the method is used to encode the current image block, and the performing inter prediction on the current image block based on the candidate motion information list includes:
  • Target candidate motion information from the candidate motion information list according to a rate distortion cost criterion.
  • the target candidate motion information encodes the current image block with the lowest rate distortion cost
  • the candidate motion information list is a merge motion information candidate list merge candidate list, and accordingly, the current image is predicted based on the target candidate motion information.
  • the motion information of the block is: determining that the target candidate motion information (for example, the best candidate motion information) is the motion information of the currently encoded image block;
  • the candidate motion information list is a motion vector prediction candidate list (MVP candidate list, such as AMVP candidate list), and accordingly, the target candidate motion
  • the information is a motion vector prediction value MVP of the current image block, and the predicting motion information of the current image block based on the target candidate motion information includes: based on the MVP and a motion vector of a current encoded image block obtained through motion estimation, Determining a motion vector difference MVD of the current image block.
  • the method is used to decode the current image block, and the performing inter prediction on the current image block based on the candidate motion information list includes:
  • Determine the target candidate motion information from the candidate motion information list for example, determine the target candidate motion information from the candidate motion information list according to an index identifier parsed from a code stream, where the index identifier is used to indicate the The target candidate motion information in the candidate motion information list; it should be understood that if the length of the candidate motion information list is one, the index identifier does not need to be parsed, and the only candidate motion information is determined as the target candidate motion information;
  • the candidate motion information list is a merge motion information candidate list merge candidate list, and accordingly, the current image is predicted based on the target candidate motion information.
  • the motion information of the block is that the target candidate motion information is used as the motion information of the current image block.
  • the candidate motion information list is a motion vector prediction candidate list (MVP candidate list, such as AMVP candidate list), and accordingly, the target candidate motion
  • the information is a motion vector prediction value MVP
  • the predicting the motion information of the current image block according to the target candidate motion information includes: based on the motion vector prediction value MVP and the current parsed from the code stream.
  • the motion vector residual value MVD of the image block determines the motion information of the current image block. For example, the sum of the motion vector prediction value and the motion vector prediction residual value is used as the motion vector of the current image block.
  • the method further includes:
  • a reconstructed image of the current image block is obtained. It should be understood that if the predicted image of the current image block is the same as the original image of the current image block, there is no residual image (that is, a residual value) of the current image block, which is obtained based on the inter prediction process. A predicted image (that is, a predicted pixel value) of the current image block to obtain a reconstructed image of the current image block.
  • the method further includes: updating the historical candidate list using motion information of a current image block.
  • the historical candidate list is continuously updated during the image encoding or decoding process, which helps to improve the inter prediction efficiency and thus the encoding and decoding performance.
  • the embodiments of the present application continuously update the historical candidate list during the image encoding or decoding process, by adding historical candidate motion information to the candidate motion information list of the current image block (such as the fused motion information candidate list or the motion vector prediction candidate list) ( history (candidate) method, increasing the number of candidate motion information (such as merged / skip fusion motion information candidates or inter-mode motion vector prediction candidates), improving prediction efficiency.
  • the candidate motion information list of the current image block such as the fused motion information candidate list or the motion vector prediction candidate list
  • history (candidate) method increasing the number of candidate motion information (such as merged / skip fusion motion information candidates or inter-mode motion vector prediction candidates), improving prediction efficiency.
  • the candidate motion information list is a merged motion information candidate list merge candidate list
  • the candidate motion information list is a motion vector prediction candidate list (MVP candidate list, such as AMVP candidate list).
  • inter MVP inter motion vector prediction
  • AMVP candidate list motion vector prediction candidate list
  • the embodiments of the present application are not only applicable to the merge / skip mode and / or advanced motion vector prediction mode (AMVP), but also applicable to the use of spatial reference blocks and / Or other modes of predicting the motion information of the current image block by the motion information of the time-domain reference block, thereby improving codec performance.
  • AMVP advanced motion vector prediction mode
  • an embodiment of the present application provides an inter prediction method, where the method includes:
  • using the historical candidate list to construct or update the candidate motion information list of the current image block includes:
  • M is an integer greater than or equal to 0.
  • the current image block (referred to as the current block) can be understood as the current encoding image block (coding block); during the decoding process, the current image block (referred to as the current block) can be It is understood as the current decoding image block.
  • the historical candidate list may be slice-level, or the historical candidate list may be image-level, or the historical candidate list may be at the level of several CTUs (CTU lines)
  • the historical candidate list may be CTU-level; the historical candidate list may be accessed or updated in a first-in-first-out manner.
  • the historical candidate motion information is motion information of a previously decoded image block in a slice to which the current image block belongs; for another example, the historical candidate motion information is a previously decoded image in an image to which the current image block belongs.
  • the historical candidate list is generated and updated during the encoding or decoding process of the slice to which the current image block belongs; and for example, the historical candidate list is the encoding or decoding process for the image to which the current image block belongs. Generated and updated; and for example, the historical candidate list is generated and updated during the encoding or decoding process of a CTU group (eg, one or more CTUs) to which the current image block belongs;
  • a CTU group eg, one or more CTUs
  • the candidate motion information list is different from the historical candidate list; the candidate motion information list is at the image block level (different image blocks have corresponding candidate motion information lists respectively), and the historical candidate list (also known as Is the historical candidate list) is the current slice (slice) or current image or the current CTU level of one or more coding tree units. For example, during encoding or decoding of multiple image blocks in the same slice, this can be used. Constantly updated historical candidate list.
  • the candidate motion information list includes motion information of one or more spatial domain reference blocks of the current image block (also referred to as spatial domain candidates) and / or one or Motion information of multiple time-domain reference blocks (also referred to as time-domain candidates);
  • the candidate motion information list includes motion vectors of one or more spatial domain reference blocks of the current image block (also referred to as spatial domain candidates) and / or motion vectors of one or more time domain reference blocks of the current image block. (Also known as time domain candidate).
  • the spatial domain reference block herein may include one or more adjacent spatial domain blocks in the image where the current image block is located, which are adjacent to the current image block.
  • the second spatial domain neighboring block B1 located on the upper side of the current image block, or the fifth spatial domain neighboring block B2 located on the upper left side of the current image block it should be understood that the spatial domain neighboring block here may be 4 * 4 pixels .
  • the time-domain reference block herein may include one or more spatial-domain reference blocks adjacent to a co-located block (co-located block) in the reference image, and / or, one or more of the co-located blocks or A plurality of sub-blocks, wherein the collocated block is an image block having the same size, shape, and coordinates in the reference image as the current image block, or the collocated block is a reference image having the same size and shape as the current image block Image blocks of the same size and shape with a specified position offset.
  • the reference image here refers to a reconstructed image. Specifically, the reference image here refers to a reference image in one or more reference image lists.
  • the time-domain reference block includes: a lower-right spatial-domain neighboring block H of a co-located block of the current image block, a co-located block, an upper-left middle block C0, and the parallel The lower right middle block C3 of the collocated block, the upper left block TL of the collocated block, or the lower right block BR of the collocated block.
  • the execution subject of the method in the embodiment of the present application may be an inter prediction device, for example, it may be a video encoder or a video decoder or an electronic device with a video codec function.
  • the An inter prediction unit, or an inter prediction unit (such as a motion compensation unit) in a video decoder may be an inter prediction device, for example, it may be a video encoder or a video decoder or an electronic device with a video codec function.
  • the An inter prediction unit, or an inter prediction unit (such as a motion compensation unit) in a video decoder may be a video encoder or a video decoder or an electronic device with a video codec function.
  • the historical candidate list is used to construct or update the fused motion information candidate list according to the inter prediction mode of the image block where the historical candidate, the fused motion information candidate and the motion information prediction candidate are located.
  • the duplicate item check process of some historical candidates can be selectively skipped, which can further reduce the duplicate item check operation when the historical candidate list is added to the fusion motion information candidate list or the motion vector prediction candidate list. Number, reducing codec time, which helps to improve the efficiency of inter-frame prediction, thereby improving codec performance.
  • the inter prediction mode of the image block where the current historical candidate motion information HMVP is located in the historical candidate list and / or existing candidates in the candidate motion information list determines whether to perform a repetitive check, and determines based on the repetitive check results of the current historical candidate motion information HMVP and M existing candidate motion information in the candidate motion information list. Whether to add the current historical candidate motion information HMVP to the candidate motion information list includes:
  • the inter prediction mode of the image block where the current candidate motion information is located is the first inter prediction mode
  • the inter prediction mode of the image block where the current candidate motion information is located is not the first inter prediction mode, it is checked whether the current historical candidate motion information is the same as the current candidate motion information.
  • the inter prediction mode of the image block where the current historical candidate motion information HMVP is located in the historical candidate list and / or existing candidates in the candidate motion information list determines whether to perform a repetitive check, and determines based on the repetitive check results of the current historical candidate motion information HMVP and M existing candidate motion information in the candidate motion information list. Whether to add the current historical candidate motion information HMVP to the candidate motion information list includes:
  • the inter prediction mode of the image block where the current historical candidate motion information HMVP is located in the historical candidate list and / or existing candidates in the candidate motion information list determines whether to perform a repetitive check, and determines based on the repetitive check results of the current historical candidate motion information HMVP and M existing candidate motion information in the candidate motion information list. Whether to add the current historical candidate motion information HMVP to the candidate motion information list includes:
  • the inter prediction mode of the image block where the current historical candidate motion information is located is not the first inter prediction mode and the inter prediction mode of the image block where the current candidate motion information in the candidate motion information list is the first In the case of the inter prediction mode, check whether the inter prediction mode of the image block where the next candidate motion information in the candidate motion information list is the first inter prediction mode; it should be understood that if the candidate motion The number of existing candidate motion information in the information list is M1, then M ⁇ M1, because at least the repeatability check of the current historical candidate motion information HMVP and the current candidate motion information in the candidate motion information list is skipped;
  • the inter prediction mode of the image block where the current historical candidate motion information is located is not the first inter prediction mode and the inter prediction mode of the image block where the current candidate motion information is located is not the first inter prediction mode To check whether the current historical candidate motion information is the same as the current candidate motion information.
  • the inter prediction mode of the image block where the current historical candidate motion information HMVP is located in the historical candidate list and / or existing candidates in the candidate motion information list determines whether to perform a repetitive check, and determines based on the repetitive check results of the current historical candidate motion information HMVP and M existing candidate motion information in the candidate motion information list. Whether to add the current historical candidate motion information HMVP to the candidate motion information list includes:
  • the inter prediction mode of the image block where the current historical candidate motion information is located and the inter prediction mode of the image block where the current candidate motion information in the candidate motion information list is the first inter prediction mode
  • the number of motion information is M1, then M ⁇ M1, because at least the repeatability check of the current historical candidate motion information HMVP and the current candidate motion information in the candidate motion information list is skipped;
  • At least one of the inter prediction mode of the image block where the current historical candidate motion information is located and the inter prediction mode of the image block where the current candidate motion information in the candidate motion information list is not the first inter prediction mode In the case, it is checked whether the current historical candidate motion information is the same as the current candidate motion information.
  • the method is used to encode the current image block, and performing inter prediction on the current image block based on the candidate motion information list includes:
  • Target candidate motion information from the candidate motion information list according to a rate distortion cost criterion.
  • the target candidate motion information encodes the current image block with the lowest rate distortion cost
  • the candidate motion information list is a merge motion information candidate list merge candidate list, and accordingly, the current image is predicted based on the target candidate motion information.
  • the motion information of the block is: determining that the target candidate motion information (for example, the best candidate motion information) is the motion information of the currently encoded image block;
  • the candidate motion information list is a motion vector prediction candidate list (MVP candidate list, such as AMVP candidate list), and accordingly, the target candidate motion
  • the information is the motion vector prediction value MVP of the current image block, and the prediction of the motion information of the current image block based on the target candidate motion information includes: based on the MVP and the current encoded image block obtained through motion estimation (Motion Estimation) And determine the motion vector difference MVD of the current image block.
  • the method is used to decode the current image block, and the performing inter prediction on the current image block based on the candidate motion information list includes:
  • Determine the target candidate motion information from the candidate motion information list for example, determine the target candidate motion information from the candidate motion information list according to an index identifier parsed from a code stream, where the index identifier is used to indicate the The target candidate motion information in the candidate motion information list; it should be understood that if the length of the candidate motion information list is one, the index identifier does not need to be parsed, and the only candidate motion information is determined as the target candidate motion information;
  • the candidate motion information list is a merge motion information candidate list merge candidate list, and accordingly, the current image is predicted based on the target candidate motion information.
  • the motion information of the block is that the target candidate motion information is used as the motion information of the current image block.
  • the candidate motion information list is a motion vector prediction candidate list (MVP candidate list, such as AMVP candidate list), and accordingly, the target candidate motion
  • the information is a motion vector prediction value MVP
  • the predicting the motion information of the current image block according to the target candidate motion information includes: based on the motion vector prediction value MVP and the current parsed from the code stream.
  • the motion vector residual value MVD of the image block determines the motion information of the current image block. For example, the sum of the motion vector prediction value and the motion vector prediction residual value is used as the motion vector of the current image block.
  • the method further includes:
  • a reconstructed image of the current image block is obtained. It should be understood that if the predicted image of the current image block is the same as the original image of the current image block, there is no residual image (that is, a residual value) of the current image block, which is obtained based on the inter prediction process. A predicted image (that is, a predicted pixel value) of the current image block to obtain a reconstructed image of the current image block.
  • the method is used to encode a current image block, and an inter prediction mode of the current image block is a merge merge mode or a skip skip mode, and the method further includes:
  • the inter prediction mode of the current image block is an inter MVP mode, and the method further includes:
  • An index number, a reference frame index, and the motion vector difference MVD corresponding to the target candidate motion information (that is, the target candidate motion vector prediction value MVP) are coded into a code stream.
  • the method further includes: updating the historical candidate list using motion information of a current image block.
  • the historical candidate list is continuously updated during the image encoding or decoding process, which helps to improve the inter prediction efficiency and thus the encoding and decoding performance.
  • the embodiments of the present application continuously update the historical candidate list during the image encoding or decoding process, by adding historical candidate motion information to the candidate motion information list of the current image block (such as the fused motion information candidate list or the motion vector prediction candidate list) ( history (candidate) method, increasing the number of candidate motion information (such as merged / skip fusion motion information candidates or inter-mode motion vector prediction candidates), improving prediction efficiency.
  • the candidate motion information list of the current image block such as the fused motion information candidate list or the motion vector prediction candidate list
  • history (candidate) method increasing the number of candidate motion information (such as merged / skip fusion motion information candidates or inter-mode motion vector prediction candidates), improving prediction efficiency.
  • the candidate motion information list is a merged motion information candidate list
  • the candidate motion information list is a motion vector prediction candidate list (MVP candidate list, such as AMVP candidate list).
  • inter MVP inter motion vector prediction
  • AMVP candidate list motion vector prediction candidate list
  • the embodiments of the present application are not only applicable to the merge / skip mode and / or advanced motion vector prediction mode (AMVP), but also applicable to the use of spatial reference blocks and / Or other modes of predicting the motion information of the current image block by the motion information of the time-domain reference block, thereby improving codec performance.
  • AMVP advanced motion vector prediction mode
  • an embodiment of the present application provides an inter prediction apparatus, including several functional units for implementing any one of the methods in the first aspect.
  • the inter prediction device may include:
  • the candidate motion information list determination unit is configured to use the historical candidate list to construct or update the candidate motion information list of the current image block, wherein the using the historical candidate list to construct or update the candidate motion information list of the current image block includes:
  • the order from the tail to the head of the historical candidate list is to check whether the target historical candidate motion information is the same as the existing candidate motion information in the candidate motion information list; if they are different, add the target historical candidate motion information to the candidate In the motion information list;
  • the target historical candidate motion information is: historical candidate motion information other than the Q historical candidate motion information arranged in order from the tail to the head in the historical candidate list, where Q is a positive integer;
  • An inter prediction processing unit is configured to perform inter prediction on the current image block based on the candidate motion information list.
  • an embodiment of the present application provides an inter prediction apparatus, including several functional units for implementing any one of the methods in the second aspect.
  • the inter prediction device may include:
  • the candidate motion information list determining unit is configured to use the historical candidate list to construct or update the candidate motion information list of the current image block, wherein the using the historical candidate list to construct or update the candidate motion information list of the current image block includes: The inter prediction mode of the image block where the current historical candidate motion information HMVP is located in the historical candidate list and / or the inter prediction mode of the image block where the existing candidate motion information candidate in the candidate motion information list is located determines whether to perform repetition And check whether the current historical candidate motion information HMVP is added to the candidate motion information list according to the repetitive check result of the current historical candidate motion information HMVP and the M existing candidate motion information in the candidate motion information list. , Where M is an integer greater than or equal to 0;
  • An inter prediction processing unit is configured to perform inter prediction on the current image block based on the candidate motion information list.
  • an embodiment of the present application provides a video encoder for encoding an image block, including: the inter prediction device according to the third aspect or the fourth aspect, wherein the inter prediction device For predicting motion information of a current encoded image block based on target candidate motion information, and determining a predicted pixel value of the current encoded image block based on the motion information of the current encoded image block;
  • An entropy encoding module configured to encode an index identifier of the target candidate motion information into a code stream, where the index identifier indicates the target candidate motion information for the currently encoded image block;
  • a reconstruction module configured to reconstruct the image block based on the predicted pixel value.
  • an embodiment of the present application provides a video decoder, which is used to decode an image block from a code stream, and includes:
  • An entropy decoding module configured to decode an index identifier from a code stream, where the index identifier is used to indicate target candidate motion information of a currently decoded image block;
  • the inter prediction device is configured to predict motion information of a currently decoded image block based on the target candidate motion information indicated by the index identifier, and based on the current decoded image The motion information of the block determines a predicted pixel value of the current decoded image block;
  • a reconstruction module configured to reconstruct the image block based on the predicted pixel value.
  • an embodiment of the present application provides a video decoding device, where the device includes:
  • a memory configured to store video data in the form of a bitstream, where the video data includes one or more image blocks;
  • a video decoder for constructing or updating a candidate motion information list of a current image block using a historical candidate list; performing inter prediction on the current image block based on the candidate motion information list; wherein the constructing using the historical candidate list Or updating the candidate motion information list of the current image block, including: from the tail to the head of the historical candidate list, checking whether the target historical candidate motion information is the same as existing candidate motion information in the candidate motion information list; if If different, the target historical candidate motion information is added to the candidate motion information list; the target historical candidate motion information is: Q historical candidate motions arranged in order from tail to head in the historical candidate list Historical candidate motion information other than information, Q is a positive integer.
  • an embodiment of the present application provides a video decoding device, where the device includes:
  • a memory configured to store video data in the form of a bitstream, where the video data includes one or more image blocks;
  • the inter prediction mode of the image block where the motion information candidate is located determines whether to perform a repetitive check, and determines based on the repetitive check results of the current historical candidate motion information HMVP and M existing candidate motion information in the candidate motion information list. Whether to add current historical candidate motion information HMVP to the candidate motion information list, where M is an integer greater than or equal to 0.
  • a ninth aspect of the present application provides a computer-readable storage medium, where the computer-readable storage medium stores instructions, and when the computer-readable storage medium runs on the computer, causes the computer to execute the foregoing or the second aspect. method.
  • a tenth aspect of the present application provides a computer program product containing instructions, which when executed on a computer, causes the computer to perform the method described in the first aspect or the second aspect above.
  • An eleventh aspect of the present application provides an electronic device, including the video encoder according to the fifth aspect, or the video decoder according to the sixth aspect, or the interframe according to the third or fourth aspect. Forecasting device.
  • a twelfth aspect of the present application provides an encoding device including: a non-volatile memory and a processor coupled to each other, the processor calling program code stored in the memory to execute the first aspect or the second aspect Aspects of any or all of the steps of a method.
  • a thirteenth aspect of the present application provides a decoding device, including: a non-volatile memory and a processor coupled to each other, the processor invoking program code stored in the memory to execute the first aspect or the second aspect Aspects of any or all of the steps of a method.
  • FIG. 1 is a schematic block diagram of a video encoding and decoding system according to an embodiment of the present application
  • FIG. 2 is a schematic block diagram of a video encoder according to an embodiment of the present application.
  • FIG. 3 is a schematic block diagram of a video decoder according to an embodiment of the present application.
  • 4A is an exemplary flowchart of an encoding method performed by a video encoder in a merge mode according to an embodiment of the present application
  • 4B is an exemplary flowchart of an encoding method performed by a video encoder in an advanced motion vector prediction mode according to an embodiment of the present application
  • FIG. 5 is an exemplary flowchart of motion compensation performed by a video decoder in an embodiment of the present application
  • FIG. 6 is an exemplary schematic diagram of a current image block and a spatial domain reference block and a time domain reference block associated therewith in an embodiment of the present application;
  • FIG. 7 is a flowchart of an image encoding method according to an embodiment of the present application.
  • FIG. 8 is a flowchart of an image decoding method according to an embodiment of the present application.
  • FIG. 9 is an exemplary schematic diagram of a historical candidate list before and after an update in an embodiment of the present application.
  • FIG. 10 is another exemplary schematic diagram of a historical candidate list before and after an update in an embodiment of the present application.
  • 11A is an exemplary schematic diagram of adding historical candidate motion information to a candidate list of fused motion information according to an embodiment of the present application
  • FIG. 11B is an exemplary schematic diagram of three historical candidates arranged in order from the tail to the head in the skip history candidate list in the embodiment of the present application;
  • FIG. 11B is an exemplary schematic diagram of three historical candidates arranged in order from the tail to the head in the skip history candidate list in the embodiment of the present application;
  • FIG. 11C is a schematic flowchart of another exemplary process of adding historical candidate motion information to a candidate list of fused motion information according to an embodiment of the present application.
  • FIG. 12 is a schematic block diagram of an inter prediction apparatus according to an embodiment of the present application.
  • FIG. 13 is a schematic block diagram of another inter prediction apparatus according to an embodiment of the present application.
  • FIG. 14 is a schematic block diagram of another encoding device or decoding device according to an embodiment of the present application.
  • Intra-prediction coding A coding method that uses surrounding pixel values to predict the current pixel value and then encodes the prediction error.
  • Coded picture A coded representation of a picture that contains all the coding tree units of the picture.
  • Motion vector A two-dimensional vector used for inter prediction, which provides an offset from coordinates in a decoded picture to coordinates in a reference picture.
  • Prediction block A rectangular M ⁇ N sample block on which the same prediction is applied.
  • Prediction process Use the predicted value to provide an estimate of the data element (eg, sample value or motion vector) that is currently being decoded.
  • data element eg, sample value or motion vector
  • Predicted value A specified value or a combination of previously decoded data elements (for example, sample values or motion vectors) used during subsequent data element decoding.
  • Reference frame A picture or frame used as a short-term reference picture or a long-term reference picture.
  • the reference frame contains samples that can be used in the decoding order for inter prediction in the decoding process of subsequent pictures.
  • Inter prediction According to the pixels in the reference frame of the current block, the position of the pixels used for prediction in the reference frame is indicated by the motion vector to generate a predicted image of the current block.
  • Bidirectional prediction (B) slice A slice that can be decoded using intra prediction or inter prediction to predict the sample value of each block with up to two motion vectors and reference indexes.
  • CTU coding tree unit (coding tree unit).
  • An image is composed of multiple CTUs.
  • a CTU usually corresponds to a square image area, which contains the luma pixels and chroma pixels in this image area (or it can only include luma pixels). (Or may also include only chroma pixels); the CTU also contains syntax elements, which indicate how to divide the CTU into at least one coding unit (coding unit, CU), and a method of decoding each coding unit to obtain a reconstructed image.
  • a coding unit corresponding to an A ⁇ B rectangular area in the image, including A ⁇ B brightness pixels or / and its corresponding chroma pixels, A is the width of the rectangle, B is the height of the rectangle, and A and B can be the same It can also be different.
  • the values of A and B are usually integer powers of 2, such as 128, 64, 32, 16, 8, and 4.
  • a coding unit includes a predicted image and a residual image, and the predicted image and the residual image are added to obtain a reconstructed image of the coding unit.
  • the predicted image is generated by intra prediction or inter prediction, and the residual image is generated by inverse quantization and inverse transform processing of the transform coefficients.
  • VTM New codec reference software developed by the JVET organization.
  • the transmitted motion information includes: inter prediction direction (forward, backward, or bidirectional), reference frame index, motion vector prediction value index, and motion vector residual value.
  • inter prediction direction forward, backward, or bidirectional
  • reference frame index reference frame index
  • motion vector prediction value index motion vector residual value.
  • MVP motion vector prediction value index
  • motion vector residual value motion vector residual value between the MVP and the actual motion vector (motion vector). difference, MVD) to the decoder.
  • the motion vector prediction may include multiple prediction values.
  • the motion vector prediction candidate list (mvp candidate list) is constructed at the encoding end and the decoding end, and the motion vector prediction index (MVP index) is passed. To the decoder.
  • the merged motion information candidate list is constructed in the same way at the encoder and decoder, and the merge index is passed to the decoder.
  • the decoder can select the corresponding fusion candidate from the merge candidate list according to the merge index, and use the motion information of the fusion candidate as the motion information of the current block, or scale the motion information of the fusion candidate as the motion information of the fusion candidate.
  • Motion information of the current block is usually obtained from its spatial-domain neighboring blocks or the time-domain blocks in the reference frame.
  • the candidate motion information obtained from the motion information of the image blocks adjacent to the current block is called Is a spatial candidate, and the motion information of the corresponding position image block obtained from the current image in the reference image is called a temporal candidate.
  • FIG. 1 is a schematic block diagram of a video encoding and decoding system 10 according to an embodiment of the present application.
  • the system 10 includes a source device 12 that generates encoded video data to be decoded by the destination device 14 at a later time.
  • Source device 12 and destination device 14 may include any of a wide range of devices, including desktop computers, notebook computers, tablet computers, set-top boxes, telephone handsets such as so-called “smart” phones, so-called “smart” "Touchpads, TVs, cameras, displays, digital media players, video game consoles, video streaming devices, or the like.
  • the source device 12 and the destination device 14 may be equipped for wireless communication.
  • the destination device 14 may receive the encoded video data to be decoded via the link 16.
  • the link 16 may include any type of media or device capable of moving the encoded video data from the source device 12 to the destination device 14.
  • the link 16 may include a communication medium that enables the source device 12 to directly transmit the encoded video data to the destination device 14 in real time.
  • the encoded video data may be modulated according to a communication standard (eg, a wireless communication protocol) and transmitted to the destination device 14.
  • Communication media may include any wireless or wired communication media, such as the radio frequency spectrum or one or more physical transmission lines. Communication media may form part of a packet-based network, such as a global network of a local area network, a wide area network, or the Internet.
  • the communication medium may include a router, a switch, a base station, or any other equipment that may be used to facilitate communication from the source device 12 to the destination device 14.
  • the encoded data may be output from the output interface 22 to the storage device 24.
  • the encoded data can be accessed from the storage device 24 by an input interface.
  • the storage device 24 may include any of a variety of distributed or locally-accessed data storage media, such as a hard drive, Blu-ray disc, DVD, CD-ROM, flash memory, volatile or non-volatile memory Or any other suitable digital storage medium for storing encoded video data.
  • the storage device 24 may correspond to a file server or another intermediate storage device that may hold the encoded video produced by the source device 12.
  • the destination device 14 may access the stored video data from the storage device 24 via streaming or download.
  • the file server may be any type of server capable of storing encoded video data and transmitting this encoded video data to the destination device 14.
  • the file server includes a web server, a file transfer protocol server, a network attached storage device, or a local disk drive.
  • the destination device 14 may access the encoded video data via any standard data connection including an Internet connection.
  • This data connection may include a wireless channel (e.g., a Wi-Fi connection), a wired connection (e.g., a cable modem, etc.), or a combination of both, suitable for accessing the encoded video data stored on the file server.
  • the transmission of the encoded video data from the storage device 24 may be a streaming transmission, a download transmission, or a combination of the two.
  • the techniques of this application are not necessarily limited to wireless applications or settings.
  • the technology can be applied to video decoding to support any of a variety of multimedia applications, such as over-the-air television broadcasting, cable television transmission, satellite television transmission, streaming video transmission (e.g., via the Internet), encoding digital video for use in Digital video or other applications stored on a data storage medium and decoded on the data storage medium.
  • the system 10 may be configured to support one-way or two-way video transmission to support applications such as video streaming, video playback, video broadcasting, and / or video telephony.
  • the source device 12 includes a video source 18, a video encoder 20, and an output interface 22.
  • the output interface 22 may include a modulator / demodulator (modem) and / or a transmitter.
  • the video source 18 may include a source such as a video capture device (e.g., a video camera), a video archive containing previously captured video, a video feed interface to receive video from a video content provider , And / or a computer graphics system for generating computer graphics data as a source video, or a combination of these sources.
  • a video capture device e.g., a video camera
  • the source device 12 and the destination device 14 may form a so-called camera phone or video phone.
  • the techniques described in this application may be exemplarily applicable to video decoding, and may be applicable to wireless and / or wired applications.
  • Captured, pre-captured, or computer-generated video may be encoded by video encoder 20.
  • the encoded video data may be transmitted directly to the destination device 14 via the output interface 22 of the source device 12.
  • the encoded video data may also be (or alternatively) stored on the storage device 24 for later access by the destination device 14 or other device for decoding and / or playback.
  • the destination device 14 includes an input interface 28, a video decoder 30, and a display device 32.
  • the input interface 28 may include a receiver and / or a modem.
  • the input interface 28 of the destination device 14 receives the encoded video data via the link 16.
  • the encoded video data communicated or provided on the storage device 24 via the link 16 may include various syntax elements generated by the video encoder 20 for use by the video decoder 30 of the video decoder 30 to decode the video data. These syntax elements may be included with the encoded video data transmitted on a communication medium, stored on a storage medium, or stored on a file server.
  • the display device 32 may be integrated with or external to the destination device 14.
  • the destination device 14 may include an integrated display device and also be configured to interface with an external display device.
  • the destination device 14 may be a display device.
  • the display device 32 displays the decoded video data to a user, and may include any of a variety of display devices, such as a liquid crystal display, a plasma display, an organic light emitting diode display, or another type of display device.
  • Video encoder 20 and video decoder 30 may operate according to, for example, the next-generation video codec compression standard (H.266) currently under development and may conform to the H.266 test model (JEM).
  • the video encoder 20 and the video decoder 30 may be based on, for example, the ITU-TH.265 standard, also referred to as a high-efficiency video decoding standard, or other proprietary or industrial standards of the ITU-TH.264 standard or extensions of these standards
  • the ITU-TH.264 standard is alternatively referred to as MPEG-4 Part 10, also known as advanced video coding (AVC).
  • AVC advanced video coding
  • the techniques of this application are not limited to any particular decoding standard.
  • Other possible implementations of the video compression standard include MPEG-2 and ITU-TH.263.
  • video encoder 20 and video decoder 30 may each be integrated with an audio encoder and decoder, and may include an appropriate multiplexer-demultiplexer (MUX-DEMUX) unit or other hardware and software to handle encoding of both audio and video in a common or separate data stream.
  • MUX-DEMUX multiplexer-demultiplexer
  • the MUX-DEMUX unit may conform to the ITUH.223 multiplexer protocol or other protocols such as the User Datagram Protocol (UDP).
  • UDP User Datagram Protocol
  • Video encoder 20 and video decoder 30 may each be implemented as any of a variety of suitable encoder circuits, such as one or more microprocessors, digital signal processors (DSPs), application specific integrated circuits (ASICs), Field Programmable Gate Array (FPGA), discrete logic, software, hardware, firmware, or any combination thereof.
  • DSPs digital signal processors
  • ASICs application specific integrated circuits
  • FPGA Field Programmable Gate Array
  • the device may store the software's instructions in a suitable non-transitory computer-readable medium and execute the instructions in hardware using one or more processors to perform the techniques of this application.
  • Each of the video encoder 20 and the video decoder 30 may be included in one or more encoders or decoders, and any of them may be integrated as a combined encoder / decoder (CODEC) in a corresponding device. part.
  • CDEC combined encoder / decoder
  • the present application may, by way of example, relate to video encoder 20 "signaling" certain information to another device, such as video decoder 30. It should be understood, however, that video encoder 20 may signal information by associating specific syntax elements with various encoded portions of video data. That is, video encoder 20 may "signal" the data by storing specific syntax elements to the header information of various encoded portions of the video data. In some applications, these syntax elements may be encoded and stored (eg, stored to storage system 34 or file server 36) before being received and decoded by video decoder 30.
  • the term “signaling” may exemplarily refer to the transmission of syntax or other data used to decode compressed video data, regardless of whether this transmission occurs in real-time or near real-time or within a time span, such as may be encoded Occurs when the syntax element is stored to the media, which can then be retrieved by the decoding device at any time after being stored on this media.
  • H.265 HEVC
  • HM HEVC test model
  • the latest standard document of H.265 can be obtained from http://www.itu.int/rec/T-REC-H.265.
  • the latest version of the standard document is H.265 (12/16).
  • the standard document is in full text.
  • the citation is incorporated herein.
  • HM assumes that video decoding devices have several additional capabilities over existing algorithms of ITU-TH.264 / AVC. For example, H.264 provides 9 intra-prediction encoding modes, while HM provides up to 35 intra-prediction encoding modes.
  • H.266 test model The evolution model of the video decoding device.
  • the algorithm description of H.266 can be obtained from http://phenix.int-evry.fr/jvet. The latest algorithm description is included in JVET-F1001-v2.
  • the algorithm description document is incorporated herein by reference in its entirety.
  • the reference software for the JEM test model can be obtained from https://jvet.hhi.fraunhofer.de/svn/svn_HMJEMSoftware/, which is also incorporated herein by reference in its entirety.
  • HM can divide a video frame or image into a sequence of tree blocks or maximum coding units (LCUs) containing both luminance and chrominance samples.
  • LCUs are also known as CTUs.
  • the tree block has a similar purpose as the macro block of the H.264 standard.
  • a slice contains several consecutive tree blocks in decoding order.
  • a video frame or image can be split into one or more slices.
  • Each tree block can be split into coding units according to a quadtree. For example, a tree block that is a root node of a quad tree may be split into four child nodes, and each child node may be a parent node and split into another four child nodes.
  • the final indivisible child nodes that are leaf nodes of the quadtree include decoding nodes, such as decoded video blocks.
  • decoding nodes such as decoded video blocks.
  • the syntax data associated with the decoded codestream can define the maximum number of times a tree block can be split, and can also define the minimum size of a decoding node.
  • the coding unit includes a decoding node and a prediction unit (PU) and a transformation unit (TU) associated with the decoding node.
  • the size of the CU corresponds to the size of the decoding node and the shape must be square.
  • the size of the CU can range from 8 ⁇ 8 pixels to a maximum 64 ⁇ 64 pixels or larger tree block size.
  • Each CU may contain one or more PUs and one or more TUs.
  • the syntax data associated with a CU may describe a case where a CU is partitioned into one or more PUs.
  • the division mode may be different between cases where the CU is skipped or coded by direct mode, intra prediction mode, or inter prediction mode.
  • the PU can be divided into non-square shapes.
  • the syntax data associated with a CU may also describe a case where a CU is partitioned into one or more TUs according to a quadtree.
  • the shape of the TU can be square or non-square.
  • the HEVC standard allows transformation based on the TU, which can be different for different CUs.
  • the TU is usually sized based on the size of the PUs within a given CU defined for the partitioned LCU, but this may not always be the case.
  • the size of the TU is usually the same as or smaller than the PU.
  • a quad-tree structure called “residual quad tree” (RQT) can be used to subdivide the residual samples corresponding to the CU into smaller units.
  • the leaf node of RQT may be called TU.
  • the pixel difference values associated with the TU may be transformed to produce a transformation coefficient, which may be quantized.
  • the PU contains data related to the prediction process.
  • the PU may include data describing the intra-prediction mode of the PU.
  • the PU may include data defining a motion vector of the PU.
  • the data defining the motion vector of the PU may describe the horizontal component of the motion vector, the vertical component of the motion vector, the resolution of the motion vector (e.g., quarter-pixel accuracy or eighth-pixel accuracy), motion vector The reference image pointed to, and / or the reference image list of the motion vector (eg, list 0, list 1 or list C).
  • TU uses transform and quantization processes.
  • a given CU with one or more PUs may also contain one or more TUs.
  • video encoder 20 may calculate a residual value corresponding to the PU. Residual values include pixel differences, which can be transformed into transform coefficients, quantized, and scanned using TU to generate serialized transform coefficients for entropy decoding.
  • This application generally uses the term "video block" to refer to the decoding node of a CU.
  • the term “video block” may also be used in this application to refer to a tree block including a decoding node and a PU and a TU, such as an LCU or a CU.
  • a video sequence usually contains a series of video frames or images.
  • a group of pictures exemplarily includes a series, one or more video pictures.
  • the GOP may include syntax data in the header information of the GOP, the header information of one or more of the pictures, or elsewhere, and the syntax data describes the number of pictures included in the GOP.
  • Each slice of the image may contain slice syntax data describing the coding mode of the corresponding image.
  • Video encoder 20 typically operates on video blocks within individual video slices to encode video data.
  • a video block may correspond to a decoding node within a CU.
  • Video blocks may have fixed or varying sizes, and may differ in size according to a specified decoding standard.
  • HM supports prediction of various PU sizes. Assuming that the size of a specific CU is 2N ⁇ 2N, HM supports intra prediction of PU sizes of 2N ⁇ 2N or N ⁇ N, and symmetric PU sizes of 2N ⁇ 2N, 2N ⁇ N, N ⁇ 2N, or N ⁇ N between frames. prediction. HM also supports asymmetric partitioning of PU-sized inter predictions of 2N ⁇ nU, 2N ⁇ nD, nL ⁇ 2N, and nR ⁇ 2N. In asymmetric partitioning, one direction of the CU is not partitioned, and the other direction is partitioned into 25% and 75%.
  • 2N ⁇ nU refers to a horizontally-divided 2N ⁇ 2NCU, where 2N ⁇ 0.5NPU is at the top and 2N ⁇ 1.5NPU is at the bottom.
  • N ⁇ N and “N times N” are used interchangeably to refer to the pixel size of a video block according to vertical and horizontal dimensions, for example, 16 ⁇ 16 pixels or 16 ⁇ 16 pixels.
  • an N ⁇ N block has N pixels in the vertical direction and N pixels in the horizontal direction, where N represents a non-negative integer value.
  • Pixels in a block can be arranged in rows and columns.
  • the block does not necessarily need to have the same number of pixels in the horizontal direction as in the vertical direction.
  • a block may include N ⁇ M pixels, where M is not necessarily equal to N.
  • the video encoder 20 may calculate the residual data of the TU of the CU.
  • a PU may include pixel data in a spatial domain (also referred to as a pixel domain), and a TU may include transforming (e.g., discrete cosine transform (DCT), integer transform, wavelet transform, or conceptually similar transform) Coefficients in the transform domain after being applied to the residual video data.
  • the residual data may correspond to a pixel difference between a pixel of an uncoded image and a prediction value corresponding to a PU.
  • Video encoder 20 may form a TU containing residual data of the CU, and then transform the TU to generate transform coefficients for the CU.
  • video encoder 20 may perform quantization of the transform coefficients.
  • Quantization exemplarily refers to the process of quantizing coefficients to possibly reduce the amount of data used to represent the coefficients to provide further compression.
  • the quantization process may reduce the bit depth associated with some or all of the coefficients. For example, an n-bit value may be rounded down to an m-bit value during quantization, where n is greater than m.
  • the JEM model further improves the coding structure of video images.
  • a block coding structure called "Quad Tree Combined with Binary Tree” (QTBT) is introduced.
  • QTBT Quality Tree Combined with Binary Tree
  • a CU can be square or rectangular.
  • a CTU first performs a quadtree partition, and the leaf nodes of the quadtree further perform a binary tree partition.
  • there are two partitioning modes in binary tree partitioning symmetrical horizontal partitioning and symmetrical vertical partitioning.
  • the leaf nodes of a binary tree are called CUs.
  • JEM's CUs cannot be further divided during the prediction and transformation process, which means that JEM's CU, PU, and TU have the same block size.
  • the maximum size of the CTU is 256 ⁇ 256 luminance pixels.
  • video encoder 20 may utilize a predefined scan order to scan the quantized transform coefficients to generate a serialized vector that can be entropy encoded.
  • video encoder 20 may perform adaptive scanning. After scanning the quantized transform coefficients to form a one-dimensional vector, video encoder 20 may perform context-adaptive variable length decoding (CAVLC), context-adaptive binary arithmetic decoding (CABAC), syntax-based context-adaptive binary Arithmetic decoding (SBAC), probability interval partition entropy (PIPE) decoding, or other entropy decoding methods to entropy decode a one-dimensional vector.
  • Video encoder 20 may also entropy encode syntax elements associated with the encoded video data for use by video decoder 30 to decode the video data.
  • video encoder 20 may assign a context within a context model to a symbol to be transmitted. Context can be related to whether adjacent values of a symbol are non-zero.
  • video encoder 20 may select a variable length code of a symbol to be transmitted. Codewords in Variable Length Decoding (VLC) may be constructed such that relatively short codes correspond to more likely symbols and longer codes correspond to less likely symbols. In this way, the use of VLC can achieve the goal of saving code rates relative to using equal length codewords for each symbol to be transmitted.
  • the probability in CABAC can be determined based on the context assigned to the symbol.
  • the video encoder may perform inter prediction to reduce temporal redundancy between images.
  • a CU may have one or more prediction units PU according to the provisions of different video compression codec standards.
  • multiple PUs may belong to a CU, or PUs and CUs are the same size.
  • the CU's partitioning mode is not divided, or it is divided into one PU, and the PU is uniformly used for expression.
  • the video encoder may signal the video decoder motion information for the PU.
  • the motion information of the PU may include: a reference image index, a motion vector, and a prediction direction identifier.
  • a motion vector may indicate a displacement between an image block (also called a video block, a pixel block, a pixel set, etc.) of a PU and a reference block of the PU.
  • the reference block of the PU may be a part of the reference picture similar to the image block of the PU.
  • the reference block may be located in a reference image indicated by a reference image index and a prediction direction identifier.
  • the video encoder may generate a candidate motion information list (hereinafter referred to as a candidate for short) for each of the PUs according to the merge prediction mode or advanced motion vector prediction mode process.
  • a candidate for short a candidate motion information list for each of the PUs according to the merge prediction mode or advanced motion vector prediction mode process.
  • the motion information may include motion vector MV and reference image indication information.
  • the motion information may also include only one or both of them.
  • the motion information may only include a motion vector.
  • the motion information represented by some candidates in the candidate list may be based on the motion information of other PUs.
  • the candidate represents motion information specifying one of spatial candidate positions (temporal candidate positions) or temporal candidate positions (temporal candidate positions)
  • this application may refer to the candidate as "original" candidate motion information .
  • original candidate motion information For example, for a merge mode, also referred to herein as a merge prediction mode, there may be five original spatial candidate locations and one original time domain candidate location.
  • the video encoder may also generate additional or additional candidate motion information by some means, such as inserting zero motion vectors as candidate motion information to generate additional candidate motion information. These additional candidate motion information are not considered as original candidate motion information and may be referred to as post- or artificially generated candidate motion information in this application.
  • the techniques of this application generally relate to a technique for generating a candidate list at a video encoder and a technique for generating the same candidate list at a video decoder.
  • Video encoders and video decoders can produce the same candidate list by implementing the same techniques used to build the candidate list. For example, both a video encoder and a video decoder may build a list with the same number of candidates (e.g., five candidates).
  • Video encoders and decoders may first consider spatial candidates (e.g., neighboring blocks in the same image), then consider temporal candidates (e.g., candidates in different images), and finally consider artificially generated candidates Until the desired number of candidates are added to the list.
  • a pruning operation may be utilized during candidate list construction for certain types of candidate motion information to remove duplicates from the candidate list, while for other types of candidates, pruning may not be used in order to reduce decoder complexity .
  • a pruning operation may be performed to exclude candidates with duplicate motion information from the list of candidates.
  • the video encoder may select candidate motion information from the candidate list and output an index identifier representing the selected candidate motion information in a code stream.
  • the selected candidate motion information may be motion information having a prediction block that produces the closest match to the PU being decoded.
  • the aforementioned index identifier may indicate a position of the candidate motion information selected in the candidate list.
  • the video encoder may also generate a prediction block for the PU based on a reference block indicated by the motion information of the PU.
  • the motion information of the PU may be determined based on the selected candidate motion information. For example, in the merge mode, it is determined that the selected candidate motion information is the motion information of the PU.
  • motion information of a PU may be determined based on a motion vector difference of the PU and selected candidate motion information.
  • the video encoder may generate one or more residual image blocks (referred to as residual blocks) for the CU based on the predictive image blocks (referred to as prediction blocks) of the PU of the CU and the original image blocks for the CU.
  • the video encoder may then encode one or more residual blocks and output a code stream.
  • the code stream may include data for identifying selected candidate motion information in a candidate list of the PU.
  • the video decoder may determine the motion information of the PU based on the selected candidate motion information in the candidate list of the PU.
  • the video decoder may identify one or more reference blocks for the PU based on the motion information of the PU. After identifying one or more reference blocks of the PU, the video decoder may generate a prediction block for the PU based on the one or more reference blocks of the PU.
  • the video decoder may reconstruct an image block for a CU based on a prediction block of the PU for the CU and one or more residual blocks for the CU.
  • the present application may describe a position or an image block as having various spatial relationships with a CU or a PU. This description can be interpreted to mean that the position or image block and the image block associated with the CU or PU have various spatial relationships.
  • a PU currently being decoded by a video decoder may be referred to as a current PU, and may also be referred to as a current image block to be processed.
  • This application may refer to the CU that the video decoder is currently decoding as the current CU.
  • This application may refer to the image currently being decoded by the video decoder as the current image. It should be understood that this application is applicable to a case where the PU and the CU have the same size, or the PU is the CU, and the PU is used to represent the same.
  • video encoder 20 may use inter prediction to generate prediction blocks and motion information for a PU of a CU.
  • the motion information of a PU may be the same or similar to the motion information of one or more neighboring PUs (ie, PUs whose image blocks are spatially or temporally near the PU's image blocks). Because neighboring PUs often have similar motion information, video encoder 20 may refer to the motion information of neighboring PUs to encode the motion information of the PU. Encoding the motion information of the PU with reference to the motion information of the neighboring PU can reduce the number of encoding bits required in the code stream to indicate the motion information of the PU.
  • Video encoder 20 may refer to the motion information of a neighboring PU in various ways to encode the motion information of that PU. For example, video encoder 20 may indicate that the motion information of the PU is the same as the motion information of nearby PUs. This application may use a merge mode to refer to indicating that the motion information of the PU is the same as that of the neighboring PU or may be derived from the motion information of the neighboring PU. In another feasible implementation manner, the video encoder 20 may calculate a Motion Vector Difference (MVD) for the PU. MVD indicates the difference between the motion vector of this PU and the motion vector of neighboring PUs. Video encoder 20 may include MVD instead of the motion vector of the PU in the motion information of the PU.
  • MVD Motion Vector Difference
  • This application may use an advanced motion vector prediction mode to refer to using the MVD and an index value identifying a candidate (ie, candidate motion information) to notify the decoding end of the PU's motion information.
  • the video encoder 20 may generate a candidate list for the PU.
  • the candidate list may include one or more candidates (ie, one or more sets of candidate motion information).
  • Each candidate in the candidate list for this PU represents a set of motion information.
  • a set of motion information may include a motion vector, a reference image list, and a reference image index corresponding to the reference image list.
  • the video encoder 20 may select one of a plurality of candidates from the candidate list for the PU. For example, a video encoder may compare each candidate with a PU that is being decoded and may select a candidate with the required code rate-distortion cost. Video encoder 20 may output a candidate index for a PU. The candidate index identifies the position of the selected candidate in the candidate list.
  • the video encoder 20 may generate a prediction block for a PU based on a reference block indicated by the motion information of the PU.
  • the motion information of the PU may be determined based on the selected candidate motion information from the candidate list for the PU.
  • video decoder 30 may generate a candidate list for each of the PUs of the CU.
  • the candidate list generated by video decoder 30 for a PU may be the same as the candidate list generated by video encoder 20 for a PU.
  • the syntax element parsed from the code stream may indicate the position of the candidate motion information selected in the candidate list of the PU.
  • video decoder 30 may generate a prediction block for the PU based on one or more reference blocks indicated by the motion information of the PU.
  • Video decoder 30 may determine motion information of a PU based on candidate motion information selected in a candidate list for the PU.
  • Video decoder 30 may reconstruct an image block for a CU based on a prediction block for a PU and a residual block for a CU.
  • the construction of the candidate list and the positions of the candidates selected from the code stream parsing in the candidate list are independent of each other and may be performed in any order or in parallel.
  • the position of the selected candidate in the candidate list is first parsed from the code stream, and the candidate list is constructed according to the parsed position.
  • no construction is required.
  • the candidate list at the parsed position needs to be constructed, that is, the candidate at the position can be determined.
  • the candidate with the index identifier of 3 in the candidate list needs to be constructed only from the candidate list with the index of 0 to 3, and the index identifier can be determined to be 3.
  • the candidate can achieve the technical effect of reducing complexity and improving decoding efficiency.
  • FIG. 2 is a schematic block diagram of a video encoder 20 according to an embodiment of the present application.
  • Video encoder 20 may perform intra-frame decoding and inter-frame decoding of video blocks within a video slice.
  • Intra decoding relies on spatial prediction to reduce or remove the spatial redundancy of a video within a given video frame or image.
  • Inter-frame decoding relies on temporal prediction to reduce or remove temporal redundancy of video within adjacent frames of a video sequence or video.
  • the intra mode (I mode) may refer to any of several space-based compression modes.
  • Inter-modes such as unidirectional prediction (P mode) or bidirectional prediction (B mode) may refer to any of several time-based compression modes.
  • the video encoder 20 includes a segmentation unit 35, a prediction unit 41, a reference image memory 64, a summer 50, a transformation processing unit 52, a quantization unit 54, and an entropy encoding unit 56.
  • the prediction unit 41 includes an inter prediction unit 43 and an intra prediction unit 46.
  • the inter prediction unit 43 may include a motion estimation unit 42 and a motion compensation unit 44.
  • the video encoder 20 may also include an inverse quantization unit 58, an inverse transform unit 60, and a summer (also referred to as a reconstructor) 62.
  • a deblocking filter (not shown in Figure 2) may also be included to filter block boundaries to remove block effect artifacts from the reconstructed video. When needed, the deblocking filter will typically filter the output of the summer 62. In addition to the deblocking filter, additional loop filters (in-loop or post-loop) can be used.
  • the video encoder 20 receives video data, and the segmentation unit 35 divides the data into video blocks.
  • This segmentation may also include segmentation into slices, image blocks, or other larger units, and video block segmentation, for example, based on the quad-tree structure of the LCU and CU.
  • Video encoder 20 exemplarily illustrates the components of a video block encoded within a video slice to be encoded. In general, a slice can be divided into multiple video blocks (and possibly into a collection of video blocks called image blocks).
  • the prediction unit 41 may select one of a plurality of possible decoding modes of the current video block, such as one or more of a plurality of intra decoding modes, based on the encoding quality and the cost calculation result (for example, code rate-distortion cost, RDcost) One of the inter-frame decoding modes.
  • the prediction unit 41 may provide the obtained intra decoded or inter decoded block to the summer 50 to generate residual block data and provide the obtained intra decoded or inter decoded block to the summer 62 to reconstruct The encoded block is thus used as a reference image.
  • the inter prediction unit 43 (such as the motion estimation unit 42 and the motion compensation unit 44) within the prediction unit 41 performs inter-predictive decoding of the current video block with respect to one or more prediction blocks in one or more reference images to Provides time compression.
  • the motion estimation unit 42 is configured to determine an inter prediction mode of a video slice according to a predetermined mode of a video sequence.
  • the predetermined mode can specify the video slices in the sequence as P slices, B slices, or GPB slices.
  • the motion estimation unit 42 and the motion compensation unit 44 can be highly integrated, and are described separately here for the convenience of understanding the concept.
  • a process of estimating a motion vector of a video block also referred to as an image block
  • a motion vector may indicate a displacement of a PU of a video block within a current video frame or image relative to a predicted block within a reference image.
  • the prediction block is a block that is found to closely match the PU of the video block to be decoded according to the pixel difference, and the pixel difference may be determined by a sum of absolute differences (SAD), a sum of squared differences (SSD), or other differences.
  • the video encoder 20 may calculate a value of a sub-integer pixel position of a reference image stored in the reference image memory 64. For example, video encoder 20 may interpolate values of quarter pixel positions, eighth pixel positions, or other fractional pixel positions of the reference image. Therefore, the motion estimation unit 42 may perform a motion search with respect to the full pixel position and the fractional pixel position and output a motion vector having a fractional pixel accuracy.
  • the motion estimation unit 42 calculates a motion vector of the PU of the video block in the inter-decoded slice by comparing the position of the PU with the position of the prediction block of the reference image.
  • Reference images can be selected from the first reference image list (List 0) or the second reference image list (List 1), each of the lists identifying one or more reference images stored in the reference image memory 64.
  • the motion estimation unit 42 sends the calculated motion vector to the entropy encoding unit 56 and the motion compensation unit 44.
  • Motion compensation performed by the motion compensation unit 44 may involve extracting or generating a prediction block based on a motion vector determined by motion estimation. After receiving the motion vector of the PU of the current video block, the motion compensation unit 44 can locate the prediction block pointed to by the motion vector in one of the reference image lists.
  • Video encoder 20 forms a residual video block by subtracting the pixel value of the prediction block from the pixel value of the current video block being decoded, thereby forming a pixel difference value.
  • the pixel difference values form the residual data of the block, and may include both luminance and chrominance difference components.
  • the summer 50 represents one or more components that perform this subtraction operation.
  • Motion compensation unit 44 may also generate syntax elements associated with video blocks and video slices for use by video decoder 30 for decoding video blocks of video slices.
  • the picture containing the PU may be associated with two reference picture lists called "List 0" and "List 1".
  • an image containing B bands may be associated with a list combination that is a combination of list 0 and list 1.
  • the motion estimation unit 42 may perform unidirectional prediction or bidirectional prediction for the PU, wherein in some feasible implementations, the bidirectional prediction is a reference image list based on the list 0 and the list 1, respectively In other feasible implementations, the bi-directional prediction is a prediction based on a reconstructed future frame and a reconstructed past frame in the display order of the current frame, respectively.
  • the motion estimation unit 42 may search a reference image of List 0 or List 1 for a reference block for the PU.
  • the motion estimation unit 42 may then generate a reference frame index indicating a reference image containing a reference block in List 0 or List 1 and a motion vector indicating a spatial displacement between the PU and the reference block.
  • the motion estimation unit 42 may output a reference frame index, a prediction direction identifier, and a motion vector as motion information of the PU.
  • the prediction direction identifier may indicate a reference picture in the reference frame index indication list 0 or list 1. For example, the prediction direction identifier 1 indicates the list list0, the prediction direction identifier 2 indicates the list list1, and the prediction direction identifier 3 indicates the bidirectional prediction, that is, list0 and list1.
  • the motion compensation unit 44 may generate a predictive image block of the PU based on a reference block indicated by the motion information of the PU.
  • the motion estimation unit 42 may search for a reference block for the PU in the reference image in list 0 and may also search for another for the PU in the reference image in list 1 Reference block.
  • the motion estimation unit 42 may then generate a reference index indicating the reference image containing the reference block in List 0 and List 1 and a motion vector indicating the spatial displacement between the reference block and the PU.
  • the motion estimation unit 42 may output a reference index and a motion vector of the PU as motion information of the PU.
  • the motion compensation unit 44 may generate a predictive image block of the PU based on a reference block indicated by the motion information of the PU.
  • the motion estimation unit 42 does not output the complete set of motion information for the PU to the entropy encoding unit 56. Instead, the motion estimation unit 42 may refer to the motion information of another PU to signal the motion information of the PU. For example, the motion estimation unit 42 may determine that the motion information of a PU is sufficiently similar to the motion information of a neighboring PU. In this embodiment, the motion estimation unit 42 may indicate an indication value in the syntax structure associated with the PU, and the indication value indicates to the video decoder 30 that the PU has the same motion information as the neighboring PU or has a Neighbor PU derived motion information.
  • the motion estimation unit 42 may identify candidates and motion vector differences (MVD) associated with neighboring PUs in a syntax structure associated with the PU.
  • MVD indicates the difference between the motion vector of the PU and the indicated candidate associated with the neighboring PU.
  • Video decoder 30 may use the indicated candidate and MVD to determine the motion vector of the PU.
  • the prediction unit 41 may generate a candidate list for each PU of the CU.
  • a history candidate is added to a candidate list (for example, a fusion motion information candidate list and / or a motion vector prediction candidate list).
  • a candidate list for example, a fusion motion information candidate list and / or a motion vector prediction candidate list.
  • the intra prediction unit 46 within the prediction unit 41 may perform intra predictive decoding of the current video block relative to one or more neighboring blocks in the same image or slice as the current block to be decoded to provide spatial compression . Therefore, instead of the inter prediction (as described above) performed by the motion estimation unit 42 and the motion compensation unit 44, the intra prediction unit 46 may predict the current block intra. Specifically, the intra prediction unit 46 may determine an intra prediction mode used to encode the current block. In some feasible implementations, the intra prediction unit 46 may, for example, encode the current block using various intra prediction modes during separate encoding traversals, and the intra prediction unit 46 (or in some feasible implementations, The mode selection unit 40) may select an appropriate intra prediction mode to use from the tested modes.
  • the video encoder 20 forms a residual video block by subtracting the prediction block from the current video block.
  • the residual video data in the residual block may be included in one or more TUs and applied to the transform processing unit 52.
  • the transform processing unit 52 transforms the residual video data into residual transform coefficients using a transform such as a discrete cosine transform (DCT) or a conceptually similar transform (eg, a discrete sine transform DST).
  • the transform processing unit 52 may transform the residual video data from a pixel domain to a transform domain (for example, a frequency domain).
  • the transformation processing unit 52 may send the obtained transformation coefficient to the quantization unit 54.
  • the quantization unit 54 quantizes the transform coefficients to further reduce the code rate.
  • the quantization process may reduce the bit depth associated with some or all of the coefficients.
  • the degree of quantization can be modified by adjusting the quantization parameters.
  • the quantization unit 54 may then perform a scan of a matrix containing the quantized transform coefficients.
  • the entropy encoding unit 56 may perform scanning.
  • the entropy encoding unit 56 may entropy encode the quantized transform coefficients.
  • the entropy encoding unit 56 may perform context adaptive variable length decoding (CAVLC), context adaptive binary arithmetic decoding (CABAC), syntax-based context adaptive binary arithmetic decoding (SBAC), probability interval partitioning entropy (PIPE) decoding or another entropy coding method or technique.
  • CAVLC context adaptive variable length decoding
  • CABAC context adaptive binary arithmetic decoding
  • SBAC syntax-based context adaptive binary arithmetic decoding
  • PIPE probability interval partitioning entropy
  • the entropy encoding unit 56 may also entropy encode the motion vector and other syntax elements of the current video slice being decoded.
  • the encoded code stream may be transmitted to the video decoder 30 or archived for later transmission or retrieved by the video decoder 30.
  • the entropy encoding unit 56 may encode information indicating a selected intra prediction mode according to the techniques of the present application.
  • Video encoder 20 may include encoding of various blocks in transmitted stream configuration data that may include multiple intra prediction mode index tables and multiple modified intra prediction mode index tables (also known as codeword mapping tables). Definition of the context and an indication of the MPM, the intra prediction mode index table, and the modified intra prediction mode index table for each of the contexts.
  • the inverse quantization unit 58 and the inverse transform unit 60 respectively apply inverse quantization and inverse transform to reconstruct a residual block in the pixel domain for later use as a reference block of a reference image.
  • the motion compensation unit 44 may calculate a reconstructed block by adding a residual block to a prediction block of one of the reference images within one of the reference image lists.
  • the motion compensation unit 44 may also apply one or more interpolation filters to the reconstructed residual block to calculate sub-integer pixel values for motion estimation.
  • the summer 62 adds the reconstructed residual block and the motion-compensated prediction block generated by the motion compensation unit 44 to generate a reconstruction block, which is used as a reference block for storage in the reference image memory 64.
  • the reference block can be used as a reference block by the motion estimation unit 42 and the motion compensation unit 44 to inter-predict a block in a subsequent video frame or image.
  • the video encoder 20 may directly quantize the residual signal without processing by the transform unit 52 and correspondingly does not need to be processed by the inverse transform unit 60; or, for some image blocks, Or image frames, the video encoder 20 does not generate residual data, and accordingly does not need to be processed by the transform unit 52, the quantization unit 54, the inverse quantization unit 58, and the inverse transform unit 60; or, the video encoder 20 may convert the reconstructed image
  • the blocks are directly stored as reference blocks without being processed by a filter unit; alternatively, the quantization unit 54 and the inverse quantization unit 58 in the video encoder 20 may be merged together.
  • the loop filtering unit is optional, and in the case of lossless compression coding, the transform unit 52, the quantization unit 54, the inverse quantization unit 58, and the inverse transform unit 60 are optional. It should be understood that, according to different application scenarios, the inter prediction unit and the intra prediction unit may be selectively enabled, and in this case, the inter prediction unit is enabled.
  • FIG. 3 is a schematic block diagram of a video decoder 30 in an embodiment of the present application.
  • the video decoder 30 includes an entropy encoding unit 80, a prediction unit 81, an inverse quantization unit 86, an inverse transform unit 88, a summer 90 (ie, a reconstructor), and a reference image memory 92.
  • the reference image memory 92 may also be provided outside the video decoder 30.
  • the prediction unit 81 includes an inter prediction unit 82 and an intra prediction unit 84.
  • the inter prediction unit 82 may be, for example, a motion compensation unit 82.
  • video decoder 30 may perform an exemplary reciprocal decoding process with the encoding process described with respect to video encoder 20 from FIG. 4A or 4B.
  • video decoder 30 receives from video encoder 20 an encoded video code stream representing video blocks of the encoded video slice and associated syntax elements.
  • the entropy encoding unit 80 of the video decoder 30 entropy decodes the bitstream to generate quantized coefficients, motion vectors, and other syntax elements.
  • the entropy encoding unit 80 forwards the motion vector and other syntax elements to the prediction unit 81.
  • Video decoder 30 may receive syntax elements at a video slice level and / or a video block level.
  • the intra-prediction unit 84 of the prediction unit 81 may be based on a signaled intra-prediction mode and data from a previously decoded block of the current frame or image The prediction data of the video block of the current video slice is generated.
  • the motion compensation unit 82 of the prediction unit 81 When the video image is decoded into inter-decoded (eg, B, P, or GPB) slices, the motion compensation unit 82 of the prediction unit 81 generates the current video based on the motion vector and other syntax elements received from the entropy encoding unit 80 A prediction block of a video block of an image.
  • the prediction block may be generated from one of the reference pictures within one of the reference picture lists.
  • the video decoder 30 may construct a reference image list (List 0 and List 1) using a default construction technique based on the reference image stored in the reference image memory 92.
  • the motion compensation unit 82 determines the prediction information of the video block of the current video slice by analyzing the motion vector and other syntax elements, and uses the prediction information to generate the prediction block of the current video block being decoded. For example, the motion compensation unit 82 uses some of the received syntax elements to determine a prediction mode (e.g., intra prediction or inter prediction) of a video block to decode a video slice, an inter prediction slice type (e.g., B slice, P slice or GPB slice), construction information of one or more of the slice's reference image list, motion vector of each inter-coded video block of the slice, each warped frame of the slice Inter-prediction status of inter-decoded video blocks and other information used to decode video blocks in the current video slice.
  • a prediction mode e.g., intra prediction or inter prediction
  • an inter prediction slice type e.g., B slice, P slice or GPB slice
  • construction information of one or more of the slice's reference image list e.g., motion vector of each inter-coded video block
  • the motion compensation unit 82 may also perform interpolation based on an interpolation filter.
  • the motion compensation unit 82 may use an interpolation filter as used by the video encoder 20 during encoding of the video block to calculate the interpolation value of the sub-integer pixels of the reference block.
  • the motion compensation unit 82 may determine an interpolation filter used by the video encoder 20 from the received syntax elements and use the interpolation filter to generate a prediction block.
  • the motion compensation unit 82 may generate a candidate list for the PU.
  • the codestream may include data identifying the position of the selected candidate in the candidate list of the PU.
  • the motion compensation unit 82 may generate a predictive image block for the PU based on one or more reference blocks indicated by the motion information of the PU.
  • a reference block of a PU may be in a temporal image different from the PU.
  • the motion compensation unit 82 may determine the motion information of the PU based on the selected motion information in the candidate list of the PU.
  • the embodiment of the present application adds a history candidate to a candidate list (such as a fusion motion information candidate list and / or a motion vector prediction candidate list).
  • a candidate list such as a fusion motion information candidate list and / or a motion vector prediction candidate list.
  • the inverse quantization unit 86 performs inverse quantization (for example, dequantization) on the quantized transform coefficient provided in the code stream and decoded by the entropy encoding unit 80.
  • the inverse quantization process may include determining the degree of quantization using the quantization parameters calculated by video encoder 20 for each video block in the video slice, and similarly determining the degree of inverse quantization that should be applied.
  • the inverse transform unit 88 applies an inverse transform (e.g., inverse DCT, inverse integer transform, or conceptually similar inverse transform process) to the transform coefficients to generate a residual block in the pixel domain.
  • the video decoder 30 After the motion compensation unit 82 generates a prediction block of the current video block based on the motion vector and other syntax elements, the video decoder 30 sums the residual block from the inverse transform unit 88 and the corresponding prediction block generated by the motion compensation unit 82 to Form decoded video blocks.
  • the summer 90 ie, the reconstructor
  • a deblocking filter may also be applied to filter the decoded blocks in order to remove block effect artifacts.
  • Other loop filters (in the decoding loop or after the decoding loop) can also be used to smooth pixel transitions or otherwise improve video quality.
  • the decoded video block in a given frame or image is then stored in a reference image memory 92, which stores a reference image for subsequent motion compensation.
  • the reference image memory 92 also stores decoded video for later presentation on a display device such as the display device 32 of FIG. 1.
  • the techniques of this application exemplarily involve inter-frame decoding. It should be understood that the techniques of this application may be performed by any of the video decoders described in this application.
  • the video decoder includes, for example, video encoder 20 and video decoding as shown and described with respect to FIGS. 1-3. ⁇ 30 ⁇ 30. That is, in a feasible implementation manner, the prediction unit 41 described with respect to FIG. 2 may perform a specific technique described below when performing inter prediction during encoding of a block of video data. In another possible implementation, the prediction unit 81 described with respect to FIG. 3 may perform specific techniques described below when performing inter prediction during decoding of a block of video data.
  • a reference to a generic "video encoder" or "video decoder” may include video encoder 20, video decoder 30, or another video encoding or coding unit.
  • video decoder 30 may be used to decode the encoded video bitstream.
  • the video decoder 30 may generate an output video stream without being processed by a filtering unit; or, for certain image blocks or image frames, the entropy decoding unit 80 of the video decoder 30 does not decode the quantized coefficients, and accordingly does not need to Processed by inverse quantization unit 86 and inverse transform unit 88.
  • the loop filtering unit is optional; and in the case of lossless compression, the inverse quantization unit 86 and the inverse transform unit 88 are optional.
  • the inter prediction unit and the intra prediction unit may be selectively enabled, and in this case, the inter prediction unit is enabled.
  • FIG. 4A is an exemplary flowchart of a merge mode in an embodiment of the present application.
  • a video encoder eg, video encoder 20
  • the video encoder may perform a merge operation different from the merge operation 200.
  • the video encoder may perform a merge operation, where the video encoder performs more or fewer steps than the merge operation 200 or steps different from the merge operation 200.
  • the video encoder may perform the steps of the merge operation 200 in a different order or in parallel.
  • the encoder may also perform a merge operation 200 on a PU encoded in a skip mode.
  • the video encoder may generate a candidate list for the current PU (202).
  • the video encoder may generate a candidate list for the current PU in various ways. For example, the video encoder may generate a candidate list for the current PU according to one of the example techniques described below with respect to FIGS. 7-11C.
  • the candidate list for the current PU may include temporal candidate motion information (referred to as a time domain candidate).
  • the temporal candidate motion information may indicate motion information of a co-located PU in the time domain.
  • a co-located PU may be spatially in the same position in the image frame as the current PU, but in a reference picture instead of the current picture.
  • a reference picture including a PU corresponding to the time domain may be referred to as a related reference picture.
  • a reference image index of a related reference image may be referred to as a related reference image index in this application.
  • the current image may be associated with one or more reference image lists (eg, list 0, list 1, etc.).
  • the reference image index may indicate the reference image by indicating the position of the reference image in a certain reference image list.
  • the current image may be associated with a combined reference image list.
  • the related reference picture index is the reference picture index of the PU covering the reference index source position associated with the current PU.
  • the reference index source location associated with the current PU is adjacent to the left of the current PU or above the current PU. In this application, if an image block associated with a PU includes a specific location, the PU may "cover" the specific location.
  • the reference index source location associated with the current PU is within the current CU.
  • the PU may need to access motion information of another PU of the current CU in order to determine a reference picture containing a co-located PU. Therefore, these video encoders may use motion information (ie, a reference picture index) of a PU belonging to the current CU to generate a time domain candidate for the current PU. In other words, these video encoders can use the motion information of the PUs that belong to the current CU to generate time-domain candidates. Therefore, the video encoder cannot generate a candidate list for a current PU and a PU covering a reference index source position associated with the current PU in parallel.
  • the video encoder may explicitly set the relevant reference picture index without referring to the reference picture index of any other PU. This may enable the video encoder to generate candidate lists for the current PU and other PUs of the current CU in parallel. Because the video encoder explicitly sets the relevant reference picture index, the relevant reference picture index is not based on the motion information of any other PU of the current CU. In some feasible implementations where the video encoder explicitly sets the relevant reference picture index, the video encoder may always set the relevant reference picture index to a fixed, predefined preset reference picture index (eg, 0). In this way, the video encoder may generate time-domain candidates based on the motion information of the co-located PU in the reference frame indicated by the preset reference picture index, and may include the time-domain candidates in the candidate list of the current CU .
  • a fixed, predefined preset reference picture index eg, 0
  • the video encoder may be explicitly used in a syntax structure (e.g., image header, slice header, APS, or another syntax structure)
  • the related reference picture index is signaled.
  • the video encoder may signal the decoder to the relevant reference picture index for each LCU (ie, CTU), CU, PU, TU, or other type of sub-block. For example, the video encoder may signal that the relevant reference picture index for each PU of the CU is equal to "1".
  • the relevant reference image index may be set implicitly rather than explicitly.
  • the video encoder may use the motion information of the PU in the reference picture indicated by the reference picture index of the PU covering the location outside the current CU to generate each candidate list for the PU of the current CU. A temporal candidate, even if these locations are not strictly adjacent to the current PU.
  • the video encoder may generate predictive image blocks associated with the candidates in the candidate list (204).
  • the video encoder may generate a prediction associated with the candidate by determining motion information of the current PU based on the motion information of the indicated candidate and then generating a predictive image block based on one or more reference blocks indicated by the motion information of the current PU.
  • Sexual image blocks The video encoder may select one of the candidates from the candidate list (206).
  • Video encoders can select candidates in various ways. For example, a video encoder may select one of the candidates based on a code rate-distortion cost analysis of each of the predictive image blocks associated with the candidate.
  • the video encoder may output the candidate's index (208).
  • the index may indicate the position of the selected candidate in the candidate list. In some feasible implementations, this index may be represented as "merge_idx".
  • FIG. 4B is an exemplary flowchart of an advanced motion vector prediction (AMVP) mode in an embodiment of the present application.
  • a video encoder eg, video encoder 20
  • the video encoder may generate one or more motion vectors for the current PU (211).
  • the video encoder may perform integer motion estimation and fractional motion estimation to generate motion vectors for the current PU.
  • the current image may be associated with two reference image lists (List 0 and List 1).
  • the video encoder may generate a list 0 motion vector or a list 1 motion vector for the current PU.
  • the list 0 motion vector may indicate a spatial displacement between an image block of the current PU and a reference block in a reference image in list 0.
  • the list 1 motion vector may indicate a spatial displacement between an image block of the current PU and a reference block in a reference image in list 1.
  • the video encoder may generate a list 0 motion vector and a list 1 motion vector for the current PU.
  • the video encoder may generate predictive image blocks (referred to simply as prediction blocks) for the current PU (212).
  • the video encoder may generate predictive image blocks for the current PU based on one or more reference blocks indicated by one or more motion vectors for the current PU.
  • the video encoder may generate a candidate list for the current PU (213).
  • the video decoder can generate a list of candidate prediction motion vectors for the current PU in various ways.
  • the video encoder may generate a candidate list for the current PU according to one or more of the possible implementations described below with respect to FIGS. 6 to 11C.
  • the candidate prediction motion vector list may include two or three candidate prediction motion vectors.
  • the list of candidate prediction motion vectors may include more candidate prediction motion vectors (eg, five or seven candidate prediction motion vectors).
  • the video encoder may generate one or more motion vector prediction residual values (also known as motion vector difference MVD) for each candidate prediction motion vector in the candidate list (214 ).
  • the video encoder may generate a motion vector difference for the candidate prediction motion vector by determining a difference between the motion vector indicated by the candidate prediction motion vector and a corresponding motion vector of the current PU.
  • the video encoder may generate a single MVD for each candidate prediction motion vector. If the current PU is bi-predicted, the video encoder may generate two MVDs for each candidate prediction motion vector.
  • the first MVD may indicate a difference between the motion vector of the candidate prediction motion vector and the list 0 motion vector of the current PU.
  • the second MVD may indicate a difference between the motion vector of the candidate prediction motion vector and the list 1 motion vector of the current PU.
  • the video encoder may select one or more of the candidate prediction motion vectors from the candidate prediction motion vector list (215).
  • the video encoder may select one or more candidate prediction motion vectors in various ways. For example, a video encoder may select a candidate prediction motion vector with an associated motion vector that matches the motion vector to be encoded with minimal error, which may reduce the number of bits required to represent the motion vector difference for the candidate prediction motion vector.
  • the video encoder may output one or more reference image indexes for the current PU, one or more candidate prediction motion vector indexes, and one or more selected candidate motion vectors.
  • One or more motion vector differences of the predicted motion vector (216).
  • the video encoder may output a reference picture index ("ref_idx_10") for List 0 or for Reference image index of list 1 ("ref_idx_11").
  • the video encoder may also output a candidate prediction motion vector index (“mvp_10_flag") indicating the position of the selected candidate prediction motion vector for the list 0 motion vector of the current PU in the candidate prediction motion vector list.
  • the video encoder may output a candidate prediction motion vector index (“mvp_11_flag”) indicating the position of the selected candidate prediction motion vector for the list 1 motion vector of the current PU in the candidate prediction motion vector list.
  • the video encoder may also output a list 0 motion vector or a list 1 motion vector MVD for the current PU.
  • the video encoder may output the reference picture index ("ref_idx_10") for List 0 and the list Reference image index of 1 ("ref_idx_11").
  • the video encoder may also output a candidate prediction motion vector index (“mvp_10_flag") indicating the position of the selected candidate prediction motion vector for the list 0 motion vector of the current PU in the candidate prediction motion vector list.
  • the video encoder may output a candidate prediction motion vector index (“mvp_11_flag”) indicating the position of the selected candidate prediction motion vector for the list 1 motion vector of the current PU in the candidate prediction motion vector list.
  • the video encoder may also output the MVD of the list 0 motion vector for the current PU and the MVD of the list 1 motion vector for the current PU.
  • FIG. 5 is an exemplary flowchart of motion compensation performed by a video decoder (such as video decoder 30) in an embodiment of the present application.
  • the video decoder may receive an indication of a selected candidate for the current PU (222). For example, the video decoder may receive a candidate index indicating the position of the selected candidate within the candidate list of the current PU.
  • the video decoder may receive the first candidate index and the second candidate index.
  • the first candidate index indicates the position of the selected candidate for the list 0 motion vector of the current PU in the candidate list.
  • the second candidate index indicates the position of the selected candidate for the list 1 motion vector of the current PU in the candidate list.
  • a single syntax element may be used to identify two candidate indexes.
  • the video decoder may generate a candidate list for the current PU (224).
  • the video decoder can generate this candidate list for the current PU in various ways. For example, the video decoder may use the techniques described below with reference to FIGS. 6 to 11 to generate a candidate list for the current PU.
  • the video decoder may explicitly or implicitly set a reference picture index identifying a reference picture including a co-located PU, as described above with reference to FIG. 4A or Described in Figure 4B.
  • the video decoder may determine motion information of the current PU based on the motion information indicated by one or more selected candidates in the candidate list for the current PU (225). For example, if the motion information of the current PU is encoded using a merge mode, the motion information of the current PU may be the same as the motion information indicated by the selected candidate. If the motion information of the current PU is encoded using AMVP mode, the video decoder may reconstruct using one or more MVDs indicated in the one or more motion vectors and code streams indicated by the or the selected candidate. One or more motion vectors of the current PU.
  • the reference image index and prediction direction identifier of the current PU may be the same as the reference image index and prediction direction identifier of the one or more selected candidates.
  • the video decoder may generate a predictive image block for the current PU based on one or more reference blocks indicated by the motion information of the current PU (226).
  • FIG. 6 is an exemplary schematic diagram of a current image block (such as a coding unit CU), a spatial neighboring image block, and a temporal neighboring image block associated with the current image block (such as the coding unit CU) in the embodiment of the present application, illustrating CU600 and schematic candidates associated with CU600 Schematic diagram of user positions 1 to 10.
  • Candidate positions 1 to 5 represent airspace candidates in the same image as CU600.
  • Candidate position 1 is positioned to the left of CU600.
  • Candidate position 2 is positioned above CU600.
  • Candidate position 3 is located at the upper right of CU600.
  • Candidate position 4 is positioned at the bottom left of CU600.
  • Candidate position 5 is located at the upper left of CU600.
  • Candidate positions 6 to 7 represent time-domain candidates associated with the co-located block 602 of the CU600, where the co-located block is the same size, shape, and coordinates as the CU600 in the reference image (i.e., adjacent coded image) Image blocks.
  • the candidate position 6 is located in the lower right corner of the co-located block 602.
  • the candidate position 7 is positioned at the lower right middle position of the co-located block 602, or at the upper left middle position of the co-located block 602.
  • FIG. 6 is a schematic embodiment of a candidate position for providing a candidate list that an inter prediction unit (for example, the motion estimation unit 42 or the motion compensation unit 82) can generate.
  • the candidate positions 1 to 5 of FIG. 6 are exemplary embodiments for providing candidate positions that an intra prediction unit can generate a candidate list.
  • the positions of the candidate in the spatial domain and the positions of the candidate in the time domain in FIG. 6 are merely schematic, and the candidate positions include, but are not limited to, this.
  • the locations of the spatial domain candidates may further include, for example, positions within a preset distance from the image block to be processed, but not adjacent to the image block to be processed.
  • the embodiment of the present application is not only applicable to merge prediction mode (Merge) and / or advanced motion vector prediction mode (advanced motion vector prediction) (AMVP), but also applicable to other motion information using spatial reference blocks and / or time domain reference blocks.
  • FIG. 7 is an exemplary flowchart of an image coding method according to an embodiment of the present application.
  • the method may be performed by a video encoder (eg, video encoder 20) or an electronic device (eg, devices 1200, 1300) with video encoding functions.
  • the method may include the following steps:
  • an inter prediction mode with the lowest rate-distortion cost is selected from the set of candidate inter prediction modes as the inter prediction mode of the currently encoded image block;
  • the candidate motion information list corresponding to the inter prediction mode of the current coded image block is a merged motion information candidate list merge candidate list; if the frame of the current coded image block
  • the inter prediction mode is an inter MVP mode (for example, AMVP mode), and the candidate motion information list corresponding to the inter prediction mode of the current coded image block is a motion vector prediction candidate list MVP candidate list; the historical candidate list is different from the candidate List of exercise information.
  • the historical candidate list is used to construct or update the current encoded image.
  • the block candidate motion information list includes: the order from the tail to the head of the historical candidate list, checking whether the target historical candidate motion information is the same as the existing candidate motion information in the candidate motion information list; if they are different, then The target historical candidate motion information is added to the candidate motion information list; the target historical candidate motion information is: a history other than the Q historical candidate motion information arranged in order from the tail to the head in the historical candidate list
  • Q is a positive integer.
  • the use of the historical candidate list to construct or update the candidate motion information list of the currently encoded image block includes: : Based on the inter prediction mode of the image block where the current historical candidate motion information HMVP in the historical candidate list is located and / or the inter prediction mode of the image block where the candidate motion information candidate in the candidate motion information list is located Determine whether to perform a repetitive check, and determine whether to add the current historical candidate motion information HMVP to the candidate according to the repetitive check result of the current historical candidate motion information HMVP and the M existing candidate motion information in the candidate motion information list In the motion information list, M is an integer greater than or equal to 0.
  • the use of the historical candidate list to construct or update the candidate motion information list of the currently encoded image block includes: : Check whether the inter prediction mode of the image block where the current candidate motion information is located in the candidate motion information list is the first inter prediction mode; the inter prediction mode of the image block where the current candidate motion information is located is the first In the case of an inter prediction mode, check whether the inter prediction mode of the image block where the next candidate motion information in the candidate motion information list is the first inter prediction mode; When the inter prediction mode of the image block is not the first inter prediction mode, it is checked whether the current historical candidate motion information is the same as the current candidate motion information.
  • the use of the historical candidate list to construct or update the candidate motion information list of the currently encoded image block includes: : Check whether the inter prediction mode of the image block where the current historical candidate motion information is located is the first inter prediction mode; the inter prediction mode of the image block where the current historical candidate motion information is located is the first inter prediction In the case of a mode, adding the current historical candidate motion information to the candidate motion information list; in a case where an inter prediction mode of an image block where the current historical candidate motion information is located is not the first inter prediction mode To check whether the current historical candidate motion information is the same as the existing candidate motion information in the candidate motion information list.
  • the use of the historical candidate list to construct or update the candidate motion information list of the currently encoded image block includes: : Check whether the inter prediction mode of the image block where the current historical candidate motion information is located is the first inter prediction mode; the inter prediction mode of the image block where the current historical candidate motion information is located is the first inter prediction In the case of a mode, the current historical candidate motion information is added to the candidate motion information list; the inter prediction mode of the image block where the current historical candidate motion information is located is not the first inter prediction mode and When the inter prediction mode of the image block where the current candidate motion information is in the candidate motion information list is the first inter prediction mode, check the image block where the next candidate motion information in the candidate motion information list is located.
  • the inter prediction mode is the first inter prediction mode; If the inter prediction mode of the image block where the motion information is located is not the first inter prediction mode and the inter prediction mode of the image block where the current candidate motion information is located is not the first inter prediction mode, check the current historical candidate motion Whether the information is the same as the current candidate motion information.
  • the use of the historical candidate list to construct or update the candidate motion information list of the currently encoded image block includes: : Check whether the inter prediction mode of the image block where the current historical candidate motion information is located is the first inter prediction mode, and whether the inter prediction mode of the image block where the current candidate motion information is located in the candidate motion information list is Is the first inter prediction mode; the inter prediction mode of the image block where the current historical candidate motion information is located and the inter prediction mode of the image block where the current candidate motion information in the candidate motion information list is the first In the case of an inter prediction mode, check whether the inter prediction mode of the image block where the next candidate motion information in the candidate motion information list is the first inter prediction mode; where the current historical candidate motion information is located Inter prediction mode of the image block and the current in the candidate motion information list Inter prediction mode information of the motion image block is selected from where at least one is not the first case where the inter prediction mode, the motion information
  • the historical candidate list is updated by using the motion information of the currently encoded image block
  • S707 Program a syntax element used to indicate an inter prediction mode of the currently-encoded image block into a code stream.
  • the inter prediction mode of the current image block is a merge merge mode or a skip mode, and a merge index number corresponding to the target candidate motion information is also coded into a code stream;
  • the inter prediction mode of the current image block is an inter MVP mode, and an index number, a reference frame index, and the motion vector difference corresponding to the target candidate motion information (that is, the target candidate motion vector prediction value MVP) are also provided.
  • the value MVD is coded into the code stream.
  • the embodiment of the present invention may further include:
  • performing inter prediction on the current image block based on the candidate motion information list in step S703 may include:
  • target motion information is determined from a list of candidate motion information corresponding to the inter prediction mode of the currently coded image block.
  • the target candidate motion information encodes the current coded image block with the lowest rate-distortion cost.
  • the target candidate motion information is the motion information of the current encoded image block (for example, in a merge mode); or based on the target candidate motion information and a motion vector of the current encoded image block obtained through motion estimation (Motion Estimation)
  • Motion Estimation To determine a motion vector difference MVD of the currently encoded image block, and the target candidate motion information is a motion vector prediction value of the currently encoded image block (for example, in the AMVP mode);
  • Inter prediction is performed on the current coded image block according to the motion information of the current coded image block to obtain a predicted image (that is, a predicted pixel value) of the current coded image block.
  • step S707 may be performed after step S705, or may be performed before step S705; the above step S702 may be performed after step S701, or may be performed before step S701; the remaining steps are not illustrated here one by one.
  • the candidate motion information list (for example, merge candidate list) includes motion information of a spatial domain reference block of the current coded image block (spatial domain reference block includes motion information of an adjacent block adjacent to the spatial domain of the current coded image block) and / Or motion information of a time-domain reference block of a current coded image block (the time-domain reference block includes: motion information of a neighboring block in a lower right corner of a co-located block at a same position as the current coded image block in a reference frame, or , The motion information of the center position of the co-located block);
  • the candidate motion information list (for example, MVP candidate list) includes a motion vector of a spatial reference block of the current coded image block and / or a motion vector of a time domain reference block of the current coded image block;
  • the syntax element included in the code stream also includes an index number indicating the target candidate motion information of the currently coded image block, in other words That is, an index number corresponding to the target candidate motion information may also be coded into a code stream;
  • the syntax element included in the code stream further includes an index number used to indicate target candidate motion information of the currently coded image block
  • the motion vector difference value MVD in other words, an index number corresponding to the target candidate motion information and the motion vector difference value MVD may also be coded into a code stream, and the target candidate motion information is the current coded image block Motion vector predictor MVP.
  • the historical candidate motion information (indicated by HMVP) in the historical candidate list is accessed or updated in a first-in-first-out manner (as shown in FIG. 9 or 10).
  • the encoding method in the embodiment of the present application may further include:
  • the Reconstructed image Based on the residual image (that is, the residual value) of the current encoded image block and the predicted image (that is, the predicted pixel value) of the current encoded image block obtained by the inter prediction process, the Reconstructed image.
  • the historical candidate list is used to construct or During the process of updating the candidate motion information list of the current image block, the latest Q historical candidate motion information is skipped. For example, if the inter prediction mode of the currently encoded image block is skip / merge mode, the historical candidate list is used to construct the fusion.
  • the Q historical candidates arranged in order from the tail to the head are skipped; if the inter prediction mode of the currently encoded image block is the inter MVP mode, when the historical candidate list is used to construct the motion vector prediction candidate list, Skip the Q historical candidates in order from the tail to the head; compared to the use of the historical candidate list (assuming length L) in the prior art to construct a fusion motion information candidate list (assuming length K) and motion vector prediction candidate list (Assume length is J), you need L times K and L times J duplicate search, respectively.
  • the number of duplicate entry checking operations when the historical candidate list is added to the fusion motion information candidate list or the motion vector prediction candidate list is reduced, and the encoding and decoding time is reduced, thereby helping to improve the efficiency of inter prediction and thus the encoding. Decoding performance.
  • FIG. 8 is an exemplary flowchart of an image decoding method according to the first embodiment of the present application.
  • the method may be performed by a video decoder (for example, video decoder 30) or an electronic device (for example, devices 1200, 1300) with video decoding functions.
  • the method may include the following steps:
  • the syntax elements such as skip_flag, merge_flag, and pred_mode are parsed from the bitstream.
  • one or more of skip_flag, merge_flag, and pred_mode are used to indicate the inter prediction of the currently decoded image block. mode.
  • the value of cu_skip_flag is 0, which indicates that the inter-prediction mode of the current image block is not a skip mode
  • the value of cu_skip_flag is 1, which indicates that the inter-prediction mode of the current image block is a skip mode
  • the prediction mode of the image block is inter inter prediction mode.
  • a value of pred_mode_flag is 1 indicates that the prediction mode of the current image block is intra intra prediction mode.
  • a value of merge_flag is 0, which indicates that the inter prediction mode of the current image block is not In merge mode
  • the value of merge_flag is 1, indicating that the inter prediction mode of the current image block is the merge mode.
  • the candidate motion information list corresponding to the inter prediction mode of the currently decoded image block is the merged motion information candidate list merge candidate list; if the frame of the currently decoded image block
  • the inter prediction mode is an inter MVP mode (for example, AMVP mode), and the candidate motion information list corresponding to the inter prediction mode of the currently decoded image block is a motion vector prediction candidate list MVP candidate list; the historical candidate list is different from the candidate List of exercise information.
  • the method of using the historical candidate list to construct or update the candidate motion information list of the currently decoded image block includes: ordering from the tail to the head of the historical candidate list, checking the target historical candidate motion information and the existing candidate motion information in the candidate motion information list. Whether the candidate motion information is the same; if different, the target historical candidate motion information is added to the candidate motion information list; the target historical candidate motion information is: an order from the tail to the head in the historical candidate list.
  • the historical candidate motion information other than the arranged Q historical candidate motion information, where Q is a positive integer.
  • the historical candidate list in the process of constructing the candidate motion information list, is introduced and the repetitive check of some historical candidates is selectively skipped; or the candidate motion information may be generated After the list (for example, the candidate motion information list is generated in a conventional manner), the historical candidate list is used to update the candidate motion information list of the current image block. The latter will be explained in detail below.
  • generating a candidate list of fused motion information may include:
  • Spatial and temporal candidates with the current block are added to the candidate list of fused motion information for the current block.
  • the spatial fusion candidates include A0, A1, B0, B1, and B2, and the time domain fusion candidates include T0 and T1.
  • the candidates for time-domain fusion also include candidates provided by the adaptive time-domain motion vector prediction (ATMVP) technology.
  • ATMVP adaptive time-domain motion vector prediction
  • the embodiment of the present application does not limit the process related to generating a candidate list of fused motion information. This process may be performed by using a method in HEVC or VTM, or other methods of generating a candidate list of fused motion information.
  • the motion vector prediction candidate list may be generated by using a method in HEVC or VTM, or other methods of generating a motion vector prediction candidate list may be adopted, which is not limited in this embodiment of the present application.
  • Updating the candidate motion information list (fused motion information candidate list or motion vector prediction candidate list) of the current image block using the historical candidate list may include one of the following methods:
  • the history candidate with a preset number Q is skipped, and the preset number Q is a positive integer greater than 0.
  • the preset number Q is a positive integer greater than 0.
  • the fusion motion information candidates in the fusion motion information candidate list obtained in S8031 are the same, if they are different, they are added to the fusion motion information candidate list, and if they are the same, the next historical candidate in the historical candidate list is checked.
  • the preset number P is a positive integer greater than 0.
  • the history candidate with a preset number Q is skipped, and the preset number Q is a positive integer greater than 0.
  • the preset number Q is a positive integer greater than 0.
  • the historical candidates in the historical candidate list are not added to the motion vector prediction candidate list.
  • the historical candidates in the historical candidate list are not added to the fusion motion information candidate list.
  • the preset number P is a positive integer greater than 0.
  • the historical candidates in the historical candidate list are added to the fusion motion information candidate list and the motion vector prediction candidate list, and can be placed after the time domain motion information candidate (time domain motion vector candidate), or For other positions, the present application does not limit the positions in the two lists. It should be understood that, in the process of constructing the candidate motion information list, the historical candidates in the historical candidate list may be placed after the time domain motion information candidate (time domain motion vector candidate); or it may be in the candidate motion information list. After the construction is completed, the historical candidates in the historical candidate list are placed after the existing spatial and temporal motion information candidates; this application does not limit this.
  • fusion motion information candidate list after the historical candidate is added to the fusion motion information candidate list, other types of fusion candidates may be added, such as bi-predictive merge candidate and zero motion vector fusion candidate. (zero motion vector merge candidate).
  • zero motion vector merge candidate The embodiment of the present application does not involve a process of joining other types of fusion candidates, and the process may be performed by a method in HEVC or VTM, or other methods.
  • S8033 Determine target candidate motion information from the candidate motion information list, predict motion information of the currently decoded image block (referred to as the current block) based on the target candidate motion information, and perform motion compensation based on the motion information of the currently decoded image block.
  • the predicted pixel value of the current image block also known as performing inter prediction on the current decoded image block according to the motion information of the current decoded image block to obtain the predicted image (that is, the predicted pixel value) of the current decoded image block ).
  • the target candidate motion information is determined from the candidate motion information list according to an index identifier (such as a fusion index or a motion vector prediction value index) parsed from the bitstream, and the index identifier is used to indicate the candidate motion information
  • an index identifier such as a fusion index or a motion vector prediction value index
  • the target candidate motion information indicated by the fusion index carried in the code stream is the motion information of the current decoded image block; or, if the current block is in the Inter MVP mode, the The target candidate motion information indicated by the motion vector prediction value index carried in the code stream is a motion vector prediction value, based on the motion vector prediction value and the motion vector residual value of the current image block parsed from the code stream MVD (and inter prediction direction, reference frame index, etc.) to determine motion information of the currently decoded image block.
  • the residual information and the predicted image are added to obtain a reconstructed image of the current block; if the current block has no residual, the predicted image is a reconstructed image of the current block.
  • the embodiment of the present invention may further include:
  • the obtained The reconstructed image of the currently decoded image block is described. For example, the predicted image and the residual image are added to obtain a reconstructed image of the current block.
  • the embodiment of the present invention may further include:
  • S802 During the decoding process of the currently decoded image block, load a historical candidate list, so as to use the historical candidate list to construct or update the candidate motion information list of the current image block;
  • L is a positive integer greater than 0.
  • the historical candidate list For the initialization process of the historical candidate list, refer to the prior art. For example, at the beginning of a slice (SLICE), the historical candidate list may be emptied. Other methods for initializing the historical candidate list may also be adopted, which is not limited in this application.
  • step S802 may be performed after step S801, or may be performed before step S801; the remaining steps are not illustrated here one by one.
  • updating the historical candidate list using the motion information of the currently decoded image block as described in step S805 includes:
  • the motion information of the current decoded image block (referred to as the current block) is the same as the Xth historical candidate motion information in the historical candidate list, removing the Xth historical candidate motion information from the historical candidate list, and Adding the motion information of the currently decoded image block as the latest historical candidate motion information to the historical candidate list;
  • the motion information of the current decoded image block is different from one or more historical candidate motion information in the historical candidate list, adding the motion information of the current decoded image block as the latest historical candidate motion information
  • the historical candidate list is described.
  • a method for judging or comparing whether the motion information of the current block is the same as a certain historical candidate motion information in the historical candidate list is not limited. It can be that the two pieces of motion information are completely the same, or that the two pieces of motion information are the same after some processing, for example, the results of two motion vectors shifted to the right by 2 bits are the same.
  • updating the historical candidate list using the motion information of the currently decoded image block as described in step S805 includes:
  • the current size of the historical candidate list has reached a preset list size, remove the earliest added historical candidate motion information from the historical candidate list, and use the motion information of the current decoded image block as the latest historical candidate motion information (the last candidate) to join the historical candidate list.
  • the motion information of the current block is compared with the historical candidates in the historical candidate list; if a historical candidate is the same as the current block motion information, the historical candidate is compared. Remove from historical candidate list. Check the size of the historical candidate list. If the list size exceeds a preset size, remove the historical candidate at the head of the list. Finally, the motion information of the current block is added to the historical candidate list.
  • the historical candidate list size is checked. If the historical candidate list size does not exceed a preset list size (also called a list length or table size), the motion information of the current block is taken as the latest historical candidate. Motion information is added to the tail of the historical candidate list; if the historical candidate list size reaches a preset list size (also known as the list length or table size), the historical candidate running information at the head of the historical candidate list is removed And add the motion information of the current block as the latest historical candidate motion information to the tail of the historical candidate list.
  • a preset list size also called a list length or table size
  • the historical candidate list is used to construct or update.
  • the latest Q historical candidate motion information is skipped. For example, if the inter prediction mode of the currently decoded image block is skip / merge mode, the fusion motion is constructed using the historical candidate list.
  • the information candidate list is used, skip the Q historical candidates in the order from the tail to the head.
  • the inter prediction mode of the currently decoded image block is the inter MVP mode
  • the historical candidate list when using the historical candidate list to construct the motion vector prediction candidate list, skip Q historical candidates arranged in order from the tail to the head; compared with the historical candidate list (assuming length L) is used in the prior art to construct a fusion motion information candidate list (assuming length K) and a motion vector prediction candidate list ( When the length is assumed to be J), L times K and L times J times the duplicate search are required.
  • the number of duplicate check operations when the historical candidate list is added to the fusion motion information candidate list or the motion vector prediction candidate list is reduced, and the encoding and decoding time is reduced, thereby helping to improve the efficiency of inter prediction and thus encoding and decoding. performance.
  • the main difference between the second embodiment of the present application and the previous embodiment is that the coding mode of the coding block where the candidate is located in the candidate motion information candidate list and the motion vector prediction candidate list is used to skip the corresponding candidate.
  • the coding mode of the coding block (CU) where each fused motion information candidate is located is recorded.
  • the coding mode of the coding block where the fused motion information candidate is located is called the fusion motion information candidate.
  • the encoding mode of the fused motion information candidate in this embodiment refers to an inter prediction mode of the fused motion information candidate.
  • the coding mode of the coding block (CU) where each motion vector prediction candidate is located is recorded.
  • the coding mode of the coding block where the motion vector prediction candidate is located is recorded.
  • This coding mode is called the motion vector prediction candidate.
  • the coding mode of the motion vector prediction candidate in this embodiment refers to an inter prediction mode of the motion vector prediction candidate.
  • Updating the candidate motion information list (fused motion information candidate list or motion vector prediction candidate list) of the current image block using the historical candidate list may include one of the following methods:
  • the history candidate with a preset number Q is skipped, and the preset number Q is a positive integer greater than or equal to zero. Then, for each historical candidate of the predetermined number B of historical candidates, the following operations are performed, and the preset number B is a positive integer greater than 0:
  • step S8031 Before checking whether it is the same as a certain fused motion information candidate in the fused motion information candidate list obtained in step S8031 (that is, to perform a duplicate check), first check whether the coding mode of the fused motion information candidate is an inter MVP mode. If the coding mode of the fused motion information candidate is inter MVP mode, it is not checked whether the fused motion information candidate is the same as the historical candidate, or the fused motion information candidate is considered different from the historical candidate; if the coding mode of the fused motion information candidate is not inter MVP mode, it is checked whether the fusion motion information candidate and the historical candidate are the same.
  • the historical candidate and the fused motion information candidate list indicate that the historical candidate is the same as a fused motion information candidate, then no operation is performed; otherwise, the historical candidate is added to the fused motion information candidate list. For example, the history candidate is added to the tail of the fusion motion information candidate list.
  • the history candidate with a preset number Q is skipped, and the preset number Q is a positive integer greater than or equal to zero. Then, for each historical candidate of the predetermined number B of historical candidates, the following operations are performed, and the preset number B is a positive integer greater than 0:
  • the coding mode of the historical candidate is inter MVP mode. If the coding mode of the historical candidate is inter MVP mode, directly add the historical candidate to the fusion motion information candidate list (that is, there is no need to check the historical candidate and the fusion motion. Whether the information candidates are the same, or if the historical candidate is considered to be different from the fused motion information candidate); if the coding mode of the historical candidate is not an inter-MVP mode, continue to the next step, that is, check a certain candidate in the fused motion information candidate list obtained from step S8031 Before the fusion motion information candidates are the same (that is, to check for duplicates), first check whether the coding mode of the fusion motion information candidate is the inter MVP mode.
  • the coding mode of the historical candidate is not the inter MVP mode and the coding mode of the fused motion information candidate is the inter MVP mode, there is no need to check whether the fused motion information candidate is the same as the current historical candidate.
  • the candidates are different.
  • the coding mode of the historical candidate is not an inter MVP mode and the coding mode of the fused motion information candidate is not an inter MVP mode, it is checked whether the fused motion information candidate is the same as the historical candidate. For example, after the motion information of the current image block is predicted, it will also be used as a historical candidate.
  • the decoder will parse the encoding mode of the current image block from the code stream.
  • the motion information of the current block and the encoding mode information of the current block are stored as information of historical candidates.
  • the historical candidate and the fused motion information candidate list indicate that the historical candidate is the same as a fused motion information candidate, then no operation is performed; otherwise, the historical candidate is added to the fused motion information candidate list. For example, the history candidate is added to the tail of the fusion motion information candidate list.
  • the inter prediction mode of the image block where the current historical candidate is located and the inter prediction mode of the image block where the current candidate in the candidate motion information list is the inter MVP mode the larger the probability The two (that is, the current candidate in the current historical candidate and the candidate motion information list) are different from each other, and the repetitive check of the two is skipped.
  • the history candidate with a preset number Q is skipped, and the preset number Q is a positive integer greater than or equal to zero. Then, for each historical candidate of the predetermined number B of historical candidates, the following operations are performed, and the preset number B is a positive integer greater than 0:
  • Step S8031 Check whether the coding mode of the historical candidate is inter MVP mode. If the coding mode of the historical candidate is inter MVP mode, directly add the historical candidate to the fusion motion information candidate list (the historical candidate is different from the fusion motion information candidate by default). ; If the coding mode of the historical candidate is not the inter MVP mode, continue to the next step, that is, check whether it is the same as a fusion motion information candidate in the fusion motion information candidate list obtained in step S8031 (that is, to perform a duplicate check).
  • the historical candidate and the fused motion information candidate list indicate that the historical candidate is the same as a fused motion information candidate, then no operation is performed; otherwise, the historical candidate is added to the fused motion information candidate list. For example, the history candidate is added to the tail of the fusion motion information candidate list.
  • the probability of the next two (That is, the current candidate in the current history and the candidate in the candidate motion information list) are different from each other, and skip the repetitive check of the two.
  • the history candidate with a preset number Q is skipped, and the preset number Q is a positive integer greater than or equal to zero.
  • the preset number B is a positive integer greater than 0:
  • step S8031 Before checking whether the historical candidate is the same as a fusion motion information candidate in the fusion motion information candidate list obtained in step S8031 (that is, to perform a duplicate check), first check whether the coding modes of the fusion motion information candidate and the historical candidate are both inter MVP mode. If both coding modes are inter MVP mode, it is not necessary to check whether the fusion motion information candidate and the historical candidate are the same, or it is considered that the two are different in the repeatability check. Conversely, if at least one of the two encoding modes is not an inter-MVP mode, it is checked whether the fusion motion information candidate is the same as the historical candidate.
  • the historical candidate and the fused motion information candidate list indicate that the historical candidate is the same as a fused motion information candidate, then no operation is performed; otherwise, the historical candidate is added to the fused motion information candidate list. For example, the history candidate is added to the tail of the fusion motion information candidate list.
  • the inter-prediction mode of the image block where the current historical candidate is located and the inter-prediction mode of the image block where the current candidate in the candidate motion information list is the inter MVP mode both (ie The current historical candidates and the current candidates in the candidate motion information list are different from each other, and the repetitive check of the two is skipped.
  • the coding mode of the fused motion information candidate is the inter prediction mode of the image block where the fused motion information candidate is located;
  • the coding mode of the historical candidate is the inter prediction mode of the image block where the historical candidate is located.
  • the historical candidate list to update the motion vector prediction candidate list of the current image block, in addition to performing a similar process to the historical candidate whose reference frame index is consistent with the target reference frame index, it may also include one of the following methods:
  • the order from the tail to the head of the history candidate list is checked and added to a preset number P of history candidates, and the preset number P is a positive integer greater than 0.
  • the history candidate with a preset number Q is skipped, and the preset number Q is a positive integer greater than or equal to zero.
  • the preset number P is a positive integer greater than 0.
  • the history candidate with a preset number Q is skipped, and the preset number Q is a positive integer greater than or equal to zero.
  • the history candidates in the history candidate list are not added to the motion vector prediction candidate list.
  • the historical candidates in the historical candidate list are added to the fusion motion information candidate list and the motion vector prediction candidate list, and can be placed after the time domain motion information candidate (time domain motion vector candidate), or
  • the embodiments of the present application do not limit the positions in the two lists.
  • fusion motion information candidate list after the historical candidate is added to the fusion motion information candidate list, other types of fusion candidates may be added, such as bi-predictive merge candidate and zero motion vector fusion candidate. (zero motion vector merge candidate).
  • the present invention does not involve a process of adding other types of fusion candidates, and the process may be performed by a method in HEVC or VTM, or other methods.
  • the motion information of the current block is compared with the historical candidates in the historical candidate list; if a historical candidate is the same as the current block motion information, the Historical candidates are removed from the historical candidate list. And, check the size of the historical candidate list. If the list size exceeds a preset size, remove the historical candidate at the head of the list. The motion information of the current block is added to the historical candidate list.
  • the inter prediction mode of the image block where the new historical candidate is located needs to be recorded. In other words, each historical candidate added to the historical candidate list is not only the MV, but also the inter prediction mode of the image block where the historical candidate is located.
  • the method for determining whether the current block motion information is the same as a certain historical candidate in the historical candidate list is not limited. It can be that the two pieces of motion information are completely the same, or that the two pieces of motion information are the same after some processing, for example, the results of two motion vectors shifted to the right by 2 bits are the same.
  • the historical candidate list is used to construct or update the fused motion information candidate list according to the inter prediction mode of the image block where the historical candidate, the fused motion information candidate and the motion information prediction candidate are located.
  • the duplicate item check process of some historical candidates can be selectively skipped, which can further reduce the duplicate item check operation when the historical candidate list is added to the fusion motion information candidate list or the motion vector prediction candidate list. Number, reducing codec time, which helps to improve the efficiency of inter-frame prediction, thereby improving codec performance.
  • FIG. 11A and FIG. 11B illustrate a method of adding history candidates to the fusion motion information candidate list to increase the number of merged / skip fusion motion information candidates and improve prediction efficiency.
  • the construction method of the fusion motion information candidate list added to the historical candidate is as follows:
  • Steps 1111 and 1113 Add the spatial domain candidate and the time domain candidate adjacent to the spatial domain of the current block to the fusion motion information candidate list of the current block.
  • a fusion motion information candidate list is generated. If the current CU or current decoded image block is in inter mode, a motion vector prediction candidate list is generated. The historical candidates in the historical candidate list are added to the fused motion information candidate list or the motion vector prediction candidate list.
  • generating a candidate list of fused motion information specifically includes:
  • the spatial and temporal candidates of the current block are added to the fusion motion information candidate list of the current block, and the method is the same as that in HEVC.
  • the spatial fusion candidates include A0, A1, B0, B1, and B2, and the time domain fusion candidates include T0 and T1.
  • the time-domain fusion candidates also include candidate motion information provided by the adaptive time-domain motion vector prediction (ATMVP) technology.
  • ATMVP adaptive time-domain motion vector prediction
  • the process of generating the candidate list of fused motion information may be performed by a method in HEVC or VTM, or other methods of generating the candidate list of fused motion information, which are not limited in this application.
  • the motion vector prediction candidate list may be generated by using the method in HEVC (High Efficiency Video Coding) or VTM, or other methods of generating a motion vector prediction candidate list.
  • HEVC High Efficiency Video Coding
  • VTM Video Coding
  • Step 1131 When the historical candidate list is added to the fused motion information candidate list or the motion vector prediction candidate list, the historical candidates with a preset number Q at the tail of the historical candidate list are skipped, or the historical candidate, the fused motion information candidate, and the motion information are predicted.
  • the candidate inter prediction mode skips the duplicate item check process, thereby reducing the number of duplicate item check operations when the historical candidate list is added to the fusion motion information candidate list or the motion vector prediction candidate list.
  • Step 1135 Add other types of fused motion information candidates, such as bi-predictive candidates and zero motion vector candidates.
  • the candidate motion information list is at the image block level (different image blocks have corresponding candidate motion information lists respectively), and the historical candidate list is the current slice or current image or the current one or more image coding units At the CTU level, in other words, during the encoding or decoding of multiple image blocks under the same slice, this continuously updated historical candidate list can be used.
  • FIG. 12 is a schematic block diagram of an inter prediction apparatus 1200 according to an embodiment of the present application. It should be noted that the inter prediction device 1200 is applicable to both the inter prediction of decoded video images and the inter prediction of encoded video images. It should be understood that the inter prediction device 1200 here may correspond to FIG. 2
  • the inter prediction unit 43 may correspond to the inter prediction unit 82 in FIG. 3, and the inter prediction device 1200 may include:
  • the candidate motion information list determining unit 1201 is configured to use the historical candidate list to construct or update the candidate motion information list of the current image block, wherein the using the historical candidate list to construct or update the candidate motion information list of the current image block includes: The order of the historical candidate list from the tail to the head is checked to check whether the target historical candidate motion information is the same as the existing candidate motion information in the candidate motion information list; if they are different, the target historical candidate motion information is added to the In the candidate motion information list; the target historical candidate motion information is: historical candidate motion information other than the Q historical candidate motion information arranged in order from the tail to the head in the historical candidate list, where Q is a positive integer;
  • the inter prediction processing unit 1202 is configured to perform inter prediction on the current image block based on the candidate motion information list.
  • the inter prediction processing unit 1202 here may correspond to the motion estimation unit 42 and the motion compensation unit 44 in FIG. 2, or may correspond to the motion compensation unit 82 in FIG. 3.
  • the target historical candidate motion information is X-th historical candidate motion information other than the Q historical candidate motion information arranged in order from the tail to the head in the historical candidate list, and the X-th historical candidate motion information
  • the corresponding reference frame index is the same as the target reference frame index.
  • the device is configured to decode a current image block, and the target reference frame index is a reference frame index of the current image block parsed from a code stream.
  • the candidate motion information list determination unit 1201 is specifically configured to: :
  • the inter prediction mode of the image block where the current candidate motion information is located is the first inter prediction mode
  • the inter prediction mode of the image block where the current candidate motion information is located is not the first inter prediction mode, it is checked whether the target historical candidate motion information is the same as the current candidate motion information.
  • the candidate motion information list determination unit 1201 is specifically configured to: :
  • the inter prediction mode of the image block where the target historical candidate motion information is located is a first inter prediction mode, adding the target historical candidate motion information to the candidate motion information list;
  • the inter prediction mode of the image block where the target historical candidate motion information is located is not the first inter prediction mode, check whether the target historical candidate motion information is the same as existing candidate motion information in the candidate motion information list .
  • the candidate motion information list determination unit 1201 is specifically configured to: :
  • the inter prediction mode of the image block where the target historical candidate motion information is located is a first inter prediction mode, adding the target historical candidate motion information to the candidate motion information list;
  • the inter prediction mode of the image block where the target historical candidate motion information is located is not the first inter prediction mode and the inter prediction mode of the image block where the current candidate motion information in the candidate motion information list is the first frame In the case of an inter prediction mode, checking whether the inter prediction mode of an image block where the next candidate motion information in the candidate motion information list is the first inter prediction mode;
  • the inter prediction mode of the image block where the target historical candidate motion information is located is not the first inter prediction mode and the inter prediction mode of the image block where the current candidate motion information in the candidate motion information list is not the first frame In the case of the inter prediction mode, it is checked whether the target historical candidate motion information is the same as the current candidate motion information.
  • the candidate motion information list determination unit 1201 is specifically configured to: :
  • the inter prediction mode of the image block where the target historical candidate motion information is located and the inter prediction mode of the image block where the current candidate motion information in the candidate motion information list is the first inter prediction mode
  • the inter prediction mode of the image block where the next candidate motion information in the candidate motion information list is the first inter prediction mode
  • At least one of the inter prediction mode of the image block where the target historical candidate motion information is located and the inter prediction mode of the image block where the current candidate motion information in the candidate motion information list is not the first inter prediction mode In the case, it is checked whether the target historical candidate motion information is the same as the current candidate motion information.
  • the apparatus is configured to encode the current image block
  • the inter prediction processing unit 1202 is specifically configured to:
  • the apparatus is configured to decode the current image block
  • the inter prediction processing unit 1202 is specifically configured to:
  • the device further includes:
  • An update unit (not shown) is used to update the historical candidate list by using the motion information of the current image block.
  • the candidate motion information list is a merged motion information candidate list merge candidate list
  • the candidate motion information list is a motion vector prediction candidate list (MVP candidate list, for example, AMVP candidate list).
  • MVP candidate list for example, AMVP candidate list
  • the historical candidate list is used to construct or update the current In the process of candidate motion information list of image blocks, the latest Q historical candidate motion information is skipped. For example, if the inter prediction mode of the currently decoded image block is skip / merge mode, the fusion motion information is constructed using the historical candidate list.
  • the candidate list skip the Q historical candidates arranged in order from the tail to the head; if the inter prediction mode of the currently decoded image block is the inter MVP mode, when using the historical candidate list to construct the motion vector prediction candidate list, skip Q historical candidates arranged in order from the tail to the head; compared with the historical candidate list (assuming length L) is used in the prior art to construct a fusion motion information candidate list (assuming length K) and a motion vector prediction candidate list (assuming When the length is J), it needs L times K and L times J times to search for duplicates.
  • the number of duplicate entry checking operations when the historical candidate list is added to the fusion motion information candidate list or the motion vector prediction candidate list is reduced, and the encoding and decoding time is reduced, thereby helping to improve the efficiency of inter prediction and thereby improve Codec performance.
  • FIG. 13 is a schematic block diagram of an inter prediction apparatus 1300 in an embodiment of the present application. It should be noted that the inter prediction device 1300 is applicable to both the inter prediction of decoded video images and the inter prediction of encoded video images. It should be understood that the inter prediction device 1300 here may correspond to FIG. 2
  • the inter prediction unit 43 may correspond to the inter prediction unit 82 in FIG. 3, and the inter prediction device 1300 may include:
  • the candidate motion information list determining unit 1301 is configured to use the historical candidate list to construct or update the candidate motion information list of the current image block; wherein, using the historical candidate list to construct or update the candidate motion information list of the current image block includes: The inter prediction mode of the image block where the current historical candidate motion information HMVP in the historical candidate list is located and / or the inter prediction mode of the image block where the existing candidate motion information candidate in the candidate motion information list is determined whether to execute Repeatability check, and determine whether to add the current historical candidate motion information HMVP to the candidate motion information list according to the repeatability check result of the current historical candidate motion information HMVP and M existing candidate motion information in the candidate motion information list Wherein M is an integer greater than or equal to 0;
  • the inter prediction processing unit 1302 is configured to perform inter prediction on the current image block based on the candidate motion information list.
  • the inter prediction processing unit 1202 here may correspond to the motion estimation unit 42 and the motion compensation unit 44 in FIG. 2, or may correspond to the motion compensation unit 82 in FIG. 3.
  • the candidate motion information list determination unit 1301 is specifically configured to:
  • the inter prediction mode of the image block where the current candidate motion information is located is the first inter prediction mode
  • the inter prediction mode of the image block where the current candidate motion information is located is not the first inter prediction mode, it is checked whether the current historical candidate motion information is the same as the current candidate motion information.
  • the candidate motion information list determination unit 1301 is specifically configured to:
  • the inter prediction mode of the image block where the current historical candidate motion information is located is a first inter prediction mode, adding the current historical candidate motion information to the candidate motion information list;
  • the inter prediction mode of the image block where the current historical candidate motion information is located is not the first inter prediction mode, check whether the current historical candidate motion information is the same as existing candidate motion information in the candidate motion information list .
  • the candidate motion information list determination unit 1301 is specifically configured to:
  • the inter prediction mode of the image block where the current historical candidate motion information is located is a first inter prediction mode, adding the current historical candidate motion information to the candidate motion information list;
  • the inter prediction mode of the image block where the current historical candidate motion information is located is not the first inter prediction mode and the inter prediction mode of the image block where the current candidate motion information in the candidate motion information list is the first In the case of an inter prediction mode, checking whether the inter prediction mode of an image block where the next candidate motion information in the candidate motion information list is the first inter prediction mode;
  • the inter prediction mode of the image block where the current historical candidate motion information is located is not the first inter prediction mode and the inter prediction mode of the image block where the current candidate motion information is located is not the first inter prediction mode To check whether the current historical candidate motion information is the same as the current candidate motion information.
  • the candidate motion information list determination unit 1301 is specifically configured to:
  • the inter prediction mode of the image block where the current historical candidate motion information is located and the inter prediction mode of the image block where the current candidate motion information in the candidate motion information list is the first inter prediction mode
  • the inter prediction mode of the image block where the next candidate motion information in the candidate motion information list is the first inter prediction mode
  • At least one of the inter prediction mode of the image block where the current historical candidate motion information is located and the inter prediction mode of the image block where the current candidate motion information in the candidate motion information list is not the first inter prediction mode In the case, it is checked whether the current historical candidate motion information is the same as the current candidate motion information.
  • the apparatus is configured to encode the current image block
  • the inter prediction processing unit 1302 is specifically configured to:
  • the apparatus is configured to decode the current image block
  • the inter prediction processing unit 1302 is specifically configured to:
  • the device further includes:
  • An update unit (not shown) is used to update the historical candidate list by using the motion information of the current image block.
  • the candidate motion information list is a merged motion information candidate list merge candidate list
  • the candidate motion information list is a motion vector prediction candidate list (MVP candidate list, for example, AMVP candidate list).
  • MVP candidate list for example, AMVP candidate list
  • a fusion motion information candidate list (assuming length K) and a motion vector prediction candidate list (assuming length J) are constructed.
  • L times K and L times J repeated item searches are required.
  • the inter prediction mode of the image block where the candidate is located is predicted based on historical candidates, fusion motion information candidates, and / or motion information (for example, for inter MVP mode, the surrounding block motion vector is used as the predicted value, and the MVD is added as the motion vector based on the predicted value.
  • the historical candidate list is used to construct or update the fusion motion information candidate list.
  • the duplicate item check process of some historical candidates can be selectively skipped, which can further reduce the duplicate item check operation when the historical candidate list is added to the fusion motion information candidate list or the motion vector prediction candidate list. Number, reducing codec time, which helps to improve inter prediction efficiency Code performance.
  • FIG. 14 is a schematic block diagram of an implementation manner of an encoding device or a decoding device (referred to as a decoding device 1400) according to an embodiment of the present application.
  • the decoding device 1400 may include a processor 1410, a memory 1430, and a bus system 1450.
  • the processor and the memory are connected through a bus system, the memory is used to store instructions, and the processor is used to execute the instructions stored in the memory.
  • the memory of the encoding device stores program code, and the processor can call the program code stored in the memory to perform various video encoding or decoding methods described in this application, especially video encoding in various inter prediction modes or intra prediction modes. Or decoding methods, and methods for predicting motion information in various inter or intra prediction modes. To avoid repetition, it will not be described in detail here.
  • the processor 1410 may be a Central Processing Unit (“CPU”), and the processor 1410 may also be another general-purpose processor, digital signal processor (DSP), or special-purpose integration. Circuits (ASICs), off-the-shelf programmable gate arrays (FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • a general-purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
  • the memory 1430 may include a read only memory (ROM) device or a random access memory (RAM) device. Any other suitable type of storage device may also be used as the memory 1430.
  • the memory 1430 may include code and data 1431 accessed by the processor 1410 using the bus 1450.
  • the memory 1430 may further include an operating system 1433 and an application program 1435.
  • the application program 1435 includes at least one program that allows the processor 1410 to perform the video encoding or decoding method described in this application (especially the inter prediction method described in this application).
  • the application programs 1435 may include applications 1 to N, which further include a video encoding or decoding application (referred to as a video decoding application) that executes the video encoding or decoding method described in this application.
  • the bus system 1450 may include a data bus, a power bus, a control bus, and a status signal bus. However, for the sake of clarity, various buses are marked as the bus system 1450 in the figure.
  • the decoding device 1400 may further include one or more output devices, such as a display 1470.
  • the display 1470 may be a touch-sensitive display or a touch display, which incorporates the display with a touch-sensitive unit that is operable to sense a touch input.
  • the display 1470 may be connected to the processor 1410 via a bus 1450.
  • Computer-readable media may include computer-readable storage media, which corresponds to tangible media, such as data storage media, or communication media including any medium that facilitates transfer of a computer program from one place to another (e.g., according to a communication protocol) .
  • computer-readable media may generally correspond to (1) tangible computer-readable storage media that is non-transitory, or (2) a communication medium such as a signal or carrier wave.
  • a data storage medium may be any available medium that can be accessed by one or more computers or one or more processors to retrieve instructions, code, and / or data structures used to implement the techniques described in this application.
  • the computer program product may include a computer-readable medium.
  • such computer-readable storage media may include RAM, ROM, EEPROM, CD-ROM or other optical disk storage devices, magnetic disk storage devices or other magnetic storage devices, flash memory, or may be used to store instructions or data structures Any form of desired program code and any other medium accessible by a computer.
  • any connection is properly termed a computer-readable medium.
  • a coaxial cable is used to transmit instructions from a website, server, or other remote source using coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies such as infrared, radio, and microwave. Wire, fiber optic cable, twisted pair, DSL or wireless technologies such as infrared, radio and microwave are included in the definition of media.
  • the computer-readable storage media and data storage media do not include connections, carrier waves, signals, or other temporary media, but are instead directed to non-transitory tangible storage media.
  • magnetic disks and compact discs include compact discs (CDs), laser discs, optical discs, digital versatile discs (DVDs), and Blu-ray discs, where disks typically reproduce data magnetically, and optical discs use lasers to reproduce optically data. Combinations of the above should also be included within the scope of computer-readable media.
  • DSPs digital signal processors
  • ASICs application specific integrated circuits
  • FPGAs field programmable logic arrays
  • the term "processor” as used herein may refer to any of the aforementioned structures or any other structure suitable for implementing the techniques described herein.
  • the functions described by the various illustrative logical blocks, units, and steps described herein may be provided within dedicated hardware and / or software units configured for encoding and decoding, or Into the combined codec.
  • the techniques can be fully implemented in one or more circuits or logic elements.
  • various illustrative logical boxes, units, and units in the video encoder 20 and the video decoder 30 can be understood as corresponding circuit devices or logic elements.
  • the techniques of this application may be implemented in a wide variety of devices or devices, including a wireless handset, an integrated circuit (IC), or a group of ICs (eg, a chipset).
  • IC integrated circuit
  • a group of ICs eg, a chipset
  • Various components, units, or units are described in this application to emphasize functional aspects of the apparatus for performing the disclosed techniques, but do not necessarily need to be implemented by different hardware units.
  • the various units may be combined in a codec hardware unit in combination with suitable software and / or firmware, or through interoperable hardware units (including one or more processors as described above) provide.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

La présente invention concerne un procédé d'interprédiction et des produits associés. Le procédé comprend les étapes consistant à : utiliser une liste de candidats historiques pour construire ou mettre à jour une liste d'informations de mouvement candidates du bloc d'image actuel ; effectuer une interprédiction sur le bloc d'image actuel sur la base de la liste d'informations de mouvement candidates ; vérifier, dans une séquence allant de la fin à la tête de la liste de candidats historiques, si des informations de mouvement candidates historiques cibles sont identiques à des informations de mouvement candidates existantes dans la liste d'informations de mouvement candidates ; si ce n'est pas le cas, ajouter les informations de mouvement candidates historiques cibles dans la liste d'informations de mouvement candidates ; les informations de mouvement candidates historiques cibles étant : des informations de mouvement candidates historiques autres que les Q éléments d'informations de mouvement candidates historiques de la liste de candidats historiques ordonnée de la fin à la tête, et Q étant un nombre entier positif. Le présent procédé permet de réduire le nombre d'opérations de vérification d'éléments répétées lors de l'ajout de la liste de candidats historiques dans une liste de candidats d'informations de mouvement fusionnée ou une liste de candidats de prédiction de vecteur de mouvement.
PCT/CN2018/104430 2018-09-05 2018-09-06 Procédé et appareil d'interprédiction et codec WO2020047807A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811034294.4 2018-09-05
CN201811034294 2018-09-05

Publications (1)

Publication Number Publication Date
WO2020047807A1 true WO2020047807A1 (fr) 2020-03-12

Family

ID=69722114

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/104430 WO2020047807A1 (fr) 2018-09-05 2018-09-06 Procédé et appareil d'interprédiction et codec

Country Status (1)

Country Link
WO (1) WO2020047807A1 (fr)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11070796B2 (en) 2018-09-28 2021-07-20 Qualcomm Incorporated Ultimate motion vector expression based pruning for video coding
US11140412B2 (en) * 2019-02-17 2021-10-05 Beijing Bytedance Network Technology Co., Ltd. Motion candidate list construction for intra block copy (IBC) mode and non-IBC inter mode
US11368706B2 (en) * 2018-12-06 2022-06-21 Lg Electronics Inc. Method and device for processing video signal on basis of inter prediction

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101605255A (zh) * 2008-06-12 2009-12-16 华为技术有限公司 一种视频编解码的方法及装置
CN104079944A (zh) * 2014-06-30 2014-10-01 华为技术有限公司 视频编码的运动矢量列表构建方法和***
WO2017176092A1 (fr) * 2016-04-08 2017-10-12 한국전자통신연구원 Procédé et dispositif pour induire des informations de prédiction de mouvement
CN107615765A (zh) * 2015-06-03 2018-01-19 联发科技股份有限公司 视频编解码***中在帧内块复制模式和帧间预测模式之间的资源共享的方法和装置
CN107743239A (zh) * 2014-09-23 2018-02-27 清华大学 一种视频数据编码、解码的方法及装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101605255A (zh) * 2008-06-12 2009-12-16 华为技术有限公司 一种视频编解码的方法及装置
CN104079944A (zh) * 2014-06-30 2014-10-01 华为技术有限公司 视频编码的运动矢量列表构建方法和***
CN107743239A (zh) * 2014-09-23 2018-02-27 清华大学 一种视频数据编码、解码的方法及装置
CN107615765A (zh) * 2015-06-03 2018-01-19 联发科技股份有限公司 视频编解码***中在帧内块复制模式和帧间预测模式之间的资源共享的方法和装置
WO2017176092A1 (fr) * 2016-04-08 2017-10-12 한국전자통신연구원 Procédé et dispositif pour induire des informations de prédiction de mouvement

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11070796B2 (en) 2018-09-28 2021-07-20 Qualcomm Incorporated Ultimate motion vector expression based pruning for video coding
US11368706B2 (en) * 2018-12-06 2022-06-21 Lg Electronics Inc. Method and device for processing video signal on basis of inter prediction
US11695948B2 (en) * 2018-12-06 2023-07-04 Lg Electronics Inc. Method and device for processing video signal on basis of inter prediction
US11140412B2 (en) * 2019-02-17 2021-10-05 Beijing Bytedance Network Technology Co., Ltd. Motion candidate list construction for intra block copy (IBC) mode and non-IBC inter mode

Similar Documents

Publication Publication Date Title
TWI791723B (zh) 圖像預測方法、裝置以及視訊編碼器、視訊解碼器
WO2019120305A1 (fr) Procédé de prédiction d'informations de mouvement d'un bloc d'image, dispositif et codec
JP5869122B2 (ja) ビデオコーディングにおける予測データのバッファリング
US9854234B2 (en) Reference picture status for video coding
US20130272409A1 (en) Bandwidth reduction in video coding through applying the same reference index
JP6151446B2 (ja) ビデオコーディングのための高精度明示的重み付け予測
JP6239609B2 (ja) ビデオコーディングのための長期参照ピクチャをシグナリングすること
US20130070855A1 (en) Hybrid motion vector coding modes for video coding
JP2018530246A (ja) ビデオコーディングのために位置依存の予測組合せを使用する改善されたビデオイントラ予測
KR20130126688A (ko) 모션 벡터 예측
US11563949B2 (en) Motion vector obtaining method and apparatus, computer device, and storage medium
US20210067796A1 (en) Video coding method and apparatus
WO2020103593A1 (fr) Procédé et appareil de prédiction inter-trame
WO2020047807A1 (fr) Procédé et appareil d'interprédiction et codec
JP6224851B2 (ja) 低複雑度符号化および背景検出のためのシステムおよび方法
US11394996B2 (en) Video coding method and apparatus
WO2020043111A1 (fr) Procédés de codage et de décodage d'image basés sur une liste de candidats historiques et codec correspondant
TW201921938A (zh) 具有在用於視訊寫碼之隨機存取組態中之未來參考訊框之可調適圖像群組結構
US11197018B2 (en) Inter-frame prediction method and apparatus
CN110896485B (zh) 一种预测运动信息的解码方法及装置
WO2020042758A1 (fr) Procédé et dispositif de prédiction intertrames
WO2019084776A1 (fr) Procédé et dispositif d'obtention d'informations de mouvement candidates d'un bloc d'image, et codec
WO2020038232A1 (fr) Procédé et appareil de prédiction d'informations de mouvement d'un bloc d'image
WO2020052653A1 (fr) Procédé et dispositif de décodage pour informations de mouvement prédites
WO2020024275A1 (fr) Procédé et dispositif de prédiction entre trames

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18932959

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18932959

Country of ref document: EP

Kind code of ref document: A1