WO2020181504A1 - 视频编码的方法与装置,以及视频解码的方法与装置 - Google Patents

视频编码的方法与装置,以及视频解码的方法与装置 Download PDF

Info

Publication number
WO2020181504A1
WO2020181504A1 PCT/CN2019/077882 CN2019077882W WO2020181504A1 WO 2020181504 A1 WO2020181504 A1 WO 2020181504A1 CN 2019077882 W CN2019077882 W CN 2019077882W WO 2020181504 A1 WO2020181504 A1 WO 2020181504A1
Authority
WO
WIPO (PCT)
Prior art keywords
type
offset value
current block
candidate list
frame
Prior art date
Application number
PCT/CN2019/077882
Other languages
English (en)
French (fr)
Inventor
郑萧桢
孟学苇
王苫社
马思伟
Original Assignee
深圳市大疆创新科技有限公司
北京大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市大疆创新科技有限公司, 北京大学 filed Critical 深圳市大疆创新科技有限公司
Priority to CN201980005231.2A priority Critical patent/CN111264061B/zh
Priority to PCT/CN2019/077882 priority patent/WO2020181504A1/zh
Publication of WO2020181504A1 publication Critical patent/WO2020181504A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/137Motion inside a coding unit, e.g. average field, frame or block difference
    • H04N19/139Analysis of motion vectors, e.g. their magnitude, direction, variance or reliability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • H04N19/159Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • H04N19/517Processing of motion vectors by encoding
    • H04N19/52Processing of motion vectors by encoding by predictive encoding

Definitions

  • This application relates to the field of video encoding and video decoding, and more specifically, to a method and device for video encoding, and a method and device for video decoding.
  • the basic principle of video coding is to use the correlation between the spatial domain, the time domain, and the codeword to remove redundancy as much as possible to save transmission bandwidth or storage space.
  • the current common practice is to adopt a block-based hybrid video coding framework, and realize video coding compression through steps such as prediction (including intra-frame prediction and inter-frame prediction), transformation, quantization, and entropy coding.
  • prediction including intra-frame prediction and inter-frame prediction
  • transformation including intra-frame prediction and inter-frame prediction
  • quantization quantization
  • entropy coding a block-based hybrid video coding framework
  • motion estimation and motion compensation are key technologies that affect coding performance and decoding performance.
  • the frame to be encoded can be divided into several image blocks, and the position of each image block in the adjacent frame can be searched out, and the difference between the two can be obtained.
  • the relative offset of the spatial position, the obtained relative offset is the motion vector (motion vector, MV).
  • the process of obtaining the motion vector is called motion estimation (ME), and the inter-frame redundancy can be removed through motion estimation , To reduce the bit overhead of video transmission.
  • This application provides a method and device for video encoding, and a method and device for video decoding, which can reduce the complexity of encoding and the complexity of decoding.
  • this application provides a video encoding method, including: determining the reference motion information of the current block according to the motion information candidate list of the current block; selecting the offset value candidate set from the corresponding offset value candidate set according to the type of the frame to which the current block belongs Determine the target offset value, where the types include a first type and a second type, and the offset value candidate set corresponding to the first type is a subset of the offset value candidate set corresponding to the second type; determined according to the reference motion information Search starting point; use the target offset value as the search step at the search starting point to search the reference block of the current block.
  • the video encoding device applying the above method uses different offset value sets to perform inter-frame prediction for different types of frames. For frames with more complex motion, the video encoding device can use a set containing more offset values, so that a smaller search step can be selected to accurately search for the optimal motion vector; for frames with relatively simple motion, the video encoding device can use A set of fewer offset values (that is, a subset of the set containing more offset values), so that the optimal motion vector can be quickly searched. In addition, since the video encoding device selects the target offset value from a preset set for different types of frames, it is no longer necessary to determine whether to shift the offset value based on different types, thereby reducing the complexity of video encoding .
  • the present application provides another video encoding method, including: determining the reference motion information of the current block according to the motion information candidate list of the current block; according to the type of the frame to which the current block belongs, starting from the same offset value candidate list Determine the target offset value in, determine the search starting point according to the reference motion information; use the target offset value as the search step at the search starting point to search for the reference block of the current block.
  • the video encoding device applying the above method uses different offset value sets to perform inter-frame prediction for different types of frames. For frames with more complex motion, the video encoding device can use a set containing more offset values, so that a smaller search step can be selected to accurately search for the optimal motion vector; for frames with relatively simple motion, the video encoding device can use A collection of fewer offset values, so that the optimal motion vector can be searched quickly. In addition, since the video encoding device selects the target offset value from the preset list for different types of frames, it is no longer necessary to determine whether to shift the offset value based on different types, thereby reducing the complexity of video encoding .
  • this application provides yet another video encoding method, including: determining the reference motion information of the current block according to the motion information candidate list of the current block; selecting the target of the current block from the offset value candidate list The offset value, wherein the video frames of the screen content and the non-screen content adopt the same offset value candidate list; determine the motion information of the current block according to the reference motion information and the target offset value; The motion information of the block encodes the current block to obtain a code stream, wherein the code stream includes the index number of the target offset value in the offset value candidate list.
  • the video encoding device applying the above method uses different offset value sets to perform inter-frame prediction for different types of frames. For frames with more complex motion, the video encoding device can use a set containing more offset values, so that a smaller search step can be selected to accurately search for the optimal motion vector; for frames with relatively simple motion, the video encoding device can use A collection of fewer offset values, so that the optimal motion vector can be searched quickly.
  • the video encoding device selects the target offset value from the preset list for different types of frames, it is no longer necessary to determine whether to shift the offset value based on different types, thereby reducing the complexity of video encoding .
  • the index numbers of the target offset values corresponding to different types of frames are arranged in the same arrangement, the code stream does not need to carry indication information indicating the type of the frame, thereby reducing the bit overhead of the code stream.
  • the present application provides a video decoding method, including: determining the reference motion information of the current block according to the motion information candidate list of the current block; determining the target offset value, where the target offset value is the one to which the current block belongs An offset value in the offset value candidate set corresponding to the type of the frame, where the type includes a first type and a second type, and the offset value candidate set corresponding to the first type is the offset value corresponding to the second type A subset of the candidate set; determine the search starting point according to the reference motion information; use the target offset value as the search step at the search starting point to search for the reference block of the current block.
  • the video decoding device Since the offset value sets corresponding to different types of frames are all preset offset value sets, the video decoding device does not need to determine whether to shift the offset value according to the type of the frame. Therefore, the video decoding device applying the above method Reduce the complexity of decoding.
  • the present application provides another video decoding method, including: determining the reference motion information of the current block according to the motion information candidate list of the current block; determining the target offset value from an offset value candidate list, wherein, Image blocks in different types of frames use the same offset value candidate list to determine the target offset value; determine the search starting point according to the reference motion information; use the target offset value at the search starting point to search for the reference block of the current block.
  • the video decoding device Since the offset value sets corresponding to different types of frames belong to the set in the preset offset value list, the video decoding device does not need to determine whether to perform shift processing on the offset value according to the type of the frame. Therefore, the video using the above method The decoding device reduces the complexity of decoding.
  • this application provides yet another video decoding method, including: receiving a code stream, the code stream including an index number; and selecting the target offset of the current block from a candidate list of offset values according to the index number. Shift value, wherein the video frames of screen content and non-screen content adopt the same offset value candidate list; determine the reference motion information of the current block according to the motion information candidate list of the current block; according to the reference motion information and the The target offset value determines the motion information of the current block; and decodes the current block according to the motion information of the current block to obtain the decoded current block.
  • the video decoding device can determine the target offset value based on the index number in the code stream, without determining whether to offset according to the frame type
  • the shift value performs shift processing. Therefore, the video decoding device applying the above method reduces the complexity of decoding. In addition, the video decoding device applying the above method does not need to receive the indication information indicating the type of the frame, and the code stream does not need to carry the indication information, thereby reducing the bit overhead of the code stream.
  • the present application provides an encoding device.
  • the encoding device includes a processing unit, configured to: determine the reference motion information of the current block according to the motion information candidate list of the current block;
  • the target offset value is determined from the offset value candidate set in the offset value candidate set, where the types include a first type and a second type, and the offset value candidate set corresponding to the first type is a child of the offset value candidate set corresponding to the second type Set; determine the search starting point according to the reference motion information; use the target offset value as the search step at the search starting point to search for the reference block of the current block.
  • the present application provides an encoding device.
  • the encoding device includes a memory and a processor.
  • the memory is used to store instructions.
  • the processor is used to execute instructions stored in the memory and store instructions in the memory. Execution of the instructions of causes the processor to execute the method provided in the first aspect.
  • the present application provides a chip including a processing module and a communication interface, the processing module is used to control the communication interface to communicate with the outside, and the processing module is also used to implement the method.
  • the present application provides a computer-readable storage medium on which a computer program is stored.
  • the computer program When the computer program is executed by a computer, the computer realizes the method provided in the first aspect.
  • the present application provides a computer program product containing instructions that when executed by a computer cause the computer to implement the method provided in the first aspect.
  • the present application provides an encoding device, the encoding device includes a processing unit configured to: determine the reference motion information of the current block according to the motion information candidate list of the current block; Determine the target offset value in the same offset value candidate list; determine the search starting point according to the reference motion information; use the target offset value at the search starting point to search for the reference block of the current block.
  • the present application provides an encoding device, the encoding device includes a memory and a processor, the memory is used to store instructions, the processor is used to execute instructions stored in the memory, and Execution of the stored instructions causes the processor to execute the method provided in the second aspect.
  • the present application provides a chip that includes a processing module and a communication interface, the processing module is used to control the communication interface to communicate with the outside, and the processing module is also used to implement the second aspect Methods.
  • this application provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a computer, the computer realizes the method provided in the second aspect.
  • this application provides a computer program product containing instructions, which when executed by a computer causes the computer to implement the method provided in the second aspect.
  • the present application provides an encoding device, the encoding device includes a processing unit, configured to: determine the reference motion information of the current block according to the motion information candidate list of the current block; Select the target offset value of the current block, where the video frames of screen content and non-screen content use the same offset value candidate list; determine the target offset value of the current block according to the reference motion information and the target offset value Motion information; encoding the current block according to the motion information of the current block to obtain a code stream, wherein the code stream contains the index number of the target offset value in the offset value candidate list.
  • the present application provides an encoding device, the encoding device includes a memory and a processor, the memory is used to store instructions, the processor is used to execute instructions stored in the memory, and Execution of the stored instructions causes the processor to execute the method provided in the third aspect.
  • this application provides a chip that includes a processing module and a communication interface, the processing module is used to control the communication interface to communicate with the outside, and the processing module is also used to implement the third aspect Methods.
  • this application provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a computer, the computer realizes the method provided in the third aspect.
  • this application provides a computer program product containing instructions that when executed by a computer causes the computer to implement the method provided in the third aspect.
  • the present application provides a decoding device, the decoding device includes a processing unit, configured to: determine the reference motion information of the current block according to the motion information candidate list of the current block; determine the target offset value, wherein, The target offset value is an offset value in the offset value candidate set corresponding to the type of the frame to which the current block belongs, where the type includes a first type and a second type, and the offset value candidate set corresponding to the first type It is a subset of the offset value candidate set corresponding to the second type; the search starting point is determined according to the reference motion information; the target offset value is used at the search starting point to search the reference block of the current block.
  • the present application provides a decoding device, the decoding device includes a memory and a processor, the memory is used to store instructions, the processor is used to execute the instructions stored in the memory, and to the memory The execution of the instructions stored in the processor causes the processor to execute the method provided in the fourth aspect.
  • the present application provides a chip that includes a processing module and a communication interface, the processing module is used to control the communication interface to communicate with the outside, and the processing module is also used to implement the fourth aspect Provided method.
  • the present application provides a computer-readable storage medium on which a computer program is stored.
  • the computer program When the computer program is executed by a computer, the computer realizes the method provided in the fourth aspect.
  • the present application provides a computer program product containing instructions that when executed by a computer causes the computer to implement the method provided in the fourth aspect.
  • the present application provides a decoding device, the decoding device includes a processing unit, configured to: determine the reference motion information of the current block according to the motion information candidate list of the current block; Determine the target offset value, where the image blocks in different types of frames use the same offset value candidate list to determine the target offset value; determine the search starting point according to the reference motion information; use the target offset value as the search step at the search starting point To search for the reference block of the current block.
  • the present application provides a decoding device, the decoding device includes a memory and a processor, the memory is used to store instructions, the processor is used to execute the instructions stored in the memory, and to the memory Execution of the instructions stored in the processor causes the processor to execute the method provided in the fifth aspect.
  • the present application provides a chip that includes a processing module and a communication interface, the processing module is used to control the communication interface to communicate with the outside, and the processing module is also used to implement the fifth aspect Provided method.
  • the present application provides a computer-readable storage medium on which a computer program is stored.
  • the computer program When the computer program is executed by a computer, the computer realizes the method provided in the fifth aspect.
  • the present application provides a computer program product containing instructions that, when executed by a computer, cause the computer to implement the method provided in the fifth aspect.
  • the present application provides a decoding device, the decoding device includes a receiving unit and a processing unit, the receiving unit is configured to: receive a code stream, the code stream includes an index number; the processing unit uses Yu: Select the target offset value of the current block from the offset value candidate list according to the index number, wherein the video frame of the screen content and the non-screen content adopts the same offset value candidate list; according to the motion of the current block
  • the information candidate list determines the reference motion information of the current block; determines the motion information of the current block according to the reference motion information and the target offset value; and decodes the current block according to the motion information of the current block To obtain the decoded current block.
  • the present application provides a decoding device, the decoding device includes a memory and a processor, the memory is used to store instructions, the processor is used to execute the instructions stored in the memory, and to the memory Execution of the instructions stored in the processor causes the processor to execute the method provided in the sixth aspect.
  • the present application provides a chip including a processing module and a communication interface, the processing module is used to control the communication interface to communicate with the outside, and the processing module is also used to implement the sixth aspect Provided method.
  • the present application provides a computer-readable storage medium on which a computer program is stored.
  • the computer program When the computer program is executed by a computer, the computer realizes the method provided in the sixth aspect.
  • the present application provides a computer program product containing instructions that when executed by a computer cause the computer to implement the method provided in the sixth aspect.
  • Figure 1 is a schematic diagram of a video encoding method suitable for the present application.
  • Fig. 2 is a schematic diagram of a video decoding method suitable for the present application.
  • Fig. 3 is a schematic diagram of a method for constructing a motion information candidate list applicable to the present application.
  • Fig. 4 is a schematic diagram of a method for determining an optimal motion vector suitable for the present application.
  • Fig. 5 is a schematic diagram of an interpolation method suitable for the present application.
  • Fig. 6 is a schematic diagram of a video encoding method provided by the present application.
  • Fig. 7 is a schematic diagram of a video decoding method provided by the present application.
  • Fig. 8 is a schematic diagram of a video encoder provided by the present application.
  • Fig. 9 is a schematic diagram of a video decoder provided by the present application.
  • Fig. 10 is a schematic diagram of an inter-frame prediction apparatus provided by the present application.
  • Fig. 11 is a schematic diagram of a video encoding device or a video decoding device provided by the present application.
  • Figure 1 shows a schematic diagram of a video encoding method suitable for the present application.
  • the video coding method includes intra prediction, inter prediction, transform, quantization, entropy encoding, and in-loop filtering. After the image is divided into coding blocks, intra-frame prediction or inter-frame prediction is performed, and after the residual error is obtained, transform and quantization are performed, and finally entropy coding is performed and a code stream is output.
  • the encoding block is an M ⁇ N array composed of pixels (M may be equal to N or not equal to N), and the pixel value of each pixel is known.
  • P represents the predicted value
  • Dn represents the residual
  • uFn' represents the reconstruction value (before filtering)
  • Dn' represents the residual.
  • Intra-frame prediction refers to predicting the pixel value of the pixel in the current coding block by using the pixel value of the pixel in the reconstructed area in the current image.
  • Inter-frame prediction is to find a matching reference block for the current coding block in the current image in the reconstructed image.
  • the pixel value of the pixel in the reference block is used as the prediction information or the predicted value of the pixel value of the pixel in the current coding block (information and value will not be distinguished hereinafter), this process is motion estimation.
  • the motion information of the current coding block includes the indication information of the prediction direction (usually forward prediction, backward prediction or bidirectional prediction), one or two motion vectors pointing to the reference block, and the image of the reference block. Indication information (usually the reference frame index).
  • Forward prediction means that the current coding block selects at least one reference image from the forward reference image set to obtain at least one reference block.
  • Backward prediction means that the current coding block selects at least one reference image from the backward reference image set to obtain at least one reference block.
  • Bidirectional prediction refers to selecting at least one reference image from the forward and backward reference image sets to obtain at least one reference block. When the bidirectional prediction method is used, there will be at least two reference blocks in the current coding block, and each reference block needs a motion vector and a reference frame index to indicate, and then determine the current block according to the pixel values of the pixels in the at least two reference blocks The predicted value of the pixel value of the pixel.
  • the motion estimation process needs to search multiple reference blocks in the reference image for the current coding block.
  • Rate-distortion optimization (RDO) or other methods can be used to determine which reference block or several reference blocks are ultimately used for prediction.
  • the residual information can be obtained according to the pixel value of the pixel in the current coding block and the corresponding prediction information, for example, the pixel value of the current coding block and the pixel value of the reference block
  • the residual information is obtained by directly subtracting the values, or the residual information can be obtained by other possible methods.
  • DCT discrete cosine transformation
  • the prediction information and residual information can also be filtered to obtain reconstruction information, which is used as reference information for subsequent encoding.
  • FIG. 2 shows a schematic flowchart of the code stream decoding method applicable to the present application.
  • the residual information is obtained by transforming operations such as entropy decoding and inverse quantization.
  • the decoding end first obtains the prediction mode of the current block to be decoded by analyzing the code stream. If it is intra-frame prediction, the pixel values of pixels in the reconstructed area around the current block to be decoded are used to construct prediction information. If it is inter-frame prediction, it is necessary to obtain the motion information of the current decoded block, and use the motion information to determine the reference block in the reconstructed image, and use the pixel value of the pixel in the reference block as the prediction information.
  • prediction information also called prediction block
  • residual information also called residual block
  • the reconstruction information of the current block to be decoded also called reconstruction block
  • the encoding end constructs a motion information candidate list before predicting the current image block, and predicts the current image block according to the candidate motion information selected in the motion information candidate list.
  • the current image block is the image block to be encoded (or decoded).
  • the image frame where the current image block is located is called the current frame.
  • the current image block is a coding unit (CU) or a decoding unit (DU) in some video standards.
  • the motion information mentioned in this article may include motion vectors, or include motion vectors and reference frame information.
  • the motion information candidate list refers to a set of candidate motion information of the current block, and each candidate motion information in the motion information candidate list can be stored in the same buffer or in different buffers. There is no restriction here.
  • the index of the motion information in the motion information candidate list mentioned below can be the index of the motion information in all candidate motion information sets of the current block, or the index of the buffer where the motion information is located, here No restrictions.
  • the encoding of the current image block can be completed through the following steps.
  • the current image block can be decoded through the following steps.
  • the preset method can be consistent with the method of constructing the motion information candidate list by the encoder.
  • the motion vector of the current image block is equal to the prediction MV (motion vector prediction, MVP) (for example, the motion vector MV1 mentioned above).
  • this mode includes the merge mode.
  • the merge mode includes a normal merge mode and/or an affine merge mode.
  • the motion vector of the current image block is equal to the prediction MV (motion vector prediction, MVP) (for example, the motion vector MV1 mentioned above).
  • the MVP may be a candidate in the motion information candidate list, or may be obtained after processing (for example, after scaling) according to one of the candidates in the motion information candidate list.
  • the MVP of the current image block can be further optimized on the basis of the MVP of the current image block to obtain the motion vector of the current image block. For example, take MVP as the benchmark MV, search around it with a fixed search step, and select the best MV from the search results.
  • the default fixed direction for example, up, down, left, and right
  • the offset candidate list is used at the encoding end.
  • the default number of step offsets (for example, one offset) is used, and then the index number of the offset value and the index number of the direction used by the image block corresponding to the optimal result of the search are written into the code stream.
  • the decoding end receives the index number of the reference MV, the index number of the offset value, and the index number of the direction, and then according to the block pointed to by the reference MV, the offset value corresponding to the index number is offset once in the direction corresponding to the index number, and the result is The reference block of the current image block.
  • this mode may also be called MVD merge mode (merge mode with MVD, MMVD).
  • Figures 3 and 4 show the method of determining the optimal MV in MMVD.
  • the encoder can select several (for example, two) reference MVs from the motion vector candidate list.
  • one MV is the MV adjacent to the coded block in the spatial domain (for example, the MV of one of the coded blocks in the image blocks 1 to 5 in Figure 3)
  • the other MV is the MV adjacent to the coded block in the time domain (for example, in other
  • the pixels corresponding to the two reference MVs are shown in the two dotted circles in FIG. 4, where L0 represents the image frame pointed to by the first reference MV, and L1 represents the image frame pointed to by the second reference MV.
  • the encoding end respectively uses the selected reference MV as a starting point to search along a fixed direction (for example, four directions up, down, left, and right).
  • the step size used in the search is an offset value in the offset value candidate list.
  • the offset candidate list contains eight offset values, namely ⁇ 1/4, 1/2, 1, 2, 4, 8, 16, 32 ⁇ , and the corresponding search step index numbers are respectively It is 0 ⁇ 7.
  • the encoding end selects an offset value from the offset value candidate list as the search step to search, and determines the optimal MV according to RDO or other rules.
  • the filled circles and hollow circles represent the pixels found based on the two search steps.
  • the MV of the object between two adjacent frames may not be exactly an integer number of pixels. However, there are no fractional pixels (for example, 1/4 pixel and 1/2 pixel) in digital video, that is, there are no other pixels between two pixels.
  • the values of these sub-pixels can be approximated by interpolation, that is, the row direction and column direction of the reference frame are interpolated, and the search is performed in the reference frame after the interpolation.
  • interpolation that is, the row direction and column direction of the reference frame are interpolated, and the search is performed in the reference frame after the interpolation.
  • the pixels in the current block and the pixels in the adjacent area need to be used. The following examples illustrate the interpolation method.
  • a 0,0 and d 0,0 are 1/4 pixels
  • b 0,0 and h 0,0 are 1/2 pixels (also called “half pixels")
  • c 0,0 and n 0,0 are 3/4 pixels.
  • the coding block is a 2x2 block surrounded by A 0,0 to A 1,0 and A 0,0 to A 0,1 .
  • some pixels outside the coding block are needed, including 3 pixels on the left side, 4 pixels on the right side, 3 pixels on the top and 4 pixels on the bottom of the coding block point.
  • the following formula gives the external pixels used by some interpolation pixels in the above coding block.
  • the method 600 includes:
  • S610 Determine a reference MV of the current block according to the motion information candidate list of the current block.
  • the method for determining the reference MV is as described above, and will not be repeated here.
  • S620 Determine a target offset value from a corresponding offset value candidate set according to the type of the frame to which the current block belongs, where the types include a first type and a second type, and the offset value candidate set corresponding to the first type is A subset of the offset candidate set corresponding to the second type.
  • the content change degree of the first type frame is simpler than the content change degree of the second type frame.
  • the first type is, for example, screen content
  • the second type is, for example, non-screen content.
  • the screen content may be a frame obtained by recording a screen
  • the non-screen content may be a frame obtained by shooting a natural object.
  • the movement of objects in screen content (for example, the translation of text) is simpler than the movement of natural objects in non-screen content (for example, the rotation of athletes). Therefore, the first type of frames may also be called simple motion frames, and the second type of frames may also be called complex motion frames.
  • the encoder can use a longer search step to search for the optimal MV to improve coding efficiency; for complex motion frames, the encoder can use a shorter search step to search for the optimal MV.
  • the encoder can select a value from the offset value candidate set ⁇ 1,2,4,8,16,32 ⁇ corresponding to the first type as the current block Target offset value; if the type of the frame to which the current block belongs is the second type, the encoder can select the offset value candidate set from the second type ⁇ 1/4,1/2,1,2,4,8,16 ,32 ⁇ select a value as the target offset value of the current block.
  • the two offset value candidate sets may be combined and stored. That is, the offset value candidate set corresponding to the first type and the offset value candidate set corresponding to the second type are located in the same offset value candidate list.
  • the encoding end may write the index number of the selected offset value in the same offset value candidate list into the code stream.
  • the offset value candidate set corresponding to the first type and the offset value candidate set corresponding to the second type are stored in the same buffer.
  • the offset candidate set corresponding to the first type can also be ⁇ 1,2,4,8,16,32,64 ⁇
  • the offset candidate set corresponding to the second type is ⁇ 1/4,1/2 ,1,2,4,8,16,32 ⁇ .
  • the offset value candidate set corresponding to the first type and the offset value candidate set corresponding to the second type may be located in the same offset value candidate list.
  • S620 can also be replaced with the following steps:
  • the target offset value is determined from the same offset value candidate list according to the type of the frame to which the current block belongs.
  • S630 can also be executed before S620, or together with S620.
  • S630 Determine a search starting point according to the reference MV.
  • S640 Use the target offset value as the search step at the search starting point to search for the reference block of the current block.
  • the encoder needs to interpolate the reference frame and search for the reference block in the interpolated reference frame; if the search step is 1 or 2 If the length of the integer pixels is equal, the encoder does not need to interpolate the reference frame, and can directly search for the reference block in the reference frame.
  • the method of interpolation and the method of searching for reference blocks have been described in detail above, and will not be repeated here.
  • the offset candidate list used by screen content is ⁇ 1,2,4,8,16,32,64,128 ⁇
  • the offset candidate list used by non-screen content is ⁇ 1/4,1 /2,1,2,4,8,16,32 ⁇ ; after the encoding end encodes the image block, an identifier needs to be added to the code stream to indicate whether the frame type belongs to screen content or non-screen content, and The index number of the selected offset value in the corresponding offset value candidate list is added to the code stream.
  • the decoder After receiving the code stream, the decoder determines whether the type of the image frame is screen content or non-screen content according to the identifier in the code stream, and then selects the corresponding offset value candidate list, and according to the index number in the code stream from the corresponding offset Select the corresponding offset value from the shift candidate list as the search step to search.
  • the codec in order to reduce the amount of storage occupied, when storing the offset candidate list, the codec only stores the offset candidate list used by the non-screen content ⁇ 1/4,1/2,1,2,4,8 ,16,32 ⁇ .
  • the offset value candidate list ⁇ 1/4,1/2,1,2,4,8,16,32 ⁇ is shifted to obtain ⁇ 1,2, 4,8,16,32,64,128 ⁇ , and then select the corresponding offset value as the search step size in the offset value candidate list obtained after the shift operation according to the index number.
  • the optimal MV obtained by using these two search step sizes may only be the local optimal MV, which may have a negative impact on the coding performance. Therefore, this application provides The encoding method does not need to use 64 or 128 as the target offset value, which improves the encoding performance.
  • the offset value candidate set corresponding to the first type is a subset of the offset value candidate set corresponding to the second type, or, when determining the search step size of image blocks in different types of frames, they are offset from the same
  • the target offset value is determined in the value candidate list. Therefore, the codec side does not need to perform shift operations in this process, which reduces the coding complexity.
  • the encoding end may not need to add an identifier to indicate whether the image frame belongs to the first type or the second type in the code stream; that is, there is no identifier in the code stream to indicate whether the image frame belongs to the first type.
  • the type is also the second type of identification, reducing byte overhead.
  • the encoding end can encode the current block to obtain a code stream, where the code stream contains the index number of the target offset value in the offset value candidate list.
  • the value in the offset candidate list is ⁇ 1/4,1/2,1,2,4,8,16,32 ⁇
  • the index number in the code stream can be from 0 to 7, respectively corresponding to the offset Move the eight values in the candidate list.
  • the index number used to indicate the search step in the code stream is a number from 2 to 7; for the image block in the second type frame, it is used to indicate the search in the code stream
  • the index number of the step size is a number from 0 to 7.
  • the values in the offset candidate list are ⁇ 1/4,1/2,1,2,4,8,16,32,64 ⁇ , and the index number in the code stream can be 0-8, respectively Corresponds to the nine values in the offset candidate list.
  • the index number used to indicate the search step in the code stream is a number from 2 to 8; for the image block in the second type frame, it is used to indicate the search in the code stream
  • the index number of the step size is a number from 0 to 8.
  • the encoding end may write the index number of the target offset value in the offset value candidate list into the code stream, so that the decoding end can search based on the search step corresponding to the index number. Since different types of frames correspond to a set of offset value index numbers, the decoder does not need to shift the offset value according to the type of the frame. Therefore, the encoder does not need to write the indication information of the frame type in the code stream, thereby reducing The information overhead of encoding.
  • the decoding end can perform decoding based on the method shown in FIG. 7.
  • the method shown in Figure 7 includes:
  • S710 Determine a reference MV of the current block according to the motion information candidate list of the current block.
  • the decoding end can construct the motion information candidate list of the current block according to the method shown by the encoding end. Subsequently, the decoder selects the reference MV from the motion information candidate list and executes the following steps.
  • the target offset value is an offset value in an offset value candidate set corresponding to the type of the frame to which the current block belongs, and the type includes a first type and a second type.
  • the offset value candidate set corresponding to the first type is a subset of the offset value candidate set corresponding to the second type.
  • the decoding end can use an offset value candidate list, that is, the decoding end determines the offset value candidate sets corresponding to different types of frames from an offset value candidate list; the decoding end can also use multiple offset value candidate lists, namely , The decoder selects the target offset value from the offset value candidate list corresponding to different types.
  • the offset candidate set corresponding to the first type is ⁇ 1,2,4,8,16,32 ⁇
  • the offset candidate set corresponding to the second type is ⁇ 1/4,1/2,1,2 ,4,8,16,32 ⁇
  • the two sets can be located in a list, the list contains offset values ⁇ 1/4,1/2,1,2,4,8,16,32 ⁇ .
  • the index number used to indicate the search step size in the code stream received by the decoding end corresponds to different offset values in the same offset value candidate list.
  • the decoding end may use an offset value candidate list, that is, the decoding end determines the offset value candidate sets corresponding to different types of frames from the offset value candidate list.
  • the offset candidate set corresponding to the first type is ⁇ 1,2,4,8,16,32,64 ⁇
  • the offset candidate set corresponding to the second type is ⁇ 1/4,1/2,1 ,2,4,8,16,32 ⁇
  • the two sets can be located in a list, the list contains offset values ⁇ 1/4,1/2,1,2,4,8,16,32 ,64 ⁇ .
  • the decoding end may determine the search step size by itself.
  • the decoding end may determine the index number from the code stream corresponding to the current block, and determine the target offset value from an offset value candidate list according to the index number. Since different types of frames correspond to a set of offset value index numbers, the decoder does not need to shift the offset value according to the type of the frame.
  • the encoding end does not need to write the indication information of the frame type in the code stream, thereby reducing the information overhead of encoding.
  • the decoder does not need to distinguish the type of the frame. No matter which type the image frame belongs to, it will determine the search step size from the same offset candidate list after extracting the index number from the code stream.
  • S730 can also be executed before S720, or together with S720.
  • S730 Determine a search starting point according to the reference MV.
  • S740 Use the target offset value as the search step at the search starting point to search for the reference block of the current block.
  • the decoder needs to interpolate the reference frame and search for the reference block in the interpolated reference frame; if the search step is 1 or 2 If the length of the integer pixels is equal, the decoder does not need to interpolate the reference frame, and can directly search for the reference block in the reference frame.
  • the method of interpolation and the method of searching for reference blocks have been described in detail above, and will not be repeated here.
  • the optimal MV obtained by using these two search steps may only be a local optimal MV, which may have a negative impact on the decoding performance.
  • the decoding method provided in this application is not used 64 or 128 is used as the target offset value to improve the coding performance.
  • Fig. 8 is a schematic diagram of a video encoder provided by this application.
  • the video encoder 100 is used to output the video to the post-processing entity 41.
  • the post-processing entity 41 represents an example of a video entity that can process encoded video data from the video encoder 100, for example, a media-aware network element (MANE) or a splicing/editing device.
  • the post-processing entity 41 and the video encoder 100 are independent devices from each other, or the post-processing entity 41 may be integrated into the video encoder 100.
  • the video encoder 100 may perform inter prediction of image blocks according to the method proposed in this application.
  • the video encoder 100 includes a prediction processing unit 108, a filter 106, a coded picture buffer (CPB) 107, a summer 112, a transformer 101, a quantizer 102, and an entropy encoder 103.
  • the prediction processing unit 108 includes an inter predictor 110 and an intra predictor 109.
  • the video encoder 100 further includes an inverse quantizer 104, an inverse transformer 105, and a summer 111.
  • the filter 106 represents one or more loop filters, such as a deblocking filter, an adaptive loop filter (ALF), and a sample adaptive offset (SAO) filter.
  • the filter 106 in Fig. 8 may be an in-loop filter or a post-loop filter.
  • the video encoder 100 may further include a video data memory (not shown in the figure).
  • the video data storage may store video data to be encoded by the video encoder 100.
  • the video data memory can also be used as a reference image memory to store reference video data when the video encoder 100 encodes video data in the intra-frame decoding mode and/or the inter-frame decoding mode.
  • the video data memory and CPB 107 can be formed by any of a variety of memory devices, for example, including synchronous dynamic random access memory (SDRAM), dynamic random access memory (DRAM), Magnetoresistive random access memory (magnetic random access memory, MRAM), resistive random access memory (resistive random access memory, RRAM) or other types of memory.
  • SDRAM synchronous dynamic random access memory
  • DRAM dynamic random access memory
  • MRAM Magnetoresistive random access memory
  • RRAM resistive random access memory
  • the video data storage and CPB107 can be provided by the same storage device or by separate storage devices.
  • the video data memory may also be integrated with other components of the video encoder 100 on a chip, or may be provided separately from other components.
  • the video encoder 100 receives video data and stores the video data in a video data storage.
  • the dividing unit divides the video data (frame) into several image blocks, and these image blocks can be further divided into smaller blocks, for example, based on a quad-tree structure or a binary tree structure.
  • the result of the above segmentation may be a slice, a tile, or other larger units, and the slice may be divided into a plurality of image blocks or a set of image blocks called "slices".
  • the prediction processing unit 108 (for example, the inter prediction unit 110) may determine the motion information candidate list of the current image block, and determine the target motion information from the motion information candidate list according to the screening rules, and then perform the execution on the current image block according to the target motion information.
  • the prediction processing unit 108 may provide the current image block that has undergone intra-frame decoding and/or inter-frame decoding to the summer 112 to generate a residual block, and the prediction processing unit 108 may further The current image block inter-decoded is provided to the summer 111 to reconstruct a coded block as a reference image block.
  • the prediction processing unit 108 (for example, the inter prediction unit 110) may send the index information of the target motion information to the entropy encoder 103, so that the entropy encoder 103 can encode the index information of the target motion information into the code stream.
  • the intra predictor 109 in the prediction processing unit 108 may perform intra predictive coding on the current image block to remove spatial redundancy.
  • the inter predictor 110 in the prediction processing unit 108 may perform inter predictive encoding on the current image block to remove temporal redundancy.
  • the inter predictor 110 is used to determine target motion information for inter prediction, and predict the motion information of one or more basic motion compensation units in the current image block according to the target motion information, and use one or more motion information in the current image block.
  • the motion information of a basic motion compensation unit obtains or generates a prediction block of the current image block.
  • the inter predictor 110 may calculate the RDO values of various motion information in the motion information candidate list, and select motion information with the best RDO characteristics therefrom.
  • the RDO characteristic is usually used to measure the degree of distortion (or error) between an encoded image block and an unencoded image block.
  • the inter predictor 110 may determine that the motion information with the smallest RDO cost for encoding the current image block in the motion information candidate list is the target motion information for inter prediction of the current image block.
  • the inter predictor 110 may also generate syntax elements associated with image blocks and slices for use by the video decoder 200 when decoding the image blocks in the slice.
  • the inter predictor 110 may send the index of the target motion information of the current image block to the entropy encoder 103 so that the entropy encoder 103 can encode the index.
  • the intra predictor 109 is used to determine target motion information for intra prediction, and perform intra prediction on the current image block based on the target motion information. For example, the intra predictor 109 may calculate the RDO value of each candidate motion information, select the motion information with the least RDO cost as the target motion information for intra prediction of the current image block, and select the frame with the best RDO characteristics based on the RDO value Intra prediction mode. After selecting the target motion information, the intra predictor 109 may send the index of the target motion information of the current image block to the entropy encoder 103 so that the entropy encoder 103 encodes the index.
  • the video encoder 100 subtracts the prediction block from the current image block to be encoded to form a residual image block (residual Piece).
  • the summer 112 represents one or more components that perform this subtraction operation.
  • the residual video data in the residual block may be included in one or more transform units (TU) and applied to the transformer 101.
  • the converter 101 uses a method such as DCT to convert the residual video data into residual transform coefficients.
  • the transformer 101 can transform the residual video data from the pixel value domain to the transform domain, such as the frequency domain.
  • the transformer 101 may send the resulting transform coefficient to the quantizer 102.
  • the quantizer 102 quantizes the transform coefficients to further reduce the bit rate.
  • the quantizer 102 may then perform a scan of the matrix containing the quantized transform coefficients.
  • the entropy encoder 103 may perform scanning.
  • the entropy encoder 103 After quantization, the entropy encoder 103 performs entropy encoding on the quantized transform coefficient. For example, the entropy encoder 103 can perform context-adaptive variable-length coding (CAVLC), context-adaptive binary arithmetic coding (CABAC), and grammar-based context-adaptive binary coding. Arithmetic coding (syntax-based context adaptive binary arithmetic coding, SBAC), probability interval partitioning entropy (PIPE) coding or other entropy coding methods.
  • the entropy encoder 103 may transmit the code stream to the video decoder 200 after performing entropy encoding.
  • the entropy encoder 103 may also perform entropy encoding on the syntax elements of the current image block to be encoded, for example, encoding the target motion information into the code stream.
  • the inverse quantizer 104 and the inverse transformer 105 respectively apply inverse quantization and inverse transformation to reconstruct a residual block (for example, an image block used as a reference image) in the pixel domain.
  • the summer 111 adds the reconstructed residual block to the prediction block generated by the inter predictor 110 or the intra predictor 109 to generate a reconstructed image block.
  • the filter 106 may be used to process the reconstructed image block to reduce distortion, for example, to reduce block artifacts.
  • the reconstructed image block may be stored in the image buffer 107 and used as a reference block for the inter predictor 110 to perform inter prediction on a block in a subsequent video frame or image.
  • the processing procedure of the video encoder 100 described above is only an example, and the video encoder 100 may also perform video encoding based on other processing procedures.
  • the video encoder 100 can directly quantize the residual signal without processing by the transformer 101, and correspondingly without processing by the inverse transformer 105; or, for some image blocks Or image frames, the video encoder 100 does not generate residual data, and accordingly does not need to be processed by the transformer 101, quantizer 102, inverse quantizer 104, and inverse transformer 105; or, the video encoder 100 can transform the reconstructed image
  • the block is directly stored as a reference block without being processed by the filter 106; alternatively, the quantizer 102 and the inverse quantizer 104 in the video encoder 100 may be merged together.
  • FIG. 9 is a schematic diagram of a video decoder provided by this application.
  • the video decoder 200 includes an entropy decoder 203, a prediction processing unit 208, an inverse quantizer 204, an inverse transformer 205, a summer 211, a filter 206, and a decoded image buffer 207.
  • the prediction processing unit 208 may include an inter predictor 210 and an intra predictor 209.
  • video decoder 200 may perform a decoding process that is substantially reciprocal of the encoding process described by video encoder 100.
  • the video decoder 200 receives a code stream including image blocks and associated syntax elements from the video encoder 100.
  • the video decoder 200 may receive video data from the network entity 42, and optionally, may also store the video data in a video data storage (not shown in the figure).
  • the video data memory can be used as a decoded picture buffer (DPB) for storing code streams. Therefore, although the video data storage is not shown in FIG. 9, the video data storage and the DPB 207 may be the same storage or separate storages.
  • the video data memory and DPB 207 can be formed by any of a variety of memory devices, for example, including SDRAM, DRAM, MRAM, RRAM or other types of memory.
  • the video data storage may be integrated with other components of other components of the video decoder 200 on a chip, or may be provided separately from other components.
  • the network entity 42 may be, for example, a server, a MANE, or a video editor/splicer.
  • the network entity 42 may or may not include a video encoder, such as the video encoder 100.
  • the network entity 42 and the video decoder 200 may be independent devices.
  • the network entity 42 and the video decoder 200 may be integrated into one device.
  • the entropy decoder 203 of the video decoder 200 entropy decodes the code stream to generate quantized coefficients and syntax elements.
  • the entropy decoder 203 forwards the syntax element to the prediction processing unit 208.
  • the video decoder 200 may receive syntax elements at the video slice level and/or the image block level.
  • the syntax element here may include target motion information related to the current image block.
  • the intra-frame predictor 209 of the prediction processing unit 208 can generate the signal based on the intra-frame prediction mode indicated in the code stream and the decoded image block from the current frame.
  • the prediction block of the image block of the current video slice can be generated.
  • the inter predictor 210 of the prediction processing unit 208 may determine to use the syntax element received from the entropy decoder 203 based on the syntax element received from the entropy decoder 203 Based on the target motion information for decoding the current image block of the current video slice, the current image block is decoded (for example, inter-frame prediction is performed).
  • the inter predictor 210 may determine whether to use a new inter prediction method to predict the current image block of the current video slice, for example, whether to use the method of the present application to determine the target offset value. If the syntax element indicates that a new inter-frame prediction method is used to predict the current image block, the motion information of the current image block is predicted based on the new inter-frame prediction method (for example, the method of this application is used to determine the target offset value), and pass The motion compensation process uses the predicted motion information of the current image block to generate the prediction block of the current image block.
  • the motion information here may include reference image information and motion vectors, where the reference image information may include, but is not limited to, one-way/two-way prediction information, a reference image list number, and a reference image index corresponding to the reference image list.
  • the video decoder 200 may construct a reference image list based on the reference images stored in the DPB 207.
  • the inter prediction process of using the method 700 to predict the motion information of the current image block has been described in detail.
  • the dequantizer 204 dequantizes (ie, dequantizes) the quantized transform coefficients decoded by the entropy decoder 203.
  • the inverse quantization process may include determining the degree of quantization that should be applied using the quantization parameter calculated by the video encoder 100 for each image block in the video slice, and determining the degree of inverse quantization that should be applied according to the degree of quantization.
  • the inverse transformer 205 performs inverse transform processing on the transform coefficients, for example, inverse DCT, inverse integer transform, or other inverse transform processes to generate residual blocks in the pixel domain.
  • the video decoder 200 After the inter predictor 210 generates the prediction block of the current image block or the sub-block of the current image block, the video decoder 200 sums the residual block from the inverse transformer 205 and the prediction block from the inter predictor 210 to obtain reconstruction Block.
  • the summer 211 represents the component that performs this summing operation.
  • filter 206 (in or after the decoding loop) can also be used to smooth pixel transitions or to improve video quality in other ways.
  • the filter 206 may be one or more loop filters, such as deblocking filters, ALF and SAO filters.
  • the filter 206 is adapted to reconstruct the block to reduce block distortion, and output the result as a decoded video stream.
  • the decoded image block in a given frame or image can also be stored in the DPB 207 to be used as a reference image for subsequent motion compensation. DPB207 can also store the decoded video for later presentation on the display device.
  • the processing procedure of the video decoder 200 described above is only an example, and the video decoder 200 may also perform video decoding based on other processing procedures.
  • the video decoder 200 may output a video stream without processing by the filter 206; or, for some image blocks or image frames, the entropy decoder 203 of the video decoder 200 does not decode the quantized coefficients, and these image blocks may be The image frame does not need to be processed by the inverse quantizer 204 and the inverse transformer 205.
  • FIG. 10 is a schematic block diagram of an inter-frame prediction apparatus 1000 in an embodiment of the application. It should be noted that the inter-frame prediction device 1000 is suitable for both inter-frame prediction of decoded video images and inter-frame prediction of encoded video images. It should be understood that the inter-frame prediction device 1000 here may correspond to The inter predictor 110 may alternatively correspond to the inter predictor 210 in FIG. 9.
  • the inter-frame prediction apparatus 1000 may include:
  • the inter prediction processing unit 1001 is configured to determine the reference MV of the current block according to the motion information candidate list of the current block.
  • the offset value selection unit 1002 is configured to determine the target offset value from the corresponding offset value candidate set according to the type of the frame to which the current block belongs, where the types include a first type and a second type, and the first type corresponds to The offset candidate set of is a subset of the offset candidate set corresponding to the second type.
  • it is used to determine the target offset value from the same offset value candidate list according to the type of the frame to which the current block belongs.
  • the inter-frame prediction processing unit 1001 is further configured to: determine the search starting point according to the reference MV; use the target offset value at the search starting point to search for the reference block of the current block.
  • the inter prediction apparatus 1000 uses different offset value sets to perform inter prediction for different types of frames.
  • the inter-frame prediction device 1000 can use a set containing more offset values, so that a smaller search step can be selected to accurately search for the optimal MV;
  • the inter-frame prediction device 1000 A set containing fewer offset values (ie, a subset of the set containing more offset values) can be used, so that the optimal MV can be quickly searched.
  • the two sets can be located in a list and stored in a buffer, thereby reducing the storage space consumed by video encoding.
  • the apparatus 1000 may include:
  • the frame inter prediction processing unit 1001 is configured to determine the reference MV of the current block according to the motion information candidate list of the current block.
  • the offset value selection unit 1002 is used to determine the target offset value.
  • the target offset value is an offset value in an offset value candidate set corresponding to the type of the frame to which the current block belongs, the type includes a first type and a second type, and the first type corresponds to
  • the offset value candidate set of is a subset of the offset value candidate set corresponding to the second type; or, image blocks in different types of frames use the same offset value candidate list to determine the target offset value.
  • the inter-frame prediction processing unit 1001 is further configured to: determine the search starting point according to the reference MV; use the target offset value at the search starting point to search for the reference block of the current block.
  • the device 1000 can determine the target offset value based on the index number in the code stream without determining whether to perform the offset value according to the type of the frame. Shift processing, therefore, the apparatus 1000 reduces the complexity of decoding.
  • each module in the inter-frame prediction apparatus in the embodiment of the present application is a functional body that implements various steps in the method embodiment of the present application.
  • the introduction of the inter-frame prediction method in the method embodiment herein. I won’t repeat it here.
  • FIG. 11 is a schematic block diagram of an implementation manner of an encoding device or a decoding device (referred to as a decoding device 1100 for short) provided by this application.
  • the decoding device 1100 may include a processor 1110, a memory 1130, and a bus system 1150.
  • the processor and the memory are connected through a bus system, the memory is used to store instructions, and the processor is used to execute instructions stored in the memory.
  • the memory of the encoding device stores program codes, and the processor can call the program codes stored in the memory to execute various video encoding or decoding methods described in this application, especially the inter-frame prediction method described in this application. To avoid repetition, it will not be described in detail here.
  • the processor 1110 may be a CPU, and the processor 1110 may also be other general-purpose processors, DSP, ASIC, FPGA or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc. .
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the memory 1130 may include ROM or RAM. Any other suitable type of storage device can also be used as the memory 1130.
  • the memory 1130 may include code and data 1131 accessed by the processor 1110 using the bus 1150.
  • the memory 1130 may further include an operating system 1133 and an application program 1135.
  • the application program 1135 includes at least one program that allows the processor 1110 to execute the video encoding or decoding method described in this application (especially the inter-frame prediction method described in this application).
  • the application program 1135 may include applications 1 to N, which further include a video encoding or decoding application (referred to as a video decoding application) that executes the video encoding or decoding method described in this application.
  • the bus system 1150 may also include a power bus, a control bus, and a status signal bus. However, for clear description, various buses are marked as the bus system 1150 in the figure.
  • the decoding device 1100 may further include one or more output devices, such as a display 1170.
  • the display 1170 may be a touch-sensitive display, which merges the display with a touch-sensitive unit operable to sense touch input.
  • the display 1170 may be connected to the processor 1110 via the bus 1150.
  • This application also provides a computer-readable storage medium containing the above-mentioned encoding method and decoding method.
  • the computer-readable storage medium contains a computer program product that can be read by one or more processors to retrieve and implement this
  • the instructions, codes and/or data structures of the technology described in the application are also provided.
  • Computer-readable storage media may include intangible media and tangible media.
  • the intangible medium is, for example, a signal or carrier wave.
  • tangible media may include magnetic media such as floppy disks, hard disks, and magnetic tapes, optical media such as DVDs, and semiconductor media such as solid state disks (SSD).
  • SSD solid state disks
  • connection may also be referred to as a computer-readable medium.
  • coaxial cable optical fiber, twisted pair, digital subscriber line (digital subscriber line, DSL), infrared, radio and microwave to transmit commands from a website, server or other remote source
  • coaxial cable, Optical fiber, twisted pair, DSL, infrared, radio and microwave are included in the definition of media.
  • the technology of the present application can be implemented in various devices or devices, including wireless handsets, integrated circuits (ICs), or a set of ICs (for example, chipsets).
  • ICs integrated circuits
  • a set of ICs for example, chipsets.
  • Various components, modules, or units are described in this application to emphasize that the device can implement the functions of the disclosed technology, but it does not necessarily need to be implemented by different hardware units.
  • various units can be integrated in the hardware unit of the encoder or decoder in combination with appropriate software and/or firmware.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

提供一种视频编码方法,包括:根据当前块的运动信息候选列表确定当前块的基准运动信息;根据当前块所属的帧的类型从对应的偏移值候选集合中确定目标偏移值,其中,所述类型包括第一类型和第二类型,第一类型对应的偏移值候选集合是第二类型对应的偏移值候选集合的子集;根据基准运动信息确定搜索起点;在搜索起点以目标偏移值为搜索步长,搜索当前块的参考块。应用上述方法后,对于运动复杂的帧,视频编码装置可以采用包含较多偏移值的集合,从而可以选择较小的搜索步长精确搜索最优运动矢量;对于运动简单的帧,视频编码装置可以采用包含较少偏移值的集合,从而可以快速搜索最优运动矢量。

Description

视频编码的方法与装置,以及视频解码的方法与装置
版权声明
本专利文件披露的内容包含受版权保护的材料。该版权为版权所有人所有。版权所有人不反对任何人复制专利与商标局的官方记录和档案中所存在的该专利文件或者该专利披露。
技术领域
本申请涉及视频编码领域和视频解码领域,并且更为具体地,涉及一种视频编码的方法与装置,以及一种视频解码的方法与装置。
背景技术
视频编码的基本原理是利用空域、时域和码字之间的相关性,尽可能去除冗余以节省传输带宽或者存储空间。目前常用的做法是采用基于块的混合视频编码框架,通过预测(包括帧内预测和帧间预测)、变换、量化、熵编码等步骤实现视频编码压缩。在各种视频编码方案和解码方案中,运动估计和运动补偿是一种影响编码性能和解码性能的关键技术。
由于视频中邻近帧中的物体之间存在着一定的相关性,因此,可将待编码帧分成若干图像块,搜索出每个图像块在邻近帧中的位置,并得出两者之间的空间位置的相对偏移量,得到的相对偏移量即运动矢量(motion vector,MV),得到运动矢量的过程被称为运动估计(motion estimation,ME),通过运动估计可以去除帧间冗余,减少视频传输的比特开销。
由于不同类型的视频中物体的运动方式复杂多变,编码端和解码端在进行运动估计时会消耗较多的时间,如何提高运动估计的效率是当前需要解决的问题。
发明内容
本申请提供了一种视频编码的方法与装置,以及一种视频解码的方法与装置,能够减小编码的复杂度和解码的复杂度。
第一方面,本申请提供了一种视频编码的方法,包括:根据当前块的运动信息候选列表确定当前块的基准运动信息;根据当前块所属的帧的类型从对应的偏移值候选集合中确定目标偏移值,其中,所述类型包括第一类型和 第二类型,第一类型对应的偏移值候选集合是第二类型对应的偏移值候选集合的子集;根据基准运动信息确定搜索起点;在搜索起点以目标偏移值为搜索步长,搜索当前块的参考块。
应用上述方法的视频编码装置对于不同类型的帧采用不同的偏移值集合进行帧间预测。对于运动较为复杂的帧,视频编码装置可以采用包含较多偏移值的集合,从而可以选择较小的搜索步长精确搜索最优运动矢量;对于运动较为简单的帧,视频编码装置可以采用包含较少偏移值的集合(即,包含较多偏移值的集合的子集),从而可以快速搜索最优运动矢量。此外,由于视频编码装置对于不同类型的帧均从预设的集合中选择目标偏移值,无需再基于不同的类型确定是否对偏移值进行移位处理,从而减小了视频编码的复杂度。
第二方面,本申请提供了另一种视频编码的方法,包括:根据当前块的运动信息候选列表确定当前块的基准运动信息;根据当前块所属的帧的类型从同一个偏移值候选列表中确定目标偏移值;根据基准运动信息确定搜索起点;在搜索起点以目标偏移值为搜索步长,搜索当前块的参考块。
应用上述方法的视频编码装置对于不同类型的帧采用不同的偏移值集合进行帧间预测。对于运动较为复杂的帧,视频编码装置可以采用包含较多偏移值的集合,从而可以选择较小的搜索步长精确搜索最优运动矢量;对于运动较为简单的帧,视频编码装置可以采用包含较少偏移值的集合,从而可以快速搜索最优运动矢量。此外,由于视频编码装置对于不同类型的帧均从预设的列表中选择目标偏移值,无需再基于不同的类型确定是否对偏移值进行移位处理,从而减小了视频编码的复杂度。
第三方面,本申请提供了再一种视频编码的方法,包括:根据当前块的运动信息候选列表确定所述当前块的基准运动信息;从偏移值候选列表中选择所述当前块的目标偏移值,其中,屏幕内容和非屏幕内容的视频帧采用同一个偏移值候选列表;根据所述基准运动信息和所述目标偏移值确定所述当前块的运动信息;根据所述当前块的运动信息对所述当前块进行编码,得到码流,其中,所述码流中包含所述目标偏移值在所述偏移值候选列表中的索引号。
应用上述方法的视频编码装置对于不同类型的帧采用不同的偏移值集合进行帧间预测。对于运动较为复杂的帧,视频编码装置可以采用包含较多 偏移值的集合,从而可以选择较小的搜索步长精确搜索最优运动矢量;对于运动较为简单的帧,视频编码装置可以采用包含较少偏移值的集合,从而可以快速搜索最优运动矢量。此外,由于视频编码装置对于不同类型的帧均从预设的列表中选择目标偏移值,无需再基于不同的类型确定是否对偏移值进行移位处理,从而减小了视频编码的复杂度。此外,由于不同类型的帧对应的目标偏移值的索引号是同一编排的,因此,码流中无需携带指示帧的类型的指示信息,从而减小了码流的比特开销。
第四方面,本申请提供了一种视频解码的方法,包括:根据当前块的运动信息候选列表确定当前块的基准运动信息;确定目标偏移值,其中,目标偏移值为当前块所属的帧的类型对应的偏移值候选集合中的一个偏移值,其中,所述类型包括第一类型和第二类型,第一类型对应的偏移值候选集合是第二类型对应的偏移值候选集合的子集;根据基准运动信息确定搜索起点;在搜索起点以目标偏移值为搜索步长,搜索当前块的参考块。
由于不同类型的帧所对应的偏移值集合均为预设的偏移值集合,视频解码装置无需根据帧的类型确定是否对偏移值进行移位处理,因此,应用上述方法的视频解码装置减小了解码的复杂度。
第五方面,本申请提供了另一种视频解码的方法,包括:根据当前块的运动信息候选列表确定当前块的基准运动信息;从一个偏移值候选列表中确定目标偏移值,其中,不同类型的帧中的图像块采用同一个偏移值候选列表确定目标偏移值;根据基准运动信息确定搜索起点;在搜索起点以目标偏移值为搜索步长,搜索当前块的参考块。
由于不同类型的帧所对应的偏移值集合属于预设的偏移值列表中的集合,视频解码装置无需根据帧的类型确定是否对偏移值进行移位处理,因此,应用上述方法的视频解码装置减小了解码的复杂度。
第六方面,本申请提供了再一种视频解码的方法,包括:接收码流,所述码流包含索引号;根据所述索引号从偏移值候选列表中选择所述当前块的目标偏移值,其中,屏幕内容和非屏幕内容的视频帧采用同一个偏移值候选列表;根据当前块的运动信息候选列表确定所述当前块的基准运动信息;根据所述基准运动信息和所述目标偏移值确定所述当前块的运动信息;根据所述当前块的运动信息对所述当前块进行解码,得到解码后的所述当前块。
由于不同类型的帧所对应的偏移值集合属于预设的偏移值列表中的集 合,视频解码装置可以基于码流中的索引号确定目标偏移值,无需根据帧的类型确定是否对偏移值进行移位处理,因此,应用上述方法的视频解码装置减小了解码的复杂度。此外,应用上述方法的视频解码装置无需接收指示帧的类型的指示信息,码流无需携带该指示信息,从而减小了码流的比特开销。
第七方面,本申请提供了一种编码装置,所述编码装置包括处理单元,用于:根据当前块的运动信息候选列表确定当前块的基准运动信息;根据当前块所属的帧的类型从对应的偏移值候选集合中确定目标偏移值,其中,所述类型包括第一类型和第二类型,第一类型对应的偏移值候选集合是第二类型对应的偏移值候选集合的子集;根据基准运动信息确定搜索起点;在搜索起点以目标偏移值为搜索步长,搜索当前块的参考块。
第八方面,本申请提供一种编码装置,所述编码装置包括存储器和处理器,所述存储器用于存储指令,所述处理器用于执行所述存储器存储的指令,并且对所述存储器中存储的指令的执行使得所述处理器执行第一方面提供的方法。
第九方面,本申请提供一种芯片,所述芯片包括处理模块与通信接口,所述处理模块用于控制所述通信接口与外部进行通信,所述处理模块还用于实现第一方面提供的方法。
第十方面,本申请提供一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被计算机执行时使得所述计算机实现第一方面提供的方法。
第十一方面,本申请提供一种包含指令的计算机程序产品,所述指令被计算机执行时使得所述计算机实现第一方面提供的方法。
第十二方面,本申请提供了一种编码装置,所述编码装置包括处理单元,用于:根据当前块的运动信息候选列表确定当前块的基准运动信息;根据当前块所属的帧的类型从同一个偏移值候选列表中确定目标偏移值;根据基准运动信息确定搜索起点;在搜索起点以目标偏移值为搜索步长,搜索当前块的参考块。
第十三方面,本申请提供一种编码装置,所述编码装置包括存储器和处理器,所述存储器用于存储指令,所述处理器用于执行所述存储器存储的指令,并且对所述存储器中存储的指令的执行使得所述处理器执行第二方面提供的方法。
第十四方面,本申请提供一种芯片,所述芯片包括处理模块与通信接口,所述处理模块用于控制所述通信接口与外部进行通信,所述处理模块还用于实现第二方面提供的方法。
第十五方面,本申请提供一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被计算机执行时使得所述计算机实现第二方面提供的方法。
第十六方面,本申请提供一种包含指令的计算机程序产品,所述指令被计算机执行时使得所述计算机实现第二方面提供的方法。
第十七方面,本申请提供了一种编码装置,所述编码装置包括处理单元,用于:根据当前块的运动信息候选列表确定所述当前块的基准运动信息;从偏移值候选列表中选择所述当前块的目标偏移值,其中,屏幕内容和非屏幕内容的视频帧采用同一个偏移值候选列表;根据所述基准运动信息和所述目标偏移值确定所述当前块的运动信息;根据所述当前块的运动信息对所述当前块进行编码,得到码流,其中,所述码流中包含所述目标偏移值在所述偏移值候选列表中的索引号。
第十八方面,本申请提供一种编码装置,所述编码装置包括存储器和处理器,所述存储器用于存储指令,所述处理器用于执行所述存储器存储的指令,并且对所述存储器中存储的指令的执行使得所述处理器执行第三方面提供的方法。
第十九方面,本申请提供一种芯片,所述芯片包括处理模块与通信接口,所述处理模块用于控制所述通信接口与外部进行通信,所述处理模块还用于实现第三方面提供的方法。
第二十方面,本申请提供一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被计算机执行时使得所述计算机实现第三方面提供的方法。
第二十一方面,本申请提供一种包含指令的计算机程序产品,所述指令被计算机执行时使得所述计算机实现第三方面提供的方法。
第二十二方面,本申请提供了一种解码装置,所述解码装置包括处理单元,用于:根据当前块的运动信息候选列表确定当前块的基准运动信息;确定目标偏移值,其中,目标偏移值为当前块所属的帧的类型对应的偏移值候选集合中的一个偏移值,其中,所述类型包括第一类型和第二类型,第一类 型对应的偏移值候选集合是第二类型对应的偏移值候选集合的子集;根据基准运动信息确定搜索起点;在搜索起点以目标偏移值为搜索步长,搜索当前块的参考块。
第二十三方面,本申请提供一种解码装置,所述解码装置包括存储器和处理器,所述存储器用于存储指令,所述处理器用于执行所述存储器存储的指令,并且对所述存储器中存储的指令的执行使得所述处理器执行第四方面提供的方法。
第二十四方面,本申请提供一种芯片,所述芯片包括处理模块与通信接口,所述处理模块用于控制所述通信接口与外部进行通信,所述处理模块还用于实现第四方面提供的方法。
第二十五方面,本申请提供一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被计算机执行时使得所述计算机实现第四方面提供的方法。
第二十六方面,本申请提供一种包含指令的计算机程序产品,所述指令被计算机执行时使得所述计算机实现第四方面提供的方法。
第二十七方面,本申请提供了一种解码装置,所述解码装置包括处理单元,用于:根据当前块的运动信息候选列表确定当前块的基准运动信息;从一个偏移值候选列表中确定目标偏移值,其中,不同类型的帧中的图像块采用同一个偏移值候选列表确定目标偏移值;根据基准运动信息确定搜索起点;在搜索起点以目标偏移值为搜索步长,搜索当前块的参考块。
第二十八方面,本申请提供一种解码装置,所述解码装置包括存储器和处理器,所述存储器用于存储指令,所述处理器用于执行所述存储器存储的指令,并且对所述存储器中存储的指令的执行使得所述处理器执行第五方面提供的方法。
第二十九方面,本申请提供一种芯片,所述芯片包括处理模块与通信接口,所述处理模块用于控制所述通信接口与外部进行通信,所述处理模块还用于实现第五方面提供的方法。
第三十方面,本申请提供一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被计算机执行时使得所述计算机实现第五方面提供的方法。
第三十一方面,本申请提供一种包含指令的计算机程序产品,所述指令 被计算机执行时使得所述计算机实现第五方面提供的方法。
第三十二方面,本申请提供了一种解码装置,所述解码装置包括接收单元和处理单元,所述接收单元用于:接收码流,所述码流包含索引号;所述处理单元用于:根据所述索引号从偏移值候选列表中选择所述当前块的目标偏移值,其中,屏幕内容和非屏幕内容的视频帧采用同一个偏移值候选列表;根据当前块的运动信息候选列表确定所述当前块的基准运动信息;根据所述基准运动信息和所述目标偏移值确定所述当前块的运动信息;根据所述当前块的运动信息对所述当前块进行解码,得到解码后的所述当前块。
第三十三方面,本申请提供一种解码装置,所述解码装置包括存储器和处理器,所述存储器用于存储指令,所述处理器用于执行所述存储器存储的指令,并且对所述存储器中存储的指令的执行使得所述处理器执行第六方面提供的方法。
第三十四方面,本申请提供一种芯片,所述芯片包括处理模块与通信接口,所述处理模块用于控制所述通信接口与外部进行通信,所述处理模块还用于实现第六方面提供的方法。
第三十五方面,本申请提供一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被计算机执行时使得所述计算机实现第六方面提供的方法。
第三十六方面,本申请提供一种包含指令的计算机程序产品,所述指令被计算机执行时使得所述计算机实现第六方面提供的方法。
附图说明
图1是一种适用于本申请的视频编码方法的示意图。
图2是一种适用于本申请的视频解码方法的示意图。
图3是一种适用于本申请的构建运动信息候选列表的方法的示意图。
图4是一种适用于本申请的确定最优运动矢量的方法的示意图。
图5是一种适用于本申请的插值方法的示意图。
图6是本申请提供的一种视频编码方法的示意图。
图7是本申请提供的一种视频解码方法的示意图。
图8是本申请提供的一种视频编码器的示意图。
图9是本申请提供的一种视频解码器的示意图。
图10是本申请提供的一种帧间预测装置的示意图。
图11是本申请提供的视频编码设备或者视频解码设备的示意图。
具体实施方式
为了便于理解本申请,首先对本申请提供的技术方案中可能涉及的技术特征进行描述。
图1示出了一种适用于本申请的视频编码方法的示意图。
该视频编码方法包括帧内预测(intra prediction)、帧间预测(inter prediction)、变换(transform)、量化(quantization)、熵编码(entropy encode)、环内滤波(in-loop filtering)等环节。将图像划分为编码块之后进行帧内预测或者帧间预测,并且在得到残差之后进行变换量化,最终进行熵编码并输出码流。此处编码块为由像素点组成的M×N大小的阵列(M可以等于N,也可以不等于N),并且各个像素点的像素值已知。图1中,P表示预测值,Dn表示残差,uFn'表示重建值(滤波前),Dn'表示残差。
帧内预测是指利用当前图像内已重建区域内像素点的像素值对当前编码块内像素点的像素值进行预测。
帧间预测是在已重建的图像中,为当前图像中的当前编码块寻找匹配的参考块。将参考块中的像素点的像素值作为当前编码块中像素点的像素值的预测信息或者预测值(以下不再区分信息和值),此过程即运动估计。
需要说明的是,当前编码块的运动信息包括了预测方向的指示信息(通常为前向预测、后向预测或者双向预测),一个或两个指向参考块的运动矢量,以及参考块所在图像的指示信息(通常为参考帧索引)。
前向预测是指当前编码块从前向参考图像集合中选择至少一个参考图像获取至少一个参考块。后向预测是指当前编码块从后向参考图像集合中选择至少一个参考图像获取至少一个参考块。双向预测是指从前向和后向参考图像集合中各选择至少一个参考图像获取至少一个参考块。当使用双向预测方法时,当前编码块会存在至少两个参考块,每个参考块各自需要运动矢量和参考帧索引进行指示,然后根据至少两个参考块内像素点的像素值确定当前块内像素点像素值的预测值。
运动估计过程需要为当前编码块在参考图像中搜索多个参考块,可以通过率失真优化(rate-distortion optimization,RDO)或者其它方法确定最终使 用哪一个或者哪几个参考块进行预测。
利用帧内预测方法或者帧间预测方法得到预测信息之后,根据当前编码块内像素点的像素值和对应的预测信息可以得到残差信息,例如可以通过当前编码块的像素值与参考块的像素值直接相减的方式得到残差信息,也可以是通过其它可能的方式得到残差值信息。然后利用离散余弦变换(discrete cosine transformation,DCT)等方法对残差信息进行变换,再对变换后的残差信息进行量化,熵编码等操作,最终得到码流,以便解码端对码流进行解码。在编码端的处理中,还可以对预测信息和残差信息进行滤波操作,进而得到重建信息,并将其作为后续编码的参考信息。
解码端对码流的处理类似于编码端对图像进行编码的逆过程,图2示出了适用于本申请的码流解码方法的示意性流程图。
如图2所示,首先利用熵解码,反量化等操作变换得到残差信息,解码端首先通过解析码流得到当前待解码块的预测模式。如果是帧内预测,则利用当前待解码块周围已重建区域内像素点的像素值构建预测信息。如果是帧间预测,则需要得到当前解码块的运动信息,并使用该运动信息在已重建的图像中确定参考块,将参考块内像素点的像素值作为预测信息。使用预测信息(亦称为预测块)和残差信息(亦称为残差块),经过滤波操作便可以得到当前待解码块的重建信息(亦称为重建块),从而进一步得到重建的部分图像。
一些实现方式中,编码端(或解码端)在对当前图像块进行预测之前,会构建运动信息候选列表,根据在该运动信息候选列表中选中的候选运动信息对当前图像块进行预测。当前图像块为待进行编码(或解码)的图像块。当前图像块所在的图像帧称为当前帧。例如,当前图像块在一些视频标准中为一个编码单元(coding unit,CU)或一个解码单元(decoding unit,DU)。
其中,本文中所提到的运动信息可以包括运动矢量,或者包括运动矢量和参考帧信息。其中,该运动信息候选列表指的是当前块的候选运动信息的集合,该运动信息候选列表中的各候选运动信息可以存储在同一个缓冲区(buffer)中,也可以存储在不同的缓冲区中,在此不做限制。下文中所提到的运动信息在运动信息候选列表中的索引,可以是运动信息在当前块的全部候选运动信息集合中的索引,或者,也可以运动信息在所在的缓冲区的索引,在此不做限制。
构建运动信息候选列表有多类模式。下面先对构建运动信息候选列表的其中一种模式进行举例说明。
作为一个示例,在编码端,在构建好运动信息候选列表之后,可以通过如下步骤完成当前图像块的编码。
1)从运动信息候选列表中选出最优的一个运动信息,根据该运动信息确定当前图像块的运动矢量MV1,并获得该选出的运动信息在运动信息候选列表中的索引。
2)根据当前图像块的运动矢量MV1,从参考图像(即参考帧)中确定当前图像块的预测图像块。即,确定当前图像块的预测图像块在参考帧中的位置。
3)获得当前图像块与预测图像块之间的残差。
4)向解码端发送步骤1)中获得的索引以及步骤3)获得的残差。
作为示例,在解码端,可以通过如下步骤解码出当前图像块。
1)从编码端接收残差与索引。
2)采用预设的方法构建运动信息候选列表。该预设的方法与编码端构建运动信息候选列表的方法可以一致。
3)根据索引,在运动信息候选列表中选中运动信息并根据该选中的运动信息确定当前图像块的运动矢量MV1。
4)根据运动矢量MV1,获取当前图像块的预测图像块,再结合残差,解码得到当前图像块。
也即在该模式中,当前图像块的运动矢量等于预测MV(motion vector prediction,MVP)(例如上述提到的运动矢量MV1)。在一些视频编解码标准中,该模式包括合并(merge)模式。在一些视频编解码标准中,merger模式包括普通合并(normal merge)模式和/或仿射合并(affine merge)模式。
在该模式中,当前图像块的运动矢量等于预测MV(motion vector prediction,MVP)(例如上述提到的运动矢量MV1)。一些示例中,该MVP可以是运动信息候选列表中的一个候选者,也可以是根据该运动信息候选列表中的其中一个候选者处理后(例如进行缩放后)得到的。
此外,还可以在当前图像块的MVP的基础上进一步优化,得到当前图像块的运动矢量。例如,以MVP为基准MV,以固定的搜索步长在其周围进行搜索,从搜索结果中选择最优MV。在一个示例中,在编码端,在确定 基准MV后,以默认的固定方向(例如上下左右),以偏移值候选列表中的搜索步长每一个试一遍,搜索时仅按偏移值的步长偏移默认次数(例如偏移一次),然后将搜索得到的最优结果对应的图像块所采用的偏移值的索引号和方向的索引号写到码流中。解码端接收到基准MV的索引号、偏移值的索引号和方向的索引号,然后根据基准MV指向的块,以索引号对应的偏移值按索引号对应的方向偏移一次,就得到当前图像块的参考块。在一些视频标准中,该模式也可被称为MVD合并模式(merge mode with MVD,MMVD)。
图3和图4示出了MMVD中确定最优MV的方法。
在MMVD中,编码端可以从运动矢量候选列表选择若干个(例如两个)基准MV。例如,一个MV为空域临近已编码块的MV(例如,图3中图像块1~5中的一个已编码块的MV),另一个MV为时域临近已编码块的MV(例如,在其他图像帧中对应图3中图像块6或图像块7位置处的图像块的MV)。该两个基准MV所对应的像素点如图4中的两个虚线圆所示,其中,L0表示第一基准MV所指向的图像帧,L1表示第二个基准MV所指向的图像帧。
在一个示例中,编码端分别以选择出的基准MV为起点,沿着固定方向(例如上下左右四个方向)进行搜索。在一个示例中,搜索时所采用的步长为偏移值候选列表中的一个偏移值。在一个示例中,偏移值候选列表中包含八个偏移值,分别为{1/4,1/2,1,2,4,8,16,32},对应的搜索步长索引号分别为0~7。编码端从偏移值候选列表中选择一个偏移值作为搜索步长进行搜索,根据RDO或者其它规则确定最优MV。实心圆和空心圆表示基于两种搜索步长搜索到的像素点。
由于物体运动的连续性,物体在相邻两帧之间的MV不一定刚好是整数个像素点。但是数字视频中并不存在分数像素点(例如,1/4像素点和1/2像素点),即,两个像素点之间不存在其它像素点。为了提高MV搜索的精确程度,可以将这些分像素点的值通过内插方式近似估算出来,也就是对参考帧的行方向和列方向进行插值,在插值之后的参考帧中进行搜索。在对当前块进行插值的过程,需要用到当前块中的像素点及其相邻区域的像素点。下面举例说明插值方法。
如图5所示,a 0,0和d 0,0为1/4像素点,b 0,0和h 0,0为1/2像素点(也可称之为“半像素点”),c 0,0和n 0,0为3/4像素点。编码块为由A 0,0~A 1,0以及 A 0,0~A 0,1围成的2x2大小的块。为了计算这个2x2的块中所有的插值像素点,需要该编码块外部的一些像素点,包括该编码块的左边3个像素点、右边4个像素点、上边3个像素点以及下边4个像素点。下列公式给出了上述编码块中部分插值像素点所使用外部像素点。
a 0,j=(∑ i=-3..3A i,jqfilter[i])>>(B-8)
b 0,j=(∑ i=-3..4A i,jhfilter[i])>>(B-8)
c 0,j=(∑ i=-2..4A i,jqfilter[1-i])>>(B-8)
d 0,0=(∑ i=-3..3A 0,jqfilter[j])>>(B-8)
h 0,0=(∑ i=-3..4A 0,jhfilter[j])>>(B-8)
n 0,0=(∑ i=-2..4A 0,jqfilter[1-j])>>(B-8)
完成插值后,即可使用1/4或1/2等分数搜索步长搜索最优MV。
由于不同类型的帧中物体的运动情况不同,因此,需要根据当前块所属的帧选择不同的搜索步长。
本申请提供了一种视频编码方法,能够提高编码性能。如图6所示,该方法600包括:
S610,根据当前块的运动信息候选列表确定当前块的基准MV。
确定基准MV的方法如上文所述,在此不再赘述。
S620,根据当前块所属的帧的类型从对应的偏移值候选集合中确定目标偏移值,其中,所述类型包括第一类型和第二类型,第一类型对应的偏移值候选集合是第二类型对应的偏移值候选集合的子集。
第一类型的帧的内容变化程度相比第二类型的帧的内容变化程度简单。上述第一类型例如是屏幕内容,上述第二类型例如是非屏幕内容。其中,屏幕内容可以是录屏得到的帧,非屏幕内容可以是拍摄自然物体得到的帧。通常情况下,屏幕内容中物体的运动方式(例如,文字的平移)相比非屏幕内容中自然物体的运动方式(例如,运动员的转动)简单。因此,第一类型的帧也可以被称为简单运动帧,第二类型的帧也可以被称为复杂运动帧。
对于简单运动帧,编码端可以使用较长的搜索步长搜索最优MV,以便于提高编码效率;对于复杂运动帧,编码端可以使用较短的搜索步长搜索最优MV。
例如,若当前块所属的帧的类型为第一类型,编码端可以从第一类型对应的偏移值候选集合{1,2,4,8,16,32}中选择一个值作为当前块的目标偏移 值;若当前块所属的帧的类型为第二类型,编码端可以从第二类型对应的偏移值候选集合{1/4,1/2,1,2,4,8,16,32}中选择一个值作为当前块的目标偏移值。
由于第一类型对应的偏移值候选集合是第二类型对应的偏移值候选集合的子集,因此,作为一个可选的实施方式,该两个偏移值候选集合可以合并存储。即,第一类型对应的偏移值候选集合和第二类型对应的偏移值候选集合位于同一个偏移值候选列表中。编码端在编码第一类型和第二类型的帧中的图像块时,可以将所选择的偏移值在该同一个偏移值候选列表中的索引号写入码流中。可选地,第一类型对应的偏移值候选集合和第二类型对应的偏移值候选集合存储于同一个缓冲区中。
上述示例仅是举例说明,适用于本申请的偏移值候选集合不限于此。例如,第一类型对应的偏移值候选集合还可以是{1,2,4,8,16,32,64},第二类型对应的偏移值候选集合是{1/4,1/2,1,2,4,8,16,32}。在该情况下,第一类型对应的偏移值候选集合和第二类型对应的偏移值候选集合可以位于同一个偏移值候选列表中。
在一个示例中,S620也可以替换为下述步骤:
根据当前块所属的帧的类型从同一个偏移值候选列表中确定目标偏移值。
编码端确定目标偏移值后,即可执行下列步骤。可选地,S630也可以S620之前执行,或者,与S620一起执行。
S630,根据基准MV确定搜索起点。
S640,在搜索起点以目标偏移值为搜索步长,搜索当前块的参考块。
若搜索步长为1/4或1/2等非整数像素点的长度,则编码端需要对参考帧进行插值,并在插值后的参考帧中搜索参考块;若搜索步长为1或2等整数像素点的长度,则编码端无需对参考帧进行插值,可以直接在参考帧中搜索参考块。插值的方法以及搜索参考块的方法上文中已经详细描述,此处不再赘述。
在一些示例中,屏幕内容采用的偏移值候选列表为{1,2,4,8,16,32,64,128},非屏幕内容采用的偏移值候选列表为{1/4,1/2,1,2,4,8,16,32};编码端对图像块进行编码后,需要在码流中加入标识,用于指示帧的类型是属于屏幕内容还是非屏幕内容,以及在码流中加入所选择的偏移值在对应的偏移 值候选列表中的索引号。解码端在接收到码流后,根据码流中的标识确定图像帧的类型是屏幕内容还是非屏幕内容,然后选择对应的偏移值候选列表,并根据码流中的索引号从对应的偏移值候选列表中选择出对应的偏移值作为搜索步长进行搜索。一些示例中,为减少占用的存储量,编解码端在存储偏移值候选列表时仅存储非屏幕内容采用的偏移值候选列表{1/4,1/2,1,2,4,8,16,32}。当根据标识确定图像帧类型为屏幕内容时,对偏移值候选列表{1/4,1/2,1,2,4,8,16,32}进行移位操作,得到{1,2,4,8,16,32,64,128},然后根据索引号在移位操作后得到的偏移值候选列表中选择对应的偏移值作为搜索步长。
本申请的一些示例中,由于64或128的搜索步长过大,采用这两种搜索步长得到的最优MV可能只是局部最优MV,对编码性能可能有负面影响,因此,本申请提供的编码方法无需使用64或128作为目标偏移值,提高了编码性能。而且,第一类型对应的偏移值候选集合是第二类型对应的偏移值候选集合的子集,或者,在确定不同类型的帧中的图像块的搜索步长时是从同一个偏移值候选列表中确定目标偏移值,因此,这个过程中编解码端均无需进行移位操作,减小了编码的复杂度。本申请的一些示例中,编码端也可以不需要在码流中加入用于指示图像帧是属于第一类型还是第二类型的标识;也即码流中没有用于指示图像帧是属于第一类型还是第二类型的标识,减少了字节开销。
编码端确定参考块之后,可以对当前块进行编码,得到码流,其中,该码流包含目标偏移值在偏移值候选列表中的索引号。
一个示例中,偏移值候选列表中的值为{1/4,1/2,1,2,4,8,16,32},码流中的索引号可以是0~7,分别对应偏移值候选列表中的八个值。对于第一类型帧中的图像块,在码流中用于指示搜索步长的索引号是2~7中的一个数字;对于第二类型帧中的图像块,在码流中用于指示搜索步长的索引号是0~7中的一个数字。
一个示例中,偏移值候选列表中的值为{1/4,1/2,1,2,4,8,16,32,64},码流中的索引号可以是0~8,分别对应偏移值候选列表中的九个值。对于第一类型帧中的图像块,在码流中用于指示搜索步长的索引号是2~8中的一个数字;对于第二类型帧中的图像块,在码流中用于指示搜索步长的索引号是0~8中的一个数字。
编码端可以将目标偏移值在偏移值候选列表中的索引号写入码流,以便于解码端基于索引号对应的搜索步长进行搜索。由于不同类型的帧对应一套偏移值索引号,解码端无需根据帧的类型对偏移值进行移位,因此,编码端无需在码流中写入帧的类型的指示信息,从而减小了编码的信息开销。
解码端可以基于图7所示的方法进行解码。图7所示的方法包括:
S710,根据当前块的运动信息候选列表确定当前块的基准MV。
解码端可以按照编码端所示的方法构建当前块的运动信息候选列表。随后,解码端从运动信息候选列表中选择基准MV,并执行下列步骤。
S720,确定目标偏移值。
其中,所述目标偏移值为当前块所属的帧的类型对应的偏移值候选集合中的一个偏移值,所述类型包括第一类型和第二类型。
一个示例中,第一类型对应的偏移值候选集合是第二类型对应的偏移值候选集合的子集。解码端可以使用一个偏移值候选列表,即,解码端从一个偏移值候选列表中确定不同类型的帧对应的偏移值候选集合;解码端也可以使用多个偏移值候选列表,即,解码端从不同的类型对应的偏移值候选列表中选择目标偏移值。
例如,第一类型对应的偏移值候选集合为{1,2,4,8,16,32},第二类型对应的偏移值候选集合为{1/4,1/2,1,2,4,8,16,32},该两个集合可以位于一个列表中,该列表包含的偏移值为{1/4,1/2,1,2,4,8,16,32}。解码端接收到的码流中用于指示搜索步长的索引号,是对应同一个偏移值候选列表中的不同偏移值。
一个示例中,第一类型对应的偏移值候选集合和第二类型对应的偏移值候选集合存在交集。解码端可以使用一个偏移值候选列表,即,解码端从一个偏移值候选列表中确定不同类型的帧对应的偏移值候选集合。
例如,第一类型对应的偏移值候选集合为{1,2,4,8,16,32,64},第二类型对应的偏移值候选集合为{1/4,1/2,1,2,4,8,16,32},该两个集合可以位于一个列表中,该列表包含的偏移值为{1/4,1/2,1,2,4,8,16,32,64}。
解码端可以自己确定搜索步长,可选地,解码端可以从当前块对应的码流中确定索引号,并根据该索引号从一个偏移值候选列表中确定目标偏移值。由于不同类型的帧对应一套偏移值索引号,解码端无需根据帧的类型对偏移值进行移位。可选地,编码端无需在码流中写入帧的类型的指示信息,从而 减小了编码的信息开销。解码端在解码时不需要去区分帧的类型,不管图像帧是属于哪一种类型,都是从码流中解出索引号后从同一个偏移值候选列表中确定搜索步长。
解码端确定目标偏移值之后,可以执行下列步骤。可选地,S730也可以S720之前执行,或者,与S720一起执行。
S730,根据基准MV确定搜索起点。
S740,在搜索起点以目标偏移值为搜索步长,搜索当前块的参考块。
若搜索步长为1/4或1/2等非整数像素点的长度,则解码端需要对参考帧进行插值,并在插值后的参考帧中搜索参考块;若搜索步长为1或2等整数像素点的长度,则解码端无需对参考帧进行插值,可以直接在参考帧中搜索参考块。插值的方法以及搜索参考块的方法上文中已经详细描述,此处不再赘述。
由于64或128的搜索步长过大,采用这两种搜索步长得到的最优MV可能只是局部最优MV,对解码性能可能有负面影响,一些示例中,本申请提供的解码方法未使用64或128作为目标偏移值,提高了编码性能。
上文详细描述了本申请提供的视频编码方法和视频解码方法,下面,将结合附图对本申请提供的视频编码装置和视频解码装置进行清楚、完整地描述。
图8为本申请提供的一种视频编码器的示意图。视频编码器100用于将视频输出到后处理实体41。后处理实体41表示可处理来自视频编码器100的经编码视频数据的视频实体的示例,例如,媒体感知网络元件(media-aware network element,MANE)或拼接/编辑装置。后处理实体41和视频编码器100互为独立装置,或者,后处理实体41可集成在包括视频编码器100中。视频编码器100可根据本申请提出的方法执行图像块的帧间预测。
在图8的示例中,视频编码器100包括预测处理单元108、滤波器106、编码图像缓冲器(coded picture buffer,CPB)107、求和器112、变换器101、量化器102和熵编码器103。预测处理单元108包括帧间预测器110和帧内预测器109。为了重构图像块,视频编码器100还包含反量化器104、反变换器105和求和器111。滤波器106表示一个或多个环路滤波器,例如去块滤波器、自适应环路滤波器(adaptive loop filter,ALF)和样本自适应偏移(sample adaptive offset,SAO)滤波器。图8中滤波器106可以是环路内滤波 器,也可以为环路后滤波器。在一种示例中,视频编码器100还可以包括视频数据存储器(图中未示意)。
视频数据存储器可存储待视频编码器100编码的视频数据。视频数据存储器也可以作为参考图像存储器,存储视频编码器100在帧内译码模式和/或帧间译码模式中对视频数据进行编码时的参考视频数据。视频数据存储器和CPB 107可由多种存储器装置中的任一者形成,例如包含同步动态随机存取存储器(synchronous dynamic random access memory,SDRAM)、动态随机存取存储器(dynamic random access memory,DRAM)、磁阻式随机存取存储器(magnetic random access memory,MRAM)、电阻式随机存取存储器(resistive random access memory,RRAM)或其它类型的存储器。视频数据存储器和CPB107可由同一存储器装置或单独存储器装置提供。视频数据存储器还可与视频编码器100的其它组件集成在芯片上,或者与其它组件分立设置。
如图8所示,视频编码器100接收视频数据,并将所述视频数据存储在视频数据存储器中。分割单元将视频数据(帧)分割成若干图像块,而且这些图像块可以被进一步分割为更小的块,例如基于四叉树结构或者二叉树结构进行分割。上述分割的结果可以是条带(slice)、片(tile)或其它较大单元,所述条带可被分成多个图像块或者被称作“片”的图像块集合。预测处理单元108(例如帧间预测单元110)可以确定当前图像块的运动信息候选列表,并根据筛选规则从该运动信息候选列表中确定目标运动信息,随后根据目标运动信息,对当前图像块执行帧间预测。预测处理单元108可以将经过帧内译码和/或帧间译码的当前图像块提供给求和器112以产生残差块,预测处理单元108还可以将经过帧内译码和/或帧间译码的当前图像块提供给求和器111以重构作为参考图像块的编码块。此外,预测处理单元108(例如帧间预测单元110)可以将目标运动信息的索引信息发送至熵编码器103,以便于熵编码器103将目标运动信息的索引信息编入码流。
预测处理单元108内的帧内预测器109可对当前图像块执行帧内预测性编码,以去除空间冗余。预测处理单元108内的帧间预测器110可对当前图像块执行帧间预测性编码以去除时间冗余。
帧间预测器110用于确定用于帧间预测的目标运动信息,并根据该目标运动信息预测当前图像块中一个或多个基本运动补偿单元的运动信息,并利 用当前图像块中一个或多个基本运动补偿单元的运动信息获取或产生当前图像块的预测块。
例如,帧间预测器110可计算运动信息候选列表中的各种运动信息的RDO值,并从中选择具有最佳RDO特性的运动信息。RDO特性通常用于衡量已编码图像块与未编码图像块之间的失真(或误差)程度。例如,帧间预测器110可确定运动信息候选列表中编码当前图像块的RDO代价最小的运动信息为对当前图像块进行帧间预测的目标运动信息。
帧间预测器110还可产生与图像块和条带相关联的语法元素以供视频解码器200在对条带中的图像块解码时使用。
在为当前图像块选择目标运动信息之后,帧间预测器110可将当前图像块的目标运动信息的索引发送到熵编码器103,以便于熵编码器103编码该索引。
帧内预测器109用于确定用于帧内预测的目标运动信息,并基于该目标运动信息对当前图像块执行帧内预测。例如,帧内预测器109可计算各个候选运动信息的RDO值,选择RDO代价最小的运动信息为对当前图像块进行帧内预测的目标运动信息,并基于RDO值选择具有最佳RDO特性的帧内预测模式。在选择目标运动信息之后,帧内预测器109可将当前图像块的目标运动信息的索引发送到熵编码器103,以便熵编码器103编码该索引。
在预测处理单元108通过帧间预测和/或帧内预测产生当前图像块的预测块之后,视频编码器100通过待编码的当前图像块减去所述预测块来形成残差图像块(残差块)。求和器112表示执行此减法运算的一个或多个组件。所述残差块中的残差视频数据可包含在一个或多个变换单元(transform unit,TU)中,并应用于变换器101。变换器101使用例如DCT等方式将残差视频数据变换成残差变换系数。变换器101可将残差视频数据从像素值域转换到变换域,例如频域。
变换器101可将所得变换系数发送到量化器102。量化器102量化所述变换系数以进一步减小位速率。在一些示例中,量化器102可接着执行对包含经量化的变换系数的矩阵的扫描。或者,熵编码器103可执行扫描。
在量化之后,熵编码器103对经量化变换系数进行熵编码。举例来说,熵编码器103可执行上下文自适应可变长度编码(context adaptive variable length coding,CAVLC)、上下文自适应二进制算术编码(context adaptive binary  arithmetic coding,CABAC)、基于语法的上下文自适应二进制算术编码(syntax-based context adaptive binary arithmetic coding,SBAC)、概率区间分割熵(probability interval partitioning entropy,PIPE)编码或其它熵编码方法。熵编码器103在进行熵编码之后,可将将码流发射到视频解码器200。熵编码器103还可对待编码的当前图像块的语法元素进行熵编码,例如,将目标运动信息编入码流中。
反量化器104和反变化器105分别应用逆量化和逆变换以在像素域中重构残差块(例如,用作参考图像的图像块)。求和器111将经重构的残差块添加到由帧间预测器110或帧内预测器109产生的预测块,以产生经重构的图像块。滤波器106可以用于处理经重构的图像块以减小失真,例如减小方块效应(block artifacts)。该经重构的图像块可以存储在图像缓冲器107中,作为帧间预测器110对后续视频帧或图像中的块进行帧间预测的参考块。
应当理解的是,上述视频编码器100的处理流程仅是举例说明,视频编码器100还可以基于其它处理流程进行视频编码。例如,对于某些图像块或者图像帧,视频编码器100可以直接地量化残差信号而不需要经变换器101处理,相应地也不需要经反变换器105处理;或者,对于某些图像块或者图像帧,视频编码器100没有产生残差数据,相应地不需要经变换器101、量化器102、反量化器104和反变换器105处理;或者,视频编码器100可以将经重构图像块作为参考块直接地进行存储而不需要经滤波器106处理;或者,视频编码器100中量化器102和反量化器104可以合并在一起。
图9为本申请提供的一种视频解码器的示意图。在图9的示例中,视频解码器200包括熵解码器203、预测处理单元208、反量化器204、反变换器205、求和器211、滤波器206以及经解码图像缓冲器207。预测处理单元208可以包括帧间预测器210和帧内预测器209。在一些示例中,视频解码器200可执行大体上与视频编码器100描述的编码过程互逆的解码过程。
在解码过程中,视频解码器200从视频编码器100接收包括图像块和相关联的语法元素的码流。视频解码器200可从网络实体42接收视频数据,可选的,还可以将所述视频数据存储在视频数据存储器(图中未示意)中。视频数据存储器可以作为存储码流的解码图像缓冲器(coded picture buffer,DPB)。因此,尽管在图9中没有示意出视频数据存储器,但视频数据存储器和DPB 207可以是同一个的存储器,也可以是单独设置的存储器。视频数据 存储器和DPB 207可由多种存储器装置中的任一者形成,例如:包含SDRAM、DRAM、MRAM、RRAM或其它类型的存储器。在各种示例中,视频数据存储器可与视频解码器200的其它组件的其它组件集成在芯片上,或者与其它组件分立设置。
网络实体42例如可以是服务器、MANE或视频编辑器/剪接器。网络实体42可包括或不包括视频编码器,例如视频编码器100。网络实体42和视频解码器200可以是相互独立的装置,可选地,网络实体42也可以与视频解码器200集成在一个装置中。
视频解码器200的熵解码器203对码流进行熵解码以产生经量化的系数和语法元素。熵解码器203将语法元素转发到预测处理单元208。视频解码器200可接收视频条带层级和/或图像块层级的语法元素。本申请中,在一种示例下,这里的语法元素可以包括与当前图像块相关的目标运动信息。
当视频条带被解码为帧内解码条带(I条带)时,预测处理单元208的帧内预测器209可基于码流中指示的帧内预测模式和来自当前帧的已解码图像块产生当前视频条带的图像块的预测块。当视频条带被解码为帧间解码条带(B条带或P条带)时,预测处理单元208的帧间预测器210可基于从熵解码器203接收到的语法元素,确定用于对当前视频条带的当前图像块进行解码的目标运动信息,基于该目标运动信息,对当前图像块进行解码(例如执行帧间预测)。
帧间预测器210可以确定是否对当前视频条带的当前图像块采用新的帧间预测方法进行预测,例如,是否采用本申请的方法来确定目标偏移值。如果语法元素指示采用新的帧间预测方法来对当前图像块进行预测,基于新的帧间预测方法(例如采用本申请的方法来确定目标偏移值)预测当前图像块的运动信息,并通过运动补偿过程使用预测出的当前图像块的运动信息来生成当前图像块的预测块。这里的运动信息可以包括参考图像信息和运动矢量,其中参考图像信息可以包括但不限于单向/双向预测信息,参考图像列表号和参考图像列表对应的参考图像索引。
视频解码器200可基于存储在DPB 207中的参考图像来建构参考图像列表。上文方法实施例中已经详细的阐述了采用方法700来预测当前图像块的运动信息的帧间预测过程。
反量化器204将熵解码器203解码得到的量化变换系数进行逆量化(即, 去量化)。逆量化过程可包括:使用由视频编码器100针对视频条带中的每个图像块计算的量化参数来确定应施加的量化程度,以及根据该量化程度确定应施加的逆量化程度。反变换器205对变换系数进行逆变换处理,例如,逆DCT、逆整数变换或其它逆变换过程,以产生像素域中的残差块。
在帧间预测器210产生当前图像块或当前图像块的子块的预测块之后,视频解码器200对来自反变换器205的残差块与来自帧间预测器210的预测块求和得到重建的块。求和器211表示执行此求和操作的组件。在需要时,还可使用滤波器206(在解码环路中或在解码环路之后)来使像素转变平滑或者使用以其它方式改进视频质量。滤波器206可以是一个或多个环路滤波器,例如去块滤波器、ALF以及SAO滤波器。在一种示例下,滤波器206适用于重建块以减小块失真,并将该结果作为解码视频流输出。并且,还可以将给定帧或图像中的经解码图像块存储在DPB207中,用于后续运动补偿的参考图像。DPB207还可以存储解码后的视频,以供稍后在显示装置上呈现。
应当理解的是,上述视频解码器200的处理流程仅是举例说明,视频解码器200还可以基于其它处理流程进行视频解码。例如,视频解码器200可以不经滤波器206处理而输出视频流;或者,对于某些图像块或者图像帧,视频解码器200的熵解码器203没有解码出经量化的系数,这些图像块或者图像帧也就不需要经反量化器204和反变换器205处理。
图10为本申请实施例中的帧间预测装置1000的一种示意性框图。需要说明的是,帧间预测装置1000既适用于解码视频图像的帧间预测,也适用于编码视频图像的帧间预测,应当理解的是,这里的帧间预测装置1000可以对应于图8中的帧间预测器110,或者可以对应于图9中的帧间预测器210。
当装置1000用于编码视频图像,该帧间预测装置1000可以包括:
帧间预测处理单元1001,用于根据当前块的运动信息候选列表确定当前块的基准MV。
偏移值选择单元1002,用于根据当前块所属的帧的类型从对应的偏移值候选集合中确定目标偏移值,其中,所述类型包括第一类型和第二类型,第一类型对应的偏移值候选集合是第二类型对应的偏移值候选集合的子集。
或者,用于根据当前块所属的帧的类型从同一个偏移值候选列表中确定目标偏移值。
帧间预测处理单元1001还用于:根据基准MV确定搜索起点;在搜索起点以目标偏移值为搜索步长,搜索当前块的参考块。
由此可见,帧间预测装置1000对于不同类型的帧采用不同的偏移值集合进行帧间预测。对于运动较为复杂的帧,帧间预测装置1000可以采用包含较多偏移值的集合,从而可以选择较小的搜索步长精确搜索最优MV;对于运动较为简单的帧,帧间预测装置1000可以采用包含较少偏移值的集合(即,包含较多偏移值的集合的子集),从而可以快速搜索最优MV。该两个集合可以位于一个列表中,并且存储于一个缓冲区中,从而减小视频编码所消耗的存储空间。
当装置1000用于解码视频图像,装置1000可以包括:
帧帧间预测处理单元1001,用于根据当前块的运动信息候选列表确定当前块的基准MV。
偏移值选择单元1002,用于确定目标偏移值。
其中,所述目标偏移值为所述当前块所属的帧的类型对应的偏移值候选集合中的一个偏移值,所述类型包括第一类型和第二类型,所述第一类型对应的偏移值候选集合是所述第二类型对应的偏移值候选集合的子集;或者,不同类型的帧中的图像块采用同一个偏移值候选列表确定目标偏移值。
帧帧间预测处理单元1001还用于:根据基准MV确定搜索起点;在搜索起点以目标偏移值为搜索步长,搜索当前块的参考块。
由于不同类型的帧所对应的偏移值集合均为预设的偏移值集合,装置1000可以基于码流中的索引号确定目标偏移值,无需根据帧的类型确定是否对偏移值进行移位处理,因此,装置1000减小了解码的复杂度。
需要说明的是,本申请实施例的帧间预测装置中的各个模块为实现本申请的方法实施例中的各种步骤的功能主体,具体请参见本文方法实施例中对帧间预测方法的介绍,此处不再赘述。
图11为本申请提供的编码设备或解码设备(简称为译码设备1100)的一种实现方式的示意性框图。其中,译码设备1100可以包括处理器1110、存储器1130和总线***1150。其中,处理器和存储器通过总线***相连,该存储器用于存储指令,该处理器用于执行该存储器存储的指令。编码设备的存储器存储程序代码,且处理器可以调用存储器中存储的程序代码执行本申请描述的各种视频编码或解码方法,尤其是本申请描述的帧间预测方法。 为避免重复,这里不再详细描述。
在本申请实施例中,该处理器1110可以是CPU,该处理器1110还可以是其他通用处理器、DSP、ASIC、FPGA或者其它可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
该存储器1130可以包括ROM或者RAM。任何其他适宜类型的存储设备也可以用作存储器1130。存储器1130可以包括由处理器1110使用总线1150访问的代码和数据1131。存储器1130可以进一步包括操作***1133和应用程序1135,该应用程序1135包括允许处理器1110执行本申请描述的视频编码或解码方法(尤其是本申请描述的帧间预测方法)的至少一个程序。例如,应用程序1135可以包括应用1至N,其进一步包括执行在本申请描述的视频编码或解码方法的视频编码或解码应用(简称视频译码应用)。
该总线***1150除包括数据总线之外,还可以包括电源总线、控制总线和状态信号总线等。但是为了清楚说明起见,在图中将各种总线都标为总线***1150。
可选的,译码设备1100还可以包括一个或多个输出设备,诸如显示器1170。在一个示例中,显示器1170可以是触感显示器,其将显示器与可操作地感测触摸输入的触感单元合并。显示器1170可以经由总线1150连接到处理器1110。
本领域技术人员能够理解,结合本文公开描述的各种说明性逻辑框、模块和算法步骤所描述的功能可以硬件、软件、固件或其任何组合来实施。如果以软件来实施,那么各种说明性逻辑框、模块和步骤描述的功能可作为一或多个指令或代码在计算机可读媒体上存储或传输,且由基于硬件的处理单元执行。
本申请还提供了一种包含上述编码方法和解码方法的计算机可读存储介质,计算机可读存储介质包含计算机程序产品,该计算机程序产品可由一个或多个处理器读取以检索用于实施本申请中描述的技术的指令、代码和/或数据结构。
计算机可读存储介质可包含无形介质和有形介质。其中,无形介质例如是信号或者载波。作为示例而非限制,有形介质可包括软盘、硬盘、磁带等磁性介质,以及DVD等光介质,以及固态硬盘(solid state disk,SSD)等 半导体介质。此外,连接也可称作计算机可读介质。举例来说,如果使用同轴缆线、光纤、双绞线、数字订户线(digital subscriber line,DSL)、红外线、无线电和微波从网站、服务器或其它远程源传输指令,那么同轴缆线、光纤、双绞线、DSL、红外线、无线电和微波包含在介质的定义中。
本申请的技术可在各种各样的装置或设备中实施,包含无线手持机、集成电路(integrated circuit,IC)或一组IC(例如,芯片组)。本申请中描述各种组件、模块或单元是为了强调装置能够实现所揭示的技术的功能,但未必需要由不同硬件单元实现。实际上,如上文所描述,各种单元可结合合适的软件和/或固件集成在编码器或解码器的硬件单元中。
领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
以上所述,仅为本申请示例性的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到的变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应该以权利要求的保护范围为准。

Claims (56)

  1. 一种视频编码的方法,其特征在于,包括:
    根据当前块的运动信息候选列表确定所述当前块的基准运动信息;
    根据所述当前块所属的帧的类型从对应的偏移值候选集合中确定目标偏移值,其中,所述类型包括第一类型和第二类型,所述第一类型对应的偏移值候选集合是所述第二类型对应的偏移值候选集合的子集;
    根据所述基准运动信息确定搜索起点;
    在所述搜索起点以所述目标偏移值为搜索步长,搜索所述当前块的参考块。
  2. 根据权利要求1所述的方法,其特征在于,
    所述第一类型的帧的内容变化程度相比所述第二类型的帧的内容变化程度简单。
  3. 根据权利要求1或2所述的方法,其特征在于,
    所述第一类型为屏幕内容;和/或,
    所述第二类型为非屏幕内容。
  4. 根据权利要求1至3中任一项所述的方法,其特征在于,
    所述第一类型对应的偏移值候选集合为{1,2,4,8,16,32};以及,
    所述第二类型对应的偏移值候选集合为{1/4,1/2,1,2,4,8,16,32}。
  5. 根据权利要求1至4中任一项所述的方法,其特征在于,所述第一类型对应的偏移值候选集合和所述第二类型对应的偏移值候选集合位于同一个偏移值候选列表中。
  6. 根据权利要求1至5中任一项所述的方法,其特征在于,所述方法还包括:
    对所述当前块进行编码,得到码流,其中,所述码流包含所述目标偏移值在所述偏移值候选列表中的索引号。
  7. 根据权利要求6所述的方法,其特征在于,所述码流中未包含用于指示所述当前块所属的帧为所述第一类型或所述第二类型的标识。
  8. 一种视频编码的方法,其特征在于,包括:
    根据当前块的运动信息候选列表确定所述当前块的基准运动信息;
    根据所述当前块所属的帧的类型从同一个偏移值候选列表中确定目标偏移值;
    根据所述基准运动信息确定搜索起点;
    在所述搜索起点以所述目标偏移值为搜索步长,搜索所述当前块的参考块。
  9. 根据权利要求8所述的方法,其特征在于,所述根据所述当前块所属的帧的类型从同一个偏移值候选列表中确定目标偏移值,包括:
    根据所述当前块所属的帧的类型从同一个偏移值候选列表中的对应的偏移值集合中确定目标偏移值;其中,所述类型包括第一类型和第二类型,所述偏移值候选列表中与所述第一类型对应的偏移值集合,是所述偏移值候选列表中与所述第二类型对应的偏移值集合的子集。
  10. 根据权利要求9所述的方法,其特征在于,
    所述第一类型的帧的内容变化程度相比所述第二类型的帧的内容变化程度简单。
  11. 根据权利要求9或10所述的方法,其特征在于,
    所述第一类型为屏幕内容;和/或,
    所述第二类型为非屏幕内容。
  12. 根据权利要求9至11中任一项所述的方法,其特征在于,
    所述偏移值候选列表中与所述第一类型对应的偏移值的集合为{1,2,4,8,16,32};以及,
    所述偏移值候选列表中与所述第二类型对应的偏移值的集合为{1/4,1/2,1,2,4,8,16,32}。
  13. 根据权利要求8至12中任一项所述的方法,其特征在于,所述方法还包括:
    对所述当前块进行编码,得到码流,其中,所述码流包含所述目标偏移值在所述偏移值候选列表中的索引号。
  14. 根据权利要求13所述的方法,其特征在于,所述码流中未包含用于指示所述当前块所属的帧为第一类型或第二类型的标识。
  15. 一种视频解码的方法,其特征在于,包括:
    根据当前块的运动信息候选列表确定所述当前块的基准运动信息;
    确定目标偏移值,其中,所述目标偏移值为所述当前块所属的帧的类型对应的偏移值候选集合中的一个偏移值,其中,所述类型包括第一类型和第二类型,所述第一类型对应的偏移值候选集合是所述第二类型对应的偏移值 候选集合的子集;
    根据所述基准运动信息确定搜索起点;
    在所述搜索起点以所述目标偏移值为搜索步长,搜索所述当前块的参考块。
  16. 根据权利要求15所述的方法,其特征在于,
    所述第一类型的帧的内容变化程度相比所述第二类型的帧的内容变化程度简单。
  17. 根据权利要求15或16所述的方法,其特征在于,
    所述第一类型为屏幕内容;和/或,
    所述第二类型为非屏幕内容。
  18. 根据权利要求15至17中任一项所述的方法,其特征在于,
    所述第一类型对应的偏移值候选集合为{1,2,4,8,16,32};以及,
    所述第二类型对应的偏移值候选集合为{1/4,1/2,1,2,4,8,16,32}。
  19. 根据权利要求15至18中任一项所述的方法,其特征在于,所述第一类型对应的偏移值候选集合和所述第二类型对应的偏移值候选集合位于同一个偏移值候选列表中。
  20. 根据权利要求15至19中任一项所述的方法,其特征在于,所述确定目标偏移值,包括:
    接收所述当前块对应的码流,其中,所述码流包含索引号;
    根据所述索引号在预设偏移值候选列表中确定所述目标偏移值。
  21. 根据权利要求20所述的方法,其特征在于,所述码流中未包含用于指示所述当前块所属的帧为所述第一类型或所述第二类型的标识。
  22. 一种视频解码的方法,其特征在于,包括:
    根据当前块的运动信息候选列表确定所述当前块的基准运动信息;
    从一个偏移值候选列表中确定目标偏移值,其中,不同类型的帧中的图像块采用同一个偏移值候选列表确定目标偏移值;
    根据所述基准运动信息确定搜索起点;
    在所述搜索起点以所述目标偏移值为搜索步长,搜索所述当前块的参考块。
  23. 根据权利要求22所述的方法,其特征在于,
    所述当前块所属的帧的类型包括第一类型和第二类型,所述偏移值候选 列表中与所述第一类型对应的偏移值的集合是所述偏移值候选列表中与所述第二类型对应的偏移值的集合的子集。
  24. 根据权利要求23所述的方法,其特征在于,
    所述第一类型的帧的内容变化程度相比所述第二类型的帧的内容变化程度简单。
  25. 根据权利要求23或24所述的方法,其特征在于,
    所述第一类型为屏幕内容;和/或,
    所述第二类型为非屏幕内容。
  26. 根据权利要求23至25中任一项所述的方法,其特征在于,
    所述偏移值候选列表中与所述第一类型对应的偏移值的集合为{1,2,4,8,16,32};以及,
    所述偏移值候选列表中与所述第二类型对应的偏移值的集合为{1/4,1/2,1,2,4,8,16,32}。
  27. 根据权利要求22至26中任一项所述的方法,其特征在于,所述从一个偏移值候选列表中确定目标偏移值,包括:
    从所述当前块对应的码流中确定索引号;
    根据所述索引号从所述偏移值候选列表中确定所述目标偏移值。
  28. 根据权利要求27所述的方法,其特征在于,所述码流中未包含用于指示所述当前块所属的帧为第一类型或第二类型的标识。
  29. 一种视频编码的装置,其特征在于,包括:
    存储器,用于存储代码;
    处理器,用于读取所述存储器中的代码,以执行如下操作:
    根据当前块的运动信息候选列表确定所述当前块的基准运动信息;
    根据所述当前块所属的帧的类型从对应的偏移值候选集合中确定目标偏移值,其中,所述类型包括第一类型和第二类型,所述第一类型对应的偏移值候选集合是所述第二类型对应的偏移值候选集合的子集;
    根据所述基准运动信息确定搜索起点;
    在所述搜索起点以所述目标偏移值为搜索步长,搜索所述当前块的参考块。
  30. 根据权利要求29所述的装置,其特征在于,
    所述第一类型的帧的内容变化程度相比所述第二类型的帧的内容变化 程度简单。
  31. 根据权利要求29或30所述的装置,其特征在于,
    所述第一类型为屏幕内容;和/或,
    所述第二类型为非屏幕内容。
  32. 根据权利要求29至31中任一项所述的装置,其特征在于,
    所述第一类型对应的偏移值候选集合为{1,2,4,8,16,32};以及,
    所述第二类型对应的偏移值候选集合为{1/4,1/2,1,2,4,8,16,32}。
  33. 根据权利要求29至32中任一项所述的装置,其特征在于,所述第一类型对应的偏移值候选集合和所述第二类型对应的偏移值候选集合位于同一个偏移值候选列表中。
  34. 根据权利要求29至33中任一项所述的装置,其特征在于,所述处理器还用于:
    对所述当前块进行编码,得到码流,其中,所述码流包含所述目标偏移值在所述偏移值候选列表中的索引号。
  35. 根据权利要求34所述的装置,其特征在于,所述码流中未包含用于指示所述当前块所属的帧为所述第一类型或所述第二类型的标识。
  36. 一种视频编码的装置,其特征在于,包括:
    存储器,用于存储代码;
    处理器,用于读取所述存储器中的代码,以执行如下操作:
    根据当前块的运动信息候选列表确定所述当前块的基准运动信息;
    根据所述当前块所属的帧的类型从同一个偏移值候选列表中确定目标偏移值;
    根据所述基准运动信息确定搜索起点;
    在所述搜索起点以所述目标偏移值为搜索步长,搜索所述当前块的参考块。
  37. 根据权利要求36所述的装置,其特征在于,所述根据所述当前块所属的帧的类型从同一个偏移值候选列表中确定目标偏移值,包括:
    根据所述当前块所属的帧的类型从同一个偏移值候选列表中的对应的偏移值集合中确定目标偏移值;其中,所述类型包括第一类型和第二类型,所述偏移值候选列表中与所述第一类型对应的偏移值集合,是所述偏移值候选列表中与所述第二类型对应的偏移值集合的子集。
  38. 根据权利要求37所述的装置,其特征在于,
    所述第一类型的帧的内容变化程度相比所述第二类型的帧的内容变化程度简单。
  39. 根据权利要求37或38所述的装置,其特征在于,
    所述第一类型为屏幕内容;和/或,
    所述第二类型为非屏幕内容。
  40. 根据权利要求37至39中任一项所述的装置,其特征在于,
    所述偏移值候选列表中与所述第一类型对应的偏移值的集合为{1,2,4,8,16,32};以及,
    所述偏移值候选列表中与所述第二类型对应的偏移值的集合为{1/4,1/2,1,2,4,8,16,32}。
  41. 根据权利要求36至40中任一项所述的装置,其特征在于,所述处理器还用于执行以下操作:
    对所述当前块进行编码,得到码流,其中,所述码流包含所述目标偏移值在所述偏移值候选列表中的索引号。
  42. 根据权利要求41所述的装置,其特征在于,所述码流中未包含用于指示所述当前块所属的帧为第一类型或第二类型的标识。
  43. 一种视频解码的装置,其特征在于,包括:
    存储器,用于存储代码;
    处理器,用于读取所述存储器中的代码,以执行如下操作:
    根据当前块的运动信息候选列表确定所述当前块的基准运动信息;
    确定目标偏移值,其中,所述目标偏移值为所述当前块所属的帧的类型对应的偏移值候选集合中的一个偏移值,其中,所述类型包括第一类型和第二类型,所述第一类型对应的偏移值候选集合是所述第二类型对应的偏移值候选集合的子集;
    根据所述基准运动信息确定搜索起点;
    在所述搜索起点以所述目标偏移值为搜索步长,搜索所述当前块的参考块。
  44. 根据权利要求43所述的装置,其特征在于,
    所述第一类型的帧的内容变化程度相比所述第二类型的帧的内容变化程度简单。
  45. 根据权利要求43或44所述的装置,其特征在于,
    所述第一类型为屏幕内容;和/或,
    所述第二类型为非屏幕内容。
  46. 根据权利要求43至45中任一项所述的装置,其特征在于,
    所述第一类型对应的偏移值候选集合为{1,2,4,8,16,32};以及,
    所述第二类型对应的偏移值候选集合为{1/4,1/2,1,2,4,8,16,32}。
  47. 根据权利要求43至46中任一项所述的装置,其特征在于,所述第一类型对应的偏移值候选集合和所述第二类型对应的偏移值候选集合位于同一个偏移值候选列表中。
  48. 根据权利要求43至47中任一项所述的装置,其特征在于,所述确定目标偏移值,包括:
    接收所述当前块对应的码流,其中,所述码流包含索引号;
    根据所述索引号在预设偏移值候选列表中确定所述目标偏移值。
  49. 根据权利要求48所述的装置,其特征在于,所述码流中未包含用于指示所述当前块所属的帧为所述第一类型或所述第二类型的标识。
  50. 一种视频解码的装置,其特征在于,包括:
    存储器,用于存储代码;
    处理器,用于读取所述存储器中的代码,以执行如下操作:
    根据当前块的运动信息候选列表确定所述当前块的基准运动信息;
    从一个偏移值候选列表中确定目标偏移值,其中,不同类型的帧中的图像块采用同一个偏移值候选列表确定目标偏移值;
    根据所述基准运动信息确定搜索起点;
    在所述搜索起点以所述目标偏移值为搜索步长,搜索所述当前块的参考块。
  51. 根据权利要求50所述的装置,其特征在于,
    所述当前块所属的帧的类型包括第一类型和第二类型,所述偏移值候选列表中与所述第一类型对应的偏移值的集合是所述偏移值候选列表中与所述第二类型对应的偏移值的集合的子集。
  52. 根据权利要求51所述的装置,其特征在于,
    所述第一类型的帧的内容变化程度相比所述第二类型的帧的内容变化程度简单。
  53. 根据权利要求51或52所述的装置,其特征在于,
    所述第一类型为屏幕内容;和/或,
    所述第二类型为非屏幕内容。
  54. 根据权利要求51至53中任一项所述的装置,其特征在于,
    所述偏移值候选列表中与所述第一类型对应的偏移值的集合为{1,2,4,8,16,32};以及,
    所述偏移值候选列表中与所述第二类型对应的偏移值的集合为{1/4,1/2,1,2,4,8,16,32}。
  55. 根据权利要求50至54中任一项所述的装置,其特征在于,所述从一个偏移值候选列表中确定目标偏移值,包括:
    从所述当前块对应的码流中确定索引号;
    根据所述索引号从所述偏移值候选列表中确定所述目标偏移值。
  56. 根据权利要求55所述的装置,其特征在于,所述码流中未包含用于指示所述当前块所属的帧为第一类型或第二类型的标识。
PCT/CN2019/077882 2019-03-12 2019-03-12 视频编码的方法与装置,以及视频解码的方法与装置 WO2020181504A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201980005231.2A CN111264061B (zh) 2019-03-12 2019-03-12 视频编码的方法与装置,以及视频解码的方法与装置
PCT/CN2019/077882 WO2020181504A1 (zh) 2019-03-12 2019-03-12 视频编码的方法与装置,以及视频解码的方法与装置

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/077882 WO2020181504A1 (zh) 2019-03-12 2019-03-12 视频编码的方法与装置,以及视频解码的方法与装置

Publications (1)

Publication Number Publication Date
WO2020181504A1 true WO2020181504A1 (zh) 2020-09-17

Family

ID=70955212

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/077882 WO2020181504A1 (zh) 2019-03-12 2019-03-12 视频编码的方法与装置,以及视频解码的方法与装置

Country Status (2)

Country Link
CN (1) CN111264061B (zh)
WO (1) WO2020181504A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112055201B (zh) * 2020-08-06 2024-05-28 浙江大华技术股份有限公司 视频编码方法及其相关装置
CN114640856B (zh) * 2021-03-19 2022-12-23 杭州海康威视数字技术股份有限公司 解码方法、编码方法、装置及设备
CN116962708B (zh) * 2023-09-21 2023-12-08 北京国旺盛源智能终端科技有限公司 一种智能服务云终端数据优化传输方法及***

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101237580A (zh) * 2008-02-29 2008-08-06 西北工业大学 基于中心预测的整数像素快速混合搜索方法
CN106537918A (zh) * 2014-08-12 2017-03-22 英特尔公司 用于视频编码的运动估计的***和方法

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9357214B2 (en) * 2012-12-07 2016-05-31 Qualcomm Incorporated Advanced merge/skip mode and advanced motion vector prediction (AMVP) mode for 3D video

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101237580A (zh) * 2008-02-29 2008-08-06 西北工业大学 基于中心预测的整数像素快速混合搜索方法
CN106537918A (zh) * 2014-08-12 2017-03-22 英特尔公司 用于视频编码的运动估计的***和方法

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
LG ELECTRONICS INC.: ""Non-CE8: MMVD Harmonization with CPR"", JOINT VIDEO EXPERTS TEAM (JVET) OF ITU-T SG 16 WP 3 AND ISO/IEC JTC 1/SC 29/WG 11, JVET-M0341, 18 January 2019 (2019-01-18), pages 1 - 4, XP030201530 *
SAMSUNG ELECTRONICS CO., LTD.: ""CE4 Ultimate Motion Vector Expression (Test 4.5.4)"", JOINT VIDEO EXPERTS TEAM (JVET) OF ITU-T SG 16 WP 3 AND ISO/IEC JTC 1/SC 29/WG 11, JVET-L0054, 12 October 2018 (2018-10-12), XP030195378 *

Also Published As

Publication number Publication date
CN111264061A (zh) 2020-06-09
CN111264061B (zh) 2023-07-25

Similar Documents

Publication Publication Date Title
WO2019129130A1 (zh) 图像预测方法、装置以及编解码器
JP2019508971A (ja) ビデオコーディングのための固定フィルタからのフィルタ係数を予測すること
TWI786790B (zh) 視頻資料的幀間預測方法和裝置
WO2020006969A1 (zh) 运动矢量预测方法以及相关装置
CN117478873A (zh) 用于视频译码中的条件性解码器侧运动矢量修正的装置和方法
WO2020181504A1 (zh) 视频编码的方法与装置,以及视频解码的方法与装置
KR20210153128A (ko) 인터 예측에서의 병합 모드 후보들에 대한 글로벌 모션
KR20210118166A (ko) 적응형 개수의 영역들을 갖는 기하학적 파티셔닝을 위한 모양 적응 이산 코사인 변환
US20230239464A1 (en) Video processing method with partial picture replacement
US20230239494A1 (en) Video encoder, video decoder, and corresponding method
JP7448558B2 (ja) 画像エンコーディングおよびデコーディングのための方法およびデバイス
CN114827623A (zh) 用于视频编解码的边界扩展
KR20210153725A (ko) 글로벌 모션 벡터들의 효율적인 코딩
EP3369250A1 (en) Parallel arithmetic coding techniques
TW201921938A (zh) 具有在用於視訊寫碼之隨機存取組態中之未來參考訊框之可調適圖像群組結構
WO2020114356A1 (zh) 帧间预测方法和相关装置
CN111656786B (zh) 候选运动信息列表的构建方法、帧间预测方法及装置
US11595652B2 (en) Explicit signaling of extended long term reference picture retention
TWI841033B (zh) 視頻數據的幀間預測方法和裝置
US11985318B2 (en) Encoding video with extended long term reference picture retention
CN110868601B (zh) 帧间预测方法、装置以及视频编码器和视频解码器
WO2020007187A1 (zh) 图像块解码方法及装置
WO2019227297A1 (zh) 一种视频图像的帧间预测方法、装置及编解码器
EP3918799A1 (en) Explicit signaling of extended long term reference picture retention

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19918775

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19918775

Country of ref document: EP

Kind code of ref document: A1