WO2019223480A1 - 视频数据解码方法及装置 - Google Patents

视频数据解码方法及装置 Download PDF

Info

Publication number
WO2019223480A1
WO2019223480A1 PCT/CN2019/083848 CN2019083848W WO2019223480A1 WO 2019223480 A1 WO2019223480 A1 WO 2019223480A1 CN 2019083848 W CN2019083848 W CN 2019083848W WO 2019223480 A1 WO2019223480 A1 WO 2019223480A1
Authority
WO
WIPO (PCT)
Prior art keywords
block
residual
processed
adjustment factor
pixels
Prior art date
Application number
PCT/CN2019/083848
Other languages
English (en)
French (fr)
Inventor
赵寅
杨海涛
陈建乐
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2019223480A1 publication Critical patent/WO2019223480A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/44Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding

Definitions

  • the present application relates to the technical field of video coding and decoding, and in particular, to a method and a device for acquiring residuals.
  • the current video coding technologies include a variety of video coding standards, such as H.264 / AVC, H.265 / HEVC, Audio Video Coding Standard (AVS) and other video coding standards.
  • the above video coding standards usually use a hybrid coding framework.
  • the hybrid coding framework may include prediction, transformation, quantization, entropy coding and other links.
  • the prediction link uses reconstructed pixels of the encoded region to generate predicted pixels of the original pixels corresponding to the current encoded image block.
  • the difference in pixel values between the original pixel and the predicted pixel is called residual.
  • the residuals are usually transformed first, transformed into transform coefficients, and then the transform coefficients are quantized.
  • the quantized transform coefficients and syntax elements for example, indication information such as the coded image block size, prediction mode, and motion vector
  • Video decoding is a process of converting a code stream into a video image, and may include links such as entropy decoding, prediction, dequantization, inverse transform, and the like.
  • the code stream is parsed through an entropy decoding process to obtain syntax elements and quantized transform coefficients.
  • the predicted pixels are obtained based on the syntax elements and the previously decoded reconstructed pixels;
  • the quantized transform coefficients are obtained through an inverse quantization process to obtain the inverse quantized transform coefficients, and the inverse quantized transform coefficients are subjected to inverse Transform to get reconstructed residuals.
  • the reconstructed residuals and prediction pixels are accumulated to obtain reconstructed pixels, thereby recovering a video image.
  • the reconstructed pixel may be different from the original pixel, and the difference in value between the two is called distortion. Due to the existence of various visual masking effects, such as the brightness masking effect and the contrast masking effect, the intensity of distortion observed by the human eye is closely related to the characteristics of the background in which the distortion is located.
  • the embodiment of the present application uses the spatial neighborhood pixel information of the current block to be processed (ie, the block to be decoded and the transform block) to simulate the original pixel information corresponding to the current block to be processed.
  • the spatial neighborhood pixel information adaptively derive the adjustment factor for the current to-be-processed block (that is, the transform block), and adjust the residual block corresponding to the current to-be-processed block based on the adaptively derived adjustment factor.
  • the residual bits of processing blocks with strong visual masking effects are reduced, and the residual bits of processing blocks with weak visual masking effects are increased, making the encoding of actual residuals more consistent with human visual perception, thereby Improved codec performance.
  • a first aspect of the embodiments of the present application provides a method for acquiring residuals in video decoding, including: parsing a bitstream to obtain a transform coefficient of a block to be processed; and transforming the transform coefficient into a first block of the block to be processed.
  • a residual determining an adjustment factor of the block to be processed according to pixel information in a preset spatial neighborhood of the block to be processed; adjusting the first residual based on the adjustment factor to obtain the block to be processed Second residual.
  • the method before determining the adjustment factor of the block to be processed according to pixel information in a preset spatial neighborhood of the block to be processed, the method further includes: The pixel values in the preset spatial neighborhood of the processing block are used to calculate pixel information in the preset spatial neighborhood of the block to be processed.
  • the calculating pixel information in a preset spatial neighborhood of the block to be processed includes: obtaining one or more pixel sets in the preset spatial neighborhood; calculating Mean and / or dispersion of pixels in the one or more pixel sets to obtain pixel information in the preset spatial neighborhood.
  • the pixel information of the pixels around the block to be processed is used instead of the pixel information of the block to be processed, so that the decoding end can adaptively derive the pixel information, which saves the number of bits of transmitted pixel information and improves the coding efficiency.
  • the dispersion includes: a mean square error sum, an average absolute error sum, a variance or a standard deviation.
  • the dispersion includes: before the acquiring one or more pixel sets in the preset spatial neighborhood, further comprising: determining the one or more pixels All pixels in each pixel set in the set have been reconstructed.
  • the pixel information is the average value
  • the adjustment factor of the block to be processed is determined according to the pixel information in a preset spatial neighborhood of the block to be processed, Including: determining the adjustment factor according to the mean value and a first mapping relationship between the mean value and the adjustment factor, wherein the first mapping relationship satisfies one or more of the following conditions: when the average value is less than the first At the threshold, the adjustment factor decreases as the average value increases; when the average value is greater than a second threshold value, the adjustment factor increases as the average value increases, where the first threshold value Less than or equal to the second threshold; when the average is greater than or equal to the first threshold and less than or equal to the second threshold, the adjustment factor is a first preset constant.
  • the pixel information is the dispersion
  • the adjustment factor of the block to be processed is determined according to the pixel information in a preset spatial neighborhood of the block to be processed.
  • the adjustment factor increases as the dispersion increases
  • the adjustment factor is a second preset constant.
  • the pixel information is the mean and the dispersion
  • the to-be-processed is determined according to the pixel information in a preset spatial neighborhood of the to-be-processed block.
  • Regulators of the block including:
  • a product or a weighted sum of the first parameter and the second parameter is used as the adjustment factor.
  • different indicators can be selected to determine the adjustment factor, achieving a balance between performance and complexity.
  • the method further includes: weighting the adjustment factor. Adjusting to obtain an adjusted adjustment factor; correspondingly, determining the adjustment of the block to be processed includes: using the adjusted adjustment factor as an adjustment factor of the block to be processed.
  • the adjustment factors are optimized and the encoding efficiency is further improved.
  • the method further includes: updating the adjustment factor according to a quantization parameter of the block to be processed; Adjusting the first residual based on the adjustment factor to obtain a second residual of the block to be processed includes: adjusting the first residual based on the updated adjustment factor to obtain the The second residual of the block to be processed is described.
  • the adjustment factor is adjusted in the following manner:
  • QC represents the adjustment factor
  • QP represents the quantization parameter
  • N, M, and X are preset constants.
  • the number of acquired transformation coefficients of the block to be processed is the same as the number of pixels of the block to be processed.
  • it further includes: arranging the transform coefficients of the block to be processed into transform coefficient blocks according to a preset positional relationship; correspondingly, the first residual of converting the transform coefficients to the block to be processed includes: Converting the transform coefficient block into a first residual block of the block to be processed; correspondingly, adjusting the first residual based on the adjustment factor to obtain a second residual of the block to be processed, The method includes: adjusting the first residual block based on the adjustment factor to obtain a second residual block of the block to be processed.
  • the first residual block includes a first luminance residual block of a luminance component of the block to be processed, and a luminance residual in the first luminance residual block.
  • the pixels correspond to the pixels of the luminance component of the block to be processed one by one.
  • the second residual block includes a second luminance residual block of the luminance component of the block to be processed.
  • Adjusting the first residual block to obtain a second residual block of the block to be processed includes: adjusting a luminance residual pixel in the first luminance residual block based on the adjustment factor to obtain the The luminance residual pixels in the second luminance residual block of the block to be processed.
  • the luminance residual pixels in the second luminance residual block are obtained in the following manner:
  • Res2_Y (i) (Res1_Y (i) ⁇ QC + offset_Y) >> shift_Y
  • QC represents the adjustment factor
  • Res1_Y (i) represents the i-th brightness residual pixel in the first brightness residual block
  • Res2_Y (i) represents the i-th brightness residual block in the second Brightness residual pixels
  • offset_Y and shift_Y are preset constants
  • i is a natural number.
  • the first residual block includes a first chrominance residual block of a chrominance component of the block to be processed, where The chroma residual pixels correspond to the pixels of the chroma component of the block to be processed one by one.
  • the second residual block includes a second chroma residual block of the chroma component of the block to be processed.
  • the adjusting the first residual block based on the adjustment factor to obtain a second residual block of the block to be processed includes: adjusting a color in the first chroma residual block based on the adjustment factor. Degree residual pixels to obtain chrominance residual pixels in a second chrominance residual block of the block to be processed.
  • the chroma residual pixels in the second chroma residual block are obtained in the following manner:
  • Res2_C (i) (Res1_C (i) ⁇ QC + offset_C) >> shift_C
  • QC represents the adjustment factor
  • Res1_C (i) represents the ith chroma residual pixel in the first chroma residual block
  • Res2_C (i) represents the The i-th chroma residual pixel, offset_C and shift_C are preset constants, and i is a natural number.
  • a bit width accuracy of a luminance residual pixel in the first luminance residual block is higher than a bit width of the luminance residual pixel in the second luminance residual block. Bit width precision.
  • a bit width accuracy of a chroma residual pixel in the first chroma residual block is higher than a chroma residual in the second chroma residual block. Bit width accuracy of poor pixels.
  • high-precision bit-width precision is used, which can improve the accuracy of the operation and improve the coding efficiency.
  • the converting the transform coefficient block into a first residual block of the block to be processed includes: performing each transform coefficient in the transform coefficient block. Performing inverse quantization to obtain an inverse quantized transform coefficient block; performing inverse transform on the inverse quantized transform coefficient block to obtain a first residual block of the block to be processed.
  • the method further includes: The residual pixels in the two residuals are added to the predicted pixels at corresponding positions in the block to be processed to obtain reconstructed pixels at the corresponding positions in the block to be processed.
  • the above two steps are a pre-order step and a subsequent step to obtain the residuals, so that the beneficial effects of adjusting the residuals can be superimposed with other prediction, transformation, and quantization techniques.
  • a second aspect of the embodiments of the present application discloses a device for acquiring residuals in video decoding, including: a parsing module for parsing a bitstream to obtain a transform coefficient of a block to be processed; a transform module for transforming the transform The coefficient is converted into a first residual of the block to be processed; a calculation module is configured to determine an adjustment factor of the block to be processed according to pixel information in a preset spatial neighborhood of the block to be processed; and an adjustment module is configured to Adjusting the first residual based on the adjustment factor to obtain a second residual of the block to be processed.
  • the calculation module is further configured to calculate, based on pixel values in a preset spatial neighborhood of the block to be processed, the Pixel information.
  • the calculation module is specifically configured to: obtain one or more pixel sets in the preset spatial neighborhood; and calculate an average value of pixels in the one or more pixel sets. And / or dispersion to obtain pixel information within the preset spatial neighborhood.
  • the dispersion includes: a mean square error sum, an average absolute error sum, a variance or a standard deviation.
  • the calculation module is further configured to determine that all pixels in each pixel set in the one or more pixel sets have been reconstructed.
  • the pixel information is the average value
  • the calculation module is specifically configured to determine according to the average value and a first mapping relationship between the average value and the adjustment factor.
  • the adjustment factor wherein the first mapping relationship satisfies one or more of the following conditions: when the average value is less than a first threshold value, the adjustment factor decreases as the average value increases; when the average value When it is greater than a second threshold, the adjustment factor increases as the average value increases, where the first threshold is less than or equal to the second threshold; when the average is greater than or equal to the first threshold, When it is less than or equal to the second threshold, the adjustment factor is a first preset constant.
  • the calculation module is specifically configured to determine the adjustment factor according to the dispersion and a second mapping relationship between the dispersion and the adjustment factor, where: The second mapping relationship satisfies one or more of the following conditions: when the dispersion is greater than a third threshold, the adjustment factor increases as the dispersion increases; when the dispersion is less than or equal to When the third threshold is mentioned, the adjustment factor is a second preset constant.
  • the pixel information is the mean and the dispersion
  • the calculation module is specifically configured to determine a first parameter according to the mean and the first mapping relationship. Determining a second parameter according to the dispersion and the second mapping relationship; and using the product or weighted sum of the first parameter and the second parameter as the adjustment factor.
  • the calculation module is further configured to: perform weight adjustment on the adjustment factor to obtain an adjusted adjustment factor; and use the adjusted adjustment factor as the waiting factor. Modulation factor for processing blocks.
  • the calculation module is further configured to: update the adjustment factor according to the quantization parameter of the block to be processed; correspondingly, the adjustment module is specifically configured to: Adjusting the first residual based on the updated adjustment factor to obtain a second residual of the block to be processed.
  • the adjustment factor is adjusted in the following manner:
  • QC represents the adjustment factor
  • QP represents the quantization parameter
  • N, M, and X are preset constants.
  • the number of acquired transformation coefficients of the block to be processed is the same as the number of pixels of the block to be processed
  • the conversion module is further configured to:
  • the transform coefficients of the blocks to be processed are arranged into transform coefficient blocks according to a preset position relationship; the transform coefficient blocks are converted into first residual blocks of the blocks to be processed; correspondingly, the adjustment module is specifically configured to:
  • the adjustment factor adjusts the first residual block to obtain a second residual block of the block to be processed.
  • the first residual block includes a first luminance residual block of a luminance component of the block to be processed, and a luminance residual in the first luminance residual block.
  • the pixels correspond to the pixels of the luminance component of the block to be processed in a one-to-one correspondence.
  • the second residual block includes a second luminance residual block of the luminance component of the block to be processed.
  • the adjustment module is specifically configured to: : Adjusting the luminance residual pixels in the first luminance residual block based on the adjustment factor to obtain the luminance residual pixels in the second luminance residual block of the block to be processed.
  • the luminance residual pixels in the second luminance residual block are obtained in the following manner:
  • Res2_Y (i) (Res1_Y (i) ⁇ QC + offset_Y) >> shift_Y
  • QC represents the adjustment factor
  • Res1_Y (i) represents the i-th brightness residual pixel in the first luminance residual block
  • Res2_Y (i) represents the i-th pixel in the second luminance residual block.
  • Brightness residual pixels, offset_Y and shift_Y are preset constants, and i is a natural number.
  • the first residual block includes a first chrominance residual block of a chrominance component of the block to be processed, where The chroma residual pixels correspond to the pixels of the chroma component of the block to be processed one by one.
  • the second residual block includes a second chroma residual block of the chroma component of the block to be processed.
  • the adjustment module is specifically configured to adjust a chroma residual pixel in the first chroma residual block based on the adjustment factor to obtain a chroma in a second chroma residual block of the block to be processed. Residual pixels.
  • the chroma residual pixels in the second chroma residual block are obtained in the following manner:
  • Res2_C (i) (Res1_C (i) ⁇ QC + offset_C) >> shift_C
  • QC represents the adjustment factor
  • Res1_C (i) represents the i-th chroma residual pixel in the first chroma residual block
  • Res2_C (i) represents the second chroma residual block.
  • the i-th chroma residual pixel, offset_C and shift_C are preset constants, and i is a natural number.
  • the bit width accuracy of the luminance residual pixels in the first luminance residual block is higher than the bit width accuracy of the luminance residual pixels in the second luminance residual block. Bit width precision.
  • a bit width accuracy of a chroma residual pixel in the first chroma residual block is higher than a chroma residual in the second chroma residual block. Bit width accuracy of poor pixels.
  • the conversion module is specifically configured to: inverse quantize each transform coefficient in the transform coefficient block to obtain an inverse quantized transform coefficient block;
  • the inverse-quantized transform coefficient block is inversely transformed to obtain a first residual block of the block to be processed.
  • the apparatus further includes: a reconstruction unit, configured to compare a residual pixel in the second residual with a predicted pixel at a corresponding position in the block to be processed. To obtain reconstructed pixels at the corresponding positions in the block to be processed.
  • a third aspect of the present application provides a device for acquiring a residual error.
  • the device includes: the device may be applied to an encoding side or a decoding side.
  • the device includes a processor and a memory, and the processor and the memory are connected (for example, connected to each other through a bus).
  • the device may further include a transceiver, and the transceiver is connected to the processor and the memory for receiving /send data.
  • the memory is used to store program code and video data.
  • the processor may be configured to read the program code stored in the memory and execute the method described in the first aspect.
  • a fourth aspect of the present application provides a video codec system, which includes a source device and a destination device.
  • the source device can communicate with the destination device.
  • the source device generates encoded video data. Therefore, the source device may be referred to as a video encoding device or a video encoding device.
  • the destination device can decode the encoded video data generated by the source device. Therefore, the destination device may be referred to as a video decoding device or a video decoding device.
  • the source device and the destination device may be examples of a video codec device or a video codec device. The method described in the first aspect will be applied to the video codec device or video codec device.
  • a fifth aspect of the present application provides a computer-readable storage medium, where the computer-readable storage medium stores instructions that, when run on a computer, cause the computer to execute the method described in the first aspect above.
  • a sixth aspect of the present application provides a computer program product containing instructions that, when run on a computer, causes the computer to perform the method described in the first aspect above.
  • FIG. 1 is an exemplary block diagram of a video encoding and decoding system that can be configured for use in an embodiment of the present application
  • FIG. 2 is an exemplary system block diagram of a video encoder that can be configured for use in an embodiment of the present application
  • FIG. 3 is an exemplary system block diagram of a video decoder that can be configured for use in an embodiment of the present application
  • FIG. 4 is a schematic flowchart of an exemplary method for acquiring residuals for video data decoding according to an embodiment of the present application
  • FIG. 5 is a schematic diagram of pixels in a spatial neighborhood of a block to be processed in an exemplary embodiment of the present application
  • FIG. 6 is a system block diagram of an exemplary hardware pipeline design in an embodiment of the present application.
  • FIG. 7 is a system block diagram of a residual acquisition device for video data decoding according to an exemplary embodiment of the present application.
  • FIG. 8 is a system block diagram of a residual acquisition device for video data decoding according to an exemplary embodiment of the present application.
  • FIG. 1 is a schematic block diagram of a video encoding and decoding system 10 according to an embodiment of the present application.
  • the system 10 includes a source device 12 that generates encoded video data to be decoded by the destination device 14 at a later time.
  • Source device 12 and destination device 14 may include any of a wide range of devices including desktop computers, notebook computers, tablet computers, set-top boxes, telephone handsets such as so-called “smart” phones, so-called “smart” "Touchpads, TVs, cameras, displays, digital media players, video game consoles, video streaming devices, or the like.
  • the source device 12 and the destination device 14 may be equipped for wireless communication.
  • the destination device 14 may receive the encoded video data to be decoded via the link 16.
  • the link 16 may include any type of media or device capable of moving the encoded video data from the source device 12 to the destination device 14.
  • the link 16 may include a communication medium that enables the source device 12 to directly transmit the encoded video data to the destination device 14 in real time.
  • the encoded video data may be modulated according to a communication standard (eg, a wireless communication protocol) and transmitted to the destination device 14.
  • Communication media may include any wireless or wired communication media, such as a radio frequency spectrum or one or more physical transmission lines. Communication media may form part of a packet-based network, such as a global network of a local area network, a wide area network, or the Internet.
  • the communication medium may include a router, a switch, a base station, or any other equipment that may be used to facilitate communication from the source device 12 to the destination device 14.
  • the encoded data may be output from the output interface 22 to the storage device 24.
  • the encoded data can be accessed from the storage device 24 by an input interface.
  • the storage device 24 may include any of a variety of distributed or locally-accessed data storage media, such as a hard drive, Blu-ray disc, DVD, CD-ROM, flash memory, volatile or non-volatile memory Or any other suitable digital storage medium for storing encoded video data.
  • the storage device 24 may correspond to a file server or another intermediate storage device that may hold the encoded video produced by the source device 12.
  • the destination device 14 may access the stored video data from the storage device 24 via streaming or download.
  • the file server may be any type of server capable of storing encoded video data and transmitting this encoded video data to the destination device 14.
  • the file server includes a web server, a file transfer protocol server, a network attached storage device, or a local disk drive.
  • the destination device 14 may access the encoded video data via any standard data connection including an Internet connection.
  • This data connection may include a wireless channel (e.g., a Wi-Fi connection), a wired connection (e.g., a cable modem, etc.), or a combination of both, suitable for accessing encoded video data stored on a file server.
  • the transmission of the encoded video data from the storage device 24 may be a streaming transmission, a download transmission, or a combination of the two.
  • the techniques of this application are not necessarily limited to wireless applications or settings.
  • the technology can be applied to video decoding to support any of a variety of multimedia applications, such as over-the-air television broadcasting, cable television transmission, satellite television transmission, streaming video transmission (e.g., via the Internet), encoding digital video for use in Digital video or other applications stored on a data storage medium and decoded on the data storage medium.
  • the system 10 may be configured to support one-way or two-way video transmission to support applications such as video streaming, video playback, video broadcasting, and / or video telephony.
  • the source device 12 includes a video source 18, a video encoder 20, and an output interface 22.
  • the output interface 22 may include a modulator / demodulator (modem) and / or a transmitter.
  • the video source 18 may include a source such as a video capture device (e.g., a video camera), a video archive containing previously captured video, a video feed interface to receive video from a video content provider , And / or a computer graphics system for generating computer graphics data as a source video, or a combination of these sources.
  • a video capture device e.g., a video camera
  • the source device 12 and the destination device 14 may form a so-called camera phone or video phone.
  • the techniques described in this application may be exemplarily applicable to video decoding and may be applicable to wireless and / or wired applications.
  • Captured, pre-captured, or computer-generated video may be encoded by video encoder 20.
  • the encoded video data may be transmitted directly to the destination device 14 via the output interface 22 of the source device 12.
  • the encoded video data may also (or alternatively) be stored on the storage device 24 for later access by the destination device 14 or other device for decoding and / or playback.
  • the destination device 14 includes an input interface 28, a video decoder 30, and a display device 32.
  • the input interface 28 may include a receiver and / or a modem.
  • the input interface 28 of the destination device 14 receives the encoded video data via the link 16.
  • the encoded video data communicated or provided on the storage device 24 via the link 16 may include various syntax elements generated by the video encoder 20 for use by the video decoder 30 of the video decoder 30 to decode the video data. These syntax elements may be included with the encoded video data transmitted on a communication medium, stored on a storage medium, or stored on a file server.
  • the display device 32 may be integrated with or external to the destination device 14.
  • the destination device 14 may include an integrated display device and also be configured to interface with an external display device.
  • the destination device 14 may be a display device.
  • the display device 32 displays the decoded video data to a user, and may include any of a variety of display devices, such as a liquid crystal display, a plasma display, an organic light emitting diode display, or another type of display device.
  • Video encoder 20 and video decoder 30 may operate according to, for example, the next-generation video codec compression standard (H.266) currently under development and may conform to the H.266 test model (JEM).
  • the video encoder 20 and video decoder 30 may be based on, for example, the ITU-TH.265 standard, also referred to as a high-efficiency video decoding standard, or other proprietary or industrial standards of the ITU-TH.264 standard or extensions of these standards While operating, the ITU-TH.264 standard is alternatively referred to as MPEG-4 Part 10, also known as Advanced Video Coding (AVC).
  • AVC Advanced Video Coding
  • the techniques of this application are not limited to any particular decoding standard.
  • Other possible implementations of the video compression standard include MPEG-2 and ITU-TH.263.
  • video encoder 20 and video decoder 30 may each be integrated with an audio encoder and decoder, and may include an appropriate multiplexer-demultiplexer (MUX-DEMUX) unit or other hardware and software to handle encoding of both audio and video in a common or separate data stream.
  • MUX-DEMUX multiplexer-demultiplexer
  • the MUX-DEMUX unit may conform to the ITUH.223 multiplexer protocol or other protocols such as the User Datagram Protocol (UDP).
  • UDP User Datagram Protocol
  • Video encoder 20 and video decoder 30 may each be implemented as any of a variety of suitable encoder circuits, such as one or more microprocessors, digital signal processors (DSPs), application specific integrated circuits (ASICs), Field Programmable Gate Array (FPGA), discrete logic, software, hardware, firmware, or any combination thereof.
  • DSPs digital signal processors
  • ASICs application specific integrated circuits
  • FPGA Field Programmable Gate Array
  • the device may store the software's instructions in a suitable non-transitory computer-readable medium and execute the instructions in hardware using one or more processors to perform the techniques of this application.
  • Each of the video encoder 20 and the video decoder 30 may be included in one or more encoders or decoders, and any of them may be integrated as a combined encoder / decoder (CODEC) in a corresponding device. part.
  • CDEC combined encoder / decoder
  • video encoder 20 may signal information by associating specific syntax elements with various encoded portions of video data. That is, video encoder 20 may "signal" the data by storing specific syntax elements to the header information of various encoded portions of the video data. In some applications, these syntax elements may be encoded and stored (eg, stored to storage system 34 or file server 36) before being received and decoded by video decoder 30.
  • the term “signaling” may illustratively refer to the transmission of syntax or other data used to decode compressed video data, regardless of whether this transmission occurs in real-time or near real-time or over a time span, such as may be encoded Occurs when a syntax element is stored to a media, which can then be retrieved by a decoding device at any time after it is stored on this media.
  • H.265 HEVC
  • HM HEVC test model
  • the latest standard document of H.265 can be obtained from http://www.itu.int/rec/T-REC-H.265.
  • the latest version of the standard document is H.265 (12/16).
  • the standard document is in full text.
  • the citation is incorporated herein.
  • HM assumes that video decoding devices have several additional capabilities over existing algorithms of ITU-TH.264 / AVC. For example, H.264 provides 9 intra-prediction coding modes, while HM provides up to 35 intra-prediction coding modes.
  • H.266 test model The evolution model of the video decoding device.
  • the algorithm description of H.266 can be obtained from http://phenix.int-evry.fr/jvet. The latest algorithm description is included in JVET-G1001-v1.
  • the algorithm description document is incorporated herein by reference in its entirety.
  • reference software for the JEM test model can be obtained from https://jvet.hhi.fraunhofer.de/svn/svn_HMJEMSoftware/, which is also incorporated herein by reference in its entirety.
  • HM can divide a video frame or image into a sequence of tree blocks or maximum coding units (LCUs) containing both luminance and chrominance samples. LCUs are also called CTUs.
  • LCUs are also called CTUs.
  • the tree block has a similar purpose as the macro block of the H.264 standard.
  • a slice contains several consecutive tree blocks in decoding order.
  • a video frame or image can be split into one or more slices.
  • Each tree block can be split into coding units according to a quadtree. For example, a tree block that is the root node of a quadtree can be split into four child nodes, and each child node can be a parent node and split into another four child nodes.
  • the final indivisible child nodes that are leaf nodes of the quadtree include decoding nodes, such as decoded video blocks.
  • decoding nodes such as decoded video blocks.
  • the syntax data associated with the decoded codestream can define the maximum number of times a tree block can be split, and can also define the minimum size of a decoding node.
  • the coding unit includes a decoding node and a prediction unit (PU), and a transformation unit (TU) associated with the decoding node.
  • the size of the CU corresponds to the size of the decoding node and the shape must be square.
  • the size of the CU can range from 8 ⁇ 8 pixels to a maximum block size of 64 ⁇ 64 pixels or larger.
  • Each CU may contain one or more PUs and one or more TUs.
  • the syntax data associated with a CU may describe a case where a CU is partitioned into one or more PUs.
  • the partition mode may be different between cases where the CU is skipped or is coded in direct mode, intra prediction mode, or inter prediction mode.
  • the PU can be divided into non-square shapes.
  • the syntax data associated with a CU may also describe a case where a CU is partitioned into one or more TUs according to a quadtree.
  • the shape of the TU can be square or non-square.
  • the HEVC standard allows transformation based on the TU, which can be different for different CUs.
  • the TU is usually sized based on the size of the PU within a given CU defined for the partitioned LCU, but this may not always be the case.
  • the size of the TU is usually the same as or smaller than the PU.
  • a quad-tree structure called "residual quad tree" (RQT) can be used to subdivide the residual samples corresponding to the CU into smaller units.
  • the leaf node of the RQT may be called a TU.
  • the pixel difference values associated with the TU may be transformed to produce a transformation coefficient, which may be quantized.
  • the PU contains data related to the prediction process.
  • the PU may include data describing the intra-prediction mode of the PU.
  • the PU may include data defining a motion vector of the PU.
  • the data defining the motion vector of the PU may describe the horizontal component of the motion vector, the vertical component of the motion vector, the resolution of the motion vector (e.g., quarter-pixel accuracy or eighth-pixel accuracy), motion vector The reference image pointed to, and / or the reference image list of the motion vector (eg, list 0, list 1 or list C).
  • TU uses transform and quantization processes.
  • a given CU with one or more PUs may also contain one or more TUs.
  • video encoder 20 may calculate a residual value corresponding to the PU. Residual values include pixel differences, which can be transformed into transform coefficients, quantized, and scanned using TU to generate serialized transform coefficients for entropy decoding.
  • This application generally uses the term "video block" to refer to the decoding node of a CU.
  • the term “video block” may also be used in this application to refer to a tree block including a decoding node and a PU and a TU, such as an LCU or a CU.
  • a video sequence usually contains a series of video frames or images.
  • a group of pictures exemplarily includes a series, one or more video pictures.
  • the GOP may include syntax data in the header information of the GOP, the header information of one or more of the pictures, or elsewhere, and the syntax data describes the number of pictures included in the GOP.
  • Each slice of the image may contain slice syntax data describing the coding mode of the corresponding image.
  • Video encoder 20 typically operates on video blocks within individual video slices to encode video data.
  • a video block may correspond to a decoding node within a CU.
  • Video blocks may have fixed or varying sizes, and may differ in size according to a specified decoding standard.
  • HM supports prediction of various PU sizes. Assuming that the size of a specific CU is 2N ⁇ 2N, HM supports intra prediction of PU sizes of 2N ⁇ 2N or N ⁇ N, and symmetrical PU sizes of 2N ⁇ 2N, 2N ⁇ N, N ⁇ 2N, or N ⁇ N prediction. HM also supports asymmetric partitioning of PU-sized inter predictions of 2N ⁇ nU, 2N ⁇ nD, nL ⁇ 2N, and nR ⁇ 2N. In asymmetric partitioning, one direction of the CU is not partitioned, and the other direction is partitioned into 25% and 75%.
  • 2N ⁇ nU refers to a horizontally-divided 2N ⁇ 2NCU with 2N ⁇ 0.5NPU at the top and 2N ⁇ 1.5NPU at the bottom.
  • N ⁇ N and “N times N” are used interchangeably to refer to the pixel size of a video block according to vertical and horizontal dimensions, for example, 16 ⁇ 16 pixels or 16 ⁇ 16 pixels.
  • an N ⁇ N block has N pixels in the vertical direction and N pixels in the horizontal direction, where N represents a non-negative integer value.
  • Pixels in a block can be arranged in rows and columns.
  • the block does not necessarily need to have the same number of pixels in the horizontal direction as in the vertical direction.
  • a block may include N ⁇ M pixels, where M is not necessarily equal to N.
  • the video encoder 20 may calculate the residual data of the TU of the CU.
  • the PU may include pixel data in the spatial domain (also referred to as the pixel domain), and the TU may include transforming (e.g., discrete cosine transform (DCT), integer transform, wavelet transform, or conceptually similar transform) Coefficients in the transform domain after applying to the residual video data.
  • the residual data may correspond to a pixel difference between a pixel of an uncoded image and a prediction value corresponding to a PU.
  • Video encoder 20 may form a TU containing residual data of the CU, and then transform the TU to generate transform coefficients for the CU.
  • video encoder 20 may perform quantization of the transform coefficients.
  • Quantization exemplarily refers to the process of quantizing coefficients to possibly reduce the amount of data used to represent the coefficients to provide further compression.
  • the quantization process may reduce the bit depth associated with some or all of the coefficients. For example, n-bit values may be rounded down to m-bit values during quantization, where n is greater than m.
  • the JEM model further improves the coding structure of video images.
  • a block coding structure called "quadtree and binary tree" (QTBT) is introduced.
  • the QTBT structure abandons the concepts of CU, PU, and TU in HEVC, and supports more flexible CU division shapes.
  • a CU can be square or rectangular.
  • a CTU first performs a quadtree partition, and the leaf nodes of the quadtree further perform a binary tree partition.
  • there are two partitioning modes in binary tree partitioning symmetrical horizontal partitioning and symmetrical vertical partitioning.
  • the leaf nodes of a binary tree are called CUs.
  • JEM's CUs cannot be further divided during the prediction and transformation process, which means that JEM's CU, PU, and TU have the same block size.
  • the maximum size of the CTU is 256 ⁇ 256 luminance pixels.
  • video encoder 20 may utilize a predefined scan order to scan the quantized transform coefficients to generate a serialized vector that can be entropy encoded.
  • video encoder 20 may perform adaptive scanning. After scanning the quantized transform coefficients to form a one-dimensional vector, video encoder 20 may perform context adaptive variable length decoding (CAVLC), context adaptive binary arithmetic decoding (CABAC), syntax-based context adaptive binary Arithmetic decoding (SBAC), probability interval segmentation entropy (PI, PE) decoding, or other entropy decoding methods to entropy decode a one-dimensional vector.
  • Video encoder 20 may also entropy encode syntax elements associated with the encoded video data for use by video decoder 30 to decode the video data.
  • video encoder 20 may assign a context within a context model to a symbol to be transmitted. Context can be related to whether adjacent values of a symbol are non-zero.
  • video encoder 20 may select a variable length code of a symbol to be transmitted. Codewords in Variable Length Decoding (VLC) can be constructed such that relatively short codes correspond to more likely symbols and longer codes correspond to less likely symbols. In this way, the use of VLC can achieve the goal of saving code rates relative to using equal length codewords for each symbol to be transmitted.
  • the probability in CABAC can be determined based on the context assigned to the symbol.
  • FIG. 2 is a schematic block diagram of a video encoder 20 according to an embodiment of the present application.
  • Video encoder 20 may perform intra-frame decoding and inter-frame decoding of video blocks within a video slice.
  • Intra decoding relies on spatial prediction to reduce or remove the spatial redundancy of a video within a given video frame or image.
  • Inter-frame decoding relies on temporal prediction to reduce or remove temporal redundancy of video within adjacent frames of a video sequence or video.
  • the intra mode (I mode) may refer to any of several space-based compression modes.
  • Inter-modes such as unidirectional prediction (P mode) or bidirectional prediction (B mode) may refer to any of several time-based compression modes.
  • the video encoder 20 includes a segmentation unit 35, a prediction unit 41, a reference image memory 64, a summer 50, a transformation processing unit 52, a quantization unit 54, and an entropy encoding unit 56.
  • the prediction unit 41 includes a motion estimation unit 42, a motion compensation unit 44, and an intra prediction module 46.
  • the video encoder 20 also includes an inverse quantization unit 58, an inverse transform unit 60, and a summer 62.
  • a deblocking filter (not shown in Figure 2) may also be included to filter block boundaries to remove block effect artifacts from the reconstructed video. When needed, the deblocking filter will typically filter the output of the summer 62. In addition to the deblocking filter, additional loop filters (in-loop or post-loop) can be used.
  • the video encoder 20 receives video data, and the dividing unit 35 divides the data into video blocks.
  • This segmentation may also include segmentation into slices, image blocks, or other larger units, and video block segmentation, for example, based on the quad-tree structure of the LCU and CU.
  • Video encoder 20 exemplarily illustrates the components of a video block encoded within a video slice to be encoded. In general, a slice can be divided into multiple video blocks (and possibly into a collection of video blocks called image blocks).
  • the prediction unit 41 may select one of a plurality of possible decoding modes of the current video block, such as a plurality of intra decoding modes, based on a calculation result of coding quality and cost (for example, code rate-distortion cost, RDcost, also called rate distortion cost).
  • a calculation result of coding quality and cost for example, code rate-distortion cost, RDcost, also called rate distortion cost.
  • One or more of the inter decoding modes may provide the obtained intra decoded or inter decoded block to the summer 50 to generate residual block data and provide the obtained intra decoded or inter decoded block to the summer 62 to reconstruct The encoded block is thus used as a reference image.
  • the motion estimation unit 42 and the motion compensation unit 44 within the prediction unit 41 perform inter-predictive decoding of a current video block with respect to one or more predictive blocks in one or more reference images to provide temporal compression.
  • the motion estimation unit 42 may be configured to determine an inter prediction mode of a video slice according to a predetermined pattern of a video sequence. The predetermined mode can specify the video slices in the sequence as P slices, B slices, or GPB slices.
  • the motion estimation unit 42 and the motion compensation unit 44 may be highly integrated, but are described separately for conceptual purposes.
  • the motion estimation performed by the motion estimation unit 42 is a process of generating a motion vector of an estimated video block.
  • a motion vector may indicate a displacement of a PU of a video block within a current video frame or image relative to a predictive block within a reference image.
  • a predictive block is a block that is found to closely match the PU of the video block to be decoded according to the pixel difference.
  • the pixel difference can be determined by the sum of absolute differences (SAD), sum of squared differences (SSD), or other differences.
  • the video encoder 20 may calculate a value of a sub-integer pixel position of a reference image stored in the reference image memory 64. For example, video encoder 20 may interpolate values of quarter pixel positions, eighth pixel positions, or other fractional pixel positions of the reference image. Therefore, the motion estimation unit 42 may perform a motion search with respect to the full pixel position and the fractional pixel position and output a motion vector with a fractional pixel accuracy.
  • the motion estimation unit 42 calculates the motion vector of the PU of the video block in the inter-decoded slice by comparing the position of the PU with the position of the predictive block of the reference image.
  • Reference images can be selected from the first reference image list (List 0) or the second reference image list (List 1), each of the lists identifying one or more reference images stored in the reference image memory 64.
  • the motion estimation unit 42 sends the calculated motion vector to the entropy encoding unit 56 and the motion compensation unit 44.
  • Motion compensation performed by the motion compensation unit 44 may involve extracting or generating a predictive block based on a motion vector determined by motion estimation, possibly performing interpolation to sub-pixel accuracy. After receiving the motion vector of the PU of the current video block, the motion compensation unit 44 can locate the predictive block pointed to by the motion vector in one of the reference image lists.
  • Video encoder 20 forms a residual video block by subtracting the pixel value of the predictive block from the pixel value of the current video block being decoded, thereby forming a pixel difference value.
  • the pixel difference values form the residual data of the block, and may include both luminance and chrominance difference components.
  • the summer 50 represents one or more components that perform this subtraction operation.
  • Motion compensation unit 44 may also generate syntax elements associated with video blocks and video slices for use by video decoder 30 to decode video blocks of video slices.
  • the picture containing the PU can be associated with two reference picture lists called "List 0" and "List 1".
  • an image containing B bands may be associated with a list combination that is a combination of list 0 and list 1.
  • the motion estimation unit 42 may perform unidirectional prediction or bidirectional prediction for the PU, wherein in some feasible implementations, the bidirectional prediction is a reference image list based on the list 0 and the list 1, respectively In other feasible implementation manners, the bidirectional prediction is a prediction based on a reconstructed future frame and a reconstructed past frame in the display order of the current frame, respectively.
  • the motion estimation unit 42 may search a reference image of List 0 or List 1 for a reference block for the PU.
  • the motion estimation unit 42 may then generate a reference index indicating a reference image containing a reference block in List 0 or List 1 and a motion vector indicating a spatial displacement between the PU and the reference block.
  • the motion estimation unit 42 may output a reference index, a prediction direction identifier, and a motion vector as motion information of the PU.
  • the prediction direction identification may indicate a reference image in the reference index indication list 0 or list 1.
  • the motion compensation unit 44 may generate a predictive image block of the PU based on a reference block indicated by the motion information of the PU.
  • the motion estimation unit 42 may search for a reference block for the PU in the reference image in list 0 and may also search for another for the PU in the reference image in list 1 Reference block.
  • the motion estimation unit 42 may then generate a reference index indicating the reference image containing the reference block in List 0 and List 1 and a motion vector indicating the spatial displacement between the reference block and the PU.
  • the motion estimation unit 42 may output a reference index and a motion vector of the PU as motion information of the PU.
  • the motion compensation unit 44 may generate a predictive image block of the PU based on a reference block indicated by the motion information of the PU.
  • the motion estimation unit 42 does not output the complete set of motion information for the PU to the entropy encoding module 56. Instead, the motion estimation unit 42 may refer to the motion information of another PU to signal the motion information of the PU. For example, the motion estimation unit 42 may determine that the motion information of a PU is sufficiently similar to the motion information of a neighboring PU. In this embodiment, the motion estimation unit 42 may indicate an indication value in the syntax structure associated with the PU, which indicates to the video decoder 30 that the PU has the same motion information as the neighboring PU or has a Motion information derived from neighboring PUs.
  • the motion estimation unit 42 may identify candidate prediction motion vectors and motion vector differences (MVDs) associated with neighboring PUs in a syntax structure associated with the PUs.
  • MVD indicates the difference between the motion vector of the PU and the indicated candidate prediction motion vector associated with the neighboring PU.
  • Video decoder 30 may use the indicated candidate predicted motion vector and MVD to determine the motion vector of the PU.
  • the prediction module 41 may generate a list of candidate prediction motion vectors for each PU of the CU.
  • One or more of the candidate prediction motion vector lists may include one or more original candidate prediction motion vectors and one or more additional candidate prediction motion vectors derived from the original candidate prediction motion vectors.
  • the intra prediction unit 46 within the prediction unit 41 may perform intra predictive decoding of the current video block relative to one or more neighboring blocks in the same image or slice as the current block to be decoded to provide spatial compression . Therefore, instead of the inter prediction (as described above) performed by the motion estimation unit 42 and the motion compensation unit 44, the intra prediction unit 46 may intra-predict the current block. In particular, the intra prediction unit 46 may determine an intra prediction mode to use to encode the current block. In some feasible implementations, the intra prediction unit 46 may, for example, use various intra prediction modes to encode the current block during separate encoding traversals, and the intra prediction unit 46 (or in some feasible implementations, The mode selection unit 40) may select an appropriate intra prediction mode to use from the tested modes.
  • the video encoder 20 forms a residual video block by subtracting the predictive block from the current video block.
  • the residual video data in the residual block may be included in one or more TUs and applied to the transform processing unit 52.
  • the transform processing unit 52 transforms the residual video data into residual transform coefficients using a transform such as a discrete cosine transform (DCT) or a conceptually similar transform (eg, a discrete sine transform DST).
  • the transform processing unit 52 may transform the residual video data from a pixel domain to a transform domain (for example, a frequency domain).
  • the transformation processing unit 52 may send the obtained transformation coefficient to the quantization unit 54.
  • the quantization unit 54 quantizes the transform coefficients to further reduce the code rate.
  • the quantization process may reduce the bit depth associated with some or all of the coefficients.
  • the degree of quantization can be modified by adjusting the quantization parameters.
  • the quantization unit 54 may then perform a scan of the matrix containing the quantized transform coefficients.
  • the entropy encoding unit 56 may perform scanning.
  • the entropy encoding unit 56 may entropy encode the quantized transform coefficients.
  • the entropy encoding unit 56 may perform context adaptive variable length decoding (CAVLC), context adaptive binary arithmetic decoding (CABAC), syntax-based context adaptive binary arithmetic decoding (SBAC), probability interval partitioning entropy (PIPE) decoding or another entropy coding method or technique.
  • CAVLC context adaptive variable length decoding
  • CABAC context adaptive binary arithmetic decoding
  • SBAC syntax-based context adaptive binary arithmetic decoding
  • PIPE probability interval partitioning entropy
  • the entropy encoding unit 56 may also entropy encode the motion vector and other syntax elements of the current video slice being decoded.
  • the encoded code stream may be transmitted to the video decoder 30 or archived for later transmission or retrieved by the video decoder 30.
  • the entropy encoding unit 56 may encode information indicating a selected intra prediction mode according to the technique of the present application.
  • Video encoder 20 may include encoding of various blocks in transmitted stream configuration data that may include multiple intra prediction mode index tables and multiple modified intra prediction mode index tables (also known as codeword mapping tables). Definition of the context and an indication of the MPM, the intra prediction mode index table, and the modified intra prediction mode index table for each of the contexts.
  • the inverse quantization unit 58 and the inverse transform unit 60 respectively apply inverse quantization and inverse transform to reconstruct a residual block in the pixel domain for later use as a reference block of a reference image.
  • the motion compensation unit 44 may calculate a reference block by adding a residual block to a predictive block of one of the reference pictures within one of the reference picture lists.
  • the motion compensation unit 44 may also apply one or more interpolation filters to the reconstructed residual block to calculate sub-integer pixel values for motion estimation.
  • the summer 62 adds the reconstructed residual block and the motion-compensated prediction block generated by the motion compensation unit 44 to generate a reference block for storage in the reference image memory 64.
  • the reference block may be used by the motion estimation unit 42 and the motion compensation unit 44 as a reference block to inter-predict a block in a subsequent video frame or image.
  • a scaling factor may be calculated according to the reconstructed spatial neighborhood pixel information around the current block to be encoded, and the obtained residual is scaled using the scaling factor. Processing to obtain residual data for subsequent reconstruction of reference blocks or reference pixels.
  • FIG. 3 is a schematic block diagram of a video decoder 30 in an embodiment of the present application.
  • the video decoder 30 includes an entropy encoding unit 80, a prediction unit 81, an inverse quantization unit 86, an inverse transform unit 88, a summer 90, and a reference image memory 92.
  • the prediction unit 81 includes a motion compensation unit 82 and an intra prediction unit 84.
  • the video decoder 30 may perform an exemplary reciprocal decoding flow with the encoding flow described with respect to the video encoder 20 from FIG. 4.
  • video decoder 30 receives from video encoder 20 an encoded video codestream representing video blocks of the encoded video slice and associated syntax elements.
  • the entropy encoding unit 80 of the video decoder 30 entropy decodes the bitstream to generate quantized coefficients, motion vectors, and other syntax elements.
  • the entropy encoding unit 80 forwards the motion vector and other syntax elements to the prediction unit 81.
  • Video decoder 30 may receive syntax elements at a video slice level and / or a video block level.
  • the intra-prediction unit 84 of the prediction unit 81 may be based on a signaled intra-prediction mode and data from a previously decoded block of the current frame or image The prediction data of the video block of the current video slice is generated.
  • the motion compensation unit 82 of the prediction unit 81 When the video image is decoded into inter-decoded (eg, B, P, or GPB) slices, the motion compensation unit 82 of the prediction unit 81 generates the current video based on the motion vector and other syntax elements received from the entropy encoding unit 80 A predictive block of a video block of an image.
  • a predictive block may be generated from one of the reference pictures within one of the reference picture lists.
  • the video decoder 30 may construct a reference image list (List 0 and List 1) using a default construction technique based on the reference image stored in the reference image memory 92.
  • the motion compensation unit 82 determines the prediction information of the video block of the current video slice by parsing the motion vector and other syntax elements, and uses the prediction information to generate the predictive block of the current video block being decoded. For example, the motion compensation unit 82 uses some of the received syntax elements to determine a prediction mode (e.g., intra prediction or inter prediction) of a video block to decode a video slice, an inter prediction slice type (e.g., B slice, P slice or GPB slice), construction information of one or more of the slice's reference image list, motion vector of each inter-coded video block of the slice, each warped frame of the slice The inter-prediction state of the inter-decoded video block and other information used to decode the video block in the current video slice.
  • a prediction mode e.g., intra prediction or inter prediction
  • an inter prediction slice type e.g., B slice, P slice or GPB slice
  • construction information of one or more of the slice's reference image list e.g., motion vector of each inter-
  • the motion compensation unit 82 may also perform interpolation based on the interpolation filter.
  • the motion compensation unit 82 may calculate an interpolation value of the sub-integer pixels of the reference block using an interpolation filter as used by the video encoder 20 during encoding of the video block.
  • the motion compensation unit 82 may determine an interpolation filter used by the video encoder 20 from the received syntax elements and use the interpolation filter to generate a predictive block.
  • the motion compensation unit 82 may generate a list of candidate prediction motion vectors for the PU.
  • the codestream may include data identifying the position of the selected candidate prediction motion vector in the candidate prediction motion vector list of the PU.
  • the motion compensation unit 82 may generate a predictive image block for the PU based on one or more reference blocks indicated by the motion information of the PU.
  • a reference block of a PU may be in a temporal image different from the PU.
  • the motion compensation unit 82 may determine the motion information of the PU based on the selected motion information in the candidate prediction motion vector list of the PU.
  • the inverse quantization unit 86 performs inverse quantization (for example, dequantization) on the quantized transform coefficient provided in the code stream and decoded by the entropy encoding unit 80.
  • the inverse quantization process may include determining the degree of quantization using quantization parameters calculated by video encoder 20 for each video block in the video slice, and similarly determining the degree of inverse quantization that should be applied.
  • the inverse transform unit 88 applies an inverse transform (for example, an inverse DCT, an inverse integer transform, or a conceptually similar inverse transform process) to the transform coefficients to generate a residual block in the pixel domain.
  • a scaling factor can be calculated according to the reconstructed spatial neighborhood pixel information around the current block to be decoded, and the scaling factor is used to The obtained residual is scaled to obtain residual data for subsequent reconstruction of the block to be decoded.
  • the video decoder 30 determines the residual block from the inverse transform unit 88 and the corresponding predictive block generated by the motion compensation unit 82. And to form decoded video blocks.
  • the summer 90 represents one or more components that perform this summing operation.
  • a deblocking filter may also be applied to filter the decoded blocks in order to remove block effect artifacts.
  • Other loop filters (in or after the decoding loop) can also be used to smooth pixel transitions or otherwise improve video quality.
  • the decoded video blocks in a given frame or image are then stored in a reference image memory 92, which stores a reference image for subsequent motion compensation.
  • the reference image memory 92 also stores decoded video for later presentation on a display device such as the display device 32 of FIG. 1.
  • the video decoder includes, for example, video encoder 20 and video decoding as shown and described with respect to FIGS. 1-3. ⁇ 30 ⁇ 30. That is, in a feasible implementation manner, the inverse transform unit 60 described with reference to FIG. 2 may perform the inverse transform unit 60 or other newly-added functional units after performing inverse transform during the encoding of a block of video data. The specific technology described. In another possible implementation, the inverse transform unit 88 or other newly added functional units described with respect to FIG. 3 may perform specific techniques described below during decoding of blocks of video data. Thus, a reference to a generic "video encoder" or "video decoder” may include video encoder 20, video decoder 30, or another video encoding or coding unit.
  • FIG. 4 schematically illustrates a flowchart of a method for acquiring a residual according to an embodiment of the present application.
  • the method may be performed by the video decoder 30.
  • the video decoding method is described as a series of steps or operations. It should be understood that the method may be performed in various orders and / or occur simultaneously, and is not limited to the execution order shown in FIG. 4. Assuming a video data stream with multiple video frames is using a video decoder, performing the following steps includes decoding the current image block to be processed for the current video frame.
  • the acquisition of the adjustment factor is related to the pixel information in the preset spatial neighborhood of the current block to be processed, and for the encoding end and the decoding end, the preset spatial neighbor of the current block to be processed is The pixel information in the domain is the same, so the adjustment factors are the same, and the adjustment of the residual data is corresponding.
  • encoding is an inverse process corresponding to decoding. Therefore, the technical solution embodied in the embodiment of the present application can also be executed by the video encoder 20 at the encoding end, and will not be described repeatedly.
  • This step belongs to entropy decoding technology. Specifically, according to a preset parsing rule, a syntax element represented in bit form (binary value) in a code stream is parsed into an actual value corresponding to the syntax element.
  • the analysis involves transform coefficients. Specifically, the binary representation of the transform coefficients in the code stream is parsed into specific values of the transform coefficients through the analysis rules of the transform coefficients. It should be understood that multiple transformation coefficients of the block to be processed are sequentially analyzed. Generally, the number of acquired transformation coefficients of the block to be processed is the same as the number of pixels of the block to be processed, and the transformed coefficients of the parsed block to be processed are arranged according to a preset positional relationship as Transform coefficient blocks.
  • the preset positional relationship includes a fixed mapping position of a preset transform coefficient, and also a mapping position of a transform coefficient determined according to a preset rule, such as a mapping position of a transform coefficient determined according to an intra prediction mode (also referred to as Scan mode of transform coefficients in intra prediction mode).
  • Typical entropy decoding techniques can be referred to the introduction in the H.265 standard (Rec.ITU-T H.265v4), pages 201 to 243, section 9.3. JEM has also improved the CABAC technology. For details, please refer to JVET-G1001-v1, page 41 to page 43, and the introduction in section 2.6. The embodiment of the present application does not limit which entropy decoding technology is used.
  • the first residual of converting the transform coefficient to the block to be processed includes: transforming the transform coefficient block into A first residual block of the block to be processed; correspondingly, adjusting the first residual based on the adjustment factor to obtain a second residual of the block to be processed includes: based on the adjustment factor Adjusting the first residual block to obtain a second residual block of the block to be processed.
  • this step can be divided into two sub-steps:
  • inverse quantization is performed on the quantized transformation coefficient A (i) to obtain a reconstructed transformation coefficient R (i), which can be described as:
  • R (i) sign ⁇ A (i) ⁇ ⁇ round ⁇ A (i) ⁇ Qs (i) + o2 (i) ⁇
  • the quantization step size Qs (i) can be a floating point number, and o2 (i) is a rounding offset.
  • integer addition and shift are used to approximate the replacement of floating-point multiplication.
  • H.265 / HEVC approximates the inverse quantization process expressed by the above formula as:
  • R (i) sign ⁇ A (i) ⁇ ⁇ (A (i) ⁇ Qs' (i) + (1 ⁇ (bdshift-1)))> bdshift
  • bdshift is the shift parameter
  • Qs '(i) is an integer
  • Qs' (i) / 2 bdshift is similar to the quantization step size Qs (i) in the above formula.
  • o2 (i) 0.5. Round down.
  • Qs' (i) is determined by the level scale l (i) and the scaling factor m (i).
  • R (i) when the product of the length and width of a transform block is equal to an odd power of 2, R (i) can also be obtained by the following formula:
  • R (i) sign ⁇ A (i) ⁇ ⁇ (A (i) ⁇ Qs' (i) ⁇ a + (1 ⁇ (bdshift-1 + s))) >> (bdshift + s)
  • This step is generally called inverse quantization or scaling.
  • scalar quantization is used to perform inverse quantization.
  • JCTVC-M1002-v1 available from http: //phenix.int-evry .fr / jct / Get) on page 20, section 3.5.5, or H.265 standards on pages 173 to 177, and section 8.6, are not repeated here.
  • vector quantization can also be used for inverse quantization.
  • the embodiment of the present application does not limit which inverse quantization technology is used.
  • This step is generally called an inverse transform.
  • Typical inverse transform techniques include Inverse Discrete Cosine Transform (IDCT) or Inverse Discrete Sine Transform (IDST) in H.265. More specifically, Such as DCT-II type inverse transform or DST-VII type inverse transform, it can also be DCT-VIII type inverse transform or DST-I type inverse transform; for example, an inverse transform is determined by the transform mode information of the transform block, using the above The determined inverse transform performs inverse transform processing, such as Adaptive Multiple Core Transform (AMT) in JEM.
  • AMT Adaptive Multiple Core Transform
  • the inverse transform processing may also include performing an inseparable second transform on the partially inversely quantized transform coefficients to obtain a new set of transform coefficients, such as NSST (Non-Separable and Secondary Transform) processing in JEM, and then using discrete cosine-based
  • NSST Non-Separable and Secondary Transform
  • the transform or inverse transform of the discrete sine transform inverse transforms this new set of transform coefficients For details, please refer to the introduction of transformation technology in JCTVC-M1002-v1, pages 18 to 20, and section 3.5. JEM has also improved the transformation and inverse transformation technologies. For details, please refer to JVET-G1001-v1, pages 28 to 35, and the introduction in section 2.4, which will not be repeated. In the embodiment of the present application, there is no limitation on which inverse transform technology is used.
  • this step only needs to use the pixel information in the preset spatial neighborhood of the block to be processed, and does not need to wait for the completion of steps S401 and S402. Similarly, steps S401 and S402 need not wait for the completion of step S403. That is, there is no sequential relationship.
  • this step can be divided into two sub-steps:
  • the pixels in the spatial neighborhood of the current to-be-processed (to-be-decoded) image block refer to pixels on the same frame as the current to-be-processed image block.
  • the spatial neighborhood pixels of the current image block to be processed may include: reconstruction values of at least one pixel in the spatial neighborhood Y of the image block X (also referred to as the image region X).
  • the spatial neighborhood pixels It can include M pixels, where M is a positive integer, and several alternative examples of the spatial neighborhood Y include:
  • the image block X (indicated by the solid line) corresponds to a w ⁇ h coding unit (that is, the width of the coding unit is w pixels and the height is h pixels.
  • the end can also be called a decoding unit, a decoding block, etc.
  • the spatial neighborhood Y (indicated by a dotted line) is constituted in one of the following four ways:
  • the image block X corresponds to a w ⁇ h region in a wc ⁇ hc coding unit C (indicated by a dotted line), and the structure of the spatial neighborhood Y is, for example, the following 2 One of the species:
  • Method 1 Wc ⁇ n pixels above the coding unit C to which X belongs, and m ⁇ hc pixels to the left of C, as shown in FIG. 5 (e).
  • the spatial neighborhood pixels may be all pixels in the spatial neighborhood Y, or may be a part of pixels sampled from the spatial neighborhood Y, which is not limited in the present invention.
  • the method before the acquiring one or more pixel sets in the preset spatial neighborhood, the method further includes: determining all pixels in each pixel set in the one or more pixel sets. Refactoring is complete.
  • the pixels in the spatial neighborhood have been reconstructed, and the brightness values of the reconstructed pixels in the spatial neighborhood are obtained.
  • the spatial neighborhood structure shown in FIG. 5 (b) it is respectively checked whether the pixels on the left and upper sides of the image region X have been reconstructed to obtain the brightness values of the pixels that have been reconstructed in these regions.
  • the spatial neighborhood structure shown in FIG. 5 (c) it is checked whether the pixels on the left, upper, and right sides of the image region X have been reconstructed to obtain the brightness values of the pixels that have been reconstructed in these regions.
  • the entire Y region can be understood as a preset spatial neighborhood, and the pixels on the left, upper, and right sides of X each constitute a pixel set, or Let X's left side, upper side, and right side each be a preset space area. It should be understood that a pixel set may include only one pixel or all pixels in a preset spatial neighborhood.
  • the adjustment factor may be set to a preset constant, and S4032 and S404 need not be performed.
  • the threshold value is, for example, 16 and the spatial neighborhood Y includes 1/4 of the number of pixels.
  • the spatial neighborhood pixel information of the current block to be processed (ie, the transform block) is used to simulate the original pixel information corresponding to the current block to be processed.
  • the statistical characteristics of spatial neighborhood pixel information refer to the numerical results obtained by statistically analyzing the pixel values of multiple pixels in the spatial neighborhood pixel information.
  • the statistical characteristics of spatial neighborhood pixel information may include at least the pixel mean Pavg and / or pixels. Dispersion P con .
  • the statistical characteristics of the pixel information in the spatial neighborhood reflect the characteristics of the background area where the current image block is located to some extent (such as background brightness and background contrast).
  • the average value of the brightness value (ie, the brightness component) of K1 pixels in the spatial neighborhood pixel information is Pavg , which is simply referred to as the pixel average value, that is:
  • P (k) is the brightness value (ie, the brightness component) of a pixel in the spatial neighborhood
  • the mean absolute error sum (Mean Absolute Difference, MAD) of the brightness values of the K2 pixels and the pixel average value P avg in the pixel information of the spatial neighborhood can be used as a representation of the dispersion P con , that is;
  • K1 and K2 are all positive integers less than or equal to M.
  • the dispersion can also be expressed by other means such as the mean square error sum, the variance or standard deviation, and the correlation between pixels, without limitation.
  • the pixel information in the preset spatial neighborhood can also be represented by other physical quantities related to the pixel values in the spatial neighborhood besides the mean and the dispersion, without limitation.
  • step S403 the specific representation manner of the pixel information calculated in step S403 is consistent with step S404.
  • step S403 when only the pixel average value is used to determine the adjustment factor of the block to be processed, in step S403, only the pixel average value needs to be calculated, and the pixel dispersion does not need to be calculated.
  • the pixel information is the average value
  • determining the adjustment factor of the block to be processed according to the pixel information in a preset spatial neighborhood of the block to be processed includes: The average value and the first mapping relationship between the average value and the adjustment factor to determine the adjustment factor, wherein the first mapping relationship satisfies one or more of the following conditions: when the average value is less than a first threshold value, all The adjustment factor decreases as the average value increases; when the average value is greater than a second threshold value, the adjustment factor increases as the average value increases, wherein the first threshold value is less than or equal to The second threshold; when the average value is greater than or equal to the first threshold and less than or equal to the second threshold, the adjustment factor is a first preset constant.
  • the pixel mean value P avg adjustment factor to the first piecewise function f (P avg) is calculated according to 1; wherein the pixel average value obtained from step P avg S403.
  • f 1 (P avg) is U-shaped function of the P avg is, f 1 (P avg) satisfied when P avg is less than the threshold value the first derivative of T1, f 1 (P avg) is less than 0, when P avg is greater than the threshold value T2, f The first derivative of 1 (P avg ) is greater than 0.
  • f 1 (P avg ) is equal to the constant C0; where T1 ⁇ 0, T2 ⁇ 0, T2 ⁇ T1, and T1 is 0 , 60, 64, or 128, T2 is, for example, 0, 80, 128, or 170; C0 is a positive real number, for example, 0.5, 0.75, 1, 1.5, 8, 16, 256, or 1024. More specifically, the f 1 (P avg ) function is, for example,
  • f 1 (P avg ) function for example
  • the first mapping relationship may be an independent variable with the mean value as described above, and the adjustment factor is a first piecewise function f 1 (P avg ) of the dependent variable, or may be the mean and
  • the preset correspondence between the adjustment factors specifically, the preset correspondence between the mean and the adjustment factor can be solidified and coded at both ends.
  • a table lookup method can be used. To determine the corresponding adjustment factor.
  • the table lookup method reduces the computational complexity and is more conducive to hardware implementation.
  • the method of obtaining the adjustment factor through calculation can obtain more accurate results, and does not need to store the above-mentioned correspondence relationship table.
  • the pixel information is the dispersion
  • determining the adjustment factor of the block to be processed according to the pixel information in a preset spatial neighborhood of the block to be processed includes: The dispersion and a second mapping relationship between the dispersion and the adjustment factor to determine the adjustment factor, wherein the second mapping relationship satisfies one or more of the following conditions: when the dispersion is greater than a third At the threshold, the adjustment factor increases as the dispersion increases; when the dispersion is less than or equal to the third threshold, the adjustment factor is a second preset constant.
  • the adjustment factor Dispersion P con second piecewise function f (P con) is calculated according to 2; wherein the dispersion obtained from step P con S403.
  • f 2 (P con ) is a monotonic function about P con , f 2 (P con ) satisfies when (P con ) ⁇ is less than the threshold T3, f 2 (P con ) is a constant C3, and when (P con ) ⁇ is greater than or equal to At threshold T3, the first derivative of f 2 (P con ) is greater than zero.
  • the second mapping relationship may be the second piecewise function f 2 (P con ) using the dispersion as an independent variable and the adjustment factor as the dependent variable as described above, or may be the dispersion
  • the preset correspondence between the degree and the adjustment factor Specifically, the preset correspondence between the degree of dispersion and the adjustment factor can be solidified and coded at both ends. When the degree of dispersion is obtained, the The method of looking up a table determines the corresponding adjustment factor.
  • the pixel information is the average and the dispersion
  • the adjustment factor of the block to be processed is determined according to the pixel information in a preset spatial neighborhood of the block to be processed. Includes: determining a first parameter according to the mean and the first mapping relationship; determining a second parameter according to the dispersion and the second mapping relationship; and multiplying a product of the first parameter and the second parameter Or a weighted sum is used as the adjustment factor.
  • T1, T2, T3, C0, C3, C4, ⁇ 1, ⁇ 2, ⁇ 3, ⁇ 4 can also be calculated according to the statistical characteristics of the adaptive video image is a predetermined constant , Can also be extracted from the video bitstream.
  • the method further includes: weighting the adjustment factor to obtain an adjusted An adjustment factor; correspondingly, determining the adjustment of the block to be processed includes: using the adjusted adjustment factor as an adjustment factor of the block to be processed.
  • the weighting coefficient s may be obtained by parsing a Sequence Parameter Set (SPS), or may be obtained by parsing a slice header.
  • SPS Sequence Parameter Set
  • the method further includes: updating the adjustment factor according to a quantization parameter of the block to be processed; correspondingly, the based on Adjusting the first residual by the adjustment factor to obtain the second residual of the block to be processed includes: adjusting the first residual based on the updated adjustment factor to obtain the block to be processed Second residual.
  • the adjustment factor is adjusted in the following manner:
  • QC represents the adjustment factor
  • QP represents the quantization parameter
  • a video image includes a luminance component (Y) and a chrominance component (Cb, Cr, or U, V).
  • the block to be processed includes the luminance component and the chrominance component of the block to be processed
  • the first residual block includes the first luminance residual block and the first chrominance residual block
  • the second residual block includes the second luminance residual Block and the second chroma residual block
  • the chroma residual block can be further divided into a residual block of a Cb component and a residual block of a Cr component, or a residual block of a U component And V component residual blocks.
  • this step includes: adjusting the first brightness residual only based on the adjustment factor to obtain a second brightness residual of the block to be processed.
  • the second residual is the first residual.
  • this step includes: adjusting the first chroma residual only based on the adjustment factor to obtain a second chroma residual of the block to be processed.
  • the second residual is the first residual.
  • the step includes: adjusting the first brightness residual based on the adjustment factor to obtain a second brightness residual of the block to be processed, and adjusting based on the adjustment factor.
  • the first chroma residual to obtain a second chroma residual of the block to be processed.
  • the adjustment factor for adjusting the luminance residual and the adjustment factor for adjusting the chrominance residual may be the same or different.
  • the adjustment factor for adjusting the chrominance residual can be obtained by calculating the luminance pixel information of the preset spatial neighborhood of the block to be processed, or can be calculated by similar methods by calculating the chromaticity of the preset spatial neighborhood of the block to be processed. Obtaining pixel information, or obtaining luminance and chrominance pixel information comprehensively considering the preset spatial neighborhood of the block to be processed, is not limited.
  • the first residual block includes a first luminance residual block of a luminance component of the block to be processed, and a luminance residual pixel in the first luminance residual block and a pixel of a luminance component of the block to be processed are one A corresponding, corresponding, the second residual block includes a second luminance residual block of a luminance component of the block to be processed, and the first residual block is adjusted based on the adjustment factor to obtain the
  • the second residual block of the block to be processed includes: adjusting the luminance residual pixels in the first luminance residual block based on the adjustment factor to obtain the luminance in the second luminance residual block of the block to be processed. Residual pixels.
  • the luminance residual pixels in the second luminance residual block are obtained in the following manner:
  • Res2_Y (i) (Res1_Y (i) ⁇ QC + offset_Y) >> shift_Y
  • QC represents the adjustment factor
  • Res1_Y (i) represents the i-th brightness residual pixel in the first luminance residual block
  • Res2_Y (i) represents the i-th pixel in the second luminance residual block.
  • the first residual block includes a first chrominance residual block of a chrominance component of the block to be processed, and a chrominance residual pixel in the first chrominance residual block and a color of the block to be processed.
  • the pixels of the degree component correspond one-to-one.
  • the second residual block includes a second chrominance residual block of the chrominance component of the block to be processed, and the first residual is adjusted based on the adjustment factor.
  • a difference block to obtain a second residual block of the block to be processed comprising: adjusting the chroma residual pixels in the first chroma residual block based on the adjustment factor to obtain the The chroma residual pixels in the second chroma residual block.
  • the chroma residual pixels in the second chroma residual block are obtained in the following manner:
  • Res2_C (i) (Res1_C (i) ⁇ QC + offset_C) >> shift_C
  • QC represents the adjustment factor
  • Res1_C (i) represents the i-th chroma residual pixel in the first chroma residual block
  • Res2_C (i) represents the second chroma residual block.
  • the first residual is subjected to scaling processing to obtain a second residual.
  • the first residual is used as an intermediate processing result, which can adopt a higher-precision bit width (also known as bit depth) in numerical form, and a higher-precision bit width Store the value of the first residual.
  • the bit width of the pixel of the block to be processed is D
  • the first residual can be saved as D + E bits.
  • D can be 8, 10, or 12
  • E can be Is 1, 2, 3, or 4.
  • the bit width of the second residual is processed to be the same as the bit width of the pixels of the block to be processed. In this step, when obtaining the second residual, the accuracy of the bit width will be reduced.
  • the above-mentioned right shift_Y or shift_C includes an operation of right shifting E bits.
  • the bit width accuracy of the residual pixels in the first residual block is higher than the bit width accuracy of the residual pixels in the second residual block.
  • the luminance and chrominance components can be described as: the bit width accuracy of the luminance residual pixels in the first luminance residual block is higher than the bit width accuracy of the luminance residual pixels in the second luminance residual block.
  • the bit width accuracy and the bit width accuracy of the chroma residual pixels in the first chroma residual block are higher than the bit width accuracy of the chroma residual pixels in the second chroma residual block.
  • the bit width of the first chroma residual is not stored with a higher precision bit width, and there is no step of reducing the bit width when obtaining the second chroma residual. .
  • the predicted pixels are generally generated by an intra prediction technique or an inter prediction technique.
  • intra-prediction and inter-prediction techniques you can refer to the H.265 standard (Rec.ITU-T H.265 v4), pages 125 to 172, section 8.4 intra prediction and section 8.5 frames.
  • An introduction to forecasting. JEM has also made a lot of improvements in intra prediction and inter prediction techniques. For details, see JVET-G1001-v1, pages 6 to 28, section 2.2 Intra prediction technology improvements, and Section 2.3 inter frames. The introduction of the improvement of prediction technology will not be repeated here.
  • the embodiment of the present application does not limit what kind of prediction technology is used.
  • the added value is also limited to one interval. Within, for example, it is limited to the allowable value range of the pixels of the block to be processed, and correspondingly, the added value after the limit is used as the reconstructed pixel of the corresponding position in the block to be processed.
  • a filtering process on the reconstructed pixels, such as a bilateral filtering process proposed in JEM.
  • whether a block to be processed needs to be filtered is determined by decoding syntax elements obtained.
  • the encoding process corresponding to the present invention is, for example: for a block to be encoded, an adjustment factor calculated according to pixels in the spatial neighborhood, using the reciprocal of the adjustment factor to scale the prediction residual of this encoding block, and transforming the scaled prediction residual. And quantization to obtain a quantized transform coefficient, and the quantized transform coefficient is encoded into a code stream by an entropy coding unit.
  • an adjustment factor calculated according to pixels in a spatial neighborhood is used, the adjustment step is used to scale the quantization step of this encoding block, and the prediction residual is transformed and used.
  • the scaled quantization step size quantizes the transform coefficients to obtain the quantized transform coefficients.
  • the quantized transform coefficients are encoded into a code stream by the entropy coding unit.
  • the spatial pixel information of the current block to be processed (ie, the block to be decoded and the transform block) is used to simulate the original pixel information corresponding to the current block to be processed.
  • an adjustment factor for the current block to be processed (that is, a transform block) is adaptively derived, which reflects the intensity of the visual masking effect generated by the background area of the current block, and is based on the adaptively derived adjustment factor to Adjust the residual block corresponding to the current block to be processed to reduce the residual bit of the processing block with strong visual masking effect and increase the residual bit of the processing block with weak visual masking effect during video encoding or decoding. , Making the encoding of the actual residuals more in line with the human visual perception, thereby improving the encoding and decoding performance.
  • the pipeline design is a method of systematically dividing the combinational logic, inserting registers between each part (hierarchical), and temporarily storing intermediate data.
  • the purpose is to decompose a large operation into several small operations, and the time of each small operation is small, so the frequency can be increased, and the small operations can be performed in parallel, so the data throughput rate (improve the processing speed) can be improved.
  • each small operation is called a pipeline stage.
  • steps S402 and S406 belong to different pipeline stages. For example, they can be called “inverse quantization and inverse transform pipeline stages” and “reconstruction pipeline stages”. There is a buffer between the two pipeline stages.
  • the "inverse quantization inverse transform pipeline stage” does not depend on the data produced by the "reconstruction pipeline stage”.
  • FIG. 7 schematically illustrates a block diagram of a residual acquisition device according to an embodiment of the present application, including:
  • An apparatus 700 for acquiring residuals in video decoding includes: an analysis module 701 for analyzing a bitstream to obtain transform coefficients of a block to be processed; and a conversion module 702 for converting the transform coefficients into the to-be-processed The first residual of the block;
  • a calculation module 703 is configured to determine an adjustment factor of the block to be processed according to pixel information in a preset spatial neighborhood of the block to be processed; an adjustment module 704 is configured to adjust the first residual based on the adjustment factor To obtain a second residual of the block to be processed.
  • the calculation module 703 is further configured to calculate pixel information in a preset spatial neighborhood of the block to be processed based on pixel values in a preset spatial neighborhood of the block to be processed.
  • the calculation module 703 is specifically configured to: obtain one or more pixel sets in the preset spatial neighborhood; calculate an average value and / or of pixels in the one or more pixel sets. The degree of dispersion to obtain pixel information within the preset spatial neighborhood.
  • the dispersion includes: a mean square error sum, an average absolute error sum, a variance or a standard deviation.
  • the calculation module 703 is further configured to determine that all pixels in each pixel set in the one or more pixel sets have completed reconstruction.
  • the pixel information is the average value
  • the calculation module 703 is specifically configured to determine the adjustment according to the average value and a first mapping relationship between the average value and the adjustment factor.
  • the calculation module 703 is specifically configured to determine the adjustment factor according to the dispersion and a second mapping relationship between the dispersion and the adjustment factor, wherein the first The two mapping relationships satisfy one or more of the following conditions: when the dispersion is greater than a third threshold, the adjustment factor increases as the dispersion increases; when the dispersion is less than or equal to the third When the threshold is set, the adjustment factor is a second preset constant.
  • the pixel information is the mean and the dispersion
  • the calculation module 703 is specifically configured to: determine a first parameter according to the mean and the first mapping relationship; A second parameter is determined by the dispersion and the second mapping relationship; a product or a weighted sum of the first parameter and the second parameter is used as the adjustment factor.
  • the calculation module 703 is further configured to: perform weight adjustment on the adjustment factors to obtain an adjusted adjustment factor; and use the adjusted adjustment factor as a value of the block to be processed. Regulatory factors.
  • the calculation module 703 is further configured to: update the adjustment factor according to the quantization parameter of the block to be processed; correspondingly, the adjustment module 704 is specifically configured to: The updated adjustment factor adjusts the first residual to obtain a second residual of the block to be processed.
  • the adjustment factor is adjusted in the following manner:
  • QC represents the adjustment factor
  • QP represents the quantization parameter
  • N, M, and X are preset constants.
  • the number of the obtained transformation coefficients of the block to be processed is the same as the number of pixels of the block to be processed
  • the conversion module 702 is further configured to:
  • the transform coefficients of the blocks are arranged into transform coefficient blocks according to a preset positional relationship; the transform coefficient blocks are converted into a first residual block of the block to be processed;
  • the adjustment module 704 is specifically configured to: based on the An adjustment factor adjusts the first residual block to obtain a second residual block of the block to be processed.
  • the first residual block includes a first luminance residual block of a luminance component of the block to be processed, and the luminance residual pixels in the first luminance residual block are related to the luminance residual pixels.
  • the pixels of the luminance component of the block to be processed correspond one-to-one.
  • the second residual block includes a second luminance residual block of the luminance component of the block to be processed.
  • the adjustment module 704 is specifically configured to: The adjustment factor adjusts the luminance residual pixels in the first luminance residual block to obtain the luminance residual pixels in the second luminance residual block of the block to be processed.
  • the luminance residual pixels in the second luminance residual block are obtained in the following manner:
  • Res2_Y (i) (Res1_Y (i) ⁇ QC + offset_Y) >> shift_Y
  • QC represents the adjustment factor
  • Res1_Y (i) represents the i-th brightness residual pixel in the first luminance residual block
  • Res2_Y (i) represents the i-th pixel in the second luminance residual block.
  • Brightness residual pixels, offset_Y and shift_Y are preset constants, and i is a natural number.
  • the first residual block includes a first chrominance residual block of a chrominance component of the block to be processed, and a chrominance residual in the first chrominance residual block.
  • the pixels are in one-to-one correspondence with the pixels of the chrominance component of the block to be processed.
  • the second residual block includes a second chrominance residual block of the chrominance component of the block to be processed.
  • the adjustment module 704 is specifically configured to adjust the chroma residual pixels in the first chroma residual block based on the adjustment factor to obtain the chroma residual pixels in the second chroma residual block of the block to be processed. .
  • the chroma residual pixels in the second chroma residual block are obtained in the following manner:
  • Res2_C (i) (Res1_C (i) ⁇ QC + offset_C) >> shift_C
  • QC represents the adjustment factor
  • Res1_C (i) represents the i-th chroma residual pixel in the first chroma residual block
  • Res2_C (i) represents the second chroma residual block.
  • the i-th chroma residual pixel, offset_C and shift_C are preset constants, and i is a natural number.
  • the bit width accuracy of the luminance residual pixels in the first luminance residual block is higher than the bit width accuracy of the luminance residual pixels in the second luminance residual block.
  • the bit width accuracy of the chroma residual pixel in the first chroma residual block is higher than the bit width of the chroma residual pixel in the second chroma residual block. Bit width precision.
  • the conversion module 702 is specifically configured to: inverse quantize each transform coefficient in the transform coefficient block to obtain an inverse quantized transform coefficient block; and perform inverse quantization The inverse transform of the transform coefficient block is performed to obtain a first residual block of the block to be processed.
  • the apparatus 700 further includes: a reconstruction unit 705, configured to add a residual pixel in the second residual and a predicted pixel at a corresponding position in the block to be processed, To obtain a reconstructed pixel at the corresponding position in the block to be processed.
  • a reconstruction unit 705 configured to add a residual pixel in the second residual and a predicted pixel at a corresponding position in the block to be processed, To obtain a reconstructed pixel at the corresponding position in the block to be processed.
  • the spatial pixel information of the current block to be processed (ie, the block to be decoded and the transform block) is used to simulate the original pixel information corresponding to the current block to be processed.
  • an adjustment factor for the current block to be processed (that is, a transform block) is adaptively derived, which reflects the intensity of the visual masking effect generated by the background area of the current block, and is based on the adaptively derived adjustment factor to Adjust the residual block corresponding to the current block to be processed to reduce the residual bit of the processing block with strong visual masking effect and increase the residual bit of the processing block with weak visual masking effect during video encoding or decoding. , Making the encoding of the actual residuals more in line with human visual perception, thereby improving the performance of encoding and decoding.
  • FIG. 8 is a schematic block diagram of a video decoding device according to an embodiment of the present application.
  • the device 800 may be applied to an encoding side or a decoding side.
  • the device 800 includes a processor 801 and a memory 802, and the processor 801 and the memory 802 are connected (for example, connected to each other through the bus 804).
  • the device 800 may further include a transceiver 803, and the transceiver 803 is connected to process
  • the receiver 801 and the memory 802 are configured to receive / transmit data.
  • the memory 802 includes, but is not limited to, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM), or A portable read-only memory (CD-ROM).
  • RAM random access memory
  • ROM read-only memory
  • EPROM erasable programmable read-only memory
  • CD-ROM portable read-only memory
  • the processor 801 may be one or more central processing units (CPUs). When the processor 801 is a CPU, the CPU may be a single-core CPU or a multi-core CPU.
  • the processor 801 is configured to read the program code stored in the memory 802 and execute operations of the implementation manner corresponding to FIG. 4 and various feasible implementation manners thereof.
  • an embodiment of the present application further provides a computer-readable storage medium, where the computer-readable storage medium stores instructions, and when the computer-readable storage medium is run on a computer, the computer executes the implementation corresponding to FIG. 4 and the implementation thereof Operation of various feasible embodiments.
  • the embodiment of the present application further provides a computer program product containing instructions, which when executed on a computer, causes the computer to execute the operations corresponding to the implementation manner shown in FIG. 4 and various feasible implementation manners thereof.
  • the computer program product includes one or more computer instructions.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device.
  • the computer instructions may be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be from a network site, computer, server, or data center.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer, and may also be a data storage device such as a server, a data center, or the like that includes one or more available medium integration.
  • the usable medium may be a magnetic medium (such as a floppy disk, a hard disk, a magnetic tape, etc.), an optical medium (such as a DVD, etc.), or a semiconductor medium (such as a solid state hard disk), and the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本申请公开了一种视频解码中残差的获取方法和设备,包括:解析码流,以获取待处理块的变换系数;将所述变换系数转换为所述待处理块的第一残差;根据所述待处理块的预设空间邻域内的像素信息,确定所述待处理块的调节因子;基于所述调节因子调整所述第一残差,以获得所述待处理块的第二残差,实现了在视频解码过程中,在稳定条带码率的同时灵活地调节残差处理,使得残差更符合人眼视觉感知,从而提高了编解码性能。

Description

视频数据解码方法及装置
本申请要求于2018年5月24日提交中国国家知识产权局、申请号为201810508090.3、申请名称为“视频数据解码方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及视频编解码技术领域,尤其涉及残差的获取方法及装置。
背景技术
当前视频编码技术包括多种,例如H.264/AVC、H.265/HEVC、音视频编码标准(Audio Video coding Standard,AVS)等视频编码标准,上述视频编码标准通常都采用混合编码框架,该混合编码框架可包括预测(prediction)、变换(transform)、量化(quantization)、熵编码(entropy coding)等环节。预测环节利用已编码区域的重建像素(reconstructed pixel)产生当前编码图像块(coding block)对应的原始像素(original pixel)的预测像素(predicted pixel)。原始像素和预测像素之间的像素值差异称为残差(residual)。为了提高残差的编码效率,通常先对残差进行变换,将其转化为变换系数(transform coefficient),再对变换系数作量化处理。然后,将量化后的变换系数以及语法元素(例如编码图像块大小、预测模式、运动矢量等指示信息)通过熵编码处理转换成码流。
视频解码是将码流转换为视频图像的过程,可包括熵解码(entropy decoding)、预测、反量化(dequantization)、反变换(inverse transform)等环节。首先,将码流通过熵解码处理解析出语法元素和经量化的变换系数。然后,一方面基于语法元素和在先已解码的重建像素得到预测像素;另一方面将经量化的变换系数通过反量化处理得到反量化后的变换系数,并对反量化后的变换系数进行反变换,以得到重建的残差。以及,累加重建的残差和预测像素,以得到重建像素,从而恢复出视频图像。
对于有损编码,重建像素与原始像素可能是不同的,两者之间的数值差异称为失真(distortion)。由于多种视觉掩蔽效应的存在,例如亮度掩蔽效应和对比度掩蔽效应,人眼观察到失真的强度与失真所在背景的特性有密切的联系。
发明内容
本申请实施例利用当前待处理块(即待解码块、变换块)的空间邻域像素信息模拟当前待处理块对应的原始像素信息。根据空间邻域像素信息,自适应地推导用于当前待处理块(即变换块)的调节因子,并基于自适应推导的调节因子来调节当前待处理块对应的残差块,实现了在视频编码或解码过程中,降低了视觉掩蔽效应较强的处理块的残差比特,提高了视觉掩蔽效应较弱的处理块的残差比特,使得实际残差的编码更符合人眼视觉感知,从而提高了编解码性能。
本申请实施例的第一方面提供了一种视频解码中残差的获取方法,包括:解析码流,以获取待处理块的变换系数;将所述变换系数转换为所述待处理块的第一残差;根据所述待处理块的预设空间邻域内的像素信息,确定所述待处理块的调节因子;基于所述调节因子调整所述第一残差,以获得所述待处理块的第二残差。
利用当前待处理块的空间邻域像素信息模拟当前待处理块对应的原始像素信息,自适应地推导用于当前待处理块的调节因子,并基于自适应推导的调节因子来调节当前待处理块对应的残差块,使得实际残差更符合人眼视觉感知,从而提高了编解码性能。
在第一方面的一种可行的实施方式中,在所述根据所述待处理块的预设空间邻域内的像素信息,确定所述待处理块的调节因子之前,还包括:基于所述待处理块的预设空间邻域内的像素值,计算所述待处理块的预设空间邻域内的像素信息。
在第一方面的一种可行的实施方式中,所述计算所述待处理块的预设空间邻域内的像素信息,包括:获取所述预设空间邻域内的一个或多个像素集合;计算所述一个或多个像素集合内像素的均值和/或离散度,以获得所述预设空间邻域内的像素信息。
采用待处理块周边像素的像素信息来代替待处理块的像素信息,使解码端能够自适应地进行像素信息的推导,节省了传输像素信息的比特数,提高了编码效率。
在第一方面的一种可行的实施方式中,所述离散度包括:均方误差和,平均绝对误差和,方差或标准差。
可以基于不同的场景和实现复杂度的要求,选择不同的指标作为离散度的表征量,实现了性能和复杂度的平衡。
在第一方面的一种可行的实施方式中,所述离散度包括:在所述获取所述预设空间邻域内的一个或多个像素集合之前,还包括:确定所述一个或多个像素集合中的每个像素集合中的全部像素已完成重构。
选择重构后的像素来计算像素信息保证了用于调节因子计算的像素信息的准确性。
在第一方面的一种可行的实施方式中,所述像素信息为所述均值,所述根据所述待处理块的预设空间邻域内的像素信息,确定所述待处理块的调节因子,包括:根据所述均值以及所述均值和所述调节因子的第一映射关系,确定所述调节因子,其中,所述第一映射关系满足如下一个或多个条件:当所述均值小于第一阈值时,所述调节因子随所述均值的增大而减小;当所述均值大于第二阈值时,所述调节因子随所述均值的增大而增大,其中,所述第一阈值小于或等于所述第二阈值;当所述均值大于或等于所述第一阈值,且小于或等于所述第二阈值时,所述调节因子为第一预设常数。
在第一方面的一种可行的实施方式中,所述像素信息为所述离散度,所述根据所述待处理块的预设空间邻域内的像素信息,确定所述待处理块的调节因子,包括:
根据所述离散度以及所述离散度和所述调节因子的第二映射关系,确定所述调节因子,其中,所述第二映射关系满足如下一个或多个条件:
当所述离散度大于第三阈值时,所述调节因子随所述离散度的增大而增大;
当所述离散度小于或等于所述第三阈值时,所述调节因子为第二预设常数。
在第一方面的一种可行的实施方式中,所述像素信息为所述均值和所述离散度,所述根据所述待处理块的预设空间邻域内的像素信息,确定所述待处理块的调节因子,包括:
根据所述均值和所述第一映射关系确定第一参数;
根据所述离散度和所述第二映射关系确定第二参数;
将所述第一参数和所述第二参数的乘积或加权和作为所述调节因子。
可以基于不同的场景和实现复杂度的要求,选择不同的指标实现调节因子的确定,实现了性能和复杂度的平衡。
在第一方面的一种可行的实施方式中,在所述将所述第一参数和所述第二参数的乘积或加权和作为所述调节因子之后,还包括:将所述调节因子进行加权调整,以获得调整后的调节因子;对应的,所述确定所述待处理块的调节,包括:将所述调整后的调节因子作为所述待处理块的调节因子。
通过对调节因子的进一步加权计算,使调节因子更加优化,进一步提高编码效率。
在第一方面的一种可行的实施方式中,在所述确定所述待处理块的调节因子之后,还包括:根据所述待处理块的量化参数,对所述调节因子进行更新;对应的,所述基于所述调节因子调整所述第一残差,以得到所述待处理块的第二残差,包括:基于所述更新后的调节因子调整所述第一残差,以得到所述待处理块的第二残差。
在第一方面的一种可行的实施方式中,所述调节因子通过如下方式进行调节:
Figure PCTCN2019083848-appb-000001
其中,QC表示所述调节因子,QP表示所述量化参数,N,M,X为预设常数。
引入量化参数来对调节因子进行将进一步的优化,进一步提高编码效率。
在第一方面的一种可行的实施方式中,获取的所述待处理块的变换系数的个数与所述待处理块的像素点的个数相同,在所述获取待处理块的变换系数之后,还包括:将所述待处理块的变换系数按照预设位置关系排列为变换系数块;对应的,所述将所述变换系数转换为所述待处理块的第一残差包括:将所述变换系数块转换为所述待处理块的第一残差块;对应的,所述基于所述调节因子调整所述第一残差,以获得所述待处理块的第二残差,包括:基于所述调节因子调整所述第一残差块,以获得所述待处理块的第二残差块。
在第一方面的一种可行的实施方式中,所述第一残差块包括所述待处理块的亮度分量的第一亮度残差块,所述第一亮度残差块中的亮度残差像素与所述待处理块的亮度分量的像素一一对应,对应的,所述第二残差块包括所述待处理块的亮度分量的第二亮度残差块,所述基于所述调节因子调整所述第一残差块,以获得所述待处理块的第二残差块,包括:基于所述调节因子调整所述第一亮度残差块中的亮度残差像素,以获得所述待处理块的第二亮度残差块中的亮度残差像素。
在第一方面的一种可行的实施方式中,所述第二亮度残差块中的亮度残差像素通过如下方式获得:
Res2_Y(i)=(Res1_Y(i)×QC+offset_Y)>>shift_Y
其中,QC表示所述调节因子,Res1_Y(i)表示所述第一亮度残差块中的第i个亮度残差像素,Res2_Y(i)表示所述第二亮度残差块中的第i个亮度残差像素,offset_Y和shift_Y为预设常数,i为自然数。
在第一方面的一种可行的实施方式中,所述第一残差块包括所述待处理块的色度分量的第一色度残差块,所述第一色度残差块中的色度残差像素与所述待处理块的色度分量的像素一一对应,对应的,所述第二残差块包括所述待处理块的色度分量的第二色度残差块,所述基于所述调节因子调整所述第一残差块,以获得所述待处理块的第二残差块,包括:基于所述调节因子调整所述第一色度残差块中的色度残差像素,以获得所述待处理块的第二色度残差块中的色度残差像素。
在第一方面的一种可行的实施方式中,所述第二色度残差块中的色度残差像素通过如下方式获得:
Res2_C(i)=(Res1_C(i)×QC+offset_C)>>shift_C
其中,QC表示所述调节因子,Res1_C(i)表示所述第一色度残差块中的第i个色度残差像素,Res2_C(i)表示所述第二色度残差块中的第i个色度残差像素,offset_C和shift_C为预设常数,i为自然数。
将亮度残差和色度残差分开处理,进一步平衡了性能和复杂度的关系。
在第一方面的一种可行的实施方式中,所述第一亮度残差块中的亮度残差像素的比特位宽精度高于所述第二亮度残差块中的亮度残差像素的比特位宽精度。
在第一方面的一种可行的实施方式中,所述第一色度残差块中的色度残差像素的比特位宽精度高于所述第二色度残差块中的色度残差像素的比特位宽精度。
对于中间过程产生的第一残差采用高精度的比特位宽精度,可以提高运算的精度,提高编码效率。
在第一方面的一种可行的实施方式中,所述将所述变换系数块转换为所述待处理块的第一残差块,包括:对所述变换系数块中的每一个变换系数进行反量化,以获得反量化后的变换系数块;对所述反量化后的变换系数块进行反变换,以获得所述待处理块的第一残差块。
在第一方面的一种可行的实施方式中,在所述基于所述调节因子调整所述第一残差,以获得所述待处理块的第二残差之后,还包括:将所述第二残差中的残差像素和所述待处理块中对应位置的预测像素相加,以获得所述待处理块中所述对应位置的重构像素。
上述两个步骤是获取残差的前序步骤和后续步骤,使对残差进行调节的有益效果,可以和其他预测、变换、量化等技术进行叠加。
本申请实施例的第二方面公开了一种视频解码中残差的获取装置,包括:解析模块,用于解析码流,以获取待处理块的变换系数;转换模块,用于将所述变换系数转换为所述待处理块的第一残差;计算模块,用于根据所述待处理块的预设空间邻域内的像素信息,确定所述待处理块的调节因子;调节模块,用于基于所述调节因子调整所述第一残差,以获得所述待处理块的第二残差。
在第二方面的一种可行的实施方式中,所述计算模块还用于:基于所述待处理块的预设空间邻域内的像素值,计算所述待处理块的预设空间邻域内的像素信息。
在第二方面的一种可行的实施方式中,所述计算模块具体用于:获取所述预设空间邻域内的一个或多个像素集合;计算所述一个或多个像素集合内像素的均值和/或离散度,以获得所述预设空间邻域内的像素信息。
在第二方面的一种可行的实施方式中,所述离散度包括:均方误差和,平均绝对误差和,方差或标准差。
在第二方面的一种可行的实施方式中,所述计算模块还用于:确定所述一个或多个像素集合中的每个像素集合中的全部像素已完成重构。
在第二方面的一种可行的实施方式中,所述像素信息为所述均值,所述计算模块具体用于:根据所述均值以及所述均值和所述调节因子的第一映射关系,确定所述调节因子,其中,所述第一映射关系满足如下一个或多个条件:当所述均值小于第一阈值时,所述调节因子随所述均值的增大而减小;当所述均值大于第二阈值时,所述调节因子随所述均值的增大而增大,其中,所述第一阈值小于或等于所述第二阈值;当所述均值大于或等于所述第一阈值,且小于或等于所述第二阈值时,所述调节因子为第一预设常数。
在第二方面的一种可行的实施方式中,所述计算模块具体用于:根据所述离散度以及所述离散度和所述调节因子的第二映射关系,确定所述调节因子,其中,所述第二映射关系满足如下一个或多个条件:当所述离散度大于第三阈值时,所述调节因子随所述离散度的增大而增大;当所述离散度小于或等于所述第三阈值时,所述调节因子为第二预设常数。
在第二方面的一种可行的实施方式中,所述像素信息为所述均值和所述离散度,所述计算模块具体用于:根据所述均值和所述第一映射关系确定第一参数;根据所述离散度和所述第二映射关系确定第二参数;将所述第一参数和所述第二参数的乘积或加权和作为所述调节因子。
在第二方面的一种可行的实施方式中,所述计算模块还用于:将所述调节因子进行加权调整,以获得调整后的调节因子;将所述调整后的调节因子作为所述待处理块的调节因子。
在第二方面的一种可行的实施方式中,所述计算模块还用于:根据所述待处理块的量化参数,对所述调节因子进行更新;对应的,所述调节模块具体用于:基于所述更新后的调节因子调整所述第一残差,以得到所述待处理块的第二残差。
在第二方面的一种可行的实施方式中,所述调节因子通过如下方式进行调节:
Figure PCTCN2019083848-appb-000002
其中,QC表示所述调节因子,QP表示所述量化参数,N,M,X为预设常数。
在第二方面的一种可行的实施方式中,获取的所述待处理块的变换系数的个数与所述待处理块的像素点的个数相同,所述转换模块还用于:将所述待处理块的变换系数按照预设位置关系排列为变换系数块;将所述变换系数块转换为所述待处理块的第一残差块;对应的,所述调节模块具体用于:基于所述调节因子调整所述第一残差块,以获得所述待处理块的第二残差块。
在第二方面的一种可行的实施方式中,所述第一残差块包括所述待处理块的亮度分量的第一亮度残差块,所述第一亮度残差块中的亮度残差像素与所述待处理块的亮度分量的像素一一对应,对应的,所述第二残差块包括所述待处理块的亮度分量的第二亮度残差块,所述调节模块具体用于:基于所述调节因子调整所述第一亮度残差块中的亮度残差像素,以获得所述待处理块的第二亮度残差块中的亮度残差像素。
在第二方面的一种可行的实施方式中,所述第二亮度残差块中的亮度残差像素通过如下方式获得:
Res2_Y(i)=(Res1_Y(i)×QC+offset_Y)>>shift_Y
其中,QC表示所述调节因子,Res1_Y(i)表示所述第一亮度残差块中的第i个亮度残差像素,Res2_Y(i)表示所述第二亮度残差块中的第i个亮度残差像素,offset_Y和shift_Y为预设常数,i为自然数。
在第二方面的一种可行的实施方式中,所述第一残差块包括所述待处理块的色度分量的第一色度残差块,所述第一色度残差块中的色度残差像素与所述待处理块的色度分量的像素一一对应,对应的,所述第二残差块包括所述待处理块的色度分量的第二色度残差块,所述调节模块具体用于:基于所述调节因子调整所述第一色度残差块中的色度残差像素,以获得所述待处理块的第二色度残差块中的色度残差像素。
在第二方面的一种可行的实施方式中,所述第二色度残差块中的色度残差像素通过如下方式获得:
Res2_C(i)=(Res1_C(i)×QC+offset_C)>>shift_C
其中,QC表示所述调节因子,Res1_C(i)表示所述第一色度残差块中的第i个色度残差像素,Res2_C(i)表示所述第二色度残差块中的第i个色度残差像素,offset_C和shift_C为预设常数,i为自然数。
在第二方面的一种可行的实施方式中,所述第一亮度残差块中的亮度残差像素的比特位宽精度高于所述第二亮度残差块中的亮度残差像素的比特位宽精度。
在第二方面的一种可行的实施方式中,所述第一色度残差块中的色度残差像素的比特位宽精度高于所述第二色度残差块中的色度残差像素的比特位宽精度。
在第二方面的一种可行的实施方式中,所述转换模块具体用于:对所述变换系数块中的每一个变换系数进行反量化,以获得反量化后的变换系数块;对所述反量化后的变换系数块进行反变换,以获得所述待处理块的第一残差块。
在第二方面的一种可行的实施方式中,所述装置还包括:重构单元,用于将所述第二残差中的残差像素和所述待处理块中对应位置的预测像素相加,以获得所述待处理块中所述对应位置的重构像素。
本申请的第三方面提供一种获取残差的设备,该设备包括:设备可以是应用于编码侧,也可以是应用于解码侧。该设备包括处理器、存储器,所述处理器和存储器相连接(如通过总线相互连接),在可能的实施方式中,该设备还可包括收发器,收发器连接处理器和存储器,用于接收/发送数据。存储器用于存储程序代码和视频数据。处理器可用于读取所存储器中存储的程序代码,执行第一方面所描述的方法。
本申请的第四方面提供一种视频编解码***,该视频编解码***包括源装置及目 的地装置。源装置与目的地装置可进行通信连接。源装置产生经编码视频数据。因此,源装置可被称作视频编码装置或视频编码设备。目的地装置可解码由源装置产生的经编码视频数据。因此,目的地装置可被称作视频解码装置或视频解码设备。源装置及目的地装置可为视频编解码装置或视频编解码设备的实例。第一方面所描述的方法会应用在该视频编解码装置或视频编解码设备。
本申请的第五方面提供一种计算机可读存储介质,所述计算机可读存储介质中存储有指令,当其在计算机上运行时,使得计算机执行上述第一方面所述的方法。
本申请的第六方面提供一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行上述第一方面所述的方法。
应理解,本申请第二至六方面与本申请第一方面对应的实施例发明目的相同,技术特征相似,获得的有益技术效果也相同,不再赘述。
附图说明
为了更清楚地说明本申请实施例或背景技术中的技术方案,下面将对本申请实施例或背景技术中所需要使用的附图进行说明。
图1为示例性的可通过配置以用于本申请实施例的一种视频编码及解码的***框图;
图2为示例性的可通过配置以用于本申请实施例的一种视频编码器的***框图;
图3为示例性的可通过配置以用于本申请实施例的一种视频解码器的***框图;
图4为示例性的本申请实施例中一种用于视频数据解码的残差获取方法的流程示意图;
图5为示例性的本申请实施例中待处理块的空间邻域像素的示意图;
图6为示例性的本申请实施例中硬件流水线设计的***框图;
图7为示例性的本申请实施例中一种用于视频数据解码的残差获取装置的***框图;
图8为示例性的本申请实施例中一种用于视频数据解码的残差获取设备的***框图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述。
图1为本申请实施例中视频编码及解码***10的一种示意性框图。如图1中所展示,***10包含源装置12,源装置12产生将在稍后时间由目的地装置14解码的经编码视频数据。源装置12及目的地装置14可包括广泛范围的装置中的任一者,包含桌上型计算机、笔记型计算机、平板计算机、机顶盒、例如所谓的“智能”电话的电话手机、所谓的“智能”触控板、电视、摄影机、显示装置、数字媒体播放器、视频游戏控制台、视频流式传输装置或类似者。在一些应用中,源装置12及目的地装置14可经装备以用于无线通信。
目的地装置14可经由链路16接收待解码的经编码视频数据。链路16可包括能够将 经编码视频数据从源装置12移动到目的地装置14的任何类型的媒体或装置。在一个可行的实施方式中,链路16可包括使源装置12能够实时将经编码视频数据直接传输到目的地装置14的通信媒体。可根据通信标准(例如,无线通信协议)调制经编码视频数据且将其传输到目的地装置14。通信媒体可包括任何无线或有线通信媒体,例如射频频谱或一个或多个物理传输线。通信媒体可形成基于包的网络(例如,局域网、广域网或因特网的全球网络)的部分。通信媒体可包含路由器、交换器、基站或可有用于促进从源装置12到目的地装置14的通信的任何其它装备。
替代地,可将经编码数据从输出接口22输出到存储装置24。类似地,可由输入接口从存储装置24存取经编码数据。存储装置24可包含多种分散式或本地存取的数据存储媒体中的任一者,例如,硬盘驱动器、蓝光光盘、DVD、CD-ROM、快闪存储器、易失性或非易失性存储器或用于存储经编码视频数据的任何其它合适的数字存储媒体。在另一可行的实施方式中,存储装置24可对应于文件服务器或可保持由源装置12产生的经编码视频的另一中间存储装置。目的地装置14可经由流式传输或下载从存储装置24存取所存储视频数据。文件服务器可为能够存储经编码视频数据且将此经编码视频数据传输到目的地装置14的任何类型的服务器。可行的实施方式文件服务器包含网站服务器、文件传送协议服务器、网络附接存储装置或本地磁盘机。目的地装置14可经由包含因特网连接的任何标准数据连接存取经编码视频数据。此数据连接可包含适合于存取存储于文件服务器上的经编码视频数据的无线信道(例如,Wi-Fi连接)、有线连接(例如,缆线调制解调器等)或两者的组合。经编码视频数据从存储装置24的传输可为流式传输、下载传输或两者的组合。
本申请的技术不必限于无线应用或设定。技术可应用于视频解码以支持多种多媒体应用中的任一者,例如,空中电视广播、有线电视传输、***传输、流式传输视频传输(例如,经由因特网)、编码数字视频以用于存储于数据存储媒体上、解码存储于数据存储媒体上的数字视频或其它应用。在一些可行的实施方式中,***10可经配置以支持单向或双向视频传输以支持例如视频流式传输、视频播放、视频广播和/或视频电话的应用。
在图1的可行的实施方式中,源装置12包含视频源18、视频编码器20及输出接口22。在一些应用中,输出接口22可包含调制器/解调制器(调制解调器)和/或传输器。在源装置12中,视频源18可包含例如以下各者的源:视频捕获装置(例如,摄像机)、含有先前捕获的视频的视频存档、用以从视频内容提供者接收视频的视频馈入接口,和/或用于产生计算机图形数据作为源视频的计算机图形***,或这些源的组合。作为一种可行的实施方式,如果视频源18为摄像机,那么源装置12及目的装置14可形成所谓的摄影机电话或视频电话。本申请中所描述的技术可示例性地适用于视频解码,且可适用于无线和/或有线应用。
可由视频编码器20来编码所捕获、预捕获或计算机产生的视频。经编码视频数据可经由源装置12的输出接口22直接传输到目的地装置14。经编码视频数据也可(或替代地)存储到存储装置24上以供稍后由目的地装置14或其它装置存取以用于解码和/或播放。
目的地装置14包含输入接口28、视频解码器30及显示装置32。在一些应用中,输 入接口28可包含接收器和/或调制解调器。目的地装置14的输入接口28经由链路16接收经编码视频数据。经由链路16传达或提供于存储装置24上的经编码视频数据可包含由视频编码器20产生以供视频解码器30的视频解码器使用以解码视频数据的多种语法元素。这些语法元素可与在通信媒体上传输、存储于存储媒体上或存储于文件服务器上的经编码视频数据包含在一起。
显示装置32可与目的地装置14集成或在目的地装置14外部。在一些可行的实施方式中,目的地装置14可包含集成显示装置且也经配置以与外部显示装置接口连接。在其它可行的实施方式中,目的地装置14可为显示装置。一般来说,显示装置32向用户显示经解码视频数据,且可包括多种显示装置中的任一者,例如液晶显示器、等离子显示器、有机发光二极管显示器或另一类型的显示装置。
视频编码器20及视频解码器30可根据例如目前在开发中的下一代视频编解码压缩标准(H.266)操作且可遵照H.266测试模型(JEM)。替代地,视频编码器20及视频解码器30可根据例如ITU-TH.265标准,也称为高效率视频解码标准,或者,ITU-TH.264标准的其它专属或工业标准或这些标准的扩展而操作,ITU-TH.264标准替代地被称为MPEG-4第10部分,也称高级视频编码(advanced video coding,AVC)。然而,本申请的技术不限于任何特定解码标准。视频压缩标准的其它可行的实施方式包含MPEG-2和ITU-TH.263。
尽管未在图1中展示,但在一些方面中,视频编码器20及视频解码器30可各自与音频编码器及解码器集成,且可包含适当多路复用器-多路分用器(MUX-DEMUX)单元或其它硬件及软件以处置共同数据流或单独数据流中的音频及视频两者的编码。如果适用,那么在一些可行的实施方式中,MUX-DEMUX单元可遵照ITUH.223多路复用器协议或例如用户数据报协议(UDP)的其它协议。
视频编码器20及视频解码器30各自可实施为多种合适编码器电路中的任一者,例如,一个或多个微处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现场可编程门阵列(FPGA)、离散逻辑、软件、硬件、固件或其任何组合。在技术部分地以软件实施时,装置可将软件的指令存储于合适的非暂时性计算机可读媒体中且使用一个或多个处理器以硬件执行指令,以执行本申请的技术。视频编码器20及视频解码器30中的每一者可包含于一个或多个编码器或解码器中,其中的任一者可在相应装置中集成为组合式编码器/解码器(CODEC)的部分。
本申请示例性地可涉及视频编码器20将特定信息“用信号发送”到例如视频解码器30的另一装置。然而,应理解,视频编码器20可通过将特定语法元素与视频数据的各种经编码部分相关联来用信号发送信息。即,视频编码器20可通过将特定语法元素存储到视频数据的各种经编码部分的头信息来“用信号发送”数据。在一些应用中,这些语法元素可在通过视频解码器30接收及解码之前经编码及存储(例如,存储到存储***34或文件服务器36)。因此,术语“用信号发送”示例性地可指语法或用于解码经压缩视频数据的其它数据的传达,而不管此传达是实时或近实时地发生或在时间跨度内发生,例如可在编码时将语法元素存储到媒体时发生,语法元素接着可在存储到此媒体之后的任何时间通过解码装置检索。
JCT-VC开发了H.265(HEVC)标准。HEVC标准化基于称作HEVC测试模型(HM)的视 频解码装置的演进模型。H.265的最新标准文档可从http://www.itu.int/rec/T-REC-H.265获得,最新版本的标准文档为H.265(12/16),该标准文档以全文引用的方式并入本文中。HM假设视频解码装置相对于ITU-TH.264/AVC的现有算法具有若干额外能力。例如,H.264提供9种帧内预测编码模式,而HM可提供多达35种帧内预测编码模式。
JVET致力于开发H.266标准。H.266标准化的过程基于称作H.266测试模型的视频解码装置的演进模型进行。H.266的算法描述可从http://phenix.int-evry.fr/jvet获得,其中最新的算法描述包含于JVET-G1001-v1中,该算法描述文档以全文引用的方式并入本文中。同时,可从https://jvet.hhi.fraunhofer.de/svn/svn_HMJEMSoftware/获得JEM测试模型的参考软件,同样以全文引用的方式并入本文中。
一般来说,HM的工作模型描述可将视频帧或图像划分成包含亮度及色度样本两者的树块或最大编码单元(largest coding unit,LCU)的序列,LCU也被称为CTU。树块具有与H.264标准的宏块类似的目的。条带包含按解码次序的数个连续树块。可将视频帧或图像分割成一个或多个条带。可根据四叉树将每一树块***成编码单元。例如,可将作为四叉树的根节点的树块***成四个子节点,且每一子节点可又为母节点且被***成另外四个子节点。作为四叉树的叶节点的最终不可***的子节点包括解码节点,例如,经解码视频块。与经解码码流相关联的语法数据可定义树块可***的最大次数,且也可定义解码节点的最小大小。
编码单元包含解码节点及预测单元(prediction unit,PU)以及与解码节点相关联的变换单元(transform unit,TU)。CU的大小对应于解码节点的大小且形状必须为正方形。CU的大小的范围可为8×8像素直到最大64×64像素或更大的树块的大小。每一CU可含有一个或多个PU及一个或多个TU。例如,与CU相关联的语法数据可描述将CU分割成一个或多个PU的情形。分割模式在CU是被跳过或经直接模式编码、帧内预测模式编码或帧间预测模式编码的情形之间可为不同的。PU可经分割成形状为非正方形。例如,与CU相关联的语法数据也可描述根据四叉树将CU分割成一个或多个TU的情形。TU的形状可为正方形或非正方形。
HEVC标准允许根据TU进行变换,TU对于不同CU来说可为不同的。TU通常基于针对经分割LCU定义的给定CU内的PU的大小而设定大小,但情况可能并非总是如此。TU的大小通常与PU相同或小于PU。在一些可行的实施方式中,可使用称作“残差四叉树”(residual qualtree,RQT)的四叉树结构将对应于CU的残差样本再分成较小单元。RQT的叶节点可被称作TU。可变换与TU相关联的像素差值以产生变换系数,变换系数可被量化。
一般来说,PU包含与预测过程有关的数据。例如,在PU经帧内模式编码时,PU可包含描述PU的帧内预测模式的数据。作为另一可行的实施方式,在PU经帧间模式编码时,PU可包含界定PU的运动矢量的数据。例如,界定PU的运动矢量的数据可描述运动矢量的水平分量、运动矢量的垂直分量、运动矢量的分辨率(例如,四分之一像素精确度或八分之一像素精确度)、运动矢量所指向的参考图像,和/或运动矢量的参考图像列表(例如,列表0、列表1或列表C)。
一般来说,TU使用变换及量化过程。具有一个或多个PU的给定CU也可包含一个或 多个TU。在预测之后,视频编码器20可计算对应于PU的残差值。残差值包括像素差值,像素差值可变换成变换系数、经量化且使用TU扫描以产生串行化变换系数以用于熵解码。本申请通常使用术语“视频块”来指CU的解码节点。在一些特定应用中,本申请也可使用术语“视频块”来指包含解码节点以及PU及TU的树块,例如,LCU或CU。
视频序列通常包含一系列视频帧或图像。图像群组(group of picture,GOP)示例性地包括一系列、一个或多个视频图像。GOP可在GOP的头信息中、图像中的一者或多者的头信息中或在别处包含语法数据,语法数据描述包含于GOP中的图像的数目。图像的每一条带可包含描述相应图像的编码模式的条带语法数据。视频编码器20通常对个别视频条带内的视频块进行操作以便编码视频数据。视频块可对应于CU内的解码节点。视频块可具有固定或变化的大小,且可根据指定解码标准而在大小上不同。
作为一种可行的实施方式,HM支持各种PU大小的预测。假定特定CU的大小为2N×2N,HM支持2N×2N或N×N的PU大小的帧内预测,及2N×2N、2N×N、N×2N或N×N的对称PU大小的帧间预测。HM也支持2N×nU、2N×nD、nL×2N及nR×2N的PU大小的帧间预测的不对称分割。在不对称分割中,CU的一方向未分割,而另一方向分割成25%及75%。对应于25%区段的CU的部分由“n”后跟着“上(Up)”、“下(Down)”、“左(Left)”或“右(Right)”的指示来指示。因此,例如,“2N×nU”指水平分割的2N×2NCU,其中2N×0.5NPU在上部且2N×1.5NPU在底部。
在本申请中,“N×N”与“N乘N”可互换使用以指依照垂直维度及水平维度的视频块的像素尺寸,例如,16×16像素或16乘16像素。一般来说,16×16块将在垂直方向上具有16个像素(y=16),且在水平方向上具有16个像素(x=16)。同样地,N×N块一股在垂直方向上具有N个像素,且在水平方向上具有N个像素,其中N表示非负整数值。可将块中的像素排列成行及列。此外,块未必需要在水平方向上与在垂直方向上具有相同数目个像素。例如,块可包括N×M个像素,其中M未必等于N。
在使用CU的PU的帧内预测性或帧间预测性解码之后,视频编码器20可计算CU的TU的残差数据。PU可包括空间域(也称作像素域)中的像素数据,且TU可包括在将变换(例如,离散余弦变换(discrete cosine transform,DCT)、整数变换、小波变换或概念上类似的变换)应用于残差视频数据之后变换域中的系数。残差数据可对应于未经编码图像的像素与对应于PU的预测值之间的像素差。视频编码器20可形成包含CU的残差数据的TU,且接着变换TU以产生CU的变换系数。
在任何变换以产生变换系数之后,视频编码器20可执行变换系数的量化。量化示例性地指对系数进行量化以可能减少用以表示系数的数据的量从而提供进一步压缩的过程。量化过程可减少与系数中的一些或全部相关联的位深度。例如,可在量化期间将n位值降值舍位到m位值,其中n大于m。
JEM模型对视频图像的编码结构进行了进一步的改进,具体的,被称为“四叉树结合二叉树”(QTBT)的块编码结构被引入进来。QTBT结构摒弃了HEVC中的CU,PU,TU等概念,支持更灵活的CU划分形状,一个CU可以正方形,也可以是长方形。一个CTU首先进行四叉树划分,该四叉树的叶节点进一步进行二叉树划分。同时,在二叉树划分中存在两种划分模式,对称水平分割和对称竖直分割。二叉树的叶节点被称为CU,JEM的CU在预测和变换的过程中都不可以被进一步划分,也就是说JEM的CU,PU,TU具有 相同的块大小。在现阶段的JEM中,CTU的最大尺寸为256×256亮度像素。
在一些可行的实施方式中,视频编码器20可利用预定义扫描次序来扫描经量化变换系数以产生可经熵编码的串行化向量。在其它可行的实施方式中,视频编码器20可执行自适应性扫描。在扫描经量化变换系数以形成一维向量之后,视频编码器20可根据上下文自适应性可变长度解码(CAVLC)、上下文自适应性二进制算术解码(CABAC)、基于语法的上下文自适应性二进制算术解码(SBAC)、概率区间分割熵(PI PE)解码或其他熵解码方法来熵解码一维向量。视频编码器20也可熵编码与经编码视频数据相关联的语法元素以供视频解码器30用于解码视频数据。
为了执行CABAC,视频编码器20可将上下文模型内的上下文指派给待传输的符号。上下文可与符号的相邻值是否为非零有关。为了执行CAVLC,视频编码器20可选择待传输的符号的可变长度码。可变长度解码(VLC)中的码字可经构建以使得相对较短码对应于可能性较大的符号,而较长码对应于可能性较小的符号。以这个方式,VLC的使用可相对于针对待传输的每一符号使用相等长度码字达成节省码率的目的。基于指派给符号的上下文可以确定CABAC中的概率。
图2为本申请实施例中视频编码器20的一种示意性框图。视频编码器20可执行视频条带内的视频块的帧内解码和帧间解码。帧内解码依赖于空间预测来减少或去除给定视频帧或图像内的视频的空间冗余。帧间解码依赖于时间预测来减少或去除视频序列的邻近帧或图像内的视频的时间冗余。帧内模式(I模式)可指若干基于空间的压缩模式中的任一者。例如单向预测(P模式)或双向预测(B模式)等帧间模式可指若干基于时间的压缩模式中的任一者。
在图2的可行的实施方式中,视频编码器20包含分割单元35、预测单元41、参考图像存储器64、求和器50、变换处理单元52、量化单元54和熵编码单元56。预测单元41包含运动估计单元42、运动补偿单元44和帧内预测模块46。对于视频块重构建,视频编码器20也包含反量化单元58、反变换单元60和求和器62。也可包含解块滤波器(图2中未展示)以对块边界进行滤波从而从经重构建视频中去除块效应伪影。在需要时,解块滤波器将通常对求和器62的输出进行滤波。除了解块滤波器之外,也可使用额外环路滤波器(环路内或环路后)。
如图2中所展示,视频编码器20接收视频数据,且分割单元35将数据分割成视频块。此分割也可包含分割成条带、图像块或其它较大单元,以及(例如)根据LCU及CU的四叉树结构进行视频块分割。视频编码器20示例性地说明编码在待编码的视频条带内的视频块的组件。一般来说,条带可划分成多个视频块(且可能划分成称作图像块的视频块的集合)。
预测单元41可基于编码质量与代价计算结果(例如,码率-失真代价,RDcost,也称率失真代价)选择当前视频块的多个可能解码模式中的一者,例如多个帧内解码模式中的一者或多个帧间解码模式中的一者。预测单元41可将所得经帧内解码或经帧间解码块提供到求和器50以产生残差块数据且将所得经帧内解码或经帧间解码块提供到求和器62以重构建经编码块从而用作参考图像。
预测单元41内的运动估计单元42及运动补偿单元44执行相对于一个或多个参考图像中的一个或多个预测性块的当前视频块的帧间预测性解码以提供时间压缩。运动估 计单元42可经配置以根据视频序列的预定模式确定视频条带的帧间预测模式。预定模式可将序列中的视频条带指定为P条带、B条带或GPB条带。运动估计单元42及运动补偿单元44可高度集成,但为概念目的而分别说明。通过运动估计单元42所执行的运动估计为产生估计视频块的运动矢量的过程。例如,运动矢量可指示当前视频帧或图像内的视频块的PU相对于参考图像内的预测性块的位移。
预测性块为依据像素差而被发现为紧密匹配待解码的视频块的PU的块,像素差可通过绝对差和(SAD)、平方差和(SSD)或其它差度量确定。在一些可行的实施方式中,视频编码器20可计算存储于参考图像存储器64中的参考图像的子整数(sub-integer)像素位置的值。例如,视频编码器20可内插参考图像的四分之一像素位置、八分之一像素位置或其它分数像素位置的值。因此,运动估计单元42可执行相对于全像素位置及分数像素位置的运动搜索且输出具有分数像素精确度的运动矢量。
运动估计单元42通过比较PU的位置与参考图像的预测性块的位置而计算经帧间解码条带中的视频块的PU的运动矢量。可从第一参考图像列表(列表0)或第二参考图像列表(列表1)选择参考图像,列表中的每一者识别存储于参考图像存储器64中的一个或多个参考图像。运动估计单元42将经计算运动矢量发送到熵编码单元56及运动补偿单元44。
由运动补偿单元44执行的运动补偿可涉及基于由运动估计所确定的运动矢量提取或产生预测性块,可能执行到子像素精确度的内插。在接收当前视频块的PU的运动矢量后,运动补偿单元44即可在参考图像列表中的一者中定位运动矢量所指向的预测性块。视频编码器20通过从正经解码的当前视频块的像素值减去预测性块的像素值来形成残差视频块,从而形成像素差值。像素差值形成块的残差数据,且可包含亮度及色度差分量两者。求和器50表示执行此减法运算的一个或多个组件。运动补偿单元44也可产生与视频块及视频条带相关联的语法元素以供视频解码器30用于解码视频条带的视频块。
如果PU位于B条带中,则含有PU的图像可与称作“列表0”和“列表1”的两个参考图像列表相关联。在一些可行的实施方式中,含有B条带的图像可与为列表0和列表1的组合的列表组合相关联。
此外,如果PU位于B条带中,则运动估计单元42可针对PU执行单向预测或双向预测,其中,在一些可行的实施方式中,双向预测为分别基于列表0和列表1的参考图像列表的图像进行的预测,在另一些可行的实施方式中,双向预测为分别基于当前帧在显示顺序上的已重建的未来帧和已重建的过去帧进行的预测。当运动估计单元42针对PU执行单向预测时,运动估计单元42可在列表0或列表1的参考图像中搜索用于PU的参考块。运动估计单元42可接着产生指示列表0或列表1中的含有参考块的参考图像的参考索引和指示PU与参考块之间的空间位移的运动矢量。运动估计单元42可输出参考索引、预测方向标识和运动矢量作为PU的运动信息。预测方向标识可指示参考索引指示列表0或列表1中的参考图像。运动补偿单元44可基于由PU的运动信息指示的参考块产生PU的预测性图像块。
当运动估计单元42针对PU执行双向预测时,运动估计单元42可在列表0中的参考图像中搜索用于PU的参考块且还可在列表1中的参考图像中搜索用于PU的另一参 考块。运动估计单元42可接着产生指示列表0和列表1中的含有参考块的参考图像的参考索引和指示参考块与PU之间的空间位移的运动矢量。运动估计单元42可输出PU的参考索引和运动矢量作为PU的运动信息。运动补偿单元44可基于由PU的运动信息指示的参考块产生PU的预测性图像块。
在一些可行的实施方式中,运动估计单元42不向熵编码模块56输出用于PU的运动信息的完整集合。而是,运动估计单元42可参考另一PU的运动信息来用信号通知PU的运动信息。举例来说,运动估计单元42可确定PU的运动信息充分类似于相邻PU的运动信息。在此实施方式中,运动估计单元42可在与PU相关联的语法结构中指示一个指示值,所述指示值向视频解码器30指示PU具有与相邻PU相同的运动信息或具有可从相邻PU导出的运动信息。在另一实施方式中,运动估计单元42可在与PU相关联的语法结构中识别与相邻PU相关联的候选预测运动矢量和运动矢量差(MVD)。MVD指示PU的运动矢量和与相邻PU相关联的所指示候选预测运动矢量之间的差。视频解码器30可使用所指示候选预测运动矢量和MVD来确定PU的运动矢量。
如前文所描述,预测模块41可产生用于CU的每一PU的候选预测运动矢量列表。候选预测运动矢量列表中的一或多者可包括一或多个原始候选预测运动矢量和从原始候选预测运动矢量导出的一或多个额外候选预测运动矢量。
预测单元41内的帧内预测单元46可执行相对于在与待解码的当前块相同的图像或条带中的一个或多个相邻块的当前视频块的帧内预测性解码以提供空间压缩。因此,作为通过运动估计单元42及运动补偿单元44执行的帧间预测(如前文所描述)的替代,帧内预测单元46可帧内预测当前块。明确地说,帧内预测单元46可确定用以编码当前块的帧内预测模式。在一些可行的实施方式中,帧内预测单元46可(例如)在单独编码遍历期间使用各种帧内预测模式来编码当前块,且帧内预测单元46(或在一些可行的实施方式中,模式选择单元40)可从经测试模式选择使用的适当帧内预测模式。
在预测单元41经由帧间预测或帧内预测产生当前视频块的预测性块之后,视频编码器20通过从当前视频块减去预测性块而形成残差视频块。残差块中的残差视频数据可包含于一个或多个TU中且应用于变换处理单元52。变换处理单元52使用例如离散余弦变换(DCT)或概念上类似的变换的变换(例如,离散正弦变换DST)将残差视频数据变换成残差变换系数。变换处理单元52可将残差视频数据从像素域转换到变换域(例如,频域)。
变换处理单元52可将所得变换系数发送到量化单元54。量化单元54对变换系数进行量化以进一步减小码率。量化过程可减少与系数中的一些或全部相关联的比特深度。可通过调整量化参数来修改量化的程度。在一些可行的实施方式中,量化单元54可接着执行包含经量化变换系数的矩阵的扫描。替代地,熵编码单元56可执行扫描。
在量化之后,熵编码单元56可熵编码经量化变换系数。例如,熵编码单元56可执行上下文自适应性可变长度解码(CAVLC)、上下文自适应性二进制算术解码(CABAC)、基于语法的上下文自适应性二进制算术解码(SBAC)、概率区间分割熵(PIPE)解码或另一熵编码方法或技术。熵编码单元56也可熵编码正经解码的当前视频条带的运动矢量及其它语法元素。在通过熵编码单元56进行熵编码之后,可将经编码码流传输到视频解码器30或存档以供稍后传输或由视频解码器30检索。
熵编码单元56可编码根据本申请的技术指示选定帧内预测模式的信息。视频编码器20可在可包含多个帧内预测模式索引表和多个经修改帧内预测模式索引表(也称作码字映射表)的所传输码流配置数据中包含各种块的编码上下文的定义及用于上下文中的每一者的MPM、帧内预测模式索引表和经修改帧内预测模式索引表的指示。
反量化单元58及反变换单元60分别应用反量化及反变换,以在像素域中重构建残差块以供稍后用作参考图像的参考块。运动补偿单元44可通过将残差块与参考图像列表中的一者内的参考图像中的一者的预测性块相加来计算参考块。运动补偿单元44也可将一个或多个内插滤波器应用于经重构建残差块以计算子整数像素值以用于运动估计。求和器62将经重构建残差块与通过运动补偿单元44所产生的经运动补偿的预测块相加以产生参考块以供存储于参考图像存储器64中。参考块可由运动估计单元42及运动补偿单元44用作参考块以帧间预测后续视频帧或图像中的块。
在本申请实施例中,经反变换单元60处理,得到残差数据以后,可以根据当前待编码块周围已重建的空间邻域像素信息计算缩放因子,并使用缩放因子对得到的残差进行缩放处理,以得到用于后续重构参考块或者参考像素的残差数据。
图3为本申请实施例中视频解码器30的一种示意性框图。在图3的可行的实施方式中,视频解码器30包含熵编码单元80、预测单元81、反量化单元86、反变换单元88、求和器90和参考图像存储器92。预测单元81包含运动补偿单元82和帧内预测单元84。在一些可行的实施方式中,视频解码器30可执行与关于来自图4的视频编码器20描述的编码流程的示例性地互逆的解码流程。
在解码过程期间,视频解码器30从视频编码器20接收表示经编码视频条带的视频块及相关联的语法元素的经编码视频码流。视频解码器30的熵编码单元80熵解码码流以产生经量化系数、运动矢量及其它语法元素。熵编码单元80将运动矢量及其它语法元素转递到预测单元81。视频解码器30可在视频条带层级和/或视频块层级处接收语法元素。
在视频条带经解码为经帧内解码(I)条带时,预测单元81的帧内预测单元84可基于用信号发送的帧内预测模式及来自当前帧或图像的先前经解码块的数据而产生当前视频条带的视频块的预测数据。
在视频图像经解码为经帧间解码(例如,B、P或GPB)条带时,预测单元81的运动补偿单元82基于从熵编码单元80所接收的运动矢量及其它语法元素而产生当前视频图像的视频块的预测性块。预测性块可从参考图像列表中的一者内的参考图像中的一者产生。视频解码器30可基于存储于参考图像存储器92中的参考图像使用默认构建技术来构建参考图像列表(列表0及列表1)。
运动补偿单元82通过解析运动矢量及其它语法元素来确定当前视频条带的视频块的预测信息,且使用预测信息来产生正经解码的当前视频块的预测性块。例如,运动补偿单元82使用所接收的语法元素中的一些来确定用以解码视频条带的视频块的预测模式(例如,帧内预测或帧间预测)、帧间预测条带类型(例如,B条带、P条带或GPB条带)、条带的参考图像列表中的一者或多者的构建信息、条带的每一经帧间编码视频块的运动矢量、条带的每一经帧间解码视频块的帧间预测状态及用以解码当前视频条带中的视频块的其它信息。
运动补偿单元82也可基于内插滤波器执行内插。运动补偿单元82可使用如由视频编码器20在视频块的编码期间所使用的内插滤波器来计算参考块的子整数像素的内插值。在此应用中,运动补偿单元82可从所接收的语法元素确定由视频编码器20使用的内插滤波器且使用内插滤波器来产生预测性块。
如果PU是使用帧间预测而编码,则运动补偿单元82可产生用于PU的候选预测运动矢量列表。码流中可包括识别选定候选预测运动矢量在PU的候选预测运动矢量列表中的位置的数据。在产生用于PU的候选预测运动矢量列表之后,运动补偿单元82可基于由PU的运动信息指示的一或多个参考块产生用于PU的预测性图像块。PU的参考块可在与所述PU不同的时间图像中。运动补偿单元82可基于由PU的候选预测运动矢量列表中的选定的运动信息确定PU的运动信息。
反量化单元86对码流中所提供且通过熵编码单元80所解码的经量化变换系数进行反量化(例如,解量化)。反量化过程可包含使用通过视频编码器20针对视频条带中的每一视频块所计算的量化参数确定量化的程度,且同样地确定应应用的反量化的程度。反变换单元88将反变换(例如,反DCT、反整数变换或概念上类似的反变换过程)应用于变换系数以便在像素域中产生残差块。和编码端对应的,在本申请实施例中,经反变换单元88处理,得到残差数据以后,可以根据当前待解码块周围已重建的空间邻域像素信息计算缩放因子,并使用缩放因子对得到的残差进行缩放处理,以得到用于后续重构待解码块的残差数据。
在运动补偿单元82基于运动矢量及其它语法元素产生当前视频块的预测性块之后,视频解码器30通过将来自反变换单元88的残差块与通过运动补偿单元82产生的对应预测性块求和来形成经解码视频块。求和器90表示执行此求和运算的一个或多个组件。在需要时,也可应用解块滤波器来对经解码块进行滤波以便去除块效应伪影。其它环路滤波器(在解码环路中或在解码环路之后)也可用以使像素转变平滑,或以其它方式改进视频质量。给定帧或图像中的经解码视频块接着存储于参考图像存储器92中,参考图像存储器92存储供后续运动补偿所使用的参考图像。参考图像存储器92也存储供稍后呈现于例如图1的显示装置32的显示装置上的经解码视频。
应理解,本申请的技术可通过本申请中所描述的视频解码器中的任一者进行,视频解码器包含(例如)如关于图1到3所展示及描述的视频编码器20及视频解码器30。即,在一种可行的实施方式中,关于图2所描述的反变换单元60可在视频数据的块的编码期间在执行反变换之后,反变换单元60或者其他新增功能性单元执行下文中所描述的特定技术。在另一可行的实施方式中,关于图3所描述的反变换单元88或者其他新增功能性单元可在视频数据的块的解码期间执行下文中所描述的特定技术。因此,对一般性“视频编码器”或“视频解码器”的引用可包含视频编码器20、视频解码器30或另一视频编码或编码单元。
图4示意性的给出了本申请实施例的一种残差获取方法的流程图。示例性的,该方法可以由视频解码器30所执行。视频解码方法描述为一系列的步骤或操作,应理解,该方法可以以各种顺序执行和/或同时发生,不限于图4所示的执行顺序。假设具有多个视频帧的视频数据流正在使用视频解码器,执行包括如下步骤来解码当前视频帧的当前待处理图像块。还应理解,在本申请实施例中,调节因子的获取和当前待处理块 的预设空间邻域内的像素信息相关,而对于编码端和解码端来说,当前待处理块的预设空间邻域内的像素信息是相同的,因此调节因子是相同的,对残差数据的调整也是对应的。本领域技术人员可以理解一般情况下,编码是与解码对应的逆过程,因此本申请实施例所体现的技术方案也可以由视频编码器20在编码端执行,不再重复描述。
S401、解析码流,以获取待处理块的变换系数。
该步骤属于熵解码技术,具体的,按照预设的解析规则,将码流中以比特形式(二进制数值)表示的语法元素,解析为该语法元素对应的实际数值。涉及到变换系数的解析,具体的,将码流中表示变换系数的二进制表示,通过变换系数的解析规则,解析为变换系数的具体数值。应理解,所述待处理块的多个变换系数被依次解析出来。一般的,获取的所述待处理块的变换系数的个数与所述待处理块的像素点的个数相同,被解析出来的待处理块的变换系数会被按照预设的位置关系排列为变换系数块,这一过程通常被称为反扫描或扫描的过程。其中,预设的位置关系包括预设的变换系数的固定映射位置,也包括根据预设规则确定的变换系数的映射位置,比如根据帧内预测模式确定的变换系数的映射位置(也称为基于帧内预测模式的变换系数的扫描方式)。
典型的熵解码技术,包括前文所提到的CABAC,可以参考H.265标准(Rec.ITU-T H.265v4)中,第201页到第243页,第9.3节的介绍。JEM同样对CABAC技术进行了改进,具体的可以参见JVET-G1001-v1中,第41页到第43页,第2.6节的介绍,不再赘述。本申请实施例中对使用何种熵解码技术不做限定。
S402、将所述变换系数转换为所述待处理块的第一残差。
应理解,当待处理块的变换系数被排列成变换系数块时,对应的,所述将所述变换系数转换为所述待处理块的第一残差包括:将所述变换系数块转换为所述待处理块的第一残差块;对应的,所述基于所述调节因子调整所述第一残差,以获得所述待处理块的第二残差,包括:基于所述调节因子调整所述第一残差块,以获得所述待处理块的第二残差块。
一般的,该步骤可以分成两个子步骤:
S4021、对所述变换系数块中的每一个变换系数进行反量化,以获得反量化后的变换系数块。
在一种示例下,对量化后的变换系数A(i)进行反量化,以得到重建的变换系数R(i),可以描述为:
R(i)=sign{A(i)}·round{A(i)·Qs(i)+o2(i)}
其中量化步长Qs(i)可以为浮点数,o2(i)为舍入偏置。在一些可行的实施方式下,为了避免使用浮点数运算,采用整数加法和移位的方式来近似替代浮点数乘法,例如H.265/HEVC将上式表述的反量化过程近似为:
R(i)=sign{A(i)}·(A(i)·Qs'(i)+(1<<(bdshift-1)))>>bdshift
其中,bdshift为移位参数,Qs'(i)为整数,Qs'(i)/2 bdshift近似于上式中的量化步长Qs(i),此时o2(i)=0.5,取整方式为向下取整。
在一种示例下,Qs'(i)由电平尺度(levelScale)l(i)和缩放因子(scaling factor)m(i)共同决定,
Qs'(i)=m(i)·l(i)
而l(i)为量化参数(Quantization Parameter,QP)的函数,即有
Figure PCTCN2019083848-appb-000003
其中,电平尺度表levelScale[k]={40,45,51,57,64,72},k=0,1,...,5;
Figure PCTCN2019083848-appb-000004
表示对QP(i)除6取整,%为取余操作。
特别的,当一个变换块(transform block)的长和宽的乘积等于2的奇数次幂时,R(i)还可以通过如下公式获得:
R(i)=sign{A(i)}·(A(i)·Qs'(i)·a+(1<<(bdshift-1+s)))>>(bdshift+s)
其中,a和s为预设常数,且
Figure PCTCN2019083848-appb-000005
例如a=181,s=8。
应理解,在本文中,符号<<表示左移位操作,符号>>表示右移位操作,不再赘述。
该步骤一般被称为反量化或者缩放(scaling),在H.265标准中采用标量量化的方式进行反量化,具体的可以参见JCTVC-M1002-v1(可以从http://phenix.int-evry.fr/jct/获取)中第20页,第3.5.5节的介绍,或者H.265标准中第173页到177页,第8.6节的介绍,不在赘述。同时,也应理解,还可以采用矢量量化的方式进行反量化。本申请实施例中对使用何种反量化技术不做限定。
S4022、对所述反量化后的变换系数块进行反变换,以获得所述待处理块的第一残差块。
该步骤一般被称为反变换,典型的反变换技术包括H.265中的反离散余弦变换(Inverse Discrete Cosine Transform,IDCT)或反离散正弦变换(Inverse Discrete Sine Transform,IDST),更具体的,如DCT-II型反变换或DST-VII型反变换,还可以为DCT-VIII型反变换或DST-I型反变换;又例如,由变换块的变换模式信息确定一种反变换,使用上述确定的反变换进行反变换处理,如JEM中的自适应多核心变换(Adaptive Multiple core Transform,AMT)。反变换处理还可以包括先对部分反量化后的变换系数进行不可分离的第二重变换得到一组新的变换系数,如JEM中的NSST(Non-Separable Secondary Transform)处理,再使用基于离散余弦变换或离散正弦变换的反变换对这组新的变换系数进行反变换。具体的,可以参见JCTVC-M1002-v1中第18页到第20页,第3.5节中关于变换技术的介绍。JEM同样对变换、反变换技术进行了改进,具体的可以参见JVET-G1001-v1中,第28页到第35页,第2.4节的介绍,不再赘述。本申请实施例中对使用何种反变换技术不做限定。
应理解,在一些可行的实施方式中,和编码端处理相对应的,在将所述变换系数转换为所述待处理块的第一残差的过程中,只存在反量化(此时,变换系数实际为量化后残差值)或只存在反变换,本申请实施例对此不做限定。
S403、基于所述待处理块的预设空间邻域内的像素值,计算所述待处理块的预设空间邻域内的像素信息。
应理解,该步骤只需要利用待处理块的预设空间邻域内的像素信息即可进行,不需要等待步骤S401和S402的完成,同理,步骤S401和S402也不需要等待步骤S403的完成,即没有时序上的先后关系。
具体的,该步骤可以分为两个子步骤:
S4031、获取所述预设空间邻域内的一个或多个像素集合。
首先介绍空间邻域的概念:当前待处理(待解码)图像块的空间邻域像素是指与当前待处理图像块在同一帧解码图像上的像素。结合图5所示,当前待处理图像块的空间邻域像素可以包括:图像块X(亦称为图像区域X)的空间邻域Y中至少一个像素的重建值,具体地,空间邻域像素可以包括M个像素,M为正整数,其中空间邻域Y的几种可选示例包括:
如图5(a)-5(d)所示,图像块X(由实线指示)对应于一个w×h编码单元(即编码单元的宽为w个像素,高为h个像素,在解码端也可称为解码单元,解码块等),空间邻域Y(由虚线指示)的构成方式例如以下4种之一:
1)方式一:X上方的w×n个像素、X左方的m×h个像素、X左上方的m×n个像素,如图5(a)所示,此时M=w×n+m×h+m×n。
2)方式二:X上方的w×n个像素、X左方的m×h个像素,如图5(b)所示。
3)方式三:X上方的w×n个像素、X左方的m×h个像素、X右方的m×h个像素,如图5(c)所示。
4)方式四:X上方的w×n个像素、X下方的w×n个像素、X左方的m×h个像素、X右方的m×h个像素,如图5(d)所示。
如图5(e)-5(f)所示,图像块X对应于一个wc×hc编码单元C(由点线指示)中的一个w×h区域,空间邻域Y的构成方式例如以下2种之一:
1)方式一:X所属编码单元C上方的wc×n个像素、C左方的m×hc个像素,如图5(e)所示。
2)方式二:X所属编码单元C上方的wc×n个像素、C左方的m×hc个像素、C右方的m×hc个像素,如图5(f)所示。
其中,m和n为预设常数,例如m=n=1,或m=n=2,或m=2、n=1,或m=1,n=2。m和n还可以与图像块X的大小有关,例如当图像块X的宽小于或等于第一阈值(例如8)时,n=2;当图像块X的宽大于第一阈值(例如8)时,n=1。空间邻域像素可以为空间邻域Y中的所有像素,也可以是从空间邻域Y中抽样的一部分像素,本发明对此不作限定。
在一些可行的实施方式中,在所述获取所述预设空间邻域内的一个或多个像素集合之前,还包括:确定所述一个或多个像素集合中的每个像素集合中的全部像素已完成重构。
具体的,检查空间邻域中像素是否已经重建,并获取空间邻域中已重建像素的亮度值。例如,对于图5(b)所示的空间邻域构成方式,分别检查图像区域X的左侧、上侧的像素是否已经重建,以获取这些区域中已经重建的像素的亮度值。又例如,对于图5(c)所示的空间邻域构成方式,分别检查图像区域X的左侧、上侧、右侧的像素是否已经重建,以获取这些区域中已经重建的像素的亮度值。又例如,对于图5(c)所示的空间邻域构成方式,分别检查图像区域X的左侧、上侧、右侧的像素是否已经重建,如果左侧和右侧的像素均已重建,但上侧的像素没有重建,则获取左右两侧的像素的亮度值;如果三侧像素均已重建,则获取左侧和上侧的像素的亮度值;如果左侧和上侧的像素均已重建,但右侧的像素没有重建,则获取左侧和上侧的像素的亮度值。
对于获取所述预设空间邻域内的一个或多个像素集合,可以将整个Y区域理解为预设空间邻域,X的左侧、上侧、右侧的像素分别构成一个像素集合,也可以将X的左侧、上侧、右侧各自作为一个预设空间领域。应理解,一个像素集合可以只包含一个像素,也可以包含预设空间邻域内的全部像素。
可选的,如果空间邻域Y中已重建像素的个数小于阈值,则调节因子可设置为预设常数,不需要执行S4032和S404。所述阈值例如为16,又例如为空间邻域Y包含像素数目的1/4。
S4032、计算所述一个或多个像素集合内像素的均值和/或离散度,以获得所述预设空间邻域内的像素信息。
在本发明实施例中,为了实现自适应地调节残差的效果,利用当前待处理块(即变换块)的空间邻域像素信息来模拟当前待处理块对应的原始像素信息。空间邻域像素信息的统计特性是指对空间邻域像素信息中多个像素的像素值经过统计分析得到的数值结果,空间邻域像素信息的统计特性至少可以包括像素均值P avg和/或像素离散度P con。空间邻域像素信息的统计特性一定程度上反映了当前图像块所处背景区域的特征(例如背景亮度和背景对比度)。
其中,空间邻域像素信息中K1个像素的亮度值(即亮度分量)的均值P avg,简称像素均值,即:
Figure PCTCN2019083848-appb-000006
其中P(k)为空间邻域中一个像素的亮度值(即亮度分量),其中K1均为小于或等于M的正整数,例如K1=M/2或M,其中空间邻域像素包括M个像素。
空间邻域像素信息中K2个像素的亮度值与像素均值P avg的平均绝对误差和(Mean Absolute Difference,MAD),可以作为离散度P con的一种表示方式,即;
Figure PCTCN2019083848-appb-000007
其中K1,K2均为小于等于M的正整数,K1可与K2相等,也可以K1>K2,例如K1=M/2或M,K2=M/4或M。
应理解,离散度还可以通过均方误差和,方差或标准差以及像素间的相关性等其它方式来表示,不做限定。同时,预设空间邻域内的像素信息也可以通过除均值和离散度以外的其它和空间邻域像素值有关的物理量来表示,不做限定。
S404、根据所述待处理块的预设空间邻域内的像素信息,确定所述待处理块的调节因子。
应理解,步骤S403中所计算的像素信息的具体表征方式是和步骤S404保持一致的。示例性的,当仅采用像素均值来确定待处理块的调节因子时,在步骤S403中则仅需要计算像素均值,而不需要计算像素的离散度。
在一种可行的实施方式中,所述像素信息为所述均值,所述根据所述待处理块的预设空间邻域内的像素信息,确定所述待处理块的调节因子,包括:根据所述均值以及所述均值和所述调节因子的第一映射关系,确定所述调节因子,其中,所述第一映射关系满足如下一个或多个条件:当所述均值小于第一阈值时,所述调节因子随所述 均值的增大而减小;当所述均值大于第二阈值时,所述调节因子随所述均值的增大而增大,其中,所述第一阈值小于或等于所述第二阈值;当所述均值大于或等于所述第一阈值,且小于或等于所述第二阈值时,所述调节因子为第一预设常数。
具体的,根据像素均值P avg的第一分段函数f 1(P avg)计算所述调节因子;其中,所述像素均值P avg由步骤S403获得。
调节因子QC由关于所述像素均值P avg的第一分段函数f 1(P avg)决定,即QC=f 1(P avg) β,其中β>0,例如β=1或0.5。f 1(P avg)为关于P avg的U形函数,f 1(P avg)满足当P avg小于阈值T1时f 1(P avg)的一阶导数小于0,当P avg大于阈值T2时f 1(P avg)的一阶导数大于0,P avg在阈值T1和T2之间时f 1(P avg)等于常数C0;其中,T1≥0,T2≥0,T2≥T1,T1例如为0、60、64或128,T2例如为0、80、128或170;C0为正实数例如为0.5、0.75、1、1.5、8、16、256或1024。更具体的,f 1(P avg)函数例如
Figure PCTCN2019083848-appb-000008
其中η 1为正实数,例如η 1=150或200.8;η 2为正实数,例如η 2=425或485.5。f 1(P avg)函数又例如
Figure PCTCN2019083848-appb-000009
其中η 3为正实数,例如η 3=425或256或135.1。
应理解,所述第一映射关系可以为如上文所述以所述均值为自变量,所述调节因子为因变量的第一分段函数f 1(P avg),也可以为所述均值和所述调节因子的预设的对应关系,具体的,可以将所述均值和所述调节因子的预设的对应关系固化与编解码两端,当所述均值获得时,可以通过查表的方法,确定对应的所述调节因子。查表的方法降低了计算复杂度,更有利于硬件的实现。而通过计算获得所述调节因子的方法能得到更加精确的结果,并且不需要存储上述对应关系表。
在一种可行的实施方式中,所述像素信息为所述离散度,所述根据所述待处理块的预设空间邻域内的像素信息,确定所述待处理块的调节因子,包括:根据所述离散度以及所述离散度和所述调节因子的第二映射关系,确定所述调节因子,其中,所述第二映射关系满足如下一个或多个条件:当所述离散度大于第三阈值时,所述调节因子随所述离散度的增大而增大;当所述离散度小于或等于所述第三阈值时,所述调节因子为第二预设常数。
具体的,根据离散度P con的第二分段函数f 2(P con)计算所述调节因子;其中,所述离散度P con由步骤S403获得。
调节因子QC由关于所述离散度P con的第二分段函数f 2(P con)决定,即QC=f 2(P con) γ, 其中γ>0,例如γ=1或者0.8。f 2(P con)为关于P con的单调函数,f 2(P con)满足当(P con) α小于阈值T3时,f 2(P con)为常数C3,当(P con) α大于等于阈值T3时,f 2(P con)的一阶导数大于0。其中,T3≥0,T3例如为0、3、5或10;α>0,例如α=1/2或1;C3为正实数,例如0.5、0.8、1、16、32或256。更具体的,f 2(P con)函数例如
Figure PCTCN2019083848-appb-000010
其中η 4为正实数,例如η 4=10、20、35.5、80或100。
应理解,所述第二映射关系可以为如上文所述以所述离散度为自变量,所述调节因子为因变量的第二分段函数f 2(P con),也可以为所述离散度和所述调节因子的预设的对应关系,具体的,可以将所述离散度和所述调节因子的预设的对应关系固化与编解码两端,当所述离散度获得时,可以通过查表的方法,确定对应的所述调节因子。
在一种可行的实施方式中,所述像素信息为所述均值和所述离散度,所述根据所述待处理块的预设空间邻域内的像素信息,确定所述待处理块的调节因子,包括:根据所述均值和所述第一映射关系确定第一参数;根据所述离散度和所述第二映射关系确定第二参数;将所述第一参数和所述第二参数的乘积或加权和作为所述调节因子。
具体的,调节因子QC由关于所述像素均值P avg的第一分段函数f 1(P avg)和关于所述离散度P con的第二分段函数f 2(P con)联合决定,例如QC=f 1(P avg) β·f 2(P con) γ,其中β,γ>0,例如β=1,γ=1,或者β=0.5,γ=1.5,或者β=2,γ=1;或者,例如QC=f 1(P avg)·k1+f 2(P con)·k2,其中k1和k2为正实数,例如k1=k2=0.5,或者k1=0.25,k2=0.75,或者k1=0.2,k2=0.7。
需要说明的是,上述参数T1、T2、T3、C0、C3、C4、η 1、η 2、η 3、η 4可以为预先设定的常数,也可以根据视频图像的统计特性自适应计算得到,也可以从视频码流中提取得到。
在一种可行的实施方式中,在所述将所述第一参数和所述第二参数的乘积作为所述调节因子之后,还包括:将所述调节因子进行加权调整,以获得调整后的调节因子;对应的,所述确定所述待处理块的调节,包括:将所述调整后的调节因子作为所述待处理块的调节因子。
具体的,调节因子QC由关于所述像素均值P avg的第一分段函数f 1(P avg)和关于所述离散度P con的第二分段函数f 2(P con)以及加权系数s联合决定,例如QC=(f 1(P avg) β·f 2(P var) γ·s+offset)>>shift,其中offset和shift为预设常数,如offset=1<<(shift-1),shift=8或12或16。所述加权系数s可以通过解析序列参数集(Sequence Parameter Set,SPS)得到,或者也可以通过解析条带头(slice header)得到。
在一种可行的实施方式中,在所述确定所述待处理块的调节因子之后,还包括:根据所述待处理块的量化参数,对所述调节因子进行更新;对应的,所述基于所述调 节因子调整所述第一残差,以得到所述待处理块的第二残差,包括:基于所述更新后的调节因子调整所述第一残差,以得到所述待处理块的第二残差。
具体的,所述调节因子通过如下方式进行调节:
Figure PCTCN2019083848-appb-000011
其中,QC表示所述调节因子,QP表示所述量化参数,N,M,X为预设常数,例如N=256或128、M=30或32、X=32或24。
S405、基于所述调节因子调整所述第一残差,以获得所述待处理块的第二残差。
应理解,一般的,视频图像包括亮度分量(Y)和色度分量(Cb、Cr,或者U、V)。对应的,待处理块包括待处理块的亮度分量和色度分量,第一残差块包括第一亮度残差块和第一色度残差块,第二残差块包括第二亮度残差块和第二色度残差块,其中,在一些实施例中,色度残差块又可以分为Cb分量的残差块和Cr分量的残差块,或者分为U分量的残差块和V分量的残差块。
在一种可行的实施方式中,该步骤包括:仅基于所述调节因子调整所述第一亮度残差,以获得所述待处理块的第二亮度残差。在此情况下,不对色度残差做调整,即对于待处理块的色度分量,第二残差即为第一残差。
在另一种可行的实施方式中,该步骤包括:仅基于所述调节因子调整所述第一色度残差,以获得所述待处理块的第二色度残差。在此情况下,不对亮度残差做调整,即对于待处理块的亮度分量,第二残差即为第一残差。
在另一种可行的实施方式中,该步骤包括:基于所述调节因子调整所述第一亮度残差,以获得所述待处理块的第二亮度残差,同时,基于所述调节因子调整所述第一色度残差,以获得所述待处理块的第二色度残差。应理解,调整亮度残差的调节因子和调整色度残差的调节因子可以相同,也可以不同。调整色度残差的调节因子可以通过计算所述待处理块的预设空间邻域的亮度像素信息获得,也可以通过类似的方法通过计算所述待处理块的预设空间邻域的色度像素信息获得,或者综合考虑所述待处理块的预设空间邻域的亮度和色度像素信息获得,不做限定。
下面对亮度残差和色度残差的情况分别进行描述:
所述第一残差块包括所述待处理块的亮度分量的第一亮度残差块,所述第一亮度残差块中的亮度残差像素与所述待处理块的亮度分量的像素一一对应,对应的,所述第二残差块包括所述待处理块的亮度分量的第二亮度残差块,所述基于所述调节因子调整所述第一残差块,以获得所述待处理块的第二残差块,包括:基于所述调节因子调整所述第一亮度残差块中的亮度残差像素,以获得所述待处理块的第二亮度残差块中的亮度残差像素。其中,所述第二亮度残差块中的亮度残差像素通过如下方式获得:
Res2_Y(i)=(Res1_Y(i)×QC+offset_Y)>>shift_Y
其中,QC表示所述调节因子,Res1_Y(i)表示所述第一亮度残差块中的第i个亮度残差像素,Res2_Y(i)表示所述第二亮度残差块中的第i个亮度残差像素,offset_Y和shift_Y为预设常数,i为自然数。示例性的,shift=8或10或12,offset=1<<(shift-1)。
所述第一残差块包括所述待处理块的色度分量的第一色度残差块,所述第一色度残差块中的色度残差像素与所述待处理块的色度分量的像素一一对应,对应的,所述 第二残差块包括所述待处理块的色度分量的第二色度残差块,所述基于所述调节因子调整所述第一残差块,以获得所述待处理块的第二残差块,包括:基于所述调节因子调整所述第一色度残差块中的色度残差像素,以获得所述待处理块的第二色度残差块中的色度残差像素。其中,所述第二色度残差块中的色度残差像素通过如下方式获得:
Res2_C(i)=(Res1_C(i)×QC+offset_C)>>shift_C
其中,QC表示所述调节因子,Res1_C(i)表示所述第一色度残差块中的第i个色度残差像素,Res2_C(i)表示所述第二色度残差块中的第i个色度残差像素,offset_C和shift_C为预设常数,i为自然数。示例性的,shift=8或10或12,offset=1<<(shift-1)。
应理解,在本步骤中,第一残差经过了缩放处理,得到了第二残差。为了提高处理过程中的运算精度,第一残差作为中间处理结果,可以采用较高精度的比特位宽(也称为比特深度,bit depth)的数值形式,以及采用较高精度的比特位宽存储第一残差的数值,例如待处理块的像素的比特位宽为D位时,第一残差可保存为D+E位,示例性的,D可以为8、10或12,E可以为1、2、3或4。一般的,第二残差的比特位宽被处理为和待处理块的像素的比特位宽相同。在本步骤中,当获取第二残差时,会将比特位宽的精度降低,结合本段中的例子,即上述右移shift_Y或shift_C中,示例性的,包含右移E位的操作。所述第一残差块中的残差像素的比特位宽精度高于所述第二残差块中的残差像素的比特位宽精度。对于亮度和色度分量,则可以分别描述为:所述第一亮度残差块中的亮度残差像素的比特位宽精度高于所述第二亮度残差块中的亮度残差像素的比特位宽精度,以及所述第一色度残差块中的色度残差像素的比特位宽精度高于所述第二色度残差块中的色度残差像素的比特位宽精度。显然,当不对色度分量进行调节时,一般的,不采用较高精度的比特位宽存储第一色度残差的数值,也不存在获取第二色度残差时降低比特位宽的步骤。
S406、将所述第二残差中的残差像素和所述待处理块中对应位置的预测像素相加,以获得所述待处理块中所述对应位置的重构像素。
预测像素一般通过帧内预测技术或者帧间预测技术产生。典型的帧内预测技术和帧间预测技术,可以参考H.265标准(Rec.ITU-T H.265 v4)中,第125页到第172页,第8.4节帧内预测以及第8.5节帧间预测的介绍。JEM同样对帧内预测和帧间预测技术进行了大量改进,具体的可以参见JVET-G1001-v1中,第6页到第28页,第2.2节帧内预测技术的改进以及第2.3节帧间预测技术的改进的介绍,不再赘述。本申请实施例中对使用何种预测技术不做限定。
另外,在一些可行的实施方式中,在将所述第二残差中的残差像素和所述待处理块中对应位置的预测像素相加之后,还会将相加值限制在一个区间之内,比如限制在待处理块的像素的可允许的取值范围之内,对应的,将限值后的相加值作为所述待处理块中所述对应位置的重构像素。
在一些可行的实施方式中,在获得所述待处理块中所述对应位置的重构像素之后,还可以包括对重构像素进行滤波处理,比如JEM中提出的双边滤波处理。在一些可行的实施方式中,待处理块是否需要进行滤波处理通过解码获得的语法元素来确定。
本发明对应的编码处理例如:对一个待编码块,按照空间邻域像素计算得到的调节因子,使用调节因子的倒数对这个编码块的预测残差进行缩放,将缩放后的预测残差进行变换和量化,得到量化后的变换系数,量化后的变换系数被熵编码单元编码成码流。应理解,如前文所述,一般的,编码和解码是可逆的过程,因此当解码端使用调节因子进行残差处理时,对应的,编码端使用调节因子的倒数进行残差处理。
本发明对应的另一种编码处理,对一个待编码块,按照空间邻域像素计算得到的调节因子,使用调节因子对这个编码块的量化步长进行缩放,对预测残差进行变换,并使用缩放后的量化步长对变换系数进行量化,得到量化后的变换系数,量化后的变换系数被熵编码单元编码成码流。
本申请实施例所提供的方案,在解码端利用当前待处理块(即待解码块、变换块)的空间邻域像素信息模拟当前待处理块对应的原始像素信息。根据空间邻域像素信息,自适应地推导用于当前待处理块(即变换块)的调节因子,其反映了当前块背景区域产生的视觉掩蔽效应的强度,并基于自适应推导的调节因子来调节当前待处理块对应的残差块,实现了在视频编码或解码过程中,降低了视觉掩蔽效应较强的处理块的残差比特,提高了视觉掩蔽效应较弱的处理块的残差比特,使得实际残差的编码更符合人眼视觉感知,从而提高了编解码性能。
同时,本申请实施例在流水设计方面也体现了显著的有益效果。
首先对流水设计或流水线(pipeline)设计做一个简单介绍。流水线设计就是将组合逻辑***地分割,并在各个部分(分级)之间***寄存器,并暂存中间数据的方法。目的是将一个大操作分解成若干的小操作,每一步小操作的时间较小,所以能提高频率,各小操作能并行执行,所以能提高数据吞吐率(提高处理速度)。一般的,每一个小操作被称为一个流水级。
在一种典型的硬件解码器的流水设计中,步骤S402和步骤S406分属于不同的流水级,示例性的,可以称为“反量化反变换流水级”和“重建流水级”。这两个流水级中间设有缓存。“反量化反变换流水级”不依赖于“重建流水级”产生的数据。
在本申请实施例中,由于对残差块的调节处理可以放到重建流水级中完成,如图6所示,没有破环原有的流水线设计,并且由于调节因子的计算以及对残差的缩放处理复杂度都比较低,不会显著增加重建流水级的复杂度,提高了解码器模块间的并行度,有利于高性能解码器的实现。
图7示意性的给出了本申请实施例的一种残差获取装置的框图,包括:
一种视频解码中残差的获取装置700,包括:解析模块701,用于解析码流,以获取待处理块的变换系数;转换模块702,用于将所述变换系数转换为所述待处理块的第一残差;
计算模块703,用于根据所述待处理块的预设空间邻域内的像素信息,确定所述待处理块的调节因子;调节模块704,用于基于所述调节因子调整所述第一残差,以获得所述待处理块的第二残差。
在一种可行的实施方式中,所述计算模块703还用于:基于所述待处理块的预设空间邻域内的像素值,计算所述待处理块的预设空间邻域内的像素信息。
在一种可行的实施方式中,所述计算模块703具体用于:获取所述预设空间邻域内的一个或多个像素集合;计算所述一个或多个像素集合内像素的均值和/或离散度,以获得所述预设空间邻域内的像素信息。
在一种可行的实施方式中,所述离散度包括:均方误差和,平均绝对误差和,方差或标准差。
在一种可行的实施方式中,所述计算模块703还用于:确定所述一个或多个像素集合中的每个像素集合中的全部像素已完成重构。
在一种可行的实施方式中,所述像素信息为所述均值,所述计算模块703具体用于:根据所述均值以及所述均值和所述调节因子的第一映射关系,确定所述调节因子,其中,所述第一映射关系满足如下一个或多个条件:当所述均值小于第一阈值时,所述调节因子随所述均值的增大而减小;当所述均值大于第二阈值时,所述调节因子随所述均值的增大而增大,其中,所述第一阈值小于或等于所述第二阈值;当所述均值大于或等于所述第一阈值,且小于或等于所述第二阈值时,所述调节因子为第一预设常数。
在一种可行的实施方式中,所述计算模块703具体用于:根据所述离散度以及所述离散度和所述调节因子的第二映射关系,确定所述调节因子,其中,所述第二映射关系满足如下一个或多个条件:当所述离散度大于第三阈值时,所述调节因子随所述离散度的增大而增大;当所述离散度小于或等于所述第三阈值时,所述调节因子为第二预设常数。
在一种可行的实施方式中,所述像素信息为所述均值和所述离散度,所述计算模块703具体用于:根据所述均值和所述第一映射关系确定第一参数;根据所述离散度和所述第二映射关系确定第二参数;将所述第一参数和所述第二参数的乘积或加权和作为所述调节因子。
在一种可行的实施方式中,所述计算模块703还用于:将所述调节因子进行加权调整,以获得调整后的调节因子;将所述调整后的调节因子作为所述待处理块的调节因子。
在一种可行的实施方式中,所述计算模块703还用于:根据所述待处理块的量化参数,对所述调节因子进行更新;对应的,所述调节模块704具体用于:基于所述更新后的调节因子调整所述第一残差,以得到所述待处理块的第二残差。
在一种可行的实施方式中,所述调节因子通过如下方式进行调节:
Figure PCTCN2019083848-appb-000012
其中,QC表示所述调节因子,QP表示所述量化参数,N,M,X为预设常数。
在一种可行的实施方式中,获取的所述待处理块的变换系数的个数与所述待处理块的像素点的个数相同,所述转换模块702还用于:将所述待处理块的变换系数按照预设位置关系排列为变换系数块;将所述变换系数块转换为所述待处理块的第一残差块;对应的,所述调节模块704具体用于:基于所述调节因子调整所述第一残差块,以获得所述待处理块的第二残差块。
在一种可行的实施方式中,所述第一残差块包括所述待处理块的亮度分量的第一亮度残差块,所述第一亮度残差块中的亮度残差像素与所述待处理块的亮度分量的像素一一对应,对应的,所述第二残差块包括所述待处理块的亮度分量的第二亮度残差块,所述调节模块704具体用于:基于所述调节因子调整所述第一亮度残差块中的亮度残差像素,以获得所述待处理块的第二亮度残差块中的亮度残差像素。
在一种可行的实施方式中,所述第二亮度残差块中的亮度残差像素通过如下方式获得:
Res2_Y(i)=(Res1_Y(i)×QC+offset_Y)>>shift_Y
其中,QC表示所述调节因子,Res1_Y(i)表示所述第一亮度残差块中的第i个亮度残差像素,Res2_Y(i)表示所述第二亮度残差块中的第i个亮度残差像素,offset_Y和shift_Y为预设常数,i为自然数。
在一种可行的实施方式中,所述第一残差块包括所述待处理块的色度分量的第一色度残差块,所述第一色度残差块中的色度残差像素与所述待处理块的色度分量的像素一一对应,对应的,所述第二残差块包括所述待处理块的色度分量的第二色度残差块,所述调节模块704具体用于:基于所述调节因子调整所述第一色度残差块中的色度残差像素,以获得所述待处理块的第二色度残差块中的色度残差像素。
在一种可行的实施方式中,所述第二色度残差块中的色度残差像素通过如下方式获得:
Res2_C(i)=(Res1_C(i)×QC+offset_C)>>shift_C
其中,QC表示所述调节因子,Res1_C(i)表示所述第一色度残差块中的第i个色度残差像素,Res2_C(i)表示所述第二色度残差块中的第i个色度残差像素,offset_C和shift_C为预设常数,i为自然数。
在一种可行的实施方式中,所述第一亮度残差块中的亮度残差像素的比特位宽精度高于所述第二亮度残差块中的亮度残差像素的比特位宽精度。
在一种可行的实施方式中,所述第一色度残差块中的色度残差像素的比特位宽精度高于所述第二色度残差块中的色度残差像素的比特位宽精度。
在一种可行的实施方式中,所述转换模块702具体用于:对所述变换系数块中的每一个变换系数进行反量化,以获得反量化后的变换系数块;对所述反量化后的变换系数块进行反变换,以获得所述待处理块的第一残差块。
在一种可行的实施方式中,所述装置700还包括:重构单元705,用于将所述第二残差中的残差像素和所述待处理块中对应位置的预测像素相加,以获得所述待处理块中所述对应位置的重构像素。
本申请实施例所提供的方案,在解码端利用当前待处理块(即待解码块、变换块)的空间邻域像素信息模拟当前待处理块对应的原始像素信息。根据空间邻域像素信息,自适应地推导用于当前待处理块(即变换块)的调节因子,其反映了当前块背景区域产生的视觉掩蔽效应的强度,并基于自适应推导的调节因子来调节当前待处理块对应的残差块,实现了在视频编码或解码过程中,降低了视觉掩蔽效应较强的处理块的残差比特,提高了视觉掩蔽效应较弱的处理块的残差比特,使得实际残差的编码更符合人眼视觉感知,从而提高了编解码性能。
同时,在本申请实施例中,由于对残差块的调节处理可以放到重建流水级中完成,没有破环原有的流水线设计,并且由于调节因子的计算以及对残差的缩放处理复杂度都比较低,不会显著增加重建流水级的复杂度,提高了解码器模块间的并行度,有利于高性能解码器的实现。
图8为本申请实施例的视频解码设备的一种示意性框图,该设备800可以是应用于编码侧,也可以是应用于解码侧。设备800包括处理器801、存储器802,所述处理器801、存储器802相连接(如通过总线804相互连接),在可能的实施方式中,设备800还可包括收发器803,收发器803连接处理器801和存储器802,用于接收/发送数据。
存储器802包括但不限于是随机存储记忆体(random access memory,RAM)、只读存储器(read-only memory,ROM)、可擦除可编程只读存储器(erasable programmable read only memory,EPROM)、或便携式只读存储器(compact disc read-only memory,CD-ROM),该存储器802用于存储相关程序代码及视频数据。
处理器801可以是一个或多个中央处理器(central processing unit,CPU),在处理器801是一个CPU的情况下,该CPU可以是单核CPU,也可以是多核CPU。该处理器801用于读取所存储器802中存储的程序代码,执行图4所对应的实施方案及其各种可行的实施方式的操作。
示例性的,本申请实施例还提供一种计算机可读存储介质,所述计算机可读存储介质中存储有指令,当其在计算机上运行时,使得计算机执行图4所对应的实施方案及其各种可行的实施方式的操作。
示例性的,本申请实施例还提供一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行图4所对应的实施方案及其各种可行的实施方式的操作。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的***、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者任意组合来实现。当使用软件实现时,可以全部或者部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令,在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本发明实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络或其他可编程装置。所述计算机指令可存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网络站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线)或无线(例如红外、微波等)方式向另一个网络站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质,也可以是包含一个或多个可用介质集成的服务器、数据中心 等数据存储设备。所述可用介质可以是磁性介质(例如软盘、硬盘、磁带等)、光介质(例如DVD等)、或者半导体介质(例如固态硬盘)等等。
在上述实施例中,对各个实施例的描述各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。
以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应以所述权利要求的保护范围为准。

Claims (40)

  1. 一种视频解码中残差的获取方法,其特征在于,包括:
    解析码流,以获取待处理块的变换系数;
    将所述变换系数转换为所述待处理块的第一残差;
    根据所述待处理块的预设空间邻域内的像素信息,确定所述待处理块的调节因子;
    基于所述调节因子调整所述第一残差,以获得所述待处理块的第二残差。
  2. 根据权利要求1所述的方法,其特征在于,在所述根据所述待处理块的预设空间邻域内的像素信息,确定所述待处理块的调节因子之前,还包括:
    基于所述待处理块的预设空间邻域内的像素值,计算所述待处理块的预设空间邻域内的像素信息。
  3. 根据权利要求2所述的方法,其特征在于,所述计算所述待处理块的预设空间邻域内的像素信息,包括:
    获取所述预设空间邻域内的一个或多个像素集合;
    计算所述一个或多个像素集合内像素的均值和/或离散度,以获得所述预设空间邻域内的像素信息。
  4. 根据权利要求3所述的方法,其特征在于,所述离散度包括:均方误差和,平均绝对误差和,方差或标准差。
  5. 根据权利要求3或4所述的方法,其特征在于,在所述获取所述预设空间邻域内的一个或多个像素集合之前,还包括:
    确定所述一个或多个像素集合中的每个像素集合中的全部像素已完成重构。
  6. 根据权利要求3至5任一项所述的方法,其特征在于,所述像素信息为所述均值,所述根据所述待处理块的预设空间邻域内的像素信息,确定所述待处理块的调节因子,包括:
    根据所述均值以及所述均值和所述调节因子的第一映射关系,确定所述调节因子,其中,所述第一映射关系满足如下一个或多个条件:
    当所述均值小于第一阈值时,所述调节因子随所述均值的增大而减小;
    当所述均值大于第二阈值时,所述调节因子随所述均值的增大而增大,其中,所述第一阈值小于或等于所述第二阈值;
    当所述均值大于或等于所述第一阈值,且小于或等于所述第二阈值时,所述调节因子为第一预设常数。
  7. 根据权利要求3至6任一项所述的方法,其特征在于,所述像素信息为所述离散度,所述根据所述待处理块的预设空间邻域内的像素信息,确定所述待处理块的调节因子,包括:
    根据所述离散度以及所述离散度和所述调节因子的第二映射关系,确定所述调节因子,其中,所述第二映射关系满足如下一个或多个条件:
    当所述离散度大于第三阈值时,所述调节因子随所述离散度的增大而增大;
    当所述离散度小于或等于所述第三阈值时,所述调节因子为第二预设常数。
  8. 根据权利要求7所述的方法,其特征在于,所述像素信息为所述均值和所述离散度,所述根据所述待处理块的预设空间邻域内的像素信息,确定所述待处理块的调节因子,包括:
    根据所述均值和所述第一映射关系确定第一参数;
    根据所述离散度和所述第二映射关系确定第二参数;
    将所述第一参数和所述第二参数的乘积或加权和作为所述调节因子。
  9. 根据权利要求8所述的方法,其特征在于,在所述将所述第一参数和所述第二参数的乘积或加权和作为所述调节因子之后,还包括:
    将所述调节因子进行加权调整,以获得调整后的调节因子;
    对应的,所述确定所述待处理块的调节因子,包括:
    将所述调整后的调节因子作为所述待处理块的调节因子。
  10. 根据权利要求1至9任一项所述的方法,其特征在于,在所述确定所述待处理块的调节因子之后,还包括:
    根据所述待处理块的量化参数,对所述调节因子进行更新;
    对应的,所述基于所述调节因子调整所述第一残差,以得到所述待处理块的第二残差,包括:
    基于所述更新后的调节因子调整所述第一残差,以得到所述待处理块的第二残差。
  11. 根据权利要求10所述的方法,其特征在于,所述调节因子通过如下方式进行调节:
    Figure PCTCN2019083848-appb-100001
    其中,QC表示所述调节因子,QP表示所述量化参数,N,M,X为预设常数。
  12. 根据权利要求1至11任一项所述的方法,其特征在于,获取的所述待处理块的变换系数的个数与所述待处理块的像素点的个数相同,在所述获取待处理块的变换系数之后,还包括:将所述待处理块的变换系数按照预设位置关系排列为变换系数块;
    对应的,所述将所述变换系数转换为所述待处理块的第一残差包括:将所述变换系数块转换为所述待处理块的第一残差块;
    对应的,所述基于所述调节因子调整所述第一残差,以获得所述待处理块的第二残差,包括:基于所述调节因子调整所述第一残差块,以获得所述待处理块的第二残差块。
  13. 根据权利要求12所述的方法,其特征在于,所述第一残差块包括所述待处理块的亮度分量的第一亮度残差块,所述第一亮度残差块中的亮度残差像素与所述待处理块的亮度分量的像素一一对应,对应的,所述第二残差块包括所述待处理块的亮度分量的第二亮度残差块,所述基于所述调节因子调整所述第一残差块,以获得所述待处理块的第二残差块,包括:
    基于所述调节因子调整所述第一亮度残差块中的亮度残差像素,以获得所述待处理块的第二亮度残差块中的亮度残差像素。
  14. 根据权利要求13所述的方法,其特征在于,所述第二亮度残差块中的亮度残差像素通过如下方式获得:
    Res2_Y(i)=(Res1_Y(i)×QC+offset_Y)>>shift_Y
    其中,QC表示所述调节因子,Res1_Y(i)表示所述第一亮度残差块中的第i个亮度残差像素,Res2_Y(i)表示所述第二亮度残差块中的第i个亮度残差像素,offset_Y和shift_Y为预设常数,i为自然数。
  15. 根据权利要求12所述的方法,其特征在于,所述第一残差块包括所述待处理块的色度分量的第一色度残差块,所述第一色度残差块中的色度残差像素与所述待处理块的色度分量的像素一一对应,对应的,所述第二残差块包括所述待处理块的色度分量的第二色度残差块,所述基于所述调节因子调整所述第一残差块,以获得所述待处理块的第二残差块,包括:
    基于所述调节因子调整所述第一色度残差块中的色度残差像素,以获得所述待处理块的第二色度残差块中的色度残差像素。
  16. 根据权利要求15所述的方法,其特征在于,所述第二色度残差块中的色度残差像素通过如下方式获得:
    Res2_C(i)=(Res1_C(i)×QC+offset_C)>>shift_C
    其中,QC表示所述调节因子,Res1_C(i)表示所述第一色度残差块中的第i个色度残差像素,Res2_C(i)表示所述第二色度残差块中的第i个色度残差像素,offset_C和shift_C为预设常数,i为自然数。
  17. 根据权利要求13或14所述的方法,其特征在于,所述第一亮度残差块中的亮度残差像素的比特位宽精度高于所述第二亮度残差块中的亮度残差像素的比特位宽精度。
  18. 根据权利要求15至17任一项所述的方法,其特征在于,所述第一色度残差块中的色度残差像素的比特位宽精度高于所述第二色度残差块中的色度残差像素的比特位宽精度。
  19. 根据权利要求12至18任一项所述的方法,其特征在于,所述将所述变换系数块转换为所述待处理块的第一残差块,包括:
    对所述变换系数块中的每一个变换系数进行反量化,以获得反量化后的变换系数块;
    对所述反量化后的变换系数块进行反变换,以获得所述待处理块的第一残差块。
  20. 根据权利要求1至19任一项所述的方法,其特征在于,在所述基于所述调节因子调整所述第一残差,以获得所述待处理块的第二残差之后,还包括:
    将所述第二残差中的残差像素和所述待处理块中对应位置的预测像素相加,以获得所述待处理块中所述对应位置的重构像素。
  21. 一种视频解码中残差的获取装置,其特征在于,包括:
    解析模块,用于解析码流,以获取待处理块的变换系数;
    转换模块,用于将所述变换系数转换为所述待处理块的第一残差;
    计算模块,用于根据所述待处理块的预设空间邻域内的像素信息,确定所述待处理块的调节因子;
    调节模块,用于基于所述调节因子调整所述第一残差,以获得所述待处理块的第二残差。
  22. 根据权利要求21所述的装置,其特征在于,所述计算模块还用于:
    基于所述待处理块的预设空间邻域内的像素值,计算所述待处理块的预设空间邻域内的像素信息。
  23. 根据权利要求22所述的装置,其特征在于,所述计算模块具体用于:
    获取所述预设空间邻域内的一个或多个像素集合;
    计算所述一个或多个像素集合内像素的均值和/或离散度,以获得所述预设空间邻域内的像素信息。
  24. 根据权利要求23所述的装置,其特征在于,所述离散度包括:均方误差和,平均绝对误差和,方差或标准差。
  25. 根据权利要求23或24所述的装置,其特征在于,所述计算模块还用于:
    确定所述一个或多个像素集合中的每个像素集合中的全部像素已完成重构。
  26. 根据权利要求23至25任一项所述的装置,其特征在于,所述像素信息为所述均值,所述计算模块具体用于:
    根据所述均值以及所述均值和所述调节因子的第一映射关系,确定所述调节因子,其中,所述第一映射关系满足如下一个或多个条件:
    当所述均值小于第一阈值时,所述调节因子随所述均值的增大而减小;
    当所述均值大于第二阈值时,所述调节因子随所述均值的增大而增大,其中,所述第一阈值小于或等于所述第二阈值;
    当所述均值大于或等于所述第一阈值,且小于或等于所述第二阈值时,所述调节因子为第一预设常数。
  27. 根据权利要求23至26任一项所述的装置,其特征在于,所述计算模块具体用于:
    根据所述离散度以及所述离散度和所述调节因子的第二映射关系,确定所述调节因子,其中,所述第二映射关系满足如下一个或多个条件:
    当所述离散度大于第三阈值时,所述调节因子随所述离散度的增大而增大;
    当所述离散度小于或等于所述第三阈值时,所述调节因子为第二预设常数。
  28. 根据权利要求27所述的装置,其特征在于,所述像素信息为所述均值和所述离散度,所述计算模块具体用于:
    根据所述均值和所述第一映射关系确定第一参数;
    根据所述离散度和所述第二映射关系确定第二参数;
    将所述第一参数和所述第二参数的乘积或加权和作为所述调节因子。
  29. 根据权利要求28所述的装置,其特征在于,所述计算模块还用于:
    将所述调节因子进行加权调整,以获得调整后的调节因子;
    将所述调整后的调节因子作为所述待处理块的调节因子。
  30. 根据权利要求21至29任一项所述的装置,其特征在于,所述计算模块还用于:
    根据所述待处理块的量化参数,对所述调节因子进行更新;
    对应的,所述调节模块具体用于:
    基于所述更新后的调节因子调整所述第一残差,以得到所述待处理块的第二残差。
  31. 根据权利要求30所述的装置,其特征在于,所述调节因子通过如下方式进行调节:
    Figure PCTCN2019083848-appb-100002
    其中,QC表示所述调节因子,QP表示所述量化参数,N,M,X为预设常数。
  32. 根据权利要求21至31任一项所述的装置,其特征在于,获取的所述待处理块的变换系数的个数与所述待处理块的像素点的个数相同,所述转换模块还用于:将所述待处理块的变换系数按照预设位置关系排列为变换系数块;
    将所述变换系数块转换为所述待处理块的第一残差块;
    对应的,所述调节模块具体用于:基于所述调节因子调整所述第一残差块,以获得所述待处理块的第二残差块。
  33. 根据权利要求32所述的装置,其特征在于,所述第一残差块包括所述待处理块的亮度分量的第一亮度残差块,所述第一亮度残差块中的亮度残差像素与所述待处 理块的亮度分量的像素一一对应,对应的,所述第二残差块包括所述待处理块的亮度分量的第二亮度残差块,所述调节模块具体用于:
    基于所述调节因子调整所述第一亮度残差块中的亮度残差像素,以获得所述待处理块的第二亮度残差块中的亮度残差像素。
  34. 根据权利要求33所述的装置,其特征在于,所述第二亮度残差块中的亮度残差像素通过如下方式获得:
    Res2_Y(i)=(Res1_Y(i)×QC+offset_Y)>>shift_Y
    其中,QC表示所述调节因子,Res1_Y(i)表示所述第一亮度残差块中的第i个亮度残差像素,Res2_Y(i)表示所述第二亮度残差块中的第i个亮度残差像素,offset_Y和shift_Y为预设常数,i为自然数。
  35. 根据权利要求32所述的装置,其特征在于,所述第一残差块包括所述待处理块的色度分量的第一色度残差块,所述第一色度残差块中的色度残差像素与所述待处理块的色度分量的像素一一对应,对应的,所述第二残差块包括所述待处理块的色度分量的第二色度残差块,所述调节模块具体用于:
    基于所述调节因子调整所述第一色度残差块中的色度残差像素,以获得所述待处理块的第二色度残差块中的色度残差像素。
  36. 根据权利要求35所述的装置,其特征在于,所述第二色度残差块中的色度残差像素通过如下方式获得:
    Res2_C(i)=(Res1_C(i)×QC+offset_C)>>shift_C
    其中,QC表示所述调节因子,Res1_C(i)表示所述第一色度残差块中的第i个色度残差像素,Res2_C(i)表示所述第二色度残差块中的第i个色度残差像素,offset_C和shift_C为预设常数,i为自然数。
  37. 根据权利要求33或34所述的装置,其特征在于,所述第一亮度残差块中的亮度残差像素的比特位宽精度高于所述第二亮度残差块中的亮度残差像素的比特位宽精度。
  38. 根据权利要求35至37任一项所述的装置,其特征在于,所述第一色度残差块中的色度残差像素的比特位宽精度高于所述第二色度残差块中的色度残差像素的比特位宽精度。
  39. 根据权利要求32至38任一项所述的装置,其特征在于,所述转换模块具体用于:
    对所述变换系数块中的每一个变换系数进行反量化,以获得反量化后的变换系数块;
    对所述反量化后的变换系数块进行反变换,以获得所述待处理块的第一残差块。
  40. 根据权利要求21至39任一项所述的装置,其特征在于,还包括:
    重构单元,用于将所述第二残差中的残差像素和所述待处理块中对应位置的预测像素相加,以获得所述待处理块中所述对应位置的重构像素。
PCT/CN2019/083848 2018-05-24 2019-04-23 视频数据解码方法及装置 WO2019223480A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810508090.3A CN110536133B (zh) 2018-05-24 2018-05-24 视频数据解码方法及装置
CN201810508090.3 2018-05-24

Publications (1)

Publication Number Publication Date
WO2019223480A1 true WO2019223480A1 (zh) 2019-11-28

Family

ID=68616560

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/083848 WO2019223480A1 (zh) 2018-05-24 2019-04-23 视频数据解码方法及装置

Country Status (2)

Country Link
CN (1) CN110536133B (zh)
WO (1) WO2019223480A1 (zh)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115720265A (zh) 2019-12-18 2023-02-28 腾讯科技(深圳)有限公司 视频编解码方法、装置、设备及存储介质
CN112738701B (zh) * 2021-01-07 2022-02-25 湖南芯海聆半导体有限公司 助听器芯片用全数字pwm音频输出方法及助听器芯片
CN113992915B (zh) * 2021-12-28 2022-05-17 康达洲际医疗器械有限公司 一种可适用于vvc帧内预测的编码单元划分方法与***
CN118214875A (zh) * 2022-12-16 2024-06-18 杭州海康威视数字技术股份有限公司 一种解码、编码方法、装置及其设备

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105338352A (zh) * 2014-07-24 2016-02-17 华为技术有限公司 一种视频编码中自适应反量化方法及装置
US20160277767A1 (en) * 2015-03-16 2016-09-22 Thomson Licensing Methods, systems and apparatus for determining prediction adjustment factors
WO2017138352A1 (en) * 2016-02-08 2017-08-17 Sharp Kabushiki Kaisha Systems and methods for transform coefficient coding
CN107211133A (zh) * 2015-11-06 2017-09-26 华为技术有限公司 反量化变换系数的方法、装置及解码设备

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102547272B (zh) * 2010-12-30 2015-03-11 ***通信集团公司 一种解码方法、装置及终端
CN107205153B (zh) * 2017-04-13 2019-04-19 深圳市安健科技股份有限公司 视频编码方法及其***

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105338352A (zh) * 2014-07-24 2016-02-17 华为技术有限公司 一种视频编码中自适应反量化方法及装置
US20160277767A1 (en) * 2015-03-16 2016-09-22 Thomson Licensing Methods, systems and apparatus for determining prediction adjustment factors
CN107211133A (zh) * 2015-11-06 2017-09-26 华为技术有限公司 反量化变换系数的方法、装置及解码设备
WO2017138352A1 (en) * 2016-02-08 2017-08-17 Sharp Kabushiki Kaisha Systems and methods for transform coefficient coding

Also Published As

Publication number Publication date
CN110536133B (zh) 2021-11-19
CN110536133A (zh) 2019-12-03

Similar Documents

Publication Publication Date Title
US20210067787A1 (en) On block level bi-prediction with weighted averaging
JP6612767B2 (ja) 色空間変換コーディングのための量子化パラメータ
US11095916B2 (en) Wraparound motion compensation in video coding
TWI666916B (zh) 在一視訊寫碼處理中之係數階寫碼
JP6284954B2 (ja) イントラ予測のためのモード決定の簡略化
JP6162150B2 (ja) ビデオコーディング用の残差4分木(rqt)コーディング
JP5587508B2 (ja) ビデオコード化のためのイントラ平滑化フィルタ
CN103190147B (zh) 用于视频译码的语法元素的联合译码方法及设备
WO2019223480A1 (zh) 视频数据解码方法及装置
JP6151434B2 (ja) 絶対値変換差分和に基づくビデオ符号化のためのイントラレート制御
US20150071357A1 (en) Partial intra block copying for video coding
JP5755808B2 (ja) ルーマおよびクロマブロックのためのvlc係数コーディング
US20120063515A1 (en) Efficient Coding of Video Parameters for Weighted Motion Compensated Prediction in Video Coding
CN112789858B (zh) 帧内预测方法及设备
TW201404160A (zh) 色調片段位準量化系數偏移及解塊
JP2015517285A (ja) Hevcの無損失符号化におけるイントラ予測残差の2値化スキーム及びイントラ予測の向上
JP2015524216A (ja) 映像コーディングにおけるロスレスコーディングモード及びパルスコード変調(pcm)モードのシグナリングの統一
WO2019086033A1 (zh) 视频数据解码方法及装置
JP7423647B2 (ja) 異なるクロマフォーマットを使用した三角予測ユニットモードでのビデオコーディング
JP2018507625A (ja) パレットモードコード化のためのエスケープ画素をコード化すること
JP2022510145A (ja) しきい値とライスパラメータとを使用した係数復号のための正規コード化ビン低減
JP7509784B2 (ja) 係数レベルのためのエスケープコーディング
WO2020048180A1 (zh) 运动矢量的获取方法、装置、计算机设备及存储介质
WO2020047807A1 (zh) 帧间预测方法、装置以及编解码器
CN114303380B (zh) 用于几何划分标志的索引的cabac译码的编码器、解码器及对应方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19807057

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19807057

Country of ref document: EP

Kind code of ref document: A1