WO2022257130A1 - 编解码方法、码流、编码器、解码器、***和存储介质 - Google Patents

编解码方法、码流、编码器、解码器、***和存储介质 Download PDF

Info

Publication number
WO2022257130A1
WO2022257130A1 PCT/CN2021/099813 CN2021099813W WO2022257130A1 WO 2022257130 A1 WO2022257130 A1 WO 2022257130A1 CN 2021099813 W CN2021099813 W CN 2021099813W WO 2022257130 A1 WO2022257130 A1 WO 2022257130A1
Authority
WO
WIPO (PCT)
Prior art keywords
network model
current block
loop filter
candidate
chroma
Prior art date
Application number
PCT/CN2021/099813
Other languages
English (en)
French (fr)
Inventor
戴震宇
Original Assignee
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oppo广东移动通信有限公司 filed Critical Oppo广东移动通信有限公司
Priority to CN202180099002.9A priority Critical patent/CN117461316A/zh
Priority to EP21944637.4A priority patent/EP4354873A1/en
Priority to PCT/CN2021/099813 priority patent/WO2022257130A1/zh
Publication of WO2022257130A1 publication Critical patent/WO2022257130A1/zh
Priority to US18/529,318 priority patent/US20240107073A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/117Filters, e.g. for pre-processing or post-processing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/186Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/80Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation
    • H04N19/82Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation involving filtering within a prediction loop

Definitions

  • the embodiments of the present application relate to the technical field of image processing, and in particular to a codec method, code stream, coder, decoder, system and storage medium.
  • loop filters are used to improve the subjective and objective quality of reconstructed images.
  • loop filtering part although there are some model selection schemes, most of these schemes select the model with better performance by calculating the rate-distortion cost of each model, which has high complexity; and for the selected model , it is also necessary to decide whether to turn on the model switch through the rate-distortion cost, and write the frame-level, block-level and other switch information into the code stream, resulting in additional bit overhead.
  • Embodiments of the present application provide an encoding and decoding method, code stream, encoder, decoder, system, and storage medium, which can not only reduce complexity, but also avoid additional bit overhead, improve encoding performance, and further improve encoding and decoding efficiency.
  • the embodiment of the present application provides a decoding method applied to a decoder, and the method includes:
  • At least two output values are determined according to the preset selection network model of the current block; wherein the at least two output values include the current block use ring
  • the loop filter network model is used, the first value corresponding to at least one candidate loop filter network model and the second value when the current block does not use the loop filter network model;
  • the target loop filtering network model is used to filter the current block to obtain the first reconstructed image block of the current block.
  • the embodiment of the present application provides an encoding method applied to an encoder, and the method includes:
  • At least two output values are determined according to the preset selection network model of the current block; wherein the at least two output values include the current block use ring
  • the loop filter network model is used, the first value corresponding to at least one candidate loop filter network model and the second value when the current block does not use the loop filter network model;
  • the target loop filtering network model is used to filter the current block to obtain the first reconstructed image block of the current block.
  • the embodiment of the present application provides a code stream, which is generated by performing bit coding according to the information to be encoded; wherein, the information to be encoded includes the value of the first syntax element identification information, and the first syntax element identification information Used to indicate whether the current block allows model decisions using preset selection network models.
  • an embodiment of the present application provides an encoder, the encoder includes a first determination unit, a first decision-making unit, and a first filtering unit; wherein,
  • the first determining unit is configured to determine the value of the first syntax element identification information
  • the first decision-making unit is configured to determine at least two output values according to the preset selection network model of the current block when the first syntax element identification information indicates that the current block allows the use of the preset selection network model for model decision-making; wherein at least two The output value includes a first value corresponding to at least one candidate loop filtering network model when the current block uses the loop filtering network model and a second value when the current block does not use the loop filtering network model; and according to at least two output values, Determine the target loop filter network model when the current block uses the loop filter network model or the current block does not use the loop filter network model;
  • the first filtering unit is configured to use the target loop filtering network model to filter the current block to obtain a first reconstructed image block of the current block when the current block uses the loop filtering network model.
  • the embodiment of the present application provides an encoder, where the encoder includes a first memory and a first processor; wherein,
  • a first memory for storing a computer program capable of running on the first processor
  • the first processor is configured to execute the method as described in the second aspect when running the computer program.
  • the embodiment of the present application provides a decoder, which includes an analysis unit, a second decision-making unit, and a second filtering unit; wherein,
  • the parsing unit is configured to parse the code stream and determine the value of the first syntax element identification information
  • the second decision-making unit is configured to determine at least two output values according to the preset selection network model of the current block when the first syntax element identification information indicates that the current block allows the use of the preset selection network model for model decision-making; wherein, at least two The output value includes a first value corresponding to at least one candidate loop filtering network model when the current block uses the loop filtering network model and a second value when the current block does not use the loop filtering network model; and according to at least two output values, Determine the target loop filter network model when the current block uses the loop filter network model or the current block does not use the loop filter network model;
  • the second filtering unit is configured to use the target loop filtering network model to filter the current block to obtain the first reconstructed image block of the current block when the current block uses the loop filtering network model.
  • the embodiment of the present application provides a decoder, where the decoder includes a second memory and a second processor; wherein,
  • a second memory for storing a computer program capable of running on the second processor
  • the second processor is configured to execute the method as described in the first aspect when running the computer program.
  • an embodiment of the present application provides a codec system, the codec system includes the encoder according to the fourth aspect or the fifth aspect and the decoder according to the sixth aspect or the seventh aspect.
  • the embodiment of the present application provides a computer storage medium, the computer storage medium stores a computer program, and when the computer program is executed, the method as described in the first aspect, or the method as described in the second aspect is implemented. method.
  • the embodiment of the present application provides a codec method, code stream, encoder, decoder, system, and storage medium.
  • the value of the first syntax element identification information is determined; when the first syntax element identification information indicates When the current block allows the use of the preset selection network model for model decision-making, at least two output values are determined according to the preset selection network model of the current block; wherein the at least two output values include at least one candidate when the current block uses the loop filter network model The first value corresponding to the loop filter network model and the second value when the current block does not use the loop filter network model; according to at least two output values, determine the target loop filter network when the current block uses the loop filter network model The model or the current block does not use the loop filter network model; when the current block uses the loop filter network model, the target loop filter network model is used to filter the current block to obtain the first reconstructed image block of the current block.
  • the model determines at least two output values; wherein the at least two output values include the respective first values of at least one candidate loop filter network model when the current block uses the loop filter network model and the current block does not use the loop filter network model
  • the second value according to at least two output values, determine the target loop filter network model when the current block uses the loop filter network model or the current block does not use the loop filter network model; when the current block uses the loop filter network model , using the target loop filtering network model to filter the current block to obtain the first reconstructed image block of the current block.
  • the target loop filter network model when the current block uses the loop filter network model or the current block does not use the loop filter network model; if the current block uses a loop filter network model, then the target loop filter network model can also be used to filter the current block, which not only reduces complexity, but also avoids additional bit overhead and improves coding performance.
  • the encoding and decoding efficiency can be improved; in addition, the first reconstructed image block finally output can be made closer to the original image block, and the video image quality can be improved.
  • FIG. 1 is a schematic diagram of the application of a coding framework provided by the embodiment of the present application
  • FIG. 2 is a schematic diagram of the application of another encoding framework provided by the embodiment of the present application.
  • FIG. 3A is a schematic diagram of a detailed framework of a video coding system provided by an embodiment of the present application.
  • FIG. 3B is a detailed schematic diagram of a video decoding system provided by an embodiment of the present application.
  • FIG. 4 is a schematic flowchart of a decoding method provided in an embodiment of the present application.
  • Fig. 5 is an application schematic diagram of another encoding framework provided by the embodiment of the present application.
  • FIG. 6A is a schematic diagram of the network structure composition of a luma loop filter network model provided by the embodiment of the present application.
  • FIG. 6B is a schematic diagram of the network structure composition of a chroma loop filter network model provided by the embodiment of the present application.
  • FIG. 7A is a schematic diagram of the network structure composition of another luma loop filter network model provided by the embodiment of the present application.
  • FIG. 7B is a schematic diagram of the network structure composition of another chroma loop filter network model provided by the embodiment of the present application.
  • FIG. 8 is a schematic diagram of a network structure composition of a residual block provided by an embodiment of the present application.
  • FIG. 9A is a schematic diagram of the composition and structure of a preset selection network model provided by the embodiment of the present application.
  • FIG. 9B is a schematic diagram of the composition and structure of another preset selection network model provided by the embodiment of the present application.
  • FIG. 10 is a schematic diagram of the overall framework of a network model based on preset selection provided by the embodiment of the present application.
  • FIG. 11 is a schematic flowchart of another decoding method provided by the embodiment of the present application.
  • FIG. 12 is a schematic flowchart of an encoding method provided in an embodiment of the present application.
  • FIG. 13 is a schematic diagram of the composition and structure of an encoder provided by an embodiment of the present application.
  • FIG. 14 is a schematic diagram of a specific hardware structure of an encoder provided in an embodiment of the present application.
  • FIG. 15 is a schematic structural diagram of a decoder provided in an embodiment of the present application.
  • FIG. 16 is a schematic diagram of a specific hardware structure of a decoder provided in an embodiment of the present application.
  • FIG. 17 is a schematic diagram of the composition and structure of an encoding and decoding system provided by an embodiment of the present application.
  • references to “some embodiments” describe a subset of all possible embodiments, but it is understood that “some embodiments” may be the same subset or a different subset of all possible embodiments, and Can be combined with each other without conflict.
  • first ⁇ second ⁇ third involved in the embodiment of the present application is only used to distinguish similar objects, and does not represent a specific ordering of objects. Understandably, “first ⁇ second ⁇ The specific order or sequence of "third” can be interchanged where allowed, so that the embodiments of the application described herein can be implemented in an order other than that illustrated or described herein.
  • JVET Joint Video Experts Team
  • VVC Very Video Coding
  • VVC's reference software testing platform VVC Test Model, VTM
  • Audio Video coding Standard (AVS)
  • HPM High-Performance Model
  • HPM-ModAI High Performance-Modular Artificial Intelligence Model
  • Deblocking Filter (DeBlocking Filter, DBF)
  • Sample adaptive offset (Sample adaptive Offset, SAO)
  • Adaptive correction filter (Adaptive loop filter, ALF)
  • Quantization Parameter (QP) Quantization Parameter
  • the digital video compression technology mainly compresses huge digital image and video data, so as to facilitate transmission and storage.
  • the existing digital video compression standards can save a lot of video data, it is still necessary to pursue better digital video compression technology to reduce digital video.
  • the encoder In the process of digital video encoding, the encoder reads unequal pixels for the original video sequences in different color formats, including luminance and chrominance components, that is, the encoder reads a black-and-white or color image. Then divide the image into blocks, and pass the block data to the encoder for encoding.
  • the encoder is usually a mixed frame encoding mode, which generally includes intra prediction and inter prediction, transformation/quantization, inverse quantization/inverse transformation, For operations such as loop filtering and entropy coding, the specific processing flow can be referred to as shown in FIG. 1 .
  • intra-frame prediction only refers to the information of the same frame image, and predicts the pixel information in the current divided block to eliminate spatial redundancy
  • inter-frame prediction can include motion estimation and motion compensation, which can refer to image information of different frames, using Motion estimation searches the motion vector information that best matches the current division block, which is used to eliminate time redundancy
  • transformation converts the predicted image block into the frequency domain, energy redistribution, combined with quantization can remove information that is not sensitive to the human eye, and is used for Eliminate visual redundancy
  • entropy coding can eliminate character redundancy according to the current context model and the probability information of the binary code stream
  • loop filtering mainly processes the inversely transformed and inversely quantized pixels to compensate for distortion information and provide subsequent encoded pixels. Better reference.
  • the traditional loop filtering module mainly includes a deblocking filter (hereinafter referred to as DBF), a sample adaptive compensation filter (hereinafter referred to as SAO) and an adaptive correction filter. (hereinafter referred to as ALF).
  • DBF deblocking filter
  • SAO sample adaptive compensation filter
  • ALF adaptive correction filter
  • the existing neural network loop filter technology often trains a variety of candidate models for characteristics such as frame type, QP, and color component type.
  • encoding either artificially select a model and encode frame-level, CTU-level and other switch information into the code stream; , and the model index serial number is written into the code stream.
  • a deep learning-based model adaptive selection technology scheme can be proposed, which can optimize the model selection operation of the neural network loop filter; but for the selected model, it is still necessary to decide whether to open the model switch through the rate-distortion cost method , and write switch information such as frame level and CTU level into the code stream, resulting in additional bit overhead.
  • An embodiment of the present application provides an encoding method.
  • the value of the first syntax element identification information is determined; when the first syntax element identification information indicates that the current block is allowed to use a preset selection network model for model decision-making, according to The preset selection network model of the current block determines at least two output values; wherein, the at least two output values include the first value corresponding to each of at least one candidate loop filter network model when the current block uses the loop filter network model and the current block is not The second value when using the loop filter network model; according to at least two output values, determine the target loop filter network model when the current block uses the loop filter network model or the current block does not use the loop filter network model; when the current block When the loop filtering network model is used, the target loop filtering network model is used to filter the current block to obtain the first reconstructed image block of the current block.
  • the embodiment of the present application also provides a decoding method.
  • the code stream is parsed to determine the value of the first syntax element identification information; when the first syntax element identification information indicates that the current block is allowed to use the preset selection network model to perform
  • at least two output values are determined according to the preset selection network model of the current block; wherein, the at least two output values include the corresponding first values of at least one candidate loop filter network model when the current block uses the loop filter network model.
  • the target loop filter network model when the current block uses the loop filter network model or the current block does not use the loop filter network model; when the current block uses the loop filter network model, the target loop filter network model is used to filter the current block to obtain the first reconstructed image block of the current block.
  • the target loop filter network model when the current block uses the loop filter network model or the current block does not use the loop filter network model; if the current block uses a loop filter network model, then the target loop filter network model can also be used to filter the current block, which not only reduces complexity, but also avoids additional bit overhead and improves coding performance.
  • the encoding and decoding efficiency can be improved; in addition, the first reconstructed image block finally output can be made closer to the original image block, and the video image quality can be improved.
  • the video coding system 10 includes a transform and quantization unit 101, an intra frame estimation unit 102, an intra frame prediction unit 103, a motion compensation unit 104, a motion estimation unit 105, an inverse transform and inverse quantization unit 106, a filter Control analysis unit 107, filtering unit 108, encoding unit 109, and decoded image buffering unit 110, etc., wherein filtering unit 108 can implement DBF filtering/SAO filtering/ALF filtering, and encoding unit 109 can implement header information encoding and context-based self-adaptation Binary Arithmetic Coding (Context-based Adaptive Binary Arithmetic Coding, CABAC).
  • filtering unit 108 can implement DBF filtering/SAO filtering/ALF filtering
  • encoding unit 109 can implement header information encoding and context-based self-adaptation Binary Arithmetic Coding (Context-based Adaptive Binary Arithmetic Coding, CABAC).
  • a video coding block can be obtained by dividing the coding tree unit (Coding Tree Unit, CTU), and then the residual pixel information obtained after intra-frame or inter-frame prediction is paired by the transformation and quantization unit 101
  • the video coding block is transformed, including transforming the residual information from the pixel domain to the transform domain, and quantizing the obtained transform coefficients to further reduce the bit rate;
  • the intra frame estimation unit 102 and the intra frame prediction unit 103 are used for Intra-frame prediction is performed on the video coding block; specifically, the intra-frame estimation unit 102 and the intra-frame prediction unit 103 are used to determine the intra-frame prediction mode to be used to code the video coding block;
  • the motion compensation unit 104 and the motion estimation unit 105 is used to perform inter-frame predictive encoding of the received video coding block relative to one or more blocks in one or more reference frames to provide temporal prediction information;
  • the motion estimation performed by the motion estimation unit 105 is used to generate motion vectors process, the motion vector can estimate the motion of the video
  • the context content can be based on adjacent coding blocks, and can be used to encode the information indicating the determined intra-frame prediction mode, and output the code stream of the video signal; and the decoded image buffer unit 110 is used to store the reconstructed video coding block for forecast reference. As the video image encoding progresses, new reconstructed video encoding blocks will be continuously generated, and these reconstructed video encoding blocks will be stored in the decoded image buffer unit 110 .
  • the video decoding system 20 includes a decoding unit 201, an inverse transform and inverse quantization unit 202, an intra prediction unit 203, a motion compensation unit 204, a filtering unit 205, and a decoded image buffer unit 206, etc., wherein the decoding unit 201 can implement header information decoding and CABAC decoding, and filtering unit 205 can implement DBF filtering/SAO filtering/ALF filtering.
  • the decoding unit 201 can implement header information decoding and CABAC decoding
  • filtering unit 205 can implement DBF filtering/SAO filtering/ALF filtering.
  • the code stream of the video signal is output; the code stream is input into the video decoding system 20, and first passes through the decoding unit 201 to obtain the decoded transform coefficient; for the transform coefficient, pass
  • the inverse transform and inverse quantization unit 202 performs processing to generate a residual block in the pixel domain; the intra prediction unit 203 is operable to generate residual blocks based on the determined intra prediction mode and data from previously decoded blocks of the current frame or picture.
  • the motion compensation unit 204 determines the prediction information for the video decoding block by parsing motion vectors and other associated syntax elements, and uses the prediction information to generate the predictive properties of the video decoding block being decoded block; a decoded video block is formed by summing the residual block from the inverse transform and inverse quantization unit 202 with the corresponding predictive block produced by the intra prediction unit 203 or the motion compensation unit 204; the decoded video signal Video quality can be improved by filtering unit 205 in order to remove block artifacts; the decoded video blocks are then stored in the decoded picture buffer unit 206, which stores reference pictures for subsequent intra prediction or motion compensation , and is also used for the output of the video signal, that is, the restored original video signal is obtained.
  • the method provided in the embodiment of the present application can be applied to the filtering unit 108 as shown in FIG. 3A (indicated by a black bold box), and can also be applied to the filtering unit 205 as shown in FIG. 3B (indicated by a bold black box). That is to say, the method in the embodiment of the present application can be applied not only to a video encoding system (referred to as “encoder” for short), but also to a video decoding system (referred to as “decoder” for short), and can even be applied to both A video encoding system and a video decoding system, but there is no limitation here.
  • the "current block” specifically refers to the block currently to be encoded in the video image (also referred to as “coding block” for short); when the embodiment of the present application is applied For a decoder, the “current block” specifically refers to a block currently to be decoded in a video image (may also be referred to simply as a "decoded block”).
  • FIG. 4 shows a schematic flowchart of a decoding method provided in an embodiment of the present application.
  • the method may include:
  • S401 Parse the code stream, and determine a value of the first syntax element identification information.
  • each decoding block may include the first image component, the second image component and the third image component; and the current block is the first image component, the second image component or the third image component loop currently to be performed in the video image
  • the decoded block for the filtering process may be a CTU, or a CU, or even a block smaller than a CU, which is not limited in this embodiment of the present application.
  • the embodiment of the present application can divide them into two types of color components such as luminance component and chrominance component.
  • the current block can also be called a luma block; or, if the current block performs chrominance component prediction, inverse transformation and inverse quantization, loop filtering and other operations, then the current block may also be called a chroma block.
  • the embodiment of the present application specifically provides a loop filtering method, especially an adaptive decision-making method based on a deep learning loop filtering network model, which is applied in such as Part of the filtering unit 205 shown in FIG. 3B .
  • the filtering unit 205 may include a deblocking filter (DBF), a sample adaptive compensation filter (SAO), a residual neural network based loop filter (CNNLF) and an adaptive correction filter (ALF).
  • DPF deblocking filter
  • SAO sample adaptive compensation filter
  • CNLF residual neural network based loop filter
  • ALF adaptive correction filter
  • the CNNLF model in the filtering unit 205 can be adaptively determined by using the method described in the embodiment of the present application, so as to determine the target model when the current block uses the CNNLF model or the current block does not use CNNLF model.
  • the embodiment of the present application proposes a model adaptive decision-making module based on deep learning, which is used to make an adaptive decision on whether to use a loop filtering network model (such as a CNNLF model) and improve coding performance.
  • a loop filtering network model such as a CNNLF model
  • the loop filter can also include a model adaptive decision-making module (Model Adaptive Decision, MAD), and the model adaptive decision-making module is located in SAO filtering and Between CNNLF filters.
  • the use of the model adaptive decision-making module does not depend on the flag bits of DBF, SAO, CNNLF and ALF, but is placed before CNNLF in position.
  • the model adaptive decision-making module can be regarded as a preset selection network model composed of a multi-layer convolutional neural network and a multi-layer fully connected neural network, so as to determine whether the current block uses the CNNLF model, which can be specifically Refers to the target model when the current block uses the CNNLF model or the current block does not use the CNNLF model.
  • a first syntax element identification information can be set, and then determined according to the value of the first syntax element identification information obtained through decoding.
  • the method may also include:
  • the value of the first syntax element identification information is the first identification value, it is determined that the first syntax element identification information indicates that the current block is allowed to use a preset selection network model for model decision-making; or,
  • the value of the first syntax element identification information is the second identification value, it is determined that the first syntax element identification information indicates that the current block is not allowed to use the preset selection network model for model decision-making.
  • first identification value and the second identification value are different, and the first identification value and the second identification value may be in a parameter form or in a digital form.
  • first syntax element identification information may be a parameter written in a profile (profile), or a value of a flag (flag), which is not limited in this embodiment of the present application.
  • the first identification value can be set to 1, and the second identification value can be set to 0; or, the first The identification value can also be set to true, and the second identification value can also be set to false; or, the first identification value can also be set to 0, and the second identification value can also be set to 1; or, the first identification value can also be set to false, the second identification value can also be set to true.
  • the flag generally, the first identification value may be 1, and the second identification value may be 0, but there is no limitation thereto.
  • the preset selection network model can be regarded as a neural network
  • the identification information of the first syntax element can be regarded as an enabling flag for model adaptive decision-making based on the neural network, which can be represented by model_adaptive_decision_enable_flag here.
  • model_adaptive_decision_enable_flag can be used to indicate whether the current block allows the use of preset selection network models for model adaptive decision-making.
  • the network can be selected from several candidate presets according to the color component type, quantization parameter, and frame type of the frame to which the current block belongs.
  • the probability distribution of the model may include the respective first values of at least one candidate loop filter network model when the current block uses the loop filter network model and the current block does not use the loop filter network model.
  • the second value of the model may include the respective first values of at least one candidate loop filter network model when the current block uses the loop filter network model and the current block does not use the loop filter network model.
  • the first value may be used to reflect the probability distribution of at least one candidate loop filtering network model when the current block uses the loop filtering network model
  • the second value may be used to reflect that the current block does not use The probability distribution of the loop filter network model.
  • both the first value and the second value may be represented by probability values; that is, according to the preset selection network model, the determined at least two output values may be at least two probability values.
  • the first value and the second value may also be used to reflect the weight distribution situation of the at least one candidate loop filter network model and the current block not using the loop filter network model when the current block uses the loop filter network model; that is, the first
  • the value and the second value may also be referred to as a weight value, which is not limited in this embodiment of the present application.
  • the preset selection network models here are different.
  • the preset selection network model corresponding to the luma component may be called a luma selection network model
  • the preset selection network model corresponding to the chroma component may be called a chroma selection network model. Therefore, in some embodiments, the determining the preset selection network model of the current block may include:
  • the color component type of the current block is a brightness component (that is, when the current block is a brightness block)
  • the color component type of the current block is a chroma component (that is, when the current block is a chroma block)
  • the candidate loop filtering network models are also different.
  • one or more candidate loop filter network models corresponding to the luma component may be referred to as a candidate luma loop filter network model
  • one or more candidate loop filter network models corresponding to the chroma component may be referred to as Candidate chroma loop filter network models. Therefore, in some embodiments, the determining at least two output values according to the preset selection network model of the current block may include:
  • the color component type of the current block is a brightness component
  • at least two brightness output values are determined according to the brightness selection network model; wherein the at least two brightness output values include at least one candidate brightness loop when the current block uses the brightness loop filtering network model
  • the color component type of the current block is a chroma component
  • at least two chroma output values are determined according to the chroma selection network model; wherein, at least two chroma output values include at least A first value corresponding to a candidate chroma loop filter network model and a second value when the current block does not use the chroma loop filter network model.
  • the probability value may include a luma component and a chrominance component.
  • the color component type of the current block is a brightness component
  • the brightness selection network model of the current block needs to be determined, and then the probability that the current block does not use the brightness loop filter model can be determined according to the brightness selection network model distribution, the probability distribution corresponding to each of the at least one candidate luma loop filter network model when the current block uses the luma loop filter model may also be determined.
  • the color component type of the current block is a chroma component
  • the frame type it may include I frame, P frame and B frame.
  • the I frame is an intra-coded picture frame (Intra-coded Picture); the I frame represents a key frame, which can be understood as a complete reservation of this frame.
  • P frame that is, a forward predictive coded image frame (Predictive-coded Picture); P frame represents the difference between this frame and the previous key frame (I frame).
  • B frame that is, a bidirectionally predicted picture frame (Bidirectionally predicted picture);
  • B frame is a bidirectional difference frame, that is, the B frame records the difference between the current frame and the previous frame and the subsequent frame.
  • the frame type may include the first type and the second type.
  • the preset selection network models here are also different.
  • the first type may be an I frame
  • the second type may be a non-I frame. It should be noted that there is no specific limitation here.
  • the brightness selection network model corresponding to the first type may be called the first brightness selection network model
  • the brightness selection network model corresponding to the second type may be called the second brightness selection network model. Brightness selects the network model. Therefore, in some embodiments, when the color component type of the current block is a brightness component, the determining the brightness selection network model of the current block may include:
  • the frame type of the frame to which the current block belongs is the first type, then determine the first brightness selection network model of the current block; or,
  • the frame type of the frame to which the current block belongs is the second type, then determine the second brightness selection network model of the current block.
  • the candidate luma loop filter network models are also different according to different frame types.
  • one or more candidate luminance loop filter network models corresponding to the first type may be referred to as candidate first luminance selection network models
  • one or more candidate luminance loop filter network models corresponding to the second type may be referred to as candidate The second brightness selects the network model. Therefore, in some embodiments, the determining at least two brightness output values according to the brightness selection network model may include:
  • At least two brightness output values are determined according to the first brightness selection network model; wherein, the at least two brightness output values include that the current block uses the first brightness loop filtering network model When at least one candidate first luma loop filter network model respectively corresponds to a first value and a second value when the current block does not use the first luma loop filter network model; or,
  • the frame type of the frame to which the current block belongs is the second type
  • at least two brightness output values are determined according to the second brightness selection network model; wherein, the at least two brightness output values include that the current block uses the second brightness loop filtering network model The first value corresponding to at least one candidate second luma loop filter network model and the second value when the current block does not use the second luma loop filter network model.
  • candidate loop filter network models corresponding to the luminance component (which may be simply referred to as “candidate luminance loop filter network models"), whether it is at least one candidate loop filter network model corresponding to the first type
  • candidate loop filter network models are obtained through model training.
  • the method may also include:
  • the first training set includes at least one first training sample and at least one second training sample, the frame type of the first training sample is the first type, and the frame type of the second training sample is the second type , and both the first training sample and the second training sample are obtained according to at least one quantization parameter;
  • the first neural network structure is trained by using the luminance component of the at least one second training sample to obtain at least one candidate second luminance loop filter network model.
  • the first neural network structure includes at least one of the following: a convolutional layer, an activation layer, a residual block, and a skip connection layer.
  • At least one candidate first luminance loop filter network model and at least one candidate second luminance loop filter network model are determined by performing model training on the first neural network structure according to at least one training sample, and at least one candidate first A luma loop filter network model and at least one candidate second luma loop filter network model have a corresponding relationship with frame type, color component type and quantization parameter.
  • the chroma selection network model corresponding to the first type may be called the first chroma selection network model
  • the frame type of the frame to which the current block belongs is the first type, then determine the first chroma selection network model of the current block; or,
  • the frame type of the frame to which the current block belongs is the second type, then determine the second chrominance selection network model of the current block.
  • the candidate chroma loop filter network models are also different according to different frame types.
  • one or more candidate chroma loop filter network models corresponding to the first type may be referred to as candidate first chroma selection network models
  • one or more candidate chroma loop filter network models corresponding to the second type may be It is called the candidate second chrominance selection network model. Therefore, in some embodiments, the determining at least two chromaticity output values according to the chromaticity selection network model may include:
  • the frame type of the frame to which the current block belongs is the first type
  • at least two chroma output values are determined according to the first chroma selection network model; wherein, the at least two chroma output values include that the current block uses the first chroma ring The first value corresponding to at least one candidate first chroma loop filter network model and the second value when the current block does not use the first chroma loop filter network model; or,
  • At least two chroma output values are determined according to the second chroma selection network model; wherein, the at least two chroma output values include that the current block uses the second chroma circle When the current block does not use the second chroma loop filter network model, the first value corresponding to at least one candidate second chroma loop filter network model and the second value when the current block does not use the second chroma loop filter network model.
  • candidate in-loop filter network models corresponding to the chroma components which may be referred to simply as “candidate chroma in-loop filter network models"
  • whether the first type corresponds to at least One candidate first chroma in-loop filter network model, or at least one candidate second chroma in-loop filter network model corresponding to the second type these candidate in-loop filter network models are obtained through model training.
  • the method may also include:
  • the first training set includes at least one first training sample and at least one second training sample, the frame type of the first training sample is the first type, and the frame type of the second training sample is the second type , and both the first training sample and the second training sample are obtained according to at least one quantization parameter;
  • the second neural network structure is trained by using the chroma component of at least one second training sample to obtain at least one candidate second chroma loop filter network model.
  • the second neural network structure includes at least one of the following: a sampling layer, a convolutional layer, an activation layer, a residual block, a pooling layer, and a skip connection layer.
  • At least one candidate first chroma loop filter network model and at least one candidate second chroma loop filter network model are determined by performing model training on the second neural network structure according to at least one training sample, and at least one The candidate first chroma in-loop filter network model and at least one candidate second chroma in-loop filter network model have correspondences with frame types, color component types and quantization parameters.
  • the first neural network structure may include a first convolution module, a first residual module, a second convolution module and a first connection module.
  • the input of the first neural network structure is the reconstructed brightness frame, and the output is the original brightness frame;
  • the first neural network structure includes: a first convolution module 601, a first residual module 602, The second convolution module 603 and the first connection module 604 .
  • the first convolution module 601, the first residual module 602, the second convolution module 603 and the first connection module 604 are connected in sequence, and the first connection module 604 is also connected with the first convolution module 601 input connection.
  • the first convolution module may consist of one convolution layer and one activation layer
  • the second convolution module may consist of two convolution layers and A layer of activation layer
  • the connection module can be composed of a jump connection layer
  • the first residual module can include several residual blocks
  • each residual block can be composed of two convolutional layers and one activation layer.
  • the second neural network structure may include an upsampling module, a third convolution module, a fourth convolution module, a fusion module, a second residual module, a fifth convolution module and a second connection module.
  • the input of the second neural network structure is the reconstructed luma frame and the reconstructed chrominance frame, and the output is the original chrominance frame;
  • the second neural network structure includes: an upsampling module 605, a third volume A product module 606 , a fourth convolution module 607 , a fusion module 608 , a second residual module 609 , a fifth convolution module 610 and a second connection module 611 .
  • the input of the up-sampling module 605 is the reconstructed chroma frame, the up-sampling module 605 is connected with the third convolution module 606; the input of the fourth convolution module 607 is the reconstructed luminance frame, and the third convolution module 606 and the fourth convolution module 607 are connected with the fusion module 608, the fusion module 608, the second residual module 609, the fifth convolution module 610 and the second connection module 611 are connected in sequence, and the second connection module 611 is also connected with the above Input connection to sampling module 605 .
  • the third convolution module may consist of a convolution layer and an activation layer
  • the fourth convolution module may consist of a convolution layer and One layer of activation layer
  • the fifth convolution module can be composed of two layers of convolution layer, one layer of activation layer and one layer of pooling layer
  • the connection module can be composed of jump connection layer
  • the second residual module can include several Residual blocks, and each residual block can consist of two convolutional layers and one activation layer.
  • CNNLF designs different network structures for the luma component and the chrominance component respectively.
  • a first neural network structure is designed, see FIG. 7A for details;
  • a second neural network structure is designed, see FIG. 7B for details.
  • the entire network structure can be composed of convolutional layers, activation layers, residual blocks, and skip connection layers.
  • the convolution kernel of the convolutional layer can be 3 ⁇ 3, that is, it can be represented by 3 ⁇ 3Conv;
  • the activation layer can be a linear activation function, that is, it can be represented by a linear rectification function (Rectified Linear Unit, ReLU), which can also be called
  • the modified linear unit is a commonly used activation function in artificial neural networks, and usually refers to the nonlinear function represented by the ramp function and its variants.
  • the network structure of the residual block is shown in the dotted box in Figure 8, which can be composed of a convolutional layer (Conv), an activation layer (ReLU), and a jump connection layer.
  • the jump connection layer refers to a global jump connection from input to output included in the network structure, which enables the network to focus on learning residuals and accelerates the convergence process of the network.
  • the luminance component is introduced as one of the inputs to guide the filtering of the chrominance component.
  • the entire network structure can be composed of a convolutional layer, an activation layer, a residual block, a pooling layer, a jump Connection layer and other components. Due to the inconsistency of the resolution, the chroma component needs to be up-sampled first. In order to avoid introducing other noises during the upsampling process, the resolution can be expanded by directly copying adjacent pixels to obtain an enlarged chroma frame (Enlarged chroma frame).
  • a pooling layer (such as an average pooling layer, represented by 2 ⁇ 2AvgPool) is also used to complete the downsampling of the chroma component.
  • a pooling layer such as an average pooling layer, represented by 2 ⁇ 2AvgPool
  • 4 I-frame luminance component models, 4 non-I-frame luminance component models, 4 chroma U component models, 4 chroma V component models, etc. can be trained offline for a total of 16 candidate rings. road filter network model.
  • the corresponding preset selection network models are also different.
  • the preset selection network model corresponding to the luma component may be called a luma selection network model
  • the preset selection network model corresponding to the chroma component may be called a chroma selection network model.
  • the determining the brightness selection network model of the current block may include:
  • the candidate brightness-selective network model comprising a candidate first brightness-selective network model and/or a candidate second brightness-selective network model
  • At least one candidate first brightness selection network model corresponding to the first type is determined from at least one candidate brightness selection network model, and determined from at least one candidate first brightness selection network model according to the quantization parameter the first brightness selection network model of the current block; or,
  • At least one candidate second brightness selection network model corresponding to the second type is determined from at least one candidate brightness selection network model, and determined from at least one candidate second brightness selection network model according to the quantization parameter The second brightness of the current block selects the network model.
  • the determining the chroma selection network model of the current block may include:
  • the candidate chroma-selective network model comprising a candidate first chroma-selective network model and/or a candidate second chroma-selective network model
  • At least one candidate first chroma selection network model corresponding to the first type is determined from at least one candidate chroma selection network model, and the at least one candidate first chroma selection network is selected according to the quantization parameter
  • the first chrominance selection network model that determines the current block in the model or,
  • At least one candidate second chroma selection network model corresponding to the second type is determined from at least one candidate chroma selection network model, and the at least one candidate second chroma selection network is selected according to the quantization parameter
  • the second chrominance selection network model for the current block is determined in the model.
  • the preset selection network model of the current block is not only related to quantization parameters, but also related to frame type and color component type.
  • different color component types correspond to different preset selection network models.
  • the preset selection network model can be a brightness selection network model related to the brightness component; It is assumed that the selection network model may be a chrominance selection network model related to chrominance components.
  • different frame types have different corresponding preset selection network models.
  • the brightness selection network model corresponding to the first type can be called the first brightness selection network model
  • the brightness selection network model corresponding to the second type can be called the second brightness selection network model
  • the chroma selection network model related to the chroma component the chroma selection network model corresponding to the first type may be called the first chroma selection network model
  • the chroma selection network model corresponding to the second type may be called the second chroma selection network model Choose a network model.
  • At least one candidate luminance selection network model (including candidate first luminance selection network model and/or candidate second luminance selection network model) and at least one candidate chroma selection network model (including candidate chroma selection network model) can be trained in advance. candidate first chrominance selection network model and/or candidate second chrominance selection network model).
  • the I frame brightness selection network model corresponding to the quantization parameter can be selected from at least one candidate I frame brightness selection network model, that is, the brightness selection network model of the current block; or, it is assumed that the frame type is non-I frame, at least one candidate non-I frame brightness selection network model corresponding to the non-I frame type can be determined from at least one candidate brightness selection network model; according to the quantization parameter of the current block, the network model can be selected from at least one candidate non-I frame brightness Select the non-I frame luminance selection network model corresponding to the quantization parameter, that is, the luminance selection network model of the current block.
  • the determination method of the chroma selection network model is the same as that of the lum
  • the method may further include:
  • the second training set includes at least one training sample, and the training sample is obtained according to at least one quantization parameter
  • the third neural network structure is trained by using the chrominance components of the training samples in the second training set to obtain at least one candidate chrominance selection network model.
  • At least one candidate brightness selection network model is determined by performing model training on the third neural network structure according to at least one training sample, and there is a relationship between the at least one candidate brightness selection network model and the frame type, color component type and quantization parameter have a corresponding relationship.
  • at least one candidate chroma selection network model is also determined by performing model training on the third neural network structure according to at least one training sample, and the at least one candidate chroma selection network model is related to frame type, color component type and quantization parameter have a corresponding relationship.
  • the third neural network structure may include at least one of the following: a convolutional layer, a pooling layer, a fully connected layer, and an activation layer.
  • the third neural network structure may include a sixth convolution module and a fully connected module, and the sixth convolution module and the fully connected module are connected in sequence.
  • the sixth convolution module may include several convolution sub-modules, and each convolution sub-module may consist of a convolutional layer and a pooling layer; the fully connected module may include several Each fully connected submodule can be composed of a fully connected layer and an activation layer.
  • the preset selection network model can be composed of multi-layer convolutional neural network and multi-layer fully connected layer neural network, and then use the training samples for deep learning to obtain the preset selection network model of the current block, such as the brightness selection network model Or the Chroma select network model.
  • deep learning is a kind of machine learning, and machine learning is the necessary path to realize artificial intelligence.
  • the concept of deep learning originates from the research of artificial neural networks, and a multi-layer perceptron with multiple hidden layers is a deep learning structure.
  • Deep learning can discover the distributed feature representation of data by combining low-level features to form more abstract high-level representation attribute categories or features.
  • CNN Convolutional Neural Networks
  • feedforward Neural Networks Feedforward Neural Networks
  • Learning is one of the representative algorithms.
  • the preset selection network model here may be a convolutional neural network structure.
  • the embodiment of the present application also designs a third neural network structure, as shown in FIG. 9A and FIG. 9B .
  • the input of the third neural network structure is the reconstruction frame, and the output is each candidate loop filter network model when the current block uses the loop filter network model and the probability distribution when the current block does not use the loop filter network model happensing.
  • the third neural network structure includes: a sixth convolution module 901 and a fully connected module 902, and the sixth convolution module 901 and the fully connected module 902 are connected in sequence.
  • the sixth convolution module 901 can include several convolution sub-modules, each convolution sub-module can be composed of one layer of convolution layer and one layer of pooling layer;
  • the full connection module 902 can include several full connection sub-modules , each fully connected sub-module can consist of a fully connected layer and an activation layer.
  • the third neural network structure may consist of a multi-layer convolutional neural network and a multi-layer fully connected neural network.
  • the network structure may include K layers of convolution layers, M layers of pooling layers, L layers of fully connected layers and N layers of activation layers, and K, M, L, and N are all integers greater than or equal to 1.
  • the network structure shown in Figure 9B it can be composed of 3 layers of convolutional layers and 2 layers of fully connected layers, and each layer of convolutional layers is followed by a pooling layer; where the convolution kernel of the convolutional layer It can be 3 ⁇ 3, that is, it can be represented by 3 ⁇ 3Conv; the pooling layer can use the maximum pooling layer, represented by 2 ⁇ 2MaxPool; in addition, an activation layer is set after each fully connected layer, here, the activation layer It can be a linear activation function or a nonlinear activation function, such as ReLU and Softmax.
  • a loss function can also be used for model training.
  • the method may also include:
  • the second training set includes at least one training sample, and the training sample is obtained according to at least one quantization parameter;
  • the third neural network structure is trained by using the chroma components of the training samples in the second training set, and at least one candidate chroma selection network model is obtained when the loss value of the preset loss function converges to a loss threshold.
  • the embodiment of the present application further provides a method for performing model training with a weighted loss function. Specifically as shown in the following formula,
  • Wa, Wb,..., Wn, Woff respectively represent the output of the preset selection network model, representing at least one candidate loop filter network model a, b, ..., n, and the network model that does not use the loop filter (that is, the model Probability value of close).
  • reca, recb,..., recn represent the output reconstructed image after using the candidate loop filter network models a, b,...,n, respectively, and rec0 represents the output reconstructed image after DBF and SAO.
  • the Clip function limits the value between 0 and N.
  • N represents the maximum value of the pixel value, for example, for a 10bit YUV image, N is 1023; orig represents the original image.
  • At least two output probability values of the preset selection network model can be used as at least one candidate CNNLF model and the weighted weight value of the output reconstruction image when the CNNLF model is not used, and finally the mean square error is calculated with the original image orig, and the loss can be obtained function value.
  • the embodiment of the present application also provides a technical solution of applying the cross-entropy loss function commonly used in classification networks to the embodiment of the present application. Specifically as shown in the following formula,
  • label(i) argmin((reca-orig) 2 ,(recb-orig) 2 ,...,(recn-orig) 2 ,(rec0-orig) 2 )
  • label(i) represents the output reconstructed image of at least one candidate loop filter network model a, b,..., n, and the output reconstructed image after DBF and SAO respectively calculates the mean square error with the original image, and takes the minimum The value i of the serial number corresponding to the error.
  • Wa, Wb,...,Wn,Woff respectively represent the output of the preset selection network model, representing at least one candidate loop filter network model a, b,...,n, and not using the loop filter network model (that is, the model is closed) probability value.
  • Wi represents the probability value with the same serial number as label(i). Then calculate the softmax of Wi and multiply it with label(i) to get the cross entropy loss value.
  • each candidate loop filter network model and the current The probability distribution of the block without using the loop filter network model.
  • the determining at least two output values according to the preset selection network model of the current block may include:
  • the second reconstructed image block is input to the preset selection network model to obtain at least two output values.
  • the at least two output values may include first values corresponding to at least one candidate loop filtering network model when the current block uses the loop filtering network model and second values when the current block does not use the loop filtering network model.
  • the loop filtering network model may refer to the aforementioned CNNLF model.
  • the second reconstructed image block is used as the input of the preset selection network model, and the output of the preset selection network model is at least one candidate CNNLF model and the current block does not use The probability distribution of the CNNLF model (including: the first value corresponding to each of the at least one candidate CNNLF model and the second value when the current block does not use the CNNLF model).
  • S403 According to at least two output values, determine a target loop filtering network model when the current block uses the loop filtering network model or the current block does not use the loop filtering network model.
  • the network model is the target loop filter network model or the current block does not use the loop filter network model.
  • determining the target loop filtering network model when the current block uses the loop filtering network model or the current block does not use the loop filtering network model may include:
  • the target value is the first value, determine that the current block uses a loop filter network model, and use the candidate loop filter network model corresponding to the target value as the target loop filter network model; or,
  • the target value is the second value, it is determined that the current block does not use the loop filtering network model.
  • the determining the target value from at least two output values may include: selecting a maximum value from at least two output values, and using the maximum value as the target value.
  • the luma loop filter network model whether it is a luma loop filter network model or a chroma loop filter network model, it is first trained through the model to obtain several candidate luma loop filter network models or several candidate luma loop filter network models, and then Then use the preset selection network model for model decision-making, if the second value of the at least two output values is the maximum value, then it can be determined that the current block does not use the loop filter network model; if the second value of the at least two output values value is not the maximum value, then determine the candidate loop filter network model corresponding to the maximum value in the first value as the target loop filter network model, so as to use the target loop filter network model to filter the current block.
  • the preset selection network model includes a brightness selection network model and a chrominance selection network model; thus, for the second reconstructed image block, it may also include the input reconstructed brightness image block and Input reconstructed chroma image patches.
  • the determining at least two output values according to the preset selection network model of the current block may include:
  • the at least two luminance output values may include the first value corresponding to at least one candidate luminance loop filtering network model when the current block uses the luminance loop filtering network model and the first value when the current block does not use the luminance loop filtering network model. binary value.
  • the method may further include: selecting the maximum probability value from at least two luminance output values; if the maximum probability value is the first value, determining the current The block uses a luma loop filter network model, and the candidate luma loop filter network model corresponding to the maximum probability value is used as the target luma loop filter network model; or, if the maximum probability value is the second value, it is determined that the current block does not use luma Loop filter network model.
  • the determining at least two output values according to the preset selection network model of the current block may include:
  • the at least two chroma output values may include the first values corresponding to at least one candidate chroma loop filter network model when the current block uses the chroma loop filter network model and the current block does not use the chroma loop filter network model.
  • the second value of the model may include the first values corresponding to at least one candidate chroma loop filter network model when the current block uses the chroma loop filter network model and the current block does not use the chroma loop filter network model.
  • the method may further include: selecting a maximum probability value from at least two chroma output values; if the maximum probability value is the first value, then Determine that the current block uses the chroma loop filter network model, and use the candidate chroma loop filter network model corresponding to the maximum probability value as the target chroma loop filter network model; or, if the maximum probability value is the second value, then determine The current block does not use the chroma loop filter network model.
  • the selected target loop filter network model can be used to filter the current block Perform filtering.
  • performing filtering processing on the current block by using the target loop filtering network model to obtain the first reconstructed image block of the current block may be include:
  • the second reconstructed image block is input to the target loop filtering network model to obtain the first reconstructed image block of the current block.
  • the method may further include: determining the second reconstructed image block as the first reconstructed image block of the current block.
  • the maximum value is determined to be the second value from the at least two output values, it means that the rate-distortion cost of the current block without using the loop filter network model is the smallest, Then it can be determined that the current block does not use the loop filter network model, that is, the second reconstructed image block is directly determined as the first reconstructed image block of the current block; if the maximum value is determined to be a certain first value, which means that the rate-distortion cost of the current block using the loop filter network model is the smallest, then the candidate loop filter network model corresponding to a certain first value can be determined as the target loop filter network model, and then the second reconstructed image block input into the target loop filtering network model to obtain the first reconstructed image block of the current block.
  • the second reconstructed image block can be obtained by deblocking filter and sample adaptive compensation
  • the filter is obtained after filtering processing.
  • the loop filtering network model described in the embodiment of the present application may be a CNNLF model.
  • the selected CNNLF model to perform CNNLF filtering on the current block, the first reconstructed image block of the current block can be obtained.
  • the method may further include: after determining the first reconstructed image block of the current block, performing filtering processing on the first reconstructed image block by using an adaptive correction filter.
  • FIG. 10 shows a schematic diagram of an overall framework of using a preset selection network model provided by an embodiment of the present application.
  • the input of the network structure is the input reconstructed luminance image block or the input reconstructed chrominance image block of the CNNLF model
  • the output of the network structure is at least one CNNLF model corresponding to each The probability value and the probability value that the current block does not use the CNNLF model (that is, the decision to turn off the CNNLF model).
  • the CNNLF model can be selected as the input reconstructed luminance image block or the input reconstructed chrominance image block for CNNLF filtering processing; if the output probability value is the largest for decision-making close CNNLF model, then it is not necessary to use neural network filtering.
  • the second reconstructed image block is obtained after filtering through a deblocking filter (DBF) and a sample adaptive compensation filter (SAO), and then the second reconstructed image block is obtained through the model
  • DBF deblocking filter
  • SAO sample adaptive compensation filter
  • the first reconstructed image block obtained after adapting the selection module and the CNNLF model can also be input into an adaptive correction filter (ALF) to continue filtering processing.
  • ALF adaptive correction filter
  • This embodiment provides a decoding method, which is applied to a decoder. Determine the value of the first syntax element identification information by parsing the code stream; when the first syntax element identification information indicates that the current block allows the use of a preset selection network model for model decision-making, determine at least two according to the preset selection network model of the current block output values; wherein at least two output values include the first value corresponding to at least one candidate loop filter network model when the current block uses the loop filter network model and the second value when the current block does not use the loop filter network model ;According to at least two output values, determine the target loop filter network model when the current block uses the loop filter network model or the current block does not use the loop filter network model; when the current block uses the loop filter network model, use the target loop
  • the path filtering network model performs filtering processing on the current block to obtain the first reconstructed image block of the current block.
  • the target loop filter network model can also be used to filter the current block, which can not only reduce the complexity, but also avoid additional bit overhead, improve the coding performance, and then be able to Encoding and decoding efficiency is improved; in addition, the first reconstructed image block that is finally output can also be made closer to the original image block, which can improve video image quality.
  • FIG. 11 shows a schematic flowchart of another decoding method provided by the embodiment of the present application. As shown in Figure 11, the method may include:
  • S1101 Parse the code stream, and determine the value of the first syntax element identification information.
  • S1104 If the identification information of the loop filter network model is the index number of the loop filter network model, then determine the target loop filter network used by the current block from at least one candidate loop filter network model according to the index number of the loop filter network model Model.
  • S1105 Perform filtering processing on the current block by using the target loop filtering network model to obtain a first reconstructed image block of the current block.
  • a first syntax element identification information can be set, and then according to the value of the first syntax element identification information obtained by decoding Sure.
  • the first syntax element identification information may be represented by model_adaptive_decision_enable_flag.
  • model_adaptive_decision_enable_flag if the value of model_adaptive_decision_enable_flag is the first identification value, it can be determined that the current block allows the use of a preset selection network model for model decision-making; or, if the value of model_adaptive_decision_enable_flag is the second identification value, then it can be Determining that the current block does not allow model decisions to be made using the preset selection network model.
  • the first identification value may be 1, and the second identification value may be 0, but there is no limitation here.
  • the embodiment of the present application can also set a loop filter network model identification information, which is used to determine the index number of the loop filter network model when the current block uses the loop filter network model or the current block does not use the loop filter network model.
  • the model adaptive decision-making module on the decoder side can be based on the identification information of the loop filter network model determined by the model adaptive decision-making module on the encoder side, and the loop filter network model obtained by decoding.
  • the identification information of the model can determine whether the current block does not use the loop filter network model or the index number of the loop filter network model used by the current block.
  • the target loop filter network model used by the current block can be determined according to the index number of the loop filter network model, and then CNNLF filter processing is performed on the current block according to the target loop filter network model, thereby reducing the complexity of the decoder.
  • the number of convolutional layers, the number of fully connected layers, and the nonlinear activation function can all be Make adjustments.
  • the loop filter network model targeted by the model adaptive decision-making module in addition to the CNNLF model, can also be used for adaptive decision-making of other efficient neural network filter models, and the embodiment of this application does not make any limited.
  • the embodiment of this application proposes a model adaptive decision-making module based on deep learning, which is used to make adaptive decisions on the use of the CNNLF model, eliminating the need to calculate rate-distortion costs and transmit frame-level, CTU-level, etc. Switch information to avoid extra bit overhead and improve encoding performance.
  • the model adaptive decision-making module can be regarded as a preset selection network model composed of a multi-layer convolutional neural network and a multi-layer fully connected neural network, and its input is the second reconstructed image block of the current block (ie, the input reconstruction of the CNNLF model Image block), the output is the probability distribution of each CNNLF model and the decision to close the CNNLF model.
  • the position of the model adaptive decision-making module in the encoder/decoder is shown in Figure 5.
  • the use of the model adaptive selection module does not depend on the flag bits of DBF, SAO, ALF, and CNNLF, but is placed before CNNLF in position .
  • the decoder acquires and parses the code stream, and when it is parsed to the loop filter module, it will be processed according to the preset filter order.
  • the preset filter sequence is DBF filter---->SAO filter---->model adaptive decision-making module---->CNNLF filter---->ALF filter.
  • model_adaptive_decision_enable_flag First judge whether the model adaptive decision-making module is allowed to be used in the current block according to the decoded model_adaptive_decision_enable_flag to make model decisions. If model_adaptive_decision_enable_flag is "1", then try to perform model adaptive decision using module processing on the current block, and jump to (b); if model_adaptive_decision_enable_flag is "0”, then jump to (e);
  • the CNNLF model input reconstructed luminance image block is used as the input of the model adaptive decision-making module, and the output is the probability distribution of each luminance CNNLF model and the decision to close the luminance CNNLF model. If the output with the largest probability value is the decision to close the brightness CNNLF model, then jump to (e); if the output with the largest probability value is the index number of a brightness CNNLF model, then select this model to perform CNNLF on the current brightness image block Filter processing to obtain the final output reconstructed brightness image block;
  • the input of the CNNLF model is used to reconstruct the chroma image block as the input of the model adaptive decision-making module, and the output is the probability distribution of each chroma CNNLF model and the decision to close the chroma CNNLF model. If the output with the largest probability value is the decision to close the chroma CNNLF model, then jump to (e); if the output with the largest probability value is the index number of a certain chroma CNNLF model, then select this model for the current chroma image The blocks are processed by CNNLF filtering to obtain the reconstructed chrominance image blocks of the final output;
  • the enable flag used by the neural network-based model adaptive decision can be represented by model_adaptive_decision_enable_flag.
  • Image-level neural network filtering enable flag picture_nn_filter_enable_flag[compIdx]
  • Image-level neural network filtering enable flag picture_nn_filter_enable_flag[compIdx]
  • this embodiment introduces the model adaptive decision-making technology based on deep learning, and the first block of the current block
  • the second reconstructed image block (that is, the input reconstructed image block of the CNNLF model) is input into the neural network structure of multi-layer convolution layer plus multi-layer fully connected layer, and outputs each CNNLF model and the probability distribution of the decision to close the CNNLF model, which is the second
  • the reconstructed image block adaptively decides to use the appropriate CNNLF model or not to use the CNNLF model. At this time, it is no longer necessary to calculate the rate-distortion cost and transmit switch information such as frame level and CTU level, avoiding additional bit overhead and improving coding performance.
  • FIG. 12 shows a schematic flowchart of an encoding method provided in an embodiment of the present application. As shown in Figure 12, the method may include:
  • S1201 Determine a value of the first syntax element identification information.
  • each encoding block may include a first image component, a second image component, and a third image component; and the current block is the first image component, the second image component, or the third image component to be looped currently in the video image Coded blocks for filtering processing.
  • the current block here may be a CTU, or a CU, or even a block smaller than a CU, which is not limited in this embodiment of the present application.
  • the embodiment of the present application can divide them into two types of color components such as luminance component and chrominance component.
  • the current block can also be called a luma block; or, if the current block performs chrominance component prediction, inverse transformation and inverse quantization, loop filtering and other operations, then the current block may also be called a chroma block.
  • the embodiment of the present application specifically provides a loop filtering method, especially an adaptive decision-making method based on a deep learning loop filtering network model, which is applied in such as Part of the filtering unit 108 shown in FIG. 3A .
  • the filtering unit 108 may include a deblocking filter (DBF), a sample adaptive compensation filter (SAO), a residual neural network based loop filter (CNNLF) and an adaptive correction filter (ALF).
  • DPF deblocking filter
  • SAO sample adaptive compensation filter
  • CNLF residual neural network based loop filter
  • ALF adaptive correction filter
  • the CNNLF model in the filtering unit 108 can be adaptively determined by using the method described in the embodiment of the present application, so as to determine the target model when the current block uses the CNNLF model or the current block does not use CNNLF model.
  • the embodiment of the present application proposes a model adaptive decision-making module based on deep learning, see the model adaptive selection module shown in Figure 5 for details, which can be used for loop filtering network models (such as CNNLF model) Whether to use and which CNNLF model to use for adaptive decision-making, thereby improving encoding performance.
  • model adaptive decision-making module based on deep learning, see the model adaptive selection module shown in Figure 5 for details, which can be used for loop filtering network models (such as CNNLF model) Whether to use and which CNNLF model to use for adaptive decision-making, thereby improving encoding performance.
  • the model adaptive decision-making module determines whether the current block is allowed to use the preset selection network model for model decision-making.
  • the determining the value of the identification information of the first syntax element includes:
  • the current block allows the use of a preset selection network model for model decision-making, then determine that the value of the identification information of the first syntax element is the first identification value; and/or,
  • the method further includes: encoding the value of the identification information of the first syntax element, and writing the encoded bits into the code stream.
  • a first syntax element identification information may be set to indicate whether the current block is allowed to use a preset selection network model for model decision-making.
  • the current block allows the use of the preset selection network model for model decision-making, then it can be determined that the value of the first syntax element identification information is the first identification value; if the current block does not allow the use of the preset selection network model for model decision-making, Then it can be determined that the value of the first syntax element identification information is the second identification value.
  • the encoder after the value of the first syntax element identification information is determined, the value of the first syntax element identification information is written into the code stream for transmission to the decoder, so that the decoder can parse the code stream Gets whether the current block allows model decisions using preset selection network models.
  • the first identification value and the second identification value are different, and the first identification value and the second identification value may be in the form of parameters or numbers.
  • the first syntax element identification information may be a parameter written in a profile (profile), or a value of a flag (flag), which is not limited in this embodiment of the present application.
  • the first identification value can be set to 1, and the second identification value can be set to 0; or, the first The identification value can also be set to true, and the second identification value can also be set to false; or, the first identification value can also be set to 0, and the second identification value can also be set to 1; or, the first identification value can also be set to false, the second identification value can also be set to true.
  • the flag generally, the first identification value may be 1, and the second identification value may be 0, but there is no limitation thereto.
  • the preset selection network model can be regarded as a neural network
  • the identification information of the first syntax element can be regarded as an enabling flag for model adaptive decision-making based on the neural network, which can be represented by model_adaptive_decision_enable_flag here.
  • model_adaptive_decision_enable_flag can be used to indicate whether the current block allows the adaptive decision of the model using the preset selection network model.
  • model_adaptive_decision_enable_flag 1
  • model_adaptive_decision_enable_flag 0
  • model_adaptive_decision_enable_flag 0
  • S1202 When the first syntax element identification information indicates that the current block allows the use of the preset selection network model for model decision-making, determine at least two output values according to the preset selection network model of the current block; where the at least two output values include the current block The respective first values corresponding to at least one candidate loop filtering network model when the loop filtering network model is used and the second values when the current block does not use the loop filtering network model.
  • the network can be selected from several candidate presets according to the color component type, quantization parameter, and frame type of the frame to which the current block belongs.
  • Determine the preset selection network model used by the current block in the model and then determine at least one candidate loop filter network model when the current block uses the loop filter network model and the current block does not use the loop filter network model according to the preset selection network model the probability distribution of .
  • the at least two output values may include the respective first values of at least one candidate loop filter network model when the current block uses the loop filter network model and the current block does not use the loop filter network model. The second value of the model.
  • the first value may be used to reflect the probability distribution of the at least one candidate loop filter network model when the current block uses the loop filter network model
  • the second value may be used to reflect that the current block does not use the loop filter network model.
  • both the first value and the second value may be represented by probability values; that is, according to the preset selection network model, the determined at least two output values may be at least two probability values.
  • the first value and the second value may also be used to reflect the weight distribution situation of the at least one candidate loop filter network model and the current block not using the loop filter network model when the current block uses the loop filter network model; that is, the first
  • the value and the second value may also be referred to as a weight value, which is not limited in this embodiment of the present application.
  • the color component types it may include luma components and chrominance components.
  • the preset selection network models here are not the same.
  • the preset selection network model corresponding to the luma component may be called a luma selection network model
  • the preset selection network model corresponding to the chroma component may be called a chroma selection network model. Therefore, in some embodiments, the determining the preset selection network model of the current block may include:
  • the color component type of the current block is a brightness component (that is, when the current block is a brightness block)
  • the chroma selection network model of the current block is determined.
  • the candidate loop filtering network models are also different.
  • one or more candidate loop filter network models corresponding to the luma component may be referred to as a candidate luma loop filter network model
  • one or more candidate loop filter network models corresponding to the chroma component may be referred to as Candidate chroma loop filter network models. Therefore, in some embodiments, the determining at least two output values according to the preset selection network model of the current block may include:
  • the color component type of the current block is a brightness component
  • at least two brightness output values are determined according to the brightness selection network model; wherein the at least two brightness output values include at least one candidate brightness loop when the current block uses the brightness loop filtering network model
  • the color component type of the current block is a chroma component
  • at least two chroma output values are determined according to the chroma selection network model; wherein the at least two chroma output values include at least A first value corresponding to a candidate chroma loop filter network model and a second value when the current block does not use the chroma loop filter network model.
  • the frame type may include I frame, P frame and B frame.
  • the frame type may include the first type and the second type.
  • the preset selection network models here are also different.
  • the first type may be an I frame
  • the second type may be a non-I frame. It should be noted that there is no specific limitation here.
  • the brightness selection network model corresponding to the first type may be called the first brightness selection network model
  • the brightness selection network model corresponding to the second type may be called the second brightness selection network model. Brightness selects the network model. Therefore, in some embodiments, when the color component type of the current block is a brightness component, the determining the brightness selection network model of the current block may include:
  • the frame type of the frame to which the current block belongs is the first type, then determine the first brightness selection network model of the current block; or,
  • the frame type of the frame to which the current block belongs is the second type, then determine the second brightness selection network model of the current block.
  • the candidate luma loop filter network models are also different according to different frame types.
  • the candidate luma loop filter network model corresponding to the first type may be called a candidate first luma selection network model
  • the candidate luma loop filter network model corresponding to the second type may be called a candidate second luma selection network model. Therefore, in some embodiments, the determining at least two brightness output values according to the brightness selection network model may include:
  • the frame type of the frame to which the current block belongs is the first type
  • at least two brightness output values are determined according to the first brightness selection network model; wherein, the at least two brightness output values include that the current block uses the first brightness loop filtering network model When the first value corresponding to at least one candidate first luma loop filter network model and the second value when the current block does not use the first luma loop filter network model; or,
  • the frame type of the frame to which the current block belongs is the second type
  • at least two brightness output values are determined according to the second brightness selection network model; wherein, the at least two brightness output values include that the current block uses the second brightness loop filtering network model The first value corresponding to at least one candidate second luma loop filter network model and the second value when the current block does not use the second luma loop filter network model.
  • candidate loop filter network model corresponding to the luminance component which may be simply referred to as “candidate luminance loop filter network model”
  • whether it is at least one candidate loop filter network model corresponding to the first type A luminance loop filter network model, or at least one candidate second luminance loop filter network model corresponding to the second type, and these candidate loop filter network models are obtained through model training.
  • the method may also include:
  • the first training set includes at least one first training sample and at least one second training sample, the frame type of the first training sample is the first type, and the frame type of the second training sample is the second type , and both the first training sample and the second training sample are obtained according to at least one quantization parameter;
  • the first neural network structure is trained by using the luminance component of the at least one second training sample to obtain at least one candidate second luminance loop filter network model.
  • the first neural network structure includes at least one of the following: a convolutional layer, an activation layer, a residual block, and a skip connection layer.
  • At least one candidate first luminance loop filter network model and at least one candidate second luminance loop filter network model are determined by performing model training on the first neural network structure according to at least one training sample, and at least one candidate first A luma loop filter network model and at least one candidate second luma loop filter network model have correspondences with frame types, color component types and quantization parameters.
  • the chroma selection network model corresponding to the first type may be called the first chroma selection network model
  • the frame type of the frame to which the current block belongs is the first type, then determine the first chroma selection network model of the current block; or,
  • the frame type of the frame to which the current block belongs is the second type, then determine the second chrominance selection network model of the current block.
  • the candidate chroma loop filter network models are also different according to different frame types.
  • one or more candidate chroma loop filter network models corresponding to the first type may be referred to as candidate first chroma selection network models
  • one or more candidate chroma loop filter network models corresponding to the second type may be It is called the candidate second chrominance selection network model. Therefore, in some embodiments, the determining at least two chromaticity output values according to the chromaticity selection network model may include:
  • the frame type of the frame to which the current block belongs is the first type
  • at least two chroma output values are determined according to the first chroma selection network model; wherein, the at least two chroma output values include that the current block uses the first chroma circle The first value corresponding to at least one candidate first chroma loop filter network model and the second value when the current block does not use the first chroma loop filter network model; or,
  • At least two chroma output values are determined according to the second chroma selection network model; wherein, the at least two chroma output values include that the current block uses the second chroma circle When the current block does not use the second chroma loop filter network model, the first value corresponding to at least one candidate second chroma loop filter network model and the second value when the current block does not use the second chroma loop filter network model.
  • candidate in-loop filter network model corresponding to the chroma component (which may be referred to simply as “candidate chroma in-loop filter network model”), whether it is the at least one candidate in-loop filter network model corresponding to the first type
  • candidate loop filter network models are obtained through model training.
  • the method may also include:
  • the first training set includes at least one first training sample and at least one second training sample, the frame type of the first training sample is the first type, and the frame type of the second training sample is the second type , and both the first training sample and the second training sample are obtained according to at least one quantization parameter;
  • the second neural network structure is trained by using the chroma component of at least one second training sample to obtain at least one candidate second chroma loop filter network model.
  • the second neural network structure includes at least one of the following: a sampling layer, a convolutional layer, an activation layer, a residual block, a pooling layer, and a skip connection layer.
  • At least one candidate first chroma loop filter network model and at least one candidate second chroma loop filter network model are determined by performing model training on the second neural network structure according to at least one training sample, and at least one The candidate first chroma in-loop filter network model and at least one candidate second chroma in-loop filter network model have correspondences with frame types, color component types and quantization parameters.
  • the first neural network structure may include a first convolution module, a first residual module, a second convolution module and a first connection module.
  • the first convolution module can be composed of one convolution layer and one activation layer
  • the second convolution module can be composed of two convolution layers and one activation layer
  • the connection The modules can be composed of skip connection layers
  • the first residual module can include several residual blocks
  • each residual block can be composed of two convolutional layers and one activation layer.
  • the second neural network structure may include an upsampling module, a third convolution module, a fourth convolution module, a fusion module, a second residual module, a fifth convolution module and a second connection module.
  • the third convolution module can be composed of one layer of convolution layer and one layer of activation layer
  • the fourth convolution module can be composed of one layer of convolution layer and one layer of activation layer
  • the first The five-convolution module can be composed of two convolutional layers, one activation layer and one pooling layer
  • the connection module can be composed of a jump connection layer
  • the second residual module can include several residual blocks, and each A residual block can consist of two convolutional layers and one activation layer.
  • CNNLF designs different network structures for the luma component and the chrominance component respectively.
  • the luminance component it designed the first neural network structure, see Figure 6A and Figure 7A for details
  • the chrominance component it designed the second neural network structure, see Figure 6B and Figure 7B for details.
  • the entire network structure can be composed of convolutional layers, activation layers, residual blocks, and skip connection layers.
  • the convolution kernel of the convolutional layer can be 3 ⁇ 3, that is, it can be represented by 3 ⁇ 3Conv;
  • the activation layer can be a linear activation function, that is, it can be represented by a linear rectification function (Rectified Linear Unit, ReLU), which can also be called
  • the modified linear unit is a commonly used activation function in artificial neural networks, and usually refers to the nonlinear function represented by the ramp function and its variants.
  • the network structure of the residual block is shown in the dotted box in Figure 8, which can be composed of a convolutional layer (Conv), an activation layer (ReLU), and a jump connection layer.
  • the jump connection layer refers to a global jump connection from input to output included in the network structure, which enables the network to focus on learning residuals and accelerates the convergence process of the network.
  • the chrominance component For the chrominance component, take Figure 7B as an example, where the luminance component is introduced as one of the inputs to guide the filtering of the chrominance component.
  • the entire network structure can be composed of a convolutional layer, an activation layer, a residual block, a pooling layer, and a jump Connection layer and other components. Due to the inconsistency of the resolution, the chroma component needs to be up-sampled first. In order to avoid introducing other noises during the upsampling process, the resolution can be expanded by directly copying adjacent pixels to obtain an enlarged chroma frame (Enlarged chroma frame).
  • a pooling layer (such as 2 ⁇ 2AvgPool) is also used to complete the downsampling of the chroma components.
  • CNNLF can contain two stages of offline training and inference testing.
  • 4 I-frame luminance component models, 4 non-I-frame luminance component models, 4 chroma U component models, 4 chroma V component models, and a total of 16 models can be trained offline.
  • a preset image data set such as DIV2K, which has 1000 high-definition images (2K resolution), of which 800 are used as training, 100 are used as verification, and 100 are used as testing
  • the image is converted from RGB A single-frame video sequence in YUV4:2:0 format as label data.
  • HPM-ModAI sets a frame-level flag in the form of a switch and a CTU-level flag for the luma component to control whether to open the CNNLF model, and sets a frame-level flag in the form of a switch for the chroma component to control whether to open it CNNLF model.
  • flag bits can usually be represented by flag.
  • a CTU-level flag is set to control whether CNNLF is turned on or not.
  • the CTU-level flag bit is determined by formula (2).
  • the encoder can determine whether the current frame or the current block uses the CNNLF model for filtering processing through the rate-distortion cost method, but at this time, it is necessary to switch information such as frame level and CTU level Writing to the code stream causes additional bit overhead.
  • the embodiment of the present application proposes a preset selection network model based on deep learning, which can make adaptive decisions on the use of the CNNLF model. At this time, it is no longer necessary to calculate the rate-distortion cost and Encode frame-level and CTU-level switch information.
  • the corresponding preset selection network models are also different.
  • the preset selection network model corresponding to the luma component may be called a luma selection network model
  • the preset selection network model corresponding to the chroma component may be called a chroma selection network model.
  • the determining the brightness selection network model of the current block may include:
  • the candidate brightness-selective network model comprising a candidate first brightness-selective network model and/or a candidate second brightness-selective network model
  • At least one candidate first brightness selection network model corresponding to the first type is determined from at least one candidate brightness selection network model, and determined from at least one candidate first brightness selection network model according to the quantization parameter the first brightness selection network model of the current block; or,
  • At least one candidate second brightness selection network model corresponding to the second type is determined from at least one candidate brightness selection network model, and determined from at least one candidate second brightness selection network model according to the quantization parameter The second brightness of the current block selects the network model.
  • the determining the chroma selection network model of the current block may include:
  • the candidate chroma-selective network model comprising a candidate first chroma-selective network model and/or a candidate second chroma-selective network model
  • At least one candidate first chroma selection network model corresponding to the first type is determined from at least one candidate chroma selection network model, and the at least one candidate first chroma selection network is selected according to the quantization parameter
  • the first chrominance selection network model that determines the current block in the model or,
  • At least one candidate second chroma selection network model corresponding to the second type is determined from at least one candidate chroma selection network model, and the at least one candidate second chroma selection network is selected according to the quantization parameter
  • the second chrominance selection network model for the current block is determined in the model.
  • the preset selection network model of the current block is not only related to quantization parameters, but also related to frame type and color component type.
  • different color component types correspond to different preset selection network models.
  • the preset selection network model can be a brightness selection network model related to the brightness component; It is assumed that the selection network model may be a chrominance selection network model related to chrominance components.
  • different frame types have different corresponding preset selection network models.
  • the brightness selection network model corresponding to the first type can be called the first brightness selection network model
  • the brightness selection network model corresponding to the second type can be called the second brightness selection network model
  • the chroma selection network model related to the chroma component the chroma selection network model corresponding to the first type may be called the first chroma selection network model
  • the chroma selection network model corresponding to the second type may be called the second chroma selection network model Choose a network model.
  • At least one candidate luminance selection network model (including candidate first luminance selection network model and/or candidate second luminance selection network model) and at least one candidate chroma selection network model (including candidate chroma selection network model) can be trained in advance. candidate first chrominance selection network model and/or candidate second chrominance selection network model).
  • the I frame brightness selection network model corresponding to the quantization parameter can be selected from at least one candidate I frame brightness selection network model, that is, the brightness selection network model of the current block; or, it is assumed that the frame type is non-I frame, at least one candidate non-I frame brightness selection network model corresponding to the non-I frame type can be determined from at least one candidate brightness selection network model; according to the quantization parameter of the current block, the network model can be selected from at least one candidate non-I frame brightness Select the non-I frame luminance selection network model corresponding to the quantization parameter, that is, the luminance selection network model of the current block.
  • the determination method of the chroma selection network model is the same as that of the lum
  • the method may further include:
  • the second training set includes at least one training sample, and the training sample is obtained according to at least one quantization parameter
  • the third neural network structure is trained by using the chrominance components of the training samples in the second training set to obtain at least one candidate chrominance selection network model.
  • At least one candidate brightness selection network model is determined by performing model training on the third neural network structure according to at least one training sample, and there is a relationship between the at least one candidate brightness selection network model and the frame type, color component type and quantization parameter have a corresponding relationship.
  • at least one candidate chroma selection network model is also determined by performing model training on the third neural network structure according to at least one training sample, and the at least one candidate chroma selection network model is related to frame type, color component type and quantization parameter have a corresponding relationship.
  • the third neural network structure may include at least one of the following: a convolutional layer, a pooling layer, a fully connected layer, and an activation layer.
  • the third neural network structure may include a sixth convolution module and a fully connected module, and the sixth convolution module and the fully connected module are connected in sequence.
  • the sixth convolution module can include several convolution sub-modules, and each convolution sub-module can be composed of one layer of convolution layer and one layer of pooling layer;
  • the fully connected module can include several fully connected sub-modules, each A fully connected sub-module can consist of a fully connected layer and an activation layer.
  • the preset selection network model can be composed of multi-layer convolutional neural network and multi-layer fully connected layer neural network, and then use the training samples for deep learning to obtain the preset selection network model of the current block, such as the brightness selection network model Or the Chroma select network model.
  • the third neural network structure can be composed of 3 layers of convolutional layers and 2 layers of fully connected layers, and each layer of convolutional layer is followed by a pooling layer; wherein, the convolutional layer
  • the convolution kernel can be 3 ⁇ 3, that is, it can be represented by 3 ⁇ 3Conv;
  • the pooling layer can use the maximum pooling layer, represented by 2 ⁇ 2MaxPool;
  • an activation layer is set after each fully connected layer, here , the activation layer can be a linear activation function or a nonlinear activation function, such as ReLU and Softmax.
  • a loss function can also be used for model training.
  • the method may also include:
  • the second training set includes at least one training sample, and the training sample is obtained according to at least one quantization parameter;
  • the third neural network structure is trained by using the chroma components of the training samples in the second training set, and at least one candidate chroma selection network model is obtained when the loss value of the preset loss function converges to a loss threshold.
  • the embodiment of the present application further provides a method for performing model training with a weighted loss function. Specifically as shown in the following formula,
  • Wa, Wb,..., Wn, Woff respectively represent the output of the preset selection network model, representing at least one candidate loop filter network model a, b, ..., n, and the network model that does not use the loop filter (that is, the model Probability value of close).
  • reca, recb,..., recn represent the output reconstructed image after using the candidate loop filter network models a, b,...,n, respectively, and rec0 represents the output reconstructed image after DBF and SAO.
  • the Clip function limits the value between 0 and N.
  • N represents the maximum value of the pixel value, for example, for a 10bit YUV image, N is 1023; orig represents the original image.
  • At least two output probability values of the preset selection network model can be used as at least one candidate CNNLF model and the weighted weight value of the output reconstruction image when the CNNLF model is not used, and finally the mean square error is calculated with the original image orig, and the loss can be obtained function value.
  • the embodiment of the present application also provides a technical solution of applying the cross-entropy loss function commonly used in classification networks to the embodiment of the present application. Specifically as shown in the following formula,
  • label(i) argmin((reca-orig) 2 ,(recb-orig) 2 ,...,(recn-orig) 2 ,(rec0-orig) 2 )
  • label(i) represents the output reconstructed image of at least one candidate loop filter network model a, b,..., n, and the output reconstructed image after DBF and SAO respectively calculates the mean square error with the original image, and takes the minimum The value i of the serial number corresponding to the error.
  • Wa, Wb,...,Wn,Woff respectively represent the output of the preset selection network model, representing at least one candidate loop filter network model a, b,...,n, and not using the loop filter network model (that is, the model is closed) probability value.
  • Wi represents the probability value with the same serial number as label(i). Then calculate the softmax of Wi and multiply it with label(i) to get the cross entropy loss value.
  • each candidate loop filter network model and the current The probability distribution of the block without using the loop filter network model.
  • the determining at least two output values according to the preset selection network model of the current block may include:
  • the second reconstructed image block is input to the preset selection network model to obtain at least two output values.
  • the at least two output values may include first values corresponding to at least one candidate loop filtering network model when the current block uses the loop filtering network model and second values when the current block does not use the loop filtering network model.
  • the loop filtering network model may refer to the aforementioned CNNLF model.
  • the second reconstructed image block is used as the input of the preset selection network model, and the output of the preset selection network model is at least one candidate CNNLF model and the current block does not use The probability distribution of the CNNLF model (including: the first value corresponding to each of the at least one candidate CNNLF model and the second value when the current block does not use the CNNLF model).
  • S1203 According to at least two output values, determine a target loop filtering network model when the current block uses the loop filtering network model or the current block does not use the loop filtering network model.
  • the network model is the target loop filter network model or the current block does not use the loop filter network model.
  • determining the target loop filtering network model when the current block uses the loop filtering network model or the current block does not use the loop filtering network model may include:
  • the target value is the first value, determine that the current block uses a loop filter network model, and use the candidate loop filter network model corresponding to the target value as the target loop filter network model; or,
  • the target value is the second value, it is determined that the current block does not use the loop filtering network model.
  • the determining the target value from at least two output values may include: selecting a maximum value from at least two output values, and using the maximum value as the target value.
  • the luma loop filter network model whether it is a luma loop filter network model or a chroma loop filter network model, it is first trained through the model to obtain several candidate luma loop filter network models or several candidate luma loop filter network models, and then Then use the preset selection network model for model decision-making, if the second value of the at least two output values is the maximum value, then it can be determined that the current block does not use the loop filter network model; if the second value of the at least two output values value is not the maximum value, then determine the candidate loop filter network model corresponding to the maximum value in the first value as the target loop filter network model, so as to use the target loop filter network model to filter the current block.
  • the luma loop filter network model whether it is a luma loop filter network model or a chroma loop filter network model, it is first trained through the model to obtain several candidate luma loop filter network models or several candidate luma loop filter network models, and then Then use the preset selection network model for model decision-making, if the second value of the at least two output values is the maximum value, then it can be determined that the current block does not use the loop filter network model; if the second value of the at least two output values value is not the maximum value, then determine the candidate loop filter network model corresponding to the maximum value in the first value as the target loop filter network model, so as to use the target loop filter network model to filter the current block.
  • the preset selection network model includes a brightness selection network model and a chrominance selection network model; thus, for the second reconstructed image block, it may also include the input reconstructed brightness image block and Input reconstructed chroma image patches.
  • the determining at least two output values according to the preset selection network model of the current block may include:
  • the at least two luminance output values may include the first value corresponding to at least one candidate luminance loop filtering network model when the current block uses the luminance loop filtering network model and the first value when the current block does not use the luminance loop filtering network model. binary value.
  • the method may further include: selecting the maximum probability value from at least two luminance output values; if the maximum probability value is the first value, determining the current The block uses a luma loop filter network model, and the candidate luma loop filter network model corresponding to the maximum probability value is used as the target luma loop filter network model; or, if the maximum probability value is the second value, it is determined that the current block does not use luma Loop filter network model.
  • the determining at least two output values according to the preset selection network model of the current block may include:
  • the at least two chroma output values may include the first values corresponding to at least one candidate chroma loop filter network model when the current block uses the chroma loop filter network model and the current block does not use the chroma loop filter network model.
  • the second value of the model may include the first values corresponding to at least one candidate chroma loop filter network model when the current block uses the chroma loop filter network model and the current block does not use the chroma loop filter network model.
  • the method may further include: selecting a maximum probability value from at least two chroma output values; if the maximum probability value is the first value, then Determine that the current block uses the chroma loop filter network model, and use the candidate chroma loop filter network model corresponding to the maximum probability value as the target chroma loop filter network model; or, if the maximum probability value is the second value, then determine The current block does not use the chroma loop filter network model.
  • the selected target loop filter network model can be used to filter the current block.
  • Perform filtering Specifically, in a possible implementation manner, when the current block uses a loop filtering network model, performing filtering processing on the current block by using the target loop filtering network model to obtain the first reconstructed image block of the current block may be include:
  • the second reconstructed image block is input to the target loop filtering network model to obtain the first reconstructed image block of the current block.
  • the method may further include: determining the second reconstructed image block as the first reconstructed image block of the current block.
  • the maximum value is determined to be the second value from the at least two output values, it means that the rate-distortion cost of the current block without using the loop filter network model is the smallest, Then it can be determined that the current block does not use the loop filter network model, that is, the second reconstructed image block is directly determined as the first reconstructed image block of the current block; if the maximum value is determined to be a certain first value, which means that the rate-distortion cost of the current block using the loop filter network model is the smallest, then the candidate loop filter network model corresponding to a certain first value can be determined as the target loop filter network model, and then the second reconstructed image block input into the target loop filtering network model to obtain the first reconstructed image block of the current block.
  • the second reconstructed image block can be obtained by deblocking filter and sample adaptive compensation
  • the filter is obtained after filtering processing.
  • the loop filtering network model described in the embodiment of the present application may be a CNNLF model.
  • the selected CNNLF model to perform CNNLF filtering on the current block, the first reconstructed image block of the current block can be obtained.
  • the method may further include: after determining the first reconstructed image block of the current block, performing filtering processing on the first reconstructed image block by using an adaptive correction filter.
  • the second reconstructed image block is obtained after filtering through a deblocking filter (DBF) and a sample adaptive compensation filter (SAO), and then the second reconstructed image block is passed through a model adaptive selection module
  • DPF deblocking filter
  • SAO sample adaptive compensation filter
  • the first reconstructed image block obtained after combining with the CNNLF model can also be input into an adaptive correction filter (ALF) to continue filtering processing.
  • ALF adaptive correction filter
  • the method may further include:
  • the determining the identification information of the loop filtering network model may include:
  • the current block uses a loop filter network model, determine the index number of the loop filter network model corresponding to the target loop filter network model as the identification information of the loop filter network model; and/or,
  • the model closure information is determined as the identification information of the loop filtering network model.
  • the index number of the loop filter network model corresponding to the target loop filter network model can be determined as The identification information of the loop filtering network model; if the current block does not use the loop filtering network model, the model closure information can be determined as the identification information of the loop filtering network model; then the identification information of the loop filtering network model is encoded and Write it into the code stream; in this way, the decoder can directly determine whether the current block does not use the loop filter network model or the index of the loop filter network model used by the current block according to the identification information of the loop filter network model obtained through decoding. sequence number, thereby reducing the complexity of the decoder.
  • the number of convolutional layers, the number of fully connected layers, the nonlinear activation function, etc. can be adjusted.
  • the loop filter network model targeted by the model adaptive selection module can also perform model adaptive selection for other efficient neural network filter models, which is not limited here.
  • the embodiment of this application proposes a model adaptive decision-making module based on deep learning, which is used to make adaptive decisions on the use of the CNNLF model, eliminating the need to calculate rate-distortion costs and transmit frame-level, CTU-level, etc. Switch information to avoid extra bit overhead and improve encoding performance.
  • the model adaptive decision-making module can be regarded as a preset selection network model composed of a multi-layer convolutional neural network and a multi-layer fully connected neural network, and its input is the second reconstructed image block of the current block (ie, the input reconstruction of the CNNLF model Image block), the output is the probability distribution of each CNNLF model and the decision to close the CNNLF model.
  • the position of the model adaptive decision-making module in the encoder/decoder is shown in Figure 5.
  • the use of the model adaptive selection module does not depend on the flag bits of DBF, SAO, ALF, and CNNLF, but is placed before CNNLF in position .
  • the preset filter sequence is DBF filter---->SAO filter---->model adaptive decision-making module---->CNNLF filter---->ALF filter.
  • model_adaptive_decision_enable_flag First judge whether the model adaptive decision-making module is allowed under the current block to make model decisions according to model_adaptive_decision_enable_flag. If model_adaptive_decision_enable_flag is "1", then try to perform model adaptive decision using module processing on the current block, and jump to (b); if model_adaptive_decision_enable_flag is "0”, then jump to (e);
  • the CNNLF model input reconstructed luminance image block is used as the input of the model adaptive decision-making module, and the output is the probability distribution of each luminance CNNLF model and the decision to close the luminance CNNLF model. If the output with the largest probability value is the decision to close the brightness CNNLF model, then jump to (e); if the output with the largest probability value is the index number of a brightness CNNLF model, then select this model to perform CNNLF on the current brightness image block Filter processing to obtain the final output reconstructed brightness image block;
  • the input of the CNNLF model is used to reconstruct the chroma image block as the input of the model adaptive decision-making module, and the output is the probability distribution of each chroma CNNLF model and the decision to close the chroma CNNLF model. If the output with the largest probability value is the decision to close the chroma CNNLF model, then jump to (e); if the output with the largest probability value is the index number of a certain chroma CNNLF model, then select this model for the current chroma image The blocks are processed by CNNLF filtering to obtain the reconstructed chrominance image blocks of the final output;
  • its syntax elements are modified as follows. Among them, for the definition of the sequence header, the modification of its syntax elements is shown in Table 1; for the definition of the intra-frame prediction image header, the modification of its syntax elements is shown in Table 2; for the definition of the inter-frame prediction image header, the modification of its syntax elements As shown in Table 3; for the slice definition, the modification of its syntax elements is shown in Table 4.
  • This embodiment provides an encoding method, which is applied to an encoder. Determine the value of the first syntax element identification information; when the first syntax element identification information indicates that the current block allows the use of a preset selection network model for model decision-making, at least two output values are determined according to the preset selection network model of the current block; wherein , the at least two output values include the first value corresponding to at least one candidate loop filter network model when the current block uses the loop filter network model and the second value when the current block does not use the loop filter network model; according to at least two Output value, to determine the target loop filter network model when the current block uses the loop filter network model or the current block does not use the loop filter network model; when the current block uses the loop filter network model, use the target loop filter network model to Filtering is performed on the current block to obtain the first reconstructed image block of the current block.
  • the target loop filter network model when the current block uses the loop filter network model or the current block does not use the loop filter network model; if the current block uses a loop filter network model, then the target loop filter network model can also be used to filter the current block, which not only reduces complexity, but also avoids additional bit overhead and improves coding performance.
  • the encoding and decoding efficiency can be improved; in addition, the first reconstructed image block finally output can be made closer to the original image block, and the video image quality can be improved.
  • FIG. 13 shows a schematic structural diagram of an encoder 130 provided by the embodiment of the present application.
  • the encoder 130 may include: a first determining unit 1301, a first decision-making unit 1302, and a first filtering unit 1303; wherein,
  • the first determining unit 1301 is configured to determine the value of the first syntax element identification information
  • the first decision-making unit 1302 is configured to determine at least two output values according to the preset selection network model of the current block when the first syntax element identification information indicates that the current block allows the use of a preset selection network model for model decision-making; where at least two The output values include first values corresponding to at least one candidate loop filter network model when the current block uses the loop filter network model and second values when the current block does not use the loop filter network model; and according to at least two output values , determine the target loop filter network model when the current block uses the loop filter network model or the current block does not use the loop filter network model;
  • the first filtering unit 1303 is configured to, when the current block uses a loop filtering network model, use the target loop filtering network model to filter the current block to obtain a first reconstructed image block of the current block.
  • the first determining unit 1301 is further configured to determine the second reconstructed image block of the current block
  • the first filtering unit 1303 is further configured to input the second reconstructed image block into the target loop filtering network model to obtain the first reconstructed image block of the current block.
  • the first filtering unit 1303 is further configured to determine the second reconstructed image block as the first reconstructed image block of the current block.
  • the first decision-making unit 1302 is further configured to determine a target value from at least two output values; and if the target value is the first value, determine that the current block uses a loop filter network model, and set the target value to The corresponding candidate loop filter network model is used as the target loop filter network model; or, if the target value is the second value, it is determined that the current block does not use the loop filter network model.
  • the first decision-making unit 1302 is further configured to select a maximum value from at least two values, and use the maximum value as the target value.
  • the first determining unit 1301 is further configured to determine that the value of the first syntax element identification information is the first identification value if the current block allows the use of a preset selection network model for model decision-making; and/or, If the current block is not allowed to use the preset selection network model for model decision-making, then determine that the value of the first syntax element identification information is a second identification value.
  • the encoder 130 may further include an encoding unit 1304 configured to encode the value of the first syntax element identification information, and write the encoded bits into the code stream.
  • the first determining unit 1301 is further configured to determine the brightness selection network model of the current block if the color component type of the current block is a brightness component; or, if the color component type of the current block is a chroma component, Then determine the chroma selection network model of the current block;
  • the first decision-making unit 1302 is further configured to determine at least two brightness output values according to the brightness selection network model if the color component type of the current block is a brightness component; wherein, the at least two brightness output values include the brightness used by the current block When the loop filter network model is used, the first value corresponding to at least one candidate luma loop filter network model and the second value when the current block does not use the luma loop filter network model; or, if the color component type of the current block is chroma component, then determine at least two chroma output values according to the chroma selection network model; where the at least two chroma output values include at least one candidate chroma loop filter network model when the current block uses the chroma loop filter network model The corresponding first value and the second value when the current block does not use the chroma in-loop filter network model.
  • the first determining unit 1301 is further configured to determine the first brightness of the current block if the frame type of the frame to which the current block belongs is the first type when the color component type of the current block is a brightness component Select a network model; or, if the frame type of the frame to which the current block belongs is the second type, then determine the second brightness of the current block and select the network model;
  • the first decision-making unit 1302 is further configured to determine at least two brightness output values according to the first brightness selection network model if the frame type of the frame to which the current block belongs is the first type; wherein the at least two brightness output values include When the current block uses the first luminance loop filtering network model, the first value corresponding to at least one candidate first luminance loop filtering network model and the second value when the current block does not use the first luminance loop filtering network model; or, If the frame type of the frame to which the current block belongs is the second type, then at least two brightness output values are determined according to the second brightness selection network model; wherein, at least two brightness output values include when the current block uses the second brightness loop filtering network model The first value corresponding to at least one candidate second luma loop filter network model and the second value when the current block does not use the second luma loop filter network model.
  • At least one candidate first luminance loop filter network model and at least one candidate second luminance loop filter network model are determined by performing model training on the first neural network structure according to at least one training sample, and at least one The candidate first luma loop filter network model and at least one candidate second luma loop filter network model have a corresponding relationship with frame type, color component type and quantization parameter.
  • the first neural network structure includes a first convolution module, a first residual module, a second convolution module and a first connection module, the first convolution module, the first residual module, the second convolution module The convolution module and the first connection module are connected in sequence, and the first connection module is also connected to the input of the first convolution module.
  • the first convolutional module consists of one convolutional layer and one activation layer
  • the second convolutional module consists of two convolutional layers and one activation layer
  • the connection module consists of a skip connection layer
  • the first residual module includes several residual blocks
  • the residual block consists of two convolutional layers and one activation layer.
  • the first determining unit 1301 is further configured to determine the first A chroma selection network model; or, if the frame type of the frame to which the current block belongs is the second type, then determine the second chroma selection network model of the current block;
  • the first decision-making unit 1302 is further configured to determine at least two chroma output values according to the first chroma selection network model if the frame type of the frame to which the current block belongs is the first type; wherein, at least two chroma output values
  • the output value includes the first value corresponding to at least one candidate first chroma loop filter network model when the current block uses the first chroma loop filter network model and the first value when the current block does not use the first chroma loop filter network model The second value; or, if the frame type of the frame to which the current block belongs is the second type, then at least two chroma output values are determined according to the second chroma selection network model; wherein, at least two chroma output values include the current block using The second chroma loop filter network model is the first value corresponding to at least one candidate second chroma loop filter network model and the second value when the current block does not use the second chroma loop filter network model.
  • At least one candidate first chroma loop filter network model and at least one candidate second chroma loop filter network model are determined by performing model training on the second neural network structure according to at least one training sample, and There is a corresponding relationship between at least one candidate first chroma loop filter network model and at least one candidate second chroma loop filter network model and frame type, color component type and quantization parameter.
  • the second neural network structure includes an upsampling module, a third convolution module, a fourth convolution module, a fusion module, a second residual module, a fifth convolution module and a second connection module, and the upsampling
  • the module is connected to the third convolution module, the third convolution module and the fourth convolution module are connected to the fusion module, the fusion module, the second residual module, the fifth convolution module and the second connection module are connected in sequence, and the first The second connection module is also connected to the input of the up-sampling module.
  • the third convolution module consists of a convolution layer and an activation layer
  • the fourth convolution module consists of a convolution layer and an activation layer
  • the fifth convolution module consists of two layers Convolutional layer, one layer of activation layer and one layer of pooling layer
  • the connection module is composed of jump connection layer
  • the second residual module includes several residual blocks
  • the residual block consists of two convolutional layers and one layer Activation layer composition.
  • the first determining unit 1301 is further configured to determine at least one candidate brightness selection network model when the color component type of the current block is a brightness component, and the candidate brightness selection network model includes a candidate first brightness selection network model and/or a candidate second brightness selection network model; and determine the frame type and quantization parameters of the frame to which the current block belongs; if the frame type is the first type, then determine at least one corresponding to the first type from at least one candidate brightness selection network model A candidate first brightness selection network model, and determine the first brightness selection network model of the current block from at least one candidate first brightness selection network model according to the quantization parameter; or, if the frame type is the second type, at least one candidate Determine at least one candidate second brightness selection network model corresponding to the second type in the brightness selection network model, and determine the second brightness selection network model of the current block from the at least one candidate second brightness selection network model according to the quantization parameter.
  • the first determining unit 1301 is further configured to determine at least one candidate chroma selection network model when the color component type of the current block is a chroma component, and the candidate chroma selection network model includes candidate first A chroma selection network model and/or a candidate second chroma selection network model; and determining the frame type and quantization parameters of the frame to which the current block belongs; if the frame type is the first type, determine from at least one candidate chroma selection network model At least one candidate first chroma selection network model corresponding to the first type, and determine the first chroma selection network model of the current block from at least one candidate first chroma selection network model according to the quantization parameter; or, if the frame type is For the second type, at least one candidate second chroma selection network model corresponding to the second type is determined from at least one candidate chroma selection network model, and the current The block's second chrominance selection network model.
  • At least one candidate brightness selection network model is determined by performing model training on the third neural network structure according to at least one training sample, and at least one candidate brightness selection network model is related to frame type, color component type and quantization parameter There is a corresponding relationship between them.
  • At least one candidate chroma selection network model is determined by performing model training on the third neural network structure according to at least one training sample, and at least one candidate chroma selection network model is related to the frame type, color component type and quantization There is a correspondence between the parameters.
  • the third neural network structure includes a sixth convolution module and a fully connected module, and the sixth convolution module and the fully connected module are connected in sequence; wherein, the sixth convolution module includes several convolution sub-modules, The convolutional sub-module consists of a convolutional layer and a pooling layer; the fully-connected module includes several fully-connected sub-modules, and the fully-connected sub-module consists of a fully-connected layer and an activation layer.
  • the first determining unit 1301 is further configured to determine identification information of the loop filter network model
  • the encoding unit 1304 is further configured to encode the identification information of the loop filter network model, and write the encoded bits into the code stream.
  • the first determining unit 1301 is further configured to determine the index number of the loop filtering network model corresponding to the target loop filtering network model as the loop filtering network model if the current block uses the loop filtering network model Identification information; and/or, if the current block does not use the loop filtering network model, determine the model closure information as the identification information of the loop filtering network model.
  • the loop filtering network model is a CNNLF model.
  • the first decision unit 1302 is further configured to determine a second reconstructed image block of the current block; and input the second reconstructed image block into a preset selection network model to obtain at least two output values.
  • the second reconstructed image block is obtained after filtering through a deblocking filter and a sample adaptive compensation filter.
  • the first filtering unit 1303 is further configured to, after determining the first reconstructed image block, use an adaptive correction filter to perform filtering processing on the first reconstructed image block.
  • a "unit" may be a part of a circuit, a part of a processor, a part of a program or software, etc., of course it may also be a module, or it may be non-modular.
  • each component in this embodiment may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units can be implemented in the form of hardware or in the form of software function modules.
  • the integrated unit is implemented in the form of a software function module and is not sold or used as an independent product, it can be stored in a computer-readable storage medium.
  • the technical solution of this embodiment is essentially or It is said that the part that contributes to the prior art or the whole or part of the technical solution can be embodied in the form of a software product, the computer software product is stored in a storage medium, and includes several instructions to make a computer device (which can It is a personal computer, a server, or a network device, etc.) or a processor (processor) that executes all or part of the steps of the method described in this embodiment.
  • the aforementioned storage medium includes: U disk, mobile hard disk, read only memory (Read Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other various media that can store program codes.
  • the embodiment of the present application provides a computer storage medium, which is applied to the encoder 130, and the computer storage medium stores a computer program, and when the computer program is executed by the first processor, it implements any one of the preceding embodiments. Methods.
  • FIG. 14 shows a schematic diagram of a specific hardware structure of the encoder 130 provided by the embodiment of the present application.
  • it may include: a first communication interface 1401 , a first memory 1402 and a first processor 1403 ; each component is coupled together through a first bus system 1404 .
  • the first bus system 1404 is used to realize connection and communication between these components.
  • the first bus system 1404 includes not only a data bus, but also a power bus, a control bus and a status signal bus. However, for clarity of illustration, the various buses are labeled as first bus system 1404 in FIG. 14 . in,
  • the first communication interface 1401 is used for receiving and sending signals during the process of sending and receiving information with other external network elements;
  • the first memory 1402 is used to store computer programs that can run on the first processor 1403;
  • the first processor 1403 is configured to, when running the computer program, execute:
  • At least two output values are determined according to the preset selection network model of the current block; wherein the at least two output values include the current block use ring
  • the loop filter network model is used, the first value corresponding to at least one candidate loop filter network model and the second value when the current block does not use the loop filter network model;
  • the target loop filtering network model is used to filter the current block to obtain the first reconstructed image block of the current block.
  • the first memory 1402 in the embodiment of the present application may be a volatile memory or a nonvolatile memory, or may include both volatile and nonvolatile memories.
  • the non-volatile memory can be read-only memory (Read-Only Memory, ROM), programmable read-only memory (Programmable ROM, PROM), erasable programmable read-only memory (Erasable PROM, EPROM), electronically programmable Erase Programmable Read-Only Memory (Electrically EPROM, EEPROM) or Flash.
  • the volatile memory can be Random Access Memory (RAM), which acts as external cache memory.
  • RAM Static Random Access Memory
  • DRAM Dynamic Random Access Memory
  • SRAM Dynamic Random Access Memory
  • Synchronous Dynamic Random Access Memory Synchronous Dynamic Random Access Memory
  • SDRAM double data rate synchronous dynamic random access memory
  • Double Data Rate SDRAM DDRSDRAM
  • enhanced SDRAM ESDRAM
  • Synchlink DRAM SLDRAM
  • Direct Memory Bus Random Access Memory Direct Rambus RAM, DRRAM
  • the first memory 1402 of the systems and methods described herein is intended to include, but is not limited to, these and any other suitable types of memory.
  • the first processor 1403 may be an integrated circuit chip, which has signal processing capabilities. In the implementation process, each step of the above method may be implemented by an integrated logic circuit of hardware in the first processor 1403 or an instruction in the form of software.
  • the above-mentioned first processor 1403 may be a general-purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), a ready-made programmable gate array (Field Programmable Gate Array, FPGA) Or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
  • DSP Digital Signal Processor
  • ASIC Application Specific Integrated Circuit
  • FPGA Field Programmable Gate Array
  • Various methods, steps, and logic block diagrams disclosed in the embodiments of the present application may be implemented or executed.
  • a general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.
  • the steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a mature storage medium in the field such as random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, register.
  • the storage medium is located in the first memory 1402, and the first processor 1403 reads the information in the first memory 1402, and completes the steps of the above method in combination with its hardware.
  • the embodiments described in this application may be implemented by hardware, software, firmware, middleware, microcode or a combination thereof.
  • the processing unit can be implemented in one or more application specific integrated circuits (Application Specific Integrated Circuits, ASIC), digital signal processor (Digital Signal Processing, DSP), digital signal processing device (DSP Device, DSPD), programmable Logic device (Programmable Logic Device, PLD), Field-Programmable Gate Array (Field-Programmable Gate Array, FPGA), general-purpose processor, controller, microcontroller, microprocessor, other devices for performing the functions described in this application electronic unit or its combination.
  • the techniques described herein can be implemented through modules (eg, procedures, functions, and so on) that perform the functions described herein.
  • Software codes can be stored in memory and executed by a processor. Memory can be implemented within the processor or external to the processor.
  • the first processor 1403 is further configured to execute the method described in any one of the foregoing embodiments when running the computer program.
  • This embodiment provides an encoder, and the encoder may include a first determining unit, a first decision-making unit, and a first filtering unit.
  • the encoder may include a first determining unit, a first decision-making unit, and a first filtering unit.
  • FIG. 15 shows a schematic diagram of the composition and structure of a decoder 150 provided in the embodiment of the present application.
  • the decoder 150 may include: an analysis unit 1501, a second decision-making unit 1502, and a second filtering unit 1503; wherein,
  • the parsing unit 1501 is configured to parse the code stream and determine the value of the first syntax element identification information
  • the second decision-making unit 1502 is configured to determine at least two output values according to the preset selection network model of the current block when the first syntax element identification information indicates that the current block allows the use of a preset selection network model for model decision-making; where at least two The output values include first values corresponding to at least one candidate loop filter network model when the current block uses the loop filter network model and second values when the current block does not use the loop filter network model; and according to at least two output values , determine the target loop filter network model when the current block uses the loop filter network model or the current block does not use the loop filter network model;
  • the second filtering unit 1503 is configured to use the target loop filtering network model to filter the current block to obtain the first reconstructed image block of the current block when the current block uses the loop filtering network model.
  • the decoder 150 may further include a second determining unit 1504, which determines the second reconstructed image block of the current block;
  • the second filtering unit 1503 is further configured to input the second reconstructed image block into the target loop filtering network model to obtain the first reconstructed image block of the current block.
  • the second filtering unit 1503 is further configured to determine the second reconstructed image block as the first reconstructed image block of the current block when the current block does not use a loop filtering network model.
  • the second decision unit 1502 is further configured to determine a target value from at least two output values; and if the target value is the first value, then determine that the current block uses a loop filter network model, and set the target value to The corresponding candidate loop filter network model is used as the target loop filter network model; or, if the target value is the second value, it is determined that the current block does not use the loop filter network model.
  • the second decision unit 1502 is further configured to select a maximum value from at least two values, and use the maximum value as the target value.
  • the second determination unit 1504 is further configured to determine that the first syntax element identification information indicates that the current block is allowed to use a preset selection network model if the value of the first syntax element identification information is the first identification value. Model decision; or, if the value of the first syntax element identification information is the second identification value, it is determined that the first syntax element identification information indicates that the current block does not allow the use of the preset selection network model for model decision-making.
  • the second determination unit 1504 is further configured to determine the brightness selection network model of the current block if the color component type of the current block is a brightness component; or, if the color component type of the current block is a chroma component, Then determine the chroma selection network model of the current block;
  • the second decision-making unit 1502 is further configured to determine at least two brightness output values according to the brightness selection network model if the color component type of the current block is a brightness component; wherein, the at least two brightness output values include the brightness used by the current block When the loop filter network model is used, the first value corresponding to at least one candidate luma loop filter network model and the second value when the current block does not use the luma loop filter network model; or, if the color component type of the current block is chroma component, then determine at least two chroma output values according to the chroma selection network model; where the at least two chroma output values include at least one candidate chroma loop filter network model when the current block uses the chroma loop filter network model The corresponding first value and the second value when the current block does not use the chroma in-loop filter network model.
  • the second determining unit 1504 is further configured to determine the first brightness of the current block if the frame type of the frame to which the current block belongs is the first type when the color component type of the current block is a brightness component Select a network model; or, if the frame type of the frame to which the current block belongs is the second type, then determine the second brightness of the current block and select the network model;
  • the second decision-making unit 1502 is further configured to determine at least two brightness output values according to the first brightness selection network model if the frame type of the frame to which the current block belongs is the first type; wherein the at least two brightness output values include When the current block uses the first luminance loop filtering network model, the first value corresponding to at least one candidate first luminance loop filtering network model and the second value when the current block does not use the first luminance loop filtering network model; or, If the frame type of the frame to which the current block belongs is the second type, then at least two brightness output values are determined according to the second brightness selection network model; wherein, at least two brightness output values include when the current block uses the second brightness loop filtering network model The first value corresponding to at least one candidate second luma loop filter network model and the second value when the current block does not use the second luma loop filter network model.
  • At least one candidate first luminance loop filter network model and at least one candidate second luminance loop filter network model are determined by performing model training on the first neural network structure according to at least one training sample, and at least one The candidate first luma loop filter network model and at least one candidate second luma loop filter network model have a corresponding relationship with frame type, color component type and quantization parameter.
  • the first neural network structure includes a first convolution module, a first residual module, a second convolution module and a first connection module, the first convolution module, the first residual module, the second convolution module The convolution module and the first connection module are connected in sequence, and the first connection module is also connected to the input of the first convolution module.
  • the first convolutional module consists of one convolutional layer and one activation layer
  • the second convolutional module consists of two convolutional layers and one activation layer
  • the connection module consists of a skip connection layer
  • the first residual module includes several residual blocks
  • the residual block consists of two convolutional layers and one activation layer.
  • the second determining unit 1504 is further configured to determine the first A chroma selection network model; or, if the frame type of the frame to which the current block belongs is the second type, then determine the second chroma selection network model of the current block;
  • the second decision-making unit 1502 is further configured to determine at least two chroma output values according to the first chroma selection network model if the frame type of the frame to which the current block belongs is the first type; wherein, at least two chroma output values
  • the output value includes the first value corresponding to at least one candidate first chroma loop filter network model when the current block uses the first chroma loop filter network model and the first value when the current block does not use the first chroma loop filter network model The second value; or, if the frame type of the frame to which the current block belongs is the second type, then at least two chroma output values are determined according to the second chroma selection network model; wherein, at least two chroma output values include the current block using The second chroma loop filter network model is the first value corresponding to at least one candidate second chroma loop filter network model and the second value when the current block does not use the second chroma loop filter network model.
  • At least one candidate first chroma loop filter network model and at least one candidate second chroma loop filter network model are determined by performing model training on the second neural network structure according to at least one training sample, and There is a corresponding relationship between at least one candidate first chroma loop filter network model and at least one candidate second chroma loop filter network model and frame type, color component type and quantization parameter.
  • the second neural network structure includes an upsampling module, a third convolution module, a fourth convolution module, a fusion module, a second residual module, a fifth convolution module and a second connection module, and the upsampling
  • the module is connected to the third convolution module, the third convolution module and the fourth convolution module are connected to the fusion module, the fusion module, the second residual module, the fifth convolution module and the second connection module are connected in sequence, and the first The second connection module is also connected to the input of the up-sampling module.
  • the third convolution module consists of a convolution layer and an activation layer
  • the fourth convolution module consists of a convolution layer and an activation layer
  • the fifth convolution module consists of two layers Convolutional layer, one layer of activation layer and one layer of pooling layer
  • the connection module is composed of jump connection layer
  • the second residual module includes several residual blocks
  • the residual block consists of two convolutional layers and one layer Activation layer composition.
  • the second determination unit 1504 is further configured to determine at least one candidate brightness selection network model when the color component type of the current block is a brightness component, and the candidate brightness selection network model includes a candidate first brightness selection network model model and/or a candidate second brightness selection network model; and determine the frame type and quantization parameters of the frame to which the current block belongs; if the frame type is the first type, then determine at least one corresponding to the first type from at least one candidate brightness selection network model A candidate first brightness selection network model, and determine the first brightness selection network model of the current block from at least one candidate first brightness selection network model according to the quantization parameter; or, if the frame type is the second type, at least one candidate Determine at least one candidate second brightness selection network model corresponding to the second type in the brightness selection network model, and determine the second brightness selection network model of the current block from the at least one candidate second brightness selection network model according to the quantization parameter.
  • the second determination unit 1504 is further configured to determine at least one candidate chroma selection network model when the color component type of the current block is a chroma component, and the candidate chroma selection network model includes the candidate first A chroma selection network model and/or a candidate second chroma selection network model; and determining the frame type and quantization parameters of the frame to which the current block belongs; if the frame type is the first type, determine from at least one candidate chroma selection network model At least one candidate first chroma selection network model corresponding to the first type, and determine the first chroma selection network model of the current block from at least one candidate first chroma selection network model according to the quantization parameter; or, if the frame type is For the second type, at least one candidate second chroma selection network model corresponding to the second type is determined from at least one candidate chroma selection network model, and the current The block's second chrominance selection network model.
  • At least one candidate brightness selection network model is determined by performing model training on the third neural network structure according to at least one training sample, and at least one candidate brightness selection network model is related to frame type, color component type and quantization parameter There is a corresponding relationship between them.
  • At least one candidate chroma selection network model is determined by performing model training on the third neural network structure according to at least one training sample, and at least one candidate chroma selection network model is related to the frame type, color component type and quantization There is a correspondence between the parameters.
  • the third neural network structure includes a sixth convolution module and a fully connected module, and the sixth convolution module and the fully connected module are connected in sequence; wherein, the sixth convolution module includes several convolution sub-modules, The convolutional sub-module consists of a convolutional layer and a pooling layer; the fully-connected module includes several fully-connected sub-modules, and the fully-connected sub-module consists of a fully-connected layer and an activation layer.
  • the parsing unit 1501 is further configured to parse the code stream and determine the identification information of the loop filter network model when the first syntax element identification information indicates that the current block allows the use of a preset selection network model for model decision-making;
  • the second determining unit 1504 is further configured to determine that the current block does not use the loop filtering network model if the identification information of the loop filtering network model is model closing information; or, if the identification information of the loop filtering network model is loop filtering network model index number, then determine the target loop filter network model used by the current block from at least one candidate loop filter network model according to the loop filter network model index number;
  • the second filtering unit 1503 is further configured to use the target in-loop filtering network model to perform filtering processing on the current block to obtain a first reconstructed image block of the current block.
  • the loop filtering network model is a CNNLF model.
  • the second determining unit 1504 is further configured to determine the second reconstructed image block of the current block
  • the second decision unit 1502 is further configured to input the second reconstructed image block into the preset selection network model to obtain at least two output values.
  • the second reconstructed image block is obtained after filtering through a deblocking filter and a sample adaptive compensation filter.
  • the second filtering unit 1503 is further configured to, after determining the first reconstructed image block, use an adaptive correction filter to perform filtering processing on the first reconstructed image block.
  • a "unit” may be a part of a circuit, a part of a processor, a part of a program or software, etc., of course it may also be a module, or it may be non-modular.
  • each component in this embodiment may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units can be implemented in the form of hardware or in the form of software function modules.
  • the integrated units are implemented in the form of software function modules and are not sold or used as independent products, they can be stored in a computer-readable storage medium.
  • this embodiment provides a computer storage medium, which is applied to the decoder 150, and the computer storage medium stores a computer program, and when the computer program is executed by the second processor, any one of the preceding embodiments is implemented. the method described.
  • FIG. 16 shows a schematic diagram of a specific hardware structure of the decoder 150 provided by the embodiment of the present application.
  • it may include: a second communication interface 1601 , a second memory 1602 and a second processor 1603 ; all components are coupled together through a second bus system 1604 .
  • the second bus system 1604 is used to realize connection and communication between these components.
  • the second bus system 1604 includes not only a data bus, but also a power bus, a control bus and a status signal bus. However, for clarity of illustration, the various buses are labeled as the second bus system 1604 in FIG. 16 . in,
  • the second communication interface 1601 is used for receiving and sending signals during the process of sending and receiving information with other external network elements;
  • the second memory 1602 is used to store computer programs that can run on the second processor 1603;
  • the second processor 1603 is configured to, when running the computer program, execute:
  • At least two output values are determined according to the preset selection network model of the current block; wherein the at least two output values include the current block use ring
  • the loop filter network model is used, the first value corresponding to at least one candidate loop filter network model and the second value when the current block does not use the loop filter network model;
  • the target loop filtering network model is used to filter the current block to obtain the first reconstructed image block of the current block.
  • the second processor 1603 is further configured to execute the method described in any one of the foregoing embodiments when running the computer program.
  • the hardware function of the second memory 1602 is similar to that of the first memory 1402, and the hardware function of the second processor 1603 is similar to that of the first processor 1403; details will not be described here.
  • This embodiment provides a decoder, which may include an analysis unit, a second decision unit, and a second filter unit.
  • a decoder which may include an analysis unit, a second decision unit, and a second filter unit.
  • FIG. 17 shows a schematic diagram of the composition and structure of a codec system provided by the embodiment of the present application.
  • a codec system 170 may include the encoder 130 described in any one of the foregoing embodiments and the decoder 150 described in any one of the foregoing embodiments.
  • the embodiment of the present application further provides a code stream, which is generated by performing bit coding according to the information to be coded; wherein, the information to be coded includes the value of the first syntax element identification information, and the first The syntax element identification information is used to indicate whether the current block allows to use the preset selection network model for model decision-making.
  • the information to be encoded here may also include the identification information of the loop filter network model; wherein, the identification information of the loop filter network model is used to determine the loop filter when the current block uses the loop filter network model.
  • the index number of the loop filter network model or the current block does not use the loop filter network model.
  • the encoder 130 may transmit the code stream to the decoder 150 .
  • the decoder 150 can obtain the value of the identification information of the first syntax element by parsing the code stream, so as to determine whether the current block is allowed to use the preset selection network model for model decision-making.
  • the target loop filter network model or the current block can be determined when the current block uses the loop filter network model
  • the loop filter network model is not used; if the current block uses the loop filter network model, then the target loop filter network model can also be used to filter the current block, which can not only reduce the complexity, but also avoid additional bit overhead , improve the encoding performance, and then can improve the encoding and decoding efficiency; in addition, it can also make the first reconstructed image block that is finally output closer to the original image block, and can improve the video image quality.
  • no matter whether it is an encoder or a decoder after determining the value of the first syntax element identification information, when the first syntax element identification information indicates that the current block is allowed to use a preset selection network model for model decision-making, Determine at least two output values according to the preset selection network model of the current block, and the at least two output values include the first value corresponding to at least one candidate loop filter network model when the current block uses the loop filter network model and the current block.
  • the second value when using the loop filter network model determines the target loop filter network model when the current block uses the loop filter network model or the current block does not use the loop filter network model; when the current block When the loop filtering network model is used, the target loop filtering network model is used to filter the current block to obtain the first reconstructed image block of the current block.
  • the target loop filter network model when the current block uses the loop filter network model or the current block does not use the loop filter network model; if the current block uses a loop filter network model, then the target loop filter network model can also be used to filter the current block, which not only reduces complexity, but also avoids additional bit overhead and improves coding performance.
  • the encoding and decoding efficiency can be improved; in addition, the first reconstructed image block finally output can be made closer to the original image block, and the video image quality can be improved.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本申请实施例公开了一种编解码方法、码流、编码器、解码器、***和存储介质,该方法包括:解析码流,确定第一语法元素标识信息的取值;当第一语法元素标识信息指示当前块允许使用预设选择网络模型进行模型决策时,根据当前块的预设选择网络模型确定至少两个输出值,这至少两个输出值包括当前块使用环路滤波网络模型时至少一个候选环路滤波网络模型各自对应的第一值和当前块不使用环路滤波网络模型时的第二值;根据至少两个输出值,确定当前块使用环路滤波网络模型时的目标环路滤波网络模型或者当前块不使用环路滤波网络模型;当当前块使用环路滤波网络模型时,利用目标环路滤波网络模型对当前块进行滤波处理,得到当前块的第一重建图像块。

Description

编解码方法、码流、编码器、解码器、***和存储介质 技术领域
本申请实施例涉及图像处理技术领域,尤其涉及一种编解码方法、码流、编码器、解码器、***和存储介质。
背景技术
在视频编解码***中,环路滤波器被使用来提升重建图像的主客观质量。其中,在环路滤波部分,虽然目前存在一些模型选择方案,但是这些方案大都是通过计算各个模型的率失真代价值来选择出性能较好的模型,复杂度较高;而且对于选择出的模型,还需要通过率失真代价来决策是否打开模型开关,以及将帧级、块级等开关信息写入码流,从而造成额外的比特开销。
发明内容
本申请实施例提供一种编解码方法、码流、编码器、解码器、***和存储介质,不仅可以降低复杂度,还可以避免额外的比特开销,提升编码性能,进而能够提高编解码效率。
本申请实施例的技术方案可以如下实现:
第一方面,本申请实施例提供了一种解码方法,应用于解码器,该方法包括:
解析码流,确定第一语法元素标识信息的取值;
当第一语法元素标识信息指示当前块允许使用预设选择网络模型进行模型决策时,根据当前块的预设选择网络模型确定至少两个输出值;其中,至少两个输出值包括当前块使用环路滤波网络模型时至少一个候选环路滤波网络模型各自对应的第一值和当前块不使用环路滤波网络模型时的第二值;
根据至少两个输出值,确定当前块使用环路滤波网络模型时的目标环路滤波网络模型或者当前块不使用环路滤波网络模型;
当当前块使用环路滤波网络模型时,利用目标环路滤波网络模型对当前块进行滤波处理,得到当前块的第一重建图像块。
第二方面,本申请实施例提供了一种编码方法,应用于编码器,该方法包括:
确定第一语法元素标识信息的取值;
当第一语法元素标识信息指示当前块允许使用预设选择网络模型进行模型决策时,根据当前块的预设选择网络模型确定至少两个输出值;其中,至少两个输出值包括当前块使用环路滤波网络模型时至少一个候选环路滤波网络模型各自对应的第一值和当前块不使用环路滤波网络模型时的第二值;
根据至少两个输出值,确定当前块使用环路滤波网络模型时的目标环路滤波网络模型或者当前块不使用环路滤波网络模型;
当当前块使用环路滤波网络模型时,利用目标环路滤波网络模型对当前块进行滤波处理,得到当前块的第一重建图像块。
第三方面,本申请实施例提供了一种码流,码流是根据待编码信息进行比特编码生成的;其中,待编码信息包括第一语法元素标识信息的取值,第一语法元素标识信息用于指示当前块是否允许使用预设选择网络模型进行模型决策。
第四方面,本申请实施例提供了一种编码器,该编码器包括第一确定单元、第一决策单元和第一滤波单元;其中,
第一确定单元,配置为确定第一语法元素标识信息的取值;
第一决策单元,配置为当第一语法元素标识信息指示当前块允许使用预设选择网络模型进行模型决策时,根据当前块的预设选择网络模型确定至少两个输出值;其中,至少两个输出值包括当前块使用环路滤波网络模型时至少一个候选环路滤波网络模型各自对应的第一值和当前块不使用环路滤波网络模型时的第二值;以及根据至少两个输出值,确定当前块使用环路滤波网络模型时的目标环路滤波网络模型或者当前块不使用环路滤波网络模型;
第一滤波单元,配置为当当前块使用环路滤波网络模型时,利用目标环路滤波网络模型对当前块进行滤波处理,得到当前块的第一重建图像块。
第五方面,本申请实施例提供了一种编码器,该编码器包括第一存储器和第一处理器;其中,
第一存储器,用于存储能够在第一处理器上运行的计算机程序;
第一处理器,用于在运行计算机程序时,执行如第二方面所述的方法。
第六方面,本申请实施例提供了一种解码器,该解码器包括解析单元、第二决策单元和第二滤波单元;其中,
解析单元,配置为解析码流,确定第一语法元素标识信息的取值;
第二决策单元,配置为当第一语法元素标识信息指示当前块允许使用预设选择网络模型进行模型决策时,根据当前块的预设选择网络模型确定至少两个输出值;其中,至少两个输出值包括当前块使用环路滤波网络模型时至少一个候选环路滤波网络模型各自对应的第一值和当前块不使用环路滤波网络模型时的第二值;以及根据至少两个输出值,确定当前块使用环路滤波网络模型时的目标环路滤波网络模型或者当前块不使用环路滤波网络模型;
第二滤波单元,配置为当当前块使用环路滤波网络模型时,利用目标环路滤波网络模型对当前块进行滤波处理,得到当前块的第一重建图像块。
第七方面,本申请实施例提供了一种解码器,该解码器包括第二存储器和第二处理器;其中,
第二存储器,用于存储能够在第二处理器上运行的计算机程序;
第二处理器,用于在运行计算机程序时,执行如第一方面所述的方法。
第八方面,本申请实施例提供了一种编解码***,该编解码***包括如第四方面或第五方面所述的编码器和如第六方面或第七方面所述的解码器。
第九方面,本申请实施例提供了一种计算机存储介质,该计算机存储介质存储有计算机程序,所述计算机程序被执行时实现如第一方面所述的方法、或者如第二方面所述的方法。
本申请实施例提供了一种编解码方法、码流、编码器、解码器、***和存储介质,在编码器侧,确定第一语法元素标识信息的取值;当第一语法元素标识信息指示当前块允许使用预设选择网络模型进行模型决策时,根据当前块的预设选择网络模型确定至少两个输出值;其中,至少两个输出值包括当前块使用环路滤波网络模型时至少一个候选环路滤波网络模型各自对应的第一值和当前块不使用环路滤波网络模型时的第二值;根据至少两个输出值,确定当前块使用环路滤波网络模型时的目标环路滤波网络模型或者当前块不使用环路滤波网络模型;当当前块使用环路滤波网络模型时,利用目标环路滤波网络模型对当前块进行滤波处理,得到当前块的第一重建图像块。在解码器侧,解析码流,确定第一语法元素标识信息的取值;当第一语法元素标识信息指示当前块允许使用预设选择网络模型进行模型决策时,根据当前块的预设选择网络模型确定至少两个输出值;其中,至少两个输出值包括当前块使用环路滤波网络模型时至少一个候选环路滤波网络模型各自对应的第一值和当前块不使用环路滤波网络模型时的第二值;根据至少两个输出值,确定当前块使用环路滤波网络模型时的目标环路滤波网络模型或者当前块不使用环路滤波网络模型;当当前块使用环路滤波网络模型时,利用目标环路滤波网络模型对当前块进行滤波处理,得到当前块的第一重建图像块。这样,通过引入基于深度学习的神经网络技术对环路滤波网络模型进行自适应决策,可以确定出当前块使用环路滤波网络模型时的目标环路滤波网络模型或者当前块不使用环路滤波网络模型;如果当前块使用环路滤波网络模型,那么可以还可以利用目标环路滤波网络模型对当前块进行滤波处理,如此不仅可以降低复杂度,还可以避免额外的比特开销,提升编码性能,进而能够提高编解码效率;另外,还可以使得最终输出的第一重建图像块更加接近于原始图像块,能够提升视频图像质量。
附图说明
图1为本申请实施例提供的一种编码框架的应用示意图;
图2为本申请实施例提供的另一种编码框架的应用示意图;
图3A为本申请实施例提供的一种视频编码***的详细框架示意图;
图3B为本申请实施例提供的一种视频解码***的详细框架示意图;
图4为本申请实施例提供的一种解码方法的流程示意图;
图5为本申请实施例提供的又一种编码框架的应用示意图;
图6A为本申请实施例提供的一种亮度环路滤波网络模型的网络结构组成示意图;
图6B为本申请实施例提供的一种色度环路滤波网络模型的网络结构组成示意图;
图7A为本申请实施例提供的另一种亮度环路滤波网络模型的网络结构组成示意图;
图7B为本申请实施例提供的另一种色度环路滤波网络模型的网络结构组成示意图;
图8为本申请实施例提供的一种残差块的网络结构组成示意图;
图9A为本申请实施例提供的一种预设选择网络模型的组成结构示意图;
图9B为本申请实施例提供的另一种预设选择网络模型的组成结构示意图;
图10为本申请实施例提供的一种基于预设选择网络模型的整体框架示意图;
图11为本申请实施例提供的另一种解码方法的流程示意图;
图12为本申请实施例提供的一种编码方法的流程示意图;
图13为本申请实施例提供的一种编码器的组成结构示意图;
图14为本申请实施例提供的一种编码器的具体硬件结构示意图;
图15为本申请实施例提供的一种解码器的组成结构示意图;
图16为本申请实施例提供的一种解码器的具体硬件结构示意图;
图17为本申请实施例提供的一种编解码***的组成结构示意图。
具体实施方式
为了能够更加详尽地了解本申请实施例的特点与技术内容,下面结合附图对本申请实施例的实现进行详细阐述,所附附图仅供参考说明之用,并非用来限定本申请实施例。
除非另有定义,本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同。本文中所使用的术语只是为了描述本申请实施例的目的,不是旨在限制本申请。
在以下的描述中,涉及到“一些实施例”,其描述了所有可能实施例的子集,但是可以理解,“一些实施例”可以是所有可能实施例的相同子集或不同子集,并且可以在不冲突的情况下相互结合。还需要指出,本申请实施例所涉及的术语“第一\第二\第三”仅是用于区别类似的对象,不代表针对对象的特定排序,可以理解地,“第一\第二\第三”在允许的情况下可以互换特定的顺序或先后次序,以使这里描述的本申请实施例能够以除了在这里图示或描述的以外的顺序实施。
对本申请实施例进行进一步详细说明之前,先对本申请实施例中涉及的名词和术语进行说明,本申请实施例中涉及的名词和术语适用于如下的解释:
联合视频专家组(Joint Video Experts Team,JVET)
新一代视频编码标准H.266/多功能视频编码(Versatile Video Coding,VVC)
VVC的参考软件测试平台(VVC Test Model,VTM)
音视频编码标准(Audio Video coding Standard,AVS)
AVS的高性能测试模型(High-Performance Model,HPM)
AVS的高性能-模块化智能编码测试模型(High Performance-Modular Artificial Intelligence Model,HPM-ModAI)
基于残差神经网络的环路滤波器(Convolutional Neural Network based in-Loop Filter,CNNLF)
去块滤波器(DeBlocking Filter,DBF)
样值自适应补偿(Sample adaptive Offset,SAO)
自适应修正滤波器(Adaptive loop filter,ALF)
量化参数(Quantization Parameter,QP)
编码单元(Coding Unit,CU)
编码树单元(Coding Tree Unit,CTU)
可以理解,数字视频压缩技术主要是将庞大的数字影像视频数据进行压缩,以便于传输以及存储等。随着互联网视频的激增以及人们对视频清晰度的要求越来越高,尽管已有的数字视频压缩标准能够节省不少视频数据,但目前仍然需要追求更好的数字视频压缩技术,以减少数字视频传输的带宽和流量压力。
在数字视频编码过程中,编码器对不同颜色格式的原始视频序列读取不相等的像素,包含亮度分量和色度分量,即编码器读取一副黑白或者彩色图像。然后将该图像进行划分成块,将块数据交由编码器进行编码,如今编码器通常为混合框架编码模式,一般可以包含帧内预测与帧间预测、变换/量化、反量化/逆变换、环路滤波及熵编码等操作,处理流程具体可参考图1所示。这里,帧内预测只参考同一帧图像的信息,预测当前划分块内的像素信息,用于消除空间冗余;帧间预测可以包括运动估计和运动补偿,其可参考不同帧的图像信息,利用运动估计搜索最匹配当前划分块的运动矢量信息,用于消除时间冗余;变换将预测后的图像块转换到频率域,能量重新分布,结合量化可以将人眼不敏感的信息去除,用于消除视觉冗余;熵编码可以根据当前上下文模型以及二进制码流的概率信息消除字符冗余;环路滤波则主要对反变换与反量化后的像素进行处理,弥补失真信息,为后续编码像素提供更好的参考。
对于视频编码标准而言,在环路滤波部分,传统环路滤波模块主要包含去块滤波器(以下简称为DBF)、样值自适应补偿滤波器(以下简称为SAO)和自适应修正滤波器(以下简称为ALF)。在HPM-ModAI的应用中,还采用了基于残差神经网络的环路滤波器(以下简称为CNNLF)作为智能环路滤波模块的基线方案,并设置于SAO滤波和ALF滤波之间,具体详见图2所示。在编码测试时,按照智能编码通用测试条件,对于全帧内(All Intra)配置,打开ALF,关闭DBF和SAO;对于随机接入(Random Access)和低延迟(Low Delay)配置,打开I帧的DBF,打开ALF,关闭SAO。
在实际应用中,尤其是HPM-ModAI中,按照QP 27~31,32~37,38~44,45~50为范围划分为4个区间,分别训练了4种I帧亮度分量模型,4种非I帧亮度分量模型,4种色度U分量模型,4种色度V分量模型等总共16种候选CNNLF模型。在编码时,根据不同的帧类型、QP、颜色分量类型等特性,需要在这多种候选CNNLF模型中人为地选择对应的一种CNNLF模型,例如可以通过率失真代价方式决策是否调用CNNLF模型,并将帧级、CTU级等开关信息写入码流(“比特流”)中。对于Random Access和Low Delay等配置,在编码时各帧的QP相比初始QP会产生一定的波动,导致所选择的CNNLF模型并不一定就是使该帧滤波效果最好的模型。
也就是说,已有的神经网络环路滤波器技术,往往是针对帧类型、QP、颜色分量类型等特性,训练了多种候选模型。在编码时,要么人为地选择一种模型,并将帧级、CTU级等开关信息编入码流中;要么通过率失真代价方式选择出一种模型,并将帧级、CTU级等开关信息,以及模型索引序号写入码流中。虽然可以提出一种基于深度学习的模型自适应选择技术方案,其可以优化神经网络环路滤波器的模型选择操作;但是对于选择出的模型,仍然需要通过率失真代价方式来决策是否打开模型开关,并将帧级,CTU级等开关信息写入码流中,造成额外的比特开销。
本申请实施例提供了一种编码方法,在编码器侧,确定第一语法元素标识信息的取值;当第一语法元素标识信息指示当前块允许使用预设选择网络模型进行模型决策时,根据当前块的预设选择网络模型确定至少两个输出值;其中,至少两个输出值包括当前块使用环路滤波网络模型时至少一个候选环路滤波网络模型各自对应的第一值和当前块不使用环路滤波网络模型时的第二值;根据至少两个输出值,确定当前块使用环路滤波网络模型时的目标环路滤波网络模型或者当前块不使用环路滤波网络模型;当当前块使用环路滤波网络模型时,利用目标环路滤波网络模型对当前块进行滤波处理,得到当前块的第一重建图像块。
本申请实施例还提供了一种解码方法,在解码器侧,解析码流,确定第一语法元素标识信息的取值;当第一语法元素标识信息指示当前块允许使用预设选择网络模型进行模型决策时,根据当前块的预设选择网络模型确定至少两个输出值;其中,至少两个输出值包括当前块使用环路滤波网络模型时至少一个候选环路滤波网络模型各自对应的第一值和当前块不使用环路滤波网络模型时的第二值;根据至少两个输出值,确定当前块使用环路滤波网络模型时的目标环路滤波网络模型或者当前块不使用环路滤波网络模型;当当前块使用环路滤波网络模型时,利用目标环路滤波网络模型对当前块进行滤波处理,得到当前块的第一重建图像块。
这样,通过引入基于深度学习的神经网络技术对环路滤波网络模型进行自适应决策,可以确定出当前块使用环路滤波网络模型时的目标环路滤波网络模型或者当前块不使用环路滤波网络模型;如果当前块使用环路滤波网络模型,那么可以还可以利用目标环路滤波网络模型对当前块进行滤波处理,如此不仅可以降低复杂度,还可以避免额外的比特开销,提升编码性能,进而能够提高编解码效率;另外,还可以使得最终输出的第一重建图像块更加接近于原始图像块,能够提升视频图像质量。
下面将结合附图对本申请各实施例进行详细说明。
参见图3A,其示出了本申请实施例提供的一种视频编码***的详细框架示意图。如图3A所示,该视频编码***10包括变换与量化单元101、帧内估计单元102、帧内预测单元103、运动补偿单元104、运动估计单元105、反变换与反量化单元106、滤波器控制分析单元107、滤波单元108、编码单元109和解码图像缓存单元110等,其中,滤波单元108可以实现DBF滤波/SAO滤波/ALF滤波,编码单元109可以实现头信息编码及基于上下文的自适应二进制算术编码(Context-based Adaptive Binary Arithmetic Coding,CABAC)。针对输入的原始视频信号,通过编码树单元(Coding Tree Unit,CTU)的划分可以得到一个视频编码块,然后对经过帧内或帧间预测后得到的残差像素信息通过变换与量化单元101对该视频编码块进行变换,包括将残差信息从像素域变换到变换域,并对所得的变换系数进行量化,用以进一步减少比特率;帧内估计单元102和帧内预测单元103是用于对该视频编码块进行帧内预测;明确地说,帧内估计单元102和帧内预测单元103用于确定待用以编码该视频编码块的帧内预测模式;运动补偿单元104和运动估计单元105用于执行所接收的视频编码块相对于一或多个参考帧中的一或多个块的帧间预测编码以提供时间预测信息;由运动估计单元105执行的运动估计为产生运动向量的过程,所述运动向量可以估计该视频编码块的运动,然后由运动补偿单元104基于由运动估计单元105 所确定的运动向量执行运动补偿;在确定帧内预测模式之后,帧内预测单元103还用于将所选择的帧内预测数据提供到编码单元109,而且运动估计单元105将所计算确定的运动向量数据也发送到编码单元109;此外,反变换与反量化单元106是用于该视频编码块的重构建,在像素域中重构建残差块,该重构建残差块通过滤波器控制分析单元107和滤波单元108去除方块效应伪影,然后将该重构残差块添加到解码图像缓存单元110的帧中的一个预测性块,用以产生经重构建的视频编码块;编码单元109是用于编码各种编码参数及量化后的变换系数,在基于CABAC的编码算法中,上下文内容可基于相邻编码块,可用于编码指示所确定的帧内预测模式的信息,输出该视频信号的码流;而解码图像缓存单元110是用于存放重构建的视频编码块,用于预测参考。随着视频图像编码的进行,会不断生成新的重构建的视频编码块,这些重构建的视频编码块都会被存放在解码图像缓存单元110中。
参见图3B,其示出了本申请实施例提供的一种视频解码***的详细框架示意图。如图3B所示,该视频解码***20包括解码单元201、反变换与反量化单元202、帧内预测单元203、运动补偿单元204、滤波单元205和解码图像缓存单元206等,其中,解码单元201可以实现头信息解码以及CABAC解码,滤波单元205可以实现DBF滤波/SAO滤波/ALF滤波。输入的视频信号经过图3A的编码处理之后,输出该视频信号的码流;该码流输入视频解码***20中,首先经过解码单元201,用于得到解码后的变换系数;针对该变换系数通过反变换与反量化单元202进行处理,以便在像素域中产生残差块;帧内预测单元203可用于基于所确定的帧内预测模式和来自当前帧或图片的先前经解码块的数据而产生当前视频解码块的预测数据;运动补偿单元204是通过剖析运动向量和其他关联语法元素来确定用于视频解码块的预测信息,并使用该预测信息以产生正被解码的视频解码块的预测性块;通过对来自反变换与反量化单元202的残差块与由帧内预测单元203或运动补偿单元204产生的对应预测性块进行求和,而形成解码的视频块;该解码的视频信号通过滤波单元205以便去除方块效应伪影,可以改善视频质量;然后将经解码的视频块存储于解码图像缓存单元206中,解码图像缓存单元206存储用于后续帧内预测或运动补偿的参考图像,同时也用于视频信号的输出,即得到了所恢复的原始视频信号。
需要说明的是,本申请实施例提供的方法,可以应用在如图3A所示的滤波单元108部分(用黑色加粗方框表示),也可以应用在如图3B所示的滤波单元205部分(用黑色加粗方框表示)。也就是说,本申请实施例中的方法,既可以应用于视频编码***(简称为“编码器”),也可以应用于视频解码***(简称为“解码器”),甚至还可以同时应用于视频编码***和视频解码***,但是这里不作任何限定。
还需要说明的是,当本申请实施例应用于编码器时,“当前块”具体是指视频图像中的当前待编码的块(也可以简称为“编码块”);当本申请实施例应用于解码器时,“当前块”具体是指视频图像中的当前待解码的块(也可以简称为“解码块”)。
在本申请的一实施例中,参见图4,其示出了本申请实施例提供的一种解码方法的流程示意图。如图4所示,该方法可以包括:
S401:解析码流,确定第一语法元素标识信息的取值。
需要说明的是,视频图像可以划分为多个图像块,每个当前待解码的图像块可以称为解码块。这里,每个解码块可以包括第一图像分量、第二图像分量和第三图像分量;而当前块即为视频图像中当前待进行第一图像分量、第二图像分量或者第三图像分量环路滤波处理的解码块。其中,这里的当前块可以为CTU,也可以为CU,甚至还可以是比CU更小的块,本申请实施例不作任何限定。
在这里,针对第一图像分量、第二图像分量和第三图像分量,从颜色划分角度,本申请实施例可以将其划分为亮度分量和色度分量等两种颜色分量类型。在这种情况下,如果当前块进行亮度分量的预测、反变换与反量化、环路滤波等操作,那么当前块也可以称为亮度块;或者,如果当前块进行色度分量的预测、反变换与反量化、环路滤波等操作,那么当前块也可以称为色度块。
还需要说明的是,在解码器侧,本申请实施例具体提供了一种环路滤波方法,尤其是一种基于深度学习的环路滤波网络模型使用的自适应决策方法,该方法应用在如图3B所示的滤波单元205部分。在这里,滤波单元205可以包括去块滤波器(DBF)、样值自适应补偿滤波器(SAO)、基于残差神经网络的环路滤波器(CNNLF)和自适应修正滤波器(ALF)。对于该滤波单元205来说,利用本申请实施例所述的方法可以对该滤波单元205中的CNNLF模型进行自适应地决策,以便决策出当前块使用CNNLF模型时的目标模型或者当前块不使用CNNLF模型。
更具体地,本申请实施例提出了一种基于深度学习的模型自适应决策使用模块,用于对环路滤波网络模型(比如CNNLF模型)是否使用进行自适应决策,提升编码性能。如图5所示,环路滤波器除了包括DBF、SAO、CNNLF和ALF之外,还可以包括模型自适应决策使用模块(Model Adaptive Decision,MAD),且模型自适应决策使用模块位于SAO滤波和CNNLF滤波之间。另外,模型自适应决策使用 模块的使用不依赖于DBF、SAO、CNNLF和ALF的标志位,只是在位置上置于CNNLF之前。需要说明的是,模型自适应决策使用模块可以看作是由多层卷积神经网络和多层全连接神经网络组成的预设选择网络模型,以便决策出当前块是否使用CNNLF模型,具体可以是指当前块使用CNNLF模型时的目标模型或者当前块不使用CNNLF模型。
在这里,为了方便解码器能够确定当前块是否允许使用预设选择网络模型进行模型决策,可以设置一个第一语法元素标识信息,然后根据解码获得的第一语法元素标识信息的取值来确定。在一些实施例中,该方法还可以包括:
若第一语法元素标识信息的取值为第一标识值,则确定第一语法元素标识信息指示当前块允许使用预设选择网络模型进行模型决策;或者,
若第一语法元素标识信息的取值为第二标识值,则确定第一语法元素标识信息指示当前块不允许使用预设选择网络模型进行模型决策。
需要说明的是,第一标识值和第二标识值不同,而且第一标识值和第二标识值可以是参数形式,也可以是数字形式。具体地,第一语法元素标识信息可以是写入在概述(profile)中的参数,也可以是一个标志(flag)的取值,本申请实施例对此不作任何限定。
以第一语法元素标识信息为一个flag为例,这时候对于第一标识值和第二标识值而言,第一标识值可以设置为1,第二标识值可以设置为0;或者,第一标识值还可以设置为true,第二标识值还可以设置为false;或者,第一标识值还可以设置为0,第二标识值还可以设置为1;或者,第一标识值还可以设置为false,第二标识值还可以设置为true。示例性地,对于flag而言,一般情况下,第一标识值可以为1,第二标识值可以为0,但是并不作任何限定。
还需要说明的是,预设选择网络模型可以看作是一个神经网络,而第一语法元素标识信息可以看作是一个基于神经网络的模型自适应决策的允许标志,这里可以用model_adaptive_decision_enable_flag表示。具体来说,model_adaptive_decision_enable_flag可以用于指示当前块是否允许使用预设选择网络模型进行模型自适应决策。
S402:当第一语法元素标识信息指示当前块允许使用预设选择网络模型进行模型决策时,根据当前块的预设选择网络模型确定至少两个输出值;其中,至少两个输出值包括当前块使用环路滤波网络模型时至少一个候选环路滤波网络模型各自对应的第一值和当前块不使用环路滤波网络模型时的第二值。
需要说明的是,如果当前块允许使用预设选择网络模型进行模型决策,那么这时候可以根据当前块的颜色分量类型、量化参数和所属帧的帧类型等,从若干个候选的预设选择网络模型中确定出当前块使用的预设选择网络模型,然后根据该预设选择网络模型确定当前块使用环路滤波网络模型时的至少一个候选环路滤波网络模型和当前块不使用环路滤波网络模型的概率分布情况。具体地,在本申请实施例中,这至少两个输出值可以包括当前块使用环路滤波网络模型时至少一个候选环路滤波网络模型各自对应的第一值和当前块不使用环路滤波网络模型时的第二值。
在一种更具体的示例中,第一值可以用于反映当前块使用环路滤波网络模型时这至少一个候选环路滤波网络模型的概率分布情况,第二值可以用于反映当前块不使用环路滤波网络模型时的概率分布情况。换言之,第一值和第二值均可以用概率值表示;即根据预设选择网络模型,所确定的至少两个输出值可以为至少两个概率值。或者,第一值和第二值还可以用于反映当前块使用环路滤波网络模型时这至少一个候选环路滤波网络模型和当前块不使用环路滤波网络模型的权重分配情况;即第一值和第二值也可以称为权重值,本申请实施例不作任何限定。
可以理解地,针对不同的颜色分量类型,这里的预设选择网络模型并不相同。在本申请实施例中,亮度分量对应的预设选择网络模型可以称为亮度选择网络模型,色度分量对应的预设选择网络模型可以称为色度选择网络模型。因此,在一些实施例中,所述确定当前块的预设选择网络模型,可以包括:
若当前块的颜色分量类型为亮度分量(即当前块为亮度块时),则确定当前块的亮度选择网络模型;或者,
若当前块的颜色分量类型为色度分量(即当前块为色度块时),则确定当前块的色度选择网络模型。
相应地,针对不同的颜色分量类型,这里的候选环路滤波网络模型也是不同的。在本申请实施例中,亮度分量对应的一个或多个候选环路滤波网络模型可以称为候选亮度环路滤波网络模型,色度分量对应的一个或多个候选环路滤波网络模型可以称为候选色度环路滤波网络模型。因此,在一些实施例中,所述根据当前块的预设选择网络模型确定至少两个输出值,可以包括:
若当前块的颜色分量类型为亮度分量,则根据亮度选择网络模型确定至少两个亮度输出值;其中,至少两个亮度输出值包括当前块使用亮度环路滤波网络模型时至少一个候选亮度环路滤波网络模型各自对应的第一值和当前块不使用亮度环路滤波网络模型时的第二值;或者,
若当前块的颜色分量类型为色度分量,则根据色度选择网络模型确定至少两个色度输出值;其中, 至少两个色度输出值包括当前块使用色度环路滤波网络模型时至少一个候选色度环路滤波网络模型各自对应的第一值和当前块不使用色度环路滤波网络模型时的第二值。
也就是说,以概率值为例,对于颜色分量类型而言,其可以包括亮度分量和色度分量。在本申请实施例中,如果当前块的颜色分量类型为亮度分量,那么需要确定当前块的亮度选择网络模型,然后可以根据亮度选择网络模型不仅可以确定当前块不使用亮度环路滤波模型的概率分布情况,还可以确定当前块使用亮度环路滤波模型时这至少一个候选亮度环路滤波网络模型各自对应的概率分布情况。如果当前块的颜色分量类型为色度分量,那么需要确定当前块的色度选择网络模型,然后可以根据色度选择网络模型不仅可以确定当前块不使用色度环路滤波模型的概率分布情况,还可以确定当前块使用色度环路滤波模型时这至少一个候选色度环路滤波网络模型的各自对应的概率分布情况。
进一步地,对于帧类型来说,其可以包括I帧、P帧和B帧。其中,I帧,即帧内编码图像帧(Intra-coded Picture);I帧表示关键帧,可以理解为这一帧画面的完整保留。P帧,即前向预测编码图像帧(Predictive-coded Picture);P帧表示的是这一帧跟之前的一个关键帧(I帧)的差别。B帧,即双向预测编码图像帧(Bidirectionally predicted Picture);B帧是双向差别帧,也就是B帧记录的是本帧与前帧和后帧的差别。
在本申请实施例中,帧类型可以包括第一类型和第二类型。针对不同的帧类型,这里的预设选择网络模型也是不同的。
在一种具体的示例中,第一类型可以为I帧,第二类型可以为非I帧。需要注意的是,这里并不作具体限定。
在一种可能的实施方式中,对于亮度选择网络模型而言,第一类型对应的亮度选择网络模型可以称为第一亮度选择网络模型,第二类型对应的亮度选择网络模型可以称为第二亮度选择网络模型。因此,在一些实施例中,在当前块的颜色分量类型为亮度分量的情况下,所述确定当前块的亮度选择网络模型,可以包括:
若当前块所属帧的帧类型为第一类型,则确定当前块的第一亮度选择网络模型;或者,
若当前块所属帧的帧类型为第二类型,则确定当前块的第二亮度选择网络模型。
相应地,对于候选亮度环路滤波网络模型来说,根据不同的帧类型,候选亮度环路滤波网络模型也是不同的。具体地,第一类型对应的一个或多个候选亮度环路滤波网络模型可以称为候选第一亮度选择网络模型,第二类型对应的一个或多个候选亮度环路滤波网络模型可以称为候选第二亮度选择网络模型。因此,在一些实施例中,所述根据亮度选择网络模型确定至少两个亮度输出值,可以包括:
若当前块所属帧的帧类型为第一类型,则根据第一亮度选择网络模型确定至少两个亮度输出值;其中,这至少两个亮度输出值包括当前块使用第一亮度环路滤波网络模型时至少一个候选第一亮度环路滤波网络模型各自对应的第一值和当前块不使用第一亮度环路滤波网络模型时的第二值;或者,
若当前块所属帧的帧类型为第二类型,则根据第二亮度选择网络模型确定至少两个亮度输出值;其中,这至少两个亮度输出值包括当前块使用第二亮度环路滤波网络模型时至少一个候选第二亮度环路滤波网络模型各自对应的第一值和当前块不使用第二亮度环路滤波网络模型时的第二值。
进一步地,在本申请实施例中,对于亮度分量对应的一个或多个候选环路滤波网络模型(可简称为“候选亮度环路滤波网络模型”),无论是第一类型对应的至少一个候选第一亮度环路滤波网络模型,还是第二类型对应的至少一个候选第二亮度环路滤波网络模型,这些候选环路滤波网络模型都是通过模型训练得到的。
在一些实施例中,该方法还可以包括:
确定第一训练集;其中,第一训练集包括至少一个第一训练样本和至少一个第二训练样本,第一训练样本的帧类型为第一类型,第二训练样本的帧类型为第二类型,且第一训练样本和第二训练样本均是根据至少一种量化参数得到的;
利用至少一个第一训练样本的亮度分量对第一神经网络结构进行训练,得到至少一个候选第一亮度环路滤波网络模型;以及
利用至少一个第二训练样本的亮度分量对第一神经网络结构进行训练,得到至少一个候选第二亮度环路滤波网络模型。
在这里,第一神经网络结构包括下述至少之一:卷积层、激活层、残差块和跳转连接层。
也就是说,至少一个候选第一亮度环路滤波网络模型和至少一个候选第二亮度环路滤波网络模型是根据至少一个训练样本对第一神经网络结构进行模型训练确定的,且至少一个候选第一亮度环路滤波网络模型和至少一个候选第二亮度环路滤波网络模型与帧类型、颜色分量类型和量化参数之间具有对应关系。
在另一种可能的实施方式中,对于色度选择网络模型而言,第一类型对应的色度选择网络模型可以 称为第一色度选择网络模型,第二类型对应的色度选择网络模型可以称为第二色度选择网络模型。因此,在一些实施例中,在当前块的颜色分量类型为色度分量的情况下,所述确定当前块的色度选择网络模型,可以包括:
若当前块所属帧的帧类型为第一类型,则确定当前块的第一色度选择网络模型;或者,
若当前块所属帧的帧类型为第二类型,则确定当前块的第二色度选择网络模型。
相应地,对于候选色度环路滤波网络模型来说,根据不同的帧类型,候选色度环路滤波网络模型也是不同的。具体地,第一类型对应的一个或多个候选色度环路滤波网络模型可以称为候选第一色度选择网络模型,第二类型对应的一个或多个候选色度环路滤波网络模型可以称为候选第二色度选择网络模型。因此,在一些实施例中,所述根据色度选择网络模型确定至少两个色度输出值,可以包括:
若当前块所属帧的帧类型为第一类型,则根据第一色度选择网络模型确定至少两个色度输出值;其中,这至少两个色度输出值包括当前块使用第一色度环路滤波网络模型时至少一个候选第一色度环路滤波网络模型各自对应的第一值和当前块不使用第一色度环路滤波网络模型时的第二值;或者,
若当前块所属帧的帧类型为第二类型,则根据第二色度选择网络模型确定至少两个色度输出值;其中,这至少两个色度输出值包括当前块使用第二色度环路滤波网络模型时至少一个候选第二色度环路滤波网络模型各自对应的第一值和当前块不使用第二色度环路滤波网络模型时的第二值。
进一步地,在本申请实施例中,对于色度分量对应的一个或多个候选环路滤波网络模型(可简称为“候选色度环路滤波网络模型”),无论是第一类型对应的至少一个候选第一色度环路滤波网络模型,还是第二类型对应的至少一个候选第二色度环路滤波网络模型,这些候选环路滤波网络模型都是通过模型训练得到的。
在一些实施例中,该方法还可以包括:
确定第一训练集;其中,第一训练集包括至少一个第一训练样本和至少一个第二训练样本,第一训练样本的帧类型为第一类型,第二训练样本的帧类型为第二类型,且第一训练样本和第二训练样本均是根据至少一种量化参数得到的;
利用至少一个第一训练样本的色度分量对第二神经网络结构进行训练,得到至少一个候选第一色度环路滤波网络模型;以及
利用至少一个第二训练样本的色度分量对第二神经网络结构进行训练,得到至少一个候选第二色度环路滤波网络模型。
在这里,第二神经网络结构包括下述至少之一:采样层、卷积层、激活层、残差块、池化层和跳转连接层。
也就是说,至少一个候选第一色度环路滤波网络模型和至少一个候选第二色度环路滤波网络模型是根据至少一个训练样本对第二神经网络结构进行模型训练确定的,且至少一个候选第一色度环路滤波网络模型和至少一个候选第二色度环路滤波网络模型与帧类型、颜色分量类型和量化参数之间具有对应关系。
在一种具体的示例中,第一神经网络结构可以包括第一卷积模块、第一残差模块、第二卷积模块和第一连接模块。
在这里,如图6A所示,第一神经网络结构的输入是重建亮度帧,输出是原始亮度帧;该第一神经网络结构包括有:第一卷积模块601、第一残差模块602、第二卷积模块603和第一连接模块604。其中,在图6A中,第一卷积模块601、第一残差模块602、第二卷积模块603和第一连接模块604顺次连接,且第一连接模块604还与第一卷积模块601的输入连接。
在一种更具体的示例中,对于第一神经网络结构而言,第一卷积模块可以由一层卷积层和一层激活层组成,第二卷积模块可以由两层卷积层和一层激活层组成,连接模块可以由跳转连接层组成,第一残差模块可以包括若干个残差块,且每一个残差块可以由两层卷积层和一层激活层组成。
在另一种具体的示例中,第二神经网络结构可以包括上采样模块、第三卷积模块、第四卷积模块、融合模块、第二残差模块、第五卷积模块和第二连接模块。
在这里,如图6B所示,第二神经网络结构的输入是重建亮度帧和重建色度帧,输出是原始色度帧;该第二神经网络结构包括有:上采样模块605、第三卷积模块606、第四卷积模块607、融合模块608、第二残差模块609、第五卷积模块610和第二连接模块611。其中,在图6B中,上采样模块605的输入是重建色度帧,上采样模块605和第三卷积模块606连接;第四卷积模块607的输入是重建亮度帧,第三卷积模块606和第四卷积模块607与融合模块608连接,融合模块608、第二残差模块609、第五卷积模块610和第二连接模块611顺次连接,且第二连接模块611还与上采样模块605的输入连接。
在一种更具体的示例中,对于第二神经网络结构而言,第三卷积模块可以由一层卷积层和一层激活层组成,第四卷积模块可以由一层卷积层和一层激活层组成,第五卷积模块可以由两层卷积层、一层激 活层和一层池化层组成,连接模块可以由跳转连接层组成,第二残差模块可以包括若干个残差块,且每一个残差块可以由两层卷积层和一层激活层组成。
示例性地,以环路滤波网络模型为CNNLF为例,CNNLF对于亮度分量和色度分量分别设计了不同的网络结构。其中,对于亮度分量,其设计了第一神经网络结构,具体参见图7A;对于色度分量,其设计了第二神经网络结构,具体参见图7B。
对于亮度分量,如图7A所示,整个网络结构可以由卷积层、激活层、残差块、跳转连接层等部分组成。这里,卷积层的卷积核可以为3×3,即可以用3×3Conv表示;激活层可以为线性激活函数,即可以用线性整流函数(Rectified Linear Unit,ReLU)表示,又可称为修正线性单元,是一种人工神经网络中常用的激活函数,通常指代以斜坡函数及其变种为代表的非线性函数。残差块(ResBlock)的网络结构如图8中的虚线框所示,可以由卷积层(Conv)、激活层(ReLU)和跳转连接层等组成。在网络结构中,跳转连接层(Concat)是指网络结构中所包括的一条从输入到输出的全局跳转连接,能够使网络能够专注于学习残差,加速了网络的收敛过程。
对于色度分量,如图7B所示,这里引入了亮度分量作为输入之一来指导色度分量的滤波,整个网络结构可以由卷积层、激活层、残差块、池化层、跳转连接层等部分组成。由于分辨率的不一致性,色度分量首先需要进行上采样。为了避免在上采样过程中引入其他噪声,可以通过直接拷贝邻近像素来完成分辨率的扩大,以得到放大色度帧(Enlarged chroma frame)。另外,在网络结构的末端,还使用了池化层(如平均值池化层,用2×2AvgPool表示)来完成色度分量的下采样。具体地,在HPM-ModAI的应用中,亮度分量网络的残差块数量可设置为N=20,色度分量网络的残差块数量可设置为N=10。
这样,在模型训练阶段,可以离线的训练出4个I帧亮度分量模型,4个非I帧亮度分量模型,4个色度U分量模型,4个色度V分量模型等共16种候选环路滤波网络模型。
还可以理解,针对不同的颜色分量类型,其对应的预设选择网络模型也不相同。在这里,亮度分量对应的预设选择网络模型可以称为亮度选择网络模型,色度分量对应的预设选择网络模型可以称为色度选择网络模型。
在一种可能的实施方式中,在当前块的颜色分量类型为亮度分量的情况下,所述确定当前块的亮度选择网络模型,可以包括:
确定至少一个候选亮度选择网络模型,候选亮度选择网络模型包括候选第一亮度选择网络模型和/或候选第二亮度选择网络模型;
确定当前块所属帧的帧类型和量化参数;
若帧类型为第一类型,则从至少一个候选亮度选择网络模型中确定第一类型对应的至少一个候选第一亮度选择网络模型,并根据量化参数从至少一个候选第一亮度选择网络模型中确定当前块的第一亮度选择网络模型;或者,
若帧类型为第二类型,则从至少一个候选亮度选择网络模型中确定第二类型对应的至少一个候选第二亮度选择网络模型,并根据量化参数从至少一个候选第二亮度选择网络模型中确定当前块的第二亮度选择网络模型。
在另一种可能的实施方式中,在当前块的颜色分量类型为色度分量的情况下,所述确定当前块的色度选择网络模型,可以包括:
确定至少一个候选色度选择网络模型,候选色度选择网络模型包括候选第一色度选择网络模型和/或候选第二色度选择网络模型;
确定当前块所属帧的帧类型和量化参数;
若帧类型为第一类型,则从至少一个候选色度选择网络模型中确定第一类型对应的至少一个候选第一色度选择网络模型,并根据量化参数从至少一个候选第一色度选择网络模型中确定当前块的第一色度选择网络模型;或者,
若帧类型为第二类型,则从至少一个候选色度选择网络模型中确定第二类型对应的至少一个候选第二色度选择网络模型,并根据量化参数从至少一个候选第二色度选择网络模型中确定当前块的第二色度选择网络模型。
需要说明的是,当前块的预设选择网络模型不仅和量化参数有关,而且还和帧类型、颜色分量类型有关。其中,不同的颜色分量类型,对应有不同的预设选择网络模型,比如对于亮度分量来说,预设选择网络模型可以是与亮度分量相关的亮度选择网络模型;对于色度分量来说,预设选择网络模型可以是与色度分量相关的色度选择网络模型。而且,不同的帧类型,其对应的预设选择网络模型也是不同的。对于与亮度分量相关的亮度选择网络模型,第一类型对应的亮度选择网络模型可以称为第一亮度选择网络模型,第二类型对应的亮度选择网络模型可以称为第二亮度选择网络模型;对于与色度分量相关的色度选择网络模型,第一类型对应的色度选择网络模型可以称为第一色度选择网络模型,第二类型对应的 色度选择网络模型可以称为第二色度选择网络模型。
还需要说明的是,在本申请实施例中,根据不同的量化参数,比如QP的取值为27~31、32~37、38~44、45~50等,以及不同的帧类型,比如第一类型和第二类型等,预先可以训练出至少一个候选亮度选择网络模型(包括候选第一亮度选择网络模型和/或候选第二亮度选择网络模型)以及至少一个候选色度选择网络模型(包括候选第一色度选择网络模型和/或候选第二色度选择网络模型)。
这样,对于亮度分量,在确定出当前块的帧类型后,假定帧类型为I帧,可以从至少一个候选亮度选择网络模型中确定出I帧类型对应的至少一个候选I帧亮度选择网络模型;根据当前块的量化参数,可以从至少一个候选I帧亮度选择网络模型中选取出该量化参数对应的I帧亮度选择网络模型,即当前块的亮度选择网络模型;或者,假定帧类型为非I帧,可以从至少一个候选亮度选择网络模型中确定出非I帧类型对应的至少一个候选非I帧亮度选择网络模型;根据当前块的量化参数,可以从至少一个候选非I帧亮度选择网络模型中选取出该量化参数对应的非I帧亮度选择网络模型,即当前块的亮度选择网络模型。另外,对于色度分量,其色度选择网络模型的确定方式与亮度分量相同,这里不再详述。
进一步地,对于至少一个候选亮度选择网络模型和至少一个候选色度选择网络模型的模型训练,在一些实施例中,该方法还可以包括:
确定第二训练集,其中,第二训练集包括至少一个训练样本,且所述训练样本是根据至少一种量化参数得到的;
利用第二训练集中训练样本的亮度分量对第三神经网络结构进行训练,得到至少一个候选亮度选择网络模型;
利用第二训练集中训练样本的色度分量对第三神经网络结构进行训练,得到至少一个候选色度选择网络模型。
也就是说,至少一个候选亮度选择网络模型是根据至少一个训练样本对第三神经网络结构进行模型训练确定的,且这至少一个候选亮度选择网络模型与帧类型、颜色分量类型和量化参数之间具有对应关系。另外,至少一个候选色度选择网络模型也是根据至少一个训练样本对第三神经网络结构进行模型训练确定的,且这至少一个候选色度选择网络模型与帧类型、颜色分量类型和量化参数之间具有对应关系。
需要说明的是,在本申请实施例中,第三神经网络结构可以包括下述至少之一:卷积层、池化层、全连接层和激活层。
在一种具体的示例中,第三神经网络结构可以包括第六卷积模块和全连接模块,第六卷积模块和全连接模块顺次连接。
在一种更具体的示例中,第六卷积模块可以包括若干个卷积子模块,每一个卷积子模块可以由一层卷积层和一层池化层组成;全连接模块可以包括若干个全连接子模块,每一个全连接子模块可以由一层全连接层和一层激活层组成。
也就是说,预设选择网络模型可以选择多层卷积神经网络和多层全连接层神经网络组成,然后利用训练样本进行深度学习以得到当前块的预设选择网络模型,如亮度选择网络模型或者色度选择网络模型。
在本申请实施例中,深度学习是机器学习的一种,而机器学习是实现人工智能的必经路径。深度学习的概念源于人工神经网络的研究,含多个隐藏层的多层感知器就是一种深度学习结构。深度学习可以通过组合低层特征形成更加抽象的高层表示属性类别或特征,以发现数据的分布式特征表示。在本申请实施例中,以卷积神经网络(Convolutional Neural Networks,CNN)为例,它是一类包含卷积计算且具有深度结构的前馈神经网络(Feedforward Neural Networks),是深度学习(Deep Learning)的代表算法之一。这里的预设选择网络模型可以是一种卷积神经网络结构。
示例性地,无论是亮度选择网络模型还是色度选择网络模型,其可以看作是由第三神经网络结构进行训练得到的。也就是说,对于预设选择网络模型,本申请实施例还设计了第三神经网络结构,具体如图9A和图9B。
如图9A所示,第三神经网络结构的输入是重建帧,输出是当前块使用环路滤波网络模型时的各个候选环路滤波网络模型以及当前块不使用环路滤波网络模型时的概率分布情况。在图9A中,该第三神经网络结构包括有:第六卷积模块901和全连接模块902,且第六卷积模块901和全连接模块902顺次连接。其中,第六卷积模块901可以包括若干个卷积子模块,每一个卷积子模块可以由一层卷积层和一层池化层组成;全连接模块902可以包括若干个全连接子模块,每一个全连接子模块可以由一层全连接层和一层激活层组成。
在一种具体的示例中,如图9B所示,第三神经网络结构可以由多层卷积神经网络和多层全连接神经网络组成。其中,该网络结构可以包括K层卷积层、M层池化层、L层全连接层和N层激活层,K、M、L、N均为大于或等于1的整数。
在一种更具体的示例中,K=3,M=3,L=2,N=2。
这样,基于图9B所示的网络结构,其可以由3层卷积层和2层全连接层组成,而且每一层卷积层之后设置有池化层;其中,卷积层的卷积核可以为3×3,即可以用3×3Conv表示;池化层可以采用最大值池化层,用2×2MaxPool表示;另外,每一层全连接层之后设置有激活层,在这里,激活层可以为线性激活函数,也可以为非线性激活函数,比如ReLU和Softmax等。
还需要说明的是,对于预设选择网络模型(比如候选亮度选择网络模型或者候选色度选择网络模型),还可以利用损失函数进行模型训练。在一些实施例中,该方法还可以包括:
确定第二训练集以及预设损失函数;其中,第二训练集包括至少一个训练样本,且所述训练样本是根据至少一种量化参数得到的;
利用第二训练集中训练样本的亮度分量对第三神经网络结构进行训练,在所述预设损失函数的损失值收敛到损失阈值时,得到至少一个候选亮度选择网络模型;以及
利用第二训练集中训练样本的色度分量对第三神经网络结构进行训练,在所述预设损失函数的损失值收敛到损失阈值时,得到至少一个候选色度选择网络模型。
需要说明的是,对于预设损失函数来说,在一种可能的实施方式中,本申请实施例还提供了一种加权的损失函数进行模型训练的方法。具体如下式所示,
lossFunction=(clip(Wa×reca+Wb×recb+…+Wn×recn+Woff×rec0,0,N)-orig) 2
其中,Wa,Wb,…,Wn,Woff分别表示预设选择网络模型的输出,代表了至少一个候选环路滤波网络模型a,b,…,n,以及不使用环路滤波网络模型(即模型关闭)的概率值。reca,recb,…,recn分别表示使用候选环路滤波网络模型a,b,…,n后的输出重建图像,rec0则表示经过DBF和SAO之后的输出重建图像。Clip函数将数值限定在0~N之间。N表示像素值的最大值,例如对于10bit的YUV图像,N为1023;orig则表示原始图像。
这样,可以将预设选择网络模型的至少两个输出概率值作为至少一个候选CNNLF模型以及不使用CNNLF模型时的输出重建图像的加权权值,最终与原始图像orig计算均方误差,可以得到损失函数值。
在另一种可能的实施方式中,本申请实施例还提供了一种将分类网络常用的交叉熵损失函数应用到本申请实施例的技术方案中。具体如下式所示,
label(i)=argmin((reca-orig) 2,(recb-orig) 2,…,(recn-orig) 2,(rec0-orig) 2)
lossFunction=-label(i)×log(softmax(Wi))
其中,label(i)表示至少一个候选环路滤波网络模型a,b,…,n的输出重建图像,以及经过DBF和SAO之后的输出重建图像分别与原始图像计算均方误差,并取其中最小误差所对应的序号的值i。Wa,Wb,…,Wn,Woff分别表示预设选择网络模型的输出,代表了至少一个候选环路滤波网络模型a,b,…,n,以及不使用环路滤波网络模型(即模型关闭)的概率值。Wi表示与label(i)相同序号的概率值。然后计算Wi的softmax,并与label(i)相乘,可以得到交叉熵损失值。
进一步地,根据上述的实施方式,在确定出预设选择网络模型和至少一个候选环路滤波网络模型之后,还可以确定当前块使用环路滤波网络模型时的各个候选环路滤波网络模型以及当前块不使用环路滤波网络模型时的概率分布情况。在一些实施例中,所述根据当前块的预设选择网络模型确定至少两个输出值,可以包括:
确定当前块的第二重建图像块;
将第二重建图像块输入预设选择网络模型,得到至少两个输出值。
在这里,这至少两个输出值可以包括当前块使用环路滤波网络模型时至少一个候选环路滤波网络模型各自对应的第一值和当前块不使用环路滤波网络模型时的第二值。
还需要说明的是,以输出值为概率值为例,环路滤波网络模型可以是指前述的CNNLF模型。在确定出待输入CNNLF模型的第二重建图像块之后,将第二重建图像块作为预设选择网络模型的输入,而预设选择网络模型的输出即为至少一个候选CNNLF模型以及当前块不使用CNNLF模型的概率分布情况(包括:这至少一个候选CNNLF模型各自对应的第一值和当前块不使用CNNLF模型时的第二值)。
S403:根据至少两个输出值,确定当前块使用环路滤波网络模型时的目标环路滤波网络模型或者当前块不使用环路滤波网络模型。
S404:当当前块使用环路滤波网络模型时,利用目标环路滤波网络模型对当前块进行滤波处理,得到当前块的第一重建图像块。
需要说明的是,在确定出至少一个候选CNNLF模型各自对应的第一值和当前块不使用CNNLF模型时的第二值之后,可以根据这这至少两个输出值确定出当前块使用环路滤波网络模型时的目标环路滤波网络模型或者所述当前块不使用环路滤波网络模型。
在一些实施例中,所述根据所述至少两个输出值,确定当前块使用环路滤波网络模型时的目标环路滤波网络模型或者当前块不使用环路滤波网络模型,可以包括:
从至少两个输出值中确定目标值;
若目标值为所述第一值,则确定当前块使用环路滤波网络模型,且将目标值对应的候选环路滤波网络模型作为目标环路滤波网络模型;或者,
若目标值为所述第二值,则确定当前块不使用环路滤波网络模型。
在一种具体的示例中,所述从至少两个输出值中确定目标值,可以包括:从至少两个值中选取最大值,将最大值作为所述目标值。
也就是说,无论是亮度环路滤波网络模型还是色度环路滤波网络模型,均是先通过模型训练以得到若干个候选亮度环路滤波网络模型或者若干个候选亮度环路滤波网络模型,然后再利用预设选择网络模型进行模型决策,如果这至少两个输出值中第二值为最大值,那么可以确定出当前块不使用环路滤波网络模型;如果这至少两个输出值中第二值不为最大值,那么将第一值中的最大值对应的候选环路滤波网络模型确定为目标环路滤波网络模型,以便利用该目标环路滤波网络模型对当前块进行滤波处理。
还需要说明的是,根据颜色分量类型的不同,预设选择网络模型包括亮度选择网络模型和色度选择网络模型;这样,对于第二重建图像块来说,也可以包括输入重建亮度图像块和输入重建色度图像块。
在一种可能的实施方式中,在当前块的颜色分量类型为亮度分量的情况下,所述根据当前块的预设选择网络模型确定至少两个输出值,可以包括:
确定亮度环路滤波网络模型的输入重建亮度图像块;
将输入重建亮度图像块输入亮度选择网络模型,得到至少两个亮度输出值。
在这里,至少两个亮度输出值可以包括当前块使用亮度环路滤波网络模型时至少一个候选亮度环路滤波网络模型各自对应的第一值和当前块不使用亮度环路滤波网络模型时的第二值。
进一步地,在一些实施例中,以亮度输出值为概率值为例,该方法还可以包括:从至少两个亮度输出值中选取最大概率值;若最大概率值为第一值,则确定当前块使用亮度环路滤波网络模型,且将最大概率值对应的候选亮度环路滤波网络模型作为目标亮度环路滤波网络模型;或者,若最大概率值为第二值,则确定当前块不使用亮度环路滤波网络模型。
在另一种可能的实施方式中,在当前块的颜色分量类型为色度分量的情况下,所述根据当前块的预设选择网络模型确定至少两个输出值,可以包括:
确定色度环路滤波网络模型的输入重建色度图像块;
将输入重建色度图像块输入色度选择网络模型,得到至少两个色度输出值。
在这里,至少两个色度输出值可以包括当前块使用色度环路滤波网络模型时至少一个候选色度环路滤波网络模型各自对应的第一值和当前块不使用色度环路滤波网络模型时的第二值。
进一步地,在一些实施例中,以色度输出值为概率值为例,该方法还可以包括:从至少两个色度输出值中选取最大概率值;若最大概率值为第一值,则确定当前块使用色度环路滤波网络模型,且将最大概率值对应的候选色度环路滤波网络模型作为目标色度环路滤波网络模型;或者,若最大概率值为第二值,则确定当前块不使用色度环路滤波网络模型。
这样,在确定出当前块使用的目标环路滤波网络模型(包括目标亮度环路滤波网络模型或者目标色度环路滤波网络模型)之后,可以利用所选取的目标环路滤波网络模型对当前块进行滤波处理。具体地,在一种可能的实施方式中,当当前块使用环路滤波网络模型时,所述利用目标环路滤波网络模型对当前块进行滤波处理,得到当前块的第一重建图像块,可以包括:
确定当前块的第二重建图像块;
将第二重建图像块输入到目标环路滤波网络模型,得到当前块的第一重建图像块。
在另一种可能的实施方式中,当当前块不使用环路滤波网络模型时,该方法还可以包括:将第二重建图像块确定为当前块的第一重建图像块。
简言之,在确定出这至少两个输出值后,如果从这至少两个输出值中确定出最大值为第二值,意味着当前块不使用环路滤波网络模型的率失真代价最小,那么可以确定出当前块不使用环路滤波网络模型,即将第二重建图像块直接确定为当前块的第一重建图像块;如果从这至少两个输出值中确定出最大值为某一第一值,意味着当前块使用环路滤波网络模型的率失真代价最小,那么可以将某一第一值对应的候选环路滤波网络模型确定为目标环路滤波网络模型,然后将第二重建图像块输入到该目标环路滤波网络模型中,得到当前块的第一重建图像块。
在一些实施例中,对于第二重建图像块(包括输入重建亮度图像块或者输入重建色度图像块)来说,这里,第二重建图像块可以是经由去块滤波器和样值自适应补偿滤波器进行滤波处理后得到。
还需要说明的是,本申请实施例所述的环路滤波网络模型可以为CNNLF模型。这样,利用所选取的CNNLF模型对当前块进行CNNLF滤波处理,可以得到当前块的第一重建图像块。
进一步地,在一些实施例中,该方法还可以包括:在确定出当前块的第一重建图像块之后,利用自 适应修正滤波器对第一重建图像块进行滤波处理。
示例性地,参见图10,其示出了本申请实施例提供的一种使用预设选择网络模型的整体框架示意图。如图10所示,结合图9B所示的网络结构,该网络结构的输入为CNNLF模型的输入重建亮度图像块或输入重建色度图像块,该网络结构的输出为至少一个CNNLF模型各自对应的概率值以及当前块不使用CNNLF模型(即决策关闭CNNLF模型)的概率值。如果输出的概率值最大的为某个CNNLF模型的索引序号,那么可以选择该CNNLF模型为输入重建亮度图像块或输入重建色度图像块进行CNNLF滤波处理;如果输出的概率值最大的为决策关闭CNNLF模型,那么可以不使用神经网络滤波处理。另外,根据图10还可以得到,第二重建图像块是经由去块滤波器(DBF)和样值自适应补偿滤波器(SAO)进行滤波处理后得到的,然后第二重建图像块经由模型自适应选择模块和CNNLF模型后得到的第一重建图像块还可以输入自适应修正滤波器(ALF)继续进行滤波处理。
本实施例提供了一种解码方法,应用于解码器。通过解析码流,确定第一语法元素标识信息的取值;当第一语法元素标识信息指示当前块允许使用预设选择网络模型进行模型决策时,根据当前块的预设选择网络模型确定至少两个输出值;其中,至少两个输出值包括当前块使用环路滤波网络模型时至少一个候选环路滤波网络模型各自对应的第一值和当前块不使用环路滤波网络模型时的第二值;根据至少两个输出值,确定当前块使用环路滤波网络模型时的目标环路滤波网络模型或者当前块不使用环路滤波网络模型;当当前块使用环路滤波网络模型时,利用目标环路滤波网络模型对当前块进行滤波处理,得到当前块的第一重建图像块。这样,通过引入基于深度学习的神经网络技术对环路滤波网络模型进行自适应决策,可以确定当前块使用环路滤波网络模型时的目标环路滤波网络模型或者当前块不使用环路滤波网络模型;如果当前块使用环路滤波网络模型,那么可以还可以利用目标环路滤波网络模型对当前块进行滤波处理,如此不仅可以降低复杂度,还可以避免额外的比特开销,提升编码性能,进而能够提高编解码效率;另外,还可以使得最终输出的第一重建图像块更加接近于原始图像块,能够提升视频图像质量。
在本申请的另一实施例中,为了节省解码器的复杂度,参见图11,其示出了本申请实施例提供的另一种解码方法的流程示意图。如图11所示,该方法可以包括:
S1101:解析码流,确定第一语法元素标识信息的取值。
S1102:当第一语法元素标识信息指示当前块允许使用预设选择网络模型进行模型决策时,解析码流,确定环路滤波网络模型的标识信息。
S1103:若环路滤波网络模型的标识信息为模型关闭信息,则确定当前块不使用环路滤波网络模型。
S1104:若环路滤波网络模型的标识信息为环路滤波网络模型索引序号,则根据环路滤波网络模型索引序号,从至少一个候选环路滤波网络模型中确定当前块使用的目标环路滤波网络模型。
S1105:利用目标环路滤波网络模型对当前块进行滤波处理,得到当前块的第一重建图像块。
需要说明的是,为了方便解码器能够确定当前块是否允许使用预设选择网络模型进行模型决策,可以设置一个第一语法元素标识信息,然后根据解码获得的第一语法元素标识信息的取值来确定。其中,第一语法元素标识信息可以用model_adaptive_decision_enable_flag表示。
在一种具体的示例中,如果model_adaptive_decision_enable_flag的取值为第一标识值,那么可以确定当前块允许使用预设选择网络模型进行模型决策;或者,如果model_adaptive_decision_enable_flag的取值为第二标识值,那么可以确定当前块不允许使用所述预设选择网络模型进行模型决策。示例性地,第一标识值可以为1,第二标识值可以为0,但这里不作任何限定。
还需要说明的是,本申请实施例还可以设置一个环路滤波网络模型的标识信息,用于确定当前块使用环路滤波网络模型时的环路滤波网络模型索引序号或者当前块不使用环路滤波网络模型。
以CNNLF模型为例,对于解码器侧的模型自适应决策使用模块,可以根据在编码器侧模型自适应决策使用模块所确定的环路滤波网络模型的标识信息,根据解码获取的环路滤波网络模型的标识信息,可以确定出当前块不使用环路滤波网络模型或者当前块使用的环路滤波网络模型索引序号。根据该环路滤波网络模型索引序号即可确定出当前块使用的目标环路滤波网络模型,进而根据目标环路滤波网络模型对当前块进行CNNLF滤波处理,从而降低解码器的复杂度。
除此之外,针对前述实施例中的第一神经网络结构、第二神经网络结构和第三神经网络结构等,其包括的卷积层数量,全连接层数量,非线性激活函数等均可以进行调整。另外,模型自适应决策使用模块所针对的环路滤波网络模型,除了CNNLF模型之外,还可以是针对其他高效的神经网络滤波器模型进行模型的自适应决策使用,本申请实施例也不作任何限定。
简言之,本申请实施例提出了一种基于深度学习的模型自适应决策使用模块,用于对CNNLF模型的使用进行自适应决策,不再需要计算率失真代价和传输帧级、CTU级等开关信息,避免额外的比特开销,提升编码性能。模型自适应决策使用模块可以看作是由多层卷积神经网络和多层全连接神经网络组成的预设选择网络模型,其输入为当前块的第二重建图像块(即CNNLF模型的输入重建图像块), 输出为各个CNNLF模型以及决策为关闭CNNLF模型的概率分布情况。模型自适应决策使用模块位于编码器/解码器中的位置如图5所示,模型自适应选择模块的使用不依赖于DBF、SAO、ALF、CNNLF的标志位,只是在位置上置于CNNLF之前。
在一种具体的示例中,本申请实施例的技术方案作用在解码器的环路滤波模块中,其具体流程如下:
解码器获取并解析码流,当解析到环路滤波模块时,按照预设的滤波器顺序进行处理。这里,预设的滤波器顺序为DBF滤波---->SAO滤波---->模型自适应决策使用模块---->CNNLF滤波---->ALF滤波。当进入模型自适应决策使用模块时,
(a)首先根据解码得到的model_adaptive_decision_enable_flag判断当前块下是否允许使用模型自适应决策使用模块进行模型决策。如果model_adaptive_decision_enable_flag为“1”,那么对当前块尝试进行模型自适应决策使用模块处理,跳转至(b);如果model_adaptive_decision_enable_flag为“0”,那么跳转至(e);
(b)判断当前块的颜色分量类型,如果当前块为亮度块,那么跳转至(c);如果当前块为色度块,那么跳转(d);
(c)对于亮度分量,将CNNLF模型的输入重建亮度图像块作为模型自适应决策使用模块的输入,输出为各个亮度CNNLF模型以及决策为关闭亮度CNNLF模型的概率分布情况。若输出的概率值最大的为决策关闭亮度CNNLF模型,则跳转至(e);若输出的概率值最大的为某个亮度CNNLF模型的索引序号,则选择该模型对当前亮度图像块进行CNNLF滤波处理,得到最终输出的重建亮度图像块;
(d)对于色度分量,将CNNLF模型的输入重建色度图像块作为模型自适应决策使用模块的输入,输出为各个色度CNNLF模型以及决策为关闭色度CNNLF模型的概率分布情况。若输出的概率值最大的为决策关闭色度CNNLF模型,则跳转至(e);若输出的概率值最大的为某个色度CNNLF模型的索引序号,则选择该模型对当前色度图像块进行CNNLF滤波处理,得到最终输出的重建色度图像块;
(e)如果当前帧已完成模型自适应决策使用模块的处理,那么加载下一帧进行处理,然后跳转至(a)。
在实现中,其语法元素的修改如下所示。
(1)序列头定义,其语法元素的修改如表1所示。
表1
Figure PCTCN2021099813-appb-000001
其中,基于神经网络的模型自适应决策使用的允许标志可以用model_adaptive_decision_enable_flag表示。
(2)帧内预测图像头定义,其语法元素的修改如表2所示。
表2
Figure PCTCN2021099813-appb-000002
其中,当基于神经网络的模型自适应决策使用的允许标志model_adaptive_decision_enable_flag为1 时,可取消以下语义的定义:
图像级神经网络滤波允许标志picture_nn_filter_enable_flag[compIdx]
图像级选择性滤波自适应标志picture_nn_filter_adaptive_flag[compIdx]
图像级神级网络滤波模型索引picture_nn_filter_set_index[compIdx]
(3)帧间预测图像头定义,其语法元素的修改如表3所示。
表3
Figure PCTCN2021099813-appb-000003
其中,当基于神经网络的模型自适应决策使用的允许标志model_adaptive_decision_enable_flag为1时,可取消以下语义的定义:
图像级神经网络滤波允许标志picture_nn_filter_enable_flag[compIdx]
图像级选择性滤波自适应标志picture_nn_filter_adaptive_flag[compIdx]
图像级神级网络滤波模型索引picture_nn_filter_set_index[compIdx]
(4)片定义,其语法元素的修改如表4所示。
表4
Figure PCTCN2021099813-appb-000004
其中,当基于神经网络的模型自适应决策使用的允许标志model_adaptive_decision_enable_flag为1时,可取消以下语义的定义:
最大编码单元神经网络滤波允许标志nn_filter_lcu_enable_flag[compIdx][LcuIdx]
最大编码单元神经网络滤波模型索引序号标志nn_filter_lcu_set_index[compIdx][LcuIdx]
通过上述实施例,对前述实施例的具体实现进行了详细阐述,从中可以看出,通过前述实施例的技术方案,该实施例通过引入基于深度学习的模型自适应决策技术,将当前块的第二重建图像块(即CNNLF模型的输入重建图像块)输入多层卷积层加多层全连接层的神经网络结构中,输出各个CNNLF模型以及决策为关闭CNNLF模型的概率分布情况,为第二重建图像块自适应地决策使用合适的CNNLF 模型或者不使用CNNLF模型,这时候不再需要计算率失真代价和传输帧级、CTU级等开关信息,避免额外的比特开销,使编码性能提升。
在本申请的又一实施例中,参见图12,其示出了本申请实施例提供的一种编码方法的流程示意图。如图12所示,该方法可以包括:
S1201:确定第一语法元素标识信息的取值。
需要说明的是,视频图像可以划分为多个图像块,每个当前待编码的图像块可以称为编码块。这里,每个编码块可以包括第一图像分量、第二图像分量和第三图像分量;而当前块即为视频图像中当前待进行第一图像分量、第二图像分量或者第三图像分量环路滤波处理的编码块。其中,这里的当前块可以为CTU,也可以为CU,甚至还可以是比CU更小的块,本申请实施例不作任何限定。
在这里,针对第一图像分量、第二图像分量和第三图像分量,从颜色划分角度,本申请实施例可以将其划分为亮度分量和色度分量等两种颜色分量类型。在这种情况下,如果当前块进行亮度分量的预测、反变换与反量化、环路滤波等操作,那么当前块也可以称为亮度块;或者,如果当前块进行色度分量的预测、反变换与反量化、环路滤波等操作,那么当前块也可以称为色度块。
还需要说明的是,在编码器侧,本申请实施例具体提供了一种环路滤波方法,尤其是一种基于深度学习的环路滤波网络模型使用的自适应决策方法,该方法应用在如图3A所示的滤波单元108部分。在这里,滤波单元108可以包括去块滤波器(DBF)、样值自适应补偿滤波器(SAO)、基于残差神经网络的环路滤波器(CNNLF)和自适应修正滤波器(ALF)。对于该滤波单元108来说,利用本申请实施例所述的方法可以对该滤波单元108中的CNNLF模型进行自适应地决策,以便决策出当前块使用CNNLF模型时的目标模型或者当前块不使用CNNLF模型。
更具体地,本申请实施例提出了一种基于深度学习的模型自适应决策使用模块,详见图5所示的模型自适应选择模块,可以用于对环路滤波网络模型(比如CNNLF模型)是否使用以及使用哪一个CNNLF模型进行自适应决策,从而提升编码性能。
在本申请实施例中,对于模型自适应决策使用模块,当前块是否允许使用预设选择网络模型进行模型决策,可以通过一个第一语法元素标识信息进行指示。在一些实施例中,所述确定第一语法元素标识信息的取值,包括:
若当前块允许使用预设选择网络模型进行模型决策,则确定第一语法元素标识信息的取值为第一标识值;和/或,
若当前块不允许使用预设选择网络模型进行模型决策,则确定第一语法元素标识信息的取值为第二标识值。
进一步地,该方法还包括:对第一语法元素标识信息的取值进行编码,将编码比特写入码流。
也就是说,首先可以设置一个第一语法元素标识信息,以指示当前块是否允许使用预设选择网络模型进行模型决策。在这里,如果当前块允许使用预设选择网络模型进行模型决策,那么可以确定第一语法元素标识信息的取值为第一标识值;如果当前块不允许使用预设选择网络模型进行模型决策,那么可以确定第一语法元素标识信息的取值为第二标识值。这样,在编码器中,当确定出第一语法元素标识信息的取值后,将第一语法元素标识信息的取值写入码流以传输到解码器,使得解码器通过解析码流即可获知当前块是否允许使用预设选择网络模型进行模型决策。
在这里,第一标识值和第二标识值不同,而且第一标识值和第二标识值可以是参数形式,也可以是数字形式。具体地,第一语法元素标识信息可以是写入在概述(profile)中的参数,也可以是一个标志(flag)的取值,本申请实施例对此不作任何限定。
以第一语法元素标识信息为一个flag为例,这时候对于第一标识值和第二标识值而言,第一标识值可以设置为1,第二标识值可以设置为0;或者,第一标识值还可以设置为true,第二标识值还可以设置为false;或者,第一标识值还可以设置为0,第二标识值还可以设置为1;或者,第一标识值还可以设置为false,第二标识值还可以设置为true。示例性地,对于flag而言,一般情况下,第一标识值可以为1,第二标识值可以为0,但是并不作任何限定。
还需要说明的是,预设选择网络模型可以看作是一个神经网络,而第一语法元素标识信息可以看作是一个基于神经网络的模型自适应决策的允许标志,这里可以用model_adaptive_decision_enable_flag表示。具体来说,model_adaptive_decision_enable_flag可以用于指示当前块是否允许使用预设选择网络模型进行模型的自适应决策。
这样,以第一标识值为1,第二标识值为0为例,如果model_adaptive_decision_enable_flag的取值为1,那么可以确定当前块允许使用预设选择网络模型进行模型决策;如果model_adaptive_decision_enable_flag的取值为0,那么可以确定当前块不允许使用预设选择网络模型进 行模型决策。
S1202:当第一语法元素标识信息指示当前块允许使用预设选择网络模型进行模型决策时,根据当前块的预设选择网络模型确定至少两个输出值;其中,至少两个输出值包括当前块使用环路滤波网络模型时至少一个候选环路滤波网络模型各自对应的第一值和当前块不使用环路滤波网络模型时的第二值。
需要说明的是,如果当前块允许使用预设选择网络模型进行模型决策,那么这时候可以根据当前块的颜色分量类型、量化参数和所属帧的帧类型等,从若干个候选的预设选择网络模型中确定出当前块使用的预设选择网络模型,然后根据预设选择网络模型确定当前块使用环路滤波网络模型时的至少一个候选环路滤波网络模型和当前块不使用环路滤波网络模型的概率分布情况。具体地,在本申请实施例中,这至少两个输出值可以包括当前块使用环路滤波网络模型时至少一个候选环路滤波网络模型各自对应的第一值和当前块不使用环路滤波网络模型时的第二值。
在一种具体的示例中,第一值可以用于反映当前块使用环路滤波网络模型时这至少一个候选环路滤波网络模型的概率分布情况,第二值可以用于反映当前块不使用环路滤波网络模型时的概率分布情况。换言之,第一值和第二值均可以用概率值表示;即根据预设选择网络模型,所确定的至少两个输出值可以为至少两个概率值。或者,第一值和第二值还可以用于反映当前块使用环路滤波网络模型时这至少一个候选环路滤波网络模型和当前块不使用环路滤波网络模型的权重分配情况;即第一值和第二值也可以称为权重值,本申请实施例不作任何限定。
可以理解地,对于颜色分量类型而言,其可以包括亮度分量和色度分量。针对不同的颜色分量类型,这里的预设选择网络模型并不相同。在本申请实施例中,亮度分量对应的预设选择网络模型可以称为亮度选择网络模型,色度分量对应的预设选择网络模型可以称为色度选择网络模型。因此,在一些实施例中,所述确定当前块的预设选择网络模型,可以包括:
若当前块的颜色分量类型为亮度分量(即当前块为亮度块时),则确定当前块的亮度选择网络模型;或者,
若当前块的颜色分量类型为色度分量(即当前块为亮度块时),则确定当前块的色度选择网络模型。
相应地,针对不同的颜色分量类型,这里的候选环路滤波网络模型也是不同的。在本申请实施例中,亮度分量对应的一个或多个候选环路滤波网络模型可以称为候选亮度环路滤波网络模型,色度分量对应的一个或多个候选环路滤波网络模型可以称为候选色度环路滤波网络模型。因此,在一些实施例中,所述根据当前块的预设选择网络模型确定至少两个输出值,可以包括:
若当前块的颜色分量类型为亮度分量,则根据亮度选择网络模型确定至少两个亮度输出值;其中,至少两个亮度输出值包括当前块使用亮度环路滤波网络模型时至少一个候选亮度环路滤波网络模型各自对应的第一值和当前块不使用亮度环路滤波网络模型时的第二值;或者,
若当前块的颜色分量类型为色度分量,则根据色度选择网络模型确定至少两个色度输出值;其中,至少两个色度输出值包括当前块使用色度环路滤波网络模型时至少一个候选色度环路滤波网络模型各自对应的第一值和当前块不使用色度环路滤波网络模型时的第二值。
进一步地,对于帧类型来说,其可以包括I帧、P帧和B帧。在本申请实施例中,帧类型可以包括第一类型和第二类型。针对不同的帧类型,这里的预设选择网络模型也是不同的。在一种具体的示例中,第一类型可以为I帧,第二类型可以为非I帧。需要注意的是,这里并不作具体限定。
在一种可能的实施方式中,对于亮度选择网络模型而言,第一类型对应的亮度选择网络模型可以称为第一亮度选择网络模型,第二类型对应的亮度选择网络模型可以称为第二亮度选择网络模型。因此,在一些实施例中,在当前块的颜色分量类型为亮度分量的情况下,所述确定当前块的亮度选择网络模型,可以包括:
若当前块所属帧的帧类型为第一类型,则确定当前块的第一亮度选择网络模型;或者,
若当前块所属帧的帧类型为第二类型,则确定当前块的第二亮度选择网络模型。
相应地,对于候选亮度环路滤波网络模型来说,根据不同的帧类型,候选亮度环路滤波网络模型也是不同的。具体地,第一类型对应的候选亮度环路滤波网络模型可以称为候选第一亮度选择网络模型,第二类型对应的候选亮度环路滤波网络模型可以称为候选第二亮度选择网络模型。因此,在一些实施例中,所述根据亮度选择网络模型确定至少两个亮度输出值,可以包括:
若当前块所属帧的帧类型为第一类型,则根据第一亮度选择网络模型确定至少两个亮度输出值;其中,这至少两个亮度输出值包括当前块使用第一亮度环路滤波网络模型时至少一个候选第一亮度环路滤波网络模型各自对应的第一值和当前块不使用第一亮度环路滤波网络模型时的第二值;或者,
若当前块所属帧的帧类型为第二类型,则根据第二亮度选择网络模型确定至少两个亮度输出值;其中,这至少两个亮度输出值包括当前块使用第二亮度环路滤波网络模型时至少一个候选第二亮度环路滤波网络模型各自对应的第一值和当前块不使用第二亮度环路滤波网络模型时的第二值。
进一步地,在本申请实施例中,对于至亮度分量对应的少一个候选环路滤波网络模型(可简称为“候选亮度环路滤波网络模型”),无论是第一类型对应的至少一个候选第一亮度环路滤波网络模型,还是第二类型对应的至少一个候选第二亮度环路滤波网络模型,这些候选环路滤波网络模型都是通过模型训练得到的。
在一些实施例中,该方法还可以包括:
确定第一训练集;其中,第一训练集包括至少一个第一训练样本和至少一个第二训练样本,第一训练样本的帧类型为第一类型,第二训练样本的帧类型为第二类型,且第一训练样本和第二训练样本均是根据至少一种量化参数得到的;
利用至少一个第一训练样本的亮度分量对第一神经网络结构进行训练,得到至少一个候选第一亮度环路滤波网络模型;以及
利用至少一个第二训练样本的亮度分量对第一神经网络结构进行训练,得到至少一个候选第二亮度环路滤波网络模型。
在这里,第一神经网络结构包括下述至少之一:卷积层、激活层、残差块和跳转连接层。
也就是说,至少一个候选第一亮度环路滤波网络模型和至少一个候选第二亮度环路滤波网络模型是根据至少一个训练样本对第一神经网络结构进行模型训练确定的,且至少一个候选第一亮度环路滤波网络模型和至少一个候选第二亮度环路滤波网络模型与帧类型、颜色分量类型和量化参数之间具有对应关系。
在另一种可能的实施方式中,对于色度选择网络模型而言,第一类型对应的色度选择网络模型可以称为第一色度选择网络模型,第二类型对应的色度选择网络模型可以称为第二色度选择网络模型。因此,在一些实施例中,在当前块的颜色分量类型为色度分量的情况下,所述确定当前块的色度选择网络模型,可以包括:
若当前块所属帧的帧类型为第一类型,则确定当前块的第一色度选择网络模型;或者,
若当前块所属帧的帧类型为第二类型,则确定当前块的第二色度选择网络模型。
相应地,对于候选色度环路滤波网络模型来说,根据不同的帧类型,候选色度环路滤波网络模型也是不同的。具体地,第一类型对应的一个或多个候选色度环路滤波网络模型可以称为候选第一色度选择网络模型,第二类型对应的一个或多个候选色度环路滤波网络模型可以称为候选第二色度选择网络模型。因此,在一些实施例中,所述根据色度选择网络模型确定至少两个色度输出值,可以包括:
若当前块所属帧的帧类型为第一类型,则根据第一色度选择网络模型确定至少两个色度输出值;其中,这至少两个色度输出值包括当前块使用第一色度环路滤波网络模型时至少一个候选第一色度环路滤波网络模型各自对应的第一值和当前块不使用第一色度环路滤波网络模型时的第二值;或者,
若当前块所属帧的帧类型为第二类型,则根据第二色度选择网络模型确定至少两个色度输出值;其中,这至少两个色度输出值包括当前块使用第二色度环路滤波网络模型时至少一个候选第二色度环路滤波网络模型各自对应的第一值和当前块不使用第二色度环路滤波网络模型时的第二值。
进一步地,在本申请实施例中,对于色度分量对应的至少一个候选环路滤波网络模型(可简称为“候选色度环路滤波网络模型”),无论是第一类型对应的至少一个候选第一色度环路滤波网络模型,还是第二类型对应的至少一个候选第二色度环路滤波网络模型,这些候选环路滤波网络模型都是通过模型训练得到的。
在一些实施例中,该方法还可以包括:
确定第一训练集;其中,第一训练集包括至少一个第一训练样本和至少一个第二训练样本,第一训练样本的帧类型为第一类型,第二训练样本的帧类型为第二类型,且第一训练样本和第二训练样本均是根据至少一种量化参数得到的;
利用至少一个第一训练样本的色度分量对第二神经网络结构进行训练,得到至少一个候选第一色度环路滤波网络模型;以及
利用至少一个第二训练样本的色度分量对第二神经网络结构进行训练,得到至少一个候选第二色度环路滤波网络模型。
在这里,第二神经网络结构包括下述至少之一:采样层、卷积层、激活层、残差块、池化层和跳转连接层。
也就是说,至少一个候选第一色度环路滤波网络模型和至少一个候选第二色度环路滤波网络模型是根据至少一个训练样本对第二神经网络结构进行模型训练确定的,且至少一个候选第一色度环路滤波网络模型和至少一个候选第二色度环路滤波网络模型与帧类型、颜色分量类型和量化参数之间具有对应关系。
在一种具体的示例中,第一神经网络结构可以包括第一卷积模块、第一残差模块、第二卷积模块和 第一连接模块。其中,对于第一神经网络结构而言,第一卷积模块可以由一层卷积层和一层激活层组成,第二卷积模块可以由两层卷积层和一层激活层组成,连接模块可以由跳转连接层组成,第一残差模块可以包括若干个残差块,且每一个残差块可以由两层卷积层和一层激活层组成。
在另一种具体的示例中,第二神经网络结构可以包括上采样模块、第三卷积模块、第四卷积模块、融合模块、第二残差模块、第五卷积模块和第二连接模块。其中,对于第二神经网络结构而言,第三卷积模块可以由一层卷积层和一层激活层组成,第四卷积模块可以由一层卷积层和一层激活层组成,第五卷积模块可以由两层卷积层、一层激活层和一层池化层组成,连接模块可以由跳转连接层组成,第二残差模块可以包括若干个残差块,且每一个残差块可以由两层卷积层和一层激活层组成。
示例性地,以环路滤波网络模型为CNNLF为例,CNNLF对于亮度分量和色度分量分别设计了不同的网络结构。其中,对于亮度分量,其设计了第一神经网络结构,具体参见图6A和图7A;对于色度分量,其设计了第二神经网络结构,具体参见图6B和图7B。
对于亮度分量,以图7A为例,整个网络结构可以由卷积层、激活层、残差块、跳转连接层等部分组成。这里,卷积层的卷积核可以为3×3,即可以用3×3Conv表示;激活层可以为线性激活函数,即可以用线性整流函数(Rectified Linear Unit,ReLU)表示,又可称为修正线性单元,是一种人工神经网络中常用的激活函数,通常指代以斜坡函数及其变种为代表的非线性函数。残差块(ResBlock)的网络结构如图8中的虚线框所示,可以由卷积层(Conv)、激活层(ReLU)和跳转连接层等组成。在网络结构中,跳转连接层(Concat)是指网络结构中所包括的一条从输入到输出的全局跳转连接,能够使网络能够专注于学习残差,加速了网络的收敛过程。
对于色度分量,以图7B为例,这里引入了亮度分量作为输入之一来指导色度分量的滤波,整个网络结构可以由卷积层、激活层、残差块、池化层、跳转连接层等部分组成。由于分辨率的不一致性,色度分量首先需要进行上采样。为了避免在上采样过程中引入其他噪声,可以通过直接拷贝邻近像素来完成分辨率的扩大,以得到放大色度帧(Enlarged chroma frame)。另外,在网络结构的末端,还使用了池化层(如2×2AvgPool)来完成色度分量的下采样。具体地,在HPM-ModAI的应用中,亮度分量网络的残差块数量可设置为N=20,色度分量网络的残差块数量可设置为N=10。
在这里,CNNLF的使用可以包含离线训练和推理测试两个阶段。其中,在离线训练阶段,可以离线的训练了4个I帧亮度分量模型,4个非I帧亮度分量模型,4个色度U分量模型,4个色度V分量模型等共16种模型。具体地,使用预设图像数据集(例如DIV2K,该数据集有1000张高清图(2K分辨率),其中,800张作为训练,100张作为验证,100张作为测试),将图像从RGB转换成YUV4:2:0格式的单帧视频序列,作为标签数据。然后使用HPM在All Intra配置下对序列进行编码,关闭DBF,SAO和ALF等传统滤波器,量化步长设置为27到50。对于编码得到的重建序列,按照QP 27~31、32~37、38~44、45~50为范围划分为4个区间,切割为128×128的图像块作为训练数据,分别训练了4种I帧亮度分量模型,4种色度U分量模型,4种色度V分量模型。进一步地,使用预设视频数据集(例如BVI-DVC),使用HPM-ModAI在Random Access配置下编码,关闭DBF,SAO和ALF等传统滤波器,并打开I帧的CNNLF,收集编码重建的非I帧数据,分别训练了4种非I帧亮度分量模型。
在推理测试阶段,HPM-ModAI为亮度分量设置了开关形式的帧级标志位与CTU级标志位以控制是否打开CNNLF模型,而为色度分量设置了开关形式的帧级标志位以控制是否打开CNNLF模型。在这里,标志位通常可以用flag表示。另外,帧级标志位由式(1)确定,其中,D=D net-D rec表示CNNLF处理后减少的失真(D net为滤波后的失真,D rec为滤波前的失真),R表示当前帧的CTU个数,λ与自适应修正滤波器的λ保持一致。当RDcost为负时,打开帧级标志位,否则关闭帧级标志位。
RDcost=D+λ×R           (1)
当帧级标志位打开时,还需要进一步通过率失真代价方式决策每个CTU是否打开CNNLF模型。这里,设置了CTU级标志位以控制是否打开CNNLF。具体地,CTU级标志位由式(2)确定。
RDcost=D          (2)
在一种可能的实施方式中,在HPM-ModAI中,编码器可以通过率失真代价方式确定当前帧或者当前块是否使用CNNLF模型进行滤波处理,但是这时候需要将帧级和CTU级等开关信息写入码流,造成额外的比特开销。
在另一种可能的实施方式中,本申请实施例提出了一种基于深度学习的预设选择网络模型,可以对CNNLF模型的使用进行自适应决策,这时候将不再需要计算率失真代价和编码帧级和CTU级等开关信息。
具体来讲,针对不同的颜色分量类型,其对应的预设选择网络模型也不相同。在这里,亮度分量对应的预设选择网络模型可以称为亮度选择网络模型,色度分量对应的预设选择网络模型可以称为色度选择网络模型。
在一种可能的实施方式中,在当前块的颜色分量类型为亮度分量的情况下,所述确定当前块的亮度选择网络模型,可以包括:
确定至少一个候选亮度选择网络模型,候选亮度选择网络模型包括候选第一亮度选择网络模型和/或候选第二亮度选择网络模型;
确定当前块所属帧的帧类型和量化参数;
若帧类型为第一类型,则从至少一个候选亮度选择网络模型中确定第一类型对应的至少一个候选第一亮度选择网络模型,并根据量化参数从至少一个候选第一亮度选择网络模型中确定当前块的第一亮度选择网络模型;或者,
若帧类型为第二类型,则从至少一个候选亮度选择网络模型中确定第二类型对应的至少一个候选第二亮度选择网络模型,并根据量化参数从至少一个候选第二亮度选择网络模型中确定当前块的第二亮度选择网络模型。
在另一种可能的实施方式中,在当前块的颜色分量类型为色度分量的情况下,所述确定当前块的色度选择网络模型,可以包括:
确定至少一个候选色度选择网络模型,候选色度选择网络模型包括候选第一色度选择网络模型和/或候选第二色度选择网络模型;
确定当前块所属帧的帧类型和量化参数;
若帧类型为第一类型,则从至少一个候选色度选择网络模型中确定第一类型对应的至少一个候选第一色度选择网络模型,并根据量化参数从至少一个候选第一色度选择网络模型中确定当前块的第一色度选择网络模型;或者,
若帧类型为第二类型,则从至少一个候选色度选择网络模型中确定第二类型对应的至少一个候选第二色度选择网络模型,并根据量化参数从至少一个候选第二色度选择网络模型中确定当前块的第二色度选择网络模型。
需要说明的是,当前块的预设选择网络模型不仅和量化参数有关,而且还和帧类型、颜色分量类型有关。其中,不同的颜色分量类型,对应有不同的预设选择网络模型,比如对于亮度分量来说,预设选择网络模型可以是与亮度分量相关的亮度选择网络模型;对于色度分量来说,预设选择网络模型可以是与色度分量相关的色度选择网络模型。而且,不同的帧类型,其对应的预设选择网络模型也是不同的。对于与亮度分量相关的亮度选择网络模型,第一类型对应的亮度选择网络模型可以称为第一亮度选择网络模型,第二类型对应的亮度选择网络模型可以称为第二亮度选择网络模型;对于与色度分量相关的色度选择网络模型,第一类型对应的色度选择网络模型可以称为第一色度选择网络模型,第二类型对应的色度选择网络模型可以称为第二色度选择网络模型。
还需要说明的是,在本申请实施例中,根据不同的量化参数,比如QP的取值为27~31、32~37、38~44、45~50等,以及不同的帧类型,比如第一类型和第二类型等,预先可以训练出至少一个候选亮度选择网络模型(包括候选第一亮度选择网络模型和/或候选第二亮度选择网络模型)以及至少一个候选色度选择网络模型(包括候选第一色度选择网络模型和/或候选第二色度选择网络模型)。
这样,对于亮度分量,在确定出当前块的帧类型后,假定帧类型为I帧,可以从至少一个候选亮度选择网络模型中确定出I帧类型对应的至少一个候选I帧亮度选择网络模型;根据当前块的量化参数,可以从至少一个候选I帧亮度选择网络模型中选取出该量化参数对应的I帧亮度选择网络模型,即当前块的亮度选择网络模型;或者,假定帧类型为非I帧,可以从至少一个候选亮度选择网络模型中确定出非I帧类型对应的至少一个候选非I帧亮度选择网络模型;根据当前块的量化参数,可以从至少一个候选非I帧亮度选择网络模型中选取出该量化参数对应的非I帧亮度选择网络模型,即当前块的亮度选择网络模型。另外,对于色度分量,其色度选择网络模型的确定方式与亮度分量相同,这里不再详述。
进一步地,对于至少一个候选亮度选择网络模型和至少一个候选色度选择网络模型的模型训练,在一些实施例中,该方法还可以包括:
确定第二训练集,其中,第二训练集包括至少一个训练样本,且所述训练样本是根据至少一种量化参数得到的;
利用第二训练集中训练样本的亮度分量对第三神经网络结构进行训练,得到至少一个候选亮度选择网络模型;
利用第二训练集中训练样本的色度分量对第三神经网络结构进行训练,得到至少一个候选色度选择网络模型。
也就是说,至少一个候选亮度选择网络模型是根据至少一个训练样本对第三神经网络结构进行模型训练确定的,且这至少一个候选亮度选择网络模型与帧类型、颜色分量类型和量化参数之间具有对应关系。另外,至少一个候选色度选择网络模型也是根据至少一个训练样本对第三神经网络结构进行模型训 练确定的,且这至少一个候选色度选择网络模型与帧类型、颜色分量类型和量化参数之间具有对应关系。
需要说明的是,在本申请实施例中,第三神经网络结构可以包括下述至少之一:卷积层、池化层、全连接层和激活层。
在一种具体的示例中,第三神经网络结构可以包括第六卷积模块和全连接模块,第六卷积模块和全连接模块顺次连接。其中,第六卷积模块可以包括若干个卷积子模块,每一个卷积子模块可以由一层卷积层和一层池化层组成;全连接模块可以包括若干个全连接子模块,每一个全连接子模块可以由一层全连接层和一层激活层组成。
也就是说,预设选择网络模型可以选择多层卷积神经网络和多层全连接层神经网络组成,然后利用训练样本进行深度学习以得到当前块的预设选择网络模型,比如亮度选择网络模型或者色度选择网络模型。
示例性地,以图9B为例,第三神经网络结构可以由3层卷积层和2层全连接层组成,而且每一层卷积层之后设置有池化层;其中,卷积层的卷积核可以为3×3,即可以用3×3Conv表示;池化层可以采用最大值池化层,用2×2MaxPool表示;另外,每一层全连接层之后设置有激活层,在这里,激活层可以为线性激活函数,也可以为非线性激活函数,比如ReLU和Softmax等。
还需要说明的是,对于预设选择网络模型(比如候选亮度选择网络模型或者候选色度选择网络模型),还可以利用损失函数进行模型训练。在一些实施例中,该方法还可以包括:
确定第二训练集以及预设损失函数;其中,第二训练集包括至少一个训练样本,且所述训练样本是根据至少一种量化参数得到的;
利用第二训练集中训练样本的亮度分量对第三神经网络结构进行训练,在所述预设损失函数的损失值收敛到损失阈值时,得到至少一个候选亮度选择网络模型;以及
利用第二训练集中训练样本的色度分量对第三神经网络结构进行训练,在所述预设损失函数的损失值收敛到损失阈值时,得到至少一个候选色度选择网络模型。
需要说明的是,对于预设损失函数来说,在一种可能的实施方式中,本申请实施例还提供了一种加权的损失函数进行模型训练的方法。具体如下式所示,
lossFunction=(clip(Wa×reca+Wb×recb+…+Wn×recn+Woff×rec0,0,N)-orig) 2
其中,Wa,Wb,…,Wn,Woff分别表示预设选择网络模型的输出,代表了至少一个候选环路滤波网络模型a,b,…,n,以及不使用环路滤波网络模型(即模型关闭)的概率值。reca,recb,…,recn分别表示使用候选环路滤波网络模型a,b,…,n后的输出重建图像,rec0则表示经过DBF和SAO之后的输出重建图像。Clip函数将数值限定在0~N之间。N表示像素值的最大值,例如对于10bit的YUV图像,N为1023;orig则表示原始图像。
这样,可以将预设选择网络模型的至少两个输出概率值作为至少一个候选CNNLF模型以及不使用CNNLF模型时的输出重建图像的加权权值,最终与原始图像orig计算均方误差,可以得到损失函数值。
在另一种可能的实施方式中,本申请实施例还提供了一种将分类网络常用的交叉熵损失函数应用到本申请实施例的技术方案中。具体如下式所示,
label(i)=argmin((reca-orig) 2,(recb-orig) 2,…,(recn-orig) 2,(rec0-orig) 2)
lossFunction=-label(i)×log(softmax(Wi))
其中,label(i)表示至少一个候选环路滤波网络模型a,b,…,n的输出重建图像,以及经过DBF和SAO之后的输出重建图像分别与原始图像计算均方误差,并取其中最小误差所对应的序号的值i。Wa,Wb,…,Wn,Woff分别表示预设选择网络模型的输出,代表了至少一个候选环路滤波网络模型a,b,…,n,以及不使用环路滤波网络模型(即模型关闭)的概率值。Wi表示与label(i)相同序号的概率值。然后计算Wi的softmax,并与label(i)相乘,可以得到交叉熵损失值。
进一步地,根据上述的实施方式,在确定出预设选择网络模型和至少一个候选环路滤波网络模型之后,还可以确定当前块使用环路滤波网络模型时的各个候选环路滤波网络模型以及当前块不使用环路滤波网络模型时的概率分布情况。在一些实施例中,所述根据当前块的预设选择网络模型确定至少两个输出值,可以包括:
确定当前块的第二重建图像块;
将第二重建图像块输入预设选择网络模型,得到至少两个输出值。
在这里,这至少两个输出值可以包括当前块使用环路滤波网络模型时至少一个候选环路滤波网络模型各自对应的第一值和当前块不使用环路滤波网络模型时的第二值。
还需要说明的是,以输出值为概率值为例,环路滤波网络模型可以是指前述的CNNLF模型。在确定出待输入CNNLF模型的第二重建图像块之后,将第二重建图像块作为预设选择网络模型的输入,而预设选择网络模型的输出即为至少一个候选CNNLF模型以及当前块不使用CNNLF模型的概率分布情 况(包括:这至少一个候选CNNLF模型各自对应的第一值和当前块不使用CNNLF模型时的第二值)。
S1203:根据至少两个输出值,确定当前块使用环路滤波网络模型时的目标环路滤波网络模型或者当前块不使用环路滤波网络模型。
S1204:当当前块使用环路滤波网络模型时,利用目标环路滤波网络模型对当前块进行滤波处理,得到当前块的第一重建图像块。
需要说明的是,在确定出至少一个候选CNNLF模型各自对应的第一值和当前块不使用CNNLF模型时的第二值之后,可以根据这这至少两个输出值确定出当前块使用环路滤波网络模型时的目标环路滤波网络模型或者所述当前块不使用环路滤波网络模型。
在一些实施例中,所述根据所述至少两个输出值,确定当前块使用环路滤波网络模型时的目标环路滤波网络模型或者当前块不使用环路滤波网络模型,可以包括:
从至少两个输出值中确定目标值;
若目标值为所述第一值,则确定当前块使用环路滤波网络模型,且将目标值对应的候选环路滤波网络模型作为目标环路滤波网络模型;或者,
若目标值为所述第二值,则确定当前块不使用环路滤波网络模型。
在一种具体的示例中,所述从至少两个输出值中确定目标值,可以包括:从至少两个值中选取最大值,将最大值作为所述目标值。
也就是说,无论是亮度环路滤波网络模型还是色度环路滤波网络模型,均是先通过模型训练以得到若干个候选亮度环路滤波网络模型或者若干个候选亮度环路滤波网络模型,然后再利用预设选择网络模型进行模型决策,如果这至少两个输出值中第二值为最大值,那么可以确定出当前块不使用环路滤波网络模型;如果这至少两个输出值中第二值不为最大值,那么将第一值中的最大值对应的候选环路滤波网络模型确定为目标环路滤波网络模型,以便利用该目标环路滤波网络模型对当前块进行滤波处理。
也就是说,无论是亮度环路滤波网络模型还是色度环路滤波网络模型,均是先通过模型训练以得到若干个候选亮度环路滤波网络模型或者若干个候选亮度环路滤波网络模型,然后再利用预设选择网络模型进行模型决策,如果这至少两个输出值中第二值为最大值,那么可以确定出当前块不使用环路滤波网络模型;如果这至少两个输出值中第二值不为最大值,那么将第一值中的最大值对应的候选环路滤波网络模型确定为目标环路滤波网络模型,以便利用该目标环路滤波网络模型对当前块进行滤波处理。
还需要说明的是,根据颜色分量类型的不同,预设选择网络模型包括亮度选择网络模型和色度选择网络模型;这样,对于第二重建图像块来说,也可以包括输入重建亮度图像块和输入重建色度图像块。
在一种可能的实施方式中,在当前块的颜色分量类型为亮度分量的情况下,所述根据当前块的预设选择网络模型确定至少两个输出值,可以包括:
确定亮度环路滤波网络模型的输入重建亮度图像块;
将输入重建亮度图像块输入亮度选择网络模型,得到至少两个亮度输出值。
在这里,至少两个亮度输出值可以包括当前块使用亮度环路滤波网络模型时至少一个候选亮度环路滤波网络模型各自对应的第一值和当前块不使用亮度环路滤波网络模型时的第二值。
进一步地,在一些实施例中,以亮度输出值为概率值为例,该方法还可以包括:从至少两个亮度输出值中选取最大概率值;若最大概率值为第一值,则确定当前块使用亮度环路滤波网络模型,且将最大概率值对应的候选亮度环路滤波网络模型作为目标亮度环路滤波网络模型;或者,若最大概率值为第二值,则确定当前块不使用亮度环路滤波网络模型。
在另一种可能的实施方式中,在当前块的颜色分量类型为色度分量的情况下,所述根据当前块的预设选择网络模型确定至少两个输出值,可以包括:
确定色度环路滤波网络模型的输入重建色度图像块;
将输入重建色度图像块输入色度选择网络模型,得到至少两个色度输出值。
在这里,至少两个色度输出值可以包括当前块使用色度环路滤波网络模型时至少一个候选色度环路滤波网络模型各自对应的第一值和当前块不使用色度环路滤波网络模型时的第二值。
进一步地,在一些实施例中,以色度输出值为概率值为例,该方法还可以包括:从至少两个色度输出值中选取最大概率值;若最大概率值为第一值,则确定当前块使用色度环路滤波网络模型,且将最大概率值对应的候选色度环路滤波网络模型作为目标色度环路滤波网络模型;或者,若最大概率值为第二值,则确定当前块不使用色度环路滤波网络模型。
这样,在确定出当前块使用的目标环路滤波网络模型(包括目标亮度环路滤波网络模型或者目标色度环路滤波网络模型)后,可以利用所选取的目标环路滤波网络模型对当前块进行滤波处理。具体地,在一种可能的实施方式中,当当前块使用环路滤波网络模型时,所述利用目标环路滤波网络模型对当前块进行滤波处理,得到当前块的第一重建图像块,可以包括:
确定当前块的第二重建图像块;
将第二重建图像块输入到目标环路滤波网络模型,得到当前块的第一重建图像块。
在另一种可能的实施方式中,当当前块不使用环路滤波网络模型时,该方法还可以包括:将第二重建图像块确定为当前块的第一重建图像块。
简言之,在确定出这至少两个输出值后,如果从这至少两个输出值中确定出最大值为第二值,意味着当前块不使用环路滤波网络模型的率失真代价最小,那么可以确定出当前块不使用环路滤波网络模型,即将第二重建图像块直接确定为当前块的第一重建图像块;如果从这至少两个输出值中确定出最大值为某一第一值,意味着当前块使用环路滤波网络模型的率失真代价最小,那么可以将某一第一值对应的候选环路滤波网络模型确定为目标环路滤波网络模型,然后将第二重建图像块输入到该目标环路滤波网络模型中,得到当前块的第一重建图像块。
在一些实施例中,对于第二重建图像块(包括输入重建亮度图像块或者输入重建色度图像块)来说,这里,第二重建图像块可以是经由去块滤波器和样值自适应补偿滤波器进行滤波处理后得到。
还需要说明的是,本申请实施例所述的环路滤波网络模型可以为CNNLF模型。这样,利用所选取的CNNLF模型对当前块进行CNNLF滤波处理,可以得到当前块的第一重建图像块。
进一步地,在一些实施例中,该方法还可以包括:在确定出当前块的第一重建图像块之后,利用自适应修正滤波器对第一重建图像块进行滤波处理。
以图10为例,第二重建图像块是经由去块滤波器(DBF)和样值自适应补偿滤波器(SAO)进行滤波处理后得到的,然后第二重建图像块经由模型自适应选择模块和CNNLF模型后得到的第一重建图像块还可以输入自适应修正滤波器(ALF)继续进行滤波处理。
除此之外,为了节省复杂度,在一些实施例中,在确定当前块使用的目标环路滤波网络模型之后,该方法还可以包括:
确定环路滤波网络模型的标识信息;
对所述环路滤波网络模型的标识信息进行编码,将编码比特写入码流。
在一种具体的示例中,所述确定环路滤波网络模型的标识信息,可以包括:
若当前块使用环路滤波网络模型,则将目标环路滤波网络模型对应的环路滤波网络模型索引序号确定为环路滤波网络模型的标识信息;和/或,
若当前块不使用环路滤波网络模型,则将模型关闭信息确定为环路滤波网络模型的标识信息。
这样,以CNNLF模型为例,根据在编码器侧模型自适应决策使用模块,如果当前块使用环路滤波网络模型,那么可以将目标环路滤波网络模型对应的环路滤波网络模型索引序号确定为环路滤波网络模型的标识信息;如果当前块不使用环路滤波网络模型,那么可以将模型关闭信息确定为环路滤波网络模型的标识信息;然后对环路滤波网络模型的标识信息进行编码并写入码流中;如此,后续在解码器中根据解码获得的环路滤波网络模型的标识信息即可直接确定出当前块不使用环路滤波网络模型或者当前块使用的环路滤波网络模型索引序号,从而能够降低解码器的复杂度。
在本申请实施例中,针对前述实施例中的第一神经网络结构、第二神经网络结构和第三神经网络结构等,其包括的卷积层数量,全连接层数量,非线性激活函数等均可以进行调整。另外,模型自适应选择模块所针对的环路滤波网络模型,除了CNNLF模型之外,还可以是针对其他高效的神经网络滤波器模型进行模型自适应选择,这里不作任何限定。
简言之,本申请实施例提出了一种基于深度学习的模型自适应决策使用模块,用于对CNNLF模型的使用进行自适应决策,不再需要计算率失真代价和传输帧级、CTU级等开关信息,避免额外的比特开销,提升编码性能。模型自适应决策使用模块可以看作是由多层卷积神经网络和多层全连接神经网络组成的预设选择网络模型,其输入为当前块的第二重建图像块(即CNNLF模型的输入重建图像块),输出为各个CNNLF模型以及决策为关闭CNNLF模型的概率分布情况。模型自适应决策使用模块位于编码器/解码器中的位置如图5所示,模型自适应选择模块的使用不依赖于DBF、SAO、ALF、CNNLF的标志位,只是在位置上置于CNNLF之前。
在一种具体的示例中,本申请实施例的技术方案作用在编码器的环路滤波模块中,其具体流程如下:
编码端进入环路滤波模块时,按照预设的滤波器顺序进行处理。这里,预设的滤波器顺序为DBF滤波---->SAO滤波---->模型自适应决策使用模块---->CNNLF滤波---->ALF滤波。当进入模型自适应决策使用模块时,
(a)首先根据model_adaptive_decision_enable_flag判断当前块下是否允许使用模型自适应决策使用模块进行模型决策。如果model_adaptive_decision_enable_flag为“1”,那么对当前块尝试进行模型自适应决策使用模块处理,跳转至(b);如果model_adaptive_decision_enable_flag为“0”,那么跳转至(e);
(b)判断当前块的颜色分量类型,如果当前块为亮度块,那么跳转至(c);如果当前块为色度块, 那么跳转至(d);
(c)对于亮度分量,将CNNLF模型的输入重建亮度图像块作为模型自适应决策使用模块的输入,输出为各个亮度CNNLF模型以及决策为关闭亮度CNNLF模型的概率分布情况。若输出的概率值最大的为决策关闭亮度CNNLF模型,则跳转至(e);若输出的概率值最大的为某个亮度CNNLF模型的索引序号,则选择该模型对当前亮度图像块进行CNNLF滤波处理,得到最终输出的重建亮度图像块;
(d)对于色度分量,将CNNLF模型的输入重建色度图像块作为模型自适应决策使用模块的输入,输出为各个色度CNNLF模型以及决策为关闭色度CNNLF模型的概率分布情况。若输出的概率值最大的为决策关闭色度CNNLF模型,则跳转至(e);若输出的概率值最大的为某个色度CNNLF模型的索引序号,则选择该模型对当前色度图像块进行CNNLF滤波处理,得到最终输出的重建色度图像块;
(e)如果当前帧已完成模型自适应决策使用模块的处理,那么加载下一帧进行处理,然后跳转至(a)。
在实现中,其语法元素的修改如下所示。其中,对于序列头定义,其语法元素的修改如表1所示;对于帧内预测图像头定义,其语法元素的修改如表2所示;对于帧间预测图像头定义,其语法元素的修改如表3所示;对于片定义,其语法元素的修改如表4所示。
本实施例提供了一种编码方法,应用于编码器。确定第一语法元素标识信息的取值;当第一语法元素标识信息指示当前块允许使用预设选择网络模型进行模型决策时,根据当前块的预设选择网络模型确定至少两个输出值;其中,至少两个输出值包括当前块使用环路滤波网络模型时至少一个候选环路滤波网络模型各自对应的第一值和当前块不使用环路滤波网络模型时的第二值;根据至少两个输出值,确定当前块使用环路滤波网络模型时的目标环路滤波网络模型或者当前块不使用环路滤波网络模型;当当前块使用环路滤波网络模型时,利用目标环路滤波网络模型对当前块进行滤波处理,得到当前块的第一重建图像块。这样,通过引入基于深度学习的神经网络技术对环路滤波网络模型进行自适应决策,可以确定出当前块使用环路滤波网络模型时的目标环路滤波网络模型或者当前块不使用环路滤波网络模型;如果当前块使用环路滤波网络模型,那么可以还可以利用目标环路滤波网络模型对当前块进行滤波处理,如此不仅可以降低复杂度,还可以避免额外的比特开销,提升编码性能,进而能够提高编解码效率;另外,还可以使得最终输出的第一重建图像块更加接近于原始图像块,能够提升视频图像质量。
在本申请的再一实施例中,基于前述实施例相同的发明构思,参见图13,其示出了本申请实施例提供的一种编码器130的组成结构示意图。如图13所示,该编码器130可以包括:第一确定单元1301、第一决策单元1302和第一滤波单元1303;其中,
第一确定单元1301,配置为确定第一语法元素标识信息的取值;
第一决策单元1302,配置为当第一语法元素标识信息指示当前块允许使用预设选择网络模型进行模型决策时,根据当前块的预设选择网络模型确定至少两个输出值;其中,至少两个输出值包括当前块使用环路滤波网络模型时至少一个候选环路滤波网络模型各自对应的第一值和当前块不使用环路滤波网络模型时的第二值;以及根据至少两个输出值,确定当前块使用环路滤波网络模型时的目标环路滤波网络模型或者当前块不使用环路滤波网络模型;
第一滤波单元1303,配置为当当前块使用环路滤波网络模型时,利用目标环路滤波网络模型对当前块进行滤波处理,得到当前块的第一重建图像块。
在一些实施例中,第一确定单元1301,还配置为确定当前块的第二重建图像块;
第一滤波单元1303,还配置为将第二重建图像块输入到目标环路滤波网络模型,得到当前块的第一重建图像块。
在一些实施例中,第一滤波单元1303,还配置为将第二重建图像块确定为当前块的第一重建图像块。
在一些实施例中,第一决策单元1302,还配置为从至少两个输出值中确定目标值;以及若目标值为第一值,则确定当前块使用环路滤波网络模型,且将目标值对应的候选环路滤波网络模型作为目标环路滤波网络模型;或者,若目标值为第二值,则确定当前块不使用环路滤波网络模型。
在一些实施例中,第一决策单元1302,还配置为从至少两个值中选取最大值,将最大值作为目标值。
在一些实施例中,第一确定单元1301,还配置为若当前块允许使用预设选择网络模型进行模型决策,则确定第一语法元素标识信息的取值为第一标识值;和/或,若当前块不允许使用预设选择网络模型进行模型决策,则确定第一语法元素标识信息的取值为第二标识值。
在一些实施例中,参见图13,编码器130还可以包括编码单元1304,配置为对第一语法元素标识 信息的取值进行编码,将编码比特写入码流。
在一些实施例中,第一确定单元1301,还配置为若当前块的颜色分量类型为亮度分量,则确定当前块的亮度选择网络模型;或者,若当前块的颜色分量类型为色度分量,则确定当前块的色度选择网络模型;
相应地,第一决策单元1302,还配置为若当前块的颜色分量类型为亮度分量,则根据亮度选择网络模型确定至少两个亮度输出值;其中,至少两个亮度输出值包括当前块使用亮度环路滤波网络模型时至少一个候选亮度环路滤波网络模型各自对应的第一值和当前块不使用亮度环路滤波网络模型时的第二值;或者,若当前块的颜色分量类型为色度分量,则根据色度选择网络模型确定至少两个色度输出值;其中,至少两个色度输出值包括当前块使用色度环路滤波网络模型时至少一个候选色度环路滤波网络模型各自对应的第一值和当前块不使用色度环路滤波网络模型时的第二值。
在一些实施例中,第一确定单元1301,还配置为在当前块的颜色分量类型为亮度分量的情况下,若当前块所属帧的帧类型为第一类型,则确定当前块的第一亮度选择网络模型;或者,若当前块所属帧的帧类型为第二类型,则确定当前块的第二亮度选择网络模型;
相应地,第一决策单元1302,还配置为若当前块所属帧的帧类型为第一类型,则根据第一亮度选择网络模型确定至少两个亮度输出值;其中,至少两个亮度输出值包括当前块使用第一亮度环路滤波网络模型时至少一个候选第一亮度环路滤波网络模型各自对应的第一值和当前块不使用第一亮度环路滤波网络模型时的第二值;或者,若当前块所属帧的帧类型为第二类型,则根据第二亮度选择网络模型确定至少两个亮度输出值;其中,至少两个亮度输出值包括当前块使用第二亮度环路滤波网络模型时至少一个候选第二亮度环路滤波网络模型各自对应的第一值和当前块不使用第二亮度环路滤波网络模型时的第二值。
在一些实施例中,至少一个候选第一亮度环路滤波网络模型和至少一个候选第二亮度环路滤波网络模型是根据至少一个训练样本对第一神经网络结构进行模型训练确定的,且至少一个候选第一亮度环路滤波网络模型和至少一个候选第二亮度环路滤波网络模型与帧类型、颜色分量类型和量化参数之间具有对应关系。
在一些实施例中,第一神经网络结构包括第一卷积模块、第一残差模块、第二卷积模块和第一连接模块,第一卷积模块、第一残差模块、第二卷积模块和第一连接模块顺次连接,且第一连接模块还与第一卷积模块的输入连接。
在一些实施例中,第一卷积模块由一层卷积层和一层激活层组成,第二卷积模块由两层卷积层和一层激活层组成,连接模块由跳转连接层组成,第一残差模块包括若干个残差块,且残差块由两层卷积层和一层激活层组成。
在一些实施例中,第一确定单元1301,还配置为在当前块的颜色分量类型为色度分量的情况下,若当前块所属帧的帧类型为第一类型,则确定当前块的第一色度选择网络模型;或者,若当前块所属帧的帧类型为第二类型,则确定当前块的第二色度选择网络模型;
相应地,第一决策单元1302,还配置为若当前块所属帧的帧类型为第一类型,则根据第一色度选择网络模型确定至少两个色度输出值;其中,至少两个色度输出值包括当前块使用第一色度环路滤波网络模型时至少一个候选第一色度环路滤波网络模型各自对应的第一值和当前块不使用第一色度环路滤波网络模型时的第二值;或者,若当前块所属帧的帧类型为第二类型,则根据第二色度选择网络模型确定至少两个色度输出值;其中,至少两个色度输出值包括当前块使用第二色度环路滤波网络模型时至少一个候选第二色度环路滤波网络模型各自对应的第一值和当前块不使用第二色度环路滤波网络模型时的第二值。
在一些实施例中,至少一个候选第一色度环路滤波网络模型和至少一个候选第二色度环路滤波网络模型是根据至少一个训练样本对第二神经网络结构进行模型训练确定的,且至少一个候选第一色度环路滤波网络模型和至少一个候选第二色度环路滤波网络模型与帧类型、颜色分量类型和量化参数之间具有对应关系。
在一些实施例中,第二神经网络结构包括上采样模块、第三卷积模块、第四卷积模块、融合模块、第二残差模块、第五卷积模块和第二连接模块,上采样模块和第三卷积模块连接,第三卷积模块和第四卷积模块与融合模块连接,融合模块、第二残差模块、第五卷积模块和第二连接模块顺次连接,且第二连接模块还与上采样模块的输入连接。
在一些实施例中,第三卷积模块由一层卷积层和一层激活层组成,第四卷积模块由一层卷积层和一层激活层组成,第五卷积模块由两层卷积层、一层激活层和一层池化层组成,连接模块由跳转连接层组成,第二残差模块包括若干个残差块,且残差块由两层卷积层和一层激活层组成。
在一些实施例中,第一确定单元1301,还配置为在当前块的颜色分量类型为亮度分量的情况下, 确定至少一个候选亮度选择网络模型,候选亮度选择网络模型包括候选第一亮度选择网络模型和/或候选第二亮度选择网络模型;以及确定当前块所属帧的帧类型和量化参数;若帧类型为第一类型,则从至少一个候选亮度选择网络模型中确定第一类型对应的至少一个候选第一亮度选择网络模型,并根据量化参数从至少一个候选第一亮度选择网络模型中确定当前块的第一亮度选择网络模型;或者,若帧类型为第二类型,则从至少一个候选亮度选择网络模型中确定第二类型对应的至少一个候选第二亮度选择网络模型,并根据量化参数从至少一个候选第二亮度选择网络模型中确定当前块的第二亮度选择网络模型。
在一些实施例中,第一确定单元1301,还配置为在当前块的颜色分量类型为色度分量的情况下,确定至少一个候选色度选择网络模型,候选色度选择网络模型包括候选第一色度选择网络模型和/或候选第二色度选择网络模型;以及确定当前块所属帧的帧类型和量化参数;若帧类型为第一类型,则从至少一个候选色度选择网络模型中确定第一类型对应的至少一个候选第一色度选择网络模型,并根据量化参数从至少一个候选第一色度选择网络模型中确定当前块的第一色度选择网络模型;或者,若帧类型为第二类型,则从至少一个候选色度选择网络模型中确定第二类型对应的至少一个候选第二色度选择网络模型,并根据量化参数从至少一个候选第二色度选择网络模型中确定当前块的第二色度选择网络模型。
在一些实施例中,至少一个候选亮度选择网络模型是根据至少一个训练样本对第三神经网络结构进行模型训练确定的,且至少一个候选亮度选择网络模型与帧类型、颜色分量类型和量化参数之间具有对应关系。
在一些实施例中,至少一个候选色度选择网络模型是根据至少一个训练样本对第三神经网络结构进行模型训练确定的,且至少一个候选色度选择网络模型与帧类型、颜色分量类型和量化参数之间具有对应关系。
在一些实施例中,第三神经网络结构包括第六卷积模块和全连接模块,第六卷积模块和全连接模块顺次连接;其中,第六卷积模块包括若干个卷积子模块,卷积子模块由一层卷积层和一层池化层组成;全连接模块包括若干个全连接子模块,全连接子模块由一层全连接层和一层激活层组成。
在一些实施例中,第一确定单元1301,还配置为确定环路滤波网络模型的标识信息;
编码单元1304,还配置为对环路滤波网络模型的标识信息进行编码,将编码比特写入码流。
在一些实施例中,第一确定单元1301,还配置为若当前块使用环路滤波网络模型,则将目标环路滤波网络模型对应的环路滤波网络模型索引序号确定为环路滤波网络模型的标识信息;和/或,若当前块不使用环路滤波网络模型,则将模型关闭信息确定为环路滤波网络模型的标识信息。
在一些实施例中,环路滤波网络模型为CNNLF模型。
在一些实施例中,第一决策单元1302,还配置为确定当前块的第二重建图像块;以及将第二重建图像块输入预设选择网络模型,得到至少两个输出值。
在一些实施例中,第二重建图像块是经由去块滤波器和样值自适应补偿滤波器进行滤波处理后得到。
在一些实施例中,第一滤波单元1303,还配置为在确定出第一重建图像块之后,利用自适应修正滤波器对第一重建图像块进行滤波处理。
可以理解地,在本申请实施例中,“单元”可以是部分电路、部分处理器、部分程序或软件等等,当然也可以是模块,还可以是非模块化的。而且在本实施例中的各组成部分可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。
所述集成的单元如果以软件功能模块的形式实现并非作为独立的产品进行销售或使用时,可以存储在一个计算机可读取存储介质中,基于这样的理解,本实施例的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或processor(处理器)执行本实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
因此,本申请实施例提供了一种计算机存储介质,应用于编码器130,该计算机存储介质存储有计算机程序,所述计算机程序被第一处理器执行时实现前述实施例中任一项所述的方法。
基于上述编码器130的组成以及计算机存储介质,参见图14,其示出了本申请实施例提供的编码器130的具体硬件结构示意图。如图14所示,可以包括:第一通信接口1401、第一存储器1402和第一处理器1403;各个组件通过第一总线***1404耦合在一起。可理解,第一总线***1404用于实现这些组件之间的连接通信。第一总线***1404除包括数据总线之外,还包括电源总线、控制总线和状态信号总线。但是为了清楚说明起见,在图14中将各种总线都标为第一总线***1404。其中,
第一通信接口1401,用于在与其他外部网元之间进行收发信息过程中,信号的接收和发送;
第一存储器1402,用于存储能够在第一处理器1403上运行的计算机程序;
第一处理器1403,用于在运行所述计算机程序时,执行:
确定第一语法元素标识信息的取值;
当第一语法元素标识信息指示当前块允许使用预设选择网络模型进行模型决策时,根据当前块的预设选择网络模型确定至少两个输出值;其中,至少两个输出值包括当前块使用环路滤波网络模型时至少一个候选环路滤波网络模型各自对应的第一值和当前块不使用环路滤波网络模型时的第二值;
根据至少两个输出值,确定当前块使用环路滤波网络模型时的目标环路滤波网络模型或者当前块不使用环路滤波网络模型;
当当前块使用环路滤波网络模型时,利用目标环路滤波网络模型对当前块进行滤波处理,得到当前块的第一重建图像块。
可以理解,本申请实施例中的第一存储器1402可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(Read-Only Memory,ROM)、可编程只读存储器(Programmable ROM,PROM)、可擦除可编程只读存储器(Erasable PROM,EPROM)、电可擦除可编程只读存储器(Electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(Random Access Memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(Static RAM,SRAM)、动态随机存取存储器(Dynamic RAM,DRAM)、同步动态随机存取存储器(Synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(Double Data Rate SDRAM,DDRSDRAM)、增强型同步动态随机存取存储器(Enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(Synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(Direct Rambus RAM,DRRAM)。本申请描述的***和方法的第一存储器1402旨在包括但不限于这些和任意其它适合类型的存储器。
而第一处理器1403可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过第一处理器1403中的硬件的集成逻辑电路或者软件形式的指令完成。上述的第一处理器1403可以是通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于第一存储器1402,第一处理器1403读取第一存储器1402中的信息,结合其硬件完成上述方法的步骤。
可以理解的是,本申请描述的这些实施例可以用硬件、软件、固件、中间件、微码或其组合来实现。对于硬件实现,处理单元可以实现在一个或多个专用集成电路(Application Specific Integrated Circuits,ASIC)、数字信号处理器(Digital Signal Processing,DSP)、数字信号处理设备(DSP Device,DSPD)、可编程逻辑设备(Programmable Logic Device,PLD)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)、通用处理器、控制器、微控制器、微处理器、用于执行本申请所述功能的其它电子单元或其组合中。对于软件实现,可通过执行本申请所述功能的模块(例如过程、函数等)来实现本申请所述的技术。软件代码可存储在存储器中并通过处理器执行。存储器可以在处理器中或在处理器外部实现。
可选地,作为另一个实施例,第一处理器1403还配置为在运行所述计算机程序时,执行前述实施例中任一项所述的方法。
本实施例提供了一种编码器,该编码器可以包括第一确定单元、第一决策单元和第一滤波单元。这样,通过引入基于深度学习的神经网络技术对环路滤波网络模型进行自适应决策,不仅可以降低复杂度,还可以避免额外的比特开销,提升编码性能,进而能够提高编解码效率;另外,还可以使得最终输出的第一重建图像块更加接近于原始图像块,能够提升视频图像质量。
在本申请的再一实施例中,基于前述实施例相同的发明构思,参见图15,其示出了本申请实施例提供的一种解码器150的组成结构示意图。如图15所示,该解码器150可以包括:解析单元1501、第二决策单元1502和第二滤波单元1503;其中,
解析单元1501,配置为解析码流,确定第一语法元素标识信息的取值;
第二决策单元1502,配置为当第一语法元素标识信息指示当前块允许使用预设选择网络模型进行模型决策时,根据当前块的预设选择网络模型确定至少两个输出值;其中,至少两个输出值包括当前块使用环路滤波网络模型时至少一个候选环路滤波网络模型各自对应的第一值和当前块不使用环路滤波 网络模型时的第二值;以及根据至少两个输出值,确定当前块使用环路滤波网络模型时的目标环路滤波网络模型或者当前块不使用环路滤波网络模型;
第二滤波单元1503,配置为当当前块使用环路滤波网络模型时,利用目标环路滤波网络模型对当前块进行滤波处理,得到当前块的第一重建图像块。
在一些实施例中,参见图15,解码器150还可以包括第二确定单元1504,确定当前块的第二重建图像块;
第二滤波单元1503,还配置为将第二重建图像块输入到目标环路滤波网络模型,得到当前块的第一重建图像块。
在一些实施例中,第二滤波单元1503,还配置为当当前块不使用环路滤波网络模型时,将第二重建图像块确定为当前块的第一重建图像块。
在一些实施例中,第二决策单元1502,还配置为从至少两个输出值中确定目标值;以及若目标值为第一值,则确定当前块使用环路滤波网络模型,且将目标值对应的候选环路滤波网络模型作为目标环路滤波网络模型;或者,若目标值为第二值,则确定当前块不使用环路滤波网络模型。
在一些实施例中,第二决策单元1502,还配置为从至少两个值中选取最大值,将最大值作为目标值。
在一些实施例中,第二确定单元1504,还配置为若第一语法元素标识信息的取值为第一标识值,则确定第一语法元素标识信息指示当前块允许使用预设选择网络模型进行模型决策;或者,若第一语法元素标识信息的取值为第二标识值,则确定第一语法元素标识信息指示当前块不允许使用预设选择网络模型进行模型决策。
在一些实施例中,第二确定单元1504,还配置为若当前块的颜色分量类型为亮度分量,则确定当前块的亮度选择网络模型;或者,若当前块的颜色分量类型为色度分量,则确定当前块的色度选择网络模型;
相应地,第二决策单元1502,还配置为若当前块的颜色分量类型为亮度分量,则根据亮度选择网络模型确定至少两个亮度输出值;其中,至少两个亮度输出值包括当前块使用亮度环路滤波网络模型时至少一个候选亮度环路滤波网络模型各自对应的第一值和当前块不使用亮度环路滤波网络模型时的第二值;或者,若当前块的颜色分量类型为色度分量,则根据色度选择网络模型确定至少两个色度输出值;其中,至少两个色度输出值包括当前块使用色度环路滤波网络模型时至少一个候选色度环路滤波网络模型各自对应的第一值和当前块不使用色度环路滤波网络模型时的第二值。
在一些实施例中,第二确定单元1504,还配置为在当前块的颜色分量类型为亮度分量的情况下,若当前块所属帧的帧类型为第一类型,则确定当前块的第一亮度选择网络模型;或者,若当前块所属帧的帧类型为第二类型,则确定当前块的第二亮度选择网络模型;
相应地,第二决策单元1502,还配置为若当前块所属帧的帧类型为第一类型,则根据第一亮度选择网络模型确定至少两个亮度输出值;其中,至少两个亮度输出值包括当前块使用第一亮度环路滤波网络模型时至少一个候选第一亮度环路滤波网络模型各自对应的第一值和当前块不使用第一亮度环路滤波网络模型时的第二值;或者,若当前块所属帧的帧类型为第二类型,则根据第二亮度选择网络模型确定至少两个亮度输出值;其中,至少两个亮度输出值包括当前块使用第二亮度环路滤波网络模型时至少一个候选第二亮度环路滤波网络模型各自对应的第一值和当前块不使用第二亮度环路滤波网络模型时的第二值。
在一些实施例中,至少一个候选第一亮度环路滤波网络模型和至少一个候选第二亮度环路滤波网络模型是根据至少一个训练样本对第一神经网络结构进行模型训练确定的,且至少一个候选第一亮度环路滤波网络模型和至少一个候选第二亮度环路滤波网络模型与帧类型、颜色分量类型和量化参数之间具有对应关系。
在一些实施例中,第一神经网络结构包括第一卷积模块、第一残差模块、第二卷积模块和第一连接模块,第一卷积模块、第一残差模块、第二卷积模块和第一连接模块顺次连接,且第一连接模块还与第一卷积模块的输入连接。
在一些实施例中,第一卷积模块由一层卷积层和一层激活层组成,第二卷积模块由两层卷积层和一层激活层组成,连接模块由跳转连接层组成,第一残差模块包括若干个残差块,且残差块由两层卷积层和一层激活层组成。
在一些实施例中,第二确定单元1504,还配置为在当前块的颜色分量类型为色度分量的情况下,若当前块所属帧的帧类型为第一类型,则确定当前块的第一色度选择网络模型;或者,若当前块所属帧的帧类型为第二类型,则确定当前块的第二色度选择网络模型;
相应地,第二决策单元1502,还配置为若当前块所属帧的帧类型为第一类型,则根据第一色度选 择网络模型确定至少两个色度输出值;其中,至少两个色度输出值包括当前块使用第一色度环路滤波网络模型时至少一个候选第一色度环路滤波网络模型各自对应的第一值和当前块不使用第一色度环路滤波网络模型时的第二值;或者,若当前块所属帧的帧类型为第二类型,则根据第二色度选择网络模型确定至少两个色度输出值;其中,至少两个色度输出值包括当前块使用第二色度环路滤波网络模型时至少一个候选第二色度环路滤波网络模型各自对应的第一值和当前块不使用第二色度环路滤波网络模型时的第二值。
在一些实施例中,至少一个候选第一色度环路滤波网络模型和至少一个候选第二色度环路滤波网络模型是根据至少一个训练样本对第二神经网络结构进行模型训练确定的,且至少一个候选第一色度环路滤波网络模型和至少一个候选第二色度环路滤波网络模型与帧类型、颜色分量类型和量化参数之间具有对应关系。
在一些实施例中,第二神经网络结构包括上采样模块、第三卷积模块、第四卷积模块、融合模块、第二残差模块、第五卷积模块和第二连接模块,上采样模块和第三卷积模块连接,第三卷积模块和第四卷积模块与融合模块连接,融合模块、第二残差模块、第五卷积模块和第二连接模块顺次连接,且第二连接模块还与上采样模块的输入连接。
在一些实施例中,第三卷积模块由一层卷积层和一层激活层组成,第四卷积模块由一层卷积层和一层激活层组成,第五卷积模块由两层卷积层、一层激活层和一层池化层组成,连接模块由跳转连接层组成,第二残差模块包括若干个残差块,且残差块由两层卷积层和一层激活层组成。
在一些实施例中,第二确定单元1504,还配置为在当前块的颜色分量类型为亮度分量的情况下,确定至少一个候选亮度选择网络模型,候选亮度选择网络模型包括候选第一亮度选择网络模型和/或候选第二亮度选择网络模型;以及确定当前块所属帧的帧类型和量化参数;若帧类型为第一类型,则从至少一个候选亮度选择网络模型中确定第一类型对应的至少一个候选第一亮度选择网络模型,并根据量化参数从至少一个候选第一亮度选择网络模型中确定当前块的第一亮度选择网络模型;或者,若帧类型为第二类型,则从至少一个候选亮度选择网络模型中确定第二类型对应的至少一个候选第二亮度选择网络模型,并根据量化参数从至少一个候选第二亮度选择网络模型中确定当前块的第二亮度选择网络模型。
在一些实施例中,第二确定单元1504,还配置为在当前块的颜色分量类型为色度分量的情况下,确定至少一个候选色度选择网络模型,候选色度选择网络模型包括候选第一色度选择网络模型和/或候选第二色度选择网络模型;以及确定当前块所属帧的帧类型和量化参数;若帧类型为第一类型,则从至少一个候选色度选择网络模型中确定第一类型对应的至少一个候选第一色度选择网络模型,并根据量化参数从至少一个候选第一色度选择网络模型中确定当前块的第一色度选择网络模型;或者,若帧类型为第二类型,则从至少一个候选色度选择网络模型中确定第二类型对应的至少一个候选第二色度选择网络模型,并根据量化参数从至少一个候选第二色度选择网络模型中确定当前块的第二色度选择网络模型。
在一些实施例中,至少一个候选亮度选择网络模型是根据至少一个训练样本对第三神经网络结构进行模型训练确定的,且至少一个候选亮度选择网络模型与帧类型、颜色分量类型和量化参数之间具有对应关系。
在一些实施例中,至少一个候选色度选择网络模型是根据至少一个训练样本对第三神经网络结构进行模型训练确定的,且至少一个候选色度选择网络模型与帧类型、颜色分量类型和量化参数之间具有对应关系。
在一些实施例中,第三神经网络结构包括第六卷积模块和全连接模块,第六卷积模块和全连接模块顺次连接;其中,第六卷积模块包括若干个卷积子模块,卷积子模块由一层卷积层和一层池化层组成;全连接模块包括若干个全连接子模块,全连接子模块由一层全连接层和一层激活层组成。
在一些实施例中,解析单元1501,还配置为当第一语法元素标识信息指示当前块允许使用预设选择网络模型进行模型决策时,解析码流,确定环路滤波网络模型的标识信息;
第二确定单元1504,还配置为若环路滤波网络模型的标识信息为模型关闭信息,则确定当前块不使用环路滤波网络模型;或者,若环路滤波网络模型的标识信息为环路滤波网络模型索引序号,则根据环路滤波网络模型索引序号,从至少一个候选环路滤波网络模型中确定当前块使用的目标环路滤波网络模型;
第二滤波单元1503,还配置为利用目标环路滤波网络模型对当前块进行滤波处理,得到当前块的第一重建图像块。
在一些实施例中,环路滤波网络模型为CNNLF模型。
在一些实施例中,第二确定单元1504,还配置为确定当前块的第二重建图像块;
第二决策单元1502,还配置为将第二重建图像块输入预设选择网络模型,得到至少两个输出值。
在一些实施例中,第二重建图像块是经由去块滤波器和样值自适应补偿滤波器进行滤波处理后得到。
在一些实施例中,第二滤波单元1503,还配置为在确定出第一重建图像块之后,利用自适应修正滤波器对第一重建图像块进行滤波处理。
可以理解地,在本实施例中,“单元”可以是部分电路、部分处理器、部分程序或软件等等,当然也可以是模块,还可以是非模块化的。而且在本实施例中的各组成部分可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。
所述集成的单元如果以软件功能模块的形式实现并非作为独立的产品进行销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本实施例提供了一种计算机存储介质,应用于解码器150,该计算机存储介质存储有计算机程序,所述计算机程序被第二处理器执行时实现前述实施例中任一项所述的方法。
基于上述解码器150的组成以及计算机存储介质,参见图16,其示出了本申请实施例提供的解码器150的具体硬件结构示意图。如图16所示,可以包括:第二通信接口1601、第二存储器1602和第二处理器1603;各个组件通过第二总线***1604耦合在一起。可理解,第二总线***1604用于实现这些组件之间的连接通信。第二总线***1604除包括数据总线之外,还包括电源总线、控制总线和状态信号总线。但是为了清楚说明起见,在图16中将各种总线都标为第二总线***1604。其中,
第二通信接口1601,用于在与其他外部网元之间进行收发信息过程中,信号的接收和发送;
第二存储器1602,用于存储能够在第二处理器1603上运行的计算机程序;
第二处理器1603,用于在运行所述计算机程序时,执行:
解析码流,确定第一语法元素标识信息的取值;
当第一语法元素标识信息指示当前块允许使用预设选择网络模型进行模型决策时,根据当前块的预设选择网络模型确定至少两个输出值;其中,至少两个输出值包括当前块使用环路滤波网络模型时至少一个候选环路滤波网络模型各自对应的第一值和当前块不使用环路滤波网络模型时的第二值;
根据至少两个输出值,确定当前块使用环路滤波网络模型时的目标环路滤波网络模型或者当前块不使用环路滤波网络模型;
当当前块使用环路滤波网络模型时,利用目标环路滤波网络模型对当前块进行滤波处理,得到当前块的第一重建图像块。
可选地,作为另一个实施例,第二处理器1603还配置为在运行所述计算机程序时,执行前述实施例中任一项所述的方法。
可以理解,第二存储器1602与第一存储器1402的硬件功能类似,第二处理器1603与第一处理器1403的硬件功能类似;这里不再详述。
本实施例提供了一种解码器,该解码器可以包括解析单元、第二决策单元和第二滤波单元。这样,通过引入基于深度学习的神经网络技术对环路滤波网络模型进行自适应决策,不仅可以降低复杂度,还可以避免额外的比特开销,提升编码性能,进而能够提高编解码效率;另外,还可以使得最终输出的第一重建图像块更加接近于原始图像块,能够提升视频图像质量。
在本申请的再一实施例中,参见图17,其示出了本申请实施例提供的一种编解码***的组成结构示意图。如图17所示,编解码***170可以包括前述实施例任一项所述的编码器130和前述实施例任一项所述的解码器150。
在一些实施例中,本申请实施例还提供了一种码流,该码流是根据待编码信息进行比特编码生成的;其中,待编码信息包括第一语法元素标识信息的取值,第一语法元素标识信息用于指示当前块是否允许使用预设选择网络模型进行模型决策。
进一步地,在一些实施例中,这里的待编码信息还可以包括环路滤波网络模型的标识信息;其中,环路滤波网络模型的标识信息用于确定当前块使用环路滤波网络模型时的环路滤波网络模型索引序号或者当前块不使用环路滤波网络模型。
需要说明的是,在编解码***170中,编码器130可以将码流传输到解码器150。这样,解码器150通过解析码流可以获取到第一语法元素标识信息的取值,以便确定出当前块是否允许使用预设选择网络模型进行模型决策。
这样,在本申请实施例中,通过引入基于深度学习的神经网络技术对环路滤波网络模型进行自适应决策,可以确定当前块使用环路滤波网络模型时的目标环路滤波网络模型或者当前块不使用环路滤波网络模型;如果当前块使用环路滤波网络模型,那么可以还可以利用目标环路滤波网络模型对当前块进行滤波处理,如此不仅可以降低复杂度,还可以避免额外的比特开销,提升编码性能,进而能够提高编解码效率;另外,还可以使得最终输出的第一重建图像块更加接近于原始图像块,能够提升视频图像质量。
需要说明的是,在本申请中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。
本申请所提供的几个方法实施例中所揭露的方法,在不冲突的情况下可以任意组合,得到新的方法实施例。
本申请所提供的几个产品实施例中所揭露的特征,在不冲突的情况下可以任意组合,得到新的产品实施例。
本申请所提供的几个方法或设备实施例中所揭露的特征,在不冲突的情况下可以任意组合,得到新的方法实施例或设备实施例。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。
工业实用性
本申请实施例中,无论是编码器还是解码器,在确定出第一语法元素标识信息的取值之后,当第一语法元素标识信息指示当前块允许使用预设选择网络模型进行模型决策时,根据当前块的预设选择网络模型确定至少两个输出值,这至少两个输出值包括当前块使用环路滤波网络模型时至少一个候选环路滤波网络模型各自对应的第一值和当前块不使用环路滤波网络模型时的第二值;根据至少两个输出值,确定当前块使用环路滤波网络模型时的目标环路滤波网络模型或者当前块不使用环路滤波网络模型;当当前块使用环路滤波网络模型时,利用目标环路滤波网络模型对当前块进行滤波处理,得到当前块的第一重建图像块。这样,通过引入基于深度学习的神经网络技术对环路滤波网络模型进行自适应决策,可以确定出当前块使用环路滤波网络模型时的目标环路滤波网络模型或者当前块不使用环路滤波网络模型;如果当前块使用环路滤波网络模型,那么可以还可以利用目标环路滤波网络模型对当前块进行滤波处理,如此不仅可以降低复杂度,还可以避免额外的比特开销,提升编码性能,进而能够提高编解码效率;另外,还可以使得最终输出的第一重建图像块更加接近于原始图像块,能够提升视频图像质量。

Claims (56)

  1. 一种解码方法,应用于解码器,所述方法包括:
    解析码流,确定第一语法元素标识信息的取值;
    当所述第一语法元素标识信息指示当前块允许使用预设选择网络模型进行模型决策时,根据当前块的预设选择网络模型确定至少两个输出值;其中,所述至少两个输出值包括所述当前块使用环路滤波网络模型时至少一个候选环路滤波网络模型各自对应的第一值和所述当前块不使用环路滤波网络模型时的第二值;
    根据所述至少两个输出值,确定所述当前块使用环路滤波网络模型时的目标环路滤波网络模型或者所述当前块不使用环路滤波网络模型;
    当所述当前块使用环路滤波网络模型时,利用所述目标环路滤波网络模型对所述当前块进行滤波处理,得到所述当前块的第一重建图像块。
  2. 根据权利要求1所述的方法,其中,当所述当前块使用环路滤波网络模型时,所述利用所述目标环路滤波网络模型对所述当前块进行滤波处理,得到所述当前块的第一重建图像块,包括:
    确定所述当前块的第二重建图像块;
    将所述第二重建图像块输入到所述目标环路滤波网络模型,得到所述当前块的第一重建图像块。
  3. 根据权利要求2所述的方法,其中,当所述当前块不使用环路滤波网络模型时,所述方法还包括:
    将所述第二重建图像块确定为所述当前块的第一重建图像块。
  4. 根据权利要求1所述的方法,其中,所述根据所述至少两个输出值,确定所述当前块使用环路滤波网络模型时的目标环路滤波网络模型或者所述当前块不使用环路滤波网络模型,包括:
    从所述至少两个输出值中确定目标值;
    若所述目标值为所述第一值,则确定所述当前块使用环路滤波网络模型,且将所述目标值对应的候选环路滤波网络模型作为所述目标环路滤波网络模型;或者,
    若所述目标值为所述第二值,则确定所述当前块不使用环路滤波网络模型。
  5. 根据权利要求4所述的方法,其中,所述从所述至少两个输出值中确定目标值,包括:从所述至少两个值中选取最大值,将所述最大值作为所述目标值。
  6. 根据权利要求1所述的方法,其中,所述方法还包括:
    若所述第一语法元素标识信息的取值为第一标识值,则确定所述第一语法元素标识信息指示所述当前块允许使用所述预设选择网络模型进行模型决策;或者,
    若所述第一语法元素标识信息的取值为第二标识值,则确定所述第一语法元素标识信息指示所述当前块不允许使用所述预设选择网络模型进行模型决策。
  7. 根据权利要求1所述的方法,其中,所述方法还包括:
    若所述当前块的颜色分量类型为亮度分量,则确定所述当前块的亮度选择网络模型;或者,
    若所述当前块的颜色分量类型为色度分量,则确定所述当前块的色度选择网络模型;
    所述根据当前块的预设选择网络模型确定至少两个输出值,包括:
    若所述当前块的颜色分量类型为亮度分量,则根据所述亮度选择网络模型确定至少两个亮度输出值;其中,所述至少两个亮度输出值包括所述当前块使用亮度环路滤波网络模型时至少一个候选亮度环路滤波网络模型各自对应的第一值和所述当前块不使用所述亮度环路滤波网络模型时的第二值;或者,
    若所述当前块的颜色分量类型为色度分量,则根据所述色度选择网络模型确定至少两个色度输出值;其中,所述至少两个色度输出值包括所述当前块使用色度环路滤波网络模型时至少一个候选色度环路滤波网络模型各自对应的第一值和所述当前块不使用所述色度环路滤波网络模型时的第二值。
  8. 根据权利要求7所述的方法,其中,在所述当前块的颜色分量类型为亮度分量的情况下,所述确定所述当前块的亮度选择网络模型,包括:
    若所述当前块所属帧的帧类型为第一类型,则确定所述当前块的第一亮度选择网络模型;或者,
    若所述当前块所属帧的帧类型为第二类型,则确定所述当前块的第二亮度选择网络模型;
    所述根据所述亮度选择网络模型确定至少两个亮度输出值,包括:
    若所述当前块所属帧的帧类型为第一类型,则根据所述第一亮度选择网络模型确定所述至少两个亮度输出值;其中,所述至少两个亮度输出值包括所述当前块使用第一亮度环路滤波网络模型时至少一个候选第一亮度环路滤波网络模型各自对应的第一值和所述当前块不使用所述第一亮度环路滤波网络模型时的第二值;或者,
    若所述当前块所属帧的帧类型为第二类型,则根据所述第二亮度选择网络模型确定所述至少两个亮度输出值;其中,所述至少两个亮度输出值包括所述当前块使用第二亮度环路滤波网络模型时至少一个候选第二亮度环路滤波网络模型各自对应的第一值和所述当前块不使用所述第二亮度环路滤波网络模型时的第二值。
  9. 根据权利要求8所述的方法,其中,第一神经网络结构包括第一卷积模块、第一残差模块、第二卷积模块和第一连接模块,所述第一卷积模块、所述第一残差模块、所述第二卷积模块和所述第一连接模块顺次连接,且所述第一连接模块还与所述第一卷积模块的输入连接。
  10. 根据权利要求9所述的方法,其中,所述第一卷积模块由一层卷积层和一层激活层组成,所述第二卷积模块由两层卷积层和一层激活层组成,所述连接模块由跳转连接层组成,所述第一残差模块包括若干个残差块,且所述残差块由两层卷积层和一层激活层组成。
  11. 根据权利要求7所述的方法,其中,在所述当前块的颜色分量类型为色度分量的情况下,所述确定所述当前块的色度选择网络模型,包括:
    若所述当前块所属帧的帧类型为第一类型,则确定所述当前块的第一色度选择网络模型;或者,
    若所述当前块所属帧的帧类型为第二类型,则确定所述当前块的第二色度选择网络模型;
    所述根据所述色度选择网络模型确定至少两个色度输出值,包括:
    若所述当前块所属帧的帧类型为第一类型,则根据所述第一色度选择网络模型确定所述至少两个色度输出值;其中,所述至少两个色度输出值包括所述当前块使用第一色度环路滤波网络模型时至少一个候选第一色度环路滤波网络模型各自对应的第一值和所述当前块不使用所述第一色度环路滤波网络模型时的第二值;或者,
    若所述当前块所属帧的帧类型为第二类型,则根据所述第二色度选择网络模型确定所述至少两个色度输出值;其中,所述至少两个色度输出值包括所述当前块使用第二色度环路滤波网络模型时至少一个候选第二色度环路滤波网络模型各自对应的第一值和所述当前块不使用所述第二色度环路滤波网络模型时的第二值。
  12. 根据权利要求11所述的方法,其中,第二神经网络结构包括上采样模块、第三卷积模块、第四卷积模块、融合模块、第二残差模块、第五卷积模块和第二连接模块,所述上采样模块和所述第三卷积模块连接,所述第三卷积模块和所述第四卷积模块与所述融合模块连接,所述融合模块、所述第二残差模块、所述第五卷积模块和所述第二连接模块顺次连接,且所述第二连接模块还与所述上采样模块的输入连接。
  13. 根据权利要求12所述的方法,其中,所述第三卷积模块由一层卷积层和一层激活层组成,所述第四卷积模块由一层卷积层和一层激活层组成,所述第五卷积模块由两层卷积层、一层激活层和一层池化层组成,所述连接模块由跳转连接层组成,所述第二残差模块包括若干个残差块,且所述残差块由两层卷积层和一层激活层组成。
  14. 根据权利要求8所述的方法,其中,在所述当前块的颜色分量类型为亮度分量的情况下,所述确定所述当前块的亮度选择网络模型,包括:
    确定至少一个候选亮度选择网络模型,所述候选亮度选择网络模型包括所述候选第一亮度选择网络模型和/或所述候选第二亮度选择网络模型;
    确定所述当前块所属帧的帧类型和量化参数;
    若所述帧类型为第一类型,则从所述至少一个候选亮度选择网络模型中确定所述第一类型对应的至少一个候选第一亮度选择网络模型,并根据所述量化参数从所述至少一个候选第一亮度选择网络模型中确定所述当前块的第一亮度选择网络模型;或者,
    若所述帧类型为第二类型,则从所述至少一个候选亮度选择网络模型中确定所述第二类型对应的至少一个候选第二亮度选择网络模型,并根据所述量化参数从所述至少一个候选第二亮度选择网络模型中确定所述当前块的第二亮度选择网络模型。
  15. 根据权利要求11所述的方法,其中,在所述当前块的颜色分量类型为色度分量的情况下,所述确定所述当前块的色度选择网络模型,包括:
    确定至少一个候选色度选择网络模型,所述候选色度选择网络模型包括所述候选第一色度选择网络模型和/或所述候选第二色度选择网络模型;
    确定所述当前块所属帧的帧类型和量化参数;
    若所述帧类型为第一类型,则从所述至少一个候选色度选择网络模型中确定所述第一类型对应的至少一个候选第一色度选择网络模型,并根据所述量化参数从所述至少一个候选第一色度选择网络模型中确定所述当前块的第一色度选择网络模型;或者,
    若所述帧类型为第二类型,则从所述至少一个候选色度选择网络模型中确定所述第二类型对应的至 少一个候选第二色度选择网络模型,并根据所述量化参数从所述至少一个候选第二色度选择网络模型中确定所述当前块的第二色度选择网络模型。
  16. 根据权利要求14或15所述的方法,其中,第三神经网络结构包括第六卷积模块和全连接模块,所述第六卷积模块和所述全连接模块顺次连接;
    其中,所述第六卷积模块包括若干个卷积子模块,所述卷积子模块由一层卷积层和一层池化层组成;所述全连接模块包括若干个全连接子模块,所述全连接子模块由一层全连接层和一层激活层组成。
  17. 根据权利要求1所述的方法,其中,所述方法还包括:
    当所述第一语法元素标识信息指示当前块允许使用预设选择网络模型进行模型决策时,解析所述码流,确定环路滤波网络模型的标识信息;
    若所述环路滤波网络模型的标识信息为模型关闭信息,则确定所述当前块不使用环路滤波网络模型;或者,
    若所述环路滤波网络模型的标识信息为环路滤波网络模型索引序号,则根据所述环路滤波网络模型索引序号,从至少一个候选环路滤波网络模型中确定所述当前块使用的目标环路滤波网络模型;
    利用所述目标环路滤波网络模型对所述当前块进行滤波处理,得到所述当前块的第一重建图像块。
  18. 根据权利要求1所述的方法,其中,所述环路滤波网络模型为基于残差神经网络的环路滤波器(CNNLF)模型。
  19. 根据权利要求1所述的方法,其中,所述根据当前块的预设选择网络模型确定至少两个输出值,包括:
    确定所述当前块的第二重建图像块;
    将所述第二重建图像块输入所述预设选择网络模型,得到所述至少两个输出值。
  20. 根据权利要求19所述的方法,其中,所述第二重建图像块是经由去块滤波器和样值自适应补偿滤波器进行滤波处理后得到。
  21. 根据权利要求1至20任一项所述的方法,其中,所述方法还包括:
    在确定出所述第一重建图像块之后,利用自适应修正滤波器对所述第一重建图像块进行滤波处理。
  22. 一种编码方法,应用于编码器,所述方法包括:
    确定第一语法元素标识信息的取值;
    当所述第一语法元素标识信息指示当前块允许使用预设选择网络模型进行模型决策时,根据所述当前块的预设选择网络模型确定至少两个输出值;其中,所述至少两个输出值包括所述当前块使用环路滤波网络模型时至少一个候选环路滤波网络模型各自对应的第一值和所述当前块不使用环路滤波网络模型时的第二值;
    根据所述至少两个输出值,确定所述当前块使用环路滤波网络模型时的目标环路滤波网络模型或者所述当前块不使用环路滤波网络模型;
    当所述当前块使用环路滤波网络模型时,利用所述目标环路滤波网络模型对所述当前块进行滤波处理,得到所述当前块的第一重建图像块。
  23. 根据权利要求22所述的方法,其中,当所述当前块使用环路滤波网络模型时,所述利用所述目标环路滤波网络模型对所述当前块进行滤波处理,得到所述当前块的第一重建图像块,包括:
    确定所述当前块的第二重建图像块;
    将所述第二重建图像块输入到所述目标环路滤波网络模型,得到所述当前块的第一重建图像块。
  24. 根据权利要求23所述的方法,其中,当所述当前块不使用环路滤波网络模型时,所述方法还包括:
    将所述第二重建图像块确定为所述当前块的第一重建图像块。
  25. 根据权利要求22所述的方法,其中,所述根据所述至少两个输出值,确定所述当前块使用环路滤波网络模型时的目标环路滤波网络模型或者所述当前块不使用环路滤波网络模型,包括:
    从所述至少两个输出值中确定目标值;
    若所述目标值为所述第一值,则确定所述当前块使用环路滤波网络模型,且将所述目标值对应的候选环路滤波网络模型作为所述目标环路滤波网络模型;或者,
    若所述目标值为所述第二值,则确定所述当前块不使用环路滤波网络模型。
  26. 根据权利要求25所述的方法,其中,所述从所述至少两个输出值中确定目标值,包括:从所述至少两个值中选取最大值,将所述最大值作为所述目标值。
  27. 根据权利要求22所述的方法,其中,所述确定第一语法元素标识信息的取值,包括:
    若所述当前块允许使用预设选择网络模型进行模型决策,则确定所述第一语法元素标识信息的取值为第一标识值;和/或,
    若所述当前块不允许使用预设选择网络模型进行模型决策,则确定所述第一语法元素标识信息的取值为第二标识值。
  28. 根据权利要求27所述的方法,其中,所述方法还包括:
    对所述第一语法元素标识信息的取值进行编码,将编码比特写入码流。
  29. 根据权利要求22所述的方法,其中,所述方法还包括:
    若所述当前块的颜色分量类型为亮度分量,则确定所述当前块的亮度选择网络模型;或者,
    若所述当前块的颜色分量类型为色度分量,则确定所述当前块的色度选择网络模型;
    所述根据当前块的预设选择网络模型确定至少两个输出值,包括:
    若所述当前块的颜色分量类型为亮度分量,则根据所述亮度选择网络模型确定至少两个亮度输出值;其中,所述至少两个亮度输出值包括所述当前块使用亮度环路滤波网络模型时至少一个候选亮度环路滤波网络模型各自对应的第一值和所述当前块不使用所述亮度环路滤波网络模型时的第二值;或者,
    若所述当前块的颜色分量类型为色度分量,则根据所述色度选择网络模型确定至少两个色度输出值;其中,所述至少两个色度输出值包括所述当前块使用色度环路滤波网络模型时至少一个候选色度环路滤波网络模型各自对应的第一值和所述当前块不使用所述色度环路滤波网络模型时的第二值。
  30. 根据权利要求29所述的方法,其中,在所述当前块的颜色分量类型为亮度分量的情况下,所述确定所述当前块的亮度选择网络模型,包括:
    若所述当前块所属帧的帧类型为第一类型,则确定所述当前块的第一亮度选择网络模型;或者,
    若所述当前块所属帧的帧类型为第二类型,则确定所述当前块的第二亮度选择网络模型;
    所述根据所述亮度选择网络模型确定至少两个亮度输出值,包括:
    若所述当前块所属帧的帧类型为第一类型,则根据所述第一亮度选择网络模型确定所述至少两个亮度输出值;其中,所述至少两个亮度输出值包括所述当前块使用第一亮度环路滤波网络模型时至少一个候选第一亮度环路滤波网络模型各自对应的第一值和所述当前块不使用所述第一亮度环路滤波网络模型时的第二值;或者,
    若所述当前块所属帧的帧类型为第二类型,则根据所述第二亮度选择网络模型确定所述至少两个亮度输出值;其中,所述至少两个亮度输出值包括所述当前块使用第二亮度环路滤波网络模型时至少一个候选第二亮度环路滤波网络模型各自对应的第一值和所述当前块不使用所述第二亮度环路滤波网络模型时的第二值。
  31. 根据权利要求30所述的方法,其中,所述至少一个候选第一亮度环路滤波网络模型和所述至少一个候选第二亮度环路滤波网络模型是根据至少一个训练样本对第一神经网络结构进行模型训练确定的,且所述至少一个候选第一亮度环路滤波网络模型和所述至少一个候选第二亮度环路滤波网络模型与帧类型、颜色分量类型和量化参数之间具有对应关系。
  32. 根据权利要求31所述的方法,其中,所述第一神经网络结构包括第一卷积模块、第一残差模块、第二卷积模块和第一连接模块,所述第一卷积模块、所述第一残差模块、所述第二卷积模块和所述第一连接模块顺次连接,且所述第一连接模块还与所述第一卷积模块的输入连接。
  33. 根据权利要求32所述的方法,其中,所述第一卷积模块由一层卷积层和一层激活层组成,所述第二卷积模块由两层卷积层和一层激活层组成,所述连接模块由跳转连接层组成,所述第一残差模块包括若干个残差块,且所述残差块由两层卷积层和一层激活层组成。
  34. 根据权利要求29所述的方法,其中,在所述当前块的颜色分量类型为色度分量的情况下,所述确定所述当前块的色度选择网络模型,包括:
    若所述当前块所属帧的帧类型为第一类型,则确定所述当前块的第一色度选择网络模型;或者,
    若所述当前块所属帧的帧类型为第二类型,则确定所述当前块的第二色度选择网络模型;
    所述根据所述色度选择网络模型确定至少两个色度输出值,包括:
    若所述当前块所属帧的帧类型为第一类型,则根据所述第一色度选择网络模型确定所述至少两个色度输出值;其中,所述至少两个色度输出值包括所述当前块使用第一色度环路滤波网络模型时至少一个候选第一色度环路滤波网络模型各自对应的第一值和所述当前块不使用所述第一色度环路滤波网络模型时的第二值;或者,
    若所述当前块所属帧的帧类型为第二类型,则根据所述第二色度选择网络模型确定所述至少两个色度输出值;其中,所述至少两个色度输出值包括所述当前块使用第二色度环路滤波网络模型时至少一个候选第二色度环路滤波网络模型各自对应的第一值和所述当前块不使用所述第二色度环路滤波网络模型时的第二值。
  35. 根据权利要求34所述的方法,其中,所述至少一个候选第一色度环路滤波网络模型和所述至少一个候选第二色度环路滤波网络模型是根据至少一个训练样本对第二神经网络结构进行模型训练确 定的,且所述至少一个候选第一色度环路滤波网络模型和所述至少一个候选第二色度环路滤波网络模型与帧类型、颜色分量类型和量化参数之间具有对应关系。
  36. 根据权利要求35所述的方法,其中,所述第二神经网络结构包括上采样模块、第三卷积模块、第四卷积模块、融合模块、第二残差模块、第五卷积模块和第二连接模块,所述上采样模块和所述第三卷积模块连接,所述第三卷积模块和所述第四卷积模块与所述融合模块连接,所述融合模块、所述第二残差模块、所述第五卷积模块和所述第二连接模块顺次连接,且所述第二连接模块还与所述上采样模块的输入连接。
  37. 根据权利要求36所述的方法,其中,所述第三卷积模块由一层卷积层和一层激活层组成,所述第四卷积模块由一层卷积层和一层激活层组成,所述第五卷积模块由两层卷积层、一层激活层和一层池化层组成,所述连接模块由跳转连接层组成,所述第二残差模块包括若干个残差块,且所述残差块由两层卷积层和一层激活层组成。
  38. 根据权利要求30所述的方法,其中,在所述当前块的颜色分量类型为亮度分量的情况下,所述确定所述当前块的亮度选择网络模型,包括:
    确定至少一个候选亮度选择网络模型,所述候选亮度选择网络模型包括所述候选第一亮度选择网络模型和/或所述候选第二亮度选择网络模型;
    确定所述当前块所属帧的帧类型和量化参数;
    若所述帧类型为第一类型,则从所述至少一个候选亮度选择网络模型中确定所述第一类型对应的至少一个候选第一亮度选择网络模型,并根据所述量化参数从所述至少一个候选第一亮度选择网络模型中确定所述当前块的第一亮度选择网络模型;或者,
    若所述帧类型为第二类型,则从所述至少一个候选亮度选择网络模型中确定所述第二类型对应的至少一个候选第二亮度选择网络模型,并根据所述量化参数从所述至少一个候选第二亮度选择网络模型中确定所述当前块的第二亮度选择网络模型。
  39. 根据权利要求34所述的方法,其中,在所述当前块的颜色分量类型为色度分量的情况下,所述确定所述当前块的色度选择网络模型,包括:
    确定至少一个候选色度选择网络模型,所述候选色度选择网络模型包括所述候选第一色度选择网络模型和/或所述候选第二色度选择网络模型;
    确定所述当前块所属帧的帧类型和量化参数;
    若所述帧类型为第一类型,则从所述至少一个候选色度选择网络模型中确定所述第一类型对应的至少一个候选第一色度选择网络模型,并根据所述量化参数从所述至少一个候选第一色度选择网络模型中确定所述当前块的第一色度选择网络模型;或者,
    若所述帧类型为第二类型,则从所述至少一个候选色度选择网络模型中确定所述第二类型对应的至少一个候选第二色度选择网络模型,并根据所述量化参数从所述至少一个候选第二色度选择网络模型中确定所述当前块的第二色度选择网络模型。
  40. 根据权利要求38所述的方法,其中,所述至少一个候选亮度选择网络模型是根据至少一个训练样本对第三神经网络结构进行模型训练确定的,且所述至少一个候选亮度选择网络模型与帧类型、颜色分量类型和量化参数之间具有对应关系。
  41. 根据权利要求39所述的方法,其中,所述至少一个候选色度选择网络模型是根据至少一个训练样本对第三神经网络结构进行模型训练确定的,且所述至少一个候选色度选择网络模型与帧类型、颜色分量类型和量化参数之间具有对应关系。
  42. 根据权利要求40或41所述的方法,其中,所述第三神经网络结构包括第六卷积模块和全连接模块,所述第六卷积模块和所述全连接模块顺次连接;
    其中,所述第六卷积模块包括若干个卷积子模块,所述卷积子模块由一层卷积层和一层池化层组成;所述全连接模块包括若干个全连接子模块,所述全连接子模块由一层全连接层和一层激活层组成。
  43. 根据权利要求22所述的方法,其中,所述方法还包括:
    确定环路滤波网络模型的标识信息;
    对所述环路滤波网络模型的标识信息进行编码,将编码比特写入码流。
  44. 根据权利要求43所述的方法,其中,所述确定环路滤波网络模型的标识信息,包括:
    若所述当前块使用环路滤波网络模型,则将所述目标环路滤波网络模型对应的环路滤波网络模型索引序号确定为所述环路滤波网络模型的标识信息;和/或,
    若所述当前块不使用环路滤波网络模型,则将模型关闭信息确定为所述环路滤波网络模型的标识信息。
  45. 根据权利要求22所述的方法,其中,所述环路滤波网络模型为基于残差神经网络的环路滤波 器(CNNLF)模型。
  46. 根据权利要求22所述的方法,其中,所述根据当前块的预设选择网络模型确定至少两个输出值,包括:
    确定所述当前块的第二重建图像块;
    将所述第二重建图像块输入所述预设选择网络模型,得到所述至少两个输出值。
  47. 根据权利要求46所述的方法,其中,所述第二重建图像块是经由去块滤波器和样值自适应补偿滤波器进行滤波处理后得到。
  48. 根据权利要求22至47任一项所述的方法,其中,所述方法还包括:
    在确定出所述第一重建图像块之后,利用自适应修正滤波器对所述第一重建图像块进行滤波处理。
  49. 一种码流,所述码流是根据待编码信息进行比特编码生成的;其中,所述待编码信息包括第一语法元素标识信息的取值,所述第一语法元素标识信息用于指示当前块是否允许使用预设选择网络模型进行模型决策。
  50. 根据权利要求49所述的码流,其中,所述待编码信息还包括环路滤波网络模型的标识信息,所述环路滤波网络模型的标识信息用于确定所述当前块使用环路滤波网络模型时的环路滤波网络模型索引序号或者所述当前块不使用环路滤波网络模型。
  51. 一种编码器,所述编码器包括第一确定单元、第一决策单元和第一滤波单元;其中,
    所述第一确定单元,配置为确定第一语法元素标识信息的取值;
    所述第一决策单元,配置为当所述第一语法元素标识信息指示当前块允许使用预设选择网络模型进行模型决策时,根据所述当前块的预设选择网络模型确定至少两个输出值;其中,所述至少两个输出值包括所述当前块使用环路滤波网络模型时至少一个候选环路滤波网络模型各自对应的第一值和所述当前块不使用环路滤波网络模型时的第二值;以及根据所述至少两个输出值,确定所述当前块使用环路滤波网络模型时的目标环路滤波网络模型或者所述当前块不使用环路滤波网络模型;
    所述第一滤波单元,配置为当所述当前块使用环路滤波网络模型时,利用所述目标环路滤波网络模型对所述当前块进行滤波处理,得到所述当前块的第一重建图像块。
  52. 一种编码器,所述编码器包括第一存储器和第一处理器;其中,
    所述第一存储器,用于存储能够在所述第一处理器上运行的计算机程序;
    所述第一处理器,用于在运行所述计算机程序时,执行如权利要求22至48任一项所述的方法。
  53. 一种解码器,所述解码器包括解析单元、第二决策单元和第二滤波单元;其中,
    所述解析单元,配置为解析码流,确定第一语法元素标识信息的取值;
    所述第二决策单元,配置为当所述第一语法元素标识信息指示当前块允许使用预设选择网络模型进行模型决策时,根据当前块的预设选择网络模型确定至少两个输出值;其中,所述至少两个输出值包括所述当前块使用环路滤波网络模型时至少一个候选环路滤波网络模型各自对应的第一值和所述当前块不使用环路滤波网络模型时的第二值;以及根据所述至少两个输出值,确定所述当前块使用环路滤波网络模型时的目标环路滤波网络模型或者所述当前块不使用环路滤波网络模型;
    所述第二滤波单元,配置为当所述当前块使用环路滤波网络模型时,利用所述目标环路滤波网络模型对所述当前块进行滤波处理,得到所述当前块的第一重建图像块。
  54. 一种解码器,所述解码器包括第二存储器和第二处理器;其中,
    所述第二存储器,用于存储能够在所述第二处理器上运行的计算机程序;
    所述第二处理器,用于在运行所述计算机程序时,执行如权利要求1至21任一项所述的方法。
  55. 一种编解码***,其中,所述编解码***包括如权利要求51或52所述的编码器和如权利要求53或54所述的解码器。
  56. 一种计算机存储介质,其中,所述计算机存储介质存储有计算机程序,所述计算机程序被执行时实现如权利要求1至21任一项所述的方法、或者如权利要求22至48任一项所述的方法。
PCT/CN2021/099813 2021-06-11 2021-06-11 编解码方法、码流、编码器、解码器、***和存储介质 WO2022257130A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
CN202180099002.9A CN117461316A (zh) 2021-06-11 2021-06-11 编解码方法、码流、编码器、解码器、***和存储介质
EP21944637.4A EP4354873A1 (en) 2021-06-11 2021-06-11 Encoding method, decoding method, code stream, encoder, decoder, system and storage medium
PCT/CN2021/099813 WO2022257130A1 (zh) 2021-06-11 2021-06-11 编解码方法、码流、编码器、解码器、***和存储介质
US18/529,318 US20240107073A1 (en) 2021-06-11 2023-12-05 Encoding method, decoding method, bitstream, encoder, decoder, system and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/099813 WO2022257130A1 (zh) 2021-06-11 2021-06-11 编解码方法、码流、编码器、解码器、***和存储介质

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/529,318 Continuation US20240107073A1 (en) 2021-06-11 2023-12-05 Encoding method, decoding method, bitstream, encoder, decoder, system and storage medium

Publications (1)

Publication Number Publication Date
WO2022257130A1 true WO2022257130A1 (zh) 2022-12-15

Family

ID=84425614

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/099813 WO2022257130A1 (zh) 2021-06-11 2021-06-11 编解码方法、码流、编码器、解码器、***和存储介质

Country Status (4)

Country Link
US (1) US20240107073A1 (zh)
EP (1) EP4354873A1 (zh)
CN (1) CN117461316A (zh)
WO (1) WO2022257130A1 (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108184129A (zh) * 2017-12-11 2018-06-19 北京大学 一种视频编解码方法、装置及用于图像滤波的神经网络
CN108520505A (zh) * 2018-04-17 2018-09-11 上海交通大学 基于多网络联合构建与自适应选择的环路滤波实现方法
CN110351568A (zh) * 2019-06-13 2019-10-18 天津大学 一种基于深度卷积网络的视频环路滤波器
WO2021006624A1 (ko) * 2019-07-08 2021-01-14 엘지전자 주식회사 적응적 루프 필터를 적용하는 비디오 또는 영상 코딩

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108184129A (zh) * 2017-12-11 2018-06-19 北京大学 一种视频编解码方法、装置及用于图像滤波的神经网络
CN108520505A (zh) * 2018-04-17 2018-09-11 上海交通大学 基于多网络联合构建与自适应选择的环路滤波实现方法
CN110351568A (zh) * 2019-06-13 2019-10-18 天津大学 一种基于深度卷积网络的视频环路滤波器
WO2021006624A1 (ko) * 2019-07-08 2021-01-14 엘지전자 주식회사 적응적 루프 필터를 적용하는 비디오 또는 영상 코딩

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
H. YIN (INTEL), R. YANG (INTEL), X. FANG, S. MA, Y. YU (INTEL): "AHG9 : Adaptive convolutional neural network loop filter", 13. JVET MEETING; 20190109 - 20190118; MARRAKECH; (THE JOINT VIDEO EXPLORATION TEAM OF ISO/IEC JTC1/SC29/WG11 AND ITU-T SG.16 ), no. JVET-M0566, 5 January 2019 (2019-01-05), XP030200692 *

Also Published As

Publication number Publication date
US20240107073A1 (en) 2024-03-28
CN117461316A (zh) 2024-01-26
EP4354873A1 (en) 2024-04-17

Similar Documents

Publication Publication Date Title
TW201841503A (zh) 視頻寫碼中之內濾波旗標
CN105308960A (zh) 自适应色彩空间变换编码
US20230062752A1 (en) A method, an apparatus and a computer program product for video encoding and video decoding
US11683505B2 (en) Method and a device for picture encoding and decoding
JP7439841B2 (ja) ループ内フィルタリングの方法及びループ内フィルタリングの装置
US10834395B2 (en) Method and a device for decoding an intra predicted block of a picture and corresponding coding method and coding device
WO2022052533A1 (zh) 编码方法、解码方法、编码器、解码器以及编码***
CN116916036A (zh) 视频压缩方法、装置及***
WO2022116085A1 (zh) 编码方法、解码方法、编码器、解码器以及电子设备
WO2022116207A1 (zh) 编码方法、解码方法和编码装置、解码装置
US20230262251A1 (en) Picture prediction method, encoder, decoder and computer storage medium
WO2022227062A1 (zh) 编解码方法、码流、编码器、解码器以及存储介质
WO2022257130A1 (zh) 编解码方法、码流、编码器、解码器、***和存储介质
WO2022257049A1 (zh) 编解码方法、码流、编码器、解码器以及存储介质
WO2022178686A1 (zh) 编解码方法、编解码设备、编解码***以及计算机可读存储介质
WO2023245544A1 (zh) 编解码方法、码流、编码器、解码器以及存储介质
WO2024016156A1 (zh) 滤波方法、编码器、解码器、码流以及存储介质
Ghassab et al. Video Compression Using Convolutional Neural Networks of Video with Chroma Subsampling
WO2023092404A1 (zh) 视频编解码方法、设备、***、及存储介质
WO2024077573A1 (zh) 编解码方法、编码器、解码器、码流以及存储介质
WO2023197230A1 (zh) 滤波方法、编码器、解码器以及存储介质
WO2023193254A1 (zh) 解码方法、编码方法、解码器以及编码器
WO2023123398A1 (zh) 滤波方法、滤波装置以及电子设备
WO2023130226A1 (zh) 一种滤波方法、解码器、编码器及计算机可读存储介质
WO2023070505A1 (zh) 帧内预测方法、解码器、编码器及编解码***

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21944637

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 202180099002.9

Country of ref document: CN

WWE Wipo information: entry into national phase

Ref document number: 2021944637

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2021944637

Country of ref document: EP

Effective date: 20240111