CN117956145A

CN117956145A - Video encoding method, video encoding device, electronic equipment and storage medium

Info

Publication number: CN117956145A
Application number: CN202410137930.5A
Authority: CN
Inventors: 何盈燊; 周超
Original assignee: Beijing Dajia Internet Information Technology Co Ltd
Current assignee: Beijing Dajia Internet Information Technology Co Ltd
Priority date: 2024-01-31
Filing date: 2024-01-31
Publication date: 2024-04-30

Abstract

The disclosure relates to a video encoding method, a video encoding device, an electronic device and a storage medium. The method comprises the following steps: acquiring a plurality of pixel blocks contained in a video frame in a video to be encoded; for each pixel block, acquiring a plurality of candidate coding mode combinations of the pixel block, wherein each candidate coding mode combination comprises at least two coding modes with different dimensions, and each coding mode is respectively corresponding to a decoding complexity weight; obtaining the decoding complexity of the pixel block based on the encoding of each candidate encoding mode combination according to the decoding complexity weight of each encoding mode contained in each candidate encoding mode combination, obtaining the encoding loss corresponding to the encoding of the pixel block based on each candidate encoding mode combination, and obtaining the target encoding mode combination with optimal performance from the candidate encoding mode combination by utilizing the decoding complexity and the encoding loss; and performing coding processing on the pixel blocks according to the coding modes contained in the target coding mode combination. The complexity of video decoding in the decoding end can be reduced by adopting the method and the device.

Description

Video encoding method, video encoding device, electronic equipment and storage medium

Technical Field

The disclosure relates to the technical field of video processing, and in particular relates to a video coding method, a video coding device, electronic equipment and a storage medium.

Background

With the development of video processing technology, a technology of uploading a video to a server by a user and sharing the uploaded video to other users by the server appears, and in the process of uploading the video to the server, the uploaded video generally needs to be encoded into a more efficient video format and then sent to a user side for decoding and playing.

In the process of decoding and playing the online video, because the upper limit of the performance of the mobile terminal is lower, the conditions of frame loss, blocking, heating and the like of decoding usually occur during decoding and playing the video, and the problem can be effectively solved by reducing the decoding complexity at the moment, and the current mode of reducing the decoding complexity usually improves the performance of the decoder by optimizing the decoder, such as assembly optimization, memory optimization, architecture optimization and the like of the decoder.

However, in the current way of reducing the complexity of mobile-side decoding by optimizing the decoder performance, it is difficult to further improve the decoder performance because the optimization space for the decoder is limited. The complexity of video decoding in the decoding end is high at present.

Disclosure of Invention

The disclosure provides a video encoding method, a video encoding device, a server and a storage medium, so as to at least solve the problem of high complexity of video decoding in a decoding end in the related art. The technical scheme of the present disclosure is as follows:

According to a first aspect of an embodiment of the present disclosure, there is provided a video encoding method, including:

acquiring a plurality of pixel blocks contained in a video frame in a video to be encoded;

for each pixel block, acquiring a plurality of candidate coding mode combinations corresponding to the pixel block, wherein each candidate coding mode combination comprises at least two coding modes with different dimensions, and each coding mode is respectively corresponding to a decoding complexity weight;

Obtaining decoding complexity corresponding to the pixel block based on the encoding of each candidate encoding mode combination according to decoding complexity weights corresponding to the encoding modes contained in each candidate encoding mode combination, obtaining encoding loss corresponding to the pixel block based on the encoding of each candidate encoding mode combination, and obtaining a target encoding mode combination with optimal performance from the candidate encoding mode combination by utilizing the decoding complexity and the encoding loss;

And carrying out coding processing on the pixel block according to the coding modes contained in the target coding mode combination.

In an exemplary embodiment, before the decoding complexity weight preset according to each coding mode included in each candidate coding mode combination, the method further includes: taking any one of the plurality of candidate coding mode combinations as a current coding mode combination, and acquiring each coding mode contained in the current coding mode combination; obtaining decoding complexity weights corresponding to the coding modes in the current coding mode combination respectively from a pre-constructed weight list; and the weight list stores the corresponding relation between the full-scale coding mode and the decoding complexity weight.

In an exemplary embodiment, the weight list has stored therein decoding complexity weights for the full-scale coding modes corresponding to different pixel block areas and different color space categories; the obtaining decoding complexity weights corresponding to the coding modes in the current coding mode combination from a pre-constructed weight list comprises the following steps: acquiring the pixel block area of the pixel block and the color space class of the pixel block; and obtaining decoding complexity weights corresponding to the pixel block areas and the color space categories from a pre-constructed weight list, wherein each coding mode in the current coding mode combination is obtained.

In an exemplary embodiment, before obtaining the decoding complexity weights corresponding to the pixel block area and the color space class in each coding mode in the current coding mode combination from a pre-constructed weight list, the method further includes: acquiring a sample pixel block, and a sample pixel block area and a sample color space category corresponding to the sample pixel block; performing coding processing on the sample pixel block by using a target coding mode, and obtaining decoding time for performing decoding processing on the coded sample coding block; the target coding mode is a coding mode of which the decoding complexity weight is to be obtained; and obtaining decoding complexity weights corresponding to the target coding mode, the sample pixel block area and the sample color space category according to the decoding time, and constructing the weight list based on the corresponding relation between the decoding complexity weights and the target coding mode, the sample pixel block area and the sample color space category.

In an exemplary embodiment, the obtaining the target coding mode combination with optimal performance from the candidate coding mode combinations by using the decoding complexity and the coding loss includes: acquiring a first coding loss and a first decoding complexity corresponding to a coding mode combination to be compared; the coding mode combination to be compared is a candidate coding mode combination with optimal performance before the current comparison round; obtaining a comparison result of a pre-constructed rate-distortion comparison function according to the first coding loss, the first decoding complexity and the second coding loss and the second decoding complexity corresponding to the current coding mode combination; the rate distortion comparison function is used for comparing the performance of the coding mode combination to be compared with the performance of the current coding mode combination; under the condition that the comparison result represents that the performance of the current coding mode combination is better than that of the coding mode combination to be compared, the current coding mode combination is used as a new coding mode combination to be compared; and taking the code pattern combination to be compared of the last comparison round as the target code pattern combination.

In an exemplary embodiment, the obtaining the comparison result of the pre-constructed rate-distortion comparison function according to the first coding loss, the first decoding complexity, and the second coding loss and the second decoding complexity corresponding to the current coding mode combination includes: acquiring a video frame level of a video frame corresponding to the pixel block, and acquiring a coding and decoding complexity adjustment parameter corresponding to the video frame level; the first coding loss, the first decoding complexity, the second coding loss, the second decoding complexity and the coding and decoding complexity adjusting parameters are brought into the rate-distortion comparison function to obtain a rate-distortion comparison value; and obtaining the comparison result according to the magnitude relation between the rate distortion comparison value and a rate distortion comparison threshold value preset for the rate distortion comparison function.

In an exemplary embodiment, the obtaining the codec complexity adjustment parameter corresponding to the video frame level includes: acquiring an encoded video frame associated with a video frame corresponding to the pixel block and decoding complexity corresponding to the encoded video frame; the decoding complexity corresponding to the encoded video frame is determined based on the decoding complexity of each pixel block contained in the encoded video frame; acquiring an initial encoding and decoding complexity adjustment parameter corresponding to the video frame level, and adjusting the initial encoding and decoding complexity adjustment parameter by utilizing the decoding complexity corresponding to the encoded video frame to acquire the encoding and decoding complexity adjustment parameter; the adjusting is in positive correlation with a decoding complexity corresponding to the encoded video frame.

In an exemplary embodiment, the weight list stores a first decoding complexity weight corresponding to a plurality of prediction modes, a second decoding complexity weight corresponding to a plurality of mode internal selections of each prediction mode, and a third decoding complexity weight corresponding to a plurality of non-prediction modes in advance; the obtaining the current coding mode combination and each coding mode contained in the current coding mode combination comprises the following steps: acquiring a target prediction mode corresponding to the current coding mode combination, target mode internal selection aiming at the target prediction mode, and target non-prediction mode corresponding to the current coding mode combination; the obtaining decoding complexity weights corresponding to the coding modes in the current coding mode combination from a pre-constructed weight list comprises the following steps: and acquiring a first decoding complexity weight corresponding to the target prediction mode, a second decoding complexity weight corresponding to the target mode internal selection and a third decoding complexity weight corresponding to the target non-prediction mode from the weight list.

In an exemplary embodiment, in a case where the target prediction mode is a preset prediction mode, the obtaining each coding mode included in the current coding mode combination by using any one of the plurality of candidate coding mode combinations as the current coding mode combination includes: acquiring a video frame level of a video frame corresponding to the pixel block, and acquiring a complexity change coefficient matched with the video frame level; and obtaining the decoding complexity of the current coding mode combination based on the decoding complexity weight corresponding to each coding mode in the current coding mode combination and the complexity change coefficient.

In an exemplary embodiment, the obtaining the complexity variation coefficient matched to the video frame level includes: acquiring an encoded video frame associated with a video frame corresponding to the pixel block and decoding complexity corresponding to the encoded video frame; the decoding complexity corresponding to the encoded video frame is determined based on the decoding complexity of each pixel block contained in the encoded video frame; acquiring an initial complexity change coefficient corresponding to the video frame level, and adjusting the initial complexity change coefficient by utilizing the decoding complexity corresponding to the encoded video frame to obtain the complexity change coefficient; the adjusting is in positive correlation with a decoding complexity corresponding to the encoded video frame.

According to a second aspect of embodiments of the present disclosure, there is provided a video encoding apparatus including:

an encoded pixel block acquisition unit configured to perform acquisition of a plurality of pixel blocks included in a video frame in a video to be encoded;

A candidate combination obtaining unit configured to obtain, for each pixel block, a plurality of candidate coding mode combinations corresponding to the pixel block, where each candidate coding mode combination includes at least two coding modes with different dimensions, and each coding mode corresponds to a decoding complexity weight;

A target combination obtaining unit configured to perform a decoding complexity weight corresponding to each encoding mode included in each of the candidate encoding mode combinations, obtain a decoding complexity corresponding to the pixel block encoded based on each of the candidate encoding mode combinations, and obtain an encoding loss corresponding to the pixel block encoded based on each of the candidate encoding mode combinations, and obtain a target encoding mode combination with optimal performance from the candidate encoding mode combinations using the decoding complexity and the encoding loss;

And a pixel block encoding unit configured to perform encoding processing on the pixel block according to the encoding modes included in the target encoding mode combination.

In an exemplary embodiment, the target combination obtaining unit is further configured to perform obtaining each coding mode included in the current coding mode combination by taking any one of the plurality of candidate coding mode combinations as the current coding mode combination; obtaining decoding complexity weights corresponding to the coding modes in the current coding mode combination respectively from a pre-constructed weight list; and the weight list stores the corresponding relation between the full-scale coding mode and the decoding complexity weight.

In an exemplary embodiment, the weight list has stored therein decoding complexity weights for the full-scale coding modes corresponding to different pixel block areas and different color space categories; a target combination acquisition unit further configured to perform acquisition of a pixel block area of the pixel block, and a color space class of the pixel block; and obtaining decoding complexity weights corresponding to the pixel block areas and the color space categories from a pre-constructed weight list, wherein each coding mode in the current coding mode combination is obtained.

In an exemplary embodiment, the video encoding apparatus further includes: the weight list construction unit is configured to acquire a sample pixel block, and a sample pixel block area and a sample color space category corresponding to the sample pixel block; performing coding processing on the sample pixel block by using a target coding mode, and obtaining decoding time for performing decoding processing on the coded sample coding block; the target coding mode is a coding mode of which the decoding complexity weight is to be obtained; and obtaining decoding complexity weights corresponding to the target coding mode, the sample pixel block area and the sample color space category according to the decoding time, and constructing the weight list based on the corresponding relation between the decoding complexity weights and the target coding mode, the sample pixel block area and the sample color space category.

In an exemplary embodiment, the target combination obtaining unit is further configured to perform obtaining a first coding loss and a first decoding complexity corresponding to the coding mode combination to be compared; the coding mode combination to be compared is a candidate coding mode combination with optimal performance before the current comparison round; obtaining a comparison result of a pre-constructed rate-distortion comparison function according to the first coding loss, the first decoding complexity and the second coding loss and the second decoding complexity corresponding to the current coding mode combination; the rate distortion comparison function is used for comparing the performance of the coding mode combination to be compared with the performance of the current coding mode combination; under the condition that the comparison result represents that the performance of the current coding mode combination is better than that of the coding mode combination to be compared, the current coding mode combination is used as a new coding mode combination to be compared; and taking the code pattern combination to be compared of the last comparison round as the target code pattern combination.

In an exemplary embodiment, the target combination obtaining unit is further configured to perform obtaining a video frame level of a video frame corresponding to the pixel block, and obtaining a codec complexity adjustment parameter corresponding to the video frame level; the first coding loss, the first decoding complexity, the second coding loss, the second decoding complexity and the coding and decoding complexity adjusting parameters are brought into the rate-distortion comparison function to obtain a rate-distortion comparison value; and obtaining the comparison result according to the magnitude relation between the rate distortion comparison value and a rate distortion comparison threshold value preset for the rate distortion comparison function.

In an exemplary embodiment, the target combination obtaining unit is further configured to perform obtaining an encoded video frame associated with a video frame corresponding to the pixel block, and a decoding complexity corresponding to the encoded video frame; the decoding complexity corresponding to the encoded video frame is determined based on the decoding complexity of each pixel block contained in the encoded video frame; acquiring an initial encoding and decoding complexity adjustment parameter corresponding to the video frame level, and adjusting the initial encoding and decoding complexity adjustment parameter by utilizing the decoding complexity corresponding to the encoded video frame to acquire the encoding and decoding complexity adjustment parameter; the adjusting is in positive correlation with a decoding complexity corresponding to the encoded video frame.

In an exemplary embodiment, the weight list stores a first decoding complexity weight corresponding to a plurality of prediction modes, a second decoding complexity weight corresponding to a plurality of mode internal selections of each prediction mode, and a third decoding complexity weight corresponding to a plurality of non-prediction modes in advance; a target combination acquisition unit further configured to perform acquisition of a target prediction mode corresponding to the current coding mode combination, target mode internal selection for the target prediction mode, and a target non-prediction mode corresponding to the current coding mode combination; and acquiring a first decoding complexity weight corresponding to the target prediction mode, a second decoding complexity weight corresponding to the target mode internal selection and a third decoding complexity weight corresponding to the target non-prediction mode from the weight list.

In an exemplary embodiment, in a case where the target prediction mode is a preset prediction mode, the target combination obtaining unit is further configured to perform obtaining a video frame level of a video frame corresponding to the pixel block, and obtaining a complexity change coefficient matched with the video frame level; and obtaining the decoding complexity of the current coding mode combination based on the decoding complexity weight corresponding to each coding mode in the current coding mode combination and the complexity change coefficient.

In an exemplary embodiment, the target combination obtaining unit is further configured to perform obtaining an encoded video frame associated with a video frame corresponding to the pixel block, and a decoding complexity corresponding to the encoded video frame; the decoding complexity corresponding to the encoded video frame is determined based on the decoding complexity of each pixel block contained in the encoded video frame; acquiring an initial complexity change coefficient corresponding to the video frame level, and adjusting the initial complexity change coefficient by utilizing the decoding complexity corresponding to the encoded video frame to obtain the complexity change coefficient; the adjusting is in positive correlation with a decoding complexity corresponding to the encoded video frame.

According to a third aspect of embodiments of the present disclosure, there is provided an electronic device, comprising: a processor; a memory for storing the processor-executable instructions; wherein the processor is configured to execute the instructions to implement the video coding method according to any one of the embodiments of the first aspect.

According to a fourth aspect of embodiments of the present disclosure, there is provided a computer readable storage medium, which when executed by a processor of an electronic device, causes the electronic device to perform the video encoding method according to any one of the embodiments of the first aspect.

According to a fifth aspect of embodiments of the present disclosure, there is provided a computer program product comprising instructions therein, which when executed by a processor of an electronic device, enable the electronic device to perform the video encoding method according to any one of the embodiments of the first aspect.

The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects:

Acquiring a plurality of pixel blocks contained in a video frame in a video to be encoded; for each pixel block, acquiring a plurality of candidate coding mode combinations corresponding to the pixel block, wherein each candidate coding mode combination comprises at least two coding modes with different dimensions, and each coding mode is respectively corresponding to a decoding complexity weight; obtaining decoding complexity corresponding to the pixel block based on the encoding of each candidate encoding mode combination according to decoding complexity weight corresponding to each encoding mode contained in each candidate encoding mode combination, obtaining encoding loss corresponding to the pixel block based on the encoding of each candidate encoding mode combination, and obtaining a target encoding mode combination with optimal performance from the candidate encoding mode combination by utilizing the decoding complexity and the encoding loss; and performing coding processing on the pixel blocks according to the coding modes contained in the target coding mode combination. According to the method and the device, the candidate coding mode combinations of the pixel blocks contained in the video frames in the video to be coded and composed of at least two coding modes with different dimensions are obtained, so that the decoding complexity of the candidate coding mode combinations is obtained according to the preset decoding complexity weight of each coding mode contained in the candidate coding mode combinations, the target coding mode combinations with optimal performance are screened out by using the coding loss and the decoding complexity corresponding to the candidate coding mode combinations, the coding modes contained in the target coding mode combinations are used for coding, the influence of the decoding complexity is considered in the coding mode combination selection, the coding modes with different dimensions can be respectively provided with the decoding complexity weights, and therefore the coding modes which are favorable for reducing the decoding complexity of a decoding end can be accurately selected for video coding, and the video decoding complexity in the decoding end is further reduced.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure and do not constitute an undue limitation on the disclosure.

Fig. 1 is a flowchart illustrating a video encoding method according to an exemplary embodiment.

Fig. 2 is a flow chart illustrating the acquisition of preset decoding complexity weights according to an exemplary embodiment.

FIG. 3 is a flowchart illustrating the construction of a weight list according to an exemplary embodiment.

Fig. 4 is a flow chart illustrating obtaining a target coding mode combination with optimal performance according to an exemplary embodiment.

Fig. 5 is a flowchart illustrating a comparison result of obtaining a rate distortion comparison function according to an exemplary embodiment.

FIG. 6 is a flow chart illustrating mode combination selection according to an exemplary embodiment.

FIG. 7 is a flow chart illustrating the calculation of decoding complexity for a certain pattern combination, according to an exemplary embodiment.

Fig. 8 is a block diagram illustrating a video encoding apparatus according to an exemplary embodiment.

Fig. 9 is a block diagram of an electronic device, according to an example embodiment.

Detailed Description

In order to enable those skilled in the art to better understand the technical solutions of the present disclosure, the technical solutions of the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.

It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the foregoing figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the disclosure described herein may be capable of operation in sequences other than those illustrated or described herein. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present disclosure as detailed in the accompanying claims.

It should be further noted that, the user information (including, but not limited to, user equipment information, user personal information, etc.) and the data (including, but not limited to, data for presentation, analyzed data, etc.) related to the present disclosure are information and data authorized by the user or sufficiently authorized by each party.

Fig. 1 is a flowchart illustrating a video encoding method according to an exemplary embodiment, which is used in a terminal as shown in fig. 1, including the following steps.

In step S101, a plurality of pixel blocks included in a video frame in a video to be encoded are acquired.

The video to be encoded refers to a video that needs to perform a video encoding operation, and may be composed of a plurality of video frames, and a pixel block is a pixel image block that composes a video frame, and the pixel block may be a unit for performing an encoding process. Specifically, after obtaining the video to be encoded that needs to be subjected to video encoding processing, the terminal may first obtain pixel blocks of a video frame that forms the video frame to be encoded, so as to determine an encoding mode of each pixel block.

In step S102, for each pixel block, a plurality of candidate coding mode combinations corresponding to the pixel block are obtained, where each candidate coding mode combination includes at least two coding modes with different dimensions, and each coding mode corresponds to a decoding complexity weight.

The candidate coding mode combination is then a selectable coding mode combination, which may consist of at least two coding modes of different dimensions, e.g. the coding modes of different dimensions may comprise: coding modes for a dimension of a prediction process, such as an intra-frame prediction mode and an inter-frame prediction mode, wherein the intra-frame prediction mode further comprises PLANER, DC, MIP, CCLM modes, angle prediction and the like, and the inter-frame prediction comprises multiple prediction modes such as common unidirectional bi-directional interpolation, AFFINE, GEO and the like, or can also comprise coding modes for selecting the dimension according to different internal selections of the prediction modes, such as coding modes corresponding to different selections of prediction vectors in the prediction modes, and the like, and can also comprise mode selections of non-prediction processes, such as coding mode selections of processes such as MV derivation, inverse transformation, inverse quantization and the like. The decoding complexity weight refers to the decoding complexity of each encoding mode in each candidate encoding mode combination, for example, a certain candidate encoding mode combination may be composed of an encoding mode a, an encoding mode B and an encoding mode C, where the encoding mode a is preset with the corresponding decoding complexity weight a, the encoding mode B is preset with the corresponding decoding complexity weight B, and the encoding mode C is preset with the corresponding decoding complexity weight C.

Specifically, when the terminal obtains a plurality of pixel blocks, the following operations can be performed on each pixel block, and a plurality of candidate coding mode combinations formed by a plurality of different coding modes of the pixel block are determined first.

In step S103, according to the decoding complexity weights corresponding to the encoding modes included in the candidate encoding mode combinations, the decoding complexity corresponding to the encoding of the pixel block based on the candidate encoding mode combinations is obtained, the encoding loss corresponding to the encoding of the pixel block based on the candidate encoding mode combinations is obtained, and the target encoding mode combination with optimal performance is obtained from the candidate encoding mode combinations by using the decoding complexity and the encoding loss.

The decoding complexity is the decoding complexity corresponding to the whole candidate coding mode combination. For example, a candidate coding mode combination may be composed of coding mode a, coding mode B and coding mode C, where coding mode a corresponds to decoding complexity weight a, coding mode B corresponds to decoding complexity weight B, and coding mode C corresponds to decoding complexity weight C, and then the decoding complexity corresponding to the candidate coding mode combination may be obtained from decoding complexity weight a, decoding complexity weight B and decoding complexity weight C.

The coding loss refers to video coding loss caused by executing candidate coding mode combinations, the target coding mode combination refers to the coding mode combination with the optimal performance among the candidate coding mode combinations, and the performance evaluation of the coding mode combination can be realized through the coding loss and the decoding complexity, so that after determining the coding loss and the decoding complexity corresponding to the pixel block and each candidate coding mode combination, the terminal can find out the target coding mode combination with the optimal performance from a plurality of candidate coding mode combinations by utilizing the coding loss and the decoding complexity.

In step S104, the pixel block is subjected to encoding processing according to the encoding modes included in the target encoding mode combination.

Finally, after the target coding mode combination with the optimal performance of each pixel block is determined through the method, the coding mode contained in the target coding mode combination can be utilized to carry out coding processing on each pixel block, and as the coding mode combination performance not only considers coding loss, but also considers the influence of decoding complexity, the screened target coding mode combination can balance the loss caused by coding and the influence of the complexity of a decoding end, and can effectively reduce the decoding complexity of the decoding end.

In the video coding method, a plurality of pixel blocks contained in a video frame in a video to be coded are obtained; for each pixel block, acquiring a plurality of candidate coding mode combinations corresponding to the pixel block, wherein each candidate coding mode combination comprises at least two coding modes with different dimensions, and each coding mode is respectively corresponding to a decoding complexity weight; obtaining decoding complexity corresponding to the pixel block based on the encoding of each candidate encoding mode combination according to decoding complexity weight corresponding to each encoding mode contained in each candidate encoding mode combination, obtaining encoding loss corresponding to the pixel block based on the encoding of each candidate encoding mode combination, and obtaining a target encoding mode combination with optimal performance from the candidate encoding mode combination by utilizing the decoding complexity and the encoding loss; and performing coding processing on the pixel blocks according to the coding modes contained in the target coding mode combination. According to the method and the device, the candidate coding mode combinations of the pixel blocks contained in the video frames in the video to be coded and composed of at least two coding modes with different dimensions are obtained, so that the decoding complexity of the candidate coding mode combinations is obtained according to the preset decoding complexity weight of each coding mode contained in the candidate coding mode combinations, the target coding mode combinations with optimal performance are screened out by using the coding loss and the decoding complexity corresponding to the candidate coding mode combinations, the coding modes contained in the target coding mode combinations are used for coding, the influence of the decoding complexity is considered in the coding mode combination selection, the coding modes with different dimensions can be respectively provided with the decoding complexity weights, and therefore the coding modes which are favorable for reducing the decoding complexity of a decoding end can be accurately selected for video coding, and the video decoding complexity in the decoding end is further reduced.

In an exemplary embodiment, as shown in fig. 2, before step S103, the method may further include:

In step S201, any one of a plurality of candidate coding mode combinations is used as a current coding mode combination, and each coding mode included in the current coding mode combination is acquired.

The current coding mode combination refers to any one of a plurality of candidate coding mode combinations, and the terminal may further obtain each coding mode included in the current coding mode combination.

In step S202, decoding complexity weights corresponding to the coding modes in the current coding mode combination are obtained from a pre-constructed weight list; the weight list stores the corresponding relation between the full-scale coding mode and the decoding complexity weight.

The weight list is a table for storing the correspondence between the full-size coding mode and each decoding complexity weight, for example, decoding complexity weights of different modes of intra-frame and inter-frame prediction may be recorded, decoding complexity weights of different modes of luminance, chrominance combining and chrominance separating may be recorded, decoding complexity weights of respectively corresponding to 16 combinations of whether interpolation is needed in horizontal and vertical directions of two prediction vectors in bidirectional inter-frame prediction may be recorded, decoding complexity weights of respectively corresponding to 9 MTS combinations of different primary transform and secondary transform types selected by MTS may be recorded, and the like. After obtaining each coding mode in the current coding mode combination, the terminal can obtain the corresponding decoding complexity weight of each coding mode from the weight list by inquiring the pre-constructed weight list.

In this embodiment, a weight list storing the correspondence between the full-scale coding modes and the decoding complexity weights may be constructed in advance, so that the decoding complexity weights of the coding modes included in the current coding mode combination may be obtained through the weight list, and the decoding complexity weights do not need to be calculated in real time, thereby improving the efficiency of obtaining the decoding complexity weights.

Further, decoding complexity weights of the full-scale coding modes corresponding to different pixel block areas and different color space categories are stored in the weight list; step S202 may further include: acquiring the pixel block area of the pixel block and the color space class of the pixel block; and obtaining decoding complexity weights corresponding to the pixel block areas and the color space categories from a pre-constructed weight list, wherein each coding mode in the current coding mode combination is obtained.

Since the average processing time for each pixel is different when the decoding end processes the blocks with different sizes, the processing time for each pixel is smaller when the average processing time for each pixel is larger, if the area size of the currently processed block is not considered, the actually calculated weight is inaccurate, and when the decoding end processes of predicting, transforming, quantizing and the like are performed on one prediction block, if the selected mode is the same, the more pixels can be processed in parallel when the block size is larger, the lower the decoding complexity required for completing the processing of each pixel is averaged, and the different weights are required to be given to the blocks with different pixel sizes in each case.

Meanwhile, the color space types corresponding to different pixel blocks are also different, and the decoding complexity weights corresponding to the same coding modes are also different when different color space types select the same coding modes, for example, the color space types can include luminance and chrominance, the pixel blocks can be divided into luminance blocks and chrominance blocks, and even if the same coding mode combination is adopted for coding processing on the luminance blocks and the chrominance blocks with the same size, the decoding complexity corresponding to the same coding mode combination is also different, so that the decoding complexity weights corresponding to the different pixel block areas and the different color space types in the full-scale coding modes are stored in the weight list in the embodiment.

For example, the same coding mode a corresponds to pixel blocks of different sizes and different color space classes, for example, pixel block 1 and pixel block 2 belong to the same color space class but are different in size, while pixel block 1 and pixel block 5 belong to the same size but are different in color space class, in which case the decoding complexity weights of pixel block 1, pixel block 2 and pixel block 3 are different, and may be decoding complexity weight A1, decoding complexity weight A2 and decoding complexity weight A3, respectively.

Therefore, the terminal may determine the pixel block area and the color space class of the pixel block, and then query the weight list by using the pixel block area, the color space class and each coding mode included in the current coding mode combination, so as to obtain a corresponding decoding complexity weight, for example, the pixel block has the same size as the pixel block 1, and the pixel block 1 belong to the pixel block of the same color space class, and if the current coding mode combination includes the coding mode 1, the decoding complexity weight corresponding to the coding mode 1 may be the decoding complexity weight A1.

In this embodiment, decoding complexity weights corresponding to different pixel block areas and different color space categories in the full-scale coding mode may also be stored in the weight list, so that the influence of the pixel block areas and the color space categories on the decoding complexity is considered, and the accuracy of obtaining the decoding complexity weights is further improved.

In addition, as shown in fig. 3, before acquiring each coding mode in the current coding mode combination from the pre-constructed weight list and the decoding complexity weight corresponding to the pixel block area, the method may further include:

In step S301, a sample pixel block is acquired, and a sample pixel block area and a sample color space class corresponding to the sample pixel block are acquired.

The sample pixel block refers to a pixel block used for constructing the weight list, the sample pixel block area refers to a pixel block area corresponding to the sample pixel block, and the sample color space class is a color space class of the sample pixel block. In this embodiment, when the weight list is constructed, a sample pixel block for constructing the weight list, and the area and color space class of the sample pixel block may be obtained first.

In step S302, the sample pixel block is encoded by using the target encoding mode, and the decoding time for decoding the encoded sample encoded block is obtained; the target coding mode is the coding mode of which the decoding complexity weight is to be acquired.

The target coding mode refers to a coding mode needing to determine a decoding complexity weight, in this embodiment, the terminal may determine a coding mode needing to obtain the decoding complexity weight as the target coding mode, then may perform coding processing on the sample pixel block through the target coding mode, input the coded sample coding block to the decoding end, and then may count decoding time of the decoding end for performing decoding processing on the sample coding block.

In step S303, a decoding complexity weight corresponding to the target encoding mode and the sample pixel block area and the sample color space class is obtained according to the decoding time, and a weight list is constructed based on the corresponding relationship between the decoding complexity weight and the target encoding mode and the corresponding relationship between the sample pixel block area and the sample color space class.

After the decoding time is obtained, a decoding complexity weight corresponding to the target coding mode and the sample pixel block area and the sample color space category can be obtained according to the decoding time, so that a corresponding relation between the decoding complexity weight and the target coding mode and the corresponding relation between the sample pixel block area and the sample color space category are established, and a weight list is constructed by utilizing the corresponding relation.

For example, the pixel block area of a certain sample pixel block is area 1, the color space class is class 1, and the target coding mode is coding mode a, then the terminal can utilize coding mode a to encode the sample pixel block, after encoding is completed, the encoded sample pixel block can be input into a decoding end to count the decoding time of the encoded sample pixel block, so that a corresponding decoding complexity weight is obtained according to the decoding time, and is used as decoding complexity weight A1 corresponding to coding mode a, area 1 and class 1, and a weight list is written, and in this way, the decoding complexity weight corresponding to the area of each sample pixel block and the color space class in the full-scale coding mode can be constructed to generate the weight list.

In this embodiment, the weight list may be constructed by counting the decoding time of decoding after the pixel blocks of various pixel block areas and color space classes are encoded according to various encoding modes, and by this way, the construction accuracy of the weight list may be further improved.

In an exemplary embodiment, as shown in fig. 4, step S103 may further include:

In step S401, a first coding loss and a first decoding complexity corresponding to the coding mode combination to be compared are obtained; and the coding mode combination to be compared is a candidate coding mode combination with optimal performance before the current comparison round.

The current comparison round refers to a comparison round of the current coding mode combination, and in this embodiment, the mode of screening out the candidate coding mode combination with the optimal performance is realized through multi-round performance comparison, and the performance comparison is realized through combining a new coding mode combination, namely, the current coding mode combination with the candidate coding mode combination with the optimal performance after the comparison is completed. Therefore, the terminal can firstly obtain the candidate coding mode combination which is compared with the current coding mode combination and has optimal performance from the candidate coding mode combinations, serve as a coding mode combination to be compared with the current coding mode combination, and obtain the coding loss and the decoding complexity corresponding to the coding mode combination to be compared, and serve as a first coding loss and a first decoding complexity.

In step S402, a comparison result of a pre-constructed rate-distortion comparison function is obtained according to the first coding loss, the first decoding complexity, and the second coding loss and the second decoding complexity corresponding to the current coding mode combination; the rate-distortion comparison function is used to compare the performance of the coding mode combination to be compared with the current coding mode combination.

In this embodiment, the method of comparing the combination performance of the encoding modes may be implemented by a rate-distortion function, where the rate-distortion Korean type may be obtained according to the encoding loss and decoding complexity of the encoding mode combination to be compared, that is, the first encoding loss and the first decoding complexity, and the encoding loss and decoding complexity of the current encoding mode combination, that is, the second encoding loss and the second decoding complexity, and by substituting the first encoding loss and the first decoding complexity, and the second encoding loss and the second decoding complexity into a pre-constructed rate-distortion comparison function, the comparison result may be obtained by using the rate-distortion comparison function, and the comparison result may be used to characterize the performance advantage or disadvantage of the encoding mode combination to be compared and the current encoding mode combination.

In step S403, when the comparison result indicates that the performance of the current coding mode combination is better than that of the coding mode combination to be compared, the current coding mode combination is used as a new coding mode combination to be compared;

in step S404, the to-be-aligned encoding mode combination of the last alignment round is taken as the target encoding mode combination.

And if the comparison result represents that the performance of the current coding mode combination is better than that of the coding mode combination to be compared, the terminal can take the current coding mode combination as a new coding mode combination to be compared, acquire the current coding mode combination corresponding to the next comparison round again to execute performance comparison again until the last comparison round, and take the coding mode combination to be compared of the last comparison round as a target coding mode combination.

For example, the candidate coding mode combinations may include coding mode combination a, coding mode combination B, coding mode combination C and coding mode combination D, the terminal may compare the performance of coding mode combination B of coding mode combination a with the performance of coding mode combination B, substituting the coding loss and decoding complexity corresponding to coding mode combination B by coding mode combination a into a pre-constructed rate-distortion comparison function to obtain a comparison result, if the comparison result characterizes coding mode combination B to be better than coding mode combination a, the terminal may use coding mode combination B as new to-be-compared coding mode combination, and compare the performance with the new current coding mode combination, i.e., coding mode combination C, and keep coding mode combination B as to-be-compared coding mode combination, and use the new current coding mode combination, i.e., coding mode combination D, to-be-compared coding mode combination if the performance of coding mode combination D is better, and since the current is already the last comparison round, the terminal may use the last to-be-compared coding mode combination, i.e., the last to be the target coding mode combination.

In this embodiment, the terminal may obtain the comparison result of the pre-configured rate-distortion comparison function through the second coding loss and the second decoding complexity corresponding to the current coding mode combination, and the first coding loss and the first decoding complexity of the coding mode combination to be compared with the optimal performance before the current comparison round, so as to implement performance comparison of the coding mode combination.

Further, as shown in fig. 5, step S402 may further include:

in step S501, a video frame level of a video frame corresponding to a pixel block is acquired, and a codec complexity adjustment parameter corresponding to the video frame level is acquired.

The video frame level refers to the level of the video frame corresponding to the pixel block, and the coding complexity adjusting parameter refers to the dynamics adjusting parameter for coding complexity, and in the coding process, the high-level frame takes the low-level frame as the reference frame, and the number of the high-level frame is much more than that of the low-level frame, so that different control parameters are required to be added to different frames to reduce the coding loss. Specifically, the terminal may first obtain a video frame level of a video frame corresponding to a certain pixel block, so as to determine a corresponding codec complexity adjustment parameter according to the video frame level.

In step S502, the first coding loss, the first decoding complexity, the second coding loss, the second decoding complexity, and the coding/decoding complexity adjustment parameters are brought into a rate-distortion comparison function to obtain a rate-distortion comparison value;

In step S503, a comparison result is obtained according to the magnitude relation between the rate-distortion comparison value and the rate-distortion comparison threshold value preset for the rate-distortion comparison function.

The rate-distortion comparison threshold refers to a preset rate-distortion function threshold for performing size comparison with a rate-distortion comparison value, the threshold may be set based on a function form of a preset rate-distortion comparison function, specifically, the terminal may bring the obtained first coding loss, first decoding complexity, second coding loss, second decoding complexity and coding complexity adjustment parameters into the preset rate-distortion comparison function, thereby calculating a corresponding rate-distortion comparison value, and may perform size relation comparison between the obtained rate-distortion comparison value and the rate-distortion comparison threshold preset for the rate-distortion comparison function, thereby obtaining a corresponding comparison result.

For example, the first coding loss may be characterized by bestCost, the first decoding complexity may be characterized by bestDecodeEnergy, the second coding loss may be characterized by tempCost, the second decoding complexity may be characterized by tempDecodeEnergy, and the codec complexity adjustment parameter may be characterized by derdo _rd_ratio, then the rate-distortion comparison function may comprise the following forms:

Function 1：(tempCost + 1.0) / (bestCost + tempCost + 2.0) + (1.0 * (tempDecodeEnergy + 1.0) / (bestDecodeEnergy + tempDecodeEnergy + 2.0) - 0.5) * derdo_rd_ratio < 0.5, is used where 0.5 is the rate distortion comparison threshold set in advance for function 1.

Function 2：(tempCost + 1.0) / (bestCost + 1.0) + (1.0 * (tempDecodeEnergy + 1.0) / (bestDecodeEnergy + 1.0) - 1.0) * derdo_rd_ratio < 1.0,, where 1.0 is the rate distortion comparison threshold set in advance for function 2.

Function 3: tempCost + derdo _rd_ratio tempDecodeEnergy-bestCost-derdo _rd_ratio bestDecodeEnergy <0, where 0 is the rate distortion comparison threshold set in advance for function 3.

In this embodiment, the terminal may further introduce a coding and decoding complexity adjustment parameter of video frame hierarchy image matching of the video frame corresponding to the pixel block into the rate distortion function, and obtain a rate distortion comparison value by using the coding and decoding complexity adjustment parameter, so as to obtain a comparison result of the rate distortion function.

Further, step S501 may further include: acquiring an encoded video frame associated with a video frame corresponding to a pixel block and decoding complexity corresponding to the encoded video frame; the decoding complexity corresponding to the encoded video frame is determined based on the decoding complexity of each pixel block contained in the encoded video frame; acquiring an initial encoding and decoding complexity adjustment parameter corresponding to a video frame level, and adjusting the initial encoding and decoding complexity adjustment parameter by utilizing decoding complexity corresponding to an encoded video frame to obtain an encoding and decoding complexity adjustment parameter; the adjustment is positively correlated with the decoding complexity corresponding to the encoded video frame.

The encoded video frame associated with the video frame corresponding to the pixel block refers to the video frame corresponding to the pixel block, and the video frame which has been encoded may be, for example, a reference frame of two frames preceding and following the video frame corresponding to the pixel block and having the nearest time, or a combination of frames therein, and the decoding complexity corresponding to the encoded video frame refers to the decoding complexity of the encoded video frame, where the decoding complexity may be determined based on the decoding complexity of each pixel block included in the encoded video frame.

The initial codec complexity adjustment parameter is a codec complexity adjustment parameter preset with a corresponding relation with a video frame level, in this embodiment, the codec complexity adjustment parameter is related to the video frame level, and may be adaptively adjusted according to the decoding complexity of the encoded video frame, that is, the initial codec complexity adjustment parameter may be adjusted according to the decoding complexity of the encoded video frame, and the adjustment mode and the decoding complexity of the encoded video frame are in positive correlation, that is, the decoding complexity is relatively large, the adjustment parameter is correspondingly enlarged, otherwise, the adjustment parameter is reduced, so that not only the decoding complexity is more focused in the frame with relatively large complexity, but also the encoding loss is more focused in the simple frame, and the adaptive effect may be realized according to the sequence frame characteristics.

In this embodiment, the terminal may adaptively adjust the encoding and decoding complexity adjustment parameters based on the decoding complexity of the encoded video frame associated with the video frame corresponding to the pixel block, so as to achieve an adaptive effect according to the characteristics of the sequence frame, and further improve the accuracy of the encoding and decoding complexity adjustment parameters.

In an exemplary embodiment, a first decoding complexity weight corresponding to a plurality of prediction modes, a second decoding complexity weight corresponding to a plurality of mode internal selections of each prediction mode, and a third decoding complexity weight corresponding to a plurality of non-prediction modes are pre-stored in a weight list; step S201 may further include: acquiring a target prediction mode corresponding to a current coding mode combination, target mode internal selection aiming at the target prediction mode, and a target non-prediction mode corresponding to the current coding mode combination; step S202 may further include: and acquiring a first decoding complexity weight corresponding to the target prediction mode, a second decoding complexity weight corresponding to the target mode internal selection and a third decoding complexity weight corresponding to the target non-prediction mode from the weight list.

In this embodiment, the correspondence between the full-scale coding mode and the decoding complexity weight may include multiple prediction modes, for example, multiple prediction modes such as intra-frame prediction and inter-frame prediction, and the correspondence between the corresponding first decoding complexity weight, multiple mode internal selections of each prediction mode, for example, different selections of prediction vectors, and the corresponding second decoding complexity weight, and multiple non-prediction modes, for example, non-prediction process modes such as MV derivation, inverse transformation, inverse quantization, and the correspondence between the corresponding third decoding complexity weight.

Specifically, after obtaining the current coding mode combination, the terminal may determine a prediction mode included in the coding mode combination as a target prediction mode, a mode internal selection for the target prediction mode as a target mode internal selection, and a non-prediction mode process of the current coding mode combination as a target non-prediction mode, respectively. And then, respectively acquiring a first decoding complexity weight corresponding to the target prediction mode, a second decoding complexity weight corresponding to the target mode internal selection and a third decoding complexity weight corresponding to the target non-prediction mode from the weight list, and taking the first decoding complexity weight and the second decoding complexity weight as decoding complexity weights respectively corresponding to all the coding modes in the current coding mode combination.

In this embodiment, the decoding complexity weights corresponding to the multiple coding modes may be stored in the weight list, which respectively includes a first decoding complexity weight corresponding to the multiple prediction modes, a second decoding complexity weight corresponding to the multiple mode internal selection of each prediction mode, and a third decoding complexity weight corresponding to the multiple non-prediction modes, so that the integrity of the correspondence between the coding modes and the decoding complexity weights may be further improved, and the accuracy of the decoding complexity corresponding to the obtained current coding mode combination may be further improved.

Further, when the target prediction mode is a preset prediction mode, obtaining the decoding complexity of the current coding mode combination based on the decoding complexity weights corresponding to the coding modes in the current coding mode combination, may further include: acquiring a video frame level of a video frame corresponding to the pixel block and a complexity change coefficient matched with the video frame level; and obtaining the decoding complexity of the current coding mode combination based on the decoding complexity weight and the complexity change coefficient respectively corresponding to each coding mode in the current coding mode combination.

In this embodiment, if the target prediction mode adopted in the current coding mode combination is a preset prediction mode, for example, in the normal inter-frame interpolation and AFFINE modes of intra-frame prediction, the terminal may further obtain a complexity change coefficient, for example, meRatio coefficient, matched with the video frame level of the video frame where the pixel block is located, where the coefficient may be used to adjust the preset prediction mode, and the multiple modes internally select the corresponding second decoding complexity weight, so that the decoding complexity of the current coding mode combination may be obtained based on the decoding complexity weights and the complexity change coefficients corresponding to the coding modes in the current coding mode combination, and since meRatio of the frames at the low level is smaller, meRatio of the frames at the high level is larger, so that the frames at the high level are more prone to selecting the inter-frame prediction mode with low complexity.

In this embodiment, when the target prediction mode is the preset prediction mode, the matched complexity change coefficient may be set based on the video frame level of the video frame where the pixel block is located, so that the decoding complexity of the current coding mode combination is obtained by using the decoding complexity weights and the complexity change coefficients corresponding to the coding modes in the current coding mode combination, so that frames of different levels may have different prediction mode selection tendencies, and the decoding complexity of the decoding end is further reduced under the condition of ensuring the coding effect.

Further, obtaining the complexity change coefficient matched with the video frame level may further include: acquiring an encoded video frame associated with a video frame corresponding to a pixel block and decoding complexity corresponding to the encoded video frame; the decoding complexity corresponding to the encoded video frame is determined based on the decoding complexity of each pixel block contained in the encoded video frame; acquiring an initial complexity change coefficient corresponding to a video frame level, and adjusting the initial complexity change coefficient by utilizing decoding complexity corresponding to an encoded video frame to obtain a complexity change coefficient; the adjustment is positively correlated with the decoding complexity corresponding to the encoded video frame.

The initial complexity change coefficient is a complexity change coefficient which is preset with a corresponding relation with the video frame level. In this embodiment, similar to the encoding and decoding complexity adjustment parameters, the complexity change coefficient is related to the video frame level, and can be adaptively adjusted according to the decoding complexity of the encoded video frame, and the adjustment and decoding complexity are in positive correlation, that is, the initial complexity change coefficient can be adjusted according to the decoding complexity of the encoded video frame, so that not only can the high-level frame be more prone to selecting the inter-frame prediction mode with low complexity, but also the adaptive effect can be realized according to the characteristics of the sequence frame.

In this embodiment, the terminal may adaptively adjust the complexity change coefficient based on the decoding complexity of the encoded video frame associated with the video frame corresponding to the pixel block, so as to achieve an adaptive effect according to the characteristics of the sequence frame, and further improve the accuracy of the complexity change coefficient.

In an exemplary embodiment, a method for reducing the complexity of video decoding is also provided, where the method obtains an accurate weight duty ratio by systematically analyzing the performance of each tool module at the decoding end, and accurately controls the complexity of decoding when encoding a code stream, thereby reducing the complexity of decoding as a whole.

That is, by adding decoding complexity weight when calculating mode selection between different coding tools, a mode with lower decoding complexity is selected under the condition of less loss, which can comprise the following parts

1. The weight of several tools decoding complexity in intra-prediction, inter-prediction is increased.

Different decoding complexity weights are given to multiple modes of intra-frame prediction and inter-frame prediction in prediction, for example, the intra-frame prediction has multiple modes of PLANER, DC, MIP, CCLM, angle prediction and the like, and the inter-frame prediction has multiple prediction modes of common unidirectional bidirectional interpolation, AFFINE, GEO and the like.

2. The luminance, chrominance, direction of each coding mode are given separate weights.

In the single mode, if the encoded video format is yuv420, the number of pixels of luminance and chrominance is different, and the two chrominance components may be combined at the decoding end, so that the decoding complexity in the case of luminance, chrominance combining and chrominance separation is also different, and different weights need to be respectively given according to the complexity derived in the decoding process.

3. The different selections inside each coding mode are given separate weights.

In a single prediction mode, if the prediction vectors selected in the mode are different in decoding complexity corresponding to the difference, different weights are respectively given, for example, in bidirectional inter prediction, whether the two prediction vectors need interpolation horizontally and vertically can be combined into 16 combinations, and different weights are respectively given.

In the anti-transformation MTS, three candidate matrixes are shared by the MTS, the decoding complexity of each transformation matrix is different, and 9 MTS combinations can be combined for different primary transformation types and secondary transformation types selected by the MTS, and different weights are respectively given.

4. The non-predictive processes such as MV derivation, inverse transformation, inverse quantization, etc. are weighted by the decoding complexity.

For other non-prediction processes such as MV derivation, inverse transformation, inverse quantization and the like with higher decoding complexity, which are not predicted, the decoding complexity weights of the other non-prediction processes are also calculated for mode selection. For example, if the final transformation quantization coefficients are all 0, if all 0, the inverse transformation and inverse quantization are not needed, so that a lot of decoding complexity can be saved; for MV derivation, if there is a sub-block in the current block, the sub-block is present, and more decoding complexity is taken.

The above-mentioned several modes of the codec add weights of decoding complexity, and make fine division to internal selection of brightness, chromaticity and mode so as to obtain a weight list containing tools of whole codec process, and said weight list contains up to 74 different weights.

5. The area size of the currently processed block is considered, and different weights are given.

When the decoding end processes blocks with different sizes, the average processing time for each pixel is different, and the larger the average processing time for each pixel is, the smaller the processing time for each pixel is, if the area size of the currently processed block is not considered, the inaccuracy of the actually calculated weight can be caused.

In the calculation process of the decoding end of predicting, transforming, quantizing and the like on one prediction block, if the mode is the same, namely, one of the 74 cases is that if the block size is larger, the more pixels can be processed in parallel, the smaller the decoding complexity required for completing the processing of each pixel is averaged, so that whether the blocks with different pixel sizes in each case need to be given different weights represents the time complexity of processing single pixels, and the number of pixels needs to be multiplied when calculating the total weight. The block sizes range from 4x4 to 128x128 for a total of 13 block sizes, thus yielding a weight table of 74 x 13, which is derived by the analysis tool at the decoding end. After the area weight of the band block is used, a more accurate decision is made on whether division is needed at the encoding end, and a large block is more prone to be selected, so that the complexity of the decoding end is reduced. When the weights are calculated, the weights of all modules contained in the current mode are added to obtain the decoding complexity.

The complexity analysis tools for decoding the tools are developed at the decoding end, so that the time-consuming part of decoding complexity can be very easily analyzed, 74 modes can be directly derived, 13 kinds of block-size weights of each mode can be directly derived, and the complexity weights of the encoding end can be more accurately calculated.

6. The decoding complexity of the mode is taken into account in the comparison formula for mode selection.

A new rate-distortion comparison function is proposed for calculating the weights of the loss and decoding complexity at the encoding end. The formula for calculating the rate-distortion comparison is as follows, bestCost is the original best loss, bestDecodeEnergy is the original best decoding complexity, tempCost is the new loss, tempDecodeEnergy is the new decoding complexity, derdo _rd_ratio is the dynamics adjustment parameter for coding and decoding complexity. The new function uses a percentage mode to compare, so that the loss caused by different levels of loss and complexity can be avoided.

(tempCost + 1.0) / (bestCost + tempCost + 2.0) + (1.0 * (tempDecodeEnergy + 1.0) / (bestDecodeEnergy + tempDecodeEnergy + 2.0) - 0.5) * derdo_rd_ratio < 0.5.

In addition, the following rate distortion function may also be employed:

Can be directly compared by tempCost and bestCost ,(tempCost + 1.0) / (bestCost + 1.0) + (1.0 * (tempDecodeEnergy + 1.0) / (bestDecodeEnergy + 1.0) - 1.0) * derdo_rd_ratio < 1.0.

The comparison of the loss-weighted decoding complexity may also be directly used, tempCost + derdo _rd_ratio tempDecodeEnergy < bestCost + derdo _rd_ratio bestDecodeEnergy.

7. Computing decoding complexity weights differently for frames of different layers

In the encoding process, the high-level frames take the low-level frames as reference frames, the number of the high-level frames is much more than that of the low-level frames, different control parameters are added to different frames, decoding complexity weight in mode selection is reduced for the low-level frames, and encoding loss is reduced.

In the normal inter interpolation and AFFINE of intra prediction, different meRatio is set for frames of different levels, decoding complexity is multiplied by meRatio and added to the loss, meRatio of frames of low level is smaller, meRatio of high level is larger, and frames of high level are more prone to select inter prediction modes of low complexity.

Different derdo _rd_ratio is used for frames of different levels in a comparison formula of mode selection, the derdo _rd_ratio of a lower level is smaller than that of a higher-level derdo _rd_ratio, and coding loss caused by an algorithm is reduced under the condition of keeping benefit of decoding complexity.

The specific mode combination selection implementation process can be shown as 6, the decoding complexity of the best mode and the candidate mode can be calculated respectively, and then the rate distortion function can be utilized to perform rate distortion comparison to select a new best mode with better performance.

The specific calculation of the decoding complexity of a certain mode combination may be implemented by a flow as shown in fig. 7, and the decoding complexity of the mode combination may be obtained by measuring the execution time of the corresponding module from the decoding end to obtain the running time of each module per pixel, constructing the running time as the weight of the decoding complexity, and then obtaining the weight of the corresponding module in the mode combination according to the weight table.

In this embodiment, the whole decoder tool module is systematically analyzed, and each module is given different weights according to different brightness, chromaticity and direction, so that the decoding complexity is precisely controlled in the whole. The decoding complexity of the mode is considered in a comparison formula of mode selection, and the gain of the coding loss and the decoding complexity is further precisely controlled. For frames of different layers, different calculation weights are given, so that coding performance loss is reduced.

It should be understood that, although the steps in the flowcharts of fig. 1 to 7 are shown in order as indicated by the arrows, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least a portion of the steps of fig. 1-7 may include multiple steps or stages that are not necessarily performed at the same time, but may be performed at different times, nor does the order in which the steps or stages are performed necessarily occur sequentially, but may be performed alternately or alternately with at least a portion of the steps or stages in other steps or other steps.

It should be understood that the same/similar parts of the embodiments of the method described above in this specification may be referred to each other, and each embodiment focuses on differences from other embodiments, and references to descriptions of other method embodiments are only needed.

Fig. 8 is a block diagram of a video encoding apparatus according to an exemplary embodiment. Referring to fig. 8, the apparatus includes an encoded pixel block acquisition unit 801, a candidate combination acquisition unit 802, a target combination acquisition unit 803, and a pixel block encoding unit 804.

An encoded pixel block acquisition unit 801 configured to perform acquisition of a plurality of pixel blocks included in a video frame in a video to be encoded;

A candidate combination obtaining unit 802 configured to obtain, for each pixel block, a plurality of candidate coding mode combinations corresponding to the pixel block, where each candidate coding mode combination includes at least two coding modes with different dimensions, and each coding mode corresponds to a decoding complexity weight;

A target combination obtaining unit 803 configured to perform a decoding complexity weight corresponding to each encoding mode included in each candidate encoding mode combination, obtain a decoding complexity corresponding to the encoding of the pixel block based on each candidate encoding mode combination, and obtain an encoding loss corresponding to the encoding of the pixel block based on each candidate encoding mode combination, and obtain a target encoding mode combination with optimal performance from the candidate encoding mode combinations using the decoding complexity and the encoding loss;

The pixel block encoding unit 804 is configured to perform encoding processing on the pixel block according to the encoding modes included in the target encoding mode combination.

In an exemplary embodiment, the target combination obtaining unit 803 is further configured to perform obtaining each encoding mode included in the current encoding mode combination by taking any one of a plurality of candidate encoding mode combinations as the current encoding mode combination; obtaining decoding complexity weights corresponding to the coding modes in the current coding mode combination from a pre-constructed weight list; the weight list stores the corresponding relation between the full-scale coding mode and the decoding complexity weight.

In an exemplary embodiment, the decoding complexity weights for the full-scale coding modes corresponding to different pixel block areas and different color space categories are stored in the weight list; a target combination acquisition unit 803 further configured to perform acquisition of a pixel block area of the pixel block, and a color space class of the pixel block; and obtaining decoding complexity weights corresponding to the pixel block area and the color space category of each coding mode in the current coding mode combination from a pre-constructed weight list.

In an exemplary embodiment, the video encoding apparatus further includes: the weight list construction unit is configured to acquire a sample pixel block, and sample pixel block area and sample color space category corresponding to the sample pixel block; performing coding processing on the sample pixel block by using a target coding mode, and obtaining decoding time for performing decoding processing on the coded sample coding block; the target coding mode is a coding mode of which the decoding complexity weight is to be acquired; and obtaining decoding complexity weights corresponding to the target coding mode and the sample pixel block area and the sample color space category according to the decoding time, and constructing a weight list based on the corresponding relation between the decoding complexity weights and the target coding mode and the corresponding relation between the sample pixel block area and the sample color space category.

In an exemplary embodiment, the target combination obtaining unit 803 is further configured to perform obtaining a first coding loss and a first decoding complexity corresponding to the coding mode combination to be compared; the coding mode combination to be compared is a candidate coding mode combination with optimal performance before the current comparison round; obtaining a comparison result of a pre-constructed rate-distortion comparison function according to the first coding loss, the first decoding complexity, the second coding loss corresponding to the current coding mode combination and the second decoding complexity; the rate distortion comparison function is used for comparing the performance of the coding mode combination to be compared with the performance of the current coding mode combination; under the condition that the comparison result represents that the performance of the current coding mode combination is better than that of the coding mode combination to be compared, the current coding mode combination is used as a new coding mode combination to be compared; and taking the to-be-compared coding mode combination of the last comparison round as a target coding mode combination.

In an exemplary embodiment, the target combination obtaining unit 803 is further configured to perform obtaining a video frame level of a video frame corresponding to the pixel block, and obtaining a codec complexity adjustment parameter corresponding to the video frame level; the first coding loss, the first decoding complexity, the second coding loss, the second decoding complexity and the coding and decoding complexity adjusting parameters are brought into a rate distortion comparison function to obtain a rate distortion comparison value; and obtaining a comparison result according to the magnitude relation between the rate distortion comparison value and a rate distortion comparison threshold value preset for the rate distortion comparison function.

In an exemplary embodiment, the target combination obtaining unit 803 is further configured to perform obtaining an encoded video frame associated with a video frame corresponding to the pixel block, and a decoding complexity corresponding to the encoded video frame; the decoding complexity corresponding to the encoded video frame is determined based on the decoding complexity of each pixel block contained in the encoded video frame; acquiring an initial encoding and decoding complexity adjustment parameter corresponding to a video frame level, and adjusting the initial encoding and decoding complexity adjustment parameter by utilizing decoding complexity corresponding to an encoded video frame to obtain an encoding and decoding complexity adjustment parameter; the decoding complexity corresponding to the encoded video frame is adjusted in a positive correlation.

In an exemplary embodiment, a first decoding complexity weight corresponding to a plurality of prediction modes, a second decoding complexity weight corresponding to a plurality of mode internal selections of each prediction mode, and a third decoding complexity weight corresponding to a plurality of non-prediction modes are pre-stored in a weight list; a target combination obtaining unit 803 further configured to perform obtaining a target prediction mode corresponding to the current coding mode combination, a target mode internal selection for the target prediction mode, and a target non-prediction mode corresponding to the current coding mode combination; and acquiring a first decoding complexity weight corresponding to the target prediction mode, a second decoding complexity weight corresponding to the target mode internal selection and a third decoding complexity weight corresponding to the target non-prediction mode from the weight list.

In an exemplary embodiment, in a case where the target prediction mode is a preset prediction mode, the target combination obtaining unit 803 is further configured to perform obtaining a video frame level of a video frame corresponding to the pixel block, and obtaining a complexity change coefficient matched with the video frame level; and obtaining the decoding complexity of the current coding mode combination based on the decoding complexity weight and the complexity change coefficient respectively corresponding to each coding mode in the current coding mode combination.

In an exemplary embodiment, the target combination obtaining unit 803 is further configured to perform obtaining an encoded video frame associated with a video frame corresponding to the pixel block, and a decoding complexity corresponding to the encoded video frame; the decoding complexity corresponding to the encoded video frame is determined based on the decoding complexity of each pixel block contained in the encoded video frame; acquiring an initial complexity change coefficient corresponding to a video frame level, and adjusting the initial complexity change coefficient by utilizing decoding complexity corresponding to an encoded video frame to obtain a complexity change coefficient; the decoding complexity corresponding to the encoded video frame is adjusted in a positive correlation.

The specific manner in which the various modules perform the operations in the apparatus of the above embodiments have been described in detail in connection with the embodiments of the method, and will not be described in detail herein.

Fig. 9 is a block diagram illustrating an electronic device 900 for video encoding according to an example embodiment. For example, electronic device 900 may be a mobile phone, computer, digital broadcast terminal, messaging device, game console, tablet device, medical device, exercise device, personal digital assistant, and the like.

Referring to fig. 9, an electronic device 900 may include one or more of the following components: a processing component 902, a memory 904, a power component 906, a multimedia component 908, an audio component 910, an input/output (I/O) interface 912, a sensor component 914, and a communication component 916.

The processing component 902 generally controls overall operation of the electronic device 900, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing component 902 may include one or more processors 920 to execute instructions to perform all or part of the steps of the methods described above. Further, the processing component 902 can include one or more modules that facilitate interaction between the processing component 902 and other components. For example, the processing component 902 can include a multimedia module to facilitate interaction between the multimedia component 908 and the processing component 902.

The memory 904 is configured to store various types of data to support operations at the electronic device 900. Examples of such data include instructions for any application or method operating on the electronic device 900, contact data, phonebook data, messages, pictures, video, and so forth. The memory 904 may be implemented by any type of volatile or nonvolatile memory device or combination thereof, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic disk, optical disk, or graphene memory.

The power supply component 906 provides power to the various components of the electronic device 900. Power supply components 906 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for electronic device 900.

The multimedia component 908 comprises a screen between the electronic device 900 and the user that provides an output interface. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from a user. The touch panel includes one or more touch sensors to sense touches, swipes, and gestures on the touch panel. The touch sensor may sense not only the boundary of a touch or slide action, but also the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 908 includes a front-facing camera and/or a rear-facing camera. When the electronic device 900 is in an operational mode, such as a shooting mode or a video mode, the front camera and/or the rear camera may receive external multimedia data. Each front and rear camera may be a fixed optical lens system or have focal length and optical zoom capabilities.

The audio component 910 is configured to output and/or input audio signals. For example, the audio component 910 includes a Microphone (MIC) configured to receive external audio signals when the electronic device 900 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may be further stored in the memory 904 or transmitted via the communication component 916. In some embodiments, the audio component 910 further includes a speaker for outputting audio signals.

The I/O interface 912 provides an interface between the processing component 902 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: homepage button, volume button, start button, and lock button.

The sensor assembly 914 includes one or more sensors for providing status assessment of various aspects of the electronic device 900. For example, the sensor assembly 914 may detect an on/off state of the electronic device 900, a relative positioning of the components, such as a display and keypad of the electronic device 900, the sensor assembly 914 may also detect a change in position of the electronic device 900 or a component of the electronic device 900, the presence or absence of a user's contact with the electronic device 900, an orientation or acceleration/deceleration of the device 900, and a change in temperature of the electronic device 900. The sensor assembly 914 may include a proximity sensor configured to detect the presence of nearby objects without any physical contact. The sensor assembly 914 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 914 may also include an acceleration sensor, a gyroscopic sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.

The communication component 916 is configured to facilitate communication between the electronic device 900 and other devices, either wired or wireless. The electronic device 900 may access a wireless network based on a communication standard, such as WiFi, an operator network (e.g., 2G, 3G, 4G, or 5G), or a combination thereof. In one exemplary embodiment, the communication component 916 receives broadcast signals or broadcast-related information from an external broadcast management system via a broadcast channel. In one exemplary embodiment, the communication component 916 further includes a Near Field Communication (NFC) module to facilitate short range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, ultra Wideband (UWB) technology, bluetooth (BT) technology, and other technologies.

In an exemplary embodiment, the electronic device 900 may be implemented by one or more Application Specific Integrated Circuits (ASICs), digital Signal Processors (DSPs), digital Signal Processing Devices (DSPDs), programmable Logic Devices (PLDs), field Programmable Gate Arrays (FPGAs), controllers, microcontrollers, microprocessors, or other electronic elements for executing the methods described above.

In an exemplary embodiment, a computer-readable storage medium is also provided, such as a memory 904 including instructions executable by the processor 920 of the electronic device 900 to perform the above-described method. For example, the computer readable storage medium may be ROM, random Access Memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.

In an exemplary embodiment, a computer program product is also provided, comprising instructions executable by the processor 920 of the electronic device 900 to perform the above-described method.

It should be noted that the descriptions of the foregoing apparatus, the electronic device, the computer readable storage medium, the computer program product, and the like according to the method embodiments may further include other implementations, and the specific implementation may refer to the descriptions of the related method embodiments and are not described herein in detail.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any adaptations, uses, or adaptations of the disclosure following the general principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It is to be understood that the present disclosure is not limited to the precise arrangements and instrumentalities shown in the drawings, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. A video encoding method, comprising:

2. The method according to claim 1, wherein before the decoding complexity weights preset according to the coding modes included in each of the candidate coding mode combinations, further comprising:

Taking any one of the plurality of candidate coding mode combinations as a current coding mode combination, and acquiring each coding mode contained in the current coding mode combination;

Obtaining decoding complexity weights corresponding to the coding modes in the current coding mode combination respectively from a pre-constructed weight list; and the weight list stores the corresponding relation between the full-scale coding mode and the decoding complexity weight.

3. The method of claim 2, wherein the weight list has stored therein decoding complexity weights for full-scale coding modes corresponding to different pixel block areas and different color space categories;

The obtaining decoding complexity weights corresponding to the coding modes in the current coding mode combination from a pre-constructed weight list comprises the following steps:

Acquiring the pixel block area of the pixel block and the color space class of the pixel block;

and obtaining decoding complexity weights corresponding to the pixel block areas and the color space categories from a pre-constructed weight list, wherein each coding mode in the current coding mode combination is obtained.

4. The method according to claim 3, wherein before obtaining the decoding complexity weights corresponding to the pixel block area and the color space class for each coding mode in the current coding mode combination from the pre-constructed weight list, the method further comprises:

Acquiring a sample pixel block, and a sample pixel block area and a sample color space category corresponding to the sample pixel block;

Performing coding processing on the sample pixel block by using a target coding mode, and obtaining decoding time for performing decoding processing on the coded sample coding block; the target coding mode is a coding mode of which the decoding complexity weight is to be obtained;

And obtaining decoding complexity weights corresponding to the target coding mode, the sample pixel block area and the sample color space category according to the decoding time, and constructing the weight list based on the corresponding relation between the decoding complexity weights and the target coding mode, the sample pixel block area and the sample color space category.

5. The method of claim 2, wherein said obtaining a target coding mode combination of optimal performance from said candidate coding mode combinations using said decoding complexity and said coding loss comprises:

Acquiring a first coding loss and a first decoding complexity corresponding to a coding mode combination to be compared; the coding mode combination to be compared is a candidate coding mode combination with optimal performance before the current comparison round;

Obtaining a comparison result of a pre-constructed rate-distortion comparison function according to the first coding loss, the first decoding complexity and the second coding loss and the second decoding complexity corresponding to the current coding mode combination; the rate distortion comparison function is used for comparing the performance of the coding mode combination to be compared with the performance of the current coding mode combination;

under the condition that the comparison result represents that the performance of the current coding mode combination is better than that of the coding mode combination to be compared, the current coding mode combination is used as a new coding mode combination to be compared;

and taking the code pattern combination to be compared of the last comparison round as the target code pattern combination.

6. The method of claim 5, wherein obtaining the comparison result of the pre-constructed rate-distortion comparison function according to the first coding loss, the first decoding complexity, and the second coding loss and the second decoding complexity corresponding to the current coding mode combination comprises:

Acquiring a video frame level of a video frame corresponding to the pixel block, and acquiring a coding and decoding complexity adjustment parameter corresponding to the video frame level;

The first coding loss, the first decoding complexity, the second coding loss, the second decoding complexity and the coding and decoding complexity adjusting parameters are brought into the rate-distortion comparison function to obtain a rate-distortion comparison value;

And obtaining the comparison result according to the magnitude relation between the rate distortion comparison value and a rate distortion comparison threshold value preset for the rate distortion comparison function.

7. The method of claim 6, wherein the obtaining the codec complexity adjustment parameters corresponding to the video frame level comprises:

Acquiring an encoded video frame associated with a video frame corresponding to the pixel block and decoding complexity corresponding to the encoded video frame; the decoding complexity corresponding to the encoded video frame is determined based on the decoding complexity of each pixel block contained in the encoded video frame;

Acquiring an initial encoding and decoding complexity adjustment parameter corresponding to the video frame level, and adjusting the initial encoding and decoding complexity adjustment parameter by utilizing the decoding complexity corresponding to the encoded video frame to acquire the encoding and decoding complexity adjustment parameter; the adjusting is in positive correlation with a decoding complexity corresponding to the encoded video frame.

8. The method according to claim 2, wherein the weight list stores in advance a first decoding complexity weight corresponding to a plurality of prediction modes, a second decoding complexity weight corresponding to a plurality of mode internal selections of each prediction mode, and a third decoding complexity weight corresponding to a plurality of non-prediction modes;

The step of obtaining each coding mode contained in the current coding mode combination by taking any one of the plurality of candidate coding mode combinations as the current coding mode combination comprises the following steps:

acquiring a target prediction mode corresponding to the current coding mode combination, target mode internal selection aiming at the target prediction mode, and target non-prediction mode corresponding to the current coding mode combination;

and acquiring a first decoding complexity weight corresponding to the target prediction mode, a second decoding complexity weight corresponding to the target mode internal selection and a third decoding complexity weight corresponding to the target non-prediction mode from the weight list.

9. The method according to claim 8, wherein, in the case that the target prediction mode is a preset prediction mode, the obtaining the decoding complexity of the current coding mode combination based on the decoding complexity weights corresponding to the coding modes in the current coding mode combination includes:

acquiring a video frame level of a video frame corresponding to the pixel block, and acquiring a complexity change coefficient matched with the video frame level;

and obtaining the decoding complexity of the current coding mode combination based on the decoding complexity weight corresponding to each coding mode in the current coding mode combination and the complexity change coefficient.

10. The method of claim 9, wherein the obtaining a complexity variation coefficient that matches the video frame level comprises:

Acquiring an initial complexity change coefficient corresponding to the video frame level, and adjusting the initial complexity change coefficient by utilizing the decoding complexity corresponding to the encoded video frame to obtain the complexity change coefficient; the adjusting is in positive correlation with a decoding complexity corresponding to the encoded video frame.

11. A video encoding apparatus, comprising:

12. An electronic device, comprising:

A processor;

A memory for storing the processor-executable instructions;

wherein the processor is configured to execute the instructions to implement the video coding method of any one of claims 1 to 10.

13. A computer readable storage medium, characterized in that instructions in the computer readable storage medium, when executed by a processor of an electronic device, enable the electronic device to perform the video encoding method of any one of claims 1 to 10.