CN115052150A

CN115052150A - Video encoding method, video encoding device, electronic equipment and storage medium

Info

Publication number: CN115052150A
Application number: CN202210640961.3A
Authority: CN
Inventors: 黄博; 闻兴; 于冰; 谷嘉文; 刘晶
Original assignee: Beijing Dajia Internet Information Technology Co Ltd
Current assignee: Beijing Dajia Internet Information Technology Co Ltd
Priority date: 2022-06-08
Filing date: 2022-06-08
Publication date: 2022-09-13

Abstract

The present disclosure relates to a video encoding method, apparatus, electronic device, storage medium, and computer program product. The method comprises the following steps: acquiring pixel value discrete information of each image block in a video frame to be coded; determining offset quantization information of each image block according to the relationship between the pixel value discrete information of each image block and a preset threshold; updating the initial quantization information of each image block according to the offset quantization information of each image block to obtain target quantization information of each image block, and encoding each image block according to the target quantization information of each image block. By adopting the method, the image distortion can be reduced.

Description

Video coding method and device, electronic equipment and storage medium

Technical Field

The present disclosure relates to the field of video coding technologies, and in particular, to a video coding method, an apparatus, an electronic device, a storage medium, and a computer program product.

Background

With the development of video processing technology, the purpose of compressing video data (such as video) can be achieved by encoding the video data, so that the video data can be stored and transmitted conveniently.

In the related art, when video coding processing is performed, offset quantization information corresponding to each image block in a video frame needs to be utilized; the offset quantization information used by the encoder for each image block of the video frame is fixed, that is, the quantization offset information used by each image block is set to a fixed value, such as 1, 0.5, and the like. However, texture information, image details, and the like of each image block in a video frame are different, and if the video frame is encoded by using the fixed quantization offset information, the rate distortion performance obtained is not ideal, and thus the image distortion after encoding is large.

Disclosure of Invention

The present disclosure provides a video encoding method, apparatus, electronic device, storage medium, and computer program product to at least solve the problem of large distortion of an encoded image in the related art. The technical scheme of the disclosure is as follows:

according to a first aspect of the embodiments of the present disclosure, there is provided a video encoding method, including:

acquiring pixel value discrete information of each image block in a video frame to be coded;

determining offset quantization information of each image block according to the relationship between the pixel value discrete information of each image block and a preset threshold;

updating the initial quantization information of each image block according to the offset quantization information of each image block to obtain target quantization information of each image block, and encoding each image block according to the target quantization information of each image block.

In an exemplary embodiment, the determining offset quantization information of each image block according to a relationship between pixel value discrete information of each image block and a preset threshold includes:

for each image block in each image block, determining offset quantization information of the image block according to the pixel value discrete information and preset parameters of the image block under the condition that the pixel value discrete information of the image block is greater than the preset threshold;

under the condition that the pixel value discrete information of the image block is smaller than or equal to the preset threshold, acquiring the power N of the pixel value discrete information of the image block as the offset quantization information of the image block; n is greater than 0.

In an exemplary embodiment, the determining offset quantization information of the image block according to the pixel value discrete information of the image block and a preset parameter includes:

determining first offset quantization information of the image block according to the preset parameters and the pixel value discrete information of the image block;

and determining the offset quantization information of the image block according to the N power of the pixel value discrete information of the image block and the first offset quantization information.

In an exemplary embodiment, the preset parameter is obtained by:

acquiring the data set; the data set comprises a sample video frame;

inputting the sample video frame into an encoder to be trained to obtain an encoded video frame corresponding to the sample video frame;

determining an image similarity evaluation index according to the sample video frame and the coded video frame corresponding to the sample video frame;

and adjusting the coding parameters of the coder to be trained according to the image similarity evaluation index, training the coder after the coding parameters are adjusted again until a training end condition is reached, and taking the coding parameters of the coder after the training which reaches the training end condition as the preset parameters.

In an exemplary embodiment, the determining offset quantization information of each image block according to a relationship between the discrete information of pixel values of each image block and a preset threshold further includes:

determining target pixel value discrete information and a target pixel value discrete information average value of each image block according to the pixel value discrete information of each image block;

for each image block in each image block, under the condition that the pixel value dispersion information of the image block is larger than the preset threshold, determining offset quantization information of the image block according to a preset scaling length, target pixel value dispersion information of the image block, an average value of the target pixel value dispersion information, a preset coefficient and the pixel value dispersion information of the image block;

and under the condition that the pixel value discrete information of the image block is smaller than or equal to the preset threshold, determining the offset quantization information of the image block according to the preset scaling length, the target pixel value discrete information of the image block and the target pixel value discrete information average value.

In an exemplary embodiment, the determining the target pixel value dispersion information and the target pixel value dispersion information average value of each image block according to the pixel value dispersion information of each image block includes:

determining target pixel value discrete information of each image block according to the pixel value discrete information of each image block, the maximum value of the pixels in the video frame to be coded and the number of the pixels in each image block;

determining the average value of the target pixel value discrete information according to the target pixel value discrete information of each image block and the number of the target image blocks in the video frame to be coded; the target image block is an image block which meets the preset image size in the video frame to be coded.

In an exemplary embodiment, the determining offset quantization information of the image block according to a preset scaling length, target pixel value dispersion information of the image block, an average value of the target pixel value dispersion information, a preset coefficient, and pixel value dispersion information of the image block includes:

determining second offset quantization information of the image block according to the preset coefficient and the pixel value discrete information of the image block;

and determining the offset quantization information of the image block according to a preset scaling length, the difference value between the target pixel value discrete information of the image block and the average value of the target pixel value discrete information and the second offset quantization information of the image block.

According to a second aspect of the embodiments of the present disclosure, there is provided a video encoding apparatus comprising:

the information acquisition unit is configured to acquire pixel value discrete information of each image block in a video frame to be encoded;

an information determining unit configured to perform determining offset quantization information of the respective image blocks according to a relationship between pixel value dispersion information of the respective image blocks and a preset threshold;

and the encoding processing unit is configured to update the initial quantization information of each image block according to the offset quantization information of each image block to obtain target quantization information of each image block, and perform encoding processing on each image block according to the target quantization information of each image block.

In an exemplary embodiment, the information determining unit is further configured to perform, for each of the respective image blocks, in a case that the pixel value dispersion information of the image block is greater than the preset threshold, determining offset quantization information of the image block according to the pixel value dispersion information of the image block and a preset parameter; under the condition that the pixel value discrete information of the image block is smaller than or equal to the preset threshold, acquiring the power N of the pixel value discrete information of the image block as the offset quantization information of the image block; n is greater than 0.

In an exemplary embodiment, the information determining unit is further configured to determine first offset quantization information of the image block according to the preset parameter and the discrete information of the pixel values of the image block; and determining the offset quantization information of the image block according to the N power of the pixel value discrete information of the image block and the first offset quantization information.

In an exemplary embodiment, the apparatus further comprises a parameter determination unit configured to perform acquiring the data set; the data set comprises a sample video frame; inputting the sample video frame into an encoder to be trained to obtain an encoded video frame corresponding to the sample video frame; determining an image similarity evaluation index according to the sample video frame and the coded video frame corresponding to the sample video frame; and adjusting the coding parameters of the coder to be trained according to the image similarity evaluation index, training the coder after the coding parameters are adjusted again until a training end condition is reached, and taking the coding parameters of the coder after the training which reaches the training end condition as the preset parameters.

In an exemplary embodiment, the information determining unit is further configured to determine target pixel value dispersion information and a target pixel value dispersion information average value of each image block according to the pixel value dispersion information of each image block; for each image block in each image block, under the condition that the pixel value dispersion information of the image block is larger than the preset threshold, determining offset quantization information of the image block according to a preset scaling length, target pixel value dispersion information of the image block, an average value of the target pixel value dispersion information, a preset coefficient and the pixel value dispersion information of the image block; and under the condition that the pixel value discrete information of the image block is smaller than or equal to the preset threshold, determining the offset quantization information of the image block according to the preset scaling length, the target pixel value discrete information of the image block and the target pixel value discrete information average value.

In an exemplary embodiment, the information determining unit is further configured to determine target pixel value dispersion information of each image block according to the pixel value dispersion information of each image block, the maximum value of the pixels in the video frame to be encoded, and the number of the pixels in each image block; determining the average value of the target pixel value discrete information according to the target pixel value discrete information of each image block and the number of the target image blocks in the video frame to be coded; the target image block is an image block which meets the preset image size in the video frame to be coded.

In an exemplary embodiment, the information determining unit is further configured to determine second offset quantization information of the image block according to the preset coefficient and the pixel value dispersion information of the image block; and determining the offset quantization information of the image block according to a preset scaling length, the difference value between the target pixel value discrete information of the image block and the average value of the target pixel value discrete information and the second offset quantization information of the image block.

According to a third aspect of the embodiments of the present disclosure, there is provided an electronic apparatus including:

a processor;

a memory for storing the processor-executable instructions;

wherein the processor is configured to execute the instructions to implement the video encoding method of any of the above.

According to a fourth aspect of embodiments of the present disclosure, there is provided a computer-readable storage medium, wherein instructions, when executed by a processor of an electronic device, enable the electronic device to perform a video encoding method as in any one of the above.

According to a fifth aspect of embodiments of the present disclosure, there is provided a computer program product comprising instructions which, when executed by a processor of an electronic device, enable the electronic device to perform the video encoding method as defined in any one of the above.

The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects:

acquiring pixel value discrete information of each image block in a video frame to be coded; then determining the offset quantization information of each image block according to the relationship between the pixel value discrete information of each image block and a preset threshold; and finally, updating the initial quantization information of each image block according to the offset quantization information of each image block to obtain the target quantization information of each image block, and coding each image block according to the target quantization information of each image block. Therefore, the offset quantization information corresponding to each image block is adaptively determined according to the relationship between the pixel value discrete information of each image block and the preset threshold, and the pixel value discrete information of each image block is comprehensively considered, which is beneficial to considering the actual texture information of each image block, so that the offset quantization information corresponding to each image block is subsequently encoded, the image distortion can be effectively reduced, and the defect that the encoded image distortion is larger due to the fact that the video frame is encoded by adopting the fixed quantization offset information is avoided.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure and are not to be construed as limiting the disclosure.

Fig. 1 is a flow chart illustrating a method of video encoding according to an example embodiment.

Fig. 2 is a flowchart illustrating a method of determining offset quantization information for respective image blocks according to an exemplary embodiment.

Fig. 3 is a flowchart illustrating another method of determining offset quantization information for respective image blocks according to an example embodiment.

Fig. 4 is a flow chart illustrating another video encoding method according to an example embodiment.

Fig. 5 is a block diagram illustrating a video encoding apparatus according to an example embodiment.

FIG. 6 is a block diagram illustrating an electronic device in accordance with an example embodiment.

Detailed Description

In order to make the technical solutions of the present disclosure better understood by those of ordinary skill in the art, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.

It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in sequences other than those illustrated or otherwise described herein. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the disclosure, as detailed in the appended claims.

It should be further noted that the user information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data for presentation, analyzed data, etc.) referred to in the present disclosure are information and data authorized by the user or sufficiently authorized by each party.

Fig. 1 is a flow chart illustrating a video encoding method, as shown in fig. 1, for use in a terminal, according to an exemplary embodiment; it is understood that the method can also be applied to a server, and can also be applied to a system comprising a terminal and a server, and is realized through the interaction of the terminal and the server. In the present exemplary embodiment, the method includes the steps of:

in step S110, discrete information of pixel values of each image block in the video frame to be encoded is obtained.

The video frame to be encoded refers to a video frame that needs to be encoded, such as a video frame in a live video, a video frame in a short video, and the like.

The image block is obtained by segmenting a video frame to be coded; the video frame to be encoded comprises a plurality of image blocks. Typically, the image size of each image block is the same, such as 16 × 16 or 64 × 64. For example, assuming that the image size of a video frame to be encoded is 160 × 160, the video frame to be encoded may be divided into 100 16 × 16 image blocks; assuming that the image size of the video frame to be encoded is 640 × 640, the video frame to be encoded may be sliced into 100 64 × 64 image blocks.

It should be noted that, when the video frame to be encoded is segmented, the video frame may be uniformly segmented or non-uniformly segmented; moreover, the width and height of each image block may be the same or different.

Wherein, the pixel value discrete information refers to the variance of the pixel values; the pixel value dispersion information of the image block refers to a variance obtained by calculating the pixel value of each pixel in the image block.

Specifically, a terminal acquires a video frame in a video as a video frame to be encoded; segmenting a video frame to be encoded according to a preset image size to obtain a plurality of image blocks; and acquiring the pixel value of each pixel in each image block, and calculating to obtain the pixel value variance of each image block according to the pixel value of each pixel in each image block, wherein the pixel value variance is correspondingly used as the pixel value discrete information of each image block.

In step S120, offset quantization information of each image block is determined according to a relationship between the pixel value dispersion information of each image block and a preset threshold.

The preset threshold is set in advance, for example, 2000, and may be adjusted according to actual conditions.

The relationship between the pixel value discrete information of each image block and the preset threshold refers to the magnitude relationship between the pixel value discrete information of each image block and the preset threshold; for example, the discrete information of the pixel values of the image block is greater than a preset threshold, and the preset threshold is less than or equal to the preset threshold.

The offset quantization information of an image block refers to an offset value of quantization information of the image block, and specifically refers to an offset value of quantization information of a video frame to be encoded. The quantization information refers to the encoder quantization parameter qp (quantization parameter). The quantization parameter of the encoder reflects the compression condition of the spatial detail, for example, the smaller the quantization parameter is, most of the detail is preserved, and the image distortion is smaller; the larger the quantization parameter, the more detail is lost, the stronger the image distortion and the lower the image quality.

Specifically, the terminal compares the pixel value discrete information of each image block with a preset threshold respectively to obtain the relationship between the pixel value discrete information of each image block and the preset threshold; and performing corresponding adjustment processing on the pixel value discrete information of each image block according to the relationship between the pixel value discrete information of each image block and a preset threshold value to obtain the adjusted pixel value discrete information of each image block, wherein the adjusted pixel value discrete information is correspondingly used as the offset quantization information of each image block.

In step S130, the initial quantization information of each image block is updated according to the offset quantization information of each image block to obtain the target quantization information of each image block, and each image block is encoded according to the target quantization information of each image block.

The initial quantization information of each image block is quantization information of a video frame to be encoded, for example, the initial quantization information of each image block is the same as the quantization information of the video frame to be encoded. The quantization information of the video frame to be encoded is used to represent the quantization step, specifically, the quantization parameter of the encoder. The initial quantization information of each image block is the same as the quantization information of the video frame to be encoded.

The initial quantization information of each image block refers to an initial quantization parameter of each image block. The target quantization information of each image block refers to a target quantization parameter of each image block.

Specifically, the terminal acquires quantization information of a video frame to be coded as initial quantization information of each image block; adding the initial quantization information and the offset quantization information of each image block respectively to obtain target quantization information of each image block; and according to the target quantization information of each image block, coding each image block to obtain a coding code stream of each image block, and outputting the coding code stream of each image block.

Furthermore, the terminal can also perform coding processing on each image block according to the target quantization information of each image block to obtain each coded image block, and combine each coded image block to obtain a coded video frame corresponding to the video frame to be coded.

For example, assuming that the initial quantization information of an image block is frame QP and the offset quantization information is QP _ adj, the target quantization information of the image block is final _ block _ QP ═ frame QP + QP _ adj.

It should be noted that, with the method provided by the present disclosure, the ssim (structural similarity) index of the compressed video can be significantly improved compared to the disclosed method under the condition of the same code rate.

In the video coding method, the discrete information of the pixel value of each image block in a video frame to be coded is obtained; then determining the offset quantization information of each image block according to the relationship between the pixel value discrete information of each image block and a preset threshold; and finally, updating the initial quantization information of each image block according to the offset quantization information of each image block to obtain the target quantization information of each image block, and coding each image block according to the target quantization information of each image block. Therefore, the offset quantization information corresponding to each image block is adaptively determined according to the relationship between the pixel value discrete information of each image block and the preset threshold, and the pixel value discrete information of each image block is comprehensively considered, which is beneficial to considering the actual texture information of each image block, so that the offset quantization information corresponding to each image block is subsequently encoded, the image distortion can be effectively reduced, and the defect that the encoded image distortion is larger due to the fact that the video frame is encoded by adopting the fixed quantization offset information is avoided.

In an exemplary embodiment, as shown in fig. 2, in step S120, the offset quantization information of each image block is determined according to a relationship between the discrete information of the pixel value of each image block and a preset threshold, which may be specifically implemented by the following steps:

in step S210, for each image block in the image blocks, when the pixel value discrete information of the image block is greater than the preset threshold, the offset quantization information of the image block is determined according to the pixel value discrete information of the image block and the preset parameter.

In step S220, under the condition that the pixel value dispersion information of the image block is less than or equal to the preset threshold, obtaining the power N of the pixel value dispersion information of the image block as the offset quantization information of the image block; n is greater than 0.

Wherein the preset parameters are trained from a data set applied by a first encoder (such as a first image encoder); for example, the parameters of the first encoder are changed, then the sample video frames in the data set are encoded, and the average SSIM of the data set is calculated, and the parameter with the optimal average SSIM is selected as the preset parameter. For example, assume that for the first set of encoding parameters, the average SSIM for the data set is n 1; for the second set of encoding parameters, the average SSIM corresponding to the data set is n 2; for the third group of encoding parameters, the average SSIM corresponding to the data set is n3 … …, and for the mth group of encoding parameters, the average SSIM corresponding to the data set is nm; where n2 is the largest, the second set of encoding parameters is determined as the predetermined parameters.

Among them, ssim (structural similarity) is an index used to measure the similarity of pictures, and is often used to determine the quality of compressed pictures or videos. The SSIM measures image similarity from three aspects of brightness, contrast and structure. The mean value is an estimate of the luminance, the standard deviation is an estimate of the contrast, and the covariance is a measure of the degree of structural similarity. The value range of SSIM is [0,1], and the larger the SSIM value is, the smaller the image distortion is, the more similar the image distortion is. Furthermore, it is more desirable for the encoder to be able to improve the SSIM index of the compressed video when the bitrate is consumed.

The power N of the pixel value dispersion information of the image block refers to the power 0.1 of the variance of the pixel values of the image block.

Specifically, the terminal compares the pixel value discrete information of each image block with a preset threshold, acquires a parameter obtained by training according to a data set applied by the first encoder as a preset parameter when the pixel value discrete information of the image block is greater than the preset threshold, and adjusts the pixel value discrete information of the image block according to the preset parameter to obtain adjusted pixel value discrete information as offset quantization information of the image block. And under the condition that the pixel value discrete information of the image block is less than or equal to a preset threshold value, calculating the power N of the pixel value discrete information of the image block as the offset quantization information of the image block.

For example, when the pixel value dispersion information of the image block is less than or equal to the preset threshold, the offset quantization information of the image block is calculated by the following formula:

when energy (i) ≦ thA, qp _ adj (i) ═ energy (i) ^0.1 ；

Wherein, energy (i) refers to the pixel value dispersion information of the ith image block, that is, the pixel value variance of the ith image block; thA is a preset threshold; qp _ adj (i) refers to offset quantization information of the ith image block, i.e., an offset quantization parameter of the ith image block.

According to the technical scheme provided by the embodiment of the disclosure, the offset quantization information corresponding to each image block is adaptively determined according to the relationship between the pixel value discrete information of each image block and the preset threshold, and the pixel value discrete information of each image block is comprehensively considered, so that the actual texture information of each image block is favorably considered, and the accuracy rate of determining the offset quantization information of the image block is improved.

In an exemplary embodiment, in step S210, determining offset quantization information of an image block according to the pixel value discrete information of the image block and a preset parameter specifically includes: determining first offset quantization information of the image block according to preset parameters and pixel value discrete information of the image block; and determining the offset quantization information of the image block according to the N-th power of the pixel value discrete information of the image block and the first offset quantization information.

The preset parameters comprise a first preset parameter, a second preset parameter, a third preset parameter and a fourth preset parameter; the first preset parameter is a parameter A, wherein A is-3.0; the second preset parameter is a parameter B, wherein B is 0.5; the third preset parameter is a parameter C, wherein C is-1.45; the fourth preset parameter is the parameter D, where D is 0.5. It should be noted that the first preset parameter, the second preset parameter, the third preset parameter, and the fourth preset parameter are not fixed, and may be adjusted according to actual conditions.

The first offset quantization information of an image block is clip3(A, B, log2(log2(energy (i)) ^0.1 + C)) + D; when log2(log2(energy (i)) ^0.1 + C) is less than A, outputting A + D; when log2(log2(energy (i)) ^0.1 + C) is greater than B, B + D is output; when A is less than or equal to log2(log2(energy (i)) ^0.1 When the sum of C is less than or equal to B, log2(log2(energy (i)) ^0.1 +C)+D。

Specifically, the terminal adjusts the pixel value discrete information of the image block according to a first preset parameter, a second preset parameter, a third preset parameter and a fourth preset parameter to obtain adjusted pixel value discrete information serving as first offset quantization information of the image block; and acquiring the N power of the pixel value discrete information of the image block, and adding the N power of the pixel value discrete information of the image block and the first offset quantization information to obtain the offset quantization information of the image block.

For example, when the pixel value dispersion information of the image block is greater than the preset threshold, the offset quantization information of the image block is calculated by the following formula:

when energy (i)>thA, qp _ adj (i) ═ energy (i) ^0.1 +clip3(A,B,log2(log2(Energy

(i) ^0.1 +C)))+D；

Wherein, energy (i) refers to discrete information of pixel values of the ith image block; thA is a preset threshold; qp _ adj (i) refers to offset quantization information of the ith image block; clip3(A, B, log2(log2(energy (i)) ^0.1 + C)) + D refers to the first offset quantization information of the ith image block; a ═ 3.0, B ═ 0.5, C ═ 1.45, and D ═ 0.5.

According to the technical scheme provided by the embodiment of the disclosure, under the condition that the pixel value discrete information of the image block is larger than the preset threshold, the pixel value discrete information of the image block and the preset parameters obtained according to the data set training applied by the first encoder are comprehensively considered, so that the accuracy rate of determining the offset quantization information of the image block is favorably improved.

In an exemplary embodiment, the preset parameter is obtained by: acquiring a data set; the data set includes a sample video frame; inputting the sample video frame into an encoder to be trained to obtain an encoded video frame corresponding to the sample video frame; determining an image similarity evaluation index according to the sample video frame and the coded video frame corresponding to the sample video frame; and adjusting the encoding parameters of the encoder to be trained according to the image similarity evaluation index, performing retraining on the encoder with the adjusted encoding parameters until the training end condition is reached, and taking the encoding parameters of the trained encoder with the training end condition as preset parameters.

The image similarity evaluation index is SSIM.

The training end condition means that the average SSIM corresponding to the data set does not increase any more. The encoder to be trained refers to an image encoder to be trained. The trained encoder refers to a trained image encoder.

Specifically, a terminal acquires a data set comprising a plurality of sample video frames, then inputs the sample video frames in the data set into an encoder to be trained, and performs encoding processing on the sample video frames through the encoder to be trained to obtain encoded video frames corresponding to the sample video frames; calculating to obtain an image similarity evaluation index according to the sample video frame and the coded video frame corresponding to the sample video frame; adjusting the encoding parameters of the encoder to be trained according to the image similarity evaluation index to obtain an encoder with the adjusted encoding parameters; and (4) training the encoder with the adjusted encoding parameters again, and repeating the process continuously until the average value of the image similarity evaluation indexes corresponding to the data set is not increased any more, wherein the encoding parameters of the encoder trained at the moment are used as preset parameters.

For example, the average value of the image similarity evaluation indexes corresponding to the data set is calculated by the following method: adding image similarity evaluation indexes corresponding to each sample video frame in the data set by the terminal to obtain a target image similarity evaluation index; and then dividing the target image similarity evaluation index by the total number of the sample video frames in the data set to obtain the average value of the image similarity evaluation indexes corresponding to the data set.

According to the technical scheme provided by the embodiment of the disclosure, the data sets of a plurality of sample video frames are utilized to iteratively update the coding parameters of the encoder until the average value of the image similarity evaluation indexes corresponding to the data sets is not increased any more, the coding parameters of the encoder trained at this time are taken as the preset parameters, so that the determination accuracy of the preset parameters is favorably improved, the determination accuracy of the offset quantization information of the image block is improved, and the image distortion of the subsequent image block-based offset quantization information after encoding is reduced.

In an exemplary embodiment, as shown in fig. 3, in the step S120, determining offset quantization information of each image block according to a relationship between the discrete information of the pixel value of each image block and a preset threshold may specifically be implemented by the following steps:

in step S310, target pixel value dispersion information and a target pixel value dispersion information average value of each image block are determined according to the pixel value dispersion information of each image block.

In step S320, for each image block in each image block, when the pixel value dispersion information of the image block is greater than the preset threshold, the offset quantization information of the image block is determined according to the preset scaling length, the target pixel value dispersion information of the image block, the target pixel value dispersion information average value, the preset coefficient, and the pixel value dispersion information of the image block.

In step S330, when the pixel value dispersion information of the image block is less than or equal to the preset threshold, the offset quantization information of the image block is determined according to the preset scaling length, the target pixel value dispersion information of the image block, and the target pixel value dispersion information average value.

Wherein, the target pixel value discrete information of the image block refers to energyLog; the target pixel value discrete information energylog (i) ═ log2(2 × energy (i) + aqTuneSsimC2) of the ith image block.

The target pixel value discrete information average value refers to an average value of target pixel value discrete information of the target image block. The target image block is an image block which meets a preset image size in a video frame to be coded.

The preset zoom length specifically means 1.0, and may be adjusted according to actual conditions.

Wherein the predetermined coefficients are trained from a data set applied by a second encoder (e.g., a second image encoder); for example, the parameters of the second encoder are changed, then the sample video frames in the data set are encoded, and the average SSIM of the data set is calculated, and the parameter with the optimal average SSIM is selected as the preset coefficient. For example, assume that for the first set of encoding parameters, the average SSIM for the data set is p 1; for the second set of encoding parameters, the average SSIM corresponding to the data set is p 2; for the third group of coding parameters, the average SSIM corresponding to the data set is p3 … …, and for the mth group of coding parameters, the average SSIM corresponding to the data set is pm; where p3 is the largest, the third set of encoding parameters is determined as the predetermined coefficients.

The preset coefficients comprise a first preset coefficient, a second preset coefficient, a third preset coefficient, a fourth preset coefficient, a fifth preset coefficient and a sixth preset coefficient; the first preset coefficient is a coefficient E, E ═ 4.0; the second preset coefficient is F, and F is 0.8; the third preset coefficient is G, G ═ 3.0; the fourth preset coefficient is H, H ═ 0.9; the fifth preset coefficient is K, K is 0.001; the sixth predetermined coefficient is L, L ═ 1.5. It should be noted that the first preset coefficient, the second preset coefficient, the third preset coefficient, the fourth preset coefficient, the fifth preset coefficient, and the sixth preset coefficient are not fixed, and may be adjusted according to actual conditions.

Specifically, the terminal calculates target pixel value discrete information of each image block according to the pixel value discrete information of each image block; calculating to obtain an average value of the target pixel value discrete information according to the target pixel value discrete information of each image block and the number of target image blocks in the video frame to be coded, and taking the average value as the average value of the target pixel value discrete information; respectively comparing the pixel value discrete information of each image block with a preset threshold, acquiring a coefficient obtained by training according to a data set applied by a second encoder under the condition that the pixel value discrete information of each image block is larger than the preset threshold, taking the coefficient as a preset coefficient, and determining the offset quantization information of the image block according to a preset scaling length, the target pixel value discrete information of the image block, the target pixel value discrete information average value, the preset coefficient and the pixel value discrete information of the image block; under the condition that the pixel value discrete information of the image block is smaller than or equal to a preset threshold, determining the offset quantization information of the image block according to a preset scaling length, target pixel value discrete information of the image block and a target pixel value discrete information average value; for example, when the pixel value discrete information of an image block is less than or equal to a preset threshold, the terminal obtains a difference between target pixel value discrete information of the image block and an average value of the target pixel value discrete information, multiplies the difference by a preset scaling length to obtain a product, and uses the product as offset quantization information of the image block.

qp _ adj (i) 3.0 × strenggth x (energylog (i) -avgEnergy) when energy (i) ≦ thB;

wherein, energy (i) refers to discrete information of pixel values of the ith image block; thB is a preset threshold; qp _ adj (i) refers to offset quantization information of the ith image block; strengthh refers to a preset zoom length; energyLog (i) refers to the target pixel value discrete information of the ith image block, and avgEnergy refers to the average value of the target pixel value discrete information.

According to the technical scheme provided by the embodiment of the disclosure, the offset quantization information corresponding to each image block is adaptively determined according to the relationship between the pixel value discrete information of each image block and the preset threshold, and the pixel value discrete information of each image block is comprehensively considered, so that the actual texture information of each image block is favorably considered, and the determination accuracy of the offset quantization information of the image block is improved; and meanwhile, the subsequent coding is carried out based on the offset quantization information corresponding to each image block, so that the image distortion can be effectively reduced.

In an exemplary embodiment, in step S310, the target pixel value dispersion information and the target pixel value dispersion information average value of each image block are determined according to the pixel value dispersion information of each image block, which specifically includes the following contents: respectively determining target pixel value discrete information of each image block according to the pixel value discrete information of each image block, the maximum value of pixels in a video frame to be coded and the number of pixels in each image block; determining the average value of the discrete information of the target pixel value according to the discrete information of the target pixel value of each image block and the number of the target image blocks in the video frame to be coded; the target image block is an image block which meets the preset image size in the video frame to be coded.

The target image block is an image block of which a corresponding image size is the same as a preset image size (for example, 8 × 8) in image blocks included in the video frame to be encoded.

Specifically, the terminal calculates and obtains target pixel value discrete information of each image block according to the pixel value discrete information of the image block, the maximum value of the pixels in the video frame to be encoded and the number of the pixels in the image block, so as to obtain the target pixel value discrete information of each image block; the number of target image blocks in a video frame to be coded and the sum of the target pixel value discrete information of each image block are obtained, and the sum and the number are divided to obtain an average value which is used as the average value of the target pixel value discrete information.

For example, the target pixel value discrete information average value is calculated by the following formula:

energyLog(i)＝log2(2×Energy(i)+aqTuneSsimC2)；

aqTuneSsimC2＝((0.03×0.03×PIXEL_MAX×PIXEL_MAX×numPixelBlock×(numPixelBlock-1)))；

sumEnergylog is the sum of all image blocks energylog (i);

avgEnergy＝sumEnergylog/block_NUM；

wherein, energyLog (i) refers to the target pixel value discrete information of the ith image block; aqTuneSsimC2 refers to the optimized value of common sense C2 in SSIM; PIXEL _ MAX refers to the maximum value of a PIXEL in a video frame to be encoded; numPixelBlock refers to the number of pixels in the ith image block; block _ NUM refers to the number of target image blocks in the video frame to be encoded.

According to the technical scheme provided by the embodiment of the disclosure, the target pixel value discrete information and the target pixel value discrete information average value of each image block are determined according to the pixel value discrete information of each image block, so that the offset quantization information of each image block is determined based on the target pixel value discrete information and the target pixel value discrete information average value of each image block, and the purpose of adaptively determining the offset quantization information of each image block is achieved.

In an exemplary embodiment, in step S320, the offset quantization information of the image block is determined according to the preset scaling length, the target pixel value dispersion information of the image block, the target pixel value dispersion information average value, the preset coefficient, and the pixel value dispersion information of the image block, and specifically includes the following contents: determining second offset quantization information of the image block according to the preset coefficient and the pixel value discrete information of the image block; and determining the offset quantization information of the image block according to the preset scaling length, the difference value between the target pixel value discrete information of the image block and the average value of the target pixel value discrete information and the second offset quantization information of the image block.

Wherein, the second offset quantization information of the image block is clip3(E, F, (G × log2(log2(energy (i)) ^0.1 + H) + K) + L)); when clip3(E, F, (G × log2(log2(energy (i)) ^0.1 + H) + K) + L)) is less than E, E is output; when clip3(E, F, (G × log2(log2(energy (i)) ^0.1 + H) + K) + L)) is greater than F, F is output; when E is less than or equal to clip3(E, F, (G × log2(log2(energy (i)) ^0.1 + H) + K) + L)). ltoreq.F, outputting clip3(E, F, (G × log2(log2(energy (i)) ^0.1 +H)+K)+L))。

Specifically, the terminal adjusts the pixel value discrete information of the image block according to a preset coefficient to obtain adjusted pixel value discrete information serving as second offset quantization information of the image block; acquiring a difference value between target pixel value discrete information and a target pixel value discrete information average value of an image block, and multiplying the difference value by a preset scaling length to obtain a product; and adding the product and the second offset quantization information of the image block to obtain the offset quantization information of the image block.

when energy (i)>thB, qp _ adj (i) 3.0 × strenggth × (energyLog (i) -avgEnergy) + clip3(E, F, (G × log2(log2(energy (i)) ^0.1 +H)+K)+L))；

Wherein, energy (i) refers to pixel value discrete information of the ith image block; thB refers to a preset threshold, such as 2000; qp _ adj (i) refers to offset quantization information of the ith image block; strengthh refers to a preset zoom length; energyLog (i) refers to the target pixel value discrete information of the ith image block, and avgEnergy refers to the average value of the target pixel value discrete information; e ═ 4.0, F ═ 0.8, G ═ 3.0, H ═ 0.9; k-0.001 and L-1.5.

According to the technical scheme provided by the embodiment of the disclosure, under the condition that the pixel value discrete information of the image block is larger than the preset threshold, the preset coefficient obtained by training according to the data set applied by the second encoder, the pixel value discrete information of the image block, the preset scaling length, the target pixel value discrete information of the image block and the target pixel value discrete information average value are comprehensively considered, and the accuracy rate of determining the offset quantization information of the image block is favorably improved.

Fig. 4 is a flowchart illustrating another video encoding method according to an exemplary embodiment, which is used in a terminal, as shown in fig. 4, and includes the steps of:

in step S410, discrete information of pixel values of each image block in the video frame to be encoded is obtained.

In step S420, for each image block in each image block, when the pixel value dispersion information of the image block is greater than the preset threshold, the offset quantization information of the image block is determined according to the pixel value dispersion information of the image block and the preset parameter.

In step S430, in a case that the pixel value dispersion information of the image block is less than or equal to the preset threshold, acquiring the N-th power of the pixel value dispersion information of the image block as the offset quantization information of the image block; n is greater than 0.

In step S440, for each image block in each image block, when the pixel value dispersion information of the image block is greater than the preset threshold, the offset quantization information of the image block is determined according to the preset scaling length, the target pixel value dispersion information of the image block, the target pixel value dispersion information average value, the preset coefficient, and the pixel value dispersion information of the image block.

In step S450, when the pixel value dispersion information of the image block is less than or equal to the preset threshold, the offset quantization information of the image block is determined according to the preset scaling length, the target pixel value dispersion information of the image block, and the target pixel value dispersion information average value.

In step S460, the initial quantization information of each image block is updated according to the offset quantization information of each image block to obtain the target quantization information of each image block, and each image block is encoded according to the target quantization information of each image block.

In the video coding method, the offset quantization information corresponding to each image block is adaptively determined according to the relationship between the pixel value discrete information of each image block and the preset threshold, and the pixel value discrete information of each image block is comprehensively considered, which is beneficial to considering the actual texture information of each image block, so that the subsequent coding is performed based on the offset quantization information corresponding to each image block, the image distortion can be effectively reduced, and the defect that the coded image distortion is larger due to the fact that the video frame is coded by adopting the fixed quantization offset information is avoided.

It should be understood that, although the steps in the flowcharts related to the embodiments as described above are sequentially displayed as indicated by arrows, the steps are not necessarily performed sequentially as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a part of the steps in the flowcharts related to the embodiments described above may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the execution order of the steps or stages is not necessarily sequential, but may be rotated or alternated with other steps or at least a part of the steps or stages in other steps.

It is understood that the same/similar parts between the embodiments of the method described above in this specification can be referred to each other, and each embodiment focuses on the differences from the other embodiments, and it is sufficient that the relevant points are referred to the descriptions of the other method embodiments.

Based on the same inventive concept, the disclosed embodiments also provide a video encoding apparatus for implementing the above-mentioned video encoding method.

Fig. 5 is a block diagram illustrating a video encoding apparatus according to an example embodiment. Referring to fig. 5, the apparatus includes an information acquisition unit 510, an information determination unit 520, and an encoding processing unit 530.

An information obtaining unit 510 configured to perform obtaining pixel value discrete information of each image block in a video frame to be encoded.

An information determining unit 520 configured to perform determining offset quantization information of each image block according to a relationship between pixel value dispersion information of each image block and a preset threshold.

And an encoding processing unit 530 configured to update the initial quantization information of each image block according to the offset quantization information of each image block to obtain target quantization information of each image block, and perform encoding processing on each image block according to the target quantization information of each image block.

In an exemplary embodiment, the information determining unit 520 is further configured to perform, for each of the respective image blocks, determining offset quantization information of the image block according to the pixel value dispersion information of the image block and a preset parameter if the pixel value dispersion information of the image block is greater than a preset threshold; under the condition that the pixel value discrete information of the image block is smaller than or equal to a preset threshold, acquiring the N power of the pixel value discrete information of the image block as the offset quantization information of the image block; n is greater than 0.

In an exemplary embodiment, the information determining unit 520 is further configured to determine first offset quantization information of the image block according to a preset parameter and pixel value discrete information of the image block; and determining the offset quantization information of the image block according to the N-th power of the pixel value discrete information of the image block and the first offset quantization information.

In an exemplary embodiment, the present disclosure provides a video encoding apparatus further comprising a parameter determination unit configured to perform acquiring the data set; the data set includes a sample video frame; inputting the sample video frame into an encoder to be trained to obtain an encoded video frame corresponding to the sample video frame; determining an image similarity evaluation index according to the sample video frame and the coded video frame corresponding to the sample video frame; and adjusting the encoding parameters of the encoder to be trained according to the image similarity evaluation index, performing retraining on the encoder with the adjusted encoding parameters until the training end condition is reached, and taking the encoding parameters of the trained encoder with the training end condition as preset parameters.

In an exemplary embodiment, the information determining unit 520 is further configured to determine the target pixel value dispersion information and the target pixel value dispersion information average value of each image block according to the pixel value dispersion information of each image block; for each image block in each image block, under the condition that the pixel value dispersion information of the image block is larger than a preset threshold value, determining offset quantization information of the image block according to a preset scaling length, target pixel value dispersion information of the image block, a target pixel value dispersion information average value, a preset coefficient and the pixel value dispersion information of the image block; and under the condition that the pixel value discrete information of the image block is smaller than or equal to a preset threshold, determining the offset quantization information of the image block according to a preset scaling length, the target pixel value discrete information of the image block and the target pixel value discrete information average value.

In an exemplary embodiment, the information determining unit 520 is further configured to perform determining target pixel value discrete information of each image block according to pixel value discrete information of each image block, a maximum value of pixels in a video frame to be encoded, and the number of pixels in each image block, respectively; determining the average value of the discrete information of the target pixel value according to the discrete information of the target pixel value of each image block and the number of the target image blocks in the video frame to be coded; the target image block is an image block which meets the preset image size in the video frame to be coded.

In an exemplary embodiment, the information determining unit 520 is further configured to determine the second offset quantization information of the image block according to the preset coefficient and the pixel value dispersion information of the image block; and determining the offset quantization information of the image block according to the preset scaling length, the difference value between the target pixel value discrete information of the image block and the average value of the target pixel value discrete information and the second offset quantization information of the image block.

With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.

The various modules in the video encoding apparatus described above may be implemented in whole or in part by software, hardware, and combinations thereof. The modules can be embedded in a hardware form or independent of a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

Fig. 6 is a block diagram illustrating an electronic device 600 for performing a video encoding method according to an example embodiment. For example, the electronic device 600 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a gaming console, a tablet device, a medical device, a fitness device, a personal digital assistant, and so forth.

Referring to fig. 6, electronic device 600 may include one or more of the following components: processing component 602, memory 604, power component 606, multimedia component 608, audio component 610, interface to input/output (I/O) 612, sensor component 614, and communication component 616.

The processing component 602 generally controls overall operation of the electronic device 600, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing component 602 may include one or more processors 620 to execute instructions to perform all or a portion of the steps of the methods described above. Further, the processing component 602 can include one or more modules that facilitate interaction between the processing component 602 and other components. For example, the processing component 602 can include a multimedia module to facilitate interaction between the multimedia component 608 and the processing component 602.

The memory 604 is configured to store various types of data to support operations at the electronic device 600. Examples of such data include instructions for any application or method operating on the electronic device 600, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 604 may be implemented by any type or combination of volatile or non-volatile storage devices, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic disk, optical disk, or graphene memory.

Power supply component 606 provides power to the various components of electronic device 600. The power components 606 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the electronic device 600.

The multimedia component 608 includes a screen providing an output interface between the electronic device 600 and a user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 608 includes a front facing camera and/or a rear facing camera. The front camera and/or the rear camera may receive external multimedia data when the electronic device 600 is in an operation mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.

The audio component 610 is configured to output and/or input audio signals. For example, the audio component 610 includes a Microphone (MIC) configured to receive external audio signals when the electronic device 600 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signal may further be stored in the memory 604 or transmitted via the communication component 616. In some embodiments, audio component 610 also includes a speaker for outputting audio signals.

The I/O interface 612 provides an interface between the processing component 602 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.

The sensor component 614 includes one or more sensors for providing status assessment of various aspects of the electronic device 600. For example, the sensor component 614 may detect an open/closed state of the electronic device 600, the relative positioning of components, such as a display and keypad of the electronic device 600, the sensor component 614 may also detect a change in the position of the electronic device 600 or components of the electronic device 600, the presence or absence of user contact with the electronic device 600, orientation or acceleration/deceleration of the device 600, and a change in the temperature of the electronic device 600. The sensor assembly 614 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 614 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 614 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.

The communication component 616 is configured to facilitate communications between the electronic device 600 and other devices in a wired or wireless manner. The electronic device 600 may access a wireless network based on a communication standard, such as WiFi, a carrier network (such as 2G, 3G, 4G, or 5G), or a combination thereof. In an exemplary embodiment, the communication component 616 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 616 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.

In an exemplary embodiment, the electronic device 600 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described methods.

In an exemplary embodiment, a computer-readable storage medium comprising instructions, such as the memory 604 comprising instructions, executable by the processor 620 of the electronic device 600 to perform the above-described method is also provided. For example, the computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

In an exemplary embodiment, a computer program product is also provided that includes instructions executable by the processor 620 of the electronic device 600 to perform the above-described method.

It should be noted that the descriptions of the above-mentioned apparatus, the electronic device, the computer-readable storage medium, the computer program product, and the like according to the method embodiments may also include other embodiments, and specific implementations may refer to the descriptions of the related method embodiments, which are not described in detail herein.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. A video encoding method, comprising:

2. The method according to claim 1, wherein the determining offset quantization information of each image block according to a relationship between pixel value discrete information of each image block and a preset threshold comprises:

for each image block in each image block, determining offset quantization information of the image block according to the pixel value discrete information of the image block and a preset parameter under the condition that the pixel value discrete information of the image block is larger than the preset threshold;

3. The method according to claim 2, wherein determining offset quantization information of the image block according to the discrete information of pixel values of the image block and preset parameters comprises:

4. The method according to claim 2, characterized in that the preset parameters are obtained by:

acquiring the data set; the data set comprises a sample video frame;

5. The method according to claim 1, wherein the determining offset quantization information of each image block according to a relationship between pixel value discrete information of each image block and a preset threshold, further comprises:

6. The method according to claim 5, wherein the determining target pixel value dispersion information and target pixel value dispersion information average value of each image block according to pixel value dispersion information of each image block comprises:

7. The method according to claim 5, wherein the determining offset quantization information for the image block according to a preset scaling length, target pixel value dispersion information for the image block, an average value of the target pixel value dispersion information, a preset coefficient, and pixel value dispersion information for the image block comprises:

and determining the offset quantization information of the image block according to a preset scaling length, the difference between the target pixel value discrete information of the image block and the average value of the target pixel value discrete information and the second offset quantization information of the image block.

8. A video encoding apparatus, comprising:

9. An electronic device, comprising:

a processor;

a memory for storing the processor-executable instructions;

wherein the processor is configured to execute the instructions to implement the video encoding method of any of claims 1 to 7.

10. A computer-readable storage medium, wherein instructions in the computer-readable storage medium, when executed by a processor of an electronic device, enable the electronic device to perform the video encoding method of any of claims 1-7.

11. A computer program product comprising instructions which, when executed by a processor of an electronic device, enable the electronic device to carry out the video encoding method of any one of claims 1 to 7.