CN118233647A

CN118233647A - Video encoding method, video encoding device, electronic equipment and storage medium

Info

Publication number: CN118233647A
Application number: CN202410273529.4A
Authority: CN
Inventors: 高敏; 陈靖
Original assignee: Shuhang Technology Beijing Co ltd
Current assignee: Shuhang Technology Beijing Co ltd
Priority date: 2024-03-11
Filing date: 2024-03-11
Publication date: 2024-06-21

Abstract

The embodiment of the application provides a video coding method, a video coding device, electronic equipment and a storage medium, wherein the video coding method comprises the following steps: obtaining a frame to be encoded, wherein the frame to be encoded comprises at least one block to be encoded; determining a predicted motion vector and a true motion vector of each block to be coded; positioning a first prediction block corresponding to each block to be coded in a reference frame of a frame to be coded based on the prediction motion vector of each block to be coded, and positioning a second prediction block corresponding to each block to be coded in the reference frame of the frame to be coded based on the true motion vector of each block to be coded; and determining whether the residual error between the predicted motion vector and the actual motion vector of each block to be coded is needed to be coded according to the residual error between each block to be coded and the corresponding first predicted block and/or the residual error between each block to be coded and the corresponding second predicted block. By adopting the embodiment of the application, the coding effect can be improved.

Description

Video encoding method, video encoding device, electronic equipment and storage medium

Technical Field

The present application relates to the field of video technologies, and in particular, to a video encoding method, apparatus, electronic device, and storage medium.

Background

In a short video service scene, the video uploaded by an creator needs to be transcoded by a server, and the transcoded video is sent to a user for watching. Prediction techniques of video coding are classified into Intra prediction (Intra-prediction) and Inter prediction (Inter-prediction), and Inter prediction is classified into forward prediction and bi-prediction according to a prediction direction. According to the prediction mode, input video frames are generally classified into intra-predicted frames (I frames), forward predicted frames (P frames), and bi-predicted frames (B frames). Specifically, an image encoded with intra prediction is referred to as an I frame, an image encoded with forward prediction is referred to as a P frame, and an image encoded with bi-prediction is referred to as a B frame.

The conventional P frame has only one reference frame, and a block to which a motion vector points is used to generate a final predicted block of the current block during prediction. To improve the coding efficiency of the conventional P frame, a generalized P frame may be introduced. Generalized P-frames refer to the prediction of conventional P-frames in a bi-directional prediction manner similar to B-frames. In this prediction mode, a P frame predicts using two reference frames, which are images before the P frame, and in the prediction process, two prediction blocks of the current block are generated by using the blocks pointed by the two motion vectors, respectively, and then the final prediction block of the current block is obtained based on the weighted average of the two prediction blocks.

In order to reduce the coding rate of the motion vector, a differential coding mode can be adopted for the motion vector. For example, a suitable motion vector is selected as a prediction value (denoted as MVP) of a motion vector (denoted as MV) of the current block, and then a residual between the MV and the MVP is encoded. For generalized P-frames, one of the motion vectors (e.g., MV 1) may be set to its predicted value (denoted MVP 1) in order to further reduce the coding rate of the motion vector, in which case the encoder no longer needs to encode the residual between MV1 and MVP 1. However, this may cause a residual error between the current block and the prediction block to become large, affecting the accuracy of inter prediction, and thus affecting the coding effect.

Disclosure of Invention

The application provides a video coding method, a video coding device, electronic equipment and a storage medium, which can improve coding effect.

In a first aspect, an embodiment of the present application provides a video encoding method, including:

obtaining a frame to be encoded, wherein the frame to be encoded comprises at least one block to be encoded;

Determining a predicted motion vector and a true motion vector of each block to be coded;

Positioning a first prediction block corresponding to each block to be coded in a reference frame of the frame to be coded based on the prediction motion vector of each block to be coded, and positioning a second prediction block corresponding to each block to be coded in the reference frame of the frame to be coded based on the actual motion vector of each block to be coded;

And determining whether the residual error between the predicted motion vector and the actual motion vector of each block to be coded is needed to be coded according to the residual error between each block to be coded and the corresponding first predicted block and/or the residual error between each block to be coded and the corresponding second predicted block.

By the above embodiment, the first prediction block and the second prediction block corresponding to each block to be encoded in the frame to be encoded are determined based on the prediction motion vector and the true motion vector, respectively, and the residual error between each block to be encoded and the corresponding first prediction block may reflect the accuracy of inter prediction based on the prediction motion vector, and the residual error between each block to be encoded and the corresponding second prediction block may reflect the accuracy of inter prediction based on the prediction motion vector and the true inter difference, and whether the residual error between the prediction motion vector and the true motion vector of each block to be encoded needs to be encoded is determined according to the residual error between each block to be encoded and the corresponding first prediction block and/or the residual error between each block to be encoded and the corresponding second prediction block. Therefore, whether motion vector residuals of all blocks to be coded in the frame to be coded need to be coded or not can be flexibly determined according to the actual condition of the frame to be coded, the self-adaption is good, the coding effect is improved, and the video code rate is reduced on the premise that the video image quality is not affected.

In a possible implementation manner of the first aspect, the determining whether the residual between the predicted motion vector and the true motion vector of each block to be encoded needs to be encoded according to the residual between each block to be encoded and its corresponding first prediction block and/or the residual between each block to be encoded and its corresponding second prediction block includes:

Calculating a first inter-frame prediction cost of the frame to be encoded according to residual errors between each block to be encoded and the corresponding first prediction block;

calculating second inter-frame prediction cost of the frame to be encoded according to residual errors between each block to be encoded and the corresponding second prediction block;

And determining whether residual errors between the predicted motion vectors and the actual motion vectors of the blocks to be encoded need to be encoded according to the magnitude relation between the first inter-frame prediction cost and the second inter-frame prediction cost.

According to the embodiment, the accuracy of inter-frame prediction based on the predicted motion vector is measured by using the first inter-frame prediction cost, the accuracy of inter-frame prediction based on the real motion vector is measured by using the second inter-frame prediction cost, and whether the accuracy of inter-frame prediction based on the predicted motion vector meets the requirement can be judged according to the magnitude relation between the first inter-frame prediction cost and the second inter-frame prediction cost, so that whether the residual error between the predicted motion vector and the real motion vector of each block to be coded needs to be coded can be determined.

In a possible implementation manner of the first aspect, the determining whether the residual between the predicted motion vector and the true motion vector of each block to be encoded needs to be encoded according to the magnitude relation between the first inter-frame prediction cost and the second inter-frame prediction cost includes:

determining a residual error between a predicted motion vector and a true motion vector of each block to be encoded without encoding if a first condition is satisfied, the first condition comprising: the first inter-prediction cost is less than the second inter-prediction cost; or alternatively

Determining a residual error between a predicted motion vector and a true motion vector of each block to be encoded, if a second condition is satisfied, the second condition comprising: the first inter-frame prediction cost is greater than the second inter-frame prediction cost, and the difference between the first inter-frame prediction cost and the second inter-frame prediction cost is greater than a first threshold; or alternatively

And under the condition that the first condition and the second condition are not met, determining whether residual errors between the predicted motion vector and the actual motion vector of each block to be coded need to be coded according to the actual motion vector of each block to be coded.

By the above embodiment, when the first inter-frame prediction cost is smaller than the second inter-frame prediction cost, it can be determined that the accuracy of inter-frame prediction based on the prediction motion vector meets the requirement, so that the residual error between the prediction motion vector and the true motion vector of each block to be encoded can be determined without encoding, thereby saving the code rate. When the first inter-frame prediction cost is greater than the second inter-frame prediction cost and the difference between the first inter-frame prediction cost and the second inter-frame prediction cost is greater than a first threshold, it can be judged that the accuracy of inter-frame prediction based on the prediction motion vector does not meet the requirement, so that the residual error between the prediction motion vector and the true motion vector of each block to be encoded can be determined, and the accuracy of inter-frame prediction is improved.

In a possible implementation manner of the first aspect, the determining, according to the true motion vector of each block to be encoded, whether a residual error between the predicted motion vector and the true motion vector of each block to be encoded needs to be encoded includes:

obtaining the absolute displacement value of each block to be coded and the average absolute displacement value of all the blocks to be coded according to the real motion vector of each block to be coded;

Determining a residual error between a predicted motion vector and a true motion vector of each block to be encoded, if a third condition is satisfied, the third condition comprising: the average displacement absolute value is larger than the second threshold value, or the number of blocks to be coded, of which the displacement absolute value is larger than the third threshold value, is larger than the first number.

By the above embodiment, when the average displacement absolute value of all the blocks to be encoded in the frame to be encoded is greater than the second threshold value, or the number of the blocks to be encoded whose displacement absolute value is greater than the third threshold value is greater than the first number, it may be determined that the motion of the frame to be encoded relative to the corresponding reference frame is relatively severe, so that the predicted motion vector of each block to be encoded has a relatively large phase difference from the actual motion vector, and thus, the residual error between the predicted motion vector and the actual motion vector of each block to be encoded may be determined, so as to improve the accuracy of inter-frame prediction.

In a possible implementation manner of the first aspect, the obtaining, according to the true motion vector of each block to be encoded, a displacement absolute value of each block to be encoded and an average displacement absolute value of all blocks to be encoded includes:

According to the real motion vector of each block to be coded, obtaining the absolute value of horizontal displacement and the absolute value of vertical displacement of each block to be coded;

obtaining the displacement absolute value of each block to be coded according to the sum of the horizontal displacement absolute value and the vertical displacement absolute value of each block to be coded;

And obtaining the average displacement absolute value according to the sum of the average horizontal displacement absolute value and the average vertical displacement absolute value of all the blocks to be coded.

According to the embodiment, the displacement absolute value of each block to be coded is obtained according to the sum of the horizontal displacement absolute value and the vertical displacement absolute value of each block to be coded, and the displacement absolute value is obtained by comprehensively considering the motion conditions in the horizontal direction and the vertical direction, so that the motion intensity of each block to be coded relative to the corresponding reference block can be accurately reflected. And obtaining an average displacement absolute value according to the sum of the average horizontal displacement absolute value and the average vertical displacement absolute value of all the blocks to be coded, wherein the average displacement absolute value is obtained by comprehensively considering the motion conditions of all the blocks to be coded in the horizontal direction and the vertical direction in the frames to be coded, so that the motion intensity of the frames to be coded relative to the corresponding reference frames can be accurately reflected.

In a possible implementation manner of the first aspect, the determining whether the residual between the predicted motion vector and the true motion vector of each block to be encoded is needed according to the residual between each block to be encoded and the corresponding first prediction block and/or the residual between each block to be encoded and the corresponding second prediction block, further includes:

Calculating a first variance of inter prediction residues of the frame to be encoded according to the residues between each block to be encoded and the corresponding second prediction block;

Determining a third prediction block corresponding to each block to be coded from the frame to be coded, and calculating a second variance of intra-frame prediction residues of the frame to be coded according to residues between each block to be coded and the corresponding third prediction block;

Determining a residual error between a predicted motion vector and a true motion vector of each block to be encoded without encoding if a fourth condition is satisfied, the fourth condition comprising: the first variance is greater than the second variance;

the determining whether the residual error between the predicted motion vector and the true motion vector of each block to be encoded needs to be encoded according to the magnitude relation between the first inter-frame prediction cost and the second inter-frame prediction cost comprises:

And if the fourth condition is not met, determining whether residual errors between the predicted motion vectors and the actual motion vectors of the blocks to be encoded need to be encoded according to the magnitude relation between the first inter-frame prediction cost and the second inter-frame prediction cost.

By the above embodiment, when the variance (i.e., the first variance) of the inter prediction residual of the frame to be encoded is greater than the variance (i.e., the second variance) of the intra prediction residual, it can be determined that the frame to be encoded has a large variation with respect to its corresponding reference frame, in which case the continued improvement of the accuracy of the motion vector is not helpful for reducing the prediction residual, so that the residual between the prediction motion vector and the true motion vector that does not need to be encoded can be determined, so as to save the code rate.

In a possible implementation manner of the first aspect, the method further includes: in case said third condition is not fulfilled,

Determining a residual error between a predicted motion vector and a true motion vector of each block to be encoded when a fifth condition is satisfied, the fifth condition comprising: the first variance is less than or equal to the second variance, and the first variance is greater than a fourth threshold; or alternatively

And when the fifth condition is not met, determining that the residual error between the predicted motion vector and the actual motion vector of each block to be coded does not need to be coded.

With the above embodiment, when the variance of the inter prediction residual (i.e., the first variance) of the frame to be encoded is smaller than or equal to the variance of the intra prediction residual (i.e., the second variance), and the variance of the inter prediction residual (i.e., the first variance) of the frame to be encoded is larger than the fourth threshold, it may be determined that the frame to be encoded does not change greatly relative to its corresponding reference frame, but the accuracy of inter prediction based on the prediction motion vector does not meet the requirement, so that the residual between the prediction motion vector to be encoded and the true motion vector may be determined to improve the accuracy of inter prediction.

In a possible implementation manner of the first aspect, the method further includes: determining a residual error between a predicted motion vector and a true motion vector of each block to be encoded without encoding if a sixth condition is satisfied, the sixth condition comprising: the frame number of the image group where the frame to be encoded is positioned is smaller than the second number;

Determining whether the residual error between the predicted motion vector and the true motion vector of each block to be encoded is needed according to the residual error between each block to be encoded and the corresponding first predicted block and/or the residual error between each block to be encoded and the corresponding second predicted block, including:

And under the condition that the sixth condition is not met, determining whether the residual error between the predicted motion vector and the actual motion vector of each block to be coded is needed to be coded according to the residual error between each block to be coded and the corresponding first predicted block and/or the residual error between each block to be coded and the corresponding second predicted block.

By the above embodiment, when the number of frames of the image group where the frame to be encoded is located is smaller than the second number, the accumulated residual error of the motion vector is considered to have little effect on the accuracy of inter-frame prediction, so that the residual error between the prediction motion vector and the true motion vector can be determined not to be encoded, thereby saving the code rate.

In a second aspect, an embodiment of the present application provides a video encoding apparatus, including:

An obtaining unit, configured to obtain a frame to be encoded, where the frame to be encoded includes at least one block to be encoded;

A determining unit, configured to determine a predicted motion vector and a true motion vector of each block to be encoded;

A prediction unit, configured to locate, in a reference frame of the frame to be encoded, a first prediction block corresponding to each block to be encoded based on a predicted motion vector of each block to be encoded, and locate, in a reference frame of the frame to be encoded, a second prediction block corresponding to each block to be encoded based on a true motion vector of each block to be encoded;

And the coding unit is used for determining whether the residual error between the predicted motion vector and the actual motion vector of each block to be coded is needed to be coded according to the residual error between each block to be coded and the corresponding first predicted block and/or the residual error between each block to be coded and the corresponding second predicted block.

In a possible implementation manner of the second aspect, the encoding unit is specifically configured to, when determining whether a residual between a predicted motion vector and a true motion vector of each block to be encoded is needed to be encoded according to a residual between each block to be encoded and a first prediction block corresponding to the block to be encoded and/or a residual between each block to be encoded and a second prediction block corresponding to the block to be encoded:

In a possible implementation manner of the second aspect, the encoding unit is specifically configured to, when determining whether a residual error between a predicted motion vector and a true motion vector of each block to be encoded needs to be encoded according to a magnitude relation between the first inter-prediction cost and the second inter-prediction cost:

In a possible implementation manner of the second aspect, the encoding unit is specifically configured to, when determining whether a residual error between a predicted motion vector and a true motion vector of each block to be encoded needs to be encoded according to the true motion vector of each block to be encoded:

In a possible implementation manner of the second aspect, the encoding unit is specifically configured to, when obtaining, according to a true motion vector of each block to be encoded, an absolute value of a displacement of each block to be encoded and an absolute value of an average displacement of all blocks to be encoded:

In a possible implementation manner of the second aspect, the encoding unit is further configured to: in case said third condition is not fulfilled,

In a possible implementation manner of the second aspect, the encoding unit is further configured to:

Determining a residual error between a predicted motion vector and a true motion vector of each block to be encoded without encoding if a sixth condition is satisfied, the sixth condition comprising: the frame number of the image group where the frame to be encoded is positioned is smaller than the second number;

the coding unit is specifically configured to, when determining whether to code a residual error between a predicted motion vector and a true motion vector of each block to be coded according to the residual error between each block to be coded and a first prediction block corresponding to the block to be coded and/or the residual error between each block to be coded and a second prediction block corresponding to the block to be coded:

In a third aspect, an embodiment of the present application provides an electronic device, including a memory and a processor, where the memory stores a computer program, and the processor implements the method in the first aspect and any one of possible implementation manners of the first aspect when executing the computer program.

In a fourth aspect, embodiments of the present application provide a computer readable storage medium having a computer program stored therein, which when executed by a processor, implements the method of the first aspect and any of its possible implementation manners.

In a fifth aspect, embodiments of the present application provide a computer program product comprising a computer program which, when executed by a processor, implements the method of the first aspect and any one of its possible implementation manners.

The advantages of the second to fifth aspects may be referred to the description of the advantages of the first aspect, and will not be repeated here.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application as claimed.

Drawings

In order to more clearly describe the embodiments of the present application or the technical solutions in the background art, the following description will describe the drawings that are required to be used in the embodiments of the present application or the background art.

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.

Fig. 1 is a schematic diagram of a GOP;

FIG. 2 is a schematic diagram of a conventional P frame;

FIG. 3 is a schematic diagram of a B frame;

FIG. 4 is a schematic diagram of a generalized P frame;

Fig. 5 is a schematic view of an application environment of a video encoding method according to an embodiment of the present application;

Fig. 6 is a schematic flow chart of a video encoding method according to an embodiment of the present application;

FIG. 7 is a schematic diagram of an adjacent block provided by an embodiment of the present application;

FIG. 8 is a schematic diagram of motion estimation according to an embodiment of the present application;

FIG. 9 is a schematic diagram of a residual block provided by an embodiment of the present application;

fig. 10 is a flowchart of another video encoding method according to an embodiment of the present application;

fig. 11 is a schematic structural diagram of a video encoding apparatus according to an embodiment of the present application;

fig. 12 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

In order that those skilled in the art will better understand the present application, a technical solution in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

The terms "first," "second," and the like in embodiments of the present application are used for distinguishing between different objects and not for describing a particular sequential order. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those listed steps or elements but may include other steps or elements not listed or inherent to such process, method, article, or apparatus.

It should be understood that in the present application, "at least one (item)" means one or more, "a plurality" means two or more, and "at least two (item)" means two or three and more. "and/or" for describing the association relationship of the association object, the representation may have three relationships, for example, "a and/or B" may represent: only a, only B and both a and B are present, wherein a, B may be singular or plural. The character "/" may indicate that the context-dependent object is an "or" relationship, meaning any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (one) of a, b or c may represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", wherein a, b, c may be single or plural.

Reference in the specification to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.

In a short video service scene, the video uploaded by an creator needs to be transcoded by a server, and the transcoded video is sent to a user for watching. Prediction techniques for video coding are classified into Intra-prediction (Intra-prediction) and Inter-prediction (Inter-prediction). Inter prediction is classified into forward prediction and bi-prediction according to a prediction direction. According to the prediction mode, input video frames are generally classified into intra-predicted frames (I frames), forward predicted frames (P frames), and bi-predicted frames (B frames). Specifically, an image encoded with intra prediction is referred to as an I frame, an image encoded with forward prediction is referred to as a P frame, and an image encoded with bi-prediction is referred to as a B frame.

In video coding, a group of pictures (Group of Pictures, GOP), or group of pictures, is the basic unit of inter prediction. Illustratively, a GOP is made up of a series of consecutive video frames, with the first frame being a key frame (I-frame or P-frame) and the subsequent frames being B-frames. Each video frame in the GOP has a timing identifier (temporal_id) that indicates the position and timing relationship of the frame in the GOP. Frames with smaller temporal_id are more important because they are key frames or reference frames, which have a significant impact on video quality and compression efficiency. In the encoding process, the encoder may determine the encoding order and the reference relation of each frame according to the size of temporal_id to achieve the optimal compression effect. In the decoding process, the decoder can restore the timing relationship of each frame according to the size of the temporal_id to correctly play the video. Referring to fig. 1, fig. 1 is a schematic diagram of a GOP. As shown in fig. 1, the length of GOP (or the number of frames of GOP, i.e., the number of video frames constituting GOP) is 16, and the GOP is divided according to the dichotomy to obtain temporal_id of each frame, wherein levels 0 to 4 on the left side represent temporal_id.

Referring to fig. 2, fig. 2 is a schematic diagram of a conventional P frame. As shown in fig. 2, the conventional P frame has only one reference frame, and in the prediction process, for a current block shown by a hatched area in the conventional P frame, a final prediction block of the current block is generated using a block to which a motion vector (denoted as MV 0) points.

Referring to fig. 3, fig. 3 is a schematic diagram of a B frame. As shown in fig. 3, the B frame has two reference frames (respectively referred to as reference frame 0 and reference frame 1), reference frame 0 is a picture before the B frame, reference frame 0 is a picture after the B frame, during prediction, for a current block shown by a shaded area in the B frame, a prediction block of the current block is generated by using a block pointed to by a motion vector (referred to as MV 0), another prediction block of the current block is generated by using a block pointed to by another motion vector (referred to as MV 1), and then a weighted average of the two prediction blocks is used as a final prediction block of the current block.

To improve the coding efficiency of the conventional P frame, a generalized P frame may be introduced. Generalized P-frames refer to the prediction of conventional P-frames in a bi-directional prediction manner similar to B-frames. In this prediction mode, a P frame is predicted using two reference frames. Referring to fig. 4, fig. 4 is a schematic diagram of a generalized P frame. As shown in fig. 4, the generalized P frame has two reference frames (respectively referred to as reference frame 0 and reference frame 1), and both reference frame 0 and reference frame 1 are images before the generalized P frame, in the prediction process, for a current block shown by a shaded area in the generalized P frame, a prediction block of the current block is generated by using a block pointed by a motion vector (referred to as MV 0), another prediction block of the current block is generated by using a block pointed by another motion vector (referred to as MV 1), and then a weighted average of the two prediction blocks is used as a final prediction block of the current block.

Based on the above, the embodiment of the application provides a video coding method to improve coding effect.

The video encoding method may be performed by a video encoding apparatus, for example, the video encoding method may be performed by a terminal device (or a terminal) or a server or other processing devices, where the terminal device may be, but is not limited to, various personal computers, notebook computers, tablet computers, smartphones, and portable wearable devices, and the server may be implemented by a stand-alone server or a server cluster composed of a plurality of servers. In some possible implementations, the data query method may be implemented by way of a processor invoking computer readable instructions stored in a memory.

Referring to fig. 5, fig. 5 is a schematic view of an application environment of a video encoding method according to an embodiment of the present application. The application environment relates to a first terminal 501, a server 502 and a second terminal 503. Wherein the server 502 may be communicatively connected to the first terminal 501 and the second terminal 503, respectively.

Illustratively, the first terminal 501 may be a terminal corresponding to a video creator, and the second terminal 503 may be a terminal corresponding to a video viewer. The video creator may upload the video to the video platform through the first terminal 501, the server 502 may be a server where the video platform is located, and the server 502 may transcode the video uploaded by the video creator and send the transcoded video to the second terminal 503 for viewing by a video viewer.

Referring to fig. 6, fig. 6 is a flowchart illustrating a video encoding method according to an embodiment of the application, and the video encoding method can be applied to the server 502 in fig. 5. As shown in fig. 6, the video encoding method includes the following steps S601 to S604.

S601, obtaining a frame to be encoded, wherein the frame to be encoded comprises at least one block to be encoded.

The frame to be encoded refers to a video frame to be encoded. A video may be composed of a plurality of video frames, and each video frame in the video may be encoded after the video to be encoded is obtained, that is, each video frame in the video may be regarded as a frame to be encoded. In an exemplary embodiment of the present application, taking a frame to be encoded as a generalized P-frame as an example, encoding of a motion vector corresponding to one reference frame of two associated reference frames is described.

A block to be encoded refers to an area (which may also be referred to as a block, or a block of pixels) of a certain size in a frame to be encoded. A video frame may be composed of a plurality of pixel blocks, and after a frame to be encoded is obtained, each pixel block in the frame to be encoded may be encoded, that is, each pixel block in the frame to be encoded may be used as a block to be encoded. By way of example, the size of the block to be encoded may be 8 x 8, i.e. the width and height of the block to be encoded each comprise 8 pixels.

S602, determining a predicted motion vector and a true motion vector of each block to be coded.

The predicted motion vector refers to a predicted value of the motion vector, and the true motion vector refers to a true value of the motion vector. For any block to be encoded, the predicted motion vector and the true motion vector of the block to be encoded may be determined according to the motion vectors of neighboring blocks of the block to be encoded, where the neighboring blocks of the block to be encoded refer to pixel blocks neighboring the block to be encoded, and the neighboring blocks may include spatial neighboring blocks, temporal neighboring blocks, spatial neighboring blocks, and temporal neighboring blocks.

Referring to fig. 7, fig. 7 is a schematic diagram of an adjacent block according to an embodiment of the application. As shown in fig. 7, the right side represents a frame to be encoded (or called a current frame), the left side represents a reference frame of the frame to be encoded, and a hatched area in the frame to be encoded represents a block to be encoded (or called a current block). The spatial neighboring blocks of the block to be encoded refer to blocks adjacent to the block to be encoded in the frame to be encoded, for example, blocks corresponding to A0, A1, B0, B1 or B2. The time-domain adjacent block of the block to be encoded refers to a block adjacent to the block to be encoded, for example, a block corresponding to C0 or C1, in the reference frame of the frame to be encoded.

In one possible implementation manner, for any block to be encoded, a neighboring block closest to the block to be encoded may be selected from neighboring blocks of the block to be encoded as a target neighboring block, and a true motion vector of the target neighboring block is used as a prediction motion vector of the block to be encoded.

Specifically, a pixel difference (or error) between each neighboring block of the block to be encoded and the block to be encoded may be calculated, and a neighboring block in which the error with the block to be encoded is minimum is determined as a neighboring block closest to the block to be encoded, thereby determining a target neighboring block. Alternatively, the error between the neighboring block and the block to be encoded may be measured using a sum of absolute errors (Sum of Absolute Difference, SAD) or a sum of absolute values of hadamard transform coefficients (Sum of Absolute Transformed Differences, SATD).

In one possible implementation, after obtaining the predicted motion vector of any block to be encoded, the true motion vector of the block to be encoded may be determined by motion estimation. Referring to fig. 8, fig. 8 is a schematic diagram of motion estimation according to an embodiment of the application. As shown in fig. 8, the right side represents a frame to be encoded (or called a current frame), the left side represents a reference frame of the frame to be encoded, and a hatched area in the frame to be encoded represents a block to be encoded (or called a current block). A block (denoted as a first pixel block, as indicated by a cross-hatched area in the reference frame) may be located in the reference frame of the frame to be encoded based on the predicted motion vector of the block to be encoded, and a block (denoted as a second pixel block, as indicated by a vertical hatched area in the reference frame) closest to the block to be encoded may be found in the vicinity of the first pixel block, and then the true motion vector of the block to be encoded may be obtained based on the relative position of the second pixel block and the block to be encoded.

Specifically, the search area may be determined based on the first pixel block, and the search area may be, for example, a search window centered on the first pixel block as shown by a dashed box on the left side of fig. 8, where the size of the search window may be set in connection with an actual requirement or situation, which is not limited by the embodiment of the present application. After determining the search window, all possible pixel blocks in the search window may be traversed, errors between each pixel block and the block to be encoded are calculated, and a pixel block having the smallest error with the block to be encoded is determined to be the block closest to the block to be encoded, thereby determining a second pixel block. Alternatively, SAD or SATD may be used to measure the error between the above pixel block and the block to be encoded.

S603, positioning a first prediction block corresponding to each block to be coded in a reference frame of the frame to be coded based on the prediction motion vector of each block to be coded, and positioning a second prediction block corresponding to each block to be coded in the reference frame of the frame to be coded based on the real motion vector of each block to be coded.

Wherein the first prediction block refers to a prediction block determined based on a prediction motion vector, and the second prediction block refers to a prediction block determined based on a true motion vector. The first prediction block may be the same as or different from the second prediction block.

For example, as shown in fig. 8, taking a block to be encoded represented by a shaded area in a frame to be encoded as an example, a first prediction block corresponding to the block to be encoded is a first pixel block represented by a cross-hatched area of a reference frame, and a second prediction block corresponding to the block to be encoded is a second pixel block represented by a vertical-hatched area of the reference frame.

S604, determining whether the residual error between the predicted motion vector and the actual motion vector of each block to be coded is needed to be coded according to the residual error between each block to be coded and the corresponding first predicted block and/or the residual error between each block to be coded and the corresponding second predicted block.

The residual error between the block to be encoded and the corresponding first prediction block can be understood as a prediction residual error caused by performing inter prediction based on the prediction motion vector, and can reflect the accuracy of performing inter prediction based on the prediction motion vector, so as to be used for determining whether the residual error between the prediction motion vector and the actual motion vector (or simply referred to as the motion vector residual error) needs to be encoded. For example, the smaller the residual between the block to be encoded and its corresponding first prediction block, the higher the accuracy of inter prediction based on the prediction motion vector may be considered, and thus the residual between the prediction motion vector and the true motion vector may not be considered to be required to be encoded. For another example, the larger the residual error between the block to be encoded and the corresponding first prediction block, the lower the accuracy of inter prediction based on the prediction motion vector, and thus the residual error between the prediction motion vector and the true motion vector may be considered to be needed to be encoded.

The residual error between the block to be coded and the second prediction block corresponding to the block to be coded can be understood as the prediction residual error caused by the inter-frame prediction based on the real motion vector, can reflect the accuracy of the inter-frame prediction based on the prediction motion vector, can reflect the real inter-frame difference, and can be further used for determining whether the residual error between the prediction motion vector and the real motion vector needs to be coded. For example, when the residual error between the block to be encoded and its corresponding second prediction block is large to some extent, the inter-frame difference may be considered to be inherently large, in which case continuing to improve the accuracy of the motion vector does not greatly help to reduce the prediction residual error, and thus it may be considered that the residual error between the prediction motion vector and the true motion vector does not need to be encoded.

In one possible implementation, the value of the inter-syntax element (noted mvdL Zero) is used to indicate whether the residual between the predicted motion vector and the true motion vector needs to be encoded. Accordingly, the value of mvdL Zero may be set according to the residual between each block to be encoded and its corresponding first prediction block, and/or the residual between each block to be encoded and its corresponding second prediction block, so as to determine whether to encode the residual between the predicted motion vector and the true motion vector.

For example, mvdL1Zero has a first value (e.g., 1) indicating that the residual between the predicted motion vector and the true motion vector does not need to be encoded. Thus, the residual between the predicted motion vector and the true motion vector is not encoded at the time of encoding, in which case the code rate for encoding the motion vector may be reduced, and in addition, since the inter prediction based on the predicted motion vector may be less accurate than the true motion vector, the residual between the block to be encoded and its corresponding predicted block may become large, and thus the code rate for encoding the predicted residual may be increased.

For another example, mvdL1Zero is a second value (e.g., 0) indicating that the residual between the predicted motion vector and the true motion vector does not need to be encoded. Accordingly, a residual error between a predicted motion vector and a true motion vector is encoded in encoding, in which case more code rates are used to encode the motion vector, and in addition, since the accuracy of inter-frame prediction based on the true motion vector is high, the residual error between a block to be encoded and its corresponding predicted block is small, so that the code rate for encoding the predicted residual error can be reduced.

In one possible implementation manner, determining whether to encode the residual error between the predicted motion vector and the true motion vector of each block to be encoded according to the residual error between each block to be encoded and the corresponding first prediction block and/or the residual error between each block to be encoded and the corresponding second prediction block may specifically include: calculating a first inter-frame prediction cost of a frame to be encoded according to residual errors between each block to be encoded and a corresponding first prediction block; calculating second inter-frame prediction cost of the frame to be encoded according to residual errors between each block to be encoded and the corresponding second prediction block; and determining whether residual errors between the predicted motion vectors and the actual motion vectors of the blocks to be encoded need to be encoded according to the magnitude relation between the first inter-frame prediction cost and the second inter-frame prediction cost.

The first inter-frame prediction cost may reflect a difference between the block to be encoded and the corresponding first prediction block, so that the difference may be used to measure accuracy of inter-frame prediction based on the prediction motion vector. Illustratively, the smaller the first inter prediction cost, the smaller the difference between the block to be encoded and its corresponding first prediction block, so that it can be considered that the higher the accuracy of inter prediction based on the prediction motion vector; the larger the first inter prediction cost, the larger the difference between the block to be encoded and its corresponding first prediction block, so that it can be considered that the accuracy of inter prediction based on the prediction motion vector is lower.

Alternatively, the SATD may be used to evaluate the residual between each block to be encoded and its corresponding first prediction block, and thus calculate the first inter prediction cost of the frame to be encoded. Specifically, for each block to be encoded, performing hadamard transform on a residual error between the block to be encoded and a first prediction block corresponding to the block to be encoded, summing absolute values of transform coefficients to obtain an SATD of the residual error, and then calculating a first inter-frame prediction cost of a frame to be encoded according to the SATD of the residual error.

Taking the example that the size of the block to be encoded is 8×8, the residual between the block to be encoded and its corresponding first prediction block may be a residual block of 8×8 size. Referring to fig. 9, fig. 9 is a schematic diagram of a residual block according to an embodiment of the application. As shown in fig. 9, the 8×8-sized residual block may be divided into 44×4-sized residual blocks. The STAD of an 8×8-sized residual block may be calculated by a 4×4-sized hadamard transform matrix.

In one possible implementation, a4×4 size hadamard transform matrix may be as follows:

The hadamard transform of a 4 x 4-sized residual block may be implemented by matrix multiplication as follows:

Wherein Res represents a 4*4-sized residual block; r _m,n represents the value at (m, n) in Res; c represents a Hadamard transform block corresponding to Res, namely a matrix obtained by carrying out Hadamard transform on Res, and C is a matrix with the size of 4 multiplied by 4.

The SATD of an 8 x 8 sized residual block may be calculated by the following formula:

Wherein SATD _blk8×8 represents the SATD of a residual block of 8×8 size; blkIdx denotes an index of a 4×4-sized residual block, blkIdx has values of 0, 1, 2, and 3, corresponding to 44×4-sized residual blocks divided by 8×8-sized residual blocks, respectively; c _blkIdx (i, j) represents a value located at (i, j) in the hadamard transform block corresponding to the 4×4-sized residual block with index blkIdx.

The first inter prediction cost of a frame to be encoded may be calculated by the following formula:

Wherein MvpInterCost denotes the first inter-prediction cost of the frame to be encoded; n _blk8×8 represents the number of residual blocks of size 8×8, i.e., the number of blocks of size 8×8 in the frame to be encoded; SATD representing a residual block of a kth 8×8 size.

The second inter prediction cost may reflect the difference between the block to be encoded and its corresponding second prediction block, thereby being used to measure the accuracy of inter prediction based on the true motion vector. Illustratively, the smaller the second inter prediction cost, the smaller the difference between the block to be encoded and its corresponding second prediction block, so that it can be considered that the accuracy of inter prediction based on the true motion vector is higher; the larger the second inter prediction cost, the larger the difference between the block to be encoded and its corresponding second prediction block, so that the lower the accuracy of inter prediction based on the true motion vector can be considered.

Alternatively, the SATD may be used to evaluate the residual between each block to be encoded and its corresponding second prediction block, thereby obtaining the second inter prediction cost of the frame to be encoded. Specifically, for each block to be encoded, performing hadamard transform on a residual error between the block to be encoded and a second prediction block corresponding to the block to be encoded, summing absolute values of transform coefficients to obtain an SATD of the residual error, and then obtaining a second inter-frame prediction cost of the frame to be encoded according to an average value of the SATD of the residual error.

It should be understood that, the specific calculation manner of the second inter-frame prediction cost may correspond to the relevant description of the calculation manner of the first inter-frame prediction cost in the foregoing, which is not described herein.

The second inter prediction cost may be understood as the optimal inter prediction cost. After the first inter-frame prediction cost and the second inter-frame prediction cost of the frame to be encoded are obtained, whether the accuracy of inter-frame prediction based on the prediction motion vector meets the requirement or not can be judged by comparing the magnitude relation of the first inter-frame prediction cost and the second inter-frame prediction cost, so that whether the residual error between the prediction motion vector and the real motion vector of each block to be encoded needs to be encoded or not is determined. For example, when the accuracy of inter prediction based on the predicted motion vector satisfies the requirement, it may be determined that it is not necessary to encode a residual between the predicted motion vector and the true motion vector of each block to be encoded. For another example, when the accuracy of inter prediction based on the predicted motion vector does not meet the requirement, a residual between the predicted motion vector and the true motion vector of each block to be encoded may be determined to be encoded.

In one possible implementation manner, determining whether the residual error between the predicted motion vector and the true motion vector of each block to be encoded needs to be encoded according to the magnitude relation between the first inter-frame prediction cost and the second inter-frame prediction cost may specifically include: in the case that the first condition is satisfied, determining a residual between a predicted motion vector and a true motion vector that does not need to encode each block to be encoded, the first condition including: the first inter-frame prediction cost is less than the second inter-frame prediction cost; or determining a residual error between a predicted motion vector and a true motion vector of each block to be encoded if a second condition is satisfied, the second condition comprising: the first inter-frame prediction cost is greater than the second inter-frame prediction cost, and the difference between the first inter-frame prediction cost and the second inter-frame prediction cost is greater than a first threshold; or under the condition that the first condition and the second condition are not met, determining whether the residual error between the predicted motion vector and the true motion vector of each block to be coded is needed to be coded according to the true motion vector of each block to be coded.

The first inter prediction cost is denoted by MvpInterCost, the second inter prediction cost is denoted by InterCost, and the first condition may be: mvpInterCost < InterCost. When the first condition is satisfied, the accuracy of inter-prediction based on the predicted motion vector may be considered to be better than the accuracy of inter-prediction based on the true motion vector, so that it may be judged that the accuracy of inter-prediction based on the predicted motion vector is satisfactory, and thus it may be determined that it is not necessary to encode the residual between the predicted motion vector and the true motion vector of each block to be encoded.

The second condition may be expressed as: mvpInterCost-InterCost > thr1. Wherein thr1 represents a first threshold, thr1 > 0. It should be understood that the first threshold may be set in connection with actual situations or requirements, which is not limited by the embodiment of the present application. When this second condition is satisfied, it can be considered that the accuracy of inter-prediction based on the predicted motion vector is worse than the accuracy of inter-prediction based on the true motion vector, and the accuracy of inter-prediction based on the predicted motion vector is too low, so that it can be judged that the accuracy of inter-prediction based on the predicted motion vector is unsatisfactory, and thus it can be determined that the residual between the predicted motion vector and the true motion vector of each block to be encoded needs to be encoded.

When the first condition and the second condition are not satisfied, that is, mvpInterCost is greater than or equal to InterCost, and MvpInterCost-InterCost is less than or equal to thr1, whether the residual error between the predicted motion vector and the true motion vector of each block to be encoded needs to be encoded can be determined further according to the true motion vector of each block to be encoded. That is, it may be determined whether the motion vector residual needs to be encoded according to the magnitude relation between the first inter-frame prediction cost and the second inter-frame prediction cost, and for some cases, it may be difficult to determine whether the motion vector residual needs to be encoded according to the magnitude relation between the first inter-frame prediction cost and the second inter-frame prediction cost, and further determine whether the motion vector residual needs to be encoded by combining the actual motion vectors of the blocks to be encoded.

The true motion vector of the block to be encoded can be used to measure the motion severity of the video content and thus can be used to determine whether the residual between the predicted motion vector and the true motion vector of each block to be encoded needs to be encoded. For example, when video content is moving harder, a residual between a predicted motion vector and a true motion vector that needs to encode each block to be encoded may be determined. For another example, when the video content is not moving severely, it may be determined that there is no need to encode a residual between the predicted motion vector and the true motion vector for each block to be encoded.

In other possible embodiments, the first condition may be: mvpInterCost is less than or equal to InterCost. The second condition may be: mvpInterCost-InterCost is not less than thr1.

In one possible implementation manner, determining whether the residual error between the predicted motion vector and the true motion vector of each block to be encoded needs to be encoded according to the true motion vector of each block to be encoded may specifically include: obtaining the absolute displacement value of each block to be coded and the average absolute displacement value of all the blocks to be coded according to the actual motion vector of each block to be coded; in the case that the third condition is satisfied, determining a residual error between a predicted motion vector and a true motion vector of each block to be encoded, the third condition including: the average displacement absolute value is larger than the second threshold value, or the number of blocks to be encoded whose displacement absolute value is larger than the third threshold value is larger than the first number.

The absolute value of the displacement of the block to be coded can be used for measuring the motion intensity of the block to be coded relative to the corresponding reference block. Illustratively, the greater the absolute value of the displacement of a block to be encoded, the more intense the motion of that block to be encoded relative to the corresponding reference block can be considered; the smaller the absolute value of the displacement of a block to be encoded, the less severe the motion of the block to be encoded with respect to the corresponding reference block can be considered.

The number of blocks to be coded with displacement absolute values larger than a certain value in the frames to be coded can be used for measuring the motion intensity of the frames to be coded relative to the corresponding reference frames. For example, the greater the number of blocks to be encoded whose displacement absolute value is greater than a certain value, the more intense the motion of the frame to be encoded with respect to the corresponding reference frame can be considered; the smaller the number of blocks to be encoded whose displacement absolute value is greater than a certain value in a frame to be encoded, the less severe the motion of the frame to be encoded with respect to a corresponding reference frame can be considered.

The absolute value of the average displacement of all the blocks to be coded in the frame to be coded can also be used for measuring the motion intensity of the frame to be coded relative to the corresponding reference frame. For example, the larger the average displacement absolute value of all the blocks to be encoded in the frame to be encoded, the more intense the motion of the frame to be encoded relative to the corresponding reference frame can be considered; the smaller the absolute value of the average displacement of all the blocks to be encoded in the frame to be encoded, the less severe the motion of the frame to be encoded relative to the corresponding reference frame can be considered.

The average displacement absolute value of all the blocks to be encoded is represented by m_avg, the number of blocks to be encoded (denoted as first blocks to be encoded) whose displacement absolute value is greater than a third threshold value (denoted by thr 3) is represented by n_thr3, the displacement absolute value of the blocks to be encoded is represented by m_blk, and the displacement absolute value of the first blocks to be encoded satisfies the following condition (for discrimination, denoted as target condition): m_blk > thr3. The third condition may be: m_avg > thr2, or N_thr3 > N1. Where thr2 represents a second threshold and N1 represents a first number. It should be understood that the second threshold, the third threshold, and the first number may be set in connection with actual situations or requirements, which is not limited by the embodiment of the present application.

When the third condition is satisfied, the motion of the frame to be encoded relative to the corresponding reference frame can be considered to be severe, so that the predicted motion vector of each block to be encoded can be judged to have a large difference from the actual motion vector, and therefore, the residual error between the predicted motion vector and the actual motion vector of each block to be encoded can be determined.

In one example, when the third condition is not satisfied, the motion of the frame to be encoded with respect to the corresponding reference frame may be considered not severe, so that it may be determined that the predicted motion vector of each block to be encoded is less different from the true motion vector, and thus it may be determined that the residual between the predicted motion vector and the true motion vector of each block to be encoded does not need to be encoded.

In other examples, when the third condition is not satisfied, other factors (for example, variance of prediction residual) may be further considered to determine whether the residual between the prediction motion vector and the true motion vector of each block to be encoded needs to be encoded, which will be described later.

In another possible embodiment, the third condition may be: m_avg > thr2, or N_thr3 > N1. Or the third condition may be: m_avg > thr2, or N_thr3.gtoreq.N 1. Or the third condition may be: m_avg. Gtoreq.thr2, or N_thr3. Gtoreq.N1. The target conditions may be: m_blk is not less than thr3.

In one possible implementation manner, obtaining the absolute value of the displacement of each block to be encoded and the absolute value of the average displacement of all the blocks to be encoded according to the actual motion vector of each block to be encoded specifically may include: according to the real motion vector of each block to be coded, obtaining the absolute value of horizontal displacement and the absolute value of vertical displacement of each block to be coded; obtaining the displacement absolute value of each block to be coded according to the sum of the horizontal displacement absolute value and the vertical displacement absolute value of each block to be coded; and obtaining an average displacement absolute value according to the sum of the average horizontal displacement absolute value and the average vertical displacement absolute value of all the blocks to be coded.

The true motion vector of the block to be encoded is represented by (x, y), the absolute value of the horizontal displacement of the block to be encoded is |x|, the absolute value of the vertical displacement of the block to be encoded is |y|, and the absolute value of the displacement of the block to be encoded (m_blk) can be calculated by the following formula: m_blk= |x|+|y|.

The absolute value of the average horizontal displacement (denoted by x_avg) of all blocks to be encoded can be calculated by the following formula:

Wherein N _mv represents the number of motion vectors in the frame to be encoded, i.e., the number of blocks to be encoded in the frame to be encoded; the |x _z | represents the absolute value of the horizontal displacement of the z-th block to be encoded in the frame to be encoded.

The absolute value of the average vertical displacement (denoted by y_avg) of all blocks to be encoded can be calculated by the following formula:

Wherein N _mv represents the number of motion vectors in the frame to be encoded, i.e., the number of blocks to be encoded in the frame to be encoded; y _z represents the absolute value of the vertical displacement of the z-th block to be encoded in the frame to be encoded.

The average displacement absolute value (m_avg) of all the blocks to be encoded can be calculated by the following formula: m_avg=x_avg+y_avg.

In one possible implementation manner, determining whether to encode the residual between the predicted motion vector and the true motion vector of each block to be encoded according to the residual between each block to be encoded and the corresponding first prediction block and/or the residual between each block to be encoded and the corresponding second prediction block may further include: calculating a first variance of inter prediction residues of the frame to be encoded according to residues between each block to be encoded and the corresponding second prediction block; determining a third prediction block corresponding to each block to be encoded from the frame to be encoded, and calculating a second variance of intra-frame prediction residues of the frame to be encoded according to residues between each block to be encoded and the corresponding third prediction block; in the case that the fourth condition is satisfied, determining a residual between a predicted motion vector and a true motion vector that does not need to encode each block to be encoded, the fourth condition including: the first variance is greater than the second variance. According to the magnitude relation between the first inter-frame prediction cost and the second inter-frame prediction cost, determining whether the residual error between the prediction motion vector and the true motion vector of each block to be encoded needs to be encoded or not can be specifically: and if the fourth condition is not met, determining whether residual errors between the predicted motion vectors and the actual motion vectors of the blocks to be coded need to be coded according to the magnitude relation between the first inter-frame prediction cost and the second inter-frame prediction cost.

The first variance refers to the variance of the inter prediction residues of the frame to be encoded, and may be calculated according to the residues between all the blocks to be encoded and the corresponding second prediction blocks in the frame to be encoded (for simplicity and distinction, the residues between the blocks to be encoded and the corresponding second prediction blocks are recorded as the inter prediction residues corresponding to the blocks to be encoded). Specifically, the inter prediction residues corresponding to all the blocks to be encoded may be averaged to obtain an average value of the inter prediction residues, and then the squares of the differences between the inter prediction residues corresponding to the blocks to be encoded and the average value of the inter prediction residues are summed and then averaged to obtain a first variance.

For example, the first variance may be smaller in case the frame to be encoded has a smaller variation with respect to its corresponding reference frame, and the first variance may be larger in case the frame to be encoded has a larger variation with respect to its corresponding reference frame. Accordingly, the first variance may be used to determine a change in the frame to be encoded relative to its corresponding reference frame.

The third prediction block corresponding to the block to be encoded refers to a block closest to the block to be encoded in the frame to be encoded. Specifically, errors between blocks other than the block to be encoded in the frame to be encoded and the block to be encoded may be calculated, and a block in which the error with the block to be encoded is the smallest is determined as a block closest to the block to be encoded, thereby determining the third prediction block. Alternatively, SAD or SATD may be used to measure the error between the other block and the block to be encoded.

The second variance refers to the variance of intra-prediction residues of the frame to be encoded, and can be calculated according to the residual errors between all the blocks to be encoded and the corresponding third prediction blocks in the frame to be encoded (for simplicity and distinction, the residual errors between the blocks to be encoded and the corresponding third prediction blocks are recorded as the intra-prediction residual errors corresponding to the blocks to be encoded). Specifically, the intra-prediction residues corresponding to all the blocks to be encoded may be averaged to obtain an average value of the intra-prediction residues, and then the squares of the differences between the intra-prediction residues corresponding to the blocks to be encoded and the average value of the intra-prediction residues are summed and then averaged to obtain the second variance.

It will be appreciated that the video content in the same frame to be encoded may differ significantly at different locations, i.e. the difference between different blocks to be encoded in the same frame to be encoded may be large, and thus the second variance is also large.

The first variance is denoted by Var _Inter, the second variance is denoted by Var _Intra, and the fourth condition may be: var _Inter＞Var_Intra. When this fourth condition is met, the frame to be encoded may be considered to vary significantly from its corresponding reference frame such that the first variance is so large that it exceeds the second variance, in which case continuing to improve the accuracy of the motion vector does not greatly help to reduce the prediction residual, and thus it may be determined that it is not necessary to encode a residual between the prediction motion vector and the true motion vector.

When the fourth condition, that is, var _Inter≤Var_Intra, is not satisfied, it may be determined whether the residual between the predicted motion vector and the true motion vector of each block to be encoded needs to be encoded according to the magnitude relation of the first inter prediction cost and the second inter prediction cost as described in the previous embodiment. That is, it may be determined whether the motion vector residual needs to be encoded based on the magnitude relation between the first variance and the second variance, and for some cases where it may be difficult to determine whether the motion vector residual needs to be encoded based on the magnitude relation between the first variance and the second variance, it may be further determined whether the motion vector residual needs to be encoded by combining the magnitude relation between the first inter-frame prediction cost and the second inter-frame prediction cost.

In another possible embodiment, the fourth condition may be: var _Inter≥Var_Intra.

In a possible implementation manner, in a case where the third condition is not satisfied, determining, when a fifth condition is satisfied, a residual error between a predicted motion vector and a true motion vector of each block to be encoded, where the fifth condition includes: the first variance is less than or equal to the second variance, and the first variance is greater than a fourth threshold; or when the fifth condition is not satisfied, determining that the residual between the predicted motion vector and the true motion vector of each block to be encoded does not need to be encoded.

When the third condition is not satisfied, that is, m_avg is less than or equal to thr2, and n_thr3 is less than or equal to N1, the magnitude of the first difference may be further combined to determine whether the residual error between the predicted motion vector and the true motion vector of each block to be encoded needs to be encoded. That is, it may be determined whether the motion vector residual needs to be encoded according to the true motion vector of each block to be encoded, and for some cases where it may be difficult to determine whether the motion vector residual needs to be encoded by the true motion vector of each block to be encoded, it is further determined whether the motion vector residual needs to be encoded by combining the magnitude of the first variance.

The fifth condition may be: var _Inter≤Var_Intra, and Var _Inter > thr4. Where thr4 represents a fourth threshold value. It should be understood that the fourth threshold may be set in connection with actual situations or requirements, which is not limited by the embodiment of the present application. When this fifth condition is satisfied, it can be considered that the frame to be encoded does not change much from its corresponding reference frame, but the accuracy of inter-prediction based on the predicted motion vector is too low, so that the first difference is large, and thus it can be judged that the accuracy of inter-prediction based on the predicted motion vector is unsatisfactory, and thus it can be determined that the residual between the predicted motion vector and the true motion vector of each block to be encoded needs to be encoded.

When the fifth condition, that is, var _Inter≤Var_Intra, is not satisfied, and Var _Inter is equal to or less than thr4, it can be determined that the accuracy of inter prediction based on the prediction motion vector is satisfied, and thus it can be determined that it is not necessary to encode the residual between the prediction motion vector and the true motion vector of each block to be encoded.

In other possible embodiments, the fifth condition may be: var _Inter＜Var_Intra, and Var _Inter > thr4. Or the fifth condition may be: var _Inter≤Var_Intra, and Var _Inter.gtoreq.thr4. Or the fifth condition may be: var _Inter＜Var_Intra, and Var _Inter.gtoreq.thr4.

In a possible implementation manner, the determination of the residual between the predicted motion vector and the true motion vector of each block to be encoded is not required in case the sixth condition is satisfied, where the sixth condition includes: the number of frames of the group of images in which the frame to be encoded is located is smaller than the second number. According to the residual error between each block to be coded and the corresponding first prediction block and/or the residual error between each block to be coded and the corresponding second prediction block, determining whether the residual error between the prediction motion vector and the real motion vector of each block to be coded needs to be coded or not can be specifically: and under the condition that the sixth condition is not met, determining whether the residual error between the predicted motion vector and the actual motion vector of each block to be coded is needed to be coded according to the residual error between each block to be coded and the corresponding first predicted block and/or the residual error between each block to be coded and the corresponding second predicted block.

The number of frames in a group of pictures (Group of Pictures, GOP) in which a frame to be encoded is located refers to the number of video frames that make up the GOP. When the number of frames of the GOP in which the frame to be encoded is located is small, the accumulated residual of the motion vectors can be considered small, and the influence on the accuracy of inter-frame prediction is small, so that the residual between the predicted motion vector and the true motion vector of each block to be encoded can be determined without encoding. When the number of frames of the GOP in which the frame to be encoded is located is large, it may be considered that the accumulated residual error of the motion vector is large, and the influence on the accuracy of inter prediction may be large, in which case, as described in the foregoing embodiment, it may be determined whether the residual error between the predicted motion vector and the true motion vector of each block to be encoded needs to be encoded according to the residual error between each block to be encoded and the corresponding first prediction block and/or the residual error between each block to be encoded and the corresponding second prediction block.

The number of frames of the GOP in which the frame to be encoded is located is denoted by N _GOP, and the sixth condition may be: n _GOP < N2. Wherein N2 represents a second number. It should be appreciated that the second number may be set in connection with actual situations or needs, and the embodiments of the present application are not limited in this regard. Alternatively, n2=4. When this sixth condition is satisfied, it can be considered that the accumulated residual of the motion vectors has little influence on the accuracy of inter prediction, and thus it can be determined that there is no need to encode the residual between the predicted motion vector and the true motion vector of each block to be encoded.

When the sixth condition is not satisfied, that is, N _GOP is equal to or greater than N2, whether the residual between the predicted motion vector and the true motion vector of each block to be encoded is required to be encoded may be determined according to the residual between each block to be encoded and the corresponding first predicted block and/or the residual between each block to be encoded and the corresponding second predicted block as described in the previous embodiment. That is, it may be determined whether the motion vector residual is needed preferentially according to the number of frames of the GOP in which the frame to be encoded is located, and for some cases where it may be difficult to determine whether the motion vector residual is needed by the number of frames of the GOP in which the frame to be encoded is located, it may be further determined whether the motion vector residual is needed by combining the residuals between each block to be encoded and its corresponding first prediction block and/or the residuals between each block to be encoded and its corresponding second prediction block.

In another possible embodiment, the sixth condition may be: n _GOP is less than or equal to N2.

Referring to fig. 10, fig. 10 is a flowchart illustrating another video encoding method according to an embodiment of the application. As shown in fig. 10, the video encoding method includes the following steps S1001 to S1014.

S1001, obtaining a frame to be encoded, wherein the frame to be encoded comprises at least one block to be encoded.

S1002, determining a predicted motion vector and a real motion vector of each block to be coded.

S1003, positioning a first prediction block corresponding to each block to be coded in a reference frame of a frame to be coded based on the prediction motion vector of each block to be coded, and positioning a second prediction block corresponding to each block to be coded in the reference frame of the frame to be coded based on the real motion vector of each block to be coded.

S1004, judging whether the number of frames of the image group where the frame to be encoded is smaller than the second number, if yes, executing step S1013, otherwise, executing step S1005.

S1005, calculating a first variance of inter-frame prediction residues of the frame to be encoded according to residues between each block to be encoded and the corresponding second prediction block, determining a third prediction block corresponding to each block to be encoded from the frame to be encoded, and calculating a second variance of intra-frame prediction residues of the frame to be encoded according to residues between each block to be encoded and the corresponding third prediction block.

S1006, judging whether the first variance is larger than the second variance, if yes, executing step S1013, otherwise, executing step S1007.

S1007, calculating the first inter-frame prediction cost of the frame to be encoded according to the residual error between each block to be encoded and the corresponding first prediction block, and calculating the second inter-frame prediction cost of the frame to be encoded according to the residual error between each block to be encoded and the corresponding second prediction block.

S1008, it is determined whether the first inter prediction cost is smaller than the second inter prediction cost, if yes, step S1013 is executed, and if no, step S1009 is executed.

S1009, determining whether the difference between the first inter-frame prediction cost and the second inter-frame prediction cost is greater than the first threshold, if so, executing step S1014, otherwise, executing step S1010.

S1010, obtaining the displacement absolute value of each block to be coded and the average displacement absolute value of all the blocks to be coded according to the real motion vector of each block to be coded.

S1011, judging whether at least one of the following is satisfied: whether the average displacement absolute value is greater than the second threshold, the number of blocks to be encoded, the displacement absolute value of which is greater than the third threshold, is greater than the first number, if yes, step S1014 is executed, and if no, step S1012 is executed.

S1012, determining whether the first difference is greater than the fourth threshold, if so, executing step S1014, and if not, executing step S1013.

S1013, determining a residual between the predicted motion vector and the true motion vector of each block to be encoded that does not need to be encoded.

S1014, determining a residual error between the predicted motion vector and the true motion vector of each block to be encoded.

It should be understood that, for the specific description of the above steps S1001 to S1014, reference may be made to the related description in the foregoing embodiments, which is not repeated here. According to the embodiment, whether the motion vector residual error of each block to be encoded in the frame to be encoded is required to be encoded can be flexibly determined according to the actual condition of the frame to be encoded, so that the self-adaption is good, the encoding effect is improved, and the video code rate is reduced on the premise that the video image quality is not affected.

It will be appreciated by those skilled in the art that in the above-described method of the specific embodiments, the written order of steps is not meant to imply a strict order of execution but rather should be construed according to the function and possibly inherent logic of the steps.

The foregoing description of various embodiments is intended to highlight differences between the various embodiments, which may be the same or similar to each other by reference, and is not repeated herein for the sake of brevity.

Referring to fig. 11, fig. 11 is a schematic structural diagram of a video encoding apparatus 1100 according to an embodiment of the present application, where the video encoding apparatus 1100 includes: an acquisition unit 1101, a determination unit 1102, a prediction unit 1103, and an encoding unit 1104, wherein:

An obtaining unit 1101, configured to obtain a frame to be encoded, where the frame to be encoded includes at least one block to be encoded;

a determining unit 1102, configured to determine a predicted motion vector and a true motion vector of each block to be encoded;

A prediction unit 1103, configured to locate, based on the predicted motion vectors of the blocks to be encoded, a first predicted block corresponding to each block to be encoded in a reference frame of the frame to be encoded, and locate, based on the actual motion vector of each block to be encoded, a second predicted block corresponding to each block to be encoded in the reference frame of the frame to be encoded;

The encoding unit 1104 is configured to determine whether a residual between a predicted motion vector and a true motion vector of each block to be encoded is needed according to a residual between each block to be encoded and a first prediction block corresponding to the block to be encoded and/or a residual between each block to be encoded and a second prediction block corresponding to the block to be encoded.

In one possible implementation, the encoding unit 1104 is specifically configured to, when determining whether to encode a residual between a predicted motion vector and a true motion vector of each block to be encoded according to a residual between each block to be encoded and its corresponding first prediction block and/or a residual between each block to be encoded and its corresponding second prediction block: calculating a first inter-frame prediction cost of a frame to be encoded according to residual errors between each block to be encoded and a corresponding first prediction block; calculating second inter-frame prediction cost of the frame to be encoded according to residual errors between each block to be encoded and the corresponding second prediction block; and determining whether residual errors between the predicted motion vectors and the actual motion vectors of the blocks to be encoded need to be encoded according to the magnitude relation between the first inter-frame prediction cost and the second inter-frame prediction cost.

In one possible implementation, the encoding unit 1104 is specifically configured to, when determining whether a residual between a predicted motion vector and a true motion vector of each block to be encoded needs to be encoded according to a magnitude relation between the first inter-prediction cost and the second inter-prediction cost: in the case that the first condition is satisfied, determining a residual between a predicted motion vector and a true motion vector that does not need to encode each block to be encoded, the first condition including: the first inter-frame prediction cost is less than the second inter-frame prediction cost; or determining a residual error between a predicted motion vector and a true motion vector of each block to be encoded if a second condition is satisfied, the second condition comprising: the first inter-frame prediction cost is greater than the second inter-frame prediction cost, and the difference between the first inter-frame prediction cost and the second inter-frame prediction cost is greater than a first threshold; or under the condition that the first condition and the second condition are not met, determining whether the residual error between the predicted motion vector and the true motion vector of each block to be coded is needed to be coded according to the true motion vector of each block to be coded.

In one possible implementation, the encoding unit 1104 is specifically configured to, when determining whether the residual between the predicted motion vector and the true motion vector of each block to be encoded needs to be encoded according to the true motion vector of each block to be encoded: obtaining the absolute displacement value of each block to be coded and the average absolute displacement value of all the blocks to be coded according to the actual motion vector of each block to be coded; in the case that the third condition is satisfied, determining a residual error between a predicted motion vector and a true motion vector of each block to be encoded, the third condition including: the average displacement absolute value is larger than the second threshold value, or the number of blocks to be encoded whose displacement absolute value is larger than the third threshold value is larger than the first number.

In one possible implementation, the encoding unit 1104 is specifically configured to, when obtaining the displacement absolute value of each block to be encoded and the average displacement absolute value of all the blocks to be encoded according to the true motion vector of each block to be encoded: according to the real motion vector of each block to be coded, obtaining the absolute value of horizontal displacement and the absolute value of vertical displacement of each block to be coded; obtaining the displacement absolute value of each block to be coded according to the sum of the horizontal displacement absolute value and the vertical displacement absolute value of each block to be coded; and obtaining an average displacement absolute value according to the sum of the average horizontal displacement absolute value and the average vertical displacement absolute value of all the blocks to be coded.

In one possible implementation, the encoding unit 1104 is specifically configured to, when determining whether to encode a residual between a predicted motion vector and a true motion vector of each block to be encoded according to a residual between each block to be encoded and its corresponding first prediction block and/or a residual between each block to be encoded and its corresponding second prediction block: calculating a first variance of inter prediction residues of the frame to be encoded according to residues between each block to be encoded and the corresponding second prediction block; determining a third prediction block corresponding to each block to be encoded from the frame to be encoded, and calculating a second variance of intra-frame prediction residues of the frame to be encoded according to residues between each block to be encoded and the corresponding third prediction block; in the case that the fourth condition is satisfied, determining a residual between a predicted motion vector and a true motion vector that does not need to encode each block to be encoded, the fourth condition including: the first variance is greater than the second variance; and if the fourth condition is not met, determining whether residual errors between the predicted motion vectors and the actual motion vectors of the blocks to be coded need to be coded according to the magnitude relation between the first inter-frame prediction cost and the second inter-frame prediction cost.

In one possible implementation, the encoding unit 1104 is further configured to: in the case that the third condition is not satisfied, determining a residual between a predicted motion vector and a true motion vector of each block to be encoded when a fifth condition is satisfied, the fifth condition including: the first variance is less than or equal to the second variance, and the first variance is greater than a fourth threshold; or when the fifth condition is not satisfied, determining that the residual between the predicted motion vector and the true motion vector of each block to be encoded does not need to be encoded.

In one possible implementation, the encoding unit 1104 is further configured to: in the case where the sixth condition is satisfied, determining a residual between a predicted motion vector and a true motion vector that does not need to encode each block to be encoded, the sixth condition including: the number of frames of the image group in which the frame to be encoded is positioned is smaller than the second number; the encoding unit 1104 is specifically configured to, when determining whether to encode a residual between a predicted motion vector and a true motion vector of each block to be encoded according to the residual between each block to be encoded and a first prediction block corresponding to the block to be encoded and/or the residual between each block to be encoded and a second prediction block corresponding to the block to be encoded: and under the condition that the sixth condition is not met, determining whether the residual error between the predicted motion vector and the actual motion vector of each block to be coded is needed to be coded according to the residual error between each block to be coded and the corresponding first predicted block and/or the residual error between each block to be coded and the corresponding second predicted block.

For specific limitations of the video encoding apparatus, reference may be made to the above limitations of the video encoding method, and no further description is given here. The various elements in the video encoding apparatus described above may be implemented in whole or in part by software, hardware, and combinations thereof. The units may be embedded in hardware or independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor invokes and executes operations corresponding to the units.

Referring to fig. 12, fig. 12 is a schematic structural diagram of an electronic device according to an embodiment of the application. The electronic device 1200 includes a memory 1201 and a processor 1202. Optionally, the electronic device 1200 further comprises a communication interface 1203 and a bus 1204. The memory 1201, the processor 1202 and the communication interface 1203 implement communication connection with each other through the bus 1204. The memory 1201 stores a computer program, and the processor 1202 is configured to execute the computer program stored in the memory 1201 to implement the method in the above-described method embodiments.

The embodiment of the application also provides a computer readable storage medium, wherein a computer program is stored in the computer readable storage medium, and the computer program realizes the method in the above method embodiments when being executed by a processor.

The present application also provides a computer program product comprising a computer program which, when executed by a processor, implements the method of the method embodiments described above.

It should be appreciated that the memory/readable storage medium in embodiments of the present application may be either volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. The nonvolatile memory may be a hard disk (HARD DISK DRIVE, HDD), a Solid State Disk (SSD), a read-only memory (ROM), a flash memory, or the like. Volatile memory can be random access memory (random access memory, RAM) or external cache memory, and RAM can be in various forms, such as static random access memory (static random access memory, SRAM) or dynamic random access memory (dynamic random access memory, DRAM), for example, by way of illustration and not limitation.

The processor in the embodiment of the present application may be one or a combination of processing modules such as a central processing unit (central processing unit, CPU), a graphics card processor (graphics processing unit, GPU) or a microprocessor (microprocessor unit, MPU), and may also be other general purpose processors, digital signal processors (DIGITAL SIGNAL processors, DSP), application Specific Integrated Circuits (ASIC), off-the-shelf programmable gate arrays (field programmable GATE ARRAY, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and the like.

The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The above examples illustrate only a few embodiments of the application, which are described in detail and are not to be construed as limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application.

Claims

1. A method of video encoding, the method comprising:

2. The method according to claim 1, wherein determining whether the residual between the predicted motion vector and the true motion vector of each block to be encoded is required to be encoded according to the residual between each block to be encoded and its corresponding first prediction block and/or the residual between each block to be encoded and its corresponding second prediction block comprises:

3. The method according to claim 2, wherein determining whether a residual between a predicted motion vector and a true motion vector of each block to be encoded is required to be encoded according to a magnitude relation between the first inter-prediction cost and the second inter-prediction cost comprises:

4. A method according to claim 3, wherein said determining whether a residual between a predicted motion vector and a true motion vector of each of said blocks to be encoded is required to be encoded based on a true motion vector of each of said blocks to be encoded comprises:

5. The method according to claim 4, wherein the obtaining the absolute displacement value of each block to be encoded and the absolute average displacement value of all blocks to be encoded according to the true motion vector of each block to be encoded comprises:

6. The method according to claim 4, wherein determining whether the residual between the predicted motion vector and the true motion vector of each block to be encoded is needed according to the residual between each block to be encoded and its corresponding first prediction block and/or the residual between each block to be encoded and its corresponding second prediction block, further comprises:

7. The method as recited in claim 6, further comprising: in case said third condition is not fulfilled,

8. The method according to any one of claims 1 to 7, further comprising:

9. A video encoding device, the device comprising:

10. An electronic device comprising a memory storing a computer program and a processor implementing the method of any one of claims 1 to 8 when the computer program is executed by the processor.

11. A computer readable storage medium, characterized in that the computer readable storage medium has stored therein a computer program which, when executed by a processor, implements the method of any of claims 1 to 8.

12. A computer program product, characterized in that the computer program product comprises a computer program which, when executed by a processor, implements the method of any one of claims 1 to 8.