CN117596396A

CN117596396A - Video coding method, device, equipment and storage medium

Info

Publication number: CN117596396A
Application number: CN202311553679.2A
Authority: CN
Inventors: 张佳
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2023-11-20
Filing date: 2023-11-20
Publication date: 2024-02-23

Abstract

The embodiment of the application discloses a video coding method, a video coding device, video coding equipment and a storage medium. The method comprises the following steps: according to the positions of the M first coding units in the target video frame and the positions of the second coding units in the target video frame, first coding units with overlapping areas with the second coding units are screened from the M first coding units, the M first coding units are obtained by dividing the target video frame according to a first coding and decoding standard, and the second coding units are obtained by dividing the target video frame according to a second coding and decoding standard; determining a reference frame of the second coding unit according to the reference information of the first coding unit having an overlapping area with the second coding unit and the initial reference frame list of the second coding unit; and encoding the second coding unit based on the reference frame of the second coding unit to obtain the code stream data of the target video frame under the second coding and decoding standard. By adopting the embodiment of the application, the coding complexity can be effectively reduced.

Description

Video coding method, device, equipment and storage medium

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to a video encoding method, apparatus, device, and storage medium.

Background

With advances in technological research, video consumers continue to increase in demand for video quality (e.g., sharpness). In order to support the transmission of high quality video, the codec standards of video are also being updated. It has been found that converting the first stream data of a video encoded according to a first codec standard (old codec standard) into second stream data encoded according to a second codec standard (new codec standard) requires decoding the first stream data first and then re-encoding the decoded video according to the second codec standard.

In the process of recoding the video obtained by decoding according to the second coding and decoding standard, each coding block in the video frame can be coded by adopting an inter-frame translation prediction mode. Inter-frame translation prediction mode refers to predicting pixels of a current Coding Unit (CU) using pixels of a block region (reference block) in an adjacent frame. The smaller the difference between the reference block and the current CU, the more accurate the prediction, and the fewer bits the final encoded difference needs. Therefore, in re-encoding the decoded video according to the second codec standard, it is necessary to determine in which frames to find the reference block, i.e., to determine the reference frame. The selection is typically done by traversing all possible combinations, resulting in higher coding complexity.

Disclosure of Invention

The embodiment of the application provides a video coding method, a video coding device, video coding equipment and a computer readable storage medium, which can effectively reduce coding complexity.

In one aspect, an embodiment of the present application provides a video encoding method, including:

acquiring coding information of a target video frame, wherein the target video frame comprises M first coding units, and the M first coding units are obtained by dividing the target video frame according to a first coding and decoding standard; the coding information comprises the positions of the M first coding units in the target video frame and the reference information of the M first coding units under the first coding and decoding standard, the reference information of any first coding unit comprises whether the prediction mode of any first coding unit comprises an inter-frame translation prediction mode or not, and M is a positive integer;

acquiring a second coding unit to be coded in the target video frame, wherein the second coding unit is obtained by dividing the target video frame according to a second coding and decoding standard, and the second coding and decoding standard is different from the first coding and decoding standard;

according to the positions of the M first coding units in the target video frame and the positions of the second coding units in the target video frame, first coding units with overlapping areas with the second coding units are screened out of the M first coding units;

Determining a reference frame of the second coding unit according to the reference information of the first coding unit with an overlapping area with the second coding unit and an initial reference frame list of the second coding unit;

and encoding the second coding unit based on the reference frame of the second coding unit to obtain the code stream data of the target video frame under the second coding and decoding standard.

In one aspect, an embodiment of the present application provides a video encoding apparatus, including:

the device comprises an acquisition unit, a decoding unit and a processing unit, wherein the acquisition unit is used for acquiring coding information of a target video frame, the target video frame comprises M first coding units, and the M first coding units are obtained by dividing the target video frame according to a first coding and decoding standard; the coding information comprises the positions of the M first coding units in the target video frame and the reference information of the M first coding units under the first coding and decoding standard, the reference information of any first coding unit comprises whether the prediction mode of any first coding unit comprises an inter-frame translation prediction mode or not, and M is a positive integer;

the obtaining unit is further configured to obtain a second coding unit to be coded in the target video frame, where the second coding unit is obtained by dividing the target video frame according to a second coding and decoding standard, and the second coding and decoding standard is different from the first coding and decoding standard;

The processing unit is used for screening first coding units with overlapping areas with the second coding units from the M first coding units according to the positions of the M first coding units in the target video frame and the positions of the second coding units in the target video frame;

the processing unit is further configured to determine a reference frame of the second coding unit according to reference information of the first coding unit having an overlapping area with the second coding unit and an initial reference frame list of the second coding unit;

the processing unit is further configured to encode the second encoding unit based on the reference frame of the second encoding unit, to obtain code stream data of the target video frame under the second encoding and decoding standard.

In one embodiment, the processing unit determines the reference frame of the second coding unit according to the reference information of the first coding unit having an overlapping region with the second coding unit and the initial reference frame list of the second coding unit, and includes:

if the prediction modes of the first coding units with the overlapping areas with the second coding units do not include the inter-frame translation prediction mode, obtaining the play distance between each reference frame in the initial reference frame list of the second coding unit and the second coding unit;

Selecting a target reference frame from the initial reference frame list based on the play distance between each reference frame and the second coding unit, and determining the target reference frame as the reference frame of the second coding unit; the playing distance between the target reference frame and the second coding unit is smaller than the playing distance between other reference frames and the second coding unit.

In one embodiment, the initial reference frame list includes a forward reference frame list and a backward reference frame list;

the processing unit encodes the second encoding unit based on the reference frame of the second encoding unit to obtain the code stream data of the target video frame under the second encoding and decoding standard, including:

if the reference frame is located in the forward reference frame list, encoding the second encoding unit based on the reference frame and the forward prediction mode of the second encoding unit to obtain code stream data of the target video frame under the second encoding and decoding standard;

and if the reference frame is positioned in the backward reference frame list, encoding the second encoding unit based on the reference frame and the backward prediction mode of the second encoding unit to obtain the code stream data of the target video frame under the second encoding and decoding standard.

In one embodiment, the reference information further includes reference frame indexes of the M first coding units;

the processing unit determines a reference frame of the second coding unit according to the reference information of the first coding unit having an overlapping area with the second coding unit and the initial reference frame list of the second coding unit, and includes:

and if the prediction mode of at least one first coding unit with the overlapping area with the second coding unit comprises an inter-frame translation prediction mode, determining the reference frame with the same reference frame index in the initial reference frame list as the reference frame of the second coding unit.

In one embodiment, the initial reference frame list includes a forward reference frame list and a backward reference frame list, and the reference information further includes prediction modes adopted by the M first coding units;

the processing unit determines a reference frame with the same reference frame index as the reference frame of the second coding unit in the initial reference frame list, including:

and determining the reference frame with the same index as the reference frame with the forward prediction mode in the forward reference frame list as the reference frame of the second coding unit, and determining the reference frame with the same index as the reference frame with the backward prediction mode in the backward reference frame list as the reference frame of the second coding unit.

In one embodiment, the prediction modes of the respective first coding units having an overlap region with the second coding unit each include an inter-frame translation prediction mode;

the obtaining unit is further configured to obtain a candidate prediction mode of the second encoding unit;

the processing unit is further configured to determine, as a prediction mode of the second coding unit, a prediction mode that is the same as a prediction mode adopted by the M first coding units in the candidate prediction modes;

and encoding the second coding unit based on the reference frame of the second coding unit and the prediction mode of the second coding unit to obtain the code stream data of the target video frame under the second coding and decoding standard.

In one embodiment, the prediction mode of the portion of the first coding unit having an overlap region with the second coding unit includes an inter-frame translation prediction mode;

The processing unit is further configured to determine the candidate prediction mode as a prediction mode of the second coding unit;

acquiring the playing distance between each reference frame in the initial reference frame list of the second coding unit and the second coding unit;

selecting a target reference frame from the initial reference frame list based on the playing distance between each reference frame and the second coding unit; the playing distance between the target reference frame and the second coding unit is smaller than the playing distance between other reference frames and the second coding unit;

And determining the reference frame with the same reference frame index as the reference frame index in the initial reference frame list as the reference frame of the second coding unit.

In one embodiment, the processing unit obtains a second encoding unit to be encoded in the target video frame, including:

dividing an object to be encoded according to P preset dividing modes to obtain P dividing results of the object to be encoded, wherein the object to be encoded is a target video frame or a target region in the target video frame, and P is a positive integer;

and if the coding efficiency of the object to be coded under the P division results is not higher than the coding efficiency of the object to be coded, determining the object to be coded as a second coding unit to be coded in the target video frame.

In one embodiment, the processing unit obtains the target video frame and the encoding information of the target video frame, including:

acquiring code stream data of a target video frame under a first coding and decoding standard;

and decoding the code stream data of the target video frame under the first coding and decoding standard to obtain the target video frame and the coding information of the target video frame.

Accordingly, the present application provides a computer device comprising:

A memory in which a computer program is stored;

and the processor is used for loading a computer program to realize the video coding method.

Accordingly, the present application provides a computer readable storage medium storing a computer program adapted to be loaded by a processor and to perform the video encoding method described above.

Accordingly, the present application provides a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device performs the video encoding method described above.

In the embodiment of the present application, coding information of a target video frame is obtained, the target video frame includes M first coding units, the coding information includes positions of the M first coding units in the target video frame, and reference information of the M first coding units under the first coding and decoding standard, the reference information of any first coding unit includes whether a prediction mode of any first coding unit includes an inter-frame translation prediction mode, a second coding unit to be coded in the target video frame is obtained, a reference frame of the second coding unit is determined according to reference information of a first coding unit having an overlapping area with the second coding unit in the M first coding units and an initial reference frame list of the second coding unit, and coding is performed on the second coding unit based on the reference frame of the second coding unit, so as to obtain code stream data of the target video frame under the second coding and decoding standard. Therefore, the selection of the reference frame of the second coding unit is accelerated by the inter-frame translation prediction result of the first coding unit with the overlapping area with the second coding unit, so that the determination process of the reference frame of the second coding unit can be simplified, and the coding complexity is further effectively reduced.

Drawings

In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic view of a video coding scheme according to an embodiment of the present application;

fig. 2 is a flowchart of a video encoding method according to an embodiment of the present application;

FIG. 3a is a schematic diagram of a basic partitioning method according to an embodiment of the present application;

fig. 3b is a schematic diagram of a video frame division result provided in an embodiment of the present application;

fig. 4 is a flowchart of another video encoding method according to an embodiment of the present application;

FIG. 5a is a schematic diagram of a second encoding unit included in a first encoding unit according to an embodiment of the present application;

FIG. 5b is a schematic diagram of a second encoding unit included in a plurality of first encoding units according to an embodiment of the present disclosure;

fig. 5c is a schematic diagram of a code stream conversion flow provided in an embodiment of the present application;

Fig. 6 is a schematic structural diagram of a video encoding device according to an embodiment of the present application;

fig. 7 is a schematic structural diagram of a computer device according to an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.

The following is a brief description of the relevant terms involved in this application:

high efficiency video coding (High Efficiency Video Coding, HEVC) standard: also known as the h.265 codec standard, may be used to augment the h.264/AVC codec standard, which specifies the codec flow and associated syntax of the bitstream data corresponding to h.265.

Multifunctional video coding (Versatile Video Coding, VVC) standard: also referred to as the h.266 codec standard, which specifies the codec flow and associated syntax of the bitstream data corresponding to h.266.

Coding Unit (CU): the coding unit may refer to the entire video frame (in the case of not dividing the video frame) or a partial region in the video frame (in the case of dividing the video frame).

Inter prediction: refers to the information of the video frame adjacent to the video frame to which the coding unit belongs in the reference video when the coding unit codes. Inter-frame translational prediction refers to taking translational motion into account when motion compensation is performed. Illustratively, inter-frame translation prediction may specifically include one or more of forward prediction, backward prediction, and bi-directional prediction.

In one example, taking the first codec standard as the HEVC codec standard and the second codec standard as the VVC codec standard as an example, when the VVC encoder starts inter-frame panning prediction for one CU, it is common practice to build a forward reference frame list and a backward reference frame list for it. The frames of the reference frame list are reconstructed frames obtained after decoding the previously encoded frames. The encoder then encodes the possible up to three prediction modes in turn. The way in which the reference frame is selected for each prediction mode is also an attempt to try all possible schemes. Forward prediction requires selecting a frame in the forward reference frame list, backward prediction requires selecting a frame in the backward reference frame list, and bi-directional prediction requires selecting a frame in each of the forward reference frame list and the backward reference frame list. There are many possible combinations of prediction modes and reference frames, and encoding all combinations and then choosing the way that the rate-distortion performance is optimal introduces a very high computational complexity to the encoder.

It was found that the purpose of inter-frame translation prediction is to find the reference block in the reference frame that is most similar to the current block to be encoded, whereas for the same video content the picture block motion is the same and does not change due to encoding using HEVC or VVC. Thus, for one image block, the prediction mode and reference frame of HEVC selection are the same as the result of VVC selection with a high probability. Based on this, the present application accelerates the reference frame selection process of VVC encoding by utilizing the translational inter prediction result of the HEVC bitstream. That is, the present application improves the transcoding speed and saves the computing resources by analyzing the reference frame selection and the prediction mode result of the HEVC bitstream in the target area, and the VVC encoder makes a fast reference frame and prediction mode decision when encoding the target area. In other words, the method and the device can keep the H.266 rate distortion performance during transcoding as much as possible, and reduce the coding complexity as much as possible.

Referring to fig. 1, fig. 1 is a schematic view of a video coding scheme according to an embodiment of the present application. As shown in fig. 1, the video encoding scheme may be performed by a computer device 101, and the computer device 101 may be a terminal device or a server. Wherein the terminal device includes, but is not limited to: smart phones (such as Android phones, IOS phones, etc.), tablet computers, portable personal computers, smart home appliances, vehicle terminals, wearable devices, and other smart devices, which are not limited in this embodiment of the present application. The server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDN (Content Delivery Network ), basic cloud computing services such as big data and an artificial intelligent platform, which is not limited in the embodiment of the present application.

It should be noted that the number of computer devices is merely for example, and does not constitute a practical limitation of the present application; for example, a terminal device, or a server, may also be included in the encoding scenario. The intermediate information of the target video frame and the target video frame may be transmitted to the computer device 101 by other computer devices (such as a terminal device) except the computer device 101, or may be obtained by decoding code stream data obtained by encoding the target video frame stored or acquired locally by the computer device 101 according to the first codec standard.

The general principle of a video coding scheme is as follows:

(1) The computer device 101 obtains the encoding information of the target video frame. The target video frame includes M first coding units, where the M first coding units are obtained by dividing the target video frame according to a first coding and decoding standard (such as HEVC standard), and a specific dividing manner may include any one of the following: non-division, horizontal two-division, vertical two-division, four-division, horizontal three-division, and vertical three-division. The coding information includes positions of M first coding units in the target video frame, and reference information of the M first coding units under the first coding and decoding standard, where the reference information of any first coding unit may include whether a prediction mode of any first coding unit includes an inter-frame translation prediction mode, and M is a positive integer. The position of each first coding unit in the target video frame may be indicated by means of indication such as coordinates, a combination of coordinates and a side length, which is not limited in this application. Optionally, the reference information of any of the first coding units may further include one or more of a reference frame index and a prediction mode. The prediction modes may include one or more of a forward prediction mode, a backward prediction mode, and a bi-prediction mode.

In one embodiment, the computer device 101 obtains the code stream data of the target video frame under the first coding and decoding standard, and decodes the code stream data of the target video frame under the first coding and decoding standard according to the decoding standard corresponding to the first coding and decoding standard, to obtain the target video frame and the coding information of the target video frame.

(2) The computer device 101 obtains a second coding unit to be coded in the target video frame, where the second coding unit is obtained by dividing the target video frame according to a second coding and decoding standard (such as VVC standard), and the first coding and decoding standard is different from the second coding and decoding standard. It should be noted that, in the specific implementation process, the result of dividing the target video frame according to the first codec standard and the result of dividing the target video frame according to the second codec standard may be the same or different.

In one embodiment, the computer device 101 divides the target video frame according to the second coding standard to obtain one or more second coding units. The second coding unit to be coded may be any one of one or more second coding units.

(3) The computer device 101 screens out the first coding units having an overlapping area with the second coding unit from the M first coding units according to the positions of the M first coding units in the target video frame and the positions of the second coding units in the target video frame.

In one embodiment, the positions of the first coding units and the second coding units may be indicated by vertex coordinates, and the computer device 101 may determine M first coding units and corresponding regions of the second coding units in the target video frame by coordinates of the first coding units and the second coding units, and screen the first coding units having overlapping regions with the second coding units from the M first coding units according to the M first coding units and corresponding regions of the second coding units in the target video frame.

(4) The computer device 101 determines a reference frame of the second coding unit from the reference information of the first coding unit having an overlapping region with the second coding unit and the initial reference frame list of the second coding unit.

In one embodiment, if the prediction modes of the first coding units having an overlapping area with the second coding unit do not include the inter-frame translation prediction mode, that is, the first coding units having an overlapping area with the second coding unit are all encoded using the non-inter-frame translation prediction mode, it is indicated that the second coding unit has a very low possibility of inter-frame translation correlation, so that only the reference frame with the minimum play pitch is predicted unidirectionally. The non-inter-frame translation prediction mode may include, for example, an intra-frame prediction mode or an inter-frame affine prediction mode, and is not particularly limited by the embodiments of the present application.

In another embodiment, if the prediction modes of the first coding units having an overlapping region with the second coding unit each include an inter-frame translation prediction mode, that is, the first coding units having an overlapping region with the second coding unit each encode using the inter-frame translation prediction mode, the reference frame list of the second coding unit may be reduced using the reference frame adopted by the first coding units (that is, the first coding units having an overlapping region with the second coding unit). Specifically, a reference frame having the same reference frame index as that in the encoding information of the target video frame in the initial reference frame list of the second encoding unit is determined as the reference frame of the second encoding unit. Optionally, the initial reference frame list may include a forward reference frame list and a backward reference frame list, where the forward reference frame list of the second coding unit only retains forward reference frames adopted by the respective first coding units, and frames that are not adopted as forward reference frames by the respective first coding units are removed from the forward reference frame list. Similarly, frames that are not employed as backward reference frames by the above-described respective first encoding units will be removed from the backward reference frame list.

In still another embodiment, if the prediction mode of the portion of the first coding unit having an overlap region with the second coding unit includes an inter-frame translation prediction mode, that is, the portion of the first coding unit having an overlap region with the second coding unit is encoded using the inter-frame translation prediction mode, the reference frame list of the second coding unit may be reduced using the reference frames employed by the respective first coding units (that is, the respective first coding units having an overlap region with the second coding unit). Specifically, a reference frame having the same reference frame index as that in the encoding information of the target video frame in the initial reference frame list of the second encoding unit is determined as the reference frame of the second encoding unit. Optionally, the initial reference frame list may include a forward reference frame list and a backward reference frame list, where the forward reference frame list of the second coding unit only retains forward reference frames adopted by the respective first coding units, and frames that are not adopted as forward reference frames by the respective first coding units are removed from the forward reference frame list. Similarly, frames that are not employed as backward reference frames by the above-described respective first encoding units will be removed from the backward reference frame list. In addition, the reference frame with the smallest playing distance with the second coding unit in the reference frame list is not rejected.

(5) The computer device 101 encodes the second encoding unit based on the reference frame of the second encoding unit, to obtain the code stream data of the target video frame under the second coding and decoding standard.

In one embodiment, the target video frame is divided into one or more second coding units according to the second coding standard, and after obtaining the coding information of the target video frame, the computer device 101 may determine the reference frame of each second coding unit according to the method in the step (3) and the step (4), and encode the reference frame of each second coding unit based on the reference frame of each second coding unit, to obtain the code stream data of the target video frame under the second coding standard.

In the embodiment of the present application, coding information of a target video frame is obtained, the target video frame includes M first coding units, the coding information includes positions of the M first coding units in the target video frame, and reference information of the M first coding units under the first coding and decoding standard, the reference information of any first coding unit includes whether a prediction mode of any first coding unit includes an inter-frame translation prediction mode, a second coding unit to be coded in the target video frame is obtained, a reference frame of the second coding unit is determined according to reference information of a first coding unit having an overlapping area with the second coding unit in the M first coding units and an initial reference frame list of the second coding unit, and coding is performed on the second coding unit based on the reference frame of the second coding unit, so as to obtain code stream data of the target video frame under the second coding and decoding standard. Therefore, the selection of the reference frame of the second coding unit is accelerated by the inter-frame translation prediction result of the first coding unit with the overlapping area with the second coding unit, so that the determination process of the reference frame of the second coding unit can be simplified, and the coding complexity of the video frame is further effectively reduced.

Based on the video coding scheme, the embodiment of the application proposes a more detailed video coding method, and the video coding method proposed by the embodiment of the application will be described in detail with reference to the accompanying drawings.

Referring to fig. 2, fig. 2 is a flowchart of a video encoding method according to an embodiment of the present application, where the video encoding method may be performed by a computer device, and the computer device may be a terminal device or a server. As shown in fig. 2, the video encoding method may include the following steps S201 to S205:

s201, obtaining coding information of a target video frame.

The target video frame may be any video frame of a video to be converted, where the video to be converted may be understood as a video that needs to convert code stream data encoded according to a first codec standard (e.g., HEVC standard) into code stream data encoded according to a second codec standard (e.g., VVC standard).

The target video frame comprises M first coding units, the M first coding units are obtained by dividing the target video frame according to a first coding and decoding standard, and M is a positive integer. Fig. 3a is a schematic diagram of a basic partitioning method provided in an embodiment of the present application. As shown in fig. 3a, the basic partitioning approach may include any of the following: non-division, horizontal two-division, vertical two-division, four-division, horizontal three-division, and vertical three-division.

It can be understood that, when the target video frame is divided according to the first codec standard, the above 6 basic division modes may be combined with each other, and the number of times of dividing the target video frame according to the first codec standard is not limited; for example, first, performing horizontal two division on a target video frame to obtain an upper half part of the target video frame and a lower half part of the target video frame; and then vertically and three-dividing the upper half part of the target video frame, and not further dividing the lower half part of the target video frame. Fig. 3b is a schematic diagram of a video frame division result according to an embodiment of the present application. As shown in fig. 3b, the target video frame is first divided into four parts, so as to obtain an upper left half part, an upper right half part, a lower left half part and a lower right half part of the target video frame; the upper left half of the target video frame is not further divided; dividing the right upper half part of the target video frame into four parts again; performing horizontal second division on the lower left half part of the target video frame, performing vertical third division on the upper half part of the lower left half part of the target video frame, and performing no further division on the lower half part of the lower left half part of the target video frame; the lower right half of the target video frame is horizontally triaded.

The coding information comprises the positions of M first coding units in the target video frame and the reference information of the M first coding units under the first coding and decoding standard. The position of each first coding unit in the target video frame may be indicated by means of indication such as coordinates, a combination of coordinates and a side length, which is not limited in this application. The reference information of each first coding unit may include whether an inter-frame shift prediction mode is used.

In one embodiment, the computer device may obtain the code stream data of the video to be converted under the first codec standard, and obtain the code stream data of the target video frame under the first codec standard from the code stream data of the video to be converted under the first codec standard, where the target video frame may be any video frame in the video to be converted. After the code stream data of the target video frame under the first coding and decoding standard is obtained, the computer equipment can decode the code stream data of the target video frame under the first coding and decoding standard according to the decoding standard corresponding to the first coding and decoding standard to obtain the target video frame and the coding information of the target video frame.

S202, acquiring a second coding unit to be coded in the target video frame.

The second coding unit is obtained by dividing the target video frame according to a second coding and decoding standard (such as VVC standard), where the first coding and decoding standard and the second coding and decoding standard are different.

In one embodiment, the computer device divides the object to be encoded according to P preset division modes (such as five other division modes except non-division in fig. 3 a), to obtain P division results of the object to be encoded; the object to be encoded may be a target video frame, or may be a target area in the target video frame (e.g., an upper half of the target video frame), where P is a positive integer. If the coding efficiency of the object to be coded under the P division results is not higher than the coding efficiency of the object to be coded (i.e. the coding efficiency of the object to be coded under the condition of not dividing), the computer device may determine the object to be coded as the second coding unit to be coded in the target video frame. Correspondingly, if the coding efficiency corresponding to at least one type of division result exists in the P types of division results of the object to be coded, which is higher than the coding efficiency of the object to be coded (i.e. the coding efficiency of the object to be coded under the condition of no division), the computer equipment can further divide the object to be coded (by adopting the division mode corresponding to the division result with the highest coding efficiency) until the coding efficiency of the coding unit obtained by division is the highest under the condition of no division.

It should be noted that, in the specific implementation process, the basic dividing manner adopted by dividing the target video frame according to the first coding and decoding standard and dividing the target video frame according to the second coding and decoding standard may be the same; for example, in the process of dividing the target video frame according to the first coding standard and the process of dividing the target video frame according to the second coding standard, the basic division manner adopted by the computer device is 6 basic division manners shown in fig. 3 a. The result of dividing the target video frame according to the first codec standard and the result of dividing the target video frame according to the second codec standard may be the same or different.

S203, according to the positions of the M first coding units in the target video frame and the positions of the second coding units in the target video frame, the first coding units with the overlapping areas with the second coding units are screened out of the M first coding units.

In one embodiment, the positions of the first coding units and the second coding units may be indicated by vertex coordinates, and the computer device may determine M first coding units and corresponding regions of the second coding units in the target video frame by coordinates of the first coding units and the second coding units, and screen the first coding units having overlapping regions with the second coding units from the M first coding units according to the M first coding units and corresponding regions of the second coding units in the target video frame.

In one example, if four vertices of a first coding unit are not exactly the same as four vertices of a second coding unit, and all four vertices of the first coding unit are outside the second coding unit, and all four vertices of the second coding unit are outside the area covered by the first coding unit, the first coding unit and the second coding unit do not overlap, otherwise, there is overlap.

There are also various methods for determining whether a vertex is outside a CU area. For example, assuming that the vertex coordinates are (a, b), the vertex coordinates of the upper left corner of the square region are (x, y), and the width and height are w and h, respectively, then when x < a < x+w and y < b < y+h, it can be determined that the vertex falls inside the square region, otherwise the vertex falls outside the square region.

In another example, assume that the upper left corner vertex coordinates of the second encoding unit are (a 0, b 0), and the lower right corner vertex coordinates are (a 1, b 1); the upper left corner vertex coordinates of the first encoding unit are (c 0, d 0), and the lower right corner vertex coordinates are (c 1, d 1). Then the condition that the first coding unit and the second coding unit have an overlap region that needs to be met is any one of: b1< d0< b0 and a0< c0< a1; b1< d0< b0 and a0< c1< a1; b1< d1< b0 and a0< c0< a1; b1< d1< b0 and a0< c1< a1.

S204, determining the reference frame of the second coding unit according to the reference information of the first coding unit with the overlapping area with the second coding unit and the initial reference frame list of the second coding unit.

In one embodiment, if the prediction modes of the first coding units having the overlapping areas with the second coding units do not include the inter-frame translation prediction mode, a play distance between each reference frame in the initial reference frame list of the second coding unit and the second coding unit is obtained. And selecting a target reference frame from the initial reference frame list based on the play distance between each reference frame and the second coding unit, and determining the target reference frame as the reference frame of the second coding unit. The playing distance between the target reference frame and the second coding unit is smaller than the playing distance between other reference frames and the second coding unit.

In another embodiment, if the prediction mode of at least one first coding unit having an overlap region with the second coding unit includes an inter-frame translation prediction mode, a reference frame of the initial reference frame list having the same reference frame index as the reference frame index is determined as the reference frame of the second coding unit.

Optionally, the initial reference frame list may include a forward reference frame list and a backward reference frame list, and the reference information of the target video frame may further include prediction modes adopted by the M first coding units. On the basis of this, it is possible to determine the same reference frame in the forward reference frame list as the reference frame of the second coding unit as the reference frame index of the reference frame in the forward prediction mode, and to determine the same reference frame in the backward reference frame list as the reference frame of the second coding unit as the reference frame index of the reference frame in the backward prediction mode.

Optionally, the prediction mode of the portion of the first coding unit having the overlapping region with the second coding unit includes an inter-frame translation prediction mode, and on the basis of this, a play distance between each reference frame in the initial reference frame list of the second coding unit and the second coding unit may be obtained. And selecting a target reference frame from the initial reference frame list based on the play distances between each reference frame and the second coding unit, wherein the play distances between the target reference frame and the second coding unit are smaller than the play distances between other reference frames and the second coding unit. And determining the reference frame with the same reference frame index as the reference frame index in the initial reference frame list and the target reference frame as the reference frame of the second coding unit.

S205, encoding the second encoding unit based on the reference frame of the second encoding unit to obtain the code stream data of the target video frame under the second encoding and decoding standard.

The computer equipment encodes the second coding unit according to the second coding and decoding standard and the reference frame of the second coding unit to obtain the coding result of the second coding unit. After obtaining the encoding results of all the second encoding units included in the target video frame, the computer equipment obtains the code stream data of the target video frame under the second encoding and decoding standard based on the encoding results of all the second encoding units included in the target video frame.

In the embodiment of the application, the coding information of the target video frame and the target video frame is obtained, the target video frame comprises M first coding units, the coding information comprises the positions of the M first coding units in the target video frame and the reference information of the M first coding units under the first coding and decoding standard, the reference information of any first coding unit comprises whether the prediction mode of any first coding unit comprises an inter-frame translation prediction mode, a second coding unit to be coded in the target video frame is obtained, the reference frame of the second coding unit is determined according to the reference information of the first coding unit and the initial reference frame list of the second coding unit, which have an overlapping area with the second coding unit, in the M first coding units, and the second coding unit is coded based on the reference frame of the second coding unit, so that the code stream data of the target video frame under the second coding and decoding standard is obtained. Therefore, the selection of the reference frame of the second coding unit is accelerated by the inter-frame translation prediction result of the first coding unit with the overlapping area with the second coding unit, so that the determination process of the reference frame of the second coding unit can be simplified, and the coding complexity of the video frame is further effectively reduced.

Referring to fig. 4, fig. 4 is a flowchart of another video encoding method provided in an embodiment of the present application, where the video encoding method may be performed by a computer device, and the computer device may be a terminal device or a server. As shown in fig. 4, the video encoding method may include the following steps S401 to S416:

s401, obtaining coding information of a target video frame.

The encoding information of the target video frame may include positions of the M first encoding units in the target video frame, and reference information of the M first encoding units under the first codec standard. The reference information of each first coding unit may include three kinds of information as follows: whether the first encoding unit uses an inter-frame translation prediction mode in encoding according to a first codec standard; in the case that the first coding unit uses an inter-frame translation prediction mode, the first coding unit specifically uses a prediction mode, such as a forward prediction mode, a backward prediction mode, or a bi-directional prediction mode; the first encoding unit encodes a reference frame index of a reference frame employed in a process according to a first codec standard.

S402, acquiring a second coding unit to be coded in the target video frame.

S403, according to the positions of the M first coding units in the target video frame and the positions of the second coding units in the target video frame, the first coding units with the overlapping areas with the second coding units are screened out of the M first coding units.

Fig. 5a is a schematic diagram of a second coding unit included in a first coding unit according to an embodiment of the present application. As shown in fig. 5a, when the second coding unit is included in one first coding unit, the second coding unit has an overlapping region with only one first coding unit. Specifically, the inclusion of the second coding unit in the jth first coding unit means that the second coding unit overlaps with the jth first coding unit, or that the second coding unit is located inside the jth first coding unit, where j is a positive integer equal to or less than M.

In one embodiment, assuming that the upper left corner vertex coordinates of the second encoding unit are (a 0, b 0), and the lower right corner vertex coordinates are (a 1, b 1); the j-th first coding unit has upper left corner vertex coordinates of (c 0, d 0), and lower right corner vertex coordinates of (c 1, d 1). The condition that the second coding unit is included in the j-th first coding unit needs to be satisfied can be expressed as: a0 Not less than c0, not less than b0 not less than d0, not less than a1 not less than c1, not less than b1 not less than d1.

Fig. 5b is a schematic diagram of a second coding unit included in a plurality of first coding units according to an embodiment of the present application. As shown in fig. 5b, when the second coding unit is included in at least two first coding units, there is an overlapping region of the second coding unit with a plurality (at least two) of the first coding units.

The specific embodiments of step S401 to step S403 may refer to the embodiments of step S201 to step S203 in fig. 2, and are not described herein.

S404, acquiring an initial reference frame list and a candidate prediction mode of the second coding unit.

Wherein the initial reference frame list may include at least one of a forward reference frame list and a backward reference frame list. The candidate prediction modes may include at least one of a forward prediction mode, a backward prediction mode, and a bi-prediction mode.

For example, in encoding the second coding unit according to the second coding standard (e.g., VVC standard) by the second encoder (e.g., VVC encoder), the initial reference frame list and the candidate prediction mode may be determined for the second coding unit.

S405, if the prediction modes of the first coding units with the overlapping areas with the second coding units do not include the inter-frame translation prediction mode, obtaining the play distance between each reference frame in the initial reference frame list of the second coding unit and the second coding unit.

S406, selecting a target reference frame from the initial reference frame list based on the play distance between each reference frame and the second coding unit, and determining the target reference frame as the reference frame of the second coding unit.

In this embodiment, the computer device may determine whether the prediction modes of the first coding units having an overlapping area with the second coding unit do not include the inter-frame translation prediction mode, and if so, that is, the first coding units having an overlapping area with the second coding unit are all encoded by using the non-inter-frame translation prediction mode, it indicates that the second coding unit has a very low possibility of inter-frame translation association, so that only the reference frame with the smallest play pitch is predicted unidirectionally.

Specifically, the computer device may obtain a play distance between each reference frame in the initial reference frame list of the second coding unit and the second coding unit when the prediction modes of each first coding unit having an overlapping region with the second coding unit do not include the inter-frame translation prediction mode; and selecting a target reference frame from the initial reference frame list based on the play distance between each reference frame and the second coding unit, and determining the target reference frame as the reference frame of the second coding unit. The playing distance between the target reference frame and the second coding unit is smaller than the playing distance between other reference frames and the second coding unit.

Play pitch refers to the number of frames in the video play order that are spaced between a reference frame and the frame currently being encoded (i.e., the target video frame). Assuming that the reference frame is numbered r in the video sequence and the frame currently being encoded is numbered f, the play distance of the reference frame from the frame currently being encoded is |f-r|. The playing distance between the target reference frame and the second coding unit is the playing distance between the target reference frame and the target video frame.

S407, if the reference frame is located in the forward reference frame list, the second coding unit is coded based on the reference frame and the forward prediction mode of the second coding unit, so as to obtain the code stream data of the target video frame under the second coding and decoding standard.

And S408, if the reference frame is positioned in the backward reference frame list, encoding the second coding unit based on the reference frame and the backward prediction mode of the second coding unit to obtain the code stream data of the target video frame under the second coding and decoding standard.

After determining the reference frame of the second coding unit, it may be determined whether the reference frame is located in the forward reference frame list or the backward reference frame list of the second coding unit. If the reference frame is located in the forward reference frame list, the computer device may determine that the prediction mode of the second coding unit is a forward prediction mode, and further encode the second coding unit based on the reference frame and the forward prediction mode of the second coding unit, to obtain an encoding result of the second coding unit. If the reference frame is located in the backward reference frame list, the computer device may determine that the prediction mode of the second coding unit is a backward prediction mode, and further encode the second coding unit based on the reference frame and the backward prediction mode of the second coding unit, to obtain an encoding result of the second coding unit.

After obtaining the encoding results of all the second encoding units included in the target video frame, the computer equipment obtains the code stream data of the target video frame under the second encoding and decoding standard based on the encoding results of all the second encoding units included in the target video frame.

S409, if the prediction modes of the first coding units having the overlapping areas with the second coding units all include inter-frame translation prediction modes, determining the reference frame with the same reference frame index as the reference frame index using the forward prediction mode in the forward reference frame list as the reference frame of the second coding unit, and determining the reference frame with the same reference frame index as the reference frame index using the backward prediction mode in the backward reference frame list as the reference frame of the second coding unit.

In this embodiment, when the overlapped first coding units are all encoded using the inter-frame translation prediction mode, the initial reference frame list of the second coding unit may be reduced using the reference frames employed by the overlapped first coding units. Specifically, the forward reference frame list of the second coding unit only retains forward reference frames employed by the overlapped first coding unit, and frames not employed as forward reference frames by the overlapped first coding unit are to be removed from the forward reference frame list. Similarly, frames that are not employed as backward reference frames by the overlapped first coding unit will be removed from the backward reference frame list.

For example, assuming that a first coding unit having an overlapping region with a second coding unit includes a first coding unit and a second first coding unit, the computer device knows that reference frame indexes adopted by the first coding unit are 1,2, and 3, respectively, based on reference information of a target video frame, and a reference frame indicated by the reference frame indexes is a forward reference frame; the reference frame indexes adopted by the second first coding unit are 2,4 and 7 respectively, and the reference frame indicated by the reference frame indexes is a backward reference frame. It is assumed that the forward reference frame list of the second coding unit includes 1,2,4,5 and the backward reference frame list includes 7,9, 11. Since reference frames 1 and 2 employed by the first coding unit are located in the forward reference frame list of the second coding unit, the computer device may determine reference frames 1 and 2 as reference frames of the second coding unit. In addition, since the reference frame 7 employed by the second first coding unit is located in the backward reference frame list of the second coding unit, the computer device may determine the reference frame 7 as the reference frame of the second coding unit. That is, the reference frames of the second encoding unit may include reference frame 1, reference frame 2, and reference frame 7.

S410, determining the prediction mode which is the same as the prediction mode adopted by the M first coding units in the candidate prediction modes as the prediction mode of the second coding unit.

In this embodiment, when the overlapped first coding units are all encoded using the inter-frame translation prediction mode, the candidate prediction modes of the second coding unit may be reduced using the prediction modes employed by the overlapped first coding units. Specifically, prediction modes that are not employed by the overlapped first coding units will be eliminated from the candidate prediction modes.

For example, assuming that a first coding unit having an overlapping region with a second coding unit includes a first coding unit and a second first coding unit, the computer device knows that a prediction mode adopted by the first coding unit is a forward prediction mode based on reference information of a target video frame; the prediction mode adopted by the second first coding unit is a backward prediction mode. Assuming that the candidate prediction modes of the second coding unit are a forward prediction mode, a backward prediction mode, and a bi-directional prediction mode, since neither the first coding unit nor the second first coding unit adopts the bi-directional prediction mode, the computer device may determine the forward prediction mode and the backward prediction mode as the prediction modes of the second coding unit.

S411, coding the second coding unit based on the reference frame of the second coding unit and the prediction mode of the second coding unit to obtain the code stream data of the target video frame under the second coding and decoding standard.

After the computer equipment determines the reference frame and the prediction mode of the second coding unit, the reference frame and the prediction mode can be combined into a plurality of coding modes, so that the computer equipment can traverse all the coding modes, and select the coding mode with the best rate distortion performance to code the second coding unit, thereby obtaining the coding result of the second coding unit. After obtaining the encoding results of all the second encoding units included in the target video frame, the computer equipment obtains the code stream data of the target video frame under the second encoding and decoding standard based on the encoding results of all the second encoding units included in the target video frame.

In one example, assume that the reference frames of the second coding unit include reference frame 1 in the forward reference frame list, reference frame 2, and reference frame 7 in the backward reference frame list; the prediction modes of the second coding unit include a forward prediction mode and a backward prediction mode. On this basis, the computer device may traverse the combination to obtain three coding modes, namely forward prediction for reference frame 1, forward prediction for reference frame 2, and backward prediction for reference frame 7.

In another example, assume that the reference frames of the second coding unit include reference frame 1 in the forward reference frame list, reference frame 2, and reference frame 7 in the backward reference frame list; the prediction mode of the second coding unit includes a forward prediction mode. On the basis, the computer equipment can traverse and combine to obtain two coding modes, namely, forward prediction is carried out on the reference frame 1, and forward prediction is carried out on the reference frame 2.

In one example, assume that the reference frames of the second coding unit include reference frame 1 in the forward reference frame list, reference frame 2, and reference frame 7 in the backward reference frame list; the prediction modes of the second coding unit include a forward prediction mode and a bi-prediction mode. On the basis, the computer equipment can traverse and combine to obtain four coding modes, namely forward prediction is carried out on the reference frame 1, forward prediction is carried out on the reference frame 2, bidirectional prediction is carried out on the reference frame 1 and the reference frame 7, and bidirectional prediction is carried out on the reference frame 2 and the reference frame 7.

S412, if the prediction mode of the part of the first coding unit with the overlapping area with the second coding unit comprises an inter-frame translation prediction mode, determining the reference frame with the same reference frame index as the reference frame with the forward prediction mode in the forward reference frame list as the reference frame of the second coding unit, and determining the reference frame with the same reference frame index as the reference frame with the backward prediction mode in the backward reference frame list as the reference frame of the second coding unit.

In this embodiment, when only a portion of the overlapped first coding units are encoded using the inter-translational prediction mode, the initial reference frame list of the second coding unit may be reduced based on the first coding unit of the overlapped first coding units that uses the translational inter-prediction. That is, frames employed as forward reference frames by at least one of the first coding units that are not overlapped with the inter-frame translation prediction will be removed from the second coding unit forward reference frame list, and frames employed as backward reference frames by at least one of the first coding units that are not overlapped with the inter-frame translation prediction will be removed from the backward reference frame list.

For example, assuming that the first coding unit having the overlapping area with the second coding unit includes a first coding unit, a second first coding unit, and a third first coding unit, the computer device knows that the first coding unit adopts a forward prediction mode based on the reference information of the target video frame, the reference frame indexes adopted by the first coding unit are 1,2, and 3, respectively, and the reference frames indicated by the reference frame indexes are forward reference frames; the second first coding unit adopts a backward prediction mode, the reference frame indexes adopted by the second first coding unit are respectively 2,4 and 7, and the reference frame indicated by the reference frame indexes is a backward reference frame; the third first coding unit employs a non-inter-frame translational prediction mode. It is assumed that the forward reference frame list of the second coding unit includes 1,2,4,5 and the backward reference frame list includes 7,9, 11. Since reference frames 1 and 2 employed by the first coding unit are located in the forward reference frame list of the second coding unit, the computer device may determine reference frames 1 and 2 as reference frames of the second coding unit. In addition, since the reference frame 7 employed by the second first coding unit is located in the backward reference frame list of the second coding unit, the computer device may determine the reference frame 7 as the reference frame of the second coding unit. That is, the reference frames of the second encoding unit may include reference frame 1, reference frame 2, and reference frame 7.

S413, obtaining the playing distance between each reference frame in the initial reference frame list of the second coding unit and the second coding unit.

S414, selecting a target reference frame from the initial reference frame list based on the play distance between each reference frame and the second coding unit.

The playing distance between the target reference frame and the second coding unit is smaller than the playing distance between other reference frames and the second coding unit.

It can be understood that, in the case where only part of the overlapped first coding units uses the inter-frame translation prediction mode for coding, the reference frame with the smallest playing distance from the target video frame in the initial reference frame list of the second coding unit is not removed from the initial reference frame list.

S415, determining the target reference frame as the reference frame of the second coding unit.

Optionally, the computer device may obtain a play distance between each reference frame in the forward reference frame list of the second coding unit and the second coding unit, and select the first target reference frame from the forward reference frame list based on the play distance between each reference frame and the second coding unit, where the play distance between the first target reference frame and the second coding unit is smaller than the play distances between other reference frames in the forward reference frame list and the second coding unit. In addition, the computer device may further obtain a play distance between each reference frame in the backward reference frame list of the second coding unit and the second coding unit, and select a second target reference frame from the backward reference frame list based on the play distance between each reference frame and the second coding unit, where the play distance between the second target reference frame and the second coding unit is smaller than the play distances between other reference frames in the backward reference frame list and the second coding unit. Further, the computer device may determine the first target reference frame and the second target reference frame as reference frames of the second coding unit.

S416, encoding the second coding unit based on the reference frame of the second coding unit and the candidate prediction mode of the second coding unit to obtain the code stream data of the target video frame under the second coding and decoding standard.

In the case where only a part of the overlapped first coding units is encoded using the inter-translational prediction mode, the candidate prediction mode of the second coding unit may not be reduced, i.e., the candidate prediction mode of the second coding unit is determined as the prediction mode of the second coding unit. After the computer equipment determines the reference frame and the prediction mode of the second coding unit, the reference frame and the prediction mode can be combined into a plurality of coding modes, so that the computer equipment can traverse all the coding modes, and select the coding mode with the best rate distortion performance to code the second coding unit, thereby obtaining the coding result of the second coding unit. After obtaining the encoding results of all the second encoding units included in the target video frame, the computer equipment obtains the code stream data of the target video frame under the second encoding and decoding standard based on the encoding results of all the second encoding units included in the target video frame.

In the embodiment of the application, the selection of the reference frame and the prediction mode of the second coding unit is accelerated by the inter-frame translation prediction result of the first coding unit with the overlapping area with the second coding unit, so that the determination process of the reference frame and the prediction mode of the second coding unit can be simplified, and the coding complexity of the video frame is further effectively reduced.

Fig. 5c is a schematic diagram of a code stream conversion flow according to an embodiment of the present application. As shown in fig. 5c, in the process of code stream conversion, first, the first code stream data (the code stream data obtained by encoding the video according to the first coding and decoding standard) is decoded by a decoder (such as an HEVC decoder), so as to obtain the decompressed video and the encoding information. After obtaining the decompressed video and the encoding information, an encoder (e.g., a VVC encoder) encodes the decompressed video based on the encoding information (according to a second encoding and decoding standard) to obtain second code stream data.

The foregoing details of the method of embodiments of the present application are set forth in order to provide a better understanding of the foregoing aspects of embodiments of the present application, and accordingly, the following provides a device of embodiments of the present application.

Referring to fig. 6, fig. 6 is a schematic structural diagram of a video encoding apparatus provided in an embodiment of the present application, where the video encoding apparatus shown in fig. 6 may be installed in a computer device, and the computer device may be a terminal device or a server. The video encoding device may be used to perform some or all of the functions of the method embodiments described above with respect to fig. 2 and 4. Referring to fig. 6, the video encoding apparatus includes:

An obtaining unit 601, configured to obtain encoding information of a target video frame, where the target video frame includes M first encoding units, and the M first encoding units are obtained by dividing the target video frame according to a first coding and decoding standard; the coding information comprises the positions of the M first coding units in the target video frame and the reference information of the M first coding units under the first coding and decoding standard, the reference information of any first coding unit comprises whether the prediction mode of any first coding unit comprises an inter-frame translation prediction mode or not, and M is a positive integer;

the obtaining unit 601 is further configured to obtain a second coding unit to be coded in the target video frame, where the second coding unit is obtained by dividing the target video frame according to a second coding and decoding standard, and the second coding and decoding standard is different from the first coding and decoding standard;

a processing unit 602, configured to screen, according to the positions of the M first coding units in the target video frame and the positions of the second coding units in the target video frame, first coding units that have an overlapping area with the second coding units from the M first coding units;

The processing unit 602 is further configured to determine a reference frame of the second coding unit according to the reference information of the first coding unit having an overlapping area with the second coding unit and the initial reference frame list of the second coding unit;

the processing unit 602 is further configured to encode the second encoding unit based on the reference frame of the second encoding unit, to obtain the code stream data of the target video frame under the second coding and decoding standard.

In one embodiment, the processing unit 602 determines the reference frame of the second coding unit according to the reference information of the first coding unit having an overlapping area with the second coding unit and the initial reference frame list of the second coding unit, including:

the processing unit 602 encodes the second encoding unit based on the reference frame of the second encoding unit to obtain the code stream data of the target video frame under the second coding and decoding standard, where the code stream data includes:

the processing unit 602 determines a reference frame of the second coding unit according to the reference information of the first coding unit having an overlapping area with the second coding unit and the initial reference frame list of the second coding unit, including:

the processing unit 602 determines, as the reference frame of the second encoding unit, the reference frame having the same reference frame index as the reference frame index in the initial reference frame list, including:

The obtaining unit 601 is further configured to obtain a candidate prediction mode of the second encoding unit;

the processing unit 602 is further configured to determine, as a prediction mode of the second coding unit, a prediction mode that is the same as a prediction mode adopted by the M first coding units in the candidate prediction modes;

the processing unit 602 is further configured to determine the candidate prediction mode as a prediction mode of the second coding unit;

In one embodiment, the processing unit 602 obtains a second encoding unit to be encoded in the target video frame, including:

In one embodiment, the processing unit 602 obtains the target video frame and the encoding information of the target video frame, including:

According to one embodiment of the present application, some of the steps involved in the video encoding methods shown in fig. 2 and 4 may be performed by various units in the video encoding apparatus shown in fig. 6. For example, step S201 and step S202 shown in fig. 2 may be performed by the acquisition unit 601 shown in fig. 6, and steps S203 to S205 may be performed by the processing unit 602 shown in fig. 6; step S401 and step S402 shown in fig. 4 may be performed by the acquisition unit 601 shown in fig. 6, and steps S403 to S416 may be performed by the processing unit 602 shown in fig. 6. The respective units in the video encoding apparatus shown in fig. 6 may be combined into one or several additional units separately or all, or some (some) of the units may be further split into a plurality of units with smaller functions to form the same operation, which may not affect the implementation of the technical effects of the embodiments of the present application. The above units are divided based on logic functions, and in practical applications, the functions of one unit may be implemented by a plurality of units, or the functions of a plurality of units may be implemented by one unit. In other embodiments of the present application, the video encoding apparatus may also include other units, and in practical applications, these functions may also be implemented with assistance from other units, and may be implemented by cooperation of multiple units.

According to another embodiment of the present application, a video encoding apparatus as shown in fig. 6 may be constructed by running a computer program (including program code) capable of executing the steps involved in the respective methods as shown in fig. 2 and 4 on a general-purpose computing apparatus such as a computer device including a Central Processing Unit (CPU), a random access storage medium (RAM), a read only storage medium (ROM), etc., and a storage element, and implementing the video encoding method of the embodiments of the present application. The computer program may be recorded on, for example, a computer-readable recording medium, and loaded into and run in the above-described computing device through the computer-readable recording medium.

Based on the same inventive concept, the principles and beneficial effects of the video coding device provided in the embodiments of the present application for solving the problems are similar to those of the video coding method in the embodiments of the present application, and may refer to the principles and beneficial effects of implementation of the method, which are not described herein for brevity.

Referring to fig. 7, fig. 7 is a schematic structural diagram of a computer device provided in an embodiment of the present application, where the computer device may be a terminal device or a server. As shown in fig. 7, the computer device includes at least a processor 701, a communication interface 702, and a memory 703. Wherein the processor 701, the communication interface 702, and the memory 703 may be connected by a bus or other means. Among them, the processor 701 (or central processing unit (Central Processing Unit, CPU)) is a computing core and a control core of the computer device, which can parse various instructions in the computer device and process various data of the computer device, for example: the CPU can be used for analyzing a startup and shutdown instruction sent by the object to the computer equipment and controlling the computer equipment to perform startup and shutdown operation; and the following steps: the CPU may transmit various types of interaction data between internal structures of the computer device, and so on. Communication interface 702 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI, mobile communication interface, etc.), and may be controlled by processor 701 to receive and transmit data; the communication interface 702 may also be used for transmission and interaction of data within a computer device. Memory 703 (Memory) is a Memory device in a computer device for storing programs and data. It will be appreciated that the memory 703 herein may comprise either a built-in memory of the computer device or an extended memory supported by the computer device. The memory 703 provides storage space that stores the operating system of the computer device, which may include, but is not limited to: android (Android) systems, internet operating systems (Internetworking Operating System, IOS), etc., as not limited in this application.

The embodiments of the present application also provide a computer-readable storage medium (Memory), which is a Memory device in a computer device, for storing programs and data. It is understood that the computer readable storage medium herein may include both built-in storage media in a computer device and extended storage media supported by the computer device. The computer readable storage medium provides storage space that stores a processing system of a computer device. In this memory space, a computer program suitable for being loaded and executed by the processor 701 is stored. Note that the computer readable storage medium can be either a high-speed RAM memory or a non-volatile memory (non-volatile memory), such as at least one magnetic disk memory; alternatively, it may be at least one computer-readable storage medium located remotely from the aforementioned processor.

In one embodiment, the processor 701 performs the following operations by running a computer program in the memory 703:

In one embodiment, the processor 701 is specifically configured to, when determining the reference frame of the second coding unit according to the reference information of the first coding unit having an overlapping area with the second coding unit and the initial reference frame list of the second coding unit, perform the following operations:

the processor 701 is specifically configured to perform the following operations when encoding the second coding unit based on the reference frame of the second coding unit to obtain the bitstream data of the target video frame under the second coding and decoding standard:

the processor 701 is specifically configured to, when determining a reference frame of the second coding unit according to the reference information of the first coding unit having an overlapping area with the second coding unit and the initial reference frame list of the second coding unit, perform the following operations:

The processor 701 is specifically configured to, when determining, as the reference frame of the second coding unit, a reference frame having the same reference frame index as the reference frame in the initial reference frame list:

the processor 701 is further configured to perform the following operations:

obtaining a candidate prediction mode of the second coding unit;

determining the prediction modes which are the same as the prediction modes adopted by the M first coding units in the candidate prediction modes as the prediction modes of the second coding units;

the processor 701 is further configured to perform the following operations:

obtaining a candidate prediction mode of the second coding unit;

determining the candidate prediction mode as a prediction mode of the second coding unit;

In one embodiment, the processor 701 is specifically configured to, when acquiring the second coding unit to be coded in the target video frame, perform the following operations:

In one embodiment, the processor 701 is specifically configured to, when acquiring the target video frame and the encoding information of the target video frame, perform the following operations:

Based on the same inventive concept, the principle and beneficial effects of the computer device for solving the problems provided in the embodiments of the present application are similar to those of the video encoding method for solving the problems in the embodiments of the method of the present application, and may refer to the principle and beneficial effects of implementation of the method, which are not described herein for brevity.

The present application also provides a computer readable storage medium having a computer program stored therein, the computer program being adapted to be loaded by a processor and to perform the video encoding method of the above method embodiments.

Embodiments of the present application also provide a computer program product or computer program comprising computer instructions stored in a computer-readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device performs the video encoding method described above.

The steps in the method of the embodiment of the application can be sequentially adjusted, combined and deleted according to actual needs.

The modules in the device of the embodiment of the application can be combined, divided and deleted according to actual needs.

Those of ordinary skill in the art will appreciate that all or part of the steps in the various methods of the above embodiments may be implemented by a program to instruct related hardware, the program may be stored in a computer readable storage medium, and the readable storage medium may include: flash disk, read-Only Memory (ROM), random-access Memory (Random Access Memory, RAM), magnetic or optical disk, and the like.

The foregoing disclosure is only a preferred embodiment of the present application, and it is not intended to limit the scope of the claims, and one of ordinary skill in the art will understand that all or part of the processes for implementing the embodiments described above may be performed with equivalent changes in the claims of the present application and still fall within the scope of the claims.

Claims

1. A method of video encoding, the method comprising:

2. The method of claim 1, wherein the determining the reference frame of the second coding unit based on the reference information of the first coding unit having an overlap region with the second coding unit and the initial reference frame list of the second coding unit comprises:

3. The method of claim 2, wherein the initial reference frame list comprises a forward reference frame list and a backward reference frame list;

the encoding the second coding unit based on the reference frame of the second coding unit to obtain the code stream data of the target video frame under the second coding and decoding standard includes:

4. The method of claim 1, wherein the reference information further comprises reference frame indices of the M first coding units;

the determining the reference frame of the second coding unit according to the reference information of the first coding unit having an overlapping area with the second coding unit and the initial reference frame list of the second coding unit includes:

5. The method of claim 4, wherein the initial reference frame list comprises a forward reference frame list and a backward reference frame list, the reference information further comprising prediction modes employed by the M first coding units;

the determining the reference frame with the same reference frame index in the initial reference frame list as the reference frame of the second coding unit includes:

6. The method of claim 5, wherein the prediction modes of each first coding unit having an overlap region with the second coding unit each comprise an inter-frame shift prediction mode; the method further comprises the steps of:

Obtaining a candidate prediction mode of the second coding unit;

7. The method of claim 5, wherein the prediction mode of the portion of the first coding unit having an overlap region with the second coding unit comprises an inter-frame shift prediction mode; the method further comprises the steps of:

acquiring a candidate prediction mode of the second coding unit, and determining the candidate prediction mode as a prediction mode of the second coding unit;

8. The method of claim 4, wherein the prediction mode of the portion of the first coding unit having an overlap region with the second coding unit comprises an inter-frame shift prediction mode;

9. The method of claim 1, wherein the obtaining the second coding unit to be coded in the target video frame comprises:

dividing an object to be encoded according to P preset dividing modes to obtain P dividing results of the object to be encoded, wherein the object to be encoded is the target video frame or a target area in the target video frame, and P is a positive integer;

10. The method of any of claims 1-9, wherein the obtaining the target video frame and the encoding information of the target video frame comprises:

and decoding the code stream data of the target video frame under a first coding and decoding standard to obtain the target video frame and the coding information of the target video frame.

11. A video encoding apparatus, the video encoding apparatus comprising:

12. A computer device, comprising: a memory and a processor;

a memory in which a computer program is stored;

A processor for loading the computer program for implementing the video coding method according to any of claims 1-10.

13. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program adapted to be loaded by a processor and to perform the video encoding method according to any of claims 1-10.