CN111770340B

CN111770340B - Video encoding method, device, equipment and storage medium

Info

Publication number: CN111770340B
Application number: CN202010717847.7A
Authority: CN
Inventors: 许桂森; 李一鸣; 王诗涛; 刘杉
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2020-07-23
Filing date: 2020-07-23
Publication date: 2022-03-15
Anticipated expiration: 2040-07-23
Also published as: CN111770340A

Abstract

The application discloses a video coding method, a video coding device, video coding equipment and a storage medium, and belongs to the field of video processing. The method comprises the following steps: according to the technical scheme provided by the embodiment of the application, the terminal can determine whether the unit to be coded is divided according to the similarity degree on the texture between at least two sub-blocks obtained by pre-dividing the unit to be coded and the unit to be coded. And when the similarity degree between at least two sub-blocks and the unit to be coded is higher, the terminal determines not to divide the unit to be coded. Therefore, the terminal does not need to traverse all modes for dividing the unit to be coded, and only needs to divide the unit to be coded with lower similarity with the subblocks, so that the number of the subblocks is reduced, the coding complexity is reduced, and the time consumed by coding is reduced under the condition of ensuring the video coding quality.

Description

Video encoding method, device, equipment and storage medium

Technical Field

The present application relates to the field of video processing, and in particular, to a video encoding method, apparatus, device, and storage medium.

Background

With the development of network technology, more and more users watch videos through various terminals. Because the storage space occupied by the video is often very large and the network bandwidth is limited, when video transmission is performed, the video needs to be compressed, i.e. encoded, so as to reduce the occupation of the storage space by the video. In the related art, when a computer device encodes a video, a video image needs to be divided into a plurality of blocks, and a decision is made for each possible block so as to select an optimal encoding mode. Therefore, the complexity of encoding the video image by the computer device is high, and much time is needed.

Disclosure of Invention

The embodiment of the application provides a video coding method, a video coding device, video coding equipment and a storage medium, which can improve the video coding effect. The technical scheme is as follows:

in one aspect, a video encoding method is provided, and the method includes:

in the video coding process, pre-dividing a unit to be coded according to a target division mode to obtain at least two sub-blocks;

acquiring texture characteristic parameters of the at least two sub-blocks, wherein the texture characteristic parameters are used for representing image textures of the corresponding sub-blocks;

and in response to that the difference information between the texture feature parameters of the at least two sub-blocks and the texture feature parameters of the unit to be encoded meets a target condition, the unit to be encoded is not divided according to the target division mode during video encoding.

In one aspect, a video encoding apparatus is provided, the apparatus comprising:

the device comprises a pre-dividing module, a coding module and a decoding module, wherein the pre-dividing module is used for pre-dividing a unit to be coded according to a target division mode in the video coding process to obtain at least two sub-blocks;

an obtaining module, configured to obtain texture feature parameters of the at least two sub-blocks, where the texture feature parameters are used to represent image textures of corresponding sub-blocks;

and the dividing module is used for responding that the difference information between the texture characteristic parameters of the at least two sub-blocks and the texture characteristic parameters of the unit to be coded meets a target condition, and when the video is coded, the unit to be coded is not divided according to the target dividing mode.

In a possible implementation manner, the dividing module is further configured to divide the unit to be encoded into the at least two sub-blocks according to the target dividing mode in response to that difference information between texture feature parameters of the at least two sub-blocks and texture feature parameters of the unit to be encoded does not meet the target condition.

In a possible embodiment, the apparatus further comprises:

the determining module is used for determining the average pixel value and the pixel value variance of the pixel points in the unit to be coded according to the pixel values of the pixel points in the unit to be coded in the video coding process;

the dividing module is further configured to, in response to that the pixel value variance is greater than a product of the average pixel value and a first threshold, not divide the unit to be encoded during video encoding, where the first threshold is used to indicate sharpness of video encoding.

In a possible implementation manner, the dividing module is further configured to, in response to that difference information between texture feature parameters of the at least two sub-blocks meets a target difference condition, not divide the unit to be encoded according to the target dividing mode when video encoding.

In one aspect, a computer device is provided that includes one or more processors and one or more memories having at least one program code stored therein, the program code being loaded and executed by the one or more processors to implement the video encoding method.

In one aspect, a computer-readable storage medium having at least one program code stored therein is provided, the program code being loaded and executed by a processor to implement the video encoding method.

In one aspect, a computer program product or a computer program is provided, the computer program product or the computer program comprising computer program code, the computer program code being stored in a computer-readable storage medium, the computer program code being read by a processor of a computer device from the computer-readable storage medium, the computer program code being executed by the processor to cause the computer device to perform the video encoding method provided in the various alternative implementations described above.

According to the technical scheme provided by the embodiment of the application, the terminal can determine whether the unit to be coded is divided according to the similarity degree on the texture between at least two sub-blocks obtained by pre-dividing the unit to be coded and the unit to be coded. And when the similarity degree between at least two sub-blocks and the unit to be coded is higher, the terminal determines not to divide the unit to be coded. Therefore, the terminal does not need to traverse all modes for dividing the unit to be coded, and only needs to divide the unit to be coded with lower similarity with the subblocks, so that the number of the subblocks is reduced, the coding complexity is reduced, and the time consumed by coding is reduced under the condition of ensuring the video coding quality.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a schematic diagram of an image frame dividing method provided in an embodiment of the present application;

fig. 2 is a schematic diagram of an implementation environment of a video encoding method according to an embodiment of the present application;

fig. 3 is a flowchart of a video encoding method according to an embodiment of the present application;

fig. 4 is a flowchart of a video encoding method according to an embodiment of the present application;

fig. 5 is a schematic diagram illustrating an effect of binary tree partitioning according to an embodiment of the present application;

FIG. 6 is a schematic diagram illustrating an effect of a ternary tree partition according to an embodiment of the present application;

FIG. 7 is a diagram illustrating an effect of quadtree partitioning according to an embodiment of the present disclosure;

fig. 8 is a schematic diagram of a pixel arrangement manner according to an embodiment of the present disclosure;

fig. 9 is a flowchart of a video encoding method according to an embodiment of the present application;

fig. 10 is a flowchart of a video encoding method according to an embodiment of the present application;

fig. 11 is a schematic structural diagram of a video encoding apparatus according to an embodiment of the present application;

fig. 12 is a schematic structural diagram of a terminal device according to an embodiment of the present application;

fig. 13 is a schematic structural diagram of a server according to an embodiment of the present application.

Detailed Description

To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.

The terms "first," "second," and the like in this application are used for distinguishing between similar items and items that have substantially the same function or similar functionality, and it should be understood that "first," "second," and "nth" do not have any logical or temporal dependency or limitation on the number or order of execution.

The term "at least one" in this application means one or more, "a plurality" means two or more, for example, a plurality of reference face images means two or more reference face images.

In order to more clearly explain the technical solutions provided in the present application, first, terms related in the embodiments of the present application are described:

coding Unit (CU): one image frame includes a plurality of coding units. In a video encoding process, an image frame needs to be divided, and the division includes a plurality of levels. For example, referring to fig. 1, in a video encoding process, a computer device can divide an image frame 101 into four macroblocks 102, and the four macroblocks 102 are encoding units. The computer device can continue to divide any of the macroblocks 102 into sub-blocks. Taking the division into a binary tree division example, the computer device can divide one macroblock 102 into two sub-blocks 103. The computer device can also continue to divide either of the two sub-blocks 103.

Pixel value: is at least one of a gray value and a luminance and chrominance value (YUV) of the pixel point.

The Video encoding method provided by the present application can be applied in the context of multiple Video encoding standards, such as in the Versatile Video Coding (VVC/h.266) and the third generation digital Video Coding Standard (AVS 3), which is not limited in the embodiments of the present application.

Fig. 2 is a schematic diagram of an implementation environment of a video encoding method according to an embodiment of the present application, and referring to fig. 2, the implementation environment may include a terminal 210 and a server 240.

The terminal 210 is connected to the server 240 through a wireless network or a wired network. Optionally, the terminal 210 is a smart phone, a tablet computer, a laptop computer, a desktop computer, a smart speaker, a smart watch, etc., but is not limited thereto. The terminal 210 is installed and operated with an application program supporting video encoding.

Optionally, the server 240 is an independent physical server, or a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as cloud service, cloud database, cloud computing, cloud function, cloud storage, web service, cloud communication, middleware service, domain name service, security service, distribution Network (CDN), big data and artificial intelligence platform, and the like. The terminal and the server can be directly or indirectly connected through wired or wireless communication, and the application is not limited herein.

In the embodiment of the present application, the server or the terminal may be used as an execution subject to implement the technical solution provided in the embodiment of the present application, and the technical method provided in the present application may also be implemented through interaction between the terminal and the server, which is not limited in the embodiment of the present application. The following description will be given taking the execution body as a terminal.

Fig. 3 is a flowchart of a video encoding method provided in an embodiment of the present application, which is described by taking an execution subject as a terminal, and referring to fig. 3, the method includes:

301. in the video coding process, the terminal pre-divides the unit to be coded according to the target division mode to obtain at least two sub-blocks.

The target division mode is at least one of binary tree division, ternary tree division and quadtree division, and other division modes are included with the development of the technology, which is not limited in the embodiment of the present application.

In order to more clearly explain the technical solution provided in the present application, the following describes the pre-partitioning and the difference of partitioning for the unit to be coded related in the present application.

The pre-division is a virtual division process, that is, when the pre-division is adopted, the terminal does not divide the unit to be encoded into at least two actual sub-blocks according to the target division mode, but obtains at least two virtual sub-blocks. In the encoding process, if the terminal divides any unit to be encoded, the unit to be encoded is no longer used as the basic unit of encoding, and the basic unit of encoding becomes the subblock obtained after the unit to be encoded is divided. If the terminal pre-divides any unit to be coded, the basic unit of coding is still the unit to be coded, and the obtained subblocks are used for calculating the texture feature parameters. The terminal can determine whether to divide the unit to be coded into at least two actual sub-blocks according to the target division mode according to the result of pre-dividing the unit to be coded according to the target division mode, wherein the pre-dividing can be used as a step before the division. For example, if the result obtained by pre-dividing the unit to be encoded according to the target division mode by the terminal does not meet the division condition, the terminal can determine that the unit to be encoded is not divided to obtain at least two actual sub-blocks. The terminal divides the unit to be coded into at least two actual sub-blocks according to the result that the result obtained by pre-dividing the unit to be coded according to the target division mode accords with the division condition, and then the terminal can carry out the subsequent coding process on the at least two actual sub-blocks.

302. The terminal acquires texture characteristic parameters of at least two sub-blocks, and the texture characteristic parameters are used for representing image textures of the corresponding sub-blocks.

The texture characteristic parameter is at least one of an average pixel value, a variance of the pixel value, an image gradient parameter, a root mean square error of the pixel value and a standard deviation of the pixel value of the pixel point.

303. And in response to that the difference information between the texture characteristic parameters of the at least two sub-blocks and the texture characteristic parameters of the unit to be coded meets the target condition, the terminal does not divide the unit to be coded according to the target division mode during video coding.

Fig. 4 is a flowchart of a video encoding method provided in an embodiment of the present application, which is described by taking an execution subject as a terminal, and referring to fig. 4, the method includes:

401. in the video coding process, a terminal determines rate distortion parameters corresponding to multiple division modes for pre-dividing a unit to be coded, wherein the rate distortion parameters are the sum of coding rate and image distortion rate, and the image distortion rate is the image distortion rate between a unit to be coded and a sub-block obtained by coding the divided sub-blocks after the unit to be coded is divided by adopting one division mode.

The partition mode is a coding unit partition mode of video coding supported by the terminal, and the plurality of partition modes include, for example, binary tree partition, ternary tree partition, and quadtree partition.

In a possible implementation manner, the terminal determines rate distortion parameters corresponding to multiple partition modes according to the code rate of video coding and the image distortion rate between an image block obtained by coding and a unit to be coded.

For example, if the code rate of the video coding is 1000kbps, the terminal determines the rate-distortion parameters corresponding to the binary tree partition, the ternary tree partition and the quaternary tree partition respectively under the code rate of 1000 kbps. The relationship between the video code rate and the image distortion rate obtained by the same video coding method is shown in formula (1). The terminal generates a Lagrange cost function (2) based on the formula (1), and rate distortion parameters corresponding to the multiple partition modes are determined through the Lagrange cost function (2).

Wherein R is code rate, D is image distortion rate, alpha is coefficient, delta²Is the variance of a plurality of pixel points in the unit to be encoded.

Min{J＝D+λR} (2)

Wherein Min { } is the minimum value, λ is the coefficient, and J is the rate distortion parameter.

402. And the terminal determines the division mode corresponding to the minimum rate distortion parameter as the target division mode from the rate distortion parameters corresponding to the multiple division modes.

In this implementation manner, the terminal can determine the partition mode with the smallest rate-distortion parameter as the target partition mode in the same video coding method, so that the target partition mode can be used to partition the unit to be coded, and the quality of the image obtained by coding the obtained subblocks is good.

It should be noted that, the

above steps

401 and 402 are optional steps, and the terminal can determine the target partition mode according to the actual situation, in addition to executing the

above steps

401 and 402 to determine the target partition mode, for example, a technician designates a partition mode as the target partition mode, or the terminal determines the target partition mode according to the partition mode indicated in the video coding standard, but of course, the terminal can also determine the target partition mode by other ways, and the method for determining the target partition mode in the embodiment of the present application is not limited.

403. And the terminal pre-divides the unit to be coded according to the target division mode to obtain at least two sub-blocks.

In one possible implementation, in response to that the target partition mode is binary tree partition, the terminal pre-partitions the unit to be encoded into two rectangular sub-blocks with the same area. The Binary Tree partition (BT) is a method for dividing the coding unit into two sub-blocks, and the Binary Tree partition includes a horizontal Binary Tree partition and a vertical Binary Tree partition. The rectangle related in the embodiment of the application comprises a rectangle and a square, and if the unit to be coded is square, two rectangular subblocks can be obtained after the coding unit is divided by adopting a binary tree; if the coding unit is rectangular and the length-width ratio of the rectangle is 2:1, two square subblocks or two rectangular subblocks can be obtained after the coding unit is divided by adopting a binary tree.

For example, referring to fig. 5, the terminal can perform horizontal binary tree division on the first unit to be encoded 501 to obtain the

subblocks

5011 and 5012. The terminal can perform vertical binary tree division on the second unit to be encoded 502 to obtain a sub-block 5021 and a sub-block 5022. The method for judging whether the unit to be coded is divided into the horizontal binary tree or the vertical binary tree by the terminal and the method for judging the target division mode belong to the same inventive concept, and are detailed in step 401 and step 402, which are not described herein again.

In addition, different minimum subblock sizes (minBTSize) may be set for the binary tree partitioning for different encoding methods. For VVC/H.266, minBTSize is 4 × 4, i.e. the size of the divided subblocks is 4 × 4 at minimum. If the size of the subblock obtained after the division by the vertical binary tree is smaller than 4 x 4, the terminal does not adopt the vertical binary tree to divide the unit to be coded.

In one possible implementation, in response to the target division mode being the ternary tree division, the terminal pre-divides the unit to be encoded into three rectangular sub-blocks with the same area. The method for dividing the unit to be coded into three sub-blocks by the aid of the ternary tree comprises horizontal ternary tree division and vertical ternary tree division.

For example, referring to fig. 6, the terminal can perform horizontal treeing on the third unit to be encoded 601, resulting in a sub-block 6011, a sub-block 6012, and a sub-block 6013. The terminal can perform horizontal treeing on the fourth unit to be encoded 602 to obtain sub-block 6021, sub-block 6022, and sub-block 6022. The method for judging whether the terminal performs horizontal or vertical ternary tree division on the unit to be coded belongs to the same inventive concept as the method for judging the target division mode, and is detailed in step 401 and step 402, which is not described herein again.

In one possible embodiment, in response to the target division mode being the quadtree division, the terminal pre-divides the unit to be encoded into four sub-blocks of a square with the same area. Quad Tree (QT) is a method of dividing a unit to be encoded into four sub-blocks.

For example, referring to fig. 7, the terminal can perform quadtree division on the fifth unit to be encoded 701, resulting in a sub-block 7011, a sub-block 7012, a sub-block 7013, and a sub-block 7014.

404. The terminal acquires texture characteristic parameters of at least two sub-blocks, and the texture characteristic parameters are used for representing image textures of the corresponding sub-blocks.

In a possible implementation manner, the terminal obtains pixel values of a plurality of reference pixels in at least two sub-blocks. The terminal obtains statistical information of at least two sub-blocks based on pixel values of a plurality of reference pixel points, wherein the statistical information is at least one of an average pixel value, a pixel value variance, an image gradient parameter, a pixel value root mean square error and a pixel value standard deviation of the reference pixel points in one sub-block. And the terminal takes the statistical information of the at least two sub-blocks as texture characteristic parameters of the at least two sub-blocks.

For convenience of understanding, the following describes a method for a terminal to acquire texture feature parameters of two sub-blocks by dividing a binary tree of a unit to be encoded by the terminal into examples, and it should be noted that, in the following description, for the sake of simplicity and clarity, integer parts are reserved for calculation results of numbers.

Taking the statistical information as the average pixel value of a plurality of reference pixels as an example, the terminal divides the unit to be encoded into a sub-block A and a sub-block B, wherein the sub-block A and the sub-block B are both composed of four pixels.

If the unit to be encoded is a gray image, the gray values of the four pixels of the sub-block a are 125, 110, 108 and 98, respectively, and the gray values of the four pixels of the sub-block B are 135, 128, 130 and 145, respectively. And the terminal takes the average value (125+110+108+98)/4 of the gray values of the four pixel points of the sub-block A as 110, and takes the average value as the texture characteristic parameter of the sub-block A. And the terminal takes the average value (135+128+130+145)/4 ═ 134 of the gray values of the four pixel points of the sub-block B as the texture characteristic parameter of the sub-block B.

If the unit to be encoded is a color image, the luminance chrominance values (in the order YUV, where Y represents luminance, and U and V represent chrominance) of the four pixels of sub-block a are (78, 135, 160), (85, 98, 64), (58, 69, 120), and (46, 84, 90), respectively, and the luminance chrominance values of the four pixels of sub-block B are (98, 135, 25), (63, 55, 46), (72, 89, 85), and (74, 76, 230), respectively. And the terminal takes the average value (67, 96 and 108) of the brightness and the chroma values of the four pixel points of the sub-block A as the texture characteristic parameter of the sub-block A. And the terminal takes the average value (77, 89, 96) of the brightness chroma values of the four pixel points of the sub-block B as the texture characteristic parameter of the sub-block B.

Taking statistical information as the pixel value variance of a plurality of reference pixels as an example, the terminal divides a unit to be encoded into a sub-block A and a sub-block B, wherein the sub-block A and the sub-block B are both composed of four pixels.

If the unit to be encoded is a gray image, the gray values of the four pixel points of the sub-block a are 125, 110, 108 and 98, the gray values of the four pixel points of the sub-block B are 135, 128, 130 and 145, the average value of the gray values of the sub-block a is 110, and the average value of the gray values of the sub-block B is 134. The terminal determines the gray value variance of the sub-block A as [ (125- & lt 110- & gt)²+(110-110)²+(108-110)²+(98-110)²]The terminal takes the gray value variance 93 as the texture feature parameter of the sub-block a, 93. The terminal determines the gray value variance of the sub-block B as [ (135-)²+(128-134)²+(130-134)²+(145-134)²]The terminal takes the gray value variance 43 as the texture feature parameter of the sub-block B, 43.

If the unit to be encoded is a color image, the luminance chrominance values (in the order YUV) of the four pixels of sub-block a are (78, 135, 160), (85, 98, 64), (58, 69, 120) and (46, 84, 90), and the luminance chrominance values (98, 135, 25), (63, 55, 46), (72, 89, 85) and (74, 76, 230), respectively. The terminal determines the variance of the three luminance chrominance values of sub-block a to be (241, 599, 1277), and the terminal uses the variance of the three luminance chrominance values (241, 599, 1277) as the texture feature parameter of sub-block a. The termination determines the variance of the three luma chroma values of sub-block B as (167, 860, 6404), and the termination uses the variance of the three luma chroma values (167, 860, 6404) as the texture feature parameter of sub-block B.

Taking the example that the statistical information is the image gradient parameters of a plurality of reference pixels, the terminal divides the unit to be encoded into a subblock a and a subblock B, wherein each of the subblocks a and B is composed of four pixels, the arrangement modes of the pixels in the subblocks a and B are shown in fig. 8, and the arrangement directions of the pixels in the subblocks a and B are both defined as the x direction for easy understanding.

If the unit to be encoded is a grayscale image, in one possible embodiment, the terminal can determine the image gradients of the sub-blocks a and B by using the gradient approximation formula (3).

G_x(x，y)＝H(x+1，y)-H(x-1，y) (3)

Wherein G is_x(x, y) is the image gradient in the x direction, H (x, y) is the gray value of the pixel point, and (x, y) is the coordinate of the pixel point.

Taking the statistical information as the standard deviation of the pixel values of a plurality of reference pixels as an example, the terminal divides the unit to be encoded into a sub-block A and a sub-block B, wherein the sub-block A and the sub-block B are both composed of four pixels.

If the unit to be encoded is a gray image, the gray values of the four pixel points of the sub-block a are 125, 110, 108 and 98, the gray values of the four pixel points of the sub-block B are 135, 128, 130 and 145, the average value of the gray values of the sub-block a is 110, and the average value of the gray values of the sub-block B is 134. The terminal determines that the standard deviation of the sub-block a is 9 and the standard deviation of the sub-block B is 6. The terminal takes the standard deviation 9 as the texture characteristic parameter of the sub-block A and takes the standard deviation 6 as the texture characteristic parameter of the sub-block B.

If the unit to be encoded is a color image, the luminance chrominance values (in the order YUV) of the four pixels of sub-block a are (78, 135, 160), (85, 98, 64), (58, 69, 120) and (46, 84, 90), and the luminance chrominance values (98, 135, 25), (63, 55, 46), (72, 89, 85) and (74, 76, 230), respectively. The terminal determines the standard deviation of the three luminance chrominance values of sub-block a to be (15, 24, 35), and the terminal uses the standard deviation of the three luminance chrominance values (15, 24, 35) as the texture parameter of sub-block a. The terminal determines the standard deviation (13, 29, 80) of the three luminance chrominance values of sub-block B, and the terminal uses the standard deviation (13, 29, 80) of the three luminance chrominance values as the texture parameter of sub-block B.

It should be noted that the terminal can execute step 405 after executing step 404, can execute step 405 before executing step 404, and can execute step 405 while executing step 404, and the execution order of step 404 and step 405 is not limited in the embodiment of the present application.

405. And the terminal acquires the texture characteristic parameters of the unit to be coded.

The texture characteristic parameter of the unit to be encoded is at least one of an average pixel value, a pixel value variance, an image gradient parameter, a pixel value root mean square error and a pixel value standard deviation of pixel values of a plurality of pixel points in the unit to be encoded.

In a possible implementation, the terminal is capable of determining texture feature parameters of the unit to be encoded at the beginning of video encoding, and storing the identifier of the unit to be encoded and the texture feature parameters in a storage space of the terminal. When the texture feature parameters of the unit to be coded need to be obtained, the terminal can obtain the texture feature parameters of the unit to be coded from the storage space according to the identifier of the unit to be coded. The method for obtaining texture parameters of the unit to be encoded and the method for obtaining texture feature parameters of at least two sub-blocks in step 404 belong to the same inventive concept, and the obtaining method may refer to the description in step 404, and is not described herein again.

In this implementation manner, the terminal can determine the texture feature parameters of the unit to be encoded in advance, and store the texture feature parameters of the unit to be encoded in advance. When the texture characteristic parameters of the unit to be coded need to be obtained, the terminal can directly obtain the texture characteristic parameters of the unit to be coded, so that the time consumed in the video coding process is saved, and the video coding efficiency is improved.

Of course, in addition to the foregoing embodiments, the terminal may also be capable of acquiring the texture feature parameters of the unit to be encoded in real time after the above step 404 is executed, and the timing for acquiring the texture feature parameters of the unit to be encoded by the terminal is not limited in the embodiment of the present application.

After performing step 405, the terminal can determine whether to perform step 406 or step 407 according to difference information between texture parameters of at least two sub-blocks and texture parameters of a unit to be encoded. In response to that the difference information between the texture feature parameters of the at least two sub-blocks and the texture feature parameters of the unit to be encoded meets the target condition, the terminal performs step 406; in response to that the difference information between the texture feature parameters of the at least two sub-blocks and the texture feature parameters of the unit to be encoded does not meet the target condition, the terminal performs step 407.

406. And in response to that the difference information between the texture characteristic parameters of the at least two sub-blocks and the texture characteristic parameters of the unit to be coded meets the target condition, the terminal does not divide the unit to be coded according to the target division mode during video coding.

In one possible implementation, in response to that the sum of texture feature parameters of at least two sub-blocks is greater than the product of the texture feature parameter of the unit to be encoded and a first threshold value, the terminal does not divide the unit to be encoded according to the target division mode during video encoding, and the first threshold value is used for indicating the definition of video encoding.

Next, an effect of the terminal using the first threshold values of different sizes in this embodiment will be described.

If the terminal sets the first threshold value to be a larger value, the number of the units to be encoded, which can meet the condition that the sum of the texture characteristic parameters of at least two sub-blocks is larger than the product of the texture characteristic parameters of the units to be encoded and the first threshold value, is smaller, the terminal can divide the image frame of the video into more sub-blocks, and the image obtained through video encoding is clearer. If the terminal sets the first threshold value to be a smaller value, the number of the units to be encoded, which can meet the condition that the sum of the texture characteristic parameters of at least two sub-blocks is greater than the product of the texture characteristic parameters of the units to be encoded and the first threshold value, is larger, and the terminal can perform subsequent encoding without dividing the image frame of the video into more sub-blocks, so that the consumption of computing resources in the encoding process can be remarkably reduced, and the speed of video encoding is increased.

After the first threshold is explained, the above-described embodiment will be explained below by way of several examples.

1. The texture characteristic parameter is taken as an average pixel value of a plurality of pixel points, the target division mode is a binary tree division example, the terminal can divide the unit to be coded into a sub block A and a sub block B, and the sub block A and the sub block B both comprise a plurality of pixel points.

If the unit to be coded is a gray image, the terminal determines that the average gray values of the pixel points in the sub-block A and the sub-block B are respectively M₀And M₁The average gray value of the pixel points in the unit to be encoded is M_p. If the terminal sets the first threshold lambda₁Then the terminal determines M₀、M₁And M_pWhether the three conform to the formula (4). If M is determined₀、M₁And M_pThe three are in accordance with the formula (4), and when the terminal is used for video coding, the terminal does not process the unit to be coded according to the binary tree division, and directly carries out subsequent coding on the unit to be coded.

M₀+M₁＞λ₁M_p (4)

If the unit to be coded is a color image, the terminal determines that the average brightness chroma values of the sub-block A and the sub-block B are respectively (Y)₁，U₁，V₁) And (Y)₂，U₂，V₂) The average luminance chroma value of the unit to be encoded is (Y)_p，U_p，V_p). If the terminal sets the first threshold lambda₂Then the terminal determines (Y)₁，U₁，V₁)、(Y₂，U₂，V₂) And (Y)_p，U_p，V_p) Whether the three conform to the formula (5). If it is determined (Y)₁，U₁，V₁)、(Y₂，U₂，V₂) And (Y)_p，U_p，V_p) The three are in accordance with the formula (5), and when the terminal is used for video coding, the terminal does not process the unit to be coded according to the binary tree division, and directly carries out subsequent coding on the unit to be coded.

(α₁Y₁+α₂U₁+α₃V₁)+(α₁Y₂+α₂U₂+α₃V₂)＞λ₂(α₁Y_p+α₂U_p+α₃V_p)(5)

Wherein alpha is₁、α₂And alpha₃Is a coefficient, α₁+α₂+α₃1 and α₁＞α₂And alpha₁＞α₃。

2. The texture characteristic parameter is taken as an average pixel value of a plurality of pixel points, the target division mode is a quadtree division example, the terminal can divide a unit to be coded into a subblock A, a subblock B, a subblock C and a subblock D, and the subblock A, the subblock B, the subblock C and the subblock D all comprise a plurality of pixel points.

If the unit to be coded is a gray image, the terminal determines that the average gray values of pixel points in the sub-block A, the sub-block B, the sub-block C and the sub-block D are respectively M₀、M₁、M₂And M₃The average gray value of the pixel points in the unit to be encoded is M_p. If the terminal sets the first threshold lambda₃Then the terminal determines M₀、M₁、M₂、M₃And M_pWhether the five satisfy the formula (6). If it is determined thatFixed M₀、M₁、M₂、M₃And M_pThe five codes accord with the formula (6), and the terminal can directly carry out subsequent coding on the unit to be coded without processing the unit to be coded according to the quad-tree division during video coding.

M₀+M₁+M₂+M₃＞λ₃M_p (6)

If the unit to be coded is a color image, the terminal determines that the average brightness chroma values of the sub-block A, the sub-block B, the sub-block C and the sub-block D are respectively (Y)₁，U₁，V₁)、(Y₂，U₂，V₂)、(Y₃，U₃，V₃) And (Y)₄，U₄，V₄) The average luminance chroma value of the unit to be encoded is (Y)_p，U_p，V_p). If the terminal sets the first threshold lambda₂Then the terminal determines (Y)₁，U₁，V₁)、(Y₂，U₂，V₂)、(Y₃，U₃，V₃)、(Y₄，U₄，V₄) And (Y)_p，U_p，V_p) Whether the five satisfy the formula (7). If it is determined (Y)₁，U₁，V₁)、(Y₂，U₂，V₂)、(Y₃，U₃，V₃)、(Y₄，U₄，V₄) And (Y)_p，U_p，V_p) The five codes accord with a formula (7), and when the terminal is used for coding the video, the terminal does not process the unit to be coded according to the quad-tree division and directly carries out subsequent coding on the unit to be coded.

(α₁Y₁+α₂U₁+α₃V₁)+(α₁Y₂+α₂U₂+α₃V₂)+(α₁Y₃，α₂U₃，α₃V₃)+(α₁Y₄，α₂U₄，α₃V₄)＞λ₄(α₁Y_p+α₂U_p+α₃V_p) (7)

3. The texture characteristic parameters are used as pixel value variances of a plurality of pixel points, the target division mode is a binary tree division example, the terminal can divide a unit to be coded into a sub block A and a sub block B, and the sub block A and the sub block B both comprise a plurality of pixel points.

If the unit to be coded is a gray image, the terminal determines that the gray value variance values of the pixel points in the sub-block A and the sub-block B are respectively V₀And V₁The gray value variance value of the pixel points in the unit to be coded is V_p. If the terminal sets the first threshold lambda₅Then the terminal determines V₀、V₁And V_pWhether the three conform to the formula (8). If V is determined₀、V₁And V_pThe three are in accordance with the formula (8), and when the terminal is used for video coding, the terminal does not process the unit to be coded according to the binary tree division, and directly carries out subsequent coding on the unit to be coded.

V₀+V₁＞λ₅V_p (8)

If the unit to be coded is a color image, the terminal determines that the variance of the luminance and chrominance values of the sub-block A and the sub-block B are respectively (V)_Y1，V_U1，V_V1) And (V)_Y2，V_U2，V_V2) Variance of luminance and chrominance values of a unit to be encoded is (V)_Yp，V_Up，V_Vp). If the terminal sets the first threshold lambda₆Then the terminal determines (V)_Y1，V_U1，V_V1)、(V_Y2，V_U2，V_V2) And (V)_Yp，V_Up，V_Vp) Whether the three conform to the formula (9). If it is determined (V)_Y1，V_U1，V_V1)、(V_Y2，V_U2，V_V2) And (V)_Yp，V_Up，V_Vp) All the three are in lineAnd (9) when the terminal is used for video coding, the terminal does not process the unit to be coded according to the binary tree division, and directly performs subsequent coding on the unit to be coded.

(α₁V_Y1，α₂V_U1，α₃V_V1)+(α₁V_Y2，α₂V_U2，α₃V_V2)＞λ₆(α₁V_Yp，

α₂V_Up，α₃V_Vp) (9)

4. The texture characteristic parameter is taken as an average pixel value of a plurality of pixel points, the target division mode is a quadtree division example, the terminal can divide a unit to be coded into a subblock A, a subblock B, a subblock C and a subblock D, and the subblock A, the subblock B, the subblock C and the subblock D all comprise a plurality of pixel points.

If the unit to be coded is a gray image, the terminal determines that the gray value variances of the pixel points in the sub-block A, the sub-block B, the sub-block C and the sub-block D are V respectively₀、V₁、V₂And V₃The variance of gray value of pixel point in the unit to be encoded is V_p. If the terminal sets the first threshold lambda₇Then the terminal determines V₀、V₁、V₂、V₃And V_pWhether the five satisfy the formula (10). If V is determined₀、V₁、V₂、V₃And V_pThe five codes accord with the formula (10), and the terminal can directly carry out subsequent coding on the unit to be coded without processing the unit to be coded according to the quad-tree division during video coding.

V₀+V₁+V₂+V₃＞λ₇V_p (10)

If the unit to be coded is a color image, the terminal determines the sub-block A, the sub-block B, the sub-block C and the sub-blockThe variance of the luminance chromaticity values of D is (V)_Y1，V_U1，V_V1)、(V_Y2，V_U2，V_V2)、(V_Y3，V_U3，V_V3) And (V)_Y4，V_U4，V_V4) Variance of luminance and chrominance values of a unit to be encoded is (V)_Yp，V_Up，V_Vp). If the terminal sets the first threshold lambda₈Then the terminal determines (V)_Y1，V_U1，V_V1)、(V_Y2，V_U2，V_V2)、(V_Y3，V_U3，V_V3)、(V_Y4，V_U4，V_V4) And (V)_Yp，V_Up，V_Vp) Whether the five satisfy the formula (11). If it is determined (V)_Y1，V_U1，V_V1)、(V_Y2，V_U2，V_V2)、(V_Y3，V_U3，V_V3)、(V_Y4，V_U4，V_V4) And (V)_Yp，V_Up，V_Vp) The five codes accord with a formula (11), and when the terminal is used for coding the video, the terminal does not process the unit to be coded according to the quad-tree division and directly carries out subsequent coding on the unit to be coded.

(α₁V_Y1+α₂V_U1+α₃V_V1)+(α₁V_Y2+α₂V_U2+α₃V_V2)+(α₁V_Y3+α₂V_U3+α₃V_V3)+(α₁V_Y4+α₂V_U4+α₃V_V4)＞λ₈(α₁V_Yp+α₂V_Up+α₃V_Vp) (11)

It should be noted that, in the above four examples, the texture feature parameter is taken as an average pixel value/pixel value variance of a plurality of pixel points, and the target partition mode is taken as an example of binary tree partition/quadtree partition, in other possible embodiments, the texture feature parameter can also be any one of an image gradient parameter, a pixel value root mean square error and a pixel value standard deviation, and the method for the terminal to perform the judgment based on other feature parameters belongs to the same inventive concept as the above four examples, and the implementation method refers to the description of the above four examples, and is not described herein again.

407. And in response to that the difference information between the texture characteristic parameters of the at least two sub-blocks and the texture characteristic parameters of the unit to be encoded does not meet the target condition, the terminal divides the unit to be encoded into the at least two sub-blocks according to the target division mode.

Optionally, after step 407, the terminal can continue to perform

step

401 and 407 on any one of the at least two sub-blocks obtained by dividing, and determine whether the at least two sub-blocks need to be divided.

It should be noted that, the steps 401-407 are described by taking a terminal as an execution subject, and in other possible embodiments, the steps 401-407 can also be implemented by taking a server as an execution subject, that is, uploading a video to the server by the terminal, and the server adopts the video encoding method provided by the present application to encode the video, which is not limited by the embodiment of the present application to the execution subject of the method.

Through the

above step

401 and 407, the terminal can determine whether to divide the unit to be encoded according to the similarity degree on the texture between at least two sub-blocks obtained by pre-dividing the unit to be encoded and the unit to be encoded. And when the similarity degree between at least two sub-blocks and the unit to be coded is higher, the terminal determines not to divide the unit to be coded. Therefore, the terminal does not need to traverse all modes for dividing the unit to be coded, and only needs to divide the unit to be coded with lower similarity with the subblocks, so that the number of the subblocks is reduced, the coding complexity is reduced, and the time consumed by coding is reduced under the condition of ensuring the video coding quality.

In addition to the steps 401-. Taking the execution subject as an example for explanation, referring to fig. 9, the method includes:

901. in the video coding process, the terminal determines the average pixel value and the pixel value variance of the pixel points in the unit to be coded according to the pixel values of the pixel points in the unit to be coded.

The method for determining the average pixel value and the variance of the pixel values of the pixel points in the unit to be encoded by the terminal belongs to the same inventive concept as that in the step 404, and the implementation method refers to the description of the above four examples, which is not described herein again.

902. And in response to the fact that the pixel value variance is larger than the product of the average pixel value and a first threshold value, the terminal does not divide the unit to be coded during video coding, and the first threshold value is used for representing the definition of the video coding.

For example, if the variance of the pixel value of the unit to be encoded is V_pAverage pixel value of M_pThe first threshold is lambda₉Then when V_p、M_pAnd λ₉When the formula (12) is satisfied, the terminal determines that the unit to be encoded is not divided when the video is encoded.

V_p＞λ₉×M_p (12)

It should be noted that, the

above steps

901 and 902 are described with a terminal as an execution subject, in other possible embodiments, the

above steps

901 and 902 can also be implemented with a server as an execution subject, that is, a terminal uploads a video to the server, and the server adopts the video encoding method provided in this application to encode the video, and the execution subject of the method is not limited in this embodiment of the application.

Through the

steps

901 and 902, the terminal does not need to pre-divide the unit to be encoded, and can determine whether the unit to be encoded needs to be divided according to the average pixel value and the pixel value variance of the pixel points in the unit to be encoded, so that the efficiency of video encoding is higher.

In addition to the steps 401-. Taking the execution subject as an example for explanation, referring to fig. 10, the method includes:

1001. in the video coding process, a terminal determines rate distortion parameters corresponding to multiple division modes for pre-dividing a unit to be coded, wherein the rate distortion parameters are the sum of coding rate and image distortion rate, and the image distortion rate is the image distortion rate between a unit to be coded and a sub-block obtained by coding the divided sub-blocks after the unit to be coded is divided by adopting one division mode.

Step 1001 and step 401 belong to the same inventive concept, and the implementation method refers to the description of step 401, which is not described herein again.

1002. And the terminal determines the division mode corresponding to the minimum rate distortion parameter as the target division mode from the rate distortion parameters corresponding to the multiple division modes.

Step 1002 and step 402 belong to the same inventive concept, and the implementation method refers to the description of step 402, which is not described herein again.

1003. And the terminal pre-divides the unit to be coded according to the target division mode to obtain at least two sub-blocks.

Step 1003 and step 403 belong to the same inventive concept, and the implementation method refers to the description of step 403, which is not described herein again.

1004. The terminal acquires texture characteristic parameters of at least two sub-blocks, and the texture characteristic parameters are used for representing image textures of the corresponding sub-blocks.

Step 1004 is the same inventive concept as step 404, and the implementation method is described in step 404, which is not described herein again.

1005. And in response to that the difference information between the texture characteristic parameters of the at least two sub-blocks meets the target difference condition, the terminal does not divide the unit to be coded according to the target division mode during video coding.

In a possible implementation manner, in response to that the difference value between the texture feature parameters of at least two sub-blocks is smaller than a second threshold, the terminal does not divide the unit to be encoded according to the target division mode during video encoding, and the second threshold is used for indicating the definition of the video encoding.

The above embodiments will be explained below by two examples.

If the unit to be coded is a gray image, the terminal determines that the average gray values of the pixel points in the sub-block A and the sub-block B are respectively M₀And M₁. The terminal determines the difference between the average gray values of the pixel points in the sub-block A and the sub-block B to be | M₀-M₁If the second threshold is mu₁Then when | M₀-M₁|＜μ₁And when the terminal determines that the unit to be coded is not processed according to the binary tree division, the terminal directly carries out subsequent coding on the unit to be coded.

If the unit to be coded is a color image, the terminal determines that the average brightness chroma values of the sub-block A and the sub-block B are respectively (Y)₁，U₁，V₁) And (Y)₂，U₂，V₂). The terminal determines the difference between the brightness chroma values of the pixel points in the sub-block A and the sub-block B as alpha₁|Y₁-Y₂|+α₂|U₁-U₂|+α₃|V₁-V₂If the second threshold is mu₂Then when alpha is₁|Y₁-Y₂|+α₂|U₁-U₂|+α₃|V₁-V₂|＜μ₂And when the terminal determines that the unit to be coded is not processed according to the binary tree division, the terminal directly carries out subsequent coding on the unit to be coded.

2. The texture characteristic parameter is taken as an average pixel value of a plurality of pixel points, the target division mode is a quadtree division example, the terminal can divide a unit to be coded into a subblock A, a subblock B, a subblock C and a subblock D, and the subblock A, the subblock V, the subblock C and the subblock D all comprise a plurality of pixel points.

If the unit to be coded is a gray image, the terminal determines that the average gray values of pixel points in the sub-block A, the sub-block B, the sub-block C and the sub-block D are respectively M₀、M₁、M₂And M₃. The terminal determines the difference value of the average gray values of the pixel points between any two sub-blocks in the sub-block A, the sub-block B, the sub-block C and the sub-block D, namely determining the absolute value M₀-M₁|、|M₀-M₂|、|M₀-M₃|、|M₁-M₂|、|M₁-M₃L and | M₂-M₃L. If the second threshold is mu₃Then when | M₀-M₁|、|M₀-M₂|、|M₀-M₃|、|M₁-M₂|、|M₁-M₃L and | M₂-M₃Any one of | is less than μ₃And when the terminal determines that the unit to be coded is not to be processed according to the quadtree division, the terminal directly carries out subsequent coding on the unit to be coded.

If the unit to be coded is a color image, the terminal determines that the average brightness chroma values of the sub-block A, the sub-block B, the sub-block C and the sub-block D are respectively (Y)₁，U₁，V₁)、(Y₂，U₂，V₂)、(Y₃，U₃，V₃) And (Y)₄，U₄，V₄). The terminal determines the average brightness and chroma value difference of the pixel points between any two sub-blocks in the sub-block A, the sub-block B, the sub-block C and the sub-block D, namely determining the Y₁-Y₂，U₁-U₂，V₁-V₂|、|Y₁-Y₃，U₁-U₃，V₁-V₃|、|Y₁-Y₄，U₁-U₄，V₁-V₄|、|Y₂-Y₃，U₂-U₃，V₂-V₃|、|Y₂-Y₄，U₂-U₄，V₂-V₄I and Y₃-Y₄，U₃-U₄，V₃-V₄L. If the second threshold is Y_μ4，U_μ4，V_μ4I, then when Y₁-Y₂，U₁-U₂，V₁-V₂|、|Y₁-Y₃，U₁-U₃，V₁-V₃|、|Y₁-Y₄，U₁-U₄，V₁-V₄|、|Y₂-Y₃，U₂-U₃，V₂-V₃|、|Y₂-Y₄，U₂-U₄，V₂-V₄I and Y₃-Y₄，U₃-U₄，V₃-V₄Any numerical value of | is less than corresponding | Y_μ4，U_μ4，V_μ4And in the process of I, the terminal determines that the unit to be coded is not processed according to the quadtree division, and the subsequent coding is directly carried out on the unit to be coded.

It should be noted that, in the two examples, the texture feature parameter is taken as an average pixel value of a plurality of pixel points, and the target partition mode is a binary tree partition/a quadtree partition example, in other possible embodiments, the texture feature parameter may also be any one of a pixel value variance, an image gradient parameter, a pixel value root mean square error, and a pixel value standard deviation, and the method for the terminal to perform the judgment based on other feature parameters belongs to the same inventive concept as the two examples, and the implementation method refers to the description of the two examples, and is not described herein again.

In addition, the step 1001-.

Through the steps 1001-1005, the terminal can determine whether to divide the unit to be encoded according to the similarity degree on the texture between at least two sub-blocks obtained by pre-dividing the unit to be encoded. And when the similarity degree between at least two sub-blocks is higher, the terminal determines not to divide the unit to be coded. Therefore, the terminal does not need to traverse all modes for dividing the unit to be coded, and only needs to divide the unit to be coded with lower sub-block similarity, so that the number of the sub-blocks is reduced, the coding complexity is reduced, and the time consumed by coding is reduced under the condition of ensuring the video coding quality.

Fig. 11 is a schematic structural diagram of a video encoding apparatus according to an embodiment of the present application, and referring to fig. 11, the apparatus includes: a pre-dividing module 1101, an obtaining module 1102, and a dividing module 1103.

The pre-dividing module 1101 is configured to pre-divide the unit to be encoded according to the target division mode in the video encoding process to obtain at least two sub-blocks.

An obtaining module 1102, configured to obtain texture feature parameters of at least two sub-blocks, where the texture feature parameters are used to represent image textures of corresponding sub-blocks.

A dividing module 1103, configured to respond that difference information between texture feature parameters of at least two sub-blocks and texture feature parameters of a unit to be encoded meets a target condition, and when a video is encoded, the unit to be encoded is not divided according to a target dividing mode.

In a possible embodiment, the dividing module is configured to, in response to that a sum of texture feature parameters of the at least two sub-blocks is greater than a product of the texture feature parameter of the unit to be encoded and a first threshold, when the video is encoded, not divide the unit to be encoded according to the target division mode, where the first threshold is used to indicate a sharpness of the video encoding.

In a possible implementation manner, the obtaining module is configured to obtain pixel values of a plurality of reference pixels in at least two sub-blocks, obtain statistical information of the at least two sub-blocks based on the pixel values of the plurality of reference pixels, where the statistical information is at least one of an average pixel value, a pixel value variance, an image gradient parameter, a pixel value root-mean-square error, and a pixel value standard deviation of the reference pixels in one sub-block, and use the statistical information of the at least two sub-blocks as texture feature parameters of the at least two sub-blocks.

In one possible implementation, the pre-dividing module is configured to pre-divide the unit to be encoded into two rectangular sub-blocks with the same area in response to the target division mode being binary tree division. Or, in response to the target division mode being the ternary tree division, the unit to be coded is pre-divided into three rectangular sub-blocks with the same area. Or, in response to the target division mode being the quadtree division, the unit to be encoded is pre-divided into four sub-blocks of a square with the same area.

In one possible embodiment, the apparatus further comprises:

and the rate distortion determining module is used for determining rate distortion parameters corresponding to a plurality of division modes for pre-dividing the unit to be encoded, wherein the rate distortion parameters are the sum of an encoding code rate and an image distortion rate, and the image distortion rate is the image distortion rate between the unit to be encoded and the unit to be encoded, which is obtained by encoding the divided sub-blocks after the unit to be encoded is divided by adopting one division mode.

And the division mode determining module is used for determining the division mode corresponding to the minimum rate distortion parameter as the target division mode from the rate distortion parameters corresponding to the multiple division modes.

In a possible implementation manner, the texture feature parameter of the unit to be encoded is at least one of an average pixel value, a variance of the pixel value, an image gradient parameter, a root mean square error of the pixel value, and a standard deviation of the pixel value of pixel values of a plurality of pixel points in the unit to be encoded.

In a possible embodiment, the dividing module is further configured to divide the unit to be encoded into at least two sub-blocks according to the target dividing mode in response to that difference information between texture feature parameters of the at least two sub-blocks and texture feature parameters of the unit to be encoded does not meet the target condition.

In one possible embodiment, the apparatus further comprises:

and the determining module is used for determining the average pixel value and the pixel value variance of the pixel points in the unit to be coded according to the pixel values of the pixel points in the unit to be coded in the video coding process.

The dividing module is further configured to, in response to the pixel value variance being greater than a product of the average pixel value and a first threshold, not divide the unit to be encoded during video encoding, where the first threshold is used to indicate the sharpness of the video encoding.

In a possible embodiment, the dividing module is further configured to, in response to that difference information between texture feature parameters of at least two sub-blocks meets a target difference condition, not divide the unit to be encoded according to a target division mode when the video is encoded.

The embodiment of the present application provides a computer device, configured to execute the methods provided in the foregoing embodiments, where the computer device may be implemented as a terminal or a server, and the structure of the terminal is described below first:

fig. 12 is a schematic structural diagram of a terminal according to an embodiment of the present application. The terminal 1200 may be: smart phones, tablet computers, notebook computers, desktop computers, smart speakers, smart watches, and the like. Terminal 1200 may also be referred to by other names such as user equipment, portable terminal, laptop terminal, desktop terminal, and so forth.

In general, terminal 1200 includes: one or more processors 1201 and one or more memories 1202.

The processor 1201 may include one or more processing cores, such as a 4-core processor, an 8-core processor, or the like. The processor 1201 may be implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), and a PLA (Programmable Logic Array). The processor 1201 may also include a main processor and a coprocessor, where the main processor is a processor for Processing data in an awake state, and is also called a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor 1201 may be integrated with a GPU (Graphics Processing Unit) that is responsible for rendering and drawing content that the display screen needs to display. In some embodiments, the processor 1201 may further include an AI (Artificial Intelligence) processor for processing a computing operation related to machine learning.

Memory 1202 may include one or more computer-readable storage media, which may be non-transitory. Memory 1202 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in the memory 1202 is used to store at least one program code for execution by the processor 1201 to implement the video encoding methods provided by the method embodiments herein.

In some embodiments, the terminal 1200 may further optionally include: a peripheral interface 1203 and at least one peripheral. The processor 1201, memory 1202, and peripheral interface 1203 may be connected by a bus or signal line. Various peripheral devices may be connected to peripheral interface 1203 via a bus, signal line, or circuit board. Specifically, the peripheral device includes: at least one of radio frequency circuitry 1204, display 1205, camera assembly 1206, audio circuitry 1207, positioning assembly 1208, and power supply 1209.

The peripheral interface 1203 may be used to connect at least one peripheral associated with I/O (Input/Output) to the processor 1201 and the memory 1202. In some embodiments, the processor 1201, memory 1202, and peripheral interface 1203 are integrated on the same chip or circuit board; in some other embodiments, any one or two of the processor 1201, the memory 1202 and the peripheral device interface 1203 may be implemented on a separate chip or circuit board, which is not limited in this embodiment.

The Radio Frequency circuit 1204 is used for receiving and transmitting RF (Radio Frequency) signals, also called electromagnetic signals. The radio frequency circuit 1204 communicates with a communication network and other communication devices by electromagnetic signals. The radio frequency circuit 1204 converts an electric signal into an electromagnetic signal to transmit, or converts a received electromagnetic signal into an electric signal. Optionally, the radio frequency circuit 1204 comprises: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and so forth.

The display screen 1205 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display screen 1205 is a touch display screen, the display screen 1205 also has the ability to acquire touch signals on or over the surface of the display screen 1205. The touch signal may be input to the processor 1201 as a control signal for processing. At this point, the display 1205 may also be used to provide virtual buttons and/or a virtual keyboard, also referred to as soft buttons and/or a soft keyboard.

Camera assembly 1206 is used to capture images or video. Optionally, camera assembly 1206 includes a front camera and a rear camera. Generally, a front camera is disposed at a front panel of the terminal, and a rear camera is disposed at a rear surface of the terminal.

The audio circuitry 1207 may include a microphone and a speaker. The microphone is used for collecting sound waves of a user and the environment, converting the sound waves into electric signals, and inputting the electric signals into the processor 1201 for processing or inputting the electric signals into the radio frequency circuit 1204 to achieve voice communication.

The positioning component 1208 is configured to locate a current geographic Location of the terminal 1200 to implement navigation or LBS (Location Based Service).

The power supply 1209 is used to provide power to various components within the terminal 1200. The power source 1209 may be alternating current, direct current, disposable or rechargeable.

In some embodiments, terminal 1200 also includes one or more sensors 1210. The one or more sensors 1210 include, but are not limited to: acceleration sensor 1211, gyro sensor 1212, pressure sensor 1213, fingerprint sensor 1214, optical sensor 1215, and proximity sensor 1216.

The acceleration sensor 1211 can detect magnitudes of accelerations on three coordinate axes of the coordinate system established with the terminal 1200.

The gyro sensor 1212 may detect a body direction and a rotation angle of the terminal 1200, and the gyro sensor 1212 may collect a 3D motion of the user on the terminal 1200 in cooperation with the acceleration sensor 1211.

Pressure sensors 1213 may be disposed on the side frames of terminal 1200 and/or underlying display 1205. When the pressure sensor 1213 is disposed on the side frame of the terminal 1200, the holding signal of the user to the terminal 1200 can be detected, and the processor 1201 performs left-right hand recognition or block operation according to the holding signal collected by the pressure sensor 1213. When the pressure sensor 1213 is disposed at a lower layer of the display screen 1205, the processor 1201 controls the operability control on the UI interface according to the pressure operation of the user on the display screen 1205.

The fingerprint sensor 1214 is used for collecting a fingerprint of the user, and the processor 1201 identifies the user according to the fingerprint collected by the fingerprint sensor 1214, or the fingerprint sensor 1214 identifies the user according to the collected fingerprint.

The optical sensor 1215 is used to collect the ambient light intensity. In one embodiment, the processor 1201 may control the display brightness of the display 1205 according to the ambient light intensity collected by the optical sensor 1215. The proximity sensor 1216 is used to collect a distance between the user and the front surface of the terminal 1200.

Those skilled in the art will appreciate that the configuration shown in fig. 12 is not intended to be limiting of terminal 1200 and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components may be used.

The computer device may also be implemented as a server, and the following describes a structure of the server:

fig. 13 is a schematic structural diagram of a server 1300 according to an embodiment of the present application, where the server 1300 may generate a relatively large difference due to a difference in configuration or performance, and may include one or more processors (CPUs) 1301 and one or more memories 1302, where at least one program code is stored in the one or more memories 1302, and the at least one program code is loaded and executed by the one or more processors 1301 to implement the video encoding method provided by each method embodiment. Certainly, the server 1300 may further include components such as a wired or wireless network interface, a keyboard, and an input/output interface, so as to perform input and output, and the server 1300 may further include other components for implementing the functions of the device, which is not described herein again.

In an exemplary embodiment, a computer readable storage medium, such as a memory including program code, executable by a processor, is also provided to perform the video encoding method in the above embodiments. For example, the computer-readable storage medium may be a Read-Only Memory (ROM), a Random Access Memory (RAM), a Compact Disc Read-Only Memory (CD-ROM), a magnetic tape, a floppy disk, an optical data storage device, and the like.

In an exemplary embodiment, a computer program product or a computer program is also provided, which includes computer program code stored in a computer-readable storage medium, which is read by a processor of a computer device from the computer-readable storage medium, and which is executed by the processor to cause the computer device to execute the video encoding method provided in the above-mentioned various alternative implementations.

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by hardware associated with program code, and the program may be stored in a computer readable storage medium, where the above mentioned storage medium may be a read-only memory, a magnetic or optical disk, etc.

The above description is only exemplary of the present application and should not be taken as limiting, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims

1. A method of video encoding, the method comprising:

and in response to that the sum of the texture characteristic parameters of the at least two sub-blocks is greater than the product of the texture characteristic parameter of the unit to be encoded and a first threshold value, when the video is encoded, the unit to be encoded is not divided according to the target division mode, and the first threshold value is used for representing the definition of the video encoding.

2. The method of claim 1, wherein the obtaining texture feature parameters of the at least two sub-blocks comprises:

acquiring pixel values of a plurality of reference pixel points in the at least two sub-blocks;

acquiring statistical information of the at least two sub-blocks based on the pixel values of the plurality of reference pixels, wherein the statistical information is at least one of an average pixel value, a pixel value variance, an image gradient parameter, a pixel value root mean square error and a pixel value standard deviation of the reference pixels in one sub-block;

and taking the statistical information of the at least two sub-blocks as texture characteristic parameters of the at least two sub-blocks.

3. The method of claim 1, wherein the pre-dividing the unit to be encoded according to the target partition mode to obtain at least two sub-blocks comprises:

in response to that the target division mode is binary tree division, pre-dividing the unit to be coded into two rectangular sub-blocks with the same area; or the like, or, alternatively,

in response to the target division mode being the ternary tree division, pre-dividing the unit to be coded into three rectangular sub-blocks with the same area; or the like, or, alternatively,

in response to the target division mode being a quadtree division, the unit to be encoded is pre-divided into four sub-blocks of a square having the same area.

4. The method of claim 1, wherein before the pre-dividing the unit to be encoded according to the target division mode to obtain at least two sub-blocks, the method further comprises:

determining rate distortion parameters corresponding to a plurality of division modes for pre-dividing the unit to be encoded, wherein the rate distortion parameters are the sum of an encoding code rate and an image distortion rate, and the image distortion rate is the image distortion rate between a unit obtained by encoding divided sub-blocks and the unit to be encoded after the unit to be encoded is divided by adopting one division mode;

and determining the division mode corresponding to the minimum rate-distortion parameter as the target division mode from the rate-distortion parameters corresponding to the plurality of division modes.

5. The method according to claim 1, wherein after obtaining texture feature parameters of the at least two sub-blocks, the method further comprises:

and dividing the unit to be encoded into the at least two sub-blocks according to the target division mode in response to the sum of the texture feature parameters of the at least two sub-blocks being less than or equal to the product of the texture feature parameter of the unit to be encoded and a first threshold.

6. The method of claim 1, further comprising:

in the video coding process, determining the average pixel value and the pixel value variance of the pixel points in the unit to be coded according to the pixel values of the pixel points in the unit to be coded;

in response to the pixel value variance being larger than the product of the average pixel value and a first threshold value, the unit to be encoded is not divided during video encoding, and the first threshold value is used for representing the definition of video encoding.

7. The method according to claim 1, wherein after obtaining texture feature parameters of the at least two sub-blocks, the method further comprises:

and in response to that the difference information between the texture feature parameters of the at least two sub-blocks meets a target difference condition, the unit to be coded is not divided according to the target division mode during video coding.

8. A video encoding apparatus, characterized in that the apparatus comprises:

and the dividing module is used for responding that the sum of the texture characteristic parameters of the at least two sub-blocks is greater than the product of the texture characteristic parameter of the unit to be coded and a first threshold value, when the video is coded, the unit to be coded is not divided according to the target dividing mode, and the first threshold value is used for expressing the definition of the video coding.

9. The apparatus according to claim 8, wherein the obtaining module is configured to obtain pixel values of a plurality of reference pixels in the at least two sub-blocks; acquiring statistical information of the at least two sub-blocks based on the pixel values of the plurality of reference pixels, wherein the statistical information is at least one of an average pixel value, a pixel value variance, an image gradient parameter, a pixel value root mean square error and a pixel value standard deviation of the reference pixels in one sub-block; and taking the statistical information of the at least two sub-blocks as texture characteristic parameters of the at least two sub-blocks.

10. The apparatus according to claim 8, wherein the pre-dividing module is configured to pre-divide the unit to be encoded into two rectangular sub-blocks with the same area in response to the target division mode being binary tree division; or, in response to the target division mode being the ternary tree division, pre-dividing the unit to be encoded into three rectangular sub-blocks with the same area; or, in response to the target division mode being the quadtree division, pre-dividing the unit to be encoded into four sub-blocks of a square with the same area.

11. The apparatus of claim 8, further comprising:

a rate distortion determining module, configured to determine rate distortion parameters corresponding to multiple partition modes for pre-partitioning the unit to be encoded, where the rate distortion parameters are a sum of an encoding rate and an image distortion rate, and the image distortion rate is an image distortion rate between a unit obtained by encoding the partitioned sub-blocks and the unit to be encoded after the unit to be encoded is partitioned by using one partition mode;

and the division mode determining module is used for determining the division mode corresponding to the minimum rate distortion parameter from the rate distortion parameters corresponding to the multiple division modes as the target division mode.

12. A computer device, characterized in that the computer device comprises one or more processors and one or more memories having at least one program code stored therein, which is loaded and executed by the one or more processors to implement the video encoding method according to any one of claims 1 to 7.

13. A computer-readable storage medium, having stored therein at least one program code, which is loaded and executed by a processor, to implement the video encoding method according to any one of claims 1 to 7.