CN117119182A

CN117119182A - Video data processing method, device, equipment and medium

Info

Publication number: CN117119182A
Application number: CN202311184821.0A
Authority: CN
Inventors: 张洪彬; 唐敏豪
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2023-09-13
Filing date: 2023-09-13
Publication date: 2023-11-24

Abstract

The application provides a video data processing method, a device, equipment and a medium, which can be applied to the technical field of intelligent traffic. The method comprises the following steps: and obtaining a video frame to be processed in the target video, carrying out data analysis processing on the data block j to obtain a propagation cost of the data block j so as to determine a reference data block corresponding to the data block j, carrying out coding processing on the reference data block to obtain a coded reference data block and coding strategy data, carrying out pre-decoding processing on the coded reference data block through the coding strategy data to obtain a pre-decoded data block corresponding to the data block j, and taking the coding strategy data and the coding filtering data as data coding parameters associated with the data block j when coding filtering data are determined through the coded reference data block and the pre-decoded data block so as to code and obtain a video coding data stream associated with the video frame to be processed. The application can improve the coding quality of the video frame.

Description

Video data processing method, device, equipment and medium

Technical Field

The application relates to the technical field of computers, in particular to the technical field of intelligent traffic, and particularly relates to a video data processing method, device, equipment and medium.

Background

In the existing video coding technology, an original video frame (such as a video frame a) can be divided into a series of coding blocks, and the coding blocks are coded by combining with video coding methods such as prediction, transformation, entropy coding and the like, so that the coding process of the video frame is realized. For example, in the intelligent traffic field, encoding of road video data is involved when uploading collected road video data to a designated device. The encoder can also pre-decode the encoded data to obtain a pre-decoded block (such as a data block B') corresponding to the data block B. Compared with the original data block B, the data block B 'can generate distortion due to the influence of quantization and the like, so that the data block B' can be further utilized to carry out filtering operation to obtain a pre-reconstructed video frame of the video frame A through the filtered data block B '(the data block B'), and the pre-reconstructed video frame can participate in the encoding process of the subsequent video frame.

However, in practice it has been found that the filtering operation can reduce the error between data block B "and data block B, thereby achieving the effect that the pre-reconstructed video frame (e.g. video frame a') is more closely adjacent to the original video frame a. But at the same time the noise in video frame a is recovered. If the subsequent video frame (e.g., video frame C) needs to refer to the pre-decoded reconstructed video frame a 'during encoding, the noise recovered in the video frame a' affects the encoding quality of the video frame C, thereby reducing the decoding reconstruction effect of the decoder side on the video frame C.

Disclosure of Invention

The embodiment of the application provides a video data processing method, a device, equipment and a medium, which can improve the coding quality of video frames.

In one aspect, an embodiment of the present application provides a video data processing method, including:

acquiring a video frame to be processed in a target video; the data block to be coded included in the video frame to be processed is a data block j; the data block j is any one data block in the video frame to be processed;

carrying out data analysis processing on the data block j to obtain the propagation cost of the data block j, and determining a reference data block corresponding to the data block j based on the propagation cost of the data block j;

encoding the reference data block to obtain an encoded reference data block and encoding strategy data associated with the encoded reference data block;

pre-decoding the coded reference data block through coding strategy data to obtain a pre-decoded data block corresponding to the data block j;

when determining the coding filtering data associated with the coded reference data block through the coded reference data block and the pre-decoding data block, taking the coding strategy data associated with the coded reference data block and the coding filtering data associated with the coded reference data block as the data coding parameters associated with the data block j;

Based on the data encoding parameters associated with the data block j, encoding results in a video encoded data stream associated with the video frame to be processed.

In one aspect, an embodiment of the present application provides a video data processing apparatus, including:

the video acquisition module is used for acquiring a video frame to be processed in the target video; the data block to be coded included in the video frame to be processed is a data block j; the data block j is any one data block in the video frame to be processed;

the reference data determining module is used for carrying out data analysis processing on the data block j to obtain the propagation cost of the data block j, and determining a reference data block corresponding to the data block j based on the propagation cost of the data block j;

the data coding module is used for coding the reference data block to obtain a coded reference data block and coding strategy data associated with the coded reference data block;

the data coding module is also used for carrying out pre-decoding processing on the coded reference data block through coding strategy data to obtain a pre-decoded data block corresponding to the data block j;

the data coding module is further used for taking the coding strategy data associated with the coded reference data block and the coding filtering data associated with the coded reference data block as data coding parameters associated with the data block j when the coding filtering data associated with the coded reference data block is determined through the coded reference data block and the pre-decoding data block;

And the data stream coding module is used for coding and obtaining a video coding data stream associated with the video frame to be processed based on the data coding parameters associated with the data block j.

An aspect of an embodiment of the present application provides a computer device, including a memory and a processor, where the memory is connected to the processor, and the memory is used to store a computer program, and the processor is used to call the computer program, so that the computer device performs the method provided in the foregoing aspect of the embodiment of the present application.

An aspect of an embodiment of the present application provides a computer readable storage medium, in which a computer program is stored, the computer program being adapted to be loaded and executed by a processor, to cause a computer device having a processor to perform the method provided in the above aspect of an embodiment of the present application.

According to one aspect of the present application, there is provided a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device performs the method provided in the above aspect.

The embodiment of the application provides a method for adaptively selecting a reference data block to carry out video coding according to the propagation cost of a current data block, which comprises the following steps: acquiring a data block j included in a video frame to be processed in a target video, wherein the data block j is any one data block in the video frame to be processed, determining a reference data block corresponding to the data block j based on propagation cost obtained by carrying out data analysis processing on the data block j, carrying out coding processing on the reference data block to obtain coding strategy data associated with the coded reference data block, carrying out pre-decoding processing on the coded reference data block through the coding strategy data to obtain a pre-decoded data block corresponding to the data block j, and when coding filter data associated with the coded reference data block is determined through the coded reference data block and the pre-decoded data block, taking the coding strategy data and the coding filter data as data coding parameters associated with the data block j, and coding to obtain a video coding data stream associated with the video frame to be processed. That is, a proper data block can be selected as a reference data block corresponding to the data block j through the propagation cost self-adaptation of the data block j, the data block in the video frame to be processed (i.e. the original video frame) is encoded and decoded, the encoding and filtering of the reference data block corresponding to the data block are improved, so that the reconstruction effect of the pre-reconstructed data block can be balanced between the data block j close to the original video frame and noise reduction, the noise in the pre-reconstructed video frame obtained through pre-reconstruction can be improved, and the improvement of the encoding efficiency and encoding quality of the subsequent video frame needing to refer to the video frame to be processed (i.e. the pre-reconstructed video frame of the video frame to be processed) during encoding is facilitated, so that the decoding prediction accuracy of the subsequent video frame at the decoding terminal side can be ensured, that is, and the improvement of the decoding and reconstruction effect of the video frame can be realized through the improvement of the encoding quality of the video frame.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic diagram of a network architecture according to an embodiment of the present application;

FIG. 2 is a schematic diagram of an encoding process according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a decoding process according to an embodiment of the present application;

FIG. 4 is a workflow diagram of video encoding and decoding according to an embodiment of the present application;

FIG. 5 is a schematic diagram of a residual quadtree partitioning scheme provided by an embodiment of the present application;

FIG. 6 is a schematic diagram of a position-based sub-block transformation provided by an embodiment of the present application;

fig. 7 is a flowchart of a video data processing method according to an embodiment of the present application;

fig. 8 is a schematic diagram of a processing procedure of an encoding terminal according to an embodiment of the present application;

FIG. 9 is a schematic diagram of a determination process of encoded filtered data according to an embodiment of the present application;

Fig. 10 is a second flowchart of a video data processing method according to an embodiment of the present application;

FIG. 11 is a schematic diagram of a scenario for determining propagation costs according to an embodiment of the present application;

fig. 12 is a second schematic diagram of a scenario for determining propagation costs according to an embodiment of the present application;

FIG. 13 is a schematic diagram of a determining scenario of a reference data block according to an embodiment of the present application;

fig. 14 is a flowchart illustrating a video data processing method according to an embodiment of the present application;

fig. 15 is a schematic view of a video frame reconstruction scenario according to an embodiment of the present application;

fig. 16 is a schematic structural diagram of a video data processing apparatus according to an embodiment of the present application;

fig. 17 is a schematic structural diagram of a computer device according to an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

Referring to fig. 1, fig. 1 is a schematic diagram of a network architecture according to an embodiment of the present application. As shown in fig. 1, the system architecture may include a service server 100 and a service terminal cluster, where the service terminal cluster may include one or more service terminals (e.g., user terminals), and the number of service terminals in the service terminal cluster will not be limited herein. As shown in fig. 1, the plurality of service terminals in the service terminal cluster may specifically include: the service terminals 200a, 200b, …, and 200n, wherein a communication connection may exist between the service terminal clusters, for example, a communication connection exists between the service terminal 200a and the service terminal 200b, and a communication connection exists between the service terminal 200a and the service terminal 200 n. Meanwhile, any service terminal in the service terminal cluster may have a communication connection with the service server 100, so that each service terminal in the service terminal cluster may perform data interaction with the service server 100 through the communication connection, for example, a communication connection exists between the service terminal 200a and the service server 100. The communication connection is not limited to a connection manner, and may be directly or indirectly connected through a wired communication manner, may be directly or indirectly connected through a wireless communication manner, or may be other manners, and the present application is not limited herein.

It should be appreciated that each service terminal in the service terminal cluster as shown in fig. 1 may be installed with an application client for video codec. When the application client runs in each service terminal, data interaction can be performed between the application client and the service server 100 shown in fig. 1. The application client may be any type of client, such as a social client, an image processing client, an instant messaging client (e.g., a conference client), an entertainment client (e.g., a game client, a live broadcast client), a multimedia client (e.g., a video client), an information client (e.g., a news information client), a shopping client, a vehicle client, a multimedia client, and the like, which have a function of displaying data information such as text, image, audio, and video.

For example, the data interaction process between the service terminal 200a and the service server 100 will be described herein by taking an application client as an example of a multimedia client. The multimedia client refers to a client capable of sending and receiving internet messages in real time and having an information searching function and the like, for example, the multimedia client may be a client provided by a multimedia platform, and the multimedia platform may implement transmission of multimedia data, such as transmission of video data, and involves encoding and decoding of video. The multimedia client on the service terminal 200a can receive and display video data transmitted by the service server. For example, the service terminal 200a responds to the video playing operation of the user on the multimedia client, sends the video playing information to the service server, when the service server obtains the target video indicated by the video playing information, encodes the video frame in the target video, sends the encoded video encoding data stream to the service terminal 200a, and decodes the video encoding data stream by the service terminal 200a to obtain the target video. The service server serves as an encoding terminal and the service terminal 200a serves as a decoding terminal at this time.

It can be understood that video data transmission can be implemented between any one service terminal and any one service server, and video data transmission can be implemented between any two service terminals, or video data transmission can be implemented between any two service servers. The initiator of the video data transmission is an encoding terminal for encoding the transmitted target video. The receiver of the video data transmission is a decoding terminal for decoding the transmitted video encoded data stream.

Optionally, the terminal where the encoder is disposed is an encoding terminal, an encoding process in the encoding terminal is implemented by the encoder, the terminal where the decoder is disposed is a decoding terminal, and a decoding process in the decoding terminal is implemented by the decoder. The encoding terminal and the decoding terminal may be the same terminal or different terminals. I.e. the encoder and decoder may be deployed at the same terminal or at different terminals.

The encoding terminal may acquire a target video to be encoded, where the target video may be captured by an image capturing device or acquired by a computer device, or acquired from a database, or uploaded by a user, etc. The image pickup apparatus may be a hardware component provided in the encoding terminal, for example, the image pickup apparatus may be a general camera, a stereo camera, a light field camera, and the like provided in the encoding terminal. The image pickup apparatus may also refer to a hardware device connected to the encoding terminal, such as a camera connected to a server. The decoding terminal can analyze the video coding data stream to obtain the data coding parameters of the coding block after receiving the video coding data stream sent by the coding terminal, and perform decoding processing based on the data coding parameters.

It may be understood that the computer device according to the embodiment of the present application may be a server (for example, the service server 100 shown in fig. 1) or a terminal (for example, any service terminal in the service terminal cluster shown in fig. 1). The server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs (Content Delivery Network, content delivery networks), basic cloud computing services such as big data and artificial intelligent platforms, and the like. The terminal may be, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, a vehicle-mounted terminal, an aircraft, etc. The embodiment of the application can be applied to various scenes, including but not limited to cloud technology, artificial intelligence, intelligent transportation, auxiliary driving and the like.

It should be understood that fig. 1 is merely an exemplary representation of a possible network architecture according to the present application, and is not limited to a specific architecture according to the present application, i.e. the present application may also provide other network architectures.

Further, referring to fig. 2, fig. 2 is a schematic diagram illustrating a coding process according to an embodiment of the application. The encoding terminal 20a shown in fig. 2 is a computer device, and the computer device may be any one of the service server 100 or the service terminal group (for example, the service terminal 200 a) in the embodiment corresponding to fig. 1, which is not limited herein, and the encoding terminal 20a is taken as an example of the service server. The decoding process of the video may be: the encoding terminal 20a may obtain a data block j (23) from a video frame to be processed (such as a video frame 22 n) in the target video 21 (such as including video frames 22a, 22b, & gt, 22 n), where the data block j is any data block to be encoded in the video frame to be processed, perform data analysis processing on the data block j to obtain a propagation cost 24 of the data block j, determine a reference data block 25 (the reference data block may also be referred to as a reference data block) corresponding to the data block j based on the propagation cost of the data block j, perform encoding processing on the reference data block to obtain an encoded reference data block (the reference data block is used as an encoded reference data block after encoding processing), and encode policy data 26 associated with the data block j corresponding to the encoded reference data block, and perform pre-decoding processing on the encoded reference data block by using the encode policy data to obtain a pre-decoded data block 27 corresponding to the data block j; determining encoded filter data 28 associated with the encoded reference data block through the encoded reference data block and the pre-decoded data block, taking the encoding strategy data 26 and the encoded filter data 28 as data encoding parameters 29 associated with the data block j, wherein the encoded filter data 28 can be used for filtering the pre-decoded data block by an encoding terminal to obtain a pre-reconstructed data block corresponding to the pre-decoded data block, and further performing frame pre-reconstruction on a video frame to be processed; based on the data encoding parameters associated with the data block j, the video encoded data stream 210 associated with the video frame to be processed is encoded, e.g., the data encoding parameters associated with the data block j may be encoded (e.g., entropy encoded) to obtain a video encoded data stream of the target video, which is transmitted from the encoding terminal to the decoding terminal 20b.

Further, referring to fig. 3, fig. 3 is a schematic diagram illustrating a decoding process according to an embodiment of the application. The decoding terminal 20b shown in fig. 3 is a computer device, and the computer device may be any one of the service server 100 or the service terminal group (for example, the service terminal 200 a) in the embodiment corresponding to fig. 1, which is not limited herein, and the decoding terminal 20b is taken as an example of the service terminal 200 a. The decoding process of the video may be: the decoding terminal 20b receives the video encoded data stream 31 associated with the to-be-processed video frame 22n in the target video 21 from the encoding terminal 20a, decodes the video encoded data stream 31 to obtain the data decoding parameters 32 associated with the data block j (23) in the to-be-processed video frame, as shown in the example of fig. 2, where the data block j is any one to-be-decoded data block (corresponding to the to-be-encoded data block) in the to-be-processed video frame, the data decoding parameters include the encoding strategy data 33 and the encoding filtering data 34 determined when the encoding reference data block is encoded, decodes the encoded reference data block based on the encoding strategy data to obtain the decoded data block 35 corresponding to the data block j, performs the filtering process according to the encoding filtering data on the decoded data block corresponding to obtain the reconstructed data block 36 corresponding to the decoded data block, and the reconstructed video frame 37 of the to-be-processed video frame can be reconstructed by the reconstructed video frame, and the reconstructed video frame is finally obtained after the decoding filtering.

Technical terms involved in the present application are described below:

1. video coding:

a video may be made up of one or more video frames, each video frame containing a portion of the video signal for that video. The video signal acquisition mode can be divided into two modes of shooting by a camera or generating by a computer. The compression coding modes of the video may also be different due to different statistical properties corresponding to the different acquisition modes.

In mainstream video coding technology, taking HEVC (High Efficiency Video Coding, international video coding standard HEVC/h.265), VVC (versatile video coding, international video coding standard VVC/h.266), AVS (Audio Video Coding Standard, national video coding standard) as an example, a hybrid coding framework is adopted under which the following series of operations and processes are allowed for video:

1) Block division structure (block partition structure): depending on the size of the video frame, the video frame may be divided into several non-overlapping processing units, each of which will perform a similar compression operation. This processing unit is called CTU (coding tree unit) or LCU (Largest Coding Unit, maximum coding unit). The CTU may proceed further down to finer divisions to obtain one or more elementary coded units, called CUs (coding units, coding blocks). The coding blocks (i.e., data blocks) are the most basic elements in a coding link, and various codec processes that may be adopted for each coding block are described in the following embodiments of the present application.

2) Predictive coding: the method comprises prediction modes such as intra-frame coding (also called intra-frame prediction) and inter-frame coding (also called inter-frame prediction), wherein an original video signal contained in a coding block in a video frame is subjected to predictive coding of a reconstructed video signal (namely a referenced coding block, when the video frame is coded, the referenced coding block is determined from a pre-reconstructed video frame corresponding to the coded video frame, when the video frame is coded, the referenced coding block is referred to as a video frame to be coded, and when the video frame is coded, the referenced coding block is determined from the video frame to be coded, and then a residual error is obtained. Wherein the encoding end needs to select a suitable prediction mode from a plurality of possible prediction modes for each encoding block in the video frame to encode, and inform the decoding end of the selected prediction mode. Wherein the prediction modes may include:

a. intra coding (Intra (picture) Prediction): the reconstructed video signal used for encoding is derived from the already encoded reconstructed region within the same video frame, i.e. the current encoded block is located in the same video frame as the encoded block referenced by the current encoded block. The basic idea of intra prediction is to use the correlation between adjacent pixels in the same video frame to remove spatial redundancy. In video coding, adjacent pixels refer to reconstructed (reconstructed) pixels of an encoded coding block surrounding a current coding block (where the current coding block may be the coding block being encoded) within the same video frame.

b. Inter-frame coding (Inter (picture) Prediction): the reconstructed video signal used for encoding is from other video frames which are already encoded and different from the current frame, and the reference video frame is the video frame which is already encoded, decoded and reconstructed, namely the current encoding block and the encoding block referred to by the current encoding block are respectively positioned in different video frames, and the current frame refers to the video frame in which the current encoding block is positioned.

3) Transform & Quantization): the residual may be transformed into a transform domain by performing transform operations such as DFT (Discrete Fourier Transform ), DCT (Discrete Cosine Transform, discrete cosine transform), DST (Discrete Sine Transform ), etc., and the residual (residual data information) in the transform domain may be referred to as transform coefficients. And (3) carrying out lossy quantization operation on the residual error in the transform domain, and losing certain information, so that the quantized signal is favorable for compression expression.

In some video coding standards, there may be more than one transform method to choose, so the coding end also needs to choose one of the transform methods for the current coding block and inform the decoding end. The quantization refinement is usually determined by QP (Quantization Parameters, quantization parameter), and when QP is larger, transform coefficients representing a larger range of values will be quantized to the same output, thus usually bringing more distortion and lower code rate; conversely, when the QP value is smaller, the transform coefficients representing a smaller range of values will be quantized to the same output, and therefore will typically suffer less distortion, while corresponding to a higher code rate.

That is, when encoding a coding block, it is necessary to determine a prediction mode (such as inter prediction or intra prediction) of a video frame (video frame a) in which the coding block is located, determine, from a target video, a video frame (which may be a current frame or a previous video frame of the video frame a, etc.) to be referred to by the video frame a based on the prediction mode, determine, from the video frame to be referred to, a coding block matching the current coding block (i.e., a coding block associated with the current coding block), and use the matching coding block as a prediction for the current coding block, that is, perform data encoding (i.e., perform prediction according to a prediction mode, such as block-based motion compensation) on the current coding block, to obtain residual data information between the matching coding block and the current coding block, and use information (such as a motion vector, motion compensation), prediction mode, etc. as prediction mode association information, which may be used to indicate that the matching coding block that can be referred to when encoding the current coding block, i.e., the matching coding block, and reconstructing data of the matching block when decoding. The residual data information is subjected to quantization transformation to obtain quantized transform coefficients, and the quantized transform coefficients and prediction mode related information can be used as data coding parameters of a coding block and coded into a video coding data stream.

4) Entropy Coding (Entropy Coding) or statistical Coding: the quantized transform domain signal is statistically encoded according to the frequency of occurrence of each value, and finally a binarized (0 or 1) video encoded data stream is outputted. Meanwhile, encoding may generate other information such as a selected prediction mode, a motion vector, etc., which also requires entropy encoding to reduce a code rate. The statistical coding is a lossless coding mode, and can effectively reduce the code rate required by expressing the same signal. Common statistical coding schemes are variable length coding (Variable Length Coding, VLC) or context-based binary arithmetic coding (Content Adaptive Binary Arithmetic Coding, CABAC).

5) Loop Filtering (Loop Filtering): the coded block is subjected to inverse quantization, inverse transformation and prediction compensation (the inverse operation of 2) to 4) and the decoded image corresponding to the coded block can be reconstructed. Compared with the original image, the reconstructed decoded image has the quantization effect, and part of information is different from the original image to generate Distortion (distancing), so that the filter can be used for filtering the reconstructed decoded image, and the Distortion degree generated by quantization is effectively reduced. The filter may be, for example, deblocking filtering (deblocking), SAO (sample adaptive compensation, sample adaptive offset) or ALF (adaptive loop filter, adaptive loop filtering, i.e. adaptive filter), etc., and the filtering operation described above is also referred to as loop filtering and filtering operation in the coding loop, since these filtered reconstructed video frames will be applied to the prediction process of other coding blocks to be encoded later as reference video frames of other coding blocks. The application is characterized in that the best coding filtering data corresponding to the data block can be determined through the self-adaption of the reference data block of the data block and the pre-decoding data block, namely the filter parameters in the self-adaption filter.

Based on the above description of steps 1) -5), please refer to fig. 4, and fig. 4 is a flowchart of video encoding and decoding according to an embodiment of the present application. The video frames in the target video are sequentially input into the encoding terminal, the video frames are divided into encoding blocks by the encoding terminal, fig. 4 uses the current encoding block to be encoded (the data block to be encoded, which may be the data block in the current frame, or the reference encoding block determined by the propagation cost of the data block in the current frame), and uses the current encoding block as the reference encoding block as an example) as the g-th encoding block (S shown in fig. 1) in the current frame (the current video frame to be encoded) _g [x,y]) By way of example, g is a positive integer and is less than or equal to the total number of coded blocks contained in the current frame. S is S _g [x,y]Representing the coordinates of the g-th code block as x, y]X represents the abscissa of the pixel, and y represents the ordinate of the pixel; data block S to be encoded _g [x,y]Corresponding reference data block S' _g [x,y]The prediction signal (i.e. the coding block and the reference signal which are referenced in coding) can be obtained after the processing such as motion compensation or intra-frame prediction, and the prediction signal is obtainedAnd the original signal S' _g [x,y]Subtracting to obtain residual error u _g [x,y]The method comprises the steps of carrying out a first treatment on the surface of the Then to the residual error u _g [x,y]And performing transformation and quantization processing. The data output by the quantization processing has two different positions A and B:

a: the data output by quantization processing can be sent to an entropy coder for entropy coding, so as to obtain a coded video coding data stream, and the video coding data stream is output to a buffer (buffer) for storage and waiting to be transmitted.

B. Quantization processing of output dataWill be subjected to a pre-decoding process, i.e. to an inverse quantization and inverse transformation process, resulting in an inverse transformed residual u' _g1 [x,y]. Then, according to the prediction mode (such as intra-frame prediction, inter-frame prediction, also called motion supplement), a new prediction signal is obtained through data prediction(predicted data block), or the previously determined prediction information ++>Here by +.>For example, the residual u' g1[ x, y ] after the inverse transformation]Add to prediction signal->Obtaining a new predictive signal->Obtaining a pre-decoded data block after pre-decoding treatment; pre-decoding data block->The reconstructed signal S 'can be obtained by loop filtering (e.g. by an adaptive filter, where the filter parameters of the adaptive filter are determined by the pre-decoded data block and the reference data block to be encoded, and the determined filter parameters are also transmitted to the decoder)' _g ' ₁ [x,y]I.e. pre-reconstructing the data blocks, and further obtaining pre-reconstructed video frames, and adding the pre-reconstructed video frames to a reconstructed data buffer for saving, for participating in the encoding process of the subsequent video frames.

It can be understood that by adaptively selecting the encoded reference data block, the technical scheme of the application can enable the data block to have good encoding quality, thereby improving the reconstruction effect of the corresponding pre-reconstructed data block, improving the reconstruction effect of the pre-reconstructed video frame, and improving the encoding efficiency (encoding effect, encoding quality) of the subsequent video frame for performing data encoding with reference to the pre-reconstructed video frame. Correspondingly, when the decoding terminal decodes the video coding data stream obtained by the coding in the technical scheme of the application, the obtained reconstructed data block also has better reconstruction effect, so that the reconstruction effect of the reconstructed video frame is improved, and the decoding efficiency (decoding effect, decoding quality) of the subsequent video frame for data decoding by referring to the reconstructed video frame is improved.

It will be appreciated that when all the encoded blocks in a video frame have not been encoded, the pre-reconstructed data blocks of the encoded blocks in the video frame are added to the reconstructed data buffer and stored, and are available for participation in the encoding process of subsequent encoded blocks (e.g., when intra-frame prediction is performed, the subsequent encoded blocks need to refer to the pre-reconstructed data blocks corresponding to the encoded blocks). When all the coding blocks in one video frame are coded and the corresponding pre-reconstruction data blocks are obtained, the pre-reconstruction data blocks corresponding to all the coding blocks of one video frame are taken out from the reconstruction data buffer, and the pre-reconstruction video frames corresponding to the video frame are obtained through pre-reconstruction so as to be added into the reconstruction data buffer.

It can be understood that in the encoding terminal, after encoding a video frame, the video frame is subjected to pre-decoding processing to obtain a pre-decoded video frame, and meanwhile, a pre-reconstructed video frame is obtained through an adaptive filter, where the pre-reconstructed video frame is used in the encoding process of a subsequent participating video frame. For example, the encoding process of frame 2 refers to frame 1, and thus encoding of frame 2 is performed based on the pre-reconstructed frame corresponding to frame 1.

Specifically, in the technical scheme of the present application, with the encoded blocks in the video frame as granularity, before encoding the encoded block B1, the propagation cost of the encoded block B1 may be determined, the reference encoded block B2 (also referred to as a reference data block) is determined according to the propagation cost of the encoded block B1, the reference encoded block B2 is decoded to obtain a pre-decoded block B3 (also referred to as a pre-decoded data block) corresponding to the reference encoded block B2, the pre-decoded block B3 and the reference encoded block B2 may be used to determine filter parameters, and the pre-decoded block B3 may be filtered by a loop filter according to the determined filter parameters (specifically, the pre-reconstructed block B4 (also referred to as a pre-reconstructed data block) is obtained by an adaptive filter according to the filter parameters, so as to obtain a pre-reconstructed video frame, and the obtained pre-reconstructed block may participate in the encoding process of the subsequent encoded block.

Meanwhile, when decoding is carried out, entropy decoding is carried out on the video coding data stream, and the quantization transformation parameters in the obtained coding strategy data are subjected to inverse quantization and inverse transformation to obtain an inverse transformed residual error u '' _g2 [x,y]. Then, according to the prediction mode, obtaining a new prediction signal through data prediction(predicted data block) the inverse transformed residual u' _g2 [x,y]And predictive signal->Adding to obtain a new prediction signal +.>Obtaining a decoded data block after decoding processing; will decode the data block +.>The reconstructed signal S 'can be obtained by loop filtering (e.g. by an adaptive filter, where the filter parameters of the adaptive filter are determined by the encoder and transmitted to the decoder)' _g ' ₂ [x,y]I.e. reconstructing the data block, and further obtaining a reconstructed video frame, which can be output and displayed to the user, the obtained reconstructed data block is also used for participating in the decoding process of the subsequent video frame.

It will be appreciated that the encoding and decoding methods of the encoded block may be as described above, or encoding and decoding methods capable of achieving the same purpose in the field of video compression may be used, where the encoding and decoding processes of the encoded block are not limited, and other steps may be included in the encoding process or the decoding process. Before coding the coding block, the application can select the reference data block with self-adaptive propagation cost to code, decode and filter, and can determine the final filter parameters by the reference data block to improve the coding quality, reconstruction quality and reconstruction effect of video frame.

Among them, since the prediction methods (such as intra prediction and inter prediction) used in the prediction encoding process have large errors, the residual error needs to be transmitted to compensate the decoded video frame, so as to improve the quality of the reconstructed video frame, so that the residual error processing is an important processing procedure in the hybrid encoding framework.

In the hybrid coding framework, as shown in fig. 4, the residual is the difference between the original signal (i.e., the original video frame) and the predicted signal (i.e., the predicted video frame):

u _g [x，y]＝S′ _g [x，y]-s’ _g [x，y]

in the HEVC, VVC and AVS3 video coding standards, the following two processing modes (1) and (2) are included for residual processing:

(1) Transformation and quantization:

by utilizing the correlation of the residuals, the residuals are subjected to energy concentration through transformation, so that the energy is concentrated on fewer low-frequency transformation coefficients, namely, after the residuals of a plurality of coding blocks are subjected to transformation processing, the transformation coefficients corresponding to the residuals of the plurality of coding blocks are smaller. By residual correlation is meant that the residuals between the coded blocks have a correlation, schematically, the residuals of one coded block will refer to the residuals of its neighboring coded block. Then, the smaller transformation coefficient becomes zero value through subsequent quantization processing, so that the cost of coding residual errors is greatly reduced. Taking the conventional DCT as an example, the transformation is as follows, a two-dimensional discrete transformation is achieved by two separate one-dimensional discrete transformations (horizontal transformation, vertical transformation).

Co _k ＝CU _k CT

Wherein Co is _k Representing transform coefficients after residual transform of a current coded block, U _k Representing residual, C representing the transform kernel of the vertical transform; c (C) ^T Representing horizontal transformationsAnd transforming the kernel.

Due to the diversity of the residual distribution, a single DCT cannot accommodate all residual characteristics. Therefore, the transformation kernels such as DST7 and DCT8 are introduced into the transformation process, so that transformation combination is introduced in the transformation process of residual errors, and the problem that a single DCT cannot adapt to all residual error characteristics is solved. Wherein, the transformation combination may refer to a combination of transformation kernels of a horizontal transformation and transformation kernels of a vertical transformation, and the horizontal transformation and the vertical transformation may employ the same or different transformation kernels. Transformation cores include, but are not limited to: DCT2, DCT8, DST7, and the like. Wherein DCT2 and DCT8 refer to different DCT transform modes, and DST7 refers to one transform mode of DST.

Taking AMT (adaptive multi-core transform) technology as an example, the possible selection of transform combinations for one transform block (i.e. the residual block to be transformed) is as follows: (DCT 2 ), (DCT 8, DCT 8), (DCT 8, DST 7), (DST 7, DCT 8), (DST 7 ). Taking (DCT 2 ) as an example, DCT2 represents a transform kernel of horizontal transform, DCT2 represents a transform kernel of vertical transform, and so on, it can be understood that (DCT 8 ), (DCT 8, DST 7), (DST 7, DCT 8), (DST 7 ).

It should be appreciated that, for which transform combination is specifically selected for a transform block, a decision needs to be made using RDO (Rate-distortion optimization, rate distortion optimization) rules at the encoding end, and although the adaptive multi-core transform can promote the adaptability of the transform module to the residual, the problem that follows is the encoding cost of the transform core index (which is used to indicate which transform core is used).

(2) Transform Skip (TS): in the video coding process, a part of residual errors with weak correlation exist, and for the part of residual errors, the transformation can be skipped on the basis of (1), so that the coding efficiency is higher, namely, the transformation process of the part of residual errors is skipped, and the part of residual errors are directly quantized.

2. Common transformation partitioning modes:

in the embodiment of the application, various common transformation dividing modes are involved in the transformation processing of the coding block. Illustratively, the transformation partitioning approach may include, but is not limited to: RQT (Residual Quad Tree, residual quadtree mode), PBT (position based transform, position-based transform mode), SBT (sub-block transform mode), and the like. To facilitate an understanding of the encoded blocks, the RQT, PBT are described in relation to each other.

①RQT：

In the HEVC standard, the RQT divides the encoded blocks by means of a recursive quadtree and encodes the best division information for transmission in the video stream. Referring to fig. 5, fig. 5 is a schematic diagram of a residual quadtree partitioning scheme according to an embodiment of the present application. The left side is a schematic diagram of the code block being divided, and the right side is a tree structure of the code block after being processed by a quadtree, wherein 1 represents the division and 0 represents no division. In fig. 5, the encoded block 10 corresponds to 1, i.e., the encoded block 10 is quadtree divided into 4 sub-blocks (i.e., sub-block 11, sub-block 12, sub-block 13, and sub-block 14 in fig. 5). The 1 st sub-block (i.e. sub-block 11) corresponds to 1, i.e. sub-block 11 is further quadtree divided into 4 sub-blocks (sub-block 111, sub-block 112, sub-block 113 and sub-block 114); the 2 nd sub-block (i.e. sub-block 12) corresponds to 0 and the 3 rd sub-block (sub-block 13) corresponds to 0, i.e. neither sub-block 12 nor sub-block 13 is subdivided. The 4 th sub-block (i.e., sub-block 14) corresponds to 1, and sub-block 14 is further divided into 4 sub-blocks (sub-block 141, sub-block 142, sub-block 143, and sub-block 144) by quadtrees; then, sub-block 111, sub-block 112, sub-block 113 each correspond to 0, meaning that sub-block 111, sub-block 112, sub-block 113 are no longer divided. Sub-block 114 corresponds to 1, meaning that sub-block 114 is again quadtidened into 4 sub-blocks (sub-block 1141, sub-block 1142, sub-block 1143, and sub-block 1144); sub-block 1141, sub-block 1142, sub-block 1143, and sub-block 1144 each correspond to 0, meaning that sub-block 1141, sub-block 1142, sub-block 1143, and sub-block 1144 are not subdivided. Sub-block 141 corresponds to 1, meaning that sub-block 141 is again quadtided into 4 sub-blocks (sub-block 1411, sub-block 1412, sub-block 1413, and sub-block 1414); sub-block 1411, sub-block 1412, sub-block 1413, and sub-block 1414 each correspond to 0, meaning that sub-block 1411, sub-block 1412, sub-block 1413, and sub-block 1414 are not subdivided.

As can be seen from fig. 5, if the transform division process is performed on the encoded block using the RQT, the transform division manner of the encoded block requires more bits (i.e., longer code length) to represent.

②PBT：

Referring to fig. 6, fig. 6 is a schematic diagram of a sub-block transform based on a location according to an embodiment of the present application. In the AVS3 standard, the position-based sub-block transform may divide the encoded block into 4 sub-blocks (i.e., sub-block 31, sub-block 32, sub-block 33, and sub-block 34 in fig. 3) in four branches, and preset the transform combinations according to the position of each sub-block. Wherein the transform combination may include a transform kernel for horizontal transforms and a transform kernel for vertical transforms. The transform kernel of the horizontal transform and the transform kernel of the vertical transform may be the same or different.

Illustratively, as shown in FIG. 6, the transform combination of sub-block 31 is (DCT 8 ), the transform combination of sub-block 32 is (DST 7, DCT 8), the transform combination of sub-block 33 is (DCT 8, DST 7), and the transform combination of sub-block 33 is (DST 7 ).

Wherein, whether PBT is adopted for any coding block (such as the current coding block) can be adaptively identified by adopting 1 flag bit. In one implementation, if the flag bit is a first value (e.g., 1), the current encoding block uses PBT; if the flag bit is a second value (e.g., 0), the current encoding block does not use PBT.

It will be appreciated that a coded block corresponds to a partitioned area in a video frame, and that the coded block contains video data information in that partitioned area. The video data information may be luminance component data in the divided areas or chrominance component data. When one encoded block contains luminance component data, the encoded block is also called a luminance encoded block, and when one encoded block contains chrominance component data, the encoded block is also called a chrominance encoded block. For example, after obtaining a plurality of divided regions 1 according to a transform partition manner one (e.g., RQT), the chroma component data in the plurality of divided regions 1 is extracted, thereby obtaining a plurality of chroma coding blocks. For another example, after obtaining a plurality of divided regions 2 according to a transformation division manner two (such as PBT), the luminance component data in the plurality of divided regions 2 is extracted, thereby obtaining a plurality of luminance coding blocks. It will be appreciated that the luma coded block, and the chroma coded block are encoded, decoded, and filtered separately. The luma coding block refers to a matched luma coding block when coding. The chroma coding blocks refer to matching chroma coding blocks when coding. The predicted luminance component data for the video frame (i.e., the decoded reconstructed luminance component data) may be obtained after the reconstructed data block corresponding to the luma coded block is obtained. After obtaining the reconstructed data block corresponding to the chroma coding block, the predicted chroma component data (i.e., the reconstructed chroma component data) of the video frame can be obtained, so that the data superposition result of the predicted luma component data and the predicted chroma component data can be used as the predicted video frame (i.e., the reconstructed video frame is decoded) of the video frame.

Therefore, when encoding data blocks, the encoding terminal obtains transformation partition information (data stream after entropy encoding is transmitted) for each encoding block, and the transformation partition information is simultaneously transmitted to the decoding terminal. The transformation dividing information comprises a transformation dividing mode of the coding block and a position mark of the coding block after division, when the coding terminal obtains the pre-reconstruction data block, the dividing position information of the pre-reconstruction data block in a video frame can be determined according to the corresponding transformation dividing information, and the pre-reconstruction data block is spliced based on the dividing position information to obtain the pre-reconstruction video frame. Similarly, when the decoding terminal obtains the reconstructed data block, the decoding terminal can determine the dividing position information of the reconstructed data block in one video frame according to the corresponding transformation dividing information, and splice the reconstructed data block based on the dividing position information to obtain the reconstructed video frame.

3. Video decoding:

the decoding process corresponds to the encoding process, and at the decoding end, after obtaining a video encoding data stream for each encoding block, on one hand, entropy decoding is performed on the video encoding data stream to obtain data decoding parameters associated with the encoding block, wherein the data decoding parameters comprise prediction mode association information and quantized transform coefficients, and then inverse quantization and inverse transformation are performed on the quantized transform coefficients to obtain residual data information. On the other hand, according to the prediction mode association information, a prediction signal (prediction data block) corresponding to the coding block can be obtained, and the residual data information and the prediction data block are added to obtain a decoded data block, and after the decoded data block is subjected to filtering processing, a reconstructed data block is obtained, and the reconstructed data block can be used for reconstructing a reconstructed video frame corresponding to the video frame where the coding block is located.

It should be understood that the service scenario according to the embodiment of the present application may be a transmission scenario of multimedia data. May be applied to video codecs (e.g., video codecs employing sub-block transform techniques), video compression products, or multimedia products. For example, the method can be applied to the intelligent traffic field, particularly the fields of automatic driving, intelligent navigation, video image acquisition and the like, for example, in the automatic driving field, the automatic driving can be realized through the acquired road video data (road picture data), and the road video data is required to be uploaded to intelligent equipment such as a server at the moment, namely, the compression transmission of the road picture video data is involved. For example, the method can be applied to a video sharing scene, the encoding terminal can be a terminal device of the user a, the encoding terminal can receive the target video uploaded or shot by the user a, encode the target video, transmit the video encoding data stream to the decoding terminal, the decoding terminal can be a terminal device of the user B, and the decoding terminal can decode the target video based on the video encoding data stream. For another example, the method can be applied to video viewing scenes, the encoding terminal can be a background server of a multimedia platform, the encoding terminal can receive a viewing request of a user A for a target video, acquire the target video from a database, encode the target video, transmit video encoding data streams to a decoding terminal, the decoding terminal can be terminal equipment of the user A, and the decoding terminal can decode and display the target video based on the video encoding data streams. For another example, the method can be applied to a video recommendation scene, the encoding terminal can be a background server of a multimedia platform, the encoding terminal can receive a recommendation request of a user A, acquire a target video to be recommended to the user from a database, encode the target video, transmit a video encoding data stream to a decoding terminal, the decoding terminal can be terminal equipment of the user A, and the decoding terminal can decode and display the target video based on the video encoding data stream. For example, the method can be applied to a live video scene, the encoding terminal can be a terminal device of a host, the encoding device can acquire live video through a camera, encode the live video, transmit video encoding data streams to a server, and transmit the video encoding data streams to a decoding terminal through the server, wherein the decoding terminal can be a terminal device of a user watching live video, and the decoding terminal can decode and display target video based on the video encoding data streams. The application scene is not limited herein, and can be applied to any video data transmission scene, and the source and type of the target video are not limited herein.

Optionally, the technical scheme of the application can be applied to the field of cloud technology. Cloud technology (Cloud technology) is based on the general terms of network technology, information technology, integration technology, management platform technology, application technology and the like applied by Cloud computing business models, and can form a resource pool, so that the Cloud computing business model is flexible and convenient as required. Cloud computing technology will become an important support. Background services of technical networking systems require a large amount of computing, storage resources, such as video websites, picture-like websites, and more portals. Along with the high development and application of the internet industry, each article possibly has an own identification mark in the future, the identification mark needs to be transmitted to a background system for logic processing, data with different levels can be processed separately, and various industry data needs strong system rear shield support and can be realized only through cloud computing.

Such as in cloud applications with reference to cloud servers. For example, intelligent transportation based on cloud technology. The acquired road video data can be encoded and transmitted to the cloud server by the acquisition equipment or the satellite on the vehicle, and the cloud server processes the road video data to realize automatic navigation or road live analysis and the like. The application can realize the transmission of video data (picture data) in the fields of map navigation, intelligent traffic, intelligent driving and the like.

Or can be applied to a cloud conference scene, wherein the cloud conference is an efficient, convenient and low-cost conference form based on a cloud computing technology. The user can rapidly and efficiently share voice, data files and videos with all groups and clients in the world synchronously by simply and easily operating through an internet interface, and the user is helped by a cloud conference service provider to operate through complex technologies such as data transmission, processing and the like in the conference. At present, domestic cloud conference mainly focuses on service contents mainly in a SaaS (Software as a Service ) mode, including service forms of telephone, network, video and the like, and video conference based on cloud computing is called as a cloud conference. In the cloud conference era, the transmission, processing and storage of data are all processed by the computer resources of video conference factories, and users can carry out efficient remote conferences without purchasing expensive hardware and installing complicated software. The cloud conference system supports the dynamic cluster deployment of multiple servers, provides multiple high-performance servers, and greatly improves conference stability, safety and usability. In recent years, video conferences are popular for a plurality of users because of greatly improving communication efficiency, continuously reducing communication cost and bringing about upgrade of internal management level, and have been widely used in various fields of transportation, finance, operators, education and the like. The video data transmission in the cloud conference can be used through the technical scheme of the application.

It should be noted that, when the computer device in the embodiment of the present application obtains data such as personal data information of a user, a prompt interface or a popup window may be displayed, where the prompt interface or the popup window is used to prompt the user to collect the data such as personal data information currently, and only after obtaining that the user sends a confirmation operation to the prompt interface or the popup window, the relevant step of data obtaining is started, otherwise, the process is ended.

It will be appreciated that in the specific embodiments of the present application, business data of objects such as users, enterprises, institutions, systems, etc. (e.g., information such as target videos uploaded by users) may be involved, and when the above embodiments of the present application are applied to specific products or technologies, permission or consent of the objects such as users, enterprises, institutions, systems, etc. needs to be obtained, and collection, use and processing of relevant data need to comply with relevant laws and regulations and standards of relevant regions.

Further, referring to fig. 7, fig. 7 is a flowchart of a video data processing method according to an embodiment of the present application, as shown in fig. 7, the method may be performed by the above-mentioned computer device, and the computer device may be a coding terminal, for example, the coding terminal may be the service terminal 200a shown in fig. 1. The method specifically comprises the following steps of S101-S106:

S101, obtaining a video frame to be processed in a target video.

The target video may be a video uploaded by a user, a video collected by an image capturing device, or a video obtained from a database, and the source and the obtaining scene of the target video are not limited herein. The target video may be any video that requires video compression.

The target video includes one or more video frames, and the coding and decoding processing logic of each video frame is the same, and the technical scheme of the present application is described by taking any video frame as a video frame to be processed (i.e., a video frame currently being encoded) as an example. In addition, a data block to be encoded included in the video frame to be processed is a data block j; the data block j is any one of the data blocks in the video frame to be processed. It will be appreciated that the video frame to be processed includes one or more data blocks, referred to as blocks of data to be encoded during the encoding phase and as blocks of data to be decoded during the decoding phase. The coding and decoding processing logic of each data block is the same, and the technical scheme of the application is described by taking any data block as a data block j as an example.

S102, carrying out data analysis processing on the data block j to obtain the propagation cost of the data block j, and determining the reference data block corresponding to the data block j based on the propagation cost of the data block j.

The propagation cost (also called genetic cost) characterizes the importance degree of the current data block, and the larger the value of the propagation cost is, the more important the data block is. I.e. the importance of the reference by the following frame in inter prediction. The more information a current data block contributes to a subsequent frame (i.e., the number of references made to the subsequent frame) the higher its importance. The propagation cost is a parameter that determines the QP for a block of data. A cut algorithm may be used to calculate the information transfer ratio according to the intra-frame prediction cost and the inter-frame prediction cost, so that the influence degree (i.e., propagation cost, i.e., pre-analysis information) of the current data block on the subsequent video frame may be calculated.

Taking an example that the target video comprises N video frames, wherein N is a positive integer greater than 1; the video frame to be processed which is currently being encoded is a video frame i in N video frames; i is a positive integer greater than 1 and less than or equal to N. Among the N video frames, the video frame located after the video frame i is the video frame to be encoded. It will be appreciated that when the target video is live, video frames will be generated continuously, each video frame having the same encoding and decoding process.

It can be understood that the propagation cost is the influence degree of the data block in the video frame i on the video frame to be encoded. The propagation costs of the data blocks in the current video frame i may be determined based on the subsequent video frames of the video frame i, such as by determining the propagation costs of the data blocks in the current video frame i from the propagation costs of the data blocks in the subsequent video frames of the video frame i. A specific way of determining the propagation costs of the data blocks in the video frame i can be seen from the relevant description of the embodiments described below.

It can be understood that, when the data block in the original video frame is used for encoding, decoding and filtering, the loop filter may recover the noise of the data block in the original video frame, and if the information of the data block contributing to the subsequent frame is less, that is, the decoding and reconstruction effect of the data block has less influence on the decoding of the subsequent video frame, the data block is close to the effect of the data block in the original video frame after decoding and reconstruction, so that the reconstructed image quality of the video frame can be improved, and the coding efficiency is improved; however, if the data block contributes more information to the subsequent frame, noise recovered by filtering may affect the effect and accuracy of the subsequent video frame during inter-frame prediction, and reduce coding efficiency. Therefore, a proper data block can be selected as a reference data block of the data block in a self-adaptive manner according to the propagation cost of the data block, the reference data block is coded, decoded and filtered, and the noise of different data blocks in a reconstructed video frame can be adjusted in an adaptive manner, so that the influence of the noise in the decoded and reconstructed data block on the coding and decoding of a subsequent video frame is reduced, and the reconstruction quality of the video frame is improved. The application can be used in combination with any wiener filtering, including in-loop and out-of-loop wiener filtering. It can be appreciated that the technical scheme of the present application is applied to the case that the data block needs to be filtered by the adaptive filter. For example, rate distortion optimized RDO (Rate-distortion optimization) for a block may be determined, and when the RDO for the block is greater than a threshold, the encoded filtered data for the block is determined. When the RDO of the data block is less than or equal to a certain threshold, then no adaptive filter is used for the data block.

The reference data block corresponding to the determined data block j may be: acquiring a denoising data block corresponding to a data block j from a denoising video frame corresponding to a video frame to be processed; and if the propagation cost of the data block j is greater than the propagation cost threshold value, taking the denoising data block corresponding to the data block j as the reference data block corresponding to the data block j.

The reference data block corresponding to the determined data block j may be: acquiring a denoising data block corresponding to a data block j from a denoising video frame corresponding to a video frame to be processed; and if the propagation cost of the data block j is smaller than or equal to the propagation cost threshold value, using the data block j as a reference data block corresponding to the data block j.

The method comprises the steps of carrying out denoising treatment on a video frame to be treated through one or more denoising algorithms to obtain a corresponding denoised video frame. The denoising algorithm used may be, for example, a sample filter (a denoising algorithm) or may be a denoising algorithm. The denoised video has the characteristic of high time domain correlation, and the encoder can remove the time domain redundancy among video frames more easily, so that the compression efficiency can be improved.

It can be understood that when the propagation cost of a data block j is greater than the propagation cost threshold, it indicates that the importance of the data block j in the encoding process of the subsequent video frame is greater, and if the reconstructed data block of the data block j contains more noise, the reconstruction effect of the subsequent video frame will be affected, so that the de-noised data block corresponding to the data block j can be used as a reference data block, so that the reconstructed data block of the data block j contains less noise. When the propagation cost of a data block j is smaller than or equal to the propagation cost threshold, it is indicated that the importance degree of the data block j in the encoding process of the subsequent video frame is smaller, and the noise information of the reconstructed data block of the data block j has less influence on the reconstruction effect of the subsequent video frame, so that the data block j can be directly used as a reference data block, so that the reconstructed data block of the data block j can be more close to the effect in the original video frame, that is, the reference data block for encoding, decoding and filtering is adaptively selected, and the reconstruction effect of the reconstructed data block can be balanced between the data block j close to the original video frame and noise reduction.

S103, coding the reference data block to obtain a coded reference data block and coding strategy data associated with the coded reference data block.

It will be appreciated that the reference data block is an encoded reference data block after the encoding process.

Taking an example that the target video includes N video frames, N is a positive integer greater than 1. The video frame to be processed is a video frame i in N video frames; i is a positive integer less than or equal to N. Among the N video frames, the video frame preceding video frame i is an encoded video frame. The reconstructed data buffer associated with the target video (i.e., the reconstructed data buffer in the encoding terminal) is used to buffer the pre-reconstructed video frames corresponding to the encoded video frames. When the video frame i is coded, the video frame to be referred is determined as a reference reconstructed video frame from the reconstructed data buffer based on the prediction mode of the video frame i. It will be appreciated that when the prediction mode is inter prediction, the reconstructed video frame referenced by video frame i needs to be determined from the reconstructed data buffer. If the video frame i needs to be encoded by referring to the previous video frame i-1, then the reconstructed video frame i-1 corresponding to the video frame i-1 is obtained from the reconstructed data buffer, and the encoding of the reference data block corresponding to the data block in the video frame i is performed through the reconstructed video frame i-1.

For example, the encoding process of the reference data block to obtain the encoded reference data block and the encoding strategy data associated with the encoded reference data block may be: determining a data block which can be referred to by a reference data block when data encoding is performed from a reference reconstructed video frame based on a prediction mode of the video frame i, and taking the determined data block which can be referred to by the reference data block when the data encoding is performed as a target reference data block; based on the prediction mode of the video frame i, carrying out data coding on the reference data block to obtain residual data information between the target reference data block and the reference data block and prediction mode associated information associated with the reference data block; carrying out quantization transformation processing on residual data information to obtain quantization transformation coefficients associated with a reference data block; when a reference data block associated with prediction mode association information and quantized transform coefficients is used as an encoded reference data block, the prediction mode association information and quantized transform coefficients associated with the reference data block are used as encoding strategy data associated with the encoded reference data block.

It will be appreciated that the reference data block is a coded reference data block after the coding process. And taking the video frame i as an encoded video frame after the reference data block corresponding to each data block in the video frame i is subjected to encoding processing, and continuing to encode the video frame i+1. The reference data block corresponding to the data block is used as the coded data block after the coding process.

It will be appreciated that the reconstructed data block corresponding to the encoded data block in the current video frame i is also stored in the reconstructed data buffer. When the prediction mode of the video frame i is intra-frame prediction, and the data block j is encoded, determining a data block to be referred from the reconstructed data block corresponding to the encoded data block in the current video frame i stored in the reconstructed data buffer, and taking the data block as a reference reconstructed data block. And meanwhile, based on the prediction mode of the video frame i, carrying out data coding on the reference data block to obtain residual data information between the reference reconstructed data block and the reference data block and prediction mode related information of the reference data block, and further determining data coding parameters of the data block j.

That is, when the reference data block corresponding to the data block j is encoded to obtain an encoded reference data block, the encoded reference data block may be subjected to a pre-decoding process to obtain a pre-reconstructed data block corresponding to the data block j, and the pre-reconstructed data block may be added to the reconstructed data buffer. When the prediction mode of the video frame i is the intra-frame mode, the pre-reconstructed data block corresponding to the video frame i participates in the encoding process of the next data block. When all the data blocks of the video frame i obtain the pre-reconstruction data blocks, the pre-reconstruction data blocks corresponding to all the data blocks of the video frame i stored in the reconstruction data buffer are used for carrying out frame pre-reconstruction on the video frame i to obtain a pre-reconstruction video frame corresponding to the video frame i, and the pre-reconstruction video frame corresponding to the video frame i is also added to the reconstruction data buffer. When the prediction mode of the video frame i+1 is the inter-frame mode, the pre-reconstructed video frame i corresponding to the video frame i participates in the encoding process of the video frame i+1. The process of frame pre-reconstruction of the video frame i by the encoding terminal is the same as the process of frame reconstruction of the video frame i by the decoding terminal, and in particular, reference may be made to the description of the following embodiments.

For example, as shown in fig. 8, fig. 8 is a schematic diagram of a processing procedure of an encoding terminal according to an embodiment of the present application; the reconstructed data buffer stores a reconstructed video frame 1-reconstructed video frame i-1 corresponding to the video frame 1-video frame i-1 and a reconstructed data block 1-reconstructed data block j-1 corresponding to the data block 1-data block j-1 in the video frame i; when the reference data block corresponding to the data block j in the video frame i is encoded, determining a data block (reference data block) which needs to be referred to by the reference data block of the data block j from the reconstructed video frame i-1 or from the reconstructed data block 1-reconstructed data block j-1 based on the prediction mode of the video frame i, encoding the reference data block corresponding to the data block j based on the reference data block to obtain an encoded reference data block, pre-decoding (using data in a reconstructed data buffer) and filtering the encoded reference data block to obtain a reconstructed data block j corresponding to the data block j, and adding the reconstructed data block j into the reconstructed data buffer.

S104, pre-decoding the coded reference data block through the coding strategy data to obtain a pre-decoded data block corresponding to the data block j.

The pre-decoding process of the encoded reference data block is the same as the decoding process of the encoded reference data block by the decoding terminal, and the following description of the embodiments can be referred to specifically.

S105, when the coding filtering data associated with the coded reference data block is determined through the coded reference data block and the pre-decoding data block, the coding strategy data associated with the coded reference data block and the coding filtering data associated with the coded reference data block are used as the data coding parameters associated with the data block j.

The encoding filtering data is used for carrying out filtering processing on the pre-decoding data block to obtain a pre-reconstruction data block corresponding to the pre-decoding data block; the pre-reconstruction data block is used for carrying out frame pre-reconstruction on the video frame to be processed.

That is, the data encoding parameters associated with data block j include, but are not limited to: encoding policy data, encoding filtering data, and transform partition information, etc.

It will be appreciated that an adaptive filter (e.g. a wiener filter) is included in the loop filter so that the encoded filtered data for the encoded reference data block can be determined from the encoded reference data block and the pre-decoded data block. That is, a block of data may correspond to an adaptive code filter data (i.e., filter parameters). Therefore, the decoding data block is subjected to filtering processing according to the encoded filtering data, that is, the decoding data block is subjected to filtering processing according to the encoded filtering data by an adaptive filter in the loop filter.

The encoding strategy data is used for performing pre-decoding processing on the encoded reference data block to obtain a pre-decoded data block corresponding to the data block j. The process of determining the code filtered data of the coded reference data block may therefore be: determining associated difference information between the pre-decoded data block and the encoded reference data block; encoded filtered data associated with the encoded reference data block is determined by associating the difference information.

Wherein determining the associated difference information between the pre-decoded data block and the encoded reference data block may be: determining autocorrelation information of the pre-decoded data block and cross-correlation information between the pre-decoded data block and the encoded reference data block; the autocorrelation information and the cross-correlation information are used as correlation difference information between the pre-decoded data block and the encoded reference data block.

The determining the encoded filtered data through the associated difference information may be that a verachhoff equation is constructed based on the associated difference information through a wiener filtering algorithm, and the encoded filtered data is obtained by solving the verachhoff equation. The left side of the Velahoff equation is the autocorrelation information of the pre-decoded data block, and the right side of the equation is the cross-correlation information between the pre-decoded data block (i.e., distortion information) and the encoded reference data block (reference information). Given a reference signal and a distorted signal, wiener filtering filters the distorted signal, the reference signal can be recovered, and the square of the difference between the reference signal and the reconstructed signal can be minimized.

It will be appreciated that the pre-decoded data block corresponding to data block j is pre-decoded data block j. When the encoded and filtered data of the data block j are determined, the pre-decoded data block j and the encoded and filtered data associated with the data block j are input into a loop filter in an encoding terminal, and the loop filter in the encoding terminal carries out filtering processing on the pre-decoded data block j according to the encoded and filtered data associated with the data block j to obtain a pre-reconstructed data block corresponding to the pre-decoded data block j; performing frame pre-reconstruction on the video frame i through the pre-reconstruction data block to obtain a pre-reconstructed video frame corresponding to the video frame i; and adding the pre-reconstructed video frame corresponding to the video frame i to a reconstructed data buffer. The specific process of pre-reconstructing the pre-reconstructed data block to obtain the pre-reconstructed video frame of the video frame i is the same as the process of reconstructing the reconstructed data block to obtain the reconstructed video frame of the video frame i, and will not be described herein.

It is understood that the data encoding parameters include encoding policy data and encoding filter data; the coding strategy data is used for decoding the coded reference decoding block (a decoding terminal) to obtain a decoding data block of the data block j; the encoded filtering data is used for carrying out filtering processing on the decoded data block to obtain a reconstructed data block corresponding to the decoded data block; the reconstruction data block is used for carrying out frame reconstruction on the video frame to be processed. The specific process of the decoding terminal may be referred to the following description of the embodiments, which is not repeated herein.

Correspondingly, the encoding strategy data is used for (an encoding terminal) pre-decoding the encoded reference decoding block to obtain a pre-decoded data block of the data block j; the encoded filtering data is used for carrying out filtering processing on the pre-decoding data block to obtain a pre-reconstruction data block corresponding to the pre-decoding data block; the pre-reconstruction data block is used for carrying out frame pre-reconstruction on the video frame to be processed to obtain a pre-reconstructed video frame.

For example, as shown in fig. 9, fig. 9 is a schematic diagram of a process for determining encoded filtered data according to an embodiment of the present application; determining a reference data block of the data block j from the data block j and the denoising data block of the data block j through the propagation cost of the data block j, (S41) carrying out coding processing on the reference data block to obtain coding strategy data of the data block j, (S42) carrying out pre-decoding on the coded reference data block based on the coding strategy data to obtain a pre-decoded data block corresponding to the data block j, determining associated difference information through the pre-decoded data block and the reference coding block, (S43) constructing a Velahoff equation through the associated difference information, solving the Velahoff equation to obtain a filter parameter, and (S44) carrying out filtering processing on the pre-decoded data block according to the filter parameter to obtain a reconstructed data block and adding the reconstructed data block into a reconstructed data buffer. The reconstructed data block is reconstructed in a reconstructed data buffer to obtain a reconstructed video frame of the video frame in which the data block j is located.

S106, based on the data coding parameters associated with the data block j, coding to obtain a video coding data stream associated with the video frame to be processed.

It can be understood that when each data block to be encoded in the video frame to be processed is taken as the data block j, the data encoding parameter associated with each data block is obtained, the data encoding parameter associated with each data block in the video frame to be processed can be encoded (such as entropy encoding), so as to obtain a video encoding data stream associated with the video frame to be processed, and the video encoding data stream is encapsulated and then transmitted to a decoding terminal, and the decoding terminal decodes and reconstructs the video encoding data stream to obtain the corresponding reconstructed video frame.

Further, referring to fig. 10, fig. 10 is a flowchart of a video data processing method according to an embodiment of the present application, as shown in fig. 10, the method may be performed by the above-mentioned computer device, and the computer device may be a coding terminal, for example, the coding terminal may be the service terminal 200a shown in fig. 1. The method specifically comprises the following steps of S201 to S209:

s201, obtaining a video frame to be processed in the target video. The specific implementation of step S201 may be referred to the above description related to the embodiment, and will not be described herein.

S202, acquiring a predicted reference video frame associated with a video frame i from the video frames to be encoded.

The determining the propagation cost of the data block j may specifically be that, from the video frame to be encoded, a predicted reference video frame associated with the video frame i is obtained.

Wherein, the estimated reference video frame associated with the video frame i can be obtained from the video frames to be encoded according to the reference video frame obtaining strategy, the estimated reference video frame can be one or more, the next video frame in the estimated reference video frame can refer to the previous video frame for data encoding, that is, the determined first video frame (such as video frame i+1) of the estimated reference video frame is the first estimated reference video frame for data encoding of the referencevideo frame i; a second video frame of the predicted reference video frame (e.g., video frame i+2) is a second predicted reference video frame that may be data encoded with respect to the first predicted reference video frame that may be data encoded with respect to video frame i, and so on. It will be appreciated that the propagation costs of the data blocks in the previous video frame are determined by the propagation costs of the data blocks in the subsequent video frame.

The reference video frame acquisition strategy may be to sequentially acquire R video frames from the video frames to be encoded as estimated reference video frames, where the first estimated reference video frame is video frame i+1, the second estimated reference video frame is video frame i+2, and so on. And when the number of the video frames to be encoded is smaller than R, taking the video frames to be encoded as estimated reference video frames. Or, it may also be that one video frame is obtained from each video frame of the video frames to be encoded until the number of the obtained video frames at intervals is R, and the obtained P video frames at intervals are used as estimated reference video frames. R is a positive integer, and at this time, the first predicted reference video frame is video frame i+1, and the second predicted reference video frame is video frame i+3. And if the number of the acquired interval video frames is smaller than R, taking all the acquired interval video frames as estimated reference video frames.

Optionally, the reference video frame acquisition policy may further be that a frame group in which a video frame i is located is acquired from N video frames included in the target video, a video frame to be encoded in the frame group in which the video frame i is located is taken as the target video frame to be encoded, and an estimated reference video frame of the video frame i is acquired from the target video frame to be encoded. The specific way of obtaining the estimated reference video frame of the video frame i from the target video frame to be encoded is the same as the way of obtaining the estimated reference video frame of the video frame i from the video frame to be encoded.

For ease of understanding, the propagation cost determination process for data block j will be described herein using the example where the predicted reference video frame includes a first predicted reference video frame and a second predicted reference video frame.

S203, determining a first estimated reference data block associated with the data block j from a first estimated reference video frame included in the estimated reference video frame, and determining a second estimated reference data block associated with the first estimated reference data block from a second estimated reference video frame included in the estimated reference video frame.

It can be understood that the first estimated reference video frame is a video frame in which data encoding is performed on a data block in the referenceable video frame i determined in the video frame to be encoded; the second estimated reference video frame is a video frame which is determined in the video frame to be encoded and can refer to the data block in the first estimated reference video frame for data encoding.

Thus, a first predicted reference data block associated with data block j may be determined from a first predicted reference video frame and a second predicted reference data block associated with the first predicted reference data block may be determined from a second predicted reference video frame. The first estimated reference data block is a data block which can be used for carrying out data coding by referring to a data block j in the first estimated reference video frame. The second estimated reference data block is a data block in the second estimated reference video frame, which can be used for carrying out data coding by referring to the first estimated reference data block.

Wherein the first predicted reference data block associated with the data block j in the first predicted reference video frame may be one or more. The data block that matches each data block in the first predicted reference video frame may be determined from the video frame i based on a block-based motion compensation technique, that is, each data block in the first predicted reference video frame may be referred to from the video frame i during data encoding (it will be understood that this is only a predicted case and data encoding is not actually performed on the data block in the first predicted reference video frame). It will be appreciated that when a data block that matches a data block (block a) in the first predicted reference video frame is a data block j, block a is indicated as the first predicted reference data block associated with that data block j. And similarly, the determination mode of the second estimated reference data block is the same as that of the first estimated reference data block.

S204, determining the propagation cost of the data block j based on the second estimated reference data block and the first estimated reference data block.

The determining the propagation cost of the data block j may be performing pre-analysis on the first estimated reference data block based on the propagation cost of the second estimated reference data block to obtain the propagation cost of the first estimated reference data block, and performing pre-analysis on the data block j based on the propagation cost of the first estimated reference data block to obtain the propagation cost of the data block j.

It will be appreciated that when video frame i is the last video frame in the target video, the information representing the data block in video frame i will not be propagated to the subsequent video frames, where the propagation cost of the data block in video frame i is 0.

It will be appreciated that the information of the data block of the previous video frame in the estimated reference video frame is considered to be transmitted to the data block of the next video frame, so as to estimate the transmission cost of each data block in the current video frame i. In addition, in the estimated reference video frame, taking an example that the information of the data block of the last video frame is not transmitted to the subsequent video frame, namely, the transmission cost of the data block of the last video frame in the estimated reference video frame is 0. It will be appreciated that when a data block has no associated data block, it is indicated that the propagation cost for that data block is 0.

Therefore, when determining the propagation cost of the data block j, the propagation cost of the data block in each video frame is determined from back to front in the estimated reference video frame, and the propagation cost of each data block in the video frame i is obtained by analogy.

That is, if the number of predicted reference video frames is R, for the data block j, a data block that can refer to the data block j is obtained from the first predicted reference video frame, that is, a first predicted reference data block, and a second predicted reference data block that can refer to the first predicted reference data block is obtained from the second predicted reference video frame, and so on, and an R predicted reference data block that can refer to the R-1 predicted reference data block is obtained from the R predicted reference video frame, and the propagation cost of the R predicted reference data block is 0.

Taking the propagation cost of the data block j as an example, the pre-analyzing processing of the data block j based on the propagation cost of the first estimated reference data block may be performed on the data block j based on the propagation cost, the intra-frame cost and the inter-frame cost of the first estimated reference data block. Wherein the inter-frame cost of a data block in a subsequent video frame is determined based on the data block in a previous video frame. That is, the inter-frame cost of the first pre-estimated reference data block is determined based on the data blocks in video frame i.

Accordingly, taking the propagation cost of the first estimated reference data block as an example, the pre-analyzing the first estimated reference data block based on the propagation cost of the second estimated reference data block may be performed by pre-analyzing the first estimated reference data block based on the propagation cost, the intra-frame cost and the inter-frame cost of the second estimated reference data block. And similarly, pre-analyzing the R-1 pre-estimated reference data block based on the propagation cost, the intra-frame cost and the inter-frame cost of the R-1 pre-estimated reference data block to obtain the propagation cost of the R-1 pre-estimated reference data block.

It will be appreciated that the propagation cost of data block j is determined based on the intra cost, the inter cost, and the propagation cost of the first pre-estimated reference data block. The propagation cost of the first predicted reference data block is determined based on the intra-frame cost, the inter-frame cost, and the propagation cost of the second predicted reference data block. That is, the propagation cost of a data block in a previous video frame is determined by the propagation cost of a data block in a subsequent video frame. The estimated reference video frame is calculated from the last frame to the current video frame i, and the propagation cost of each data block in the current video frame i to the subsequent estimated reference video frame can be obtained.

The pre-analyzing the intra-frame cost, the inter-frame cost and the propagation cost of the first estimated reference data block m1 to obtain the propagation cost propagation cost_j of the data block j may be:

propagate cost_j＝f(propagate cost_m1，intra cost_m1，inter cost_m1)

wherein, the propagation cost_m1 is the propagation cost of the first estimated reference data block m 1; intra cost_m1 is the intra cost of the first estimated reference data block m 1; inter cost_m1 is the inter cost of the first estimated reference data block m 1. Where f (x, y, x) represents the propagation cost of solving the current data block, and can be defined by the relevant business personnel.

Correspondingly, pre-analyzing the intra-frame cost, the inter-frame cost and the propagation cost of the second estimated reference data block m2 to obtain the propagation cost of the first estimated reference data block m1 may be:

propagate cost_m1＝f(propagate cost_m2，intra cost_m2，inter cost_m2)

wherein, the propagation cost_m2 is the propagation cost of the second estimated reference data block m 2; intra cost_m2 is the intra cost of the second pre-estimated reference data block m 2; inter cost_m2 is the inter cost of the second predicted reference data block m 2.

Optionally, when the first estimated reference data block m1 is used as a data block associated with a plurality of data blocks, the propagation cost obtained by the first estimated reference data block m1 is divided into the plurality of data blocks according to a certain proportion.

I.e. propagation cost for data block j, propagation cost_j:

propagate cost_j＝af(propagate cost_m1，intra cost_m1，inter cost_m1)

where a is the division ratio for data block j. The manner in which the division ratio is determined may be defined by the relevant business person. For example, the propagation cost may be equally distributed by a plurality of data blocks.

Optionally, when there are a plurality of first estimated reference data blocks m1 (e.g., m11, m 12) of the data block j, and the propagation costs obtained by the data block m11 are divided into the data block j according to the ratio a, the propagation costs of the data block j may be the sum of the propagation costs provided by the plurality of first estimated reference data blocks to the data block j.

I.e. propagation cost for data block j, propagation cost_j:

propagate cost_j＝af(propagate cost_m11，intra cost_m11，inter cost_m11)+f(propagate cost_m12，intra cost_m12，inter cost_m12)

for example, as shown in fig. 11 to fig. 12, fig. 11 to fig. 12 are schematic diagrams of a determination scenario of propagation costs according to an embodiment of the present application; the target video comprises video frames 1-20, if at the time T1, the video frame 3 is the video frame being encoded, the video frames 4-20 are video frames to be encoded, if the propagation cost of the data block in the video frame 3 can be determined through R (such as R=5) video frames to be encoded after the video frame 3, then the estimated reference video frame (such as the video frame 4-8) associated with the video frame 3 is obtained from the video frames to be encoded, the propagation cost of the data block in the video frame 3 can be determined through the video frames 4-8, if the propagation cost of the data block in the video frame 8 can be set to 0, (s 11) the propagation cost of each data block in the video frame 7 is determined according to the propagation cost, the intra-frame cost and the inter-frame cost of the data block in the video frame 8; (s 12) determining the propagation cost of each data block in the video frame 6 according to the propagation cost, the intra-frame cost and the inter-frame cost of the data block of the video frame 7; (s 13) determining the propagation cost of each data block in the video frame 5 according to the propagation cost, the intra-frame cost and the inter-frame cost of the data block of the video frame 6; (s 14) determining the propagation cost of each data block in the video frame 4 according to the propagation cost, the intra-frame cost and the inter-frame cost of the data block of the video frame 5; (s 15) determining the propagation cost of each data block in the video frame 3, i.e. the propagation cost of each data block in the current frame, according to the propagation cost, the intra-frame cost and the inter-frame cost of the data blocks of the video frame 4.

Similarly, if at time T2 (T2 > T1), if video frame 4 is the video frame being encoded, then video frames 5-20 are video frames to be encoded, e.g., the propagation cost of the data block in video frame 4 can be determined by R (e.g., r=5) video frames to be encoded after video frame 4, then the estimated reference video frame (e.g., video frame 5-9) associated with video frame 4 is obtained from the video frames to be encoded, the propagation cost of the data block in video frame 4 can be determined by video frames 5-9, e.g., the propagation cost of the data block in video frame 9 can be set to 0, (s 21) the propagation cost of each data block in video frame 8 is determined according to the propagation cost, intra-frame cost and inter-frame cost of the data block in video frame 9; (s 22) determining the propagation cost of each data block in the video frame 7 according to the propagation cost, the intra-frame cost and the inter-frame cost of the data block of the video frame 8; (s 23) determining a propagation cost for each data block in video frame 6 based on the propagation cost, intra-frame cost, and inter-frame cost for the data block of video frame 7; (s 24) determining the propagation cost of each data block in the video frame 5 according to the propagation cost, the intra-frame cost and the inter-frame cost of the data block of the video frame 6; (s 25) determining the propagation cost of each data block in the video frame 4, i.e. the propagation cost of each data block in the current frame, according to the propagation cost, intra-frame cost and inter-frame cost of the data blocks in the video frame 5.

Here, as shown in fig. 12, taking the determination of the data block j in the video frame 3 as an example, the data block j is the data block X1 in the video frame 3; the video frame associated with the video frame 3 is a video frame 4, and the estimated reference data block associated with (i.e. capable of being referred to) the data block X1 determined from the video frame 4 is a data block X2; the video frame associated with the video frame 4 is a video frame 5, and the estimated reference data block associated with (i.e. capable of being referred to) the data block X2 determined from the video frame 5 is a data block X3; the video frame associated with the video frame 5 is a video frame 6, and the estimated reference data block associated with (i.e. capable of being referred to) the data block X3 determined from the video frame 6 is a data block X4; the video frame associated with the video frame 6 is a video frame 7, and the estimated reference data block associated with (i.e. capable of being referred to) the data block X4 determined from the video frame 7 is a data block X5; the video frame associated with the video frame 7 is a video frame 8, and the estimated reference data block associated with (i.e. capable of being referred to) the data block X5 determined from the video frame 8 is a data block X6; therefore, (S31) pre-analyzing the data block X5 based on the inter-frame cost 16a, the intra-frame cost 16b, and the propagation cost 16c of the data block X6 to obtain the propagation cost 15c of the data block X5; (S32) pre-analyzing the data block X4 based on the inter-frame cost 15a, the intra-frame cost 15b, and the propagation cost 15c of the data block X5 to obtain the propagation cost 14c of the data block X4; (S33) pre-analyzing the data block X3 based on the inter-frame cost 14a, the intra-frame cost 14b, and the propagation cost 14c of the data block X4 to obtain the propagation cost 13c of the data block X3; (S34) pre-analyzing the data block X2 based on the inter-frame cost 13a, the intra-frame cost 13b, and the propagation cost 13c of the data block X3 to obtain the propagation cost 13c of the data block X2; (S35) pre-analyzing the data block X1 based on the inter-frame cost 12a, the intra-frame cost 12b and the propagation cost 12c of the data block X2 to obtain the propagation cost 11c of the data block X1.

S205, determining a reference data block corresponding to the data block j based on the propagation cost of the data block j.

Alternatively, for either the chrominance data block or the luminance data block, the corresponding reference data block may be determined by the propagation cost. Alternatively, for a chroma data block, it generally contains less noise information, so the chroma data block may be directly used as its corresponding reference data block, that is, the chroma data block contained in the original video frame may be directly subjected to the codec and filtering operations.

That is, before the data block j is subjected to data analysis processing to obtain the propagation cost of the data block j, it is determined whether the data block j is a chrominance data block or a luminance data block. If the data block j is the luminance data block to be encoded, the step of performing data analysis processing on the data block j to obtain the propagation cost of the data block j is notified (S202-S205). And if the data block j is the chromaticity data block to be encoded, using the data block j as a reference data block corresponding to the data block j.

The specific manner of determining the reference data block based on the propagation cost may be referred to the related description of the above embodiment, which is not described herein.

For example, as shown in fig. 13, fig. 13 is a schematic diagram of a determining scenario of a reference data block according to an embodiment of the present application; if the data block j is a chroma data block, the data block j is directly used as a reference data block of the data block j; if the data block j is a luminance data block, determining a propagation cost of the data block j (it can be understood that in determining the propagation cost of the data block j, the related first associated data block, second associated data block and the like are all data blocks of the same type as the data block j, namely, the luminance data block), and determining a reference data block of the data block j from the denoising data blocks corresponding to the data block j and the data block j according to the propagation cost; if the propagation cost is greater than the propagation cost threshold, taking the denoising data block corresponding to the data block j as a reference data block of the data block j; and when the propagation cost is less than or equal to the propagation cost threshold value, taking the data block j as a reference data block of the data block j.

S206, coding the reference data block to obtain a coded reference data block and coding strategy data associated with the coded reference data block.

S207, pre-decoding the coded reference data block through the coding strategy data to obtain a pre-decoded data block corresponding to the data block j.

S208, when the coding filtering data associated with the coded reference data block is determined through the coded reference data block and the pre-decoding data block, the coding strategy data associated with the coded reference data block and the coding filtering data associated with the coded reference data block are used as the data coding parameters associated with the data block j.

S209, based on the data encoding parameter associated with the data block j, encoding to obtain a video encoding data stream associated with the video frame to be processed. The specific implementation manner of steps S206 to S209 may be referred to the related description of the above embodiments, which is not repeated herein.

Further, referring to fig. 14, fig. 14 is a flowchart of a video data processing method according to an embodiment of the present application, as shown in fig. 14, the method may be performed by the above-mentioned computer device, and the computer device may be a decoding terminal, for example, the decoding terminal may be the service server 100 shown in fig. 1. The method specifically comprises the following steps S301-S304:

s301, acquiring a video coding data stream associated with a video frame to be processed in a target video.

The video coding data stream is transmitted to the decoding terminal by the coding terminal, video frames in the target video are all associated with the video coding data stream, and decoding reconstruction of each video frame can be achieved through the video coding data stream, so that corresponding reconstructed video frames are obtained.

S302, decoding from the video coding data stream to obtain data decoding parameters associated with the data block j.

Wherein the data decoding parameters comprise coding strategy data and coding filtering data; the coding strategy data and the coding filtering data are determined when the coding processing is carried out on the coded reference data block corresponding to the data block j; the coded reference data block is determined by the propagation cost of the data block j obtained by performing data analysis processing on the data block j.

The video encoding data stream may be entropy decoded, so as to sequentially obtain data decoding parameters associated with each data block in the video frame to be processed, and sequentially decode each data block based on the data decoding parameters associated with each data block.

The data decoding parameters are determined by the encoding terminal, entropy encoding is carried out to obtain a video encoding data stream, and then the video encoding data stream is transmitted to the decoding terminal. It can be understood that when the encoding terminal obtains a data block, the encoding terminal determines a corresponding reference data block based on the propagation cost of the data block, and performs encoding processing on the reference data block, that is, the description of the operations of encoding, decoding, filtering, etc. on the data block described in the technical scheme of the present application actually refers to the operations of encoding, decoding, filtering on the reference data block corresponding to the data block. The determination of the data decoding parameters and the reference data blocks may be found in the relevant description of the above embodiments.

S303, decoding the coded reference data block based on the coding strategy data to obtain a decoded data block corresponding to the data block j.

It will be appreciated that the encoding strategy data includes prediction mode association information and quantization variation parameters, and that the decoding process for the encoded reference data block may be: performing inverse quantization transformation on the quantization transformation coefficient to obtain residual data information corresponding to the data block j; based on the prediction mode association information, carrying out data prediction on the data block j to obtain a predicted data block corresponding to the data block j; and determining a decoding data block corresponding to the data block j through the residual data information and the predicted data block.

It will be appreciated that the prediction mode association information is used to indicate the prediction mode of the data block j, such as intra prediction and inter prediction, and information related to the prediction mode, such as some motion vectors determined during inter prediction, etc., for determining the data block referred to by the data block j during encoding from the reconstructed video frame, and the determined data block may be used as the predicted data block corresponding to the data block j. For another example, in intra prediction, a data block adjacent to the data block j may be indicated, so as to determine, from the reconstructed data block corresponding to the current frame, an adjacent data block referred to by the data block j in encoding, and the determined data block may be used as a predicted data block corresponding to the data block j.

Meanwhile, the quantization transformation parameters are obtained by the encoding terminal performing quantization transformation processing (or only quantization processing) on the residual data information, so that the decoding terminal can perform inverse quantization transformation processing (or only inverse quantization processing) on the quantization transformation parameters to obtain the residual data information, and the superposition result of the residual data information and the information of the predicted data block is used as a decoded data block corresponding to the data block j.

S304, filtering the decoded data block according to the encoded and filtered data to obtain a reconstructed data block corresponding to the decoded data block.

The reconstruction data block is used for carrying out frame reconstruction on the video frame to be processed. The corresponding encoding terminal is the frame pre-reconstruction.

It will be appreciated that, for each decoded data block obtained, the decoded data block may be filtered according to the encoded filtered data to obtain a corresponding reconstructed data block.

For example, the decoded data block corresponding to the data block j is the decoded data block j, and the filtering process for the decoded data block j may be performed by inputting the decoded data block j and the encoded filtered data associated with the data block j to a loop filter in the decoding terminal, and filtering the decoded data block j by the loop filter in the decoding terminal according to the encoded filtered data associated with the data block j, to obtain the reconstructed data block corresponding to the decoded data block j.

It will be understood that taking K to-be-decoded data blocks included in the video frame to be processed as examples, K is a positive integer greater than 1, and when each to-be-decoded data block in the K to-be-decoded data blocks is selected as the data block j, a reconstructed data block corresponding to each to-be-decoded data block may be obtained.

Wherein, one data block to be decoded can be a chrominance data block to be decoded or a luminance data block to be decoded. For example, the K data blocks to be decoded include K1 chrominance data blocks to be decoded and K2 luminance data blocks to be decoded; the sum of K1 and K2 is equal to K, and both are positive integers. K1 may or may not be equal to K2. The data encoding parameters of one data block include transform partition information.

Thus, frame reconstruction of a video frame to be processed by reconstructing a data block may be: determining dividing position information of a reconstruction data block corresponding to the K1 to-be-decoded chroma data blocks based on transformation dividing information of the K1 to-be-decoded chroma data blocks, and reconstructing to obtain a decoded chroma data frame corresponding to the to-be-processed video frame through dividing position information of the reconstruction data block corresponding to the K1 to-be-decoded chroma data blocks; determining dividing position information of a reconstruction data block corresponding to the K2 brightness data blocks to be decoded based on transformation dividing information of the K2 brightness data blocks to be decoded, and reconstructing the decoding brightness data frame corresponding to the video frame to be processed through dividing position information of the reconstruction data block corresponding to the K2 brightness data blocks to be decoded; and reconstructing a reconstructed video frame corresponding to the video frame to be processed based on the decoded chrominance data frame and the decoded luminance data frame. Wherein the decoded chroma data frame is the chroma component data of the reconstructed video frame and the decoded luma data frame is the luma component data of the reconstructed video frame.

For example, as shown in fig. 15, fig. 15 is a schematic view of a video frame reconstruction scene according to an embodiment of the present application; one encoded block corresponds to a divided area in one video frame. When one encoded block contains luminance component data, the encoded block is also called a luminance encoded block, and when one encoded block contains chrominance component data, the encoded block is also called a chrominance encoded block. That is, the data blocks in the video frame include chrominance data blocks and luminance data blocks. For example, after a video frame is transformed and divided into K1 divided regions in a transformation and division manner, chroma component data in the K1 divided regions is extracted, thereby obtaining K1 chroma coding blocks (a 11, a12, a1K 1). In another example, after the video frame is transformed and divided according to the transformation and division method two to obtain K2 divided regions, luminance component data in the K2 divided regions is extracted, thereby obtaining K2 luminance coding blocks (b 11, b12,..and b1K 2).

Meanwhile, the propagation costs of the K1 chroma coding blocks (a 11, a12,) and a1K1 are determined to determine the reference coding blocks (a 11', a12', a1K1 ') corresponding to the K1 chroma coding blocks (a 11, a12,) and a1K 1'. The propagation costs of the K2 luma coded blocks (b 11, b12,) b1K 2) are determined to determine the reference coded blocks (b 11', b 12'.

That is, the chroma coding block a11' is coded to obtain an associated data decoding parameter A1, the chroma coding block a12' is coded to obtain an associated data decoding parameter A2, and the chroma coding block A1k1' is coded to obtain an associated data decoding parameter Ak1. Correspondingly, the brightness coding block B11' is coded to obtain an associated data decoding parameter B1, the brightness coding block B12' is coded to obtain an associated data decoding parameter B2, and the brightness coding block B1k2' is coded to obtain an associated data decoding parameter Bk2.

It will be understood that, when the luma coding block and the chroma coding block are decoded (i.e., predicted separately), reconstructed data blocks corresponding to the K1 chroma coding blocks may be obtained, reconstructed data blocks corresponding to the K1 chroma coding blocks may be spliced (reconstructed) to obtain chroma data frames (i.e., predicted chroma component data of one video frame) according to the position division information, reconstructed data blocks corresponding to the K2 luma coding blocks may be obtained, reconstructed data blocks corresponding to the K2 luma coding blocks may be spliced (reconstructed) to obtain luma data frames (i.e., predicted luma component data of one video frame) according to the position division information, and the corresponding reconstructed video frames may be determined from the chroma data frames and the luma data frames.

That is, during decoding, the chroma coding block a11 is decoded based on the data decoding parameter A1 to obtain a decoded data block a21 corresponding to the chroma coding block a11', the chroma coding block a12' is decoded based on the data decoding parameter A2 to obtain a decoded data block a22 corresponding to the chroma coding block a12', and the chroma coding block A1k1' is decoded based on the data decoding parameter Ak1 to obtain a decoded data block A2k1 corresponding to the chroma coding block A1k1 '. Accordingly, the luminance coding block B11 'is decoded based on the data decoding parameter B1 to obtain a decoded data block B21 corresponding to the luminance coding block B11', the luminance coding block B12 'is decoded based on the data decoding parameter B2 to obtain a decoded data block B22 corresponding to the luminance coding block B12', and the luminance coding block B1k2 'is decoded based on the data decoding parameter Bk2 to obtain a decoded data block B2k2 corresponding to the luminance coding block B1k 2'.

Subsequently, at the time of decoding, the decoded data block a21 corresponding to the chroma coding block a11' is subjected to a filter process to obtain a reconstructed data block a31 corresponding to the decoded data block a21, the decoded data block a22 corresponding to the chroma coding block a12' is subjected to a filter process to obtain a reconstructed data block a32 corresponding to the decoded data block a22, and the decoded data block a2k1 corresponding to the chroma coding block a1k1' is subjected to a filter process to obtain a reconstructed data block a3k1 corresponding to the decoded data block a2k1. Correspondingly, the decoding data block b21 corresponding to the brightness encoding block b11 'is subjected to filtering processing to obtain a reconstruction data block b31 corresponding to the decoding data block b21, the decoding data block b22 corresponding to the brightness encoding block b12' is subjected to filtering processing to obtain a reconstruction data block b32 corresponding to the decoding data block b22,.

Thus, a decoded chrominance data frame is reconstructed based on the reconstructed data block a 31-reconstructed data block a3k1 and a decoded luminance data frame is reconstructed based on the reconstructed data block b 31-reconstructed data block b3k2, and a reconstructed video frame is obtained by decoding the chrominance data frame and the decoded luminance data frame.

The embodiment of the application provides a method for adaptively selecting a reference data block to carry out video coding according to the propagation cost of a current data block, which comprises the following steps: decoding a video coding data stream associated with a video frame to be processed in a target video to obtain a data decoding parameter of a data block j, wherein the data block j is any one of the data blocks to be decoded in the video frame to be processed, decoding the data block based on coding strategy data in the data decoding parameter to obtain a decoded data block corresponding to the data block j, and filtering the decoded data block according to the data decoding parameter in the data decoding parameter to obtain a reconstructed data block, wherein the reconstructed data block is used for reconstructing the video frame to be processed; wherein the data decoding parameters are determined by encoding an encoded reference data block, the encoded reference data block being determined based on the propagation cost of data block j. That is, a proper data block can be selected as a reference data block corresponding to the data block j in a self-adaptive manner through the propagation cost of the data block j, and the encoding and decoding are performed on the data block in the video frame to be processed (namely, the original video frame) from the original data block corresponding to the data block, so that the reconstruction effect of the reconstructed data block can reach the balance between the data block j close to the original video frame and noise reduction, and further the noise in the reconstructed video frame obtained by reconstruction can be improved, and the improvement of the encoding efficiency and the encoding quality of the subsequent video frame needing to be referred to in the encoding process of the video frame to be processed (namely, the pre-reconstructed video frame of the video frame to be processed) is facilitated, so that the decoding prediction accuracy of the subsequent video frame at the decoding terminal side can be ensured, that is, and the improvement of the decoding and reconstruction effect of the video frame can be realized by improving the encoding quality of the video frame.

Further, referring to fig. 16, fig. 16 is a schematic structural diagram of a video data processing apparatus according to an embodiment of the present application. As shown in fig. 16, the video data processing apparatus 1 is applicable to a computer device. It should be understood that the video data processing apparatus 1 may be a computer program (comprising program code) running in a computer device, for example the video data processing apparatus 1 may be an application software; it will be appreciated that the video data processing apparatus 1 may be adapted to perform the corresponding steps in the method provided by the embodiments of the present application. As shown in fig. 16, the video data processing apparatus 1 may include: a video acquisition module 11, a reference data determination module 12, a data encoding module 13, a data stream encoding module 14; wherein:

a video acquisition module 11, configured to acquire a video frame to be processed in a target video; the data block to be coded included in the video frame to be processed is a data block j; the data block j is any one data block in the video frame to be processed;

the reference data determining module 12 is configured to perform data analysis processing on the data block j to obtain a propagation cost of the data block j, and determine a reference data block corresponding to the data block j based on the propagation cost of the data block j;

A data encoding module 13, configured to perform encoding processing on the reference data block, so as to obtain an encoded reference data block and encoding policy data associated with the encoded reference data block;

the data encoding module 13 is further configured to perform pre-decoding processing on the encoded reference data block through encoding policy data, so as to obtain a pre-decoded data block corresponding to the data block j;

the data encoding module 13 is further configured to, when determining encoded filter data associated with the encoded reference data block from the encoded reference data block and the pre-decoded data block, use the encoded policy data associated with the encoded reference data block and the encoded filter data associated with the encoded reference data block as data encoding parameters associated with the data block j; the encoded filtering data is used for carrying out filtering processing on the pre-decoding data block to obtain a pre-reconstruction data block corresponding to the pre-decoding data block; the pre-reconstruction data block is used for carrying out frame pre-reconstruction on the video frame to be processed;

the data stream encoding module 14 is configured to encode a video encoded data stream associated with the video frame to be processed based on the data encoding parameter associated with the data block j.

Wherein the target video comprises N video frames; n is a positive integer greater than 1; the video frame to be processed is a video frame i in N video frames; i is a positive integer greater than 1 and less than or equal to N; among the N video frames, the video frame positioned behind the video frame i is the video frame to be encoded;

The reference data determination module 12 includes:

an associated data obtaining unit 121, configured to obtain, from video frames to be encoded, a predicted reference video frame associated with a video frame i; the estimated reference video frames comprise a first estimated reference video frame and a second estimated reference video frame; the first estimated reference video frame is a video frame for carrying out data coding on a data block in a referenceable video frame i determined in the video frame to be coded; the second estimated reference video frame is a video frame which is determined in the video frame to be encoded and can refer to the data block in the first estimated reference video frame for data encoding;

the associated data obtaining unit 121 is further configured to determine a first estimated reference data block associated with the data block j from the first estimated reference video frame, and determine a second estimated reference data block associated with the first estimated reference data block from the second estimated reference video frame; the first estimated reference data block is a data block which can be subjected to data coding by referring to a data block j in the first estimated reference video frame; the second estimated reference data block is a data block which can refer to the first estimated reference data block in the second estimated reference video frame for data coding;

The pre-analysis unit 122 is configured to perform pre-analysis on the first estimated reference data block based on the propagation cost of the second estimated reference data block to obtain the propagation cost of the first estimated reference data block, and perform pre-analysis on the data block j based on the propagation cost of the first estimated reference data block to obtain the propagation cost of the data block j.

The reference data determining module 12 is specifically further configured to:

and if the data block j is the brightness data block to be encoded, notifying a reference data determining module to execute data analysis processing on the data block j to obtain the propagation cost of the data block j.

The reference data determining module 12 is specifically further configured to:

and if the data block j is the chromaticity data block to be encoded, using the data block j as a reference data block corresponding to the data block j.

Wherein the reference data determining module 12 comprises:

a data block determining unit 123, configured to obtain a denoising data block corresponding to the data block j from a denoising video frame corresponding to the video frame to be processed;

the data block determining unit 123 is further configured to, if the propagation cost of the data block j is greater than the propagation cost threshold, use the denoised data block corresponding to the data block j as the reference data block corresponding to the data block j.

Wherein the reference data determining module 22 comprises:

the data block determining unit 123 is further configured to, if the propagation cost of the data block j is less than or equal to the propagation cost threshold, use the data block j as the reference data block corresponding to the data block j.

Wherein the target video comprises N video frames; n is a positive integer greater than 1; the video frame to be processed is a video frame i in N video frames; i is a positive integer less than or equal to N; among the N video frames, the video frame positioned before the video frame i is an encoded video frame; the reconstruction data buffer associated with the target video is used for buffering a pre-reconstructed video frame corresponding to the encoded video frame; when the video frame i is subjected to coding processing, the video frame which is determined to be referred to is a reference reconstructed video frame from the reconstructed data buffer based on the prediction mode of the video frame i;

the data encoding module 13 includes:

a reference data determining unit 131, configured to determine, from the reference reconstructed video frame, a data block that can be referred to by the reference data block when performing data encoding based on the prediction mode of the video frame i, and take the determined data block that can be referred to by the reference data block when performing data encoding as a target reference data block;

A data encoding unit 132, configured to perform data encoding on the reference data block based on the prediction mode of the video frame i, to obtain residual data information between the target reference data block and the reference data block, and prediction mode association information associated with the reference data block;

a quantization transforming unit 133, configured to perform quantization transforming processing on the residual data information, so as to obtain a quantization transforming coefficient associated with the reference data block;

the policy data determining unit 134 is configured to, when a reference data block associated with prediction mode association information and quantized transform coefficients is used as an encoded reference data block, use the prediction mode association information and quantized transform coefficients associated with the reference data block as encoding policy data associated with the encoded reference data block.

Wherein the data encoding module 13 comprises:

a difference information determining unit 135 for determining associated difference information between the pre-decoded data block and the encoded reference data block;

the filtered data determining unit 136 is configured to determine encoded filtered data associated with the encoded reference data block by associating the difference information.

The filtering data determining unit 136 specifically is configured to:

determining autocorrelation information of the pre-decoded data block and cross-correlation information between the pre-decoded data block and the encoded reference data block;

The autocorrelation information and the cross-correlation information are used as correlation difference information between the pre-decoded data block and the encoded reference data block.

Wherein the target video comprises N video frames; n is a positive integer greater than 1; the video frame to be processed is a video frame i in N video frames; i is a positive integer less than or equal to N; the video frames preceding video frame i among the N video frames are encoded video frames; the reconstruction data buffer associated with the target video is used for buffering a pre-reconstructed video frame corresponding to the encoded video frame; the pre-decoding data block corresponding to the data block j is the pre-decoding data block j;

the data encoding module 13 further includes:

a pre-decoding unit 137, configured to input the pre-decoded data block j and the encoded filtered data associated with the data block j into a loop filter in the encoding terminal, and perform filtering processing on the pre-decoded data block j according to the encoded filtered data associated with the data block j by using the loop filter in the encoding terminal to obtain a pre-reconstructed data block corresponding to the pre-decoded data block j;

a pre-reconstruction unit 138, configured to perform frame pre-reconstruction on the video frame i through the pre-reconstruction data block, so as to obtain a pre-reconstructed video frame corresponding to the video frame i;

the pre-reconstruction unit 138 is further configured to add a pre-reconstructed video frame corresponding to the video frame i to the reconstruction data buffer.

The specific implementation manners of the video acquisition module 11, the reference data determination module 12, the data encoding module 13, and the data stream encoding module 14 may be referred to the relevant descriptions in the above embodiments, and will not be further described herein. It should be understood that the description of the beneficial effects obtained by the same method will not be repeated.

Further, referring to fig. 17, fig. 17 is a schematic structural diagram of a computer device according to an embodiment of the present application. As shown in fig. 17, the computer device 1700 may be a service terminal or a server, which is not limited herein. For ease of understanding, the present application takes a computer device as an example of a server, and the computer device 1700 may include: processor 1701, network interface 1704 and memory 1705, and in addition, the computer device 1700 may further comprise: a user interface 1703, and at least one communication bus 1702. Wherein the communication bus 1702 is used to enable connected communications between these components. The user interface 1703 may also include standard wired interfaces, wireless interfaces, among others. The network interface 1704 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1705 may be a high-speed RAM memory or a non-volatile memory, such as at least one disk memory. The memory 1705 may optionally also be at least one storage device located remotely from the processor 1701. As shown in fig. 17, an operating system, a network communication module, a user interface module, and a device control application program may be included in the memory 1705, which is one type of computer-readable storage medium.

Wherein the network interface 1704 in the computer device 1700 may also provide a network data interaction function. In the computer device 1700 shown in FIG. 17, the network interface 1704 can provide a network data interaction function; while the user interface 1703 is primarily an interface for providing input to a user; the processor 1701 may be configured to invoke the device control application stored in the memory 1705 to perform the description of the video data processing method in the embodiment corresponding to fig. 7, 10 and 14, and may also perform the description of the video data processing apparatus 1 in the embodiment corresponding to fig. 16, which is not described herein. In addition, the description of the beneficial effects of the same method is omitted.

In one possible implementation, the memory 1705 is used to store program instructions. The processor 1701 may call program instructions to perform the steps of:

when determining the coding filtering data associated with the coded reference data block through the coded reference data block and the pre-decoding data block, taking the coding strategy data associated with the coded reference data block and the coding filtering data associated with the coded reference data block as the data coding parameters associated with the data block j; the encoded filtering data is used for carrying out filtering processing on the pre-decoding data block to obtain a pre-reconstruction data block corresponding to the pre-decoding data block; the pre-reconstruction data block is used for carrying out frame pre-reconstruction on the video frame to be processed;

The processor 1701 is specifically configured to, when performing data analysis processing on the data block j to obtain a propagation cost of the data block j:

obtaining a predicted reference video frame associated with a video frame i from video frames to be encoded; the estimated reference video frames comprise a first estimated reference video frame and a second estimated reference video frame; the first estimated reference video frame is a video frame for carrying out data coding on a data block in a referenceable video frame i determined in the video frame to be coded; the second estimated reference video frame is a video frame which is determined in the video frame to be encoded and can refer to the data block in the first estimated reference video frame for data encoding;

determining a first estimated reference data block associated with the data block j from the first estimated reference video frame, and determining a second estimated reference data block associated with the first estimated reference data block from the second estimated reference video frame; the first estimated reference data block is a data block which can be subjected to data coding by referring to a data block j in the first estimated reference video frame; the second estimated reference data block is a data block which can refer to the first estimated reference data block in the second estimated reference video frame for data coding;

The first estimated reference data block is subjected to pre-analysis processing based on the propagation cost of the second estimated reference data block to obtain the propagation cost of the first estimated reference data block, and the data block j is subjected to pre-analysis processing based on the propagation cost of the first estimated reference data block to obtain the propagation cost of the data block j.

The processor 1701 is further configured to, before performing data analysis processing on the data block j to obtain a propagation cost of the data block j:

and if the data block j is the brightness data block to be encoded, notifying to execute the data analysis processing on the data block j to obtain the propagation cost of the data block j.

Wherein the processor 1701 is further configured to:

The processor 1701 is specifically configured to, when determining a reference data block corresponding to the data block j based on the propagation cost of the data block j:

acquiring a denoising data block corresponding to a data block j from a denoising video frame corresponding to a video frame to be processed;

and if the propagation cost of the data block j is greater than the propagation cost threshold value, taking the denoising data block corresponding to the data block j as the reference data block corresponding to the data block j.

and if the propagation cost of the data block j is smaller than or equal to the propagation cost threshold value, using the data block j as a reference data block corresponding to the data block j.

the processor 1701, when configured to perform encoding processing on the reference data block to obtain an encoded reference data block and encoding policy data associated with the encoded reference data block, is specifically configured to:

determining a data block which can be referred to by a reference data block when data encoding is performed from a reference reconstructed video frame based on a prediction mode of the video frame i, and taking the determined data block which can be referred to by the reference data block when the data encoding is performed as a target reference data block;

Based on the prediction mode of the video frame i, carrying out data coding on the reference data block to obtain residual data information between the target reference data block and the reference data block and prediction mode associated information associated with the reference data block;

carrying out quantization transformation processing on residual data information to obtain quantization transformation coefficients associated with a reference data block;

when a reference data block associated with prediction mode association information and quantized transform coefficients is used as an encoded reference data block, the prediction mode association information and quantized transform coefficients associated with the reference data block are used as encoding strategy data associated with the encoded reference data block.

Wherein the processor 1701, when configured to determine encoded filtered data associated with the encoded reference data block from the encoded reference data block and the pre-decoded data block, is specifically configured to:

determining associated difference information between the pre-decoded data block and the encoded reference data block;

encoded filtered data associated with the encoded reference data block is determined by associating the difference information.

Wherein the processor 1701, when configured to determine associated difference information between the pre-decoded data block and the encoded reference data block, is specifically configured to:

the processor 1701 is also configured to:

based on the coding strategy data, pre-decoding the coded reference data block to obtain a pre-decoded data block corresponding to the data block j; the pre-decoding data block corresponding to the data block j is the pre-decoding data block j;

inputting the pre-decoded data block j and the coded and filtered data associated with the data block j into a loop filter in the coding terminal, and performing filtering processing on the pre-decoded data block j by the loop filter in the coding terminal according to the coded and filtered data associated with the data block j to obtain a pre-reconstructed data block corresponding to the pre-decoded data block j;

And carrying out frame pre-reconstruction on the video frame i through the pre-reconstruction data block to obtain a pre-reconstructed video frame corresponding to the video frame i, and adding the pre-reconstructed video frame corresponding to the video frame i into a reconstruction data buffer.

In specific implementation, the device, the processor, the memory, etc. described in the embodiments of the present application may perform the implementation described in the foregoing method embodiments, or may perform the implementation described in the embodiments of the present application, which is not described herein again.

Furthermore, it should be noted here that: the embodiment of the present application further provides a computer readable storage medium, in which a computer program executed by the video data processing apparatus 1 mentioned above is stored, and the computer program includes computer instructions, when executed by a processor, can execute the description of the video data processing method in the embodiment corresponding to fig. 7, 10 and 14, and therefore, a detailed description will not be given here. In addition, the description of the beneficial effects of the same method is omitted. For technical details not disclosed in the embodiments of the computer-readable storage medium according to the present application, please refer to the description of the method embodiments of the present application. As an example, computer instructions may be deployed to be executed on one computing device or on multiple computing devices at one site or, alternatively, across multiple computing devices distributed across multiple sites and interconnected by a communication network, where the multiple computing devices distributed across multiple sites and interconnected by a communication network may constitute a blockchain system.

In addition, it should be noted that: embodiments of the present application also provide a computer program product or computer program that may include computer instructions that may be stored in a computer-readable storage medium. The processor of the computer device reads the computer instructions from the computer readable storage medium, and the processor may execute the computer instructions, so that the computer device performs the foregoing description of the video data processing method in the embodiment corresponding to fig. 7, fig. 10, and fig. 14, and therefore, a detailed description will not be given here. In addition, the description of the beneficial effects of the same method is omitted. For technical details not disclosed in the computer program product or the computer program embodiments according to the present application, reference is made to the description of the method embodiments according to the present application.

It should be noted that, for simplicity of description, the foregoing method embodiments are all expressed as a series of action combinations, but it should be understood by those skilled in the art that the present application is not limited by the order of action described, as some steps may be performed in other order or simultaneously according to the present application. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily required for the present application.

The steps in the method of the embodiment of the application can be sequentially adjusted, combined and deleted according to actual needs.

The modules in the device of the embodiment of the application can be combined, divided and deleted according to actual needs.

Those skilled in the art will appreciate that implementing all or part of the above-described methods may be accomplished by way of a computer program stored in a computer-readable storage medium, which when executed may comprise the steps of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), or the like.

The foregoing disclosure is illustrative of the present application and is not to be construed as limiting the scope of the application, which is defined by the appended claims.

Claims

1. A method of video data processing, the method comprising:

Performing data analysis processing on the data block j to obtain the propagation cost of the data block j, and determining a reference data block corresponding to the data block j based on the propagation cost of the data block j;

coding the reference data block to obtain a coded reference data block and coding strategy data associated with the coded reference data block;

performing pre-decoding processing on the coded reference data block through the coding strategy data to obtain a pre-decoded data block corresponding to the data block j;

when determining encoded filtered data associated with the encoded reference data block from the encoded reference data block and the pre-decoded data block, taking the encoded policy data associated with the encoded reference data block and the encoded filtered data associated with the encoded reference data block as data encoding parameters associated with the data block j;

and encoding to obtain a video coding data stream associated with the video frame to be processed based on the data encoding parameter associated with the data block j.

2. The method of claim 1, wherein the target video comprises N video frames; n is a positive integer greater than 1; the video frame to be processed is a video frame i in the N video frames; i is a positive integer greater than 1 and less than or equal to N; among the N video frames, the video frame located after the video frame i is a video frame to be encoded;

The data analysis processing is performed on the data block j to obtain the propagation cost of the data block j, including:

obtaining a predicted reference video frame associated with the video frame i from the video frame to be encoded; the estimated reference video frames comprise a first estimated reference video frame and a second estimated reference video frame; the first estimated reference video frame is a video frame which is determined in the video frame to be encoded and can be used for carrying out data encoding by referring to a data block in the video frame i; the second estimated reference video frame is a video frame which is determined in the video frame to be encoded and can be used for carrying out data encoding by referring to the data block in the first estimated reference video frame;

determining a first estimated reference data block associated with the data block j from the first estimated reference video frame, and determining a second estimated reference data block associated with the first estimated reference data block from the second estimated reference video frame; the first estimated reference data block is a data block which can refer to the data block j for data coding in the first estimated reference video frame; the second estimated reference data block is a data block which can refer to the first estimated reference data block in the second estimated reference video frame for data coding;

And pre-analyzing the first estimated reference data block based on the propagation cost of the second estimated reference data block to obtain the propagation cost of the first estimated reference data block, and pre-analyzing the data block j based on the propagation cost of the first estimated reference data block to obtain the propagation cost of the data block j.

3. The method of claim 1, wherein before the data block j is subjected to data analysis processing to obtain the propagation cost of the data block j, the method further comprises:

and if the data block j is the brightness data block to be encoded, notifying to execute the step of carrying out data analysis processing on the data block j to obtain the propagation cost of the data block j.

4. A method according to claim 3, characterized in that the method further comprises:

5. The method of claim 1, wherein the determining the reference data block corresponding to the data block j based on the propagation cost of the data block j comprises:

Acquiring a denoising data block corresponding to the data block j from a denoising video frame corresponding to the video frame to be processed;

and if the propagation cost of the data block j is greater than the propagation cost threshold value, taking the denoising data block corresponding to the data block j as a reference data block corresponding to the data block j.

6. The method of claim 1, wherein the determining the reference data block corresponding to the data block j based on the propagation cost of the data block j comprises:

7. The method of claim 1, wherein the target video comprises N video frames; n is a positive integer greater than 1; the video frame to be processed is a video frame i in the N video frames; i is a positive integer less than or equal to N; among the N video frames, the video frame located before the video frame i is an encoded video frame; the reconstruction data buffer associated with the target video is used for buffering a pre-reconstructed video frame corresponding to the encoded video frame; when the video frame i is subjected to coding processing, the video frame which is determined to be referred to is a reference reconstruction video frame from the reconstruction data buffer based on the prediction mode of the video frame i;

The encoding processing of the reference data block to obtain an encoded reference data block and encoding strategy data associated with the encoded reference data block includes:

determining a data block which can be referred to by the reference data block when the data coding is carried out from the reference reconstructed video frame based on the prediction mode of the video frame i, and taking the determined data block which can be referred to by the reference data block when the data coding is carried out as a target reference data block;

performing quantization transformation processing on the residual data information to obtain a quantization transformation coefficient associated with the reference data block;

when the reference data block associated with the prediction mode association information and the quantized transform coefficient is taken as the encoded reference data block, the prediction mode association information and the quantized transform coefficient associated with the reference data block are taken as encoding strategy data associated with the encoded reference data block.

8. The method of claim 1, wherein said determining encoded filtered data associated with said encoded reference data block from said encoded reference data block and said pre-decoded data block comprises:

encoded filtered data associated with the encoded reference data block is determined by the associated difference information.

9. The method of claim 8, wherein said determining associated difference information between the pre-decoded data block and the encoded reference data block comprises:

determining autocorrelation information of the pre-decoded data block, and cross-correlation information between the pre-decoded data block and the encoded reference data block;

and using the autocorrelation information and the cross-correlation information as associated difference information between the pre-decoded data block and the encoded reference data block.

10. The method of claim 1, wherein the target video comprises N video frames; n is a positive integer greater than 1; the video frame to be processed is a video frame i in the N video frames; i is a positive integer less than or equal to N; the video frame positioned before the video frame i in the N video frames is an encoded video frame; the reconstruction data buffer associated with the target video is used for buffering a pre-reconstructed video frame corresponding to the encoded video frame; the pre-decoding data block corresponding to the data block j is a pre-decoding data block j;

The method further comprises the steps of:

performing frame pre-reconstruction on the video frame i through the pre-reconstruction data block to obtain a pre-reconstructed video frame corresponding to the video frame i;

and adding the pre-reconstructed video frame corresponding to the video frame i to the reconstructed data buffer.

11. A video data processing apparatus, the apparatus comprising:

the data encoding module is further configured to perform pre-decoding processing on the encoded reference data block according to the encoding policy data, so as to obtain a pre-decoded data block corresponding to the data block j;

the data encoding module is further configured to, when encoded filtered data associated with the encoded reference data block is determined by the encoded reference data block and the pre-decoded data block, use encoding policy data associated with the encoded reference data block and encoding filtered data associated with the encoded reference data block as data encoding parameters associated with the data block j;

12. A computer device comprising a memory and a processor;

the memory is connected to the processor, the memory is used for storing a computer program, and the processor is used for calling the computer program to enable the computer device to execute the method of any one of claims 1-10.

13. A computer readable storage medium, characterized in that the computer readable storage medium has stored therein a computer program adapted to be loaded and executed by a processor to cause a computer device having the processor to perform the method of any of claims 1-10.