CN116074528A

CN116074528A - Video coding method and device, and coding information scheduling method and device

Info

Publication number: CN116074528A
Application number: CN202111277292.XA
Authority: CN
Inventors: 黄剑飞
Original assignee: Beijing Ape Power Future Technology Co Ltd
Current assignee: Beijing Ape Power Future Technology Co Ltd
Priority date: 2021-10-29
Filing date: 2021-10-29
Publication date: 2023-05-05

Abstract

The specification provides a video coding method and device, and a coding information scheduling method and device, wherein the video coding method comprises the following steps: acquiring a video to be encoded, and encoding the video to be encoded into a base layer and at least one enhancement layer to obtain initial encoding information of the video to be encoded; determining a priority coefficient of each video frame of the video to be encoded, which is discarded, according to the video data of the video to be encoded; and adding the priority coefficient of each video frame to the initial coding information of the video to be coded to obtain target coding information of the video to be coded. Therefore, the discarded priority coefficient of each video frame can be added to the coding information, so that when the frame data in the enhancement layer is required to be discarded in the subsequent scheduling process of the coding information, the video frame in the enhancement layer can be selectively discarded based on the discarded priority coefficient of each video frame determined in the coding process, the video quality is ensured, and the user experience is improved.

Description

Video coding method and device, and coding information scheduling method and device

Technical Field

The present disclosure relates to the field of video processing technologies, and in particular, to a video encoding method. The present specification also relates to a video encoding apparatus, an encoded information scheduling method, an encoded information scheduling apparatus, a computing device, and a computer-readable storage medium.

Background

With the rapid development of computer technology and network technology, various video layers are endless, and accordingly, video coding technology is rapidly developed. With the development of video coding technology, user demands are becoming more and more diversified, so that not only is high-efficiency video coding required, but also a coding result is required to meet various application occasions. An SVC (Scalable Video Coding ) coding mode is generated, and video can be coded by adopting the coding mode from three aspects of time, space and quality respectively, so that scalable coding information of time, space and quality is provided to meet the requirements of network transmission rate and end users on the aspects of time, space, signal to noise ratio and the like of the video.

In the SVC coding scheme, the lowest quality layer is called a base layer, and a layer enhancing spatial resolution, temporal resolution, or signal-to-noise strength is called an enhancement layer. Currently, in the transmission process or the decoding process, if the frame data in the enhancement layer needs to be discarded, the whole enhancement layer is often discarded directly, which may cause poor video quality and affect the user experience.

Disclosure of Invention

In view of this, the present embodiments provide a video encoding method. The present disclosure also relates to a video encoding apparatus, an encoded information scheduling method, an encoded information scheduling apparatus, a computing device, and a computer-readable storage medium, for solving the technical drawbacks of the prior art.

According to a first aspect of embodiments of the present specification, there is provided a video encoding method, including:

acquiring a video to be encoded, and encoding the video to be encoded into a base layer and at least one enhancement layer to obtain initial encoding information of the video to be encoded;

determining a priority coefficient of each video frame of the video to be encoded, which is used for representing the probability of discarding the corresponding video frame, according to the video data of the video to be encoded;

and adding the priority coefficient of each video frame to the initial coding information of the video to be coded to obtain target coding information of the video to be coded.

According to a second aspect of embodiments of the present specification, there is provided a method for scheduling encoded information, including:

acquiring and analyzing target coding information of a video to be decoded to obtain a base layer, at least one enhancement layer and a priority coefficient of each video frame of the video to be decoded, wherein the priority coefficient is used for representing the probability of discarding the corresponding video frame;

Determining a video frame to be discarded in at least one enhancement layer according to the current decoding condition and the priority coefficient fed back by the decoding end;

and taking the rest video frames except the video frames to be discarded in the base layer and at least one enhancement layer as decoding information of the video to be decoded, and scheduling the decoding information to the decoding end.

According to a third aspect of embodiments of the present specification, there is provided a video encoding apparatus comprising:

the coding module is configured to acquire a video to be coded, code the video to be coded into a base layer and at least one enhancement layer, and acquire initial coding information of the video to be coded;

a first determining module configured to determine, according to video data of the video to be encoded, a priority coefficient of each video frame of the video to be encoded being discarded, the priority coefficient being used to represent a probability of the corresponding video frame being discarded;

the adding module is configured to add the priority coefficient of each video frame to the initial coding information of the video to be coded to obtain target coding information of the video to be coded.

According to a fourth aspect of embodiments of the present specification, there is provided a video encoding apparatus comprising:

the acquisition module is configured to acquire and analyze target coding information of the video to be decoded to obtain a base layer, at least one enhancement layer and a priority coefficient of each video frame of the video to be decoded, wherein the priority coefficient is used for representing the probability of discarding the corresponding video frame;

The second determining module is configured to determine the video frame to be discarded in at least one enhancement layer according to the current decoding condition and the priority coefficient fed back by the decoding end;

and the scheduling module is configured to take the rest video frames except the video frames to be discarded in the base layer and at least one enhancement layer as decoding information of the video to be decoded, and schedule the decoding information to the decoding end.

According to a fifth aspect of embodiments of the present specification, there is provided a computing device comprising:

a memory and a processor;

the memory is used for storing computer executable instructions and the processor is used for executing the computer executable instructions to implement the steps of any video encoding method or encoded information scheduling method.

According to a sixth aspect of embodiments of the present specification, there is provided a computer readable storage medium storing computer executable instructions which, when executed by a processor, implement the steps of any video encoding method or encoded information scheduling method.

The video coding method provided by the specification can firstly obtain the video to be coded, and code the video to be coded into a base layer and at least one enhancement layer to obtain the initial coding information of the video to be coded; then determining a priority coefficient of each video frame of the video to be encoded, which is used for representing the probability of discarding the corresponding video frame, according to the video data of the video to be encoded; then, the priority coefficient of each video frame can be added into the initial coding information of the video to be coded, so as to obtain the target coding information of the video to be coded.

In this case, after the video to be encoded is encoded into a base layer and at least one enhancement layer, the discarded priority coefficient of each video frame may be determined additionally according to specific video data, the priority coefficient may be determined based on the influence of the video frame on the video quality, and the discarded priority coefficient of each video frame is also added to the encoding information, so that in the scheduling process of the encoding information, if the frame data in the enhancement layer needs to be discarded later, the video frame in the enhancement layer may be discarded selectively from the enhancement layer according to the level of the priority coefficient based on the discarded priority coefficient of each video frame determined in the encoding process, thereby the frame data with little influence on the video quality may be discarded preferentially, the video quality is ensured, and the user experience is improved.

The method for scheduling the encoding information provided by the specification can acquire and analyze target encoding information of the video to be decoded first to obtain a base layer, at least one enhancement layer and a priority coefficient of each video frame of the video to be decoded, wherein the priority coefficient is used for representing the probability of discarding the corresponding video frame; then, determining the video frame to be discarded in at least one enhancement layer according to the current decoding condition and the priority coefficient fed back by the decoding end; and then taking the rest video frames except the video frames to be discarded in the base layer and at least one enhancement layer as decoding information of the video to be decoded, and scheduling the decoding information to a decoding end.

In this case, the target coding information of the video to be decoded may include, in addition to a base layer and at least one enhancement layer obtained by coding the video, a priority coefficient determined by discarding each video frame according to specific video data in the coding process, where the priority coefficient may represent an influence of the video frame on video quality, so in the scheduling process of the coding information, if frame data in the enhancement layer needs to be discarded, the video frame in the enhancement layer may be selectively discarded from the enhancement layer according to the priority coefficient determined in the coding process, so that frame data having little influence on video quality may be discarded preferentially, video quality is ensured, and user experience is improved.

Drawings

Fig. 1 is a flowchart of a video encoding method according to an embodiment of the present disclosure;

FIG. 2 is a flowchart of a method for scheduling encoded information according to an embodiment of the present disclosure;

FIG. 3 is a schematic diagram of time domain coding according to an embodiment of the present disclosure;

FIG. 4 is a schematic diagram of spatial/quality domain coding according to an embodiment of the present disclosure;

fig. 5 is a schematic structural diagram of a video encoding device according to an embodiment of the present disclosure;

Fig. 6 is a schematic structural diagram of an encoded information scheduling apparatus according to an embodiment of the present disclosure;

FIG. 7 is a block diagram of a computing device according to one embodiment of the present disclosure.

Detailed Description

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present description. This description may be embodied in many other forms than described herein and similarly generalized by those skilled in the art to whom this disclosure pertains without departing from the spirit of the disclosure and, therefore, this disclosure is not limited by the specific implementations disclosed below.

The terminology used in the one or more embodiments of the specification is for the purpose of describing particular embodiments only and is not intended to be limiting of the one or more embodiments of the specification. As used in this specification, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used in one or more embodiments of the present specification refers to and encompasses any or all possible combinations of one or more of the associated listed items.

It should be understood that, although the terms first, second, etc. may be used in one or more embodiments of this specification to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first may also be referred to as a second, and similarly, a second may also be referred to as a first, without departing from the scope of one or more embodiments of the present description. The word "if" as used herein may be interpreted as "at … …" or "at … …" or "responsive to a determination", depending on the context.

First, terms related to one or more embodiments of the present specification will be explained.

Scalable video coding (SVC: scalable Video Coding): or scalable video coding, a video coding technique and standard, can be generally divided into temporal scalability, spatial scalability, and quality scalability. Video is output in the form of a base layer and enhancement layers at the time of encoding. When decoding, the base layer can realize the basic video content with lower code rate, and with the addition of more enhancement layers, the base layer can realize higher video quality. By the technology, the flexible and configurable video code stream can be realized only by one coding process, and the flexible and configurable video code stream is used for different network bandwidths and playing scenes.

Time domain: also called time domain, the argument is time, i.e. the horizontal axis is time and the vertical axis is the variation of the signal. In a video sequence, a succession of images is referred to. Airspace: also called spatial domain, the so-called pixel domain, processing in the spatial domain is at the pixel level, referring to single frame image information in the video sequence.

In the present specification, a video encoding method is provided, and the present specification relates to a video encoding apparatus, an encoded information scheduling method, an encoded information scheduling apparatus, a computing device, and a computer-readable storage medium, one by one, which are described in detail in the following embodiments.

Fig. 1 shows a flowchart of a video encoding method according to an embodiment of the present disclosure, which specifically includes the following steps:

step 102: and obtaining the video to be encoded, and encoding the video to be encoded into a base layer and at least one enhancement layer to obtain initial encoding information of the video to be encoded.

It should be noted that, the video to be encoded may refer to a video to be encoded by scalable video, where the video to be encoded may be a video in any field and any scene, and the video to be encoded may refer to a video locally stored in an encoding device, or may be a video acquired from another device, for example, the video to be encoded may be a chat video acquired in real time in a chat process, or a live video acquired in real time in a live broadcast process, or may also be a recorded video acquired in advance by the encoding device and stored locally, or the like.

In the present description, after the encoding device obtains the video to be encoded, a scalable video encoding manner may be adopted to encode the obtained video to be encoded into a base layer and at least one enhancement layer, the layered information obtained by the scalable video encoding manner is used as initial encoding information of the video to be encoded, and then priority information of discarding the video frame is added to be used as final encoding information of the video to be encoded.

In an optional implementation manner of this embodiment, the video to be encoded may be encoded according to a time domain, a space domain and/or a quality domain to obtain the base layer and at least one enhancement layer, that is, the video to be encoded is encoded into the base layer and at least one enhancement layer, so as to obtain initial encoding information of the video to be encoded, which may be specifically implemented as follows:

and according to the time domain information, the spatial domain information and/or the quality domain information of the video to be encoded, the video to be encoded is encoded into a base layer and at least one enhancement layer, and initial encoding information of the video to be encoded is obtained.

It should be noted that, the scalable video coding mode can code the video to be coded from three dimensions of time, space and quality, providing a piece of coding information with scalable time, space and quality, and outputting the video to be coded into a form of a base layer and a plurality of enhancement layers during coding; that is, in specific encoding, a desired dimension may be selected to encode the video to be encoded, or the video to be encoded may be encoded from all three dimensions, that is, at least one dimension may be selected from three dimensions of time, space and quality to encode the video to be encoded. When decoding, the base layer can realize the basic video content with lower code rate, and with the addition of more enhancement layers, the base layer can realize higher video quality. By the technology, the flexible and configurable video code stream can be realized only by one coding process, and the flexible and configurable video code stream is used for different network bandwidths and playing scenes.

In actual implementation, to realize temporal scalability, a hierarchical bi-directional prediction frame coding method may be used; to achieve spatial scalability, layered coding methods may be used using inter-layer motion, texture, and residual information; to achieve signal-to-noise ratio scalability, two methods, coarse granularity scalability and medium granularity scalability, which employ inter-layer prediction methods similar to spatial scalability, may be used.

According to the method and the device, the video to be coded can be coded into a base layer and at least one enhancement layer according to the time domain information, the space domain information and/or the quality domain information of the video to be coded, so that initial coding information of the video to be coded is obtained, the video can be coded from three aspects of time, space and quality respectively, and the coding information with scalable time, space and quality is provided to meet the requirements of network transmission rate and end users on the aspects of time, space, signal to noise ratio and the like of the video.

Step 104: according to the video data of the video to be encoded, determining a priority coefficient of each video frame of the video to be encoded, wherein the priority coefficient is used for representing the probability of discarding the corresponding video frame.

Specifically, on the basis of obtaining video to be encoded and encoding the video to be encoded into a base layer and at least one enhancement layer to obtain initial encoding information of the video to be encoded, further, determining a priority coefficient of discarding each video frame of the video to be encoded according to video data of the video to be encoded, where the priority coefficient is used to represent a probability of discarding the corresponding video frame. The priority coefficient and the probability of being discarded may be in a direct proportion relationship, or may be in an inverse proportion relationship, that is, the higher the priority coefficient is, the greater the probability of being discarded of the corresponding video frame is, or the higher the priority coefficient is, the smaller the probability of being discarded of the corresponding video frame is.

It should be noted that, the smaller the influence of the video frame on the video quality of the complete video, that is, the smaller the influence on the experience of the user for watching the video, the greater the probability of discarding the video frame; the larger the influence of the video frame on the video quality of the complete video, namely the larger the influence on the experience of the user for watching the video, the smaller the probability of discarding the video frame, so that the influence of the video frame on the video quality of the complete video can be determined by analyzing specific video content included in the video frame according to the video data of the video to be encoded, and the priority coefficient of discarding the video frame can be determined.

In practical applications, it is assumed that there is a proportional relationship between the priority coefficient and the probability of being discarded, that is, the higher the priority coefficient, the greater the probability of the corresponding video frame being discarded. That is, the higher the influence of the video frame on the video quality of the complete video, the higher the priority coefficient of the video frame to be discarded can be set, the lower the influence of the video frame to be discarded can be set, so that the priority coefficient of the video frame can represent the influence of the video frame to the video quality of the complete video, namely, the influence on the user's viewing video, thereby representing the probability that the video frame may be discarded later.

In the present disclosure, after encoding a video to be encoded into a base layer and at least one enhancement layer, a priority coefficient for discarding each video frame may be determined additionally according to specific video data, where the priority coefficient may represent an influence of the video frame on video quality, that is, a probability that the video frame may be subsequently discarded.

In an optional implementation manner of this embodiment, for the case where the initial coding information is obtained by coding based on the time domain information of the video to be coded, that is, the case where one base layer and at least one enhancement layer are obtained by coding based on the time domain information of the video to be coded, the priority coefficient of the video frame may be determined according to the change between the two video frames, that is, the priority coefficient of each video frame included in the video to be coded may be determined according to the video data of the video to be coded, and the specific implementation process may be as follows:

determining the inter-frame variation amplitude between the current video frame and the previous video frame of the video to be encoded;

and determining the priority coefficient of the current video frame according to the inter-frame variation amplitude.

It should be noted that, the larger the inter-frame variation between the current video frame and the previous video frame, the more drastic the current video frame is compared with the previous video frame, i.e. the more dissimilar the current video frame and the previous video frame are, the more abrupt the current video frame is discarded, which may cause the finally decoded and played video to be affected, and the viewing experience of the user is affected, i.e. the more drastic the variation of the video frame has a larger effect on the video quality of the complete video, and the larger the corresponding effect on the viewing experience of the user is, so the less the probability that the video frame with drastic variation is discarded should be.

In addition, the smaller the inter-frame variation amplitude between the current video frame and the previous video frame, the less obvious the variation of the current video frame compared with the previous video frame, namely the more similar the current video frame and the previous video frame are, if the current video frame is discarded, the later video frame in the two more similar video frames in the video is discarded, the obvious variation of the video finally decoded and played is not caused, the watching experience of the user is not influenced, namely the less the variation is, the less the influence of the video frame on the video quality of the complete video is, the correspondingly less the influence on the watching experience of the user is, and therefore the more the less the variation is, the greater the probability of discarding the video frame is supposed.

In the present specification, for the case that the initial encoding information is obtained based on encoding of time domain information of a video to be encoded, an inter-frame variation amplitude between a current video frame and a previous video frame of the video to be encoded can be determined, and then a priority coefficient of the current video frame is determined according to the inter-frame variation amplitude, so that the determined priority coefficient can represent the variation degree between the current video frame and the previous video frame, and thus, according to the priority coefficient, the influence on the video quality of the complete video after the current video frame is discarded can be determined, frames with small variation in the time domain can be discarded preferentially, and frames with severe variation can be reserved, so that the video is more fluent to be watched.

In an optional implementation manner of this embodiment, the inter-frame variation amplitude between the current video frame and the previous video frame may be determined according to the pixel difference between the current video frame and the previous video frame, that is, the inter-frame variation amplitude between the current video frame and the previous video frame of the video to be encoded may be determined, and the specific implementation process may be as follows:

determining a pixel difference value between each pixel point in the current video frame and the pixel point in the previous video frame;

determining a first pixel average value and a first pixel standard deviation according to pixel difference values between each pixel point in a current video frame and each pixel point in a previous video frame;

the first pixel average value and the first pixel standard deviation are used as the inter-frame variation amplitude.

It should be noted that, in practice, a single frame of video is an image, and the image is formed by a plurality of pixels, for each pixel in the current video frame, if the pixel difference between the pixel in the current video frame and the pixel in the previous video frame is large, it is indicated that the larger the difference between the pixel in the current video frame and the pixel in the previous video frame is, that is, the larger the change is. Therefore, according to the pixel difference value between each pixel point in the current video frame and the previous video frame, the difference between each pixel point in the current video frame and the previous video frame can be determined, so that the change degree of the current video frame compared with the previous video frame, namely the inter-frame change amplitude, is determined.

In practical applications, the difference between each pixel point in the current video frame and each pixel point in the previous video frame can be represented by the first pixel average value and the first pixel standard deviation between each pixel point in the current video frame and each pixel point in the previous video frame, that is, the first pixel average value and the first pixel standard deviation are used as the inter-frame variation amplitude, so that the variation degree of the current video frame compared with the previous video frame is reflected.

In particular, the inter-frame variation amplitude between the current video frame and the previous video frame can be determined by the following formula (1) and formula (2):

TI _std ＝std _space [M _n (i,j)] (1)

TI _avg ＝avg _space [M _n (i,j)] (2)

wherein M is _n (i,j)＝F _n (i,j)-F _n-1 (i, j) representing the pixel difference between the current video frame and the previous video frame at the pixel point (i, j), std _space M representing all pixels (i, j) of the whole frame image _n Standard deviation, avg _space M representing all pixels (i, j) of the whole frame image _n And (5) calculating an average value. TI (TI) _std For the first pixel standard deviation, TI _avg For the first pixel average value, TI _std And TI (TI) _avg I.e. the inter-frame variation amplitude between the current video frame and the previous video frame.

According to the method and the device, the change of the pixel value of each pixel point in the current video frame compared with the pixel value of the pixel point in the previous video frame can be comprehensively analyzed, so that the change degree of the current video frame compared with the previous video frame is represented according to the difference between the pixel points of each pixel point in the current video frame and the previous video frame, and the process of determining the inter-frame change amplitude is simple and accurate; and the first pixel average value and the first pixel standard deviation between each pixel point are used as the inter-frame variation amplitude together, so that the accuracy of determining the inter-frame variation amplitude of the current video frame compared with the previous video frame is improved, the accuracy of the subsequently determined priority coefficient is improved, and the subsequent video quality is ensured.

In an optional implementation manner of this embodiment, a plurality of classification threshold ranges may be preset for the first pixel average value and the first pixel standard deviation, so as to determine the priority coefficient of the current video frame according to the corresponding classification threshold range, that is, determine the priority coefficient of the current video frame according to the inter-frame variation amplitude, and specifically implement the following process:

and determining the priority coefficient of the current video frame according to the first pixel average value and the corresponding multiple grading threshold ranges and the first pixel standard deviation and the corresponding multiple grading threshold ranges.

It should be noted that, for the first pixel average value, a plurality of corresponding classification threshold ranges may be preset, and used for determining a threshold range in which the first pixel average value is located, for example, the plurality of classification threshold ranges corresponding to the first pixel average value may be greater than the average value threshold 0, less than the average value threshold 0 and greater than the average value threshold 1, less than the average value threshold 1 and greater than the average value threshold 2, and so on. The corresponding multiple grading threshold ranges may also be preset for the first pixel standard deviation, and are used for judging the threshold range where the first pixel standard deviation is located, for example, the multiple grading threshold ranges corresponding to the first pixel standard deviation may be greater than the standard deviation threshold 0, less than the standard deviation threshold 0 and greater than the standard deviation threshold 1, less than the standard deviation threshold 1 and greater than the standard deviation threshold 2, and so on. The average value threshold value 0, the average value threshold value 1, the average value threshold value 2, the standard deviation threshold value 0, the standard deviation threshold value 1, and the standard deviation threshold value 2 represent only the threshold numbers, and do not represent specific numerical values.

In an optional implementation manner of this embodiment, the determining the priority coefficient of the current video frame according to the first pixel average value and the corresponding multiple classification threshold ranges, and the first pixel standard deviation and the corresponding multiple classification threshold ranges may be implemented as follows:

and if the first pixel average value is in the first threshold range and/or the first pixel standard deviation is in the second threshold range, determining the priority coefficient corresponding to the first threshold range and/or the second threshold range as the priority coefficient of the current video frame.

It should be noted that, in the multiple classification threshold ranges corresponding to the first pixel average value, each classification threshold range is provided with a corresponding priority coefficient, and in the multiple classification threshold ranges corresponding to the first pixel standard deviation, each classification threshold range is also provided with a corresponding priority coefficient. The first threshold range represents a threshold range in which the first pixel average value is located, the second threshold range represents a threshold range in which the first pixel standard deviation is located, the first threshold range and the second threshold range are corresponding threshold ranges, that is, the first threshold range and the second threshold range correspond to the same priority coefficient, that is, the plurality of classification threshold ranges corresponding to the first pixel average value correspond to the plurality of classification threshold ranges corresponding to the first pixel standard deviation, and the group of threshold ranges correspond to the same priority coefficient.

In practical application, the priority coefficient of the current video frame may be determined based on the and logic, that is, the first pixel average value is located in the first threshold range, and the first pixel standard deviation is located in the second threshold range, where the priority coefficients corresponding to the first threshold range and the second threshold range may be determined as the priority coefficient of the current video frame. Alternatively, the priority coefficient of the current video frame may be determined based on or logic, that is, the first pixel average value is located in the first threshold range, or the first pixel standard deviation is located in the second threshold range, where the priority coefficient corresponding to the first threshold range or the second threshold range may be determined as the priority coefficient of the current video frame.

For example, if ti_standard deviation > standard deviation threshold 0, or ti_average > average threshold 0, the priority coefficient of the current video frame is the priority coefficient corresponding to the standard deviation threshold 0 or the average threshold 0, where the priority coefficients corresponding to the standard deviation threshold 0 and the average threshold 0 are the same, if both are 0, the priority coefficient of the current video frame is 0; if the standard deviation threshold 1< ti_standard deviation < = standard deviation threshold 0, or the average value threshold 1< ti_average value < = average value threshold 0, the priority coefficient of the current video frame is the priority coefficient corresponding to the standard deviation threshold 1 or the average value threshold 1, wherein the priority coefficients corresponding to the standard deviation threshold 1 and the average value threshold 1 are the same, if both the priority coefficients are 1, the priority coefficient of the current video frame is 1.

In the embodiment of the specification, the first pixel average value and the first pixel standard deviation between each pixel point in the current video frame and the previous video frame can be used as the inter-frame variation amplitude together, so that a plurality of corresponding grading threshold ranges can be preset for the first pixel average value and the first pixel standard deviation, the grading threshold range where the first pixel average value is located and the grading threshold range where the first pixel standard deviation is located are respectively determined, the priority coefficient of the current video frame is determined according to the grading threshold range where the first pixel average value and the first pixel standard deviation are combined, and the priority coefficient of the current video frame is determined together by combining the average value and the standard deviation, thereby improving the determination accuracy.

In an optional implementation manner of this embodiment, for the case where the initial encoding information is encoded based on spatial domain information or quality domain information of the video to be encoded, that is, where one base layer and at least one enhancement layer are encoded based on spatial domain information or quality domain information of the video to be encoded, the priority coefficient of each video frame may be determined according to complexity and upsampling quality of the video frame, that is, the priority coefficient of each video frame included in the video to be encoded may be determined according to video data of the video to be encoded, which may be implemented as follows:

For each video frame included in the video to be encoded, determining the complexity of the video frame in a spatial domain or a quality domain;

and determining the priority coefficient of the video frame according to the complexity of the video frame.

It should be noted that, the principle of SVC spatial classification is that the base layer transmits low resolution video, and the enhancement layer is added to restore high resolution video frames, and if the enhancement layer is discarded, the high resolution video frames can only be restored by an up-sampling algorithm during playing. Compared with a video frame with high complexity, the video frame with low complexity has better effect of recovering high resolution through up-sampling when only a base layer exists.

In practical application, for video frames with lower complexity, the video perceived quality recovered based on the base layer and the upsampling is better, and for video frames with higher complexity, the video perceived quality recovered based on the base layer and the upsampling is poorer. That is, the higher the complexity of the video frame, the worse the upsampling quality, and if the video frame is discarded, the worse the video definition of the final decoded and played video may be caused, which affects the viewing experience of the user, that is, the higher the complexity of the video frame, the greater the influence of the video frame on the video quality of the complete video, and the greater the influence of the video frame on the viewing experience of the user correspondingly, so the lower the probability that the video frame with higher complexity is discarded should be.

In addition, the lower the complexity of the video frame, the better the upsampling quality, and if the video frame is discarded, the less influence the sharpness of the video played by final decoding has on the user viewing experience, that is, the less influence the video frame with lower complexity has on the video quality of the complete video, the less influence the video frame with lower complexity has on the user viewing experience correspondingly, so the greater the probability that the video frame with lower complexity is discarded should be.

In the present specification, for the case that the initial encoding information is obtained based on spatial domain information or quality domain information encoding of a video to be encoded, the complexity of the video frame can be determined in the spatial domain or quality domain for each video frame included in the video to be encoded, and then the priority coefficient of the video frame is determined according to the complexity of the video frame, so that the determined priority coefficient can represent the influence on the video definition of the complete video after the video frame is discarded, so that the subsequent frame with low complexity, that is, the frame with good upsampling quality can be discarded preferentially, and the subjective definition is guaranteed to be better.

In an optional implementation manner of this embodiment, the complexity of each video frame may be determined according to the pixel differences of each pixel point in each video frame, that is, the complexity of the video frame is determined in the spatial domain or the quality domain, and the specific implementation process may be as follows:

Determining a second pixel average value and a second pixel standard deviation of each pixel included in the video frame;

the second pixel mean and the second pixel standard deviation are taken as the complexity of the video frame.

It should be noted that, in practice, a single frame of video is an image, and the image is formed by a plurality of pixels, for each video frame, the greater the pixel difference between the pixels in the video frame, the higher the complexity of the video frame, and the smaller the pixel difference between the pixels in the video frame, the lower the complexity of the video frame. Thus, for each video frame, a second pixel average value and a second pixel standard deviation of each pixel included in the video frame may be determined, with the second pixel average value and the second pixel standard deviation being taken as the complexity of the video frame.

In particular implementations, for each video frame, the complexity of the video frame may be determined by the following equations (3) and (4):

SI _std ＝std _space [Sobel(Fn)] (3)

SI _avg ＝avg _space [Sobel(Fn)] (4)

wherein Sobel (Fn) represents that the video frame Fn is filtered by Sobel to obtain a pixel-by-pixel result, std _space The filtering result Fn representing all pixel points of the whole frame of image is subjected to standard deviation, avg _space The filtering result Fn representing all pixels of the whole frame of image is averaged. SI (service information indicator) _std For the second pixel standard deviation, SI _avg For the second pixel mean value, SI _std And SI (information and information) _avg I.e. the complexity of the video frame.

In the embodiment of the present disclosure, for each video frame, the pixel value difference of each pixel point in the video frame may be comprehensively analyzed, so that the complexity of the video frame is represented according to the pixel value difference of each pixel point, and the process of determining the complexity of the video frame is simple and accurate; and the second pixel average value and the second pixel standard deviation among the pixel points are used as the complexity of the video frame, so that the accuracy of determining the complexity of the video frame is improved, the accuracy of the follow-up determined priority coefficient is improved, and the follow-up video quality is ensured.

In an optional implementation manner of this embodiment, for each video frame, the second pixel average value and the second pixel standard deviation are used as the complexity of the video frame, and when determining the priority coefficient of the video frame according to the complexity of the video frame, the specific implementation process of determining the priority coefficient of the current video frame according to the inter-frame variation amplitude is similar to the specific implementation process described above, and the description of this embodiment is omitted here.

Step 106: and adding the priority coefficient of each video frame to the initial coding information of the video to be coded to obtain target coding information of the video to be coded.

Specifically, on the basis of determining the priority coefficient of each video frame of the video to be encoded, which is discarded, according to the video data of the video to be encoded, further, the priority coefficient of each video frame may be added to the initial encoding information of the video to be encoded, so as to obtain the target encoding information of the video to be encoded.

It should be noted that, after the video to be encoded is encoded into a base layer and at least one enhancement layer, instead of directly using the layered information obtained by encoding as the final encoding information of the video to be encoded, the priority coefficient of each video frame in the video to be encoded can be further determined, the probability of subsequent deletion of the video frame is represented by the priority coefficient, and the priority coefficient of each video frame is added to the initial encoding information of the video to be encoded to obtain the final target encoding information of the video to be encoded, so that when the target encoding information is decoded subsequently, the video frames in the enhancement layer can be selectively discarded according to the priority coefficient of each video frame included in the target encoding information, thereby preferentially discarding the frame data with little influence on video quality, ensuring video quality and improving user experience.

In an optional implementation manner of this embodiment, after adding the priority coefficient of each video frame to the initial coding information of the video to be coded to obtain the target coding information of the video to be coded, the method may further include:

analyzing the target coding information to obtain a priority coefficient of each video frame of the video to be decoded, which is discarded;

and determining and scheduling decoding information of the video to be decoded to a decoding end according to the priority coefficient.

It should be noted that, after adding the priority coefficient of each video frame to the initial coding information of the video to be coded, final target coding information of the video to be coded can be obtained, the target coding information can be analyzed later, based on the priority coefficient of each video frame discarded obtained by analysis, the final scheduled decoding information is determined, and the decoding information is scheduled to the decoding end for decoding and playing. The detailed scheduling process is described in detail in the following coding information scheduling method.

According to the video coding method provided by the specification, after the time domain coding is carried out to obtain the base layer and at least one enhancement layer, the inter-frame change amplitude between the current video frame and the previous video frame of the video to be coded can be determined, and then the priority coefficient of the current video frame is determined according to the inter-frame change amplitude, so that the determined priority coefficient can represent the change degree between the current video frame and the previous video frame, the influence on the video quality of the complete video after the current video frame is discarded can be determined according to the priority coefficient, the frames with small changes in the time domain can be discarded preferentially, and the frames with severe changes are reserved, so that the video is more fluent to be watched.

In addition, after a base layer and at least one enhancement layer are obtained through encoding in a spatial domain or a quality domain, for each video frame included in a video to be encoded, the complexity of the video frame and the up-sampling quality of the video frame can be determined in the spatial domain or the quality domain, and then the priority coefficient of the video frame is determined according to the complexity and the up-sampling quality of the video frame, so that the determined priority coefficient can represent the influence on the video definition of the complete video after the video frame is discarded, and therefore, the frames with low discarded complexity and good up-sampling quality can be discarded preferentially in the following steps, and the subjective definition is guaranteed to be better.

That is, according to the present disclosure, the priority coefficient of each video frame to be discarded may be determined according to specific video data, and the priority coefficient of each video frame to be discarded may be added to the encoding information, so that, in the subsequent scheduling process of the encoding information, if frame data in the enhancement layer needs to be discarded, the video frame in the enhancement layer may be selectively discarded from the enhancement layer according to the priority coefficient determined in the encoding process, so that frame data with little influence on video quality may be preferentially discarded, video quality is guaranteed, and user experience is improved.

Fig. 2 shows a flowchart of a method for scheduling encoded information according to an embodiment of the present disclosure, which specifically includes the following steps:

step 202: and acquiring and analyzing target coding information of the video to be decoded to obtain a base layer, at least one enhancement layer and a priority coefficient of each video frame of the video to be decoded, wherein the priority coefficient is used for representing the probability of discarding the corresponding video frame.

In practical applications, the target coding information may include, in addition to a base layer and at least one enhancement layer obtained by coding the video to be decoded, a priority coefficient determined according to specific video data in the coding process, where each video frame is discarded. Therefore, in practical application, when the scheduling system schedules the target coding information of the video to be decoded, the scheduling system may analyze the target coding information of the video to be decoded first to obtain a base layer, at least one enhancement layer, and a priority coefficient of each video frame of the video to be decoded, which is discarded.

It should be noted that, the priority coefficient may represent an influence of a video frame on video quality, that is, a probability of discarding the video frame, so that in a scheduling process of encoded information, video frames in an enhancement layer may be selectively discarded based on the priority coefficient of discarding each video frame to obtain decoding information that may be scheduled to a decoding end, so that frame data having little influence on video quality may be preferentially discarded, video quality is guaranteed, and user experience is improved.

Step 204: and determining the video frame to be discarded in at least one enhancement layer according to the current decoding condition and the priority coefficient fed back by the decoding end.

Specifically, on the basis of obtaining and analyzing target coding information of the video to be decoded to obtain a base layer, at least one enhancement layer and a priority coefficient of each video frame of the video to be decoded, further, determining the video frame to be discarded in the at least one enhancement layer according to the current decoding condition and the priority coefficient fed back by the decoding end.

It should be noted that, the current decoding condition may refer to a decoding capability of decoding target coding information of a video to be decoded, that is, how many enhancement layers may be decoded in addition to the base layer, for example, the current decoding condition may be a current network bandwidth, a playing scene, etc., for example, the larger the network bandwidth, the more enhancement layers may be decoded. The current decoding condition is that the decoding end feeds back to the scheduling system so that the scheduling system can make scheduling decisions to determine which video frames to discard.

In practical application, the current decoding capability of the decoding end can be determined according to the current decoding condition, namely, how many video frames can be decoded currently, and then the scheduling system can determine how many video frames need to be discarded currently. Then, the scheduling system can screen the video frames to be discarded from at least one enhancement layer according to the priority coefficient of each video frame included in the target coding information, and the screened video frames to be discarded are discarded subsequently and are not transmitted to the decoding end for decoding.

In addition, the priority coefficient of each video frame which is discarded and determined in the encoding process can represent the influence of the video frame on the video quality, namely the probability of being discarded, so that when the frame data in the enhancement layer needs to be discarded in the scheduling process, the video frame in the enhancement layer can be selectively discarded based on the priority coefficient of each video frame which is discarded, thereby preferentially discarding the frame data with little influence on the video quality, ensuring the video quality and improving the user experience.

In an optional implementation manner of this embodiment, when selecting a video frame to be discarded from at least one enhancement layer, it may be determined first which enhancement layers from among the enhancement layers are selected, that is, according to the current decoding condition and the priority coefficient fed back by the decoding end, it is determined that the video frame to be discarded in at least one enhancement layer, and the specific implementation process may be as follows:

determining a target enhancement layer to be discarded in at least one enhancement layer according to the current decoding condition;

and screening the video frames to be discarded from the video frames included in the target enhancement layer according to the priority coefficient.

It should be noted that, under different decoding conditions, the number of decodable layers is different, for example, in case of better network bandwidth, besides the base layer, multiple enhancement layers, even all enhancement layers, can be decoded, and the better the network bandwidth, the more enhancement layers can be decoded; for the case of bad network bandwidth, there are fewer or even only base layers to decode in addition to the base layer, the worse the network bandwidth, the fewer or even none enhancement layers can be decoded.

Therefore, in practical application, the number of layers to be discarded corresponding to different decoding conditions can be preset, so that a subsequent scheduling system can determine a target enhancement layer to be discarded in at least one enhancement layer according to the current decoding conditions, and the target enhancement layer can be an enhancement layer which cannot be completely decoded under the current decoding conditions. For encoding in the time domain, assuming that the target encoding information of the video to be decoded includes one base layer T0 and 3 enhancement layers T1, T2, and T3, the target enhancement layer to be discarded may be a T3 layer in the case of a better network bandwidth, and may be a T3 layer and a T2 layer in the case of a poor network bandwidth.

When determining the target enhancement layer to be discarded in at least one enhancement layer, it is necessary to determine the target enhancement layer from the highest enhancement layer in the order of enhancement layers obtained by encoding. That is, in the scheduling process, when the enhancement layer needs to be discarded, the enhancement layer needs to be discarded from the highest enhancement layer, and then the enhancement layer needs to be discarded downwards.

In addition, after the target enhancement layer is determined, all the target enhancement layers do not need to be directly discarded, and the video frames to be discarded can be further screened from all the video frames included in the target enhancement layer according to the priority coefficient. That is, the video frames with less influence on the video quality and the user watching experience can be further screened from the video frames included in the target enhancement layer according to the priority coefficient of the video frames, and discarded, and the video frames with more influence on the video quality and the user watching experience in the target enhancement layer are reserved, namely, the video frames with less influence on the video quality and the user watching experience can be preferentially discarded, so that the video quality of the video obtained by decoding is ensured, and the user experience is ensured.

In an optional implementation manner of this embodiment, the number of video frames to be discarded at present may be determined according to the current decoding condition, then, according to the priority coefficient of each video frame, a corresponding number of video frames are screened out from the determined target enhancement layer as video frames to be discarded, and then, the corresponding content is discarded, and decoding is not performed, that is, according to the priority coefficient, the video frames to be discarded are screened out from each video frame included in the target enhancement layer, which may be implemented as follows:

determining the current frame number to be discarded according to the current decoding condition;

sequentially screening out target video frames of the current frame number to be discarded according to a preset sequence based on the discarded priority coefficients of each video frame included in the target enhancement layer;

and taking the screened target video frame as the video frame to be discarded.

It should be noted that, different decoding conditions may also preset different frames to be discarded, so that the decoding device may obtain, according to the current decoding condition, a preset corresponding current frame to be discarded.

In practical applications, each video frame included in the target enhancement layer has a corresponding discarded priority coefficient, where the priority coefficient may represent a sequence of discarding the corresponding video frame. In a possible implementation manner, the priority coefficient and the probability of being discarded may be in a proportional relationship, that is, the higher the priority coefficient is, the greater the probability of being discarded is for the corresponding video frame, at this time, the target video frame of the current frame number to be discarded needs to be screened out sequentially from high to low according to the priority coefficient of each video frame included in the target enhancement layer (that is, the preset order is that the priority coefficient is from high to low), and the screened target video frame is the video frame to be discarded, which is not decoded subsequently.

In another possible implementation manner, the priority coefficient and the probability of being discarded may be in an inverse relationship, that is, the higher the priority coefficient is, the smaller the probability of being discarded is for the corresponding video frame, at this time, the target video frame of the current frame number to be discarded needs to be screened out in sequence from low to high according to the priority coefficient of each video frame included in the target enhancement layer (that is, the preset sequence is that the priority coefficient is from low to high at this time), and the screened target video frame is the video frame to be discarded, and is not decoded subsequently.

In an optional implementation manner of this embodiment, when the preset order is that the priority coefficient is from high to low, based on the priority coefficient of each video frame included in the target enhancement layer, the target video frame of the current frame number to be discarded is sequentially screened out according to the preset order, and the specific implementation process may be as follows:

determining a first video frame with the largest priority coefficient in reference video frames, wherein the reference video frames are all video frames included in a target enhancement layer;

if the number of the first video frames exceeds the current frame number to be discarded, selecting the first video frames with the current frame number to be discarded as the frame number to be discarded, and obtaining target video frames with the current frame number to be discarded;

if the number of the first video frames does not exceed the number of the video frames to be discarded, taking each first video frame as the video frame to be discarded, and continuing to screen the video frame to be discarded from the rest video frames included in the target enhancement layer until the target video frame of the current frame number to be discarded is obtained.

The preset sequence is that the priority coefficient is from high to low, and the higher the priority coefficient is, the higher the corresponding video frame is discarded, namely the higher the priority coefficient is, the smaller the influence on video quality and user experience is, and the video frame with the higher priority coefficient can be discarded preferentially.

In practical application, the first video frame with the largest priority coefficient in each video frame included in the target enhancement layer can be determined first, then whether the number of the first video frames exceeds the current frame number to be discarded is determined, if so, the first video frame with the largest priority coefficient can meet the number requirement of the video frames to be discarded, at this time, the first video frame with the current frame number to be discarded can be selected as the video frame to be discarded, and the target video frame with the required number (namely the current frame number to be discarded) can be obtained.

In addition, if the number of the first video frames does not exceed the number of the video frames to be discarded, it is indicated that the first video frame with the largest priority coefficient cannot meet the number requirement of the video frames to be discarded, so that the video frames with the next priority coefficient need to be discarded continuously.

In an optional implementation manner of this embodiment, the filtering of the video frames to be discarded from the remaining video frames included in the target enhancement layer is continued until the target video frame of the current frame number to be discarded is obtained, and the specific implementation process may be as follows:

taking the remaining number except the number of the first video frames in the current frame number to be discarded as the updated current frame number to be discarded;

taking video frames except the first video frame in all video frames included in the target enhancement layer as reference video frames;

and returning to the operation step of determining the first video frame with the largest priority coefficient in the reference video frames until the number of the determined first video frames exceeds the current frame number to be discarded, and obtaining the target video frame with the current frame number to be discarded.

It should be noted that, since the first video frame with the largest current priority coefficient is determined as the video frame to be discarded, the remaining number of the current frame to be discarded excluding the number of the first video frames is the number of the video frames to be discarded that need to be continuously screened out, the remaining number of the current frame to be discarded excluding the number of the first video frames can be used as the updated current frame to be discarded for subsequent screening.

In addition, the first video frame included in the target enhancement layer is determined to be the video frame to be discarded, and the filtering needs to be continued from the video frames except for the first video frame in each video frame included in the target enhancement layer, so that the video frames except for the first video frame in each video frame included in the target enhancement layer can be used as reference video frames, the first video frame with the largest priority coefficient in the reference video frames is continuously determined, and the subsequent operation steps are carried out, and the filtering is continued until the video frames to be discarded meeting the number requirement are filtered.

It should be noted that, when SVC encoding video, a base layer and at least one enhancement layer are obtained, the higher layer has a dependency relationship with the lower layer, the more layers are referred to in the time domain, the video frames in the layers cannot be easily discarded, and the layers are referred to less or not to be referred to above, and the video frames in the layers can be discarded at will.

In practical application, after determining the target enhancement layer, when selecting the video frames to be discarded from the video frames included in the target enhancement layer according to the priority coefficient, if the video frames with the same priority coefficient exist, the video frames in the higher enhancement layer can be discarded preferentially, i.e. the video frames with few or no references are discarded preferentially. Therefore, when the video frames to be discarded are determined, the hierarchical sequence in the SVC coding process is followed, and the priority coefficient of each video frame is added, so that the frame data with small influence on the video quality is discarded preferentially, the video quality is ensured, and the user experience is improved.

For example, fig. 3 is a schematic diagram of time domain coding provided in an embodiment of the present disclosure, as shown in fig. 3, the time domain obtained by time domain coding is layered into a base layer T0,3 enhancement layers T1, T2, T3, and the reference relationship between the video frames of each layer in fig. 3 may represent the dependency relationship between the video frames of each layer, where the reference relationship may refer to the coding reference relationship between the video frames in video coding, that is, when a certain video frame is decoded, the video frame needs to be referred to, if the video frame 4 needs to be decoded, the video frame 0 and the video frame 8 need to be decoded first, if the video frame 2 needs to be decoded, the video frame 0 and the video frame 4 need to be decoded first, and so on.

Assuming that the number of layers to be discarded corresponding to the current decoding condition is 2, it may be determined that the target enhancement layers to be discarded are T3 and T2 from high to low, and assuming that the number of frames to be discarded corresponding to the current decoding condition is 5 frames. As shown in fig. 3, the layers T2 and T3 include video frame 1, video frame 2, video frame 3, video frame 5, video frame 6, and video frame 7, and the priority coefficient of video frame 1 being discarded is P2, the priority coefficient of video frame 2 being discarded is also P2, the priority coefficient of video frame 3 being discarded is P1, the priority coefficient of video frame 5 being discarded is P0, the priority coefficient of video frame 6 being discarded is P0, and the priority coefficient of video frame 7 being discarded is P1.

Firstly, determining that the video frames 1 and 2 with the largest priority coefficients in the T2 and T3 layers are the video frames 1 and 2, wherein the number of the video frames is 2, and the frames do not exceed the current frame number to be discarded by 5 frames, thus taking the video frames 1 and 2 as the video frames to be discarded, and updating to obtain the current frame number to be discarded as 3 frames. Then, it is determined that the video frames 3 and 7 with the largest priority coefficient except the video frames 1 and 2 in the layers T2 and T3 are the video frames 3 and 7, the number of which is 2, and the number of frames which does not exceed the current frame number 3 to be discarded, so that the video frames 3 and 7 are taken as the video frames to be discarded, and the current frame number to be discarded is updated to be 1.

Then, it is determined that the video frames 5 and 6 with the largest priority coefficient except for the video frame 1, the video frame 2, the video frame 3 and the video frame 7 in the layers T2 and T3 are 2 and exceed the current frame number 1 to be discarded, so that one of the video frames 5 and 6 needs to be selected as the video frame to be discarded, because the video frame 5 is at the level T3, the video frame 6 is at the level T2 and the video frame 6 is at the level lower and is more dependent by other video frames, the video frame 5 in the layer T3 is preferentially selected to be discarded, the video frame 6 in the layer T2 is reserved (i.e. when the priority coefficient is the same, the higher layer is preferentially discarded, and the lower layer is reserved), namely, the video frame 5 is determined to be the video frame to be discarded at this moment, and the finally selected video frames to be discarded are the video frame 1, the video frame 2, the video frame 3, the video frame 5 and the video frame 7 can be obtained at this moment.

From the above, if the T3 layer is to be discarded, the video frame 1 (P2), the video frame 3, the video frame 7 (P1), and the video frame 5 (P0) may be discarded first. If the T3+T2 layers are to be discarded simultaneously, the video frames 1 and 2 are preferentially discarded, the video frames 3 and 7 are next, and the video frames 5 and 6 are finally discarded.

In still another example, fig. 4 is a schematic diagram of spatial/quality domain coding according to an embodiment of the present disclosure, where, as shown in fig. 4, spatial/quality domain coding is obtained by layering spatial/quality domain coding into a base layer D0,1 enhancement layers D1, and the D1 layers include a video frame 1, a video frame 2, a video frame 3, and a video frame 5, where the priority coefficient of the video frame 1 being discarded is P0, the priority coefficient of the video frame 2 being discarded is also P0, the priority coefficient of the video frame 3 being discarded is P1, the priority coefficient of the video frame 4 being discarded is P1, and the priority coefficient of the video frame 5 being discarded is P0.

According to the layering result, the layer with high level can be discarded preferentially, and the number of layers to be discarded corresponding to the current decoding condition is assumed to be 1 layer, so that the target enhancement layer to be discarded can be determined to be D1 from high to low, and the current frame number to be discarded corresponding to the current decoding condition is assumed to be 3 frames.

Firstly, determining that the priority coefficient in the D1 layer is the largest of the video frames 3 and 4, wherein the number of the video frames is 2, and the number of the video frames does not exceed the current frame number 3 to be discarded, thus taking the video frames 3 and 4 as the video frames to be discarded, and updating to obtain the current frame number 1 to be discarded. Then, determining that the video frames 1, 2 and 5 with the largest priority coefficient except the video frames 3 and 4 in the layer D1 are the video frames 1, 2 and 5, and the number of the video frames is 3 and exceeds the current frame number to be discarded 1 frame, thus determining any one of the video frames 1, 2 and 5 as the video frame to be discarded, and assuming that the video frame is 1, and obtaining the finally screened video frames to be discarded as the video frames 1, 3 and 4.

It should be noted that, in addition to a base layer and at least one enhancement layer obtained by encoding video, the target encoding information of the video to be decoded may further include a priority coefficient determined by discarding each video frame according to specific video data in the encoding process, where the priority coefficient may represent an influence of the video frame on video quality, so in the scheduling process of the encoding information, if frame data in the enhancement layer needs to be discarded, the video frame in the enhancement layer may be selectively discarded from the enhancement layer according to the priority coefficient determined by the encoding process, so that frame data with little influence on video quality may be discarded preferentially, video quality is guaranteed, and user experience is improved.

Step 206: and taking the rest video frames except the video frames to be discarded in the base layer and at least one enhancement layer as decoding information of the video to be decoded, and scheduling the decoding information to a decoding end.

It should be noted that, based on the current decoding condition and the priority coefficient fed back by the decoding end, the to-be-discarded video frame in at least one enhancement layer is determined, further, the base layer and the rest video frames except for the to-be-discarded video frame in at least one enhancement layer can be used as decoding information of the to-be-decoded video, and the decoding information is scheduled to the decoding end.

In practical application, after the video frames to be discarded are screened from at least one enhancement layer according to the discarded priority coefficients of the video frames, the video frames to be discarded screened from the enhancement layer of the video to be decoded can be discarded, the scheduling transmission is not performed, and only the rest video frames are scheduled and transmitted to a decoding end for decoding by the decoding end to obtain corresponding decoded video for playing.

According to the coding information scheduling method provided by the specification, the target coding information of the video to be decoded can comprise not only a base layer and at least one enhancement layer obtained by coding the video, but also a priority coefficient of discarding each video frame, which is determined according to specific video data in the coding process, and can represent the influence of the video frame on the video quality, so that in the scheduling process, if the frame data in the enhancement layer needs to be discarded, the video frame in the enhancement layer can be discarded selectively according to the priority coefficient from the enhancement layer based on the priority coefficient of discarding each video frame, thereby preferentially discarding the frame data with little influence on the video quality, guaranteeing the video quality and improving the user experience.

Corresponding to the above method embodiments, the present disclosure further provides an embodiment of a video encoding apparatus, and fig. 5 shows a schematic structural diagram of a video encoding apparatus according to an embodiment of the present disclosure. As shown in fig. 5, the apparatus includes:

the encoding module 502 is configured to obtain a video to be encoded, and encode the video to be encoded into a base layer and at least one enhancement layer, so as to obtain initial encoding information of the video to be encoded;

a first determining module 504 configured to determine, according to video data of the video to be encoded, a priority coefficient of each video frame of the video to be encoded being discarded, the priority coefficient being used to represent a probability of the corresponding video frame being discarded;

the adding module 506 is configured to add the priority coefficient of each video frame to the initial encoding information of the video to be encoded, so as to obtain the target encoding information of the video to be encoded.

Optionally, the initial coding information is obtained by coding time domain information based on the video to be coded; the first determination module 504 is further configured to:

Optionally, the first determination module 504 is further configured to:

determining, for each pixel point in the current video frame, a pixel difference between the pixel points in the current video frame and the previous video frame;

determining a first pixel average value and a first pixel standard deviation according to pixel difference values between each pixel point in the current video frame and each pixel point in the previous video frame;

and taking the first pixel average value and the first pixel standard deviation as the inter-frame variation amplitude.

Optionally, the first determination module 504 is further configured to:

and if the first pixel average value is in a first threshold range and/or the first pixel standard deviation is in a second threshold range, determining the priority coefficient corresponding to the first threshold range and/or the second threshold range as the priority coefficient of the current video frame.

Optionally, the initial coding information is obtained by coding spatial domain information or quality domain information based on the video to be coded; the first determination module 504 is further configured to:

Optionally, the first determination module 504 is further configured to:

determining a second pixel average value and a second pixel standard deviation of each pixel point included in the video frame;

According to the video coding device provided by the specification, after the video to be coded is coded into the base layer and at least one enhancement layer, the discarded priority coefficient of each video frame can be determined additionally according to specific video data, the priority coefficient can be determined based on the influence of the video frame on video quality, and the discarded priority coefficient of each video frame is also added into coding information, so that in the subsequent scheduling process of the coding information, if frame data in the enhancement layer are required to be discarded, the video frames in the enhancement layer can be selectively discarded from the enhancement layer according to the height of the priority coefficient based on the discarded priority coefficient of each video frame determined in the encoding process, and therefore the frame data with little influence on video quality can be discarded preferentially, video quality is ensured, and user experience is improved.

The above is a schematic solution of a video encoding apparatus of the present embodiment. It should be noted that, the technical solution of the video encoding device and the technical solution of the video encoding method belong to the same conception, and details of the technical solution of the video encoding device, which are not described in detail, can be referred to the description of the technical solution of the video encoding method.

Corresponding to the method embodiment, the present disclosure further provides an embodiment of an encoded information scheduling apparatus, and fig. 6 shows a schematic structural diagram of an encoded information scheduling apparatus according to an embodiment of the present disclosure. As shown in fig. 6, the apparatus includes:

the obtaining module 602 is configured to obtain and parse target coding information of the video to be decoded, so as to obtain a base layer, at least one enhancement layer of the video to be decoded, and a priority coefficient of discarding each video frame of the video to be decoded, where the priority coefficient is used to represent a probability of discarding the corresponding video frame;

a second determining module 604, configured to determine, according to the current decoding condition and the priority coefficient fed back by the decoding end, a video frame to be discarded in at least one enhancement layer;

the scheduling module 606 is configured to take the remaining video frames except the video frames to be discarded in the base layer and the at least one enhancement layer as decoding information of the video to be decoded, and schedule the decoding information to a decoding end.

Optionally, the second determination module 604 is further configured to:

sequentially screening out target video frames of the current frame number to be discarded according to a preset sequence based on priority coefficients of all video frames included in the target enhancement layer;

and taking the screened target video frame as the video frame to be discarded.

Optionally, the preset sequence is that the priority coefficient is from high to low; the second determination module 604 is further configured to:

Optionally, the second determination module 604 is further configured to:

According to the coding information scheduling device provided by the specification, the target coding information of the video to be decoded can comprise not only a base layer and at least one enhancement layer obtained by coding the video, but also a priority coefficient of discarding each video frame, which is determined according to specific video data in the coding process, and the priority coefficient can represent the influence of the video frame on the video quality, so that in the scheduling process, if the frame data in the enhancement layer needs to be discarded, the video frame in the enhancement layer can be discarded selectively according to the priority coefficient from the enhancement layer based on the priority coefficient of discarding each video frame, thereby preferentially discarding the frame data with little influence on the video quality, ensuring the video quality and improving the user experience.

The above is an exemplary scheme of an encoded information scheduling apparatus of the present embodiment. It should be noted that, the technical solution of the coding information scheduling apparatus and the technical solution of the coding information scheduling method belong to the same concept, and details of the technical solution of the coding information scheduling apparatus, which are not described in detail, can be referred to the description of the technical solution of the coding information scheduling method.

Fig. 7 illustrates a block diagram of a computing device 700 provided in accordance with an embodiment of the present specification. The components of computing device 700 include, but are not limited to, memory 710 and processor 720. Processor 720 is coupled to memory 710 via bus 730, and database 750 is used to store data.

Computing device 700 also includes access device 740, access device 740 enabling computing device 700 to communicate via one or more networks 760. Examples of such networks include the Public Switched Telephone Network (PSTN), a Local Area Network (LAN), a Wide Area Network (WAN), a Personal Area Network (PAN), or a combination of communication networks such as the internet. The access device 740 may include one or more of any type of network interface, wired or wireless (e.g., a Network Interface Card (NIC)), such as an IEEE802.11 Wireless Local Area Network (WLAN) wireless interface, a worldwide interoperability for microwave access (Wi-MAX) interface, an ethernet interface, a Universal Serial Bus (USB) interface, a cellular network interface, a bluetooth interface, a Near Field Communication (NFC) interface, and so forth.

In one embodiment of the present description, the above-described components of computing device 700, as well as other components not shown in FIG. 7, may also be connected to each other, such as by a bus. It should be understood that the block diagram of the computing device illustrated in FIG. 7 is for exemplary purposes only and is not intended to limit the scope of the present description. Those skilled in the art may add or replace other components as desired.

Computing device 700 may be any type of stationary or mobile computing device including a mobile computer or mobile computing device (e.g., tablet, personal digital assistant, laptop, notebook, netbook, etc.), mobile phone (e.g., smart phone), wearable computing device (e.g., smart watch, smart glasses, etc.), or other type of mobile device, or a stationary computing device such as a desktop computer or PC. Computing device 700 may also be a mobile or stationary server.

Wherein the processor 720 is configured to execute the following computer executable instructions to implement the steps of any video encoding method or encoded information scheduling method.

The foregoing is a schematic illustration of a computing device of this embodiment. It should be noted that, the technical solution of the computing device and the technical solution of the video encoding method or the encoding information scheduling method belong to the same concept, and details of the technical solution of the computing device, which are not described in detail, can be referred to the description of the technical solution of the video encoding method or the encoding information scheduling method.

An embodiment of the present specification also provides a computer-readable storage medium storing computer instructions that, when executed by a processor, perform the steps of any video encoding method or encoded information scheduling method.

The above is an exemplary version of a computer-readable storage medium of the present embodiment. It should be noted that, the technical solution of the storage medium and the technical solution of the video encoding method or the encoding information scheduling method described above belong to the same concept, and details of the technical solution of the storage medium that are not described in detail may be referred to the description of the technical solution of the video encoding method or the encoding information scheduling method described above.

The foregoing describes specific embodiments of the present disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.

The computer instructions include computer program code which may be in source code form, object code form, executable file or some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, randomAccess Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. It should be noted that the content of the computer readable medium can be appropriately increased or decreased according to the requirements of the jurisdiction's jurisdiction and the patent practice, for example, in some jurisdictions, the computer readable medium does not include electrical carrier signals and telecommunication signals according to the jurisdiction and the patent practice.

It should be noted that, for the sake of simplicity of description, the foregoing method embodiments are all expressed as a series of combinations of actions, but it should be understood by those skilled in the art that the present description is not limited by the order of actions described, as some steps may be performed in other order or simultaneously in accordance with the present description. Further, those skilled in the art will appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily all necessary in the specification.

In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to the related descriptions of other embodiments.

The preferred embodiments of the present specification disclosed above are merely used to help clarify the present specification. Alternative embodiments are not intended to be exhaustive or to limit the invention to the precise forms disclosed. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the disclosure and the practical application, to thereby enable others skilled in the art to best understand and utilize the disclosure. This specification is to be limited only by the claims and the full scope and equivalents thereof.

Claims

1. A method of video encoding, the method comprising:

2. The video coding method according to claim 1, wherein the initial coding information is obtained by coding based on time domain information of the video to be coded;

the determining the priority coefficient of each video frame included in the video to be encoded according to the video data of the video to be encoded includes:

3. The video coding method of claim 2, wherein the determining an inter-frame variation amplitude between a current video frame and a previous video frame of the video to be coded comprises:

4. A video coding method according to claim 3, wherein said determining the priority coefficient of the current video frame based on the inter-frame variation amplitude comprises:

5. The method of video encoding according to claim 4, wherein determining the priority coefficient of the current video frame based on the first pixel mean and the corresponding plurality of classification threshold ranges, and the first pixel standard deviation and the corresponding plurality of classification threshold ranges, comprises:

6. The video coding method according to claim 2, wherein the initial coding information is obtained by coding based on spatial domain information or quality domain information of the video to be coded;

7. The method of video coding according to claim 6, wherein said determining the complexity of the video frame in the spatial domain or the quality domain comprises:

8. A method for scheduling encoded information, the method comprising:

determining a video frame to be discarded in the at least one enhancement layer according to the current decoding condition fed back by the decoding end and the priority coefficient;

And taking the rest video frames except the video frames to be discarded in the base layer and the at least one enhancement layer as decoding information of the video to be decoded, and scheduling the decoding information to the decoding end.

9. The method for scheduling encoded information according to claim 8, wherein said determining a video frame to be discarded in said at least one enhancement layer according to a current decoding condition fed back by a decoding side and said priority coefficient comprises:

determining a target enhancement layer to be discarded in the at least one enhancement layer according to the current decoding condition;

and screening the video frames to be discarded from all video frames included in the target enhancement layer according to the priority coefficient.

10. The method for scheduling encoded information according to claim 9, wherein said screening the video frames to be discarded from the respective video frames included in the target enhancement layer according to the priority coefficient comprises:

sequentially screening out the target video frames of the current frame number to be discarded according to a preset sequence based on the priority coefficients of all video frames included in the target enhancement layer;

And taking the screened target video frame as the video frame to be discarded.

11. The method for scheduling encoded information according to claim 10, wherein the predetermined order is a priority coefficient from high to low;

the step of sequentially screening out the target video frames of the current frame number to be discarded according to a preset sequence based on the priority coefficients of the video frames included in the target enhancement layer comprises the following steps:

determining a first video frame with the largest priority coefficient in reference video frames, wherein the reference video frames are all video frames included in the target enhancement layer;

and if the number of the first video frames does not exceed the number of the video frames to be discarded, taking each first video frame as the video frame to be discarded, and continuing to screen the video frame to be discarded from the rest video frames included in the target enhancement layer until the target video frame of the current frame to be discarded is obtained.

12. The method for scheduling encoding information according to claim 11, wherein said filtering the video frames to be discarded from the remaining video frames included in the target enhancement layer until the target video frame of the current frame number to be discarded is obtained, comprises:

taking video frames except the first video frame in all video frames included in the target enhancement layer as the reference video frame;

13. A video encoding device, the device comprising:

the coding module is configured to acquire a video to be coded, code the video to be coded into a base layer and at least one enhancement layer, and obtain initial coding information of the video to be coded;

a first determining module configured to determine, according to video data of the video to be encoded, a priority coefficient of each video frame of the video to be encoded being discarded, the priority coefficient being used to represent a probability of discarding the corresponding video frame;

and the adding module is configured to add the priority coefficient of each video frame to the initial coding information of the video to be coded to obtain target coding information of the video to be coded.

14. An encoded information scheduling apparatus, the apparatus comprising:

the second determining module is configured to determine a video frame to be discarded in the at least one enhancement layer according to the current decoding condition fed back by the decoding end and the priority coefficient;

and the scheduling module is configured to take the rest video frames except the video frames to be discarded in the base layer and the at least one enhancement layer as decoding information of the video to be decoded, and schedule the decoding information to the decoding end.

15. A computing device, comprising:

a memory and a processor;

the memory is configured to store computer executable instructions and the processor is configured to execute the computer executable instructions to implement the steps of the video encoding method of any one of claims 1 to 7 or the encoded information scheduling method of any one of claims 8 to 12.

16. A computer readable storage medium storing computer instructions which, when executed by a processor, implement the steps of the video encoding method of any one of claims 1 to 7 or the encoded information scheduling method of any one of claims 8 to 12.