CN101299819B

CN101299819B - Method for sorting three-dimensional wavelet sub-band and enveloping code flow of telescopic video coding

Info

Publication number: CN101299819B
Application number: CN 200810104941
Authority: CN
Inventors: 戴琼海; 彭义刚; 肖红江; 杨敬钰
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2008-04-25
Filing date: 2008-04-25
Publication date: 2010-04-14
Anticipated expiration: 2028-04-25
Also published as: CN101299819A

Abstract

The invention discloses a three-dimensional wavelet sub-band sorting and code stream packet sealing method in the telescopic video code, wherein the wavelet sub-band in the time domain is sorted according to the requirement of the sequential decode, and the wavelet sub-band in the time domain is divided into different levels; the wavelet sub-band in the time domain of the same level is sorted according to the size of the transmission distortion MES value; then the code stream after being sorted is packed, then transmitted to a receiving end, when the receiving end does not reach the time limit of the decode, then performs the retransmission when finding the package missing, with the retransmitting time smaller than or equal to the largest retransmitting time; when the receiving end reaches the time limit of the decode, the package missing is not retransmitted. The method provides an effective code rate transmission control method, thereby providing correct code stream distribution for the video code and the transmission, providing the code stream organizing and transmitting way of high performance three-dimensional retractability self-adapting to the isomery, the network bandwidth wave property and the time delay change and the terminal diversity of the user receiving.

Description

Three-dimensional wavelet sub-band sorting and code stream packaging method in scalable video coding

Technical Field

The invention belongs to the field of multimedia communication, and particularly relates to a three-dimensional wavelet sub-band sequencing and code stream packaging method.

Background

With the rapid development of internet technology, multimedia applications including images, videos, sounds and other contents are becoming popular and popular. However, due to the inherent heterogeneity of the internet, various networks have different channel characteristics (e.g., different channel bandwidths, delays, jitter, etc.). Meanwhile, the terminal devices used by users are various, and the display capability and the processing capability of the terminals are obviously different. Therefore, the quality of video obtained by users through different terminals of different networks is different, and scalable coding is needed to solve the problem. The code stream obtained by coding the video in a scalable coding mode has scalability. The scalable coding method codes the video signal with the highest quality, and then can extract a part of the video signal to obtain a code stream with a lower code rate, so that the requirements of network bandwidth and terminal processing capacity are met, and the complexity of a server is reduced. Because of this superior adaptability, scalable video coding is one of the most interesting areas in the video coding community.

For video coding, scalable coding is to achieve the following scalability: quality scalability, spatial scalability, and temporal scalability. Quality scalability has the property that: taking part of the bits to reconstruct can obtain a blurred video picture, and if more bits are taken, the reconstructed picture becomes clearer. Spatial scalability refers to the property that a codestream can provide pictures of different spatial resolutions. Temporal scalability means that the code stream can provide different temporal resolutions, i.e. different frame rates. The current scalable video coding methods roughly include: layered video coding, fine scalable video coding, wavelet scalable video coding, and MPEG AVC/h.264 scalable extension. Among them, wavelet scalable coding is an effective tool that can generate a scalable codestream, and it can provide a scalable codestream while maintaining high coding efficiency. Its efficient representation capability and embedded coding approach provide great flexibility for adaptive spatial, temporal and quality scalability. One t +2D three-dimensional wavelet transform coding scheme with motion compensation comprises the following steps: after the motion compensation time domain wavelet transform is carried out on the video sequence, then the two-dimensional space domain wavelet transform is carried out, and finally the EZBC coding mode is adopted for coding. The different wavelet sub-bands generated by this t +2D transform coding method have different effects on the quality of the reconstructed video sequence. The overall flow of the video encoding method is shown in fig. 1. The method specifically comprises the following steps:

1) dividing the video frames into image groups, and performing MCTF on the image groups: for a video sequence comprising N frames of images, it is first divided into a set of groups of size 2^JA Group of pictures (GOP) of frames, then performing J-level Motion compensation temporal wavelet transform (MCTF) on each GOP by adopting a lifting scheme of wavelet transform to obtain different temporal wavelet sub-bands, namely temporal high-frequency wavelet sub-bands and temporal low-frequency wavelet sub-bands of each level;

2) performing space domain wavelet transformation on the time domain wavelet sub-band to obtain a wavelet sub-band: performing M-level spatial wavelet transform on each time domain wavelet subband formed in each GOP under the full spatial resolution, and transforming each level of time domain wavelet subband into 3 xM +1 spatial wavelet subbands with different spatial resolutions; thus each GOP is transformed into 2^JX (3 × M +1) wavelet subbands;

3) EZBC encoding is carried out on the wavelet sub-bands: and encoding the transformed wavelet sub-bands by adopting Embedded Zero Blocks (EZBC) of context modeling. Because the code tables of different levels and different sub-bands of the quad-tree in the method are independently constructed, each wavelet sub-band corresponds to an independent code stream.

The above step 1) MCTF, step 2) spatial wavelet transform may be implemented based on a lifting scheme of wavelet transform. The lifting scheme of the wavelet transform comprises three steps: split, predict, and update. The splitting step refers to splitting the original signal into two parts: i.e. the signals are divided into two subsets according to the odd and even of the serial number: an even-order subset and an odd-order subset. The predicting step uses the correlation between the two subsets to predict one subset from the other, for example, using the odd-numbered sequence subset to predict the even-numbered sequence subset, thereby obtaining the detail signal portion, and storing the detail signal portion in the even-numbered sequence subset. The updating step is to update the odd-numbered sequence subset with the detail signal, i.e. the even-numbered sequence subset, to obtain the profile signal, and store the profile signal in the odd-numbered sequence subset. In this way, the lifting scheme of the wavelet transform is decomposed into several very simple basic steps, and each step is very easy to find its inverse transform. The reconstruction process is the inverse step of transformation, and also comprises three steps, namely inverse prediction, inverse update and combination.

The lifting scheme of wavelet lifting has the advantages that: integer wavelet transformation can be realized; the wavelet transformation of any image size can be realized; the wavelet transformation can be completed at the current position without allocating extra memory, thereby facilitating the realization based on a chip; the algorithm is simple, the method is suitable for parallel processing, the calculation speed is high, and the like.

The above-described video coding method based on three-dimensional wavelet transform provides great flexibility for adaptive spatial and spatial, temporal and quality scalability. The coded video code stream should be embedded, so that the embedded code stream can be intercepted according to the specific video transmission network structure, the network bandwidth and the requirement of a user video receiving terminal, and the video with the best quality can be obtained as much as possible. However, most of the current code stream organization modes only support the embedded type of the intra-frame code stream, do not consider the embedded type organization of the inter-frame code stream, do not consider the different importance of each sub-band after wavelet transformation to the quality of the reconstructed video frame, and do not consider the time delay influence during transmission. Therefore, when the whole video code stream is organized and transmitted, the global optimal distribution of the whole video code stream under the network limited condition cannot be ensured.

In view of the above-mentioned drawbacks and deficiencies of the prior art in the background art, the present invention provides a scalable video coding three-dimensional wavelet subband sorting and code stream packing method to provide an effective video coding transmission mode, thereby providing accurate code stream allocation for video coding and transmission, and providing a high-performance three-dimensional (temporal, spatial, quality) scalable code stream organization form adaptive to the heterogeneity of video transmission network, network bandwidth fluctuation and delay variation, and diversity of user receiving terminals.

Before implementing the method provided by the invention, firstly, the t +2D three-dimensional wavelet transform coding containing motion compensation is carried out on a video sequence, and the specific process is as follows: for size of 2^JGOP of the frame is taken as MCTF of J level; performing M-level spatial wavelet transform on each time domain wavelet sub-band under the full spatial resolution, so that each level of time domain wavelet sub-band is transformed into 3 xM +1 spatial wavelet sub-bands with different spatial resolutions; thus each GOP is transformed into 2^JX (3 × M +1) wavelet subbands. Then, the three-dimensional wavelet sub-band ordering and code stream packaging method of the scalable video coding provided by the invention is implemented.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provides a method for sequencing and packaging three-dimensional wavelet sub-bands in scalable video coding, which considers the embedded type of the code stream in a video frame and the embedded type of the code stream between frames, comprehensively considers the influence of transmission delay and the different importance of each wavelet sub-band after wavelet transformation on the quality of a reconstructed video frame, transmits the sub-band with larger importance in advance and transmits the sub-band with smaller importance in the back, thereby ensuring that a receiving end receives the most important information. The whole code stream has good network adaptability, and the video reconstruction quality can be obtained as good as possible under the condition that the network condition is limited.

The significant characteristic of the invention is that after EZBC coding is completed by adopting the t +2D three-dimensional wavelet transform coding method in the prior art, according to the characteristic that each wavelet sub-band after the three-dimensional wavelet transform containing motion compensation has different importance to the quality of the reconstructed video frame, the limitation of the conditions such as channel delay, jitter, bandwidth and the like in actual transmission is considered, the sub-band with larger importance is transmitted in advance, and the sub-band with smaller importance is transmitted later, thereby ensuring that the receiving end receives the most important information.

The invention has another characteristic that the method combines the video telescopic wavelet transform and the embedded coding and combines the scalability of transmission, so that the whole code stream has good network adaptability and can obtain good video reconstruction quality as far as possible under the condition of limited network conditions.

The third characteristic of the invention is that the proposed wavelet sub-band sorting and code stream packing method has lower computational complexity and is easy to implement. Firstly, the motion compensation based time domain wavelet transform and the space domain wavelet transform can be realized by adopting a lifting scheme of wavelet transform, and the method has the advantages of small occupied memory, low operation complexity and the like. Secondly, the code stream packaging method and the maximum retransmission times are simple to calculate.

Drawings

Fig. 1 shows a t +2D three-dimensional wavelet transform coding scheme with motion compensation in existing scalable video coding.

FIG. 2 is a flow chart of a three-dimensional wavelet subband sorting and code stream packing method for scalable video coding according to the present invention.

Figure 3 is a schematic diagram of 3-level MCTF and 3-level spatial wavelet transform for an 8-frame sized GOP in one embodiment of the present invention.

FIG. 4 is a diagram illustrating GOP decoding timing, wavelet subband ordering, and code stream packet packing for 8 frames in one embodiment of the present invention.

Detailed Description

The general flow chart of the invention is shown in fig. 2, and comprises the steps of sorting wavelet sub-bands of code streams after EZBC coding, and packing and transmitting two parts of specific solutions of the sorted code streams as follows:

1) the sorting of the wavelet sub-bands specifically comprises the following steps:

11) according to the principle of sequential decoding when a reconstructed video sequence is decoded, frames 1, 2 and 3 … … of the reconstructed video are decoded in sequence, time domain wavelet sub-bands are sequenced, according to the relationship between the reconstructed video frame and different time domain wavelet sub-bands, the time domain wavelet sub-band related to the current frame of the reconstructed video sequence is transmitted first, and the time domain wavelet sub-band related to the next frame is transmitted later; thus, the time domain wavelet sub-bands are classified into classes with different transmission priorities; the specific process is as follows:

with A_i ⁰To represent video frames, where the subscript i represents the frame's sequence number and the superscript 0 represents the original video frame before transformation; the original video frame sequence is: a. the₀ ⁰，A₁ ⁰，A₂ ⁰，A₃ ⁰，A₄ ⁰，A₅ ⁰，A₆ ⁰，A₇ ⁰… …, respectively; with L_i ^jAnd H_i ^jRespectively representing a time domain low frequency wavelet sub-band and a time domain high frequency wavelet sub-band after Motion Compensation Temporal Filtering (MCTF); the result of the transformation is that there is only one time-domain lowest frequency wavelet sub-band L₀ ^jMultiple time domain high frequency wavelet sub-bands H_i ^jWherein the superscript J represents that the time domain high-frequency wavelet sub-band is obtained by J level transformation, and J belongs to [1, J ]]∩Z⁺J is the total transform series, subscript i represents the sequential number of the wavelet sub-band, and in the J-th transform, the sum is 2^J-jA time domain high frequency wavelet sub-band, i ∈ {1, 2^J-jJ }; the lowest frequency wavelet sub-band L₀ ^jAnd all 2^J-j(j∈[1，J]∩Z⁺) Decoding original video frame A in time domain high frequency wavelet sub-band₀ ⁰Correlated time domain low frequency wavelet sub-band L₀ ^JAnd time domain high frequency wavelet sub-band H₀ ^J，H₀ ^J-1，...，H₀ ¹Is set to 1; according to the same principle, the rest time domain high-frequency wavelet sub-bands are sequentially divided into different levels 2, 3 and p, wherein p is more than or equal to 2;

12) sorting the airspace wavelet sub-waves in the same level according to the size of the transmission distortion MSE value so as to transmit the sub-band with the small transmission distortion MSE value in advance, and then transmitting the sub-band with the large transmission distortion MSE value, thereby ensuring that a receiving end receives the most important information; the specific process is as follows:

the video frame after J-level MCTF and M-level spatial wavelet transform is divided into different wavelet sub-bands S_i，i∈[1，n]∩Z⁺，Z⁺Represents a set of positive integers, n is 2^JX (3M +1), the effect of each wavelet subband on the quality of the reconstructed video frame is measured by MSE, where MSE is defined as:

where, s is the original sequence of video frames,

a sequence of video frames reconstructed for a decoding side; if wavelet sub-band S_iIf the transmission distortion is lost during transmission, the MSE value of the transmission distortion is:

<math> <mrow> <mi>D</mi> <mrow> <mo>(</mo> <munder> <mi>∪</mi> <mrow> <mi>i</mi> <mo>&NotEqual;</mo> <mi>j</mi> </mrow> </munder> <msub> <mi>S</mi> <mi>i</mi> </msub> <mo>)</mo> </mrow> <mo>,</mo> </mrow> </math>

i∈[1，n]∩Z⁺

the above formula represents the wavelet sub-band S_jIf it is lost, the other wavelet sub-bandsCompared with the original video frame, the reconstructed video frame has distortion; the larger the MSE value is, the more important the wavelet subband is for reconstructing video frames is, and the higher the transmission level is; the result of wavelet subband ordering from the magnitude of the MSE values is:

S_{r_{0}}, S_{r_{1}}, S_{r_{2}}, S_{r_{3}}, S_{r_{4}} . . . . . .,

r₀∈[1，n]∩Z⁺

these wavelet sub-bandsThe corresponding code stream lengths are respectively:

l_{r_{0}}, l_{r_{1}}, l_{r_{2}}, l_{r_{3}}, l_{r_{4}} . . . . . .;

in the next packing operation, the operation is performed according to the restrictions of the code stream length and the maximum packet length.

2) The sequenced code streams are packaged and transmitted, and the method specifically comprises the following steps:

21) packaging the sorted code streams in sequence by a packet L_jJ represents the serial number of the packet, the maximum packet length is L, and the length of a network IP (Internet protocol) packet is set; because different wavelet sub-bands have different influences on the video reconstruction effect, namely different wavelet sub-bands have different importance, the sub-band with higher importance is transmitted first, and the sub-band with lower importance is transmitted later, so that the receiving end is ensured to receive the most important information.

22) Calculating the maximum retransmission times M of the packet j according to different time delays of the network_j(ii) a The method specifically comprises the following steps: if packet L_jLoss, total distortion to reconstructed video frame is D_j ^L(ii) a If packet L_jReceived but distorted by D due to quantization at the time of encoding_j ^Q(ii) a Taking into account only the distortion D caused by channel transmission_j ^CWhereinThe number Of packets in a Group Of Pictures (GOP) is N_P(ii) a The number of packets corresponding to each level is N_P ^kK is more than or equal to 1 and less than or equal to K, and K is the time domain wavelet sub-band level number; bag L_jMaximum number of retransmissions M_jIs obtained by the following formula:

<math> <mrow> <mi>s</mi> <mo>.</mo> <mi>t</mi> <mo>.</mo> <munderover> <mi>Σ</mi> <mrow> <mi>s</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>k</mi> </munderover> <munderover> <mi>Σ</mi> <mrow> <mi>j</mi> <mo>=</mo> <mn>1</mn> </mrow> <msubsup> <mi>N</mi> <mi>P</mi> <mi>s</mi> </msubsup> </munderover> <msub> <mi>Time</mi> <mi>j</mi> </msub> <mo>≤</mo> <msub> <mi>Deadline</mi> <mi>k</mi> </msub> <mo>,</mo> <mn>1</mn> <mo>≤</mo> <mi>k</mi> <mo>≤</mo> <msub> <mi>N</mi> <mi>P</mi> </msub> </mrow> </math>

wherein D is_GOPIs the total transmission distortion of a GOP, the wavelet sub-band transmission time in the k level is at most

B is the channel bandwidth, L_jIs the length of the jth packet, M_jIs the packet L_jMaximum number of retransmissions of, Time_jIs the packet L_jTransmission time of (1), delay_kIs the total decoding deadline, P, of the wavelet sub-band of the top k level_Loss，jIs each packet L_jProbability of packet loss per transmission, P_ARQ，jIs a bag L_jProbability of packet loss in retransmission.

23) Transmitting the packed code stream to a receiving end, and finding a packet loss L when the receiving end does not reach a decoding time limit_jThen the transmitting end is required to retransmit the packet L_jThe number of retransmissions is less than or equal to the maximum number of retransmissions M_j(ii) a When the receiving end reaches the decoding time limit, the packet loss L is determined_jNo retransmission is performed.

Example of the implementation

An embodiment of the present invention is given here, but the present invention is not limited to this embodiment only. A flow chart of an embodiment of the present invention is shown in fig. 2. This embodiment describes the implementation steps with an 8-frame GOP.

The method is based on the existing three-dimensional wavelet transform and coding method with motion compensation.

A GOP of size 8 is subjected to 3-level MCTF and 3-level spatial wavelet transform as shown in figure 3. In fig. 3, curved arrows represent motion compensated temporal wavelet transforms between adjacent frames, and straight solid and dashed arrows represent temporal low and high frequency wavelet subbands formed by MCTF, respectively, with different gray values used by the blocks to distinguish the different wavelet subband blocks. The specific transformation process is as follows:

the frame in a GOP of the original video sequence is marked A₀ ⁰，A₁ ⁰，A₂ ⁰，A₃ ⁰，A₄ ⁰，A₅ ⁰，A₆ ⁰，A₇ ⁰After being processed by MCTF, the wavelet band is transformed into a first-level time domain low-frequency wavelet sub-band L₀ ¹，L₁ ¹，L₂ ¹，L₃ ¹And a first-level time domain high-frequency wavelet sub-band H₀ ¹，H₁ ¹，H₂ ¹，H₃ ¹(ii) a For the first-level time domain low-frequency wavelet sub-band L₀ ¹，L₁ ¹，L₂ ¹，L₃ ¹And performing second-level MCTF to form a second-level time domain low-frequency wavelet sub-band L₀ ²，L₁ ²And two-level time domain high-frequency wavelet sub-band H₀ ²，H₁ ²(ii) a For the secondary time domain low-frequency wavelet sub-band L₀ ²，L₁ ²And performing third-level MCTF to form a three-level time domain low-frequency wavelet sub-band L₀ ³And three-level time domain high-frequency wavelet sub-band H₀ ³. And then, performing space domain wavelet transformation on each time domain wavelet sub-band obtained by MCTF to obtain each space domain wavelet sub-band. And then carrying out EZBC coding on all wavelet sub-bands to obtain a coding code stream corresponding to each wavelet sub-band.

After the steps, the implementation steps of the invention are carried out, which comprise:

1) sorting and code stream packaging of wavelet sub-band

Firstly, according to the requirements of sequential decoding and decoding time delay, sorting time domain wavelet sub-bands. According to the sequential decoding of frame A₀ ⁰，A₀ ¹，A₀ ²，A₀ ³，A₀ ⁴，A₀ ⁵，A₀ ⁶，A₀ ⁷The temporal wavelet sub-bands resulting from MCTF are sorted. To decode frame A₀ ⁰Using time domain wavelet sub-band L₀ ³，H₀ ³，H₀ ²，H₀ ¹(ii) a Then decoding frame A₁ ⁰Then the time domain wavelet sub-band H is used again₁ ¹(ii) a Then decoding frame A₂ ⁰And using the time domain wavelet sub-band H₁ ²，H₂ ¹(ii) a Then decoding frame A₃ ⁰And using the time domain wavelet sub-band H₃ ¹. Thus, the subsequent frames can be decoded again and again. According to this principle, the time domain wavelet sub-band is divided into 4 transmission levels, i.e. the transmission order of the time domain wavelet sub-band is:

L₀ ³，H₀ ³，H₀ ²，H₀ ¹|H₁ ¹|H₁ ²，H₂ ¹|H₃ ¹|

the time domain wavelet sub-band transmission at different levels has different decoding time limits, limited by network conditions. Setting the decoding time limit corresponding to the time domain wavelet sub-bands of different levels as:

| | | |

Deadline_A Deadline_B Deadline_C Deadline_D

8 frame GOP time domain subband transmission ordering and decodingThe time limit is shown in fig. 4 (a); then sorting the spatial wavelet sub-bands in the same level, as shown in fig. 4 (b); and finally, packing the coded code stream according to the sorting result of the wavelet sub-bands, and setting the maximum packet length as L, as shown in fig. 4 (c). Fig. 4(a) shows the result of ordering the temporal high-frequency wavelet subband and the temporal low-frequency wavelet subband formed by MCTF and the corresponding decoding time limit, where the sequence of frames is: l is₀ ³，H₀ ³，H₀ ²，H₀ ¹|H₁ ¹|H₁ ²，H₂ ¹|H₃ ¹In the figure, blocks with different gray levels represent different wavelet sub-bands, a vertical black dotted line divides the time domain wavelet sub-band into four levels of 1, 2, 3 and 4 from left to right in sequence, and the decoding time limit of each level is respectively a Deadline_A、Deadline_B、Deadline_C、Deadline_D(ii) a FIG. 4(b) shows the result of ordering spatial wavelet subbands in the same level, with the upper wavelet subbands ordered the further up; fig. 4(c) shows a code stream formed by packets, wherein the 1 st level wavelet sub-band code stream is packaged into 3 packets, and the 2 nd, 3 rd, 4 th level wavelet sub-band code streams are respectively packaged into 1 packet.

2) Code stream transmission

For the packaged packet, an optimal transmission strategy is determined according to actual network conditions. Consider the following optimization problem:

wherein,

N_P ^k(k is not less than 1 and not more than 4) is the number of packets corresponding to each level of 1, 2, 3, 4, D_GOPIs the total transmission distortion of a GOP, D_j ^C(1. ltoreq. k. ltoreq.4) is the distortion caused by the channel transmission, Time_A，Time_B，Time_C，Time_DThe transmission time, Deadline, of the wavelet sub-band in each

level

1, 2, 3, 4 respectively_A、Deadline_B、Deadline_C、Deadline_DRespectively, a decoding time limit, L, in each level_jIs the length of the jth packet, M_jIs the packet L_jMaximum number of retransmissions of, Time_jIs the packet L_jTransmission time of (P)_Loss，jIs each packet L_jProbability of packet loss per transmission, P_ARQ，jIs a bag L_jProbability of packet loss in retransmission. By solving the above optimization problem, each packet L can be obtained_jCorresponding to the maximum number of retransmissions M_j. During actual transmission, retransmission is carried out as long as packet loss is found according to the limitation of a time delay condition as long as the decoding time limit is not reached; if the bandwidth is smaller, when the receiving end reaches the decoding time limit, retransmission is not carried out. Thus, the receiving end is guaranteed to receive the most important information.

Claims

1. A three-dimensional wavelet sub-band sequencing and code stream packaging method in scalable video coding is characterized by comprising the steps of sequencing wavelet sub-bands of code streams after EZBC coding and packaging and transmitting the sequenced code streams;

firstly, t +2D three-dimensional wavelet transform coding containing motion compensation is carried out on a video sequence, and the specific process is as follows: for size of 2^JGOP of the frame is taken as MCTF of J level; under the full space resolution, each time domain wavelet sub-band is processed with M-level space domain wavelet transformation, so that each level of time domain wavelet sub-band is transformed into 3 xM +1 different space domainsSpatial domain wavelet sub-bands of inter-resolution; thus each GOP is transformed into 2^JX (3 × M +1) wavelet subbands;

11) according to the principle of sequential decoding when a reconstructed video sequence is decoded, frames of an original video are decoded in sequence, time domain wavelet sub-bands are sorted, the time domain wavelet sub-bands are divided into levels with different transmission priorities, according to the relationship between the reconstructed video sequence and the different time domain sub-bands, the priority level of the time domain wavelet sub-band related to the current frame of the reconstructed video sequence is high, and the priority level of the time domain wavelet sub-band related to the next frame is low;

12) according to wavelet sub-bands other than the spatial wavelet

The reconstructed video frame is compared with the original video frame in the size of the MSE (mean square error) value of the transmission distortion, and the spatial wavelet sub-bands in the same level are sequenced so as to transmit the sub-band with the large MSE value of the transmission distortion first and transmit the sub-band with the small MSE value of the transmission distortion later, thereby ensuring that the receiving end receives the most important information;

21) packaging the sorted code streams in sequence by a packet L_jJ represents the serial number of the packet, the maximum packet length is L, and the length of a network IP (Internet protocol) packet is set;

22) calculating packet L according to different time delays of network_jMaximum number of retransmissions M_j，M_jIs a natural number, and specifically comprises: if packet L_jLoss, total distortion to reconstructed video frame is D_j ^L(ii) a If packet L_jReceived but distorted by D due to quantization at the time of encoding_j ^Q(ii) a Taking into account only the distortion D caused by channel transmission_j ^CWherein

Within a group of pictures GOPThe number of the packets is N_P(ii) a The number of packets corresponding to each level is N_P ^kK is more than or equal to 1 and less than or equal to K, and K is the time domain wavelet sub-band level number; bag L_jMaximum number of retransmissions M_jIs obtained by the following formula:

wherein D is_GOPIs the total transmission distortion of a GOP, the wavelet sub-band transmission time in the k level is at mostB is letterBandwidth of track, L_jIs the length of the jth packet, M_jIs the packet L_jMaximum number of retransmissions of, Time_jIs the packet L_jTransmission time of (1), delay_kIs the total decoding deadline for the wavelet sub-band of the top k level, PLoss,_jis each packet L_jProbability of packet loss per transmission, P_ARQ，jIs a bag L_jProbability of packet loss in retransmission;

2. The method of claim 1, wherein said step 11) of dividing the time domain wavelet sub-bands into classes having different transmission priorities comprises:

with A_i ⁰To represent video frames, where the subscript i represents the frame's sequence number and the superscript 0 represents the original video frame before transformation; the original video frame sequence is: a. the₀ ⁰，A₁ ⁰，A₂ ⁰，A₃ ⁰，A₄ ⁰，A₅ ⁰，A₆ ⁰，A₇ ⁰… …, respectively; with L_i ^jAnd H_i ^jRespectively representing a time domain low frequency wavelet sub-band and a time domain high frequency wavelet sub-band after Motion Compensation Temporal Filtering (MCTF); the result of the transformation is that there is only one time-domain lowest frequency wavelet sub-band L₀ ^jMultiple time domain high frequency wavelet sub-bands H_i ^jWherein the superscript J represents that the time domain high-frequency wavelet sub-band is obtained by J level transformation, and J belongs to [1, J ]]∩Z⁺J is the total transform series, subscript i represents the sequential number of the wavelet sub-band, and in the J-th transform, the sum is 2^J-jA time domain high frequency wavelet sub-band, i ∈ {1, 2^J-jJ }; sub-band of lowest frequency waveletL₀ ^jAnd all 2^J-j(j∈[1，J]∩Z⁺) Decoding original video frame A in time domain high frequency wavelet sub-band₀ ⁰Correlated time domain low frequency wavelet sub-band L₀ ^JAnd time domain high frequency wavelet sub-band H₀ ^J，H₀ ^J-1，...，H₀ ¹Is set to 1; according to the same principle, the rest time domain high-frequency wavelet sub-bands are sequentially divided into different levels 2, 3 and p, wherein p is more than or equal to 2;

3. the method according to claim 1, wherein the step 12) of sorting the spatial wavelet sub-carriers in the same level according to the MSE values comprises:

where, s is the original sequence of video frames,

<math> <mrow> <mi>D</mi> <mrow> <mo>(</mo> <munder> <mrow> <mo>∪</mo> <msub> <mi>S</mi> <mi>i</mi> </msub> </mrow> <mrow> <mi>i</mi> <mo>&NotEqual;</mo> <mi>j</mi> </mrow> </munder> <mo>)</mo> </mrow> <mo>,</mo> <mi>i</mi> <mo>&Element;</mo> <mo>[</mo> <mn>1</mn> <mo>,</mo> <mi>n</mi> <mo>]</mo> <mo>∩</mo> <msup> <mi>Z</mi> <mo>+</mo> </msup> </mrow> </math>

the above formula represents the wavelet sub-band S_jIf it is lost, the other wavelet sub-bands

Compared with the original video frame, the reconstructed video frame has distortion; the larger the MSE value is, the more important the wavelet subband is for reconstructing video frames is, and the higher the transmission level is; the result of wavelet subband ordering from the magnitude of the MSE values is:

r0∈[1，n]∩Z⁺，r₁、r₂、r₃、r₄numerical range and r₀The same is carried out;

these wavelet sub-bands

The corresponding code stream lengths are respectively:

l_{r_{0}}, l_{r_{1}}, l_{r_{2}}, l_{r_{3}}, l_{r_{4}} . . . . . .;

in the next packing operation, packing is performed according to the restrictions of the code stream length and the maximum packet length.