CN101299819B - Method for sorting three-dimensional wavelet sub-band and enveloping code flow of telescopic video coding - Google Patents

Method for sorting three-dimensional wavelet sub-band and enveloping code flow of telescopic video coding Download PDF

Info

Publication number
CN101299819B
CN101299819B CN 200810104941 CN200810104941A CN101299819B CN 101299819 B CN101299819 B CN 101299819B CN 200810104941 CN200810104941 CN 200810104941 CN 200810104941 A CN200810104941 A CN 200810104941A CN 101299819 B CN101299819 B CN 101299819B
Authority
CN
China
Prior art keywords
band
wavelet
wavelet sub
mrow
sub
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN 200810104941
Other languages
Chinese (zh)
Other versions
CN101299819A (en
Inventor
戴琼海
彭义刚
肖红江
杨敬钰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN 200810104941 priority Critical patent/CN101299819B/en
Publication of CN101299819A publication Critical patent/CN101299819A/en
Application granted granted Critical
Publication of CN101299819B publication Critical patent/CN101299819B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The invention discloses a three-dimensional wavelet sub-band sorting and code stream packet sealing method in the telescopic video code, wherein the wavelet sub-band in the time domain is sorted according to the requirement of the sequential decode, and the wavelet sub-band in the time domain is divided into different levels; the wavelet sub-band in the time domain of the same level is sorted according to the size of the transmission distortion MES value; then the code stream after being sorted is packed, then transmitted to a receiving end, when the receiving end does not reach the time limit of the decode, then performs the retransmission when finding the package missing, with the retransmitting time smaller than or equal to the largest retransmitting time; when the receiving end reaches the time limit of the decode, the package missing is not retransmitted. The method provides an effective code rate transmission control method, thereby providing correct code stream distribution for the video code and the transmission, providing the code stream organizing and transmitting way of high performance three-dimensional retractability self-adapting to the isomery, the network bandwidth wave property and the time delay change and the terminal diversity of the user receiving.

Description

Three-dimensional wavelet sub-band sorting and code stream packaging method in scalable video coding
Technical Field
The invention belongs to the field of multimedia communication, and particularly relates to a three-dimensional wavelet sub-band sequencing and code stream packaging method.
Background
With the rapid development of internet technology, multimedia applications including images, videos, sounds and other contents are becoming popular and popular. However, due to the inherent heterogeneity of the internet, various networks have different channel characteristics (e.g., different channel bandwidths, delays, jitter, etc.). Meanwhile, the terminal devices used by users are various, and the display capability and the processing capability of the terminals are obviously different. Therefore, the quality of video obtained by users through different terminals of different networks is different, and scalable coding is needed to solve the problem. The code stream obtained by coding the video in a scalable coding mode has scalability. The scalable coding method codes the video signal with the highest quality, and then can extract a part of the video signal to obtain a code stream with a lower code rate, so that the requirements of network bandwidth and terminal processing capacity are met, and the complexity of a server is reduced. Because of this superior adaptability, scalable video coding is one of the most interesting areas in the video coding community.
For video coding, scalable coding is to achieve the following scalability: quality scalability, spatial scalability, and temporal scalability. Quality scalability has the property that: taking part of the bits to reconstruct can obtain a blurred video picture, and if more bits are taken, the reconstructed picture becomes clearer. Spatial scalability refers to the property that a codestream can provide pictures of different spatial resolutions. Temporal scalability means that the code stream can provide different temporal resolutions, i.e. different frame rates. The current scalable video coding methods roughly include: layered video coding, fine scalable video coding, wavelet scalable video coding, and MPEG AVC/h.264 scalable extension. Among them, wavelet scalable coding is an effective tool that can generate a scalable codestream, and it can provide a scalable codestream while maintaining high coding efficiency. Its efficient representation capability and embedded coding approach provide great flexibility for adaptive spatial, temporal and quality scalability. One t +2D three-dimensional wavelet transform coding scheme with motion compensation comprises the following steps: after the motion compensation time domain wavelet transform is carried out on the video sequence, then the two-dimensional space domain wavelet transform is carried out, and finally the EZBC coding mode is adopted for coding. The different wavelet sub-bands generated by this t +2D transform coding method have different effects on the quality of the reconstructed video sequence. The overall flow of the video encoding method is shown in fig. 1. The method specifically comprises the following steps:
1) dividing the video frames into image groups, and performing MCTF on the image groups: for a video sequence comprising N frames of images, it is first divided into a set of groups of size 2JA Group of pictures (GOP) of frames, then performing J-level Motion compensation temporal wavelet transform (MCTF) on each GOP by adopting a lifting scheme of wavelet transform to obtain different temporal wavelet sub-bands, namely temporal high-frequency wavelet sub-bands and temporal low-frequency wavelet sub-bands of each level;
2) performing space domain wavelet transformation on the time domain wavelet sub-band to obtain a wavelet sub-band: performing M-level spatial wavelet transform on each time domain wavelet subband formed in each GOP under the full spatial resolution, and transforming each level of time domain wavelet subband into 3 xM +1 spatial wavelet subbands with different spatial resolutions; thus each GOP is transformed into 2JX (3 × M +1) wavelet subbands;
3) EZBC encoding is carried out on the wavelet sub-bands: and encoding the transformed wavelet sub-bands by adopting Embedded Zero Blocks (EZBC) of context modeling. Because the code tables of different levels and different sub-bands of the quad-tree in the method are independently constructed, each wavelet sub-band corresponds to an independent code stream.
The above step 1) MCTF, step 2) spatial wavelet transform may be implemented based on a lifting scheme of wavelet transform. The lifting scheme of the wavelet transform comprises three steps: split, predict, and update. The splitting step refers to splitting the original signal into two parts: i.e. the signals are divided into two subsets according to the odd and even of the serial number: an even-order subset and an odd-order subset. The predicting step uses the correlation between the two subsets to predict one subset from the other, for example, using the odd-numbered sequence subset to predict the even-numbered sequence subset, thereby obtaining the detail signal portion, and storing the detail signal portion in the even-numbered sequence subset. The updating step is to update the odd-numbered sequence subset with the detail signal, i.e. the even-numbered sequence subset, to obtain the profile signal, and store the profile signal in the odd-numbered sequence subset. In this way, the lifting scheme of the wavelet transform is decomposed into several very simple basic steps, and each step is very easy to find its inverse transform. The reconstruction process is the inverse step of transformation, and also comprises three steps, namely inverse prediction, inverse update and combination.
The lifting scheme of wavelet lifting has the advantages that: integer wavelet transformation can be realized; the wavelet transformation of any image size can be realized; the wavelet transformation can be completed at the current position without allocating extra memory, thereby facilitating the realization based on a chip; the algorithm is simple, the method is suitable for parallel processing, the calculation speed is high, and the like.
The above-described video coding method based on three-dimensional wavelet transform provides great flexibility for adaptive spatial and spatial, temporal and quality scalability. The coded video code stream should be embedded, so that the embedded code stream can be intercepted according to the specific video transmission network structure, the network bandwidth and the requirement of a user video receiving terminal, and the video with the best quality can be obtained as much as possible. However, most of the current code stream organization modes only support the embedded type of the intra-frame code stream, do not consider the embedded type organization of the inter-frame code stream, do not consider the different importance of each sub-band after wavelet transformation to the quality of the reconstructed video frame, and do not consider the time delay influence during transmission. Therefore, when the whole video code stream is organized and transmitted, the global optimal distribution of the whole video code stream under the network limited condition cannot be ensured.
In view of the above-mentioned drawbacks and deficiencies of the prior art in the background art, the present invention provides a scalable video coding three-dimensional wavelet subband sorting and code stream packing method to provide an effective video coding transmission mode, thereby providing accurate code stream allocation for video coding and transmission, and providing a high-performance three-dimensional (temporal, spatial, quality) scalable code stream organization form adaptive to the heterogeneity of video transmission network, network bandwidth fluctuation and delay variation, and diversity of user receiving terminals.
Before implementing the method provided by the invention, firstly, the t +2D three-dimensional wavelet transform coding containing motion compensation is carried out on a video sequence, and the specific process is as follows: for size of 2JGOP of the frame is taken as MCTF of J level; performing M-level spatial wavelet transform on each time domain wavelet sub-band under the full spatial resolution, so that each level of time domain wavelet sub-band is transformed into 3 xM +1 spatial wavelet sub-bands with different spatial resolutions; thus each GOP is transformed into 2JX (3 × M +1) wavelet subbands. Then, the three-dimensional wavelet sub-band ordering and code stream packaging method of the scalable video coding provided by the invention is implemented.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a method for sequencing and packaging three-dimensional wavelet sub-bands in scalable video coding, which considers the embedded type of the code stream in a video frame and the embedded type of the code stream between frames, comprehensively considers the influence of transmission delay and the different importance of each wavelet sub-band after wavelet transformation on the quality of a reconstructed video frame, transmits the sub-band with larger importance in advance and transmits the sub-band with smaller importance in the back, thereby ensuring that a receiving end receives the most important information. The whole code stream has good network adaptability, and the video reconstruction quality can be obtained as good as possible under the condition that the network condition is limited.
The significant characteristic of the invention is that after EZBC coding is completed by adopting the t +2D three-dimensional wavelet transform coding method in the prior art, according to the characteristic that each wavelet sub-band after the three-dimensional wavelet transform containing motion compensation has different importance to the quality of the reconstructed video frame, the limitation of the conditions such as channel delay, jitter, bandwidth and the like in actual transmission is considered, the sub-band with larger importance is transmitted in advance, and the sub-band with smaller importance is transmitted later, thereby ensuring that the receiving end receives the most important information.
The invention has another characteristic that the method combines the video telescopic wavelet transform and the embedded coding and combines the scalability of transmission, so that the whole code stream has good network adaptability and can obtain good video reconstruction quality as far as possible under the condition of limited network conditions.
The third characteristic of the invention is that the proposed wavelet sub-band sorting and code stream packing method has lower computational complexity and is easy to implement. Firstly, the motion compensation based time domain wavelet transform and the space domain wavelet transform can be realized by adopting a lifting scheme of wavelet transform, and the method has the advantages of small occupied memory, low operation complexity and the like. Secondly, the code stream packaging method and the maximum retransmission times are simple to calculate.
Drawings
Fig. 1 shows a t +2D three-dimensional wavelet transform coding scheme with motion compensation in existing scalable video coding.
FIG. 2 is a flow chart of a three-dimensional wavelet subband sorting and code stream packing method for scalable video coding according to the present invention.
Figure 3 is a schematic diagram of 3-level MCTF and 3-level spatial wavelet transform for an 8-frame sized GOP in one embodiment of the present invention.
FIG. 4 is a diagram illustrating GOP decoding timing, wavelet subband ordering, and code stream packet packing for 8 frames in one embodiment of the present invention.
Detailed Description
The general flow chart of the invention is shown in fig. 2, and comprises the steps of sorting wavelet sub-bands of code streams after EZBC coding, and packing and transmitting two parts of specific solutions of the sorted code streams as follows:
1) the sorting of the wavelet sub-bands specifically comprises the following steps:
11) according to the principle of sequential decoding when a reconstructed video sequence is decoded, frames 1, 2 and 3 … … of the reconstructed video are decoded in sequence, time domain wavelet sub-bands are sequenced, according to the relationship between the reconstructed video frame and different time domain wavelet sub-bands, the time domain wavelet sub-band related to the current frame of the reconstructed video sequence is transmitted first, and the time domain wavelet sub-band related to the next frame is transmitted later; thus, the time domain wavelet sub-bands are classified into classes with different transmission priorities; the specific process is as follows:
with Ai 0To represent video frames, where the subscript i represents the frame's sequence number and the superscript 0 represents the original video frame before transformation; the original video frame sequence is: a. the0 0,A1 0,A2 0,A3 0,A4 0,A5 0,A6 0,A7 0… …, respectively; with Li jAnd Hi jRespectively representing a time domain low frequency wavelet sub-band and a time domain high frequency wavelet sub-band after Motion Compensation Temporal Filtering (MCTF); the result of the transformation is that there is only one time-domain lowest frequency wavelet sub-band L0 jMultiple time domain high frequency wavelet sub-bands Hi jWherein the superscript J represents that the time domain high-frequency wavelet sub-band is obtained by J level transformation, and J belongs to [1, J ]]∩Z+J is the total transform series, subscript i represents the sequential number of the wavelet sub-band, and in the J-th transform, the sum is 2J-jA time domain high frequency wavelet sub-band, i ∈ {1, 2J-jJ }; the lowest frequency wavelet sub-band L0 jAnd all 2J-j(j∈[1,J]∩Z+) Decoding original video frame A in time domain high frequency wavelet sub-band0 0Correlated time domain low frequency wavelet sub-band L0 JAnd time domain high frequency wavelet sub-band H0 J,H0 J-1,...,H0 1Is set to 1; according to the same principle, the rest time domain high-frequency wavelet sub-bands are sequentially divided into different levels 2, 3 and p, wherein p is more than or equal to 2;
12) sorting the airspace wavelet sub-waves in the same level according to the size of the transmission distortion MSE value so as to transmit the sub-band with the small transmission distortion MSE value in advance, and then transmitting the sub-band with the large transmission distortion MSE value, thereby ensuring that a receiving end receives the most important information; the specific process is as follows:
the video frame after J-level MCTF and M-level spatial wavelet transform is divided into different wavelet sub-bands Si,i∈[1,n]∩Z+,Z+Represents a set of positive integers, n is 2JX (3M +1), the effect of each wavelet subband on the quality of the reconstructed video frame is measured by MSE, where MSE is defined as:
Figure G2008101049414D00041
where, s is the original sequence of video frames,
Figure G2008101049414D00042
a sequence of video frames reconstructed for a decoding side; if wavelet sub-band SiIf the transmission distortion is lost during transmission, the MSE value of the transmission distortion is:
<math> <mrow> <mi>D</mi> <mrow> <mo>(</mo> <munder> <mi>&cup;</mi> <mrow> <mi>i</mi> <mo>&NotEqual;</mo> <mi>j</mi> </mrow> </munder> <msub> <mi>S</mi> <mi>i</mi> </msub> <mo>)</mo> </mrow> <mo>,</mo> </mrow> </math> i∈[1,n]∩Z+
the above formula represents the wavelet sub-band SjIf it is lost, the other wavelet sub-bandsCompared with the original video frame, the reconstructed video frame has distortion; the larger the MSE value is, the more important the wavelet subband is for reconstructing video frames is, and the higher the transmission level is; the result of wavelet subband ordering from the magnitude of the MSE values is:
S r 0 , S r 1 , S r 2 , S r 3 , S r 4 . . . . . . , r0∈[1,n]∩Z+
these wavelet sub-bandsThe corresponding code stream lengths are respectively:
l r 0 , l r 1 , l r 2 , l r 3 , l r 4 . . . . . . ;
in the next packing operation, the operation is performed according to the restrictions of the code stream length and the maximum packet length.
2) The sequenced code streams are packaged and transmitted, and the method specifically comprises the following steps:
21) packaging the sorted code streams in sequence by a packet LjJ represents the serial number of the packet, the maximum packet length is L, and the length of a network IP (Internet protocol) packet is set; because different wavelet sub-bands have different influences on the video reconstruction effect, namely different wavelet sub-bands have different importance, the sub-band with higher importance is transmitted first, and the sub-band with lower importance is transmitted later, so that the receiving end is ensured to receive the most important information.
22) Calculating the maximum retransmission times M of the packet j according to different time delays of the networkj(ii) a The method specifically comprises the following steps: if packet LjLoss, total distortion to reconstructed video frame is Dj L(ii) a If packet LjReceived but distorted by D due to quantization at the time of encodingj Q(ii) a Taking into account only the distortion D caused by channel transmissionj CWhereinThe number Of packets in a Group Of Pictures (GOP) is NP(ii) a The number of packets corresponding to each level is NP kK is more than or equal to 1 and less than or equal to K, and K is the time domain wavelet sub-band level number; bag LjMaximum number of retransmissions MjIs obtained by the following formula:
<math> <mrow> <mi>s</mi> <mo>.</mo> <mi>t</mi> <mo>.</mo> <munderover> <mi>&Sigma;</mi> <mrow> <mi>s</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>k</mi> </munderover> <munderover> <mi>&Sigma;</mi> <mrow> <mi>j</mi> <mo>=</mo> <mn>1</mn> </mrow> <msubsup> <mi>N</mi> <mi>P</mi> <mi>s</mi> </msubsup> </munderover> <msub> <mi>Time</mi> <mi>j</mi> </msub> <mo>&le;</mo> <msub> <mi>Deadline</mi> <mi>k</mi> </msub> <mo>,</mo> <mn>1</mn> <mo>&le;</mo> <mi>k</mi> <mo>&le;</mo> <msub> <mi>N</mi> <mi>P</mi> </msub> </mrow> </math>
wherein D isGOPIs the total transmission distortion of a GOP, the wavelet sub-band transmission time in the k level is at most
Figure G2008101049414D00053
B is the channel bandwidth, LjIs the length of the jth packet, MjIs the packet LjMaximum number of retransmissions of, TimejIs the packet LjTransmission time of (1), delaykIs the total decoding deadline, P, of the wavelet sub-band of the top k levelLoss,jIs each packet LjProbability of packet loss per transmission, PARQ,jIs a bag LjProbability of packet loss in retransmission.
23) Transmitting the packed code stream to a receiving end, and finding a packet loss L when the receiving end does not reach a decoding time limitjThen the transmitting end is required to retransmit the packet LjThe number of retransmissions is less than or equal to the maximum number of retransmissions Mj(ii) a When the receiving end reaches the decoding time limit, the packet loss L is determinedjNo retransmission is performed.
Example of the implementation
An embodiment of the present invention is given here, but the present invention is not limited to this embodiment only. A flow chart of an embodiment of the present invention is shown in fig. 2. This embodiment describes the implementation steps with an 8-frame GOP.
The method is based on the existing three-dimensional wavelet transform and coding method with motion compensation.
A GOP of size 8 is subjected to 3-level MCTF and 3-level spatial wavelet transform as shown in figure 3. In fig. 3, curved arrows represent motion compensated temporal wavelet transforms between adjacent frames, and straight solid and dashed arrows represent temporal low and high frequency wavelet subbands formed by MCTF, respectively, with different gray values used by the blocks to distinguish the different wavelet subband blocks. The specific transformation process is as follows:
the frame in a GOP of the original video sequence is marked A0 0,A1 0,A2 0,A3 0,A4 0,A5 0,A6 0,A7 0After being processed by MCTF, the wavelet band is transformed into a first-level time domain low-frequency wavelet sub-band L0 1,L1 1,L2 1,L3 1And a first-level time domain high-frequency wavelet sub-band H0 1,H1 1,H2 1,H3 1(ii) a For the first-level time domain low-frequency wavelet sub-band L0 1,L1 1,L2 1,L3 1And performing second-level MCTF to form a second-level time domain low-frequency wavelet sub-band L0 2,L1 2And two-level time domain high-frequency wavelet sub-band H0 2,H1 2(ii) a For the secondary time domain low-frequency wavelet sub-band L0 2,L1 2And performing third-level MCTF to form a three-level time domain low-frequency wavelet sub-band L0 3And three-level time domain high-frequency wavelet sub-band H0 3. And then, performing space domain wavelet transformation on each time domain wavelet sub-band obtained by MCTF to obtain each space domain wavelet sub-band. And then carrying out EZBC coding on all wavelet sub-bands to obtain a coding code stream corresponding to each wavelet sub-band.
After the steps, the implementation steps of the invention are carried out, which comprise:
1) sorting and code stream packaging of wavelet sub-band
Firstly, according to the requirements of sequential decoding and decoding time delay, sorting time domain wavelet sub-bands. According to the sequential decoding of frame A0 0,A0 1,A0 2,A0 3,A0 4,A0 5,A0 6,A0 7The temporal wavelet sub-bands resulting from MCTF are sorted. To decode frame A0 0Using time domain wavelet sub-band L0 3,H0 3,H0 2,H0 1(ii) a Then decoding frame A1 0Then the time domain wavelet sub-band H is used again1 1(ii) a Then decoding frame A2 0And using the time domain wavelet sub-band H1 2,H2 1(ii) a Then decoding frame A3 0And using the time domain wavelet sub-band H3 1. Thus, the subsequent frames can be decoded again and again. According to this principle, the time domain wavelet sub-band is divided into 4 transmission levels, i.e. the transmission order of the time domain wavelet sub-band is:
L0 3,H0 3,H0 2,H0 1|H1 1|H1 2,H2 1|H3 1|
the time domain wavelet sub-band transmission at different levels has different decoding time limits, limited by network conditions. Setting the decoding time limit corresponding to the time domain wavelet sub-bands of different levels as:
| | | |
DeadlineA DeadlineB DeadlineC DeadlineD
8 frame GOP time domain subband transmission ordering and decodingThe time limit is shown in fig. 4 (a); then sorting the spatial wavelet sub-bands in the same level, as shown in fig. 4 (b); and finally, packing the coded code stream according to the sorting result of the wavelet sub-bands, and setting the maximum packet length as L, as shown in fig. 4 (c). Fig. 4(a) shows the result of ordering the temporal high-frequency wavelet subband and the temporal low-frequency wavelet subband formed by MCTF and the corresponding decoding time limit, where the sequence of frames is: l is0 3,H0 3,H0 2,H0 1|H1 1|H1 2,H2 1|H3 1In the figure, blocks with different gray levels represent different wavelet sub-bands, a vertical black dotted line divides the time domain wavelet sub-band into four levels of 1, 2, 3 and 4 from left to right in sequence, and the decoding time limit of each level is respectively a DeadlineA、DeadlineB、DeadlineC、DeadlineD(ii) a FIG. 4(b) shows the result of ordering spatial wavelet subbands in the same level, with the upper wavelet subbands ordered the further up; fig. 4(c) shows a code stream formed by packets, wherein the 1 st level wavelet sub-band code stream is packaged into 3 packets, and the 2 nd, 3 rd, 4 th level wavelet sub-band code streams are respectively packaged into 1 packet.
2) Code stream transmission
For the packaged packet, an optimal transmission strategy is determined according to actual network conditions. Consider the following optimization problem:
Figure G2008101049414D00061
Figure G2008101049414D00062
Figure G2008101049414D00063
Figure G2008101049414D00064
Figure G2008101049414D00065
wherein,
Figure G2008101049414D00071
NP k(k is not less than 1 and not more than 4) is the number of packets corresponding to each level of 1, 2, 3, 4, DGOPIs the total transmission distortion of a GOP, Dj C(1. ltoreq. k. ltoreq.4) is the distortion caused by the channel transmission, TimeA,TimeB,TimeC,TimeDThe transmission time, Deadline, of the wavelet sub-band in each level 1, 2, 3, 4 respectivelyA、DeadlineB、DeadlineC、DeadlineDRespectively, a decoding time limit, L, in each leveljIs the length of the jth packet, MjIs the packet LjMaximum number of retransmissions of, TimejIs the packet LjTransmission time of (P)Loss,jIs each packet LjProbability of packet loss per transmission, PARQ,jIs a bag LjProbability of packet loss in retransmission. By solving the above optimization problem, each packet L can be obtainedjCorresponding to the maximum number of retransmissions Mj. During actual transmission, retransmission is carried out as long as packet loss is found according to the limitation of a time delay condition as long as the decoding time limit is not reached; if the bandwidth is smaller, when the receiving end reaches the decoding time limit, retransmission is not carried out. Thus, the receiving end is guaranteed to receive the most important information.

Claims (3)

1. A three-dimensional wavelet sub-band sequencing and code stream packaging method in scalable video coding is characterized by comprising the steps of sequencing wavelet sub-bands of code streams after EZBC coding and packaging and transmitting the sequenced code streams;
firstly, t +2D three-dimensional wavelet transform coding containing motion compensation is carried out on a video sequence, and the specific process is as follows: for size of 2JGOP of the frame is taken as MCTF of J level; under the full space resolution, each time domain wavelet sub-band is processed with M-level space domain wavelet transformation, so that each level of time domain wavelet sub-band is transformed into 3 xM +1 different space domainsSpatial domain wavelet sub-bands of inter-resolution; thus each GOP is transformed into 2JX (3 × M +1) wavelet subbands;
1) the sorting of the wavelet sub-bands specifically comprises the following steps:
11) according to the principle of sequential decoding when a reconstructed video sequence is decoded, frames of an original video are decoded in sequence, time domain wavelet sub-bands are sorted, the time domain wavelet sub-bands are divided into levels with different transmission priorities, according to the relationship between the reconstructed video sequence and the different time domain sub-bands, the priority level of the time domain wavelet sub-band related to the current frame of the reconstructed video sequence is high, and the priority level of the time domain wavelet sub-band related to the next frame is low;
12) according to wavelet sub-bands other than the spatial wavelet
Figure F2008101049414C00011
The reconstructed video frame is compared with the original video frame in the size of the MSE (mean square error) value of the transmission distortion, and the spatial wavelet sub-bands in the same level are sequenced so as to transmit the sub-band with the large MSE value of the transmission distortion first and transmit the sub-band with the small MSE value of the transmission distortion later, thereby ensuring that the receiving end receives the most important information;
2) the sequenced code streams are packaged and transmitted, and the method specifically comprises the following steps:
21) packaging the sorted code streams in sequence by a packet LjJ represents the serial number of the packet, the maximum packet length is L, and the length of a network IP (Internet protocol) packet is set;
22) calculating packet L according to different time delays of networkjMaximum number of retransmissions Mj,MjIs a natural number, and specifically comprises: if packet LjLoss, total distortion to reconstructed video frame is Dj L(ii) a If packet LjReceived but distorted by D due to quantization at the time of encodingj Q(ii) a Taking into account only the distortion D caused by channel transmissionj CWherein
Figure F2008101049414C00012
Within a group of pictures GOPThe number of the packets is NP(ii) a The number of packets corresponding to each level is NP kK is more than or equal to 1 and less than or equal to K, and K is the time domain wavelet sub-band level number; bag LjMaximum number of retransmissions MjIs obtained by the following formula:
<math> <mrow> <mi>min</mi> <msub> <mi>D</mi> <mi>GOP</mi> </msub> <mo>=</mo> <munderover> <mi>&Sigma;</mi> <mrow> <mi>j</mi> <mo>=</mo> <mn>1</mn> </mrow> <msub> <mi>N</mi> <mi>P</mi> </msub> </munderover> <msubsup> <mi>D</mi> <mi>j</mi> <mi>C</mi> </msubsup> <mo>&CenterDot;</mo> <msub> <mi>P</mi> <mrow> <mi>ARQ</mi> <mo>,</mo> <mi>j</mi> </mrow> </msub> <mo>=</mo> <munderover> <mi>&Sigma;</mi> <mrow> <mi>j</mi> <mo>=</mo> <mn>1</mn> </mrow> <msub> <mi>N</mi> <mi>P</mi> </msub> </munderover> <msubsup> <mi>D</mi> <mi>j</mi> <mi>C</mi> </msubsup> <mo>&CenterDot;</mo> <msup> <mrow> <mo>(</mo> <msub> <mi>P</mi> <mrow> <mi>Loss</mi> <mo>,</mo> <mi>j</mi> </mrow> </msub> <mo>)</mo> </mrow> <msub> <mi>M</mi> <mi>j</mi> </msub> </msup> </mrow> </math>
<math> <mrow> <mi>s</mi> <mo>.</mo> <mi>t</mi> <mo>.</mo> <munderover> <mi>&Sigma;</mi> <mrow> <mi>s</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>k</mi> </munderover> <munderover> <mi>&Sigma;</mi> <mrow> <mi>j</mi> <mo>=</mo> <mn>1</mn> </mrow> <msubsup> <mi>N</mi> <mi>p</mi> <mi>s</mi> </msubsup> </munderover> <msub> <mi>Time</mi> <mi>j</mi> </msub> <mo>&le;</mo> <msub> <mi>Deadline</mi> <mi>k</mi> </msub> <mo>,</mo> <mn>1</mn> <mo>&le;</mo> <mi>k</mi> <mo>&le;</mo> <msub> <mi>N</mi> <mi>P</mi> </msub> </mrow> </math>
wherein D isGOPIs the total transmission distortion of a GOP, the wavelet sub-band transmission time in the k level is at mostB is letterBandwidth of track, LjIs the length of the jth packet, MjIs the packet LjMaximum number of retransmissions of, TimejIs the packet LjTransmission time of (1), delaykIs the total decoding deadline for the wavelet sub-band of the top k level, PLoss,jis each packet LjProbability of packet loss per transmission, PARQ,jIs a bag LjProbability of packet loss in retransmission;
23) transmitting the packed code stream to a receiving end, and finding a packet loss L when the receiving end does not reach a decoding time limitjThen the transmitting end is required to retransmit the packet LjThe number of retransmissions is less than or equal to the maximum number of retransmissions Mj(ii) a When the receiving end reaches the decoding time limit, the packet loss L is determinedjNo retransmission is performed.
2. The method of claim 1, wherein said step 11) of dividing the time domain wavelet sub-bands into classes having different transmission priorities comprises:
with Ai 0To represent video frames, where the subscript i represents the frame's sequence number and the superscript 0 represents the original video frame before transformation; the original video frame sequence is: a. the0 0,A1 0,A2 0,A3 0,A4 0,A5 0,A6 0,A7 0… …, respectively; with Li jAnd Hi jRespectively representing a time domain low frequency wavelet sub-band and a time domain high frequency wavelet sub-band after Motion Compensation Temporal Filtering (MCTF); the result of the transformation is that there is only one time-domain lowest frequency wavelet sub-band L0 jMultiple time domain high frequency wavelet sub-bands Hi jWherein the superscript J represents that the time domain high-frequency wavelet sub-band is obtained by J level transformation, and J belongs to [1, J ]]∩Z+J is the total transform series, subscript i represents the sequential number of the wavelet sub-band, and in the J-th transform, the sum is 2J-jA time domain high frequency wavelet sub-band, i ∈ {1, 2J-jJ }; sub-band of lowest frequency waveletL0 jAnd all 2J-j(j∈[1,J]∩Z+) Decoding original video frame A in time domain high frequency wavelet sub-band0 0Correlated time domain low frequency wavelet sub-band L0 JAnd time domain high frequency wavelet sub-band H0 J,H0 J-1,...,H0 1Is set to 1; according to the same principle, the rest time domain high-frequency wavelet sub-bands are sequentially divided into different levels 2, 3 and p, wherein p is more than or equal to 2;
3. the method according to claim 1, wherein the step 12) of sorting the spatial wavelet sub-carriers in the same level according to the MSE values comprises:
the video frame after J-level MCTF and M-level spatial wavelet transform is divided into different wavelet sub-bands Si,i∈[1,n]∩Z+,Z+Represents a set of positive integers, n is 2JX (3M +1), the effect of each wavelet subband on the quality of the reconstructed video frame is measured by MSE, where MSE is defined as:
Figure F2008101049414C00021
where, s is the original sequence of video frames,
Figure F2008101049414C00022
a sequence of video frames reconstructed for a decoding side; if wavelet sub-band SiIf the transmission distortion is lost during transmission, the MSE value of the transmission distortion is:
<math> <mrow> <mi>D</mi> <mrow> <mo>(</mo> <munder> <mrow> <mo>&cup;</mo> <msub> <mi>S</mi> <mi>i</mi> </msub> </mrow> <mrow> <mi>i</mi> <mo>&NotEqual;</mo> <mi>j</mi> </mrow> </munder> <mo>)</mo> </mrow> <mo>,</mo> <mi>i</mi> <mo>&Element;</mo> <mo>[</mo> <mn>1</mn> <mo>,</mo> <mi>n</mi> <mo>]</mo> <mo>&cap;</mo> <msup> <mi>Z</mi> <mo>+</mo> </msup> </mrow> </math>
the above formula represents the wavelet sub-band SjIf it is lost, the other wavelet sub-bands
Figure F2008101049414C00024
Compared with the original video frame, the reconstructed video frame has distortion; the larger the MSE value is, the more important the wavelet subband is for reconstructing video frames is, and the higher the transmission level is; the result of wavelet subband ordering from the magnitude of the MSE values is:
r0∈[1,n]∩Z+,r1、r2、r3、r4numerical range and r0The same is carried out;
these wavelet sub-bands
Figure F2008101049414C00026
The corresponding code stream lengths are respectively:
l r 0 , l r 1 , l r 2 , l r 3 , l r 4 . . . . . . ;
in the next packing operation, packing is performed according to the restrictions of the code stream length and the maximum packet length.
CN 200810104941 2008-04-25 2008-04-25 Method for sorting three-dimensional wavelet sub-band and enveloping code flow of telescopic video coding Expired - Fee Related CN101299819B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 200810104941 CN101299819B (en) 2008-04-25 2008-04-25 Method for sorting three-dimensional wavelet sub-band and enveloping code flow of telescopic video coding

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 200810104941 CN101299819B (en) 2008-04-25 2008-04-25 Method for sorting three-dimensional wavelet sub-band and enveloping code flow of telescopic video coding

Publications (2)

Publication Number Publication Date
CN101299819A CN101299819A (en) 2008-11-05
CN101299819B true CN101299819B (en) 2010-04-14

Family

ID=40079483

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 200810104941 Expired - Fee Related CN101299819B (en) 2008-04-25 2008-04-25 Method for sorting three-dimensional wavelet sub-band and enveloping code flow of telescopic video coding

Country Status (1)

Country Link
CN (1) CN101299819B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8825886B2 (en) * 2010-07-28 2014-09-02 Hong Kong Applied Science and Technology Research Institute Company Limited System and method for evaluating network transport effects on delivery of media content
JP2012209673A (en) * 2011-03-29 2012-10-25 Sony Corp Information processing apparatus, information processing method, image provision system, image provision method, and program
CN106559668B (en) * 2015-09-25 2018-07-27 电子科技大学 A kind of low code rate image compression method based on intelligent quantization technology
CN107371029B (en) * 2017-06-28 2020-10-30 上海大学 Video packet priority distribution method based on content
CN116016955A (en) * 2018-06-28 2023-04-25 苹果公司 Priority-based video coding and transmission
CN116996675B (en) * 2023-09-27 2023-12-19 河北天英软件科技有限公司 Instant messaging system and information processing method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1568003A (en) * 2003-06-18 2005-01-19 中国科学院研究生院 Dimension decreasable object segmentation and video coding with wavelet fractal scalability
CN1669328A (en) * 2002-07-17 2005-09-14 皇家飞利浦电子股份有限公司 3D wavelet video coding and decoding method and corresponding device
CN1794818A (en) * 2005-12-01 2006-06-28 西安交通大学 Control method of high performance three-dimensional code rate in flexible video coding

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1669328A (en) * 2002-07-17 2005-09-14 皇家飞利浦电子股份有限公司 3D wavelet video coding and decoding method and corresponding device
CN1568003A (en) * 2003-06-18 2005-01-19 中国科学院研究生院 Dimension decreasable object segmentation and video coding with wavelet fractal scalability
CN1794818A (en) * 2005-12-01 2006-06-28 西安交通大学 Control method of high performance three-dimensional code rate in flexible video coding

Also Published As

Publication number Publication date
CN101299819A (en) 2008-11-05

Similar Documents

Publication Publication Date Title
McCanne et al. Low-complexity video coding for receiver-driven layered multicast
CN101299819B (en) Method for sorting three-dimensional wavelet sub-band and enveloping code flow of telescopic video coding
CN100387063C (en) Control method of high performance three-dimensional code rate in flexible video coding
US6931068B2 (en) Three-dimensional wavelet-based scalable video compression
EP1458196A2 (en) Packetization of FGS/PFGS video bitstreams
CN108293138A (en) Video/image coding in the effective and scalable frame encoded using small echo and AVC, AVC, VPx of modification, the VPx of modification or the HEVC of modification
JP2011147120A (en) Method and apparatus for transmitting scalable video in accordance with priority
CN102006483B (en) Video coding and decoding method and device
CN101478677B (en) Scalable multi-description video encoding structure design method based on code rate control
Akyol et al. Scalable multiple description video coding with flexible number of descriptions
CN101917608B (en) Scalable transmission method of video track
CN102131082A (en) Image processing apparatus and image processing method
CN1689045A (en) L-frames with both filtered and unfilterd regions for motion comensated temporal filtering in wavelet based coding
Sagetong et al. Optimal bit allocation for channel-adaptive multiple description coding
CN101146227A (en) Build-in gradual flexible 3D wavelet video coding algorithm
Tillier et al. Multiple descriptions scalable video coding
CN102549932A (en) Method and apparatus for communicating an image over a network with spatial scalability
Gadgil et al. Multiple description coding
Bai et al. Multiple description video coding based on lattice vector quantization
Peng et al. Line-cast: Line-based semi-analog broadcasting of satellite images
Kidwai Efficient image coding for wireless sensor networks
CN1253005C (en) Dimension decreasable object segmentation and video coding with wavelet fractal scalability
Kusetogullari et al. Genetic algorithm based rainbow network flow optimization for multiple description coding in lossy network
Arivazhagan et al. Evaluation of zero tree wavelet coders
Yang et al. A 3D-DCT and Convolutional FEC Approach to Agile Video Streaming

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20100414

CF01 Termination of patent right due to non-payment of annual fee