CN102970540B

CN102970540B - Based on the multi-view video rate control of key frame code rate-quantitative model

Info

Publication number: CN102970540B
Application number: CN201210479090.8A
Authority: CN
Inventors: 蒋刚毅; 郑巧燕; 郁梅; 朱高锋; 邵枫; 彭宗举
Original assignee: Ningbo University
Current assignee: Ningbo University
Priority date: 2012-11-21
Filing date: 2012-11-21
Publication date: 2016-03-02
Anticipated expiration: 2032-11-21
Also published as: CN102970540A

Abstract

The invention discloses a kind of multi-view video rate control based on key frame code rate-quantitative model, it carries out Rate Control by dividing four layers, be respectively viewpoint layer, GOP layer, frame-layer and macroblock layer, in viewpoint layer, distribute the code check between viewpoint according to the ratio value of each viewpoint actual coding; In GOP layer, distribute the total bitrate of whole GOP and calculate the coded quantization parameter of the key frame of each GOP according to the code check-quantization parameter model analyzing the key frame obtained; Data Rate Distribution weight according to different levels B frame in frame-layer reasonably distributes code check; In macroblock layer, carry out the Data Rate Distribution of macro block according to the encoder complexity of macro block and ask for the coded quantization parameter of macro block, advantage is the code check-quantitative model utilizing key frame, and adopt unequal Rate Distribution Strategy for the B frame of different levels, more reasonably can carry out target bit allocation, effectively carrying out Rate Control, providing good distortion performance and subjective quality when ensureing rate control accuracy.

Description

Based on the multi-view video rate control of key frame code rate-quantitative model

Technical field

The present invention relates to a kind of multi-view point video rate control techniques, especially relate to a kind of multi-view video rate control based on key frame code rate-quantitative model.

Background technology

In order to meet people to the quality of video and the more and more much higher sample requirement of content, the multi-view point video technology with 3D visual performance is just more and more subject to the attention of academia and industrial quarters, and becomes one of focus of video research in recent years.Rate Control is one of key technology in multi-view point video related application, and it will reasonable distribution code check in time, prevents the spilling of buffering area, also will carry out rational Data Rate Distribution between each viewpoint simultaneously, to ensure the harmony of video quality between viewpoint.International Organization for standardization JVT(JointVideoTeam) be devoted to the formulation of multiple view video coding standard always, and propose JMVC(JointMulti-viewVideoCoding) as public research platform.HBP(HierarchicalBPictures is have employed in JMVC) forecast coding structure, have employed that disparity compensation prediction and motion compensated prediction remove the various time, redundancy between space and viewpoint, improve compression coding efficiency further.But the Rate Control model for two-dimensional video is originally in parallax compensation and motion compensation, therefore can not directly apply to multiple view video coding, and current JMVC not yet provides Rate Control model.At present, scholar is had to propose the bit rate control method of some multi-view point videos, but these bit rate control methods are all simply some the Rate Control motions in are H.264 extended in multi-view point video, do not consider emphatically the particularity of multi-view point video, if the code check-quantitative model analysis of the key frame similar with single channel I frame and the special construction etc. of HBP are all the problems being badly in need of in multi-view point video Rate Control considering.

In the MVC adopting HBP forecast coding structure, one group of image sets is defined as a GOP(GroupofPicture), and the 1st frame of each GOP is key frame, effectively carrying out key frame Rate Control will contribute to improving the coding quality of whole GOP.In traditional single channel rate control algorithm, the adjustment of all quantization parameters is all only for P frame or B frame, and I frame only adopts the average QP(QuantizationParameter of all P frame/B frames of previous GOP) value encodes, if the average qp value of all P frame/B frames of previous GOP is larger, then comparatively large by causing the QP of I frame to choose, the picture quality of whole sequence is reduced.In addition, the bit number of coded frame not only depends on the size of QP, and relevant with the complexity of coded image, therefore should simulate the code check of I frame and the relational model of QP, by this models applying in the asking for of QP of I frame.In a word, how the Rate Control of I frame mainly sets up suitable I frame code check-quantitative model to determine quantization parameter accurately, therefore in multi-view point video, want the code check of key frame in weight analysis HBP structure and the relation of quantization parameter, the R-QP curve simulating key frame is applied to the QP setting of key frame in Rate Control.

Summary of the invention

Under technical problem to be solved by this invention is to provide a kind of prerequisite ensureing rate control accuracy, can the multi-view video rate control based on key frame code rate-quantitative model of effectively increase rate distortion performance and video encoding quality.

The present invention solves the problems of the technologies described above adopted technical scheme: a kind of multi-view video rate control based on key frame code rate-quantitative model, it is characterized in that comprising the following steps:

1. define in the multi-view point video of input current processing kth ' individual viewpoint video is current view point video, wherein, k' represents the numbering of viewpoint, and the initial value of k' is 1;

2. current view point video is divided into multiple image sets, each image sets comprises the image of I frame, P frame and B frame three types under AVC coded format, and the 1st frame in each image sets is key frame;

3. calculate the target bit being pre-assigned to current view point video, be designated as T _view(k'), T _view(k')=T _total× w (k '), wherein, T _totalrepresent the general objective bit number of multi-view point video of input, w (k') represent kth in the multi-view point video of input ' the proportional roles of individual viewpoint video;

4. give each image sets preassignment target bit in current view point video, and arrange the coded quantization parameter of the key frame in each image sets in current view point video, detailed process is:

-1 4., calculate the target bit of i-th image sets be pre-assigned in current view point video, be designated as f (i, 0),

f (i, 0) = \{\begin{matrix} \frac{B}{F_{r}} \times N_{gop} & i = 1 \\ \frac{B}{F_{r}} \times N_{gop} - (\frac{B_{s}}{8} - B_{c} (i - 1, N_{gop})) & 2 \leq i \leq N \end{matrix},

Wherein, N represents the number of the image sets that current view point video comprises, and B represents the available bandwidth of outer setting, F _rfor frame per second, N _goprepresent the frame number of the frame that i-th image sets in current view point video comprises, B _srepresent initial buffer size, B _c(i-1, N _gop) real cache district degree after (i-1) individual image sets in the complete current view point video of presentation code;

-2 4., whether i-th image sets judged in current view point video be the 1st image sets, if so, then by the initial code quantization parameter QP of outer setting ₀as the coded quantization parameter of the key frame in the 1st image sets, then perform step 4.-6, otherwise, perform step 4.-3, wherein, 1≤i≤N, QP ₀∈ [0,51];

4. code check-the quantitative model-3, according to key frame calculate the coded quantization parameter that first of the key frame in i-th image sets in current view point video is to be selected, be designated as Q ' _k(i, 0), wherein, R _k(i, 0) represents the target bit of the key frame in i-th image sets be pre-assigned in current view point video, C ₁and C ₂be constant;

-4 4. the mean value, according to the coded quantization parameter of all frames in the i-th-1 image sets in current view point video except key frame, calculates the coded quantization parameter that second of the key frame in i-th image sets in current view point video is to be selected, is designated as Q " _k(i, 0),

Q_{K}^{''} (i, 0) = \frac{{Sum}_{BQP} (i - 1)}{N_{gop} - 1} - 1 - \frac{8 \times T_{r} (i - 1, N_{goop})}{T_{r} (i, 0)} - \frac{N_{gop}}{15},

Wherein, Sum _bQP(i-1) represent the coded quantization parameter of all frames in the i-th-1 image sets in current view point video except key frame and, T _r(i-1, N _gop) remaining bits number after the i-th-1 image sets in the complete current view point video of presentation code in current view point video, T _rremaining bits number when (i, 0) represents the key frame in i-th image sets in precoding current view point video in i-th image sets;

4.-5, from Q ' _k(i, 0) and Q " _kthe coded quantization parameter to be selected that in (i, 0), selective value is little, as the preliminary coded quantization parameter of the key frame in the image sets of i-th in current view point video, is designated as then must not principle pair more than 2 according to the difference of the coded quantization parameter of front and back two frame frame of the same type revise, revised coded quantization parameter is designated as

{\tilde{Q}}_{K} (i, 0) = \min {Q_{K} (i - 1,0) + 2, \max {Q_{K} (i - 1,0) - 2, {\tilde{Q}}_{K} (i, 0)}},

Then according to H.264 standard is right further revise, obtain the final coded quantization parameter of the key frame in i-th image sets in current view point video, be designated as Q _k(i, 0), wherein, Q _k(i-1,0) represents the coded quantization parameter of the key frame in the i-th-1 image sets in current view point video, and min{} is for getting minimum value function, and max{} is for getting max function;

4.-6, i=i+1 is made, then return step 4.-1 continue to the next image sets preassignment target bit in current view point video, and the coded quantization parameter of the key frame in next image sets is set, until all image sets in current view point video are disposed, wherein, "=" in i=i+1 is assignment;

Remaining bits number when 5. upgrading the every frame in each image sets in precoding current view point video except key frame and target cache district degree, then give the every frame preassignment target bit except key frame in each image sets in current view point video, detailed process is:

5.-1, when constant bandwidth, remaining bits number scale in i-th image sets during jth frame in i-th image sets in precoding current view point video is T by remaining bits number when calculating the every frame in i-th image sets in precoding current view point video except key frame in i-th image sets _r(i, j), T _r(i, j)=T _r(i, j-1)-A (i, j-1), then target cache district degree during the every frame in i-th image sets in precoding current view point video except key frame is calculated, target cache district degree during jth frame in i-th image sets in precoding current view point video is designated as Tbl (i, j) wherein, 1≤i≤N, 2≤j≤N _gop, T _r(i, j-1) the remaining bits number in front i-th image sets of jth-1 frame in i-th image sets in precoding current view point video is represented, A (i, the actual bit number of jth-1 frame in i-th image sets j-1) in presentation code current view point video, Tbl (i, j-1) target cache district degree during jth-1 frame in i-th image sets in precoding current view point video is represented, Tbl (i, 2) target cache district degree during the 2nd frame in i-th image sets in precoding current view point video is represented, Tbl (i, 2)=B _c(i, 2), B _creal cache district degree after the 2nd frame in i-th image sets in the complete current view point video of (i, 2) presentation code;

5.-2, according to the remaining bits number in i-th image sets during every frame in i-th image sets in precoding current view point video except key frame, calculate the target bit to be selected of the every frame in i-th image sets be pre-assigned in current view point video except key frame, for the jth frame in the image sets of i-th in current view point video, according to T _r(i, j), calculates the target bit to be selected of the jth frame in i-th image sets be pre-assigned in current view point video, is designated as

\hat{f} (i, j), \hat{f} (i, j) = \frac{T_{r} (i, j) \times w_{b}^{t}}{Σ_{l = t}^{L} w_{b}^{l} \times N_{b}^{l}},

Wherein, l represents the different levels of B frame, and 1≤t≤L, L represents the hierachy number that hierarchical B-frame is total, represent the weighted value that t hierarchical B frame is corresponding, represent the weighted value that in residue B frame, l hierarchical B frame is corresponding, represent the frame number of l hierarchical B frame in residue B frame;

5.-3, according to the real cache district degree after the every frame in target cache district degree during every frame in i-th image sets in precoding current view point video except key frame and i-th image sets of having encoded in current view point video except key frame, calculate the target bit to be selected of the every frame in i-th image sets be pre-assigned in current view point video except key frame, for the jth frame in the image sets of i-th in current view point video, according to Tbl (i, j) the real cache district degree B after the jth frame and in i-th image sets of having encoded in current view point video _c(i, j), calculates the target bit to be selected of the jth frame in i-th image sets be pre-assigned in current view point video, is designated as

\tilde{f} (i, j) = \frac{B}{F_{r}} + γ \times (Tbl (i, j) - B_{c} (i, j)),

Wherein, γ is constant;

5.-4, according to step 5.-2 and two of every frame of step 5. in-3 i-th image sets being pre-assigned in current view point video calculated except key frame target bit to be selected, calculate the final target bit of the every frame in i-th image sets be pre-assigned in current view point video except key frame, for the jth frame in the image sets of i-th in current view point video, according to with calculate the final target bit of the jth frame in i-th image sets be pre-assigned in current view point video, be designated as f (i, j),

f (i, j) = β \times \hat{f} (i, j) + (1 - β) \times \tilde{f} (i, j),

Wherein, β is constant;

6. each macro block preassignment target bit in the every frame in each image sets in current view point video except key frame is given, then calculate the coded quantization parameter of each macro block in the every frame in each image sets in current view point video except key frame, detailed process is:

-1 6., calculate in i-th image sets being pre-assigned in current view point video except key frame every frame in the target bit of each macro block, the target bit of the kth macro block in the jth frame in i-th image sets be pre-assigned in current view point video is designated as f _mb(j, k), wherein, 1≤i≤N, 2≤j≤N _gop, T _mb(j, remaining bits number when k) representing kth in the jth frame in i-th image sets in precoding current view point video macro block in jth frame, MAD (j, k) the MAD value of the kth macro block in the jth frame in i-th image sets in current view point video is represented, MAD (j, p) the MAD value of p macro block in the jth frame in i-th image sets in current view point video is represented, 1≤k≤N _mb, 1≤p≤N _mb, N _mbrepresent the number of the macro block that the jth frame in i-th image sets in current view point video comprises;

6.-2, according to the target bit of each macro block in the every frame in i-th image sets be pre-assigned in current view point video except key frame, calculate the coded quantization parameter of each macro block in the every frame in i-th image sets in current view point video except key frame, for the kth macro block in the jth frame in the image sets of i-th in current view point video, adopt

f_{mb} (j, k) = (\frac{X_{1}}{Q_{mb} {(j, k)}^{2}} + \frac{X_{2}}{Q_{mb} (j, k)}) \times MAD (j, k),

Calculate the coded quantization parameter of the kth macro block in the jth frame in i-th image sets in current view point video, be designated as Q _mb(j, k), wherein, X ₁and X ₂be

f_{mb} (j, k) = (\frac{X_{1}}{Q_{mb} {(j, k)}^{2}} + \frac{X_{2}}{Q_{mb} (j, k)}) \times MAD (j, k)

In model parameter, adopt after a macro block of having encoded linear regression technique upgrade X ₁and X ₂value;

7. using pending viewpoint video next in the multi-view point video of input as current view point video, then return step and 2. continue to perform, until all viewpoint videos in the multi-view point video of input are disposed.

Described step 4.-3 detailed process be:

The encoder complexity of the key frame in the i-th-1 image sets in a, calculating current view point video, is designated as C _k(i-1,0), C _k(i-1,0)=A _k(i-1,0) × Q _k(i-1,0), wherein, A _k(i-1,0) represents the coding actual bit number of the key frame in the i-th-1 image sets in current view point video, Q _k(i-1,0) represents the coded quantization parameter of the key frame in the i-th-1 image sets in current view point video;

The encoder complexity of the every frame in the i-th-1 image sets in b, calculating current view point video except key frame, is designated as C by the encoder complexity of the jth frame in the i-th-1 image sets in current view point video _b(i-1, j), C _b(i-1, j)=A _b(i-1, j) × Q _b(i-1, j), then calculates the encoder complexity sum of all frames in the i-th-1 image sets in current view point video except key frame, is designated as C _{b, i-1}, wherein, 2≤j≤N _gop, A _b(i-1, j) represents the coding actual bit number of the jth frame in the i-th-1 image sets in current view point video, Q _b(i-1, j) represents the coded quantization parameter of the jth frame in the i-th-1 image sets in current view point video;

C, encoder complexity sum according to the encoder complexity of the key frame in the i-th-1 image sets in current view point video and all frames except key frame, calculate the target bit of the key frame in i-th image sets be pre-assigned in current view point video, be designated as R _k(i, 0), wherein, f (i, 0) represents the target bit of i-th image sets be pre-assigned in current view point video, C _k(i-1,0) represents the encoder complexity of the key frame in the i-th-1 image sets in current view point video;

D, to calculate in current view point video i-th image sets according to the code check of key frame-quantitative model in first of key frame coded quantization parameter to be selected, be designated as Q ' _k(i, 0), wherein, C ₁and C ₂be constant.

Described step is C in-3 4. ₁=17.96, C ₂=-0.1704.

Described step 5. in-2 when the length of the image sets in current view point video is 8, the total hierachy number L of hierarchical B-frame is 3, t+l=L; Described step is γ=0.75 in-3 5.; Described step is β=0.5 in-4 5..

The MAD value MAD (j of the kth macro block in the jth frame in i-th image sets of described step 6. in-1 in current view point video, k) acquisition process is: according in jth-1 frame in the image sets of i-th in current view point video with the MAD value MAD (j-1 of the macro block of the macro block same position of the kth in jth frame, k), linear prediction goes out MAD (j, k), MAD (j, k)=a ₁× MAD (j-1, k)+a ₂, wherein, a ₁and a ₂be prediction model parameters, a ₁initial value be 1, a ₂initial value be 0, adopt after a macro block of having encoded linear regression technique upgrade a ₁and a ₂value.

Compared with prior art, the invention has the advantages that and carry out Rate Control by points four layers, be respectively viewpoint layer, GOP layer, frame-layer and macroblock layer, in viewpoint layer, distribute the code check between viewpoint according to the ratio value of each viewpoint actual coding; In GOP layer, distribute the total bitrate of whole GOP and calculate the coded quantization parameter of the key frame of each GOP according to the code check-quantization parameter model analyzing the key frame obtained; In frame-layer, the Data Rate Distribution weight according to different levels B frame reasonably distributes code check; In macroblock layer, carry out the Data Rate Distribution of macro block according to the encoder complexity of macro block and ask for the coded quantization parameter of macro block, experimental result shows that the inventive method more accurately can control code check, rate control accuracy is mainly within 1%, its PSNR and distortion performance are all improved, on the evaluation basis of objective quality, corresponding evaluation has also been carried out to the subjective quality of each sequence, final experimental result shows, the inventive method can not only obtain good objective quality, and can obtain better subjective quality.

Accompanying drawing explanation

Fig. 1 is the schematic diagram of multi-view point video forecast coding structure based on HBP structure and key frame;

Fig. 2 be the inventive method totally realize block diagram;

Fig. 3 a is the R-QP statistic curve figure of the key frame of 3 class viewpoints of Breakdancers sequence;

Fig. 3 b is the R-QP statistic curve figure of the key frame of 3 class viewpoints of Doorflowers sequence;

Fig. 4 is the R-QP statistic curve figure of lower 4 the sequence key frames of 1024 × 768 resolution;

Fig. 5 a is the rate distortion curve that Breakdancers sequence adopts original method and the inventive method;

Fig. 5 b is the rate distortion curve that Ballet sequence adopts original method and the inventive method;

Fig. 5 c is the rate distortion curve that Doorflowers sequence adopts original method and the inventive method;

Fig. 5 d is the rate distortion curve that Alt_Moabit sequence adopts original method and the inventive method;

Fig. 6 a is the reconstructed image obtained after Breakdancers sequence adopts original method process;

Fig. 6 b is the reconstructed image obtained after Breakdancers sequence adopts the inventive method process;

Fig. 6 c is the regional area of the original image of Breakdancers sequence;

Fig. 6 d is the regional area of the reconstructed image obtained after Breakdancers sequence adopts original method process;

Fig. 6 e is the regional area of the reconstructed image obtained after Breakdancers sequence adopts the inventive method process;

Fig. 7 a is the reconstructed image obtained after Ballet sequence adopts original method process;

Fig. 7 b is the reconstructed image obtained after Ballet sequence adopts the inventive method process;

Fig. 7 c is the regional area of the original image of Ballet sequence;

Fig. 7 d is the regional area of the reconstructed image obtained after Ballet sequence adopts original method process;

Fig. 7 e is the regional area of the reconstructed image obtained after Ballet sequence adopts the inventive method process;

Fig. 8 a is the reconstructed image obtained after Alt_Moabit sequence adopts original method process;

Fig. 8 b is the reconstructed image obtained after Alt_Moabit sequence adopts the inventive method process;

Fig. 8 c is the regional area of the original image of Alt_Moabit sequence;

Fig. 8 d is the regional area of the reconstructed image obtained after Alt_Moabit sequence adopts original method process;

Fig. 8 e is the regional area of the reconstructed image obtained after Alt_Moabit sequence adopts the inventive method process;

Fig. 9 a is the reconstructed image obtained after Doorflowers sequence adopts original method process;

Fig. 9 b is the reconstructed image obtained after Doorflowers sequence adopts the inventive method process;

Fig. 9 c is the regional area of the original image of Doorflowers sequence;

Fig. 9 d is the regional area of the reconstructed image obtained after Doorflowers sequence adopts original method process;

Fig. 9 e is the regional area of the reconstructed image obtained after Doorflowers sequence adopts the inventive method process.

Embodiment

Below in conjunction with accompanying drawing embodiment, the present invention is described in further detail.

A kind of multi-view video rate control based on key frame code rate-quantitative model that the present invention proposes, it carries out Rate Control by viewpoint layer, image sets (GOP) layer, frame-layer and macroblock layer.At viewpoint layer, distribute the code check between viewpoint according to the ratio value of each viewpoint actual coding; At image sets layer, distribute the total bitrate of whole image sets and calculate the coded quantization parameter of the key frame in each image sets according to the exponential model of the code check-quantification analyzing the key frame obtained; In frame-layer, the Data Rate Distribution weight according to different levels B frame reasonably distributes code check; In macroblock layer, carry out the Data Rate Distribution of macro block according to the encoder complexity of macro block and ask for the coded quantization parameter of macro block.Fig. 1 gives the schematic diagram of the multi-view point video forecast coding structure based on HBP structure, and in Fig. 1, the arrow of horizontal direction represents time reference, and the arrow of vertical direction represents interview reference.I class viewpoint (I-View) adopts intra-frame coding techniques, and not with reference to other viewpoint, P class viewpoint (P-View) is unidirectional view reference, with reference to I viewpoint, as the View2 in Fig. 1.In like manner, category-B viewpoint (B-View) carries out two-way interview prediction by I and the P class viewpoint reconstructed to obtain, as the View1 in Fig. 1.As shown in fig. 1, an image sets is made up of all frames that moment T0 in each viewpoint to moment T7 is corresponding, and the 1st frame of each image sets is key frame, the I0 namely in each viewpoint, B1 and P0.In the methods of the invention viewpoint is divided into 3 classes, is respectively I-view, P-view and B-view.What Fig. 2 gave the inventive method totally realizes block diagram, and it comprises the following steps:

1. define in the multi-view point video of input current processing kth ' individual viewpoint video is current view point video, wherein, k' represents the numbering of viewpoint, and the initial value of k' is 1, as: suppose that current view point video is the 1st viewpoint video, then k'=1.

2. current view point video is divided into multiple image sets, each image sets comprises the image of I frame, P frame and B frame three types under AVC coded format, and wherein B frame comprises B again ₁, B ₂, B ₃and B ₄the frame of Four types, the 1st frame in each image sets is key frame, as shown in Figure 1.

3. calculate the target bit being pre-assigned to current view point video, be designated as T _view(k'), T _view(k')=T _total× w (k '), wherein, T _totalrepresent the general objective bit number of the multi-view point video of input, w (k') represents the proportional roles of each viewpoint video of the multi-view point video of input, and w (k') is encoded by regular coding quantization parameter the ratio-dependent of coding actual bit number of each viewpoint video obtained.

f (i, 0) = \{\begin{matrix} \frac{B}{F_{r}} \times N_{gop} & i = 1 \\ \frac{B}{F_{r}} \times N_{gop} - (\frac{B_{s}}{8} - B_{c} (i - 1, N_{gop})) & 2 \leq i \leq N \end{matrix},

Wherein, N represents the number of the image sets that current view point video comprises, and B represents the available bandwidth of outer setting, F _rfor frame per second, N _goprepresent the frame number of the frame that i-th image sets in current view point video comprises, B _srepresent initial buffer size, B _c(i-1, N _gop) real cache district degree after (i-1) individual image sets in the complete current view point video of presentation code.

At this, have certain quality in order to ensure all image sets in current view point video, real cache district degree must remain on after an image sets of having encoded left and right.

-2 4., whether i-th image sets judged in current view point video be the 1st image sets, if so, then by the initial code quantization parameter QP of outer setting ₀as the coded quantization parameter of the key frame in the 1st image sets, then perform step 4.-6, otherwise, perform step 4.-3, wherein, 1≤i≤N, QP ₀∈ [0,51], i.e. QP ₀value be integer in 0 to 51 scopes.

4. code check-the quantitative model-3, according to key frame calculate the coded quantization parameter that first of the key frame in i-th image sets in current view point video is to be selected, be designated as Q ' _k(i, 0), wherein, R _k(i, 0) represents the target bit of the key frame in i-th image sets be pre-assigned in current view point video, C ₁and C ₂be constant.

In this particular embodiment, step 4.-3 detailed process be:

The encoder complexity of the key frame in the i-th-1 image sets in a, calculating current view point video, is designated as C _k(i-1,0), C _k(i-1,0)=A _k(i-1,0) × Q _k(i-1,0), wherein, A _k(i-1,0) represents the coding actual bit number of the key frame in the i-th-1 image sets in current view point video, Q _k(i-1,0) represents the coded quantization parameter of the key frame in the i-th-1 image sets in current view point video.

The encoder complexity of the every frame in the i-th-1 image sets in b, calculating current view point video except key frame, is designated as C by the encoder complexity of the jth frame in the i-th-1 image sets in current view point video _b(i-1, j), C _b(i-1, j)=A _b(i-1, j) × Q _b(i-1, j), then calculates the encoder complexity sum of all frames in the i-th-1 image sets in current view point video except key frame, is designated as C _{b, i-1}, wherein, 2≤j≤N _gop, A _b(i-1, j) represents the coding actual bit number of the jth frame in the i-th-1 image sets in current view point video, Q _b(i-1, j) represents the coded quantization parameter of the jth frame in the i-th-1 image sets in current view point video.

C, encoder complexity sum according to the encoder complexity of the key frame in the i-th-1 image sets in current view point video and all frames except key frame, calculate the target bit of the key frame in i-th image sets be pre-assigned in current view point video, be designated as R _k(i, 0), wherein, f (i, 0) represents the target bit of i-th image sets be pre-assigned in current view point video, C _k(i-1,0) represents the encoder complexity of the key frame in the i-th-1 image sets in current view point video.

D, to calculate in current view point video i-th image sets according to the code check of key frame-quantitative model in first of key frame coded quantization parameter to be selected, be designated as Q ' _k(i, 0), wherein, C ₁and C ₂be constant, at this, get C ₁=17.96, C ₂=-0.1704.

In the present embodiment, the curve model that statistical analysis matching draws is through for the code check-quantitative model (R-QP) of key frame.Fig. 3 a gives code check-quantitative statistics curve chart that resolution is the key frame of 3 class viewpoints of the Breakdancers sequence of 1024 × 768, and Fig. 3 b gives code check-quantitative statistics curve chart that resolution is the key frame of 3 class viewpoints of the Doorflowers sequence of 1024 × 768.In Fig. 3 a and Fig. 3 b, abscissa is quantization parameter, and quantization parameter value is the set of 26 different Q P compositions, is designated as { QP _k| QP _k=2 × K, K ∈ N, 0≤K≤25}, ordinate is the coding actual bit number under the coded quantization parameter that key frame is corresponding, and unit is megabit (Mbit).Can find out from Fig. 3 a and Fig. 3 b for same sequence, the code check-quantitative model of the key frame of 3 class different points of view is nearly all identical.Fig. 4 gives lower 4 the different sequences of 1024 × 768 resolution at QP _kunder the code check-quantification scatter diagram of key frame.As can be seen from Figure 4 the overall trend of 4 sequences is similar, median method is adopted to obtain the average bit rate scatter diagram of 4 sequences, according to average bit rate value and the corresponding quantization parameter of 4 sequences, simulate the code check-quantitation curve of two kinds of key frames, one is exponential model, and another kind is secondary model.Table 1 gives some indexs of two kinds of fit approach, wherein, SSE(squaresumoferror) be error sum of squares, refer to that each test point is to mensuration mean deviation quadratic sum, the smaller the better; R-square is the goodness of fit, and span is [0,1], and the value of R-square, more close to 1, illustrates that regression straight line is better to the fitting degree of measured value, otherwise the value of R-square, more close to 0, illustrates that the fitting degree of regression straight line to measured value is poorer.Data listed by analytical table 1 are known, meet QP ∈ { QP at QP _k| QP _kin=2 × K, K ∈ N, 0≤K≤25} situation, the SSE of the SSE ratio index model of secondary model is slightly low, and the coefficient R-square of secondary model is higher, therefore secondary model ratio index model is slightly good, but precision is all not high enough.Find that secondary model ratio index model is slightly good according to the analysis of Fig. 4 and table 1, the QP scope of real application is generally QP ∈ { QP _k| QP _k=2 × K, K ∈ N, 10≤K≤25}, in order to reject QP interval from the integer of 0 to 20, be designated as QP ∈ { QP _k| QP _k=2 × K, K ∈ N, 0≤K≤9}, really apply the impact of QP, simulate the code check-quantitative model of the key frame of realistic utilization better, meet QP ∈ { QP at QP on it _k| QP _kin=2 × K, K ∈ N, 10≤K≤25} situation, sequence is carried out to the matching again of secondary model and exponential model.Table 2 gives the various indexs of two kinds of models of 4 sequences, to analyze its quality.As seen from Table 2, if for QP ∈ { QP _k| QP _kcarry out R-QP models fitting in=2 × K, K ∈ N, 10≤K≤25} interval, the exponential model of 4 kinds of sequences is all more accurate than secondary model.The SSE of exponential model is below 0.01, and the coefficient R-square of exponential model has 3 sequences to reach more than 0.99, and secondary model only has 1 sequence, and other coefficient correlation is all lower.In addition, the statistical relationship of the average R-QP obtained that the code check of 4 sequences under corresponding QP is averaged, up to 0.99, conic section only has 0.92 to the coefficient correlation of exponential curve; In like manner, the SSE value of conic section is bigger than normal.Finally, determine that the code check-quantitative model of key frame is exponential model: wherein, C ₁and C ₂be constant, R is the target bit rate of key frame, and QP is the quantization parameter of key frame.

Table 1 is at QP ∈ { QP _k| QP _k=2 × K, K ∈ N, under 0≤K≤25}, the R-QP matched curve of key frame is compared

Fit curve equation	Equation	SSE	R-square
				Exponential model	R=5.1060×exp(-0.0873×QP)	1.5021	0.9726
Secondary model	R=0.0033×QP ²-0.2527×QP+4.6310	0.6427	0.9884

Table 2 is at QP ∈ { QP _k| QP _kr-QP matched curve in=2 × K, K ∈ N, 10≤K≤25} interval

4. the mean value-4, according to the coded quantization parameter of all frames in the i-th-1 image sets in current view point video except key frame, calculate the coded quantization parameter that second of the key frame in i-th image sets in current view point video is to be selected, " the K (i, 0) that is designated as Q

Q_{K}^{''} (i, 0) = \frac{{Sum}_{BQP} (i - 1)}{N_{gop} - 1} - 1 - \frac{8 \times T_{r} (i - 1, N_{gop})}{T_{r} (i, 0)} - \frac{N_{gop}}{15},

Wherein, Sum _bQP(i-1) represent the coded quantization parameter of all frames in the i-th-1 image sets in current view point video except key frame and, T _r(i-1, N _gop) remaining bits number after the i-th-1 image sets in the complete current view point video of presentation code in current view point video, T _rremaining bits number when (i, 0) represents the key frame in i-th image sets in precoding current view point video in i-th image sets.

4.-5, from Q ' _k(i, 0) and Q " _kthe coded quantization parameter to be selected that in (i, 0), selective value is little, as the preliminary coded quantization parameter of the key frame in the image sets of i-th in current view point video, is designated as then the mass difference in order to front and back two frame frame of video is less, specifies that the difference of the coded quantization parameter of former and later two frame of video of the same type must not must not principle pair more than 2 according to the difference of the coded quantization parameter of front and back two frame frame of the same type more than 2 revise, revised coded quantization parameter is designated as

{\tilde{Q}}_{K} (i, 0) = \min {Q_{K} (i - 1,0) + 2, \max {Q_{K} (i - 1,0) - 2, {\hat{Q}}_{K} (i, 0)}},

Then according to H.264 standard is right further revise, obtain the final coded quantization parameter of the key frame in i-th image sets in current view point video, be designated as Q _k(i, 0), wherein, Q _k(i-1,0) represents the coded quantization parameter of the key frame in the i-th-1 image sets in current view point video, and min{} is for getting minimum value function, and max{} is for getting max function.

4.-6, i=i+1 is made, then return step 4.-1 continue to the next image sets preassignment target bit in current view point video, and the coded quantization parameter of the key frame in next image sets is set, until all image sets in current view point video are disposed, wherein, "=" in i=i+1 is assignment.

5.-1, when constant bandwidth, remaining bits number scale in i-th image sets during jth frame in i-th image sets in precoding current view point video is T by remaining bits number when calculating the every frame in i-th image sets in precoding current view point video except key frame in i-th image sets _r(i, j), T _r(i, j)=T _r(i, j-1)-A (i, j-1), then target cache district degree during the every frame in i-th image sets in precoding current view point video except key frame is calculated, target cache district degree during jth frame in i-th image sets in precoding current view point video is designated as Tbl (i, j) wherein, 1≤i≤N, 2≤j≤N _gop, T _r(i, j-1) the remaining bits number in front i-th image sets of jth-1 frame in i-th image sets in precoding current view point video is represented, A (i, the actual bit number of jth-1 frame in i-th image sets j-1) in presentation code current view point video, Tbl (i, j-1) target cache district degree during jth-1 frame in i-th image sets in precoding current view point video is represented, Tbl (i, 2) target cache district degree during the 2nd frame in i-th image sets in precoding current view point video is represented, Tbl (i, 2)=B _c(i, 2), B _creal cache district degree after the 2nd frame in i-th image sets in the complete current view point video of (i, 2) presentation code.

5.-2, according to the remaining bits number in i-th image sets during every frame in i-th image sets in precoding current view point video except key frame, calculate the target bit to be selected of the every frame in i-th image sets be pre-assigned in current view point video except key frame, for the jth frame in the image sets of i-th in current view point video, according to T _r(i, j), calculates the target bit to be selected of the jth frame in i-th image sets be pre-assigned in current view point video, is designated as wherein, l represents the different levels of B frame, and 1≤t≤L, L represents the hierachy number that hierarchical B-frame is total, represent the weighted value that t hierarchical B frame is corresponding, represent the weighted value that in residue B frame, l hierarchical B frame is corresponding, represent the frame number of l hierarchical B frame in residue B frame.At this, when the length of the image sets in current view point video is 8, the total hierachy number L of hierarchical B-frame is 3,

\tilde{f} (i, j) = \frac{B}{F_{r}} + γ \times (Tbl (i, j) - B_{c} (i, j)),

Wherein, γ is constant, gets γ=0.75 at this.

5.-4, according to step 5.-2 and two of every frame of step 5. in-3 i-th image sets being pre-assigned in current view point video calculated except key frame target bit to be selected, calculate the final target bit of the every frame in i-th image sets be pre-assigned in current view point video except key frame, for the jth frame in the image sets of i-th in current view point video, according to with calculate the final target bit of the jth frame in i-th image sets be pre-assigned in current view point video, be designated as f (i, j), wherein, β is constant, gets β=0.5 at this.

-1 6., calculate in i-th image sets being pre-assigned in current view point video except key frame every frame in the target bit of each macro block, the target bit of the kth macro block in the jth frame in i-th image sets be pre-assigned in current view point video is designated as f _mb(j, k), wherein, 1≤i≤N, 2≤j≤N _gop, T _mb(j, remaining bits number when k) representing kth in the jth frame in i-th image sets in precoding current view point video macro block in jth frame, MAD (j, k) the MAD value of the kth macro block in the jth frame in i-th image sets in current view point video is represented, MAD (j, p) the MAD value of p macro block in the jth frame in i-th image sets in current view point video is represented, 1≤k≤N _mb, 1≤p≤N _mb, N _mbrepresent the number of the macro block that the jth frame in i-th image sets in current view point video comprises.

In this particular embodiment, the MAD value MAD (j of the kth macro block in the jth frame in i-th image sets in current view point video, k) acquisition process is: according in jth-1 frame in the image sets of i-th in current view point video with the MAD value MAD (j-1 of the macro block of the macro block same position of the kth in jth frame, k), linear prediction goes out MAD (j, k), MAD (j, k)=a ₁× MAD (j-1, k)+a ₂, wherein, a ₁and a ₂be prediction model parameters, a ₁initial value be 1, a ₂initial value be 0, adopt after a macro block of having encoded linear regression technique upgrade a ₁and a ₂value.

f_{mb} (j, k) = (\frac{X_{1}}{Q_{mb} {(j, k)}^{2}} + \frac{X_{2}}{Q_{mb} (j, k)}) \times MAD (j, k),

f_{mb} (j, k) = (\frac{X_{1}}{Q_{mb} {(j, k)}^{2}} + \frac{X_{2}}{Q_{mb} (j, k)}) \times MAD (j, k)

In model parameter, adopt after a macro block of having encoded linear regression technique upgrade X ₁and X ₂value.

Below for test the inventive method, to prove validity and the feasibility of the inventive method.Test environment as listed in table 3, on the computer of IntelCore2Duo3.0GHz, 3.25GB internal memory, tests original method and the inventive method.Original method refers to and is extended in multiple view video coding by the G012 algorithm in H.264/AVC, and its Rate Control part does not do any relevant improvement for multi-view point video.In table 3, original method adopts OrgRC to represent, the inventive method adopts ProRC to represent.

Table 4 lists the rate control accuracy of 4 sequences the inventive method under different basic QP and original method and Y-PSNR (PSNR) compares.Target bit rate in table 4 and actual bit rate are total bit numbers of 3 viewpoints, code check deviation (RateControlError, RCE) for measuring the precision of Rate Control, namely wherein, R _targetand R _actualrepresent target bit rate and actual bit rate respectively.Code check precision aspect as can be seen from table 4, the rate control accuracy major part of original method and the inventive method all controls within 1%.The average bit rate control precision of original method is (-0.17%), and the average bit rate control precision of the inventive method is (-0.36%), known original method and the inventive method are more or less the same in rate control accuracy, and the inventive method is to adopting the key frame of intraframe coding to improve, it is larger that it takies code check ratio, if therefore the code check of key frame is changed to some extent and will be brought greater impact to code check precision, code check precision is accurate not enough, especially when comparatively low bit-rate end.PSNR aspect as can be seen from table 4, the PSNR of the inventive method is comparatively apparently higher than original method, and PSNR on average improves 0.19.When high code rate, PSNR gain is so obvious, but each sequence is all greatly improved when comparatively low bit-rate, and especially can to reach 0.8dB many in Doorflowers gain, and the gain of Ballet sequence reaches 0.6dB.Cycle tests is being more obviously because original method adopts the average QP of the B frame of previous GOP when calculating the QP of key frame compared with low bit-rate end PSNR gain, and during low bit-rate, the QP of B frame chooses comparatively large, and therefore the QP value of key frame causes greatly quality lower; And the inventive method utilizes the code check of key frame-quantization parameter model (R-QP) model to ask for QP according to Data Rate Distribution, the code check now distributed and model coefficient are all more stable, and the value of QP is less, therefore more remarkable in low bit-rate end PSNR gain.

Fig. 5 a, Fig. 5 b, Fig. 5 c and Fig. 5 d sets forth Breakdancers sequence, Ballet sequence, Doorflowers sequence and Alt_Moabit sequence and adopt the distortion performance of JMVC original method (OrgRC) and the inventive method (ProRC) to compare, Totalrate in Fig. 5 a, Fig. 5 b, Fig. 5 c and Fig. 5 d refers to the total bitrate of 3 class viewpoints, and PSNR is the PSNR mean value of 3 class viewpoints.As can be seen from Fig. 5 a, Fig. 5 b, Fig. 5 c and Fig. 5 d, the distortion performance of the inventive method increases a little compared with original method, almost maintains an equal level at high code check end and original method, improves more remarkable in low bit-rate end performance.Analyzing its reason is exactly that the code check-quantization parameter model of key frame in the inventive method and the B frame Data Rate Distribution process of different levels improve PSNR and distortion performance to a certain extent.

Fig. 6 a and Fig. 6 b sets forth the reconstructed image after the reconstructed image and the process of employing the inventive method obtained after Breakdancers sequence adopts original method process, Fig. 6 c gives the regional area of the original image of Breakdancers sequence, Fig. 6 d gives the regional area of Fig. 6 a, and Fig. 6 e gives the regional area of Fig. 6 b.Analysis chart 6a to Fig. 6 e is known, background place smooth after dancer, and serious blocking artifact has appearred in original method, and the inventive method then obtains higher quality without blocking artifact.Analyzing it should be that the Quality advance of key frame causes the quality of subsequent frame also to improve a lot by the inventive method; protect the quality at smooth place, the arm of dancer especially edge has more obvious blocking artifact to be also obtained for good protection in original method.The regional area of arm amplifies by Fig. 6 d and Fig. 6 e, observes and can to find in original method that crenellated phenomena appears in the edge of arm and the inventive method boundary curve is comparatively level and smooth, be more suitable for the subjective feeling of human eye.

Fig. 7 a and Fig. 7 b sets forth the reconstructed image after the reconstructed image and the process of employing the inventive method obtained after Ballet sequence adopts original method process, Fig. 7 c gives the regional area of the original image of Ballet sequence, Fig. 7 d gives the regional area of Fig. 7 a, and Fig. 7 e gives the regional area of Fig. 7 b.Similar with Breakdancers sequence, the cheek of Fig. 7 a and Fig. 7 b ballet dancer in original method, arm and that leg lifted all receive more serious impact, observe regional area Fig. 7 d and Fig. 7 e and find out that the people face part of original method is completely destroyed more significantly, and these places of the inventive method all obtain higher quality than original method.

Fig. 8 a and Fig. 8 b sets forth the reconstructed image after the reconstructed image and the process of employing the inventive method obtained after Alt_Moabit sequence adopts original method process, Fig. 8 c gives the regional area of the original image of Alt_Moabit sequence, Fig. 8 d gives the regional area of Fig. 8 a, and Fig. 8 e gives the regional area of Fig. 8 b.Alt_Moabit sequence has scene change, there will be vehicle and relatively large bus.Can find out that obvious blocking artifact has all appearred in blue car front-body original method and public transport headstock from the regional area of Fig. 8 c to Fig. 8 e, and in the inventive method, the headstock of car and bus is all more clear, does not have obvious blocking artifact.

Fig. 9 a and Fig. 9 b sets forth the reconstructed image after the reconstructed image and the process of employing the inventive method obtained after Doorflowers sequence adopts original method process, Fig. 9 c gives the regional area of the original image of Doorflowers sequence, Fig. 9 d gives the regional area of Fig. 9 a, and Fig. 9 e gives the regional area of Fig. 9 b.As can be seen from Fig. 9 a to Fig. 9 e, the Doorflowers sequence fresh flower that has a man to hold in both hands is from coming in outdoors, texture is more complicated, the edge examining the head of the man that can find to enter, leg and chair is all fuzzyyer in original method, clear not, and obtain higher quality in the methods of the invention, as can be seen from regional area figure also.

Table 3 test environment

Rate control accuracy and the PSNR of table 4 the inventive method and original method compare

Claims

1., based on a multi-view video rate control for key frame code rate-quantitative model, it is characterized in that comprising the following steps:

3. calculate the target bit being pre-assigned to current view point video, be designated as T _view(k'), T _view(k')=T _total× w (k'), wherein, T _totalrepresent the general objective bit number of multi-view point video of input, w (k') represent kth in the multi-view point video of input ' the proportional roles of individual viewpoint video;

f (i, 0) = \{\begin{matrix} \frac{B}{F_{r}} \times N_{g o p} & i = 1 \\ \frac{B}{F_{r}} \times N_{g o p} - (\frac{B_{s}}{8} - B_{c} (i - 1, N_{g o p})) & 2 \leq i \leq N \end{matrix},

Q_{K}^{''} (i, 0) = \frac{{Sum}_{B Q P} (i - 1)}{N_{g o p} - 1} - 1 - \frac{8 \times T_{r} (i - 1, N_{g o p})}{T_{r} (i, 0)} - \frac{N_{g o p}}{15},

{\tilde{Q}}_{K} (i, 0) = m i n {Q_{K} (i - 1, 0) + 2, m a x {Q_{K} (i - 1, 0) - 2, {\hat{Q}}_{K} (i, 0)}},

5.-2, according to the remaining bits number in i-th image sets during every frame in i-th image sets in precoding current view point video except key frame, calculate the target bit to be selected of the every frame in i-th image sets be pre-assigned in current view point video except key frame, for the jth frame in the image sets of i-th in current view point video, according to T _r(i, j), calculates the target bit to be selected of the jth frame in i-th image sets be pre-assigned in current view point video, is designated as wherein, l represents the different levels of B frame, and 1≤t≤L, L represents the hierachy number that hierarchical B-frame is total, represent the weighted value that t hierarchical B frame is corresponding, represent the weighted value that in residue B frame, l hierarchical B frame is corresponding, represent the frame number of l hierarchical B frame in residue B frame;

\tilde{f} (i, j) = \frac{B}{F_{r}} + γ \times (T b l (i, j) - B_{c} (i, j)),

Wherein, γ is constant;

5.-4, according to step 5.-2 and two of every frame of step 5. in-3 i-th image sets being pre-assigned in current view point video calculated except key frame target bit to be selected, calculate the final target bit of the every frame in i-th image sets be pre-assigned in current view point video except key frame, for the jth frame in the image sets of i-th in current view point video, according to with calculate the final target bit of the jth frame in i-th image sets be pre-assigned in current view point video, be designated as f (i, j), wherein, β is constant;

f_{m b} (j, k) = (\frac{X_{1}}{Q_{m b} {(j, k)}^{2}} + \frac{X_{2}}{Q_{m b} (j, k)}) \times M A D (j, k),

f_{m b} (j, k) = (\frac{X_{1}}{Q_{m b} {(j, k)}^{2}} + \frac{X_{2}}{Q_{m b} (j, k)}) \times M A D (j, k)

2. the multi-view video rate control based on key frame code rate-quantitative model according to claim 1, it is characterized in that described step 4.-3 detailed process be:

3. the multi-view video rate control based on key frame code rate-quantitative model according to claim 1, is characterized in that described step 4. C in-3 ₁=17.96, C ₂=-0.1704.

4. the multi-view video rate control based on key frame code rate-quantitative model according to any one of claim 1 to 3, it is characterized in that described step 5. in-2 when the length of the image sets in current view point video is 8, the total hierachy number L of hierarchical B-frame is 3 described step is γ=0.75 in-3 5.; Described step is β=0.5 in-4 5..

5. the multi-view video rate control based on key frame code rate-quantitative model according to claim 4, it is characterized in that the MAD value MAD (j of the kth macro block in the jth frame in i-th image sets of described step 6. in-1 in current view point video, k) acquisition process is: according in jth-1 frame in the image sets of i-th in current view point video with the MAD value MAD (j-1 of the macro block of the macro block same position of the kth in jth frame, k), linear prediction goes out MAD (j, k), MAD (j, k)=a ₁× MAD (j-1, k)+a ₂, wherein, a ₁and a ₂be prediction model parameters, a ₁initial value be 1, a ₂initial value be 0, adopt after a macro block of having encoded linear regression technique upgrade a ₁and a ₂value.