CN102547300B

CN102547300B - Method for detecting frame types and device

Info

Publication number: CN102547300B
Application number: CN201010594322.5A
Authority: CN
Inventors: 沈秋; 谢清鹏; 张冬; 李厚强
Original assignee: University of Science and Technology of China USTC; Huawei Technologies Co Ltd
Current assignee: University of Science and Technology of China USTC; Huawei Technologies Co Ltd
Priority date: 2010-12-17
Filing date: 2010-12-17
Publication date: 2015-01-21
Anticipated expiration: 2030-12-17
Also published as: EP2814255A2; EP2814255A3; US9497459B2; EP2637410A4; EP2814255B1; EP2637410B1; WO2012079406A1; CN102547300A; EP2637410A1; US20130279585A1

Abstract

An embodiment of the invention discloses a method for detecting frame types and a device. The method includes detecting playing time of each frame; and determining that a current frame is a bidirectional prediction coded frame B if the playing time of the current frame is shorter than the maximum playing time of a received frame. By the aid of the method in the technical scheme, coding sequence of frames in different types and the relationship between former datum quantities and latter datum quantities of the frames in different types are combined, the frame types are judged under the condition without decoding or clearing load, influence of attenuation factors is eliminated, and accuracy of detection of the frame types is improved.

Description

The detection method of frame type and device

Technical field

The present invention relates to technical field of video processing, particularly the detection method of frame type and device.

Background technology

Decodable code data frame type in video encoding standard can be divided into intracoded frame (I-Frame, Intra coded frames, I frame), single directional prediction coded frame (P-Frame, Predicted frames, P frame), bi-directional predictive coding frame (B-Frame, Bi-directional predicted frames, B frame).In Video Applications, I frame, as decodable initial, is commonly referred to as random access points, can provide the service such as Stochastic accessing and fast browsing.In transmitting procedure, different frame types is made mistakes, and is different on the impact of the subjective quality of decoding end, and I frame has the effect that truncated error is propagated, therefore, if I frame is made mistakes, then very big on the decoding quality impact of whole video; P frame is often as the reference frame of other inter-frame encoding frame, and its effect is inferior to I frame; Because B frame is not usually as reference frame, it loses video decode quality impact less.Therefore, the different frame type distinguishing data flow in transmission of video application has very important meaning, and such as: as the important parameter of video quality assessment, the accuracy that frame type judges directly has influence on the accuracy of assessment result; Can carry out not equal difference protection to frame dissimilar in video and realize effective transmission of video, in addition in order to save transfer resource, can abandon some when bandwidth is not enough affects little frame to subjective quality.

Conventional flow transmission technology is mainly internet stream media alliance (Internet Streaming Media Alliance, ISMA) mode and motion video experts group transport stream (the Moving Picture Expert Group-2Transport Stream over Internet Protocol on Internet protocol that lives, MPEG-2TS over IP) mode, these two kinds of protocol modes, when being encapsulated by compressed video data stream, all devise the indicating bit of energy instruction video data type.ISMA mode compressed video data stream is directly adopted RTP (Real-time Transport Protocol, RTP) encapsulate, wherein MPEG-4Part2 follows internet standard 3016(Request For Comments3016, RFC3016), H.264/ the sense of hearing and visual signal coding (Aural and Visual Code, AVC) RFC3984 is followed, the sequence number (Sequence Number), timestamp (Timestamp) etc. that comprise for RFC3984, RTP head can be used for judging frame losing and helping to detect frame type, MPEG-2TS over IP mode also divides two kinds: transport stream (the TS over User Datagram Protocol/IP on user datagram protocol/IP, TS over UDP/IP) and RTP/UDP/IP on transport stream (TS over Real-time Transport Protocol/UDP/IP, TS over RTP/UDP/IP), what relatively commonly use in transmission of video is be called for short TS over RTP after TS over RTP/UDP/IP(), that compressed video data stream is encapsulated as basic stream, further basic stream is divided into TS grouping, finally TS grouping RTP is encapsulated and transmitted.

RTP is a kind of host-host protocol for multimedia data stream, is responsible for providing real-time Data Transmission end to end, and its message mainly comprises four parts: RTP head, and RTP extension header, carries head only, only carries data.The data comprised in RTP head mainly contain: sequence number, timestamp, flag bit etc.Sequence number and RTP bag one_to_one corresponding, often send a bag increase by 1, can be used for detecting packet loss; Timestamp can represent the sampling time of video data, and different frames has different timestamps, can the playing sequence of instruction video data; Flag bit is then used for the end of mark one frame.These information are important evidence that frame type judges.

A TS grouping has 188 bytes, be made up of packet header, variable-length adapter head and payload data, wherein initial indicating bit (the payload unit start indicator of packet header, PUSI) represent whether payload data comprises stream of packets (the Packet Elementary Stream of packing, PES) packet header or Program Specific Information (Program Special Information, PSI).For H.264 media formats, each PES packet header imply that the beginning of a NAL unit.Some flag bits in TS grouping self adaptation section, as: Stochastic accessing instruction (random access indicator), basic flow priority instruction (elementary stream priority indicator), can be used for judging the importance of transferring content, for video, Stochastic accessing is designated as in first PES bag that 1 expression runs into subsequently and comprises sequence start information, and basic flow priority is designated as in this TS packet payload of 1 expression and has more Intra blocks of data.

If judge that TS packet payload part comprises PES packet header by PUSI, then can excavate the information useful to transmission further.PES grouping is made up of PES packet header and subsequent grouped data, and original stream data (video, audio frequency etc.) is carried in PES bag data.PES grouping is inserted in transport stream packet, and the first character joint of each PES packet header is exactly the first character joint of transport stream packet pay(useful) load.Namely PES packet header must be included in a new TS bag, PES bag data will be full of the Payload region that TS transmits bag simultaneously, if the ending that the ending of PES bag data cannot be wrapped with TS is alignd, then need the byte of padding inserting respective numbers in the adaptive region of TS, both endings are alignd.PES priority represents the clean importance of carrying in PES bag data, for video, is 1 expression Intra data; PTS represents displaying time in addition, and DTS represents decode time, can be used for judging the front and back correlation of video payload content, thus judge load type.

In TS over RTP mode, in order to protect the video copy content in transmission, often adopting in transmitting procedure and the mode of payload encryption is transmitted.The encryption of dividing into groups to TS is encrypted the payload portions of grouping, once the scrambling mark of TS head puts 1, then its load is encrypted, and the length of the packet between adjacent PUSI with identical PID (length of same frame of video) now only can be utilized to judge load data type.If PES head unencryption in TS grouping, then except the length of above-mentioned frame of video can be utilized to except judging data frame type, PTS can also be utilized to assist judgment frame type.

By above introduce known: its data volume of dissimilar Frame is had any different, and I frame is owing to only eliminating the redundancy in frame, and its data volume is generally large than the inter-frame encoding frame eliminating inter-frame redundancy, and P frame is general larger than the data volume of B frame.For this characteristic, there are some frame type detection algorithms at present when TS block encryption, utilize the data volume of frame to carry out judgment frame type; Below introduce and use two kinds of many methods:

One: by resolving TS grouping, obtain the length of each frame of video, by length scale information inference frame type.The method proposed be for TS grouping payload portions encrypt when, determine frame type.

The method judges the lost condition of grouping by the Continuity Counter territory of resolving TS grouping, image sets (the Group Of Pictures before judging is performed by this, GOP) structural information estimates the Packet State of losing, and in conjunction with available information (the Random Access Indicator of TS packet header self adaptation field, RAI or Elementary Stream Priority Indicator, ESPI) judge the type of frame of video.

For the identification of I frame, can by following three kinds of methods:

1, RAI or ESPI is utilized to identify I frame.

2, when RAI or ESPI can not be utilized to identify, by the data of a buffer memory GOP, by the maximum in the data of current cache as I frame, the length of GOP needs pre-defined, once GOP length changes, the method will lose efficacy.

3, use and represent that the value of maximum GOP length is as I frame fixed cycle really, determines that the maximum amount of data frame in the cycle is I frame, determine that the cycle is the maximum in the I frame period detected.

For P frame, by following three kinds of methods:

1, the frame between the frame before from start frame to immediately I frame, each frame selecting data volume to be greater than frame is around defined as P frame.Gop structure for processing target stream comprises framing pattern really, from determining to select the successive frame corresponding with N kind determination frame pattern as determining target frame the cycle, by the magnitude relationship between the data volume determining target frame with determine that frame pattern compares, P frame can be determined based on coupling therebetween.In gop structure, use following pattern as determining frame pattern: this pattern comprise immediately preceding P frame before all continuous B frame and a B frame at P frame next frame.Now some informational needs of GOP pre-enter.

2, based on each frame in the mean value of the frame data amount of multiple frames of pre-position in expression mode and the threshold value calculated and expression mode frame data amount between comparative result.

3, Use Adjustment coefficient adjusts threshold value for distinguishing P and B frame based on frame data amount.Regulation coefficient: the interim regulation coefficient of selective sequential performs and determines to process identical process with frame type in given range, thus the frame type of each frame in learn cycle given in advance is estimated, calculate estimated result determine ratio with the mistake of the actual frame type obtained from non-encrypted stream, know have lowest error determine than interim regulation coefficient as real regulation coefficient.

For B frame, determination methods is: I frame, and the frame beyond P frame is defined as B frame.

The method of above judgment frame type, for the situation having packet loss, can packet loss be detected based on RTP sequence number and TS stem continuity designator (CC), pattern matching can estimate the Packet State of losing, thus the correction acquired a certain degree by gop structure.But need to pre-enter GOP information for the method that can not adjust threshold value, the method for adjustable thresholds then needs getting frame type information from unencrypted code stream to train coefficient, to need too much manual intervention.In addition, need buffer memory GOP to carry out frame type estimation again, be not suitable for real-time application.Again, I frame judges only to carry out once, and adjustable coefficient is the cycle, and in each cycle, directly get maximum is I, only take into account local characteristics, does not consider for global property.

Two: the method utilizing threshold value to distinguish different frame can divide four steps to carry out:

1, the renewal of threshold value:

Distinguish the threshold value (Ithresh) of I frame:

Scaled_max_iframe=scaled_max_iframe*0.995; Wherein scaled_max_iframe is a upper I frame sign.

If nbytes>scaled_max_iframe,

Then ithresh=(scaled_max_iframe/4+av_nbytes*2)/2; Wherein av_nbytes is the slip average of current 8 frames.

Distinguish the threshold value (Pthresh) of P frame:

Scaled_max_pframe=scaled_max_pframe*0.995; Wherein scaled_max_pframe is a upper P frame sign.

If nbytes>scaled_max_pframe, then pthresh=av_nbytes*0.75;

2, I frame is detected: video has an I frame at set intervals, and I frame is larger than mean value, and I frame is larger than P frame.If current frame data amount is larger than Ithresh, then think that this frame is I frame.

3, P frame is detected: utilize B frame less than mean value.If the data volume of present frame is greater than Pthresh, be less than Ithresh, then think that this frame is P frame.

4, other frame is B frame.

The method of above the second judgment frame type, adopt decay factor to control threshold value, this factor directly affects the judgement of I frame, when follow-up I frame is greater than current I frame, easily judges I frame; But when follow-up I frame is much smaller than current I frame, need the decay through a lot of frame just can rejudge out I frame.And be fixed as 0.995 in algorithm, and do not consider that GOP changes violent situation, inapplicable in a lot of situation.Decay factor is less, then I frame loss is less, and P is mistaken for the probability increase of I frame simultaneously; Decay factor is larger, then I frame loss increases (when the size variation of I frame is violent in sequence), and I frame is judged as P frame.Therefore Detection accuracy is lower.In addition, only consider to use threshold decision B/P frame, to this frame structure of I/P/P/P..., algorithm can by a lot of P frame misjudgement for B frame False Rate be high.

Summary of the invention

The technical problem that the embodiment of the present invention will solve is to provide a kind of detection method and device of frame type, improves the accuracy that frame type detects.

For solving the problems of the technologies described above, the detection method embodiment of frame type provided by the present invention can be achieved through the following technical solutions:

Detect the reproduction time of each frame;

If the reproduction time of present frame is less than the maximum play time of the frame received, then determine that described present frame is bi-directional predictive coding frame B frame.

A detection method for frame type, comprising:

Obtain the type of coding of the frame place code stream received, described type of coding comprises: open loop coding and closed loop coding;

If the data volume of present frame is greater than first threshold, determine that present frame is obvious intracoded frame I frame, described first threshold is calculated by the average amount of frame and I frame data amount setting continuous number;

If the former frame of present frame is I frame, type of coding is closed loop coding and present frame is non-obvious I frame, or, if the former frame of present frame is I frame, type of coding is open loop coding and the data volume of present frame is greater than the 4th threshold value, then determine that present frame is single directional prediction coded frame P frame; Described 4th threshold value is the P frame average amount of an image sets and the average of B frame average amount;

If present frame non-I frame also non-P frame, then determine that present frame is B frame.

A checkout gear for frame type, comprising:

Time detecting unit, for detecting the reproduction time of each frame;

Frame type determining unit, if the maximum play time being less than the frame received for the reproduction time of present frame, then determines that described present frame is bi-directional predictive coding B frame.

A checkout gear for frame type, comprising:

Type obtaining unit, for obtaining the type of coding of the frame place code stream received, described type of coding comprises: open loop coding and closed loop coding;

Frame type determining unit, if be greater than first threshold for the data volume of present frame, determines that present frame is obvious I frame, and described first threshold is calculated by the average amount of frame and I frame data amount setting continuous number;

If the former frame of present frame is I frame, type of coding be closed loop coding and present frame is non-obvious I frame, or, if the former frame of present frame be I frame, type of coding is open loop coding and the data volume of present frame is greater than the 4th threshold value, then determine that present frame is P frame; Described 4th threshold value is the P frame average amount of an image sets and the average of B frame average amount;

The technical scheme that the embodiment of the present invention provides, in conjunction with the coded sequence of dissimilar frame and the front and back data volume magnitude relationship of dissimilar frame, when not decoding clean carrying, judgment frame type, eliminates the impact of decay factor, improves the accuracy that frame type detects.

Accompanying drawing explanation

In order to be illustrated more clearly in the technical scheme of the embodiment of the present invention, below the accompanying drawing used required in describing embodiment is briefly described, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skill in the art, under the prerequisite not paying creative work, other accompanying drawing can also be obtained according to these accompanying drawings.

Figure 1A is embodiment of the present invention method flow schematic diagram;

Figure 1B is embodiment of the present invention method flow schematic diagram;

Fig. 2 a is embodiment of the present invention classification B frame coding structure schematic diagram;

Fig. 2 b is the relation of embodiment of the present invention coded sequence and playing sequence, and the level schematic diagram of coding;

Fig. 3 is embodiment of the present invention packet loss frame structure schematic diagram;

Fig. 4 is embodiment of the present invention method flow schematic diagram;

Fig. 5 is embodiment of the present invention side's apparatus structure schematic diagram;

Fig. 6 is embodiment of the present invention side's apparatus structure schematic diagram;

Fig. 7 is embodiment of the present invention side's apparatus structure schematic diagram;

Fig. 8 is embodiment of the present invention side's apparatus structure schematic diagram;

Fig. 9 is embodiment of the present invention side's apparatus structure schematic diagram;

Figure 10 is embodiment of the present invention testing result schematic diagram;

Figure 11 is embodiment of the present invention testing result schematic diagram;

Figure 12 is embodiment of the present invention testing result schematic diagram;

Figure 13 is embodiment of the present invention testing result schematic diagram;

Figure 14 is embodiment of the present invention testing result schematic diagram;

Figure 15 is embodiment of the present invention testing result schematic diagram.

Embodiment

Below in conjunction with the accompanying drawing in the embodiment of the present invention, be clearly and completely described the technical scheme in the embodiment of the present invention, obviously, described embodiment is only the present invention's part embodiment, instead of whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art, not making the every other embodiment obtained under creative work prerequisite, belong to the scope of protection of the invention.

A detection method for frame type, as shown in Figure 1A, comprising:

101A: the reproduction time detecting each frame;

102A: if the reproduction time of present frame is less than the maximum play time of the frame received, then determine that above-mentioned present frame is bi-directional predictive coding B frame;

Further, the embodiment of the present invention is all right: determine the level belonging to B frame is in hierarchical coding according to the playing sequence of each frame and coded sequence.For how determining that level will be further described below.Based on the characteristic of B frame, if the level determined belonging to it can be applied in a lot of field, such as: when compressed data frames, the B frame that level is high can be abandoned.The application embodiment of the present invention after the level of B frame is determined will not limit.

Above-described embodiment, in conjunction with the coded sequence of dissimilar frame and the front and back data volume magnitude relationship of dissimilar frame, when not decoding clean carrying, judgment frame type, eliminates the impact of decay factor, improves the accuracy that frame type detects.

The embodiment of the present invention additionally provides the detection method of another kind of frame type, as shown in Figure 1B, comprising:

101B: the type of coding obtaining the frame place code stream received, above-mentioned type of coding comprises: open loop coding and closed loop coding;

102B: if the data volume of present frame is greater than first threshold, determines that present frame is obvious I frame, and above-mentioned first threshold is calculated by the average amount of frame and I frame data amount setting continuous number;

Above-mentioned obvious I frame belongs to I frame, if be judged as obvious I frame, so wrongheaded probability is very low, but likely occurs failing to judge, and follow-up other judge that the situation of I frame may appear misjudging in the mode of I frame.

If the former frame of present frame is I frame, type of coding is closed loop coding and present frame is that (present frame is its frame type not clear now for non-obvious I frame, but obvious I frame can be determined whether it is), or, if the former frame of present frame is I frame, type of coding is open loop coding and the data volume of present frame is greater than the 4th threshold value, then determine that present frame is P frame; Above-mentioned 4th threshold value is the P frame average amount of an image sets and the average of B frame average amount;

It should be noted that, method corresponding to above-mentioned Figure 1B can independent utility, also can use with the methods combining of Figure 1A, can use implementation when reproduction time cannot detect in figure ia if be combined.

The type of coding of the frame place code stream that above-mentioned acquisition receives comprises:

Add up the type of a frame after obvious I frame, if the ratio of P frame reaches setting ratio then determine that type of coding is closed loop coding, otherwise encode for open loop.

Following examples to be combined with the scheme of Figure 1A for the scheme of Figure 1B and to be described, if Figure 1B scheme independently use time, can check whether reproduction time can be detected.

If the embodiment of the method that reproduction time cannot detect in 101A further, also comprises:

If present frame is greater than Second Threshold, then determine that present frame is I frame; The data volume that above-mentioned Second Threshold is the I frame of before present frame, present frame the maximum in the group of images in the average amount of P frame and the average amount of setting number successive frame.

If present frame is greater than the 3rd threshold value, and the interval of present frame and previous I frame exceedes fixed intervals, then determine that present frame is I frame; Above-mentioned 3rd threshold value is: the average amount of each frame of present frame place image sets, a upper I frame calculate to the data volume of the previous P frame away from degree, present frame of the fixing I frame period of the distance of present frame and expection and the data volume of present frame place image sets I frame; Or above-mentioned 3rd threshold value is according to the average amount of each frame of present frame place image sets and a upper I frame the calculating away from degree of fixing I frame period to the distance of present frame and expection.

If the previous frame of present frame is P frame and the data volume of present frame is greater than the 5th threshold value, or present image group there is B frame and the data volume of present frame is greater than the 6th threshold value, then determine that present frame is P frame; Above-mentioned 5th threshold value is: the average amount of the P frame of the first regulatory factor and present frame place image sets long-pending, and above-mentioned first regulatory factor is greater than 0.5 and is less than 1; Above-mentioned 6th threshold value is: the average of P frame average amount and B frame average amount;

If the previous frame of present frame is B frame and the data volume of present frame is less than the 7th threshold value, or present image group there is P frame and the data volume of present frame is less than the 8th threshold value, then determine that present frame is P frame; Above-mentioned 7th threshold value is: the average amount of the B frame of the second regulatory factor and present frame place image sets long-pending, and above-mentioned second regulatory factor is greater than 1 and is less than 1.5; Above-mentioned 8th threshold value is: the average of P frame average amount and B frame average amount.

After frame type judgement terminates, determine the fixed intervals of I frame, if still do not judge I frame after fixed intervals reach, then the frame of the maximum amount of data in fixed intervals place setting range is defined as I frame; And upgrade the average amount of all kinds frame in image sets and the spacing parameter of I frame.

After frame type judgement terminates, statistics continuous print B frame, if continuous print B frame number is greater than predicted value, is then defined as P frame by frame maximum for data volume in above-mentioned continuous print B frame; And upgrade the average amount of all kinds frame in image sets; Above-mentioned predicted value is more than or equal to 3 and is less than or equal to 7.

Determine whether the frame received packet loss occurs, if there is packet loss, then determine packet loss type;

If packet loss type is packet loss in frame, then the data volume determining to receive frame when calculating frame data amount and packet loss data volume and the data volume that is this frame;

If packet loss type is interframe packet loss, then determine whether the flag bit of the bag before packet loss place is 1, a frame after if so, then being counted by the data volume of packet loss, otherwise the data volume of packet loss is averagely allocated to front and back two frame.

Above-mentionedly further determine that packet loss type comprises:

By adding up the frame type forecast coding structure detected;

If packet loss type is interframe packet loss, the flag bit of the bag before packet loss place cannot detect, then according to the coding structure of prediction and the position segmentation current data length of packet loss.

The embodiment of the present invention makes full use of the header packet information of RTP or TS over RTP, in conjunction with the coded sequence of frame dissimilar in video and the front and back data volume magnitude relationship of dissimilar frame, the judgment frame type real-time fast when decoded video does not carry only, and the accuracy being improved frame type detection by the method that packet loss process, automatically undated parameter and later stage frame type are corrected.

The header packet information of the reproduction time of instruction video data is had in video flowing, as the RTP timestamp in ISMA mode, and the PTS of PES head in TS over RTP mode.The embodiment of the present invention will utilize the correlation of reproduction time information and coded sequence, judges the type of coding of some special construction, as: B frame.But for TS over RTP mode, may there is TS and only carry the situation of encrypting PES head completely and cannot decode, i.e. PTS non-availability, therefore, the embodiment of the present invention additionally provides and does not utilize reproduction time only to utilize the information such as data volume to carry out the scheme of frame type judgement.

The video code flow observed in practical application can find, frame dissimilar in same GOP generally has comparatively significantly to be distinguished, and I frame data amount is maximum, and secondly, B frame is minimum for P frame.If the I frame of each GOP section start correctly can be identified, then the data volume of this frame can be utilized to judge P frame and the B frame of this GOP inside.But non-stationary due to vision signal, the I frame data amount difference at diverse location place also exists larger difference, even can be suitable with the data volume of the P frame in GOP before, is judged that I frame brings difficulty.The embodiment of the present invention devise a set of can the dynamic parameter of Intelligent adjustment, to improve the robustness and accuracy that frame type judges.Particularly when judging I frame, taking into full account the adjustment judgment criterion that the characteristic of I frame in different application scene is suitable and relevant parameter, having greatly reduced the False Rate of I frame.

In the application scenarios damaging transmission, can packet loss be there is in the video flowing of input, according to the impact of packet loss on deterministic process, two classes can be divided into: the packet loss one, in frame, now the information of frame boundaries is not lost, first can get frame boundaries, add up the bag number of a frame with the sequence number of correspondence; Two, (as: in RTP, flag bit is the bag of 1 to frame boundaries packet loss, or PUSI puts the bag of 1 in TS over RTP), now possibly cannot judge the border of front and back two frame, also before and after possibility, the data of two frames are spliced to a frame, make frame data amount really not statistical uncertainty, affect the result that frame type judges.The embodiment of the present invention is estimated carrying out packet loss detection, frame boundaries estimation and frame type partly at this point.

In the early stage that frame type judges, because statistics is inadequate, more erroneous judgement can be there is, not only have influence on the result exported, more can by changing the accuracy of various parameter influence to follow-up judgement.The embodiment of the present invention adds frame type and corrects after judgment frame type flow, if have during apparent error at the rear Output rusults of data increase and carry out inside correction, correct although inner and can not change the frame type exported, the accuracy of follow-up judgement can be improved by the mode adjusting parameter.

Below three main points respectively with regard to the embodiment of the present invention are described in detail:

One: utilize reproduction time to judge B frame or/and classification B frame:

Because B frame adopts forward direction and backward encoded frame as prediction, its coded sequence after reference frame, makes its reproduction time often inconsistent with coded sequence rear, therefore can judge B frame by reproduction time information.If the reproduction time of present frame is less than the maximum reproduction time of the frame received, then this frame is B frame certainly, otherwise is I frame or P frame.

The level that B frame for hierarchical coding also can utilize reproduction time to judge belonging to highest level and each B frame further.For the situation of continuous 7 B frames, shown in Fig. 2 a, be the coding structure figure of classification B frame in this situation, the subscript of first row letter represents the level belonging to every frame, and the numeral of second row is the broadcasting sequence number of each frame.And the coded sequence of reality is (parenthetic numeral is for playing sequence number) I0/P0 (0), I0/P0 (8), B1 (4), B2 (2), B3 (1), B3 (3), B2 (6), B3 (5), B3 (7).Fig. 2 b is the relation of coded sequence and playing sequence, and the level of coding, and Arabic numerals represent broadcasting sequence number, Chinese figure presentation code sequence number.

Judge that the algorithm of classification can be divided into two steps with reproduction time:

The first step: judge highest level (being 3 in this example).The level of the 0th frame is set to 0, then presses coded sequence and read reproduction time, if when the reproduction time of former frame is less than the reproduction time of former frame, the level of present frame is that the level of former frame adds 1, on the contrary the same then with former frame.Until read frame i.e. the 1st frame that reproduction time is in close proximity to the 0th frame, the level now corresponding to the 1st frame is highest level.

Second step: judge the level belonging to remaining frame according to the symmetric relation of adjacent B frame reproduction time.Figure after the first step completes. the level in five (b) solid box is all determined, now needs to detect the level belonging to B frame in dotted line frame.Detection method travels through in the frame determining level, searches out two frames and make the average of their reproduction time equal with the reproduction time of present frame, then the level of present frame is that the maximum level of these two frames adds 1.Namely what the ellipse in figure was shown is this symmetric relation, and namely in ellipse, the average of the reproduction time of two frames equals the reproduction time of bottom frame above, and the level of bottom frame is just for the maximum of above two frame-layer levels adds 1.

Two, frame data amount is utilized to carry out judgment frame type:

Whether be B frame owing to can only distinguish according to reproduction time, present embodiments provide and only utilize the information such as data volume to judge the scheme of I frame and P frame.For the situation can judging B frame according to reproduction time, only need to distinguish whether I frame or P frame to remaining frame; For judging that the situation (situation of such as header packet information encryption) of B frame then will judge all frames according to reproduction time, first determine I frame and P frame, remaining frame is then judged to be B frame.

The present embodiment utilizes frame data amount to carry out judgment frame type by the method that Automatic parameter upgrades, and is mainly divided into following module (as shown in figure 6): I frame judge module, P frame judge module, parameter update module and type correct module.

A:I frame judges:

In general the I frame in video can be divided into following two classes: the I frame of fixed intervals, namely in order to meet Stochastic accessing in compression process according to fixed intervals (regular period internal fixtion, once user's switching channels, this interval may change) the I frame that inserts; Namely the I frame that self adaptation is inserted is to improve compression efficiency, at the I frame that scene switching place is inserted.

For the I frame of fixed intervals, these fixed intervals can be estimated in identifying, when exceeding this interval and also not determining I frame, initiatively relax Rule of judgment or judge (hereinafter will be described in detail this) by the feature of local.

And for the I frame that self adaptation is inserted, scene switching place like sequence space Complexity classes, if the I frame being encoded to self adaptation insertion, because the compression efficiency of I frame is poor, its code check is often large than P frame before; If be encoded to P frame, due to forecast variation, its code check also can be larger, and now this frame is important frame, be comparatively easy to be judged as I frame (P frame and I frame data amount all larger, be easily I frame by P frame misidentification mistakenly).For switching place of space complexity simple scene, being encoded to I frame may be also less than P frame before, I frame for this type of has no idea correctly to identify, but those P frames thereafter or B frame also can correspondingly diminish, by follow-up renewal, type correction can be carried out, to improve the discrimination to subsequent frame type.

Therefore, judge I frame by following three steps, namely compare current frame data amount and given threshold value respectively, as long as current frame data amount is greater than given threshold value and is just judged to be I frame in a certain step:

Obvious I frame is judged according to threshold value 1;

The I frame at on-fixed interval is judged according to threshold value 2;

The I frame exceeding the fixed intervals of expection is judged according to threshold value 3.

B:P frame judges:

Previous frame is I frame and current video stream be closed loop coding situation, B frame can not be close to after I frame.If this frame is not judged as I frame, then it is P frame;

Previous frame is I frame and current video stream be open loop coding situation, if the data volume of present frame is greater than threshold value 4, then this frame is P frame, otherwise this frame is B frame;

For the situation that previous frame is P frame, if current frame data amount is greater than threshold value 5 or is greater than threshold value 6 when current GOP exists B frame, so this frame is P frame;

For the situation that previous frame is B frame, represent in current GOP to there is B frame, if current frame data amount is less than threshold value 7 or is less than threshold value 8 when current GOP has judged P frame, so this frame is P frame.

C: parameter upgrades:

The type of coding (open loop or closed loop) of statistics GOP: in identifying, for obvious I frame, can add up a frame is thereafter B frame or P frame, if be all P frame after most of I frame, then can think that this encoder is closed loop coding, otherwise think that open loop is encoded.

Calculate the I frame fixed intervals of expection: after judging I frame, add up the probability distribution at its interval, and by weighted average, the fixed intervals of the expection obtained.

Threshold value according in the above-mentioned module of the renewal that the frame type newly judged is real-time:

A) threshold value 1: according to the average amount (av_IBPnbytes) of 50 frames and the data volume (iframe_size_GOP) of previous I frame before, calculate according to formula (1):

Threshold value 1=delta1*iframe_size_GOP+av_IBPnbytes

Wherein, delta1 is regulatory factor, and span is (0,1), and the empirical value experimentally obtained is 0.5.

B) threshold value 2: according to the average amount (av_IPnbytes) of I frame P frame in the average amount (max_pframes_size_GOP) of P frame maximum in the data volume (iframe_size_GOP) of previous I frame, current GOP and front 50 frames, calculate according to formula (2):

Threshold value 2=max (delta2*max_pframes_size_GOP, delta2*av_IPnbytes, delta3*iframe_size_GOP)

Wherein, delta2 and delta3 is respectively regulatory factor, and its empirical value is 1.5 and 0.5.

C) threshold value 3: according to the average amount (av_frame_size_GOP) of every frame of current GOP, the data volume (prew_pframe_nbytes) of previous P frame, the data volume (iframe_size_GOP) of the I frame of current GOP, calculates according to formula (3); Or the P frame average amount (av_pframes_size_GOP) according to current GOP calculates according to formula (5):

Threshold value 3=max (av_frame_size_GOP, ip_thresh*prew_pframe_nbytes, iframe_size_GOP/3) formula (3)

Wherein, ip_thresh is along with calculating away from degree from a upper I frame to the fixing I frame period (expected_iframe_interval) of the distance (curr_i_interval) of present frame and expection:

Ip_thresh=max (2-(curr_i_interval-expected_iframe_interval) * 0.1,1.5) formula (4)

Threshold value 3=SThresh*av_pframes_size_GOP+av_pframes_size_GOP formula (5)

Wherein, sThresh calculates according to curr_i_interval and expected_iframe_interval:

SThresh=

Max (delta4, SThresh/ (delta5*curr_i_interval/expected_iframe_interval)) formula (6)

Wherein, delta4 and delta5 is respectively regulatory factor, and its empirical value is 0.2 and 2.0.

D) threshold value 4: be the P frame average amount (av_pframes_size_Last_GOP) of a upper GOP and the average of B frame average amount (av_bframes_size_Last_GOP), as formula (7):

Threshold value 4=(av_pframes_size_Last_GOP+av_bframes_size_Last_GOP)/2

E) threshold value 5: be 0.75 be multiplied by P frame average amount (av_pframes_size_GOP) in current GOP, as formula (8):

Threshold value 5=delta6*av_pframes_size_GOP

Wherein, delta6 is respectively regulatory factor, and its empirical value is 0.75

F) threshold value 6: be the average of P frame average amount (av_pframes_size_GOP) and B frame average amount (max_bframes_size_GOP), as formula (9);

Threshold value 6=(av_pframes_size_GOP+max_bframes_size_GOP)/2

G) threshold value 7: be 1.25 be multiplied by B frame average amount (av_bframes_size_GOP) in current GOP, as formula (10):

Threshold value 7=delta7*av_bframes_size_GOP

Wherein, delta7 is respectively regulatory factor, and its empirical value is 1.25

H) threshold value 8: be the average (av_bframes_size_GOP) of P frame average amount (av_pframes_size_GOP) and B frame average amount, as formula (11):

Threshold value 7=(av_pframes_size_GOP+av_bframes_size_GOP)/2

D: type is corrected:

Correct the I frame of failing to judge:

After above-mentioned steps, the situation that the fixed intervals far exceeding expection also do not judge I frame may be there is, now, although output frame type, but the information correcting parameter of local can be utilized, follow-up frame type be judged more accurate.Near the fixed intervals close to expection, the maximum frame of the amount of fetching data, changes its frame type into I frame, and upgrades the parameters such as the average amount of each frame type in GOP and I frame period.

Correct the B frame of misjudgement:

Video encoder in practical application, generally can take into account decoding time delay and decoding storage overhead when utilizing B frame to improve code efficiency, the continuous B frame can not encoded out more than 7, even, more extreme, continuous B frame can not more than 3.The predicted value of maximum continuous B frame in this code stream is drawn by the frame type statistics judged before.When a frame is defined as B frame, need to guarantee that this continuous print B frame number is no more than predicted value.If exceed this value, illustrate and be currently judged as may having misjudgement in the frame of B frame continuously, need frame maximum for data volume in these frames to change the original sentence to as P frame, and upgrade the information such as the average amount of each frame type in GOP.

Frame type when three, cannot determine border and frame data amount detects:

The first two example all needs to carry out under frame boundaries and the acquired situation of frame data amount.Without during packet loss can by the sequence number of RTP, timestamp, flag bit (ISMA mode) or RTP sequence number, TS in CC, PUSI, PID(TS over RTP mode) know the data volume of frame boundaries and each frame accurately, but when there is packet loss, if the bag being in frame boundaries is lost, then cannot the position on accurate judgment frame border, the data volume of two frames even may be spelled by the bag number misjudgment of frame is a frame, and this brings great interference by the detection of frame type.Therefore, if there is packet loss, needs to carry out packet loss process before frame type judges, obtain the information such as frame boundaries, frame data amount and frame type.

New frame due to the change flag of RTP timestamp in ISMA mode to arrive, therefore when there is packet loss, its processing procedure is fairly simple:

1) if packet loss surrounding time stamp is unchanged, it is inner that the bag that representative is lost is in a frame, only need consider the data of packet loss when statistics frame data volume;

2) if packet loss surrounding time stamp changes, represent the border that packet loss occurs in frame, if now the flag bit of the previous bag of packet loss is 1, then the data of a frame after depending on packet loss being, add in the data volume of a rear frame; Otherwise, the data volume of packet loss is averagely allocated to front and back two frame (suppose Burst loss can not more than the length of a frame) herein.

The situation of TS over RTP wants relative complex, owing to can only pass through whether have PES head (namely PUSI is 1) to judge the beginning of a frame, if generation packet loss, then being difficult to judgement two has the data between the bag of PES head to be belong to a frame or multiframe, as shown in Figure 3, data between the bag that two have PES head there occurs 3 packet losses, but due to cannot know loss bag in whether also have PES head (namely representing the beginning of a frame), cannot judge whether these data belong to same frame.Present case each provides solution from two aspects.

If PES head can be separated, then can judge according to PTS wherein whether current data length (namely two have the data length between the bag of PES head) comprises frame originating point information:

1) add up the order of the PTS of GOP correctly detected, distance weighted as forward index using distribution probability and distance present frame, obtain expecting coding structure;

2) mate with the coding structure of expection according to the PTS to current PTS of the series of frames in reception order from I frame and next PTS

If a) meet the coding structure of expection, then think this data length packet loss in do not comprise frame originating point information, namely current data length is a frame, and it is inner that packet loss occurs in this frame, does not need segmentation;

If b) do not meet the coding structure of expection, illustrate in packet loss and probably comprise frame originating point information, current data length is split in the position (continuous length, packet loss length etc.) occurred according to coding structure and the packet loss of expection, distributes rational frame type and frame sign and PTS.

3) if follow-up found before be judged as losing the frame of frame head, the then judged result before upgrading in aligning step.

In addition, can according to packet loss length, continuous length, maximum continuous length, maximum packet loss length etc. judges whether current data length is a frame and belongs to which kind of frame type:

1) if the length difference of this data length and previous I frame is few, then think and belong to same I frame; If this data length and P frame are almost large, and maximum continuous length is larger than the data volume of the average B frame within 50 frames, then think that this data length all belongs to same P frame; 2 are forwarded to other situations to);

2) if this data length and two P frames almost greatly, then will be split as two P frames, this be changed data length and be divided into two sections, make the length of every section all closest with P frame, and will guarantee that second segment starts with lost package; 3 are forwarded to other situations to);

3) if to add B frame similar for this data length and P frame, then to be split as P frame+B frame, bag maximum for continuous length is attributed to P frame, on this basis this data length is divided into two sections, make the length of every section respectively close to P frame and B frame, and will guarantee that second segment starts with lost package; 4 are forwarded to other situations to);

4) if maximum continuous length be less than B frame and this data length and three B frames similar, then to be split as three B frames, this data length is divided into three sections, make the length of every section all close to B frame, and will guarantee that second segment the 3rd section starts with lost package; 5 are forwarded to other situations to);

5) if maximum continuous length be less than B frame and this data length and two B frames similar, then to be split as two B frames, the bag of this data length is divided into two sections, make the length of every section all close to B frame, and will guarantee that second segment the 3rd section starts with lost package; 6 are forwarded to other situations to);

6) think in other situations that this data length all belongs to a frame.

The present embodiment combines above each example, provides an optional frame type detection scheme, idiographic flow as shown in Figure 4: be divided into following several stages: utilize PTS preliminary judgment frame type, packet loss process, utilize the further judgment frame type of data volume and type to correct.

401: after data input, judge whether packet header can separate, and is, perform according to reproduction time judgment frame type, otherwise perform packet loss process; After frame type judgement terminates, before judging whether, frame judges whether wrong, has and then performs frame type correction, otherwise can enter the circulation of frame type judgement, namely enter 401, and concrete execution is as follows:

According to reproduction time judgment frame type: the bag code stream of input first being determined whether to TS over RTP, if it is needs to judge whether the PES head that TS wraps is encrypted.For the bag of the TS over RTP that RTP bag or PES head can be separated, can tentatively determine whether B frame according to reproduction time information, concrete enforcement can with reference to main points one;

Packet loss process: detect whether there is packet loss, if directly count data volume without packet loss to enter following frame type determining step; If there is packet loss, need carry out packet loss process respectively, estimated frame border, frame data amount or partial frame type for RTP or TS over RTP bag, concrete enforcement can with reference to main points three;

According to data volume judgment frame type: this process real-time judge frame type, and the adjustment relevant parameter of dynamic and intelligent, concrete enforcement can with reference to main points two;

Type is corrected: if the judged result before finding in deterministic process is wrong, can correct, this process does not affect Output rusults, but can be used for upgrading relevant parameter, and to improve the accuracy of follow-up judgement, concrete enforcement can with reference to main points two.

The embodiment of the present invention additionally provides a kind of checkout gear of frame type, as shown in Figure 5, comprising:

Time detecting unit 501, for detecting the reproduction time of each frame;

Frame type determining unit 502, if the maximum play time being less than the frame received for the reproduction time of present frame, then determines that above-mentioned present frame is bi-directional predictive coding B frame;

Further, can also comprise in above-mentioned Fig. 5:

Level determining unit 503, for determining the level belonging to B frame is in hierarchical coding according to the playing sequence of each frame and coded sequence; It should be noted that, level determines it is not the essential features that the embodiment of the present invention determines B frame, this technical characteristic only as follow-up carry out the relevant treatment needing hierarchical information time just needs.

The embodiment of the present invention additionally provides the checkout gear of another kind of frame type, as shown in Figure 6, comprising:

Type obtaining unit 601, for obtaining the type of coding of the frame place code stream received, above-mentioned type of coding comprises: open loop coding and closed loop coding;

Frame type determining unit 602, if be also greater than first threshold for the data volume of present frame, determines that present frame is obvious I frame, and above-mentioned first threshold is calculated by the average amount of frame and I frame data amount setting continuous number;

If the former frame of present frame is I frame, type of coding be closed loop coding and present frame is non-obvious I frame, or, if the former frame of present frame be I frame, type of coding is open loop coding and the data volume of present frame is greater than the 4th threshold value, then determine that present frame is P frame; Above-mentioned 4th threshold value is the P frame average amount of an image sets and the average of B frame average amount;

Further, above-mentioned frame type determining unit 602, if be also greater than Second Threshold for present frame, then determines that present frame is I frame; The data volume that above-mentioned Second Threshold is the I frame of before present frame, present frame the maximum in the group of images in the average amount of P frame and the average amount of setting number successive frame.

Further, above-mentioned frame type determining unit 602, if also exceed fixed intervals for the interval of present frame and previous I frame, and present frame is greater than the 3rd threshold value, then determine that present frame is I frame; Above-mentioned 3rd threshold value is: the data volume of the average amount of each frame of present frame place image sets, the previous P frame of present frame and the data volume of present frame place image sets I frame, a upper I frame are to the calculating away from degree of fixing I frame period of the distance of present frame and expection; Or above-mentioned 3rd threshold value is according to the average amount of each frame of present frame place image sets and a upper I frame the calculating away from degree of fixing I frame period to the distance of present frame and expection.

Further, above-mentioned frame type determining unit 602, if be also P frame for the previous frame of present frame and the data volume of present frame is greater than the 5th threshold value, or present image group there is B frame and the data volume of present frame is greater than the 6th threshold value, then determine that present frame is P frame; Above-mentioned 5th threshold value is: the average amount of the P frame of the first regulatory factor and present frame place image sets long-pending, and above-mentioned first regulatory factor is greater than 0.5 and is less than 1; Above-mentioned 6th threshold value is: the average of P frame average amount and B frame average amount;

If the previous frame of present frame is B frame and the data volume of present frame is less than the 7th threshold value, or present image group is slightly at P frame and the data volume of present frame is less than the 8th threshold value, then determine that present frame is P frame; Above-mentioned 7th threshold value is: the average amount of the B frame of the second regulatory factor and present frame place image sets long-pending, and above-mentioned second regulatory factor is greater than 1 and is less than 1.5; Above-mentioned 8th threshold value is: the average of P frame average amount and B frame average amount.

Further, as shown in Figure 7, said apparatus also comprises:

Interval acquiring unit 701, for after frame type judgement terminates, determines the fixed intervals of I frame;

Above-mentioned frame type determining unit 602, if also for still not judging I frame after fixed intervals reach, is then defined as I frame by the frame of the maximum amount of data in fixed intervals place setting range;

First updating block 702, for the spacing parameter of the average amount and I frame that upgrade all kinds frame in image sets.

Further, as shown in Figure 8, said apparatus also comprises:

Statistic unit 801, for after frame type judgement terminates, statistics continuous print B frame;

Above-mentioned frame type determining unit 602, if be also greater than predicted value for the quantity of continuous B frame, is then defined as P frame by frame maximum for data volume in above-mentioned continuous print B frame; Above-mentioned predicted value is more than or equal to 3 and is less than or equal to 7

Second updating block 802, for upgrading the average amount of all kinds frame in image sets.

Further, as shown in Figure 9, said apparatus also comprises:

Packet loss type determining units 901, for determining whether the frame received packet loss occurs, if there is packet loss, then determines packet loss type;

Data volume determining unit 902, if be packet loss in frame for packet loss type, then the data volume determining to receive frame when calculating frame data amount and packet loss data volume and the data volume that is this frame;

It should be noted that, the device of this enforcement and the device of Fig. 4 or Fig. 5 can merge use, and frame type determining unit 502 and frame type determining unit 602 can use same functional unit to realize.

Be below frame type judge after several application, be understandable that frame type determine after applicating example should not be construed as exhaustive, to the embodiment of the present invention form limit.

1. carry out unequal loss protection according to the frame type judged: during Bandwidth-Constrained, unequal loss protection can be carried out according to different frame type on the difference that video quality affects, make video reception quality reach optimum.

2. can realize video fast browsing with the expection cycle in conjunction with the average bit rate of GOP: not wanting to browse whole videos for being stored in local code stream user, can preliminary treatment fast be passed through, extract I frame correspondence position thus realize scanning fast.For the code stream being stored in server, user does not want to browse whole videos, and server can pass through preliminary treatment fast, extracts I frame correspondence position thus selectively transmits key frame information to user.

3. service quality (Quality of Service, QOS): when bandwidth is not enough, at intermediate node, can according to the frame type judged, intelligence abandons a part of B frame or P frame (the P frame near GOP terminates), and while making to reduce code check, the least possible affects video quality.

In addition based on experiment, testing the effect of the technical scheme of the embodiment of the present invention, is below test result.

The experiment of this section is not when having packet loss, and to utilizing reproduction time and not utilizing two kinds of situations of reproduction time, contrast with the scheme two in background technology respectively, result is as shown in table 1.

Table 1 cycle tests

Cycle tests: the code stream of the TS code stream using existing network to catch and constant bit rate coding is tested, as table one, the code stream first three individual (iptv137, iptv138, iptv139) that wherein existing network is caught is payload segment encryption but PES head unencrypted code stream; The code stream code check of constant bit rate coding is (1500,3000,4000,5000,6000,7000,9000,12000,15000).The code stream selected is all H.264 encode, and its frame type is divided into I, P, B tri-kinds, and without classification B.Provide the frame type test experience result of above sequence below, as shown in table 2.

Table 2 context of methods and existing method testing result contrast

As shown in Table 2, this experiment compares the following factor: I frame loss is the ratio of I frame sum in undetected I frame and sequence; I frame fallout ratio is P or B is mistaken for the number of I frame and the ratio (it should be noted that it is all only P to be misjudged as I in most cases, can be I by B misjudgement under few cases, this be consistent with the fact that B frame code check is far smaller than I frame code check) of I frame sum; P->I error rate is that P frame is judged to the number of I frame and the ratio of actual P frame sum by mistake; P->B error rate is that P frame is judged to the number of B frame and the ratio of actual P frame sum by mistake; B->P error rate is that B frame is judged to the number of P frame and the ratio of actual B frame sum by mistake; Total error rate is the number of misjudgement and the ratio (as long as the frame type judged and actual type do not meet be misjudgement) of totalframes.I frame loss and I frame fallout ratio mean value can embody the correct detection probability for I frame.

Judge that owing to utilizing PTS the accuracy rate of B frame is 100%, therefore compares no longer separately the result utilizing reproduction time and do not utilize reproduction time.Meanwhile, in order to fully demonstrate the superiority of the embodiment of the present invention two, when utilizing reproduction time, existing method is too increased to the process utilizing reproduction time to judge B frame, therefore, the difference of performance is mainly from the difference of the method utilizing frame data amount to judge.Result shows, when reproduction time judgment frame type can be utilized and do not utilize reproduction time judgment frame type, the code stream that this method is intercepted and captured for existing network and self-editing code stream all good than existing method, especially to self-editing code stream, this method Detection results is obvious especially, even in some cases can be error-free, then seldom there is error-free situation in existing method.

Figure 10 to Figure 15 gives the detailed testing result of some sequences, and with circular indicia on wherein actual lines, the lines triangle of prediction identifies; (transverse axis represents I frame period to comprise I frame distribution situation, being spaced apart two adjacent frames of 1 expression is I, be spaced apart 0 expression I frame period and be greater than 49, I frame predetermined period is the I frame period of context of methods prediction, I frame actual cycle is the actual I frame period) and the distribution situation of frame type (in figure in form, diagonal of a matrix is the correct frame number judged, other positions are misjudgement).Icon is entitled as sequence name+totalframes+total fallout ratio.The sequence of visible existing network is all generally the fixing I frame period (in figure maximum) of existence one, along with the switching of scene, can insert some I frames adaptively, thus cause a disturbance near maximum, define the I frame distribution situation in figure.For FIFA sequence (figure 14), can see in actual cycle there are two maximum, algorithm also can tell two maximum more accurately herein.The expection I frame period estimated according to this paper algorithm is very similar to actual I frame period, the frame-skipping that therefore can be used for when instructing fast browsing.

Figure 10: iptv13715861 (error0.6%) result is as shown in table 3:

Table 3

iptv137	Be detected as P	Be detected as B	Be detected as I
				Actual type P	4909	0	61

Actual type B	1	10215	0
				Actual type I	36	0	639

Figure 11: iptv13817320 (error0.1%), result is as shown in table 4:

Table 4

iptv138	Be detected as P	Be detected as B	Be detected as I
				Actual type P	5676	0	8
Actual type B	0	10903	0
				Actual type I	10	0	723

Figure 12: song38741 (error0.9%), result is as shown in table 5:

Table 5

song	Be detected as P	Be detected as B	Be detected as I
				Actual type P	16698	0	149
Actual type B	0	20217	0
				Actual type I	210	0	1467

Figure 13: FIFA9517 (error1.3%), result is as shown in table 6:

Table 6

FIFA	Be detected as P	Be detected as B	Be detected as I
				Actual type P	4267	0	21
Actual type B	0	4693	0
				Actual type I	106	0	430

Figure 14: travel1486 (error0.8%), result is as shown in table 7:

Table 7

travel	Be detected as P	Be detected as B	Be detected as I
				Actual type P	493	0	11
Actual type B	0	934	0
				Actual type I	1	0	47

Figure 15: sport1156 (error0.3%), result is as shown in table 8:

Table 8

sport	Be detected as P	Be detected as B	Be detected as I
				Actual type P	396	0	4
Actual type B	0	719	0
				Actual type I	0	0	37

One of ordinary skill in the art will appreciate that all or part of step realized in above-described embodiment method is that the hardware that can carry out instruction relevant by program completes, above-mentioned program can be stored in a kind of computer-readable recording medium, the above-mentioned storage medium mentioned can be read-only memory, disk or CD etc.

Above the detection method of the frame type that the embodiment of the present invention provides and device are described in detail, apply specific case herein to set forth principle of the present invention and execution mode, the explanation of above embodiment just understands method of the present invention and core concept thereof for helping; Meanwhile, for one of ordinary skill in the art, according to thought of the present invention, all will change in specific embodiments and applications, to sum up, this description should not be construed as limitation of the present invention.

Claims

1. a detection method for frame type, is characterized in that, comprising:

Detect the reproduction time of each frame;

If the reproduction time of present frame is less than the maximum play time of the frame received, then determine that described present frame is bi-directional predictive coding frame B frame;

If detect reproduction time unsuccessfully also to comprise:

2. method according to claim 1, is characterized in that, frame type also comprises after detecting and terminating:

The level belonging to B frame is in hierarchical coding is determined according to the playing sequence of each frame and coded sequence.

3. method according to claim 1, it is characterized in that, the type of coding of the frame place code stream that described acquisition has received comprises:

4. method according to claim 1, is characterized in that, also comprise:

If the data volume of present frame is greater than Second Threshold, then determine that present frame is I frame; The data volume that described Second Threshold is the I frame of before present frame, present frame the maximum in the group of images in the average amount of P frame and the average amount of setting number successive frame.

5. method according to claim 1, is characterized in that, also comprise:

If the interval of present frame and previous I frame exceedes fixed intervals, and the data volume of present frame is greater than the 3rd threshold value, then determine that present frame is I frame; Described 3rd threshold value according to the data volume of the average amount of each frame of present frame place image sets, the previous P frame of present frame and the data volume of present frame place image sets I frame, a upper I frame to the calculating away from degree of fixing I frame period of the distance of present frame and expection; Or described 3rd threshold value is according to the average amount of each frame of present frame place image sets and a upper I frame the calculating away from degree of fixing I frame period to the distance of present frame and expection.

6. method according to claim 1, is characterized in that, also comprise:

If the previous frame of present frame is P frame and the data volume of present frame is greater than the 5th threshold value, or present image group there is B frame and the data volume of present frame is greater than the 6th threshold value, then determine that present frame is P frame; Described 5th threshold value is: the average amount of the P frame of the first regulatory factor and present frame place image sets long-pending, and described first regulatory factor is greater than 0.5 and is less than 1; Described 6th threshold value is: the average of P frame average amount and B frame average amount;

If the previous frame of present frame is B frame and the data volume of present frame is less than the 7th threshold value, or present image group there is P frame and the data volume of present frame is less than the 8th threshold value, then determine that present frame is P frame; Described 7th threshold value is: the average amount of the B frame of the second regulatory factor and present frame place image sets long-pending, and described second regulatory factor is greater than 1 and is less than 1.5; Described 8th threshold value is: the average of P frame average amount and B frame average amount.

7. method according to claim 1 to 6 any one, is characterized in that, also comprise:

8. method according to claim 1 to 6 any one, is characterized in that, also comprise:

After frame type judgement terminates, statistics continuous print B frame, if the quantity of B frame is greater than predicted value continuously, is then defined as P frame by frame maximum for data volume in described continuous print B frame; And upgrade the average amount of all kinds frame in image sets; Described predicted value is more than or equal to 3 and is less than or equal to 7.

9. method according to claim 1 to 6 any one, is characterized in that, also comprise:

10. a checkout gear for frame type, is characterized in that, comprising:

Time detecting unit, for detecting the reproduction time of each frame;

Frame type determining unit, if the maximum play time being less than the frame received for the reproduction time of present frame, then determines that described present frame is bi-directional predictive coding B frame;

Described checkout gear also comprises the unit for performing following operation: if the failure of described time detecting unit inspection reproduction time, then obtain the type of coding of the frame place code stream received, described type of coding comprises: open loop coding and closed loop coding;

11. devices according to claim 10, is characterized in that, also comprise:

Level determining unit, for determining the level belonging to B frame is in hierarchical coding according to the playing sequence of each frame and coded sequence.

12. devices according to claim 10, is characterized in that,

Described also comprise comprise for the type of coding performing the frame place code stream that described acquisition has received: for adding up the type of a frame after obvious I frame, if the ratio of P frame reaches setting ratio then determine that type of coding is closed loop coding, otherwise it is open loop coding.

13. devices according to claim 10, is characterized in that,

Described checkout gear also comprises the unit for performing following operation: if the data volume of present frame is greater than Second Threshold, then determine that present frame is I frame; The data volume that described Second Threshold is the I frame of before present frame, present frame the maximum in the group of images in the average amount of P frame and the average amount of setting number successive frame.

14. devices according to claim 10, is characterized in that,

Described checkout gear also comprises the unit for performing following operation: if the interval of present frame and previous I frame exceedes fixed intervals, and the data volume of present frame is greater than the 3rd threshold value, then determine that present frame is I frame; Described 3rd threshold value according to the data volume of the average amount of each frame of present frame place image sets, the previous P frame of present frame and the data volume of present frame place image sets I frame, a upper I frame to the calculating away from degree of fixing I frame period of the distance of present frame and expection; Or described 3rd threshold value is according to the average amount of each frame of present frame place image sets and a upper I frame the calculating away from degree of fixing I frame period to the distance of present frame and expection.

15. devices according to claim 10, is characterized in that,

Described checkout gear also comprises the unit for performing following operation: if the previous frame of present frame is P frame and the data volume of present frame is greater than the 5th threshold value, or present image group there is B frame and the data volume of present frame is greater than the 6th threshold value, then determine that present frame is P frame; Described 5th threshold value is: the average amount of the P frame of the first regulatory factor and present frame place image sets long-pending, and described first regulatory factor is greater than 0.5 and is less than 1; Described 6th threshold value is: the average of P frame average amount and B frame average amount;

16., according to claim 10 to device described in 15 any one, is characterized in that,

Described checkout gear also comprises the unit for performing following operation: after frame type judgement terminates, determine the fixed intervals of I frame, if still do not judge I frame after fixed intervals reach, then the frame of the maximum amount of data in fixed intervals place setting range is defined as I frame; And upgrade the average amount of all kinds frame in image sets and the spacing parameter of I frame.

17., according to claim 10 to device described in 15 any one, is characterized in that,

Described checkout gear also comprises the unit for performing following operation: after frame type judgement terminates, and statistics continuous print B frame, if the quantity of B frame is greater than predicted value continuously, is then defined as P frame by frame maximum for data volume in described continuous print B frame; And upgrade the average amount of all kinds frame in image sets; Described predicted value is more than or equal to 3 and is less than or equal to 7.

18., according to claim 10 to device described in 15 any one, is characterized in that,

Described checkout gear also comprises the unit for performing following operation: determine whether the frame received packet loss occurs, if there is packet loss, then determine packet loss type;