WO2007072419A2 - A device for and a method of processing a data stream - Google Patents

A device for and a method of processing a data stream Download PDF

Info

Publication number
WO2007072419A2
WO2007072419A2 PCT/IB2006/054944 IB2006054944W WO2007072419A2 WO 2007072419 A2 WO2007072419 A2 WO 2007072419A2 IB 2006054944 W IB2006054944 W IB 2006054944W WO 2007072419 A2 WO2007072419 A2 WO 2007072419A2
Authority
WO
WIPO (PCT)
Prior art keywords
frame
frames
data stream
mode
stream
Prior art date
Application number
PCT/IB2006/054944
Other languages
French (fr)
Other versions
WO2007072419A3 (en
Inventor
Albert M. A. Rijckaert
Eric W. J. Moors
Roland P. J. M. Manders
Original Assignee
Koninklijke Philips Electronics N.V.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Koninklijke Philips Electronics N.V. filed Critical Koninklijke Philips Electronics N.V.
Publication of WO2007072419A2 publication Critical patent/WO2007072419A2/en
Publication of WO2007072419A3 publication Critical patent/WO2007072419A3/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/24Systems for the transmission of television signals using pulse code modulation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/266Channel or content management, e.g. generation and management of keys and entitlement messages in a conditional access system, merging a VOD unicast channel into a multicast channel
    • H04N21/2662Controlling the complexity of the video stream, e.g. by scaling the resolution or bitrate of the video stream based on the client capabilities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/432Content retrieval operation from a local storage medium, e.g. hard-disk
    • H04N21/4325Content retrieval operation from a local storage medium, e.g. hard-disk by playing back content from the storage medium
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/76Television signal recording
    • H04N5/78Television signal recording using magnetic recording
    • H04N5/782Television signal recording using magnetic recording on tape
    • H04N5/783Adaptations for reproducing at a rate different from the recording rate
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/76Television signal recording
    • H04N5/91Television signal processing therefor
    • H04N5/913Television signal processing therefor for scrambling ; for copy protection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/76Television signal recording
    • H04N5/91Television signal processing therefor
    • H04N5/913Television signal processing therefor for scrambling ; for copy protection
    • H04N2005/91357Television signal processing therefor for scrambling ; for copy protection by modifying the video signal
    • H04N2005/91364Television signal processing therefor for scrambling ; for copy protection by modifying the video signal the video signal being scrambled

Definitions

  • the invention relates to a device for processing a data stream.
  • the invention further relates to a method of processing a data stream.
  • the invention further relates to a program element.
  • the invention further relates to a computer-readable medium.
  • audio and video data are often stored in a compressed manner, and for security reasons in an encrypted manner.
  • MPEG2 is a standard for the generic coding of moving pictures and associated audio and creates a video stream out of frame data that can be arranged in a specified order called the GOP ("Group Of Pictures") structure.
  • An MPEG2 video bit stream is made up of a series of data frames encoding pictures.
  • the three ways of encoding a picture are intra-coded (I picture), forward predictive (P picture) and bi-directional predictive (B picture).
  • An intra- coded frame (I-frame) is independently decodable.
  • a forward predictive frame (P-frame) needs information of a preceding I-frame or P-frame.
  • a bi-directional predictive frame (B- frame) is dependent on information of a preceding and/or subsequent I-frame or P-frame.
  • trick-play modes should also be possible with digital televisions.
  • digital televisions have an embedded decoder and a digital interface via which a standardized signal is provided, such as an MPEG signal described above.
  • a standardized signal such as an MPEG signal described above.
  • a similar situation also occurs in a home network of Set Top Boxes that communicate via a digital in- home network.
  • the system providing the trick-play signal is located remotely from the decoder. It is therefore advantageous to provide a trick-play signal as a standardized signal form capable of working in conjunction with a standard decoder.
  • the signal should also preferably take into account mode transitions from normal to trick-play, trick-play to trick-play and vice versa since these transitions can occur at any time under control of the consumer, even midway through the transmission of a single video frame. In such a situation a signal already transmitted cannot be revoked.
  • a device for processing a data stream comprising a plurality of frames
  • the device comprises an input for receiving the data stream, an output for transmitting a processed data stream, a detection unit arranged to detect a switching of mode from a first reproduction mode to a second reproduction mode, a determination unit arranged to determine a first anchor frame comprised within the plurality of frames in response to the switching of mode and an insertion unit arranged to insert frames, as inserted frames, into the data stream to produce the processed data stream in the second reproduction mode, wherein the insertion unit is arranged to insert the first anchor frame determined by the determination unit into the data stream as a first frame in the second reproduction mode and insert a first succession of empty predictive frames subsequent to the first anchor frame.
  • a device for processing a data stream may comprise a detection unit for detecting a switching of mode from a first reproduction mode, for example, normal play or still picture mode to a second reproduction mode such as step picture mode.
  • the device may determine a first anchor frame as the first frame in the processed data stream in a step picture mode.
  • the determination of an anchor frame excludes the next following frame being a B-frame and therefore reduces the number of artifacts caused by the reordering of anchor frames with respect to B-frames in data streams where such reordering is commonly applied, such as in MPEG based data streams.
  • the device may insert the first anchor frame and a succession of repeated empty predicted frames, such as empty MPEG P-frames, to provide, for example, a step picture mode with a frozen frame appearing subsequent to the step.
  • the reduction of artifacts when displaying the first frame provides a processed data stream of an improved compliance, for example, to the MPEG standard and therefore an improved picture display for the user.
  • anchor frame may particularly denote a frame, which, in transmission order and/or in display order, keeps its relative temporal position with respect to other anchor frames.
  • 1-frames and P-frames may be denoted as anchor frames.
  • B-frames would not be denoted as anchor frames in the context of MPEG2.
  • a method for processing a data stream comprising the method steps of receiving the data stream, detecting a switching of mode from a first reproduction mode to a second reproduction mode, determining a first anchor frame comprised within the plurality of frames in response to the switching of mode, inserting frames, as inserted frames, into the data stream to produce a processed data stream in the second reproduction mode and outputting the processed data stream, wherein the method step of inserting further comprises method steps of inserting the first anchor frame determined by the determining step into the data stream as a first frame in the second reproduction mode and inserting a first succession of empty predictive frames subsequent to the first anchor frame.
  • a program element for processing a data stream, the program element being capable of being directly loadable into the memory of a programmable device, and comprising software code portions for performing, when said program element is run on the device, the method steps of receiving a data stream comprising a plurality of frames, detecting a switching of mode from a first reproduction mode to a second reproduction mode, determining a first anchor frame comprised within the plurality of frames in response to the switching of mode, inserting frames into the data stream to produce a processed data stream in the second reproduction mode and outputting the processed data stream, wherein the method step of inserting further comprises the method steps of inserting the first anchor frame determined by the determining step into the data stream as a first frame in the second reproduction mode and inserting a succession of empty predictive frames subsequent to the first anchor frame.
  • a computer-readable medium directly loadable into the memory of a programmable device, comprising software code portions for performing processing of a data stream, when said code portions are run on the device, the method steps of receiving a data stream comprising a plurality of frames, detecting a switching of mode from a first reproduction mode to a second reproduction mode, determining a first anchor frame comprised within the plurality of frames in response to the switching of mode, inserting frames into the data stream to produce a processed data stream in the second reproduction mode and outputting the processed data stream, wherein the method step of inserting further comprises the method steps of inserting the first anchor frame determined by the determining step into the data stream as a first frame in the second reproduction mode and inserting a succession of empty predictive frames subsequent to the first anchor frame.
  • the detection unit may be further arranged to detect a step forward signal
  • the determination unit may be further arranged to, in response to detection of the step forward signal, determine a subsequent anchor frame, the subsequent anchor frame being the next following anchor frame comprised within the plurality of frames
  • the insertion unit may be arranged to insert the subsequent anchor frame determined by the determination unit into the processed data stream in the second reproduction mode and insert a second succession of empty predictive frames subsequent to the subsequent anchor frame.
  • a correction unit may be provided for correcting at least one temporal parameter of the inserted frames comprised within the processed data stream and inserted by the insertion unit.
  • the temporal parameter may comprise a temporal reference, a Presentation Time Stamp and/or a Decoding Time Stamp.
  • a frame detector unit may be provided for detecting repeated bi-directionally predicted frames comprised within the plurality of frames in the first reproduction mode, wherein the determination unit may be further arranged to determine a further anchor frame which directly precedes the switching of mode and determine the first anchor frame to be the further anchor frame.
  • a frame detector unit may be provided for detecting repeated empty predicted anchor frames comprised within the plurality of frames in the first reproduction mode, wherein the determination unit may be further arranged to determine a further anchor frame which directly succeeds the switching of mode and determine the first anchor frame to be the further anchor frame.
  • the first reproduction mode may be a selection of one of a still picture mode, a slow forward mode or a normal play mode. Such modes are common trick-play modes in audio-video based systems and provide a user with improved functionality.
  • the second reproduction mode may be a step picture mode.
  • Such a mode is a common trick-play mode in audio-video based systems and provides a user with the ability to step through audio-video content.
  • the empty predictive frames may comprise at least one empty MPEG P type frame.
  • Such empty MPEG P type frames provide an efficient manner compliant with the MPEG standard for repeating a frame in the processed data stream.
  • the bi-directionally predicted frames might be MPEG
  • Such frames are commonly encountered in MPEG data streams and an embodiment capable of providing a compliant data stream responsive to the detection of such frames will often be required.
  • the empty predicted anchor frames might be empty MPEG P type frames.
  • Such frames are also commonly encountered in MPEG data streams and an embodiment capable of providing a compliant data stream responsive to the detection of such frames will again often be required.
  • the data stream may comprise one or more from a selection of video data, audio data and digital data. Such data streams are commonly encountered in consumer electronics devices.
  • the data stream may be an MPEG2 data stream.
  • MPEG2 is a designation for a group of audio and video coding standards agreed upon by MPEG
  • MPEG2 Moving pictures experts group
  • ISO/IEC 13818 International Standard For example, MPEG2 is used to encode audio and video broadcast signals including digital satellite and cable TV, but may also be used for DVD.
  • the device may also be adapted to process an MPEG4 encrypted data stream. More generally, any codec scheme may be implemented which uses anchor frames from which other frames are dependent, particularly any type of encoding using predictive frames and thus any kind of MPEG encoding/decoding.
  • the device according to the invention may be realized as at least one of the group consisting of a digital video recording device, a network-enabled device, a conditional access system, a portable audio player, a portable video player, a mobile phone, a DVD player, a CD player, a hard disk based media player, an Internet radio device, a computer, a television, a public entertainment device and an MP3 player.
  • a digital video recording device a network-enabled device, a conditional access system
  • a portable audio player a portable video player
  • a mobile phone a DVD player
  • CD player Compact Disc
  • a hard disk based media player a hard disk based media player
  • an Internet radio device a computer
  • television a public entertainment device
  • MP3 player an MP3 player
  • Fig. 1 illustrates a time-stamped transport stream packet.
  • Fig. 2 shows an MPEG2 group of picture structure with infra-coded frames and forward predictive frames.
  • Fig. 3 illustrates an MPE G2 group of picture structure with infra-coded frames, forward predictive frames and bi-directional predictive frames.
  • Fig. 4a illustrates a structure of a characteristic point information file and stored stream content.
  • Fig. 4b shows an example of an Entitlement Control Message (ECM) file.
  • Fig. 5 illustrates a system for trick-play on a plaintext stream.
  • ECM Entitlement Control Message
  • Fig. 6 illustrates time compression in trick-play.
  • Fig. 7 illustrates trick-play with fractional distance.
  • Fig. 8 illustrates low speed trick-play.
  • Fig. 9 illustrates a general conditional access system structure.
  • Fig. 10 illustrates a digital video broadcasting encrypted transport stream packet.
  • Fig. 11 illustrates a transport stream packet header of the digital video broadcasting encrypted transport stream packet of Fig. 10.
  • Fig. 12 illustrates a system allowing the performance of trick-play on a fully encrypted stream.
  • Fig. 13 illustrates a full transport stream and a partial transport stream.
  • Fig. 14 illustrates Entitlement Control Messages for a stream type I and for a stream type II.
  • Fig. 15 illustrates writing Control Words to a decrypter.
  • Fig. 16 illustrates Entitlement Control Message handling in a fast forward mode.
  • Fig. 17 illustrates detection of one or two Control Words.
  • Fig. 18 illustrates the splitting of a transport stream packet at a frame boundary.
  • Fig. 19 illustrates a system allowing the performance of slow- forward trick- play on a fully encrypted stream.
  • Fig. 20 illustrates a hybrid stream with plaintext packets on each frame boundary.
  • Fig. 21 illustrates a system allowing the performance of slow- forward trick- play on a stored hybrid encrypted stream.
  • Fig. 22 illustrates an incomplete picture start code at the concatenation point of repeated B-frame data.
  • Fig. 23 illustrates the effect of MPEG frame re-ordering from transmission order to display order.
  • Fig. 24 illustrates the effect of MPEG frame re-ordering during slow- forward from transmission order through the intermediate display frame order to the actual displayed frames.
  • Fig. 25 illustrates the effect of MPEG frame re-ordering during slow- forward making use of empty P-frames before the anchor frames.
  • Fig. 26 illustrates the effect of MPEG frame re-ordering during slow- forward making use of backward predictive empty B-frames.
  • Fig. 27 illustrates the effect of MPEG frame re-ordering during slow- forward making use of forward predictive empty B-frames.
  • Fig. 28 illustrates the high table ID toggle rate due to repetition of B-frame data.
  • Fig. 29 illustrates the loss of a necessary Control Word for a type I system due to B-frame data repetition.
  • Fig. 30a illustrates the handling of ECMs for the slow- forward stream.
  • Fig. 30b illustrates the handling of ECMs for the slow- forward stream suitable for fast channel changing.
  • Fig. 31 illustrates a system for removing ECMs corresponding to the repeated B-frame data.
  • Fig. 32 illustrates a system for removing ECMs corresponding to the repeated B-frame data making use of a repetition detection unit.
  • Fig. 33 illustrates a system for removing ECMs corresponding to the repeated B-frame data making use of an analyzer unit.
  • Fig. 34 illustrates the loss of a necessary Control Word for a type II system due to B-frame data repetition and the removal of ECMs corresponding to all but the last repetition of the repeated B-frame data.
  • Fig. 35 illustrates the handling of ECMs for the slow- forward stream to prevent the loss of a necessary Control Word for a type II system due to B-frame data repetition.
  • Fig. 36 illustrates the results of a measurement of B-frame data length over time for a typical broadcast reception.
  • Fig. 37 illustrates the overlap of repeated B-frame data when the length of the B-frame data in time exceeds a single frame display time.
  • Fig. 38 illustrates the results of a measurement of distance in time, measured in frame periods, from the Presentation Time Stamp (PTS) of a B-frame to the Program Clock Reference (PCR) for a typical broadcast reception over a period of 30 seconds.
  • PTS Presentation Time Stamp
  • PCR Program Clock Reference
  • Fig. 39 illustrates the potential overlap of B-frame data from the last repeated B-frame when a switch occurs from slow forward processing to still picture mode.
  • Fig. 40 illustrates the switching to still picture mode on a B-frame comprising ECMs.
  • Fig. 41 illustrates identical PTS values and the resulting conflict for two frames when switching from slow forward processing mode to still picture mode.
  • Fig. 42 illustrates the use of a Pe-frame to avoid identical PTS values and the resulting conflict for two frames of Fig. 41 and shows further an issue with subsequently repeated B-frames.
  • Fig. 43 illustrates the switching from slow forward processing mode to still picture mode during an anchor frame.
  • Fig. 44 illustrates the switching from slow forward processing mode to still picture mode during a pre-inserted Pe-frame.
  • Fig. 45 illustrates the switching from slow forward processing mode to still picture mode during a pre-inserted Bb-frame.
  • Fig. 46 illustrates the switching from slow forward processing mode to still picture mode during a post-inserted Bf- frame.
  • Fig. 47 illustrates the switching from slow forward processing mode to still picture mode during a switching period comprising an anchor frame or subsequent Bf- frames.
  • Fig. 48 illustrates the switching from slow forward processing mode to still picture mode during a switching period comprising pre-inserted empty frames.
  • Fig. 49a illustrates the switching from slow forward processing mode to still picture mode during a switching period lasting until the start of an anchor frame.
  • Fig. 49b illustrates the switching from still picture mode to slow forward processing mode for a still picture mode using Pe-frames.
  • Fig. 50a illustrates the options available when switching from a slow forward processing mode to a still picture mode when the last repeated B-frame does not contain ECMs.
  • Fig. 50b illustrates the options available when switching from a slow forward processing mode to a still picture mode when the last repeated B-frame does contain ECMs, but no table ID toggle.
  • Fig. 50c illustrates the options available when switching from a slow forward processing mode to a still picture mode when the last repeated B-frame contains both ECMs and a table ID toggle.
  • Fig. 5Od illustrates the options available when switching from a slow forward processing mode to a still picture mode when the last repeated B-frame contains both ECMs and a table ID toggle and where the table ID toggle occurs prior to the mode switch.
  • Fig. 5Oe illustrates the options available when switching from a slow forward processing mode to a still picture mode when the last repeated B-frame contains both ECMs and a table ID toggle and where the table ID toggle occurs subsequent to the mode switch.
  • Fig. 5Of illustrates the exception handling required upon the last repeated current B-frame for the situation of Fig. 5Oe.
  • Fig. 5Og illustrates a second set of options available when switching from a slow forward processing mode to a still picture mode when the last repeated B-frame contains both ECMs and a table ID toggle and where the table ID toggle occurs subsequent to the mode switch.
  • Fig. 5Oh a third set of options available when switching from a slow forward processing mode to a still picture mode when the last repeated B-frame contains both ECMs and a table ID toggle and where the table ID toggle occurs subsequent to the mode switch.
  • Fig. 51 illustrates a device for processing an encrypted data stream.
  • Fig. 52 illustrates a second device for processing an encrypted data stream.
  • Fig. 53 illustrates an I-frame based step backwards mode.
  • Fig. 54 illustrates an undefined first frame after the first step backward.
  • Fig. 55 illustrates an undefined first frame and PTS conflict after the first step forward and equivalent artifacts for each subsequent picture step forward where a transition between an anchor frame and a B-frame occurs.
  • Fig. 56 illustrates an anchor frame based step forwards mode displaying only a single undefined frame after the first step forward.
  • Fig. 57 illustrates the first anchor frame required for the cases where the incoming data stream is comprised of repeated B-frame data, repeated Bf- frames and repeated Pe-frames.
  • Fig. 58 illustrates an I-frame based step forward mode.
  • Fig. 59 illustrates the first I-frame required for the cases where the incoming data stream is comprised of repeated B-frame data, repeated Bf-frames and repeated Pe- frames.
  • Fig. 60 illustrates a device for processing a data stream according to an exemplary embodiment of the invention.
  • Fig. 61 illustrates a second device for processing a data stream according to an exemplary embodiment of the invention.
  • Fig. 62 illustrates a third device for processing a data stream according to an exemplary embodiment of the invention.
  • the Figures are schematically drawn and not true to scale, and the identical reference numerals in different Figures refer to corresponding elements. It will be clear for those skilled in the art, that alternative but equivalent embodiments of the invention are possible without deviating from the true inventive concept, and that the scope of the invention will be limited by the claims only.
  • time-stamped transport stream This comprises transport stream packets, all of which are pre-pended with a 4 bytes header in which the transport stream packet arrival time is placed. This time may be derived from the value of the program clock reference (PCR) time-base at the time the first byte of the packet is received at the recording device.
  • PCR program clock reference
  • Fig. 1 illustrates a time stamped transport stream packet 100 having a total length 104 of 188 Bytes and comprising a time stamp 101 having a length 105 of 4 Bytes, a packet header 102, and a packet payload 103 having a length of 184 Bytes.
  • MPEG/DVB digital video broadcasting
  • trick-play engines When creating trick-play for an MPEG/DVB transport stream, problems may arise when the content is at least partially encrypted. It may not be possible to descend to the elementary stream level, which is the usual approach, or even access any packetized elementary stream (PES) headers before decryption. This also means that finding picture frames may not be possible.
  • PES packetized elementary stream
  • ECM denotes an Entitlement Control Message.
  • This message may particularly comprise secret provider proprietary information and may, among others, contain encrypted Control Words (CW) needed to decrypt the MPEG stream. Typically, Control Words expire in 10-20 seconds.
  • CW Control Words
  • keys particularly denotes data that may be stored in a smart card and may be transferred to the smart card using EMMs, that is so-called "Entitlement Management Messages" that may be embedded in the transport stream. These keys may be used by the smart card to decrypt the Control Words present in the ECM. An exemplary validity period of such a key may be one month.
  • Control Words particularly denotes decryption information needed to decrypt actual content. Control words may be decrypted by the smart card and then stored in a memory of the decryption core.
  • any MPEG2 streams created are MPEG2 compliant transport streams. This is because the decoder may not only be integrated within a device, but may also be connected via a standard digital interface, such as an IEEE 1394 interface, or an Internet interface, for example.
  • Fig. 2 shows a stream 200 comprising several MPEG2 GOP structures with a sequence of I-frames 201 and P-frames 202.
  • the GOP size is denoted with reference numeral 203.
  • the GOP size 203 is set to 12 frames, and only I-frames 201 and P-frames 202 are shown here.
  • a GOP structure may be used in which only the first frame is coded independently of other frames. This is the so-called intra-coded or I-frame 201.
  • the predictive frames or P-frames 202 are coded with a unidirectional prediction, meaning that they only rely on the previous I-frame 201 or P-frame 202 as indicated by arrows 204 in Figure 2.
  • Such a GOP structure has typically a size of 12 or 16 frames 201, 202.
  • FIG. 3 shows the MPEG2 GOP structure with a sequence of I-frames 201, P-frames 202 and B-frames 301.
  • the GOP size is again denoted with reference numeral 203.
  • B-frames 301 are coded with a bi-directional prediction, meaning that they rely on a previous and a next I- or P-frame 201, 202 as indicated for some B-frames 301 by curved arrows 204.
  • the transmission order of the compressed frames may be not the same as the order in which they are displayed.
  • the compressed frames may be reordered. So in transmission, the reference frames may come first.
  • the reordered stream, as it is transmitted, is also shown in Fig. 3, lower part.
  • the reordering is indicated by straight arrows 302.
  • a stream containing B-frames 301 can give a nice looking trick-play picture if all of the B-frames 301 are skipped. For the present example, this leads to a trick-play speed of 3x forward.
  • trick-play may not be trivial.
  • the possibility of a slow-reverse based on I-frames only is briefly mentioned as an option.
  • An efficient frame based slow-reverse is practically impossible though, due to the necessary inversion of the MPEG2 GOP.
  • Slow-forward which is also known as slow motion forward is a mode in which the display picture runs at a lower than normal speed.
  • a rudimentary form of slow- forward is already possible with the technique making use of a fast-forward algorithm that generates trick-play GOPs. Setting the fast- forward speed to a value between zero and one results in a slow- forward stream based on a repetition of fast-forward trick-play GOPs.
  • the distance between I-frames in normal play may be around half a second and for slow- forward/reverse it is multiplied with the slow motion factor. So this type of slow- forward or slow-reverse is not really the slow motion consumers are used to but in fact it is more like a slide show with a large temporal distance between the successive pictures.
  • still picture mode In another trick-play mode the display picture is halted. This can be achieved by adding empty P-frames to the I-frame for the duration of the still picture mode. This means that the picture resulting from the last I-frame is halted. When switching to still picture from normal play, this can also be the nearest I-frame according to the data in the CPI file.
  • This technique may be an extension of the fast-forward/reverse modes and results in nice still pictures especially if interlace kill is used. However the positional accuracy may often be insufficient when switching from normal play or slow- forward/reverse to still picture.
  • the still picture mode can be extended to implement a step mode.
  • the step command advances the stream to some next or previous I-frame.
  • the step size is at minimum one GOP but can also be set to a higher value equal to an integer number of GOPs.
  • Step forward and step backward are both possible in this case because only I-frames are used.
  • the slow- forward can also be based on a repetition of every frame, which results in a much smoother slow motion.
  • the best form of slow- forward would in fact be a repetition of fields instead of frames because the temporal resolution is doubled and there are no interlace artifacts. This may be however practically impossible for the intrinsically frame based MPEG2 streams and even more so if they are largely encrypted.
  • the interlace artifacts can be significantly reduced for the I- and P-frames by using special empty frames to force the repetition. Such an interlace reduction technique may not be available for the B-frames though. Whether the use of interlace kill for the I- and P-frames is still advantageous in this case or in fact leads to a more annoying picture for the viewer can only be verified by experiments.
  • Slow-reverse on the basis of individual frames may in fact be very complicated for MPEG signals due to the temporal predictions.
  • a complete GOP has to be buffered and reversed. There is no simple method known of to recode the frames in a GOP to the reverse order. So an almost complete decoding and encoding might be necessary with an inversion of the frame order between these two. This asks for the buffering of a complete decoded GOP as well as a full MPEG decoder and encoder.
  • Still picture mode can be defined as an extension of the frame-based slow- forward mode. It may be based on a repeated display of the current frame for the duration of the still picture mode whatever the type of this frame is. This may be, in fact, a slow- forward with an infinite slow motion factor if this indicates the factor with which the normal play stream is slowed down. No interlace kill may be possible if the picture is halted on a B- frame. In that sense this still picture mode may be worse than the trick-play GOP based still picture mode. This can be corrected by only halting the picture at an I- or P- frame at the cost of a somewhat less accurate still picture position. Discontinuities in the temporal reference and the PTS can also be avoided in this case.
  • the bit rate may be significantly reduced because the repetition of an I- or P-frame may be forced by the insertion of empty frames instead of a repetition of the frame data itself as may be necessary for the B-frames. So, technically speaking, the halting of a picture at an I- or P-frame may be the best choice, if one accepts that lack of positional accuracy.
  • the still picture mode can also be extended with a step mode.
  • the step command advances the stream in principle to the next frame. Larger step sizes are possible by stepping to the next P-frame or some next I-frame. A step backward on frame basis may not be possible. The only option may be to step backward to one of the previous I-frames.
  • trick-play GOP based and frame based Two types have been mentioned, namely trick-play GOP based and frame based.
  • the first one may be most logically connected to fast-forward/reverse whereas the second one may be related to slow- forward.
  • the streams resulting from both methods look very alike because they are both based on the insertion of empty frames to force the repetition of an anchor frame. But on detailed stream construction level there are some differences.
  • CPI characteristic point information
  • Finding I-frames in a stream usually requires parsing the stream, to find the frame headers. Locating the positions where the I-frame starts can be done while the recording is being made, or off-line after the recording is completed, or semi on-line, in fact being off-line but with a small delay with respect to the moment of recording.
  • the I-frame end can be found by detecting the start of the next P-frame or B-frame.
  • the meta-data derived this way can be stored in a separate but coupled file that may be denoted as characteristic point information file or CPI file. This file may contain pointers to the start and eventually end of each I-frame in the transport stream file. Each individual recording may have its own CPI file.
  • a characteristic point information file 400 is visualized in Fig. 4a. Apart from the CPI file 400, stored information 401 is shown.
  • the amount of data to read from the transport stream file may be exactly known to get a complete I-frame 201. If for some reason the I-frame end is not known, the entire GOP or at least a large part of the GOP data may be read to be sure that the entire I-frame 201 is read. The end of the GOP may be given by the start of the next I-frame 201. It is known from measurements that the amount of I-frame data can be 40% or more of the total GOP data.
  • trick-play picture refresh rate can be achieved by displaying each I-frame 201 several times.
  • the bit rate will be reduced accordingly. This may be achieved by adding so-called empty P-frames 202 between the I-frames 201.
  • Such an empty P-frame 202 may not be really empty but may contain data instructing the decoder to repeat the previous frame. This has a limited bit cost, which can in many cases be neglected compared to an I-frame 201.
  • trick-play GOP structures like IPP or IPPP may be acceptable for the trick-play picture quality and even advantageous at high trick-play speeds.
  • the resulting trick-play bit rate may be of the same order as the normal play bit rate. It is also mentioned that these structures may reduce the required sustained bandwidth from the storage device.
  • a trick-play system 500 is schematically depicted in Fig. 5.
  • the trick-play system 500 comprises a recording unit 501, an I-frame selection unit 502, a trick-play generation block 503 and an MPEG2 decoder 504.
  • the trick-play generation block 503 includes a parsing unit 505, an adding unit 506, a packetizer unit 507, a table memory unit 508 and a multiplexer 509.
  • the recording unit 501 provides the I-frame selection unit 502 with plaintext MPEG2 data 510.
  • the multiplexer 509 provides the MPEG2 decoder 504 with an MPEG2 DVB compliant transport stream 511.
  • the I-frame selector 502 reads specific I-frames 201 from the storage device 501. Which I-frames 201 are chosen depends on the trick-play speed as will be described below.
  • the retrieved I-frames 201 are used to construct an MPEG-2/DVB compliant trick- play stream that may be then sent to the MPEG-2 decoder 504 for decoding and rendering.
  • the position of the I-frame packets in the trick-play stream cannot be coupled to the relative timing of the original transport stream.
  • the time axis may be compressed or expanded with the speed factor and additionally inversed for reverse trick- play. Therefore, the time stamps of the original time stamped transport stream may not be suitable for trick-play generation.
  • the original PCR time base may be disturbing for trick-play.
  • I-frames 201 normally contain two time stamps that tell the decoder 504 when to start decoding the frame (decoding time stamp, DTS) and when to start presenting, for instance displaying, it (presentation time stamp, PTS).
  • DTS decoding time stamp
  • PTS presentation time stamp
  • Decoding and presentation may be started when DTS respectively PTS are equal to the PCR time base, which may be reconstructed in the decoder 504 by means of the PCRs in the stream.
  • the distance between, e.g., the PTS values of 2 I-frames 201 corresponds to their nominal distance in display time. In trick-play this time distance may be compressed or expanded with the speed factor. Since a new PCR time base may be used in trick-play, and because the distance for DTS and PTS may be no longer correct, the original DTS and PTS of the I-frame 201 have to be replaced.
  • the I-frame 201 may first be parsed into an elementary stream in the parsing unit 505. Then the empty P-frames 202 are added on elementary stream level. The obtained trick-play, GOP is mapped into one PES packet and packetized to transport stream packets. Then corrected tables like PAT, PMT, etc. are added. At this stage, a new PCR time base together with DTS and PTS are included. The transport stream packets are pre-pended with a 4 bytes time stamp that is coupled to the PCR time base such that the trick-play stream can be handled by the same output circuitry as used for normal play.
  • trick-play speeds In the following, some aspects related to trick-play speeds will be described. In this context, firstly, fixed trick-play speeds will be discussed.
  • N b G/T (1)
  • the basic speed is an integer but this is not necessarily the case.
  • the set of trick-play speeds resulting from the method described above is satisfying, in some cases not.
  • the trick-play speed formula will be inverted and the distance D will be calculated which is given by:
  • next ideal point Ip with the distance D may be calculated and one of the I-frames 201 may be chosen closest to this ideal point to construct a trick-play GOP.
  • next ideal point may be calculated by increasing the last ideal point by D.
  • the actual distance is varying between int(D) and int(D)+l, the ratio between the occurrences of the two being dependent on the fraction of D, such that the average distance is equal to D.
  • the average trick-play speed is equal to N, but that the actually used frame has a small jitter with respect to the ideal frame.
  • trick-play speed N does not need to be an integer but can be any number above the basic speed Nb. Also speeds below this minimum can be chosen, but then the picture refresh rate may be lowered locally because the effective trick-play GOP size T is doubled or at still lower speeds even tripled or more. This is due to a repetition of the trick-play GOPs, as the algorithm will choose the same I-frame 201 more than once.
  • the round function is used to select the I-frames 201 and as can be seen frames 2 and 4 are selected twice.
  • the described method will allow for a continuously variable trick- play speed.
  • a negative value is chosen for N.
  • Fig. 7 this simply means that the arrows 700 are pointing in the other direction.
  • the method described will also include the sets of fixed trick-play speeds mentioned earlier and they will have the same quality, especially if the round function is used. Therefore, it might be appropriate that the flexible method described in this section should always be implemented whatever the choice of the speeds will be.
  • refresh rate particularly denotes the frequency with which new pictures are displayed. Although not speed dependent, it will be briefly discussed here because it can influence the choice of T. If the refresh rate of the original picture is denoted by R (25Hz or 30Hz), the refresh rate of the trick-play picture (R t ) is given by:
  • Fig. 9 illustrates a conditional access system 900 which will now be described.
  • content 901 may be provided to a content encryption unit 902. After having encrypted the content 901, the content encryption unit 902 supplies a content decryption unit 904 with encrypted content 903.
  • ECM denotes Entitlement Control Messages.
  • KMM denotes Key Management Messages
  • GKM denotes Group Key Messages
  • EMM denotes Entitlement Management Messages.
  • a Control Word 906 may be supplied to the content encryption unit 902 and to an ECM generation unit 907.
  • the ECM generation unit 907 generates an ECM and provides the same to an ECM decoding unit 908 of a smart card 905.
  • the ECM decoding unit 908 generates from the ECM a Control Word that is decryption information that is needed and provided to the content encryption unit 904 to decrypt the encrypted content 903.
  • an authorization key 910 is provided to the ECM generation unit
  • KMM decoding unit 912 provides an output signal to the ECM decoding unit 908.
  • a group key 914 may be provided to the KMM generation unit 911 and to a GKM generation unit 915 which may further be provided with a user key 918.
  • the GKM generation unit 915 generates a GKM signal GKM and provides the same to a GKM decoding unit 916 of the smart card 905, wherein the GKM decoding unit 916 gets as a further input a user key 917.
  • entitlements 919 may be provided to an EMM generation unit 920 that generates an EMM signal and provides the same to an EMM decoding unit 921.
  • the EMM decoding unit 921 located in the smart card 905 is coupled with an entitlement list unit 913 which provides the ECM decoding unit 908 with corresponding control information.
  • CA conditional access
  • the CA system 900 uses a layered hierarchy (see Fig. 9).
  • the CA system 900 transfers the content decryption key (Control Word CW 906, 909) from server to client in the form of an encrypted message, called an ECM.
  • ECMs are encrypted using an authorization key (AK) 910.
  • the CA server 900 may renew the authorization key 910 by issuing a KMM.
  • a KMM is in fact a special type of EMM, but for clarity the term KMM may be used.
  • KMMs are also encrypted using a key that for instance can be a group key (GK) 914, which is renewed by sending a GKM that is again a special type of EMM.
  • GK group key
  • GKMs are then encrypted with the user key (UK) 917, 918, which is a fixed unique key embedded in the smart card 905 and known by the CA system 900 of the provider only.
  • Authorization keys and group keys are stored in the smart card 905 of the receiver.
  • Entitlements 919 are sent to individual customers in the form of an EMM and stored locally in a secure device (smart card 905). Entitlements 919 are coupled to a specific program. An entitlements list 913 gives access to a group of programs depending on the type of subscription. ECMs are only processed into keys (Control Words) by the smart card 905 if an entitlement 919 is available for the specific program. Entitlement EMMs are subject to an identical layered structure as the KMMs (not depicted in Fig. 9).
  • the description above is a generalized view of the CA system 900.
  • digital video broadcasting only the encryption algorithm, the odd/even Control Word structure, the global structure of ECMs and EMMs and their referencing are defined.
  • the detailed structure of the CA system 900 and the way the payloads of ECMs and EMMs are encoded and used are provider specific. Also the smart card is provider specific. However, from experience it is known that many providers follow essentially the structure of the generalized view of Fig. 9.
  • the applied encryption and decryption algorithm is defined by the DVB standardization organization. In principle two encryption possibilities are defined namely PES level encryption and TS level encryption. However, in real life mainly the TS level encryption method is used. Encryption and decryption of the transport stream packets is done packet based. This means that the encryption and decryption algorithm is restarted every time a new transport stream packet is received. Therefore, packets can be encrypted or decrypted individually. In the transport stream, encrypted and plaintext packets are mixed because some stream parts are encrypted (e.g. audio/video) and others are not (e.g. tables). Even within one stream part (e.g. video) encrypted and plaintext packets may be mixed. Referring to Fig. 10, a DVB encrypted transport stream packet 1000 will be described.
  • the stream packet 1000 has a length 1001 of 188 Bytes and comprises three portions.
  • a packet header 1002 has a size 1003 of 4 Bytes.
  • an adaptation field 1004 may be included in the stream packet 1000. After that, a DVB encrypted packet payload 1005 may be sent.
  • Fig. 11 illustrates a detailed structure of the transport stream packet header 1002 of Fig. 10.
  • the transport stream packet header 1002 comprises a synchronization unit (SYNC) 1010, a transport error indicator (TEI) 1011 which may indicate transport errors in a packet, a payload unit start indicator (PLUSI) 1012 which may particularly indicate a possible start of a PES packet in the subsequent payload 1005, a transport priority unit (TPI) 1017 indicating priority of the transport, a packet identifier (PID) 1013 used for determining the assignment of the package, a transport scrambling control (SCB) 1014 is used to select the CW that is needed for decrypting the transport stream packet, an adaptation field control (AFLD) 1015, and a continuity counter (CC) lOl ⁇ .Thus, Fig. 10 and Fig. 11 show the
  • MPEG2 transport stream packet 1000 that has been encrypted and which comprises different parts:
  • - Packet header 1002 is in plaintext. It serves to obtain important information such as a packet identifier (PID) number, presence of an adaptation field, scrambling control bits, etc.
  • - Adaptation field 1004 is also in plaintext. It can contain important timing information such as the PCR.
  • - DVB Encrypted Packet Payload 1005 contains the actual program content that may have been encrypted using the DVB algorithm.
  • SCB scrambling control bits
  • trick-play on fully encrypted streams are the two extremes of a range of possibilities.
  • Another reason is that there exist applications in which it may be necessary to record fully encrypted streams.
  • a basic principle is to read a large enough block of data from the storage device, decrypt it, select an I-frame in the block and construct a trick-play stream with it.
  • Such a system 1200 is depicted in Fig. 12
  • Fig. 12 shows the basic principle of trick-play on a fully encrypted stream.
  • data stored in a hard disk 1201 are provided as a transport stream 1202 to a decrypter 1203.
  • the hard disk 1201 provides a smart card 1204 with an ECM, wherein the smart card 1204 generates Control Words from this ECM and sends the same to the decrypter 1203.
  • the decrypter 1203 decrypts the encrypted transport stream 1202 and sends the decrypted data to an I-frame detector and filter 1205. From there, the data are provided to an insert empty P frame unit 1206 which conveys the data to a set top box 1207. From there, data are provided to a television 1208.
  • the recording must contain all the data required to playback the recording of the channel at a later stage.
  • the CAT/PMT may describe CA packets (ECMs) needed for decryption of the stream.
  • Fig. 13 illustrates a full transport stream 1301.
  • NIT network information table
  • BAT bouquet association table
  • the partial stream should have SIT (selection information table) and DIT (discontinuity information table) tables inserted.
  • Jumping to the next block during trick-play can mean jumping back in the stream. It will be explained that this may not be only the case for trick-play reverse but also for trick-play forward at moderate speeds. The situation for forward trick-play with forward jumps and for reverse trick-play with inherently backward jumps will be explained afterwards. Specific problems may occur caused by the fact that data has to be decrypted.
  • a conditional access system may be designed for transmission.
  • the transmitted stream may be reconstructed with original timings.
  • trick-play may have severe implications for the handling of cryptographic metadata due to changed timings.
  • the data may be compressed or expanded in time due to trick-play, but the latency of the smart card may remain constant.
  • the mentioned data blocks may go through a decrypter.
  • This decrypter needs the Control Words used in the encryption process to decrypt the data blocks.
  • These Control Words may also be encrypted and stored in ECMs.
  • ECMs In a normal set-top-box (STB), these ECMs may be part of the program tuned to.
  • a conditional access module may extract the ECMs, send them to a smart card, and, if the card has rights or an authorization to decrypt these ECMs, may receive the decrypted Control Words from it.
  • Control Words usually have a relatively short lifetime of, for instance, approximately 10 seconds.
  • the Scrambling Control Bit, SCB 1014, in the transport stream packet headers may indicate this lifetime. If it changes, the next Control Word has to be used. This SCB change or toggle is indicated in Fig. 14 by a vertical line and with a reference numeral 1402.
  • FIG. 14 particularly two different scenarios or stream types may be distinguished: According to a stream type I shown in a lower row 1401 in Fig. 14, two
  • Control Words are provided per ECM.
  • Fig. 14 illustrates the two data streams 1400, 1401 comprising subsequently arranged periods or segments A, B, C denoted with reference numeral 1403.
  • each ECM comprises two Control Words, namely the Control Word relating to the current period or ECM, and additionally the Control Word of the subsequent period or ECM.
  • the Control Words there is some redundancy concerning the provision of the Control Words.
  • the conditional access module may only send the first unique ECM it finds to the smart card to reduce or minimize the traffic to the card, as it may have a fairly slow processor.
  • CW A denotes the CW that was used to encrypt period A
  • CW B denotes the CW that was used to encrypt period B
  • ECM A may be defined as being the ECM that is present during the major part of period A. It can be seen that, in that case, ECM A holds the CW for the current period A and for stream type I additionally for the next period B. In general, an ECM may hold at least the CW for the current period and might hold the CW for the next period. Due to zapping, this may probably be true for all or many providers. Before going on, more information will be provided about a decrypter and how it may handle the CWs.
  • the decrypter may contain two registers, one for the "odd” and one for the "even” CW. "Odd” and “even” does not have to mean that the values of the CWs themselves are odd or even. The terms are particularly used to distinguish between two subsequent CWs in the stream. Which CW has to be used for the decryption of a packet is indicated by the SCB 1014 in the packet header. So the CWs used to encrypt the stream are alternating between odd and even. In Fig. 14 this means that, for instance, CW A and CW C are odd, whereas CW B and CW D are even. After the decryption by the smart card, CWs may be written to the corresponding registers in the decrypter overwriting previous values, as indicated in Fig. 15.
  • Fig. 15 illustrates the two registers 1501, 1502 containing even CWs (register 1501) and containing odd CWs (register 1502).
  • smart card latency 1500 that is a time needed by the smart card to retrieve or decrypt a CW from an ECM
  • Fig. 15 illustrates the two registers 1501, 1502 containing even CWs (register 1501) and containing odd CWs (register 1502).
  • smart card latency 1500 that is a time needed by the smart card to retrieve or decrypt a CW from an ECM
  • Fig. 15 illustrates the two registers 1501, 1502 containing even CWs (register 1501) and containing odd CWs (register 1502).
  • smart card latency 1500 that is a time needed by the smart card to retrieve or decrypt a CW from an ECM
  • Fig. 15 illustrates the two registers 1501, 1502 containing even CWs (register 1501) and containing odd CWs (register 1502).
  • smart card latency 1500 that is a time needed by the smart card to
  • Fig. 14 shows ECM handling in a fast forward mode. In a plurality of subsequent periods 1403 separated by SCB toggles 1402, a plurality of data blocks 1600 are reproduced, wherein a switching 1601 occurs between different data blocks.
  • an ECM B is sent at a border between periods A and B.
  • an ECM C is sent at a border between period A and period B.
  • an ECM C is sent at a border between period B and period C.
  • an ECM D is sent at a border between period B and period C.
  • the ECMs may be stored in a separate file. In this file it may also be indicated to which period an ECM belongs (which part of the recorded stream).
  • the packets in the MPEG stream file may be numbered. The number of the first packet of a period (SCB toggle 1402) may be stored alongside with the ECM for this same period 1403.
  • the ECM file may be generated during recording of the stream.
  • FIG. 4b An example of an ECM file is shown in Fig. 4b.
  • the ECM file is a file that may be created during the recording.
  • ECM packets may be located which may contain the Control Words needed to decrypt the video data. Every ECM may be used for a certain period, for instance 10 seconds, and may be transmitted (repeated) several times during this period (for instance 100 times).
  • the ECM file may contain every first new ECM of such a period.
  • the ECM data may be written into this file, and may be accompanied by some metadata. First of all, a serial number (counting up from 1) may be given.
  • the ECM file may contain the position of the SCB toggle. This may denote the first packet that can use this ECM to correctly decrypt its content. Then the position in time of this SCB toggle may follow as the third field.
  • FIG. 17 illustrates a situation for one CW detection and for two CW detection.
  • a smartcard latency 1500 an ECM A may be decrypted to generate corresponding CWs.
  • decrypted content 1701 may be generated.
  • PES headers 1702 namely a PES header A in period A (left) and a PES header B in period B (right).
  • the area 1703 of period B for one CW in Fig. 17 indicates that the data is decrypted with the wrong key and therefore scrambled. This checking could be done while recording, in which case it will take for instance 20 to 30 seconds. It could also be done offline and, because only two packets indicated by the PLUSIs (one in each period) would have to be checked, it could be very quick. In the unlikely event that adequate PES headers are not available, the picture headers could be used instead. In fact, any known information may be useable for detection. Again, a one/two CW indication may be stored in the ECM file.
  • a plaintext normal play stream can easily be reconstructed from a plaintext slow-forward stream. So the slow-forward stream should be encrypted if the normal play stream is encrypted. Since a DVB encryptor is not permissible in a consumer device this can only be realized if the slow- forward stream is constructed on transport stream level using the encrypted data packets from the originally transmitted encrypted data stream.
  • this packet 1800 So first it is necessary for this packet 1800 to be split up into two packets 1803 1804, the first one 1803 containing the data from the first frame 1801 in the original packet 1800 and the second one 1804 containing the data from the next frame 1802.
  • Each of the two packets 1803 1804 resulting from the splitting has to be stuffed, for instance, with an adaptation field AF 1805 and 1806.
  • the splitting of packets is clearly no problem for a plaintext stream.
  • a first option would be to fully decrypt the normal play data as is depicted in Fig. 19.
  • the decryption in slow- forward mode of a stored fully encrypted stream or a stored hybrid stream is no problem because no stream data is skipped or duplicated in the stream to the decrypter.
  • the complete stored stream is simply fed at a lower than normal rate through the decrypter, which also means that there would be no problems with the embedded ECMs.
  • the plaintext stream coming from the decrypter can then be used to split the packets or in fact to perform any necessary stream manipulation. But the resulting slow- forward stream is, of course, always a plaintext stream in this case.
  • an encrypted slow- forward stream from an encrypted normal play stream has to be performed on transport stream level because the use of a DVB encrypter in consumer devices may not be allowed.
  • a hybrid stream as shown in Fig. 20, with only a few plaintext packets 2001 on all frame boundaries is preferable.
  • the other plaintext packets 2000 and encrypted packets 2002 are left unmodified.
  • Such a stream could be generated on the playback side of the storage device if the stored stream is fully encrypted.
  • the decrypter in Fig. 19 is a selective type that only decrypts the necessary packets.
  • the stream is already stored on a storage device 2100, for example, a hard disk drive as a hybrid stream which may be sent to a decryptor 2101 as is indicated in Fig. 21.
  • the plaintext packets in the hybrid stream should now also allow for the splitting of packets containing data from two frames.
  • some part of the sequence header code or picture start code can still be located in an encrypted packet. In this case an ideal splitting is not possible.
  • the split is made between the encrypted and plaintext packet.
  • other types of concatenation have to be considered than the concatenation of empty P frames to I frames, for example, the concatenation of B-frames to B-frames.
  • the amount of data for a B-frame is much more than for an empty P-frame but in general it is still significantly less than for an I-frame.
  • the transmission time is also multiplied with the slow motion factor so at least on average there need not be an increase in bit rate.
  • P-frames can be of the interlace kill type thus reducing interlace artifacts for these pictures. But such a reduction is not possible for pictures resulting from the B-frames because the repetition is not forced by an empty frame but by a repetition of the B-frame data itself. So the B-frames will always have the original interlace effects. If interlace kill would be used for the I- and P-frames this might look inconsistent because pictures with and without interlace effects are sequentially present in the stream of displayed pictures. Alternatively, it is better to only use empty frames without interlace kill to construct the slow- forward stream. One might expect that the repetition of the I- and P- frames should be enforced by the insertion in the transmission stream of empty P-frames after the original I- or P-frame.
  • This method may be used for fast-forward/reverse streams consisting of I-frames followed by empty P-frames.
  • this method is not correct for a stream that also includes B-frames, as is the case for a slow-forward stream constructed from a stored transmission stream with B-frames. Due to the reordering from transmission stream to display stream the I- and P-frames will be repeated in the wrong position thus disturbing the normal display order of the frames. This will be clarified by means of Fig. 23 and Fig. 24.
  • Fig. 23 depicts the effect of reordering in normal play.
  • the top line 2300 shows a normal play transmission stream with a GOP size of 12 frames consisting of I- 2302, P- 2304 and B-frames 2303. The first four frames of the next transmission GOP are also shown here for clarity.
  • the bottom line 2301 shows the stream after reordering to the display order.
  • the index indicates the display frame order. According to pages 24 and 25 of the MPEG-2 standard, ISO/IEC 13818-1 : 1995(E), the reordering is performed as follows:
  • Anchor frames are shifted to the position of the next anchor frame.
  • the top line 2400 in Fig. 24 shows the transmission order of the first part of the slow-motion stream for this case, assuming a slow motion factor of 3.
  • Empty P-frames 2404 are inserted after the I-frames 2403 and P-frames 2405, and the B- frames 2406 are repeated.
  • the middle line 2401 shows the effect of the reordering.
  • the bottom line 2402 shows how the I-frames 2403 are repeated 2407 and 2408 by the empty P- frames 2404 in this case. The same repetition occurs for the P-frames 2405.
  • An empty P- frame 2404 results in a displayed picture that is a copy of the picture resulting from the previous anchor frame, which itself could also be an empty P-frame. It is clearly visible that the normal display order indicated by the index is disturbed because the display of frame 14 is split up into two parts. Only the last time frame 14 2408 is displayed it is in the correct position. This also means that all the B-frames are decoded erroneously. Therefore this is not the correct way to generate a slow- forward trick-play stream. In fact there are several possibilities to solve this problem. A first one is shown in Fig. 25. Here the empty P-frames 2404 are inserted before the anchor frames 2403 and 2405 in the transmitted stream extracted from the storage device as is shown in the top line 2500.
  • the empty P-frames 2404 are now after the anchor frames 2403 and 2405. This is where they should be for a correct repetition of the anchor frames as is clear from the bottom line 2502.
  • the first one is related to the propagation of errors within a GOP.
  • P-frames depend on the previous anchor frame and B-frames depend on the surrounding anchor frames.
  • a data error during the transfer to the STB results in decoding errors and therefore disturbances in the picture. If this error is in an anchor frame it propagates until the end of the GOP because subsequent P- frames depend on this anchor frame.
  • the B-frames are affected because they use the pictures from the disturbed surrounding anchor frames for their decoding.
  • empty B-frames It is also indicated that several types of empty B-frames can be constructed. They have the advantage that no additional error propagation is introduced and that interlace kill can be used.
  • the most important types of empty B-frames for our discussion are the forward and backward predictive empty B-frames. We will call them respectively Bf- and Bb-frames. The terms forward and backward predictive might be confusing to the reader.
  • B-frame is normally bi-directionally predictive, but unidirectional predictive B- frames can also exist. In the latter case they can be forward or backward predictive.
  • Forward predictive means that an anchor frame is used to predict the following B-frames during encoding. So the picture resulting from a forward predictive B-frame is reconstructed during decoding from the previous anchor frame. This means that the Bf- frame forces the repetition of the previous anchor frame. Therefore it has the same effect as an empty P or Pe-frame. It will be clear that the Bb-frame has the opposite effect. It forces the display of the anchor frame following it. For both types of empty B-frames an interlace kill version also exists. Empty B-frames can be used for the construction of a slow- forward stream.
  • the first possibility on the basis of Bb-frames is depicted in Fig. 26.
  • the Bb-frames 2603 are inserted before the anchor frames 2403 2405 in the transmitted stream, as shown in the top line 2600 and keep their position during the reordering as shown in the middle line 2601.
  • the anchor frames 2403 2405 are shifted to the position of the next anchor frame as is also shown in the middle line 2601.
  • the Bb-frame 2603 forces the display of the anchor frame following it in the reordered stream as shown in the bottom line 2602.
  • Bf- frames as depicted in Fig. 27. They are inserted after the anchor frames 2403 2405 in the transmission stream as shown in the top line 2700. The reordered stream is shown in the middle line 2701. The repeated display, as shown in the bottom line 2702, of the anchor frames 2403 2405 in the reordered stream is forced by the Bf- frames 2703 that follow them.
  • Bf- frames 2703 is very similar to the use of empty P-frames 2404 for the construction of fast-forward and fast-reverse streams.
  • the use of Bf- frames 2703 is also possible in that case thus using common measures for trick-play generation which is a further benefit.
  • the first point is that measures are taken to enforce the repetition of the frames. This means that the slow- forward stream is not just a stretched normal play stream, but that additional data is added.
  • the repetition of the anchor frames is realized by the insertion of empty frames. These do not contain any ECMs and therefore the normal processing order is not disturbed. So the original ECMs can still be used in this case.
  • the repetition of the B-frames is enforced by the repetition of the B-frame data. These may contain ECMs and the normal ECM order is disturbed if a toggle of the table ID 2802 and 2803 is present within the B-frame as is depicted in Fig. 28 at point 2800.
  • a repetition of such a B-frame leads to a high rate of ECMs being to the smart card.
  • decryption messages with the same toggle ID are filtered to reduce the rate at which decryption messages are sent to the smart card.
  • a toggle in the table ID is used to filter decryption messages.
  • a table ID toggle is found at the start 2801 of the repeated B-frames and at the original toggle location 2800 somewhere along these frames. The high rate of ECMs can cause the decryption system to become overloaded and fail since the rate of different ECMs being sent to the decrypter is higher than in the encrypted data stream in its original form.
  • Fig. 30b the output of a further advantageous embodiment is illustrated.
  • the original ECMs 3003 are only repeated in the repeated B frames 3000 and 3001 up to the table ID toggle 2800, i.e. pre-toggle original ECMs 3004 are kept and the post-toggle ECMs after the table ID toggle 2800 are removed 3005 in a filtering process.
  • the final repeated B frame 3002 all original ECMs are inserted, including the post-toggle ECMs 3006 after the table ID toggle 2008. This is advantageous to increase the speed of response of the system after channel changing, or zapping.
  • the inverse process can be applied for situations when the first B-frame of a repeated sequence comprises all ECMs, however, then the following repeated B-frames would then have the pre-toggle ECMs 3004 removed and only comprise the post-toggle ECMs 3006.
  • Fig. 31 shows a device 3100 capable of performing the required processing on the encrypted data stream 3107 to solve the problems mentioned above.
  • a repeated portion detection unit 3101 detects the ECMs contained within the encrypted data stream that have been modified from their original form 3106 by a trick-play generator 3105. Such a modification may be performed to generate, for example, a slow- forward trick-play stream by replicating B-frame data and the ECMs corresponding to the positions in time of the replicated B-frame data.
  • the detected ECMs are communicated to a selection unit 3102 which identifies at least one of the ECMs as have being repeated subsequent to the creation of the original encrypted data stream 3106 transmitted by the original content provider, i.e.
  • the selection unit 3102 may comprise an input 3104 for identifying sections of the encrypted data stream that have been repeated. This input is preferably connected directly to a trick- play generator 3105 that inserts the repeated sections into the encrypted data stream. The selection unit 3102 then selects encryption messages or ECMs to be deleted. A deletion unit 3103 deletes the selected encryption messages from the encrypted data stream.
  • the device 3100 may also contain repetition detection unit 3200.
  • the repetition detection unit 3200 is capable of analyzing the incoming encrypted data stream and identifying sections of the stream that have been repeated at a time later than the creation of the original encrypted data stream.
  • the repetition detection unit 3200 may comprise a decrypter as known to the skilled person to completely decrypt the incoming data stream and analyze it for repeated frames and a comparator also known from the prior art to perform such an analysis on the decrypted data stream.
  • the repetition detection unit 3200 may comprise a simple table ID toggle detector that monitors the table ID in the incoming data stream and generates a toggle signal when the table ID changes.
  • the device 3100 may also contain an analyzer unit 3300.
  • the analyzer unit 3300 may be connected to the repeated portion detection unit 3101 and be capable of receiving information 3301 about the characteristics of the encryption messages, or ECMs, contained within the encrypted data stream. Exemplary forms for such characteristics are the toggles of the table IDs comprised within the encrypted data stream. It has been found preferable to use the timing characteristics of the ECMs as a suitable characteristic to analyze.
  • the analyzer unit 3300 can then use timing characteristics resulting from the detected ECMs to identify portions of the encrypted data stream that have been repeated since the creation of the encrypted data stream in its original form.
  • Such a device like that of Fig.
  • Fig. 32 does not require explicit information to be given to the selection unit and therefore the devices of Fig. 32 and Fig. 33 can operate independently of the trick- play unit 3105.
  • the information on repeated sections contained within the encrypted data stream is then passed on to the selection unit 3102 via the input 3104 for receiving information on repeated sections of the encrypted data stream.
  • This method will also solve the high ECM rate problem for a stream with only one Control Word per ECM.
  • the Control Word problem is a little bit different though. There is no risk that the Control Word needed to decrypt the first part of a repeated B-frame will be destroyed. But a special effect occurs if the table ID toggle and SCB toggle are in one and the same B-frame. In practice this is not to be expected to occur frequently because the distance between these two is often larger than the GOP size due to the latency of the smart card. So the method given earlier for two Control Words per ECM will generally also work for systems with one Control Word per ECM. But the case that the toggles of table ID and SCB are in one and the same B-frame will be considered anyway for the less frequent occurrences.
  • Fig. 34 depicts this infrequent situation.
  • the single Control Word 3400 resulting from the ECM with a toggled table ID 3403 is necessary to decrypt the part of the B-frame 3402 after the SCB toggle 3401. So if only the last frame 3404 of a series of identical B-frames contains ECMs 3003, the last part of the earlier B-frames 3405 of the series cannot be decrypted correctly.
  • Compressed frames can result from the repetition of the B-frame data.
  • duration of B-frames will normally be less than one frame time. On average this is true but occasionally the transmission time of a B-frame can be larger than one frame time.
  • a B- frame was detected of 1.4 frame times. This measurement is depicted in Fig. 36.
  • the average B-frame data length equals 0.6 frames, but regularly the duration of the B-frame data is larger than one frame time.
  • the DTS of this previous frame can never be earlier than the end of the data of this frame and therefore never before the start of the data of the current frame.
  • the DTS of an arbitrary frame is at least one frame time later than the start of the data for this frame.
  • the DTS is always after the end of the frame data, even if this data is evenly distributed in one frame time. So the described equal packet distribution should be applied to all B- frames except the last repeated one.
  • a compressed as well as expanded frame will both be named a compressed frame in the remainder of this specification.
  • Gluing is only necessary between the B-frames of an identical series of B- frames. So a possible additional gluing packet will only be added to the end of a compressed B-frame and never anywhere else. An additional PCR packet is added to the end of the B- frames except to the end of the last repeated B-frame because there is no room at this point. This again means that the additional PCRs are only added at the end of compressed B-frames. So no special placement algorithm is necessary for these packets because they are all included in the compression algorithm.
  • Still picture mode is a trick-play mode where we deal with the freezing of a picture on the display screen.
  • the user would like the picture on the screen to be frozen at the moment he pushes the still picture button. This is easily accomplished in a box that also contains the decoder. At that moment the decoding is stopped and the picture in the active picture memory is repeated indefinitely.
  • a more complicated aspect is in dealing with a remote decoder connected to a trick-play engine via a digital interface with a standardized signal, such as MPEG, for example.
  • the storage device 2100 and decoder 2101 are in separate boxes connected by a digital interface, and that the still picture function is part of the storage device. In this case we cannot prevent the display of pictures that have already been sent to the decoder 2101 but were not yet displayed. At the moment the still picture button is pressed, the current picture on the screen will not be frozen but some future picture. This introduces a visible latency between the action, i.e. the pushing of the button, and the reaction, i.e. the freezing of the picture. The amount of latency depends on the delay of the presentation, as indicated by the PTS, with respect to the transmission of the corresponding frame data.
  • the presentation moment indicated by the PTS is not necessarily the real moment of presentation.
  • the DTS and PTS of a B-frame are identical. This would imply a zero decoding time because a picture cannot be displayed, i.e. presented, before it is decoded.
  • the MPEG-2 standard in fact, assumes that the decoder/renderer compensates for the decoding time it needs, thus introducing an additional delay on presentation.
  • the distance of the DTS to the frame data is given by the original broadcast and is not altered for the slow-forward stream constructed.
  • the PTS and DTS are identical for B-frames. So for these frames the reaction time is identical when switching to still picture from normal play or slow-forward. But the distance measured in original frames is, in fact, reduced by the slow motion factor. This means that the still picture is closer to the actual displayed picture for larger slow motion factors.
  • the distance of the PTS/DTS with respect to the start of the B-frame data is relatively small. Measurements indicate, however, that the distance to be roughly 10 frame times of 40 ms in a measurement on ZDF, as shown in Fig. 38.
  • the delay of 10 frame times combined with an expected decoding time of less than one frame time results in an expected reaction time better than half a second.
  • the displayed picture is frozen for slow motion factors above ten times and that the fourth original picture after the displayed picture is frozen for a slow motion factor of three times.
  • a table ID toggle 2800 within a B-frame may lead to an incorrect decryption of repeated B-frames for a stream type I. It has been disclosed that omitting the ECMs from all B-frames except the last repeated one might solve this problem. But at least part of the ECMs in the current B-frame has already been transmitted if this frame contains ECMs. This may be the case for all of the B-frames in an encrypted normal play stream and for the last repeated B-frame in an encrypted slow- forward stream. A first solution would be that the stop must be postponed to the next frame because a stop cannot be made on such a B-frame containing ECMs, if no further information about the ECMs is known.
  • Fig. 40 shows a stop on the last repeated B-frame 4000 of a slow- forward stream.
  • the presence or absence of a table ID toggle 2800 within the B-frame data is easily detected by a comparison of the table ID value of the first ECM 4001 and last ECM 4002 within the B-frame data 4000.
  • a third problem is related to the temporal reference and PTS of the previous anchor frame. These values depend on the number of B-frames that follow the anchor frame. This number of B-frames is changed in still picture mode to some undefined large value. But the temporal reference and PTS of the previous anchor frame cannot be changed because they have already been transmitted. For this reason, it was already disclosed that the switch from normal play to slow-motion should occur at the start of the next anchor frame. The same is, in fact, true when we switch from one slow motion factor to another, so also when we switch from normal play or slow- forward to still picture.
  • the STB uses the PTS to determine the moment of presentation. At the moment the PTS of the anchor frame is reached this frame will be displayed despite the fact that a B-frame with the same PTS is also available.
  • the philosophy being that in case of a conflict in the PTS the oldest frame will be used. It is expected that the B-frame with conflicting PTS will be skipped to avoid buffer problems.
  • the anchor frame is not flushed but kept in memory until a next anchor frame is received, leading to a correct decoding of the subsequent B-frames.
  • the result on the display screen is a frozen B- frame but with a flash of an anchor frame at a short time after the still picture mode was entered.
  • the STB uses the PTS to determine the moment of presentation. At the moment the PTS of the anchor frame is reached the STB will display the B-frame with the identical PTS because this is the most recent frame.
  • the anchor frame is not displayed but kept in memory until a next anchor frame is received, thus enabling a correct decoding of the subsequent B-frames.
  • the result on the screen is a frozen B-frame as it was intended.
  • the STB does not use the PTS for presentation at all. It just displays the frames in sequence after reordering to the display order. This means that the B-frames are displayed one after the other and that an anchor frame is displayed when the next anchor frame is encountered in the transmitted sequence.
  • the frame display grid has a constant delay with respect to the DTS of the B-frames. Also here the result on the screen is a frozen B- frame as it was intended.
  • a picture resulting from an original anchor frame is frozen on the display screen.
  • a picture resulting from an original anchor frame is frozen on the display screen.
  • the switching command 3900 is received during the transmission of an anchor frame 4103 as is depicted in Fig. 43.
  • the previous anchor frame 4306 will be displayed.
  • this frame is frozen on the screen. This can only be realized by the transmission of Bf-frames 2703 after the current anchor frame 4103.
  • the switching command 3900 is received during a pre-inserted Pe-frame 2404.
  • additional Pe-frames 4403 will be inserted after the current Pe-frame 2404, as shown in Fig. 44.
  • the switching command 3900 is received during a pre-inserted Bb-frame 2603, as shown in Fig. 45.
  • This Bb-frame 2603 results in the display of the previously transmitted anchor frame that should now be frozen on the screen.
  • the transmission of the slow-forward stream is continued to the start of the next anchor frame 4503.
  • Pe-frames 4403, equivalent to the previously described Pe-frames 2404, are inserted. Problems with the PTS are thus avoided.
  • the Bb-frames 2603 from the slow- forward stream as well as the Pe-frames 4403 inserted after these Bb-frames 2603 in the still picture stream 4500 lead to a repeated display of the same previously transmitted anchor frame 4306 and ultimately to the desired effect in the display stream 4502.
  • the switching command 3900 is received during a post-inserted Bf- frame 2703, as shown in Fig. 46.
  • This frame forces the display of the anchor frame 4606 previous to the anchor frame directly in front of the Bf- frames, i.e. anchor frame A n-1 .
  • This picture should then be frozen. This can only be achieved by the continued transmission of Bf- frames 2703.
  • the conflict at display position 4605 in the displayed picture line 4602 is now evident and the consequences are either accepted or alternatively the switch is delayed to the start of the next anchor frame.
  • Case A Case A is shown in Fig. 47, which is a combination of Fig. 43 and Fig. 46.
  • the switching command 3900 is received during an anchor frame or the subsequent Bf- frames in switching period 4703.
  • the Bf- frames 2703 are then extended to an indefinite series.
  • a conflict will occur between the PTS of the anchor frame and the PTS of Bf- frame 4704 at reordered location 4705 and displayed picture location 4706.
  • the other frames 4306 will be displayed correctly.
  • Case B is shown in Fig. 48, which is a combination of Fig. 44 and Fig. 45.
  • the switching command 3900 is received during a switching period 4803 of pre-inserted empty frames 4804, of Bb or Pe type. In this case the transmission of the slow- forward stream is continued until the start of the next anchor frame 4503. From this point onwards an indefinite series of Pe-frames 4403 is transmitted. A conflict in the PTS is avoided in this way and the correct still picture frames 4306 are displayed.
  • Fig. 49a Preferably the previous anchor frame is now displayed 4306 and frozen on the screen. This is accomplished by the transmission of an indefinite series of either Bb- frames 2603 or Pe-frames 2404 as the repeated frames 4403 (in Fig. 49a the Pe-frames 2404 version is shown) from the switching moment onwards, irrespective of the empty frame type used to construct the slow- forward stream. Since the use of Bb-frames 2603 would lead to a PTS conflict it may be preferable to make use of Pe-frames 2404, i.e. as shown in Fig. 49a. Completing a discussion on the generation of a still picture stream would not be complete without also discussing the transition from the still picture mode back to normal play mode or a further trick-play mode, such as slow-forward mode, for example. This will now be discussed.
  • the transmitted stream consists of an indefinite series of frames, which can be of the following types: l. B-frames;
  • the last repeated B-frame 5000 does not contain any ECMs 3003 and therefore also no table ID toggle 2800.
  • the preferred solution may be to switch directly to still picture mode since the user will see a still picture of the frame being displayed when the switch to still picture mode 3900 was made.
  • the current frame repeated may then be the last repeated B-frame 5000 and is shown in stream 5003.
  • an equally compliant stream can be provided by postponing the actual switch to still picture mode from the detection of the switch to still picture mode 3900 to the start of the next frame 5001, i.e. at location 5002, whereby the still picture mode will consist of a sequence of next frames 5001 as shown in stream 5004.
  • the encrypted data stream can be processed based on the presence, absence or content of ECMs in the next frame.
  • Elements of the devices shown in Fig. 31, Fig. 32 and Fig. 33 would be suitable for correctly processing such a stream and correcting any problems that may result from the repeated B-frame data of the next frame 5001. By taking such measures a further postponement of the actual initiation of still picture mode may not then be necessary.
  • Fig. 50b a situation is shown wherein the switch to still picture mode 3900 is received during the processing or transmission of a frame 5010 comprising ECMs 3003. In this case, the ECMs comprised within the last repeated B-frame 5010 have to be detected.
  • the fail-safe option is to postpone the actual switch to still picture mode from the detection of the switch to still picture mode 3900 to the start of the next frame 5001, i.e. to location 5002, whereby the still picture mode will consist of a sequence of next frames 5001 as shown in stream 5012.
  • the preferable option is to immediately perform the switch to still picture mode, as shown in stream 5011.
  • Fig. 50c a situation is shown wherein the switch to still picture mode 3900 is received during the processing or transmission of a frame 5020 comprising ECMs 3003 and a table ID toggle 2800.
  • the ECMs 3003 and the table ID toggle 2800 comprised within the last repeated B-frame 5020 have to be detected.
  • Upon detection of ECMs 3003 and a table ID toggle 2800 in the last repeated B-frame 5020 it is no longer possible to make an immediate switch to still picture mode, if no further information is known. If this was to be done, then stream 5021 would be the result.
  • the preferred option may be to postpone the actual switch to still picture mode from the detection of the switch to still picture mode 3900 to the start of the next frame 5001, i.e. to location 5002, whereby the still picture mode will consist of a sequence of next frames 5001 as shown in stream 5022. Since no portion of the next frame 5001 has yet been transmitted the encrypted data stream may be processed based only on the presence, absence or content of ECMs in the next frame and may be processed with devices, or elements thereof, as shown in Fig. 31, Fig. 32 and Fig. 33. A further postponement of the actual initiation of still picture mode may again then not be necessary.
  • Fig. 50c can be further elaborated upon and indeed further improved upon with respect to the goal of responding as quickly as possible to the user switching command.
  • Fig. 5Od the situation is presented whereby the switch to still picture mode 3900 is received subsequently to the transmission or processing of a table ID toggle 2800 in the ECMs 3003.
  • a switch can still be performed if the ECMs may be filtered based upon the respective table ID toggle 2800, such that no further unwanted table ID toggle 2800 occurs in the subsequently repeated current frames 5032.
  • the required filtering is shown in stream 5031.
  • each repeated current B-frame 5032 has the ECMs 3003 occurring prior to the table ID toggle 2800 selectively identified and deleted.
  • an increase in the rate of ECMs 3003 processed by the smart-card is avoided because the rate of toggling of the table ID parameter is kept to a level approximately the same as the original encrypted data stream.
  • Elements of the devices of Fig. 31, Fig. 32 and Fig. 33 would be suitable for this purpose.
  • This scenario is especially suited for type II encryption systems comprising one control word per ECM. For a type I system postponement of the switch to still picture is the correct option.
  • FIG. 5Oe A further scenario is shown in Fig. 5Oe.
  • the switch to still picture mode 3900 is received prior to a table ID toggle 2800 in the ECMs 3003 of the last repeated B- frame 5020.
  • a switch can, again, still be performed if the ECMs may be filtered based upon the respective table ID toggle 2800, such that no further unwanted table ID toggle 2800 occurs in the subsequent repeated current frames 5040.
  • the required filtering is shown in stream 5041.
  • each repeated current B- frame 5040 has the ECMs 3003 after the table ID toggle 2800 removed.
  • an increase in the rate of ECMs 3003 processed by the smart-card is again avoided by removing the table ID toggles 2800.
  • 5Oe is valid, i.e. the switch to still picture mode 3900 is received prior to a table ID toggle 2800 in the ECMs 3003 of the last repeated B-frame 5020.
  • a switch can, again, still be performed if all of the ECMs may be filtered from the switch to still picture mode 3900, as shown in the partial current frame 5060 and all of the ECMs are also filtered from each repeated current frame 5061 including the last repeated current frame 5062, i.e. all ECMs after the switch are filtered.
  • an increase in the rate of ECMs 3003 processed by the smart-card is again avoided by removing all ECMs and therefore also the table ID toggles 2800.
  • FIG. 5Oh A still further scenario is shown in Fig. 5Oh.
  • the switch to still picture mode 3900 is received prior to a table ID toggle 2800 in the ECMs 3003 of the last repeated B-frame 5020.
  • a switch can, again, still be performed if all of the ECMs may be filtered from the switch to still picture mode 3900, as shown in the partial current frame 5060 and all of the ECMs are also filtered from each repeated current frame 5061.
  • the last repeated current frame 5050 is treated as a special case and contains all ECMs of the original current frame 5020. In such a situation an increase in the rate of ECMs 3003 processed by the smart-card is again avoided by removing the table ID toggles 2800.
  • a data processing device 5100 for processing an encrypted MPEG2 data stream including video content (alternatively audio content) will be described.
  • the processing device 5100 it is possible to perform the various method steps as described referring to Fig. 18 through Fig. 50.
  • Account is taken of decryption messages, i.e. ECMs, present in the encrypted stream to take optimal decisions based upon the amount of information known with respect to the ECMs to provide a processed encrypted data stream that is a compliant as possible with conditional access systems in common use.
  • Fig. 51 shows a hard disk drive 5101 on which encrypted audiovisual content, i.e. an encrypted data stream 5110, to be reproduced is stored.
  • the encrypted data stream 5110 can be received directly from a digital satellite, a digital cable signal, a Set Top Box, a digital terrestrial television broadcast or from the Internet using Internet Protocol broadcasting. More examples of suitable sources of audio/video streams are also possible.
  • the processing device 5100 may be controlled by a control unit like a central processing unit (CPU) or control unit 5102 which, in turn, can be controlled by a human user by means of a user interface 5103.
  • a human user may control the operation of the processing device 5100, for instance, a user may initiate a normal play mode or a trick-play operation mode like a slow-forward mode, still picture mode or step picture mode.
  • a still picture mode signal 3900 may be detected.
  • the still picture mode signal 3900 may be communicated to a detection unit 5104 and/or a replication unit 5105 directly or via the control unit 5102 (not shown in Fig. 51).
  • audiovisual content in an encrypted form may be sent from the hard disk drive 5101 to the detection unit 5104, which is capable of detecting the ECMs 3003 comprised within the encrypted data stream 5110.
  • the detection unit 5104 may pass the encrypted data stream 5110 available at its input on to the replication unit 5105 for further processing.
  • the detection unit 5104 may provide the replication unit 5105 with a signal 5111 indicating that the replication unit 5105 should begin immediately with a transition to still picture mode, as an example, or should postpone the transition to still picture mode until the next frame comprised within the encrypted data stream 5110. A postponement of the replication to the next frame, i.e.
  • a subsequent frame may imply that the current frame wherein the transition to still picture mode signal 3900 was received is passed through the replication unit 5105 without replication. It should be apparent that the replication unit 5105 performs replication of the still picture mode frames, however, the exact frame replicated is a result of what the detection unit 5104 actually detects.
  • the signal 5111 could also be communicated via the control unit 5102 (not shown in Fig. 51), which also links the detection unit 5104 indirectly to the replication unit 5105 via the common system bus 5113. To provide the signal 5111 the detection unit 5104 must decide on what action is necessary. In an exemplary embodiment the detection unit 5104 may make the decision based upon whether any ECMs 3003 are detected at all within the B-frame data 5000 (or 5010) to be repeated.
  • the replication unit 5105 may replicate portions, which may be whole or split frames in the encrypted data stream 5110, a number of times in accordance with a predetermined replication rate, for example for slow forward or still picture mode, which may be defined or determined by the control unit 5102 and/or by a user operating the user interface 5103.
  • a processed encrypted data stream 5112 may then be supplied to a reproduction unit 5106.
  • the reproduction unit 5106 may further comprise a monitor having loudspeakers, a television, a set top box, etc, wherein reproduction of this content is possible under control of the control unit 5102 and/or under control of the user via the user interface 5103. It is possible that a further decryption unit (not shown) is foreseen within the reproduction unit 5106 so as to decrypt the processed encrypted data stream 5112 for playback.
  • the detection unit 5104 may be adapted to process individual frames of the encrypted data stream 5110, which may be intra-coded frames (I-frames), forward predictive frames (P-frames) or bi-directional predictive frames (B-frames).
  • the processed content may be a data stream of video data and/or audio data.
  • the reproduction unit 5106 may be capable of reproducing the data stream connected to the replication unit 5105.
  • the encrypted data stream 5110 may be an encrypted MPEG2 data stream.
  • ECMs 3003 are, in fact, related to a cryptographic period 1403 which may encompass multiple frames, it is the relationship of these cryptographic periods 1403 with the frame periods that makes trick-play on encrypted data streams such a complex topic.
  • control unit 5102 may be under control of a human user operating a user input/output interface 5103, for example, by using a user interface (UI) which may include a display, input means like remote control, a keypad, a joystick, a trackball, or the like and may allow a user to specify a mode according to which she or he wishes to reproduce audio/video content stored on the hard disk drive 5101. For instance, the user may adjust, via the user input/output unit 5103, parameters like playback speed, a trick- play reproduction mode, equalization, etc.
  • UI user interface
  • the detection unit 5104 is preferably adapted to also detect a toggle in the table ID parameter of the encrypted data stream, such table ID toggles 2800, indicate where one set of encryption messages transition to a second set of encryption messages.
  • a set of encryption messages may be taken to mean a set of encryption messages that relate to the same set of keys in a conditional access system.
  • Such transitions have consequences for correct processing of the encryption messages by the smartcard in the conditional access system. This measure is particularly useful since it provides further information to decide upon whether the switch to still picture mode 3900 can be safely initiated immediately or whether it must be postponed to a following frame, as described earlier with reference to Fig. 50b and Fig. 50c.
  • table ID toggles 2800 and the user initiated switch to still picture mode 3900 it is possible to make further optimal decisions about whether the switch to still picture mode should be immediately initiated or postponed.
  • Such situations require the relative times, or the relative positions in the stream of the relevant events and have been described earlier with respect to Fig. 5Od, Fig. 5Oe, Fig. 5Of, Fig. 5Og and Fig. 5Oh.
  • a device 5200 which can selectively filter ECMs 3003 which would cause a decryption system to become overloaded due to a high rate of essentially different ECMs 3003, for example, resulting after an increase in the rate of table ID toggles 2800.
  • the device 5200 provides an embodiment that can process an encrypted data stream comprising repeated portions repeated by the replication unit 5105, i.e. repeated B-frame data, into a processed encrypted data stream with a rate of change of decrypting messages that is substantially equivalent to that of the encrypted data stream in its original from, i.e. when originally broadcast, transmitted or delivered via another suitable means, such as on a data storage carrier, like an optical drive, for example.
  • the replication unit 5105 may provide a replication signal 5203 to an input 3104 of selection unit 3102 indicating portions of the encrypted data stream that have been repeated with respect to the original form of the encrypted data stream 5110.
  • the selection unit 3102 was described in details earlier in this specification in the text related to Fig. 31.
  • the detection unit 5104 detects ECMs, as described above, and may communicate the detected ECMs to the selection unit 3102 directly or via the control unit 5102 via system bus 5113.
  • the selection unit 3102 identifies at least one of the ECMs as have being repeated subsequent to the creation of the original encrypted data stream 5110 transmitted by the original content provider and may use the replication signal 5203 for this purpose.
  • a deletion unit 3103 deletes the selected encryption messages from the encrypted data stream using information from the selection unit 3102.
  • the deletion may also be envisaged as a filtering of the ECMs.
  • the devices, or elements thereof, shown in Fig. 31, Fig. 32 and Fig. 33 are also applicable to the embodiment shown in Fig. 52.
  • the detection unit 5104 may further delay the switch to the slow- forward or still picture mode by a delay time which corresponds to the time difference between the point of time of switching and a point of time of starting a next frame in the sequence of the plurality of frames.
  • Such a next frame may be a B-frame or an anchor frame, which anchor frame may then be an I-frame or a P-frame (in the nomenclature of MPEG).
  • anchor frame may then be an I-frame or a P-frame (in the nomenclature of MPEG).
  • the replication unit 5105 may correct the temporal reference between these frames.
  • Step picture mode generally means that the device or system is in still picture mode and that the user wants to step forward or backward to another frame and then resume the still picture mode. It is, however, quite feasible to enter step picture mode from other reproduction modes, such as, normal play mode, slow forward mode, or an equivalent reverse direction mode. It was already noted in this specification that reverse modes based on all frames or at least on frames other than the I-frames are practically unfeasible in MPEG due to the asymmetry in the prediction used in the MPEG encoding process. This is therefore also true for the step backward mode. So it is only practically feasible to step backward to the previous I-frame and then from I-frame to I-frame.
  • Case 1 covers the case where the same B-frame data is transmitted over and over again, i.e. use is made of repeated B-frame data.
  • the previous I-frame in display order will be the previous I- frame in the normal play transmission order, but this is not true if this I-frame was the last transmitted anchor frame. In that case, the I-frame needed is the I-frame before that in the normal play stream.
  • the I-frame to be displayed after the step backward command has to be sent to the decoder followed by an indefinite series of Pe-frames.
  • the I-frame is not followed by any B-frames in this case, and its temporal reference has to be set to zero and the PTS is Delta higher than its DTS, where Delta is a DTS increment that corresponds to one frame time.
  • the temporal reference should simply be incremented.
  • the parameter Delta is equal to the number of 90 kHz periods in one frame time because the DTS is linked to the PCR base.
  • the DTS of the I-frame is Delta higher than the DTS of the previously transmitted frame.
  • the temporal reference and DTS/PTS for the subsequent Pe-frames may be calculated using normal MPEG encoding rules taking into account Delta for each inserted frame.
  • Fig. 54 depicts the situation where the first step backward is made on a still picture from a B-frame.
  • the numbering conventions for individual frames are identical to those used earlier in this specification.
  • the PTS of the B- frame (shown in Fig. 54) or Bf- frame (not shown in Fig. 54) previous to the I-frame is equal to its DTS. This means that the distance between the PTS of this B-frame 5403 and the PTS of the I-frame 5404 is equal to two frames. In other words, some other frame 5405 should be displayed between these two.
  • the decoder might still display the anchor frame at this position, or it might repeat the display of the last B-frame 5403 to fill the gap.
  • a repeated last B-frame does not have a disturbing effect, but any other action of the decoder will lead to an incorrect first frame after the step backward. This may still be acceptable because it is limited to only one incorrect frame.
  • the I-frame previous to the last transmitted one is sent to the decoder followed by an indefinite series of Pe-frames. After the first step backward, the device or system is always in a situation where an indefinite series of Pe-frames is transmitted (case 3).
  • step backward there is no undefined first frame after the step backward. This means that an incorrect first frame can only occur after the first step backward.
  • the decryption of the signal also has to be considered. This is, of course, no issue for the Pe-frames that are always in plaintext but it is an issue for the I-frames that are expected to be (largely) encrypted, for example, in a hybrid stream.
  • the ECM handling as described for fast-reverse should also be applied here. Switching effects have to be considered before the first step backward is made.
  • the explanation of the step backward mode was based on a step size equal to the distance between the I-frames or in other words on a step size equal to one GOP. It will be clear that larger step sizes equal to an integer multiple of GOPs can also be used.
  • the maximum step size and step frequency are limited by the timely availability of CWs for the decryption process.
  • step forward mode a number of possibilities remain open.
  • Three types of step forward mode can, in fact, be distinguished: I) Frame based; II) Anchor frame based;
  • Fig. 55 Two problems arise when the next frame is an anchor frame.
  • the first problem is an undefined first frame 5503 after the step forward, as shown in the display picture line 5502.
  • the second problem is that the temporal reference and PTS for the anchor frame 5504 cannot be calculated with an infinite slow motion factor. This is due to the fact that the number of B-frames following this anchor frame may be infinite and depends upon the user. So the calculation of temporal reference and PTS should be based on some chosen slow motion factor unequal to infinity. This means that a PTS conflict as described for still picture will occur for one of the subsequent B-frames 5505, at display location 5506, in this example.
  • a second type of step forward mode is shown which is an anchor frame based one.
  • step backward mode it has first to be determined which anchor frame should be displayed next.
  • the first step picture mode step 5606 is therefore a special case.
  • the frame selected depends on the cases mentioned earlier.
  • cases 1 and 2 where the still picture stream consists of a series of B- or Bf- frames, it is the previous anchor frame in the recorded normal play stream.
  • the case 1 situation is depicted in detail in Fig. 56.
  • the previous anchor frame in the recorded normal play stream is I-frame, I 4 , 5603 and this is inserted into the processed stream 5600.
  • a succession of empty P type frames 5605 i.e.
  • Pe-frames are inserted into the processed data stream 5600.
  • the displayed pictures 5602, as shown in Fig. 56 show only a single artifact at display location 5503. Thereafter, no further artifacts are shown and the processed data stream therefore is more compliant.
  • the comparison between display stream 5502 of frame based step picture mode shown in Fig. 55 and the display stream 5602 of anchor frame based step picture mode shown in Fig. 56 shows clear improvements in the compliance of the processed data stream. In particular the artifacts at display locations 5506, 5507 and 5508 no longer occur.
  • the device selects the next anchor frame that would have been displayed in the original data stream, in the example of Fig. 56, it is P-frame, P 7 , 5604.
  • a succession of empty Pe-frames 5605 is used to create the frozen picture of the step picture mode.
  • the first anchor frame chosen for the first step picture mode frame is the next anchor frame in the normal data stream. Therefore, the anchor frame to be displayed next is sent to the decoder by inserting it into the processed data stream followed by an indefinite series of Pe-frames.
  • the PTS of this anchor frame is equal to its DTS increased by the frame time, Delta, because it is not followed by any B-frame.
  • the DTS is as usual equal to the DTS of the previous frame increased by the frame time, Delta.
  • the temporal references are strictly coupled to the PTS.
  • the temporal reference of this anchor frame is equal to the temporal reference of the preceding frame increased by 1. If on the other hand the preceding frame is a B-frame of any type, the increment is 2 instead of 1. It is the result of this that causes the single artifact 5503 and it stems from the fact that an additional gap of one frame exists again between the temporal references and PTS values of the anchor frame 5603 and the last B-frame 5607. This gap has to be filled by the decoder through the display of some other frame. The most probable candidates are the previous anchor frame or the last B-frame 5607. This has no disturbing effect because either the previous anchor frame is displayed one frame time earlier or the last B-frame 5607 is displayed one frame time longer.
  • the temporal reference and DTS/PTS for the Pe-frames are again calculated with normal MPEG encoding rules taking into account each inserted frame.
  • the original anchor frame in the normal play stream next to the last transmitted one is sent to the decoder followed by an indefinite series of Pe-frames.
  • the system is always in a situation where an indefinite series of Pe-frames is transmitted.
  • this second type of step forward mode fits very nicely to the step backward mode because the still picture stream always consists of a series of Pe- frames after the first step in both cases.
  • a step backward cannot be followed by a frame based step forward because the B-frame will reference the wrong anchor frames in that case.
  • the transmitted data stream 5700 comprises repeated B-frame data, B 14, 5707 in the still picture mode time period 5708.
  • the first step forward signal is received at location 5606.
  • the previous anchor frame is therefore, I 16 , 5704, in this example.
  • the succession of Pe-frames 5605 is also shown.
  • the transmitted data stream 5701 comprises repeated Bf- frames 5710 in the still picture mode time period 5708.
  • the first step forward signal is again received at location 5606.
  • the previous anchor frame in case 2 is a P-frame, P 1 C,, 5705 due to the use of repeated Bf- frames 5710.
  • the succession of Pe-frames 5605 is again shown in stream 5701.
  • the transmitted data stream 5702 comprises repeated Pe-frames 5711 to create the still picture mode in the still picture mode time period 5708.
  • the first step forward signal is again received at location 5606.
  • the anchor frame required for the first step picture mode frame in case 3 is again a P-frame, but this time it is frame P22, 5706.
  • Frame P22, 5706 is the next following anchor frame in the normal data stream sequence which can be seen in the representation of the normal data stream 5703 in Fig. 57.
  • the succession of Pe-frames 5605 is again shown in stream 5702.
  • the third type of step forward mode possible is I-frame based. Again it has to be determined first what I-frame should be displayed next. Following the methods already disclosed for frame based and anchor frame based step picture mode the I-frame required for the different cases can also be determined in a general way.
  • the normal play data stream 5800 is shown with the first I-frame 5802, the second I-frame 5803 and the third I- frame 5804 of subsequent picture steps in I-frame step picture mode. After reordering the reordered data stream 5801 results.
  • the transmitted data stream 5900 comprises repeated B-frame data, B14, 5707 in the still picture mode time period.
  • the first step forward signal is received at location 5606.
  • the first frame to display is the previous I-frame and is therefore, lie, 5704, in this example.
  • the succession of Pe-frames 5605 is also shown.
  • the transmitted data stream 5901 comprises repeated Bf- frames 5710 in the still picture mode time period.
  • the first step forward signal is again received at location 5606.
  • the I-frame to be displayed in case 2 is the following I-frame, I 28 , 5907.
  • the succession of Pe-frames 5605 is again shown in stream 5901.
  • the transmitted data stream 5902 comprises repeated Pe- frames 5711 to create the still picture mode in the still picture mode time period.
  • the first step forward signal is again received at location 5606.
  • the I-frame required for the first step picture mode frame in case 3 is again I-frame, I 28 , 5907.
  • Frame I 28 , 5907 is, as for case 2, the next following I-frame in the normal data stream sequence which can be seen in the representation of the normal data stream 5906 in Fig. 59.
  • the succession of Pe-frames 5605 is again shown in stream 5902.
  • the I-frame based step forward is, in fact, identical to the I-frame based step backward already described. It should be clear that this is the only step forward mode that allows for larger step sizes.
  • the step size is then equal to an integer multiple of GOPs.
  • step forward seems to be the best choice based on the technical quality on the one hand and the step size on the other hand.
  • the step size is fixed to the anchor frame distance in this case and a larger step size cannot be chosen.
  • a mixture of anchor frame based and I-frame based step forward depending on the wanted step size is also possible.
  • step backward and step forward in relation to the ECM handling.
  • step forward mode the normal sequence of the transmitted frames is not disturbed. Only some frames are skipped in certain modes. It is expected that the original ECMs present in the stream can still be used in the step forward mode. But also here the maximum step size and step frequency are limited for encrypted streams by the timely availability of CWs for the decryption process.
  • a data processing device 6000 for processing an MPEG2 data stream including video content (alternatively audio content) will be described.
  • the device is an exemplary embodiment of the invention.
  • the processing device 6000 it is possible to perform the various method steps as described referring to Fig. 18 through Fig. 59.
  • the processing device 6000 may be controlled by a central processing unit (CPU) or control unit 5102 which, in turn, can be controlled by a human user by means of a user interface 5103.
  • the device 6000 processes a data stream 6010, which may be an MPEG2 data stream, or any other data stream wherein anchor frames are employed, such as MPEG4.
  • the data stream 6010 is retrieved, when a corresponding control signal is sent from the control unit 5102 to the hard disk drive 5101 via a system bus 5113, from a storage device 5101, which may be a hard disk drive, an optical storage device, such as a CD, DVD, a flash storage device etc.
  • the data stream 6010 may also enter the device from a traditional broadcast channel, such as terrestrial television, digital cable or digital satellite.
  • Newer forms of transmission are also possible sources for the data stream 6010, such as transmission via the Internet or a mobile transmission technology as employed in mobile phone systems.
  • a digital interface such as Ethernet or IEEE 1394, also known as "Firewire” may also be used.
  • the actual source of the data stream 6010 is not a limiting factor in implementing an embodiment of the invention.
  • a user interacts with the device using the user interface 5103.
  • This user interface can be via a remote control, a keyboard/mouse combination, or other known interaction means. Feedback can be given to the user via a display, such as a television, LCD display or monitor, for example. Other possibilities known to the skilled person are not excluded.
  • a user operating the device 6000 can interact with the device 6000 to initiate a step picture mode.
  • the user interface 5103 may then send a step forward signal 6017 to a detection unit 6001, which detects a mode change from a first reproduction mode to a second reproduction mode.
  • the first reproduction mode may be a still picture mode, also known as freeze frame mode, a slow- forward mode or a normal play mode, as typical examples.
  • the second reproduction mode may be a step picture mode.
  • the step forward signal 6017 may also be communicated from the user interface 5103 to the detection unit 6001 via the control unit 5102 and the system bus 5113.
  • the detection unit 6001 outputs a mode switch signal 6015 when the mode change is detected.
  • the mode switch signal 6015 may trigger an optional switching means (not shown) to switch between a bypass path (not shown) or a processed data stream path comprising a determination unit 6003 and an insertion unit 6004.
  • the switching means is optional since both the determination unit 6003 and the insertion unit 6004 may also be implemented as essentially pass through devices when no stream processing is required.
  • the determination unit 6003 may receive the data stream 6010 on an input 6002.
  • the mere act of receiving a data stream, such as the data stream 6010 may initiate the determination unit 6003 to determine the first anchor frame necessary for the correct step picture mode frame, though it is also possible to initiate the determination unit 6003 to determine the first anchor frame in other ways such as via a control signal sent via a system bus 5113 or via the mode switch signal 6015.
  • the actual first anchor frame determined depends upon the form that the incoming data stream 6010 takes. The various forms were described earlier with reference to Fig. 59, along with the correct first anchor frame to be used, in each case.
  • the first anchor frame that is determined by the determination unit 6003 is indicated via a first anchor frame control signal 6011 to the insertion unit 6004.
  • the insertion unit may receive the first anchor frame itself, i.e. as a copy of the first anchor frame data, via the first anchor frame control signal 6011. It is also possible that only a reference to the first anchor frame is sent via first anchor frame control signal 6011 from the determination unit 6003 to the insertion unit 6004 and that the first anchor frame itself is received by a separate data path 6018 in the normal data stream signal path.
  • the insertion unit 6004 is adapted to insert the first anchor frame into the data stream 6010 as a first step in producing a processed data stream 6013.
  • the insertion unit 6003 may be further arranged to insert a succession of frames into the processed data stream 6013 to cause a repetition of the first anchor frame on a display after decoding by reproduction unit 5106.
  • the frames necessary to repeat the first anchor frame may be bidirectional predicted frames, i.e. MPEG B-frames, a subset of MPEG B-frames, such as Bf- frames or predicted anchor frames, such as MPEG P-frames.
  • the succession of inserted frames may be empty frames.
  • a mixture of frame types is also possible. Many options have been discussed earlier in this specification with respect to still picture mode.
  • the device 6000 In responding to a subsequent picture step forward signal 6017 the device 6000 is in a known state. In this case, the device 6000 remains in the second reproduction mode and the detection unit 6001 does not need to change the optional switching means.
  • the detection unit 6001 does, however, need to indicate that a picture step forward signal 6017 was received and indicates this to the determination unit 6003 and the insertion unit 6004 via a step control signal 6016.
  • the determination unit 6003 determines the subsequent anchor frame as the next following anchor frame after the first anchor frame that would have occurred in the incoming data stream 6010.
  • the insertion unit 6004 then inserts the subsequent anchor frame into the processed data stream 6013, followed by a further succession of anchor frames.
  • the insertion unit 6004 can supply a reproduction unit 5106 with the processed data stream 6013 via an output 6005.
  • a complementary half of the optional switching means may also be placed in the output path and may again be controlled by the detection unit 6001 or by the control unit 5102 via system bus 5113.
  • the reproduction unit 5106 may be located internally to the device 6000 and therefore be under control of the control unit 5102 or it may be external to the device 6000, i.e. in a remote decoder arrangement. In the latter case the reproduction unit 5106 does not, of course, need to be connected to the device system bus 5113. Therefore, the system bus 5113 connecting the reproduction unit 5105 and the control unit 5102 is entirely optional.
  • FIG. 61 A further embodiment of the invention is shown in Fig. 61.
  • the device 6100, shown in Fig. 61 is similar to the device 6000 shown in Fig. 60, but has a correction unit 6101, which is inserted into the processed data stream path after the insertion unit 6004.
  • the correction unit 6101 may correct any temporal parameters of the frames inserted by the insertion unit 6004 to further improve the compliance of the processed data stream 6013 to a standard such as one of the MPEG standards.
  • Temporal parameters of relevance are the temporal reference, the Decoding Time Stamp, or DTS, and the Presentation Time Stamp, or PTS.
  • the corrections required have been described earlier in the description of the step picture mode when referring to Fig. 53 and Fig. 56.
  • the correction unit 6101 may operate under the control of control unit 5102 via the system bus 5113.
  • a third embodiment of the invention is shown in Fig. 62.
  • the device 6200 shown in Fig. 62, is similar to the device 6000 of Fig. 60 and the device 6100 of Fig. 61, but has a frame detector unit 6201 inserted into the incoming data stream path.
  • the frame detector unit 6201 is capable of detecting the form of the incoming data stream 6010.
  • the determination unit 6003 may need to determine a different first anchor frame from a case where the data stream 6010 is constructed from a series of empty Pe-frames.
  • the frame detector unit 6201 can determine repeated bi-directionally predicted frames and/or repeated empty predicted frames.
  • the frame detector unit 6201 may communicate a frame detection signal directly to the determination unit 6003 or via the control unit 5102 via system bus 5113.
  • the determination unit 6003 may use the frame determination signal and the mode switch signal 6015 to determine the first anchor frame according to the situations given in Fig. 59 and the related description.
  • the frame detector unit 6201 may operate under the control of control unit 5102 via the system bus 5113.
  • the insertion unit 6004 operates in a similar manner to that of device 6000 and device 6100.
  • the correction unit 6101 operates in a similar manner to that of device 6100.
  • the correction unit 6101 has an output 6005 that may be used to provide the processed data stream 6013 to a reproduction unit 5106 via the optional switching means.
  • the reproduction unit 5106 may be a remote decoder connected via a digital interface to the device 6200.
  • the digital interface may be any suitable interface for conveying audio-video signals, such as MPEG signals.
  • any of the embodiments described comprise implicit features, such as, an internal current supply, for example, a battery or an accumulator.
  • any reference signs placed in parentheses shall not be construed as limiting the claims.
  • the word “comprising” and “comprises”, and the like, does not exclude the presence of elements or steps other than those listed in any claim or the specification as a whole.
  • the singular reference of an element does not exclude the plural reference of such elements and vice- versa.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Television Signal Processing For Recording (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

A device (6000) and method is disclosed for processing a data stream comprising a plurality of frames. Further, a program element and a computer-readable medium are disclosed enabling the disclosed method. The device comprises a detection unit (6001) for detecting the switch from a first reproduction mode to a second reproduction mode, a determination unit (6003) for determining a first anchor frame in the plurality of frames and an insertion unit (6004) for inserting the first anchor frame and a succession of empty predictive frames subsequent to the first anchor frame. The device operates in a manner that improves the compliance of a processed data stream for further transmission to a remote decoder arrangement that is capable of decoding a standardized signal.

Description

A device for and a method of processing a data stream
FIELD OF THE INVENTION
The invention relates to a device for processing a data stream. The invention further relates to a method of processing a data stream. The invention further relates to a program element. The invention further relates to a computer-readable medium.
BACKGROUND OF THE INVENTION
Electronic entertainment devices become more and more important.
Particularly, an increasing number of users buy hard disk based audio/video players and other entertainment equipment.
Since the reduction of storage space is an important issue in the field of audio/video players, audio and video data are often stored in a compressed manner, and for security reasons in an encrypted manner.
MPEG2 is a standard for the generic coding of moving pictures and associated audio and creates a video stream out of frame data that can be arranged in a specified order called the GOP ("Group Of Pictures") structure. An MPEG2 video bit stream is made up of a series of data frames encoding pictures. The three ways of encoding a picture are intra-coded (I picture), forward predictive (P picture) and bi-directional predictive (B picture). An intra- coded frame (I-frame) is independently decodable. A forward predictive frame (P-frame) needs information of a preceding I-frame or P-frame. A bi-directional predictive frame (B- frame) is dependent on information of a preceding and/or subsequent I-frame or P-frame.
It is an interesting function in a media playback device to switch from a normal reproduction mode, in which media content is played back in a normal speed, to a trick-play reproduction mode, in which media content is played back in a modified manner, for instance with a reduced speed ("slow forward"), a still picture, step picture etc.
Preferably, such trick-play modes should also be possible with digital televisions. Such digital televisions have an embedded decoder and a digital interface via which a standardized signal is provided, such as an MPEG signal described above. A similar situation also occurs in a home network of Set Top Boxes that communicate via a digital in- home network. In both cases the system providing the trick-play signal is located remotely from the decoder. It is therefore advantageous to provide a trick-play signal as a standardized signal form capable of working in conjunction with a standard decoder. The signal should also preferably take into account mode transitions from normal to trick-play, trick-play to trick-play and vice versa since these transitions can occur at any time under control of the consumer, even midway through the transmission of a single video frame. In such a situation a signal already transmitted cannot be revoked.
US 2004/0190866 Al discloses frame advance and slide show trick modes on an MPEG data stream for a remote decoder system, wherein for each original B frame a predetermined maximum number of copies of the original B frames are added. Basic still picture trick-play is therefore provided for. However, the MPEG data stream provided by the prior art does not produce a data stream that is fully compatible with the MPEG standard. Therefore, the actual images seen by the consumer on the display cannot be guaranteed in all circumstances. It is clear that the provision of an MPEG data stream that has an improved compatibility with the MPEG standard would be advantageous.
BRIEF SUMMARY OF THE INVENTION
It is an object of the invention to properly adjust the provision of a data stream to improve compatibility. Accordingly there is provided, in a first aspect of the invention, a device for processing a data stream comprising a plurality of frames, wherein the device comprises an input for receiving the data stream, an output for transmitting a processed data stream, a detection unit arranged to detect a switching of mode from a first reproduction mode to a second reproduction mode, a determination unit arranged to determine a first anchor frame comprised within the plurality of frames in response to the switching of mode and an insertion unit arranged to insert frames, as inserted frames, into the data stream to produce the processed data stream in the second reproduction mode, wherein the insertion unit is arranged to insert the first anchor frame determined by the determination unit into the data stream as a first frame in the second reproduction mode and insert a first succession of empty predictive frames subsequent to the first anchor frame.
A device for processing a data stream may comprise a detection unit for detecting a switching of mode from a first reproduction mode, for example, normal play or still picture mode to a second reproduction mode such as step picture mode. In response to the detection of the switching of mode the device may determine a first anchor frame as the first frame in the processed data stream in a step picture mode. The determination of an anchor frame excludes the next following frame being a B-frame and therefore reduces the number of artifacts caused by the reordering of anchor frames with respect to B-frames in data streams where such reordering is commonly applied, such as in MPEG based data streams. The device may insert the first anchor frame and a succession of repeated empty predicted frames, such as empty MPEG P-frames, to provide, for example, a step picture mode with a frozen frame appearing subsequent to the step. The reduction of artifacts when displaying the first frame provides a processed data stream of an improved compliance, for example, to the MPEG standard and therefore an improved picture display for the user. The term "anchor frame" may particularly denote a frame, which, in transmission order and/or in display order, keeps its relative temporal position with respect to other anchor frames. In the context of MPEG2, 1-frames and P-frames may be denoted as anchor frames. In contrast to this, B-frames would not be denoted as anchor frames in the context of MPEG2. According to a second aspect of the invention a method is provided for processing a data stream, the method comprising the method steps of receiving the data stream, detecting a switching of mode from a first reproduction mode to a second reproduction mode, determining a first anchor frame comprised within the plurality of frames in response to the switching of mode, inserting frames, as inserted frames, into the data stream to produce a processed data stream in the second reproduction mode and outputting the processed data stream, wherein the method step of inserting further comprises method steps of inserting the first anchor frame determined by the determining step into the data stream as a first frame in the second reproduction mode and inserting a first succession of empty predictive frames subsequent to the first anchor frame. According to a third aspect of the invention a program element is provided for processing a data stream, the program element being capable of being directly loadable into the memory of a programmable device, and comprising software code portions for performing, when said program element is run on the device, the method steps of receiving a data stream comprising a plurality of frames, detecting a switching of mode from a first reproduction mode to a second reproduction mode, determining a first anchor frame comprised within the plurality of frames in response to the switching of mode, inserting frames into the data stream to produce a processed data stream in the second reproduction mode and outputting the processed data stream, wherein the method step of inserting further comprises the method steps of inserting the first anchor frame determined by the determining step into the data stream as a first frame in the second reproduction mode and inserting a succession of empty predictive frames subsequent to the first anchor frame.
According to a fourth aspect of the invention a computer-readable medium is provided, the computer-readable medium directly loadable into the memory of a programmable device, comprising software code portions for performing processing of a data stream, when said code portions are run on the device, the method steps of receiving a data stream comprising a plurality of frames, detecting a switching of mode from a first reproduction mode to a second reproduction mode, determining a first anchor frame comprised within the plurality of frames in response to the switching of mode, inserting frames into the data stream to produce a processed data stream in the second reproduction mode and outputting the processed data stream, wherein the method step of inserting further comprises the method steps of inserting the first anchor frame determined by the determining step into the data stream as a first frame in the second reproduction mode and inserting a succession of empty predictive frames subsequent to the first anchor frame. In one embodiment the detection unit may be further arranged to detect a step forward signal, the determination unit may be further arranged to, in response to detection of the step forward signal, determine a subsequent anchor frame, the subsequent anchor frame being the next following anchor frame comprised within the plurality of frames and the insertion unit may be arranged to insert the subsequent anchor frame determined by the determination unit into the processed data stream in the second reproduction mode and insert a second succession of empty predictive frames subsequent to the subsequent anchor frame. Such measures provide the advantage that after a first step in the processed data stream further steps in the processed data stream can be achieved. The further steps present no further artifacts and therefore also present an improved displayed picture sequence for a user. In a further embodiment a correction unit may be provided for correcting at least one temporal parameter of the inserted frames comprised within the processed data stream and inserted by the insertion unit. The temporal parameter may comprise a temporal reference, a Presentation Time Stamp and/or a Decoding Time Stamp. Such measures provide a processed data stream with a further improvement in compliance with a specific standard.
In a further embodiment a frame detector unit may be provided for detecting repeated bi-directionally predicted frames comprised within the plurality of frames in the first reproduction mode, wherein the determination unit may be further arranged to determine a further anchor frame which directly precedes the switching of mode and determine the first anchor frame to be the further anchor frame. Such a measure provides a processed data stream with the most appropriate first anchor frame taking into account the construction of the data stream in the first reproduction mode, such as when the data stream comprises a sequence of repeated B-frames, and therefore further improvement in compliance with a specific standard in more situations.
In another embodiment a frame detector unit may be provided for detecting repeated empty predicted anchor frames comprised within the plurality of frames in the first reproduction mode, wherein the determination unit may be further arranged to determine a further anchor frame which directly succeeds the switching of mode and determine the first anchor frame to be the further anchor frame. Such a measure provides a processed data stream with the most appropriate first anchor frame taking into account the construction of the data stream in the first reproduction mode, such as when the data stream comprises a sequence of repeated empty P-frames, and therefore further improvement in compliance with a specific standard in more situations. In one embodiment the first reproduction mode may be a selection of one of a still picture mode, a slow forward mode or a normal play mode. Such modes are common trick-play modes in audio-video based systems and provide a user with improved functionality.
In a further embodiment the second reproduction mode may be a step picture mode. Such a mode is a common trick-play mode in audio-video based systems and provides a user with the ability to step through audio-video content.
In an embodiment the empty predictive frames may comprise at least one empty MPEG P type frame. Such empty MPEG P type frames provide an efficient manner compliant with the MPEG standard for repeating a frame in the processed data stream. In a further embodiment the bi-directionally predicted frames might be MPEG
B type frames. Such frames are commonly encountered in MPEG data streams and an embodiment capable of providing a compliant data stream responsive to the detection of such frames will often be required.
In another embodiment the empty predicted anchor frames might be empty MPEG P type frames. Such frames are also commonly encountered in MPEG data streams and an embodiment capable of providing a compliant data stream responsive to the detection of such frames will again often be required. In one embodiment the data stream may comprise one or more from a selection of video data, audio data and digital data. Such data streams are commonly encountered in consumer electronics devices.
In one embodiment the data stream may be an MPEG2 data stream. MPEG2 is a designation for a group of audio and video coding standards agreed upon by MPEG
(moving pictures experts group) and published as the ISO/IEC 13818 International Standard. For example, MPEG2 is used to encode audio and video broadcast signals including digital satellite and cable TV, but may also be used for DVD.
However, the device according to exemplary embodiments of the invention may also be adapted to process an MPEG4 encrypted data stream. More generally, any codec scheme may be implemented which uses anchor frames from which other frames are dependent, particularly any type of encoding using predictive frames and thus any kind of MPEG encoding/decoding.
The device according to the invention may be realized as at least one of the group consisting of a digital video recording device, a network-enabled device, a conditional access system, a portable audio player, a portable video player, a mobile phone, a DVD player, a CD player, a hard disk based media player, an Internet radio device, a computer, a television, a public entertainment device and an MP3 player. However, the applications are only exemplary. The aspects defined above and further aspects of the invention are apparent from the examples of embodiment to be described hereinafter and are explained with reference to these examples of embodiment.
BRIEF DESCRIPTION OF THE DRAWINGS The invention will be described in more detail hereinafter with reference to examples of embodiment but to which the invention is not limited.
Fig. 1 illustrates a time-stamped transport stream packet.
Fig. 2 shows an MPEG2 group of picture structure with infra-coded frames and forward predictive frames. Fig. 3 illustrates an MPE G2 group of picture structure with infra-coded frames, forward predictive frames and bi-directional predictive frames.
Fig. 4a illustrates a structure of a characteristic point information file and stored stream content.
Fig. 4b shows an example of an Entitlement Control Message (ECM) file. Fig. 5 illustrates a system for trick-play on a plaintext stream.
Fig. 6 illustrates time compression in trick-play.
Fig. 7 illustrates trick-play with fractional distance.
Fig. 8 illustrates low speed trick-play. Fig. 9 illustrates a general conditional access system structure.
Fig. 10 illustrates a digital video broadcasting encrypted transport stream packet.
Fig. 11 illustrates a transport stream packet header of the digital video broadcasting encrypted transport stream packet of Fig. 10. Fig. 12 illustrates a system allowing the performance of trick-play on a fully encrypted stream.
Fig. 13 illustrates a full transport stream and a partial transport stream.
Fig. 14 illustrates Entitlement Control Messages for a stream type I and for a stream type II. Fig. 15 illustrates writing Control Words to a decrypter.
Fig. 16 illustrates Entitlement Control Message handling in a fast forward mode.
Fig. 17 illustrates detection of one or two Control Words.
Fig. 18 illustrates the splitting of a transport stream packet at a frame boundary.
Fig. 19 illustrates a system allowing the performance of slow- forward trick- play on a fully encrypted stream.
Fig. 20 illustrates a hybrid stream with plaintext packets on each frame boundary. Fig. 21 illustrates a system allowing the performance of slow- forward trick- play on a stored hybrid encrypted stream.
Fig. 22 illustrates an incomplete picture start code at the concatenation point of repeated B-frame data.
Fig. 23 illustrates the effect of MPEG frame re-ordering from transmission order to display order.
Fig. 24 illustrates the effect of MPEG frame re-ordering during slow- forward from transmission order through the intermediate display frame order to the actual displayed frames. Fig. 25 illustrates the effect of MPEG frame re-ordering during slow- forward making use of empty P-frames before the anchor frames.
Fig. 26 illustrates the effect of MPEG frame re-ordering during slow- forward making use of backward predictive empty B-frames. Fig. 27 illustrates the effect of MPEG frame re-ordering during slow- forward making use of forward predictive empty B-frames.
Fig. 28 illustrates the high table ID toggle rate due to repetition of B-frame data.
Fig. 29 illustrates the loss of a necessary Control Word for a type I system due to B-frame data repetition.
Fig. 30a illustrates the handling of ECMs for the slow- forward stream.
Fig. 30b illustrates the handling of ECMs for the slow- forward stream suitable for fast channel changing.
Fig. 31 illustrates a system for removing ECMs corresponding to the repeated B-frame data.
Fig. 32 illustrates a system for removing ECMs corresponding to the repeated B-frame data making use of a repetition detection unit.
Fig. 33 illustrates a system for removing ECMs corresponding to the repeated B-frame data making use of an analyzer unit. Fig. 34 illustrates the loss of a necessary Control Word for a type II system due to B-frame data repetition and the removal of ECMs corresponding to all but the last repetition of the repeated B-frame data.
Fig. 35 illustrates the handling of ECMs for the slow- forward stream to prevent the loss of a necessary Control Word for a type II system due to B-frame data repetition.
Fig. 36 illustrates the results of a measurement of B-frame data length over time for a typical broadcast reception.
Fig. 37 illustrates the overlap of repeated B-frame data when the length of the B-frame data in time exceeds a single frame display time. Fig. 38 illustrates the results of a measurement of distance in time, measured in frame periods, from the Presentation Time Stamp (PTS) of a B-frame to the Program Clock Reference (PCR) for a typical broadcast reception over a period of 30 seconds.
Fig. 39 illustrates the potential overlap of B-frame data from the last repeated B-frame when a switch occurs from slow forward processing to still picture mode. Fig. 40 illustrates the switching to still picture mode on a B-frame comprising ECMs.
Fig. 41 illustrates identical PTS values and the resulting conflict for two frames when switching from slow forward processing mode to still picture mode. Fig. 42 illustrates the use of a Pe-frame to avoid identical PTS values and the resulting conflict for two frames of Fig. 41 and shows further an issue with subsequently repeated B-frames.
Fig. 43 illustrates the switching from slow forward processing mode to still picture mode during an anchor frame. Fig. 44 illustrates the switching from slow forward processing mode to still picture mode during a pre-inserted Pe-frame.
Fig. 45 illustrates the switching from slow forward processing mode to still picture mode during a pre-inserted Bb-frame.
Fig. 46 illustrates the switching from slow forward processing mode to still picture mode during a post-inserted Bf- frame.
Fig. 47 illustrates the switching from slow forward processing mode to still picture mode during a switching period comprising an anchor frame or subsequent Bf- frames.
Fig. 48 illustrates the switching from slow forward processing mode to still picture mode during a switching period comprising pre-inserted empty frames.
Fig. 49a illustrates the switching from slow forward processing mode to still picture mode during a switching period lasting until the start of an anchor frame.
Fig. 49b illustrates the switching from still picture mode to slow forward processing mode for a still picture mode using Pe-frames. Fig. 50a illustrates the options available when switching from a slow forward processing mode to a still picture mode when the last repeated B-frame does not contain ECMs.
Fig. 50b illustrates the options available when switching from a slow forward processing mode to a still picture mode when the last repeated B-frame does contain ECMs, but no table ID toggle.
Fig. 50c illustrates the options available when switching from a slow forward processing mode to a still picture mode when the last repeated B-frame contains both ECMs and a table ID toggle. Fig. 5Od illustrates the options available when switching from a slow forward processing mode to a still picture mode when the last repeated B-frame contains both ECMs and a table ID toggle and where the table ID toggle occurs prior to the mode switch.
Fig. 5Oe illustrates the options available when switching from a slow forward processing mode to a still picture mode when the last repeated B-frame contains both ECMs and a table ID toggle and where the table ID toggle occurs subsequent to the mode switch.
Fig. 5Of illustrates the exception handling required upon the last repeated current B-frame for the situation of Fig. 5Oe.
Fig. 5Og illustrates a second set of options available when switching from a slow forward processing mode to a still picture mode when the last repeated B-frame contains both ECMs and a table ID toggle and where the table ID toggle occurs subsequent to the mode switch.
Fig. 5Oh a third set of options available when switching from a slow forward processing mode to a still picture mode when the last repeated B-frame contains both ECMs and a table ID toggle and where the table ID toggle occurs subsequent to the mode switch. Fig. 51 illustrates a device for processing an encrypted data stream. Fig. 52 illustrates a second device for processing an encrypted data stream. Fig. 53 illustrates an I-frame based step backwards mode. Fig. 54 illustrates an undefined first frame after the first step backward. Fig. 55 illustrates an undefined first frame and PTS conflict after the first step forward and equivalent artifacts for each subsequent picture step forward where a transition between an anchor frame and a B-frame occurs.
Fig. 56 illustrates an anchor frame based step forwards mode displaying only a single undefined frame after the first step forward. Fig. 57 illustrates the first anchor frame required for the cases where the incoming data stream is comprised of repeated B-frame data, repeated Bf- frames and repeated Pe-frames.
Fig. 58 illustrates an I-frame based step forward mode. Fig. 59 illustrates the first I-frame required for the cases where the incoming data stream is comprised of repeated B-frame data, repeated Bf-frames and repeated Pe- frames.
Fig. 60 illustrates a device for processing a data stream according to an exemplary embodiment of the invention. Fig. 61 illustrates a second device for processing a data stream according to an exemplary embodiment of the invention.
Fig. 62 illustrates a third device for processing a data stream according to an exemplary embodiment of the invention. The Figures are schematically drawn and not true to scale, and the identical reference numerals in different Figures refer to corresponding elements. It will be clear for those skilled in the art, that alternative but equivalent embodiments of the invention are possible without deviating from the true inventive concept, and that the scope of the invention will be limited by the claims only.
DETAILED DESCRIPTION OF THE INVENTION
In the following, referring to Fig. 1 to Fig. 13, different aspects of trick-play implementation for transport streams according to exemplary embodiments of the invention will be described. Particularly, several possibilities to perform trick-play on an MPEG2 encoded stream will be described, which may be partly or totally encrypted, or non-encrypted. The following description will target methods specific to the MPE G2 transport stream format. However, the invention is not restricted to this format.
Experiments were actually done with an extension, the so-called time-stamped transport stream. This comprises transport stream packets, all of which are pre-pended with a 4 bytes header in which the transport stream packet arrival time is placed. This time may be derived from the value of the program clock reference (PCR) time-base at the time the first byte of the packet is received at the recording device. This is a proper method to store the timing information with the stream, so that playback of the stream becomes a relatively easy process.
One problem during playback is to ensure that the MPEG2 decoder buffer will not overrun nor underflow. If the input stream was compliant to the decoder buffer model, restoring the relative timing ensures that the output stream is also compliant. Some of the trick-play methods described herein are independent of the time stamp and perform equally well on transport streams with and without time stamps.
Fig. 1 illustrates a time stamped transport stream packet 100 having a total length 104 of 188 Bytes and comprising a time stamp 101 having a length 105 of 4 Bytes, a packet header 102, and a packet payload 103 having a length of 184 Bytes. This following description will give an overview of the possibilities to create an MPEG/DVB (digital video broadcasting) compliant trick-play stream from a recorded transport stream and intends to cover the full spectrum of recorded streams from those that are completely plaintext, so every bit of data can be manipulated, to streams that are completely encrypted (for instance according to the DVB scheme), so that only headers and some tables may be accessible for manipulation.
When creating trick-play for an MPEG/DVB transport stream, problems may arise when the content is at least partially encrypted. It may not be possible to descend to the elementary stream level, which is the usual approach, or even access any packetized elementary stream (PES) headers before decryption. This also means that finding picture frames may not be possible. Known trick-play engines need to be able to access and process this information.
In the frame of this description, the term "ECM" denotes an Entitlement Control Message. This message may particularly comprise secret provider proprietary information and may, among others, contain encrypted Control Words (CW) needed to decrypt the MPEG stream. Typically, Control Words expire in 10-20 seconds. The ECMs are embedded in packets in the transport stream.
In the frame of this description, the term "keys" particularly denotes data that may be stored in a smart card and may be transferred to the smart card using EMMs, that is so-called "Entitlement Management Messages" that may be embedded in the transport stream. These keys may be used by the smart card to decrypt the Control Words present in the ECM. An exemplary validity period of such a key may be one month.
In the frame of this description, the term "Control Words" (CW) particularly denotes decryption information needed to decrypt actual content. Control words may be decrypted by the smart card and then stored in a memory of the decryption core.
Some aspects related to trick-play on plaintext streams will now be described.
It may be preferable that any MPEG2 streams created are MPEG2 compliant transport streams. This is because the decoder may not only be integrated within a device, but may also be connected via a standard digital interface, such as an IEEE 1394 interface, or an Internet interface, for example.
Account should also be taken of any problems that may occur when using a video coding technique like MPEG2 that exploits the temporal redundancy of video to achieve high compression ratios. Frames can no longer be decoded independently. A structure of a plurality of groups of pictures (GOPs) is shown in Fig. 2. Particularly, Fig. 2 shows a stream 200 comprising several MPEG2 GOP structures with a sequence of I-frames 201 and P-frames 202. The GOP size is denoted with reference numeral 203. The GOP size 203 is set to 12 frames, and only I-frames 201 and P-frames 202 are shown here.
In MPEG, a GOP structure may be used in which only the first frame is coded independently of other frames. This is the so-called intra-coded or I-frame 201. The predictive frames or P-frames 202 are coded with a unidirectional prediction, meaning that they only rely on the previous I-frame 201 or P-frame 202 as indicated by arrows 204 in Figure 2. Such a GOP structure has typically a size of 12 or 16 frames 201, 202.
Another structure 300 of a plurality of GOPs is shown in Fig. 3. Particularly, Fig. 3 shows the MPEG2 GOP structure with a sequence of I-frames 201, P-frames 202 and B-frames 301. The GOP size is again denoted with reference numeral 203.
It is possible to use a GOP structure containing also bi-directionally predictive frames or B-frames 301 as shown in Fig. 3. A GOP size 203 of 12 frames is chosen for the example. The B-frames 301 are coded with a bi-directional prediction, meaning that they rely on a previous and a next I- or P-frame 201, 202 as indicated for some B-frames 301 by curved arrows 204. The transmission order of the compressed frames may be not the same as the order in which they are displayed. To decode a B-frame 301, both reference frames before and after the B-frame
301 (in display order) are needed. To minimize the buffer demand in a decoder, the compressed frames may be reordered. So in transmission, the reference frames may come first. The reordered stream, as it is transmitted, is also shown in Fig. 3, lower part. The reordering is indicated by straight arrows 302. A stream containing B-frames 301 can give a nice looking trick-play picture if all of the B-frames 301 are skipped. For the present example, this leads to a trick-play speed of 3x forward.
Even if an MPEG2 stream is not encrypted (that is to say plaintext), trick-play may not be trivial. The possibility of a slow-reverse based on I-frames only is briefly mentioned as an option. An efficient frame based slow-reverse is practically impossible though, due to the necessary inversion of the MPEG2 GOP. Slow-forward which is also known as slow motion forward is a mode in which the display picture runs at a lower than normal speed. A rudimentary form of slow- forward is already possible with the technique making use of a fast-forward algorithm that generates trick-play GOPs. Setting the fast- forward speed to a value between zero and one results in a slow- forward stream based on a repetition of fast-forward trick-play GOPs. For a plaintext stream this is no problem but for an encrypted stream it can lead to the erroneous decryption of part of the I-frame in certain specific conditions. There are several options to solve this problem but the most suitable way may be not to repeat the fast-forward trick-play GOP but to extend the size of the trick-play GOP by the addition of empty P-frames. This technique in fact also enables slow-reverse, because it may be based on the trick-play GOPs used for fast-forward/reverse and therefore on the independently decodable I-frames. However, it may not be preferred to make use of this kind of I-frame based slow- forward or slow-reverse for the following reason. The distance between I-frames in normal play may be around half a second and for slow- forward/reverse it is multiplied with the slow motion factor. So this type of slow- forward or slow-reverse is not really the slow motion consumers are used to but in fact it is more like a slide show with a large temporal distance between the successive pictures.
In another trick-play mode called still picture mode the display picture is halted. This can be achieved by adding empty P-frames to the I-frame for the duration of the still picture mode. This means that the picture resulting from the last I-frame is halted. When switching to still picture from normal play, this can also be the nearest I-frame according to the data in the CPI file. This technique may be an extension of the fast-forward/reverse modes and results in nice still pictures especially if interlace kill is used. However the positional accuracy may often be insufficient when switching from normal play or slow- forward/reverse to still picture.
The still picture mode can be extended to implement a step mode. The step command advances the stream to some next or previous I-frame. The step size is at minimum one GOP but can also be set to a higher value equal to an integer number of GOPs. Step forward and step backward are both possible in this case because only I-frames are used. The slow- forward can also be based on a repetition of every frame, which results in a much smoother slow motion. The best form of slow- forward would in fact be a repetition of fields instead of frames because the temporal resolution is doubled and there are no interlace artifacts. This may be however practically impossible for the intrinsically frame based MPEG2 streams and even more so if they are largely encrypted. The interlace artifacts can be significantly reduced for the I- and P-frames by using special empty frames to force the repetition. Such an interlace reduction technique may not be available for the B-frames though. Whether the use of interlace kill for the I- and P-frames is still advantageous in this case or in fact leads to a more annoying picture for the viewer can only be verified by experiments. Slow-reverse on the basis of individual frames may in fact be very complicated for MPEG signals due to the temporal predictions. A complete GOP has to be buffered and reversed. There is no simple method known of to recode the frames in a GOP to the reverse order. So an almost complete decoding and encoding might be necessary with an inversion of the frame order between these two. This asks for the buffering of a complete decoded GOP as well as a full MPEG decoder and encoder.
Still picture mode can be defined as an extension of the frame-based slow- forward mode. It may be based on a repeated display of the current frame for the duration of the still picture mode whatever the type of this frame is. This may be, in fact, a slow- forward with an infinite slow motion factor if this indicates the factor with which the normal play stream is slowed down. No interlace kill may be possible if the picture is halted on a B- frame. In that sense this still picture mode may be worse than the trick-play GOP based still picture mode. This can be corrected by only halting the picture at an I- or P- frame at the cost of a somewhat less accurate still picture position. Discontinuities in the temporal reference and the PTS can also be avoided in this case. Moreover, the bit rate may be significantly reduced because the repetition of an I- or P-frame may be forced by the insertion of empty frames instead of a repetition of the frame data itself as may be necessary for the B-frames. So, technically speaking, the halting of a picture at an I- or P-frame may be the best choice, if one accepts that lack of positional accuracy. The still picture mode can also be extended with a step mode. The step command advances the stream in principle to the next frame. Larger step sizes are possible by stepping to the next P-frame or some next I-frame. A step backward on frame basis may not be possible. The only option may be to step backward to one of the previous I-frames. Two types of still picture mode have been mentioned, namely trick-play GOP based and frame based. The first one may be most logically connected to fast-forward/reverse whereas the second one may be related to slow- forward. When switching from some mode to still picture, it may be preferable to choose the related still picture mode to minimize the switching delay. The streams resulting from both methods look very alike because they are both based on the insertion of empty frames to force the repetition of an anchor frame. But on detailed stream construction level there are some differences.
In the following, some aspects related to a CPI ("characteristic point information") file will be described.
Finding I-frames in a stream usually requires parsing the stream, to find the frame headers. Locating the positions where the I-frame starts can be done while the recording is being made, or off-line after the recording is completed, or semi on-line, in fact being off-line but with a small delay with respect to the moment of recording. The I-frame end can be found by detecting the start of the next P-frame or B-frame. The meta-data derived this way can be stored in a separate but coupled file that may be denoted as characteristic point information file or CPI file. This file may contain pointers to the start and eventually end of each I-frame in the transport stream file. Each individual recording may have its own CPI file.
The structure of a characteristic point information file 400 is visualized in Fig. 4a. Apart from the CPI file 400, stored information 401 is shown. The CPI file
400 may also contain some other data that are not discussed here.
With the data from the CPI file 400 it is possible to jump to the start of any I- frame 201 in the stream. If the CPI file 400 also contains the end of the I-frames 201, the amount of data to read from the transport stream file may be exactly known to get a complete I-frame 201. If for some reason the I-frame end is not known, the entire GOP or at least a large part of the GOP data may be read to be sure that the entire I-frame 201 is read. The end of the GOP may be given by the start of the next I-frame 201. It is known from measurements that the amount of I-frame data can be 40% or more of the total GOP data.
It is known that reducing the trick-play picture refresh rate can be achieved by displaying each I-frame 201 several times. The bit rate will be reduced accordingly. This may be achieved by adding so-called empty P-frames 202 between the I-frames 201. Such an empty P-frame 202 may not be really empty but may contain data instructing the decoder to repeat the previous frame. This has a limited bit cost, which can in many cases be neglected compared to an I-frame 201. From experiments it is known that trick-play GOP structures like IPP or IPPP may be acceptable for the trick-play picture quality and even advantageous at high trick-play speeds. The resulting trick-play bit rate may be of the same order as the normal play bit rate. It is also mentioned that these structures may reduce the required sustained bandwidth from the storage device.
Here some aspects related to timing issues and stream construction will be described.
A trick-play system 500 is schematically depicted in Fig. 5. The trick-play system 500 comprises a recording unit 501, an I-frame selection unit 502, a trick-play generation block 503 and an MPEG2 decoder 504. The trick-play generation block 503 includes a parsing unit 505, an adding unit 506, a packetizer unit 507, a table memory unit 508 and a multiplexer 509.
The recording unit 501 provides the I-frame selection unit 502 with plaintext MPEG2 data 510. The multiplexer 509 provides the MPEG2 decoder 504 with an MPEG2 DVB compliant transport stream 511.
The I-frame selector 502 reads specific I-frames 201 from the storage device 501. Which I-frames 201 are chosen depends on the trick-play speed as will be described below. The retrieved I-frames 201 are used to construct an MPEG-2/DVB compliant trick- play stream that may be then sent to the MPEG-2 decoder 504 for decoding and rendering. The position of the I-frame packets in the trick-play stream cannot be coupled to the relative timing of the original transport stream. In trick-play, the time axis may be compressed or expanded with the speed factor and additionally inversed for reverse trick- play. Therefore, the time stamps of the original time stamped transport stream may not be suitable for trick-play generation. Moreover, the original PCR time base may be disturbing for trick-play. First of all it is not guaranteed that a PCR will be available within the selected I-frame 201. But even more important is that the frequency of the PCR time base would be changed. According to the MPEG2 specification, this frequency should be within 30 ppm from 27 MHz. The original PCR time base fulfils this requirement, but if used for trick-play it would be multiplied by the trick-play speed factor. For reverse trick-play this even leads to a time base running in the wrong direction. Therefore, the old PCR time base has to be removed and a new one added to the trick-play stream.
Finally, I-frames 201 normally contain two time stamps that tell the decoder 504 when to start decoding the frame (decoding time stamp, DTS) and when to start presenting, for instance displaying, it (presentation time stamp, PTS). Decoding and presentation may be started when DTS respectively PTS are equal to the PCR time base, which may be reconstructed in the decoder 504 by means of the PCRs in the stream. The distance between, e.g., the PTS values of 2 I-frames 201 corresponds to their nominal distance in display time. In trick-play this time distance may be compressed or expanded with the speed factor. Since a new PCR time base may be used in trick-play, and because the distance for DTS and PTS may be no longer correct, the original DTS and PTS of the I-frame 201 have to be replaced.
To solve above-mentioned complications, the I-frame 201 may first be parsed into an elementary stream in the parsing unit 505. Then the empty P-frames 202 are added on elementary stream level. The obtained trick-play, GOP is mapped into one PES packet and packetized to transport stream packets. Then corrected tables like PAT, PMT, etc. are added. At this stage, a new PCR time base together with DTS and PTS are included. The transport stream packets are pre-pended with a 4 bytes time stamp that is coupled to the PCR time base such that the trick-play stream can be handled by the same output circuitry as used for normal play.
In the following, some aspects related to trick-play speeds will be described. In this context, firstly, fixed trick-play speeds will be discussed.
As mentioned before, a trick-play GOP structure like IPP may be used in which the I-frame 201 is followed by two empty P-frames 202. It is assumed that the original GOP has a GOP size 203 of 12 frames and that all the original I-frames 201 are used for trick-play. This means that the I-frames 201 in the normal play stream have a distance of 12 frames and the same I-frames 201 in the trick-play stream a distance of 3 frames. This leads to a trick-play speed of 12/3 = 4x. If the original GOP size 203 in frames is denoted by G, the trick-play GOP size in frames by T and the trick-play speed factor by Nb, the trick-play speed in general is given by:
Nb=G/T (1)
Nb will also be denoted as the basic speed. Higher speeds can be realized by skipping I- frames 201 from the original stream. If every second I-frame 201 is taken, the trick-play speed is doubled, if every third I-frame 201 is taken, the trick-play speed is tripled and so on. In other words, the distance between the used I-frames 201 of the original stream is 2, 3 and so on. This distance may be always an integer number. If the distance between the I-frames 201 used for trick-play generation is denoted by D (D=I meaning that every I-frame 201 is used), then the general trick-play speed factor N is given by:
N=D*G/T (2)
This means that all integer multiples of the basic speed can be realized, leading to an acceptable set of speeds. It should be noticed that D is negative for reverse trick-play and that D=O results in a still picture. Data can only be read in a forward direction. Therefore, in reverse trick-play, data is read forward and jumps are made backwards to retrieve the preceding I-frame 201 given by D. It should also be noticed that a larger trick- play GOP size T results in a lower basic speed. For instance, IPPP leads to a finer grained set of speeds than IPP.
Referring to Fig. 6, time compression in trick-play will be explained. Fig. 6 shows the situation for 7=3 (IPP) and G= 12. For D=2, an original display time of 24 frames is compressed into a trick-play display time of 3 frames resulting in N=S. In the given example, the basic speed is an integer but this is not necessarily the case. For G= 16 and 7=3, the basic speed is 16/3 = 5 1/3 which does not result in a set of integer trick-play speeds. Therefore, the IPPP structure (7=4) is better suited for a GOP size of 16 resulting in a basic speed of 4x. If a single trick-play structure is desired that fits to the most common GOP sizes of 12 and 16, IPPP may be chosen.
Secondly, arbitrary trick-play speeds will be discussed.
In some cases, the set of trick-play speeds resulting from the method described above is satisfying, in some cases not. In the case of G= 16 and 7=3 one probably still would prefer integer trick-play speed factors. Even in the case of G= 12 and 7=4 it might be preferred to have a speed not available in the set like for instance 7x. Now, the trick-play speed formula will be inverted and the distance D will be calculated which is given by:
D=N*T/G (3)
Using the above example with G=12, 7=4 and N=I results in D=2 1/3. Instead of skipping a fixed number of I-frames 201, an adaptive skipping algorithm might be used that chooses the next I-frame 201 based on the fact what I-frame 201 best matches the required speed. To choose the best matching I-frame 201, the next ideal point Ip with the distance D may be calculated and one of the I-frames 201 may be chosen closest to this ideal point to construct a trick-play GOP. In the following step, again the next ideal point may be calculated by increasing the last ideal point by D.
As visualized in Fig. 7 illustrating trick-play with fractional distances, there are particularly three possibilities to choose the I-frame 201 : A. The I-frame closest to the ideal point; / = round(//?) B. The last I-frame before the ideal point; / = mt{Ip) C. The first I-frame after the ideal point; / = int(Ip)+l
As can clearly be seen, the actual distance is varying between int(D) and int(D)+l, the ratio between the occurrences of the two being dependent on the fraction of D, such that the average distance is equal to D. This means that the average trick-play speed is equal to N, but that the actually used frame has a small jitter with respect to the ideal frame. Several experiments have been performed with this, and although the trick-play speed may vary locally, this is not visually disturbing. Usually, it is not even noticeable especially at somewhat higher trick-play speeds. It is also clear from Fig.7 that it makes no essential difference whether to choose method A, B or C.
With this method, trick-play speed N does not need to be an integer but can be any number above the basic speed Nb. Also speeds below this minimum can be chosen, but then the picture refresh rate may be lowered locally because the effective trick-play GOP size T is doubled or at still lower speeds even tripled or more. This is due to a repetition of the trick-play GOPs, as the algorithm will choose the same I-frame 201 more than once.
Fig. 8 shows an example for D=2/3 which is equivalent to N=2/3 Nb. Here, the round function is used to select the I-frames 201 and as can be seen frames 2 and 4 are selected twice. Anyway, the described method will allow for a continuously variable trick- play speed. For reverse trick-play a negative value is chosen for N. For the example of Fig. 7 this simply means that the arrows 700 are pointing in the other direction. The method described will also include the sets of fixed trick-play speeds mentioned earlier and they will have the same quality, especially if the round function is used. Therefore, it might be appropriate that the flexible method described in this section should always be implemented whatever the choice of the speeds will be.
Now some aspects related to the refresh rate of the trick-play picture will be discussed.
The term "refresh rate" particularly denotes the frequency with which new pictures are displayed. Although not speed dependent, it will be briefly discussed here because it can influence the choice of T. If the refresh rate of the original picture is denoted by R (25Hz or 30Hz), the refresh rate of the trick-play picture (Rt) is given by:
Rt=RfT (4)
With a trick-play GOP structure of IPP (7=3) or IPPP (T=A), the refresh rate Rt is 8 1/3 Hz respectively 6 1/4 Hz for Europe and 10 Hz respectively 7 1/2 Hz for the USA. Although the judgment of trick-play picture quality is a somewhat subjective matter, there are clear hints from experiments that these refresh rates are acceptable for low speeds and even advantageous at higher speeds.
In the following, some aspects related to encrypted stream environments will be described. Here some information about encrypted transport streams is presented as a basis for the description of trick-play on encrypted streams. It is focused on the Conditional Access System used for broadcast.
Fig. 9 illustrates a conditional access system 900 which will now be described.
In the conditional access system 900, content 901 may be provided to a content encryption unit 902. After having encrypted the content 901, the content encryption unit 902 supplies a content decryption unit 904 with encrypted content 903.
In this specification it has been stated that ECM denotes Entitlement Control Messages. Furthermore, it is meant that KMM denotes Key Management Messages, GKM denotes Group Key Messages and EMM denotes Entitlement Management Messages. A Control Word 906 may be supplied to the content encryption unit 902 and to an ECM generation unit 907. The ECM generation unit 907 generates an ECM and provides the same to an ECM decoding unit 908 of a smart card 905. The ECM decoding unit 908 generates from the ECM a Control Word that is decryption information that is needed and provided to the content encryption unit 904 to decrypt the encrypted content 903. Furthermore, an authorization key 910 is provided to the ECM generation unit
907 and to a KMM generation unit 911, wherein the latter generates a KMM and provides the same to a KMM decoding unit 912 of the smart card 905. The KMM decoding unit 912 provides an output signal to the ECM decoding unit 908.
Moreover, a group key 914 may be provided to the KMM generation unit 911 and to a GKM generation unit 915 which may further be provided with a user key 918. The GKM generation unit 915 generates a GKM signal GKM and provides the same to a GKM decoding unit 916 of the smart card 905, wherein the GKM decoding unit 916 gets as a further input a user key 917.
Beyond this, entitlements 919 may be provided to an EMM generation unit 920 that generates an EMM signal and provides the same to an EMM decoding unit 921. The EMM decoding unit 921 located in the smart card 905 is coupled with an entitlement list unit 913 which provides the ECM decoding unit 908 with corresponding control information.
In many cases, content providers and service providers want to control access to certain content items through a conditional access (CA) system. To achieve this, the broadcasted content 901 is encrypted under the control of the CA system 900. In the receiver, content is decrypted before decoding and rendering if access is granted by the CA system 900.
The CA system 900 uses a layered hierarchy (see Fig. 9). The CA system 900 transfers the content decryption key (Control Word CW 906, 909) from server to client in the form of an encrypted message, called an ECM. ECMs are encrypted using an authorization key (AK) 910. For security reasons, the CA server 900 may renew the authorization key 910 by issuing a KMM. A KMM is in fact a special type of EMM, but for clarity the term KMM may be used. KMMs are also encrypted using a key that for instance can be a group key (GK) 914, which is renewed by sending a GKM that is again a special type of EMM. GKMs are then encrypted with the user key (UK) 917, 918, which is a fixed unique key embedded in the smart card 905 and known by the CA system 900 of the provider only. Authorization keys and group keys are stored in the smart card 905 of the receiver.
Entitlements 919 (for instance viewing rights) are sent to individual customers in the form of an EMM and stored locally in a secure device (smart card 905). Entitlements 919 are coupled to a specific program. An entitlements list 913 gives access to a group of programs depending on the type of subscription. ECMs are only processed into keys (Control Words) by the smart card 905 if an entitlement 919 is available for the specific program. Entitlement EMMs are subject to an identical layered structure as the KMMs (not depicted in Fig. 9).
In an MPEG2 system, encrypted content, ECMs and EMMs (including the KMM and GKM types) are all multiplexed into a single MPEG2 transport stream.
The description above is a generalized view of the CA system 900. In digital video broadcasting, only the encryption algorithm, the odd/even Control Word structure, the global structure of ECMs and EMMs and their referencing are defined. The detailed structure of the CA system 900 and the way the payloads of ECMs and EMMs are encoded and used are provider specific. Also the smart card is provider specific. However, from experience it is known that many providers follow essentially the structure of the generalized view of Fig. 9.
In the following, DVB Encryption/Decryption topics will be discussed. The applied encryption and decryption algorithm is defined by the DVB standardization organization. In principle two encryption possibilities are defined namely PES level encryption and TS level encryption. However, in real life mainly the TS level encryption method is used. Encryption and decryption of the transport stream packets is done packet based. This means that the encryption and decryption algorithm is restarted every time a new transport stream packet is received. Therefore, packets can be encrypted or decrypted individually. In the transport stream, encrypted and plaintext packets are mixed because some stream parts are encrypted (e.g. audio/video) and others are not (e.g. tables). Even within one stream part (e.g. video) encrypted and plaintext packets may be mixed. Referring to Fig. 10, a DVB encrypted transport stream packet 1000 will be described.
The stream packet 1000 has a length 1001 of 188 Bytes and comprises three portions. A packet header 1002 has a size 1003 of 4 Bytes. Subsequent to the packet header 1002, an adaptation field 1004 may be included in the stream packet 1000. After that, a DVB encrypted packet payload 1005 may be sent.
Fig. 11 illustrates a detailed structure of the transport stream packet header 1002 of Fig. 10.
The transport stream packet header 1002 comprises a synchronization unit (SYNC) 1010, a transport error indicator (TEI) 1011 which may indicate transport errors in a packet, a payload unit start indicator (PLUSI) 1012 which may particularly indicate a possible start of a PES packet in the subsequent payload 1005, a transport priority unit (TPI) 1017 indicating priority of the transport, a packet identifier (PID) 1013 used for determining the assignment of the package, a transport scrambling control (SCB) 1014 is used to select the CW that is needed for decrypting the transport stream packet, an adaptation field control (AFLD) 1015, and a continuity counter (CC) lOlβ.Thus, Fig. 10 and Fig. 11 show the
MPEG2 transport stream packet 1000 that has been encrypted and which comprises different parts:
- Packet header 1002 is in plaintext. It serves to obtain important information such as a packet identifier (PID) number, presence of an adaptation field, scrambling control bits, etc. - Adaptation field 1004 is also in plaintext. It can contain important timing information such as the PCR.
- DVB Encrypted Packet Payload 1005 contains the actual program content that may have been encrypted using the DVB algorithm.
In order to select the correct CW that is needed to decrypt the broadcasted program it is necessary to parse the transport stream packet header. A schematic overview of this header is given in Fig.11. An important field for the decryption of the broadcasted program is the scrambling control bits (SCB) field 1014. This SCB field 1014 indicates which CW the decrypter must use to decrypt the broadcasted program. Moreover, it indicates whether the payload of the packet is encrypted or in plaintext. For every new transport stream packet, this SCB 1014 must be parsed since it changes over time and can change from packet to packet.
In the following, some aspects related to trick-play on fully encrypted streams will be described. The first reason why this is an interesting topic is that trick-play on plaintext and fully encrypted streams are the two extremes of a range of possibilities. Another reason is that there exist applications in which it may be necessary to record fully encrypted streams. Thus, it would be useful to have a technique at hand to perform trick-play on a fully encrypted stream. A basic principle is to read a large enough block of data from the storage device, decrypt it, select an I-frame in the block and construct a trick-play stream with it. Such a system 1200 is depicted in Fig. 12
Fig. 12 shows the basic principle of trick-play on a fully encrypted stream. For this purpose, data stored in a hard disk 1201 are provided as a transport stream 1202 to a decrypter 1203. Further, the hard disk 1201 provides a smart card 1204 with an ECM, wherein the smart card 1204 generates Control Words from this ECM and sends the same to the decrypter 1203.
Using the Control Words, the decrypter 1203 decrypts the encrypted transport stream 1202 and sends the decrypted data to an I-frame detector and filter 1205. From there, the data are provided to an insert empty P frame unit 1206 which conveys the data to a set top box 1207. From there, data are provided to a television 1208.
Some aspects will be mentioned with respect to the question of what a recording contains.
Making a recording of a single channel, the recording must contain all the data required to playback the recording of the channel at a later stage. One can resort to just record everything on a certain transponder, but this way one would record far more than one needs to playback the program intended to record. This means that both bandwidth and storage space would be wasted. So instead of this, only the packets really needed should be recorded. For each program this means one must record all the MPEG2 mandatory packets like PAT (program association table), CAT (conditional access table), and obviously for each program the video and audio packets as well as the PMT (program map table) that describes which packets belong to a program. Furthermore, the CAT/PMT may describe CA packets (ECMs) needed for decryption of the stream. Unless the recording is made in plaintext after decryption, those ECM packets have to be recorded as well. If the recording made does not consist of all packets from the full multiplex, the recording becomes a so-called partial transport stream 1300 (see Fig. 13). Further, Fig. 13 illustrates a full transport stream 1301. The DVB standard requires that if a partial transport stream 1300 is played, all normal DVB mandatory tables like NIT (network information table), BAT (bouquet association table) etc. are removed. Instead of these tables, the partial stream should have SIT (selection information table) and DIT (discontinuity information table) tables inserted.
In the following, referring to Fig. 14 to Fig. 63, systems will be described which are capable of processing a data stream according to exemplary embodiments of the invention.
It is emphasized that the systems described in the specification can be implemented in the frame of and in combination with any of the systems described referring to Fig. 1 to Fig. 13.
In the following, some aspects related to dealing with ECMs will be described. Jumping to the next block during trick-play can mean jumping back in the stream. It will be explained that this may not be only the case for trick-play reverse but also for trick-play forward at moderate speeds. The situation for forward trick-play with forward jumps and for reverse trick-play with inherently backward jumps will be explained afterwards. Specific problems may occur caused by the fact that data has to be decrypted.
A conditional access system may be designed for transmission. In normal play, the transmitted stream may be reconstructed with original timings. But trick-play may have severe implications for the handling of cryptographic metadata due to changed timings. The data may be compressed or expanded in time due to trick-play, but the latency of the smart card may remain constant.
To create a trick-play stream, the mentioned data blocks may go through a decrypter. This decrypter needs the Control Words used in the encryption process to decrypt the data blocks. These Control Words may also be encrypted and stored in ECMs. In a normal set-top-box (STB), these ECMs may be part of the program tuned to. A conditional access module may extract the ECMs, send them to a smart card, and, if the card has rights or an authorization to decrypt these ECMs, may receive the decrypted Control Words from it. Control Words usually have a relatively short lifetime of, for instance, approximately 10 seconds. The Scrambling Control Bit, SCB 1014, in the transport stream packet headers, may indicate this lifetime. If it changes, the next Control Word has to be used. This SCB change or toggle is indicated in Fig. 14 by a vertical line and with a reference numeral 1402.
Referring to Fig. 14, particularly two different scenarios or stream types may be distinguished: According to a stream type I shown in a lower row 1401 in Fig. 14, two
Control Words (CWs) are provided per ECM.
According to a stream type II shown in an upper row 1400 in Fig. 14, only one Control Word (CW) is provided per ECM.
Fig. 14 illustrates the two data streams 1400, 1401 comprising subsequently arranged periods or segments A, B, C denoted with reference numeral 1403. In the scenario illustrated in the upper row 1400 of Fig. 14, essentially one Control Word per corresponding ECM is provided. In contrast to this, in the lower row 1401, each ECM comprises two Control Words, namely the Control Word relating to the current period or ECM, and additionally the Control Word of the subsequent period or ECM. Thus, there is some redundancy concerning the provision of the Control Words.
During the short lifespan, items of the decryption information may be transmitted several times, so that tuning to such a channel halfway through the lifespan of such a Control Word does not mean waiting for the next Control Word. The conditional access module may only send the first unique ECM it finds to the smart card to reduce or minimize the traffic to the card, as it may have a fairly slow processor.
This shows that there may be a limitation of trick-play on encrypted streams. There may be an implicit upper speed limit, coming from the limited speed of the processing capability of the smart card. In trick-play, the Control Word lifetime of 10 seconds may be compressed or expanded with the trick-play speed factor. Sending an ECM to a smart card and receiving the decrypted Control Words may take approximately half a second. The way Control Words are packed into an ECM may be provider-specific and particularly different for stream type I and stream type II, as depicted in Fig. 14.
CW A denotes the CW that was used to encrypt period A, CW B denotes the CW that was used to encrypt period B, and so on. Horizontally, the transmission time axis is plotted. ECM A may be defined as being the ECM that is present during the major part of period A. It can be seen that, in that case, ECM A holds the CW for the current period A and for stream type I additionally for the next period B. In general, an ECM may hold at least the CW for the current period and might hold the CW for the next period. Due to zapping, this may probably be true for all or many providers. Before going on, more information will be provided about a decrypter and how it may handle the CWs. The decrypter may contain two registers, one for the "odd" and one for the "even" CW. "Odd" and "even" does not have to mean that the values of the CWs themselves are odd or even. The terms are particularly used to distinguish between two subsequent CWs in the stream. Which CW has to be used for the decryption of a packet is indicated by the SCB 1014 in the packet header. So the CWs used to encrypt the stream are alternating between odd and even. In Fig. 14 this means that, for instance, CW A and CW C are odd, whereas CW B and CW D are even. After the decryption by the smart card, CWs may be written to the corresponding registers in the decrypter overwriting previous values, as indicated in Fig. 15.
Fig. 15 illustrates the two registers 1501, 1502 containing even CWs (register 1501) and containing odd CWs (register 1502). Further, smart card latency 1500, that is a time needed by the smart card to retrieve or decrypt a CW from an ECM, is illustrated in Fig. 15. In the case of stream type I, each ECM holds two CWs and as a result both registers 1501, 1502 may be overwritten after the decryption of the ECM. One of the registers 1501, 1502 is active and the other is inactive. Which one is active depends on the SCB 1014. In the example, the SCB 1014 will indicate during period B that the even register 1501 is the active one. The active register may only be overwritten with a CW identical to the one it already holds because it is still needed for decryption of the remainder of that particular period. Therefore, only the inactive register may be overwritten with a new value.
Taking a closer look at period B in trick-play. Assuming that an ECM is sent to the smart card at the start of this period so at the moment the SCB toggle 1402 is crossed. The question is what ECM could then be sent to the smart card? This ECM should hold CW C to ensure a timely decryption by the smart card for usage at the start of period C.
It may also hold CW B without disturbing the correct availability of CWs in the decrypter.
Looking again at Fig. 14, it can be seen that for stream type I this means sending ECM B and for stream type II ECM C at the start of period B. In general, the current ECM can be sent in case it holds two CWs, and one period in advance if it holds only one CW. Sending an ECM one period in advance may be contradictory though to the embedded ECMs, so the latter have to be removed from the stream in that case. For a more generalized approach it may be preferred that the original ECMs are always removed from the stream by the trick-play generation circuitry or software. However, this cannot always be true. Fig. 16 shows ECM handling in a fast forward mode. In a plurality of subsequent periods 1403 separated by SCB toggles 1402, a plurality of data blocks 1600 are reproduced, wherein a switching 1601 occurs between different data blocks.
For stream type I, an ECM B is sent at a border between periods A and B. For stream type II, an ECM C is sent at a border between period A and period B. Furthermore, according to stream type I, an ECM C is sent at a border between period B and period C. For a stream type II, an ECM D is sent at a border between period B and period C.
For ECMs to be available for trick-play at the correct moment, the ECMs may be stored in a separate file. In this file it may also be indicated to which period an ECM belongs (which part of the recorded stream). The packets in the MPEG stream file may be numbered. The number of the first packet of a period (SCB toggle 1402) may be stored alongside with the ECM for this same period 1403. The ECM file may be generated during recording of the stream.
An example of an ECM file is shown in Fig. 4b.
The ECM file is a file that may be created during the recording. In the stream, ECM packets may be located which may contain the Control Words needed to decrypt the video data. Every ECM may be used for a certain period, for instance 10 seconds, and may be transmitted (repeated) several times during this period (for instance 100 times). The ECM file may contain every first new ECM of such a period. The ECM data may be written into this file, and may be accompanied by some metadata. First of all, a serial number (counting up from 1) may be given. As a second field, the ECM file may contain the position of the SCB toggle. This may denote the first packet that can use this ECM to correctly decrypt its content. Then the position in time of this SCB toggle may follow as the third field. These three fields may be followed by the ECM packet data itself of which only the first bytes are depicted in Fig. 4b. The differences between the ECMs are further down the payload and not visible here. Using the SCB toggles stored in the ECM file, it may be easy to detect if such toggle is crossed even if this would be during a jump. To send the correct ECM, it may be required to know whether the ECMs contain one or two CWs. In principle, this is not known because it is provider-specific and secret. However, this can easily be determined experimentally by sending ECMs at various moments and observing the results on the display. An alternative method that is particularly suitable for implementation in the storage device itself is as follows. Send one single ECM to the smart card at the moment of an SCB toggle, decrypt the stream and check for PES headers in the coming two periods. With one PES header per GOP, there are around twenty PES headers in each period. The position of a PES header may be easily detected because a PLUSI bit in the plaintext header of the packet may indicate its presence. If correct PES headers are only found during the first period (after the latency of the smartcard), the ECM contains one CW. If they are also found during the second period, it contains two CWs.
Such a situation is depicted in Fig. 17. Fig. 17 illustrates a situation for one CW detection and for two CW detection.
As can be seen, different periods 1403 of encrypted content 1700 are provided. With a smartcard latency 1500, an ECM A may be decrypted to generate corresponding CWs. By decrypting the encrypted content 1700, decrypted content 1701 may be generated. Further shown in Fig. 17 are PES headers 1702, namely a PES header A in period A (left) and a PES header B in period B (right).
The area 1703 of period B for one CW in Fig. 17 indicates that the data is decrypted with the wrong key and therefore scrambled. This checking could be done while recording, in which case it will take for instance 20 to 30 seconds. It could also be done offline and, because only two packets indicated by the PLUSIs (one in each period) would have to be checked, it could be very quick. In the unlikely event that adequate PES headers are not available, the picture headers could be used instead. In fact, any known information may be useable for detection. Anyway, a one/two CW indication may be stored in the ECM file.
In the following, some aspects related to dealing with slow-forward streams in particular will be described. For the construction of a slow- forward stream many considerations apply. For example, the construction of a slow- forward stream on elementary stream level can only be performed on fully plaintext data. As a consequence, the slow-forward stream will be fully plaintext. This is also the case for a normal play stream that has been encrypted. Such a situation is unacceptable to a copyright holder. Furthermore, this is worse than in the case of fast-forward/reverse stream because all information, i.e. each and every frame, is present in plaintext in the slow- forward stream and not just a subset of the frames as is the case for true fast-forward/reverse streams. Therefore a plaintext normal play stream can easily be reconstructed from a plaintext slow-forward stream. So the slow-forward stream should be encrypted if the normal play stream is encrypted. Since a DVB encryptor is not permissible in a consumer device this can only be realized if the slow- forward stream is constructed on transport stream level using the encrypted data packets from the originally transmitted encrypted data stream.
To be able to construct a slow-forward stream on transport stream level it is first of all necessary that each individual frame is available as a series of transport stream packets. In the case of one PES packet per frame this comes naturally. A PES packet is always contained in a series of transport stream packets because PES and transport stream packets are aligned. In the case of one PES packet per GOP this is only the case for the start of the I-frame. All other frame boundaries are mostly located somewhere inside a packet. This is shown in Fig. 18. Such a packet 1800 therefore contains information from two frames 1801 and 1802. So first it is necessary for this packet 1800 to be split up into two packets 1803 1804, the first one 1803 containing the data from the first frame 1801 in the original packet 1800 and the second one 1804 containing the data from the next frame 1802. Each of the two packets 1803 1804 resulting from the splitting has to be stuffed, for instance, with an adaptation field AF 1805 and 1806.
The splitting of packets is clearly no problem for a plaintext stream. A first option would be to fully decrypt the normal play data as is depicted in Fig. 19. The decryption in slow- forward mode of a stored fully encrypted stream or a stored hybrid stream is no problem because no stream data is skipped or duplicated in the stream to the decrypter. The complete stored stream is simply fed at a lower than normal rate through the decrypter, which also means that there would be no problems with the embedded ECMs. The plaintext stream coming from the decrypter can then be used to split the packets or in fact to perform any necessary stream manipulation. But the resulting slow- forward stream is, of course, always a plaintext stream in this case. The construction of an encrypted slow- forward stream from an encrypted normal play stream has to be performed on transport stream level because the use of a DVB encrypter in consumer devices may not be allowed. For this a hybrid stream, as shown in Fig. 20, with only a few plaintext packets 2001 on all frame boundaries is preferable. The other plaintext packets 2000 and encrypted packets 2002 are left unmodified. Such a stream could be generated on the playback side of the storage device if the stored stream is fully encrypted. In this case the decrypter in Fig. 19 is a selective type that only decrypts the necessary packets. But preferably the stream is already stored on a storage device 2100, for example, a hard disk drive as a hybrid stream which may be sent to a decryptor 2101 as is indicated in Fig. 21. The plaintext packets in the hybrid stream should now also allow for the splitting of packets containing data from two frames. However, even with a hybrid stream some part of the sequence header code or picture start code can still be located in an encrypted packet. In this case an ideal splitting is not possible. In fact the split is made between the encrypted and plaintext packet. For a frame based slow- forward also other types of concatenation have to be considered than the concatenation of empty P frames to I frames, for example, the concatenation of B-frames to B-frames. This asks for a gluing algorithm at these frame boundaries as is clarified in Fig. 22. In the example there is only one byte of the picture start code at the start of the B-frame 2200 and one byte of the picture start code end of the B-frame 2201. As a result two bytes are missing at the concatenation point. These two bytes have to be inserted in the gluing area 2202. For this gluing it has to be known how the 4-byte MPEG picture start code is split.
In slow- forward mode the decoder has somehow to be forced to repeat the display of a picture in accordance with the slow motion factor. Empty P-frames can be used to force the repetition of a picture resulting from an I-frame. The same technique can also be applied for pictures resulting from P-frames. This technique cannot be applied for B-frames though because empty P-frames always point to an anchor frame being an I- or P-frame. This is in fact the case for any type of empty frame. So the repetition of a picture resulting from a B-frame has to be realized in another way. The only guaranteed method is to repeat the B- frame data itself. Since the repeated B-frames point to the same anchor frames as the original B-frame the resulting pictures will be identical. The amount of data for a B-frame is much more than for an empty P-frame but in general it is still significantly less than for an I-frame. Anyway, the transmission time is also multiplied with the slow motion factor so at least on average there need not be an increase in bit rate. The empty frames used to force the repetition of pictures resulting from I- or
P-frames can be of the interlace kill type thus reducing interlace artifacts for these pictures. But such a reduction is not possible for pictures resulting from the B-frames because the repetition is not forced by an empty frame but by a repetition of the B-frame data itself. So the B-frames will always have the original interlace effects. If interlace kill would be used for the I- and P-frames this might look inconsistent because pictures with and without interlace effects are sequentially present in the stream of displayed pictures. Alternatively, it is better to only use empty frames without interlace kill to construct the slow- forward stream. One might expect that the repetition of the I- and P- frames should be enforced by the insertion in the transmission stream of empty P-frames after the original I- or P-frame. This method may be used for fast-forward/reverse streams consisting of I-frames followed by empty P-frames. However this method is not correct for a stream that also includes B-frames, as is the case for a slow-forward stream constructed from a stored transmission stream with B-frames. Due to the reordering from transmission stream to display stream the I- and P-frames will be repeated in the wrong position thus disturbing the normal display order of the frames. This will be clarified by means of Fig. 23 and Fig. 24.
Fig. 23 depicts the effect of reordering in normal play. The top line 2300 shows a normal play transmission stream with a GOP size of 12 frames consisting of I- 2302, P- 2304 and B-frames 2303. The first four frames of the next transmission GOP are also shown here for clarity. The bottom line 2301 shows the stream after reordering to the display order. The index indicates the display frame order. According to pages 24 and 25 of the MPEG-2 standard, ISO/IEC 13818-1 : 1995(E), the reordering is performed as follows:
• B-frames keep their original position;
• Anchor frames are shifted to the position of the next anchor frame. Upon closer inspection looking at how the slow- forward stream is constructed from this normal play stream. The top line 2400 in Fig. 24 shows the transmission order of the first part of the slow-motion stream for this case, assuming a slow motion factor of 3. Empty P-frames 2404 are inserted after the I-frames 2403 and P-frames 2405, and the B- frames 2406 are repeated. The middle line 2401 shows the effect of the reordering. The bottom line 2402 shows how the I-frames 2403 are repeated 2407 and 2408 by the empty P- frames 2404 in this case. The same repetition occurs for the P-frames 2405. An empty P- frame 2404 results in a displayed picture that is a copy of the picture resulting from the previous anchor frame, which itself could also be an empty P-frame. It is clearly visible that the normal display order indicated by the index is disturbed because the display of frame 14 is split up into two parts. Only the last time frame 14 2408 is displayed it is in the correct position. This also means that all the B-frames are decoded erroneously. Therefore this is not the correct way to generate a slow- forward trick-play stream. In fact there are several possibilities to solve this problem. A first one is shown in Fig. 25. Here the empty P-frames 2404 are inserted before the anchor frames 2403 and 2405 in the transmitted stream extracted from the storage device as is shown in the top line 2500. In the reordered stream, shown in the middle line 2501, the empty P-frames 2404 are now after the anchor frames 2403 and 2405. This is where they should be for a correct repetition of the anchor frames as is clear from the bottom line 2502. There are however some disadvantages in the use of empty P-frames. The first one is related to the propagation of errors within a GOP. P-frames depend on the previous anchor frame and B-frames depend on the surrounding anchor frames. A data error during the transfer to the STB results in decoding errors and therefore disturbances in the picture. If this error is in an anchor frame it propagates until the end of the GOP because subsequent P- frames depend on this anchor frame. Also the B-frames are affected because they use the pictures from the disturbed surrounding anchor frames for their decoding. So the picture disturbances gradually increase towards the end of the GOP. This is especially important for slow- forward where the GOP size can be very large and therefore very long in time. On the other hand a data error in a B-frame has only a very limited effect because no other frames depend on it. So the picture disturbances are restrained to this B-frame and its repetitions. One could argue that data errors should not occur on a digital interface but there is a second disadvantage in the use of empty P-frames. If these are of the interlace kill type they change the decoded picture by nature resulting in decoding errors for the subsequent frames. So interlace kill is not possible.
It is also indicated that several types of empty B-frames can be constructed. They have the advantage that no additional error propagation is introduced and that interlace kill can be used. The most important types of empty B-frames for our discussion are the forward and backward predictive empty B-frames. We will call them respectively Bf- and Bb-frames. The terms forward and backward predictive might be confusing to the reader.
First of all a B-frame is normally bi-directionally predictive, but unidirectional predictive B- frames can also exist. In the latter case they can be forward or backward predictive. Forward predictive means that an anchor frame is used to predict the following B-frames during encoding. So the picture resulting from a forward predictive B-frame is reconstructed during decoding from the previous anchor frame. This means that the Bf- frame forces the repetition of the previous anchor frame. Therefore it has the same effect as an empty P or Pe-frame. It will be clear that the Bb-frame has the opposite effect. It forces the display of the anchor frame following it. For both types of empty B-frames an interlace kill version also exists. Empty B-frames can be used for the construction of a slow- forward stream. The first possibility on the basis of Bb-frames is depicted in Fig. 26. The Bb-frames 2603 are inserted before the anchor frames 2403 2405 in the transmitted stream, as shown in the top line 2600 and keep their position during the reordering as shown in the middle line 2601. The anchor frames 2403 2405 are shifted to the position of the next anchor frame as is also shown in the middle line 2601. The Bb-frame 2603 forces the display of the anchor frame following it in the reordered stream as shown in the bottom line 2602.
However, the preferred option is the use of Bf- frames as depicted in Fig. 27. They are inserted after the anchor frames 2403 2405 in the transmission stream as shown in the top line 2700. The reordered stream is shown in the middle line 2701. The repeated display, as shown in the bottom line 2702, of the anchor frames 2403 2405 in the reordered stream is forced by the Bf- frames 2703 that follow them. It is clear that the use of Bf- frames 2703 is very similar to the use of empty P-frames 2404 for the construction of fast-forward and fast-reverse streams. In fact the use of Bf- frames 2703 is also possible in that case thus using common measures for trick-play generation which is a further benefit. However, when Bf- frames 2703 are used for fast-forward and fast-reverse we have to consider the effect of reordering. This means that some parameters in the fast-forward/reverse stream like the PTS/DTS and the temporal references have to be chosen differently.
A further complication arises with the handling of ECMs in combination with the creation of a slow- forward stream that is not trivial. Extra attention has to be paid to the conditions applied to encrypted data streams.
Initially starting with the assumption that a slow- forward stream is a normal play stream that is transferred to the decoder at a lower than normal speed. In this case, the latency of the smart card relative to the data is lowered by a factor equal to the slow motion factor L. Current conditional access systems will operate properly even with a smart card latency of zero. This means that the original embedded ECMs can be used in their original positions. So no special ECM handling is needed in this case.
However, the actual slow- forward stream deviates on some points from the above assumption. The first point is that measures are taken to enforce the repetition of the frames. This means that the slow- forward stream is not just a stretched normal play stream, but that additional data is added. The repetition of the anchor frames is realized by the insertion of empty frames. These do not contain any ECMs and therefore the normal processing order is not disturbed. So the original ECMs can still be used in this case. On the other hand, the repetition of the B-frames is enforced by the repetition of the B-frame data. These may contain ECMs and the normal ECM order is disturbed if a toggle of the table ID 2802 and 2803 is present within the B-frame as is depicted in Fig. 28 at point 2800. A repetition of such a B-frame leads to a high rate of ECMs being to the smart card. This is because in the known decryption systems decryption messages with the same toggle ID are filtered to reduce the rate at which decryption messages are sent to the smart card. In the known decryption systems a toggle in the table ID is used to filter decryption messages. In the present example, a table ID toggle is found at the start 2801 of the repeated B-frames and at the original toggle location 2800 somewhere along these frames. The high rate of ECMs can cause the decryption system to become overloaded and fail since the rate of different ECMs being sent to the decrypter is higher than in the encrypted data stream in its original form.
Therefore, to create a smooth slow- forward trick-play stream not only are the are there repeated ECMs that were present in the original encrypted data stream to improve channel changing response there are also subsequently repeated ECMs that have been inserted into the encrypted data stream during subsequent processing of the trick-play stream. Another effect has to do with the presence of Control Words in the decrypter, thus influencing the decryption of the stream. Starting with a stream of type I where there are two Control Words per ECM. The Control Words resulting from the ECM with a toggled table ID will destroy the Control Word needed to decrypt the first part of the B-frame. This is depicted in Fig. 29. If we assume that ECMs are discarded as long as the smart card is busy this does not have any effect if the repetition of the B-frames 2900 is shorter than the latency of the smart card 2901. But with a small latency 2901 and a high slow motion factor it can lead to the loss of data from the first part of a repeated B-frame 2900, since a portion of the repeated B-frame data 2902 can no longer be decrypted. This is because a required control word has been overwritten in the registers of the decrypter.
Both of these problems, being the high ECM rate and the loss of data, are preferably solved by discarding the ECMs 3003 from a series of repeated frames 3000 and 3001 except from the last frame 3002 as is shown in Fig. 30a. So only the last 3002 of a series of identical B-frames 3000, 3001 and 3002 will contain the original ECMs 3003. In this case there is only one table ID toggle 2800 in the series of identical B-frames 3000, 3001 and 3002 at a moment that the Control Word for the decryption of the start of the B-frame is no longer needed. The ECMs 3003 do not take part in the repetition process, thus leading to a normal order of the ECMs with a correct timing.
In Fig. 30b the output of a further advantageous embodiment is illustrated. In the repeated B frames 3000, 3001 and 3002 the original ECMs 3003 are only repeated in the repeated B frames 3000 and 3001 up to the table ID toggle 2800, i.e. pre-toggle original ECMs 3004 are kept and the post-toggle ECMs after the table ID toggle 2800 are removed 3005 in a filtering process. For the final repeated B frame 3002 all original ECMs are inserted, including the post-toggle ECMs 3006 after the table ID toggle 2008. This is advantageous to increase the speed of response of the system after channel changing, or zapping. The inverse process can be applied for situations when the first B-frame of a repeated sequence comprises all ECMs, however, then the following repeated B-frames would then have the pre-toggle ECMs 3004 removed and only comprise the post-toggle ECMs 3006.
Fig. 31 shows a device 3100 capable of performing the required processing on the encrypted data stream 3107 to solve the problems mentioned above. A repeated portion detection unit 3101 detects the ECMs contained within the encrypted data stream that have been modified from their original form 3106 by a trick-play generator 3105. Such a modification may be performed to generate, for example, a slow- forward trick-play stream by replicating B-frame data and the ECMs corresponding to the positions in time of the replicated B-frame data. The detected ECMs are communicated to a selection unit 3102 which identifies at least one of the ECMs as have being repeated subsequent to the creation of the original encrypted data stream 3106 transmitted by the original content provider, i.e. the only entity that actually knows how the conditional access system is truly constructed and the requirements upon ECMs to guarantee correct operation of the conditional access system. The selection unit 3102 may comprise an input 3104 for identifying sections of the encrypted data stream that have been repeated. This input is preferably connected directly to a trick- play generator 3105 that inserts the repeated sections into the encrypted data stream. The selection unit 3102 then selects encryption messages or ECMs to be deleted. A deletion unit 3103 deletes the selected encryption messages from the encrypted data stream.
In a further device as shown in Fig. 32 the device 3100 may also contain repetition detection unit 3200. The repetition detection unit 3200 is capable of analyzing the incoming encrypted data stream and identifying sections of the stream that have been repeated at a time later than the creation of the original encrypted data stream. The repetition detection unit 3200 may comprise a decrypter as known to the skilled person to completely decrypt the incoming data stream and analyze it for repeated frames and a comparator also known from the prior art to perform such an analysis on the decrypted data stream. Alternatively, the repetition detection unit 3200 may comprise a simple table ID toggle detector that monitors the table ID in the incoming data stream and generates a toggle signal when the table ID changes.
In a further device as shown in Fig. 33 the device 3100 may also contain an analyzer unit 3300. The analyzer unit 3300 may be connected to the repeated portion detection unit 3101 and be capable of receiving information 3301 about the characteristics of the encryption messages, or ECMs, contained within the encrypted data stream. Exemplary forms for such characteristics are the toggles of the table IDs comprised within the encrypted data stream. It has been found preferable to use the timing characteristics of the ECMs as a suitable characteristic to analyze. The analyzer unit 3300 can then use timing characteristics resulting from the detected ECMs to identify portions of the encrypted data stream that have been repeated since the creation of the encrypted data stream in its original form. Such a device, like that of Fig. 32, does not require explicit information to be given to the selection unit and therefore the devices of Fig. 32 and Fig. 33 can operate independently of the trick- play unit 3105. The information on repeated sections contained within the encrypted data stream is then passed on to the selection unit 3102 via the input 3104 for receiving information on repeated sections of the encrypted data stream.
This method will also solve the high ECM rate problem for a stream with only one Control Word per ECM. The Control Word problem is a little bit different though. There is no risk that the Control Word needed to decrypt the first part of a repeated B-frame will be destroyed. But a special effect occurs if the table ID toggle and SCB toggle are in one and the same B-frame. In practice this is not to be expected to occur frequently because the distance between these two is often larger than the GOP size due to the latency of the smart card. So the method given earlier for two Control Words per ECM will generally also work for systems with one Control Word per ECM. But the case that the toggles of table ID and SCB are in one and the same B-frame will be considered anyway for the less frequent occurrences.
Fig. 34 depicts this infrequent situation. The single Control Word 3400 resulting from the ECM with a toggled table ID 3403 is necessary to decrypt the part of the B-frame 3402 after the SCB toggle 3401. So if only the last frame 3404 of a series of identical B-frames contains ECMs 3003, the last part of the earlier B-frames 3405 of the series cannot be decrypted correctly.
Maintaining the original ECMs 3003 in the first frame 3500 of a series of identical B-frames and discarding them from the remaining frames 3405 can easily correct this. This situation is depicted in Fig. 35. However, ECMs only in the first frame 3500 is less preferable for systems with two Control Words per ECM, for example, if one of the Control Words would be required to be used immediately. Fortunately, such two Control Words per ECM systems generally send the necessary Control Word one complete cryptographic period in advance. So for one Control Word per ECM systems, only the first B-frame of a series of identical B-frames should contain the ECMs if one wants to consider this less frequent situation. Another point that deviates from a simply stretched normal play stream is the data rate. Although the average data rate of the slow- forward stream is certainly lower than for the normal play stream, this is not the case for the data rate of compressed frames.
Compressed frames can result from the repetition of the B-frame data. One might expect that the duration of B-frames will normally be less than one frame time. On average this is true but occasionally the transmission time of a B-frame can be larger than one frame time. In a measurement on the television channel ZDF lasting roughly 30 seconds a B- frame was detected of 1.4 frame times. This measurement is depicted in Fig. 36. The average B-frame data length equals 0.6 frames, but regularly the duration of the B-frame data is larger than one frame time.
The positioning of the packets of B-frames by means of a correction of their time stamp with the time stamp offset will lead to a correct result as long as the duration of the B-frame is smaller than one frame time. But if a B-frame in the slow- forward stream is larger than one frame, the end of it will overlap with the subsequent frame because the start of the frames is placed with a distance of one frame. The situation for a B-frame larger than one frame time is clarified in Fig. 37. This is true for all of the repeated B-frames 3700 except the last B-frame 3701 of the repeated section because the last repeated B-frame 3701 would never overlap with the subsequent frame 3702, since it did not originally overlap with the subsequent frame 3702 in the encrypted data stream in its original form. The type of the previous and next frame has no influence on this effect. So there can be an anchor frame, a B- frame or even an empty frame.
This means that all the B-frames of a series of identical B-frames 3700 except the last one 3701 have to be compressed in time, these are shown as compressed B-frames 3703 in Fig. 37. This compression can increase the local bit rate even to a level above the maximum bit rate of the total normal play stream. To limit this increase as much as possible, the packets of the B-frame are evenly distributed over the available frame time. The time stamp of the first packet of a B-frame can be calculated with offset rules to evenly distribute the frames over time.
One could imagine that the method of equal packet distribution for the B- frames should be used in all cases and not only if compression is needed. But in most cases this means that the B-frame is expanded. The application of a time stamp offset to the first packet of a B-frame means that the distance of this packet to the DTS is equal to the normal play stream. The expansion would then result in a smaller time distance than original between the end of the B-frame data and the corresponding DTS. But it can easily be understood that the DTS of a frame can never be earlier than one frame time after the start of the frame data. The reasoning is as follows. The DTS of a frame in the original stream is by definition always one frame time later than the DTS of the previous frame. The DTS of this previous frame can never be earlier than the end of the data of this frame and therefore never before the start of the data of the current frame. This means that the DTS of an arbitrary frame is at least one frame time later than the start of the data for this frame. This also means that the DTS is always after the end of the frame data, even if this data is evenly distributed in one frame time. So the described equal packet distribution should be applied to all B- frames except the last repeated one. For simplicity, a compressed as well as expanded frame will both be named a compressed frame in the remainder of this specification.
Gluing is only necessary between the B-frames of an identical series of B- frames. So a possible additional gluing packet will only be added to the end of a compressed B-frame and never anywhere else. An additional PCR packet is added to the end of the B- frames except to the end of the last repeated B-frame because there is no room at this point. This again means that the additional PCRs are only added at the end of compressed B-frames. So no special placement algorithm is necessary for these packets because they are all included in the compression algorithm.
A consequence from the compression of B-frames is that the earlier described correction of the value of a PCR within the frame data is no longer correct for such a B- frame.
The compression of B-frames as described above leads to a smaller time gap between the table ID toggle and the SCB toggle, which is especially critical for one Control Word per ECM systems where the time distance between these two is matched to the latency of the smart card. Therefore, the solution given for the less frequent case does not work properly in all situations. Therefore it is nevertheless preferable that in most circumstances the ECMs should only be present in the last frame of a series of identical B-frames.
In the following, some aspects related to dealing with still-picture streams in particular will be described.
Still picture mode is a trick-play mode where we deal with the freezing of a picture on the display screen. Preferably, the user would like the picture on the screen to be frozen at the moment he pushes the still picture button. This is easily accomplished in a box that also contains the decoder. At that moment the decoding is stopped and the picture in the active picture memory is repeated indefinitely. A more complicated aspect is in dealing with a remote decoder connected to a trick-play engine via a digital interface with a standardized signal, such as MPEG, for example.
In the following it is assumed that the storage device 2100 and decoder 2101, first introduced in Fig. 21, are in separate boxes connected by a digital interface, and that the still picture function is part of the storage device. In this case we cannot prevent the display of pictures that have already been sent to the decoder 2101 but were not yet displayed. At the moment the still picture button is pressed, the current picture on the screen will not be frozen but some future picture. This introduces a visible latency between the action, i.e. the pushing of the button, and the reaction, i.e. the freezing of the picture. The amount of latency depends on the delay of the presentation, as indicated by the PTS, with respect to the transmission of the corresponding frame data. It is noted, however, that the presentation moment indicated by the PTS is not necessarily the real moment of presentation. According to the MPEG-2 standard, the DTS and PTS of a B-frame are identical. This would imply a zero decoding time because a picture cannot be displayed, i.e. presented, before it is decoded. The MPEG-2 standard, in fact, assumes that the decoder/renderer compensates for the decoding time it needs, thus introducing an additional delay on presentation.
The distance of the DTS to the frame data is given by the original broadcast and is not altered for the slow-forward stream constructed. The PTS and DTS are identical for B-frames. So for these frames the reaction time is identical when switching to still picture from normal play or slow-forward. But the distance measured in original frames is, in fact, reduced by the slow motion factor. This means that the still picture is closer to the actual displayed picture for larger slow motion factors.
One might expect that the distance of the PTS/DTS with respect to the start of the B-frame data is relatively small. Measurements indicate, however, that the distance to be roughly 10 frame times of 40 ms in a measurement on ZDF, as shown in Fig. 38. The delay of 10 frame times combined with an expected decoding time of less than one frame time results in an expected reaction time better than half a second. This also means that, in this case, the displayed picture is frozen for slow motion factors above ten times and that the fourth original picture after the displayed picture is frozen for a slow motion factor of three times. These numbers seem to be acceptable even in the case of such unexpectedly large distance of the PTS/DTS to the frame data. Since one has no control over the delays, one can only consider the moment the still picture button is pressed relative to the transmission of frames to the decoder. Actions necessary to generate a still picture will then be taken as soon as possible after the button is pressed. One should distinguish between still pictures generated from a B-frame and from an anchor frame which here means the original frames, and not any empty frames that may have been inserted into the stream. For anchor frames the situation is somewhat more complex due to the reordering and increased distance of the PTS in the slow- forward stream, used to generate the still picture stream. The generation of a still picture from a B-frame will now be described. First we assume that the still picture command is received during the transmission of a B-frame. This can also be a repeated B-frame, i.e. fully repeated B-frame data with ECMs, if we switch to still picture from the slow- forward mode. We will call this frame the current B- frame. Preferably, the picture resulting from this frame is frozen on the display screen. In this way the user will see paused on screen the earliest frame. To achieve this, the data of the current B-frame has to be repeated indefinitely. So the slow- forward processing with an infinite slow motion factor is used from this point onwards. The calculation of the temporal reference, PTS, DTS and PCRs for the additional B-frames can be accomplished, but there can be a further problem with the placement of the first of these B-frames in the time axis of the stream. This is the situation, depicted in Fig. 39, when the current B-frame 3901 has a duration that is longer in duration in the stream than one frame time. In the slow- forward stream 3903 this is only possible for the last repeated B-frame 3901, because the previously repeated B-frames 3703 are time compressed. This problem can also occur in normal play, but here this can occur for any B-frame, since any B-frame in the normal play stream may last longer than the frame time. Since the first additional frame 3902 would normally be placed at a distance of one frame time with respect to the start of the current B-frame 3901, this would lead to a data overlap in this case. Usually this is solved by a time compression of the oversized B-frame, but this is not possible for the current B-frame 3901 because its transmission has already started, before the switch to still picture mode 3900 was requested. This means that the first packet 3904 after the current frame should be positioned differently. In fact, its position should be equal to the position of the first packet that would normally follow the current B-frame 3901. The subsequent frames 3905 are again placed at frame time distances, as usual. This would not lead to overlap problems if the repeated current B-frames are compressed to one frame time by the slow- forward processing described earlier. In Fig. 40 another problem originates from the ECM processing and has been elucidated somewhat upon earlier in the discussion of slow- forward processing an encrypted data stream. More specifically, the presence of a table ID toggle 2800 within a B-frame may lead to an incorrect decryption of repeated B-frames for a stream type I. It has been disclosed that omitting the ECMs from all B-frames except the last repeated one might solve this problem. But at least part of the ECMs in the current B-frame has already been transmitted if this frame contains ECMs. This may be the case for all of the B-frames in an encrypted normal play stream and for the last repeated B-frame in an encrypted slow- forward stream. A first solution would be that the stop must be postponed to the next frame because a stop cannot be made on such a B-frame containing ECMs, if no further information about the ECMs is known. However, further improvements upon this can be made, since this is not always true. The stop can still be made on the current B-frame if it does not contain a table ID toggle 2800, which is the case for most of the B-frames in an encrypted stream. In this case there is also no need to omit the ECMs from the repeated current B-frame. Fig. 40 shows a stop on the last repeated B-frame 4000 of a slow- forward stream. The presence or absence of a table ID toggle 2800 within the B-frame data is easily detected by a comparison of the table ID value of the first ECM 4001 and last ECM 4002 within the B-frame data 4000. If they are identical, or no ECMs are present, there is no table ID toggle within this B- frame and the stop can be made on it, leading to a faster switch to still mode for the user. A third problem is related to the temporal reference and PTS of the previous anchor frame. These values depend on the number of B-frames that follow the anchor frame. This number of B-frames is changed in still picture mode to some undefined large value. But the temporal reference and PTS of the previous anchor frame cannot be changed because they have already been transmitted. For this reason, it was already disclosed that the switch from normal play to slow-motion should occur at the start of the next anchor frame. The same is, in fact, true when we switch from one slow motion factor to another, so also when we switch from normal play or slow- forward to still picture. This is, of course, a very realistic option, but the undefined number of Bf- frames that would follow this anchor frame is still problematic. This can be solved and will be described later in this specification. Focusing now on a switch at the end of the current B-frame. As shown by the PTS arrows 4107 and
4108 in Fig. 41, the fact that the temporal reference and PTS of a previous anchor frame 4103 cannot be changed leads to a conflict with the PTS of one of the additional B-frames 4104. It is not clear which frame is displayed despite a correct reordering 4101 of the additional B- frame 4104 to position 4105. A conflict in display sequence 4102 at position 4106 may occur.
Such a conflict is, in principle, avoided if the particular B-frame is replaced by a Pe-frame 2404, as shown in Fig. 42. This would, however, have consequences. Firstly, one consequence is that the B-frames 4203 following this Pe-frame 2404 are no longer correctly decoded because they reference the wrong anchor frames, namely twice An 4103 in the example, instead of frames An and An_i . A second one is that the temporal reference and PTS for the Pe-frame 2404 cannot be predicted because the number of subsequent B-frames is unknown. The prior art of US 2004/0190866 Al solved this partially by limiting the number of repeated B-frames to a predetermined number. After this predetermined number some form of glitch will appear to the user for each and every repeated B-frame.
It is also possible to solve the same problem in a more elegant manner. For example, both of these consequences or problems are avoided if we continue to transmit Pe- frames from this point onwards, but in this case the picture resulting from the B-frame is only frozen for a small amount of time, after which the picture is switched to the next anchor frame in display order and halted. By replicating Pe-frames 2404 rather than B-frames only a single glitch is seen by the user initially, as opposed to a glitch for each repeated B-frame. The end effect, however, is in fact the same as if we had switched to still picture at the next anchor frame in the transmission stream, which is therefore probably the better solution.
It is also always possible to ignore the conflict in the PTS. Each decoder will respond to this in an individual manner, but a discontinuity could be flagged in the stream. The final result is not ideal because it will depend on the way that the decoder or Set Top Box decides how an anchor frame needs to be displayed. There are several possibilities:
1. The STB uses the PTS to determine the moment of presentation. At the moment the PTS of the anchor frame is reached this frame will be displayed despite the fact that a B-frame with the same PTS is also available. The philosophy being that in case of a conflict in the PTS the oldest frame will be used. It is expected that the B-frame with conflicting PTS will be skipped to avoid buffer problems. It is also assumed that the anchor frame is not flushed but kept in memory until a next anchor frame is received, leading to a correct decoding of the subsequent B-frames. The result on the display screen is a frozen B- frame but with a flash of an anchor frame at a short time after the still picture mode was entered.
2. The STB uses the PTS to determine the moment of presentation. At the moment the PTS of the anchor frame is reached the STB will display the B-frame with the identical PTS because this is the most recent frame. The anchor frame is not displayed but kept in memory until a next anchor frame is received, thus enabling a correct decoding of the subsequent B-frames. The result on the screen is a frozen B-frame as it was intended.
3. The STB does not use the PTS for presentation at all. It just displays the frames in sequence after reordering to the display order. This means that the B-frames are displayed one after the other and that an anchor frame is displayed when the next anchor frame is encountered in the transmitted sequence. The frame display grid has a constant delay with respect to the DTS of the B-frames. Also here the result on the screen is a frozen B- frame as it was intended.
The creation of a still picture from an anchor frame is a preferred option. Here we, in fact, mean that a picture resulting from an original anchor frame is frozen on the display screen. For normal play such a picture can only result from the anchor frame itself, but for slow-forward it can also result from an empty frame. First we consider the situation that the switching command 3900 is received during the transmission of an anchor frame 4103 as is depicted in Fig. 43. At this moment in the sequence the previous anchor frame 4306 will be displayed. Preferably this frame is frozen on the screen. This can only be realized by the transmission of Bf-frames 2703 after the current anchor frame 4103. Initially a sequence of correctly displayed still frames 4306 are seen, but later on in the sequence 4302 there may be a conflict between the PTS of the current anchor frame 4103 and the PTS of one of the Bf-frames 2703, namely Bf-frame 4303, at display location 4305, with all the consequences that have just been described. Either these consequences are accepted or the switch is delayed to the start of the next anchor frame.
If we now consider the situation that the switching command is received during an empty frame, which can only occur when we switch from slow- forward to still picture mode then we have to distinguish between the three possible options for empty frames in the slow- forward stream, namely the pre-insertion of Pe- or Bb-frames and the post-insertion of Bf-frames:
1. The switching command 3900 is received during a pre-inserted Pe-frame 2404. In this case additional Pe-frames 4403 will be inserted after the current Pe-frame 2404, as shown in Fig. 44. This results in a frozen picture 4306 corresponding to the current Pe- frame 2404, which is, in fact, the picture from the previous anchor frame. One could also say that the transmission of the slow- forward stream is continued until the start of the next anchor frame, and that from this point onwards Pe-frames 4403 are inserted. There are no problems with the PTS in this case.
2. The switching command 3900 is received during a pre-inserted Bb-frame 2603, as shown in Fig. 45. This Bb-frame 2603 results in the display of the previously transmitted anchor frame that should now be frozen on the screen. The transmission of the slow-forward stream is continued to the start of the next anchor frame 4503. From this point onwards Pe-frames 4403, equivalent to the previously described Pe-frames 2404, are inserted. Problems with the PTS are thus avoided. The Bb-frames 2603 from the slow- forward stream as well as the Pe-frames 4403 inserted after these Bb-frames 2603 in the still picture stream 4500 lead to a repeated display of the same previously transmitted anchor frame 4306 and ultimately to the desired effect in the display stream 4502.
3. The switching command 3900 is received during a post-inserted Bf- frame 2703, as shown in Fig. 46. This frame forces the display of the anchor frame 4606 previous to the anchor frame directly in front of the Bf- frames, i.e. anchor frame An-1. This picture should then be frozen. This can only be achieved by the continued transmission of Bf- frames 2703. This leads to a PTS conflict for Bf- frame 4603 in particular at location 4604 after reordering as shown in line 4601. The conflict at display position 4605 in the displayed picture line 4602 is now evident and the consequences are either accepted or alternatively the switch is delayed to the start of the next anchor frame.
The four anchor frame still picture situations can therefore be regrouped to two different cases: Case A Case A is shown in Fig. 47, which is a combination of Fig. 43 and Fig. 46.
The switching command 3900 is received during an anchor frame or the subsequent Bf- frames in switching period 4703. The Bf- frames 2703 are then extended to an indefinite series. A conflict will occur between the PTS of the anchor frame and the PTS of Bf- frame 4704 at reordered location 4705 and displayed picture location 4706. The other frames 4306 will be displayed correctly. Case B
Case B is shown in Fig. 48, which is a combination of Fig. 44 and Fig. 45. The switching command 3900 is received during a switching period 4803 of pre-inserted empty frames 4804, of Bb or Pe type. In this case the transmission of the slow- forward stream is continued until the start of the next anchor frame 4503. From this point onwards an indefinite series of Pe-frames 4403 is transmitted. A conflict in the PTS is avoided in this way and the correct still picture frames 4306 are displayed.
In the situation that post-insertion of Bf- frames is used for the construction of the slow-forward stream, only case A has to be considered for generating a still picture from an anchor frame. In the other situations, cases A and B have to be supported both.
Problems and solutions have thus far been disclosed with the necessary actions when the switch to still picture is performed as fast as possible after the reception of the switching command. It is clear that a PTS conflict between the anchor frame and one of the B- or Bf- frames following it could not be avoided in this case. It was also mentioned that such a problem does not exist if the switch to still picture is delayed until the start of the next anchor frame. This seems to be acceptable when switching from normal play to still picture because it only introduces a small latency. But when switching from slow- forward to still picture, this latency is in fact multiplied by the slow motion factor and can therefore become unacceptably high.
Accepting this we assume anyway that the switch is made at the start of an anchor frame 4903, wherever the switching command 3900 was received. This situation is depicted in Fig. 49a. Preferably the previous anchor frame is now displayed 4306 and frozen on the screen. This is accomplished by the transmission of an indefinite series of either Bb- frames 2603 or Pe-frames 2404 as the repeated frames 4403 (in Fig. 49a the Pe-frames 2404 version is shown) from the switching moment onwards, irrespective of the empty frame type used to construct the slow- forward stream. Since the use of Bb-frames 2603 would lead to a PTS conflict it may be preferable to make use of Pe-frames 2404, i.e. as shown in Fig. 49a. Completing a discussion on the generation of a still picture stream would not be complete without also discussing the transition from the still picture mode back to normal play mode or a further trick-play mode, such as slow-forward mode, for example. This will now be discussed.
In still picture mode, the transmitted stream consists of an indefinite series of frames, which can be of the following types: l. B-frames;
2. Bf-frames;
3. Pe-frames.
First we consider the switch from still picture to slow- forward. In all cases the transmission is continued with the next frame in the normal play sequence. In case 1 this can either be a B- frame or an anchor frame. In case 2 this will be a B-frame unless the normal play stream contains no such frames, in which case it will be an anchor frame. In case 3 the next frame will always be an anchor frame as is depicted in Fig. 49b. In Fig. 49b, the switch from still picture to slow- forward occurs at point 4913, thereafter a sequence of repeated B-frames 2406 are transmitted according to the slow- forward processing described earlier. In still picture mode, the slow-forward processing with a slow motion factor of infinity may be used to calculate the temporal reference, DTS/PTS and placement of the packets and frames. This processing may then be continued with the new slow motion factor from this next frame onwards. In case 3 this will result in a flawless transition from still picture to slow- forward. But in cases 1 and 2 there exists a problem with the display of the picture resulting from the previous anchor frame that should follow the pictures resulting from the series of B-frames. This is because the PTS of this anchor frame is somewhere way in the past. It cannot be predicted how a Set Top Box will react to this conflicting situation, but it is possible that the picture resulting from this anchor frame is still displayed at the correct moment in the sequence of pictures. It should also be noted that from the switching point onwards, the empty frames in the slow-forward stream may again be of the normally used type, here normally should be interpreted as, as disclosed in this specification. For case 3, which can result from a switch to still picture at the start of an anchor frame, this means that the Pe-frames are followed by an anchor frame and then for instance by a series of Bf- frames.
Considering now the switch from still picture to normal play it is evident that this situation differs little from that just discussed. Normal play will be started with the next frame in the normal play sequence. This guarantees that there is no disruption in the frame sequence. But for a flawless transition, the slow- forward processing would preferably be continued for the normal play stream but with a slow motion factor of one. Although possible, this is not the expected situation in practice. It is expected that the normal play stream will be transmitted, as is, from the next frame onwards. With an intermediate step, wherein the temporal references at the transition point may be corrected, helps somewhat, but discontinuities will occur in a lot of other parameters including the DTS and PTS. The discontinuity flags available in MPEG-2 should be used as far as possible in this case.
In the following a number of embodiments will be presented that may be used to solve the ECM related problems discussed related to performing trick-play on encrypted streams, especially the issues related to generating an encrypted still picture stream and to some extent slow- forward processing of encrypted streams. For any processing of an encrypted data stream that requires the repetition of a portion of the encrypted data stream, such as slow- forward, still picture or a picture step mode, it is preferable that the ECMs are taken into account in generating and outputting the processed encrypted data stream. The figures Fig. 50a through Fig. 5Oh present some examples of situations that may occur. For example, in Fig. 50a for a preferred still picture data stream it is necessary to repeat B-frame data of the last repeated B-frame 5000. In this example, the last repeated B-frame 5000 does not contain any ECMs 3003 and therefore also no table ID toggle 2800. In this case, the preferred solution may be to switch directly to still picture mode since the user will see a still picture of the frame being displayed when the switch to still picture mode 3900 was made. The current frame repeated may then be the last repeated B-frame 5000 and is shown in stream 5003. Alternatively, an equally compliant stream can be provided by postponing the actual switch to still picture mode from the detection of the switch to still picture mode 3900 to the start of the next frame 5001, i.e. at location 5002, whereby the still picture mode will consist of a sequence of next frames 5001 as shown in stream 5004. Since no portion of the next frame 5001 has yet been transmitted the encrypted data stream can be processed based on the presence, absence or content of ECMs in the next frame. Elements of the devices shown in Fig. 31, Fig. 32 and Fig. 33 would be suitable for correctly processing such a stream and correcting any problems that may result from the repeated B-frame data of the next frame 5001. By taking such measures a further postponement of the actual initiation of still picture mode may not then be necessary. In Fig. 50b a situation is shown wherein the switch to still picture mode 3900 is received during the processing or transmission of a frame 5010 comprising ECMs 3003. In this case, the ECMs comprised within the last repeated B-frame 5010 have to be detected. The situation denoted in Fig. 50b is one where no table ID toggle 2800 is comprised within the repeated B-frame data 5010. Upon detection of ECMs 3003 in the last repeated B-frame 5010 the fail-safe option is to postpone the actual switch to still picture mode from the detection of the switch to still picture mode 3900 to the start of the next frame 5001, i.e. to location 5002, whereby the still picture mode will consist of a sequence of next frames 5001 as shown in stream 5012. However, even in this case the preferable option is to immediately perform the switch to still picture mode, as shown in stream 5011. By detecting the lack of a table ID toggle 2800 in the last repeated B-frame 5010 we may still safely perform the switch immediately and achieve higher positional accuracy with respect to the response to the users desired still picture command.
In Fig. 50c a situation is shown wherein the switch to still picture mode 3900 is received during the processing or transmission of a frame 5020 comprising ECMs 3003 and a table ID toggle 2800. In this case, the ECMs 3003 and the table ID toggle 2800 comprised within the last repeated B-frame 5020 have to be detected. Upon detection of ECMs 3003 and a table ID toggle 2800 in the last repeated B-frame 5020 it is no longer possible to make an immediate switch to still picture mode, if no further information is known. If this was to be done, then stream 5021 would be the result. The high rate of table ID toggles 2800 shown in stream 5021 would likely be beyond the rate at which the smart-card could operate, resulting in a processed encrypted data stream for which no guarantees could be given as regards its compliance with the conditional access system employed. Therefore, the preferred option may be to postpone the actual switch to still picture mode from the detection of the switch to still picture mode 3900 to the start of the next frame 5001, i.e. to location 5002, whereby the still picture mode will consist of a sequence of next frames 5001 as shown in stream 5022. Since no portion of the next frame 5001 has yet been transmitted the encrypted data stream may be processed based only on the presence, absence or content of ECMs in the next frame and may be processed with devices, or elements thereof, as shown in Fig. 31, Fig. 32 and Fig. 33. A further postponement of the actual initiation of still picture mode may again then not be necessary.
However, a further problem related to the overwriting of control words must also be taken into account for the conditional access systems in common use. The overwriting of controls words was described earlier with respect to Fig. 14 and Fig. 15 for type I and type II systems. In some situations, namely for a type I system comprising two control words per ECM, the exact timing of the SCB toggle is also critical. By repeating the initial section of a frame preceding an SCB toggle, after overwriting a necessary control word, the required control word may no longer be available for this initial section of the frame. This implies that this initial section of the frame is no longer decryptable. Therefore, for a type I system operating in the situation shown in Fig. 50c the correct operation is achieved by postponing the still picture mode to the following frame.
The situation depicted in Fig. 50c can be further elaborated upon and indeed further improved upon with respect to the goal of responding as quickly as possible to the user switching command. In Fig. 5Od the situation is presented whereby the switch to still picture mode 3900 is received subsequently to the transmission or processing of a table ID toggle 2800 in the ECMs 3003. In such a case, a switch can still be performed if the ECMs may be filtered based upon the respective table ID toggle 2800, such that no further unwanted table ID toggle 2800 occurs in the subsequently repeated current frames 5032. In the example of Fig. 5Od the required filtering is shown in stream 5031. In stream 5031 each repeated current B-frame 5032 has the ECMs 3003 occurring prior to the table ID toggle 2800 selectively identified and deleted. In such a situation an increase in the rate of ECMs 3003 processed by the smart-card is avoided because the rate of toggling of the table ID parameter is kept to a level approximately the same as the original encrypted data stream. Of course, it would also be possible to postpone the switch to the start of the next frame, as shown in stream 5033. Account should then be taken of any ECMs present in the next frame 5001, but the situation is at least somewhat easier to assess since the transmission or processing of the next frame 5001 has not yet started. Elements of the devices of Fig. 31, Fig. 32 and Fig. 33 would be suitable for this purpose. This scenario is especially suited for type II encryption systems comprising one control word per ECM. For a type I system postponement of the switch to still picture is the correct option.
A further scenario is shown in Fig. 5Oe. Here the switch to still picture mode 3900 is received prior to a table ID toggle 2800 in the ECMs 3003 of the last repeated B- frame 5020. In such a case, a switch can, again, still be performed if the ECMs may be filtered based upon the respective table ID toggle 2800, such that no further unwanted table ID toggle 2800 occurs in the subsequent repeated current frames 5040. In the example of Fig. 5Oe the required filtering is shown in stream 5041. In stream 5041 each repeated current B- frame 5040 has the ECMs 3003 after the table ID toggle 2800 removed. In such a situation an increase in the rate of ECMs 3003 processed by the smart-card is again avoided by removing the table ID toggles 2800. Of course, it would also be possible to postpone the switch to the start of the next frame, i.e. location 5002, as shown in stream 5042. Account should again be taken of any ECMs present in the next frame 5001, but the situation is again at least somewhat easier to assess since the transmission or processing of the next frame 5001 has not yet started. For the situation depicted in Fig. 5Oe a further processing step is recommended upon the last of the repeated current B-frames 5050, as is shown in Fig. 5Of. Namely by including all ECMs 3003 in the last of the repeated current B-frames 5050 the original order of ECMs 3003 is preserved and no increased ECM rate is perceived by the rest of the system. In Fig. 5Og the same scenario as for Fig. 5Oe is valid, i.e. the switch to still picture mode 3900 is received prior to a table ID toggle 2800 in the ECMs 3003 of the last repeated B-frame 5020. In such a case, a switch can, again, still be performed if all of the ECMs may be filtered from the switch to still picture mode 3900, as shown in the partial current frame 5060 and all of the ECMs are also filtered from each repeated current frame 5061 including the last repeated current frame 5062, i.e. all ECMs after the switch are filtered. In such a situation an increase in the rate of ECMs 3003 processed by the smart-card is again avoided by removing all ECMs and therefore also the table ID toggles 2800. Of course, it would also be possible to postpone the switch to the start of the next frame, i.e. location 5002, as shown in stream 5064. Account should again be taken of any ECMs present in the next frame 5001, but the situation is again at least somewhat easier to assess since the transmission or processing of the next frame 5001 has not yet started.
A still further scenario is shown in Fig. 5Oh. Here again the switch to still picture mode 3900 is received prior to a table ID toggle 2800 in the ECMs 3003 of the last repeated B-frame 5020. In such a case, a switch can, again, still be performed if all of the ECMs may be filtered from the switch to still picture mode 3900, as shown in the partial current frame 5060 and all of the ECMs are also filtered from each repeated current frame 5061. The last repeated current frame 5050 is treated as a special case and contains all ECMs of the original current frame 5020. In such a situation an increase in the rate of ECMs 3003 processed by the smart-card is again avoided by removing the table ID toggles 2800. Of course, it would also be possible to postpone the switch to the start of the next frame, i.e. location 5002, as shown in stream 5071. Account should again be taken of any ECMs present in the next frame 5001, but the situation is again at least somewhat easier to assess since the transmission or processing of the next frame 5001 has not yet started.
It is emphasized that the systems described in the following can be implemented in the frame of and in combination with any of the devices or systems described referring to Fig. 1 through Fig. 50.
In the following, referring to Fig. 51, a data processing device 5100 for processing an encrypted MPEG2 data stream including video content (alternatively audio content) will be described. By means of the processing device 5100, it is possible to perform the various method steps as described referring to Fig. 18 through Fig. 50. Account is taken of decryption messages, i.e. ECMs, present in the encrypted stream to take optimal decisions based upon the amount of information known with respect to the ECMs to provide a processed encrypted data stream that is a compliant as possible with conditional access systems in common use. Fig. 51 shows a hard disk drive 5101 on which encrypted audiovisual content, i.e. an encrypted data stream 5110, to be reproduced is stored. Alternatively, the encrypted data stream 5110 can be received directly from a digital satellite, a digital cable signal, a Set Top Box, a digital terrestrial television broadcast or from the Internet using Internet Protocol broadcasting. More examples of suitable sources of audio/video streams are also possible. The processing device 5100 may be controlled by a control unit like a central processing unit (CPU) or control unit 5102 which, in turn, can be controlled by a human user by means of a user interface 5103. By means of the user interface 5103, a human user may control the operation of the processing device 5100, for instance, a user may initiate a normal play mode or a trick-play operation mode like a slow-forward mode, still picture mode or step picture mode. For example, a still picture mode signal 3900 may be detected. The still picture mode signal 3900 may be communicated to a detection unit 5104 and/or a replication unit 5105 directly or via the control unit 5102 (not shown in Fig. 51).
When a corresponding control signal is sent from the control unit 5102 to the hard disk drive 5101 via a system bus 5113, audiovisual content in an encrypted form may be sent from the hard disk drive 5101 to the detection unit 5104, which is capable of detecting the ECMs 3003 comprised within the encrypted data stream 5110. The detection unit 5104 may pass the encrypted data stream 5110 available at its input on to the replication unit 5105 for further processing. In parallel, the detection unit 5104 may provide the replication unit 5105 with a signal 5111 indicating that the replication unit 5105 should begin immediately with a transition to still picture mode, as an example, or should postpone the transition to still picture mode until the next frame comprised within the encrypted data stream 5110. A postponement of the replication to the next frame, i.e. a subsequent frame, may imply that the current frame wherein the transition to still picture mode signal 3900 was received is passed through the replication unit 5105 without replication. It should be apparent that the replication unit 5105 performs replication of the still picture mode frames, however, the exact frame replicated is a result of what the detection unit 5104 actually detects. Alternatively, the signal 5111 could also be communicated via the control unit 5102 (not shown in Fig. 51), which also links the detection unit 5104 indirectly to the replication unit 5105 via the common system bus 5113. To provide the signal 5111 the detection unit 5104 must decide on what action is necessary. In an exemplary embodiment the detection unit 5104 may make the decision based upon whether any ECMs 3003 are detected at all within the B-frame data 5000 (or 5010) to be repeated. This has been described earlier with reference to Fig. 50a and Fig. 50b. This broad inventive concept can be further refined. These further refinement or improvements that are also possible and will be described later. The replication unit 5105 may replicate portions, which may be whole or split frames in the encrypted data stream 5110, a number of times in accordance with a predetermined replication rate, for example for slow forward or still picture mode, which may be defined or determined by the control unit 5102 and/or by a user operating the user interface 5103. A processed encrypted data stream 5112 may then be supplied to a reproduction unit 5106. The reproduction unit 5106 may further comprise a monitor having loudspeakers, a television, a set top box, etc, wherein reproduction of this content is possible under control of the control unit 5102 and/or under control of the user via the user interface 5103. It is possible that a further decryption unit (not shown) is foreseen within the reproduction unit 5106 so as to decrypt the processed encrypted data stream 5112 for playback.
The detection unit 5104 may be adapted to process individual frames of the encrypted data stream 5110, which may be intra-coded frames (I-frames), forward predictive frames (P-frames) or bi-directional predictive frames (B-frames). The processed content may be a data stream of video data and/or audio data. The reproduction unit 5106 may be capable of reproducing the data stream connected to the replication unit 5105. The encrypted data stream 5110 may be an encrypted MPEG2 data stream. Although, ECMs 3003 are, in fact, related to a cryptographic period 1403 which may encompass multiple frames, it is the relationship of these cryptographic periods 1403 with the frame periods that makes trick-play on encrypted data streams such a complex topic.
As stated the control unit 5102 may be under control of a human user operating a user input/output interface 5103, for example, by using a user interface (UI) which may include a display, input means like remote control, a keypad, a joystick, a trackball, or the like and may allow a user to specify a mode according to which she or he wishes to reproduce audio/video content stored on the hard disk drive 5101. For instance, the user may adjust, via the user input/output unit 5103, parameters like playback speed, a trick- play reproduction mode, equalization, etc.
The detection unit 5104 is preferably adapted to also detect a toggle in the table ID parameter of the encrypted data stream, such table ID toggles 2800, indicate where one set of encryption messages transition to a second set of encryption messages. Here a set of encryption messages may be taken to mean a set of encryption messages that relate to the same set of keys in a conditional access system. Such transitions have consequences for correct processing of the encryption messages by the smartcard in the conditional access system. This measure is particularly useful since it provides further information to decide upon whether the switch to still picture mode 3900 can be safely initiated immediately or whether it must be postponed to a following frame, as described earlier with reference to Fig. 50b and Fig. 50c.
When more information is available with respect to the presence and type of ECMs 3003, table ID toggles 2800 and the user initiated switch to still picture mode 3900 it is possible to make further optimal decisions about whether the switch to still picture mode should be immediately initiated or postponed. Such situations, of course, require the relative times, or the relative positions in the stream of the relevant events and have been described earlier with respect to Fig. 5Od, Fig. 5Oe, Fig. 5Of, Fig. 5Og and Fig. 5Oh.
In Fig. 52 a device 5200 is provided which can selectively filter ECMs 3003 which would cause a decryption system to become overloaded due to a high rate of essentially different ECMs 3003, for example, resulting after an increase in the rate of table ID toggles 2800. The device 5200 provides an embodiment that can process an encrypted data stream comprising repeated portions repeated by the replication unit 5105, i.e. repeated B-frame data, into a processed encrypted data stream with a rate of change of decrypting messages that is substantially equivalent to that of the encrypted data stream in its original from, i.e. when originally broadcast, transmitted or delivered via another suitable means, such as on a data storage carrier, like an optical drive, for example.
In Fig. 52 the replication unit 5105 may provide a replication signal 5203 to an input 3104 of selection unit 3102 indicating portions of the encrypted data stream that have been repeated with respect to the original form of the encrypted data stream 5110. The selection unit 3102 was described in details earlier in this specification in the text related to Fig. 31. The detection unit 5104 detects ECMs, as described above, and may communicate the detected ECMs to the selection unit 3102 directly or via the control unit 5102 via system bus 5113. The selection unit 3102 identifies at least one of the ECMs as have being repeated subsequent to the creation of the original encrypted data stream 5110 transmitted by the original content provider and may use the replication signal 5203 for this purpose. A deletion unit 3103 deletes the selected encryption messages from the encrypted data stream using information from the selection unit 3102. The deletion may also be envisaged as a filtering of the ECMs. The devices, or elements thereof, shown in Fig. 31, Fig. 32 and Fig. 33 are also applicable to the embodiment shown in Fig. 52. In the embodiments shown in Fig. 51 or Fig. 52 the detection unit 5104 may further delay the switch to the slow- forward or still picture mode by a delay time which corresponds to the time difference between the point of time of switching and a point of time of starting a next frame in the sequence of the plurality of frames. Such a next frame may be a B-frame or an anchor frame, which anchor frame may then be an I-frame or a P-frame (in the nomenclature of MPEG). By delaying the start of the slow- forward or still picture mode by a corresponding time difference, it is possible to obtain a smooth transition between the different playback modes even in the case of encrypted data streams. In other words, before expiry of the delay time, the audio/video playback is continued in the normal play mode, and only after expiry of this delay time interval, the playback in the slow- forward or still picture mode is started. It may be necessary for the replication unit 5105 to also correct the temporal reference of the plurality of frames after and/or during the delay time. In other words, it may be necessary in the slow- forward or still picture mode that the order of playback of the frames be altered, or that a plurality of frames have to be repeated several times, or that additional (empty) frames are inserted in the processed encrypted data stream 5112. In order to take into account such modified frame conditions, the replication unit 5105 may correct the temporal reference between these frames.
In the following a number of embodiments will be presented that may be used to generate a step picture data stream with an improved compatibility with the MPEG2 standard, for example.
Step picture mode generally means that the device or system is in still picture mode and that the user wants to step forward or backward to another frame and then resume the still picture mode. It is, however, quite feasible to enter step picture mode from other reproduction modes, such as, normal play mode, slow forward mode, or an equivalent reverse direction mode. It was already noted in this specification that reverse modes based on all frames or at least on frames other than the I-frames are practically unfeasible in MPEG due to the asymmetry in the prediction used in the MPEG encoding process. This is therefore also true for the step backward mode. So it is only practically feasible to step backward to the previous I-frame and then from I-frame to I-frame. Given that a user wants to go back to the previous I-frame in the normal play display sequence a question remains of how this could be achieved. To realize the switch to the still picture display of this previous I-frame it is preferable to distinguish again between the three still picture cases mentioned earlier in this specification. Case 1 covers the case where the same B-frame data is transmitted over and over again, i.e. use is made of repeated B-frame data. In most cases, the previous I-frame in display order will be the previous I- frame in the normal play transmission order, but this is not true if this I-frame was the last transmitted anchor frame. In that case, the I-frame needed is the I-frame before that in the normal play stream. In case 2, the use of repeated Bf- frames results in the repeated display of the anchor frame previous to the last anchor frame. In case 3, the use of empty P-frames, or Pe-frames, repeats the display of the last anchor frame. In both of these cases, i.e. 2 and 3, the displayed picture might also result from an I-frame, but whatever the situation, the I-frame previous to the anchor frame referenced by the Bf- or Pe-frames is the one needed for the first step backward. An example for each case 5300, 5301 and 5302 is depicted in Fig. 53. The frames 5303, 5304 and 5305 indicate the last displayed anchor frame for each case. In all cases, the I-frame to be displayed after the step backward command has to be sent to the decoder followed by an indefinite series of Pe-frames. For the one I-frame per GOP case, the I-frame is not followed by any B-frames in this case, and its temporal reference has to be set to zero and the PTS is Delta higher than its DTS, where Delta is a DTS increment that corresponds to one frame time. However, one I-frame per GOP is not compulsory within MPEG and unless a GOP start is detected, the temporal reference should simply be incremented. The parameter Delta is equal to the number of 90 kHz periods in one frame time because the DTS is linked to the PCR base. Some values for Delta are given in the table 1. ime Rate Delta
24 3750
25 3600
29.97 3003
30 3000
50 1800
60 1500 Table 1 Delta as a function of frame rate
The DTS of the I-frame is Delta higher than the DTS of the previously transmitted frame. The temporal reference and DTS/PTS for the subsequent Pe-frames may be calculated using normal MPEG encoding rules taking into account Delta for each inserted frame.
There are cases where the previously transmitted frame was a B-frame (case 1) or a Bf- frame (case 2). For example, Fig. 54 depicts the situation where the first step backward is made on a still picture from a B-frame. The numbering conventions for individual frames are identical to those used earlier in this specification. The PTS of the B- frame (shown in Fig. 54) or Bf- frame (not shown in Fig. 54) previous to the I-frame is equal to its DTS. This means that the distance between the PTS of this B-frame 5403 and the PTS of the I-frame 5404 is equal to two frames. In other words, some other frame 5405 should be displayed between these two. Due to the reordering this is in principle the previous anchor frame, but the PTS of this anchor frame will almost certainly not correspond to this position. The decoder might still display the anchor frame at this position, or it might repeat the display of the last B-frame 5403 to fill the gap. A repeated last B-frame does not have a disturbing effect, but any other action of the decoder will lead to an incorrect first frame after the step backward. This may still be acceptable because it is limited to only one incorrect frame. For all the subsequent steps backward, the I-frame previous to the last transmitted one is sent to the decoder followed by an indefinite series of Pe-frames. After the first step backward, the device or system is always in a situation where an indefinite series of Pe-frames is transmitted (case 3). In this case there is no undefined first frame after the step backward. This means that an incorrect first frame can only occur after the first step backward. The decryption of the signal also has to be considered. This is, of course, no issue for the Pe-frames that are always in plaintext but it is an issue for the I-frames that are expected to be (largely) encrypted, for example, in a hybrid stream. In principle, the ECM handling as described for fast-reverse should also be applied here. Switching effects have to be considered before the first step backward is made. The explanation of the step backward mode was based on a step size equal to the distance between the I-frames or in other words on a step size equal to one GOP. It will be clear that larger step sizes equal to an integer multiple of GOPs can also be used. For encrypted streams the maximum step size and step frequency are limited by the timely availability of CWs for the decryption process.
Turning now to step forward mode a number of possibilities remain open. Three types of step forward mode can, in fact, be distinguished: I) Frame based; II) Anchor frame based;
III) I-frame based.
Starting with the first type then the next frame in the normal play sequence will be transmitted with an infinite slow motion factor. This is depicted in Fig. 55. Two problems arise when the next frame is an anchor frame. The first problem is an undefined first frame 5503 after the step forward, as shown in the display picture line 5502. The second problem is that the temporal reference and PTS for the anchor frame 5504 cannot be calculated with an infinite slow motion factor. This is due to the fact that the number of B-frames following this anchor frame may be infinite and depends upon the user. So the calculation of temporal reference and PTS should be based on some chosen slow motion factor unequal to infinity. This means that a PTS conflict as described for still picture will occur for one of the subsequent B-frames 5505, at display location 5506, in this example. It is totally unclear what would be the best choice for the slow motion factor. One choice could be as good as another in which case a slow motion factor equal to one (which represents normal play) might be a good choice. The problem is repeated each time that there is a B-frame to anchor frame transition and is unavoidable for a frame based step picture mode data stream, even taking account of the measures proposed in the prior art. At each transition from an anchor frame to an undefined number of B-frames then somewhere a conflict 5506 or 5508 occurs. When the transition from an undefined number of B-frames to an anchor frame occurs then a hole in the temporal reference 5503 or 5507 occurs. However, the effect of these problems cannot be predicted because it depends on the Set Top Box, or more specifically the decoder in the Set Top Box. In some cases, despite these problems, a correct sequence of frames without any disturbances might still be the result of the frame based step forward mode, but in other cases a disturbed sequence will be the result. This unpredictability is, of course, undesirable.
In Fig. 56 a second type of step forward mode is shown which is an anchor frame based one. As for step backward mode it has first to be determined which anchor frame should be displayed next. The first step picture mode step 5606 is therefore a special case. The frame selected depends on the cases mentioned earlier. In cases 1 and 2, where the still picture stream consists of a series of B- or Bf- frames, it is the previous anchor frame in the recorded normal play stream. The case 1 situation is depicted in detail in Fig. 56. The previous anchor frame in the recorded normal play stream is I-frame, I4, 5603 and this is inserted into the processed stream 5600. Subsequently a succession of empty P type frames 5605, i.e. Pe-frames, are inserted into the processed data stream 5600. The displayed pictures 5602, as shown in Fig. 56, show only a single artifact at display location 5503. Thereafter, no further artifacts are shown and the processed data stream therefore is more compliant. The comparison between display stream 5502 of frame based step picture mode shown in Fig. 55 and the display stream 5602 of anchor frame based step picture mode shown in Fig. 56 shows clear improvements in the compliance of the processed data stream. In particular the artifacts at display locations 5506, 5507 and 5508 no longer occur. Upon reception of a further step forward command from the user the device selects the next anchor frame that would have been displayed in the original data stream, in the example of Fig. 56, it is P-frame, P7, 5604. Again, a succession of empty Pe-frames 5605 is used to create the frozen picture of the step picture mode. In case 3, where the stream consists of Pe-frames, the first anchor frame chosen for the first step picture mode frame is the next anchor frame in the normal data stream. Therefore, the anchor frame to be displayed next is sent to the decoder by inserting it into the processed data stream followed by an indefinite series of Pe-frames. The PTS of this anchor frame is equal to its DTS increased by the frame time, Delta, because it is not followed by any B-frame. The DTS is as usual equal to the DTS of the previous frame increased by the frame time, Delta. The temporal references are strictly coupled to the PTS. So if a Pe-frame precedes the anchor frame, the temporal reference of this anchor frame is equal to the temporal reference of the preceding frame increased by 1. If on the other hand the preceding frame is a B-frame of any type, the increment is 2 instead of 1. It is the result of this that causes the single artifact 5503 and it stems from the fact that an additional gap of one frame exists again between the temporal references and PTS values of the anchor frame 5603 and the last B-frame 5607. This gap has to be filled by the decoder through the display of some other frame. The most probable candidates are the previous anchor frame or the last B-frame 5607. This has no disturbing effect because either the previous anchor frame is displayed one frame time earlier or the last B-frame 5607 is displayed one frame time longer. The temporal reference and DTS/PTS for the Pe-frames are again calculated with normal MPEG encoding rules taking into account each inserted frame. For the subsequent steps forward, the original anchor frame in the normal play stream next to the last transmitted one is sent to the decoder followed by an indefinite series of Pe-frames. After the first step forward, the system is always in a situation where an indefinite series of Pe-frames is transmitted. As a result, this second type of step forward mode fits very nicely to the step backward mode because the still picture stream always consists of a series of Pe- frames after the first step in both cases. This means that a step forward can easily be followed by a step backward and vice- versa, which is an important feature for the user. A step backward cannot be followed by a frame based step forward because the B-frame will reference the wrong anchor frames in that case.
A summary of all three cases is shown in Fig. 57. In case 1, the transmitted data stream 5700 comprises repeated B-frame data, B 14, 5707 in the still picture mode time period 5708. The first step forward signal is received at location 5606. The previous anchor frame is therefore, I16, 5704, in this example. The succession of Pe-frames 5605 is also shown. In case 2, the transmitted data stream 5701 comprises repeated Bf- frames 5710 in the still picture mode time period 5708. The first step forward signal is again received at location 5606. The previous anchor frame in case 2 is a P-frame, P1C,, 5705 due to the use of repeated Bf- frames 5710. The succession of Pe-frames 5605 is again shown in stream 5701. Finally, in case 3, the transmitted data stream 5702 comprises repeated Pe-frames 5711 to create the still picture mode in the still picture mode time period 5708. The first step forward signal is again received at location 5606. The anchor frame required for the first step picture mode frame in case 3 is again a P-frame, but this time it is frame P22, 5706. Frame P22, 5706 is the next following anchor frame in the normal data stream sequence which can be seen in the representation of the normal data stream 5703 in Fig. 57. The succession of Pe-frames 5605 is again shown in stream 5702.
The third type of step forward mode possible is I-frame based. Again it has to be determined first what I-frame should be displayed next. Following the methods already disclosed for frame based and anchor frame based step picture mode the I-frame required for the different cases can also be determined in a general way. In Fig. 58 the normal play data stream 5800 is shown with the first I-frame 5802, the second I-frame 5803 and the third I- frame 5804 of subsequent picture steps in I-frame step picture mode. After reordering the reordered data stream 5801 results.
In Fig. 59 some specific examples are shown for the three cases discussed previously. In case 1, the transmitted data stream 5900 comprises repeated B-frame data, B14, 5707 in the still picture mode time period. The first step forward signal is received at location 5606. The first frame to display is the previous I-frame and is therefore, lie, 5704, in this example. The succession of Pe-frames 5605 is also shown. In case 2, the transmitted data stream 5901 comprises repeated Bf- frames 5710 in the still picture mode time period. The first step forward signal is again received at location 5606. The I-frame to be displayed in case 2 is the following I-frame, I28, 5907. The succession of Pe-frames 5605 is again shown in stream 5901. Finally, in case 3, the transmitted data stream 5902 comprises repeated Pe- frames 5711 to create the still picture mode in the still picture mode time period. The first step forward signal is again received at location 5606. The I-frame required for the first step picture mode frame in case 3 is again I-frame, I28, 5907. Frame I28, 5907 is, as for case 2, the next following I-frame in the normal data stream sequence which can be seen in the representation of the normal data stream 5906 in Fig. 59. The succession of Pe-frames 5605 is again shown in stream 5902. For the remainder, the I-frame based step forward is, in fact, identical to the I-frame based step backward already described. It should be clear that this is the only step forward mode that allows for larger step sizes. As for step backward, the step size is then equal to an integer multiple of GOPs.
The anchor frame based step forward seems to be the best choice based on the technical quality on the one hand and the step size on the other hand. The step size is fixed to the anchor frame distance in this case and a larger step size cannot be chosen. A mixture of anchor frame based and I-frame based step forward depending on the wanted step size is also possible. There is an important difference between step backward and step forward in relation to the ECM handling. In step forward mode the normal sequence of the transmitted frames is not disturbed. Only some frames are skipped in certain modes. It is expected that the original ECMs present in the stream can still be used in the step forward mode. But also here the maximum step size and step frequency are limited for encrypted streams by the timely availability of CWs for the decryption process. In relation to this the handling of ECMs for trick-play handling has been described earlier in this specification. It is emphasized that the systems described in the following can be implemented in the frame of and in combination with any of the devices or systems described referring to Fig. 1 through Fig. 59.
In the following, referring to Fig. 60, a data processing device 6000 for processing an MPEG2 data stream including video content (alternatively audio content) will be described. The device is an exemplary embodiment of the invention. By means of the processing device 6000, it is possible to perform the various method steps as described referring to Fig. 18 through Fig. 59.
The processing device 6000 may be controlled by a central processing unit (CPU) or control unit 5102 which, in turn, can be controlled by a human user by means of a user interface 5103. The device 6000 processes a data stream 6010, which may be an MPEG2 data stream, or any other data stream wherein anchor frames are employed, such as MPEG4. The data stream 6010 is retrieved, when a corresponding control signal is sent from the control unit 5102 to the hard disk drive 5101 via a system bus 5113, from a storage device 5101, which may be a hard disk drive, an optical storage device, such as a CD, DVD, a flash storage device etc. The data stream 6010 may also enter the device from a traditional broadcast channel, such as terrestrial television, digital cable or digital satellite. Newer forms of transmission are also possible sources for the data stream 6010, such as transmission via the Internet or a mobile transmission technology as employed in mobile phone systems. A digital interface such as Ethernet or IEEE 1394, also known as "Firewire" may also be used. The actual source of the data stream 6010 is not a limiting factor in implementing an embodiment of the invention.
In operation of the device 6000 a user interacts with the device using the user interface 5103. This user interface can be via a remote control, a keyboard/mouse combination, or other known interaction means. Feedback can be given to the user via a display, such as a television, LCD display or monitor, for example. Other possibilities known to the skilled person are not excluded. A user operating the device 6000 can interact with the device 6000 to initiate a step picture mode. The user interface 5103 may then send a step forward signal 6017 to a detection unit 6001, which detects a mode change from a first reproduction mode to a second reproduction mode. The first reproduction mode may be a still picture mode, also known as freeze frame mode, a slow- forward mode or a normal play mode, as typical examples. Other known trick-play modes may also be possible. The second reproduction mode may be a step picture mode. The step forward signal 6017 may also be communicated from the user interface 5103 to the detection unit 6001 via the control unit 5102 and the system bus 5113. The detection unit 6001 outputs a mode switch signal 6015 when the mode change is detected. The mode switch signal 6015 may trigger an optional switching means (not shown) to switch between a bypass path (not shown) or a processed data stream path comprising a determination unit 6003 and an insertion unit 6004. The switching means is optional since both the determination unit 6003 and the insertion unit 6004 may also be implemented as essentially pass through devices when no stream processing is required.
The determination unit 6003 may receive the data stream 6010 on an input 6002. The mere act of receiving a data stream, such as the data stream 6010, may initiate the determination unit 6003 to determine the first anchor frame necessary for the correct step picture mode frame, though it is also possible to initiate the determination unit 6003 to determine the first anchor frame in other ways such as via a control signal sent via a system bus 5113 or via the mode switch signal 6015. The actual first anchor frame determined depends upon the form that the incoming data stream 6010 takes. The various forms were described earlier with reference to Fig. 59, along with the correct first anchor frame to be used, in each case. The first anchor frame that is determined by the determination unit 6003 is indicated via a first anchor frame control signal 6011 to the insertion unit 6004. The insertion unit may receive the first anchor frame itself, i.e. as a copy of the first anchor frame data, via the first anchor frame control signal 6011. It is also possible that only a reference to the first anchor frame is sent via first anchor frame control signal 6011 from the determination unit 6003 to the insertion unit 6004 and that the first anchor frame itself is received by a separate data path 6018 in the normal data stream signal path.
In any actual implementation the insertion unit 6004 is adapted to insert the first anchor frame into the data stream 6010 as a first step in producing a processed data stream 6013. For a step picture mode that freezes the stepped picture frame the insertion unit 6003 may be further arranged to insert a succession of frames into the processed data stream 6013 to cause a repetition of the first anchor frame on a display after decoding by reproduction unit 5106. The frames necessary to repeat the first anchor frame may be bidirectional predicted frames, i.e. MPEG B-frames, a subset of MPEG B-frames, such as Bf- frames or predicted anchor frames, such as MPEG P-frames. The succession of inserted frames may be empty frames. A mixture of frame types is also possible. Many options have been discussed earlier in this specification with respect to still picture mode.
In responding to a subsequent picture step forward signal 6017 the device 6000 is in a known state. In this case, the device 6000 remains in the second reproduction mode and the detection unit 6001 does not need to change the optional switching means. The detection unit 6001 does, however, need to indicate that a picture step forward signal 6017 was received and indicates this to the determination unit 6003 and the insertion unit 6004 via a step control signal 6016. Upon receiving the step control signal 6016 the determination unit 6003 determines the subsequent anchor frame as the next following anchor frame after the first anchor frame that would have occurred in the incoming data stream 6010. The insertion unit 6004 then inserts the subsequent anchor frame into the processed data stream 6013, followed by a further succession of anchor frames. Again these anchor frames may be of any type suitable to cause the freezing of the subsequent anchor frame. The insertion unit 6004 can supply a reproduction unit 5106 with the processed data stream 6013 via an output 6005. A complementary half of the optional switching means may also be placed in the output path and may again be controlled by the detection unit 6001 or by the control unit 5102 via system bus 5113. The reproduction unit 5106 may be located internally to the device 6000 and therefore be under control of the control unit 5102 or it may be external to the device 6000, i.e. in a remote decoder arrangement. In the latter case the reproduction unit 5106 does not, of course, need to be connected to the device system bus 5113. Therefore, the system bus 5113 connecting the reproduction unit 5105 and the control unit 5102 is entirely optional.
A further embodiment of the invention is shown in Fig. 61. The device 6100, shown in Fig. 61, is similar to the device 6000 shown in Fig. 60, but has a correction unit 6101, which is inserted into the processed data stream path after the insertion unit 6004. The correction unit 6101 may correct any temporal parameters of the frames inserted by the insertion unit 6004 to further improve the compliance of the processed data stream 6013 to a standard such as one of the MPEG standards. Temporal parameters of relevance are the temporal reference, the Decoding Time Stamp, or DTS, and the Presentation Time Stamp, or PTS. The corrections required have been described earlier in the description of the step picture mode when referring to Fig. 53 and Fig. 56. The correction unit 6101 may operate under the control of control unit 5102 via the system bus 5113.
A third embodiment of the invention is shown in Fig. 62. The device 6200, shown in Fig. 62, is similar to the device 6000 of Fig. 60 and the device 6100 of Fig. 61, but has a frame detector unit 6201 inserted into the incoming data stream path. The frame detector unit 6201 is capable of detecting the form of the incoming data stream 6010. Of relevance is the construction of the data stream 6010, for example, if the incoming data stream 6010 comprises repeated B-frame data or repeated Bf- frames then the determination unit 6003 may need to determine a different first anchor frame from a case where the data stream 6010 is constructed from a series of empty Pe-frames. It is therefore useful if the frame detector unit 6201 can determine repeated bi-directionally predicted frames and/or repeated empty predicted frames. The frame detector unit 6201 may communicate a frame detection signal directly to the determination unit 6003 or via the control unit 5102 via system bus 5113. The determination unit 6003 may use the frame determination signal and the mode switch signal 6015 to determine the first anchor frame according to the situations given in Fig. 59 and the related description. The frame detector unit 6201 may operate under the control of control unit 5102 via the system bus 5113. The insertion unit 6004 operates in a similar manner to that of device 6000 and device 6100. The correction unit 6101 operates in a similar manner to that of device 6100. The correction unit 6101 has an output 6005 that may be used to provide the processed data stream 6013 to a reproduction unit 5106 via the optional switching means. Again the reproduction unit 5106 may be a remote decoder connected via a digital interface to the device 6200. The digital interface may be any suitable interface for conveying audio-video signals, such as MPEG signals.
A list of abbreviations used in the specification is provided in Table 2.
AFLD Adaptation Field Control
BAT Bouquet Association Table
CA Conditional Access
CAT Conditional Access Table
CC Continuity Counter
CW Control Word
CPI Characteristic Point Information
DIT Discontinuity Information Table
DTS Decoding Time Stamp
DVB Digital Video Broadcast
ECM Entitlement Control Messages
EMM Entitlement Management Messages
GK Group Key
GKM Group Key Message
GOP Group Of Pictures
HDD Hard Disk Drive
KMM Key Management Message
MPEG Motion Pictures Experts Group
NIT Network Information Table
PAT Program Association Table
PCR Program Clock Reference
PES Packetized Elementary Stream
PID Packet Identifier
PLUSI Payload Unit Start Indicator
PMT Program Map Table
PTS Presentation Time Stamp
SIT Selection Information Table
SCB Scrambling Control Bits
STB Set-top-box
SYNC Synchronization Unit
TEI Transport Error Indicator
TPI Transport Priority Unit
TS Transport Stream UK User Key
Table 2 Abbreviations used in the specification
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be capable of designing many alternative embodiments without departing from the scope of the invention as defined by the appended claims. Furthermore, any of the embodiments described comprise implicit features, such as, an internal current supply, for example, a battery or an accumulator. In the claims, any reference signs placed in parentheses shall not be construed as limiting the claims. The word "comprising" and "comprises", and the like, does not exclude the presence of elements or steps other than those listed in any claim or the specification as a whole. The singular reference of an element does not exclude the plural reference of such elements and vice- versa. In a device claim enumerating several means, several of these means may be embodied by one and the same item of hardware. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to advantage. The terms "data" and "content" have been used interchangeably through the text, but are to be understood as equivalents.

Claims

CLAIMS:
1. A device (6000) for processing a data stream comprising a plurality of frames, wherein the device (6000) comprises: an input for receiving the data stream; an output for transmitting a processed data stream; a detection unit (6001) arranged to detect a switching of mode from a first reproduction mode to a second reproduction mode; a determination unit (6003) arranged to determine a first anchor frame comprised within the plurality of frames in response to the switching of mode; and an insertion unit (6004) arranged to insert frames, as inserted frames, into the data stream to produce the processed data stream in the second reproduction mode, wherein the insertion unit (6004) is arranged to insert the first anchor frame determined by the determination unit (6003) into the data stream as a first frame in the second reproduction mode; and insert a first succession of empty predictive frames subsequent to the first anchor frame.
2. The device (6000) of claim 1 wherein the detection unit (6001) is further arranged to detect a step forward signal; the determination unit (6003) is further arranged to, in response to detection of the step forward signal, determine a subsequent anchor frame, the subsequent anchor frame being the next following anchor frame comprised within the plurality of frames; and the insertion unit (6004) is arranged to insert the subsequent anchor frame determined by the determination unit (6003) into the processed data stream in the second reproduction mode; and insert a second succession of empty predictive frames subsequent to the subsequent anchor frame.
3. The device (6100) of claim 1 or claim 2 further comprising a correction unit (6101) for correcting at least one temporal parameter of the inserted frames comprised within the processed data stream and inserted by the insertion unit (6004).
4. The device (6200) of claim 1 or claim 2 further comprising a frame detector unit (6201) for detecting repeated bi-directionally predicted frames comprised within the plurality of frames in the first reproduction mode; wherein the determination unit (6003) is further arranged to: determine a further anchor frame which directly precedes the switching of mode; and determine the first anchor frame to be the further anchor frame.
5. The device (6200) of claim 1 or claim 2 further comprising a frame detector unit (6201) for detecting repeated empty predicted anchor frames comprised within the plurality of frames in the first reproduction mode; wherein the determination unit (6003) is further arranged to: determine a further anchor frame which directly succeeds the switching of mode; and determine the first anchor frame to be the further anchor frame.
6. The device (6100) of claim 3 wherein the at least one temporal parameter comprises a temporal reference.
7. The device (6100) of claim 3 wherein the at least one temporal parameter comprises a decoding time stamp.
8. The device (6100) of claim 3 wherein the at least one temporal parameter comprises a presentation time stamp.
9. The device (6000) of claim 1 wherein the first reproduction mode is a selection of one of a: still picture mode; a slow forward mode; or a normal play mode.
10. The device (6000) of claim 1 wherein the second reproduction mode is a step picture mode.
11. The device (6000) of claim 1 or claim 2 wherein the empty predictive frames comprise at least one empty MPEG P type frame.
12. The device (6200) of claim 4 wherein the bi-directionally predicted frames are
MPEG B type frames.
13. The device (6200) of claim 5 wherein the empty predicted anchor frames are empty MPEG P type frames.
14. The device (6000) of claim 1 wherein the data stream comprises one or more from a selection of: video data; audio data; and digital data.
15. The device (6000) of claim 1 wherein the data stream is an MPEG2 data stream.
16. The device (6000) of claim 1 realized as at least one of the group consisting of: a digital video recording device; a network-enabled device; a conditional access system; a portable audio player; a portable video player; a mobile phone; a DVD player; a CD player; a hard disk based media player; an Internet radio device; a computer; a television; a public entertainment device; and an MP3 player.
17. A method of processing a data stream comprising a plurality of frames, the method comprising method steps of: receiving the data stream; detecting a switching of mode from a first reproduction mode to a second reproduction mode; determining a first anchor frame comprised within the plurality of frames in response to the switching of mode; inserting frames, as inserted frames, into the data stream to produce a processed data stream in the second reproduction mode; and outputting the processed data stream, wherein the method step of inserting further comprises method steps of: inserting the first anchor frame determined by the determining step into the data stream as a first frame in the second reproduction mode; and inserting a first succession of empty predictive frames subsequent to the first anchor frame.
18. The method of claim 17 wherein the method step of detecting comprises a further method step of detecting a step forward signal; the method step of determining comprises a further method step of determining, in response to detection of the step forward signal, a subsequent anchor frame the subsequent anchor frame being the next following anchor frame comprised within the plurality of frames; and the method step of inserting comprises further method steps of inserting the subsequent anchor frame into the processed data stream in the second reproduction mode and inserting a second succession of empty predictive frames subsequent to the subsequent anchor frame.
19. The method of claim 17 or claim 18 wherein the method step of detecting comprises a further method step of detecting repeated bi-directionally predicted frames comprised within the plurality of frames in the first reproduction mode; and the method step of determining comprises further method steps of: determining a further anchor frame which directly precedes the switching of mode from a first reproduction mode to a second reproduction mode; and determining the first anchor frame to be the further anchor frame.
20. The method of claim 17 or claim 18 wherein the method step of detecting comprises a further method step of detecting repeated empty predicted frames comprised within the plurality of frames in the first reproduction mode; and the method step of determining comprises further method steps of: determining a further anchor frame which directly succeeds the switching of mode from a first reproduction mode to a second reproduction mode; and determining the first anchor frame to be the further anchor frame.
21. A program element directly loadable into the memory of a programmable device, comprising software code portions for performing, when said program element is run on the device, the method steps of: receiving a data stream comprising a plurality of frames; detecting a switching of mode from a first reproduction mode to a second reproduction mode; determining a first anchor frame comprised within the plurality of frames in response to the switching of mode; inserting frames into the data stream to produce a processed data stream in the second reproduction mode; and outputting the processed data stream, wherein the method step of inserting further comprises the method steps of: inserting the first anchor frame determined by the determining step into the data stream as a first frame in the second reproduction mode; and inserting a succession of empty predictive frames subsequent to the first anchor frame.
22. A computer-readable medium directly loadable into the memory of a programmable device, comprising software code portions for performing, when said code portions are run on the device, the method steps of: receiving a data stream comprising a plurality of frames; detecting a switching of mode from a first reproduction mode to a second reproduction mode; determining a first anchor frame comprised within the plurality of frames in response to the switching of mode; inserting frames into the data stream to produce a processed data stream in the second reproduction mode; and outputting the processed data stream, wherein the method step of inserting further comprises the method steps of: inserting the first anchor frame determined by the determining step into the data stream as a first frame in the second reproduction mode; and inserting a succession of empty predictive frames subsequent to the first anchor frame.
PCT/IB2006/054944 2005-12-23 2006-12-19 A device for and a method of processing a data stream WO2007072419A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP05112884.1 2005-12-23
EP05112884 2005-12-23

Publications (2)

Publication Number Publication Date
WO2007072419A2 true WO2007072419A2 (en) 2007-06-28
WO2007072419A3 WO2007072419A3 (en) 2008-06-12

Family

ID=38189060

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2006/054944 WO2007072419A2 (en) 2005-12-23 2006-12-19 A device for and a method of processing a data stream

Country Status (1)

Country Link
WO (1) WO2007072419A2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2106131A3 (en) * 2008-03-26 2012-07-25 Kabushiki Kaisha Toshiba Progressive scan conversion device and method for performing progressive scan conversion

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1996013121A1 (en) * 1994-10-20 1996-05-02 Thomson Consumer Electronics, Inc. Hdtv trick play stream derivation for vcr
WO2002087232A1 (en) * 2001-04-24 2002-10-31 Koninklijke Philips Electronics N.V. Method and device for generating a video signal
US20030231863A1 (en) * 1998-06-11 2003-12-18 Koninklijke Philips Electronics N.V. Trick play signal generation for a digital video recorder using retrieved intra-encoded pictures and generated inter-encoded pictures

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1996013121A1 (en) * 1994-10-20 1996-05-02 Thomson Consumer Electronics, Inc. Hdtv trick play stream derivation for vcr
US20030231863A1 (en) * 1998-06-11 2003-12-18 Koninklijke Philips Electronics N.V. Trick play signal generation for a digital video recorder using retrieved intra-encoded pictures and generated inter-encoded pictures
WO2002087232A1 (en) * 2001-04-24 2002-10-31 Koninklijke Philips Electronics N.V. Method and device for generating a video signal

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2106131A3 (en) * 2008-03-26 2012-07-25 Kabushiki Kaisha Toshiba Progressive scan conversion device and method for performing progressive scan conversion

Also Published As

Publication number Publication date
WO2007072419A3 (en) 2008-06-12

Similar Documents

Publication Publication Date Title
EP1967002B1 (en) A device for and a method of processing a data stream
US20080170687A1 (en) Device for and a Method of Processing an Encrypted Data Stream
US20080304810A1 (en) Device for and a Method of Processing an Input Data Stream Comprising a Sequence of Input Frames
US20080273698A1 (en) Device for and a Method of Processing a Data Stream Having a Sequence of Packets and Timing Information Related to the Packets
US20080212774A1 (en) Device for and a Method of Processing an Encrypted Data Stream in a Cryptographic System
WO2007072257A1 (en) A device for and a method of processing an encrypted data stream
WO2006114761A1 (en) A device for and a method of detecting positions of intra-coded frames in a data stream
JP4837868B2 (en) Method and apparatus for editing digital video recordings, and recordings produced by such methods
WO2007072244A1 (en) A device for and a method of processing a data stream comprising a plurality of frames
WO2007072252A2 (en) Creation of 'trick-play' streams for plaintext, partially, or fully encrypted video streams
WO2007072419A2 (en) A device for and a method of processing a data stream
WO2007072242A1 (en) A device for and a method of processing an encrypted data stream
JP4861221B2 (en) RECORDING DEVICE, RECORDING METHOD, VIDEO RECORDING / REPRODUCING DEVICE, AND RECORDING FILE PROCESSING METHOD THEREOF
MX2007012939A (en) A device for and a method of processing an encrypted data stream for trick play

Legal Events

Date Code Title Description
NENP Non-entry into the national phase in:

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 06842603

Country of ref document: EP

Kind code of ref document: A2

122 Ep: pct application non-entry in european phase

Ref document number: 06842603

Country of ref document: EP

Kind code of ref document: A2