GB2495501A

GB2495501A - Image decoding method based on information predictor index

Info

Publication number: GB2495501A
Application number: GB1117497.6A
Authority: GB
Inventors: Christophe Gisquet; Naa L Ouedraogo
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2011-10-11
Filing date: 2011-10-11
Publication date: 2013-04-17
Anticipated expiration: 2031-10-11
Also published as: GB201117497D0; GB2495501B

Abstract

A method and device for decoding a bitstream comprising an digital image sequence, encoded using reference image prediction (e.g. motion compensation (MC)), comprises: determining an index i of an information predictor (motion vector) selected from a set of image predictors; and decoding the image portion using the predictor indexed by index i and retrieved from the set, wherein determining the index i comprises iteratively constructing the set of predictors; and, for at least one iteration during the iterative construction of the set, reading at least one item of binary information (e.g., a single bit, value 0 ) from the bitstream, and determining, based on the read binary information, whether to stop the iterations and return an index i based on a current state of the set. Also claimed is a similar method in which positive detection of an end bit 810 of a (unary) code word, coding a predictor index, stops 812 iterative vector / predictor determination. The reading of binary data from the bitstream may be based on predictor set state, e.g. whether the set contains more than one predictor. Index i may be returned based on reaching a maximum number of iterations.

Description

METHOD AND DEVICE FOR DECODING A BITSTREAM COMPRISING AN

ENCODED SEQUENCE OF IMAGES

The present invention concerns a method and device for decoding a bitstream comprising an encoded sequence of digital images.

The invention belongs to the field of digital signal processing, and in particular to the field of video compression using some kind of prediction to reduce the amount of compressed data, i.e. the video bitrate.

An example of prediction is motion compensation to reduce spatial and temporal redundancies in video streams. While the invention may apply to any kind of prediction provided that a selection from information prediction candidates is made at the encoder, motion compensation is preferred for illustrative purposes in the present document.

Another example of information prediction selection is disclosed in publication US 2005/0254717, wherein the encoding method comprises selecting a prediction method from a set of prediction method candidates. Next, a rank or "index" identifying the method selected from the set is specified in the bitstream for the decoder.

Many video compression formats, such as for example H.263, H.264, MPEG-i, MPEG-2, MPEG-4, and SVC, use block-based discrete cosine transform (DCT) and motion compensation to remove spatial and temporal redundancies. They can be referred to as predictive video formats. Each frame or image of the video signal is divided into slices which are encoded and can be decoded independently. A slice is typically a rectangular portion of the frame, or more generally, a portion of a frame or an entire frame. Further, each slice is divided into macroblocks (MBs), and each macroblock is further divided into blocks, typically blocks of 8x8 or 4x4 pixels. The encoded frames are of two types: temporally predicted frames (either predicted from one reference frame called a P-frame or predicted from two reference frames called B-frames), and non-temporally predicted frames (called INTRA frames or I-frames).

Temporal prediction consists in finding in a reference frame, either a previous or a subsequent frame of the video sequence, an image portion or reference area which is the closest to the block to encode. This step is known as motion estimation. Next, the block is predicted using the reference area (motion compensation) -the difference (or "residual") between the block to encode and the reference portion is encoded, along with an item of motion information relative to the motion vector which indicates the reference area to use for motion compensation.

In order to further reduce the cost of encoding motion information, it has been proposed to encode a motion vector in terms of a difference between the motion vector and a motion vector predictor, typically computed from the motion vectors of the blocks surrounding the block to encode.

In H.264 (or AVC for Advanced Video Coding), motion vectors are encoded with respect to a median predictor computed from the motion vectors situated in a causal neighbourhood of the block to encode, for example from the blocks situated above and to the left of the block to encode. Only the difference, also called residual motion vector, between the median predictor and the current block motion vector is encoded.

Encoding using residual motion vectors saves some bitrate, but necessitates the decoder to perform the same computation of the motion vector predictor in order to decode the value of the motion vector of a block to decode.

Recently, further improvements have been proposed, such as using a plurality of possible motion vector predictors. This method, often referred to as motion vector competition (MVCOMP), consists in determining from among several motion vector predictors or candidates which motion vector predictor minimizes the encoding cost, typically a rate-distortion (RD) cost, of the residual motion information. The residual motion information comprises the residual motion vector, i.e. the difference between the actual motion vector of the block to encode and the selected motion vector predictor, and an item of information indicating the selected motion vector predictor, such as for example an encoded value of the index (referred to as "predictor index") of the motion vector predictor selected from the candidates.

In the current version of High Efficiency Video Coding (HEVC) (which is still subject to modifications), it has been proposed to use a plurality of motion vector predictors (five) as schematically illustrated in the example of Figure 1: three so-called spatial motion vector predictors V1, V2 and V3 taken from already-encoded blocks situated in the neighbourhood of the block to encode (the motion vector predictors are said to be "spatial" because they come from blocks belonging to the current frame); a median motion vector predictor computed on the basis of components of the three spatial motion vector predictors VI, V2 and V3 and a temporal motion vector predictor V0 which is the motion vector of the co-located block in a previously encoded image of the sequence (e.g. the block of image N-i located at the same spatial position as the block being coded' of image N -the corresponding predictor is said to be "collocated motion vector predictor").

Currently in HEVC the three spatial motion vector predictors are taken from the block situated to the left of the block to encode (V3), the block situated above (V2) and from one of the blocks situated at the respective corners of the block to encode, according to a predetermined rule of availability.

This motion vector predictor selection scheme is called Advanced Motion Vector Prediction (AMVP) which allows the selection of the best predictor from a given set. In the example of Figure 1, the vector V1 of the block situated above left is selected Finally, a set of five motion vector predictor candidates mixing spatial predictors and temporal predictors is obtained.

This operation of producing the set of motion vector predictor candidates is referred to as the "motion vector predictor derivation".

In order to reduce the overhead of signaling the motion vector predictor in the bitstream, the set of motion vector predictors is reduced by eliminating the duplicate motion vectors, i.e. the motion vectors which have the same value. For example, in the illustration of Figure 1, V1 and V2 are equal, and V0 and V3 are also equal, so only two of them should be kept as motion vector predictor candidates, for example V0 and V1. In this case, only one bit is necessary to indicate the index of the motion vector predictor to the decoder.

The set of candidates may even be reduced to a single item, for example if all motion vector predictors are equal. In such a case it is not necessary to insert any information relative to the selected motion vector predictor in the bitstream since the decoder (whtch performs the same derivation and reduction) can infer the motion vector predictor value.

In AMVP the optimal amount of spatial and temporal predictors can be evaluated. A current implementation includes two spatial predictors and one temporal collocated predictor (i.e. motion vector at the same position in a reference frame), for

example.

To summarize, the encoding of motion vectors by difference from a motion vector predictor, along with the reduction of the number of motion vector predictor candidates leads to a compression gain.

In AVC and HEVC, one block to encode may also be predicted by a weighted combination of two reference portions taken from two reference frames. In that case, two motion vectors and two reference indexes of the two reference frames are encoded in the bitstream. This kind of blocks is called Bidir block or B-block. On the other hand, when a single motion vector is used for block prediction and then encoded, the block is called inter or P-block.

Previously decoded images are stored in a decoded picture buffer (DPB) and are used as reference frames. Frames of the decoded picture buffer (DPB) are ordered in two different reference lists LISTO and LIST1 that are determined symmetrically by the encoder and decoder. Each motion vector of one block is associated with one reference frame index value that indicates which frame of the decoded pictures buffer is used as reference frame. The coding mode of one block defines which reference list, LISTO or LISTI, is used.

Figure 2 is a flow chart schematically illustrating steps of an encoding process implementing the motion vector predictor derivation.

The step of derivation of motion vector predictors 204 generates the motion vector predictors set 205 by taking into account the current reference frame index 202 if needed (INTER mode) and the motion vector field 201 grouping the motion vectors computed for the already-encoded image blocks of the current frame and of the previous frames. Next, a deletion or reduction process as will be described below can be applied to the motion vector predictors set 205 in step 206 to produce a reduced motion vector predictors set 207. In step 208 the number of motion vector predictors 209 in this reduced set 207 is determined. Next, the best motion vector predictor is selected with the RD selection criterion and its corresponding index 203 (identifying it within the set) is received by the module for predictor index signalling. If the number 209 of motion vector predictors is higher than one, the index of the best motion vector predictor 203 is converted in the module 210 into a code word 211, the length (i.e. number of bits) of which may depend on the number 201 of motion vector predictors.

This code word, if it exists, is then entropy coded in 212.

Figure 3 shows the flow chart of the AMVP scheme applying the above approach at the decoder end.

In step 304, the motion vector predictors set 305 is generated based on the motion vectors field 301 of the current frame and of the previous frames, and based on the current reference frame index 302 if needed. Next, a deletion or reduction process can be applied in step 306 to the motion vector predictors set 305 to produce a reduced motion vector predictors set 307. The same derivation scheme is used at the encoding and decoding ends, in such a way that the reduced set 307 is similar for the encoder and for the decoder. This ensures synchronization between the two ends and correct decoding of the predictor indexes and corresponding motion vector predictor.

In step 308 the number of motion vector predictors 309 in this reduced set 307 is determined. The number of predictors 309 is then used in step 310 for the entropy decoding of the predictor index code word 311 (if needed) retrieved from the bitstream 303. In step 312 the code word 311 (if it exists) obtained in step 310 is converted into an index defining the motion vector predictor index 313 within the set 307. This process is sufficient for index decoding or predictor index parsing.

in step 314, the motion vector predictor 315 indexed by the predictor index 313 is then extracted from the reduced set 307. The same motion vector predictor 315 as the one used at the encoder is thus obtained thanks to synchronization between the encoder and the decoder. That motion vector predictor 315 is then used to decode the current block by motion prediction.

Motion vector predictor derivation has a significant impact on encoding and decoding complexity. In particular, for the current HEVC motion vector predictor derivation, each predictor position (above, left, temporal, below left, above right) should be derived. Such derivation processes involve memory accesses, scaling, etc. Moreover, each predictor is compared to all other predictors in order to eliminate duplicate candidates. Indeed such a derivation process, which is mandatory to be able to determine the number of predictors for the entropic decoding, may constitute up to6 to 10% of the decoding process.

In general, where an information predictor (e.g. motion vector predictor) selected from a set is encoded in the bitstream and has to be retrieved based on the number of information predictors in the set when decoding, there is an increase of complexity on the decoding to obtain the set and this required number.

The present invention has been devised to address one or more of the foregoing concerns.

According to a first aspect of the invention there is provided a method of decoding a bitstream comprising an encoded sequence of images, at least one portion of an image having been encoded using prediction with respect to reference image data, the method comprising, for at least one image portion to be decoded, determining an index i of an information predictor selected from a set of information predictors for decoding the image portion and decoding the image portion using the information predictor indexed by index i and retrieved from the set, wherein determining the index i comprises iteratively constructing the set of information predictors, wherein at least one iteration during the iterative construction of the set comprises -reading at least one item of binary information from the bitstream; -determining, based on the read binary information, whether to stop the iterations and return an index i based on a current state of the set.

The invention may substantially decrease the computational complexity at the decoder, since not all the information predictors (e.g. the motion information predictors) need to be derived to be able to decode the predictor index, contrary to the known techniques.

This is achieved by stopping the iterations for forming the information predictors set on detecting that the selected information predictor has been computed and its corresponding index fully decoded. According to the invention, such detection relies on binary information in the bitstream, for example on binary information related to the selected information predictor such as the code word encoding the corresponding predictor index as explained below with respect to an exemplary code based on a unary code.

In one embodiment of the invention, determining whether to stop the iterations and return the index i comprises verifying whether the read binary information represents a code word end, and -if it does, stopping the iterations and returning a counter value as the index I; -otherwise, incrementing the counter.

According to this configuration, the binary information is included in the code words. Thus, the encoding operations do not need to be modified to implement the invention. The latter is therefore only an alternate decoding processing that provides lower complexity.

Furthermore, since the binary information represents a code word end, the decoder is certain to have fully decoded the predictor index, even if the whole set of information predictors has not been constructed.

In particular, the at least one item of binary information may be a single bit.

This provision ensures low complexity when determining whether to stop the iterations.

According to a particular feature, the indexes of information predictors are encoded in the bitstream using a code at least partly built on a unary code, for example a truncated unary code (i.e. with the largest code word being truncated by deleting its end or termination bit). Using a unary-based code ensures that nearly each code word ends with a particular bit, operating as a termination bit. This ensures low complexity for detecting the end of predictor indexes.

Furthermore, the unary-based code words may be assigned to the information predictors in the same order as these predictors are computed (i.e. the smallest unary code word with a single bit is assigned to the first computed predictor, and so on until the biggest unary code is assigned), which is an easy-to-implement way of assigning to ensure synchronization between the encoder and the decoder. This way of assigning ensures that, when the decoder detects a termination or end bit during the iterative construction of the set, the corresponding information predictor has already been computed and thus is available in the current set. In this situation, it is clear that there is no need to further compute other information predictors, saving computational operations.

Depending on the implementation of the unary code, the code words are made of one or several 1' followed by a single 0' as an end bit, or are made of one or several 0' followed by a single 1' as an end bit. In one of these two cases, the single bit has the value 0' (respectively 1') to represent a code word end.

In one embodiment of the invention, the information predictors comprise motion information predictors used in motion compensation for video compression, e.g. motion vector predictors.

In another embodiment of the invention, reading at least one item of binary information from the bitstream is decided based on a state of the set being constructed during the at least one iteration. This ensures an appropriate decision is taken to avoid complexity higher than in conventional approaches.

In particular, the state of the set for the decision comprises whether or not the set of information predictors contains more than one information predictor. This makes it possible to align the iterative counter value with the index values, since the first index value assigned to an information predictor is usually 0.

According to another particular feature, the method may further comprise preliminarily determining a maximum number of predictor-construction iterations and determining whether or not the number of performed iterations has reached the maximum number, and if it has reached the maximum number, returning the index I. This ensures detection of the truncated unary code corresponding to the last constructed predictor.

In particular, the method may further comprise decoding from the bitstream H an item of information representing a coding mode for the at least one image portion to be decoded, wherein the determined maximum number depends on the decoded coding mode information. This is because, depending on the INTER, MERGE, INTRA, etc. coding mode, the number of information predictors may vary. Of course, the determined maximum number may depend on other criteria.

According to another particular feature, the maximum number depends on the availability of motion information associated with image portions neighbouring the at least one image portion to be decoded.

In another embodiment of the invention, the at least one iteration comprises computing an information predictor and adding the computed information predictor to the set if the latter does not already comprise that computed information predictor. This avoids having duplicate predictors in the set, as it is achieved through the deletion or reduction process at the encoder.

In yet another embodiment of the invention, the index i is directly related to the number of predictors in the set in course of construction, at a given iteration.

According to a second aspect of the invention there is provided a decoding device for decoding a bitstream comprising an encoded sequence of images, at least one portion of an image having been encoded using prediction with respect to reference image data, the decoding device comprising index determining means for determining, for at least one image portion to be decoded, an index i of an information predictor selected from a set of information predictors for decoding the image portion and decoding means for decoding the image portion using the information predictor indexed by index i and retrieved from the set.

wherein the index determining means are configured to iteratively construct the set of information predictors, and comprise reading means for reading, in at least one iteration of the iterative construction of the set, at least one item of binary information from the bitstream and determining means for determining, based on the read binary information, whether to stop the iterations and return an index i based on a current state of the set.

A further aspect of the invention provides a method of decoding a bitstream comprising an encoded sequence of images, at least one portion of an image having been encoded using prediction with respect to reference image data, the method comprising, for at least one image portion to be decoded, detecting, in the bitstream, an end bit of a code word coding an index of an information predictor for decoding the image portion, the code word belonging to a code at least partly built on a unary code, and upon positive detection, stopping an iterative construction of a set of information predictors and returning, based on the current state of the set, the index of the information predictor for decoding the image portion.

Another aspect of the invention relates to a computer program product for a programmable apparatus, the computer program product comprising a sequence of instructions for carrying out the method as set out above when loaded into and executed by the programmable apparatus.

1 5 Yet another aspect of the invention relates to a non transitory computer-readable storage medium, able to be read by a programmable apparatus, storing instructions of a computer program for carrying out the method as set out above when loaded into and executed by the programmable apparatus.

The decoding device, the computer program product and the computer-readable storage medium may have features and advantages that are analogous to those set out above and below in relation to the decoding method, in particular that of reducing complexity when decoding a bitstream comprising an encoded sequence of images.

Another aspect of the invention relates to a method for decoding a bitstream substantially as herein described with reference to, and as shown in, Figure 8; Figures 3 and 8; Figures 3, 7 and 8 of the accompanying drawings.

Optional features of the invention are further defined in the dependent appended claims.

At least parts of the method according to the invention may be computer implemented. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.) or an embodiment combining software and hardware aspects which may all generally be referred to herein as a "circuit", "module" or "system". Furthermore, the present invention may take the form of a computer program product embodied in any tangible medium of expression having computer usable program code embodied in the medium.

Since the present invention can be implemented in software, the present invention can be embodied as computer readable code for provision to a programmable apparatus on any suitable carrier medium. A tangible carrier medium may comprise a storage medium such as a floppy disk, a CD-ROM, a hard disk drive, a magnetic tape device or a solid state memory device and the like. A transient carrier medium may include a signal such as an electrical signal, an electronic signal, an optical signal, an acoustic signal, a magnetic signal or an electromagnetic signal, e.g. a microwave or SF signal.

Embodiments of the invention will now be described, by way of example only, and with reference to the following drawings in which: Figure 1 is a schematic diagram of a set of motion vector predictors used in a motion vector prediction process; Figure 2 is a flow chart illustrating steps of an encoding process of the prior art; Figure 3 is a flow chart illustrating steps of an decoding process of the prior a rt; Figure 4 is a block diagram illustrating components of a processing device in which embodiments of the invention may be implemented; Figure 5 is a block diagram illustrating components of an encoder device; Figure 6 is a schematic diagram of a set of motion vector predictors used in a motion vector prediction process; Figure 7 is a block diagram illustrating components of a decoder device according to embodiments of the invention; Figure 8 is a flow chart illustrating steps of a general decoding process according to embodiments of the invention; and Figure 9 illustrates angular prediction methods for INTRA coding mode.

Figure 4 schematically illustrates a processing device 400 configured to implement at least one embodiment of the present invention. The processing device 400 may be a device such as a micro-computer, a workstation or a light portable device. The device 400 comprises a communication bus 413 to which there are preferably connected: -a central processing unit 411, such as a microprocessor, denoted CPU; -a read only memory 407, denoted ROM, for storing computer programs for implementing the invention; -a random access memory 412, denoted RAM, for storing the executable code of the method of embodiments of the invention as well as the registers adapted to record variables and parameters necessary for implementing the method of decoding a bitstream according to embodiments of the invention, and/or a corresponding method for encoding a sequence of digital images; and -a communication interlace 402 connected to a communication network 403 over which digital data to be processed are transmitted.

Optionally, the apparatus 400 may also include the following components: -a data storage means 404 such as a hard disk, for storing computer programs for implementing methods of one or more embodiments of the invention and data used or produced during the implementation of one or more embodiments of the invention; -a disk drive 405 for a disk 406, the disk drive being adapted to read data from the disk 406 or to write data onto said disk; -a screen 409 for displaying data and/or serving as a graphical interlace with the user, by means of a keyboard 410 or any other pointing means.

The apparatus 400 can be connected to various peripherals, such as for example a digital camera 400 or a microphone 408, each being connected to an input/output card (not shown) so as to supply multimedia data to the apparatus 400.

The communication bus provides communication and interoperability between the various elements included in the apparatus 400 or connected to it. The representation of the bus is not limiting and in particular the central processing unit is operable to communicate instructions to any element of the apparatus 400 directly or by means of another element of the apparatus 400.

The disk 406 can be replaced by any information medium such as for example a compact disk (CD-ROM). rewritable or not, a ZIP disk or a memory card and, in general terms, by an information storage means that can be read by a microcomputer or by a microprocessor, integrated or not into the apparatus, possibly removable and adapted to store one or more programs whose execution enables the method of decoding a bitstream according to the invention to be implemented.

The executable code may be stored either in read only memory 407, on the hard disk 404 or on a removable digital medium such as for example a disk 406 as described previously. According to a variant, the executable code of the programs can be received by means of the communication network 403, via the interface 402, in order to be stored in one of the storage means of the apparatus 400, such as the hard disk 404, before being executed.

The central processing unit 411 is adapted to control and direct the execution of the instructions or portions of software code of the program or programs according to the invention, which instructions are stored in one of the aforementioned storage means. On powering up, the program or programs that are stored in a non-volatile memory, for example on the hard disk 404 or in the read only memory 407, are transferred into the random access memory 412, which then contains the executable code of the program or programs, as well as registers for storing the variables and parameters necessary for implementing the invention.

In this embodiment, the apparatus is a programmable apparatus which uses software to implement the invention. However, alternatively, the present invention may be implemented in hardware (for example, in the form of an Application Specific Integrated Circuit or ASIC).

Figure 5 illustrates a block diagram of an encoder implementing motion compensation when encoding digital images. The encoder is represented by connected modules, each module being adapted to implement, for example in the form of programming instructions to be executed by the CPU 411 of device 400, at least one corresponding step of a method for encoding a sequence of digital images.

An original sequence of digital images iO to in 501 is received as an input by the encoder 50. Each digital image is represented by a set of samples, known as pixels.

A bitstream 510 is output by the encoder 50 after implementation of the encoding process.

The bitstream 510 comprises a plurality of encoding units or slices, each slice comprising a slice header for transmitting encoding values of encoding parameters used to encode the slice and a slice body, comprising encoded video data.

The input digital images iO to in 501 are divided into blocks of pixels by module 502. The blocks are image portions and may be of variable sizes (e.g. 4x4, 8x8, 16x16, 32x32). A coding mode is selected for each input block. There are two families of coding modes: coding modes based on spatial prediction coding (INTRA prediction), and coding modes based on temporal prediction (INTER coding, Bidir, SKIP). The possible coding modes are tested.

Module 503 implements INTRA prediction, in which the given block to be encoded is predicted by a predictor block computed from pixels from the neighbourhood of said block to be encoded. An indication (e.g. a direction) of the INTRA predictor selected and the difference between the given block and its predictor block is encoded to provide a residual if the INTRA coding is selected.

Temporal prediction is implemented by motion estimation module 504 and motion compensation module 505. Firstly a reference image from a set of reference images 516 is selected, and a portion of the reference image, also called reference area or predictor block (generally of the block size), which is the closest area to the given block to be encoded, is selected by the motion estimation module 504. Motion compensation module 505 then predicts the block to be encoded using the selected area. The difference between the selected reference area and the given block, also called "residual block", is computed by the motion compensation module 505. The selected reference area is indicated by a motion vector, together with an index identifying the corresponding reference frame.

Thus in both cases (spatial and temporal prediction), a residual is computed by subtracting the predictor block from the original block to encode.

In the INTRA prediction implemented by module 503, a prediction direction is encoded.

In the temporal prediction, at least one motion vector is encoded.

Information relative to the motion vector and the residual block is encoded if the INTER prediction is selected.

To further reduce the bitrate for temporal prediction, the motion vector is encoded by difference (also called residual motion vector) with respect to a motion vector predictor.

A set of motion vector predictors, also called generically motion information predictors in the invention, is obtained from the motion vectors field 518 (equivalent to 201 of Figure 2) by a motion vector prediction and coding module 517. The motion vectors field 518 comprises in particular the motion vectors that have been computed and used for blocks already encoded in the current frame and in previous frames.

The set of motion vector predictors used to select a best motion vector predictor to encode a current motion vector may be generated from the motion vectors

of field 518 in various ways.

The generation of the set of motion vector predictors, which is the same at the encoder and at the decoder, may in particular comprise a so-called step of derivation and a step of reduction as already explained above with reference to Figure 2.

An example of a process of derivation of motion vector predictors for both the encoding and the decoding processes is now described with reference to Figure 6 which illustrates a current AMVP implementation for HEVC coding, according to which a set of motion vector predictors with only two spatial motion vectors and one temporal collocated vector is computed from neighbouring blocks. As already introduced above, depending on the coding mode for example, there may be a different number of motion vector predictors in the set, for example five predictors comprising three spatial motion 1 0 vectors, one median motion vector and one temporal motion vector.

In the predictor set represented in Figure 6, the two spatial motion vectors of the INTER mode are chosen from among those blocks which are above and left of the block to be encoded including the above corner blocks and left corner blocks. "A motion vector is chosen from a block" means that, if said b'ock has been temporally encoded (using a motion vector), the corresponding motion vector is selected.

The spatial "left" predictor is selected from among the motion vectors of blocks labelled "Below Left" and "Left" in the figure. The spatial "above" predictor is selected from among the motion vectors of blocks labelled "Above Right", "Above" and "Above Left".

If no motion vector value is found, the corresponding "left" or "above" predictor is considered as unavailable. In that case, it means that the related blocks were INTRA coded or that these blocks do not exist.

The temporal motion predictor generally comes from the nearest reference frame in an encoding configuration privileging a low encoding latency.

At the end of the motion vector predictor derivation, the set of motion information predictors generated contains 0, 1, 2 or 3 predictors.

If no predictor is defined in the set, the motion vector to encode is not compared to a predictor. Both vertical and horizontal components of the motion vector to encode are coded on their own, i.e. without comparison with a predictor.

The MERGE mode (defined in the HEVC recommendation) is a particular form of INTER coding. Two sub-modes are considered.

The first one is the MERGE SKIP mode, according to which only the index of a predictor (if used) is transmitted in the bitstream, without a texture residual block.

The second one is the conventional MERGE mode, where the partitioning of the current block, the texture residual block (if needed) and the corresponding motion vector predictor index are all transmitted.

For both MERGE modes, no residual motion vector is transmitted, meaning that only the predictor index is transmitted.

For the MERGE modes, five predictors are considered, in the following order "Left", "Above", "Temporal", "Above Right", "Below Left".

Of course, various coding modes may lead to a various number of predictors.

Following the derivation of motion vector predictors, a deletion or reduction operation to reduce the set of predictors is applied, in order to minimize the number of candidates in the set. This would consequently reduce the rate of the predictor index.

Practically, duplicate motion vectors are simply removed from the set.

Duplicated motion vectors are vectors having the same values (in particular horizontal and vertical components).

For the particular case of the MERGE mode, the deletion process may take into account the component values of the motion vectors and their reference frame indexes. Thus, the two components of a motion vector predictor and its reference index are compared to the corresponding components and index of each other predictor in the set: only if these three values are equal is the duplicate motion vector predictor removed from the set.

Returning to Figure 5, the motion vector prediction and coding module 517 selects the best motion vector predictor from the constructed set and then encodes the corresponding predictor index and residual motion vector in the bitstream.

Signalling of the index of the selected motion vector predictor depends on the motion vector predictors set constructed as described above. More particularly, the amount of bits allocated to the signalling depends on the number of motion vector predictors remaining in the constructed set. For instance, if at the end of the iterative construction algorithm, only one motion vector remains, no overhead is required to signal the motion vector predictor index, since the index can easily be retrieved by the decoder.

Table 1 gives an example of code word assignment for each index according to the number N of predictors in the set after the deletion process. This example is a truncated unary code for N-i indexes (the last one being made of only bits "1" without the end bit "0").

Code word when the amount of predictors in the ________ ________ ______ set is N ______ _______ Predictor N=1 N=2 N=3 N4 N5 index _________ _______ ________ _______ ________ 0 -(inferred) 0 0 0 0 1 _______ 1 10 10 10 ________ ______ 11 110 110 3 ________ ______ ______ 111 1110 4 __________ _______ I _________ -1111

Table 1

In this example, the shortest unary code word (0) is assigned to the first S constructed motion vector predictor (index=0), the second shortest (10) to the second constructed motion vector predictor (index=1), and so on (note that when N =2, since the second shortest code is also the last code, the truncated unary code is 1).

The code word assigned to the last predictor index is a truncated unary code word, i.e. without the end bit 0'. Of course, using a unary code word for the last predictor index is still possible for the invention but would cost one additional bit for the last index compared to the case of Table 1.

In a variant an opposite unary code may be used (1' replacing 0', vice-versa), possibly truncated.

It is recalled here that the decoder is able to perform the derivation and deletion operations in a similar fashion as the encoder (see Figure 3 similar to Figure 2 with respect to the generation of the set of predictors), in such a way that the same motion vector predictors are associated with the same indexes, in the same order.

The encoder 50 further comprises a module 506 for selection of the coding mode 506, which uses an encoding cost criterion, such as a rate-distortion criterion, to determine which is the best mode between the spatial prediction mode and the temporal prediction mode. In order to further reduce redundancies a transform 507 is applied to the residual block (if any), the transformed data obtained is then quantized by module 508 and entropy encoded by module 509 (equivalent to 212 of Figure 2).

Finally, the encoded residual block of the current block to encode is inserted into the bitstream 510, along with the motion prediction information (e.g. reference frame index, residual motion vector and predictor index -in particular the code word from Table 1 coding the index of the selected motion vector predictor).

For the blocks encoded in SKIP' mode, only a reference (including the code word coding the predictor index) to the predictor together with the residual motion vector are encoded in the bitstream, without any residual block.

The encoder 50 further performs decoding of the encoded image in order to produce a reference image for the motion estimation of the subsequent images. This enables the encoder and the decoder receiving the bitstream to have the same reference frames. The module 511 performs inverse quantization of the quantized data, followed by an inverse transform 512. The reverse intra prediction module 513 uses the prediction information to determine which predictor block to use for a given block and the reverse motion compensation module 514 uses the motion prediction information to determine which motion vector predictor to use and then actually adds the residual block obtained by module 512 to the reference area obtained from the set of reference images 516, the motion vector predictor and the residual motion vector.

Optionally, a deblocking filter 515 is applied to remove the blocking effects and enhance the visual quality of the decoded image. The same deblocking filter is applied at the decoder, so that, if there is no transmission loss, the encoder and the decoder apply the same processing.

Figure 7 illustrates a block diagram of a decoder 70 according to at least one embodiment of the invention. The decoder is represented by connected modules, each module being adapted to implement, for example in the form of programming instructions to be executed by the CPU 411 of device 400, a corresponding step of a method implementing an embodiment of the invention.

The decoder 70 receives a bitstream 701 comprising encoding units, each one being composed of a header containing information on encoding parameters and a body containing the encoded video data. As explained with respect to Figure 5, the encoded video data are entropy encoded, and the motion vector predictor indexes are encoded, for a given block, using Table 1 for example. The received encoded video data are entropy decoded by module 702. The residual data are then dequantized by module 703 and then a reverse transform is applied by module 704 to obtain pixel values.

The coding mode data are also entropy decoded and, depending on the coding mode, an INTRA type decoding or an INTER type decoding is performed.

In the case of INTRA mode, an INTRA predictor is determined by intra reverse prediction module 705 based on the INTRA prediction mode (i.e. prediction direction) specified in the bitstream. Decoding the index (or direction) of the INTRA predictor may however be implemented through the index decoding process according to the invention.

If the mode is INTER, the motion prediction information is obtained as described below with reference to Figure 8 so as to find the reference area used by the encoder. The motion prediction information is composed of the reference frame index, the residual motion vector and the motion vector predictor index. The reference frame index and the residual motion vector may be directly decoded from the bitstream, while the decoding of the predictor index is performed according to steps of the present invention, as for example illustrated below with reference to Figure 8.

The motion vector predictor indexed by the obtained predictor index is then retrieved from the predictors set (as constructed with reference to Figure 8) and added to the residual motion vector in order to obtain the motion vector by motion vector decoding module 710.

The reference area indicated by the decoded motion vector is extracted from the reference image 708 indexed by the decoded reference frame index, to apply the reverse motion compensation 706 (e.g. addition of the extracted reference block to a residual block). The motion vector field 711 is updated with the decoded motion vector in order to be used for the inverse prediction of the next decoded motion vectors.

Finally, a decoded block is obtained. A deblocking filter 707 is applied, similarly to the deblocking filter 515 applied at the encoder. A decoded video signal 709 is finally provided by the decoder 70.

Figure 8 illustrates through a flow chart steps of a decoding process of the predictor index according to an embodiment of the invention.

This decoding process makes it possible to obtain the predictor index without the set of predictors having been fully constructed. This reduces computational complexity compared to the conventional techniques where the set of predictors has to be fully constructed to know the number of predictors to be able to retrieve the code According to the invention, the decoding process comprises iteratively constructing the set of information predictors (e.g. the motion vector predictors) and stopping the iterative construction when it is detected that the predictor index has been decoded.

This embodiment still relates to HEVC where a motion vector is usually associated with one or several blocks, i.e. coded with the MERGE or INTER mode. Of course, it may be extended to other coding modes, for example to an INTRA coding mode where the used prediction direction is selected from a constructed set of direction predictors. In the following, further explanations are given concerning the INTRA coding mode when useful.

The index decoding process starts at step 800 where it is considered that all the information necessary for the decoding of the index are available.

This information concerns for example the availability of neighbours (the motion vectors of which are used to derive the predictors) and the coding mode (MERGE or INTER in the present example).

This information helps determining the value MAXbIOk, at step 801, this value defining the number of candidates for the current block to encode. Consequently, MAXbI0Ck also defines the maximum number of iterations for constructing the set of predictors.

Regarding the INTER prediction coding (using motion vector), MAXbIOk is set to 3, unless the coding mode is MERGE in which case MAXb!O is set to 5.

In the particular case of the INTRA prediction mode, various parameters may be taken into account to determine MAXbIk, such as the block size. For example, a possible illustration is in the current version of HEVC, for which MAXbIQCk is set to 4 for 64x64 blocks, corresponding to the four prediction modes provided for such 64x64 blocks, is set to 16 for 32x32 blocks, to 35 for 8x8 blocks and 10 for 4x4 blocks.

Next, the initialization step 802 initializes an iteration counter MAX' (which counts the number of iterations and thus candidate motion vectors evaluated) and an iteration counter index' (which indicates the currently-selected candidate and is used to provide the predictor index to be determined by this process) to U. The set of predictors is also initialized as an empty list.

The counters respectively represent the current maximum index and decoded index.

The main decoding loop, which fits with a loop constructing a new predictor in the set, starts at step 803 where the next candidate predictor in the order of evaluation is computed, for example "Left" before "Above", and then "Temporal", "Above Right" and "Below Left" for the MERGE modes.

The computation is similar to the one performed at the encoder, with the same order of predictors. A consequence is that the same indexes are assigned to the same predictors for both the encoder and the decoder, making it possible to accurately obtain the used motion vector predictor at the decoder.

Regarding the INTER and MERGE coding mode, the computation of the candidate predictor uses various information such as availability, actual value and reference frame index (the latter being used to scale the motion vector).

Regarding the INTRA coding mode, the predictor computation may be more complex (a whole block has to be determined), so simplification schemes can be implemented.

Next to step 803, the iteration counter MAX is incremented in step 804.

Next, the decision step 805 checks whether or not the new motion vector predictor computed at step 803 is already present in the current set of candidate predictors.

Regarding the INTER and MERGE coding mode, in one embodiment the check only consists in comparing the motion vector horizontal and vertical components and the reference frame index with the same values of each predictor already present in the set.

Regarding the INTRA coding mode, various approaches may be implemented.

A general approach may consist in comparing, pixel by pixel, the predictor block corresponding to the computed next predictor and the predictor block corresponding to each of the predictor in the set.

Simpler approach may also be used to save computational costs.

For example, with reference to Figure 9, let's consider the 8x8 block 903.

As mentioned above, there are up to 35 angular prediction methods (controlled by a parameter called "direction") available to generate the predictor block. These directions are illustrated by the direction diagram 906 in the figure, also showing the corresponding indexes (from 0 to 34) of the various prediction modes.

In this situatFon, one way to perform step 805, i.e. to detect whether or not a first predictor (in the set) equals another predictor (the "next predictor" computed at step 803) is illustrated as follows.

It relies on determining the equality of the corresponding prediction methods, for example by checking the sample value on a given border.

Let's consider two predictors associated with two borders of the block to encode: the outer above border 901 together with vertical direction 904 (corresponding to direction of index 1 in 906), and outer left border 902 with horizontal direction 905 (corresponding to direction of index 2 in 906).

In this example it is determined whether or not each of these two predictors equals the predictor associated with the DC mode (of index 3) which corresponds to add a constant value.

If the considered border is constant (i.e. all of its samples are of equal value), the corresponding predictor is identical to the predictor associated with the DC mode.

If already present (output "yes" from step 805), the set of predictors remains unchanged, the newly computed predictor is discarded and the process goes to step 808 further described below.

1 0 If not present in the set (output "no" from step 805), the next step 806 adds the obtained predictor to the set of candidate predictors, thereby increasing its length.

Then the decision step 807 checks whether or not the current size of the set of candidate predictors is lower than 2 (or whether or not the set contains more than 1 candidate predictor).

If it is lower than 2 (output "yes" from step 807), there is still no need to read an additional bit from the bitstream to signal which predictor has been selected.

The process then continues at step 808.

If it is 2 or higher (output "no" from step 807), at least one item of binary information (in particular, a single bit) is read from the bitstream at step 809.

Next, the decision step 810 determines, based on the read binary information, whether to stop the iterations and return the index counter value as the predictor index to decode.

In particular step 810 checks whether the read additional bitstream information indicates the index has been fully decoded. In the case of the truncated unary code of Table 1, such read information is a single bit of value 0.

The invention may also apply with Rice code or more generally Golomb codes, and may also be extended to prefix codes.

If the index has been fully decoded (or the read bit represents a code word end), the processing ends at step 812 by returning the index counter value as the predictor index to decode.

If not fully decoded (for example, the read bit is 1), the processing continues at step 811 where the index counter is incremented by one. Next, the decision step 808 checks if the processing has reached its last iteration by comparing the iteration counter MAX to MAXmode, i.e. determining whether or not there is another candidate predictor left to compute.

If there is no additional predictor to possibly determine (e.g. because of availability of neighbours, maximum count reached), the predictor index has been fully decoded, and processing ends at step 812 by returning the index counter value, which corresponds to the index of the last constructed predictor.

Otherwise, a new iteration for constructing the set starts by looping back to step 803.

As shown above, the decision to read the additional binary information at step 809 depends on the state of the set being constructed during the current iteration, in particular on the size of the set (compared to 2 at test 807).

Furthermore, given the way the predictors are constructed and the index (and thus the code words) is assigned, when the predictor index has been fully decoded according to the invention, it is certain that the corresponding motion vector predictor has been constructed during one of the iterations performed. It may then be retrieved directly from the constructed set, even if the latter has not been fully 1 5 constructed.

This is because each predictor is computed during an iteration closely synchronized (possibly with a difference of one iteration) with the reading, in the bitstream, of the bit having the same position as the end bit of the code word that is associated with the index assigned to that predictor. This ensures that when the iterations are stopped (index fully decoded), the predictor corresponding to the decoded index (the end bit of which has just been detected) is present in the set (possibly partially constructed).

As is apparent from Figure 8, step 809 progressively (i.e. through the iterations) parses the code word coding the predictor index in the bitstream.

computational complexity is reduced thanks to the invention when step 810 detects an end bit, meaning that the whole code word has been read. In this situation, the corresponding motion vector predictor has already been computed during a previous iteration, because the shortest code words are assigned to the first computed predictors. The iterative construction can then be stopped, thus saving computational time and costs.

For their part, tests 807 and 808 only detect when the predictors set has been fully constructed (for example in the case where the index has no end bit, because it is the last predictor that has been selected). For illustrative purposes, the index 2' is now considered, which was then encoded using the code word!110 in the bitstream (for N=5=MAXbIQGk predictors in the set).

The index decoding of Figure 8 is as summarized in Table 2 below. For simplicity, the iterations when a computed predictor is already present in the set have been omitted since this does not substantially modify the process.

Iteration index' Result Value of Result index' Result of number counter of test bit read of test counter test 808 value at 807 at step 810 value at (MAX __________ step 803 _______ 809 _______ step 808 value) 1 0 Yes --0 No(1) 2 0 No I No 1 No(2) 3 1 No I No 2 No(3) 4 2 No 0 Yes

Table2

In this example, the returned index counter value is "2" which corresponds to the expected predictor index. One may observe that the whole set of five predictors have not been fully constructed, but the third predictor in the set (i.e. the one indexed by "2") has been computed and can therefore be used without additional computation of predictors.

If the code word comprises an end bit "0", the process ends at step 810 by detecting that end bit.

Table 3 below illustrates the case of index "4" for N5=MAXbk. The corresponding code word is 1111 (last row in Table 1), assigned to the last predictor of the set.

Iteration index' Result Value of Result 1ndex' Result of number counter of test bit read of test counter test 608 value at 807 at step 810 value at (MAX __________ step 803 ______ 809 ______ step 808 value) 1 0 Yes --0 No(l) 2 0 No 1 No I No(2) 3 1 No 1 No 2No(3) 4 2 No 1 No 3 No(4) 3 No 1 -No 4 Yes(S)

Table 3

At step 808 of iteration 5, it is determined that all the predictors have been computed. Therefore the returned index counter value is "4", i.e. the expected predictor index.

If the code word does not comprise an end bit "0" (because it is the last truncated unary code), the process ends at step 808 by detecting that all the predictors have been constructed.

Table 4 below illustrates another situation with MAXbIOCk5 but where one predictor is a duplicate of a yet constructed predictor (let's say at the third iteration) and then turns out to restrict the set of predictors to only N=4 predictors. In this example.

the expected index is "3" meaning a code word equal to 111.

Iteration index' Result Result Value Result index' Result of number counter of test of test of bit of test counter test 808 value at 805 807 read at 810 value at (MAX step 803 step step 808 value) _______ ___________ _______ 809 ________ __________ __________ 1 0 No Yes --0 No(1) 2 0 NoNo 1 No 1 No(2) 3 1 Yes ---I No(3) 4 1 No No 1 No 2 No(4) 2 NoNo 1 No 3 Yes(5)

Table 4

This example is quite similar to Table 3, with exit of the process through test 808 where it is determined that all the predictors have been computed. The returned index counter value is "3", i.e. the expected predictor index.

Although the present invention has been described hereinabove with reference to specific embodiments, the present invention is not limited to the specific embodiments, and modifications will be apparent to a skilled person n the art which lie within the scope of the present invention. Many further modifications and variations will suggest themselves to those versed in the art upon making reference to the foregoing illustrative embodiments, which are given by way of example only and which are not intended to limit the scope of the invention, that being determined solely by the appended claims. In particular different features from different embodiments may be interchanged, where appropriate.

In particular, while the invention has been mainly described with reference to the decoding of motion vector predictor indexes, it may apply to the indexes of any kind of information predictor provided that such information is selected from a constructed set of corresponding predictors. For example, they may be prediction method candidates as disclosed in publication US 200510254717. They may also be direction predictors from which one direction is selected during INTRA coding of a block.

In the claims, the word comprising" does not exclude other elements or steps, and the indefinite article "a" or "an" does not exclude a plurality. The mere fact that different features are recited in mutually different dependent claims does not indicate that a combination of these features cannot be advantageously used.

Claims

<claim-text>CLAIMS1. A method of decoding a bitstream comprising an encoded sequence of images, at least one portion of an image having been encoded using prediction with respect to reference image data, the method comprising, for at least one image portion to be decoded, determining an index i of an information predictor selected from a set of information predictors for decoding the image portion and decoding the image portion using the information predictor indexed by index land retrieved from the set, wherein determining the index i comprises iteratively constructing the set of information predictors, wherein at least one iteration during the iterative construction of the set comprises: reading at least one item of binary information from the bitstream; and determining, based on the read binary information, whether to stop the iterations and return an index j based on a current state of the set.</claim-text> <claim-text>2. The decoding method of Claim 1, wherein determining whether to stop the iterations and return the index i comprises verifying whether the read binary information represents a code word end, and -if it does, stopping the iterations and returning a counter value as the indexi; otherwise, incrementing the counter.</claim-text> <claim-text>3. The decoding method of Claim 2, wherein the at least one item of binary information is a single bit.</claim-text> <claim-text>4. The decoding method of Claim 2 or 3, wherein the indexes of information predictors are encoded in the bitstream using a code at least partly built on a unary code.</claim-text> <claim-text>5. The decoding method of Claim 4, wherein the single bit has the value 0' to represent a code word end.</claim-text> <claim-text>6. The decoding method of any of Claims 1 to 5. wherein the information predictors comprise motion information predictors used in motion compensation.</claim-text> <claim-text>7. The decoding method of any of Claims 1 to 6. wherein reading at least one item of binary information from the bitstream is decided based on a state of the set being constructed during the at least one iteration.</claim-text> <claim-text>8. The decoding method of Claim 7, wherein the state of the set for the decision comprises whether or not the set of information predictors contains more than one information predictor.</claim-text> <claim-text>9. The decoding method of Claim 7, further comprising preliminarily determining a maximum number of predictor-construction iterations, and determining whether or not the number of performed iterations has reached the maximum number, and if it has reached the maximum number, returning the index I. 10. The decoding method of Claim 9, further comprising decoding from the bitstream an item of information representing a coding mode for the at least one image portion to be decoded, wherein the determined maximum number depends on the decoded coding mode information.11. The decoding method of Claim 9, wherein the maximum number depends on the availability of motion information associated with image portions neighbouring the at least one image portion to be decoded.12. The decoding method of any of Claims ito 11, wherein the at least one iteration comprises computing an information predictor and adding the computed information predictor to the set if the latter does not already comprise that computed information predictor.13. The decoding method of any of Claims I to 12, wherein the index i is directly related to the number of predictors in the set in course of construction, at a given iteration.14. A decoding device for decoding a bitstream comprising an encoded sequence of images, at least one portion of an image having been encoded using prediction with respect to reference image data, the decoding device comprising index determining means for determining, for at least one image portion to be decoded, an index i of an information predictor selected from a set of information predictors for decoding the image portion and decoding means for decoding the image portion using the information predictor indexed by index i and retrieved from the set, wherein the index determining means are configured to iteratively construct the set of information predictors, and comprise reading means for reading, in at least one iteration of the iterative construction of the set, at least one item of binary information from the bitstream and determining means for determining, based on the read binary information, whether to stop the iterations and return an index I based on a current state of the set.15. A computer program product for a programmable apparatus, the computer program product comprising a sequence of instructions for implementing a method according to any one of Claims ito 13 when loaded into and executed by the programmable apparatus.16. A non transitory computer-readable storage medium storing instructions of a computer program for implementing a method, according to any one of Claims Ito 13.17. A method of decoding a bitstream comprising an encoded sequence of images substantially as hereinbefore described with reference to, and as shown in, Figure 8; Figures 3 and 8; Figures 3, 7 and 8 of the accompanying drawings.18. A method of decoding a bitstream comprising an encoded sequence of images, at least one portion of an image having been encoded using prediction with respect to reference image data, the method comprising, for at least one image portion to be decoded, detecting, in the bitstream, an end bit of a code word coding an index of an information predictor for decoding the image portion, the code word belonging to a code at least partly built on a unary code, and upon positive detection, stopping an iterative construction of a set of information predictors and returning, based on the current state of the set, the index of the information predictor for decoding the image portion.</claim-text>