GB2549942A

GB2549942A - Method, apparatus and computer program product for encoding a mask of a frame and decoding an encoded mask of a frame

Info

Publication number: GB2549942A
Application number: GB1607587.1A
Authority: GB
Inventors: M Williams John; Ono Tomohiro
Original assignee: Kudan Ltd
Current assignee: Kudan Ltd
Priority date: 2016-04-29
Filing date: 2016-04-29
Publication date: 2017-11-08
Also published as: WO2017186580A3; WO2017186580A2; GB201607587D0

Abstract

The invention is directed to a method 100 of encoding 120 a mask of a frame, wherein the mask comprises a plurality of pixels with at least one component, wherein the value of each component of a pixel is set to a first value if the pixel is within an area of interest or set to a second value if a pixel is outside the area of interest. The method 100 comprises: segmenting 121 the mask into a first, second and third segments; setting 123 the value of the first colour component of each pixel of an encoded mask to the value of a component of a respective pixel of the first segment; setting 124 the value of the second colour component of each pixel of the encoded mask to the value of a component of a respective pixel of the second segment; and setting 125 the value of the third colour component of each pixel of the encoded mask to the value of a component of a respective pixel of the third segment. Additional inventions defined by the claims relate to a means for extracting a mask window and a means for scaling a mask, segmenting it and rearranging the segments.

Description

METHOD. APPARATUS AND COMPUTER PROGRAM PRODUCT FOR ENCODING A MASK OF A FRAME AND DECODING AN ENCODED MASK OF A FRAME

Field of the invention [0001] The invention relates to augmented reality in which video elements (e.g. elements of a live direct or indirect view of a physical, real-world environment) are augmented or supplemented by computer-generated video or graphical input. In particular, the invention is directed to a method, an apparatus and a computer program product for encoding a mask of a frame and decoding an encoded mask of a frame, e.g. for purpose of generating an alpha video (i.e. a transparent video).

Background [0002] An alpha video is a video wherein each frame includes an alpha channel (i.e. transparency channel) in addition to a green channel, a blue channel and a red channel or in addition to a greyscale channel.

[0003] Conventionally, a method of generating an alpha video comprises inputting a frame 2, see Fig. 1, to a processor. Here, the frame 2 is a color frame and thus includes a green channel 2g, a blue channel 2b and a red channel 2r. The frame 2 comprises an area of interest, here represented as a triangle, capturing objects and/or people in the foreground. The processor generates a mask 4 of the frame 2. Here, the mask 4 (the alpha channel) is coded as a grayscale mask and thus is represented by equal values of green, blue and red in the three channels. If a pixel is within the area of interest, the grayscale component is set to '255', otherwise the grayscale component is set to 'O'. By encoding the alpha channel (mask 4) in this way as a grayscale image on a second frame, the image and t33he mask 4 can be processed by standard video processing hardware.

[0004] At a rendering time, the processor sets the alpha component of each pixel of the frame 2 to the grayscale component of a respective pixel of the mask 4. If a pixel has an alpha component set to '255' that pixel is to be rendered (i.e. the pixel is opaque) whereas if a pixel has an alpha component set to '0' that pixel is not to be rendered (i.e. the pixel is transparent). In this way, the processor can extract the area of interest from the frame 2 and can overlay it over another frame (not represented).

[0005] An objective of the present invention is to compress an alpha video to reduce the amount of memory resources required to store the alpha video and the amount of communication resources required to send the alpha video.

Summary of the Invention [0006] The invention relates to a method of encoding a mask of a frame, wherein the mask comprises a plurality of pixels with at least one component, wherein the value of each component of a pixel is set to a first value if the pixel is within an area of interest or set to a second value if a pixel is outside the area of interest. The method comprises: segmenting the mask into a first, second and third segments; setting the value of the first color component of each pixel of an encoded mask to the value of a component of a respective pixel of the first segment; setting the value of the second color component of each pixel of the encoded mask to the value of a component of a respective pixel of the second segment; and setting the value of the third color component of each pixel of the encoded mask to the value of a component of a respective pixel of the third segment. In this way, the mask of the frame can be compressed at least by a factor three.

[0007] According to one aspect, the mask comprises a plurality of pixels with a grayscale component, wherein the value of the grayscale component of a pixel is set to a first value if the pixel is within an area of interest or set to a second value if a pixel is outside the area of interest. The method comprises: segmenting the mask into a first, second and third segments; setting the value of the first color component of each pixel of an encoded mask to the value of the grayscale component of a respective pixel of the first segment; setting the value of the second color component of each pixel of the encoded mask to the value of the grayscale component of a respective pixel of the second segment; and setting the value of the third color component of each pixel of the encoded mask to the value of the grayscale component of a respective pixel of the third segment.

[0008] According to another aspect, the mask comprises a plurality of pixels with first, second and third color components, wherein the values of the first, second and third color component of a pixel are all set to a first value if the pixel is within an area of interest or all set to a second value if a pixel is outside the area of interest. The method comprises: segmenting the mask into a first, second and third segments; setting the value of the first color component of each pixel of an encoded mask to the value of the first color component of a respective pixel of the first segment; setting the value of the second color component of each pixel of the encoded mask to the value of the second color component of a respective pixel of the second segment; and setting the value of the third color component of each pixel of the encoded mask to the value of the third color component of a respective pixel of the third segment.

[0009] According to one aspect, the first, second and third color components are green, blue and red components.

[0010] According to another aspect, the encoded mask and the first, second and third segments have the same dimensions.

[0011] According to one aspect, the area of interest captures at least one person or an object in the foreground of the frame.

[0012] According to another aspect, the first value is '255' and the second value is '0' or vice versa.

[0013] According to one aspect, the method further comprises reducing the dimensions of the frame and the encoded mask.

[0014] The invention further relates to a method of decoding an encoded mask of a frame, wherein the encoded mask comprises a plurality of pixels with first, second and third color components, comprising: segmenting the frame into a first, second and third segments; setting an alpha component of each pixel of the first segment of the frame to the first color component of a respective pixel of the encoded mask; setting an alpha component of each pixel of the second segment of the frame to the second color component of a respective pixel of the encoded mask; and setting an alpha component of each pixel of the third segment of the frame to the third color component of a respective pixel of the encoded mask.

[0015] According to one aspect, the first, second and third color components are green, blue and red components.

[0016] According to another aspect, the encoded mask and the first, second and third segments have the same dimensions.

[0017] According to one aspect, the first, second and third color components of each pixel of the encoded mask are set to a first value or to a second value.

[0018] According to one aspect, the first value is '255' and the second value is '0' or vice versa.

[0019] According to another aspect, the method further comprises increasing (130) the dimensions of the frame and the encoded mask.

[0020] The invention further relates to a method of encoding a mask of a frame, wherein the mask comprises a plurality of pixels with at least one component, wherein the value of each component of a pixel is set to a first value if the pixel is within an area of interest or set to a second value if a pixel is outside the area of interest. The method comprises: selecting at least one mask window overlapping with an area of interest; extracting the at least one mask window to form an encoded mask. In this way, the mask of a frame can be compressed.

[0021] According to one aspect, the method comprises: selecting a plurality of mask windows overlapping with the area of interest; extracting and rearranging (544) the plurality of mask windows to form the encoded mask.

[0022] According to another aspect, the area of interest captures at least one person or an object in the foreground of the frame.

[0023] According to one aspect, the at least one mask window is selected for the frame independently from any other frame, such that the at least one mask window varies from one frame to another.

[0024] According to another aspect, the at least one mask window is selected for the frame and at least one other frame, such that the at least one mask window is identical from one frame to another.

[0025] According to one aspect, the method further comprises: determining an area of interest for the frame; determining an area of interest for the at least one other frame; selecting the at least one mask window to be the smallest at least one mask window that encompasses all the areas of interest.

[0026] According to another aspect, the method further comprises encoding the frame, wherein encoding the frame includes: selecting at least one frame window overlapping with the area of interest; extracting the at least one frame window to form an encoded frame; and storing the encoded in association with information to locate the at least one frame window within the frame.

[0027] According to one aspect, the method further comprises: selecting a plurality of frame windows overlapping with the area of interest; extracting and rearranging the plurality of frame windows to form the encoded frame; and storing the encoded frame in association with information to locate the plurality of frame windows within the frame and within the encoded frame.

[0028] According to another aspect, each mask window has a corresponding frame window with the same shape, with the same dimensions, with the same location within the mask and within the frame and with the same location within the encoded mask and within the encoded frame.

[0029] According to one aspect, the component of each pixel of the mask is a greyscale component set to a first value or a second value.

[0030] According to another aspect, the first value is '255' and the second value is '0' or vice versa.

[0031] The invention further relates to a method of decoding an encoded mask (4*) of a frame, comprising: setting an alpha component of each pixel of an encoded frame to a component of a respective pixel of the encoded mask.

[0032] According to one aspect, the method further comprises: extracting a plurality of frame windows from an encoded frame based on location information locating the plurality of frame windows within the encoded frame; and rearranging the plurality of frame windows based on location information locating the plurality of frame windows within the encoded frame.

[0033] The invention further relates to a method of encoding a mask of a frame, wherein the mask comprises a plurality of pixels with at least one component, wherein the value of each component of a pixel is set to a first value if the pixel is within an area of interest or set to a second value if a pixel is outside the area of interest. The method comprises decreasing the dimensions of the mask by a factor N, where N is an integer greater than 1; segmenting the mask into N segments; and rearranging the N segments to form an encoded mask having one common dimension with the mask. In an example, the encoded mask and the mask have a same height and a different width. In another example, the encoded mask and the mask have a same width and a different height.

[0034] According to one aspect, the method comprises: decreasing the dimensions h x w of the mask by a factor N to have dimensions h/N x w/N; segmenting the mask into N segments having dimensions h/N x w/N2; rearranging the N segments to form an encoded mask having dimensions h x w/N2.

[0035] According to another aspect, the method comprises: decreasing the dimensions h x w of the mask by a factor N to have dimensions h/N x w/N; segmenting the mask into N segments having dimensions h/N2 x w/N; rearranging the N segments to form a encoded mask having dimensions h/N2 x w.

[0036] According to one aspect, the factor N is equal to 2.

[0037] The invention further relates to a method of decoding an encoded mask of a frame, comprising: setting an alpha component of each pixel of a frame to a component of a respective pixel of the encoded mask.

[0038] The invention further relates to an apparatus comprising means for performing the above methods.

[0039] The invention finally relates to a computer program product comprising instructions which when executed by an apparatus perform the above methods.

[0040] Other features and advantages of the invention will become apparent after review of the entire application, including the following sections: brief description of the drawings, detailed description and claims.

Brief description of the drawings [0041] The accompanying drawings illustrate exemplary aspects of the invention, and, together with the general description given above and the detailed description given below, serve to explain features of the invention. Wherever possible, the same reference numbers will be used throughout the drawings to refer to the same or like parts.

[0042] Fig. 1 represents a frame and a mask of a frame.

[0043] Fig. 2 represents a flow diagram of a first method of encoding a mask of a frame.

[0044] Fig. 3 to Fig. 5 represent a frame and a mask of a frame at various stages of the first method of encoding a mask of a frame.

[0045] Fig. 6 represents a flow diagram of a method of decoding an encoded mask of a frame.

[0046] Fig. 7 represents a flow diagram of a second method of encoding a mask of a frame.

[0047] Fig. 8 and Fig. 9 represent a frame and a mask of a frame at various stages of the second method of encoding a mask of a frame.

[0048] Fig. 10 represents a flow diagram of a method of decoding an encoded mask of a frame.

[0049] Fig. 11 represents a flow diagram of a third method of encoding a mask of a frame.

[0050] Fig. 12 and Fig. 13 represent a frame and a mask of a frame at various stages of the third method of encoding a mask of a frame.

[0051] Fig. 14 represents a flow diagram of a method of decoding an encoded mask of a frame.

[0052] Fig. 15 represents a flow diagram of a fourth method of encoding a mask of a frame.

[0053] Fig. 16 and Fig. 17 represent a frame and a mask of a frame at various stages of the third method of encoding a mask of a frame.

[0054] Fig. 18 represents a flow diagram of a fourth method of decoding an encoded mask of a frame.

[0055] Fig. 19 a mobile device for implementing any one of the previous methods.

Detailed Description [0056] The following detailed description describes four solutions to the problem of compressing an alpha video.

First solution [0057] Fig. 2 represents a flow diagram of a method 100 of encoding the mask 4 of the frame 2 shown in Fig. 3. The method starts at step 110 with the frame 2 and the mask 4 being input.

[0058] At step 120, the mask 4 is encoded following operations 121 to 127. At operation 121, the mask 4 is segmented into three segments 4i, 42 and 43 of the same dimensions (see Fig. 3). Here, the segments 4lf 42 and 43are vertical but alternatively the segments 4i, 42 and 43 could be horizontal or could have some other (preferably non-overlapping) configuration.

[0059] Then, an encoded mask 4* (see Fig. 4) is generated based on the segments 4i, 42 and 43. The encoded mask 4* is split into a green channel, a blue channel and a red channel. The encoded mask 4* (in total) has the same dimensions as the segments 4lf 42 and 43.

[0060] At operation 122, a pixel of the encoded mask 4* is selected. At operation 123, the green component of the selected pixel is set to the alpha component of a respective pixel of the segment 4\. At operation 124, the blue component of the selected pixel is set to the alpha component of a respective pixel of the segment 42. At operation 125, the red component of the selected pixel is set to the alpha component of a respective pixel of the segment 43. At operation 126, it is determined whether each pixel of the encoded mask 4* has been selected. If at least one pixel of the encoded mask 4* has not been selected, operations 122 to 126 are performed again for a next pixel (e.g. in a raster scan manner). Otherwise, step 120 comes to an end at operation 127.

[0061] At an optional step 130 (Fig. 5), the resolution of the frame 2 and the encoded mask 4* can be reduced such that the dimensions of both the frame 2 and the encoded mask 4* are equal to the dimensions of the original frame 2 input at step 110. In this way, the size of both the frame 2 and the encoded mask 4* is equal to the size of the original frame 2. This is useful because video encoding hardware components are designed to handle standard frame sizes and can therefore process the frame and its mask as a single frame of standard size.

[0062] In another embodiment (not represented) the mask 4 of the frame 2 input at step 110 is not a grayscale mask. Instead, the mask 4 is a color mask and includes a green channel 4g, a blue channel 4b and a red channel 4r. In this other embodiment, if a pixel is within the area of interest, the green component, blue component and red component would be set to '255', otherwise the green component, blue component and red component would be set to '0'. In this other embodiment, operations 121 to 127 are slightly modified. At operation 121, the mask 4 is still segmented into three segments 4lr 42 and 43 of the same dimensions. At operation 122, a pixel of the encoded mask 4* is selected. At operation 123, the green component of the selected pixel is set to the green component of a respective pixel of the segment 4i. At operation 124, the blue component of the selected pixel is set to the blue component of a respective pixel of the segment 42. At operation 125, the red component of the selected pixel is set to the red component of a respective pixel of the segment 43. At operation 126, it is determined whether each pixel of the encoded mask 4* has been selected. If at least one pixel of the encoded mask 4* has not been selected, operations 122 to 126 are performed again for a next pixel. Otherwise, step 120 comes to an end at operation 127.

[0063] Fig. 6 represents a flow diagram of a method 200 of decoding the encoded mask 4* obtained via the method 100. The method 200 starts at step 210 with the frame 2 and the encoded mask 4* being input.

[0064] At an optional step 220, the dimensions of the frame 2 and the encoded mask 4* can be increased such that the dimensions the frame 2 are equal to the dimensions of the original frame 2.

[0065] At step 230, the encoded mask 4* is decoded following operations 231 to 236. At operation 231, the frame 2 is segmented into three segments 23, 22 and 23 having the same dimensions as the encoded mask 4*. Here, the segments 2lr 22 and 23 are vertical segments but alternatively the segments 23, 22 and 23 could be horizontal segments. Then, the encoded mask 4* is used to generate the alpha channel of the frame 2. At operation 232, a pixel of the frame 2 is selected. At operation 233, if the pixel is selected in the first segment 2i, an alpha component of that selected pixel is set to the green component of a respective pixel of the encoded mask 4*. At operation 234, if the pixel is selected in the first segment 22, an alpha component of that selected pixel is set to the blue component of a respective pixel of the encoded mask 4*. At operation 235, if the pixel is selected in the first segment 2i, an alpha component of that selected pixel is set to the red component of a respective pixel of the encoded mask 4*. At operation 236, it is determined whether each pixel of the frame 2 has been selected. If at least one pixel of the frame 2 has not been selected, operations 232 to 236 are performed again. Otherwise, step 230 comes to an end at operation 237.

[0066] In the above description, the step 130 of redimensioning is included in order to provide a frame 2 and mask 4* that conform with standard frame sizes and can be further handled by standard video software and hardware. If it is not necessary to keep within standard frame sizes, the frame 2 and the mask 4* can be combined together without redimensioning and thus without loss of resolution. This principle applies equally to other solutions described herein.

Second solution [0067] Fig. 7 represents a flow diagram of a method 300 of encoding the mask 4 of the frame 2. The method 300 starts at step 310 with the frame 2 being input.

[0068] At step 320, the frame 2 is encoded following operations 322 to 326. At operation 322, a frame window 10 (see Fig. 8) is selected within the frame 2. The frame window 10 overlaps with the area of interest and preferably entirely encompasses the area of interest. The frame window 10 has an offset 12 within the frame 2. The offset 12 is a set of coordinates of a reference point (e.g. upper left corner) of the frame window 12 within the frame 2. It is to be understood that the frame window 10 is always smaller than the frame 2 and can be of any shape. Here the frame window 10 is a rectangular window but alternatively it could be a square window, a triangular window or the like or it could be a window of a coded defined shape (1 for square, 2 for triangle etc.). At operation 324, the frame window 10 is extracted to form an encoded frame 2* (see Fig. 9).

[0069] At step 330, the mask 4 of the frame 2 is input.

[0070] At step 340, the mask 4 is encoded following operations 342 and 344. At operation 342, a mask window 14 (see Fig. 8) is selected within the mask 4. The mask window 14 has an offset 16 within the mask 4. The offset 16 is a set of coordinates of a reference point (e.g. upper left corner) of the mask window 14 within the mask 4. Here, the frame window 10 corresponds to the mask window 14, that is they have the same shapes, the same dimensions and the same offsets 12 and 16. At operation 344, the mask window 14 is extracted to form an encoded mask 4* (see Fig. 9). In Fig. 9, the mask is shown as being duplicated across all three channels. It will be understood that the technique of Fig. 2 can be applied to split it in three and avoid repeating the same information across different channels.

[0071] The offset 16 need not be duplicated across all three channels. It is the same offset for each. It can be encoded in a selected one of the channels (e.g. in the green channel by default) or it can be split into elements (e.g. x, y and size) and the different elements can be encoded on different channels.

[0072] At step 350, the encoded frame 2*, the encoded mask 4* and location information (e.g. offset 12) to locate the frame window 10 within the frame 2 are stored.

[0073] Fig. 10 represents a flow diagram of a method 400 of decoding the encoded mask 4* obtained by the method 300. The method 400 starts at step 410 with the encoded frame 2* the encoded mask 4* and the location information (e.g. offset 12) being input.

[0074] At step 420, the encoded mask 4* is decoded following operations 421 to 424 to generate an alpha channel for the encoded frame 2*. At operation 421, a pixel is selected within the encoded frame 2*. At operation 422, the alpha component of that pixel is set to the greyscale component of a respective pixel of the encoded mask 4*. At operation 423, it is determined whether each pixel of the encoded frame 2* has been selected. If at least one pixel of the encoded frame 2* has not been selected, then operations 421 to 423 are performed again. Otherwise, step 420 comes to an end at operation 424.

[0075] It is to be understood that the area of interest may vary over time from one frame 2 to another as a result of the number, the size and the location of the objects and/or people captured in the foreground varying. For example, the frame window 10 and the mask window 14 are selected on a frame 2 basis such that the frame window 10 and the mask window 14 change from one frame 2 to another. Alternatively, the frame window 10 and the mask window 14 are selected on a plurality of frames 2 basis such that the frame window 10 and the mask window 14 are identical from one frame 2 to the next frame in a set of frames 2. For example, the area of interest is determined in each frame 2 of the set of frames 2 and the frame window 10 and the mask window 14 are selected to be the smallest frame window 10 and mask window 14 that encompasses all the areas of interest. In this way, we are sure that the area of interest will be captured in each frame 2 of the set of frames 2.

Third solution [0076] Fig. 11 represents a flow diagram of a method 500 of encoding the mask 4 of the frame 2. The method 500 starts at step 510 with the frame 2 being input.

[0077] At step 520, the frame 2 is encoded following operations 522 to 526. At operation 522, a plurality of frame windows 10 (see Fig. 8) is selected within the frame 2. Each frame window 10 overlaps partially with the area of interest. Preferably the frame windows 10 are non-overlapping with each other, i.e. define mutually exclusive portions of an area of interest, but this is not essential. Each frame window 10 has an offset 12 within the frame 2. Each offset 12 is a set of coordinates of a reference point (e.g. upper left corner) of a respective frame window 12 within the frame 2. It is to be understood that each frame window 10 is smaller than the frame 2 and can be of any shape. Here each frame window 10 is a rectangular window but alternatively it could be a square window, a triangular window or the like. Preferably the frame windows 10 all have the same shape but, as before, the shape of all the frame windows 10 or the shapes of individual frame windows 10 can be parameters to be encoded.

[0078] In the illustration, three frame windows 10 are shown, but there could be fewer or more (e.g. 16 or 64). The number of frame windows 10 can be an encoding parameter.

[0079] At operation 524, the plurality of frame windows 12 is extracted and rearranged to form an encoded frame 2* (see Fig. 13). Each frame window 12 has an offset 13 with in the encoded frame 2*. Each offset 13 is a set of coordinates of a reference point (e.g. upper left corner) of a respective frame window 12 within the encoded frame 2*.

[0080] At step 530, the mask 4 of the frame 2 is input.

[0081] At step 540, the mask 4 is encoded following operations 542 and 544. At operation 542, a plurality of mask windows 14 (see Fig. 12) is selected within the mask 4. Each mask window 14 has an offset 16 within the mask 4. Each offset 16 is a set of coordinates of a reference point (e.g. upper left corner) of a respective mask window 14 within the mask 4.

[0082] At operation 544, the plurality of mask windows 14 is extracted and rearranged to form an encoded mask 4* (see Fig. 13). Each mask window 14 has an offset 17 within the encoded mask 4*. Each offset 17 is a set of coordinates of a reference point (e.g. upper left corner) of a respective mask window 14 within the encoded mask 4*. Here, the plurality of frame windows 10 are paired with the plurality of mask windows 14, that is each frame window 10 corresponds to mask window 14, that is they have the same shapes, the same dimensions, the same offsets 12 and 16 and the same offsets 13 and 17.

[0083] Advantageously, there are three mask windows and each of these, with its respective offset, is encoded in a different one of the three channels as shown in Fig. 13. Alternatively, there may be more mask windows, but it is advantageous that the number of mask windows in a multiple of 3.

[0084] At step 550, the encoded frame 2*, the encoded mask 4* and location information (e.g. offsets 12, 13) to locate the frame windows within the frame 2 and within the encoded frame 2* are stored.

[0085] Fig. 14 represents a flow diagram of a method 600 of decoding the encoded mask 4* obtained with method 500. The method 600 starts at step 610 with the encoded frame 2* the encoded mask 4* and the location information (e.g. offsets 12, 13) being input (as well as the number of frame windows 10 if this is an encoding parameter and their shape or shapes if these are encoding parameters).

[0086] At step 620, the encoded mask 4* is decoded following operations 621 to 624 to generate an alpha channel for the encoded frame 2*. At operation 621, a pixel is selected within the encoded frame 2*. At operation 622, the alpha component of that pixel is set to the greyscale component of a respective pixel of the encoded mask 4*. At operation 623, it is determined whether each pixel of the encoded frame 2* has been selected. If at least one pixel of the encoded frame 2* has not been selected, then operations 621 to 623 are performed again. Otherwise, step 620 comes to an end at operation 624.

[0087] As per the second solution, it is to be understood that the area of interest may vary over time from one frame 2 to another as a result of the number, the size and the location of the objects and/or people captured in the foreground varying. In a first embodiment, the plurality of frame windows 10 and the plurality of mask windows 14 are selected on a frame 2 basis such that the plurality of frame windows 10 and the plurality of mask windows 14 changes from one frame 2 to another. In a second embodiment, the plurality of frame windows 10 and the plurality of mask windows 14 are selected on a plurality of frames 2 basis such that the plurality of frame windows 10 and the plurality of mask windows 14 are identical from one frame 2 to the other. For example, the area of interest is determined in each frame 2 of the set of frames 2. Then, the plurality of frame window 10 and the plurality of mask window 14 are selected to be the smallest plurality of frame window 10 and the smallest plurality of mask window 14 that encompasses all the areas of interest. In this way, we are sure that the area of interest will be captured in each frame 2 of the set of frames 2.

Fourth solution [0088] Fig. 15 represents a flow diagram of a method 700 of encoding the mask 4 of the frame 2. The method 700 starts at step 710 with the frame 2 and the mask 4 being input. Both the frame 2 and the mask 4 have same dimensions h x w, where h is an integer representing a height and w is an integer representing the width.

[0089] At step 720, the mask 4 is encoded following operations 721 to 723. At operation 721, the resolution of the mask 4 is reduced by a factor 2 such that the dimensions of the mask 4 are h/2 x w/2 (see Fig. 16). At operation 722, the mask 4 is segmented into two vertical segments 4i and 42 having dimensions h/2 x w/4.

[0090] At operation 723, the segments 4i and 42 are rearranged on top of each other to form an encoded mask 4* having dimensions h x w/4 (see Fig. 17).

[0091] At an optional step 730, the resolution of the resultant frame with its encoded mask 4* can be reduced such the dimensions of both the frame 2 and the encoded mask are equal to the dimensions of the original frame 2 input at step 710. In this way, the size of both the frame 2 and the encoded mask 4* is equal to the size of the original frame 2 and, moreover, both the frame 2 and the encoded mask 4* can be handled by standard video software and hardware.

[0092] It is to be understood that at step 720 the mask 4 could be encoded following modified operations 721 to 723. For example, at operation 721, the resolution of the mask 4 could be reduced by a factor 2 such that the dimensions of the mask 4 are h/2 x w/2 (see Fig. 16). At operation 722, the mask 4 could be segmented into two horizontal segments 4i and 42 having dimensions h/4 x w/2. At operation 723, the segments 41 and 42 could be rearranged next to each other to form an encoded mask 4* having dimensions h/4 x w.

[0093] Fig. 14 represents a flow diagram of a method 800 of decoding the encoded mask 4* obtained with method 700. The method 800 starts at step 810 with the the frame 2 and the encoded mask 4* being input.

[0094] At an optional step 820, the dimensions of the frame 2 and the encoded mask 4* can be increased such that the dimensions of the frame 2 is equal to the dimensions of the original frame 2.

[0095] At step 830, the encoded mask 4* is decoded following operations 831 to 833 to generate an alpha channel for the frame 2. At operation 831, a pixel of the frame 2 is selected. At operation 832, the alpha component of that pixel is set to the greyscale component of a corresponding pixel of the encoded mask 4*. At operation 833, it is determined whether at least one pixel of the frame 2 has not been selected. If at least one pixel of the frame has not been selected yet, then operations 831 to 833 are performed again. Otherwise, step 830 comes to an end at operation 834.

Combination of the first, second, third and fourth solutions [0096] For increased benefit, the above solutions can be combined (i.e. any two or three solutions can be applied one after another).

[0097] In particular, solution 1 can be combined with solution 2, 3 or 4. That is, the encoded mask 4* obtained via step 340, via step 540 or via step 720 can be further encoded via step 120. For example, at operation 121, the encoded mask 4* is segmented into three segments 4*i, 4*2 and 4*3 of the same dimensions. Then, an encoded mask 4** is generated based on the segments 4*i, 4*2 and 4*3. The encoded mask 4** includes a green channel, a blue channel and a red channel. The encoded mask 4** (in total) has the same dimensions as the segments 4*i, 4*2 and 4*3. At operation 122, a pixel of the encoded mask 4** is selected. At operation 123, the green component of the selected pixel is set to the greyscale component of a respective pixel of the segment 4*i. At operation 124, the blue component of the selected pixel is set to the greyscale component of a respective pixel of the segment 4*2. At operation 125, the red component of the selected pixel is set to the greyscale component of a respective pixel of the segment 4*3. At operation 126, it is determined whether each pixel of the encoded mask 4** has been selected. If at least one pixel of the encoded mask 4** has not been selected, operations 122 to 126 are performed again for a next pixel (e.g. in a raster scan manner). Otherwise, step 120 comes to an end at operation 127.

[0098] Equally, solution 4 can be combined with solution 1, 2 or 3. That is, the encoded mask 4* obtained via step 120, via step 340 or via step 540 can be further encoded via step 720. For example, at operation 721, the resolution of the mask 4* is reduced by a factor 2 such that the dimensions of the mask 4 are h/2 x w/2 (see Fig. 16). At operation 722, the mask 4** is segmented into two vertical segments 4*i and 4*2 having dimensions h/2 x w/4. At operation 723, the segments 4*i and 4*2 are rearranged on top of each other to form an encoded mask 4** having dimensions h x w/4.

Mobile device [0099] Fig. 20 shows a mobile device 1000 comprising means for implementing the steps of any one of the previous methods. The mobile device 1000 comprises a video camera 1010 for capturing a video, a processor 1030 for generating an alpha video based on the captured video, a display 1020 for rendering the alpha video and volatile and non-volatile memories 1040 for storing instructions which when executed by the processor 1030 perform the steps of any one of the previous methods.

[00100] It will be understood that above aspects have been described by way of example only, and that various changes and modifications may be made without departing from the scope of the claims.

Claims

1. A method (100) of encoding (120) a mask (4) of a frame (2), wherein the mask (4) comprises a plurality of pixels with at least one component, wherein the value of each component of a pixel is set to a first value if the pixel is within an area of interest or set to a second value if a pixel is outside the area of interest, the method (100) comprising: segmenting (121) the mask (4) into a first, second and third segments (4i, 42, 43); setting (123) the value of the first color component of each pixel of an encoded mask (4*) to the value of a component of a respective pixel of the first segment (4i); setting (124) the value of the second color component of each pixel of the encoded mask (4*) to the value of a component of a respective pixel of the second segment (42) ; and setting (125) the value of the third color component of each pixel of the encoded mask (4*) to the value of a component of a respective pixel of the third segment (43) .

2. The method (100) of claim 1, wherein the mask (4) comprises a plurality of pixels with a grayscale component, wherein the value of the grayscale component of a pixel is set to a first value if the pixel is within an area of interest or set to a second value if a pixel is outside the area of interest, the method comprising: - segmenting (121) the mask (4) into a first, second and third segments (4i, 42, 43); - setting (123) the value of the first color component of each pixel of an encoded mask (4*) to the value of the grayscale component of a respective pixel of the first segment (4i); - setting (124) the value of the second color component of each pixel of the encoded mask (4*) to the value of the grayscale component of a respective pixel of the second segment (42); and - setting (125) the value of the third color component of each pixel of the encoded mask (4*) to the value of the grayscale component of a respective pixel of the third segment (43).

3. The method (100) of claim 1, wherein the mask (4) comprises a plurality of pixels with first, second and third color components, wherein the values of the first, second and third color component of a pixel are all set to a first value if the pixel is within an area of interest or all set to a second value if a pixel is outside the area of interest, the method (100) comprising: - segmenting (121) the mask (4) into a first, second and third segments (4i, 42, 43); - setting (123) the value of the first color component of each pixel of an encoded mask (4*) to the value of the first color component of a respective pixel of the first segment (4i); - setting (124) the value of the second color component of each pixel of the encoded mask (4*) to the value of the second color component of a respective pixel of the second segment (42); and - setting (125) the value of the third color component of each pixel of the encoded mask (4*) to the value of the third color component of a respective pixel of the third segment (43).

4. The method (100) of any one of claims 1 to 3, wherein the first, second and third color components are green, blue and red components.

5. The method (100) of any one of claims 1 to 4, wherein the encoded mask (4*) and the first, second and third segments (4i, 42,43) have the same dimensions.

6. The method (100) of any one of claims 1 to 5, wherein the area of interest captures at least one person or an object in the foreground of the frame (2).

7. The method (100) of any one of claims 1 to 6, wherein the first value is '255' and the second value is '0' or vice versa.

8. The method (100) of any one of claims 1 to 7, further comprising: reducing (130) the dimensions of the frame (2) and the encoded mask (4*).

9. A method (200) of decoding (230) an encoded mask (4*) of a frame (2), wherein the encoded mask (4*) comprises a plurality of pixels with first, second and third color components, comprising: segmenting the frame (2) into a first, second and third segments; - setting an alpha component of each pixel of the first segment of the frame to the first color component of a respective pixel of the encoded mask (4*); setting an alpha component of each pixel of the second segment of the frame to the second color component of a respective pixel of the encoded mask (4*); and setting an alpha component of each pixel of the third segment of the frame to the third color component of a respective pixel of the encoded mask (4*).

10. The method (200) of any one of claim 9, wherein the first, second and third color components are green, blue and red components.

11. The method (200) of any one of claims 9 to 10, wherein the encoded mask (4*) and the first, second and third segments have the same dimensions.

12. The method (200) of any one of claims 9 to 11, wherein the first, second and third color components of each pixel of the encoded mask (4*) are set to a first value or to a second value.

13. The method (200) of any one of claim 12, wherein the first value is '255' and the second value is '0' or vice versa.

14. The method (200) of any one of claims 1 to 3, further comprising: increasing (130) the dimensions of the frame (2) and the encoded mask (4*).

15. An apparatus comprising means for performing the method of any of claims 1 to 14.

16. A computer program product comprising instructions which when executed by an apparatus perform the method of any of claims 1 to 15.

17. A method (BOO; 500)) of encoding (340; 540) a mask (4) of a frame (2), wherein the mask (4) comprises a plurality of pixels with at least one component, wherein the value of each component of a pixel is set to a first value if the pixel is within an area of interest or set to a second value if a pixel is outside the area of interest, comprising: selecting (342; 542) at least one mask window (14) overlapping with an area of interest; - extracting (344; 544) the at least one mask window (14) to form an encoded mask (4*).

18. The method (500) of claim 17, comprising: - selecting (542) a plurality of mask windows (14) overlapping with the area of interest; - extracting and rearranging (544) the plurality of mask windows (14) to form the encoded mask (4*).

19. The method (300; 500) of any one of claims 17 or 18, wherein the area of interest captures at least one person or an object in the foreground of the frame (2).

20. The method (300; 500) of any of claims 17 to 19, wherein the at least one mask window (14) is selected for the frame (2) independently from any other frame (2), such that the the at least one mask window (14) varies from one frame (2) to another.

21. The method (300; 500) of any of claims 17 to 20, wherein the at least one mask window (14) is selected for the frame (2) and at least one other frame (2), such that the at least one mask window (14) is identical from one frame to another.

22. The method (300; 500) of any of claim 21, further comprising: - determining an area of interest for the frame (2); - determining an area of interest for the at least one other frame (2); - selecting the at least one mask window (14) to be the smallest at least one mask window (14) that encompasses all the areas of interest.

23. The method (300; 500) of any of claims 17 to 22, further comprising encoding (320; 520) the frame, wherein encoding (320; 520) the frame includes: selecting (322; 522) at least one frame window (10) overlapping with the area of interest; - extracting (324; 524) the at least one frame window (10) to form an encoded frame (2*); and - storing the encoded frame (2*) in association with information to locate the at least one frame window (10) within the frame (2).

24. The method (500) of claim 23, comprising: selecting (542) a plurality of frame windows (10) overlapping with the area of interest; extracting and rearranging (544) the plurality of frame windows (10) to form the encoded frame (2*); and storing the encoded frame (2*) in association with information to locate the plurality of frame windows (10) within the frame (2) and within the encoded frame (2*).

25. The method (300; 500) of any one of claims 23 to 24, wherein each mask window (14) has a corresponding frame window (10) with the same shape, with the same dimensions, with the same location within the mask (4) and within the frame (2) and with the same location within the encoded mask (4*) and within the encoded frame (2*).

26. The method (300; 500) of any one of claims 17 to 25, wherein the component of each pixel of the mask is a greyscale component set to a first value or a second value.

27. The method (BOO; 500) of any one of claims 17 to 26, wherein the first value is '255' and the second value is '0' or vice versa.

28. A method (400; 600) of decoding (420; 620) an encoded mask (4*) of a frame (2), comprising: setting (421, 422, 423; 621, 622, 623) an alpha component of each pixel of an encoded frame (2*) to a component of a respective pixel of the encoded mask (4*).

29. The method (400; 600) of claim 28, further comprising: extracting (624) a plurality of frame windows (10) from an encoded frame (2*) based on location information (12) locating the plurality of frame windows (10) within the encoded frame (2*); and rearranging (625) the plurality of frame windows (10) based on location information (13) locating the plurality of frame windows (10) within the encoded frame (2*).

30. An apparatus comprising means for performing the method of any of claims 17 to 29.

31. A computer program product comprising instructions which when executed by an apparatus perform the method of any of claims 17 to 29.

32. A method (700) of encoding (720) a mask (4) of a frame (2), wherein the mask (4) comprises a plurality of pixels with at least one component, wherein the value of each component of a pixel is set to a first value if the pixel is within an area of interest or set to a second value if a pixel is outside the area of interest, comprising: decreasing (721) the dimensions of the mask (4) by a factor N, where N is an integer greater than 1; segmenting (722) the mask (4) into N segments (4i, 42); and rearranging (723) the N segments (Alt 42) to form an encoded mask (4*) having one common dimension with the mask (4).

33. The method (700) of claim 36, comprising: decreasing (721) the dimensions h x w of the mask (4) by a factor N to have dimensions h/N x w/N; segmenting (722) the mask (4) into N segments (4i, 42) having dimensions h/N x w/N2; rearranging (723) the N segments (4i, 42) to form an encoded mask (4*) having dimensions h x w/N2.

34. The method of claim 36, comprising: - decreasing (721) the dimensions h x w of the mask (4) by a factor N to have dimensions h/N x w/N; segmenting (722) the mask into N segments (4i, 42) having dimensions h/N2 x w/N; rearranging (723) the N segments (4i, 42) to form a encoded mask having dimensions h/N2 x w.

35. The method of any of claims 36 to 38, wherein the factor N is equal to 2.

36. A method of decoding (830) an encoded mask (4*) of a frame (2), comprising: setting (831, 832, 833) an alpha component of each pixel of a frame (2) to a component of a respective pixel of the encoded mask (4*).

37. An apparatus comprising means for performing the method of any of claims 36 to 40.

38. A computer program product comprising instructions which when executed by an apparatus perform the method of any of claims 36 to 40.