CN110769260A

CN110769260A - Video decoding device and video decoding method

Info

Publication number: CN110769260A
Application number: CN201810841960.9A
Authority: CN
Inventors: 林和源
Original assignee: MStar Semiconductor Inc Taiwan
Current assignee: MStar Semiconductor Inc Taiwan
Priority date: 2018-07-27
Filing date: 2018-07-27
Publication date: 2020-02-07

Abstract

A video decoding device comprises a visual field generating unit for generating a visual field; a visual field expanding unit expanding the visual field to generate an expanded visual field; and a partial decoder for decoding the image data corresponding to the expanded view in an image frame to generate decoded image data corresponding to the expanded view in the image frame.

Description

Video decoding device and video decoding method

Technical Field

The present invention relates to the field of decoding technology, and more particularly, to partial decoding technology and partial decoding method.

Background

Virtual Reality (VR) simulates a three-dimensional space by displaying a panoramic image (panoramic image), and when a user moves, the panoramic image changes correspondingly, so that the user feels as if he/she is experiencing the situation. Since the amount of data of the panoramic image is large, when the decoding capability is insufficient, the panoramic image cannot be completely decoded in time, and therefore, it is common practice to decode only the image data within the visual field of the user. Referring to fig. 1 and 2, fig. 1 is a conventional video decoding apparatus 100 for panoramic images, and fig. 2 is an exemplary diagram of panoramic images and views. The video decoding apparatus 100 includes a visual field (visual field) generating unit 110 and a partial decoder (partial decoder) 130. The view generating unit 110 may perform a perspective projection (perspective projection) or a projection of a day (gnomonic projection) according to a viewpoint vp (viewpoint) of the user to generate a view VF, wherein the viewpoint includes a latitude and a longitude; next, the partial decoder 130 decodes only the image data in the field of view VF in the image frame F of the panoramic image, and outputs the decoded image data I of the image in the field of view VF for display.

However, when the user rotates the head or moves the body to change the view VF and the partial decoder 130 does not have time to decode the image data outside the view VF, the user cannot see the partial frame, which reduces the quality of the user experience of the virtual reality. Generally, in addition to the panoramic image PI, the transmitting end provides a low-resolution panoramic image, and to solve the above problem, it is a common practice to decode image data outside the field of view VF in the low-resolution panoramic image, and fill up the display frame outside the field of view VF with the low-resolution image data, so that when the user rotates the head or moves the body, the user can see the low-resolution image. Although the method improves the user experience quality of virtual reality, the user still needs to wait for the panoramic image with normal resolution to finish decoding and then can see the image with normal resolution.

Disclosure of Invention

Therefore, an object of the present invention is to provide a video decoding apparatus and method that can solve the above-mentioned problems by appropriately expanding the decoding area of the panorama image.

The invention discloses a video decoding device, which comprises a visual field generating unit, a visual field generating unit and a video decoding unit, wherein the visual field generating unit is used for generating a visual field; a visual field expanding unit expanding the visual field to generate an expanded visual field; and a partial decoder (partial decoder) for decoding image data corresponding to the expanded view in a frame to generate decoded image data corresponding to the expanded view in the frame.

The present invention further discloses a video decoding method, which includes generating a field of view; expanding the field of view to produce an expanded field of view; and decoding the image data corresponding to the expanded view in an image frame to generate the decoded image data corresponding to the expanded view in the image frame.

The features, operation and function of the present invention will be described in detail with reference to the drawings.

Drawings

FIG. 1 is a block diagram of a conventional video decoding apparatus for panoramic images;

FIG. 2 is a schematic view of an example of a panoramic image and a field of view;

FIG. 3 is a block diagram of a video decoding apparatus according to an embodiment of the present invention;

FIG. 4 is a flowchart of a video decoding method according to an embodiment of the present invention;

FIG. 5a is a diagram illustrating an exemplary bitmap according to an embodiment of the present invention;

FIG. 5b is a diagram illustrating an exemplary mask according to an embodiment of the present invention;

FIG. 5c is a diagram illustrating an example of the field of view and mask after being computed according to an embodiment of the present invention;

FIG. 6 is a diagram illustrating an exemplary mask according to another embodiment of the present invention;

FIG. 7 is a schematic diagram of a plurality of masks according to another embodiment of the present invention;

FIG. 8 is a block diagram of a video decoding apparatus according to another embodiment of the present invention.

Description of the symbols

100. 300, 800 video decoding device

110. 310, 810 visual field generating unit

320. 820 visual field expanding unit

130. 330, 830 part decoder

840 frame buffer

850 image processor

VP viewpoint

Visual field of VF

EVF dilated posterior visual field

F image frame

ID decoded image data

b0, b1 bit

Region A1, A2, A3

BA buffer address

S410 to S430

Detailed Description

Referring to fig. 3 and 4 together, fig. 3 is a block diagram of a video decoding apparatus 300 according to an embodiment of the invention, and fig. 4 is a flowchart of a video decoding method according to an embodiment of the invention. The video decoding apparatus 300 includes a view (visual field) generating unit 310, a view expanding unit 320, and a partial decoder (partial decoder) 330. Fig. 4 is a video decoding method 400 according to an embodiment of the invention. The visual field generating unit 310 generates a visual field VF according to a viewpoint vp (viewpoint) of the user (step S410), and outputs the visual field VF to the visual field expanding unit 320. In practice, the field of view VF can be generated by performing a perspective projection (perspective projection) or a projection of a day (gnomonic projection) according to the viewpoint VP, which is not described herein since perspective projection and projection of a day are well known in the art. In another embodiment, in order to save cost and modify convenience, the corresponding relationship between the plurality of views and the views may be stored as a lookup table, and the view generating unit 310 may generate the view VF by querying the lookup table according to the view VP after receiving the view VP.

The field-of-view expanding unit 320 receives the field of view VF from the field-of-view generating unit 310, expands the field of view VF to generate an expanded field of view EVF (step S420), and outputs the expanded field of view EVF to the partial decoder 330, and the partial decoder 330 decodes the image data corresponding to the expanded field of view EVF in an image frame F according to the expanded field of view EVF to generate the decoded image data ID corresponding to the expanded field of view EVF in the image frame F (step S430), where the image frame F is a panoramic image frame. In one embodiment, if the input image includes a low resolution panoramic frame and a high resolution panoramic frame corresponding to the same image content, frame F is the high resolution panoramic frame. Because the expanded visual field EVF is larger than the visual field VF, when the user rotates the head or moves the body, the user still can see the high-resolution image, and the user cannot see the black picture or the low-resolution image because the decoder does not have time to decode, thereby providing good quality of virtual reality user experience.

In the case where the field of view VF is represented by a bitmap, the field of view expanding unit 320 expands the field of view VF with a mask of morphological dilation (morphological dilation) to generate an expanded field of view EVF. For example, referring to fig. 5 a-5 c, fig. 5a is a diagram illustrating an example of a bitmap according to an embodiment of the present invention, in which a set of all bits 1 in the bitmap represents a view VF; fig. 5b is a diagram illustrating an example of a mask according to an embodiment of the present invention, where the mask is a3 × 3 mask, and thus, for a bit in the bitmap, as long as the bit itself or any bit around the bit is "1", the bit is "1" after the and mask operation, and for the bit b0 in the bitmap, since the bit b1 at the bottom right corner is "1", the bit b0 becomes "1" after the and mask operation, as shown in fig. 5c, fig. 5c is a diagram illustrating an example of a view VF and a mask according to an embodiment of the present invention after the operation. As can be seen from the above, after the bitmap is subjected to the and mask operation, the number of bits equal to "1" is increased, and thus the field of view VF is expanded. Please note that the mask can be changed according to the actual requirement, as long as the bit number equal to "1" is increased after the bitmap is subjected to the AND mask operation. For example, as shown in FIG. 6, the set of "1" s in the mask may constitute a cross, with the remainder of the mask being "0".

When the image frame F belongs to an equirectangular projection (ERP) image, since the ERP image projects a spherical surface into a rectangular plane, the extension degree at different latitudes is different, generally, the nearer the latitude 0 degree (equator) the ERP image has a smaller horizontal extension degree, and the nearer the latitude 90 degrees (north and south) the ERP image has a larger horizontal extension degree. For example, the equator is projected as the center line of the rectangular plane, thus its horizontal extent is minimal; the north and south poles are projected on a line of the rectangular plane uppermost and lowermost, respectively, so that they extend horizontally to the greatest extent. In order to make the user still see the high-resolution image when the visual field VF changes due to rotating the head or moving the body, the degree of expansion of the visual field VF also changes according to the latitude of the ERP image. In detail, the nearer the latitude 90 degrees (south and north poles) the higher the horizontal extension of the ERP image, the larger the horizontal extension of the ERP image near the top or bottom, the smaller the horizontal extension of the ERP image near the center. Thus, the masking may be different for different areas of an ERP image. For example, as shown in fig. 7, the bitmap can be divided into three areas a1, a2, A3 from top to bottom. For the central region a2, a mask of even expanse may be used, e.g., a3 x 3 mask; for regions a1 and A3 near the top and bottom, a shield with a greater horizontal expansion, such as 5 x 2 shield, may be used.

Sometimes, the view generating unit cannot directly obtain the viewpoint VP of the user and cannot generate a view according to the viewpoint, in this case, the view generating unit may generate the view according to a plurality of buffer addresses (buffers) output when a frame buffer (frame buffer) is read by an image processor (GPU), as shown in fig. 8, which is a block diagram of a video decoding device 800 according to another embodiment of the present invention. Specifically, the partial decoder 830 outputs the decoded image data ID to the frame buffer 840, and the image processor 850 is coupled to the frame buffer 840 and generates a plurality of sets of buffer addresses BA according to the views VF to obtain the corresponding image data ID from the frame buffer 840, so that even if the view generating unit 810 cannot obtain the view point VP, the view point VF can be directly known from the plurality of sets of buffer addresses BA output when the image processor 850 reads the image data ID stored in the frame buffer 840, and a bitmap of the view point VF can be generated.

In summary, the expanded visual field is used as the decoding range, so that the user can still see the high-resolution image when the visual field is changed due to the rotation of the head or the movement of the body, thereby providing good virtual reality user experience quality.

The above description is only an embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes performed by the present specification and drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims

1. A video decoding device, comprising:

a visual field generating unit for generating a visual field;

a visual field expanding unit expanding the visual field to generate an expanded visual field; and

a partial decoder for decoding image data corresponding to the expanded view in an image frame to generate decoded image data corresponding to the expanded view in the image frame.

2. The video decoding device of claim 1, wherein the image frame is an isometric rectangular projected ERP image, and the extent of the field of view varies with the latitude of the ERP image.

3. The video decoding apparatus of claim 2, wherein a horizontal extent near a top or bottom of the ERP image is greater than a horizontal extent near a center of the ERP image.

4. The video decoding apparatus of claim 3, wherein the field-of-view expanding unit expands the field of view using morphological dilation to generate the expanded field of view.

5. The video decoding apparatus of claim 1, wherein the view generating unit generates the view according to a viewpoint.

6. The video decoding apparatus of claim 5, wherein the view field generating unit performs a perspective projection or a day projection according to the viewpoint to generate the view field.

7. The video decoding device of claim 5, wherein the view generating unit generates the view by querying a lookup table according to the view, the lookup table comprising a plurality of sets of corresponding relationships between views and views.

8. The video decoding apparatus of claim 1, wherein the decoded image data is stored in a frame buffer, and the view generating unit generates the view according to a plurality of sets of buffer addresses output when an image processor reads the frame buffer.

9. A video decoding method, comprising:

generating a field of view;

expanding the field of view to produce an expanded field of view; and

and decoding the image data corresponding to the expanded view in an image frame to generate the decoded image data corresponding to the expanded view in the image frame.

10. The video decoding method of claim 9, wherein the image frame is an isometric rectangular projected ERP image, and the extent of the field of view varies with the latitude of the ERP image.

11. The video decoding method of claim 10, wherein a horizontal extent near a top or bottom of the ERP image is greater than a horizontal extent near a center of the ERP image.

12. The method of claim 11, wherein the step of expanding the field of view to generate the expanded field of view comprises:

the field of view is dilated with a morphological dilation to produce the dilated field of view.

13. The video decoding method of claim 9, wherein the step of generating the field of view comprises:

the field of view is generated according to a viewpoint.

14. The video decoding method of claim 13, wherein the step of generating the view according to the view comprises:

a perspective projection or a day projection is made based on the viewpoint to create the field of view.

15. The video decoding method of claim 13, wherein the step of generating the view according to the view comprises:

and inquiring a comparison table according to the viewpoint to generate the view field, wherein the comparison table comprises a plurality of groups of corresponding relations between the viewpoint and the view field.

16. The video decoding method of claim 9, wherein the step of generating the field of view comprises:

the view is generated according to a plurality of groups of buffer addresses output when an image processor reads a frame buffer, wherein the frame buffer stores the decoded image data.