WO2010082463A1 - Appareil et procédé de codage d'image et appareil et procédé de décodage d'image - Google Patents

Appareil et procédé de codage d'image et appareil et procédé de décodage d'image Download PDF

Info

Publication number
WO2010082463A1
WO2010082463A1 PCT/JP2010/000050 JP2010000050W WO2010082463A1 WO 2010082463 A1 WO2010082463 A1 WO 2010082463A1 JP 2010000050 W JP2010000050 W JP 2010000050W WO 2010082463 A1 WO2010082463 A1 WO 2010082463A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
block
prediction
encoding
edge
Prior art date
Application number
PCT/JP2010/000050
Other languages
English (en)
Japanese (ja)
Inventor
伊藤浩朗
山口宗明
齋藤昇平
高橋昌史
Original Assignee
株式会社日立製作所
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社日立製作所 filed Critical 株式会社日立製作所
Publication of WO2010082463A1 publication Critical patent/WO2010082463A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/14Coding unit complexity, e.g. amount of activity or edge presence estimation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/119Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/593Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques

Definitions

  • the present invention relates to an encoding technique and a decoding technique with high image quality and good encoding efficiency.
  • MPEG Motion Picture Experts Group
  • MPEG-4 It is an international standard encoding method as a standard.
  • H.264 / AVC Advanced Video Video Coding
  • one screen one frame is divided into units of macroblocks (16 pixels ⁇ 16 lines), and predictive encoding using correlation within the screen or between the screens is performed for each block.
  • a macroblock is further divided into 16 ⁇ 8, 8 ⁇ 16, 8 ⁇ 8, 8 ⁇ 4, 4 ⁇ 8, and 4 ⁇ 4 small blocks.
  • the prediction accuracy is improved by increasing the number of division types for blocks when encoding, and a high compression rate is realized.
  • Patent Document 1 discloses that an input image is divided into first blocks of M ⁇ N size, and further, the first block is divided into second blocks of m ⁇ n size, The division shape of the first block is determined based on the feature information extracted from the image of the block.
  • the code amount of the block including it will increase.
  • the presence / absence of an edge is detected as feature information of the second block, and if there is an edge, it is divided into the shape of the second block, and if there is no edge, the division is performed.
  • the shape of the first block is not changed. That is, the amount of codes can be reduced by treating an area where no edge exists as a large-sized block.
  • the block division method since the block division method is limited to a rectangular shape, for example, an image including an edge in a diagonal direction cannot always be predicted with high accuracy. There is a problem that the image quality is lowered or the code amount is increased. This is also a problem common to the conventional MPEG systems.
  • the present invention has been made in view of the above problems, and an object of the present invention is to divide a block suitably for an image feature portion such as an edge, thereby enabling high-quality and efficient encoding technology and decoding. It is to provide the technology.
  • the present invention relates to an image encoding apparatus that encodes a difference image between a prediction image generated using a plurality of encoding modes and an input image divided into first blocks of a predetermined size.
  • One encoding mode is selected from among the modes, a prediction image generation unit that generates a prediction image by intra prediction or inter prediction, a subtraction unit that calculates a difference image between the prediction image and the input image,
  • An encoded stream generation unit configured to generate an encoded stream by performing a frequency conversion process, a quantization process, and a variable-length encoding process on the difference image
  • the second block dividing unit includes the first block dividing unit Based on the rock and the edge information of encoded adjacent blocks in the same screen to determine the block shape of the second block.
  • the present invention relates to an image coding method for coding a difference image between a prediction image generated using a plurality of coding modes and a first block having a predetermined size.
  • One encoding mode is selected from among the modes, a prediction image generation step of generating a prediction image by intra prediction or inter prediction, a subtraction step of calculating a difference image of the prediction image and the input image,
  • An encoded stream generation step of generating an encoded stream by performing a frequency conversion process, a quantization process, and a variable-length encoding process on the difference image
  • Lock dividing step determines the block shape of the second blocks based on the first block and the edge information of encoded adjacent blocks in the same screen.
  • the present invention provides an image decoding apparatus that generates a decoded image by decoding an encoded stream, and generates a differential image by performing variable length decoding processing, inverse quantization processing, and inverse frequency transform processing on the encoded stream.
  • an addition unit that generates the decoded image by adding the difference image, and the prediction image generation unit performs the decoding based on edge information of a decoded block adjacent to the decoding target block in the same screen.
  • the block shape of the target block is determined.
  • the present invention provides an image decoding method for generating a decoded image by decoding an encoded stream, and generating a differential image by performing variable length decoding processing, inverse quantization processing, and inverse frequency transform processing on the encoded stream.
  • an addition step of adding the difference image to generate a decoded image wherein the predicted image generation step is based on edge information of a decoded block adjacent to the decoding target block in the same screen.
  • the block shape of the target block is determined.
  • an image encoding device it is possible to provide an image encoding device, an image decoding device, and a method thereof with high image quality and good encoding efficiency.
  • FIG 1 is a configuration block diagram showing an embodiment of an image encoding device of the present invention (Embodiment 1).
  • FIG. An example of filter coefficients used to calculate edge strength. Pixel value to be referenced for calculating edge strength.
  • FIG. The figure which shows the operation example of prediction between screens.
  • FIG. 10 is a diagram for explaining another example of operation of the edge information acquisition unit 110 (second embodiment).
  • 10 is a flowchart illustrating a decoding processing procedure according to the third embodiment. The flowchart which shows the procedure of the prediction process in a screen in FIG. The flowchart which shows the procedure of the prediction process between screens in FIG.
  • FIG. 1 is a configuration block diagram showing an embodiment (Embodiment 1) of an image encoding apparatus according to the present invention.
  • the first block dividing unit 101 divides the input image 150 into first input image blocks 151 each having 16 ⁇ 16 pixels.
  • the subtractor 102 performs a subtraction process between the first input image block 151 and a predicted image block 155 output from the encoding mode selection unit 114 described later, and generates a difference image signal 152.
  • the frequency conversion unit 103 performs frequency conversion such as discrete cosine transform (DCT conversion) on the differential image signal 152 for each image block, and outputs frequency conversion data.
  • the quantization unit 104 quantizes the data and outputs quantized data 153.
  • variable-length encoding unit 105 multiplexes the encoded data obtained by variable-length encoding the quantized data 153 with the information 157 such as the encoding mode information and the motion vector output from the encoding mode selection unit 114 to generate an encoded stream. 154 is generated and output.
  • the inverse quantization unit 106 inversely quantizes the quantized data 153 output from the quantization unit 104, and the inverse frequency transform unit 107 performs inverse frequency transform on the quantized data 153 to output difference block data.
  • the adder 108 adds the difference block data and the predicted image block 155 output from the encoding mode selection unit 114 to generate a decoded image 156.
  • the frame memory 109 stores the decoded image 156.
  • the edge information acquisition unit 110 acquires edge information for the image read from the frame memory 109.
  • the second block division unit 111 further finely divides the input image block 151 output from the first block division unit 101 based on the acquired edge information, and outputs a second input image block 160. Detailed operations of the edge information acquisition unit 110 and the second block division unit 111 will be described later.
  • the intra-screen prediction unit 112 generates an intra-screen prediction image from the reference image 159 read from the frame memory 109.
  • the inter-screen prediction unit 113 generates an inter-screen prediction image from the reference image 159 read from the frame memory 109.
  • the encoding mode selection unit 114 selects either the prediction image generated by the intra-screen prediction unit 112 or the prediction image generated by the inter-screen prediction unit 113, and the selected prediction image 155 is subtracted by the subtractor 102 and the adder 108. Output to. Also, the coding mode selected by the coding mode selection unit 114 and block division information 157 are output to the variable length coding unit 105.
  • the input image block 151 is divided into finer second input image blocks 160, and intra prediction and inter prediction are performed.
  • the method is characterized in that an optimal predicted image 155 is created.
  • FIG. 2 is a diagram for explaining an operation example of the edge information acquisition unit 110.
  • a screen 210 is an encoding target screen, and is divided into first blocks having a predetermined size (16 ⁇ 16 pixels) by the first block dividing unit 101.
  • An arrow 210a in the screen indicates the order of blocks to be encoded, and shows an enlargement of an encoding target block 201 to be encoded next and adjacent blocks 202 to 205 referred to for obtaining edge information.
  • the adjacent blocks 202 to 205 taken up here are blocks that are adjacent in the same screen as the encoding target block 201 and have already been encoded according to the encoding order 210a, and are decoded images read from the frame memory 109. .
  • an edge straight line at the pixel P_edge_cal (i0) is calculated by [Formula 3]. Note that x i and y i are the horizontal and vertical coordinates of the pixel P_edge_cal (i0) when the upper left corner of the screen is the origin.
  • the validity / invalidity of the obtained edge straight line is determined.
  • the edge straight line calculated by [Formula 3] intersects the encoding target block 201 as shown by the straight line 220 in FIG. 2, the coordinates (x i , y i ) of the pixel i at this time and the edge angle g (i And the edge information is output to the second block dividing unit 111.
  • FIG. 3 is an example of filter coefficients used for calculating the edge strength.
  • (A) is a Sobel filter, and (b) is a pre-wit filter.
  • Vertical filter is a factor to determine the difference in the horizontal direction of the pixel values in order to calculate the vertical edge strength f v.
  • the horizontal filter is a coefficient for obtaining a difference between pixel values in the vertical direction in order to calculate the horizontal edge strength f h .
  • FIG. 4 shows pixel values that are referred to for calculating the edge strength.
  • A is a pixel value, and i and j represent horizontal and vertical coordinates with the upper left corner of the screen as the origin.
  • Pixel A (i, j) is the central pixel, and eight adjacent pixels are used.
  • the vertical edge strength f v and the horizontal edge strength f h in the pixel A (i, j) when the Sobel filter of (a) is used are calculated using [Formula 4] and [Formula 5].
  • FIG. 5 is a diagram showing a mode of block division by the second block division unit 111.
  • H When performing inter-screen prediction in H.264 / AVC, as shown in FIGS. 5 (1) to (7), there are four types of modes: 16 ⁇ 16 pixel mode, 16 ⁇ 8 mode, 8 ⁇ 16 mode, and 8 ⁇ 8 mode.
  • the total of 4 blocks when divided and divided into 8 ⁇ 8 can be divided into 4 types of 8 ⁇ 8 mode, 8 ⁇ 4 mode, 4 ⁇ 8 mode, and 4 ⁇ 4 mode independently, for a total of 7 blocks
  • the second block dividing unit 111 divides the first input image block 151 into two blocks of 16 ⁇ 8 size and inter-screen prediction unit 113. Output to.
  • the block after being divided by the second block dividing unit 111 is referred to as a sub-block.
  • the second block dividing unit 111 adds a new mode for dividing the first input image block 151 based on the edge information acquired by the edge information acquiring unit 110 in addition to the above seven block dividing modes.
  • this mode is referred to as “edge division mode (edge_base division)”.
  • edge division mode edge_base division
  • the edge division mode shown in FIG. 5 (8) is provided. That is, in the new edge division mode, the encoding target block is divided into two sub-blocks 0 and 1 along the division line 220a corresponding to the edge information. Since the dividing line 220a is determined according to the edge information, an arbitrary inclination can be taken at an arbitrary position in the block.
  • the shapes of the two sub blocks include not only a rectangle but also a trapezoid and a triangle, and allow the shapes to be different from each other. Further, all the pixels in the encoding target block 201 belong to only one of the sub-block 0 and the sub-block 1, and there are no pixels belonging to both sub-blocks or none of the sub-blocks. And
  • Inter-screen predictive encoding is a technique of referring to a reference image block in a reference screen that has already been encoded and encoding a difference from the encoding target image.
  • H.264 / AVC three types of slices, ie, an I slice, a P slice, and a B slice, are used as a unit for switching the coding mode, and inter-frame predictive coding can be performed in the P slice and the B slice. is there.
  • motion compensation inter-screen prediction encoding is performed with reference to one reference image block from a past screen on the encoded time axis.
  • the motion-compensated inter-prediction code is referred by referring to two reference image blocks in any combination from the past screen or the future screen on the time axis that has already been encoded. To do.
  • FIGS. 6A and 6B are diagrams illustrating an example of the operation of inter-screen prediction.
  • FIG. 6A illustrates a case where the block division mode is the 16 ⁇ 8 mode
  • the screen 600 is a coding target screen
  • the screen 610 is a past reference screen on the time axis with respect to the coding target screen 600
  • a screen 620 shows a future reference screen on the time axis with respect to the encoding target screen 600.
  • the encoding target block 601 in the encoding target screen 600 is composed of two sub-blocks 602 and 603 having a size of 16 ⁇ 8 (rectangular).
  • an area (16 ⁇ 8 size) similar to the sub-blocks 602 and 603 is searched from the past reference screen 610 on the time axis to obtain a predicted image. For example, a square error sum or a difference error sum is used to calculate the similarity between images.
  • blocks 612 and 613 are selected as the predicted images of the sub-blocks 602 and 603.
  • the blocks 622 and 623 are selected from the future reference screen 620 on the time axis as predicted images of the sub-blocks 602 and 603. It is out. Then, it is possible to refer to the two blocks of the past and the future and use the average image as the predicted image.
  • FIG. 6B shows the case of the edge division mode added according to this embodiment.
  • the screen 650 shows the encoding target screen
  • the screen 660 shows the past reference screen on the time axis with respect to the encoding target screen 650
  • the screen 670 shows the future reference screen on the time axis with respect to the encoding target screen 650.
  • the encoding target block 651 in the encoding target screen 650 is composed of, for example, two trapezoidal sub-blocks 652 and 653 by edge division.
  • the sub-blocks 652 and 653 select blocks 662 and 663 as reference images from the past reference screens 660 on the time axis. These blocks 662 and 663 are regions having the same shape as the respective divided shapes (here, trapezoids) of the sub-blocks 652 and 653.
  • the sub-blocks 652 and 653 select blocks 672 and 673 as reference images from the future reference screen 670 on the time axis in addition to the blocks 662 and 663. .
  • These blocks 672 and 673 are also regions having the same shape as the divided shapes of the sub-blocks 652 and 653, respectively.
  • inter-screen prediction in the edge division mode added in the present embodiment can be performed in the same manner as in the conventional rectangular block division mode, although the shape of the sub-block is different.
  • the inter-screen prediction unit 113 selects a block division mode with the highest coding efficiency from among the inter-screen predictions in each block division mode according to FIGS. 5 (1) to (8).
  • an RD-Optimization method for selecting an optimum mode from the relationship between image quality distortion due to encoding and code amount is used. The details of RD-Optimization are described in Reference Document 1 below. [Reference 1] G. Sullivan and T. Wiegand: “Rate-Distortion Optimization for Video Compression”, IEEE Signal Processing Magazine, vol.15 no.6, pp.74-90, 1998.
  • FIG. 7 and FIG. 8 are diagrams for specifically explaining the effect of the edge division mode in inter-screen predictive coding.
  • a screen 700 shown in FIG. 7B is an encoding target screen, and is a P slice here.
  • a screen 710 illustrated in FIG. 7A is a past reference screen on the time axis with respect to the encoding target screen 700.
  • a block 701 indicates an encoding target block
  • blocks 702 to 705 indicate encoded blocks adjacent to the encoding target block 701.
  • FIG. 8 is an enlarged view of blocks 701 to 705 in FIG. 7B.
  • an image including a stationary building 751 and a moving automobile 752 (752 ′) is taken as an example.
  • FIG. 7B is an image in which a part of both the background building 751 and the moving automobile 752 is mixed, and the boundary (edge) between the building and the automobile is in an oblique direction.
  • block 701 and reference image 710 in (a) Since the building portion in the block 701 is a still image, it matches best with the block 711 in the same spatial position in the building 751 of the reference image 710.
  • the car portion is a moving image, and best matches the block 712 at the corresponding position in the car 752 ′ of the reference image 710 (that is, the image 712 shifted by the amount of movement of the car on the two screens).
  • the boundary between the building and the car is oblique, Regardless of which block division method (FIG.
  • the encoding target block 701 is divided into two trapezoidal sub-blocks 801 and 802 along the edge straight line 800. That is, by dividing the building so that it is included in the sub-block 801 and the automobile in the sub-block 802, there is no sub-block in which both the building and the automobile are mixed.
  • the sub-block 801 generates a prediction image with reference to the block 711 on the reference screen 710.
  • Sub-block 802 refers to block 712 to generate a predicted image.
  • the prediction error can be reduced more than the inter-screen prediction in H.264 / AVC.
  • FIG. 2 shows a conventional intra-screen prediction operation in H.264 / AVC.
  • a case where the block size when performing intra prediction encoding is 4 ⁇ 4 pixel units is shown.
  • H. In intra-frame prediction in H.264 / AVC, prediction is performed using a total of 13 decoded pixels in four encoded blocks adjacent to the left, upper left, upper, and upper right of the encoding target block.
  • FIGS. 9 (1), (2), (4) to (9) For prediction, as shown in FIGS. 9 (1), (2), (4) to (9), a mode for generating a predicted image by extending the 13 decoded pixels in the direction of the arrow, and FIG. As shown in 3), there is a mode for generating a predicted image from the average value of adjacent decoded pixels.
  • the block size for performing intra-screen predictive encoding includes cases of 8 ⁇ 8 pixel units and 16 ⁇ 16 pixel units.
  • FIG. 10 shows an operation in the case of performing intra prediction encoding in units of 16 ⁇ 16 pixels.
  • 16 ⁇ 16 pixel units similarly to the 4 ⁇ 4 pixel unit, the decoded pixels adjacent to the block to be encoded are extended in the direction of the arrow to generate a predicted image.
  • One of the modes 0 to 3) is selected.
  • the second block division unit 111 divides the first input image block 151 into three sizes (16 ⁇ 16 pixel size, 8 ⁇ 8 pixel size, 4 ⁇ 4) during intra prediction.
  • the pixel size is divided into any sub-block.
  • the in-screen prediction unit 112 performs a process of selecting any one of the nine modes shown in FIG. 9 or the four types of modes shown in FIG.
  • the second block division unit 111 calculates the edge information acquisition unit 110 in addition to the above three block division modes (16 ⁇ 16 pixel size, 8 ⁇ 8 pixel size, 4 ⁇ 4 pixel size).
  • a mode for dividing the first input image block 151 based on the edge information (edge division mode) is newly provided.
  • the edge division mode is the same as in the case of the inter-picture prediction encoding, and the description thereof is omitted here.
  • FIG. 11 shows the operation of intra prediction in the edge division mode.
  • the second block division unit 111 based on the edge information calculated by the edge information acquisition unit 110, the second block division unit 111 converts the first input image block 151 (encoding target block) into sub-blocks 1101 and 1102. An example of division is shown below.
  • the intra prediction unit 112 performs intra prediction for each of the divided sub blocks.
  • the prediction mode any one of the prediction mode 0 and the prediction mode 1 shown in FIG. 10 is applied.
  • FIG. 11B and 11C show the intra-screen prediction operation for the sub-block 1101.
  • FIG. 1101 Since the sub-block 1101 is adjacent to the upper and left encoded blocks, the prediction from the encoded upper block (prediction mode 0) and the prediction from the encoded left block (prediction mode 1) ) (B) is a case where the prediction mode 0 is applied to the sub-block 1101.
  • FIG. 10 (1) shows that prediction is applied to the entire encoding target block, whereas in FIG. 11 (b), prediction is applied only to the sub-block 1101. Different points apply.
  • FIG. 11C shows the case where the prediction mode 1 is applied to the sub-block 1101. This case is also similar to FIG. 10 (2), except that the prediction is applied only to the sub-block 1101 in FIG. 11 (c).
  • the pixel group 1104 in FIG. 11C is not in contact with the sub-block 1101, and when a decoded image corresponding to the position of the pixel group 1104 is used as it is as a predicted image, a prediction error is expected to increase.
  • it is effective to copy the value of the pixel 1103 that is in contact with both the pixel group 1104 and the sub-block 1101 to the pixel group 1104 and then use it as a predicted image.
  • FIG. 11D shows the case where the prediction mode 1 is applied to the sub-block 1102. Since the sub-block 1102 is not adjacent to the encoded upper block and is adjacent to only the left block, the prediction mode applicable in this example is only the prediction mode 1.
  • FIGS. 12 and 13 are diagrams for specifically explaining the effect of the edge division mode in the intra prediction encoding.
  • a screen 1200 indicates an encoding target screen
  • a block 1201 indicates an encoding target block
  • blocks 1202 to 1205 indicate encoded blocks adjacent to the encoding target block 1201.
  • FIG. 13 is an enlarged view of blocks 1201 to 1205.
  • the encoding target block 1201 is a block in which a part of the building 1251 and the automobile 1252 coexists, and the boundary between the building and the automobile is in an oblique direction.
  • block division mode and prediction mode FIGS. 9 (1) to (9) and FIGS. 10 (1) to (4)
  • the prediction error could not be reduced sufficiently.
  • the edge division mode in this embodiment is used, the edge straight line 1300 between the building and the car is calculated in the encoded adjacent block 1202 by the method shown in FIG.
  • the encoding target block 1201 can be divided into a trapezoidal sub-block 1301 and a sub-block 1302 by the edge straight line 1300.
  • a prediction image is generated with reference to the adjacent block 1203.
  • a prediction image is generated with reference to the adjacent block 1202 in the prediction mode shown in FIG.
  • the prediction error can be reduced as compared with the intra-screen prediction in H.264 / AVC.
  • the intra-screen prediction image generated by the intra-screen prediction unit 112 and the inter-screen prediction image generated by the inter-screen prediction unit 113 are obtained, and both are input to the encoding mode selection unit 114.
  • the encoding mode selection unit 114 selects an optimal prediction image from the input intra prediction image and inter prediction image.
  • a selection method for example, the RD-Optimization method described in Reference Document 1 is used.
  • the block division method (16 ⁇ 16 pixel block, 8 ⁇ 8 pixel block, 4 ⁇ 4 pixel block, edge division) is used as the encoding mode information 157.
  • Mode information is output to the variable length coding unit 105.
  • the block division method is any one of the 16 ⁇ 16 pixel block, the 8 ⁇ 8 pixel block, and the 4 ⁇ 4 pixel block, the prediction mode information illustrated in FIG. 9 or 10 is included.
  • the block division method (FIG. 5 (1) to (8)) and the motion vector information are used as the variable length code as the encoding mode information 157.
  • the conversion unit 105 To the conversion unit 105.
  • the sub-block division method in this embodiment is the conventional H.264 standard.
  • the types of division shapes trapezoids and triangles
  • the degree of freedom of size increase, so that optimum division can be performed according to the edge of the image to be encoded, and prediction errors are reduced.
  • the number of modes increases with the addition of the block division method is only one, and the influence on the increase in code amount can be minimized.
  • the shape of the edge division mode added in the present embodiment is not a fixed pattern, but derived from the feature of the edge of the adjacent block. This is to determine the block division method by applying the strength of the correlation in the spatial direction of the image, that is, the property that the feature of the image is similar to the spatial direction, and it is possible to reduce the prediction error.
  • the image coding apparatus in the image coding apparatus according to the first embodiment, in the coding apparatus that divides an input image into first blocks and further divides the input image into a plurality of second blocks, the same as the first block.
  • the block shape of the second block is determined based on the edge information of the encoded block adjacent in the screen, thereby reducing the prediction error and realizing image coding with high image quality and good coding efficiency. It becomes possible to do.
  • the block is divided based on the edge information.
  • the same effect can be obtained by extracting an image feature portion having a large code amount and dividing the block based on this.
  • Embodiment 2 describes a case where edge information is acquired by a method different from that in Embodiment 1 when block division is performed based on edge information.
  • the image coding apparatus according to the second embodiment has the same configuration as that of the first embodiment (FIG. 1), and thus the description thereof is omitted.
  • the edge information acquisition unit 110 in FIG. 1 performs a process of calculating edge information using a decoded image block adjacent in the same screen as the block to be encoded as shown in FIG. .
  • the edge information acquisition unit 110 calculates the edge information of the block at the same spatial position in the encoded screen adjacent in the time direction to the encoding target block.
  • FIG. 14 is a diagram for explaining an operation example of the edge information acquisition unit 110.
  • a screen 1400 is an encoding target screen
  • a screen 1410 is an encoded screen that precedes the encoding target screen 1400 in time, and is a decoded image screen obtained by decoding this.
  • a block 1401 indicates an encoding target block.
  • a block 1411 indicates a block in the decoded image screen 1410 that has the same spatial position in the screen as the block 1401.
  • the edge angle g at the pixel P_edge_cal2 (i1) is calculated.
  • an edge straight line at the pixel P_edge_cal2 (i1) is calculated from [Equation 3] in Embodiment 1 from the edge angle g calculated by [Equation 2].
  • the edge straight line information calculated by [Formula 3] is applied to the encoding target block 1401 and divided into sub-blocks.
  • the sub-block division method in the present embodiment is the same as that in the first embodiment.
  • the type of division shape trapezoid or triangle
  • the degree of freedom of size increase, so that optimum division is possible according to the edge (feature) of the image to be encoded, and prediction error There is an effect to reduce.
  • the number of modes increases with the addition of the block division method is only one, and the influence on the increase in code amount can be minimized.
  • the shape of the edge division mode added in this embodiment is not a fixed pattern, but is derived from image features of temporally adjacent screens.
  • the block division method is determined by applying the strength of the correlation in the time direction of the image, that is, the property that the feature of the image is similar to the time direction, and the prediction error can be reduced.
  • the block shape of the second block is determined based on the edge information of the block at the same spatial position in the encoded screen adjacent in the time direction, thereby reducing the prediction pixels and encoding efficiency with high image quality. It is possible to realize good image coding.
  • the block is divided based on the edge information. However, the same effect can be obtained by extracting an image feature portion having a large code amount and dividing the block based on this.
  • FIG. 15 is a configuration block diagram showing an embodiment (third embodiment) of an image decoding apparatus according to the present invention.
  • the variable length decoding unit 1501 performs variable length decoding on the input encoded stream 1550, quantized data 1551 that is a frequency conversion component of the prediction difference, and the block size and motion vector.
  • the encoding mode information 1555 necessary for prediction processing such as the above is acquired.
  • the inverse quantization unit 1502 inversely quantizes the quantized data 1551 and outputs inversely quantized data.
  • the inverse frequency transform unit 1503 performs inverse frequency transform on the inversely quantized data and outputs difference block data 1552.
  • the adder 1504 adds difference block data 1552 and a predicted image block 1556 described later, and outputs a decoded image 1553.
  • the decoded image 1553 is output from the image decoding device 1500 and stored in the frame memory 1505.
  • the edge information acquisition unit 1506 outputs edge information 1557 for the image read from the frame memory 1505.
  • the intra prediction unit 1507 generates an intra prediction image from the reference image 1554 read from the frame memory 1505 when the encoding mode information 1555 acquired from the variable length decoding unit 1501 is the intra prediction encoding mode.
  • the predicted image block 1556 is output to the adder 1504.
  • the inter-screen prediction unit 1508 generates an inter-screen prediction image from the reference image 1554 read from the frame memory 1505 when the encoding mode information 1555 acquired from the variable length decoding unit 1501 is the inter-screen prediction encoding mode.
  • the predicted image block 1556 is output to the adder 1504.
  • the edge information acquisition unit 1506 is the same as the operation in the first or second embodiment.
  • the screen 210 in FIG. 2 is the encoding target screen and the block 201 is the encoding target block.
  • the screen 210 is the decoding target screen and the block 201 is the decoding target. Replace with block.
  • blocks 202 to 205 in FIG. 2 represent decoded images in the operation of the first embodiment, and can be applied to the present embodiment as they are.
  • the edge straight line 220 for the decoding target block 201 is acquired, and the edge information 1557 is output.
  • the screen 1400 in FIG. 14 is the encoding target screen and the block 1401 is the encoding target block.
  • the screen 1400 is the decoding target screen and the block 1401 is the decoding target block.
  • the intra prediction unit 1507 generates a prediction image block 1556 when the encoding mode information 1555 acquired from the variable length decoding unit 1501 is the intra prediction encoding mode.
  • the intra-screen prediction unit 1507 acquires block division mode and prediction mode information from the coding mode information 1555.
  • the block division mode is any one of 16 ⁇ 16 pixel size, 8 ⁇ 8 pixel size, 4 ⁇ 4 pixel size, and edge division mode.
  • the prediction mode is acquired from the encoding mode information 1555 next.
  • the prediction mode in this case is one of prediction modes 0 to 3 shown in FIG. Even when the block division mode is 8 ⁇ 8 pixel size or 4 ⁇ 4 pixel size, the prediction mode is acquired from the coding mode information 1555.
  • the prediction mode in this case is one of prediction modes 0 to 8 shown in FIG.
  • the block division mode is the edge division mode
  • the shape of each sub-block of the decoding target block is obtained based on the edge information 1557 acquired by the edge information acquisition unit 1506.
  • the prediction mode of each sub block is acquired from the coding mode information 1555.
  • the prediction mode is any one of prediction mode 0 to prediction mode 1 shown in FIG.
  • the intra-screen prediction unit 1507 generates an intra-screen prediction image corresponding to the prediction mode based on the reference image 1554 read from the frame memory 1505, and outputs it as a prediction image block 1556.
  • the inter-screen prediction unit 1508 generates a prediction image block 1556 when the encoding mode information 1555 acquired from the variable length decoding unit 1501 is the inter-screen prediction encoding mode.
  • the inter-screen prediction unit 1508 acquires block division mode and motion vector information from the coding mode information 1555.
  • the block division mode is one of FIGS. 5 (1) to (8).
  • the motion vector information of each sub-block is acquired from the encoding mode information 1555.
  • the block division mode is the edge division mode of FIG. 5 (8)
  • the shape of each sub-block of the decoding target block is obtained based on the edge information 1557 acquired by the edge information acquisition unit 1506.
  • motion vector information of each sub block is acquired from the encoding mode information 1555.
  • the inter-screen prediction unit 1508 generates an inter-screen prediction image corresponding to the prediction mode based on the reference image 1554 read from the frame memory 1505 and outputs it as a prediction image block 1556.
  • FIG. 16 is a flowchart showing the decoding processing procedure for one frame in the present embodiment.
  • the following processing is performed on all blocks in one frame (S1601). That is, the prediction difference is decoded by performing variable length decoding processing (S1602), inverse quantization processing (S1603), and inverse frequency transform processing (S1604) on the input stream. Subsequently, it is determined in which method (intra-screen, between screens) the target block is predictively encoded (S1605). In accordance with the determination result, an intra-screen prediction process (S1606) or an inter-screen prediction process (S1607) is executed to generate a predicted image.
  • S1602 variable length decoding processing
  • S1603 inverse quantization processing
  • S1604 inverse frequency transform processing
  • a decoded image generation process (S1608) is performed from the generated prediction image and the prediction difference generated in the inverse frequency conversion process (S1604). The above processing is performed for all the blocks in the frame (S1609), and decoding for one frame of the image is completed.
  • FIG. 17 is a flowchart showing in more detail the procedure of the intra-screen prediction process (S1606) shown in FIG.
  • the block division mode is the edge division mode (S1701).
  • the edge division mode when S1701 is YES
  • the block division process (S1702) based on the edge information is performed, and the screen is obtained from the coding mode information decoded in the variable length decoding process (S1602) of FIG.
  • the intra prediction mode is acquired (S1703).
  • the block division mode is acquired from the coding mode information decoded in the variable length decoding process (S1602) of FIG. 16 (S1704).
  • the in-screen prediction mode is acquired (S1705).
  • an intra-screen prediction image generation process (S1706) is performed from the block division mode and the intra-screen prediction mode acquired by any of the above methods, and the intra-screen prediction image generation process ends.
  • FIG. 18 is a flowchart showing in more detail the procedure of the inter-screen prediction process (S1607) shown in FIG.
  • the block division mode is the edge division mode (S1801).
  • the edge division mode when S1801 is YES
  • block division processing S1802 based on edge information is performed, and motion is performed from the coding mode information decoded in the variable length decoding processing (S1602) of FIG.
  • a vector is acquired (S1803).
  • the block division mode is acquired from the coding mode information decoded in the variable length decoding process (S1602) of FIG. 16 (S1804).
  • a motion vector is acquired (S1805).
  • the inter-screen prediction image generation process (S1806) is performed from the block division mode and the motion vector information acquired by any of the above methods, and the inter-screen prediction image generation process ends.
  • the decoding target block is converted into a trapezoidal or triangular shape based on the edge information 1557 output from the edge information acquisition unit 1506.
  • the prediction image corresponding to each sub-block is generated by dividing into sub-blocks including the shape.
  • the image decoding apparatus can perform decoding only by knowing that the block division method of the decoding target block is the edge division mode.
  • the image decoding apparatus is the same as that for encoding an input stream that is encoded by dividing an input image into first blocks and then dividing the input image into a plurality of second blocks.
  • the division shape of the second block is determined based on the edge information of the decoded block adjacent in the same screen as the first block, and decoding is performed.
  • the division shape of the second block is determined based on the edge information of the block at the same spatial position in the encoded screen adjacent to the first block in the time direction and decoded. Is.
  • the block is divided based on the edge information.
  • the same effect can be obtained by extracting an image feature portion having a large code amount and dividing the block based on this.
  • DESCRIPTION OF SYMBOLS 100 ... Image coding apparatus 101 ... 1st block division part 102 ... Subtractor 103 ... Frequency conversion part 104 ... Quantization part 105 ... Variable length coding part 106 ... Inverse quantization part 107 ... Inverse frequency conversion part 108 ... Addition 109: Frame memory 110 ... Edge information acquisition unit 111 ... Second block division unit 112 ... Intra-screen prediction unit 113 ... Inter-screen prediction unit 114 ... Coding mode selection unit 1500 ... Image decoding device 1501 ... Variable length decoding Unit 1502 ... Inverse quantization unit 1503 ... Inverse frequency conversion unit 1504 ... Adder 1505 ... Frame memory 1506 ... Edge information acquisition unit 1507 ... In-screen prediction unit 1508 ... Inter-screen prediction unit.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

L'invention concerne des technologies de codage et de décodage qui permettent d'obtenir une qualité d'image et une efficacité de décodage élevées en divisant des parties caractéristiques de l'image, telles que les bords, en blocs appropriés. Un appareil de codage d'image (100) comprend : une première partie division de blocs (101), qui divise une image d'entrée en premiers blocs, une deuxième partie division de blocs (111), qui divise encore les premiers blocs en une pluralité de deuxièmes blocs, et des parties dédiées à la génération d'images prédites (112, 113), qui génèrent des images prédites via une prédiction au sein d'un écran ou entre plusieurs écrans de premiers ou deuxièmes blocs. La deuxième partie division de blocs (111) détermine la forme des deuxièmes blocs en fonction des informations de bord des blocs codés adjacents dans le même écran que les premiers blocs.
PCT/JP2010/000050 2009-01-13 2010-01-06 Appareil et procédé de codage d'image et appareil et procédé de décodage d'image WO2010082463A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2009-004614 2009-01-13
JP2009004614A JP2012089905A (ja) 2009-01-13 2009-01-13 画像符号化装置および画像符号化方法、画像復号化装置および画像復号化方法

Publications (1)

Publication Number Publication Date
WO2010082463A1 true WO2010082463A1 (fr) 2010-07-22

Family

ID=42339716

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2010/000050 WO2010082463A1 (fr) 2009-01-13 2010-01-06 Appareil et procédé de codage d'image et appareil et procédé de décodage d'image

Country Status (2)

Country Link
JP (1) JP2012089905A (fr)
WO (1) WO2010082463A1 (fr)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2360927A3 (fr) * 2010-02-12 2011-09-28 Samsung Electronics Co., Ltd. Système d'encodage/décodage d'image utilisant une prédiction de pixel basée sur graphique et système d'encodage et procédé
WO2012090413A1 (fr) * 2010-12-27 2012-07-05 日本電気株式会社 Dispositif de codage vidéo, dispositif de décodage vidéo, procédé de codage vidéo, procédé de décodage vidéo et programme
JP2015502064A (ja) * 2011-11-11 2015-01-19 フラウンホッファー−ゲゼルシャフト ツァ フェルダールング デァ アンゲヴァンテン フォアシュンク エー.ファオ パーティション符号化を用いた有効な予測
US10321148B2 (en) 2011-11-11 2019-06-11 Ge Video Compression, Llc Effective wedgelet partition coding using spatial prediction
US10341667B2 (en) 2011-11-11 2019-07-02 Ge Video Compression, Llc Adaptive partition coding
US10574981B2 (en) 2011-11-11 2020-02-25 Ge Video Compression, Llc Effective Wedgelet partition coding

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107005702B (zh) 2014-11-14 2020-10-16 华为技术有限公司 用于处理数字图像的块的***和方法
EP3207700B1 (fr) 2014-11-14 2020-01-08 Huawei Technologies Co., Ltd. Systèmes et procédés de traitement à base de masque d'un bloc d'une image numérique
WO2016074744A1 (fr) 2014-11-14 2016-05-19 Huawei Technologies Co., Ltd. Systèmes et procédés de traitement d'une image numérique
CN105872539B (zh) * 2015-02-08 2020-01-14 同济大学 图像编码方法和装置及图像解码方法和装置
EP3673651A1 (fr) * 2017-08-22 2020-07-01 Panasonic Intellectual Property Corporation of America Codeur d'image, décodeur d'image, procédé de codage d'image et procédé de décodage d'image

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008017305A (ja) * 2006-07-07 2008-01-24 Canon Inc 画像処理装置及び画像処理方法
WO2008016605A2 (fr) * 2006-08-02 2008-02-07 Thomson Licensing Procédés et appareil de segmentation géométrique adaptative pour le décodage vidéo

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008017305A (ja) * 2006-07-07 2008-01-24 Canon Inc 画像処理装置及び画像処理方法
WO2008016605A2 (fr) * 2006-08-02 2008-02-07 Thomson Licensing Procédés et appareil de segmentation géométrique adaptative pour le décodage vidéo

Cited By (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2360927A3 (fr) * 2010-02-12 2011-09-28 Samsung Electronics Co., Ltd. Système d'encodage/décodage d'image utilisant une prédiction de pixel basée sur graphique et système d'encodage et procédé
WO2012090413A1 (fr) * 2010-12-27 2012-07-05 日本電気株式会社 Dispositif de codage vidéo, dispositif de décodage vidéo, procédé de codage vidéo, procédé de décodage vidéo et programme
JP2015502064A (ja) * 2011-11-11 2015-01-19 フラウンホッファー−ゲゼルシャフト ツァ フェルダールング デァ アンゲヴァンテン フォアシュンク エー.ファオ パーティション符号化を用いた有効な予測
JP2017135724A (ja) * 2011-11-11 2017-08-03 ジーイー ビデオ コンプレッション エルエルシー パーティション符号化を用いた有効な予測
CN109218735A (zh) * 2011-11-11 2019-01-15 Ge视频压缩有限责任公司 用于编码和解码的设备和方法
JP2019068455A (ja) * 2011-11-11 2019-04-25 ジーイー ビデオ コンプレッション エルエルシー パーティション符号化を用いた有効な予測
US10321148B2 (en) 2011-11-11 2019-06-11 Ge Video Compression, Llc Effective wedgelet partition coding using spatial prediction
US10321139B2 (en) 2011-11-11 2019-06-11 Ge Video Compression, Llc Effective prediction using partition coding
US10334255B2 (en) 2011-11-11 2019-06-25 Ge Video Compression, Llc Effective prediction using partition coding
US10341667B2 (en) 2011-11-11 2019-07-02 Ge Video Compression, Llc Adaptive partition coding
US10362317B2 (en) 2011-11-11 2019-07-23 Ge Video Compression, Llc Adaptive partition coding
US10542263B2 (en) 2011-11-11 2020-01-21 Ge Video Compression, Llc Effective prediction using partition coding
US10542278B2 (en) 2011-11-11 2020-01-21 Ge Video Compression, Llc Effective wedgelet partition coding using spatial prediction
US10567776B2 (en) 2011-11-11 2020-02-18 Ge Video Compression, Llc Adaptive partition coding
US10574981B2 (en) 2011-11-11 2020-02-25 Ge Video Compression, Llc Effective Wedgelet partition coding
US10574982B2 (en) 2011-11-11 2020-02-25 Ge Video Compression, Llc Effective wedgelet partition coding
US10771794B2 (en) 2011-11-11 2020-09-08 Ge Video Compression, Llc Adaptive partition coding
US10771793B2 (en) 2011-11-11 2020-09-08 Ge Video Compression, Llc Effective prediction using partition coding
US10785497B2 (en) 2011-11-11 2020-09-22 Ge Video Compression, Llc Effective wedgelet partition coding using spatial prediction
US10911753B2 (en) 2011-11-11 2021-02-02 Ge Video Compression, Llc Effective wedgelet partition coding
JP2021013197A (ja) * 2011-11-11 2021-02-04 ジーイー ビデオ コンプレッション エルエルシー パーティション符号化を用いた有効な予測
JP2021044832A (ja) * 2011-11-11 2021-03-18 ジーイー ビデオ コンプレッション エルエルシー パーティション符号化を用いた有効な予測
US10986352B2 (en) 2011-11-11 2021-04-20 Ge Video Compression, Llc Adaptive partition coding
US11032562B2 (en) 2011-11-11 2021-06-08 Ge Video Compression, Llc Effective wedgelet partition coding using spatial prediction
US11032555B2 (en) 2011-11-11 2021-06-08 Ge Video Compression, Llc Effective prediction using partition coding
US11425367B2 (en) 2011-11-11 2022-08-23 Ge Video Compression, Llc Effective wedgelet partition coding
JP7126329B2 (ja) 2011-11-11 2022-08-26 ジーイー ビデオ コンプレッション エルエルシー パーティション符号化を用いた有効な予測
JP2023025001A (ja) * 2011-11-11 2023-02-21 ジーイー ビデオ コンプレッション エルエルシー パーティション符号化を用いた有効な予測
US11722657B2 (en) 2011-11-11 2023-08-08 Ge Video Compression, Llc Effective wedgelet partition coding
US11863763B2 (en) 2011-11-11 2024-01-02 Ge Video Compression, Llc Adaptive partition coding
CN109218735B (zh) * 2011-11-11 2024-07-05 Ge视频压缩有限责任公司 用于编码和解码的设备和方法

Also Published As

Publication number Publication date
JP2012089905A (ja) 2012-05-10

Similar Documents

Publication Publication Date Title
WO2010082463A1 (fr) Appareil et procédé de codage d'image et appareil et procédé de décodage d'image
US9047667B2 (en) Methods and apparatuses for encoding/decoding high resolution images
JP6084734B2 (ja) 映像復号化装置
KR100750128B1 (ko) 영상의 인트라 예측 부호화, 복호화 방법 및 장치
JP5401009B2 (ja) 映像のイントラ予測符号化、復号化方法及び装置
KR100727972B1 (ko) 영상의 인트라 예측 부호화, 복호화 방법 및 장치
JP6005087B2 (ja) 画像復号装置、画像復号方法、画像符号化装置、画像符号化方法及び符号化データのデータ構造
US7426308B2 (en) Intraframe and interframe interlace coding and decoding
KR101623124B1 (ko) 비디오 인코딩 장치 및 그 인코딩 방법, 비디오 디코딩 장치 및 그 디코딩 방법, 및 거기에 이용되는 방향적 인트라 예측방법
US20150010243A1 (en) Method for encoding/decoding high-resolution image and device for performing same
US20130089265A1 (en) Method for encoding/decoding high-resolution image and device for performing same
WO2010001917A1 (fr) Dispositif et procédé de traitement d'image
WO2012096150A1 (fr) Dispositif de codage d'image dynamique, dispositif de décodage d'image dynamique, procédé de codage d'image dynamique et procédé de décodage d'image dynamique
KR100727970B1 (ko) 영상의 부호화 및 복호화 장치와, 그 방법, 및 이를수행하기 위한 프로그램이 기록된 기록 매체
JP4577778B2 (ja) 動画像の符号化および復号化方法
JP2009049969A (ja) 動画像符号化装置及び方法並びに動画像復号化装置及び方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 10731134

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 10731134

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP