TWI478587B

TWI478587B - A Fast Method for Predicting Modular Multimedia Video Coding

Info

Publication number: TWI478587B
Application number: TW100135624A
Authority: TW
Original assignee: Nat Univ Chung Cheng
Priority date: 2011-09-30
Filing date: 2011-09-30
Publication date: 2015-03-21
Also published as: TW201315244A

Description

可調式多媒體視訊編碼之模式快速預測方法Mode fast prediction method for adjustable multimedia video coding

本發明係與影像處理之技術有關，特別是指一種可調式多媒體視訊編碼之模式快速預測方法。The invention relates to the technology of image processing, in particular to a mode fast prediction method for adjustable multimedia video coding.

在可調式多媒體視訊編碼中，編碼模式的預測是由三種預測方式所構成，一種是畫面內的紋理預測，另一種是畫面間的移動預測，最後一種是跨解析度/跨層預測。In the adjustable multimedia video coding, the prediction of the coding mode is composed of three prediction modes, one is texture prediction within the picture, the other is motion prediction between pictures, and the last is cross-resolution/cross-layer prediction.

而畫面中的巨集區塊的編碼模式分為16x16、16x8、8x16、8x8、8x4、4x8、以及4x4七種，而各個巨集區塊的搜尋模式中，若進行全部16x16、16x8、8x16、8x8、8x4、4x8、以及4x4的編碼模式的指定，於本案中係定義為完整模式(full mode)。The coding mode of the macroblock in the picture is divided into 16x16, 16x8, 8x16, 8x8, 8x4, 4x8, and 4x4, and in the search mode of each macroblock, if all 16x16, 16x8, 8x16, The designation of the 8x8, 8x4, 4x8, and 4x4 coding modes is defined as a full mode in this case.

畫面內的紋理預測是由目前巨集區塊四周的像素，經過二種巨集區塊分割，做出紋理，預測與目前巨集區塊最相似的紋理。畫面間的移動預測是預測目前巨集區塊在鄰近巨集區塊的移動向量(Motion Vector,MV)，預測出預測移動向量(Predict Motion Vector,PMV)，至參考畫面(Reference Frame)中，以預測移動向量為中心開啟搜尋範圍(Search Range,SR)預測目前巨集區塊在搜尋範圍中最相似的巨集區塊，二者間的向量稱為移動向量，而移動預測是經由八種的巨集區塊分割，分別運算出高達四十一種不同區塊分割的動態向量。而跨解析度/跨層預測則是利用較小解析度的畫面，將較小解析度已完成模式預測的畫面資訊，給予較大解析度畫面做為模式來預測，其中包含了三種跨解析度/跨層預測模式：跨解析度/跨層畫面內紋理預測(Interlayer Intra)、跨解析度/跨層差值測(Interlayer Residual)以及跨解析度/跨層移動測(Interlayer Motion)，而跨解析度/跨層差值預測包含去除較小解析度差值之畫面間移動預測模式，有八種巨集區塊分割，同樣運算出四十一種動態向量。完整模式預測是由以上畫面內的紋理預測、畫面間的移動預測以及跨解析度/跨層預測三種預測方式後，取得最佳的預測模式。The texture prediction in the picture is formed by the pixels around the current macroblock, and is divided by two kinds of macroblocks to make textures and predict the texture most similar to the current macroblock. The motion prediction between pictures is to predict the motion vector (Motion Vector, MV) of the current macro block in the adjacent macro block, and predict the Predict Motion Vector (PMV) to the Reference Frame. The search range (SR) is used to predict the most similar macroblock of the current macroblock in the search range. The vector between the two is called the motion vector, and the mobile prediction is through eight kinds. The macroblock partition is divided into dynamic vectors of up to forty different block partitions. The cross-resolution/cross-layer prediction is to use a smaller resolution image to predict the picture information of the smaller resolution completed mode, and give the larger resolution picture as the mode to predict, including three cross-resolutions. / Cross-layer prediction mode: cross-resolution / cross-layer intra-texture prediction (Interlayer Intra), cross-resolution / cross-layer difference (Interlayer Residual) and cross-resolution / cross-layer motion measurement (Interlayer Motion), and cross The resolution/cross-layer difference prediction includes an inter-picture motion prediction mode that removes smaller resolution differences. There are eight macroblock partitions, and forty-one dynamic vectors are also calculated. The full-mode prediction is based on texture prediction in the above picture, motion prediction between pictures, and cross-resolution/cross-layer prediction.

為了取得最佳的預測模式，必須完成上述畫面內的紋理預測二種巨集區塊分割、畫面間的移動預測八種模式以及跨解析度/跨層預測十種模式，其中只有一種會成為最佳的預測模式，而其他的十九種模式則捨棄不用，因此產生了極大量的冗餘的運算複雜度。在小解析度的畫面如QCIF及CIF中，由於巨集區塊的數量較少，累積冗餘的運算複雜度似乎較不明顯，但當解析度達到D1、HD720、Full HD1080甚至更大時，巨集區塊的數量相當多，此冗餘的運算複雜度即會變成拖慢整體效能的最大原因。In order to obtain the best prediction mode, it is necessary to complete the texture prediction in the above picture, two kinds of macro block partitioning, eight modes of motion prediction between pictures, and ten modes of cross-resolution/cross-layer prediction. Only one of them will become the most Good prediction mode, while the other 19 modes are discarded, resulting in a very large amount of redundant computational complexity. In small resolution pictures such as QCIF and CIF, due to the small number of macro blocks, the computational complexity of cumulative redundancy seems to be less obvious, but when the resolution reaches D1, HD720, Full HD1080 or even larger, The number of macroblocks is quite large, and the complexity of this redundancy becomes the biggest cause of slowing down overall performance.

由於可調式多媒體視訊編碼之資料流中，包含了大解析度與小解析度之畫面，因此必定在壓縮時，會產生冗餘的運算量，因此我們在演算法的設計上，使用多種創新的方式，減少了運算複雜度與兼顧畫面品質。Since the data stream of the adjustable multimedia video coding includes a large resolution and a small resolution, it is necessary to generate redundant calculations during compression. Therefore, we use a variety of innovative algorithms in the design of the algorithm. The method reduces the computational complexity and balances the picture quality.

本發明之主要目的在於提供一種可調式多媒體視訊編碼之模式快速預測方法，其可減少運算複雜度並能兼顧畫面品質。The main object of the present invention is to provide a mode fast prediction method for adjustable multimedia video coding, which can reduce computational complexity and balance picture quality.

為了達成前述目的，依據本發明所提供之一種可調式多媒體視訊編碼之模式快速預測方法，該可調式多媒體視訊編碼具有三種不同解析度大小的畫面，分別為最小解析度的一基礎層、中等解析度的一第一增強層、以及最大解析度的一第二增強層，該方法包含有下列步驟：a)由最小解析度畫面開始，依序對各個巨集區塊進行相關的編碼模式預測，在該最小解析度畫面的所有巨集區塊完成編碼模式預測後，再依序對中等解析度畫面的巨集區塊進行編碼模式預測，最後再依序對最大解析度畫面的巨集區塊進行編碼模式預測；b)將目前正在處理的一巨集區塊定義為一目前巨集區塊，判斷該目前巨集區塊是否為最小解析度，若是，則直接以完整模式進行運算，並且決定完整模式為該目前巨集區塊的最佳編碼模式，再跳至步驟f)；若否，則接著進入步驟c)；其中，最佳編碼模式係指訊噪失真比是最小時的模式；c)參考該目前巨集區塊所對應的次小解析度巨集區塊的最佳編碼模式，以及利用參考鄰近區塊模式技術來參考該目前巨集區塊所對應的相同解析度的鄰近巨集區塊的最佳編碼模式，將這些參考到的編碼模式進行統計，判斷同類編碼模式的出現次數，再以出現次數對各個編碼模式進行排序；d)進行提早預測最佳編碼模式之動作，其係將步驟c)的排序順序所對應的各個鄰近巨集區塊逐一進行，對一該鄰近巨集區塊而言，其係將其所產生的訊噪失真比與該鄰近巨集區塊所對應的次小解析度巨集區塊所產生的訊噪失真比進行運算而取得兩者的比率，再把該目前巨集區塊的訊噪失真比與該比率相乘後，與該目前巨集區塊所對應的次小解析度巨集區塊的訊噪失真比進行比較，若比較結果前者小於後者，則判斷目前比較的編碼模式即是最佳編碼模式，並停止其餘編碼模式所對應的訊噪失真比的比較動作，若前者大於或等於後者，則繼續依序比較下一個鄰近巨集區塊所產生的訊噪失真比，若比較至最後一個編碼模式均未發生小於該次小解析度巨集區塊的訊噪失真比的狀況，則以該排序順序中的最後一個編碼模式為最佳編碼模式；e)將步驟d)中所判斷的最佳編碼模式決定為該目前巨集區塊的最佳編碼模式；f)判斷該目前巨集區塊是否為最後一個巨集區塊，若是，則進行步驟g)，若否，則將下一個巨集區塊指定為目前巨集區塊，並跳至步驟b)；以及g)結束。藉此，可減少運算複雜度並能兼顧畫面品質。In order to achieve the foregoing objective, an adjustable multimedia video coding mode fast prediction method according to the present invention has three different resolution sizes, which are respectively a base layer of minimum resolution and a medium resolution. a first enhancement layer and a second enhancement layer of maximum resolution, the method comprising the following steps: a) starting from the minimum resolution picture, sequentially performing correlation coding mode prediction on each macro block, After all the macroblocks of the minimum resolution picture complete the coding mode prediction, the coding mode prediction is performed on the macroblocks of the medium resolution picture sequentially, and finally the macro block of the maximum resolution picture is sequentially sequenced. Performing coding mode prediction; b) defining a macroblock currently being processed as a current macroblock, determining whether the current macroblock is the minimum resolution, and if so, directly performing the operation in the full mode, and Determining that the full mode is the best coding mode of the current macroblock, and then jumping to step f); if not, proceeding to step c); The optimal coding mode refers to the mode when the signal-to-noise distortion ratio is minimum; c) refers to the optimal coding mode of the sub-small resolution macroblock corresponding to the current macroblock, and utilizes the reference neighbor block mode technique Referring to the optimal coding mode of the neighboring macroblocks of the same resolution corresponding to the current macroblock, the reference coding modes are counted, and the number of occurrences of the same coding mode is judged, and then the number of occurrences is The coding mode is sorted; d) performing the action of predicting the optimal coding mode early, which performs each of the adjacent macroblocks corresponding to the sorting order of step c) one by one, and for a neighboring macroblock, The ratio of the signal-to-noise distortion ratio generated by the signal-to-noise distortion ratio generated by the sub-small-resolution macroblock corresponding to the adjacent macroblock is calculated to obtain the ratio of the two, and then the current macro is obtained. The signal-to-noise distortion ratio of the block is multiplied by the ratio, and compared with the signal-to-noise distortion ratio of the sub-small-resolution macroblock corresponding to the current macroblock, if the comparison result is smaller than the latter, the current judgment is ratio The comparison mode is the optimal coding mode, and stops the comparison of the signal-to-noise distortion ratio corresponding to the remaining coding modes. If the former is greater than or equal to the latter, the information generated by the next adjacent macroblock is continuously compared. The noise-to-distortion ratio, if no comparison occurs to the last coding mode that is less than the noise-to-noise ratio of the small-resolution macroblock, the last coding mode in the sorting order is the optimal coding mode; Determining the optimal coding mode determined in step d) as the best coding mode of the current macroblock; f) determining whether the current macroblock is the last macroblock, and if so, performing steps g) If no, the next macroblock is designated as the current macroblock and jumps to step b); and g) ends. Thereby, the computational complexity can be reduced and the picture quality can be balanced.

為了詳細說明本發明之技術特點所在，茲舉以下之較佳實施例並配合圖式說明如後，其中：如第一圖至第四圖所示，本發明一較佳實施例所提供之一種可調式多媒體視訊編碼之模式快速預測方法，其中該可調式多媒體視訊編碼具有三種不同解析度大小的畫面，分別為最小解析度的一基礎層、中等解析度的一第一增強層、以及最大解析度的一第二增強層，該方法主要具有下列步驟：In order to explain the technical features of the present invention in detail, the following preferred embodiments are described below with reference to the accompanying drawings, wherein, as shown in FIG. 1 to FIG. 4, a preferred embodiment of the present invention provides a preferred embodiment of the present invention. Adjustable multimedia video coding mode fast prediction method, wherein the adjustable multimedia video coding has three different resolution sizes, respectively, a base layer of minimum resolution, a first enhancement layer of medium resolution, and maximum resolution A second enhancement layer of the degree, the method mainly has the following steps:

a)由最小解析度畫面開始，依序對各個巨集區塊進行相關的編碼模式預測，在該最小解析度畫面的所有巨集區塊完成編碼模式預測後，再依序對中等解析度畫面的巨集區塊進行編碼模式預測，最後再依序對最大解析度畫面的巨集區塊進行編碼模式預測。a) starting from the minimum resolution picture, performing correlation coding mode prediction on each macroblock in sequence, and performing coding mode prediction on all macroblocks of the minimum resolution picture, and then sequentially on the medium resolution picture The macroblock block performs coding mode prediction, and finally performs coding mode prediction on the macroblock block of the maximum resolution picture.

b)將目前正在處理的一巨集區塊定義為一目前巨集區塊，判斷該目前巨集區塊是否為最小解析度(即是否為基礎層？)，若是，則直接以完整模式進行運算，並且決定完整模式為該目前巨集區塊的最佳編碼模式，再跳至步驟f)；若否，則接著進入步驟c)。其中，最佳編碼模式係指訊噪失真比是最小時的模式。b) defining a macroblock currently being processed as a current macroblock, determining whether the current macroblock is the minimum resolution (ie, is the base layer?), and if so, directly in the full mode Operate, and decide that the full mode is the best coding mode of the current macroblock, and then jump to step f); if not, then proceed to step c). Among them, the optimal coding mode refers to the mode when the signal-to-noise distortion ratio is the smallest.

c)參考該目前巨集區塊所對應的次小解析度巨集區塊的最佳編碼模式，以及利用參考鄰近區塊模式技術來參考該目前巨集區塊所對應的相同解析度的鄰近巨集區塊的最佳編碼模式，將這些參考到的編碼模式進行統計，判斷同類編碼模式的出現次數，再以出現次數對各個編碼模式進行排序。其中，前述的次小解析度，舉例而言，若該目前巨集區塊為中等解析度，則次小解析度即為最小解析度，若該目前巨集區塊為最大解析度，則次小解析度即為中等解析度。前述的參考鄰近巨集區塊的動作，係如第二圖為例說明，以第一增強層的該目前巨集區塊MB(E1A)為例，其參考的次小解析度巨集區塊即為基礎層的對應巨集區塊MB(BA)，而參考的相同解析度的鄰近巨集區塊即為同第一增強層的巨集區塊MB(E1B),MB(E1C),MB(E1D)；同理，以第二增強層的該目前巨集區塊MB(E2A)為例，則參考次小解析度巨集區塊MB(E1A)，以及參考相同解析度巨集區塊MB(E2B),MB(E2C),MB(E2D)。c) referring to the best coding mode of the sub-small resolution macroblock corresponding to the current macroblock, and using the reference neighboring block mode technique to refer to the proximity of the same resolution corresponding to the current macroblock The optimal coding mode of the macroblock block, the statistics of the reference coding modes are counted, the number of occurrences of the same coding mode is judged, and each coding mode is sorted by the number of occurrences. Wherein, the foregoing sub-small resolution, for example, if the current macroblock is medium resolution, the sub-small resolution is the minimum resolution, if the current macroblock is the maximum resolution, then The small resolution is medium resolution. The foregoing operation of referring to the neighboring macroblock is as illustrated in the second figure. Taking the current macroblock MB (E1A) of the first enhancement layer as an example, the reference sub-small resolution macroblock is referred to. That is, the corresponding macro block MB (BA) of the base layer, and the neighboring macro block of the same resolution referenced is the macro block MB (E1B), MB (E1C), MB of the same enhancement layer. (E1D); similarly, taking the current macroblock MB (E2A) of the second enhancement layer as an example, refer to the sub-small resolution macro block MB (E1A), and refer to the same resolution macro block. MB (E2B), MB (E2C), MB (E2D).

d)進行提早預測最佳編碼模式之動作，其係將步驟c)的排序順序所對應的各個鄰近巨集區塊逐一進行，對一該鄰近巨集區塊而言，其係將其所產生的訊噪失真比與該鄰近巨集區塊所對應的次小解析度巨集區塊所產生的訊噪失真比進行運算而取得兩者的比率，再把該目前巨集區塊的訊噪失真比與該比率相乘後，與該目前巨集區塊所對應的次小解析度巨集區塊的訊噪失真比進行比較，若比較結果前者小於後者，則判斷目前比較的編碼模式即是最佳編碼模式，並停止其餘編碼模式所對應的訊噪失真比的比較動作，若前者大於或等於後者，則繼續依序比較下一個鄰近巨集區塊所產生的訊噪失真比，若比較至最後一個編碼模式均未發生小於該次小解析度巨集區塊的訊噪失真比的狀況，則以該排序順序中的最後一個編碼模式為最佳編碼模式。前述的提早預測最佳編碼模式的比較動作，係以第三圖為例說明，其中該目前巨集區塊MB(EA_ET)是位於第一增強層中，其左側之巨集區塊MB(EB_ET)係為鄰近巨集區塊，因此，將該左側巨集區塊MB(EB_ET)的訊噪失真比與所對應的次小解析度(即基礎層)巨集區塊MB(BB_ET)的訊噪失真比進運算而取得兩者的比率(即訊噪失真比的比率，RDcost ratio=RDcost(EB_ET)/RDcost(BB_ET))，再將該目前巨集區塊MB(EA_ET)的訊噪失真比與該比率相乘(RDcost(EA_ET) * RDcost ratio)後，與該目前巨集區塊MB(EA_ET)所對應的次小解析度(即基礎層)巨集區塊MB(BA_ET)的訊噪失真比進行比較。d) performing an action of predicting the optimal coding mode early, which is performed by successively selecting each adjacent macroblock corresponding to the sorting sequence of step c), and for a neighboring macroblock, the system generates The noise-to-noise distortion ratio is calculated by the ratio of the noise-to-noise distortion ratio generated by the sub-small-resolution macroblock corresponding to the adjacent macroblock to obtain the ratio of the two, and then the noise of the current macroblock is obtained. After the distortion ratio is multiplied by the ratio, the signal-to-noise distortion ratio of the sub-small resolution macroblock corresponding to the current macroblock is compared. If the comparison result is smaller than the latter, the currently compared coding mode is determined. Is the best coding mode, and stops the comparison of the signal-to-noise distortion ratio corresponding to the remaining coding modes. If the former is greater than or equal to the latter, then continue to compare the signal-to-noise distortion ratio generated by the next adjacent macroblock sequentially. When the comparison to the last coding mode does not occur less than the noise-to-noise ratio of the small-resolution macroblock, the last coding mode in the sorting order is the optimal coding mode. The foregoing comparison operation for predicting the optimal coding mode in advance is illustrated by taking the third picture as an example, wherein the current macro block MB (EA_ET) is located in the first enhancement layer, and the macro block MB on the left side (EB_ET) ) is a neighboring macroblock, therefore, the signal-to-noise distortion ratio of the left macroblock MB (EB_ET) and the corresponding sub-small resolution (ie, the base layer) macroblock MB (BB_ET) The noise distortion ratio is obtained by the ratio (ie, the ratio of the signal-to-noise ratio, RDcost ratio=RDcost(EB_ET)/RDcost(BB_ET)), and then the noise of the current macro block MB (EA_ET) is distorted. After multiplying by the ratio (RDcost(EA_ET) * RDcost ratio), the sub-small resolution (ie, the base layer) macro block MB (BA_ET) corresponding to the current macro block MB (EA_ET) The noise to noise ratio is compared.

由於步驟d)的提早預測最佳編碼模式之動作，有時會發生沒有選到最佳編碼模式的誤差狀況，若是在中等解析度的該目前巨集區塊的最佳編碼模式已產生誤差，則會影響到後續最大解析度的該目前巨集區塊的最佳編碼模式的正確性，而將前述的誤差狀況擴大。Due to the early prediction of the optimal coding mode in step d), an error condition in which the optimal coding mode is not selected may occur, and if the optimum coding mode of the current macroblock in the medium resolution has generated an error, It will affect the correctness of the best coding mode of the current macroblock of the subsequent maximum resolution, and expand the aforementioned error condition.

因此，在本實施例中，還包含了步驟d1)，其係進行動態加入完整模式之動作，主要係在進行到中等解析度畫面的巨集區塊編碼模式預測時，將該目前巨集區塊所對應的相同解析度的鄰近巨集區塊的訊噪失真比進行平均而取得一平均值，再將這個平均值與該目前巨集區塊的訊噪失真比相除，而得到了中等解析度的該目前巨集區塊的訊噪失真比趨勢，並將之儲存。在進行到最大解析度畫面的巨集區塊編碼模式預測時，將該目前巨集區塊所對應的相同解析度的鄰近巨集區塊的訊噪失真比進行平均而取得一平均值，再將這個平均值與該目前巨集區塊的訊噪失真比相除，而得到了最大解析度的該目前巨集區塊的訊噪失真比趨勢。將中等解析度的該目前巨集區塊的訊噪失真比趨勢與最大解析度的該目前巨集區塊的訊噪失真比趨勢進行比較，若比較結果不相同，則代表著步驟d)的提早預測最佳編碼模式之動作產生了誤差，因此就決定最大解析度的該目前巨集區塊的下一個巨集區塊的最佳編碼模式為完整模式；若比較結果相同，則繼續後續步驟。Therefore, in this embodiment, the step d1) is further included, which is an action of dynamically adding the complete mode, mainly when performing the macroblock block mode prediction to the medium resolution picture, and the current macro area is The signal-to-noise distortion ratio of the neighboring macroblocks of the same resolution corresponding to the block is averaged to obtain an average value, and the average value is divided by the signal-to-noise distortion ratio of the current macroblock, and the medium is obtained. The signal-to-noise distortion ratio trend of the current macroblock of the resolution is stored and stored. When the macroblock block mode prediction to the maximum resolution picture is performed, the signal-to-noise distortion ratio of the neighboring macroblocks of the same resolution corresponding to the current macroblock is averaged to obtain an average value, and then an average value is obtained. The average value is divided by the current noise distortion ratio of the current macroblock, and the maximum resolution of the current macroblock block is obtained. Comparing the signal-to-noise distortion ratio trend of the current macroblock of the medium resolution to the signal-to-noise distortion ratio trend of the current macroblock of the maximum resolution, if the comparison result is not the same, it represents the step d) The action of predicting the best coding mode early produces an error, so the optimal coding mode of the next macroblock of the current macroblock that determines the maximum resolution is the full mode; if the comparison result is the same, the subsequent steps are continued. .

前述之步驟d1)，以第四圖為例來說明如何取得訊噪失真比趨勢。在進行到中等解析度畫面的目前巨集區塊MB(E1A_R)時，取得其鄰近(左方與上方)巨集區塊MB(E1B_R),MB(E1C_R)的訊噪失真比，將兩者平均後與該目前巨集區塊MB(E1A_R)的訊噪失真比相除，而取得訊噪失真比趨勢(RDcost average/RDcost(E1A_R))並儲存之。接著，在進行到最大解析度畫面的目前巨集區塊MB(E2A_R)時，取得其鄰近(左方與上方)巨集區塊MB(E2B_R),MB(E2C_R)的訊噪失真比，將兩者平均後與該目前巨集區塊MB(E2A_R)的訊噪失真比相除，而取得訊噪失真比趨勢(RDcost average/RDcost(E2A_R))。In the foregoing step d1), the fourth figure is taken as an example to illustrate how to obtain the signal-to-noise distortion ratio trend. When the current macro block MB (E1A_R) of the medium resolution picture is performed, the signal-to-noise distortion ratio of the neighboring (left and upper) macro blocks MB (E1B_R) and MB (E1C_R) is obtained. After averaging, the signal-to-noise distortion ratio of the current macroblock MB (E1A_R) is divided, and the signal-to-noise distortion ratio trend (RDcost average/RDcost(E1A_R)) is obtained and stored. Then, when the current macro block MB (E2A_R) of the maximum resolution picture is performed, the signal-to-noise distortion ratio of the neighboring (left and upper) macro blocks MB (E2B_R) and MB (E2C_R) is obtained. After averaging, the signal-to-noise distortion ratio of the current macroblock MB (E2A_R) is divided, and the signal-to-noise distortion ratio trend (RDcost average/RDcost(E2A_R)) is obtained.

於本實施例中，前述步驟d1)中的該訊噪失真比趨勢係具有一定範圍，且這個範圍的大小可以讓使用者來進行調整，在將中等解析度的該目前巨集區塊的訊噪失真比趨勢與最大解析度的該目前巨集區塊的訊噪失真比趨勢進行比較時，訊噪失真比趨勢的範圍愈大則愈不容易相同，愈小則愈容易相同。由此可知，範圍設定得愈大，則愈不容易達到兩者相同的狀況，因此被決定為完整模式的編碼模式就愈多，編碼品質就會愈好；反之愈差。In this embodiment, the signal-to-noise distortion ratio trend in the foregoing step d1) has a certain range, and the size of the range can be adjusted by the user, and the current resolution of the current macro block is moderately resolved. When the noise distortion ratio is compared with the trend of the signal-to-noise distortion ratio of the current macroblock of the maximum resolution, the smaller the range of the signal-to-noise distortion ratio trend is, the easier it is to be the same. It can be seen that the larger the range is set, the less likely it is to achieve the same situation. Therefore, the more the coding mode is determined to be the full mode, the better the coding quality will be;

e)將步驟d)中所判斷的最佳編碼模式決定為該目前巨集區塊的最佳編碼模式。e) determining the optimal coding mode determined in step d) as the best coding mode for the current macroblock.

f)判斷該目前巨集區塊是否為最後一個巨集區塊，若是，則進行步驟g)；若否，則將下一個巨集區塊指定為目前巨集區塊，並跳至步驟b)。f) determining whether the current macroblock is the last macroblock, if yes, proceeding to step g); if not, designating the next macroblock as the current macroblock and jumping to step b ).

g)結束。g) End.

上述在步驟c)以及步驟d1)中的鄰近巨集區塊，係指該目前巨集區塊的左側、左上、上方、及右上四個方位中的至少二個的巨集區塊。於本實施例中，鄰近巨集區塊在步驟c)係指位於該目前巨集區塊左側、上方、及右上的巨集區塊，而在步驟d1)中則是指位於該目前巨集區塊左側及上方的巨集區塊。鄰近巨集區塊愈多，則運算較為複雜但編碼品質較好，愈少則運算較為簡單但編碼品質較差。The adjacent macroblocks in the steps c) and d1) refer to macroblocks of at least two of the left, upper left, upper, and upper right directions of the current macroblock. In this embodiment, the neighboring macroblock refers to the macroblock located on the left side, the upper side, and the upper right side of the current macroblock in step c), and is located in the current macro in step d1). The macro block on the left and above the block. The more neighboring macroblocks, the more complicated the operation but the better the encoding quality. The less the operation, the simpler the operation but the lower the encoding quality.

須補充說明的是，前述的步驟d1)並不是必要的步驟，在不加入步驟d1)的狀況下，提早預測最佳編碼模式之動作所產生的些微誤差，即使經過跨解析度的畫面的錯誤擴展，也僅會使得編碼品質差一些，此仍可滿足較低畫面需求的情況。亦即，本發明並不以加入步驟d1)為必要。It should be added that the above step d1) is not a necessary step, and in the case where the step d1) is not added, the slight error caused by the action of the optimal coding mode is predicted early, even if the error of the screen after the resolution is exceeded. The extension will only make the coding quality worse, which can still meet the needs of lower picture requirements. That is, the present invention is not necessary to add the step d1).

藉由上述步驟可知，本發明係對每個巨集區塊依序處進行編碼模式預測，其而主要是參考鄰近巨集區塊的最佳編碼模式，並加以排序後進行提早預測最佳編碼模式的動作，藉此即可提早預測出該目前巨集區塊的最佳編碼模式，進而省略掉後續不必要的編碼模式預測所需要的運算，減少了運算複雜度。再者，由於能提早預測出該目前巨集區塊的最佳編碼模式，因此畫面的品質也能夠兼顧而不會變差。It can be seen from the above steps that the present invention performs coding mode prediction on each macroblock sequentially, and mainly refers to the optimal coding mode of the neighboring macroblocks, and sorts them to perform early prediction and optimal coding. The action of the mode can predict the optimal coding mode of the current macroblock early, thereby omitting the operations required for subsequent unnecessary coding mode prediction, and reducing the computational complexity. Furthermore, since the optimal coding mode of the current macroblock can be predicted early, the quality of the picture can be balanced without being deteriorated.

第一圖係本發明一較佳實施例之流程圖。The first figure is a flow chart of a preferred embodiment of the present invention.

第二圖係本發明一較佳實施例之動作示意圖，顯示參考鄰近巨集區塊的動作。The second figure is a schematic diagram of the operation of a preferred embodiment of the present invention, showing the action of referring to the adjacent macroblock.

第三圖係本發明一較佳實施例之動作示意圖，顯示提早預測最佳編碼模式的比較動作。The third figure is a schematic diagram of the action of a preferred embodiment of the present invention, showing a comparison action for predicting the optimal coding mode early.

第四圖係本發明一較佳實施例之動作示意圖，顯示取得訊噪失真比趨勢的動作。The fourth figure is a schematic diagram of the action of a preferred embodiment of the present invention, showing an action of obtaining a signal-to-noise distortion ratio trend.

Claims

一種可調式多媒體視訊編碼之模式快速預測方法，該可調式多媒體視訊編碼具有三種不同解析度大小的畫面，分別為最小解析度的一基礎層、中等解析度的一第一增強層、以及最大解析度的一第二增強層，該方法包含有下列步驟：a)由最小解析度畫面開始，依序對各個巨集區塊進行相關的編碼模式預測，在該最小解析度畫面的所有巨集區塊完成編碼模式預測後，再依序對中等解析度畫面的巨集區塊進行編碼模式預測，最後再依序對最大解析度畫面的巨集區塊進行編碼模式預測；b)將目前正在處理的一巨集區塊定義為一目前巨集區塊，判斷該目前巨集區塊是否為最小解析度，若是，則直接以完整模式進行運算，並且決定完整模式為該目前巨集區塊的最佳編碼模式，再跳至步驟f)；若否，則接著進入步驟c)；其中，最佳編碼模式係指訊噪失真比是最小時的模式；c)參考該目前巨集區塊所對應的次小解析度巨集區塊的最佳編碼模式，以及利用參考鄰近區塊模式技術來參考該目前巨集區塊所對應的相同解析度的鄰近巨集區塊的最佳編碼模式，將這些參考到的編碼模式進行統計，判斷同類編碼模式的出現次數，再以出現次數對各個編碼模式依大小順序由大至小進行排序；d)進行提早預測最佳編碼模式之動作，其係將步驟c) 的排序順序所對應的各個鄰近巨集區塊逐一進行，對一該鄰近巨集區塊而言，其係將其所產生的訊噪失真比與該鄰近巨集區塊所對應的次小解析度巨集區塊所產生的訊噪失真比進行運算而取得兩者的比率，再把該目前巨集區塊的訊噪失真比與該比率相乘後，與該目前巨集區塊所對應的次小解析度巨集區塊的訊噪失真比進行比較，若比較結果前者小於後者，則判斷目前比較的編碼模式即是最佳編碼模式，並停止其餘編碼模式所對應的訊噪失真比的比較動作，若前者大於或等於後者，則繼續依序比較下一個鄰近巨集區塊所產生的訊噪失真比，若比較至最後一個編碼模式均未發生小於該次小解析度巨集區塊的訊噪失真比的狀況，則以該排序順序中的最後一個編碼模式為最佳編碼模式；e)將步驟d)中所判斷的最佳編碼模式決定為該目前巨集區塊的最佳編碼模式；f)判斷該目前巨集區塊是否為最後一個巨集區塊，若是，則進行步驟g)，若否，則將下一個巨集區塊指定為目前巨集區塊，並跳至步驟b)；以及g)結束。 A mode fast prediction method for adjustable multimedia video coding, the adjustable multimedia video coding has three different resolution sizes, a basic layer of minimum resolution, a first enhancement layer of medium resolution, and maximum resolution a second enhancement layer, the method comprises the following steps: a) starting from the minimum resolution picture, sequentially performing correlation coding mode prediction on each macro block, in all macro areas of the minimum resolution picture After the block completes the coding mode prediction, the coding mode prediction is performed on the macroblock of the medium resolution picture, and finally the coding mode prediction is performed on the macro block of the maximum resolution picture; b) the current processing is being processed. A macroblock is defined as a current macroblock, and it is determined whether the current macroblock is the minimum resolution, and if so, the operation is directly performed in the complete mode, and the complete mode is determined as the current macroblock. The best coding mode, then jump to step f); if not, then proceed to step c); wherein the optimal coding mode means that the signal-to-noise distortion ratio is minimum a mode; c) referring to an optimal coding mode of the sub-small resolution macroblock corresponding to the current macroblock, and using the reference neighboring block mode technique to refer to the same resolution corresponding to the current macroblock The optimal coding mode of the neighboring macroblocks is to count the coding modes of these reference codes, determine the number of occurrences of the same coding mode, and then sort the coding modes according to the order of the number of occurrences from large to small; d) Perform the action of predicting the best coding mode early, which will be step c) Each neighboring macroblock corresponding to the sorting order is performed one by one, and for a neighboring macroblock, the signal-to-noise distortion ratio generated by the neighboring macroblock is compared with the second smallest parsing corresponding to the neighboring macroblock. The noise-to-noise distortion generated by the macroblock is calculated to obtain the ratio of the two, and then the signal-to-noise distortion ratio of the current macroblock is multiplied by the ratio to correspond to the current macroblock. Comparing the signal-to-noise distortion ratio of the sub-resolution macroblocks, if the comparison result is smaller than the latter, it is judged that the currently compared coding mode is the optimal coding mode, and the signal-to-noise distortion ratio corresponding to the remaining coding modes is stopped. The comparison action, if the former is greater than or equal to the latter, continues to compare the signal-to-noise distortion ratio generated by the next adjacent macroblock sequentially, and if the comparison to the last coding mode does not occur less than the small resolution macroblock The condition of the signal-to-noise distortion ratio of the block is determined by the last coding mode in the sorting order as the optimal coding mode; e) determining the optimal coding mode determined in step d) as the most current block of the current macroblock Good coding mode (f) judging whether the current macroblock is the last macroblock, and if so, proceeding to step g), if not, designating the next macroblock as the current macroblock and jumping to Step b); and g) end.

依據申請專利範圍第1項所述之可調式多媒體視訊編碼之模式快速預測方法，其中更包含有：步驟d1)進行動態加入完整模式之動作，其係在進行到中等解析度畫面的巨集區塊編碼模式預測時，將該目前巨集區塊所對應的相同解析度的鄰近巨集區塊的訊噪失真比進行平均而取得一平均值，再將這個平均值與該目前巨集區塊的訊噪失真比相除，而得到了中等解析度的該目前巨集區塊的訊噪失真比趨勢，並將之儲存；在進行到最大解析度畫面的巨集區塊編碼模式預測時，將該目前巨集區塊所對應的相同解析度的鄰近巨集區塊的訊噪失真比進行平均而取得一平均值，再將這個平均值與該目前巨集區塊的訊噪失真比相除，而得到了最大解析度的該目前巨集區塊的訊噪失真比趨勢；將中等解析度的該目前巨集區塊的訊噪失真比趨勢與最大解析度的該目前巨集區塊的訊噪失真比趨勢進行比較，若不相同，則決定最大解析度的該目前巨集區塊的下一個巨集區塊的最佳編碼模式為完整模式；若相同，則繼續後續步驟。 According to the fast prediction method of the adjustable multimedia video coding mode described in claim 1, the method further includes: step d1) performing the action of dynamically adding the complete mode, which is performed on the macro region of the medium resolution picture. In the block coding mode prediction, the signal-to-noise distortion ratio of the neighboring macroblocks of the same resolution corresponding to the current macroblock is averaged An average value, and then dividing the average value with the signal-to-noise distortion ratio of the current macroblock block, and obtaining a medium-resolution trend of the signal-to-noise distortion ratio of the current macroblock block, and storing it; When performing the macroblock block mode prediction to the maximum resolution picture, the signal-to-noise distortion ratio of the adjacent macroblocks of the same resolution corresponding to the current macro block is averaged to obtain an average value, and then The average value is divided by the signal-to-noise distortion ratio of the current macroblock, and the maximum resolution of the current macroblock block is obtained. The medium resolution block of the current macroblock is obtained. The signal-to-noise distortion ratio trend is compared with the trend-to-maximum resolution of the current macroblock block. If not, the maximum resolution of the next macroblock of the current macroblock is determined. The best encoding mode is the full mode; if they are the same, the next step is continued.

依據申請專利範圍第2項所述之可調式多媒體視訊編碼之模式快速預測方法，其中：該訊噪失真比趨勢係具有一定範圍，在將中等解析度的該目前巨集區塊的訊噪失真比趨勢與最大解析度的該目前巨集區塊的訊噪失真比趨勢進行比較時，訊噪失真比趨勢的範圍愈大則愈不容易相同，愈小則愈容易相同。 According to the fast prediction method of the adjustable multimedia video coding mode described in claim 2, wherein: the noise-to-noise distortion ratio trend has a certain range, and the noise of the current macro block of the medium resolution is distorted When comparing the trend of the signal-to-noise distortion ratio of the current macroblock with the trend and the maximum resolution, the smaller the range of the signal-to-noise distortion ratio trend is, the easier it is to be the same. The smaller the smaller, the easier it is.

依據申請專利範圍第2項所述之可調式多媒體視訊編碼之模式快速預測方法，其中：步驟c)及步驟d1)中的鄰近巨集區塊，以及係指該目前巨集區塊的左側、左上、上方、及右上四個方位中的至少二個的巨集區塊。 The mode fast prediction method of the adjustable multimedia video coding method according to claim 2, wherein: the adjacent macroblock in step c) and step d1), and the left side of the current macro block, A macroblock of at least two of the four upper left, upper, and upper right directions.

依據申請專利範圍第1項所述之可調式多媒體視訊編碼之模式快速預測方法，其中：步驟c)中的鄰近巨集區塊，係指該目前巨集區塊的左側、左上、上方、及右上四個方位中的至少二個的巨集區塊。According to the fast prediction method of the adjustable multimedia video coding mode described in claim 1, wherein: the adjacent macro zone in step c) A block refers to a macroblock of at least two of the four directions of the left, upper left, upper, and upper right of the current macroblock.