JP4336402B2

JP4336402B2 - Image processing apparatus and image processing method

Info

Publication number: JP4336402B2
Application number: JP25415798A
Authority: JP
Inventors: 章佳浜中
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 1998-09-08
Filing date: 1998-09-08
Publication date: 2009-09-30
Anticipated expiration: 2018-09-08
Also published as: JP2000092485A

Description

【０００１】
【発明の属する技術分野】
本発明は、画像信号をディジタル化して得られた画像データの圧縮処理、及び、圧縮処理された画像データの伸張処理に関するものである。
【０００２】
【従来の技術】
従来より、音声や画像の符号化方式についての国際標準として、ＪＰＥＧ、Ｈ．２６１、ＪＰＥＧとＨ．２６１を改良したＭＰＥＧ等が知られている。そして、音声や画像を統合的に扱うマルチメディア時代と呼ばれる現在では、ＭＰＥＧを改良したＭＰＥＧ１、さらにＭＰＥＧ１を改良したＭＰＥＧ２が多く使用されている。
【０００３】
ここで、ＭＰＥＧ２は、高画質化の要求により進められた動画像符号化標準であり、次のような特徴がある。
（１）蓄積メディアだけでなく、通信や放送メディアへの適用も考慮されている。
（２）現行テレビ品質以上の高品質画像を対象とし、高解像度テレビ（ＨＤＴＶ：High Definition Television）品質への拡張可能なこと。
（３）ＭＰＥＧ１やＨ．２６１とは異なり、順次走査（ノンインターレース）だけでなく、飛び越し走査（インターレース）画像も行える符号化を行うこと。
（４）分解能可変性（スケーラビリティ：Scalability ）をもつこと。
（５）ＭＰＥＧ２デーコーダは、ＭＰＥＧ１のビット・ストリームもデコードできること。すなわち、下方互換性を備えていること。
【０００４】
これらの特徴のうち、特に、（４）のスケーラビリティ機能は、ＭＰＥＧ２で新たに導入された一つであり、大きく３つに分類される。これらは、空間スケーラビリティ（Spatial Scalability ）、時間スケーラビリティ（Temporal Scalability）、及び信号対雑音比（ＳＮＲ：Singnal to Noise Radio）スケーラビリティ（ＳＮＲ Scalability）である。以下、各々のスケーラビリティについての概要を説明する。
【０００５】
（空間スケーラビリティ）
図１３は、空間スケーラビリティの符号化概要を示す図である。時間解像度が小さいレイヤを基本レイヤ（Base Layer）と呼び、大きいレイヤを高位レイヤ（Enhancement Layer ）と呼ぶ。
基本レイヤとは、原画像に対して、空間的にある一定の比率で間引き（サブ・サンプリング）を施し、空間的解像度（画質）を低下させる代わりに、１フレーム当たりの符号量削減したもの、すなわち空間解像度的に低画質、低符号量のレイヤのことである。この基本レイヤでは、各フレーム間に限定（元の高画質画像を対象とせず、基本レイヤ中の画像に限定）した通常のＭＰＥＧ２で符号化する。
これに対して、高位レイヤとは、空間解像度的に高画質、高符号量のレイヤのことであり、この高位レイヤでは、基本レイヤの画像をアップ・サンプル（低解像度画面の画素間に、平均値等の画素を付加し、高解像度画面を作ること）して高位レイヤと同じ大きさの画像（基本レイヤの拡大画像：Expanded Base Layer ）をつくり出し、高位レイヤ中の画像からの予測だけでなく、アップ・サンプルされた拡大画像からも予測して、符号化する。そして、このようにして符号化された高位レイヤの画像を復号すると、空間的に原画像と同一サイズとなり、画質については、圧縮率に依存したものとなる。
このような空間スケーラビリティを用いると、２つの画像シーケンスを個々に符号化して送る場合より、効率良く２つの画像シーケンスの符号化が行える。また、通常のテレビジョン放送とＨＤＴＶ放送を同時に送り、受信側の性能によって画像を選択する、といったことも可能となる。
【０００６】
（時間スケーラビリティ）
図１４は、時間スケーラビリティの符号化概要を示す図である。時間解像度が小さいレイヤを基本レイヤ（Base Layer）と呼び、大きいレイヤを高位レイヤ（Enhancement Layer ）と呼ぶ。
基本レイヤとは、原画像の時間的解像度（フレームレート）に対して、ある一定の割合でフレーム単位の間引きを施し、時間的解像度を低下させる代わりに、伝送する符号量を削減したもの、すなわち時間解像度的に低画質、低符号量のレイヤのことである。この基本レイヤでは、各フレーム間に限定（元の高画質画像を対象とせず、基本レイヤ中の時間的に現在、過去、未来の各フレームに限定）した通常のＭＰＥＧ２で符号化する。
これに対して、高位レイヤとは、時間解像度的に高画質、高符号量のレイヤのことであり、この高位レイヤでは、高位レイヤ中でＩ、Ｐ、Ｂピクチャの使うだけでなく、基本レイヤ中の画像を使って予測して符号化する。そして、このようにして符号化された高位レイヤの画像を復号すると、その画像のフレームレートは、原画像と同一サイズとなり、画質については、圧縮率に依存したものとなる。
このような時間スケーラビリティを用いると、例えば、３０Ｈｚ順次走査の画像と、６０Ｈｚ順次走査の画像とを同時に効率良く送ることができる。また、インターレースと順次走査の組合せも可能となる。
尚、時間スケーラビリティについては、将来のＭＰＥＧ２の拡張のためにあるものであり、現在のところ使用されていない（”Reserved”の扱いとなっている）。
【０００７】
（ＳＮＲスケーラビリティ）
図１５は、ＳＮＲスケーラビリティの符号化概要を示す図である。画質が低いレイヤを基本レイヤ（Base Layer）と呼び、画質が高いレイヤを高位レイヤ（Enhancement Layer ）と呼ぶ。
基本レイヤとは、原画像を符号化（圧縮）する過程、例えば、「ブロック化→直交変換→量子化→可変長符号化」というような過程において、比較的高い圧縮率（粗い量子化ステップ・サイズ）により低符号量としたもの、すなわち画質（Ｎ／Ｓ）的に低画質、低符号量のレイヤである。この基本レイヤでは、各フレーム間に限定したＭＰＥＧ１又はＭＰＥＧ２（予測符号化）で符号化する。
これに対して、高位レイヤとは、基本レイヤに対して高画質、高符号量のレイヤのことであり、この高位レイヤでは、基本レイヤで符号化された画像を復号し、その復号画像を原画像から引いた誤差分のみを、比較的低い圧縮率（基本レイヤの量子化ステップ・サイズよりも小さな量子化ステップ・サイズ）で、フレーム内で符号化する。尚、ＳＮＲスケーラビリティにおいて、フレーム（フィールド）間予測は行わない。全て、イントラ・フレーム又はフィールド符号化である。
このようなＳＮＲスケーラビリティを用いると、画質の異なった２種類の画像を同時に効率良く、符号化及び復号することができる。
【０００８】
そこで、上述のＭＰＥＧ２を採用した画像の符号化装置として、例えば、図１６に示すようなエンコーダ９００がある。
このエンコーダ９００は、上記図１６に示すように、画像のＲＧＢデータが供給される変換回路９０１と、変換回路９０１の出力が供給される選択回路９０２と、選択回路９０２の出力が供給される第１のデータ生成回路９０４及び第２のデータ生成回路９０３と、選択回路９０２、第１のデータ生成回路９０４、及び第２のデータ生成回路９０３の出力が供給されるブロック化処理回路９０５と、ブロック化処理回路９０５の出力が供給される符号化回路９０６とを備えている。
【０００９】
先ず、変換回路９０１は、各々が８ビットのＲＧＢデータを、各々が８ビットの４：２：０のＹＣｂＣｒデータに変換する。
選択回路９０２は、空間スケーラビリティの使用モード、ＳＮＲスケーラビリティの使用モード、及び通常モード（スケーラビリティの不使用モード）の何れかのモードを選択する。この選択回路９０２における選択は、主として後述するデコーダを使用するユーザから指示される。
【００１０】
選択回路９０２で空間スケーラビリティの使用モードが選択された場合、変換回路９０１で得られたＹＣｂＣｒデータ（各８ビットデータ）は、選択回路９０２を介することで、第１のデータ生成回路９０４に供給される。
第１のデータ生成回路９０４は、供給されたＹＣｂＣｒデータから、対応する基本レイヤ及び高位レイヤのデータを生成して、ブロック化処理回路９０５に供給する。
【００１１】
一方、選択回路９０２でＳＮＲスケーラビリティの使用モードが選択された場合、変換回路９０１で得られたＹＣｂＣｒデータ（各８ビットデータ）は、選択回路９０２を介することで、第２のデータ生成回路９０３に供給される。
第２のデータ生成回路９０３は、供給されたＹＣｂＣｒデータから、対応する基本レイヤ及び高位レイヤのデータを生成して、ブロック化処理回路９０５に供給する。
【００１２】
また、選択回路９０２で通常モードが選択された場合、変換回路９０１で得られたＹＣｂＣｒデータ（各８ビットデータ）は、選択回路９０２を介することで、直接ブロック化処理回路９０５に供給される。
【００１３】
ブロック化処理回路９０５は、供給されたＹＣｂＣｒデータに対して、ＹＣｂＣｒ各々独立して、次のような処理を行う。
すなわち、水平及び垂直方向の各ｎ画素のブロックを単位として、ブロックを構成する。さらに、その各ブロックを、ＹＣｂＣｒ独立に各々ａ個、ｂ個、ｃ個まとめてマクロ・ブロックを構成する。
【００１４】
符号化回路９０６は、ブロック化処理回路９０５で得られた各マクロ・ブロックのデータに対して、マクロ・ブロック単位に所定の符号化処理を行う。例えば、イントラ（Ｉ）又はインター（Ｐ又はＢ）予測符号化方式の選択を行い、予測処理を行った後、直交変換（ＤＣＴ）処理、量子化処理、可変長符号化（ＶＬＣ）処理を行う。
この符号化回路９０６で符号化処理が行われたデータは、ＭＰＥＧ２のビットストリームとして伝送又は記録される。
【００１５】
上述のようなエンコーダ９００に対応した復号装置として、例えば、図１７に示すようなデコーダ９１０がある。
このデコーダ９１０は、基本的にエンコーダ９００の逆処理を行うものであり、上記図１７に示すように、ＭＰＥＧ２のビットストリームが供給されるヘッダ検出回路９１１と、ヘッダ検出回路９１１の出力が供給されるフラグ検出回路９１２及び復号回路９１３と、復号回路９１３の出力が供給される信号選択回路９１４と、信号選択回路９１４の出力が供給される第１のデータ復号回路９１５及び第２のデータ復号回路９１６と、信号選択回路９１４、第１のデータ復号回路９１５、及び第２のデータ復号回路９１６が供給される画質選択回路９１７とを備えている。
【００１６】
先ず、ヘッダ検出回路９１１は、ＭＰＥＧ２のビットストリームに含まれるヘッダ情報を解読し、そのヘッダ情報に応じた制御信号を生成してフラグ検出回路９１２に供給する。
フラグ検出回路９１２は、ヘッダ検出回路９１１からの制御信号から、スケーラビリティに関わるフラグを検出し、そのフラグを復号回路９１３、信号選択回路９１４、及び画質選択回路９１７に各々供給する。
【００１７】
復号回路９１３は、上記図１６の符号化回路９０６に対応したものであり、ヘッダ検出回路９１１を介して供給されたＭＰＥＧ２のビットストリームに対して、フラグ検出回路９１２からのフラグに従った所定の復号処理を行う。
【００１８】
信号選択回路９１４は、復号回路９１３で復号されたデータを構成するマクロ・ブロックを解除し、その後、フラグ検出回路９１２からのフラグに従って、信号経路を選択する。
これにより、信号選択回路１９４でマクロ・ブロックの解除が行われたデータ（復号画像データ）は、空間スケーラビリティが使用されている場合は第１のデータ復号回路９１５へ、ＳＮＲスケーラビリティが使用されている場合は第２のデータ復号回路９１６へ、スケーラビリティが使用されていない場合は直接画質選択回路９１７へ、供給される。
【００１９】
第１のデータ復号回路９１５は、上記図１６の第１のデータ生成回路９０４に対応するものであり、信号選択回路９１４からの復号画像データ（空間スケーラビリティのデータ）を、元のＹＣｂＣｒデータに復号して、画質選択回路９１７に供給する。
【００２０】
第２のデータ復号回路９１６は、上記図１６の第２のデータ生成回路９０３に対応するものであり、信号選択回路９１４からの復号画像データ（ＳＮＲスケーラビリティのデータ）を、元のＹＣｂＣｒデータに復号して、画質選択回路９１７に供給する。
【００２１】
画質選択回路９１７は、フラグ検出回路９１２からのフラグに従って、スケーラビリティが使用されている場合に、第１のデータ復号回路９１５又は第２のデータ復号回路９１６からのＹＣｂＣｒデータでの基本レイヤや高位レイヤ等に対応する画質の選択を行い、その選択に従った画像データを出力する。
【００２２】
【発明が解決しようとする課題】
ところで、上述したようなＭＰＥＧ２におけるエンコーダ９００とデコーダ９１０間では、次の３つの方式（符号化／復号方式）のうち何れかが選択され採用される。
【００２３】
（第１の方式）
エンコーダ及びデコーダ共に、所望のデータレートに対応して符号化、或いは復号する。尚、画質（解像度）は１種類のみとする。
【００２４】
（第２の方式）
エンコーダ側では、空間スケーラビリティ（上記図１３参照）を用いて、サイズ（解像度）の異なる２種類の画像（基本レイヤと高位レイヤの各画像）を同時に符号化し、デコーダ側では、自装置や自装置に接続されている表示器等の性能（データ処理能力等）に応じて、上記基本レイヤから空間解像度が低い画像を復元、或いは、上記基本レイヤと高位レイヤの両方から空間解像度が高い画像を復元する。
ここで、例えば、原画像をＨＤＴＶの画像信号（１４９０×１１５２画素からなる信号）とした場合、空間スケーラビリティでは、次のような基本レイヤと高位レイヤの２種類が存在することになる。
基本レイヤ：原画像を水平及び垂直方向（ｘ及びｙ方向）共に１／２に間引
いた７２０×５７６画素の画像のレイヤ。
高位レイヤ：原画像の前方予測（Ｐ）と双方予測（Ｂ）に加え、上記基本レイヤを高位レイヤと同じサイズにアップ・サンプルした画像も予
測（比較）対象として符号化したもの。
【００２５】
（第３の方式）
エンコーダ側では、ＳＮＲスケーラビリティ（上記図１５参照）を用いて、符号量（量子化ステップ・サイズ）が異なる２種類の画像（基本レイヤと高位レイヤの各画像）を同時に符号化し、デコーダ側では、自装置の性能に応じて、上記基本レイヤから低画質（低ビットレート）の画像を復元、或いは、上記基本レイヤと高位レイヤの両方から高画質（高ビットレート）の画像を復元する。
ここで、ＳＮＲスケーラビリティでは、画質の異なった２種類の画像を同時に符号化及び復号可能である。すなわち、同一の画像に対して、互いに異なる２種類の量子化ステップ・サイズ（量子化係数）を用いて、同一画像で圧縮比の異なる画像を生成することが可能である。このとき、圧縮比の大きい画像（低ビットレートの画像）を基本レイヤと定義し、原画像から該基本レイヤを復元した画像を引いた誤差分を高位レイヤと定義する。したがって、デコーダ側において、基本レイヤと高位レイヤを加算したものが、低ビットレートの高画質の画像となる。
【００２６】
そこで、これらの３つの方式のうち（２）及び（３）を採用した場合、上記図１６のエンコーダ９００の選択回路９０２では、空間スケーラビリティ及びＳＮＲスケーラビリティの何れか一方が選択可能となる。
【００２７】
しかしながら、この選択回路９０２において、空間スケーラビリティが選択された場合、基本レイヤの画像サイズは高位レイヤとの関係により一義的に決定されてしまい、基本レイヤの画像サイズについては、任意に選択する自由度がない。
また、ＳＮＲスケーラビリティが選択された場合も同様に、基本レイヤのフレームレート（解像度）は高位レイヤとの関係により一義的に決定されてしまうい、基本レイヤの画像サイズについては、任意に選択する自由度がない。
【００２８】
したがって、上記図１６に示したような従来の符号化装置では、スケーラビリティ機能を使用するときは、画像サイズやフレームレート等の符号量を選択することができなかった。すなわち出力先である復号装置や、回線事情に直接関係のあるファクタを選択することができなかった。
要するに、復号装置側（受信側）において、空間スケーラビリティやＳＮＲスケーラビリティ等のうち何れかを使用して符号化された画像を受信した場合、画質の選択枝は、
１．基本レイヤのみ復号した低画質の画像
２．基本レイヤ及び高位レイヤの両方を復号した高画質の画像
に限定されるため、復号装置の性能や、そのユーザのニーズに応じて画質（デコード速度）を選択できない、という問題があった。
【００２９】
本発明は、かかる問題点に鑑みなされたものであり、復号側に適した画像を提供することを目的とする。
【００３０】
【課題を解決するための手段】
上記課題を解決するため、本発明は、画像データを所定の符号化方式に従って符号化して出力する画像処理装置であって、外部装置から、前記外部装置において画像を復号するための条件を示す外部情報を受信する受信手段と、空間的スケーラビリティモード、時間的スケーラビリティモード、ＳＮＲスケーラビリティモードの何れか二つのモードを少なくとも含む複数のスケーラビリティモードから、前記受信手段で受信された前記外部情報に対応するスケーラビリティモードを選択し、前記選択したスケーラビリティモードで前記画像データを符号化する符号化手段と、前記符号化手段によって符号化された画像データを前記外部装置に伝送する伝送手段とを備えたことを特徴とする。
【００３１】
また、上記課題を解決するため、本発明は、所定の符号化方式に従って符号化して得られた画像データを復号する画像処理装置であって、画像を復号するための条件を示す外部情報を入力する入力手段と、前記外部情報を外部装置に伝送する伝送手段と、前記外部装置において空間的スケーラビリティモード、時間的スケーラビリティモード、ＳＮＲスケーラビリティモードの何れか二つのモードを少なくとも含む複数のスケーラビリティモードから、前記入力手段で入力された前記外部情報に対応するスケーラビリティモードが選択され、前記選択されたスケーラビリティモードで符号化された画像データを前記外部装置から受信する受信手段と、前記受信手段により受信された符号化された画像データを復号化する復号手段とを備えたことを特徴とする。
【００３２】
また、上記課題を解決するため、本発明は、画像データを所定の符号化方式に従って符号化して出力する画像処理装置における画像処理方法であって、外部装置から、前記外部装置において画像を復号するための条件を示す外部情報を受信する受信工程と、空間的スケーラビリティモード、時間的スケーラビリティモード、ＳＮＲスケーラビリティモードの何れか二つのモードを少なくとも含む複数のスケーラビリティモードから、前記受信工程で受信された前記外部情報に対応するスケーラビリティモードを選択し、前記選択したスケーラビリティモードで画像データを符号化する符号化工程と、前記符号化工程によって符号化された画像データを前記外部装置へ伝送する伝送工程とを含むことを特徴とする。
【００３３】
また、上記課題を解決するため、本発明は、所定の符号化方式に従って符号化して得られた画像データを復号する画像処理装置における画像処理方法であって、画像を復号するための条件を示す外部情報を入力する入力工程と、前記外部情報を外部装置に伝送する伝送工程と、前記外部装置において空間的スケーラビリティモード、時間的スケーラビリティモード、ＳＮＲスケーラビリティモードの何れか二つのモードを少なくとも含む複数のスケーラビリティモードから、前記入力工程で入力された前記外部情報に対応するスケーラビリティモードが選択され、前記選択されたスケーラビリティモードで符号化された画像データを前記外部装置から受信する受信工程と、前記受信工程により受信された符号化された画像データを復号化する復号工程とを含むことを特徴とする。
【００５０】
【発明の実施の形態】
以下、本発明の実施の形態について図面を用いて説明する。
【００５１】
（第１の実施の形態）
【００５２】
本発明は、例えば、図１に示すような符号化装置１００により実施される。
この符号化装置１００は、上記図１に示すように、各々が８ビットのＲＧＢデータが供給される変換回路１０１と、変換回路１０１の出力が供給される第１のフレームメモリ１０２と、第１のフレームメモリ１０２の出力が供給される第１のデータ生成回路１０５及び第１のブロック化処理回路１０７と、第１のブロック化処理回路１０７の出力が供給される第１の符号化回路１０９とを備えており、第１のブロック化処理回路１０７には、第１のデータ生成回路１０５の出力も供給されるようになされている。
また、符号化装置１００は、変換回路１０１の出力が供給される第２のフレームメモリ１０４と、第２のフレームメモリ１０４の出力が供給される第２のデータ生成回路１０６及び第２のブロック化処理回路１０８と、第２のブロック化処理回路１０８の出力が供給される第２の符号化回路１１０とを備えており、第２のブロック化処理回路１０８には、第２のデータ生成回路１０６の出力も供給されるようになされている。
そして、第１のデータ生成回路１０５の出力は、第２のデータ生成回路１０６にも供給され、第１の符号化回路１０９の出力も、第２の符号化回路１１０に供給されるようになされている。
また、さらに符号化装置１００は、第１の符号化回路１０９及び第２の符号化回路１１０の各出力が供給されるビットストリーム生成回路１１１と、装置全体の動作制御を行うための制御回路１０３を備えている。
【００５３】
制御回路１０３の内部構成は、図２に示すように、ＣＰＵ７０１と、装置全体の動作制御を実施するための各種処理プログラムをＣＰＵ７０１が読出可能に格納したプログラムメモリ７０２と、詳細は後述する外部情報１１２が供給される情報検出回路７０３とを備えた構成としている。そして、ＣＰＵ７０１には、外部からのインフラ情報やユーザリクエスト情報等を含む情報（以下、「外部情報」と言う）１１２が供給されるようになされている。
したがって、ＣＰＵ７０１がプログラムメモリ７０２の各種処理プログラムを読み出して実行することで、ここで説明する符号化装置１００の動作が実現することになる。
【００５４】
第１のデータ生成回路１０５の内部構成は、図３に示すように、第１のフレームメモリ１０２の出力（ＹＣｂＣｒデータ）が供給される第１のセレクタ３０１と、第１のセレクタ３０１の出力が供給される第２のセレクタ３０３及びサンプリング回路３０４と、第２のセレクタ３０３の出力が供給されるフレームレートコントローラ３０５と、フレームレートコントローラ３０５及びサンプリング回路３０４の各出力が供給される第３のセレクタ３０２とを備えてた構成としている。そして、第３のセレクタ３０２の出力が、第１のブロック化処理回路１０７及び第２のデータ生成回路１０６に供給されるようになされている。
【００５５】
第２のデータ生成回路１０６の内部構成は、図４に示すように、第２のフレームメモリ１０４の出力（ＹＣｂＣｒデータ）が供給される第１のセレクタ４０１と、第１のデータ生成回路１０５の出力（基本レイヤの画像データ）が供給されるフレームメモリ４０５と、第１のセレクタ４０１及びフレームメモリ４０５の各出力が供給される第１の差分データ生成回路４０３及び第２の差分データ生成回路４０４と、第１の差分データ生成回路４０３及び第２の差分データ生成回路４０４の各出力が供給される第２のセレクタ４０２とを備えた構成としている。そして、第２のセレクタ４０２の出力が、第２のブロック化処理回路１０６に供給されるようになされている。
【００５６】
上述のような符号化装置１００において、まず、変換回路１０１は、入力された画像データ（各々が８ビットのＲＧＢデータ）を、４：２：０のＹＣｂＣｒデータ（各々が８ビットデータ）に変換して、そのＹＣｂＣｒデータを第１のフレームメモリ１０２及び第２のフレームメモリ１０４に各々供給する。
第１のフレームメモリ１０２及び第２のフレームメモリ１０４は各々、変換回路１０１からのＹＣｂＣｒデータを記憶するが、このときの動作制御は、次のように動作する制御回路１０３により行われる。
【００５７】
すなわち、制御回路１０３（上記図２参照）において、情報検出回路７０３は、外部情報１１２を解釈し、それに対応した制御情報をＣＰＵ７０１に供給する。
ＣＰＵ７０１は、情報検出回路７０８からの制御情報から、符号化におけるスケーラビリティ機能の使用又は不使用を示すモード情報、スケーラビリティ機能の使用モードの場合のスケーラビリティ機能の種類、及び基本レイヤと高位レイヤに対する各種制御情報（例えば、基本レイヤの画像サイズ（Statial ）、フレームレート（Temporal）、及び圧縮率（SNR ）等）等を得て、それらの情報（以下、「符号化制御信号」と言う）を第１のデータ生成回路１０５及び第２のデータ生成回路１０６等に各々供給する。また、これと同時にＣＰＵ７０１は、第１のフレームメモリ１０２及び第２のフレームメモリ１０４に対して、第１のデータ生成回路１０５及び第２のデータ生成回路１０６の機能とリンクしてデータの読み書き動作するように、情報検出回路７０８からの制御情報に従った読書（Ｒ／Ｗ：Read/Write）制御信号を供給する。
【００５８】
したがって、上述の第１のフレームメモリ１０２及び第２のフレームメモリ１０４は、外部情報１１２に基づいたＲ／Ｗ制御信号に従って動作し、他の第１のデータ生成回路１０５及び第２のデータ生成回路１０６等も同様に、外部情報１１２に基づいた符号化制御信号に従って動作することになる。
【００５９】
以下、外部情報１１２により示される内容、特に、”空間スケーラビリティ使用モード”、”時間スケーラビリティ使用モード”、”ＳＮＲスケーラビリティ使用モード”、”スケーラビリティ機能不使用モード”の各モード別に、変換回路１０１以降の各回路の動作について説明する。
【００６０】
（空間スケーラビリティ使用モード）
【００６１】
第１のフレームメモリ１０２及び第２のフレームメモリ１０４は各々、制御回路１０３（具体的にはＣＰＵ７０１）からのＲ／Ｗ制御信号（外部情報１１２に基づいた空間スケーラビリティ使用モード指定の制御信号）に従って、変換回路１０１からのＹＣｂＣｒデータの読み書き動作を行う。
これにより、第１のフレームメモリ１０２及び第２のフレームメモリ１０４から読み出されたＹＣｂＣｒデータは、第１のデータ生成回路１０５及び第２のデータ生成回路１０６を介して、第１のブロック化処理回路１０７及び第２のブロック化処理回路１０８に供給される。
【００６２】
このとき、第１のデータ生成回路１０５及び第２のデータ生成回路１０６にも、制御回路１０３からの符号化制御信号（外部情報１１２に基づいた空間スケーラビリティ使用モード指定の制御信号）が供給され、第１のデータ生成回路１０５及び第２のデータ生成回路１０６は、この符号化制御信号に従って、動作する。
【００６３】
すなわち、第１のデータ生成回路１０５（上記図３参照）において、第１のセレクタ３０１は、制御回路１０３からの符号化制御信号により、出力先をサンプリング回路３０４に切り替えて、第１のフレームメモリ１０２からのＹＣｂＣｒデータを出力する。
サンプリング回路３０４は、制御回路１０３からの符号化制御信号に含まれるサブ・サンプルのサイズ情報に従って、第１のセレクタ３０１からのＹＣｂＣｒデータに画像の縮小処理を行って、基本レイヤの画像データを生成する。
このサンプリング回路３０４で得られた基本レイヤの画像データは、第３のセレクタ３０２に供給される。
第３のセレクタ３０２は、制御回路１０３からの符号化制御信号により、出力するデータを、サンプリング回路３０４の出力（基本レイヤの画像データ）に出力を切り替える。したがって、第１のブロック化処理回路１０７には、基本レイヤの画像データが供給される。この基本レイヤの画像データは、後述の第２のデータ生成回路１０６にも供給される。
【００６４】
このようにして、第１のデータ生成回路１０５から第１のブロック化処理回路１０７に供給された基本レイヤの画像データは、第１のブロック化処理回路１０７にてブロック化され、第１の符号化回路１０９にてブロック単位の所定の符号化処理が行われ、これがビットストリーム生成回路１１１に供給される。
【００６５】
一方、第２のデータ生成回路１０６（上記図４参照）において、第１のセレクタ４０１は、制御回路１０３からの符号化制御信号により、出力先を第１の差分データ生成回路４０３に切り替えて、第２のフレームメモリ１０４からのＹＣｂＣｒデータを出力する。
これと同時に、フレームメモリ４０５は、制御回路１０３からの符号化制御信号により、第１のデータ生成回路１０５からの基本レイヤの画像データを第１の差分データ生成回路４０３に供給する。
第１の差分データ生成回路４０３は、制御回路１０３からの符号化制御信号に従って、フレーム（又はフィールド）単位で、フレームメモリ４０５からの基本レイヤの画像データを原画像（又は高位レイヤの画像）と同サイズにアップ・サンプルして、高位レイヤの画像データとの差分画像データを生成する。
この第１の差分データ生成回路４０３で得られた差分画像データは、第２のセレクタ４０２に供給される。
第２のセレクタ４０２は、制御回路１０３からの符号化制御信号により、出力するデータを、第１の差分データ生成回路４０３の出力（差分画像データ）に出力を切り替える。したがって、第２のブロック化処理回路１０８には、差分画像データが供給される。
【００６６】
このようにして、第２のデータ生成回路１０６から第２のブロック化処理回路１０８に供給された高位レイヤの差分画像データは、第２のブロック化処理回路１０８にてブロック化され、第２の符号化回路１１０にて、基本レイヤの画像とは独立して、ブロック単位の所定の符号化処理が行われ、これがビットストリーム生成回路１１１に供給される。
【００６７】
ビットストリーム生成回路１１１は、第１の符号化回路１０９からの基本レイヤの画像データ、及び第２の符号化回路１１０からの高位レイヤの画像データ（差分画像データ）に、所定のアプリケーション（伝送や蓄積等）に対応したヘッダを付加して、１つのビットストリームに組み込み、スケーラブルな画像データのビットビットストリームを形成して外部出力する。
【００６８】
（時間スケーラビリティ使用モード）
【００６９】
このモード指定時においても、上述の空間スケーラビリティ使用モード指定時と同様にして、第１のフレームメモリ１０２及び第２のフレームメモリ１０４から読み出されたＹＣｂＣｒデータは、第１のデータ生成回路１０５及び第２のデータ生成回路１０６を介して、第１のブロック化処理回路１０７及び第２のブロック化処理回路１０８に供給されるが、第１のデータ生成回路１０５及び第２のデータ生成回路１０６の動作については、上述の上述の空間スケーラビリティ使用モード指定時の動作と異なる。
【００７０】
すなわち、第１のデータ生成回路１０５（上記図３参照）において、第１のセレクタ３０１は、制御回路１０３からの符号化制御信号（外部情報１１２に基づいた時間スケーラビリティ使用モード指定の制御信号）により、出力先を第２のセレクタ３０３に切り替えて、第１のフレームメモリ１０２からのＹＣｂＣｒデータを出力する。
第２のセレクタ３０３は、制御回路１０３からの符号化制御信号により、第１のセレクタ３０１からのＹＣｂＣｒデータをフレームレートコントローラ３０５に供給する。
フレームレートコントローラ３０５は、制御回路１０３からの符号化制御信号に含まれるフレームレート情報に従って、第２のセレクタ３０１からのＹＣｂＣｒデータに対して、フレーム単位のダウンサンプル（画像データの時間方向の分解能を落とす）を行って、基本レイヤの画像データを生成する。
このフレームレートコントローラ３０５で得られた基本レイヤの画像データは、第３のセレクタ３０２に供給される。
第３のセレクタ３０２は、制御回路１０３からの符号化制御信号により、出力するデータを、フレームレートコントローラ３０５の出力（基本レイヤの画像データ）に出力を切り替える。したがって、第１のブロック化処理回路１０７には、基本レイヤの画像データが供給される。この基本レイヤの画像データは、後述の第２のデータ生成回路１０６にも供給される。
【００７１】
このようにして、第１のデータ生成回路１０５から第１のブロック化処理回路１０７に供給された基本レイヤの画像データは、第１のブロック化処理回路１０７にてブロック化され、第１の符号化回路１０９にてブロック単位の所定の符号化処理が行われ、これがビットストリーム生成回路１１１に供給される。
【００７２】
一方、第２のデータ生成回路１０６（上記図４参照）において、第１のセレクタ４０１は、制御回路１０３からの符号化制御信号により、出力先を第２の差分データ生成回路４０４に切り替えて、第２のフレームメモリ１０４からのＹＣｂＣｒデータを出力する。
これと同時に、フレームメモリ４０５は、制御回路１０３からの符号化制御信号により、第１のデータ生成回路１０５からの基本レイヤの画像データを第２の差分データ生成回路４０４に供給する。
第２の差分データ生成回路４０４は、制御回路１０３からの符号化制御信号に従って、フレームメモリ４０５からの基本レイヤの画像データを高位レイヤの予測情報として、時間的に未来、過去の画像データに対して参照し、高位レイヤとしての差分画像データを生成する。
この第２の差分データ生成回路４０４で得られた差分画像データは、第２のセレクタ４０２に供給される。
第２のセレクタ４０２は、制御回路１０３からの符号化制御信号により、出力するデータを、第２の差分データ生成回路４０４の出力（差分画像データ）に出力を切り替える。したがって、第２のブロック化処理回路１０８には、差分画像データが供給される。
【００７３】
このようにして、第２のデータ生成回路１０６から第２のブロック化処理回路１０８に供給された高位レイヤの差分画像データは、第２のブロック化処理回路１０８にてブロック化され、第２の符号化回路１１０にて、基本レイヤの画像とは独立して、ブロック単位の所定の符号化処理が行われ、これがビットストリーム生成回路１１１に供給される。
【００７４】
ビットストリーム生成回路１１１は、上述の空間スケーラビリティ使用モード指定時と同様にして、第１の符号化回路１０９からの基本レイヤの画像データ、及び第２の符号化回路１１０からの高位レイヤの画像データ（差分画像データ）にヘッダを付加して、スケーラブルな画像データのビットビットストリームを形成して外部出力する。
【００７５】
（ＳＮＲスケーラビリティ使用モード）
【００７６】
第１のフレームメモリ１０２及び第２のフレームメモリ１０４は各々、制御回路１０３（具体的にはＣＰＵ７０１）からのＲ／Ｗ制御信号（外部情報に基づいたＳＮＲスケーラビリティ使用モード指定の制御信号）に従って、変換回路１０１からのＹＣｂＣｒデータの読み書き動作を行う。
これにより、この場合には、第１のフレームメモリ１０２及び第２のフレームメモリ１０４から読み出されたＹＣｂＣｒデータは、直接、第１のブロック化処理回路１０７及び第２のブロック化処理回路１０８に供給される。
したがって、そのＹＣｂＣｒデータは、第１のブロック化処理回路１０７及び第２のブロック化処理回路１０８にてブロック化され、第１の符号化回路１０９及び第２の符号化回路１１０に供給される。
【００７７】
第１の符号化回路１０９は、制御回路１０３からの符号化制御信号により、第１のブロック化処理回路１０７からのＹＣｂＣｒデータに対して、ブロック単位の所定の符号化処理を行って、符号化された基本レイヤの画像データを生成する。このとき、上記符号化制御信号に従った所定の符号量（圧縮率）となるような符号化処理を行う。
この第１の符号化回路１０９で得られた符号化された基本レイヤの画像データは、ビットストリーム生成回路１１１に供給されると共に、高位レイヤの画像データの符号化処理における参照データとして第２の符号化回路１１０にも供給される。
【００７８】
第２の符号化回路１１０は、制御回路１０３からの符号化制御信号により、高位レイヤの予測情報として、第１の符号化回路１０９からの基本レイヤの画像データを、未来、過去の画像データに対して参照し、符号化された高位レイヤとしての差分画像データを生成する。
この第２の符号化回路１１０で得られた符号化された高位レイヤとしての差分画像データは、ビットストリーム生成回路１１１に供給される。
【００７９】
ビットストリーム生成回路１１１は、上述の空間又は時間スケーラビリティ使用モード指定時と同様にして、第１の符号化回路１０９からの基本レイヤの画像データ、及び第２の符号化回路１１０からの高位レイヤの画像データ（差分画像データ）にヘッダを付加して、スケーラブルな画像データのビットビットストリームを形成して外部出力する。
【００８０】
（スケーラビリティ機能不使用モード）
【００８１】
第１のフレームメモリ１０２及び第２のフレームメモリ１０４は各々、制御回路１０３（具体的にはＣＰＵ７０１）からのＲ／Ｗ制御信号（スケーラビリティ機能不使用モードに基づいた制御信号）に従って、変換回路１０１からのＹＣｂＣｒデータの読み書き動作を行う。
これにより、この場合には、第１のフレームメモリ１０２及び第２のフレームメモリ１０４から読み出されたＹＣｂＣｒデータは、直接、第１のブロック化処理回路１０７及び第２のブロック化処理回路１０８に供給される。
したがって、そのＹＣｂＣｒデータは、第１のブロック化処理回路１０７及び第２のブロック化処理回路１０８にてブロック化され、第１の符号化回路１０９及び第２の符号化回路１１０にてブロック単位の所定の符号化処理が行われ、これがビットストリーム生成回路１１１に供給される。
【００８２】
ビットストリーム生成回路１１１は、第１の符号化回路１０９及び第２の符号化回路１１０からの各データに、所定のアプリケーション（伝送や蓄積等）に対応したヘッダを付加して、画像データのビットビットストリームを形成して外部出力する。
【００８３】
（第２の実施の形態）
【００８４】
本発明は、例えば、図５に示すような復号装置２００に適用される。
この復号装置２００は、上述した第１の実施の形態における符号化装置１００に対応したものである。
すなわち、復号装置２００は、基本的に符号化装置１００の逆処理を行うようになされているが、特に、復号装置２００には、後述するユーザからの情報（ユーザ情報）が入力されるようになされている。このユーザ情報は、画質や復号装置２００が有する能力等の各種情報を含むものである。
したがって、復号装置２００のユーザが、復号における各種情報を入力することで、この入力に従った外部出力情報２１２が制御回路２０８により生成され、この外部出力情報２１２が、上述した外部情報１１２として符号化装置１００に供給されることになる。
【００８５】
尚、図６は、上述の制御回路２０８の内部構成を示したものであり、図７は、復号装置２００の第１のデータ復号回路２０９の内部構成を示したものであり、図８は、復号装置２００の第１のデータ復号回路２０９の内部構成を示したものである。
【００８６】
そこで、上述のユーザ情報は、次のような構成により入力される。
図９は、上記図５の復号装置２００の機能を有するシステム２４０の構成を示したものである。
このシステム２４０は、上記図９に示すように、モニタ２４１と、パーソナルコンピュータ（ＰＣ）本体２４２と、マウス２４３とを備えており、各々は互いに接続されている。
そして、ＰＣ本体２４２は、上記図５に示したような構成の復号装置２００の機能を有してる。
【００８７】
そこで、システム２４０において、先ず、モニタ２４１には、選択可能なソフト（動画像）のジャンル選択メニュー画面が表示される。例えば、「映画」、「音楽」、「Ｐｈｏｔｏ」、及び「ｅｔｃ．」等の選択メニュー画面が表示される。
ユーザは、マウス２４３を操作することで、モニタ２４１に映し出されている画面上から、所望するソフトのジャンルを指定する。具体的には、例えば、所望するソフトのジャンル（上記図９では「映画」）に、マウスカーソル２４４を合わせ、マウス２４３をクリック操作又はダブルクリック操作する。これにより、そのジャンル（「映画」）は指定されたことになる。
【００８８】
このような操作が終了すると、次に、図１０に示すように、上記図９の選択メニュー画面上で選択されたジャンル（「映画」）に対応する個別ソフトのタイトルメニュー画面が表示される。例えば、「映画」に対応する、「ｔｉｔｌｅ−Ａ」、「ｔｉｔｌｅ−Ｂ」、「ｔｉｔｌｅ−Ｃ」、及び「ｔｉｔｌｅ−Ｄ」等のタイトルメニューが表示される。
ユーザは、マウス２４３を操作することで、モニタ２４１に映し出されている画面上から、所望するタイトルを指定する。具体的には、例えば、所望するタイトル（上記図１０では「ｔｉｔｌｅ−Ａ」）に、マウスカーソル２４４を合わせ、マウス２４３をクリック操作又はダブルクリック操作する。これにより、そのタイトル（「ｔｉｔｌｅ−Ａ」）は指定されたことになる。
【００８９】
このような操作が終了すると、次に、図１１に示すように、上記図１０のタイトルメニュー画面上で選択されたタイトル（「ｔｉｔｌｅ−Ａ」）に対して、復号における各種条件を設定するための条件設定画面が表示される。ここでは、次のような各種条件を設定できるようになされている。
S/N ：低画質（Low ）、高画質（High）、及び本システムが有する復号能力に応じた最適な画質（Auto）の何れかを指定。
Frame Rate：低フレームレート（Low ）、高フレームレート（Full）、及び本システムが有する復号能力に応じた最適なフレームレート（Au to）の何れかを指定。
Full Spec ：符号化装置（上記図１の符号化装置１００等）側の最高画質（高符号量）画像を指定。
Full Auto ：本システムが有する復号能力に応じた最適な各種条件を設定。
【００９０】
したがって、上記図１１に示すように、ユーザは、設定したい条件（上記図１１では「Ｓ／Ｎ」）に、マウスカーソル２４４を合わせ、マウス２４３をクリック操作又はダブルクリック操作する。これにより、図１２に示すように、「Ｓ／Ｎ」を設定するための詳細条件が表示される。すなわち、「Ｌｏｗ」、「Ｈｉｇｈ」、及び「Ａｕｔｏ」の各条件が表示される。
ユーザは、その詳細条件のなかから「Ｓ／Ｎ」に設定したい条件（上記図１２では「Ａｕｔｏ」）に、マウスカーソル２４４を合わせ、マウス２４３をクリック操作又はダブルクリック操作する。これにより、上記図１２では、「Ｓ／Ｎ」については「Ａｕｔｏ」（本システムが有する復号能力に応じた最適な画質）が設定されたことになる。
【００９１】
上述のようにして、画面上で設定された各種条件の情報が、上述した第１の実施の形態における符号化装置１００に、外部情報１１２として供給されることになる。
これを受けた符号化装置１００は、上述したように、外部情報１１２を解釈して、最適なスケーラビリティを選択し、そのスケーラビリティにおける各種設定条件（画像サイズや圧縮率等）を決定し、一連の符号化処理を行って、上記図９のシステム２４０（復号装置２００）に対して出力する。
【００９２】
尚、本発明の目的は、上述した各実施の形態のホスト及び端末の機能を実現するソフトウェアのプログラムコードを記憶した記憶媒体を、システム或いは装置に供給し、そのシステム或いは装置のコンピュータ（又はＣＰＵやＭＰＵ）が記憶媒体に格納されたプログラムコードを読みだして実行することによっても、達成されることは言うまでもない。
この場合、記憶媒体から読み出されたプログラムコード自体が各実施の形態の機能を実現することとなり、そのプログラムコードを記憶した記憶媒体は本発明を構成することとなる。
【００９３】
プログラムコードを供給するための記憶媒体としては、ＲＯＭ、フロッピーディスク、ハードディスク、光ディスク、光磁気ディスク、ＣＤ−ＲＯＭ、ＣＤ−Ｒ、磁気テープ、不揮発性のメモリカード等を用いることができる。
【００９４】
また、コンピュータが読みだしたプログラムコードを実行することにより、各実施の形態の機能が実現されるだけでなく、そのプログラムコードの指示に基づき、コンピュータ上で稼動しているＯＳ等が実際の処理の一部又は全部を行い、その処理によって各実施の形態の機能が実現される場合も含まれることは言うまでもない。
【００９５】
さらに、記憶媒体から読み出されたプログラムコードが、コンピュータに挿入された拡張機能ボードやコンピュータに接続された機能拡張ユニットに備わるメモリに書き込まれた後、そのプログラムコードの指示に基づき、その機能拡張ボードや機能拡張ユニットに備わるＣＰＵなどが実際の処理の一部又は全部を行い、その処理によって各実施の形態の機能が実現される場合も含まれることは言うまでもない。
上述した実施の形態によれば、符号化側では、復号側の情報（復号側が有する処理能力、及び／又は復号側のユーザにより指定された解像度や符号量、及び／又は符号化側と復号側との間の伝送線路容量等の情報）に従って、画像データの符号化が行われる。
より具体的には、符号化側に、出力先である復号側の性能（処理能力）情報、及び／または伝送線路情報、及び／又はユーザの希望情報（ユーザにより指定された解像度や符号量）等を受け付ける手段を設け、符号化側は、該情報に適合するスケーラブルな画像を選択してリアルタイムに符号化して送出する。一方、復号側には、該復号側の処理能力情報や、ユーザインターフェースにより入力されたユーザの希望情報等を符号化側に送出する手段を設ける。
このように構成したことで、復号側（復号側のユーザ）が希望する画像が如何なる画像であっても、その画像に適合した符号化画像データを復号側に与えることができる。また、画像の伝送に用いる回線容量や、復号側の性能に適合した（互換性を確保した）符号化画像データを復号側に提供することができる。したがって、復号側では、良好な復号画像を得ることができる。
また、符号化側と復号側との間にインタラクティブな通信機能を設けるように構成すれば、符号化側と復号側との間に介在するあらゆる事情に適合した符号化画像データを復号側に与えることができる。この結果、符号化側と復号側との間の広義の互換性や回線容量の有効利用等のメリットをもたらすことができる。
さらに、復号側に、ユーザインターフェースを設けるように構成すれば、インタラクティブな通信機能を利用して、ユーザの希望にフレキシブルに適応する符号化画像データを提供することができる。
【００９６】
【発明の効果】
以上説明したように、本発明によれば、復号側に適した画像を提供することが可能となる。
【図面の簡単な説明】
【図１】第１の実施の形態において、本発明を適用した符号化装置の構成を示すブロック図である。
【図２】上記符号化装置の制御回路の内部構成を示すブロック図である。
【図３】上記符号化装置の第１のデータ生成回路の内部構成を示すブロック図である。
【図４】上記符号化装置の第２のデータ生成回路の内部構成を示すブロック図である。
【図５】第２の実施の形態において、本発明を適用した復号装置の構成を示すブロック図である。
【図６】上記復号装置の制御回路の内部構成を示すブロック図である。
【図７】上記復号装置の第１のデータ復号回路の内部構成を示すブロック図である。
【図８】上記復号装置の第２のデータ復号回路の内部構成を示すブロック図である。
【図９】上記復号装置の機能を有するシステムを説明するための図である。
【図１０】上記システムのモニタ画面上において、ジャンルメニューのタイトルを選択する操作を説明するための図である。
【図１１】上記タイトルの選択により表示される条件設定画面を説明するための図である。
【図１２】上記条件設定画面上で選択された条件項目に所望する条件を設定する操作を説明するための図である。
【図１３】空間スケーラビリティを説明するための図である。
【図１４】時間スケーラビリティを説明するための図である。
【図１５】ＳＮＲスケーラビリティを説明するための図である。
【図１６】従来の符号化装置の構成を示すブロック図である。
【図１７】従来の復号装置の構成を示すブロック図である。
【符号の説明】
１００符号化装置
１０１変換回路
１０２第１のフレームメモリ
１０３制御回路
１０４第２のフレームメモリ
１０５第１のデータ生成回路
１０６第２のデータ生成回路
１０７第１のブロック化処理回路
１０８第２のブロック化処理回路
１０９第１の符号化回路
１１０第２の符号化回路
１１１ビットストリーム生成回路
１１２外部情報[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a compression process of image data obtained by digitizing an image signal and a decompression process of the compressed image data.
[0002]
[Prior art]
Conventionally, as an international standard for audio and image encoding methods, JPEG, H.264, and the like have been used. 261, JPEG and H.264. MPEG etc. which improved 261 are known. At present, called the multimedia era in which audio and images are handled in an integrated manner, MPEG1 improved from MPEG and MPEG2 improved from MPEG1 are often used.
[0003]
Here, MPEG2 is a moving picture coding standard advanced in response to a request for high image quality, and has the following characteristics.
(1) In addition to storage media, application to communication and broadcast media is also considered.
(2) Able to expand to high definition television (HDTV) quality for high quality images that are higher than the current television quality.
(3) MPEG1 or H.264 Unlike H.261, encoding is performed not only for sequential scanning (non-interlace) but also for interlaced scanning (interlaced) images.
(4) Have resolution variability (scalability).
(5) The MPEG2 decoder must be able to decode the MPEG1 bit stream. That is, it must have downward compatibility.
[0004]
Among these features, in particular, the scalability function (4) is one newly introduced in MPEG2, and is roughly classified into three. These are Spatial Scalability, Temporal Scalability, and SNR (Singnal to Noise Radio) Scalability (SNR Scalability). Hereinafter, an overview of each scalability will be described.
[0005]
(Spatial scalability)
FIG. 13 is a diagram illustrating an outline of encoding of spatial scalability. A layer having a small temporal resolution is referred to as a base layer (Base Layer), and a layer having a large temporal resolution is referred to as a higher layer (Enhancement Layer).
The basic layer is a reduction of the code amount per frame instead of reducing the spatial resolution (image quality) by performing thinning (sub-sampling) on the original image at a certain spatial ratio. That is, it is a layer with low image quality and low code amount in terms of spatial resolution. In this basic layer, encoding is performed in normal MPEG2 limited between frames (the original high-quality image is not targeted, but is limited to the image in the basic layer).
On the other hand, the high layer is a layer with high image quality and high code amount in terms of spatial resolution. In this high layer, the base layer image is up-sampled (averaged between the pixels of the low resolution screen). Create a high-resolution screen by adding pixels such as values) to create an image that is the same size as the higher layer (expanded base layer) and not only prediction from the image in the higher layer The image is predicted and encoded from the up-sampled enlarged image. When the higher layer image encoded in this way is decoded, it is spatially the same size as the original image, and the image quality depends on the compression rate.
When such spatial scalability is used, two image sequences can be encoded more efficiently than when two image sequences are individually encoded and transmitted. It is also possible to send a normal television broadcast and an HDTV broadcast at the same time and select an image according to the performance on the receiving side.
[0006]
(Time scalability)
FIG. 14 is a diagram illustrating an outline of encoding of time scalability. A layer having a small temporal resolution is referred to as a base layer (Base Layer), and a layer having a large temporal resolution is referred to as a higher layer (Enhancement Layer).
The base layer is a thinned out code amount instead of reducing the temporal resolution by thinning out the frame unit at a certain rate with respect to the temporal resolution (frame rate) of the original image. It is a layer with low image quality and low code amount in terms of temporal resolution. In this basic layer, encoding is performed in normal MPEG2 limited between frames (the original high-quality image is not targeted, but is limited to current, past, and future frames in the basic layer in terms of time).
On the other hand, the high layer is a layer with high image quality and high code amount in terms of time resolution. In this high layer, not only the use of I, P and B pictures in the high layer, but also the basic layer Predict and encode using the image inside. When the higher layer image encoded in this way is decoded, the frame rate of the image becomes the same size as the original image, and the image quality depends on the compression rate.
When such time scalability is used, for example, a 30 Hz sequential scan image and a 60 Hz sequential scan image can be simultaneously and efficiently transmitted. Also, a combination of interlace and progressive scanning is possible.
The time scalability is for future expansion of MPEG2, and is not used at present (it is treated as “Reserved”).
[0007]
(SNR scalability)
FIG. 15 is a diagram illustrating an outline of encoding of SNR scalability. A layer having a low image quality is called a base layer (Base Layer), and a layer having a high image quality is called a high-order layer (Enhancement Layer).
The base layer is a process of encoding (compressing) an original image, for example, “blocking → orthogonal transformation → quantization → variable length encoding”, and a relatively high compression rate (coarse quantization step / This is a layer having a low code amount according to (size), that is, a layer having low image quality and low code amount in terms of image quality (N / S). In this basic layer, encoding is performed by MPEG1 or MPEG2 (predictive encoding) limited between frames.
On the other hand, the higher layer is a layer having higher image quality and higher code amount than the base layer. In this higher layer, an image encoded in the base layer is decoded and the decoded image is used as the original. Only the error subtracted from the image is encoded within the frame with a relatively low compression ratio (quantization step size smaller than the quantization step size of the base layer). Note that inter-frame (field) prediction is not performed in SNR scalability. All are intra frame or field coding.
When such SNR scalability is used, two types of images with different image quality can be efficiently encoded and decoded simultaneously.
[0008]
Therefore, for example, an encoder 900 as shown in FIG. 16 is available as an image encoding apparatus adopting the above-described MPEG2.
As shown in FIG. 16, the encoder 900 includes a conversion circuit 901 to which image RGB data is supplied, a selection circuit 902 to which the output of the conversion circuit 901 is supplied, and a first output to which the output of the selection circuit 902 is supplied. A block generation processing circuit 905 to which outputs of the first data generation circuit 904 and the second data generation circuit 903, the selection circuit 902, the first data generation circuit 904, and the second data generation circuit 903 are supplied; And an encoding circuit 906 to which the output of the encoding processing circuit 905 is supplied.
[0009]
First, the conversion circuit 901 converts RGB data of 8 bits each into 4: 2: 0 YCbCr data of 8 bits each.
The selection circuit 902 selects one of a spatial scalability usage mode, an SNR scalability usage mode, and a normal mode (a scalability non-use mode). The selection in the selection circuit 902 is mainly instructed by a user who uses a decoder described later.
[0010]
When the selection mode of the spatial scalability is selected by the selection circuit 902, the YCbCr data (each 8-bit data) obtained by the conversion circuit 901 is supplied to the first data generation circuit 904 via the selection circuit 902. The
The first data generation circuit 904 generates corresponding base layer and higher layer data from the supplied YCbCr data, and supplies the data to the blocking processing circuit 905.
[0011]
On the other hand, when the selection circuit 902 selects the SNR scalability use mode, the YCbCr data (each 8-bit data) obtained by the conversion circuit 901 passes through the selection circuit 902 to the second data generation circuit 903. Supplied.
The second data generation circuit 903 generates corresponding base layer and higher layer data from the supplied YCbCr data, and supplies the data to the blocking processing circuit 905.
[0012]
Further, when the normal mode is selected by the selection circuit 902, YCbCr data (each 8-bit data) obtained by the conversion circuit 901 is directly supplied to the block processing circuit 905 via the selection circuit 902.
[0013]
The blocking processing circuit 905 performs the following processing on the supplied YCbCr data independently for each YCbCr.
That is, a block is configured with a block of n pixels in the horizontal and vertical directions as a unit. Further, a macro block is formed by grouping the a, b, and c blocks into YCbCr independently.
[0014]
The encoding circuit 906 performs a predetermined encoding process for each macro block on the data of each macro block obtained by the block processing circuit 905. For example, after selecting an intra (I) or inter (P or B) predictive coding method and performing a prediction process, an orthogonal transform (DCT) process, a quantization process, and a variable length coding (VLC) process are performed. .
The data encoded by the encoding circuit 906 is transmitted or recorded as an MPEG2 bit stream.
[0015]
As a decoding apparatus corresponding to the encoder 900 as described above, for example, there is a decoder 910 as shown in FIG.
The decoder 910 basically performs the reverse processing of the encoder 900. As shown in FIG. 17, the decoder 910 is supplied with the header detection circuit 911 to which the MPEG2 bit stream is supplied and the output of the header detection circuit 911. Flag detection circuit 912 and decoding circuit 913, signal selection circuit 914 supplied with the output of decoding circuit 913, first data decoding circuit 915 and second data decoding circuit supplied with the output of signal selection circuit 914 916, and a signal selection circuit 914, a first data decoding circuit 915, and an image quality selection circuit 917 to which the second data decoding circuit 916 is supplied.
[0016]
First, the header detection circuit 911 decodes header information included in the MPEG2 bit stream, generates a control signal corresponding to the header information, and supplies the control signal to the flag detection circuit 912.
The flag detection circuit 912 detects a flag related to scalability from the control signal from the header detection circuit 911, and supplies the flag to the decoding circuit 913, the signal selection circuit 914, and the image quality selection circuit 917.
[0017]
The decoding circuit 913 corresponds to the encoding circuit 906 in FIG. 16 described above. The decoding circuit 913 performs predetermined processing according to the flag from the flag detection circuit 912 with respect to the MPEG2 bit stream supplied via the header detection circuit 911. Perform decryption.
[0018]
The signal selection circuit 914 releases the macro block constituting the data decoded by the decoding circuit 913, and then selects a signal path according to the flag from the flag detection circuit 912.
As a result, the data (decoded image data) for which the macroblock is canceled by the signal selection circuit 194 uses the SNR scalability to the first data decoding circuit 915 when the spatial scalability is used. If the scalability is not used, the image data is directly supplied to the second data decoding circuit 916.
[0019]
The first data decoding circuit 915 corresponds to the first data generation circuit 904 in FIG. 16, and decodes the decoded image data (spatial scalability data) from the signal selection circuit 914 into the original YCbCr data. Then, it is supplied to the image quality selection circuit 917.
[0020]
The second data decoding circuit 916 corresponds to the second data generation circuit 903 in FIG. 16, and decodes the decoded image data (SNR scalability data) from the signal selection circuit 914 into the original YCbCr data. Then, it is supplied to the image quality selection circuit 917.
[0021]
When the scalability is used according to the flag from the flag detection circuit 912, the image quality selection circuit 917 uses the basic layer and the higher layer in the YCbCr data from the first data decoding circuit 915 or the second data decoding circuit 916. And the like, and image data according to the selection is output.
[0022]
[Problems to be solved by the invention]
By the way, between the encoder 900 and the decoder 910 in MPEG2 as described above, one of the following three systems (encoding / decoding system) is selected and adopted.
[0023]
(First method)
Both the encoder and decoder encode or decode corresponding to the desired data rate. Note that the image quality (resolution) is only one type.
[0024]
(Second method)
On the encoder side, spatial scalability (see FIG. 13 above) is used to simultaneously encode two types of images (base layer and higher layer images) having different sizes (resolutions). Depending on the performance (data processing capacity, etc.) of the display connected to the, restore the image with low spatial resolution from the basic layer, or restore the image with high spatial resolution from both the basic layer and the higher layer. To do.
Here, for example, when an original image is an HDTV image signal (a signal composed of 1490 × 1152 pixels), there are two types of spatial scalability as follows: a basic layer and a higher layer.
Base layer: The original image is decimated by half in both the horizontal and vertical directions (x and y directions)
An image layer of 720 × 576 pixels.
High layer: In addition to forward prediction (P) and bi-prediction (B) of the original image, an image obtained by up-sampling the base layer to the same size as the high layer is also predicted.
Encoded for measurement (comparison).
[0025]
(Third method)
On the encoder side, using SNR scalability (see FIG. 15 above), two types of images (base layer and higher layer images) with different code amounts (quantization step size) are encoded simultaneously. Depending on the performance of the device itself, a low image quality (low bit rate) image is restored from the base layer, or a high image quality (high bit rate) image is restored from both the base layer and the higher layer.
Here, in SNR scalability, two types of images with different image quality can be encoded and decoded simultaneously. That is, it is possible to generate images having the same image and different compression ratios by using two different quantization step sizes (quantization coefficients) for the same image. At this time, an image with a large compression ratio (low bit rate image) is defined as a base layer, and an error obtained by subtracting an image obtained by restoring the base layer from an original image is defined as a high layer. Therefore, on the decoder side, the image obtained by adding the base layer and the higher layer is a high-quality image with a low bit rate.
[0026]
Therefore, when (2) and (3) are adopted among these three methods, the selection circuit 902 of the encoder 900 in FIG. 16 can select either spatial scalability or SNR scalability.
[0027]
However, when the spatial scalability is selected in the selection circuit 902, the image size of the base layer is uniquely determined by the relationship with the higher layer, and the image size of the base layer can be arbitrarily selected. There is no.
Similarly, when the SNR scalability is selected, the frame rate (resolution) of the base layer is uniquely determined depending on the relationship with the higher layer, and the image size of the base layer can be arbitrarily selected. There is no degree.
[0028]
Therefore, in the conventional coding apparatus as shown in FIG. 16, when using the scalability function, it is not possible to select a code amount such as an image size or a frame rate. In other words, it was not possible to select a decoding apparatus as an output destination or a factor directly related to the line situation.
In short, when an image encoded using any one of spatial scalability, SNR scalability, etc. is received on the decoding device side (reception side), the selection of image quality is:
1. Low quality image with only base layer decoded
2. High quality image with both base and higher layers decoded
Therefore, there is a problem that the image quality (decoding speed) cannot be selected according to the performance of the decoding device and the needs of the user.
[0029]
The present invention has been made in view of such problems, and an object thereof is to provide an image suitable for the decoding side.
[0030]
[Means for Solving the Problems]
In order to solve the above-described problems, the present invention is an image processing apparatus that encodes image data according to a predetermined encoding method and outputs the encoded image data from an external apparatus. Conditions for decryption The external information received by the reception means from a plurality of scalability modes including at least two modes of a spatial scalability mode, a temporal scalability mode, and an SNR scalability mode. An encoding unit that selects a corresponding scalability mode and encodes the image data in the selected scalability mode; and a transmission unit that transmits the image data encoded by the encoding unit to the external device. It is characterized by that.
[0031]
In order to solve the above problems, the present invention is an image processing apparatus for decoding image data obtained by encoding according to a predetermined encoding method, Conditions for decryption And at least two modes of a spatial scalability mode, a temporal scalability mode, and an SNR scalability mode in the external device. Receiving means for selecting a scalability mode corresponding to the external information input by the input means from a plurality of scalability modes, and receiving image data encoded in the selected scalability mode from the external device; And decoding means for decoding the encoded image data received by the receiving means.
[0032]
In order to solve the above problems, the present invention provides an image processing method in an image processing apparatus that encodes and outputs image data according to a predetermined encoding method. An image is transmitted from an external apparatus to the external apparatus. Conditions for decryption The external information received in the receiving step from a plurality of scalability modes including at least two modes of a spatial scalability mode, a temporal scalability mode, and an SNR scalability mode. Selecting a corresponding scalability mode, encoding the image data in the selected scalability mode, and transmitting the image data encoded by the encoding step to the external device. Features.
[0033]
In order to solve the above problems, the present invention provides an image processing method in an image processing apparatus for decoding image data obtained by encoding according to a predetermined encoding method, Conditions for decryption At least two modes of a spatial scalability mode, a temporal scalability mode, and an SNR scalability mode in the external device. A receiving step of selecting a scalability mode corresponding to the external information input in the input step from a plurality of scalability modes, and receiving image data encoded in the selected scalability mode from the external device; A decoding step of decoding the encoded image data received by the receiving step.
[0050]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, embodiments of the present invention will be described with reference to the drawings.
[0051]
(First embodiment)
[0052]
The present invention is implemented by, for example, an encoding apparatus 100 as shown in FIG.
As shown in FIG. 1, the encoding apparatus 100 includes a conversion circuit 101 to which 8-bit RGB data is supplied, a first frame memory 102 to which an output of the conversion circuit 101 is supplied, The first data generation circuit 105 and the first block processing circuit 107 to which the output of the frame memory 102 is supplied, and the first encoding circuit 109 to which the output of the first block processing circuit 107 is supplied The output of the first data generation circuit 105 is also supplied to the first block processing circuit 107.
The encoding apparatus 100 also includes a second frame memory 104 to which the output of the conversion circuit 101 is supplied, a second data generation circuit 106 to which the output of the second frame memory 104 is supplied, and a second block. The processing circuit 108 and the second encoding circuit 110 to which the output of the second blocking processing circuit 108 is supplied are provided. The second blocking processing circuit 108 includes a second data generation circuit 106. Is also supplied.
The output of the first data generation circuit 105 is also supplied to the second data generation circuit 106, and the output of the first encoding circuit 109 is also supplied to the second encoding circuit 110. ing.
Further, the encoding apparatus 100 further includes a bit stream generation circuit 111 to which outputs of the first encoding circuit 109 and the second encoding circuit 110 are supplied, and a control circuit 103 for performing operation control of the entire apparatus. It has.
[0053]
As shown in FIG. 2, the internal configuration of the control circuit 103 includes a CPU 701, a program memory 702 in which various processing programs for performing the operation control of the entire apparatus are stored so as to be readable by the CPU 701, and external information described later in detail. 112 is provided with an information detection circuit 703 to which 112 is supplied. The CPU 701 is supplied with information (hereinafter referred to as “external information”) 112 including infrastructure information, user request information, and the like from the outside.
Therefore, the CPU 701 reads out and executes various processing programs in the program memory 702, thereby realizing the operation of the encoding apparatus 100 described here.
[0054]
As shown in FIG. 3, the internal configuration of the first data generation circuit 105 includes a first selector 301 to which an output (YCbCr data) of the first frame memory 102 is supplied, and an output of the first selector 301. The supplied second selector 303 and sampling circuit 304, the frame rate controller 305 to which the output of the second selector 303 is supplied, and the third selector to which the outputs of the frame rate controller 305 and sampling circuit 304 are supplied. 302. The output of the third selector 302 is supplied to the first block processing circuit 107 and the second data generation circuit 106.
[0055]
As shown in FIG. 4, the internal configuration of the second data generation circuit 106 includes a first selector 401 to which the output (YCbCr data) of the second frame memory 104 is supplied, and the first data generation circuit 105. A frame memory 405 to which an output (base layer image data) is supplied, and a first difference data generation circuit 403 and a second difference data generation circuit 404 to which outputs of the first selector 401 and the frame memory 405 are supplied. And a second selector 402 to which the outputs of the first difference data generation circuit 403 and the second difference data generation circuit 404 are supplied. The output of the second selector 402 is supplied to the second block processing circuit 106.
[0056]
In the encoding apparatus 100 as described above, first, the conversion circuit 101 converts input image data (each 8-bit RGB data) into 4: 2: 0 YCbCr data (each 8-bit data). Then, the YCbCr data is supplied to the first frame memory 102 and the second frame memory 104, respectively.
Each of the first frame memory 102 and the second frame memory 104 stores YCbCr data from the conversion circuit 101, and the operation control at this time is performed by the control circuit 103 that operates as follows.
[0057]
That is, in the control circuit 103 (see FIG. 2 above), the information detection circuit 703 interprets the external information 112 and supplies control information corresponding to the external information 112 to the CPU 701.
The CPU 701 determines from the control information from the information detection circuit 708 mode information indicating the use or non-use of the scalability function in encoding, the type of scalability function in the use mode of the scalability function, and various controls for the base layer and the higher layer. Information (for example, base layer image size (Statial), frame rate (Temporal), compression rate (SNR), etc.), etc. is obtained, and the information (hereinafter referred to as "encoding control signal") is first To the data generation circuit 105, the second data generation circuit 106, and the like. At the same time, the CPU 701 links the functions of the first data generation circuit 105 and the second data generation circuit 106 with respect to the first frame memory 102 and the second frame memory 104 and performs a data read / write operation. As described above, a read / write (R / W) control signal according to control information from the information detection circuit 708 is supplied.
[0058]
Therefore, the first frame memory 102 and the second frame memory 104 described above operate according to the R / W control signal based on the external information 112, and the other first data generation circuit 105 and second data generation circuit. Similarly, 106 and the like operate according to the encoding control signal based on the external information 112.
[0059]
Hereinafter, the contents indicated by the external information 112, in particular, the “spatial scalability usage mode”, “time scalability usage mode”, “SNR scalability usage mode”, and “scalability function non-use mode” are described for the conversion circuit 101 and later. The operation of each circuit will be described.
[0060]
(Spatial scalability usage mode)
[0061]
Each of the first frame memory 102 and the second frame memory 104 is in accordance with an R / W control signal (a control signal for designating a spatial scalability mode based on the external information 112) from the control circuit 103 (specifically, the CPU 701). The read / write operation of YCbCr data from the conversion circuit 101 is performed.
As a result, the YCbCr data read from the first frame memory 102 and the second frame memory 104 is subjected to the first blocking process via the first data generation circuit 105 and the second data generation circuit 106. This is supplied to the circuit 107 and the second block processing circuit 108.
[0062]
At this time, the first data generation circuit 105 and the second data generation circuit 106 are also supplied with an encoding control signal from the control circuit 103 (a control signal designating a spatial scalability use mode based on the external information 112), The first data generation circuit 105 and the second data generation circuit 106 operate in accordance with the encoding control signal.
[0063]
That is, in the first data generation circuit 105 (see FIG. 3 above), the first selector 301 switches the output destination to the sampling circuit 304 in accordance with the encoding control signal from the control circuit 103, and the first frame memory. YCbCr data from 102 is output.
The sampling circuit 304 performs image reduction processing on the YCbCr data from the first selector 301 in accordance with the sub-sample size information included in the encoded control signal from the control circuit 103, and generates base layer image data. To do.
The base layer image data obtained by the sampling circuit 304 is supplied to the third selector 302.
The third selector 302 switches the output data to the output of the sampling circuit 304 (base layer image data) in accordance with the encoding control signal from the control circuit 103. Therefore, base layer image data is supplied to the first blocking processing circuit 107. The base layer image data is also supplied to a second data generation circuit 106 described later.
[0064]
In this way, the base layer image data supplied from the first data generation circuit 105 to the first blocking processing circuit 107 is blocked by the first blocking processing circuit 107, and the first code The encoding circuit 109 performs predetermined encoding processing in units of blocks, and supplies this to the bit stream generation circuit 111.
[0065]
On the other hand, in the second data generation circuit 106 (see FIG. 4 above), the first selector 401 switches the output destination to the first difference data generation circuit 403 in accordance with the encoding control signal from the control circuit 103, The YCbCr data from the second frame memory 104 is output.
At the same time, the frame memory 405 supplies the image data of the base layer from the first data generation circuit 105 to the first difference data generation circuit 403 according to the encoding control signal from the control circuit 103.
The first difference data generation circuit 403 converts the base layer image data from the frame memory 405 into the original image (or higher layer image) in units of frames (or fields) in accordance with the encoding control signal from the control circuit 103. Up-sample to the same size, and generate difference image data with the image data of the higher layer.
The difference image data obtained by the first difference data generation circuit 403 is supplied to the second selector 402.
The second selector 402 switches the output data to the output (difference image data) of the first difference data generation circuit 403 according to the encoding control signal from the control circuit 103. Therefore, difference image data is supplied to the second block processing circuit 108.
[0066]
In this way, the higher layer difference image data supplied from the second data generation circuit 106 to the second blocking processing circuit 108 is blocked by the second blocking processing circuit 108, In the encoding circuit 110, a predetermined encoding process in units of blocks is performed independently of the base layer image, and this is supplied to the bit stream generation circuit 111.
[0067]
The bit stream generation circuit 111 applies predetermined applications (transmission and transmission data) to the base layer image data from the first encoding circuit 109 and the higher layer image data (difference image data) from the second encoding circuit 110. A header corresponding to storage etc. is added and incorporated into one bit stream, and a bit bit stream of scalable image data is formed and output externally.
[0068]
(Time scalability usage mode)
[0069]
Even when this mode is designated, the YCbCr data read from the first frame memory 102 and the second frame memory 104 is transferred to the first data generation circuit 105 and the same as when the spatial scalability use mode is designated. The data is supplied to the first block processing circuit 107 and the second block processing circuit 108 via the second data generation circuit 106. The first data generation circuit 105 and the second data generation circuit 106 The operation is different from the operation when the above-described spatial scalability use mode is specified.
[0070]
That is, in the first data generation circuit 105 (see FIG. 3 above), the first selector 301 receives the encoding control signal from the control circuit 103 (the control signal specifying the time scalability use mode based on the external information 112). The output destination is switched to the second selector 303, and the YCbCr data from the first frame memory 102 is output.
The second selector 303 supplies the YCbCr data from the first selector 301 to the frame rate controller 305 according to the encoding control signal from the control circuit 103.
The frame rate controller 305 performs frame-based down-sampling (resolution in the time direction of image data) on the YCbCr data from the second selector 301 in accordance with the frame rate information included in the encoding control signal from the control circuit 103. To generate image data of the base layer.
The base layer image data obtained by the frame rate controller 305 is supplied to the third selector 302.
The third selector 302 switches the output data to the output of the frame rate controller 305 (base layer image data) in accordance with the encoding control signal from the control circuit 103. Therefore, base layer image data is supplied to the first blocking processing circuit 107. The base layer image data is also supplied to a second data generation circuit 106 described later.
[0071]
In this way, the base layer image data supplied from the first data generation circuit 105 to the first blocking processing circuit 107 is blocked by the first blocking processing circuit 107, and the first code The encoding circuit 109 performs predetermined encoding processing in units of blocks, and supplies this to the bit stream generation circuit 111.
[0072]
On the other hand, in the second data generation circuit 106 (see FIG. 4 above), the first selector 401 switches the output destination to the second difference data generation circuit 404 in accordance with the encoding control signal from the control circuit 103, and The YCbCr data from the second frame memory 104 is output.
At the same time, the frame memory 405 supplies the base layer image data from the first data generation circuit 105 to the second difference data generation circuit 404 in response to the encoding control signal from the control circuit 103.
In accordance with the encoding control signal from the control circuit 103, the second difference data generation circuit 404 uses the base layer image data from the frame memory 405 as the higher layer prediction information, and temporally in the future, past image data Thus, difference image data as a higher layer is generated.
The difference image data obtained by the second difference data generation circuit 404 is supplied to the second selector 402.
The second selector 402 switches the output data to the output (difference image data) of the second difference data generation circuit 404 according to the encoding control signal from the control circuit 103. Therefore, difference image data is supplied to the second block processing circuit 108.
[0073]
In this way, the higher layer difference image data supplied from the second data generation circuit 106 to the second blocking processing circuit 108 is blocked by the second blocking processing circuit 108, In the encoding circuit 110, a predetermined encoding process in units of blocks is performed independently of the base layer image, and this is supplied to the bit stream generation circuit 111.
[0074]
The bit stream generation circuit 111 performs base layer image data from the first encoding circuit 109 and higher layer image data from the second encoding circuit 110 in the same manner as when the spatial scalability usage mode is specified. A header is added to (difference image data) to form a bit-bit stream of scalable image data and output it externally.
[0075]
(SNR scalability usage mode)
[0076]
Each of the first frame memory 102 and the second frame memory 104 is in accordance with an R / W control signal (control signal for specifying an SNR scalability usage mode based on external information) from the control circuit 103 (specifically, the CPU 701). Read / write operation of YCbCr data from the conversion circuit 101 is performed.
Thereby, in this case, the YCbCr data read from the first frame memory 102 and the second frame memory 104 is directly transmitted to the first block processing circuit 107 and the second block processing circuit 108. Supplied.
Therefore, the YCbCr data is blocked by the first block processing circuit 107 and the second block processing circuit 108 and supplied to the first encoding circuit 109 and the second encoding circuit 110.
[0077]
The first encoding circuit 109 performs predetermined encoding processing in units of blocks on the YCbCr data from the first blocking processing circuit 107 in accordance with the encoding control signal from the control circuit 103, and performs encoding. The base layer image data thus generated is generated. At this time, an encoding process is performed so as to obtain a predetermined code amount (compression rate) according to the encoding control signal.
The encoded base layer image data obtained by the first encoding circuit 109 is supplied to the bit stream generation circuit 111 and is also used as reference data in the encoding processing of the higher layer image data. Also supplied to the encoding circuit 110.
[0078]
The second encoding circuit 110 converts the base layer image data from the first encoding circuit 109 into the future and past image data as the prediction information of the higher layer by the encoding control signal from the control circuit 103. The difference image data as a high-order layer encoded with reference is generated.
The encoded difference image data as the higher layer obtained by the second encoding circuit 110 is supplied to the bit stream generation circuit 111.
[0079]
The bit stream generation circuit 111 performs the base layer image data from the first encoding circuit 109 and the higher layer layer from the second encoding circuit 110 in the same manner as in the above-described spatial or temporal scalability use mode designation. A header is added to the image data (difference image data), and a bit bit stream of scalable image data is formed and output externally.
[0080]
(Scalability function non-use mode)
[0081]
The first frame memory 102 and the second frame memory 104 are each in accordance with an R / W control signal (control signal based on the scalability function non-use mode) from the control circuit 103 (specifically, the CPU 701). YCbCr data read / write operation is performed.
Thereby, in this case, the YCbCr data read from the first frame memory 102 and the second frame memory 104 is directly transmitted to the first block processing circuit 107 and the second block processing circuit 108. Supplied.
Therefore, the YCbCr data is blocked by the first block processing circuit 107 and the second block processing circuit 108, and the first encoding circuit 109 and the second encoding circuit 110 block by block. A predetermined encoding process is performed, and this is supplied to the bit stream generation circuit 111.
[0082]
The bit stream generation circuit 111 adds a header corresponding to a predetermined application (transmission, storage, etc.) to each data from the first encoding circuit 109 and the second encoding circuit 110, so that the bit of the image data A bit stream is formed and output externally.
[0083]
(Second Embodiment)
[0084]
The present invention is applied to, for example, a decoding device 200 as shown in FIG.
This decoding apparatus 200 corresponds to the encoding apparatus 100 in the first embodiment described above.
That is, the decoding apparatus 200 is basically configured to perform the reverse process of the encoding apparatus 100. In particular, information (user information) described later is input to the decoding apparatus 200. Has been made. This user information includes various types of information such as image quality and capability of the decoding device 200.
Therefore, when the user of the decoding device 200 inputs various information in decoding, the external output information 212 according to this input is generated by the control circuit 208, and this external output information 212 is encoded as the above-described external information 112. To be supplied to the converter 100.
[0085]
6 shows the internal configuration of the above-described control circuit 208, FIG. 7 shows the internal configuration of the first data decoding circuit 209 of the decoding device 200, and FIG. The internal structure of the 1st data decoding circuit 209 of the decoding apparatus 200 is shown.
[0086]
Therefore, the above-described user information is input with the following configuration.
FIG. 9 shows a configuration of a system 240 having the function of the decoding device 200 of FIG.
As shown in FIG. 9, the system 240 includes a monitor 241, a personal computer (PC) main body 242, and a mouse 243, which are connected to each other.
The PC main body 242 has the function of the decoding device 200 configured as shown in FIG.
[0087]
Therefore, in the system 240, first, a genre selection menu screen of selectable software (moving images) is displayed on the monitor 241. For example, selection menu screens such as “movie”, “music”, “Photo”, and “etc.” are displayed.
The user operates the mouse 243 to specify a desired software genre from the screen displayed on the monitor 241. Specifically, for example, the mouse cursor 244 is moved to the desired software genre (“movie” in FIG. 9), and the mouse 243 is clicked or double-clicked. As a result, the genre (“movie”) is designated.
[0088]
When such an operation is completed, the title menu screen of the individual software corresponding to the genre (“movie”) selected on the selection menu screen of FIG. 9 is displayed as shown in FIG. For example, title menus such as “title-A”, “title-B”, “title-C”, and “title-D” corresponding to “movie” are displayed.
The user operates the mouse 243 to specify a desired title on the screen displayed on the monitor 241. Specifically, for example, the mouse cursor 244 is placed on a desired title (“title-A” in FIG. 10 above), and the mouse 243 is clicked or double-clicked. Thus, the title (“title-A”) is designated.
[0089]
Upon completion of such an operation, next, as shown in FIG. 11, various conditions for decoding are set for the title (“title-A”) selected on the title menu screen of FIG. The condition setting screen is displayed. Here, the following various conditions can be set.
S / N: Designate one of low image quality (Low), high image quality (High), and optimum image quality (Auto) according to the decoding capability of this system.
Frame Rate: Specify one of low frame rate (Low), high frame rate (Full), and the optimal frame rate (Au to) according to the decoding capability of this system.
Full Spec: Designates the highest image quality (high code amount) image on the side of the encoding device (such as the encoding device 100 in FIG. 1).
Full Auto: Various optimum conditions are set according to the decoding capability of this system.
[0090]
Therefore, as shown in FIG. 11, the user moves the mouse cursor 244 to the condition to be set (“S / N” in FIG. 11) and clicks or double-clicks the mouse 243. Thereby, as shown in FIG. 12, the detailed conditions for setting “S / N” are displayed. That is, the conditions “Low”, “High”, and “Auto” are displayed.
The user moves the mouse cursor 244 to a condition (“Auto” in FIG. 12) desired to be set to “S / N” from among the detailed conditions, and clicks or double-clicks the mouse 243. As a result, in FIG. 12, “Auto” (optimum image quality according to the decoding capability of this system) is set for “S / N”.
[0091]
As described above, information on various conditions set on the screen is supplied as the external information 112 to the encoding apparatus 100 in the first embodiment described above.
Receiving this, as described above, the encoding apparatus 100 interprets the external information 112, selects the optimum scalability, determines various setting conditions (image size, compression rate, etc.) in the scalability, An encoding process is performed and output to the system 240 (decoding device 200) of FIG.
[0092]
An object of the present invention is to supply a storage medium storing software program codes for realizing the functions of the host and terminal of each of the above-described embodiments to a system or apparatus, and the computer (or CPU) of the system or apparatus. Needless to say, this can also be achieved by reading and executing the program code stored in the storage medium.
In this case, the program code itself read from the storage medium implements the functions of the respective embodiments, and the storage medium storing the program code constitutes the present invention.
[0093]
A ROM, floppy disk, hard disk, optical disk, magneto-optical disk, CD-ROM, CD-R, magnetic tape, nonvolatile memory card, or the like can be used as a storage medium for supplying the program code.
[0094]
Further, by executing the program code read by the computer, not only the functions of the respective embodiments are realized, but also the OS or the like running on the computer based on the instruction of the program code performs the actual processing. It goes without saying that a case where the functions of the respective embodiments are realized by performing part or all of the above and the processing thereof is included.
[0095]
Further, after the program code read from the storage medium is written to the memory provided in the extension function board inserted in the computer or the function extension unit connected to the computer, the function extension is performed based on the instruction of the program code. It goes without saying that the case where the CPU or the like provided in the board or function expansion unit performs part or all of the actual processing and the functions of the respective embodiments are realized by the processing.
According to the above-described embodiment, on the encoding side, information on the decoding side (the processing capability of the decoding side and / or the resolution and code amount specified by the user on the decoding side, and / or the encoding side and the decoding side The image data is encoded in accordance with the transmission line capacity between and the like.
More specifically, on the encoding side, performance (processing capability) information on the decoding side that is the output destination, and / or transmission line information, and / or user-desired information (resolution and code amount specified by the user) The encoding side selects a scalable image suitable for the information, encodes it in real time, and sends it out. On the other hand, the decoding side is provided with means for sending processing capacity information on the decoding side, user's desired information input by the user interface, and the like to the encoding side.
With this configuration, regardless of the image desired by the decoding side (decoding side user), encoded image data suitable for the image can be given to the decoding side. In addition, it is possible to provide the decoding side with encoded image data that is compatible with the line capacity used for image transmission and the performance on the decoding side (ensures compatibility). Therefore, a good decoded image can be obtained on the decoding side.
Further, if an interactive communication function is provided between the encoding side and the decoding side, encoded image data suitable for all circumstances existing between the encoding side and the decoding side is given to the decoding side. be able to. As a result, it is possible to bring about advantages such as wide compatibility between the encoding side and the decoding side, and effective use of line capacity.
Furthermore, if the decoding side is provided with a user interface, it is possible to provide encoded image data that can be flexibly adapted to the user's wishes using an interactive communication function.
[0096]
【The invention's effect】
As described above, according to the present invention, it is possible to provide an image suitable for the decoding side.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a configuration of an encoding apparatus to which the present invention is applied in a first embodiment.
FIG. 2 is a block diagram showing an internal configuration of a control circuit of the encoding device.
FIG. 3 is a block diagram showing an internal configuration of a first data generation circuit of the encoding device.
FIG. 4 is a block diagram showing an internal configuration of a second data generation circuit of the encoding device.
FIG. 5 is a block diagram showing a configuration of a decoding apparatus to which the present invention is applied in the second embodiment.
FIG. 6 is a block diagram showing an internal configuration of a control circuit of the decoding device.
FIG. 7 is a block diagram showing an internal configuration of a first data decoding circuit of the decoding device.
FIG. 8 is a block diagram showing an internal configuration of a second data decoding circuit of the decoding device.
FIG. 9 is a diagram for explaining a system having the function of the decoding device;
FIG. 10 is a diagram for explaining an operation for selecting a title of a genre menu on the monitor screen of the system.
FIG. 11 is a diagram for explaining a condition setting screen displayed by selecting the title.
FIG. 12 is a diagram for explaining an operation for setting a desired condition for a condition item selected on the condition setting screen.
FIG. 13 is a diagram for explaining spatial scalability.
FIG. 14 is a diagram for explaining temporal scalability.
FIG. 15 is a diagram for explaining SNR scalability;
FIG. 16 is a block diagram illustrating a configuration of a conventional encoding device.
FIG. 17 is a block diagram showing a configuration of a conventional decoding device.
[Explanation of symbols]
100 Encoder
101 Conversion circuit
102 first frame memory
103 Control circuit
104 Second frame memory
105 First data generation circuit
106 Second data generation circuit
107 first block processing circuit
108 Second block processing circuit
109 First encoding circuit
110 Second encoding circuit
111 bit stream generation circuit
112 External information

Claims

画像データを所定の符号化方式に従って符号化して出力する画像処理装置であって、
外部装置から、前記外部装置において画像を復号するための条件を示す外部情報を受信する受信手段と、
空間的スケーラビリティモード、時間的スケーラビリティモード、ＳＮＲスケーラビリティモードの何れか二つのモードを少なくとも含む複数のスケーラビリティモードから、前記受信手段で受信された前記外部情報に対応するスケーラビリティモードを選択し、前記選択したスケーラビリティモードで前記画像データを符号化する符号化手段と、
前記符号化手段によって符号化された画像データを前記外部装置に伝送する伝送手段と
を備えたことを特徴とする画像処理装置。An image processing apparatus for encoding and outputting image data according to a predetermined encoding method,
Receiving means for receiving external information indicating conditions for decoding an image in the external device from an external device;
A scalability mode corresponding to the external information received by the receiving means is selected from a plurality of scalability modes including at least any one of a spatial scalability mode, a temporal scalability mode, and an SNR scalability mode, and the selected Encoding means for encoding the image data in a scalability mode;
An image processing apparatus comprising: transmission means for transmitting image data encoded by the encoding means to the external device.

前記符号化手段は、更に、選択したスケーラビリティモードで符号化される画像データのサイズ、又はフレームレートを前記外部情報に従って、設定することを特徴とする請求項１に記載の画像処理装置。The image processing apparatus according to claim 1, wherein the encoding unit further sets a size or a frame rate of image data encoded in the selected scalability mode according to the external information.

前記符号化手段は、前記複数のスケーラビリティモードの基本レイヤを生成する基本レイヤ生成手段を含み、前記複数のスケーラビリティモードにおける基本レイヤの生成処理は、前記基本レイヤ生成手段において共通していることを特徴とする請求項１又は２に記載の画像処理装置。The encoding means includes base layer generation means for generating base layers of the plurality of scalability modes, and base layer generation processing in the plurality of scalability modes is common to the base layer generation means. The image processing apparatus according to claim 1 or 2 .

所定の符号化方式に従って符号化して得られた画像データを復号する画像処理装置であって、
画像を復号するための条件を示す外部情報を入力する入力手段と、
前記外部情報を外部装置に伝送する伝送手段と、
前記外部装置において空間的スケーラビリティモード、時間的スケーラビリティモード、ＳＮＲスケーラビリティモードの何れか二つのモードを少なくとも含む複数のスケーラビリティモードから、前記入力手段で入力された前記外部情報に対応するスケーラビリティモードが選択され、前記選択されたスケーラビリティモードで符号化された画像データを前記外部装置から受信する受信手段と、
前記受信手段により受信された符号化された画像データを復号化する復号手段と
を備えたことを特徴とする画像処理装置。An image processing apparatus for decoding image data obtained by encoding according to a predetermined encoding method,
An input means for inputting external information indicating conditions for decoding an image;
Transmission means for transmitting the external information to an external device;
In the external device, a scalability mode corresponding to the external information input by the input means is selected from a plurality of scalability modes including at least any one of a spatial scalability mode, a temporal scalability mode, and an SNR scalability mode. Receiving means for receiving image data encoded in the selected scalability mode from the external device;
An image processing apparatus comprising: decoding means for decoding encoded image data received by the receiving means.

前記伝送手段は、前記外部情報と共に、又は前記外部情報に換えて、自装置の復号能力情報を前記外部装置に伝送することを特徴とする請求項４に記載の画像処理装置。The image processing apparatus according to claim 4, wherein the transmission unit transmits the decoding capability information of the own apparatus to the external apparatus together with the external information or instead of the external information.

画像データを所定の符号化方式に従って符号化して出力する画像処理装置における画像処理方法であって、
外部装置から、前記外部装置において画像を復号するための条件を示す外部情報を受信する受信工程と、
空間的スケーラビリティモード、時間的スケーラビリティモード、ＳＮＲスケーラビリティモードの何れか二つのモードを少なくとも含む複数のスケーラビリティモードから、前記受信工程で受信された前記外部情報に対応するスケーラビリティモードを選択し、前記選択したスケーラビリティモードで画像データを符号化する符号化工程と、
前記符号化工程によって符号化された画像データを前記外部装置へ伝送する伝送工程と
を含むことを特徴とする画像処理方法。An image processing method in an image processing apparatus for encoding and outputting image data according to a predetermined encoding method,
A receiving step of receiving external information indicating a condition for decoding an image in the external device from an external device;
A scalability mode corresponding to the external information received in the reception step is selected from a plurality of scalability modes including at least any one of a spatial scalability mode, a temporal scalability mode, and an SNR scalability mode, and the selected An encoding process for encoding image data in scalability mode;
A transmission step of transmitting the image data encoded by the encoding step to the external device.

所定の符号化方式に従って符号化して得られた画像データを復号する画像処理装置における画像処理方法であって、
画像を復号するための条件を示す外部情報を入力する入力工程と、
前記外部情報を外部装置に伝送する伝送工程と、
前記外部装置において空間的スケーラビリティモード、時間的スケーラビリティモード、ＳＮＲスケーラビリティモードの何れか二つのモードを少なくとも含む複数のスケーラビリティモードから、前記入力工程で入力された前記外部情報に対応するスケーラビリティモードが選択され、前記選択されたスケーラビリティモードで符号化された画像データを前記外部装置から受信する受信工程と、
前記受信工程により受信された符号化された画像データを復号化する復号工程と
を含むことを特徴とする画像処理方法。An image processing method in an image processing apparatus for decoding image data obtained by encoding according to a predetermined encoding method,
An input step for inputting external information indicating conditions for decoding an image;
A transmission step of transmitting the external information to an external device;
In the external device, a scalability mode corresponding to the external information input in the input step is selected from a plurality of scalability modes including at least any one of a spatial scalability mode, a temporal scalability mode, and an SNR scalability mode. Receiving from the external device image data encoded in the selected scalability mode;
A decoding step of decoding the encoded image data received by the receiving step.