JP2004200959A

JP2004200959A - Code sequence converting apparatus, image edit system, camera system, and program

Info

Publication number: JP2004200959A
Application number: JP2002366225A
Authority: JP
Inventors: Yutaka Sano; 豊佐野; Shogo Oneda; 章吾大根田; Keiichi Suzuki; 啓一鈴木; Yukio Kadowaki; 幸男門脇; Toru Suino; 水納　　亨; Takanori Yano; 隆則矢野; Minoru Fukuda; 実福田
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 2002-12-18
Filing date: 2002-12-18
Publication date: 2004-07-15

Abstract

<P>PROBLEM TO BE SOLVED: To provide a code sequence converting apparatus capable of discriminating a kind of an image on the basis of a motion amount of an image, a sub band code amount distribution, and a compression rate so as to attain re-quantization and encoding a moving picture while maintaining high image quality. <P>SOLUTION: A parsing means 12 analyzes a syntax of a code sequence resulting from applying compression encoding to moving picture data and a quantization means 29 selects packets to apply re-quantization to them. Then a code sequence generating means 13 generates a new code sequence including the remaining packets. The quantization means 29 applies control to the packets according to one or more quantization tables selected from a quantization table group 27 by a quantization table selection means 28. The quantization table is selected by each index value respectively decided depending on each value of the sub band code quantity distribution and a designated image compression rate. <P>COPYRIGHT: (C)2004,JPO&NCIPI

Description

【０００１】
【発明の属する技術分野】
本発明は、画像データを圧縮符号化した符号列を新たな符号列に変換する符号列変換装置、これを備えた画像編集システム及びカメラシステム、並びに画像データを圧縮符号化した符号列を新たな符号列に変換する処理を実行するプログラムに関する。
【０００２】
【従来の技術】
従来、画像圧縮伸長アルゴリズムとして、動画像専用のMPEG1／MPEG2／MPEG4や、静止画像を連続したフレームとして扱うMotion JPEGが使用されている。また、最近、後者のMotion静止画像の符号化については、国際標準としてMotion JPEG2000という新しい方式が規格化されつつある。
【０００３】
MPEG方式とMotion静止画像方式の違いは、後者はフレーム内符号化だけを行うのに対し、前者は同一フレーム内の画像ばかりではなく、異なるフレーム間画像においても相関をとり、より圧縮率を上げることができることにある。一方、各々のフレームを独立に扱う後者の方式は、前者に比較して、フレーム毎の編集が可能であり、また、通信時のエラーが他フレームに及ぶこともない。このように、MPEG方式、Motion静止画像方式は、各々特長を持っている。そして、アプリケーション毎に、適宜、方式が使い分けられている。
【０００４】
Motion JPEG2000方式では、変換方式として離散ウエーブレット変換を用いているが、離散ウエーブレット変換を用いて画像データを圧縮符号化する技術としては、特許文献１に開示の技術が知られている。
【０００５】
【特許文献１】特開２００１−３０９３８１公報
【発明が解決しようとする課題】
特許文献１に開示の技術では、画素値を離散ウエーブレット変換して圧縮符号化するのみならず、異なるフレーム間の画像においても相関をとり、フレーム間で画像の動きがない場合の動画像データの冗長性も解消するようにしているので、よりデータの圧縮率を向上させることができる。
【０００６】
かかる技術における具体的な符号量削減の方法は、タイル単位で、現フレームと直前フレームの間で、低域のウエーブレット係数値を比較し、一致すると判定されると、ペイロード部から、一致した部分が削除され、ヘッダ部には、フレーム間で動きがないことを示す情報が書き込まれ、逆に、不一致と判定されると、符号はそのまま残される、というものである。
【０００７】
すなわち、フレーム内の静止画像に対して最適化された「量子化」、フレーム間のウエーブレット係数値の比較結果に基づいた「符号量削減」、という二つの処理が、各々独立に行われる。そして、唯一の「量子化テーブル」に基づいて、フレーム内の静止画像の量子化が行われる。
【０００８】
しかしながら、かかる従来技術においては、以下に述べる（１）（２）のような不具合がある。
【０００９】
（１）従来、Motion静止画像の圧縮は、フレーム内の静止画像に対して最適化された「量子化」だけに目が向けられ、動画像固有の視覚特性を考慮した「量子化」が行われていなかった。本来、フレーム間の動き量は、画像領域毎に異なるのに、従来は、動き量が全画像領域に渡って均一であると仮定していた。その結果、静止画像の表示の際には予想できなかった、全く新たな画像品質の劣化が、動画像表示の際にしばしば現れていた。
【００１０】
この点について図面を参照して説明する。図３１には、動き量を無視してサブバンド符号を一律に削減する従来の量子化方法を示している。ここで、縦軸は動き量を表し、現フレームとそれ以前のフレームとの間におけるウエーブレット係数値の差分から求められる相関係数値が使われている。また、横軸は、タイルの単位を表している。タイルは、フレーム内の原静止画像を分割して作成された矩形領域のことである。
【００１１】
動き量について見ると、例えばタイルＡは大きく、タイルＣは小さいというように、タイルごとに異なっている。しかし、従来の方法では、画像全体を一律な動き量に置換えていた。この例では、全てのタイルの動き量が「無し」として扱っている。従って、動き量の大小に関わらず、唯一の量子化テーブルが参照されていた。なお、タイルにつけられた番号、すなわちタイル＃は、後述するように図３３で定義されている。
【００１２】
（２）また、フレーム内静止画像の量子化も、圧縮率あるいはビットレートで一義的に定義された唯一の「量子化テーブル」を用いている。本来、サブバンドの符号量分布は、画像領域毎に異なるはずなのに、従来は、そうしたことを全く考慮していなかった。その結果、高域成分の多い画像、低域成分の多い画像、高域から低域まで広く分布した画像、等の原画像の性質によって、量子化された画像の画質が大きく変動していた。
【００１３】
図３２には、符号量の分布を無視して、サブバンド符号を一律に削減する従来の量子化方法を示している。縦軸は、最上位階層の(デコンポジシン・レベル数が最大の)低域サブバンド符号量を“１”に規格化したときの相対的な符号量を表している。また、横軸は、サブバンドを表している。
【００１４】
そして、タイルごとの符号量分布を示す曲線のうち、細線の部分は符号量の削除対象となるサブバンドを、太線の部分は符号量がそのまま保存されるサブバンドを、各々表している。（ａ）は、符号量の削減対象となるサブバンドの数が多い場合、（ｂ）は少ない場合で、各々、圧縮率の低い場合と高い場合に対応している。なお、タイル番号Ａ〜Ｄは、図３３で定義されているものである。
【００１５】
タイル＃＝ＡとＤは高域にまで符号データを保持している。また、タイル＃＝Ｂは中域に比較的多くの情報を、タイル＃＝Ｃは低域にほとんどの情報を各々保有している。このように、タイルごとにサブバンド符号量の異なる場合においても、従来の方法では、符号量分布に関係なく、サブバンド符号データを一律に削減していた。その結果、特に圧縮率の高い場合には、中域から高域のデータ量が相対的に多い、タイル＃＝Ａ，Ｂ，Ｄの画質は、低域にデータが集中しているタイル＃＝Ｃの画質に比較して劣化が顕著であった。
【００１６】
以上の（１）（２）を具体的に説明するために、圧縮伸長アルゴリズムにMotion JPEG2000を使った場合に生ずる現象について、以下に詳しく観察結果を述べる。
【００１７】
図３３は、動画像コンテンツから、連続する３つのフレームを抜き出したものである。この例では、動き量の少ない景色を背景にして走る自転車を表している。動き量の大きなタイル、小さなタイル、の区別が比較的しやすい画像である。今、４つのタイルに注目する。タイル＃＝Ａは中心から下に、タイル＃＝Ｂは左上に、タイル＃＝Ｃは中心から上に、タイル＃＝Ｄは右上に、各々位置している。
【００１８】
フレーム間の動き量及びサブバンドの符号量分布を考慮しない、従来の量子化によって符号化されたMotion静止画像を伸長表示させたときには、以下に述べるような現象が現れる。
【００１９】
▲１▼．タイル＃＝Ａにおいて
自転車は、撮影点から一番近い距離にあり、構造上の特徴から高域サブバンドの符号量分布が大きい。圧縮率が高くなるに従い、高域サブバンドの符号量が減らされるので、フレーム内の静止画像で、車輪のスポーク部分を一本一本判別するのが難しくなる。動く自転車のスポークの判別は、人間の持つ動体視力の限界以上なので、動画像に与える影響はそれ程大きくはない。
【００２０】
▲２▼．タイル＃＝Ｂにおいて
遠景と前景の自転車の中間に位置している樹の動き量はほとんど無い。葉の部分は、中域サブバンドの符号量分布が大きい。圧縮率が高くなるに従い、フレーム内静止画像で葉の一枚一枚を区別することが難しくなる。但し、その傾向は、タイル＃＝Ａにおける自転車のスポークほど強くはない。
【００２１】
▲３▼．タイル＃＝Ｃにおいて
遠景には、低域サブバンドの符号量分布が大きいスカイラインが配置されている。圧縮率が高くなっても、符号量削減の影響を比較的受けにくい。但し、動き量がほとんど無いので、低域サブバンドのデータが僅かでも削られると、フレーム内静止画像全体にボケが広がってしまう。動画像で見ると、背景に「モヤモヤとした歪」が生じ、非常に目障りとなる。
【００２２】
▲４▼．タイル＃＝Ｄにおいて
画像領域内には、樹の輪郭部分が多く存在する。その結果、符号量の分布において、高域側のデータ量がやや多い点がタイル＃＝Ｃとは異なる。タイル＃＝Ｃと画像を比較したとき、フレーム内静止画像にはあまり差は現れない。しかし、動画像を表示させた時には、「モヤモヤとした歪」がより顕著に現れる。これは、削られた高域成分の量が多いためであると思われる。
【００２３】
本発明の目的は、画像の動き量、サブバンド符号量分布、圧縮率の切口で画像の種類を判別して再量子化を行うことを可能として、高い画像品質を維持しつつ動画像を符号化することである。
【００２４】
本発明の別の目的は、複数パターンで再量子化を行うことを可能として、高い画像品質を維持しつつ動画像あるいは静止画像を符号化することである。
【００２５】
【課題を解決するための手段】
請求項１に記載の発明は、動画像データをフレームごとに１又は複数の矩形領域に分割し、この矩形領域ごとに画素値を周波数変換して階層的に符号化することにより作成した符号列の構文を解析する構文解析手段と、符号の取捨選択を行って前記符号列の再量子化を行う量子化手段と、前記解析結果に基づいて前記符号列について現フレームと以前のフレームとの間の動き量を求める動き量検出手段と、前記解析結果に基づいて現フレーム内のサブバンド符号量分布を求めるサブバンド符号量分布検出手段と、現フレームの圧縮率を指定する圧縮率指定手段と、前記動き量の大きさに応じて与えられた数値で前記再量子化を行う際の符号の削減方法を指定する複数のテーブルからなるテーブル群から前記テーブルを指定する際のインデックスとなる動き量インデックス値、前記サブバンド符号量分布の大きさに応じて与えられた数値で前記テーブルを指定する際のインデックスとなる符号分布インデックス値、及び、前記圧縮率の大きさに応じて与えられた数値で前記テーブルを指定する際のインデックスとなる圧縮率インデックス値に基づいて、前記再量子化に使用する前記テーブルを１又は複数選択する選択手段と、前記再量子化後の符号列から新たな符号列を作成する符号列作成手段と、を備えている符号列変換装置である。
【００２６】
したがって、画像の動き量、サブバンド符号量分布、圧縮率という３種類の切口で画像の種類を判別して再量子化を行うことができるので、高い画像品質を維持しつつ動画像を符号化することができる。また、動き量、サブバンド符号量分布を符号量から求め、量子化を符号状態で行うので、処理に必要なメモリ要領を節減でき、処理速度を高め、処理に必要な消費電力を低減することができる。
【００２７】
請求項２に記載の発明は、静止画像データを１又は複数の矩形領域に分割し、この矩形領域ごとに画素値を周波数変換して階層的に符号化することにより作成した符号列の構文を解析する構文解析手段と、符号の取捨選択を行って前記符号列の再量子化を行う量子化手段と、前記解析結果に基づいてサブバンド符号量分布を求めるサブバンド符号量分布検出手段と、圧縮率を指定する圧縮率指定手段と、前記サブバンド符号量分布の大きさに応じて与えられた数値で前記再量子化を行う際の符号の削減方法を指定する複数のテーブルからなるテーブル群から前記テーブルを指定する際のインデックスとなる符号分布インデックス値、及び、前記圧縮率の大きさに応じて与えられた数値で前記テーブルを指定する際のインデックスとなる圧縮率インデックス値に基づいて、前記再量子化に使用する前記テーブルを１又は複数選択する選択手段と、前記再量子化後の符号列から新たな符号列を作成する符号列作成手段と、を備えている符号列変換装置である。
【００２８】
したがって、サブバンド符号量分布、圧縮率という２種類の切口で画像の種類を判別して再量子化を行うことができるので、高い画像品質を維持しつつ静止画像を符号化することができる。また、サブバンド符号量分布を符号量から求め、量子化を符号状態で行うので、処理に必要なメモリ要領を節減でき、処理速度を高め、処理に必要な消費電力を低減することができる。
【００２９】
請求項３に記載の発明は、請求項１又は２に記載の符号列変換装置において、前記選択手段は、前記動き量、前記サブバンド符号量分布及び前記圧縮率に対応している前記各インデックス値で特定される前記テーブルを指定し、また、前記各インデックス値が所定の選択条件を満たした場合には他の前記テーブルも選択し、満たさない場合には当該他のテーブルの選択は行わない。
【００３０】
したがって、複数パターンで再量子化を行うことができ、高い画像品質を維持しつつ動画像あるいは静止画像を符号化することができる。
【００３１】
請求項４に記載の発明は、請求項３に記載の符号列変換装置において、前記選択手段は、前記動き量、前記サブバンド符号量分布の低域側総符号量及び前記圧縮率のうち何れか一つが所定の閾値以上のときに前記選択条件が満たされる。
【００３２】
したがって、動き量、サブバンド符号量分布又は圧縮率がある程度大きいときに複数パターンで再量子化を行なって、高い画像品質を維持しつつ動画像あるいは静止画像を符号化することができる。
【００３３】
請求項５に記載の発明は、請求項３又は４に記載の符号列変換装置において、前記選択条件を満たすか否かにかかわらず、前記選択手段で前記他のテーブルの選択を実行することを設定する第１の設定手段を備えている。
【００３４】
しががって、必要性があるときは、選択条件を満たさなくても複数パターンで再量子化を行なって、高い画像品質を維持しつつ動画像あるいは静止画像を符号化することができる。
【００３５】
請求項６に記載の発明は、請求項３〜５のいずれかの一に記載の符号列変換装置において、前記選択手段で選択する他のテーブルの数に上限値を設定する第２の設定手段を備えている。
【００３６】
したがって、無制限に複数パターンで再量子化を行うことを防止し、無駄な処理を行わないようにすることができる。
【００３７】
請求項７に記載の発明は、請求項３〜６のいずれかの一に記載の符号列変換装置において、前記選択手段は、前記他のテーブルの前記各インデックス値は、前記選択条件を満たさないときに選択される前記各インデックス値で特定されるテーブルの当該インデックス値と、その少なくとも一つが隣接する値である。
【００３８】
したがって、動き量、サブバンド符号量分布、圧縮率に対して最も適切と考えられる再量子化の他に、それに次いで適切と考えられる再量子化を行うことができる。
【００３９】
請求項８に記載の発明は、請求項３〜７のいずれかの一に記載の符号列変換装置において、前記選択手段は、前記符号列がＲＧＢ、ＹＵＶ又はＹＣｂＣｒのいずれかの色空間で構成されているときに、前記選択条件を各色空間のうち少なくとも一つに適用する。
【００４０】
したがって、少なくともひとつの色空間で選択条件を満たすときに、複数パターンで再量子化を行なって、高い画像品質を維持しつつ動画像あるいは静止画像を符号化することができる。
【００４１】
請求項９に記載の発明は、請求項３〜８のいずれかの一に記載の符号列変換装置において、前記符号列作成手段は、前記テーブルが複数選択されたときは前記再量子化手段で前記テーブルごとに得られた符号データを互いに異なるコンポーネントに配置するように前記新たな符号列の作成を行う。
【００４２】
したがって、複数パターンで再量子化した符号データを複数コンポーネントにそれぞれ有する単一の符号列を得ることができる。
【００４３】
請求項１０に記載の発明は、請求項９に記載の符号列変換装置において、前記符号列作成手段は、前記他のテーブルが選択されなかったときに、当該他のテーブルから得られた符号データを配置すべき前記コンポーネントに予め用意した符号データを配置する。
【００４４】
したがって、複数パターンで再量子化しないときは予め用意した符号データを、他のテーブルから得られた符号データを配置すべきコンポーネントに配置して、その後の新たな符号列の利用の便宜を図ることができる。
【００４５】
請求項１１に記載の発明は、請求項１０に記載の符号列変換装置において、前記符号列作成手段は、前記予め用意した符号データとして画素値が最小となる白色の画像の符号データを配置する。
【００４６】
したがって、複数パターンで再量子化しないときは画素値が最小となる白色の画像の符号データを、他のテーブルから得られた符号データを配置すべきコンポーネントに配置して、その後の新たな符号列の利用の便宜を図ることができる。
【００４７】
請求項１２に記載の発明は、請求項１〜１１のいずれかの一に記載の符号列変換装置において、前記各インデックス値は、前記動き量、前記サブバンド符号量分布及び前記圧縮率の各値に応じてそれぞれ昇順又は降順に付されている。
【００４８】
したがって、画像の動き量、サブバンド符号量分布及び圧縮率の各値に応じてそれぞれ昇順又は降順に付されているインデックス値により的確に必要なテーブルを選択し、複数パターンで再量子化を行なって、高い画像品質を維持しつつ動画像あるいは静止画像を符号化することができる。
【００４９】
請求項１３に記載の発明は、請求項１〜１２のいずれかの一に記載の符号列変換装置において、前記サブバンド符号量分布検出手段は、サブバンドごとの直交変換係数値の符号量和及びある帯域に含まれるサブバンドの直交変換係数値の総符号量和を求めるものである。
【００５０】
したがって、サブバンド符号量分布を適切に検出することができる。
【００５１】
なお、前記構文解析手段は、前記符号列として前記直交変換に離散コサイン変換、または離散ウエーブレット変換を用いているものを対象とすることができる（請求項１４，１５）。
【００５２】
請求項１６に記載の発明は、請求項１〜１５のいずれかの一に記載の符号列変換装置と、この符号列変換装置で作成した新たな符号列を伸長する画像伸長装置と、この伸長後の画像データに基づいて画像を表示する表示装置と、を備えている画像編集システムである。
【００５３】
したがって、請求項１〜１５のいずれかの一に記載の発明と同様の作用、効果を奏する。
【００５４】
請求項１７に記載の発明は、請求項９に記載の符号列変換装置である第１の符号列変換装置と、この符号列変換装置で作成した新たな符号列を伸長する画像伸長装置と、この伸長後の画像データに基づいて画像を表示する表示装置と、前記符号列変換装置で作成した新たな符号列に含まれる前記複数のコンポーネントのうち所望のもののみを含む符号列を新たに作成する第２の符号列変換装置と、を備えている画像編集システムである。
【００５５】
したがって、複数パターンで再量子化を行った各符号データを収容した複数のコンポーネントの画質を互いに比較して、最適な画質であると判断したコンポーネントのデータのみを残した符号列を作成することができる。
【００５６】
請求項１８に記載の発明は、請求項１７に記載の画像編集システムにおいて、前記第１の符号列変換装置で作成した新たな符号列に含まれる各コンポーネントの画像データから歪量を測定する歪量測定装置を備え、前記第２の符号列変換装置は、この測定の結果に基づいて前記新たな符号列の作成を行う。
【００５７】
したがって、画像データの歪量に基づいて最適な画質のコンポーネントを自動で選択して、そのコンポーネントのみを残した符号列を作成することができる。
【００５８】
請求項１９に記載の発明は、画像を静止画像として撮像する画像入力装置と、この撮影された画像データを１又は複数の矩形領域に分割し、この矩形領域ごとに画素値を周波数変換し、階層的に圧縮符号化する画像圧縮装置と、この圧縮符号化後の符号列を処理する請求項１〜１５のいずれかの一に記載の前記画像処理装置と、を備えているカメラシステムである。
【００５９】
したがって、請求項１〜１５のいずれかの一に記載の発明と同様の作用、効果を奏する。
【００６０】
請求項２０に記載の発明は、請求項１〜１５のいずれかの一に記載の発明の前記各手段の機能をコンピュータに実行させるコンピュータに読み取り可能なプログラムである。
【００６１】
したがって、請求項１〜１５のいずれかの一に記載の発明と同様の作用、効果を奏する。
【００６２】
【発明の実施の形態】
［前提技術の概要］
まず、本実施の形態の前提技術となる「直交変換係数の配置」、「階層符号化アルゴリズム」、及び、「JPEG2000アルゴリズム」の概要について、各々順に説明する。
【００６３】
（１）「直交変換係数の配置」について
図１は二次元ＤＣＴ（離散コサイン変換）を使った場合の基底ベクトルを、図２は二次元ＤＷＴ（離散ウェーブレット変換）を使った場合のオクターブ分割を、各々示している。これらの直交変換を採用した、代表的な符号化アルゴリズムとしては、各々、JPEGアルゴリズムとJPEG2000アルゴリズムが知られている。
【００６４】
そして、図３に概念的に示すように、JPEG2000アルゴリズムで符号化後の符号列は、ＳＯＣ（Start Of Codestream）を先頭、ＥＯＣ（End Of Codestream）を末尾として配列され、直交変換係数値はペイロード部２１１に、その要約された情報がヘッダ部２１２に、符号列の中で配置される。符号列中のヘッダ部２０２は、構文解析手段２１３により検出され、パケット長読取手段２１４によりパケット長（ペイロード部のサブバンド符号量）が読み出される。
【００６５】
（２）「階層符号化アルゴリズム」及び「JPEG2000アルゴリズム」について
図４は、JPEG2000の基本となる階層符号化アルゴリズムを説明するためのブロック図である。この階層符号化アルゴリズムは、２次元ウェーブレット変換・逆変換部２０２、量子化・逆量子化部２０３、エントロピー符号化・復号化部２０４、タグ処理部２０５で構成されている。JPEGアルゴリズムと比較して、最も大きく異なる点の一つは変換方法である。JPEGでは離散コサイン変換（ＤＣＴ）を、階層符号化圧縮伸長アルゴリズムでは離散ウェーブレット変換（ＤＷＴ）を、各々用いている。ＤＷＴはＤＣＴに比べて、高圧縮領域における画質が良いという長所が、JPEGの後継アルゴリズムであるJPEG2000で採用された大きな理由の一つとなっている。また、他の大きな相違点は、後者では、最終段に符号形成をおこなうために、タグ処理部２０５と呼ばれる機能ブロックが追加されていることである。この部分で、圧縮動作時には圧縮データが符号列として生成され、伸長動作時には伸長に必要な符号列の解釈が行われる。そして、符号列によって、JPEG2000は様々な便利な機能を実現できるようになった。
【００６６】
例えば、図５はデコンポジションレベル数が３の場合の、各デコンポジションレベルにおけるサブバンドを示す図であるが、図２に示したブロックベースでのＤＷＴにおけるオクターブ分割の階層に対応した任意の階層で、静止画像の圧縮伸長処理を停止させることができる。なお、ここでの「デコンポジション」に関し、JPEG2000 PartI FDIS（Final Draft international Standard）には、以下のように定義されている。
decomposition level：
A collection of wavelet subbands where each coefficient has the same spatial impact or span with respect to the source component samples. These include the HL,LH,and HH subbands of the same two dimensional subband decomposition. For the last decomposition level the LL subband is also included.
なお、図４において、原画像の入出力部分には、色空間変換部２０１が接続されることが多い。例えば、原色系のＲ（赤）／Ｇ（緑）／Ｂ（青）の各コンポーネントからなるＲＧＢ表色系や、補色系のＹ（黄）／Ｍ（マゼンタ）／Ｃ（シアン）の各コンポーネントからなるＹＭＣ表色系から、ＹＵＶ或いはＹＣｂＣｒ表色系への変換又は逆の変換を行なう部分がこれに相当する。
【００６７】
以下、JPEG2000アルゴリズムについて、詳細に説明する。
【００６８】
図６は、タイル分割されたカラー画像の各コンポーネントの例を示す図である。カラー画像は、一般に図３に示すように、原画像の各コンポーネント２０７_Ｒ，２０７_Ｇ，２０７_Ｂ（ここではＲＧＢ原色系）が、矩形をした矩形領域（タイル）２０７_Ｒｔ，２０７_Ｇｔ，２０７_Ｂｔによって分割される。そして、個々のタイル、例えば、Ｒ００，Ｒ０１，…，Ｒ１５／Ｇ００，Ｇ０１，…，Ｇ１５／Ｂ００，Ｂ０１，…，Ｂ１５が、圧縮伸長プロセスを実行する際の基本単位となる。従って、圧縮伸長動作は、コンポーネント毎、そしてタイル毎に、独立に行なわれる。
【００６９】
符号化時には、各コンポーネントの各タイルのデータが、図１の色空間変換部１に入力され、色空間変換を施されたのち、２次元ウェーブレット変換部２０２で２次元ウェーブレット変換（順変換）が適用されて周波数帯に空間分割される。
【００７０】
図５は、２次元離散ウェーブレット変換のオクターブ分割の説明図である。デコンポジションレベルは、その数値が大きいほど上位階層レベルである。グレイで表示した部分は、各階層レベルにおいて符号化の対象となるサブバンドである。図５の例では、デコンポジションレベル数が３の場合の、各デコンポジションレベルにおけるサブバンドを示している。すなわち、原画像のタイル分割によって得られたタイル原画像（０ＬＬ）（デコンポジションレベル０（２０６_０））に対して、２次元ウェーブレット変換を施し、デコンポジションレベル１（２０６_１）に示すサブバンド（１ＬＬ，１ＨＬ，１ＬＨ，１ＨＨ）を分離する。そして引き続き、この階層における低周波成分１ＬＬに対して、２次元ウェーブレット変換を施し、デコンポジションレベル２（２０６_２）に示すサブバンド（２ＬＬ，２ＨＬ，２ＬＨ，２ＨＨ）を分離する。順次同様に、低周波成分２ＬＬに対しても、２次元可逆ウェーブレット変換を施し、デコンポジションレベル３（２０６_３）に示すサブバンド（３ＬＬ，３ＨＬ，３ＬＨ，３ＨＨ）を分離する。
【００７１】
更に図５では、各デコンポジションレベルにおいて符号化の対象となるサブバンドを、グレイで表してある。例えば、デコンポジションレベル数を３とした時、グレイで示したサブバンド（３ＨＬ，３ＬＨ，３ＨＨ，２ＨＬ，２ＬＨ，２ＨＨ，１ＨＬ，１ＬＨ，１ＨＨ）が符号化対象となり、３ＬＬサブバンドは符号化されない。
【００７２】
次いで、指定した符号化の順番で符号化の対象となるビットが定められ、図４の量子化部２０３で対象ビット周辺のビットからコンテキストが生成される。
【００７３】
図７は、プレシンクトとコードブロックの関係を説明するための説明図である。量子化の処理が終わったウェーブレット係数は、個々のサブバンド毎に、「プレシンクト」と呼ばれる重複しない矩形に分割される。これは、インプリメンテーションでメモリを効率的に使うために導入されたものである。図７に示したように、一つのプレシンクト、例えばプレシンクト２０８_ｐ４は、空間的に一致した３つの矩形領域からなっている。プレシンクト２０８_ｐ６も同様である。なお、ここで原画像はデコンポジションレベル１でタイル２０８_ｔ０，２０８_ｔ１，２０８_ｔ２，２０８_ｔ３の４つのタイルに分割されている。更に、個々のプレシンクトは、重複しない矩形の「コードブロック」（プレシンクト８_ｐ４に対してはコードブロック２０８_４ｂ０，２０８_４ｂ１，…）に分けられる。これは、エントロピーコーディングを行なう際の基本単位となる。符号化効率を上げるために、係数値をビットプレーン単位に分解し、画素或いはコードブロック毎にビットプレーンに順序付けを行い、１又は複数のビットプレーンからなる層（レイヤ）を構成することもある。すなわち係数値のビットプレーンから、その有意性に基づいた層（レイヤ）を構成し、そのレイヤごとに符号化を行なう。最も有意なレイヤである最上位レイヤ（ＭＳＢ）とその下位レイヤを数レイヤだけ符号化し、最も有意でないレイヤ（ＭＬＢ）を含んだそれ以外のレイヤをトランケートすることもある。
【００７４】
エントロピー符号化部２０４では、コンテキストと対象ビットから確率推定によって、各コンポーネントのタイルに対する符号化を行なう。こうして、原画像の全てのコンポーネントについて、タイル単位で符号化処理が行われる。
【００７５】
最後にタグ処理部２０５は、エントロピコーダ部からの全符号化データを１本の符号列に結合するとともに、それにタグを付加する処理を行なう。図８には、符号列の構造を簡単に示している。符号列の先頭と各タイルを構成する部分タイルの先頭にはヘッダ（それぞれ、メインヘッダ２０９_ｈ及びタイル部ヘッダ２０９_ｔｈ）と呼ばれるタグ情報が付加され、その後に、各タイルの符号化データ（ビットストリーム２０９_ｂ）が続く。そして、符号列の終端には、再びタグ（ＥＯＣタグ２０９_ｅ）が置かれる。
【００７６】
更に、図９は、符号化されたウェーブレット係数値の収容されたパケットを、サブバンドごとに表わしたときの符号列の構造を示すものである。タイルによる分割処理を行っても、あるいは行なわなくても、同様のパケット列構造を持っている。
【００７７】
一方、復号化時には、符号化時とは逆に、各コンポーネントの各タイルの符号列から画像データを生成する。図４を用いて簡単に説明する。この場合、タグ処理部２０５は、外部より入力した符号列に付加されたタグ情報を解釈し、符号列を各コンポーネントの各タイルの符号列に分解し、その各コンポーネントの各タイルの符号列毎に復号化処理が行われる。符号列内のタグ情報に基づく順番で復号化の対象となるビットの位置が定められるとともに、逆量子化部２０３で、その対象ビット位置の周辺ビット（既に復号化を終えている）の並びからコンテキストが生成される。エントロピー復号化部２０４で、このコンテキストと符号列から確率推定によって復号化を行い対象ビットを生成し、それを対象ビットの位置に書き込む。
【００７８】
このようにして復号化されたデータは周波数帯域毎に空間分割されているため、これを２次元ウェーブレット逆変換部２０２で２次元ウェーブレット逆変換を行なうことにより、画像データの各コンポーネントの各タイルが復元される。復元されたデータは色空間逆変換部２０１によって元の表色系のデータに変換される。
【００７９】
以上が、「JPEG2000」アルゴリズムの概要であり、静止画像、すなわち単フレームに対する方式を複数フレームに拡張したものが、「Motion JPEG2000」アルゴリズムである。
【００８０】
［発明の実施の形態１］
以下、本発明の一実施の形態について詳細に説明する。
【００８１】
なお、ここでは、Motion JPEG2000を代表とする動画像圧縮伸長方法を中心に説明するが、言うまでもなく、本発明は、以下の説明の内容に限定されるものではない。
【００８２】
図１０は、本実施の形態である符号列変換装置１のハードウエア構成を示すブロック図である。符号列変換装置１は、各種演算を行い、符号列変換装置１の各部を集中的に制御するＭＰＵ２と、各種のＲＯＭ、ＲＡＭなどからなるメモリ３とが、バス４で接続されている。メモリ３（のＲＯＭ）にはプログラムが記憶されている。このプログラムはＭＰＵ２が実行する。
【００８３】
また、バス４には、所定の処理を実行するＡＳＩＣ（又はＦＰＧＡ）５と、ネットワークインターフェイス６とが接続されている。ネットワークインターフェイス６は、ＬＡＮ、ＷＡＮ、インターネットなどのネットワーク７と符号列変換装置１とを接続するインターフェイスとなる。
【００８４】
さらに、バス４にはＲＣＬ（Reconfigurable Logic，リコンフィギャラブル・ロジック）８が接続されている。ＲＣＬ８は、符号列変換装置１で実行するアルゴリズムの実行部の上位階層、すなわち、高速処理が要求されるプロセスを担当する。高速処理の内容が固定している場合には、その機能を、ＡＳＩＣ（又はＦＰＧＡ）５のようなハード・ワイヤード・ロジックに任せることができる。その場合、ＲＣＬ８は、頻度が低いが、高速処理が必要とされる特殊な処理にのみ対応すればよい。
【００８５】
図１１、図１２は、所定のプログラムに基づいて図１０を参照して説明したハードウエアが行う処理の機能ブロック図である。入力符号列が動画像の場合を図１１に、また、符号列が静止画像の場合を図１２に、各々示した。ＲＣＬ８が行なう処理により実現すべきブロックには、※印を付して示してある。
【００８６】
図１１に示すように、符号列変換装置１は、Motion JPEG2000アルゴリズムなどにより、動画像データをフレームごとに１又は複数の矩形領域（タイル）に分割し、この矩形領域ごとに画素値を離散コサイン変換、離散ウェーブレット変換などで直交変換して階層的に符号化することにより作成した符号列の構文を解析する構文解析手段１２を備えている。構文解析手段１２で解析後の構文からは、当該符号列を構成する各パケットのパケット長をパケット長検出手段２１で検出する。動き量検出手段２２は、現フレームと以前のフレームとの間の動き量を検出するものである。動き量検出手段２２は、パケット長検出手段２１で検出したパケット長を記憶するパケット長記憶手段２３と、現フレームと以前のフレームとの間のパケット長の差分を求める差分検出手段２４とを備えている。動き量検出手段２２は、この例では、現フレームと以前のフレームとの間の動き量として、現フレームとそれ以前のフレームとの間におけるパケット長の差分から求められる相関係数値を使う。この相関係数値が大きいことは、フレーム間の画像が似ていることを意味するので、動き量は小さいことに対応する。
【００８７】
サブバンド符号量分布検出手段２５は、現フレーム内のサブバンド符号量分布を検出するものである。ここで、サブバンド符号量分布検出手段２５によるサブバンド符号量分布の検出について説明する。ここでは、パケット長に基づいてサブバンド符号量分布を検出する手法について説明する。ペイロード部のサブバンド符号量、すなわち、パケット長は、ヘッダ部に記述されているので、その情報を構文解析手段１２により読み取って、フレームのパケット長をパケット長検出手段２１により検出し、この検出したフレームのパケット長に基づいてサブバンド符号量分布を検出することができる。
【００８８】
圧縮率指定手段２６は、ユーザが選択した画像の圧縮率を符号列変換装置１において指定する。
【００８９】
量子化テーブル群２７は、「動き量」、「サブバンド符号量分布」、「圧縮率」のインデックス値（動き量インデックス値（ｉ）、符号量分布インデックス値（ｊ）、圧縮率インデックス値（ｋ））（後述する）に係るテーブル（量子化テーブル）をマトリクス状に配置したものである。この量子化テーブル群２７に登録されている各テーブルは、後述のパケット・スイッチ３０のパケット操作をどのように実行するか指示して、符号の削減の具体的な方法を指定するものである。
【００９０】
量子化テーブル選択手段２８は、動き量検出手段２２が検出した現フレームと以前のフレームとの間の動き量、サブバンド符号量分布検出手段２５が検出した現フレーム内のサブバンド符号量分布、及び、圧縮率指定手段２６が指定したユーザが選択した画像の圧縮率に基づき、前述の各インデックス値ｉ，ｊ，ｋをインデックスとして、量子化テーブル群２７から適切な量子化テーブル（選択された量子化テーブル３１となる）を選択する。
【００９１】
量子化手段２９は、符号列を再量子化するパケット・スイッチ３０を備えている。パケット・スイッチ３０は、符号列を構成する符号の取捨選択する。具体的には符号列を構成するパケットをパケット操作することにより取捨選択する。これは、前述のようにＲＣＬ８が実行する機能により実現される。このパケットの取捨選択は、選択された量子化テーブル３１に基づいて行われる。また、量子化手段２９の取捨選択に応じて、符号列作成手段１３が符号列のヘッダ・データを書換え、新たな符号列とする。
【００９２】
図１２に示すように、静止画像を対象とする符号列変換装置１は、構文解析手段１２は、JPEG2000アルゴリズムなどにより、静止画像データを１又は複数の矩形領域（タイル）に分割し、この矩形領域ごとに画素値を離散コサイン変換、離散ウェーブレット変換などで直交変換して階層的に符号化することにより作成した符号列の構文を解析する。図１１の場合には入力された符号列が動画像の場合に必要だった動き量検出手段２２は、静止画像の場合には不要になる。また、処理速度に対する制約も緩くなるのが一般的である。そこで、こうして生まれたリソースの余裕を、リソースの不足している他の機能モジュールに振り分け、あるいは、デバイスを休止状態にさせて、消費電力の削減を図ることもできる。
【００９３】
以下では、符号列変換装置１が実行する具体的な処理の内容について説明する。
【００９４】
図１３を参照して、符号列変換装置１について、動き量検出手段２２で検出した動画像の「動き量」を量子化手段２９で行う量子化に反映させる例を説明する。図１３において、縦軸は画像の動き量を表し、現フレームとそれ以前のフレームとの間におけるウエーブレット係数値の差分から求められる相関係数値が使われている。前述のように相関係数値が大きいことは、フレーム間の画像が似ていることを意味するので、動き量は小さいことに対応する。また、横軸はタイルの単位を表している。なお、ここでも、前述の図３１と図３３で表される動画像と矩形タイルを例に説明する。
【００９５】
相関係数値には複数の閾値が設けられ、各閾値間の値には対応する「動き量インデックス値」が降順又は昇順に割り当てられる。そして、各動き量インデックス値に対応している量子化テーブルでは、符号化データが削減されるサブバンドと、符号化データが保存されるサブバンドが決められている。
【００９６】
なお、インデックス値のうち、「圧縮率インデックス値」は、符号量の大きさを半定量的に大雑把に決めるときに使うものである。ユーザは、最初に、この圧縮率インデックス値を指定する。この例では、インデックス値として“ｋ＝ｐ”が選ばれた場合を示している。なお、“ｋ＝０”は、可逆符号化（ロスレス圧縮）を表わしている。
【００９７】
動き量検出手段２２で検出された動き量に応じて動き量インデックス値が決定する。図１３の例では、動き量の最も大きなタイル＃＝Ａは、動き量インデックス値として“ｉ＝９”を、動き量の最も小さなタイル＃＝Ｃは、動き量インデックス値として“ｉ＝１”を、各々与えられる。そして、各インデックス値に対応する量子化テーブルを量子化テーブル３１として選択して量子化手段２９で量子化が行われる。
【００９８】
ここまでは、唯一の量子化テーブル（主量子化テーブル）を量子化テーブル選択手段２８で選択する場合について説明した。これだけでも、従来の目障りであった画像の背景に現れる「モヤモヤとした歪」を、大幅に低減させることができる。
【００９９】
ここでは、画質向上の幅をさらに広げるために、主量子化テーブル以外の他の量子化テーブル（副量子化テーブル）も量子化テーブル群２７から選択して、処理を実行する例について説明する。今、最も動き量の大きな矩形領域、タイル＃＝Ａが、副量子化テーブルの選択条件を満たしたとする。すると、タイル＃＝Ａには、主量子化テーブルの動き量インデックス値“ｉ＝９”の他に、副量子化テーブルの動き量インデックス値として、“ｉ＝８”と“ｉ＝７”が割り当てられる（副量子化テーブルのインデックス値の選択については後述する）。
【０１００】
これにより、副量子化テーブルの動き量インデックス値で、主量子化テーブルるとは別に副量子化テーブルが選択され、このテーブルを使って、入力符号列が再量子化される。そして、これにより新たに生成した符号列は、主コンポーネントとは別に、副コンポーネントに配置される。そして、主コンポーネントと副コンポーネントに置かれた符号列を比較、選択することにより、更なる画質向上が実現される（主、副コンポーネントへの配置については後述する）。
【０１０１】
図１４には、「サブバンド符号量分布」を量子化手段２９における量子化に反映させた例を示す。ここでサブバンド符号量分布とは、サブバンド符号量分布検出手段２５が算出した、サブバンドごとの直交変換係数値の和のことである。縦軸は、最上位階層の（デコンポジシン・レベル数が最大の）低域サブバンド符号量を“１”に規格化したときの相対的な符号量を表している。また、横軸は、サブバンドを表している。
【０１０２】
タイルごとのサブバンド符号量分布を示す曲線のうち、細線の部分は、符号データの削除対象となるサブバンドを、一方、太線の部分は、符号データがそのまま保存されるサブバンドを、各々表している。（ａ）は、符号データの削減対象となるサブバンドの数が多い場合、（ｂ）は、少ない場合で、各々、圧縮率の低い場合と高い場合に対応している。タイル番号＃＝Ａ〜Ｄは、図３３で定義されているものである。
【０１０３】
タイル＃＝ＡとＤは、高域にまで符号データを保持している。また、タイル＃＝Ｂは中域に比較的多くの情報を、タイル＃＝Ｃは低域にほぼ全部の情報を、各々保有している。このように、タイルごとにサブバンドの持つ符号化データ量の分布が異なるので、ここでは量子化処理にあたっては、サブバンド符号量分布の多寡に応じて降順又は昇順の３種類のインデックス値（サブバンド符号量分布インデックス値）を割り当てる。そして、個々のインデックス値には、対応する量子化テーブルが用意されている。
【０１０４】
すなわち、高域に符号データ量の分布の多いタイル＃＝ＡとＤには量子化テーブル“ｊ＝Ｈ”が、中域に分布の多いタイル＃＝Ｂには量子化テーブル“ｊ＝Ｍ”が、低域に分布の多いタイル＃＝Ｃには量子化テーブル“ｊ＝Ｌ”が、各々割り当てられている。
【０１０５】
その結果、従来、特に圧縮率の高い場合に生じていた、タイルタイル＃＝Ａ，Ｂ，Ｄの画質劣化は大幅に低減され、タイル＃＝Ｃと同等の画質を維持することが可能になる。
【０１０６】
また、図１５は、前述のサブバンド符号量分布に応じて最適化された「量子化」を説明する別の説明図である。ここで、縦軸には全サブバンドの持つ符号量の和を、横軸にはタイル＃を表している。画像情報が低域サブバンドに集まっているタイルは全サブバンドの符号量和は小さく、反対に、高域に画像情報が集中しているタイルは符号量和が大きい。各タイルは、サブバンド符号量分布（ここではサブバンド符号量分布インデックス値が“ｊ＝Ｈ，Ｍ，Ｌ”）に対応して設けられた最低符号量和まで、符号量が削減される。
【０１０７】
量子化テーブルでは、量子化前の符号量に応じて符号量を削減する割合が決められているので、量子化後の符号データを復号して画像として表示させたとき、高域から低域の全ての帯域に渡って画質の劣化はあまり目立たない。なぜなら、従来のように、量子化前の符号量を考慮せず、一律に符号量を削減することがないからである。
【０１０８】
ここまでは、副量子化テーブルの選択条件を考慮しない場合について説明した。以下、更に、高画質化のために、副量子化テーブルを選択した場合について説明する。
【０１０９】
今、タイル＃＝Ｄは、主インデックス値として“ｊ＝Ｈ”が与えられている。しかし、符号量分布をみると、低域サブバンドへの符号量の集中度が高いことがわかる。つまり、インデックス値として、“ｊ＝Ｈ”の他に、“ｊ＝Ｍ”や“ｊ＝Ｌ”についても、検討する余地があると考えられる。そのためには、副量子化テーブルの選択条件を、こうした場合には、主量子化テーブルの他に、副量子化テーブルをも考慮するように設定しておけばよい。
【０１１０】
そうすれば、タイル＃＝Ｄに対して、副インデックス値としては、“ｊ＝Ｍ”や“ｊ＝Ｌ”を割り与えることができる。図１６には、低圧縮率のときの、主量子化と副量子化の様子を示した。
【０１１１】
図１７（ａ）には、「動き量」を考慮した量子化テーブル群２７が示されている。ここで示した量子化テーブル群２７は、動き量とあらかじめ設定しておいた閾値によって分類された降順又は昇順の「動き量インデックス値（ｉ＝０，１，…，９）」と、ユーザが指定する「圧縮率インデックス値（ｋ＝０，１， …,ｎ）」とをインデックス値として構成される、二次元マトリクス状である。この例では、動き量に９個のインデックス値を、圧縮率にはロスレス（ｋ＝０）から最高圧縮率（ｋ＝ｎ）のインデックス値を、各々設けている。
【０１１２】
また、図１７（ｂ）には、比較のために従来の量子化テーブル群２７ａを示した。これは、動き量が無い場合、すなわち、動き量インデックス値“ｉ＝０”の場合に相当している。
【０１１３】
本例の量子化テーブル群２７を用いた符号列変換装置１では、各タイルの特徴を分類して得た各インデックス値に従って、量子化テーブル群２７を参照し、最適な量子化テーブルを選択する。圧縮率のインデックス値として“ｋ＝ｐ”を選んだとき、タイル＃＝Ａのインデックス値は“（ｉ，ｋ）＝（９，ｐ）”、また、タイル＃＝Ｂのそれは“（ｉ，ｋ）＝（２，ｐ）”となる。このインデックス値に対応する量子化テーブルを使って、実際の量子化は行われる。
【０１１４】
各圧縮率で必要な量子化テーブルの数は、従来の１個から１０個に増える。また、全量子化テーブル数は、ｎ＋１個から（ｎ＋１）×１０個に増え、様々な画像にきめ細かな対応ができるようになる。
【０１１５】
各タイルの符号データは、通常は１つのインデックス値に対応した主量子化テーブルを使って再量子化が行われる。また、副量子化テーブルの選択条件に該当した場合は、主インデックス値の他に、副インデックス値が割り当てられる。この場合には、各タイルの符号データは、複数のインデックス値に対応した複数の量子化テーブルを使って複数種類の再量子化が行われる。ここで割り与えられる副量子化テーブルの各インデックス値は、すくなくともひとつが主量子化テーブルのインデックス値に隣接した値をとるのが望ましい（この場合、他のインデックス値は主量子化テーブルのインデックス値と同じ値とする）。例えば、主量子化のインデックス値として“（ｉ，ｋ）＝（９，ｐ）”を持つタイル＃＝Ａが、２つの副量子化のインデックス値を持つ場合、“（ｉ，ｋ）＝（８，ｐ），（７，ｐ）”となる。
【０１１６】
図１８（ａ）には、「サブバンド符号量分布」を考慮した量子化テーブル群２７が示されている。サブバンド符号量分布は、タイルごとの特徴に応じて、三種類のパターンに分類されている。動き量の場合と同様に、インデックス値が与えられ、これらは、「符号量分布インデックス値（ｊ＝Ｌ，Ｍ，Ｈ）」と呼ばれる。また、ユーザは、圧縮率を「圧縮率インデックス値（ｉ＝０，１， …,ｎ）」で指定することができる。ここに図示されているのは、二種類のインデックス値から構成される、二次元マトリクスである。
【０１１７】
また、図１８（ｂ）には、比較のために従来の量子化テーブル群２７ａを示しておいた。これは、符号量分布を一定とした場合、例えば、符号量分布インデックス値＝Ｍの場合に相当している。
【０１１８】
本例の量子化テーブル群２７を用いた符号列変換装置１では、各タイルの特徴を分類して得た各インデックス値に従って、量子化テーブル群２７を参照し、最適な量子化テーブルを選択する。圧縮率インデックス値として“ｋ＝ｐ”を選んだとき、タイル＃＝Ａのインデックス値は“（ｊ，ｋ）＝（Ｈ，ｐ）”、また、タイル＃＝Ｂのそれは“（ｊ，ｋ）＝（Ｍ，ｐ）”となる。このインデックス値に対応する量子化テーブルを使って、実際の量子化は行われる。
【０１１９】
そして、各圧縮率で必要な量子化テーブルの数は、従来の１個から３個に増える。また、全量子化テーブル数は、ｎ＋１個から（ｎ＋１）×３個に増えるものの、より高い柔軟性をもった画像処理が可能となる。なお、副量子化テーブルを使った例は、図１６を参照して既に説明した。
【０１２０】
図１９（ａ）には、「動き量」と「サブバンド符号量分布」の両方を考慮し、動画像に対応した量子化テーブル群２７が示されている。ここで示した量子化テーブル群２７は、「動き量インデックス値（ｉ＝０，１， …,ｎ）」と「符号量分布インデックス値（ｊ＝Ｌ，Ｍ，Ｈ）」、そして、「圧縮率インデックス値（ｉ＝０，１， …,ｎ）」をインデックス値とする、三次元マトリクス（ｉ，ｊ，ｋ）状である。この例では、各インデックス値は、「動き量」「サブバンド符号量分布」「圧縮率」の各値の各範囲に、昇順あるいは降順に付けられている。また、図１９（ｂ）には、比較のために従来の量子化テーブル群２７ａを示しておいた。
【０１２１】
そして、本例の量子化テーブル群２７を用いた符号列変換装置１では、各タイルの特徴を分類して得たインデックス値（ｉ，ｊ，ｋ）に従って、量子化テーブル群２７を参照し、最適な量子化テーブルを選択する。圧縮率のインデックス値として“ｋ＝ｐ”を選んだとき、タイル＃＝Ａのインデックス値は“（ｉ，ｊ，ｋ）＝（９，Ｈ，ｐ）”、また、タイル＃＝Ｂのそれは“（ｉ，ｊ，ｋ）＝（２，Ｈ，ｐ）”となる。このインデックス値に対応する量子化テーブルを使って、実際の量子化は行われる。
【０１２２】
そして、各圧縮率で必要な量子化テーブルの数は、従来の１個から３０個に増える。また、全量子化テーブル数は、ｎ＋１個から（ｎ＋１）×３０個に増える。
【０１２３】
また、図２０（ａ）には、「サブバンド符号量分布」のみを考慮した量子化テーブル群２７が示されている。これは、静止画像に対応したものである。ここで示した量子化テーブル群２７は、「符号量分布インデックス値（ｊ＝Ｌ，Ｍ，Ｈ）」と「圧縮率インデックス値（ｉ＝０，１， …,ｎ）」から構成される、二次元マトリクス（ｊ，ｋ）である。図２０（ｂ）には、比較例としての従来の量子化テーブル群を示している。
【０１２４】
図２１は、副量子化テーブルの選択条件について説明するものである。インデックス値は、「動き量」「低域側の総サブバンド符号量和」「圧縮率」の各値に対し、この例では昇順に付けられている。そして、これらの各値の大きさを示す各インデックス値が、予め設定された閾値以上のとき（何れかひとつのインデックス値が閾値以上のときとすることができる）、前述の主量子化テーブルの他に、副量子化テーブルが選択される（複数の副量子化テーブルからなる副量子化テーブル群を符号２７ｂで示す）。
【０１２５】
例えば、主インデックス値が、“（ｉ，ｊ，ｋ）＝（７，２，ｎ−２）”であるとき、これは、副量子化テーブルの選択条件を満たすので、近隣のインデックス値が１つ以上新たに選択される。一方、主インデックス値が、“（ｉ，ｊ，ｋ）＝（０，３，１）”のときは、副量子化テーブルは選択されない。なお、この例では、主インデックス値のうち符号量インデックス値が、“ｊ＝０”のときは、動き量や圧縮率がどんなインデックス値を持っても、副量子化テーブルが選択されることはない。すなわち、サブバンド符号量分布が高域に偏った画像に対しては、主量子化テーブルの再量子化だけで十分な画質が得られるとしている。
【０１２６】
図２２と図２３は、主量子化テーブル及び副量子化テーブルを用いて再量子化した符号データを、コンポーネントに配置する方法について説明するもので、図２２は動画像、図２３は静止画像の場合である。今、入力符号データは、Ｙ（輝度）、Ｃｂ（青の色差）、Ｃｒ（赤の色差）の色空間で構成されているとする。説明を簡単にするために、ここでは輝度成分についてだけ説明するが、残りの色成分についても、全く同様の処理を行う。なお、入力符号列がＲＧＢ、ＹＵＶ又はＹＣｂＣｒのいずれかの色空間で構成されているときに、選択条件を各色空間のうち少なくとも一つに適用するようにすることができる。
【０１２７】
動画像（図２２参照）の各フレームにおいては、必ず主量子化テーブルで再量子化が行われ、その結果は、主コンポーネントに配置される。また、副量子化テーブルが選択された場合に限って、主量子化テーブルの他に、副量子化テーブルでも再量子化が行われる。そして、その結果は、副コンポーネントに配置される。これらのコンポーネントの配置は符号列作成手段１３で行う。この例では、副量子化テーブルの選択条件がアクティブになるのは、フレーム＃＝ｎ，ｎ＋５，ｎ＋６，ｎ＋９のときである。これらのフレームでは、主インデックス値の動き量インデックス値とは別に、異なる動き量インデックス値を持った量子化テーブルが量子化テーブル群２７から選択され、入力符号データの再量子化が行われる。そして、結果は、副コンポーネントに配置され保持される。
【０１２８】
静止画像（図２３参照）の各タイルにおいては、必ず主量子化テーブルで再量子化が行われ、その結果は主コンポーネントに配置される。また、副量子化テーブルが選択された場合に限って、主量子化テーブルの他に副量子化テーブルでも再量子化が行われる。そして、その結果は副コンポーネントに配置される。この例では、副量子化テーブル選択条件がアクティブになるのは、タイル＃＝ｎ，ｎ＋７，ｎ＋８，ｎ＋１１のときである。これらのタイルでは、主インデックス値の符号量分布インデックス値とは別に、異なる符号量分布インデックス値を持った量子化テーブルが量子化テーブル群２７から選択され、入力符号データの再量子化が行われる。そして、その結果は副コンポーネントに配置され保持される。また、タイル＃＝ｍ＋２では、副量子化テーブルの選択条件は満たされていないが、ユーザが強制的にアクティブにしている。この場合、量子化テーブルの選択条件にかかわらず、副量子化テーブルの選択を実行することが設定され（第１の設定手段）、主量子化テーブルとは異なる圧縮率で再量子化が行われる。なお、量子化テーブル選択手段２８で選択する量子化テーブルの数にユーザが上限値を設定できるようにすることもできる（第２の設定手段）。
【０１２９】
なお、図２２及び図２３の例において、副量子化テーブルが選択されないときは、符号列作成手段１３は、副コンポーネントには予め用意しておいた画像の符号データを配置する。例えば、全ての画素について画素値０（白色）の符号データとすればよい。
【０１３０】
なお、符号量を算出する方法として、ここでは、ヘッダ部からパケット長情報を読取る「マクロ的」手法が使われている。図２４は、その原理を説明する説明図である。符号列４１のペイロード部４２のデータ量、すなわち、「パケット長」は、ヘッダ部４３に記述されているので、その情報を構文解析手段１２で読み取り、利用する。画像の動き量は符号量の変化量、すなわち、パケット長の変化量に現れる。また、符号量分布は、サブバンドごとのパケット長のヒストグラムから求められる。この方法は、「ミクロ的」手法に比較して、動き量に僅かながら誤差が含まれる場合があるものの、符号列変換装置１の構成が簡易で済み、かつ、高速で処理することができるという、大きな利点を有している。
【０１３１】
以上説明した、図１１の符号列変換装置１によれば、画像の「動き量」と「符号量分布」の両方を考慮した量子化が行われるので、入力した動画像の符号列に対して、きめ細かい画質制御を含んだ量子化を行うことができる。
【０１３２】
また、入力原画像が複数のコンポーネントで構成されている場合、システム全体の更なる高速化を図ることが可能である。すなわち、例えば、入力符号列がＹＣｂＣｒ信号である場合、パケット長検出をＹ成分だけで行えばよい。Ｃｂ及びＣｒ成分に対する量子化テーブルは、Ｙ成分に対して求められたものと同じ量子化テーブルを選択すればよい。
【０１３３】
また、図１２の符号列変換装置１によれば、量子化テーブルの選択基準として、「サブバンド符号量分布」が使用される。また、符号量を算出する方法としては、ヘッダ部からパケット長情報を読取る「マクロ的」手法が使われている。
【０１３４】
図２５には、サブバンド符号量分布を考慮した量子化テーブルが示されている。ここで示した例では、符号量分布に最適な量子化が行われるので、高い画像品質を維持することができる。
【０１３５】
図２６には、サブバンド符号量分布を考慮した静止画像の符号化の例を示す。図２６（ａ）の静止画像において、タイル＃＝Ｕは青空を、タイル＃＝Ｗは高層ビルの窓枠の密集した正面側を、そして、タイル＃＝Ｖは高層ビルと空の両方を、各々主たる画像としている。サブバンド符号量分布について見れば、タイル＃＝Ｕは低域に、タイル＃＝Ｗは高域に、タイル＃＝Ｖは低域から高域までの全ての範囲に、画像情報が集まっている。
【０１３６】
ここでタイル＃＝Ｗの量子化は、画像情報が低域から高域の全ての範囲に広がっているため、量子化に際しては、高域のビル正面側と、低域の空と、両方の画質バランスを崩さないように行う必要がある。これは非常に難しいことで、しばしば画質の劣化を招いていた。しかも、圧縮率が高くなるに従って最適な解を見つけることがますます困難になっていく傾向がある。
【０１３７】
そこで、この例では、空とビルの境界にあたる矩形領域には、主量子化テーブルの他に、副量子化テーブルを与えている（図２６（ｂ）を参照）。
【０１３８】
副インデックス値としては、主量子化の圧縮率インデックス値よりも、より圧縮率の小さなインデックス値を割り当てている。具体的には、低圧縮率（インデックス値ｋ＝ｐ）ではロスレス圧縮（ｋ＝０）を、中圧縮率（ｋ＝ｑ）では、より低圧縮率側のインデックス値（ｋ＝ｑ−ｍ）を、高圧縮率（ｋ＝ｒ）では、より低圧縮率側のインデックス値（ｋ＝ｒ−ｎ）を、各々使っている。このように、デリケートな画像に対しては、複数の量子化を行うことにより、高い画像品質を維持したまま、原画像の符号化を行うことが可能となる。
【０１３９】
図２７には、原画像にタイル分割処理を施した場合と、施さない場合について示した。タイル分割を行なわない場合でも、プレシンクトやコード・ブロックを矩形領域として利用すれば、タイル分割を行なった場合と同様に、画像の「動き量」や「サブバンド符号量分布」を細かく検出し、画質劣化を抑えた量子化が可能である。
【０１４０】
図２８は、符号列変換装置１を備えた画像編集システム５１の概略構成を示すブロック図である。図２８に示すように、画像編集システム５１において、符号列変換装置１（第１の符号列変換装置）は、前述のように画像を圧縮符号化した符号列から新たな符号列を作成する。このとき入力する符号列は、可逆に符号化された、いわゆるロスレスのコード・ストリームであることが、この画像編集システム５１の能力を最大限に引き出すことができるので望ましい。ロスレスの符号データを得ることが難しい場合は、低圧縮率で符号化された符号列であることが望ましい。
【０１４１】
画像伸長装置５２は、主及び副量子化テーブルで量子化された符号データに、復号化／逆量子化／逆直交変換／等の処理を施して伸長し、画像表示装置５３が画像を表示出力できるようにする。
【０１４２】
画像の編集者は、この画像表示装置５３に表示された画像を見て、符号列に含まれている各主、副コンポーネントのうち、最も適当と判断したコンポーネントを主／副量子化符号データ入替指示信号により選択する。すると、符号列変換装置５４（第２の符号列変換装置）は、主／副量子化符号データ入替指示信号に基づいて、符号列変換装置１が出力した新たな符号列で、各主、副コンポーネント中から選択されたコンポーネントを取り出して、このコンポーネントを主コンポーネントとして含み、選択されなかったコンポーネントは含まない新たな符号列を作成する。こうして、複数の量子化テーブルの中から再量子化に最もふさわしいもので再量子化された符号列が最終的に確定し、画像編集システム５１から出力される。
【０１４３】
図２９は、歪量測定装置５５をさらに備えた画像編集システム５１の概略構成を示すブロック図である。これは、図２８の画像編集システム５１で行っていた編集者によるコンポーネントの選択を自動化したものである。歪量の測定方法としては、画素値を原画像と量子化後の画像の間で比較する方法等がある。これにより、編集者の主観評価に拠っていた画質評価結果が客観的なものとなる。また、編集処理の高速化・自動化が可能となる。
【０１４４】
図３０は、符号列変換装置１を備えたカメラシステム６１の概略構成を示すブロック図である。図３０に示すように、画像入力装置６２は、ＣＣＤなどの光電変換素子を備え、静止画像を撮像する。画像圧縮装置６３はこの撮影した静止画像の画像データを圧縮符号化する。そして、符号列変換装置１が、この圧縮符号化後の符号列から前述のように新たな符号列を作成する。この新たな符号列は所定のネットワークを介して送信され、あるいは、記憶装置に記憶される。
【０１４５】
【発明の効果】
請求項１に記載の発明は、画像の動き量、サブバンド符号量分布、圧縮率という３種類の切口で画像の種類を判別して再量子化を行うことができるので、高い画像品質を維持しつつ動画像を符号化することができる。また、動き量、サブバンド符号量分布を符号量から求め、量子化を符号状態で行うので、処理に必要なメモリ要領を節減でき、処理速度を高め、処理に必要な消費電力を低減することができる。
【０１４６】
請求項２に記載の発明は、サブバンド符号量分布、圧縮率という２種類の切口で画像の種類を判別して再量子化を行うことができるので、高い画像品質を維持しつつ静止画像を符号化することができる。また、サブバンド符号量分布を符号量から求め、量子化を符号状態で行うので、処理に必要なメモリ要領を節減でき、処理速度を高め、処理に必要な消費電力を低減することができる。
【０１４７】
請求項３に記載の発明は、請求項１又は２に記載の発明において、複数パターンで再量子化を行うことができ、高い画像品質を維持しつつ動画像あるいは静止画像を符号化することができる。
【０１４８】
請求項４に記載の発明は、請求項３に記載の発明において、動き量、サブバンド符号量分布又は圧縮率がある程度大きいときに複数パターンで再量子化を行なって、高い画像品質を維持しつつ動画像あるいは静止画像を符号化することができる。
【０１４９】
請求項５に記載の発明は、請求項３又は４に記載の発明において、必要性があるときは、選択条件を満たさなくても複数パターンで再量子化を行なって、高い画像品質を維持しつつ動画像あるいは静止画像を符号化することができる。
【０１５０】
請求項６に記載の発明は、請求項３〜５のいずれかの一に記載の発明において、無制限に複数パターンで再量子化を行うことを防止し、無駄な処理を行わないようにすることができる。
【０１５１】
請求項７に記載の発明は、請求項３〜６のいずれかの一に記載の発明において、動き量、サブバンド符号量分布、圧縮率に対して最も適切と考えられる再量子化の他に、それに次いで適切と考えられる再量子化を行うことができる。
【０１５２】
請求項８に記載の発明は、請求項３〜７のいずれかの一に記載の発明において、少なくともひとつの色空間で選択条件を満たすときに、複数パターンで再量子化を行なって、高い画像品質を維持しつつ動画像あるいは静止画像を符号化することができる。
【０１５３】
請求項９に記載の発明は、請求項３〜８のいずれかの一に記載の発明において、複数パターンで再量子化した符号データを複数コンポーネントにそれぞれ有する単一の符号列を得ることができる。
【０１５４】
請求項１０に記載の発明は、請求項９に記載の発明において、複数パターンで再量子化しないときは予め用意した符号データを、他のテーブルから得られた符号データを配置すべきコンポーネントに配置して、その後の新たな符号列の利用の便宜を図ることができる。
【０１５５】
請求項１１に記載の発明は、請求項１０に記載の発明において、複数パターンで再量子化しないときは画素値が最小となる白色の画像の符号データを、他のテーブルから得られた符号データを配置すべきコンポーネントに配置して、その後の新たな符号列の利用の便宜を図ることができる。
【０１５６】
請求項１２に記載の発明は、請求項１〜１１のいずれかの一に記載の発明において、画像の動き量、サブバンド符号量分布及び圧縮率の各値に応じてそれぞれ昇順又は降順に付されているインデックス値により的確に必要なテーブルを選択し、複数パターンで再量子化を行なって、高い画像品質を維持しつつ動画像あるいは静止画像を符号化することができる。
【０１５７】
請求項１３に記載の発明は、請求項１〜１２のいずれかの一に記載の発明において、サブバンド符号量分布を適切に検出することができる。
【０１５８】
請求項１６，１９，２０に記載の発明は、請求項１〜１５のいずれかの一に記載の発明と同様の作用、効果を奏する。
【０１５９】
請求項１７に記載の発明は、複数パターンで再量子化を行った各符号データを収容した複数のコンポーネントの画質を互いに比較して、最適な画質であると判断したコンポーネントのデータのみを残した符号列を作成することができる。
【０１６０】
請求項１８に記載の発明は、請求項１７に記載の発明において、画像データの歪量に基づいて最適な画質のコンポーネントを自動で選択して、そのコンポーネントのみを残した符号列を作成することができる。
【図面の簡単な説明】
【図１】二次元ＤＣＴ（離散コサイン変換）を使った場合の基底ベクトルを説明する説明図である。
【図２】二次元ＤＷＴ（離散ウェーブレット変換）を使った場合のオクターブ分割を説明する説明図である。
【図３】JPEG2000符号列の構文解析、パケット長の読み取りについて説明する概念図である。
【図４】JPEG2000の基本となる階層符号化アルゴリズムを説明するためのブロック図である。
【図５】デコンポジションレベル数が３の場合の、各デコンポジションレベルにおけるサブバンドを示す図である。
【図６】タイル分割されたカラー画像の各コンポーネントの例を示す図である。
【図７】プレシンクトとコードブロックの関係を説明するための図である。
【図８】符号列の構造を説明するための図である。
【図９】符号化されたウェーブレット係数値の収容されたパケットを、サブバンドごとに表わしたときの符号列の構造を示す説明図である。
【図１０】本発明の一実施の形態にかかる符号列変換装置のハードウエア構成を示すブロック図である。
【図１１】動画像を処理する場合の符号列変換装置の機能ブロック図である。
【図１２】静止画像を処理する場合の符号列変換装置の機能ブロック図である。
【図１３】動き量検出手段で検出した動画像の「動き量」を量子化手段で行う量子化に反映させる例を説明する説明図である。
【図１４】「サブバンド符号量分布」を量子化手段における量子化に反映させた例を説明する説明図である。
【図１５】サブバンド符号量分布に応じて最適化された「量子化」を説明する別の説明図である。
【図１６】低圧縮率のときの主量子化と副量子化の様子を示す説明図である。
【図１７】画像の「動き量」を考慮した量子化テーブル群の説明図である。
【図１８】「サブバンド符号量分布」を考慮した量子化テーブル群の説明図である。
【図１９】「動き量」と「サブバンド符号量分布」の両方を考慮し、動画像に対応した量子化テーブル群の説明図である。
【図２０】「サブバンド符号量分布」のみを考慮した量子化テーブル群の説明図である。
【図２１】副量子化テーブルの選択条件について説明する説明図である。
【図２２】主量子化テーブル及び副量子化テーブルを用いて再量子化した符号データを、コンポーネントに配置する方法について説明する説明図である。
【図２３】主量子化テーブル及び副量子化テーブルを用いて再量子化した符号データを、コンポーネントに配置する方法について説明する説明図である。
【図２４】ヘッダ部からパケット長情報を読取る「マクロ的」手法で符号量を算出する方法の説明図である。
【図２５】サブバンド符号量分布を考慮した量子化テーブルの説明図である。
【図２６】サブバンド符号量分布を考慮した静止画像の符号化の例を示す説明図である。
【図２７】原画像にタイル分割処理を施した場合と、施さない場合について示す説明図である。
【図２８】画像編集システムの概略構成の説明図である。
【図２９】歪量測定装置を備えた画像編集システムの概略構成の説明図である。
【図３０】カメラシステムの概略構成の説明図である。
【図３１】本発明の課題を説明する説明図である。
【図３２】本発明の課題を説明する説明図である。
【図３３】本発明の課題を説明する説明図である。[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention provides a code sequence conversion device that converts a code sequence obtained by compression-encoding image data into a new code sequence, an image editing system and a camera system including the same, and a code sequence obtained by compression-encoding image data. The present invention relates to a program that executes a process of converting a code string.
[0002]
[Prior art]
Conventionally, MPEG1 / MPEG2 / MPEG4 dedicated to moving images and Motion JPEG handling still images as continuous frames have been used as image compression / decompression algorithms. Recently, a new method called Motion JPEG2000 is being standardized as an international standard for the encoding of the latter Motion still images.
[0003]
The difference between the MPEG method and the Motion still image method is that the latter performs only intra-frame coding, while the former performs correlation not only on images in the same frame but also on different inter-frame images, further increasing the compression rate. Is to be able to do it. On the other hand, the latter method, in which each frame is handled independently, enables editing of each frame as compared with the former method, and does not cause errors during communication to extend to other frames. As described above, the MPEG system and the Motion still image system each have features. The method is appropriately used for each application.
[0004]
In the Motion JPEG2000 system, a discrete wavelet transform is used as a conversion method. A technique disclosed in Patent Document 1 is known as a technique for compressing and encoding image data using the discrete wavelet transform.
[0005]
[Patent Document 1] JP-A-2001-309381
[Problems to be solved by the invention]
According to the technique disclosed in Patent Literature 1, pixel values are not only subjected to discrete wavelet transform and compression-encoded, but also a correlation is obtained in images between different frames, and moving image data when there is no image movement between frames. Is also eliminated, so that the data compression ratio can be further improved.
[0006]
A specific method of reducing the code amount in this technique is to compare the low-frequency wavelet coefficient values between the current frame and the immediately preceding frame on a tile-by-tile basis. The portion is deleted, and information indicating that there is no motion between frames is written in the header portion. Conversely, if it is determined that there is no match, the code is left as it is.
[0007]
That is, two processes of “quantization” optimized for a still image in a frame and “reduction of code amount” based on a comparison result of wavelet coefficient values between frames are performed independently. Then, the quantization of the still image in the frame is performed based on the unique “quantization table”.
[0008]
However, such a conventional technique has the following problems (1) and (2).
[0009]
(1) Conventionally, in the compression of a motion still image, attention is paid only to “quantization” optimized for a still image in a frame, and “quantization” is performed in consideration of a visual characteristic unique to a moving image. I wasn't. Originally, although the amount of motion between frames differs for each image region, conventionally, it has been assumed that the amount of motion is uniform over the entire image region. As a result, a completely new deterioration in image quality, which could not be expected when displaying a still image, often appeared when displaying a moving image.
[0010]
This will be described with reference to the drawings. FIG. 31 shows a conventional quantization method for uniformly reducing subband codes ignoring the amount of motion. Here, the vertical axis represents the motion amount, and the correlation coefficient value obtained from the difference between the wavelet coefficient values between the current frame and the previous frame is used. The horizontal axis represents the unit of the tile. A tile is a rectangular area created by dividing an original still image in a frame.
[0011]
In terms of the amount of motion, the tiles A are large and the tiles C are small, for example. However, in the conventional method, the entire image is replaced with a uniform motion amount. In this example, the motion amounts of all tiles are treated as “none”. Therefore, regardless of the magnitude of the motion amount, only one quantization table has been referenced. The number assigned to the tile, that is, the tile # is defined in FIG. 33 as described later.
[0012]
(2) Also, quantization of a still image in a frame uses a unique “quantization table” uniquely defined by a compression rate or a bit rate. Originally, the code amount distribution of the sub-band should be different for each image area, but in the past, this was not considered at all. As a result, the quality of the quantized image greatly fluctuates due to the nature of the original image such as an image with many high-frequency components, an image with many low-frequency components, and an image widely distributed from high to low frequencies.
[0013]
FIG. 32 shows a conventional quantization method for uniformly reducing subband codes ignoring the code amount distribution. The vertical axis represents the relative code amount when the low-band sub-band code amount of the highest layer (having the largest number of decomposin levels) is normalized to “1”. The horizontal axis represents a subband.
[0014]
In the curve showing the code amount distribution for each tile, a thin line portion indicates a subband from which the code amount is to be deleted, and a thick line portion indicates a subband in which the code amount is stored as it is. (A) corresponds to the case where the number of subbands to be reduced in code amount is large, and (b) corresponds to the case where the number of subbands is small. Note that the tile numbers A to D are defined in FIG.
[0015]
Tile # = A and D hold code data up to high frequencies. Further, tile # = B has a relatively large amount of information in the middle band, and tile # = C has a large amount of information in the low band. As described above, even when the subband code amount differs for each tile, the conventional method uniformly reduces the subband code data regardless of the code amount distribution. As a result, especially when the compression ratio is high, the amount of data in the middle to high ranges is relatively large, and the image quality of tiles # = A, B, and D is that of tile # = The deterioration was remarkable as compared with the image quality of C.
[0016]
In order to specifically explain the above (1) and (2), the observation results of the phenomenon that occurs when Motion JPEG2000 is used for the compression / decompression algorithm will be described in detail below.
[0017]
FIG. 33 shows three consecutive frames extracted from the moving image content. In this example, a bicycle that runs in the background with a small amount of movement is shown. This is an image in which it is relatively easy to distinguish a tile having a large amount of motion from a tile having a small amount of motion. Now look at the four tiles. Tile # = A is located below the center, tile # = B is located at the upper left, tile # = C is located above the center, and tile # = D is located at the upper right.
[0018]
When a Motion still image encoded by conventional quantization is expanded and displayed without considering the amount of motion between frames and the code amount distribution of subbands, the following phenomenon appears.
[0019]
▲ 1 ▼. In tile # = A
The bicycle is located at the closest distance from the shooting point, and the code amount distribution of the high frequency sub-band is large due to its structural features. As the compression rate increases, the code amount of the high-frequency sub-band decreases, so that it becomes difficult to determine the spoke portions of the wheels one by one in a still image in a frame. Since the discrimination of the spokes of a moving bicycle is more than the limit of the dynamic visual acuity of a human, the influence on the moving image is not so large.
[0020]
▲ 2 ▼. In tile # = B
There is almost no movement of the tree located between the distant view and the foreground bicycle. The leaf portion has a large code amount distribution of the middle subband. As the compression ratio increases, it becomes more difficult to distinguish each leaf from the still images in the frame. However, the tendency is not as strong as the spoke of a bicycle in tile # = A.
[0021]
(3). In tile # = C
In the distant view, a skyline having a large code amount distribution of the low-frequency subband is arranged. Even if the compression ratio becomes high, it is relatively hard to be affected by the code amount reduction. However, since there is almost no motion amount, even if the data of the low-frequency sub-band is slightly removed, the blur spreads over the entire still image in the frame. When viewed in a moving image, a “distorted distortion” occurs in the background, which is very disturbing.
[0022]
▲ 4 ▼. In tile # = D
There are many tree contours in the image area. As a result, the distribution of the code amount differs from tile # = C in that the data amount on the high frequency side is slightly larger. When comparing the image with the tile # = C, there is not much difference in the still image in the frame. However, when a moving image is displayed, “distorted distortion” appears more conspicuously. This seems to be due to the large amount of high-frequency components removed.
[0023]
An object of the present invention is to make it possible to determine the type of an image based on the amount of motion, subband code amount distribution, and compression ratio of an image and perform requantization, thereby encoding a moving image while maintaining high image quality. It is to make.
[0024]
Another object of the present invention is to encode a moving image or a still image while maintaining high image quality by enabling requantization to be performed in a plurality of patterns.
[0025]
[Means for Solving the Problems]
According to the first aspect of the present invention, a code sequence created by dividing moving image data into one or a plurality of rectangular regions for each frame, frequency-converting pixel values for each rectangular region, and hierarchically encoding the pixel values. Syntactic analysis means for analyzing the syntax of, a quantization means for selecting and re-quantizing the code sequence by selecting a code, between the current frame and the previous frame for the code sequence based on the analysis result Motion amount detecting means for calculating the motion amount of the sub-band code amount distribution detecting means for obtaining the sub-band code amount distribution in the current frame based on the analysis result, and a compression ratio specifying means for specifying the compression rate of the current frame An index for specifying the table from a table group consisting of a plurality of tables for specifying a code reduction method when performing the requantization with a numerical value given according to the magnitude of the motion amount; A motion distribution index value, a code distribution index value that is an index when specifying the table with a numerical value given according to the size of the subband code amount distribution, and a size of the compression ratio Selecting means for selecting one or more of the tables to be used for the requantization based on a compression ratio index value which is an index when specifying the table with a given numerical value, and a code string after the requantization And a code string creating means for creating a new code string from the code string.
[0026]
Therefore, the requantization can be performed by determining the type of the image based on three types of cuts, that is, the amount of motion of the image, the distribution of the subband code amount, and the compression ratio, so that the moving image can be encoded while maintaining high image quality. can do. In addition, since the motion amount and the sub-band code amount distribution are obtained from the code amount and the quantization is performed in the code state, the memory required for the processing can be reduced, the processing speed is increased, and the power consumption required for the processing is reduced. Can be.
[0027]
According to a second aspect of the present invention, the syntax of a code string created by dividing still image data into one or a plurality of rectangular areas, frequency-converting pixel values for each of the rectangular areas, and hierarchically encoding the pixel values is defined. Syntactic analysis means for analyzing, quantizing means for performing re-quantization of the code string by selecting codes, and sub-band code amount distribution detecting means for obtaining a sub-band code amount distribution based on the analysis result, A compression rate designating means for designating a compression rate, and a table group comprising a plurality of tables for designating a code reduction method when performing the requantization with a numerical value given according to the size of the subband code amount distribution. A code distribution index value that is an index when specifying the table, and a compression ratio that is an index when specifying the table with a numerical value given according to the size of the compression ratio Selecting means for selecting one or more tables to be used for the requantization based on the index value, and code string creating means for creating a new code string from the code string after the requantization. Code string conversion device.
[0028]
Therefore, requantization can be performed by discriminating the type of image at the two cuts of the subband code amount distribution and the compression ratio, so that a still image can be encoded while maintaining high image quality. Further, since the sub-band code amount distribution is obtained from the code amount and quantization is performed in the code state, memory requirements for processing can be reduced, processing speed can be increased, and power consumption required for processing can be reduced.
[0029]
According to a third aspect of the present invention, in the code string conversion apparatus according to the first or second aspect, the selecting means includes the index corresponding to the motion amount, the subband code amount distribution, and the compression ratio. The table specified by the value is specified, and when each of the index values satisfies a predetermined selection condition, the other table is also selected; otherwise, the other table is not selected. .
[0030]
Therefore, requantization can be performed with a plurality of patterns, and a moving image or a still image can be encoded while maintaining high image quality.
[0031]
According to a fourth aspect of the present invention, in the code sequence conversion apparatus according to the third aspect, the selecting means is configured to select one of the motion amount, the total low-side code amount of the subband code amount distribution, and the compression rate. When at least one of them is equal to or more than a predetermined threshold, the selection condition is satisfied.
[0032]
Therefore, when the motion amount, the sub-band code amount distribution, or the compression ratio is large to some extent, requantization is performed in a plurality of patterns, and a moving image or a still image can be encoded while maintaining high image quality.
[0033]
According to a fifth aspect of the present invention, in the code string conversion apparatus according to the third or fourth aspect, the selection unit selects the other table regardless of whether the selection condition is satisfied. There is provided first setting means for setting.
[0034]
Therefore, when necessary, re-quantization can be performed with a plurality of patterns even if the selection condition is not satisfied, and a moving image or a still image can be encoded while maintaining high image quality.
[0035]
According to a sixth aspect of the present invention, in the code string converter according to any one of the third to fifth aspects, the second setting means sets an upper limit value to the number of other tables selected by the selection means. It has.
[0036]
Therefore, it is possible to prevent unlimited re-quantization of a plurality of patterns and prevent unnecessary processing from being performed.
[0037]
According to a seventh aspect of the present invention, in the code string conversion apparatus according to any one of the third to sixth aspects, the selecting means is configured such that the index values of the other tables do not satisfy the selection condition. At least one of the index values of the table specified by the respective index values selected at the time is an adjacent value.
[0038]
Therefore, in addition to the requantization considered to be most appropriate for the motion amount, the subband code amount distribution, and the compression ratio, it is possible to perform the requantization considered next to be appropriate.
[0039]
According to an eighth aspect of the present invention, in the code string conversion device according to any one of the third to seventh aspects, the selecting means is configured such that the code string is formed of any one of RGB, YUV and YCbCr color spaces. The selection condition is applied to at least one of the color spaces.
[0040]
Therefore, when the selection condition is satisfied in at least one color space, re-quantization is performed with a plurality of patterns, and a moving image or a still image can be encoded while maintaining high image quality.
[0041]
According to a ninth aspect of the present invention, in the code string conversion device according to any one of the third to eighth aspects, the code string creation unit is configured to execute the code sequence conversion unit when the plurality of tables are selected. The new code string is created so that the code data obtained for each of the tables is arranged in different components.
[0042]
Therefore, it is possible to obtain a single code string having code data requantized in a plurality of patterns in a plurality of components.
[0043]
According to a tenth aspect of the present invention, in the code string conversion device according to the ninth aspect, when the other table is not selected, the code string creating unit outputs the code data obtained from the other table. The code data prepared in advance is arranged in the component to be arranged.
[0044]
Therefore, when requantization is not performed in a plurality of patterns, code data prepared in advance is arranged in a component in which code data obtained from another table is to be arranged, thereby facilitating use of a new code string thereafter. Can be.
[0045]
According to an eleventh aspect of the present invention, in the code sequence conversion device according to the tenth aspect, the code sequence creating unit arranges code data of a white image having a minimum pixel value as the previously prepared code data. .
[0046]
Therefore, when the requantization is not performed in a plurality of patterns, the code data of the white image having the minimum pixel value is arranged in a component where the code data obtained from another table is to be arranged, and a new code sequence Can be conveniently used.
[0047]
According to a twelfth aspect of the present invention, in the code sequence conversion device according to any one of the first to eleventh aspects, each of the index values is a value of the motion amount, the subband code amount distribution, and the compression ratio. They are given in ascending order or descending order according to the values.
[0048]
Therefore, a necessary table is appropriately selected based on the index values assigned in ascending or descending order according to the values of the image motion amount, the subband code amount distribution, and the compression ratio, and requantization is performed in a plurality of patterns. Thus, a moving image or a still image can be encoded while maintaining high image quality.
[0049]
According to a thirteenth aspect of the present invention, in the code string converter according to any one of the first to twelfth aspects, the subband code amount distribution detecting means includes a code amount sum of orthogonal transform coefficient values for each subband. And the sum of the total code amount of the orthogonal transform coefficient values of the sub-bands included in a certain band.
[0050]
Therefore, it is possible to appropriately detect the subband code amount distribution.
[0051]
The parsing means may use a code sequence using a discrete cosine transform or a discrete wavelet transform as the orthogonal transform (claims 14 and 15).
[0052]
According to a sixteenth aspect of the present invention, there is provided a code sequence conversion device according to any one of the first to fifteenth aspects, an image decompression device for decompressing a new code sequence created by the code sequence conversion device, A display device for displaying an image based on the subsequent image data.
[0053]
Therefore, the same operation and effect as the invention according to any one of the first to fifteenth aspects are obtained.
[0054]
According to a seventeenth aspect of the present invention, there is provided a first code sequence conversion device as the code sequence conversion device according to the ninth aspect, and an image decompression device for expanding a new code sequence created by the code sequence conversion device. A display device for displaying an image based on the decompressed image data, and a code string including only desired ones among the plurality of components included in the new code string generated by the code string conversion apparatus. And a second code string conversion device.
[0055]
Therefore, it is possible to compare the image qualities of a plurality of components accommodating each coded data re-quantized in a plurality of patterns with each other, and create a code string leaving only the data of the component determined to have the optimum image quality. it can.
[0056]
According to an eighteenth aspect of the present invention, in the image editing system according to the seventeenth aspect, a distortion that measures a distortion amount from image data of each component included in a new code sequence created by the first code sequence conversion device. An amount measuring device is provided, and the second code sequence conversion device creates the new code sequence based on the result of the measurement.
[0057]
Therefore, it is possible to automatically select a component having the optimum image quality based on the distortion amount of the image data, and create a code string leaving only that component.
[0058]
The invention according to claim 19 is an image input device that captures an image as a still image, divides the captured image data into one or a plurality of rectangular regions, and frequency-converts a pixel value for each rectangular region. A camera system comprising: an image compression device that performs hierarchical compression encoding; and the image processing device according to any one of claims 1 to 15 that processes a code string after the compression encoding. .
[0059]
Therefore, the same operation and effect as the invention according to any one of the first to fifteenth aspects are obtained.
[0060]
According to a twentieth aspect of the present invention, there is provided a computer readable program that causes a computer to execute the functions of the respective means of the first aspect of the present invention.
[0061]
Therefore, the same operation and effect as the invention according to any one of the first to fifteenth aspects are obtained.
[0062]
BEST MODE FOR CARRYING OUT THE INVENTION
[Overview of prerequisite technology]
First, the outlines of “arrangement of orthogonal transform coefficients”, “hierarchical coding algorithm”, and “JPEG2000 algorithm”, which are the prerequisite technologies of the present embodiment, will be described in order.
[0063]
(1) Regarding “arrangement of orthogonal transform coefficients”
FIG. 1 shows a basis vector when two-dimensional DCT (discrete cosine transform) is used, and FIG. 2 shows an octave division when two-dimensional DWT (discrete wavelet transform) is used. As typical encoding algorithms that employ these orthogonal transforms, a JPEG algorithm and a JPEG2000 algorithm are known, respectively.
[0064]
Then, as conceptually shown in FIG. 3, the code string encoded by the JPEG2000 algorithm is arranged with the SOC (Start Of Codestream) at the head and the EOC (End Of Codestream) at the end, and the orthogonal transform coefficient value is the payload. In the section 211, the summarized information is arranged in the header section 212 in the code string. The header part 202 in the code string is detected by the syntax analysis means 213, and the packet length (the subband code amount of the payload part) is read by the packet length reading means 214.
[0065]
(2) About “Hierarchical coding algorithm” and “JPEG2000 algorithm”
FIG. 4 is a block diagram for explaining a hierarchical encoding algorithm that is the basis of JPEG2000. This hierarchical encoding algorithm includes a two-dimensional wavelet transform / inverse transform unit 202, a quantization / inverse quantization unit 203, an entropy encoding / decoding unit 204, and a tag processing unit 205. One of the biggest differences compared to the JPEG algorithm is the conversion method. JPEG uses the discrete cosine transform (DCT), and the hierarchical coding compression / decompression algorithm uses the discrete wavelet transform (DWT). The advantage that DWT has better image quality in a high compression area than DCT is one of the major reasons that it was adopted in JPEG2000 which is a successor algorithm of JPEG. Another major difference is that, in the latter case, a functional block called a tag processing unit 205 is added in order to perform code formation in the final stage. In this part, compressed data is generated as a code string during a compression operation, and a code string necessary for decompression is interpreted during a decompression operation. And, depending on the code string, JPEG2000 can realize various convenient functions.
[0066]
For example, FIG. 5 is a diagram showing subbands at each decomposition level when the number of decomposition levels is 3, and an arbitrary hierarchy corresponding to the octave division hierarchy in the block-based DWT shown in FIG. Thus, the compression / decompression processing of the still image can be stopped. The "decomposition" here is defined as follows in JPEG2000 PartI Final Draft International Standard (FDIS).
decomposition level:
A collection of wavelet subbands where each coefficient has the same spatial impact or span with respect to the source component samples. also included.
In FIG. 4, a color space conversion unit 201 is often connected to the input / output portion of the original image. For example, an RGB color system composed of R (red) / G (green) / B (blue) components of a primary color system, and Y (yellow) / M (magenta) / C (cyan) components of a complementary color system The conversion from the YMC color system to the YUV or YCbCr color system or the reverse conversion corresponds to this.
[0067]
Hereinafter, the JPEG2000 algorithm will be described in detail.
[0068]
FIG. 6 is a diagram illustrating an example of each component of the tiled color image. A color image is generally composed of each component 207 of the original image as shown in FIG. _R , 207 _G , 207 _B (Here, RGB primary color system) is a rectangular area (tile) 207 having a rectangular shape. _Rt , 207 _Gt , 207 _Bt Divided by Each of the tiles, for example, R00, R01, ..., R15 / G00, G01, ..., G15 / B00, B01, ..., B15 is a basic unit when executing the compression / decompression process. Therefore, the compression / expansion operation is performed independently for each component and for each tile.
[0069]
At the time of encoding, the data of each tile of each component is input to the color space conversion unit 1 of FIG. 1 and subjected to color space conversion, and then the two-dimensional wavelet conversion unit 202 performs two-dimensional wavelet conversion (forward conversion). It is applied and spatially divided into frequency bands.
[0070]
FIG. 5 is an explanatory diagram of the octave division of the two-dimensional discrete wavelet transform. The higher the numerical value of the decomposition level, the higher the hierarchical level. The parts displayed in gray are the subbands to be encoded at each hierarchical level. The example of FIG. 5 shows the subbands at each decomposition level when the number of decomposition levels is three. That is, the tile original image (0LL) obtained by the tile division of the original image (decomposition level 0 (206 ₀ )) Is subjected to a two-dimensional wavelet transform to obtain a decomposition level 1 (206). ₁ ) Are separated from each other (1LL, 1HL, 1LH, 1HH). Subsequently, a two-dimensional wavelet transform is performed on the low-frequency component 1LL in this layer to obtain a decomposition level 2 (206). ₂ ) Are separated from each other (2LL, 2HL, 2LH, 2HH). Similarly, the two-dimensional reversible wavelet transform is performed on the low-frequency component 2LL in the same manner as described above to obtain the decomposition level 3 (206). ₃ ) Are separated from each other (3LL, 3HL, 3LH, 3HH).
[0071]
Further, in FIG. 5, the subbands to be coded at each decomposition level are shown in gray. For example, when the number of decomposition levels is 3, the subbands (3HL, 3LH, 3HH, 2HL, 2LH, 2HH, 1HL, 1LH, 1HH) shown in gray are to be encoded, and the 3LL subband is not encoded. .
[0072]
Next, bits to be encoded are determined in the designated encoding order, and a context is generated from bits around the target bit by the quantization unit 203 in FIG.
[0073]
FIG. 7 is an explanatory diagram for explaining the relationship between precincts and code blocks. The wavelet coefficients after the quantization process are divided into non-overlapping rectangles called “precincts” for each subband. This was introduced to make efficient use of memory in the implementation. As shown in FIG. 7, one precinct, for example, precinct 208 _p4 Consists of three spatially coincident rectangular areas. Precinct 208 _p6 The same is true for Here, the original image is tile 208 at decomposition level 1. _t0 , 208 _t1 , 208 _t2 , 208 _t3 Are divided into four tiles. Furthermore, each precinct is a rectangular “code block” (precinct 8 _p4 For code block 208 _4b0 , 208 _4b1 ,…). This is a basic unit when performing entropy coding. In order to increase the coding efficiency, the coefficient values may be decomposed in bit plane units, the bit planes may be ordered for each pixel or code block, and a layer made up of one or more bit planes may be configured. That is, a layer (layer) based on the significance is formed from the bit planes of the coefficient values, and coding is performed for each layer. In some cases, the most significant layer, the most significant layer (MSB) and its lower layers are coded by several layers, and other layers including the least significant layer (MLB) are truncated.
[0074]
The entropy coding unit 204 codes each component tile by probability estimation from the context and the target bit. In this way, the encoding process is performed for all the components of the original image in tile units.
[0075]
Finally, the tag processing unit 205 combines all the encoded data from the entropy coder unit into one code string, and performs a process of adding a tag to it. FIG. 8 briefly shows the structure of the code string. A header (a main header 209, respectively) is provided at the beginning of the code string and at the beginning of the partial tile constituting each tile. _h And tile section header 209 _th ) Is added, and then the encoded data (bit stream 209) of each tile is added. _b ) Follows. Then, the tag (EOC tag 209) is again placed at the end of the code string. _e ) Is placed.
[0076]
FIG. 9 shows the structure of a code string when a packet containing encoded wavelet coefficient values is represented for each subband. The packet sequence has the same structure whether or not the division processing by the tile is performed.
[0077]
On the other hand, at the time of decoding, image data is generated from the code string of each tile of each component, contrary to the time of encoding. This will be briefly described with reference to FIG. In this case, the tag processing unit 205 interprets the tag information added to the code string input from the outside, decomposes the code string into a code string of each tile of each component, and decodes each code string of each tile of each component. Is subjected to a decoding process. The position of the bit to be decoded is determined in the order based on the tag information in the code string, and the inverse quantization unit 203 calculates the order of the neighboring bits (already decoded) at the target bit position. A context is created. The entropy decoding unit 204 performs decoding by probability estimation from the context and the code string to generate a target bit, and writes it to the position of the target bit.
[0078]
Since the data decoded in this way is spatially divided for each frequency band, the two-dimensional wavelet inverse transform is performed by the two-dimensional inverse wavelet transform unit 202 so that each tile of each component of the image data is Will be restored. The restored data is converted by the color space inverse converter 201 into the original data of the color system.
[0079]
The above is the outline of the “JPEG2000” algorithm. The “Motion JPEG2000” algorithm is obtained by extending the method for a still image, that is, a single frame to a plurality of frames.
[0080]
[First Embodiment of the Invention]
Hereinafter, an embodiment of the present invention will be described in detail.
[0081]
Here, a description will be given mainly of a moving image compression / expansion method represented by Motion JPEG2000, but it is needless to say that the present invention is not limited to the following description.
[0082]
FIG. 10 is a block diagram illustrating a hardware configuration of the code string conversion device 1 according to the present embodiment. In the code string converter 1, an MPU 2 that performs various operations and centrally controls each unit of the code string converter 1 and a memory 3 including various ROMs and RAMs are connected by a bus 4. A program is stored in the memory 3 (ROM). This program is executed by the MPU 2.
[0083]
Further, an ASIC (or FPGA) 5 for executing predetermined processing and a network interface 6 are connected to the bus 4. The network interface 6 serves as an interface for connecting a network 7 such as a LAN, a WAN, or the Internet to the code string converter 1.
[0084]
Further, an RCL (Reconfigurable Logic, Reconfigurable Logic) 8 is connected to the bus 4. The RCL 8 is in charge of a higher layer of an execution unit of the algorithm executed by the code sequence conversion device 1, that is, a process that requires high-speed processing. If the content of the high-speed processing is fixed, the function can be left to hard wired logic such as ASIC (or FPGA) 5. In this case, the RCL 8 needs to cope only with special processing that is infrequent but requires high-speed processing.
[0085]
FIGS. 11 and 12 are functional block diagrams of processing performed by the hardware described with reference to FIG. 10 based on a predetermined program. FIG. 11 shows a case where the input code string is a moving image, and FIG. 12 shows a case where the code string is a still image. Blocks to be realized by the processing performed by the RCL 8 are indicated by *.
[0086]
As shown in FIG. 11, the code sequence conversion device 1 divides moving image data into one or a plurality of rectangular regions (tiles) for each frame by a Motion JPEG2000 algorithm or the like, and obtains a discrete cosine value for each rectangular region. There is provided a syntax analysis unit 12 for analyzing the syntax of a code string created by performing orthogonal transform by transform, discrete wavelet transform, etc., and performing hierarchical coding. From the syntax analyzed by the syntax analysis unit 12, the packet length of each packet constituting the code string is detected by the packet length detection unit 21. The motion amount detecting means 22 detects the amount of motion between the current frame and the previous frame. The motion amount detecting means 22 includes a packet length storing means 23 for storing the packet length detected by the packet length detecting means 21, and a difference detecting means 24 for obtaining a difference in packet length between the current frame and the previous frame. ing. In this example, the motion amount detecting means 22 uses, as the motion amount between the current frame and the previous frame, a correlation coefficient value obtained from a difference in packet length between the current frame and the previous frame. Since a large correlation coefficient value means that images between frames are similar, it corresponds to a small motion amount.
[0087]
The subband code amount distribution detecting means 25 detects a subband code amount distribution in the current frame. Here, detection of the sub-band code amount distribution by the sub-band code amount distribution detecting means 25 will be described. Here, a method for detecting the subband code amount distribution based on the packet length will be described. Since the subband code amount of the payload portion, that is, the packet length is described in the header portion, the information is read by the syntax analysis means 12, and the packet length of the frame is detected by the packet length detection means 21. The subband code amount distribution can be detected on the basis of the packet length of the frame.
[0088]
The compression ratio specifying means 26 specifies the compression ratio of the image selected by the user in the code string conversion device 1.
[0089]
The quantization table group 27 includes index values of “motion amount”, “subband code amount distribution”, and “compression ratio” (motion amount index value (i), code amount distribution index value (j), compression ratio index value ( k)) A table (quantization table) relating to (described later) is arranged in a matrix. Each table registered in the quantization table group 27 indicates how to execute a packet operation of the packet switch 30 described later, and specifies a specific method of code reduction.
[0090]
The quantization table selecting unit 28 calculates the motion amount between the current frame and the previous frame detected by the motion amount detecting unit 22, the subband code amount distribution in the current frame detected by the subband code amount distribution detecting unit 25, Also, based on the compression ratio of the image selected by the user specified by the compression ratio specifying means 26, an appropriate quantization table (selected from the quantization table group 27 is selected from the quantization table group 27 using the above-described index values i, j, k as indices. (Which becomes the quantization table 31).
[0091]
The quantization means 29 includes a packet switch 30 for requantizing the code string. The packet switch 30 selects codes constituting the code string. Specifically, packets constituting a code string are selected by performing packet operations. This is realized by the function executed by the RCL 8 as described above. The selection of the packet is performed based on the selected quantization table 31. Further, in accordance with the selection by the quantization means 29, the code string creating means 13 rewrites the header data of the code string to make a new code string.
[0092]
As shown in FIG. 12, in the code string conversion apparatus 1 for a still image, the syntax analysis unit 12 divides the still image data into one or a plurality of rectangular areas (tiles) using a JPEG2000 algorithm or the like, and The syntax of a code string created by orthogonally transforming pixel values for each region by discrete cosine transform, discrete wavelet transform, or the like, and hierarchically encoding is analyzed. In the case of FIG. 11, the motion amount detecting means 22 that is necessary when the input code string is a moving image becomes unnecessary when the input code string is a still image. In general, restrictions on the processing speed are relaxed. Therefore, it is possible to reduce the power consumption by allocating the resource margin created in this way to another functional module having the insufficient resource, or by putting the device into a sleep state.
[0093]
Hereinafter, the contents of the specific processing executed by the code string conversion device 1 will be described.
[0094]
With reference to FIG. 13, an example will be described with reference to FIG. In FIG. 13, the vertical axis represents the motion amount of the image, and the correlation coefficient value obtained from the difference of the wavelet coefficient value between the current frame and the previous frame is used. As described above, a large correlation coefficient value means that images between frames are similar, and thus corresponds to a small motion amount. The horizontal axis represents the unit of the tile. Here, the moving images and the rectangular tiles shown in FIGS. 31 and 33 will be described as an example.
[0095]
A plurality of thresholds are provided for the correlation coefficient value, and a corresponding “motion amount index value” is assigned to a value between the thresholds in descending or ascending order. Then, in the quantization table corresponding to each motion amount index value, a subband in which encoded data is reduced and a subband in which encoded data is stored are determined.
[0096]
Among the index values, the “compression ratio index value” is used when roughly determining the size of the code amount in a semi-quantitative manner. The user first specifies this compression ratio index value. This example shows a case where “k = p” is selected as the index value. Note that “k = 0” represents lossless encoding (lossless compression).
[0097]
The motion index value is determined according to the motion detected by the motion detector 22. In the example of FIG. 13, the tile # = A having the largest motion amount has “i = 9” as the motion index value, and the tile # = C having the smallest motion amount has “i = 1” as the motion index value. Are given respectively. Then, a quantization table corresponding to each index value is selected as a quantization table 31, and quantization is performed by the quantization means 29.
[0098]
So far, a case has been described in which only one quantization table (main quantization table) is selected by the quantization table selection unit 28. This alone can significantly reduce the "distorted distortion" that appears in the background of an image, which has been annoying in the past.
[0099]
Here, an example in which a quantization table (sub-quantization table) other than the main quantization table is selected from the quantization table group 27 and the processing is executed in order to further expand the image quality improvement range will be described. Now, it is assumed that the rectangular area having the largest motion amount, tile # = A, satisfies the selection condition of the sub-quantization table. Then, in tile # = A, in addition to the motion index value “i = 9” of the main quantization table, “i = 8” and “i = 7” are used as the motion index values of the sub-quantization table. (The selection of the index value of the sub-quantization table will be described later.)
[0100]
As a result, a sub-quantization table is selected separately from the main quantization table using the motion amount index value of the sub-quantization table, and the input code string is re-quantized using this table. Then, the code string newly generated thereby is arranged in the sub-component separately from the main component. Then, by comparing and selecting the code strings placed in the main component and the sub-component, the image quality is further improved (the arrangement in the main and sub-components will be described later).
[0101]
FIG. 14 shows an example in which the “subband code amount distribution” is reflected in the quantization in the quantization means 29. Here, the sub-band code amount distribution is a sum of orthogonal transform coefficient values for each sub-band calculated by the sub-band code amount distribution detecting unit 25. The vertical axis represents the relative code amount when the low-order sub-band code amount of the highest layer (having the largest number of decomposin levels) is normalized to “1”. The horizontal axis represents a subband.
[0102]
In the curve showing the subband code amount distribution for each tile, a thin line portion represents a subband from which code data is to be deleted, while a thick line portion represents a subband in which code data is stored as it is. ing. (A) corresponds to a case where the number of subbands to be reduced in code data is large, and (b) corresponds to a case where the number is small, and corresponds to a case where the compression ratio is low and a case where the compression ratio is high, respectively. The tile numbers # = A to D are defined in FIG.
[0103]
Tile # = A and D hold code data up to the high band. Further, tile # = B has a relatively large amount of information in the middle band, and tile # = C has almost all information in the low band. As described above, since the distribution of the encoded data amount of the sub-band differs for each tile, in the quantization process, three types of index values (sub-order or ascending order) depending on the sub-band code amount distribution are used. Band code amount distribution index value). Then, a corresponding quantization table is prepared for each index value.
[0104]
That is, the quantization table “j = H” is used for the tiles # = A and D in which the distribution of the code data amount is high in the high band, and the quantization table “j = M” is used for the tile # = B in which the distribution is large in the middle band. However, a quantization table “j = L” is assigned to each tile # = C having a large distribution in the low frequency band.
[0105]
As a result, the image quality deterioration of the tiles # = A, B, and D, which has conventionally occurred particularly when the compression ratio is high, is greatly reduced, and the image quality equivalent to that of the tile # = C can be maintained. .
[0106]
FIG. 15 is another explanatory diagram illustrating “quantization” optimized according to the above-described subband code amount distribution. Here, the vertical axis represents the sum of the code amounts of all the subbands, and the horizontal axis represents the tile #. A tile in which image information is collected in a low-frequency sub-band has a small sum of code amounts of all sub-bands, and a tile in which image information is concentrated in a high-frequency band has a large sum of code amounts. The code amount of each tile is reduced to the minimum code amount sum provided corresponding to the sub-band code amount distribution (here, the sub-band code amount distribution index value is “j = H, M, L”).
[0107]
In the quantization table, the rate at which the code amount is reduced according to the code amount before quantization is determined. Therefore, when the code data after quantization is decoded and displayed as an image, the range from the high band to the low band is reduced. The deterioration of the image quality is not so noticeable over all the bands. This is because the code amount is not reduced uniformly without considering the code amount before quantization as in the related art.
[0108]
So far, the case where the selection condition of the sub-quantization table is not considered has been described. Hereinafter, a case will be described in which the sub-quantization table is further selected for higher image quality.
[0109]
Now, for the tile # = D, “j = H” is given as the main index value. However, looking at the code amount distribution, it can be seen that the degree of concentration of the code amount in the low frequency sub-band is high. That is, in addition to “j = H”, “j = M” and “j = L” may be considered as index values. For this purpose, the selection condition of the sub-quantization table may be set so as to consider not only the main quantization table but also the sub-quantization table in such a case.
[0110]
Then, “j = M” or “j = L” can be assigned to the tile # = D as a sub-index value. FIG. 16 shows a state of main quantization and sub-quantization at a low compression ratio.
[0111]
FIG. 17A shows a quantization table group 27 in which “motion amount” is considered. The quantization table group 27 shown here includes “motion amount index values (i = 0, 1,..., 9)” in descending order or ascending order classified by the motion amount and a preset threshold value. It has a two-dimensional matrix configuration in which the specified “compression ratio index value (k = 0, 1,..., N)” is used as an index value. In this example, nine index values are provided for the motion amount, and index values for lossless (k = 0) to the highest compression ratio (k = n) are provided for the compression ratio.
[0112]
FIG. 17B shows a conventional quantization table group 27a for comparison. This corresponds to a case where there is no motion amount, that is, a case where the motion amount index value is “i = 0”.
[0113]
In the code sequence conversion device 1 using the quantization table group 27 of the present example, the optimal quantization table is selected by referring to the quantization table group 27 according to each index value obtained by classifying the feature of each tile. . When “k = p” is selected as the index value of the compression ratio, the index value of tile # = A is “(i, k) = (9, p)”, and that of tile # = B is “(i, k) = (2, p) ". Actual quantization is performed using the quantization table corresponding to the index value.
[0114]
The number of quantization tables required for each compression ratio increases from one in the related art to ten. Further, the total number of quantization tables is increased from n + 1 to (n + 1) × 10, so that it is possible to cope with various images in detail.
[0115]
Normally, the code data of each tile is re-quantized using a main quantization table corresponding to one index value. When the selection condition of the sub-quantization table is satisfied, a sub-index value is assigned in addition to the main index value. In this case, the code data of each tile is subjected to a plurality of types of requantization using a plurality of quantization tables corresponding to a plurality of index values. It is preferable that at least one of the index values of the sub-quantization table assigned here takes a value adjacent to the index value of the main quantization table (in this case, the other index values are the index values of the main quantization table). And the same value). For example, when the tile # = A having “(i, k) = (9, p)” as the index value of the main quantization has two index values of the sub-quantization, “(i, k) = ( 8, p), (7, p) ".
[0116]
FIG. 18A shows a quantization table group 27 in consideration of the “subband code amount distribution”. The subband code amount distribution is classified into three types of patterns according to the characteristics of each tile. As in the case of the motion amount, index values are given, and these are called “code amount distribution index values (j = L, M, H)”. In addition, the user can specify the compression ratio by “compression ratio index value (i = 0, 1,..., N)”. Shown here is a two-dimensional matrix composed of two types of index values.
[0117]
FIG. 18B shows a conventional quantization table group 27a for comparison. This corresponds to a case where the code amount distribution is fixed, for example, a case where the code amount distribution index value = M.
[0118]
In the code sequence conversion device 1 using the quantization table group 27 of the present example, the optimal quantization table is selected by referring to the quantization table group 27 according to each index value obtained by classifying the feature of each tile. . When “k = p” is selected as the compression ratio index value, the index value of tile # = A is “(j, k) = (H, p)”, and that of tile # = B is “(j, k). ) = (M, p) ". Actual quantization is performed using the quantization table corresponding to the index value.
[0119]
Then, the number of quantization tables required for each compression ratio increases from one in the related art to three. Although the total number of quantization tables increases from n + 1 to (n + 1) × 3, image processing with higher flexibility can be performed. The example using the sub-quantization table has already been described with reference to FIG.
[0120]
FIG. 19A shows a quantization table group 27 corresponding to a moving image in consideration of both the “motion amount” and the “subband code amount distribution”. The quantization table group 27 shown here includes “motion amount index value (i = 0, 1,..., N)”, “code amount distribution index value (j = L, M, H)”, and “compression amount index value (j = L, M, H)”. It is a three-dimensional matrix (i, j, k) with index ratios of "ratio index values (i = 0, 1,..., N)". In this example, each index value is assigned in ascending order or descending order to each range of the “motion amount”, “subband code amount distribution”, and “compression ratio”. FIG. 19B shows a conventional quantization table group 27a for comparison.
[0121]
Then, the code sequence conversion device 1 using the quantization table group 27 of the present example refers to the quantization table group 27 according to the index values (i, j, k) obtained by classifying the features of each tile. Select the optimal quantization table. When “k = p” is selected as the index value of the compression ratio, the index value of tile # = A is “(i, j, k) = (9, H, p)”, and that of tile # = B is “(I, j, k) = (2, H, p)”. Actual quantization is performed using the quantization table corresponding to the index value.
[0122]
Then, the number of quantization tables required for each compression ratio increases from one in the related art to thirty. Further, the total number of quantization tables increases from n + 1 to (n + 1) × 30.
[0123]
FIG. 20A shows a quantization table group 27 in which only the “subband code amount distribution” is considered. This corresponds to a still image. The quantization table group 27 shown here is composed of “code amount distribution index values (j = L, M, H)” and “compression rate index values (i = 0, 1,..., N)”. This is a two-dimensional matrix (j, k). FIG. 20B shows a conventional quantization table group as a comparative example.
[0124]
FIG. 21 illustrates conditions for selecting a sub-quantization table. In this example, the index values are assigned in ascending order to the values of “motion amount”, “sum of the total sub-band code amount on the low frequency side”, and “compression ratio”. Then, when each index value indicating the magnitude of each of these values is equal to or greater than a preset threshold value (when any one index value is equal to or greater than the threshold value), the above main quantization table In addition, a sub-quantization table is selected (a sub-quantization table group including a plurality of sub-quantization tables is indicated by reference numeral 27b).
[0125]
For example, when the main index value is “(i, j, k) = (7, 2, n−2)”, this satisfies the selection condition of the sub-quantization table. One or more are newly selected. On the other hand, when the main index value is “(i, j, k) = (0, 3, 1)”, the sub quantization table is not selected. In this example, when the code amount index value among the main index values is “j = 0”, the sub-quantization table is selected regardless of the index value of the motion amount or the compression ratio. Absent. In other words, it is stated that a sufficient image quality can be obtained by simply requantizing the main quantization table for an image in which the subband code amount distribution is biased toward a high band.
[0126]
22 and 23 illustrate a method of arranging code data requantized using the main quantization table and the sub-quantization table in a component. FIG. 22 illustrates a moving image, and FIG. 23 illustrates a still image. Is the case. Now, it is assumed that the input code data is configured in a color space of Y (luminance), Cb (blue color difference), and Cr (red color difference). For the sake of simplicity, only the luminance component is described here, but the same processing is performed for the remaining color components. When the input code string is configured in one of the RGB, YUV, and YCbCr color spaces, the selection condition can be applied to at least one of the color spaces.
[0127]
In each frame of the moving image (see FIG. 22), requantization is always performed in the main quantization table, and the result is arranged in the main component. Only when the sub-quantization table is selected, re-quantization is performed not only in the main quantization table but also in the sub-quantization table. Then, the result is arranged in the sub-component. The arrangement of these components is performed by the code string creating means 13. In this example, the selection condition of the sub-quantization table becomes active when frame # = n, n + 5, n + 6, and n + 9. In these frames, a quantization table having a different motion index value is selected from the quantization table group 27 separately from the motion index value of the main index value, and the input code data is requantized. Then, the result is arranged and held in the sub-component.
[0128]
In each tile of the still image (see FIG. 23), requantization is always performed in the main quantization table, and the result is arranged in the main component. Only when the sub-quantization table is selected, re-quantization is performed not only in the main quantization table but also in the sub-quantization table. Then, the result is arranged in the sub-component. In this example, the sub-quantization table selection condition becomes active when tile # = n, n + 7, n + 8, n + 11. In these tiles, a quantization table having a different code amount distribution index value is selected from the quantization table group 27 separately from the code amount distribution index value of the main index value, and re-quantization of input code data is performed. . Then, the result is arranged and held in the sub-component. In the case of tile # = m + 2, the selection condition of the sub-quantization table is not satisfied, but the user compulsorily activates it. In this case, the selection of the sub-quantization table is set to be executed regardless of the selection condition of the quantization table (first setting means), and the re-quantization is performed at a compression rate different from that of the main quantization table. . Note that the user can set an upper limit for the number of quantization tables selected by the quantization table selection unit 28 (second setting unit).
[0129]
In the examples of FIGS. 22 and 23, when the sub-quantization table is not selected, the code string creating unit 13 arranges the code data of the image prepared in advance in the sub-component. For example, code data with a pixel value of 0 (white) may be set for all pixels.
[0130]
Here, as a method of calculating the code amount, a “macro-like” method of reading packet length information from a header portion is used here. FIG. 24 is an explanatory diagram for explaining the principle. Since the data amount of the payload portion 42 of the code string 41, that is, the "packet length" is described in the header portion 43, the information is read by the syntax analysis means 12 and used. The motion amount of the image appears in the change amount of the code amount, that is, the change amount of the packet length. Further, the code amount distribution is obtained from the histogram of the packet length for each subband. According to this method, although the amount of motion may include a slight error as compared with the “micro” method, the configuration of the code string conversion device 1 is simple and processing can be performed at high speed. , Has great advantages.
[0131]
According to the code string conversion device 1 of FIG. 11 described above, quantization is performed in consideration of both the “motion amount” and the “code amount distribution” of an image. Quantization including fine image quality control can be performed.
[0132]
Further, when the input original image is composed of a plurality of components, it is possible to further speed up the entire system. That is, for example, when the input code string is a YCbCr signal, the packet length detection may be performed using only the Y component. As the quantization table for the Cb and Cr components, the same quantization table as that obtained for the Y component may be selected.
[0133]
Further, according to the code sequence conversion device 1 of FIG. 12, “subband code amount distribution” is used as a criterion for selecting a quantization table. As a method of calculating the code amount, a “macro-like” method of reading packet length information from a header portion is used.
[0134]
FIG. 25 shows a quantization table in consideration of the subband code amount distribution. In the example shown here, the optimal quantization is performed on the code amount distribution, so that high image quality can be maintained.
[0135]
FIG. 26 shows an example of encoding a still image in consideration of the subband code amount distribution. In the still image of FIG. 26 (a), tile # = U indicates the blue sky, tile # = W indicates the dense front side of the window frame of the high-rise building, and tile # = V indicates both the high-rise building and the sky. Each is a main image. As for the subband code amount distribution, image information is collected in tile # = U in the low band, tile # = W in the high band, and tile # = V in the entire range from the low band to the high band. .
[0136]
Here, in the quantization of the tile # = W, since the image information is spread over the entire range from the low band to the high band, upon quantization, both the front side of the building in the high band and the sky in the low band are used. It is necessary to do so so as not to lose the image quality balance. This is very difficult and often results in poor image quality. Moreover, it tends to be more difficult to find the optimal solution as the compression ratio increases.
[0137]
Therefore, in this example, a rectangular area corresponding to the boundary between the sky and the building is provided with a sub-quantization table in addition to the main quantization table (see FIG. 26B).
[0138]
As the sub index value, an index value having a smaller compression ratio than the compression ratio index value of the main quantization is assigned. Specifically, lossless compression (k = 0) is performed at a low compression rate (index value k = p), and an index value (k = q−m) at a lower compression rate is used at a medium compression rate (k = q). In the high compression ratio (k = r), the index value (k = rn) on the lower compression ratio side is used. As described above, by performing a plurality of quantization operations on a delicate image, it is possible to encode the original image while maintaining high image quality.
[0139]
FIG. 27 illustrates a case where the tile division processing is performed on the original image and a case where the tile division processing is not performed. Even when tile division is not performed, if the precincts and code blocks are used as rectangular areas, the "motion amount" and "subband code amount distribution" of the image are finely detected, as in the case of performing tile division. It is possible to perform quantization while suppressing image quality deterioration.
[0140]
FIG. 28 is a block diagram illustrating a schematic configuration of an image editing system 51 including the code string conversion device 1. As shown in FIG. 28, in the image editing system 51, the code sequence conversion device 1 (first code sequence conversion device) creates a new code sequence from a code sequence obtained by compression-coding an image as described above. At this time, it is desirable that the input code string is a so-called lossless code stream that is reversibly coded so that the capability of the image editing system 51 can be maximized. When it is difficult to obtain lossless code data, a code string encoded at a low compression ratio is desirable.
[0141]
The image decompression device 52 decompresses the code data quantized by the main and sub-quantization tables by performing processes such as decoding, inverse quantization, inverse orthogonal transform, and the like, and the image display device 53 displays and outputs an image. It can be so.
[0142]
The image editor looks at the image displayed on the image display device 53 and replaces the component determined to be the most appropriate among the main and sub components included in the code string with the main / sub quantized code data replacement. Select according to the instruction signal. Then, based on the main / sub-quantized code data exchange instruction signal, the code sequence conversion device 54 (second code sequence conversion device) generates a new code sequence output from the code sequence The selected component is taken out of the components, and a new code string including this component as a main component and excluding the unselected component is created. In this way, a re-quantized code string that is most suitable for re-quantization from among the plurality of quantization tables is finally determined and output from the image editing system 51.
[0143]
FIG. 29 is a block diagram illustrating a schematic configuration of an image editing system 51 further including a distortion amount measuring device 55. This is an automated component selection by the editor performed by the image editing system 51 of FIG. As a method of measuring the amount of distortion, there is a method of comparing pixel values between an original image and an image after quantization. Thus, the image quality evaluation result based on the subjective evaluation of the editor becomes objective. In addition, it is possible to speed up and automate the editing process.
[0144]
FIG. 30 is a block diagram illustrating a schematic configuration of a camera system 61 including the code string conversion device 1. As shown in FIG. 30, the image input device 62 includes a photoelectric conversion element such as a CCD and captures a still image. The image compression device 63 compression-encodes the image data of the photographed still image. Then, the code string conversion device 1 creates a new code string from the code string after the compression encoding as described above. This new code string is transmitted via a predetermined network or stored in a storage device.
[0145]
【The invention's effect】
According to the first aspect of the present invention, requantization can be performed by determining the type of an image based on three types of cuts, that is, the amount of motion of the image, the distribution of the subband code amount, and the compression ratio, so that high image quality is maintained. The moving image can be encoded while performing. In addition, since the motion amount and the sub-band code amount distribution are obtained from the code amount and the quantization is performed in the code state, the memory required for the processing can be reduced, the processing speed is increased, and the power consumption required for the processing is reduced. Can be.
[0146]
According to the second aspect of the present invention, the requantization can be performed by determining the type of the image at the two cuts of the distribution of the sub-band code amount and the compression ratio, so that the still image can be obtained while maintaining high image quality. Can be encoded. Further, since the sub-band code amount distribution is obtained from the code amount and quantization is performed in the code state, memory requirements for processing can be reduced, processing speed can be increased, and power consumption required for processing can be reduced.
[0147]
According to a third aspect of the present invention, in the first or second aspect, requantization can be performed with a plurality of patterns, and a moving image or a still image can be encoded while maintaining high image quality. it can.
[0148]
According to a fourth aspect of the present invention, in the third aspect of the invention, when the motion amount, the sub-band code amount distribution or the compression ratio is relatively large, requantization is performed in a plurality of patterns to maintain high image quality. In addition, a moving image or a still image can be encoded.
[0149]
According to a fifth aspect of the present invention, in the third or fourth aspect, when necessary, requantization is performed with a plurality of patterns even if selection conditions are not satisfied, and high image quality is maintained. In addition, a moving image or a still image can be encoded.
[0150]
According to a sixth aspect of the present invention, in the invention according to any one of the third to fifth aspects, it is possible to prevent unrestricted re-quantization of a plurality of patterns and prevent unnecessary processing. Can be.
[0151]
According to a seventh aspect of the present invention, in addition to the requantization which is considered to be most appropriate for the motion amount, the sub-band code amount distribution, and the compression ratio in the invention according to any one of the third to sixth aspects, , Followed by any requantization deemed appropriate.
[0152]
According to an eighth aspect of the present invention, in the invention according to any one of the third to seventh aspects, when a selection condition is satisfied in at least one color space, requantization is performed with a plurality of patterns to obtain a high image quality. A moving image or a still image can be encoded while maintaining the quality.
[0153]
According to a ninth aspect of the present invention, in the invention according to any one of the third to eighth aspects, it is possible to obtain a single code string having code data requantized by a plurality of patterns in a plurality of components. .
[0154]
According to a tenth aspect of the present invention, in the invention according to the ninth aspect, when requantization is not performed in a plurality of patterns, code data prepared in advance is placed in a component where code data obtained from another table is to be placed. Thus, it is possible to facilitate the use of a new code string thereafter.
[0155]
According to an eleventh aspect of the present invention, in the tenth aspect of the present invention, code data of a white image having a minimum pixel value when requantization is not performed in a plurality of patterns is obtained by coding data obtained from another table. Can be arranged in the component to be arranged, and the subsequent use of a new code string can be facilitated.
[0156]
According to a twelfth aspect of the present invention, in the invention according to any one of the first to eleventh aspects, an ascending order or a descending order is assigned according to each value of a motion amount of an image, a distribution of a subband code amount, and a compression ratio. A required table can be accurately selected according to the index value set, and requantization is performed with a plurality of patterns, so that a moving image or a still image can be encoded while maintaining high image quality.
[0157]
According to a thirteenth aspect of the present invention, in the invention according to any one of the first to twelfth aspects, the subband code amount distribution can be appropriately detected.
[0158]
The inventions according to the sixteenth, nineteenth, and twentieth aspects have the same functions and effects as the invention according to any one of the first to fifteenth aspects.
[0159]
According to the seventeenth aspect of the present invention, the image quality of a plurality of components accommodating each code data requantized by a plurality of patterns is compared with each other, and only the data of the component determined to have the optimum image quality is left. A code sequence can be created.
[0160]
According to an eighteenth aspect of the present invention, in the invention according to the seventeenth aspect, a component having an optimum image quality is automatically selected based on the amount of distortion of image data, and a code string leaving only the component is created. Can be.
[Brief description of the drawings]
FIG. 1 is an explanatory diagram illustrating base vectors when two-dimensional DCT (discrete cosine transform) is used.
FIG. 2 is an explanatory diagram for explaining octave division when a two-dimensional DWT (discrete wavelet transform) is used.
FIG. 3 is a conceptual diagram illustrating syntax analysis of a JPEG2000 code string and reading of a packet length.
FIG. 4 is a block diagram for explaining a hierarchical encoding algorithm that is the basis of JPEG2000.
FIG. 5 is a diagram illustrating subbands at each decomposition level when the number of decomposition levels is three.
FIG. 6 is a diagram illustrating an example of each component of a tiled color image.
FIG. 7 is a diagram for explaining a relationship between a precinct and a code block.
FIG. 8 is a diagram illustrating the structure of a code string.
FIG. 9 is an explanatory diagram showing the structure of a code string when a packet containing encoded wavelet coefficient values is represented for each subband.
FIG. 10 is a block diagram showing a hardware configuration of a code string conversion device according to one embodiment of the present invention.
FIG. 11 is a functional block diagram of a code string conversion device when processing a moving image.
FIG. 12 is a functional block diagram of a code string conversion device when processing a still image.
FIG. 13 is an explanatory diagram illustrating an example in which the “motion amount” of a moving image detected by the motion amount detection means is reflected on quantization performed by the quantization means.
FIG. 14 is an explanatory diagram illustrating an example in which “subband code amount distribution” is reflected in quantization by a quantization unit.
FIG. 15 is another explanatory diagram illustrating “quantization” optimized according to a subband code amount distribution.
FIG. 16 is an explanatory diagram showing a state of main quantization and sub-quantization when the compression ratio is low.
FIG. 17 is an explanatory diagram of a quantization table group in consideration of the “motion amount” of an image.
FIG. 18 is an explanatory diagram of a quantization table group in consideration of “subband code amount distribution”.
FIG. 19 is an explanatory diagram of a quantization table group corresponding to a moving image in consideration of both “motion amount” and “sub-band code amount distribution”.
FIG. 20 is an explanatory diagram of a quantization table group considering only “subband code amount distribution”.
FIG. 21 is an explanatory diagram illustrating selection conditions for a sub-quantization table.
FIG. 22 is an explanatory diagram illustrating a method of arranging code data re-quantized using a main quantization table and a sub-quantization table in a component.
FIG. 23 is an explanatory diagram illustrating a method of arranging code data re-quantized using a main quantization table and a sub-quantization table in a component.
FIG. 24 is an explanatory diagram of a method of calculating a code amount by a “macro-like” method of reading packet length information from a header portion.
FIG. 25 is an explanatory diagram of a quantization table considering a subband code amount distribution.
FIG. 26 is an explanatory diagram illustrating an example of encoding of a still image in consideration of a subband code amount distribution.
FIG. 27 is an explanatory diagram illustrating a case where an original image is subjected to tile division processing and a case where tile division processing is not performed;
FIG. 28 is an explanatory diagram of a schematic configuration of an image editing system.
FIG. 29 is an explanatory diagram of a schematic configuration of an image editing system including a distortion amount measurement device.
FIG. 30 is an explanatory diagram of a schematic configuration of a camera system.
FIG. 31 is an explanatory diagram illustrating a problem of the present invention.
FIG. 32 is an explanatory diagram illustrating a problem of the present invention.
FIG. 33 is an explanatory diagram for explaining the problem of the present invention.

Claims

動画像データをフレームごとに１又は複数の矩形領域に分割し、この矩形領域ごとに画素値を周波数変換して階層的に符号化することにより作成した符号列の構文を解析する構文解析手段と、
符号の取捨選択を行って前記符号列の再量子化を行う量子化手段と、
前記解析結果に基づいて前記符号列について現フレームと以前のフレームとの間の動き量を求める動き量検出手段と、
前記解析結果に基づいて現フレーム内のサブバンド符号量分布を求めるサブバンド符号量分布検出手段と、
現フレームの圧縮率を指定する圧縮率指定手段と、
前記動き量の大きさに応じて与えられた数値で前記再量子化を行う際の符号の削減方法を指定する複数のテーブルからなるテーブル群から前記テーブルを指定する際のインデックスとなる動き量インデックス値、前記サブバンド符号量分布の大きさに応じて与えられた数値で前記テーブルを指定する際のインデックスとなる符号分布インデックス値、及び、前記圧縮率の大きさに応じて与えられた数値で前記テーブルを指定する際のインデックスとなる圧縮率インデックス値に基づいて、前記再量子化に使用する前記テーブルを１又は複数選択する選択手段と、
前記再量子化後の符号列から新たな符号列を作成する符号列作成手段と、
を備えている符号列変換装置。Syntactic analysis means for dividing the moving image data into one or a plurality of rectangular areas for each frame, analyzing the syntax of a code string created by frequency-converting and hierarchically encoding pixel values for each of the rectangular areas; ,
Quantizing means for selecting a code and performing re-quantization of the code sequence,
Motion amount detection means for obtaining a motion amount between the current frame and the previous frame for the code sequence based on the analysis result,
A sub-band code amount distribution detecting means for obtaining a sub-band code amount distribution in the current frame based on the analysis result;
Compression ratio designating means for designating the compression ratio of the current frame;
A motion amount index that is an index when specifying the table from a table group including a plurality of tables that specifies a code reduction method when performing the requantization with a numerical value given according to the magnitude of the motion amount Value, a code distribution index value that is an index when specifying the table with a numerical value given according to the size of the subband code amount distribution, and a numerical value given according to the size of the compression ratio. Selecting means for selecting one or more of the tables to be used for the requantization, based on a compression ratio index value serving as an index when specifying the table;
Code string creating means for creating a new code string from the re-quantized code string,
A code string conversion device comprising:

静止画像データを１又は複数の矩形領域に分割し、この矩形領域ごとに画素値を周波数変換して階層的に符号化することにより作成した符号列の構文を解析する構文解析手段と、
符号の取捨選択を行って前記符号列の再量子化を行う量子化手段と、
前記解析結果に基づいてサブバンド符号量分布を求めるサブバンド符号量分布検出手段と、
圧縮率を指定する圧縮率指定手段と、
前記サブバンド符号量分布の大きさに応じて与えられた数値で前記再量子化を行う際の符号の削減方法を指定する複数のテーブルからなるテーブル群から前記テーブルを指定する際のインデックスとなる符号分布インデックス値、及び、前記圧縮率の大きさに応じて与えられた数値で前記テーブルを指定する際のインデックスとなる圧縮率インデックス値に基づいて、前記再量子化に使用する前記テーブルを１又は複数選択する選択手段と、
前記再量子化後の符号列から新たな符号列を作成する符号列作成手段と、
を備えている符号列変換装置。Syntax analysis means for dividing the still image data into one or a plurality of rectangular areas, analyzing the syntax of a code string created by frequency-converting pixel values for each of the rectangular areas and hierarchically encoding the pixel values;
Quantizing means for selecting a code and performing re-quantization of the code sequence,
Sub-band code amount distribution detecting means for obtaining a sub-band code amount distribution based on the analysis result,
Compression ratio designating means for designating a compression ratio;
It becomes an index when specifying the table from a table group consisting of a plurality of tables that specify a code reduction method when performing the requantization with a numerical value given according to the size of the subband code amount distribution. Based on a code distribution index value and a compression ratio index value which is an index when specifying the table with a numerical value given according to the size of the compression ratio, the table used for the requantization is set to 1 Or a selection means for selecting a plurality,
Code string creating means for creating a new code string from the re-quantized code string,
A code string conversion device comprising:

前記選択手段は、前記動き量、前記サブバンド符号量分布及び前記圧縮率に対応している前記各インデックス値で特定される前記テーブルを指定し、また、前記各インデックス値が所定の選択条件を満たした場合には他の前記テーブルも選択し、満たさない場合には当該他のテーブルの選択は行わない、請求項１又は２に記載の符号列変換装置。The selecting means specifies the table specified by the index values corresponding to the motion amount, the subband code amount distribution and the compression ratio, and the index values satisfy a predetermined selection condition. The code string conversion device according to claim 1, wherein the other table is selected when the condition is satisfied, and is not selected when the condition is not satisfied.

前記選択手段は、前記動き量、前記サブバンド符号量分布の低域側総符号量及び前記圧縮率のうち何れか一つが所定の閾値以上のときに前記選択条件が満たされる、請求項３に記載の符号列変換装置。4. The method according to claim 3, wherein the selection unit satisfies the selection condition when any one of the motion amount, the low-frequency side total code amount of the subband code amount distribution, and the compression ratio is equal to or greater than a predetermined threshold. The code string conversion device according to any one of the preceding claims.

前記選択条件を満たすか否かにかかわらず、前記選択手段で前記他のテーブルの選択を実行することを設定する第１の設定手段を備えている、請求項３又は４に記載の符号列変換装置。5. The code string conversion according to claim 3, further comprising a first setting unit that sets the selection unit to execute the selection of the other table regardless of whether the selection condition is satisfied. apparatus.

前記選択手段で選択する他のテーブルの数に上限値を設定する第２の設定手段を備えている、請求項３〜５のいずれかの一に記載の符号列変換装置。The code string conversion device according to claim 3, further comprising a second setting unit configured to set an upper limit value for the number of other tables selected by the selection unit.

前記選択手段は、前記他のテーブルの前記各インデックス値は、前記選択条件を満たさないときに選択される前記各インデックス値で特定されるテーブルの当該インデックス値と、その少なくとも一つが隣接する値である、請求項３〜６のいずれかの一に記載の符号列変換装置。The selecting means may be configured such that each of the index values of the other table is an index value of a table specified by each of the index values selected when the selection condition is not satisfied, and a value at least one of which is adjacent. The code string conversion device according to any one of claims 3 to 6.

前記選択手段は、前記符号列がＲＧＢ、ＹＵＶ又はＹＣｂＣｒのいずれかの色空間で構成されているときに、前記選択条件を各色空間のうち少なくとも一つに適用する、請求項３〜７のいずれかの一に記載の符号列変換装置。8. The method according to claim 3, wherein the selection unit applies the selection condition to at least one of the color spaces when the code string is configured in any one of RGB, YUV, and YCbCr color spaces. 9. The code string conversion device according to any one of the above.

前記符号列作成手段は、前記テーブルが複数選択されたときは前記再量子化手段で前記テーブルごとに得られた符号データを互いに異なるコンポーネントに配置するように前記新たな符号列の作成を行う、請求項３〜８のいずれかの一に記載の符号列変換装置。The code string creating means, when a plurality of tables are selected, creates the new code string so as to arrange code data obtained for each table in the requantizing means in different components. The code string conversion device according to claim 3.

前記符号列作成手段は、前記他のテーブルが選択されなかったときに、当該他のテーブルから得られた符号データを配置すべき前記コンポーネントに予め用意した符号データを配置する、請求項９に記載の符号列変換装置。10. The code string creation unit according to claim 9, wherein when the other table is not selected, the code data prepared in advance is arranged in the component where the code data obtained from the other table is to be arranged. Code string converter.

前記符号列作成手段は、前記予め用意した符号データとして画素値が最小となる白色の画像の符号データを配置する、請求項１０に記載の符号列変換装置。The code string conversion device according to claim 10, wherein the code string creation unit arranges code data of a white image having a minimum pixel value as the code data prepared in advance.

前記各インデックス値は、前記動き量、前記サブバンド符号量分布及び前記圧縮率の各値に応じてそれぞれ昇順又は降順に付されている、請求項１〜１１のいずれかの一に記載の符号列変換装置。The code according to any one of claims 1 to 11, wherein each of the index values is assigned in ascending order or descending order according to each value of the motion amount, the subband code amount distribution, and the compression ratio. Column conversion device.

前記サブバンド符号量分布検出手段は、サブバンドごとの直交変換係数値の符号量和及びある帯域に含まれるサブバンドの直交変換係数値の総符号量和を求めるものである、請求項１〜１２のいずれかの一に記載の符号列変換装置。The said sub-band code amount distribution detection means calculates | requires the code amount sum of the orthogonal transformation coefficient value for every sub-band, and the total code amount sum of the orthogonal transformation coefficient value of the sub-band included in a certain band. 13. The code string conversion device according to any one of 12.

前記構文解析手段は、前記符号列として前記直交変換に離散コサイン変換を用いているものを対象とする、請求項１〜１３のいずれかの一に記載の符号列変換装置。14. The code sequence conversion device according to claim 1, wherein the syntax analysis unit targets a code sequence using a discrete cosine transform for the orthogonal transform.

前記構文解析手段は、前記符号列として前記直交変換に離散ウエーブレット変換を用いているものを対象とする、請求項１〜１３のいずれかの一に記載の符号列変換装置。14. The code string conversion device according to claim 1, wherein the syntax analysis unit targets a code string using a discrete wavelet transform for the orthogonal transform.

請求項１〜１５のいずれかの一に記載の符号列変換装置と、
この符号列変換装置で作成した新たな符号列を伸長する画像伸長装置と、
この伸長後の画像データに基づいて画像を表示する表示装置と、
を備えている画像編集システム。A code string conversion device according to any one of claims 1 to 15,
An image decompression device for decompressing a new code sequence created by the code sequence conversion device;
A display device for displaying an image based on the decompressed image data,
An image editing system provided with.

請求項９に記載の符号列変換装置である第１の符号列変換装置と、
この符号列変換装置で作成した新たな符号列を伸長する画像伸長装置と、
この伸長後の画像データに基づいて画像を表示する表示装置と、
前記符号列変換装置で作成した新たな符号列に含まれる前記複数のコンポーネントのうち所望のもののみを含む符号列を新たに作成する第２の符号列変換装置と、
を備えている画像編集システム。A first code string conversion device, which is the code string conversion device according to claim 9,
An image decompression device for decompressing a new code sequence created by the code sequence conversion device;
A display device for displaying an image based on the decompressed image data,
A second code string converter that newly creates a code string including only desired ones of the plurality of components included in the new code string created by the code string converter,
An image editing system provided with.

前記第１の符号列変換装置で作成した新たな符号列に含まれる各コンポーネントの画像データから歪量を測定する歪量測定装置を備え、
前記第２の符号列変換装置は、この測定の結果に基づいて前記新たな符号列の作成を行う、
請求項１７に記載の画像編集システム。A distortion amount measurement device that measures an amount of distortion from image data of each component included in the new code sequence created by the first code sequence conversion device;
The second code string conversion device creates the new code string based on the measurement result.
The image editing system according to claim 17.

画像を静止画像として撮像する画像入力装置と、
この撮影された画像データを１又は複数の矩形領域に分割し、この矩形領域ごとに画素値を周波数変換し、階層的に圧縮符号化する画像圧縮装置と、
この圧縮符号化後の符号列を処理する請求項１〜１５のいずれかの一に記載の前記画像処理装置と、
を備えているカメラシステム。An image input device that captures an image as a still image,
An image compression apparatus that divides the captured image data into one or a plurality of rectangular areas, frequency-converts pixel values for each of the rectangular areas, and compression-encodes hierarchically;
The image processing apparatus according to any one of claims 1 to 15, wherein the image processing apparatus processes the code string after the compression encoding.
Camera system equipped with.

請求項１〜１５のいずれかの一に記載の発明の前記各手段の機能をコンピュータに実行させるコンピュータに読み取り可能なプログラム。A non-transitory computer-readable program that causes a computer to execute the function of each of the means according to any one of claims 1 to 15.