JP4039873B2

JP4039873B2 - Video information recording / playback device

Info

Publication number: JP4039873B2
Application number: JP2002088452A
Authority: JP
Inventors: 敦志清水
Original assignee: Sanyo Electric Co Ltd
Current assignee: Sanyo Electric Co Ltd
Priority date: 2002-03-27
Filing date: 2002-03-27
Publication date: 2008-01-30
Anticipated expiration: 2022-03-27
Also published as: JP2003283993A

Description

【０００１】
【発明の属する技術分野】
本発明は、映像データの再生装置に関するものであり、特に、映像データのダイジェスト版を生成して再生する装置に関するものである。
【０００２】
【従来の技術】
従来より、映像情報のダイジェスト版を作成して再生する装置が存在する。つまり、プロフェッショナル用途と一般の消費者用途とを問わず、大量の映像データの中から視聴したいものを見つけるには、多大な時間と労力を必要とする。例えば、ＶＴＲの早送り再生により見たい番組を見つけることも可能ではあるが、時間と労力を要することになる。これに対処するため、要約映像を作成することにより、該要約映像を通じて大まかに映像の内容を把握する装置や方法が提案されている（特開平１０−３２７７３号、特開平１１−２３９３２２号、特開平１１−１７６０３８号）。
【０００３】
また、特開２０００−３５０１２４号では、テレビジョン番組を録画する際に、番組の映像や音声を解析し、番組の中で特徴的と思われるシーンを抽出し、該シーンを縮小静止画像として生成、保存しておき、映像をブラウジングする際には、該縮小静止画像を再生データに用いている。例えば、特徴シーンを抽出する際には、シーンの変わり目、カメラの動き、色の変化、テロップの有無や内容等の点を抽出している。
【０００４】
【発明が解決しようとする課題】
しかし、上記従来の場合では、１つの基準に基づいて特徴シーンを抽出しているので、番組によっては必ずしも重要なシーンを抜き出しているとはいえない場合がある。例えば、通常、ニュースでは、各ニュース項目が開始される先頭シーンが重要であるのに対して、ドラマや歌番組では、出演者のアップシーン等が重要であるといえ、番組のジャンル（種類としてもよい）によって重要なシーンも異なるにも拘わらず、１つの基準によって特徴シーンを抽出すると、番組によっては、重要でないシーンを抽出してしまうおそれがある。
【０００５】
そこで、本発明は、番組のジャンルが異なっても、特徴シーン、すなわち、その番組にとって重要なシーンを適切に抽出して要約映像を作成してダイジェスト再生することができる映像情報記録再生装置を提供するとともに、ユーザーにとってより適切な要約映像を作成してダイジェスト再生をすることができる映像情報記録再生装置を提供することを目的とする。
【０００６】
【課題を解決するための手段】
本発明は上記問題点を解決するために創作されたものであって、第１には、映像情報の記録・再生を行う映像情報記録再生装置であって、映像情報を記憶する映像情報記憶手段と、映像情報における各シーンごとに複数種類の特徴量を検出する特徴量検出手段と、映像情報のジャンルごとに定められた基準で、該特徴量を評価するための基準に従い、検出された特徴量を評価する評価手段と、上記評価手段が、各特徴量について重み付けを行なうための重み係数が各特徴量ごとに設けられた重み係数群が、複数種類のジャンルについてそれぞれ設けられた重み係数記憶部と、該評価手段による評価結果に基づき、該シーン又は該シーンの一部を特定する情報である特定情報を記憶することにより、該特定情報からなるシナリオデータを記憶するシナリオ記憶手段と、該シナリオ記憶手段に記憶されたシナリオデータに基づいて、該映像情報記憶手段に記憶された映像データから、所定のシーン又は該シーンの一部を読み出して再生を行うシナリオ再生手段とを有し、上記評価手段は、映像情報のジャンルに対応する重み係数群を用いて特徴量を評価することを特徴とする。
【０００７】
この第１の構成の映像情報記録再生装置においては、上記映像情報記憶手段に映像情報が記憶される。また、上記特徴量検出手段は、映像情報における各シーンごとに複数種類の特徴量を検出する。そして、上記評価手段は、映像情報のジャンルごとに定められた基準で、該特徴量を評価するための基準に従い、検出された特徴量を評価する。さらに、シナリオ記憶手段は、該評価手段による評価結果に基づき、該シーン又は該シーンの一部を特定する情報である特定情報を記憶することにより、該特定情報からなるシナリオデータを記憶する。このようにして、ダイジェスト再生をする際のシナリオが各基準ごとにシナリオ記憶手段に記憶されるので、上記シナリオ再生手段は、映像情報記憶手段に記憶された映像データから、所定の基準に基づくシナリオデータに基づいて、所定のシーン又は該シーンの一部を読み出して再生を行う。よって、映像情報のジャンルごとに定められた基準に従ってシナリオが作成されて、該シナリオに基づきダイジェスト再生されるので、番組のジャンルが異なっても、番組のジャンルごとに最適なダイジェスト再生を行うことが可能となる。
【０００８】
また、第２には、映像情報の記録・再生を行う映像情報記録再生装置であって、映像情報を記憶する映像情報記憶手段と、映像情報における各シーンごとに複数種類の特徴量を検出する特徴量検出手段と、該特徴量検出手段により検出された特徴量を複数種類の基準に従いそれぞれ評価する評価手段と、上記評価手段が、各特徴量について重み付けを行なうための重み係数が各特徴量ごとに設けられた重み係数群が優先順位に従い複数設けられた重み係数記憶部と、該評価手段による評価結果に基づき、該シーン又は該シーンの一部を特定する情報である特定情報を各基準ごとに記憶することにより、各基準ごとに該特定情報からなるシナリオデータを記憶するシナリオ記憶手段と、該シナリオ記憶手段に記憶されたシナリオデータのうちの所定の基準に基づくシナリオデータに基づいて、該映像報記憶手段に記憶された映像データから、所定のシーン又は該シーンの一部を読み出して再生を行うシナリオ再生手段とを有し、上記評価手段は、各重み係数群を用いてそれぞれ特徴量を評価することにより、該特徴量を複数種類の基準に従い評価することを特徴とする。
【０００９】
この第２の構成の映像情報記録再生装置においては、上記映像情報記憶手段に映像情報が記憶される。また、上記特徴量検出手段は、映像情報における各シーンごとに複数種類の特徴量を検出する。そして、上記評価手段は、該特徴量検出手段により検出された特徴量を複数種類の基準に従いそれぞれ評価する。さらに、シナリオ記憶手段は、該評価手段による評価結果に基づき、該シーン又は該シーンの一部を特定する情報である特定情報を各基準ごとに記憶することにより、各基準ごとに該特定情報からなるシナリオデータを記憶する。このようにして、ダイジェスト再生をする際のシナリオが各基準ごとにシナリオ記憶手段に記憶されるので、上記シナリオ再生手段は、映像情報記憶手段に記憶された映像データから、所定の基準に基づくシナリオデータに基づいて、所定のシーン又は該シーンの一部を読み出して再生を行う。よって、複数の基準に基づいてダイジェスト再生のシナリオが複数作成されるので、ユーザーは、異なる基準に基づいたシナリオによるダイジェスト再生を見ることができ、映像情報を概観しやすくなるとともに、重要シーンを見つけやすくなる。よって、ユーザーにとってより適切なダイジェスト再生が可能となる。例えば、登場人物を把握したい場合には、顔領域が存在するシーンに重きをおいた基準に従ったシナリオを再生すればよい。
【００１０】
また、第３には、映像情報の記録・再生を行う映像情報記録再生装置であって、映像情報を記憶する映像情報記憶手段と、映像情報における各シーンごとに複数種類の特徴量を検出する特徴量検出手段と、該特徴量検出手段により検出された複数種類の特徴量を、映像情報のジャンルごとに定められた基準に従いそれぞれ評価する評価手段と、上記評価手段が、各特徴量について重み付けを行なうための重み係数が各特徴量ごとに設けられた重み係数群が、複数種類のジャンルについてそれぞれ設けられた重み係数記憶部と、該評価手段による評価結果に基づき、該シーン又は該シーンの一部を特定する情報である特定情報を各基準ごとに記憶することにより、各基準ごとに該特定情報からなるシナリオデータを記憶するシナリオ記憶手段と、該シナリオ記憶手段に記憶されたシナリオデータのうちの所定の基準に基づくシナリオデータに基づいて、該映像報記憶手段に記憶された映像データから、所定のシーン又は該シーンの一部を読み出して再生を行うシナリオ再生手段とを有し、上記評価手段は、映像情報のジャンルに対応する重み係数群を用いてそれぞれ特徴量を評価することにより、該特徴量を複数種類の基準に従い評価することを特徴とする。よって、映像情報のジャンルごとに定められた基準に従ってシナリオが作成されて、該シナリオに基づきダイジェスト再生されるので、番組のジャンルが異なっても、番組のジャンルごとに最適なダイジェスト再生を行うことが可能となる。さらに、ある番組について、複数の基準に基づいてダイジェスト再生のシナリオが複数作成されるので、ユーザーは、異なる基準に基づいたシナリオによるダイジェスト再生を見ることができ、映像情報を概観しやすくなるとともに、重要シーンを見つけやすくなる。
【００１３】
また、第４には、映像情報の記録・再生を行う映像情報記録再生装置であって、映像情報を記憶する映像情報記憶手段と、映像情報における各シーンごとに複数種類の特徴量を検出する特徴量検出手段と、該特徴量検出手段により検出された複数種類の特徴量を、映像情報のジャンルごとに定められた基準に従いそれぞれ評価する評価手段と、上記評価手段が、重み係数群集合であって、各特徴量について重み付けを行なうための重み係数が各特徴量ごとに設けられた重み係数群が優先順位に従い複数設けられた重み係数群集合を複数のジャンルについてそれぞれ有する重み係数記憶部と、該評価手段による評価結果に基づき、該シーン又は該シーンの一部を特定する情報である特定情報を各基準ごとに記憶することにより、各基準ごとに該特定情報からなるシナリオデータを記憶するシナリオ記憶手段と、該シナリオ記憶手段に記憶されたシナリオデータのうちの所定の基準に基づくシナリオデータに基づいて、該映像報記憶手段に記憶された映像データから、所定のシーン又は該シーンの一部を読み出して再生を行うシナリオ再生手段とを有し、上記評価手段は、映像情報のジャンルに対応する重み係数群集合における複数の重み係数群を用いてそれぞれ特徴量を評価することにより、該特徴量を複数種類の基準に従い評価することを特徴とする。
【００１４】
また、第５には、上記第１あるいは第３の構成において、上記評価手段は、各特徴量と対応する重み係数とを乗算した値を積算した値を、所定のしきい値と比較し、該積算した値が該所定のしきい値よりも大きいか否かを判定することを特徴とする。
【００１５】
また、第６には、上記第２あるいは第４の構成において、上記評価手段は、各特徴量と対応する重み係数とを乗算した値を積算した値を、所定のしきい値と比較し、該積算した値が該所定のしきい値よりも大きいか否かを判定することを特徴とする。このようにして、検出された特徴量を評価する。
【００１６】
また、第７には、上記第５あるいは第６の構成において、上記シナリオ記憶手段は、上記積算した値が上記所定のしきい値よりも大きい場合に、そのシーン又は該シーンの一部を特定する情報である特定情報を記憶していくことを特徴とする。
【００１７】
また、第８には、上記第５から第７までのいずれかの構成において、上記評価手段は、評価に用いる複数種類のしきい値を有し、各しきい値に基づいて評価を行い、また、上記シナリオ記憶手段は、各しきい値に基づく評価結果に基づき、各しきい値ごとにシナリオデータを記憶することを特徴とする。
【００３６】
【発明の実施の形態】
本発明の実施の形態としての実施例を図面を利用して説明する。本発明に基づく映像情報記録再生装置Ａは、データ分離部（分離手段）１０と、Ａ／Ｄ変換部１１と、番組付加情報抽出部１２と、ＥＰＧデータ保持部１４と、Ａ／Ｄ変換部１５と、エンコーダ１６と、インデキシング部１８と、データ保持部１９と、重み係数テーブル（重み係数記憶部）２０と、ＡＶデータ保持部（映像情報記憶手段）２２と、シナリオ保持部（シナリオ記憶手段）２４と、再生データ選択部（シナリオ選択手段）２６と、ナビゲーション制御部２８と、ＡＶデータ読出し部３０と、デコーダ３２と、Ｄ／Ａ変換部３４と、モニタ３６と、を有している。
【００３７】
ここで、上記データ分離部１０は、ＥＰＧ（電子番組ガイド）データとＡＶデータとが含まれたデジタルＡＶデータ（映像情報）が入力されると、これをＥＰＧデータとＡＶデータとに分離する。ここで、図１に示す構成は、映像情報記録再生装置Ａに、現行テレビジョンのアナログ放送波のデータのようなアナログＡＶデータ（映像情報）が入力される場合の例であり、実際には、データ分離部１０の前段に受信部と復調処理部が設けられ、受信部から入力されたアナログＡＶデータが該復調部で復調処理された後にこのデータ分離部１０に入力されることになる。具体的には、このデータ分離部１０は、ＶＢＩ（垂直帰線消去区間）に多重化されたＥＰＧデータを分離する機能を有している。また、データ分離部１０は、ＥＰＧデータを上記番組付加情報抽出部１２に送り、一方、ＡＶデータをエンコーダ１６に送る。
【００３８】
なお、この映像情報記録再生装置Ａに入力されるデータが、デジタルＡＶデータの場合には、図１の構成からＡ／Ｄ変換部１１と、Ａ／Ｄ変換部１５と、エンコーダ１６が省略されることになる。また、上記のような復調部も省略される。つまり、受信されたデジタルＡＶデータが直接データ分離部１０に入力され、このデータ分離部１０においてＥＰＧデータとＡＶデータとに分離され、ＥＰＧデータは番組付加情報抽出部１２に入力され、また、ＡＶデータは、ＡＶデータ保持部２２に入力されるとともに、インデキシング部１８に入力される。なお、デジタルＡＶデータが映像情報記録再生装置Ａに入力される場合には、エンコードされたデータが入力されることになるので、図１のようなエンコーダ１６は必要ない。
【００３９】
次に、Ａ／Ｄ変換部１１は、アナログデータとしてのＥＰＧデータをデジタルデータに変換する。
【００４０】
また、番組付加情報抽出部１２は、ＥＰＧデータから必要なデータを抽出する。例えば、ジャンルや番組タイトルのデータを抽出するものである。番組付加情報抽出部１２は、抽出したデータをＥＰＧデータ保持部１４に送る機能も有している。また、ＥＰＧデータ保持部１４は、抽出されたデータを保持するものである。
【００４１】
また、Ａ／Ｄ変換部１５は、アナログデータとしてのＡＶデータをデジタルデータに変換する。
【００４２】
また、エンコーダ１６は、ＭＰＥＧエンコーダであり、送られたＡＶデータをＭＰＥＧの規格に従い圧縮符号化する。そして、エンコーダ１６は、符号化されたＡＶデータをＭＰＥＧストリームとしてＡＶデータ保持部２２とインデキシング部１８に送る機能を有している。
【００４３】
インデキシング部１８は、エンコーダ１６から送られたＡＶデータを解析処理するものであり、無音検出や、カット点検出や、顔領域の検出等を行うことにより、所定の評価関数を演算するために使用する特徴量の基礎を検出するとともに、各ショットの評価、特に、該特徴量について評価関数を演算してシナリオデータを作成する。
【００４４】
つまり、インデキシング部１８は、ＡＶデータに対して無音検出を行うことにより無音区間の最後のフレーム位置（これが「コーナー先頭フレーム」となる）を検出したり、画像の連続性がない位置のフレーム位置（これが「カット点」となる）を検出したり、該カット点を基準とした代表フレームに顔領域が存在するか否かの検出を行う。コーナー先頭フレームが存在することや、カット点が存在することや、代表フレームに顔領域が存在することが、特徴量の基礎となる。
【００４５】
また、上記評価関数の演算に関しては、インデキシング部１８は、カット点のフレームから次のカット点の前までのフレームで構成されるショットにおいて、特徴量の基礎に基づいて特徴量を算出し、算出された特徴量について所定の評価関数に従い演算を行って、評価値を算出する。評価関数の演算に際しては、ＥＰＧデータ保持部１４からのデータと、重み係数テーブル２０の情報を参照する。ここで、特徴量としては、ショット長（例えば、該ショットのフレームの数）の値や、コーナー先頭フレームの場合に与えられる値や、代表フレームに顔領域が存在する場合に与えられる値が挙げられる。なお、ショットは、特許請求の範囲における「シーン」に当たる。
【００４６】
また、上記シナリオデータの作成に際しては、インデキシング部１８は、算出した評価値を評価し、所定の基準を満たす場合にそのショットを特定するための情報を抽出していく。なお、シナリオデータは、複数種類作成される。
【００４７】
なお、インデキシング部１８における処理の詳しい内容については、追って説明する。
【００４８】
また、データ保持部１９は、インデキシング部１８により検出されたデータ等を保持するものであり、具体的には、インデキシング部１８により検出された検出結果を記憶しておくための検出結果記憶テーブル（図２参照）や、該検出結果記憶テーブルに記憶されたデータに基づいて所定のデータを記憶するための特徴量記憶テーブル（図３参照）が設けられている。
【００４９】
また、重み係数テーブル２０は、上記評価関数を演算する際に使用される重み係数のデータを保持するものであり、具体的には、図４に示すような重み係数テーブルが記憶されている。この重み係数テーブルは、ショット長、コーナー先頭か否か、顔領域があるかについて重み係数が記憶されていて、ジャンルごとに優先度に応じて複数の組み合わせが記憶されている。つまり、優先度が最も高い場合には、そのジャンルにとって最も適切な重み係数の組み合わせとなっている。
【００５０】
ここで、ショット長、コーナー先頭か否か、顔領域についての重み係数の１つの組み合わせが、上記「重み係数群」に当たる。例えば、図４の例で、ニュースにおける優先度１の重み係数０．１、０．７、０．２が重み係数群を構成する。また、あるジャンルにおいて、優先順位に従い設けられた複数の重み係数群が上記重み係数群集合に当たる。つまり、図４の例で、ニュースにおける優先度１〜３の各重み係数の組み合わせが重み係数群集合を構成する。
【００５１】
また、ＡＶデータ保持部２２は、エンコーダ１６から送られたＡＶデータを保持するためのものである。また、シナリオ保持部２４は、上記シナリオデータを保持するものである。シナリオデータは複数種類作成されるので、作成された複数種類のシナリオデータが保持されることになる。つまり、シナリオ保持部２４は、複数のシナリオデータを記憶するシナリオデータテーブル（図５参照）を有している。
【００５２】
また、再生データ選択部２６は、ユーザーが操作を行うための操作部であり、例えば、リモコンにより構成される。この操作部は、特に、ユーザーがダイジェスト再生を行う場合に用いるものである。つまり、見たい番組を選択したり、再生するシナリオを選択したりするのに用いる。
【００５３】
また、ナビゲーション制御部２８は、再生データ選択部２６からのデータに基づき、所定のシナリオデータをシナリオ保持部２４から読み出し、ＡＶデータ読出し部３０に送る。
【００５４】
また、ＡＶデータ読出し部３０は、ナビゲーション制御部２８から送られたデータに基づきＡＶデータ保持部２２に保持されたＡＶデータから所定のデータを読み出して、デコーダ３２に送るものである。
【００５５】
また、デコーダ３２は、ＭＰＥＧデコーダであり、送られたＡＶデータを復号するものである。
【００５６】
なお、上記ナビゲーション制御部２８と、ＡＶデータ読出し部３０と、デコーダ３２と、Ｄ／Ａ変換部３４と、モニタ３６等は、上記シナリオ再生手段として機能する。
【００５７】
なお、映像情報記録再生装置Ａを構成する上記各部については、それぞれを各機能を有する装置として構成してもよいし、一部の構成を所定の処理を実行するためのプログラムと、該プログラムに基づき処理を実行するＣＰＵにより構成してもよい。
【００５８】
つまり、各部を装置により構成する場合には、例えば、上記データ分離部１０は、ＥＰＧデータとＡＶデータとに分離する機能を有する装置として構成し、エンコーダ１６についても、符号化装置により構成する。また、ＥＰＧデータ保持部１４，重み係数テーブル２０、ＡＶデータ保持部２２、シナリオ保持部２４は、記憶装置により構成されることになる。
【００５９】
また、一部の構成を所定の処理を実行するためのプログラムと、該プログラムに基づき動作するＣＰＵにより構成する場合には、各種プログラムが格納された記憶装置と、該プログラムに基づき処理を実行するＣＰＵにより構成し、該各種プログラムとしては、データ分離部１０が行なう処理を実行するためのプログラムや、番組付加情報抽出部１２が行なう処理を実行するためのプログラムや、エンコーダ１６が行なう処理を実行するためのプログラムや、インデキシング部１８が行なう処理を実行するためのプログラムや、ナビゲーション制御部２８が行なう処理を実行するためのプログラムや、ＡＶデータ読出し部３０が行なう処理を実行するためのプログラムや、デコーダ３２が行なう処理を実行するためのプログラム等が挙げられる。
【００６０】
上記構成の映像情報記録再生装置Ａの動作について説明する。まず。ＥＰＧデータとＡＶデータとが含まれたアナログＡＶデータが受信部（図示せず）を介して映像情報記録再生装置Ａに入力されると、図示しない復調部において復調された後に、データ分離部１０に入力される。そして、該ＡＶデータは、データ分離部１０において、該ＥＰＧデータとＡＶデータとに分離される。ＥＰＧデータは、Ａ／Ｄ変換部１１においてＡ／Ｄ変換された後に番組付加情報抽出部１２に送られ、また、ＡＶデータは、Ａ／Ｄ変換部１５においてＡ／Ｄ変換された後にエンコーダ１６に送られる。
【００６１】
番組付加情報抽出部１２は、ＥＰＧデータから必要なデータ、例えば、ジャンルや番組タイトルのデータを抽出し、該抽出したデータをＥＰＧデータ保持部１４に送る。送られたデータは、ＥＰＧデータ保持部１４に保持される。
【００６２】
一方、エンコーダ１６は、送られたＡＶデータをＭＰＥＧの規格に従い圧縮符号化し、符号化されたＡＶデータをＭＰＥＧストリームとしてＡＶデータ保持部２２とインデキシング部１８に送る。ＡＶデータ保持部２２では、ＡＶデータが記憶される。このＡＶデータ保持部２２への記憶が、上記映像情報記憶工程に当たる。
【００６３】
なお、上記デジタルＡＶデータの場合の構成では、デジタルＡＶデータが直接分離部１０に入力され、ＥＰＧデータは番組付加情報抽出部１２に送られ、また、ＡＶデータは、インデキシング部１８とＡＶデータ保持部２２に送られる。
【００６４】
また、インデキシング部１８では、図６に示すフローチャートの処理や図７のフローチャートの処理が行われる。
【００６５】
つまり、エンコーダ１６からＡＶデータが送られているか否かを判定すること等により、録画中の番組が終了したか否かが判定され（Ｓ１０）、番組が終了したら処理は終了する。一方、番組が終了していない場合には、ステップＳ１１に移行して、処理の対象となる対象フレームを特定する（Ｓ１１）。これは最初のフレームから順次対象フレームとして特定されることになる。
【００６６】
次に、そのフレームの位置（時間的な位置）において無音検出を行うか否かが判定される（Ｓ１２）。これは、無音検出はフレーム間隔よりも長い時間間隔ごとに行なうために、このような判定が設けられているもので、このステップＳ１２においては、複数回に１度の割合で無音検出を行う旨の判定がなされる。
【００６７】
そして、無音検出を行う場合には、そのフレームの位置が無音となっているか否かが判定される（Ｓ１３）。つまり、音声レベルが予め設定したしきい値を越えているか否かを判定することにより、無音が否かが判定される。
【００６８】
そして、無音である場合には、対応するフレーム番号を記憶しておく（Ｓ１４、Ｓ１５）。これは、インデキシング部１８自身において保持しておいてもよいし、データ保持部１９に記憶させておいてもよい。一方、無音でない場合には、ステップＳ１６に移行する。
【００６９】
ステップＳ１６では、最後の無音から所定時間が経過したか否かが判定される（Ｓ１６）。つまり、ステップＳ１５において、無音と検出された位置のフレームのフレーム番号が記憶されていくので、この記憶されたデータに従って、最後に無音と判定された位置から所定時間経過しているか否かが判定される。これは、最後に無音になってから所定時間無音が検出されない場合に初めて該最後の無音の位置をコーナー先頭とすることから、このような判定を設けるのである。そして、最後の無音から所定時間が経過している場合には、その最後の無音の位置のフレーム番号をコーナー先頭である旨のデータとともに検出結果記憶テーブル（図２参照）に記憶する。なお、コーナー先頭のデータを記憶した後にも有音区間が続いている場合に、さらに、コーナー先頭のデータを記憶する必要はないので、上記ステップＳ１６では、最後の無音の位置のフレーム番号がコーナー先頭として記憶されていないことも判定され、記憶されていない場合に、ステップＳ１７に移行することになる。つまり、ステップＳ１６では、最後の無音から所定時間が経過したか否かが判定されるとともに、最後の無音の位置のフレーム番号がコーナー先頭として記憶されていないことも判定され、最後の無音から所定時間が経過し、かつ、最後の無音の位置のフレーム番号がコーナー先頭として記憶されていない場合に、ステップＳ１７で、その最後の無音の位置のフレーム番号をコーナー先頭として記憶することになる。
【００７０】
例えば、図２、図９に示す例において、フレーム番号ｌ＋２のフレームの位置において無音と検出され、その後所定時間無音が検出されなかったことにより、該フレーム番号ｌ＋２のデータとコーナー先頭である旨のデータが検出結果記憶テーブルに記憶されたものである。フレーム番号ｎ＋２のフレームについても同様である。
【００７１】
次に、カット検出を行う（Ｓ１８）。これは、前のフレームと連続性があるか否かを判定することにより行われ、ＡＶデータにおいてカメラが切り替わる等物理的にフレーム間で連続性がなくなった場合に、連続性がないものと判定される。前のフレームと連続性がないと判定された場合に、対象フレームがカット点となる。このようなカット検出の方法については、すでに種々の手法が提案されており、例えば、Ｊ．Ｍｅｎｇらによる「“ＳｃｅｎｅＣｈａｎｇｅＤｅｔｅｃｔｉｏｎｉｎａＭＰＥＧＣｏｍｐｒｅｓｓｅｄＶｉｄｅｏＳｅｑｕｅｎｃｅ”，ＳＰＩＥＰｒｏｃｅｅｄｉｎｇＶｏｌ．２４１９Ｆｅｂｒｕａｒｙ１９９５」が提案する方法を用いることでＭＰＥＧ−１やＭＰＥＧ−２の映像ストリームから効率的にカット検出を実行することが可能となる。ステップＳ１８におけるカット検出によりカット点が検出されたら、対象フレームのフレーム番号をカット点である旨のデータとともに検出結果記憶テーブルに記憶する（Ｓ１９、Ｓ２０）。一方、カット点でないと検出された場合には、ステップＳ２１に移行する（Ｓ１９）。
【００７２】
例えば、図２、図９に示す例において、フレーム番号ｍ＋１のフレームにおける判定においては、前フレームであるフレーム番号ｍのフレームとの連続性がないと判定されたことにより、フレーム番号ｍ＋１のデータがカット点である旨のデータとともに検出結果記憶テーブルに記憶されたものである。フレーム番号ｎ＋１のフレームについても同様である。
【００７３】
次に、対象フレームが代表フレームか否かが判定される（Ｓ２１）。この代表フレームとは、カット点に当たるフレームから所定フレーム目のフレームをいい、例えば、カット点に当たるフレームから３番目のフレームを代表フレームと規定した場合には、対象フレームが、カット点に当たるフレームから３番目のフレームであるか否かが判定される。
【００７４】
そして、代表フレームであると判定された場合には、フレーム中に顔領域が存在するか否かが判定される（Ｓ２２）。つまり、フレーム中に顔の画像が存在するか否かを判定する。この顔領域検出もすでに種々の手法が提案されており、例えば、Ｈ．Ｗａｎｇらによる「“ＡＨｉｇｈｌｙＥｆｆｉｃｉｅｎｔＳｙｓｔｅｍｆｏｒＡｕｔｏｍａｔｉｃＦａｃｅＲｅｇｉｏｎＤｅｔｅｃｔｉｏｎｉｎＭＰＥＧＶｉｄｅｏ”，ＩＥＥＥＴＣＳＶＴ」が提案する方法を用いることが可能である。ステップＳ２２における顔領域検出により対象フレームにおいて顔領域が検出された場合には、そのフレームのフレーム番号を顔領域を含むフレームである旨のデータとともに検出結果記憶テーブルに記憶する（Ｓ２３、Ｓ２４）。一方、顔領域を含まない場合には、その対象フレームについての処理を終了し、ステップＳ１０に戻る。
【００７５】
なお、ステップＳ２１の判定において対象フレームが代表フレームでない場合には、その対象フレームについての処理を終了し、ステップＳ１０に戻る。
【００７６】
例えば、図２、図９に示す例において、フレーム番号ｎ＋４のフレームにおける判定においては、該フレームが代表フレームであり、かつ、顔領域が検出されたとして、フレーム番号ｎ＋１のデータが顔領域が存在する旨のデータとともに検出結果記憶テーブルに記憶されたものである。なお、図９において、フレーム番号ｍ＋４のフレームも代表フレームであるが、顔領域が存在しないとして、フレーム番号ｍ＋４についてのデータは検出結果記憶テーブルには記憶されていない。
【００７７】
ある対象フレームについて、ステップＳ１２〜Ｓ２４までの処理が終了したら、ステップＳ２５に移行して、ショットの評価を行う。このショットの評価の詳細については後述する。その後は、ステップＳ１１において、その次のフレームを対象フレームに特定して同じようにステップＳ１２〜Ｓ２４までの処理を繰り返していく。このようにして、最後のフレームまで処理を行っていき、検出結果記憶テーブルに検出結果を記憶していく。
【００７８】
なお、各フレームごとに順次処理を行っていくに従い、検出結果記憶テーブルにデータが記憶されていくわけであるが、インデキシング部１８においては、検出結果記憶テーブルへ順次記憶されるデータに基づいて図３に示す特徴量記憶テーブルにも記憶を行っていく。この特徴量記憶テーブルは、各ショット番号ごとに、先頭フレームと、最終フレームと、ショット長と、ショット長判定値と、コーナー先頭である場合の特徴量と、顔領域が存在する場合の特徴量と、評価の処理が完了しているか否かが記憶されるようになっている。
【００７９】
つまり、１つのショットは、カット点のフレームから次のカット点の手前のフレームにより構成されるので（先頭のショットについては、先頭のフレームから最初のカット点の手前のフレームまで）、各ショットごとに各データを記憶していく。つまり、先頭フレームについては、そのショットの先頭フレームのフレーム番号を記憶し、最終フレームはそのショットの最終フレームのフレーム番号を記憶し、ショット長は、そのショットにおけるフレーム数を記憶する。ショット長判定値は、ショット長があるしきい値よりも大きい場合には１とし、該しきい値以下の場合には０とする。このように、ショット長があるしきい値よりも大きい場合に与えられる値が、上記シーン長さ特徴量に当たる。また、コーナー先頭である場合の特徴量については、そのショット内にコーナー先頭である旨のデータが記憶されているフレームがある場合には１とし、一方、そのショット内にコーナー先頭である旨のデータが記憶されているフレームが１つもない場合には０とする。このように、そのショット内にコーナー先頭である旨のデータが記憶されているフレームがある場合に与えられる値が、上記音声レベル特徴量に当たる。同じように、顔領域が存在する場合の特徴量についても、そのショット内に顔領域が存在する旨のデータが記憶されているフレームがある場合には１とし、一方、そのショット内に顔領域が存在する旨のデータが記憶されているフレームが１つもない場合には０とする。このように、そのショット内に顔領域が存在する旨のデータが記憶されているフレームがある場合に与えられる値が、上記顔領域特徴量に当たる。つまり、ショット長判定値と、コーナー先頭と、顔領域については、２値とする。
【００８０】
なお、上記シーン長さ特徴量や、音声レベル特徴量や、顔領域特徴量は、インデキシング部１８により検出されるわけであるが、この場合のインデキシング部１８は、上記シーン長さ特徴量検出手段や、音声レベル特徴量検出手段や、顔領域特徴量検出手段として機能するといえる。また、上記のように、各特徴量が検出される工程が上記特徴量検出工程に当たる。
【００８１】
なお、評価の処理が完了しているか否かに関しては、そのショットについてステップＳ２５における評価が完了した場合には、その旨のデータ（例えば、１）を記憶する。このようにして、検出結果記憶テーブルにデータを順次記憶していくに伴い、特徴量記憶テーブルにも順次データが記憶されていく。
【００８２】
この特徴量記憶テーブルへの記憶のタイミングは、特徴量記憶テーブルへの記憶が可能になったタイミングで任意に行えばよいが、例えば、上記ステップＳ１７、Ｓ２０、Ｓ２４における検出結果記憶テーブルへの記憶のタイミングにおいて同時に行えばよい。例えば、あるカット点が検出された場合には、そのカット点の手前にあるショットの最終フレームと、次のショットの先頭フレームの番号が分かるので、ステップＳ２０において検出結果記憶テーブルにカット点のデータを書き込むのと同時に、特徴量記憶テーブルにも書込みを行なう。ショット長やショット長判定値についても、最終フレームのデータが分かれば算出可能であるので、書込み可能である。また、顔領域が存在する場合の特徴量についても、あるカット点が検出された場合には、その手前のショット内に顔領域の存在するフレームがある場合には、ステップＳ２０において検出結果記憶テーブルにカット点のデータを書き込むのと同時にその旨のデータを書き込む。なお、コーナー先頭である場合の特徴量については、最後の無音から所定時間経過しないとコーナー先頭であることが分からないので、ステップＳ１７において検出結果記憶テーブルに書き込みを行なうのと同時に行なう。また、評価の処理が完了しているか否かについては、あるショットについてステップＳ２５の処理が完了した場合に、その旨のデータを書き込む。
【００８３】
以上のようにインデキシング部１８が特徴量の検出を行うわけであるが、その際のインデキシング部１８は、上記特徴量検出手段として機能する。
【００８４】
次に、ステップＳ２５におけるショットの評価について、図７等を使用して説明する。
【００８５】
まず、未処理のショットがあるか否かが判定される（Ｓ３０）。つまり、ステップＳ２５における評価を行っていないショットがあるか否かが判定され、ある場合には、ステップＳ３１に移行し、ない場合にはステップＳ２５の処理を一旦終了して、ステップＳ１０（図６参照）に戻る。ここで、ステップＳ２５における評価を行っていないショットがあるか否かについては、特徴量記憶テーブルにおける評価の処理が完了しているか否かのデータに基づいて判定すればよい。
【００８６】
また、ステップＳ３１では、未処理のショットの直後のフレーム、つまり、未処理のショットの直後のカット点から所定時間が経過したか否かが判定される（Ｓ３１）。これは、あるフレームがコーナー先頭であるか否かは、最後の無音から所定時間が経過しないと判明しないので、直後のカット点から所定時間が経過するまでは、該未処理のショットにコーナー先頭のデータが含まれる可能性があるからである。ステップＳ３１において、所定時間が経過している場合には、Ｓ３２に移行し、経過していない場合にはステップＳ２５の処理を一旦終了して、ステップＳ１０に戻る。
【００８７】
ステップＳ３２においては、ＥＰＧデータ保持部１４に保持されているデータから処理対象のＡＶデータのジャンルについてのデータを読み出して取得する（Ｓ３２）。
【００８８】
次に、対象ショットについての特徴量を取得する（Ｓ３３）。つまり、未処理ショットについての特徴量を特徴量記憶テーブルから読み出す。なお、未処理ショットが複数ある場合には、最初の未処理ショットについての特徴量を読み出す。
【００８９】
次に、シナリオ番号を初期値にセットする（Ｓ３４）。例えば、シナリオ番号を１とする。そして、各特徴量についての重み係数を重み係数テーブルから取得する（Ｓ３５）。なお、重み係数はＳ３２で取得したジャンルと、シナリオ番号に応じて選択して取得される。例えば、ステップＳ３２で取得したＡＶデータのジャンルがニュースで、シナリオ番号が１の場合には、図４に示す重み係数テーブルに従い、ショット長については、０．１、コーナー先頭については０．７、顔領域については０．２の各重み係数を取得する。つまり、シナリオ番号１〜３は、重み係数テーブルにおける優先度１〜３に対応している。
【００９０】
そして、所定の評価関数に従い評価値を算出する。つまり、各特徴量に重み付けを行って計算し評価値を算出する（Ｓ３６）。評価関数の具体例としては、以下の評価関数を用いる。
【００９１】
Ｆ＝ｗ１＊ｖ１＋ｗ２＊ｖ２＋ｗ３＊ｖ３
上記の評価関数において、ｗ１、ｗ２、ｗ３は各重み係数を示し、ｖ１、ｖ２、ｖ３は各特徴量を示す。つまり、各特徴量について、対応する重み係数を乗算した値の和を求める。例えば、ジャンルがニュースであるＡＶデータの場合に、図３に示すショット番号１のショットにおいては、Ｆ＝０．１＊１＋０．７＊１＋０．２＊０となり、Ｆの値が評価値となる。
【００９２】
そして、算出された評価値と所定のしきい値とを比較し、評価値が該しきい値よりも大きい場合には、該ショットを特定するためのデータをシナリオデータテーブルに記憶する。具体的には、シナリオデータテーブルにおける所定のシナリオ番号に対応させて該ショットを特定するためのデータを書き込む。なお、該ショットを特定するためのデータとしては、該ショットにおける代表フレームのアドレスデータとする。これは、ダイジェスト再生する際には、該ショット内の所定の範囲のみを再生するものとするためである。なお、該所定の範囲を代表フレームから所定数のフレーム分（又は所定時間分）とし、この範囲を該ショットにおけるセグメントと呼ぶこととする。なお、このセグメントは、特許請求の範囲における「シーンの一部」に当たる。また、シナリオデータテーブルに書き込まれる代表フレームのアドレスデータは、上記「シーン又はシーンの一部を特定する情報である特定情報」に当たる。
【００９３】
次に、全てのシナリオについて評価が完了したか否かが判定されて（Ｓ３９）、完了してない場合には、シナリオ番号をインクリメントして次のシナリオについて評価を行なう。例えば、ニュースの場合に、シナリオ番号２について評価する場合には、図４に示す重み係数テーブルに従い、ショット長については、０．１、コーナー先頭については０．５、顔領域については０．４の各重み係数を取得して（Ｓ３５）、その後同じように評価値を計算し（Ｓ３６）、しきい値との比較を行って（Ｓ３７）、しきい値を越えている場合に、シナリオデータテーブルへの書込みを行う。シナリオ番号３についても同様である。
【００９４】
以上のようにして、全てのシナリオについて評価が完了したら、ステップＳ２５の処理を完了して、ステップＳ１０に戻る。このステップＳ２５の処理も各ショットについて処理が行われて、逐次シナリオデータテーブルにデータが書き込まれていくことになる。このようにして、シナリオデータが記憶されていく。あるＡＶデータの最後まで処理が完了した際には、シナリオデータテーブルには、各シナリオ番号ごとに、評価値がしきい値を越えたショットの代表フレームのアドレスデータが記憶されていることになる。
【００９５】
例えば、図５、図１０の例においては、例えば、ショット番号１１のショットでは、シナリオ番号１〜３の全てのシナリオについて評価値がしきい値を越えたことから、該ショットの代表フレームのアドレスデータが書き込まれている。また、ショット番号１２のショットでは、シナリオ番号３の場合のみ評価値がしきい値を越えたことから、シナリオ番号３についてのみショット番号１２のショットの代表フレームのアドレスデータが書き込まれている。
【００９６】
なお、複数の番組について上記の処理が行われた場合には、各番組ごとにシナリオデータテーブルが記憶されることになる。
【００９７】
なお、上記の説明では、ステップＳ２５の処理は、図６に示す一連の処理の流れの中に存在するものとして説明したが、図６のフローチャートからステップＳ２５を削除するとともに、図７のフローチャートを図６のフローチャートとは別に並行して行うようにしてもよい。
【００９８】
以上のように、上記ステップＳ２５の処理は、インデキシング部１８により行われるが、この場合のインデキシング部１８は上記評価手段として機能する。また、上記ステップＳ２５は、上記評価工程に当たる。また、上記のようなシナリオテーブルへの書込みが、上記シナリオ記憶工程に当たる。
【００９９】
次に、ＡＶデータをダイジェスト再生する場合の動作について説明する。ユーザーが再生データ選択部２６により、ダイジェスト再生したい番組を選択する。例えば、選択可能な番組がモニタ３６に表示されるので、これらから番組を選択する。
【０１００】
そして、ダイジェスト再生の操作を行うと、図８に示す処理に従い再生が行われる。つまり、ユーザーがダイジェスト再生の操作を行うと、その情報がナビゲーション制御部２８に送られる。すると、ナビゲーション制御部２８は、シナリオ保持部２４から最も優先度の高いシナリオデータを読み出す（Ｓ５０）。つまり、シナリオ番号１のシナリオにおけるアドレスデータを読み出してＡＶデータ読出し部３０に送る。そして、ＡＶデータ読出し部３０では、最初のアドレスデータに基づいて、該アドレスデータを先頭フレームとするセグメントをＡＶデータ保持部２２から読み出して、デコーダ３２に転送する（Ｓ５１）。すると、デコーダ３２で該セグメントのデコードが行われて、Ｄ／Ａ変換部３４でＤ／Ａ変換された後にモニタ３６に再生される。そして、再生データ制御部２６によりシナリオ変更の指示がない限り（Ｓ５３）、ＡＶデータ読出し部３０は、順次送られたアドレスデータに基づいて対応するセグメントをＡＶデータ保持部２２から読み出してデコーダ３２に送ることにより、以後同様に再生される。つまり、シナリオ番号１として記憶されたアドレスデータに応じたセグメントのみが順次再生されるのである。
【０１０１】
一方、再生データ選択部２６において、シナリオ変更の指示があった場合（Ｓ５３）、つまり、次に優先度の高いシナリオ、つまり、シナリオ番号２のシナリオのシナリオデータがシナリオ保持部２４から読み出される（Ｓ５４）。つまり、選択されたシナリオについて記憶されたアドレスデータが読み出されて、ＡＶデータ読出し部３０に送られる。ＡＶデータ読出し部３０は、送られたアドレスデータに従って所定のセグメントを読み出し、デコーダ３２に送り、デコーダ３２でデコードされた後に再生されることになる。この場合に、変更されたシナリオに基づき最初から再生が行われる。さらに、シナリオ変更の指示があった場合には、次に優先度の高いシナリオ、つまり、シナリオ番号３のシナリオのシナリオデータが読み出されて再生される。なお、さらに下位の優先度のシナリオがない場合には、最も優先度の高いシナリオに戻って再生を行う。つまり、シナリオ番号が１〜３までの場合には、シナリオ番号３の再生中にシナリオ変更の指示があった場合には、シナリオ番号１に戻る。
【０１０２】
つまり、ダイジェスト再生の操作を行うと、最初は最も優先度の高いシナリオに基づき再生を行うが、途中でシナリオ変更が行われると、順次優先度が下位のシナリオに基づき再生が行われ、最下位のシナリオ再生中にシナリオ変更の指示があった場合には、最上位のシナリオに戻るのである。なお、上記ステップＳ５３においては、シナリオをユーザが選択できるようにして、ステップＳ５４において、選択されたシナリオに基づいて再生するようにしてもよい。
【０１０３】
なお、上記及び以下の説明におけるシナリオデータに従ったダイジェスト再生が上記シナリオ再生工程に当たる。
【０１０４】
なお、以下のように「通常モード」と「シナリオ選択モード」を設けて各モードを選択できるようにしてもよい。
【０１０５】
つまり、「通常モード」と「シナリオ選択モード」を選択する画面が表示されるので、ここで「通常モード」を選択したとする。
【０１０６】
すると、再生データ選択部２６は、番組を特定するためのデータと、通常モードである旨のデータをナビゲーション制御部２８に送る。すると、ナビゲーション制御部２８は、番組を特定するためのデータに従い該番組についてのシナリオデータテーブルを選択する。そして、通常モードである旨のデータに基づいて、シナリオ番号１のシナリオを選択し、記憶されている代表フレームのアドレスデータをＡＶデータ読出し部３０に送る。
【０１０７】
ＡＶデータ読出し部３０は、送られたアドレスデータに従って、ＡＶデータ保持部２２から所定のデータを読み出して、デコーダ３２に送る。つまり、該アドレスデータが示すフレームから１セグメント分のデータを読み出して、デコーダ３２に送る。
【０１０８】
すると、デコーダ３２では、ＡＶデータ読出し部３０から送られたデータがデコードされて、Ｄ／Ａ変換部３４に送られてＤ／Ａ変換された後にモニタ３６に送られて再生される。
【０１０９】
つまり、シナリオ番号１として記憶されたアドレスデータに応じたセグメントのみが順次再生されるのである。例えば、図５の例では、ショット番号１１におけるセグメントの次には、ショット番号１５におけるセグメントが再生される。
【０１１０】
一方、上記シナリオ選択モードを選択した場合には、シナリオ選択画面が表示されるので、このシナリオ選択画面においてシナリオを選択することにより、選択されたシナリオに応じて再生が行われる。例えば、シナリオ番号２を選択した場合には、シナリオ番号２に記憶されたアドレスデータがシナリオ保持部２４からナビゲーション制御部２８を介してＡＶデータ読出し部３０に送られるので、ＡＶデータ読出し部３０は、シナリオ番号２に記憶されたアドレスデータに従ってＡＶデータを読み出して再生されるのである。
【０１１１】
以上のように、本実施例の映像情報記録再生装置によれば、映像情報のジャンルごとに定められた基準に従ってシナリオデータが作成されて、該シナリオデータに基づきダイジェスト再生されるので、番組のジャンルが異なっても、番組のジャンルごとに最適なダイジェスト再生を行うことが可能となる。また、複数の基準に基づいてダイジェスト再生のシナリオが複数作成されるので、ユーザーは、異なる基準に基づいたシナリオデータによるダイジェスト再生を見ることができ、映像情報を概観しやすくなるとともに、重要シーンを見つけやすくなる。つまり、ある映像情報について、複数のシナリオデータが作成されるので、ダイジェスト再生に用いるシナリオデータを切り替えていくことにより、種々の基準に基づくダイジェスト再生を見ることができ、映像情報を概観しやすくなる。
【０１１２】
なお、ダイジェスト再生に際して、複数の番組を同時に再生するようにしてもよい。つまり、デコーダ３２に動画サムネイル表示機能を設けることにより、図１１も示すように、複数の番組のダイジェスト再生を同時にサムネイル再生するようにしてもよい。図１１に示す例は、モニタ３６の表示画面において、上段においては、ニュース＊＊＊の番組がダイジェスト再生され、中段においては、ドラマＡＡＡの番組がダイジェスト再生され、下段においては、ドラマ○○○の番組がダイジェスト再生される例である。
【０１１３】
つまり、再生データ選択部２６により、複数の番組の同時ダイジェスト再生を指示することにより、各番組についてのシナリオデータがシナリオ保持部２４から読み出されて、ＡＶデータ読出し部３０に送られて、ＡＶデータ読出し部３０では、ＡＶデータ保持部２２に記憶された各番組のＡＶデータからシナリオデータに従ってセグメントが読み出されて、デコーダ３２に送られることになる。
【０１１４】
また、図１２に示すように、あるシナリオデータに記憶されたアドレスデータに対応するセグメントをそれぞれ一列に表示する表示領域Ｍ１〜Ｍ３等を表示画面Ｍに設け、各表示領域にセグメントを繰り返し表示するようにしてもよい。例えば、図５の例で、ショット番号１１に対応するセグメントを表示領域Ｍ１に繰り返し表示し、ショット番号１５に対応するセグメントを表示領域Ｍ２に繰り返し表示する。また、表示領域Ｍ３には、ショット番号１５の次にシナリオデータに記憶されているショット番号に対応するセグメントが表示されることになる。つまり、各セグメントを時間的に前から後ろに行くに従い、表示画面Ｍの左から右に配置し、そのうちの複数個（例えば、図１２に示すように３つ）のみを表示するようにし、他のセグメントはユーザーが入力装置を左又は右に操作することにより１画面ずつずれて表示できるようにする。
【０１１５】
このようにすることにより、ある番組の重要シーンを一度に見ることができ、その番組の概要を即座に知ることが可能となる。
【０１１６】
なお、同じシナリオに基づきながらもダイジェスト再生時間が異なるシナリオを用意することにより、再生時間の長いシナリオを選択した場合には、より詳細に各シーンを見ることができるようにしてもよい。
【０１１７】
これは、同じ重み係数の組み合わせを用いながらステップＳ３７で用いるしきい値を異ならせることにより可能である。例えば、図４の例で、ニュースにおいて優先度１の重み係数（ショット長については、０．１、コーナー先頭については０．７、顔領域については０．２）を用いて評価値を算出し、しきい値と比較するが、この際、複数のしきい値を用意して判定を行う。例えば、しきい値ａとしきい値ｂ（しきい値ａ＞しきい値ｂ）とを用意し、評価値としきい値ａとの比較結果に基づいたシナリオデータをシナリオ番号１−１とし、評価値としきい値ｂとの比較結果に基づいたシナリオデータをシナリオ番号１−２としてそれぞれをシナリオデータテーブルに書き込む。この場合、しきい値ｂの方がしきい値ａよりも小さいので、シナリオデータとして書き込まれるアドレスデータの数もしきい値ｂの方が多くなる可能性が高くなる。
【０１１８】
そして、ダイジェスト再生に際して、シナリオ番号１−１とシナリオ番号１−２のいずれかを選択できるようにすることにより、同じ重み係数の組み合わせにより作成されたシナリオであるにも拘わらず、再生時間の異なるように構成することが可能となる。
【０１１９】
なお、ダイジェスト再生をする場合に、どのようなシナリオに基づいて再生をしているのかが分かるようにしておくことが好ましい。つまり、シナリオの内容が分かるように、再生画面中に表示を行う。例えば、図４の例では、ニュースにおいて、シナリオ番号１のシナリオ（つまり、優先度１のシナリオ）では、コーナー先頭をかなり評価するので、例えば、「コーナー先頭多いモード」と再生画面の端位置等に表示し、シナリオ番号２のシナリオでは、コーナー先頭を評価するもののシナリオ番号１ほどではないので、例えば、「コーナー先頭やや多いモード」と表示し、シナリオ番号３のシナリオでは、コーナー先頭と顔領域とを等分に評価しているので、例えば、「コーナー先頭、顔領域やや多いモード」等と表示する。
【０１２０】
なお、上記の説明では、特徴量の基礎として、ショット長や、コーナー先頭であることや、顔領域があることを例に挙げたが、これだけには限られず、例えば、テロップがあること等他の特徴量の基礎を併用してもよい。
【０１２１】
【発明の効果】
本発明に基づく映像情報記録再生装置及び映像情報記録再生方法によれば、映像情報のジャンルごとに定められた基準に従ってシナリオが作成されて、該シナリオに基づきダイジェスト再生されるので、番組のジャンルが異なっても、番組のジャンルごとに最適なダイジェスト再生を行うことが可能となる。
【０１２２】
また、本発明に基づく映像情報記録再生装置及び映像情報記録再生方法によれば、複数の基準に基づいてダイジェスト再生のシナリオが複数作成されるので、ユーザーは、異なる基準に基づいたシナリオによるダイジェスト再生を見ることができ、映像情報を概観しやすくなるとともに、重要シーンを見つけやすくなる。例えば、登場人物を把握したい場合には、顔領域が存在するシーンに重きをおいた基準に従ったシナリオを再生すればよい。
【図面の簡単な説明】
【図１】本発明の実施例に基づく映像情報記録再生装置の構成を示すブロック図である。
【図２】検出結果記憶テーブルの構成を示す説明図である。
【図３】特徴量記憶テーブルの構成を示す説明図である。
【図４】重み係数テーブルの構成を示す説明図である。
【図５】シナリオデータテーブルの構成を示す説明図である。
【図６】本発明の実施例に基づく映像情報記録再生装置の動作を説明するためのフローチャートである。
【図７】本発明の実施例に基づく映像情報記録再生装置の動作を説明するためのフローチャートである。
【図８】本発明の実施例に基づく映像情報記録再生装置の動作を説明するためのフローチャートである。
【図９】本発明の実施例に基づく映像情報記録再生装置の動作を説明するための説明図である。
【図１０】本発明の実施例に基づく映像情報記録再生装置の動作を説明するための説明図である。
【図１１】再生における表示の一例を示す説明図である。
【図１２】再生における表示の一例を示す説明図である。
【符号の説明】
Ａ映像情報記録再生装置
１０データ分離部
１１、１５Ａ／Ｄ変換部
１２番組付加情報抽出部
１４ＥＰＧデータ保持部
１６エンコーダ
１８インデキシング部
１９データ保持部
２０重み係数テーブル
２２ＡＶデータ保持部
２４シナリオ保持部
２６再生データ選択部
２８ナビゲーション制御部
３０ＡＶデータ読出し部
３２デコーダ
３４Ｄ／Ａ変換部
３６モニタ[0001]
BACKGROUND OF THE INVENTION
The present invention relates to an apparatus for reproducing video data, and more particularly to an apparatus for generating and reproducing a digest version of video data.
[0002]
[Prior art]
Conventionally, there exists an apparatus that creates and reproduces a digest version of video information. In other words, it takes a lot of time and effort to find what to watch from a large amount of video data, regardless of whether it is for professional use or general consumer use. For example, although it is possible to find a program to be viewed by fast-forward playback of a VTR, it takes time and effort. In order to cope with this, an apparatus and a method have been proposed in which a summary video is created to roughly grasp the content of the video through the summary video (Japanese Patent Laid-Open Nos. 10-32773 and 11-239322, in particular). (Kaihei 11-176038).
[0003]
In Japanese Patent Laid-Open No. 2000-350124, when a television program is recorded, the video and audio of the program are analyzed, a scene that seems to be characteristic in the program is extracted, and the scene is generated as a reduced still image. When the image is stored and browsed, the reduced still image is used as reproduction data. For example, when extracting a feature scene, points such as scene transitions, camera movements, color changes, presence / absence of telop, and contents are extracted.
[0004]
[Problems to be solved by the invention]
However, in the above conventional case, since characteristic scenes are extracted based on one criterion, it may not be said that important scenes are necessarily extracted depending on programs. For example, in the news, the first scene where each news item is started is usually important, whereas in a drama or song program, the upscene of the performer is important. However, if a feature scene is extracted based on one criterion, an unimportant scene may be extracted depending on a program.
[0005]
Therefore, the present invention provides a video information recording / reproducing apparatus capable of appropriately extracting characteristic scenes, that is, scenes important for the program, creating a summary video and performing digest playback even if the program genres are different. In addition, an object of the present invention is to provide a video information recording / reproducing apparatus capable of generating digest video more appropriate for a user and performing digest playback.
[0006]
[Means for Solving the Problems]
The present invention was created to solve the above problems, and firstly, a video information recording / reproducing apparatus for recording / reproducing video information, Video information storage means for storing video information, feature quantity detection means for detecting a plurality of types of feature quantities for each scene in the video information, and evaluation of the feature quantities based on criteria determined for each genre of video information The evaluation means for evaluating the detected feature quantity according to the criteria for the above, and the evaluation means includes a plurality of types of weight coefficient groups in which weight coefficients for weighting each feature quantity are provided for each feature quantity. Scenario data comprising the specific information by storing specific information, which is information for specifying the scene or a part of the scene, based on the evaluation result of the weighting coefficient storage unit provided for each genre and the evaluation means From the video data stored in the video information storage means based on the scenario data stored in the scenario storage means, Reading a portion of the constant of the scene or the scene and a scenario reproducing means for reproducing, the evaluation means evaluates the characteristic amount using the weighting coefficient group corresponding to the genre of the video information It is characterized by that.
[0007]
In the video information recording / reproducing apparatus having the first configuration, the video information is stored in the video information storage means. The feature amount detection means detects a plurality of types of feature amounts for each scene in the video information. The evaluation means evaluates the detected feature amount according to a criterion for evaluating the feature amount, based on a criterion determined for each genre of video information. Further, the scenario storage unit stores scenario data including the specific information by storing specific information that is information for specifying the scene or a part of the scene based on the evaluation result by the evaluation unit. In this way, the scenario for digest playback is stored in the scenario storage means for each criterion, so that the scenario playback means can execute a scenario based on a predetermined criterion from the video data stored in the video information storage means. Based on the data, a predetermined scene or a part of the scene is read and reproduced. Therefore, a scenario is created according to a standard defined for each genre of video information, and digest playback is performed based on the scenario. Therefore, even if the program genre differs, optimal digest playback can be performed for each program genre. It becomes possible.
[0008]
The second is a video information recording / reproducing apparatus for recording / reproducing video information, Video information storage means for storing video information, feature quantity detection means for detecting a plurality of types of feature quantities for each scene in the video information, and feature quantities detected by the feature quantity detection means according to a plurality of types of criteria, respectively An evaluation unit that evaluates, a weighting factor storage unit in which a plurality of weighting factor groups in which weighting factors for weighting each feature amount are provided for each feature amount are provided according to priority, and the evaluation unit Scenario storage means for storing scenario data composed of the specific information for each reference by storing specific information that is information for specifying the scene or a part of the scene for each reference based on the evaluation result by the means And stored in the video information storage means based on scenario data based on a predetermined criterion among the scenario data stored in the scenario storage means Scenario reproduction means for reading out and reproducing a predetermined scene or a part of the scene from the image data, and the evaluation means evaluates the feature amount using each weight coefficient group, thereby Evaluate quantity according to multiple criteria It is characterized by that.
[0009]
In the video information recording / reproducing apparatus having the second configuration, the video information is stored in the video information storage means. The feature amount detection means detects a plurality of types of feature amounts for each scene in the video information. The evaluation means evaluates the feature quantities detected by the feature quantity detection means according to a plurality of types of criteria. Further, the scenario storage means stores specific information, which is information for specifying the scene or a part of the scene, for each criterion based on the evaluation result by the evaluation means, so that the specific information is determined for each criterion. The scenario data is stored. In this way, the scenario for digest playback is stored in the scenario storage means for each criterion, so that the scenario playback means can execute a scenario based on a predetermined criterion from the video data stored in the video information storage means. Based on the data, a predetermined scene or a part of the scene is read and reproduced. Therefore, multiple digest playback scenarios are created based on multiple criteria, allowing users to view digest playback based on scenarios based on different criteria, making it easier to view video information and finding important scenes. It becomes easy. Therefore, digest playback more appropriate for the user is possible. For example, when it is desired to grasp the characters, a scenario according to a standard that places importance on the scene where the face area exists may be reproduced.
[0010]
The third is a video information recording / reproducing apparatus for recording / reproducing video information, Video information storage means for storing video information, feature quantity detection means for detecting a plurality of types of feature quantities for each scene in the video information, and a plurality of types of feature quantities detected by the feature quantity detection means The evaluation means for evaluating each according to the criteria defined for each genre, and the weighting coefficient group provided for each feature quantity by the evaluation means for weighting each feature quantity includes a plurality of types of genres. For each criterion, by storing for each criterion specific information that is information for identifying the scene or a part of the scene based on the evaluation result by the evaluation means and the weight coefficient storage unit provided for each Scenario storage means for storing scenario data comprising the specific information, and based on a predetermined criterion among the scenario data stored in the scenario storage means Scenario reproduction means for reading out and reproducing a predetermined scene or a part of the scene from the video data stored in the video information storage means based on the Nario data, and the evaluation means Evaluate each feature value using a group of weighting factors corresponding to the genre, and evaluate the feature value according to multiple types of criteria It is characterized by that. Therefore, a scenario is created according to a standard defined for each genre of video information, and digest playback is performed based on the scenario. Therefore, even if the program genre differs, optimal digest playback can be performed for each program genre. It becomes possible. In addition, for a given program, multiple digest playback scenarios are created based on multiple criteria, allowing users to view digest playback based on scenarios based on different criteria, making it easier to view video information, Easier to find important scenes.
[0013]
Also, Fourth, A video information recording / reproducing apparatus for recording / reproducing video information, Video information storage means for storing video information, feature quantity detection means for detecting a plurality of types of feature quantities for each scene in the video information, and a plurality of types of feature quantities detected by the feature quantity detection means The evaluation means for evaluating each according to the criteria defined for each genre, and the evaluation means is a set of weight coefficient groups, and weights for weighting each feature quantity are provided for each feature quantity. A weighting coefficient storage unit having a plurality of weighting coefficient group sets each having a plurality of coefficient groups according to priority order, and information for specifying the scene or a part of the scene based on the evaluation result by the evaluation unit By storing specific information for each criterion, scenario storage means for storing scenario data composed of the specific information for each criterion, and the scenario storage A scenario in which a predetermined scene or a part of the scene is read out from the video data stored in the video information storage means and reproduced based on scenario data based on a predetermined standard among scenario data stored in the means Reproducing means, and the evaluation means evaluates the feature quantities using a plurality of weight coefficient groups in the weight coefficient group set corresponding to the genre of the video information, respectively, so that the feature quantities are determined according to a plurality of types of criteria. evaluate It is characterized by that.
[0014]
Also, Fifth, in the first or third configuration, The evaluation means compares a value obtained by multiplying each feature amount by a corresponding weighting factor with a predetermined threshold value, and determines whether the accumulated value is greater than the predetermined threshold value. It is characterized by determining.
[0015]
Also, Sixth, in the second or fourth configuration, The evaluation means compares a value obtained by multiplying each feature amount by a corresponding weighting factor with a predetermined threshold value, and determines whether the accumulated value is greater than the predetermined threshold value. It is characterized by determining. In this way, the detected feature amount is evaluated.
[0016]
Also, Seventh, in the fifth or sixth configuration, The scenario storage means stores specific information that is information for specifying the scene or a part of the scene when the integrated value is larger than the predetermined threshold value.
[0017]
Also, Eighth, in any of the fifth to seventh configurations, The evaluation means has a plurality of types of threshold values used for evaluation, and performs evaluation based on each threshold value. The scenario storage means performs each evaluation based on the evaluation result based on each threshold value. The scenario data is stored for each threshold.
[0036]
DETAILED DESCRIPTION OF THE INVENTION
Embodiments of the present invention will be described with reference to the drawings. A video information recording / reproducing apparatus A according to the present invention includes a data separation unit (separation unit) 10, an A / D conversion unit 11, a program additional information extraction unit 12, an EPG data holding unit 14, and an A / D conversion unit. 15, an encoder 16, an indexing unit 18, a data holding unit 19, a weighting factor table (weighting factor storage unit) 20, an AV data holding unit (video information storage unit) 22, and a scenario holding unit (scenario storage unit) ) 24, a reproduction data selection unit (scenario selection unit) 26, a navigation control unit 28, an AV data reading unit 30, a decoder 32, a D / A conversion unit 34, and a monitor 36. .
[0037]
Here, when the digital AV data (video information) including EPG (electronic program guide) data and AV data is input, the data separation unit 10 separates the data into EPG data and AV data. Here, the configuration shown in FIG. 1 is an example when analog AV data (video information) such as analog broadcast wave data of the current television is input to the video information recording / reproducing apparatus A. A receiving unit and a demodulation processing unit are provided before the data separation unit 10, and analog AV data input from the reception unit is demodulated by the demodulation unit and then input to the data separation unit 10. Specifically, the data separation unit 10 has a function of separating EPG data multiplexed in VBI (vertical blanking interval). The data separation unit 10 sends EPG data to the program additional information extraction unit 12, while sending AV data to the encoder 16.
[0038]
If the data input to the video information recording / reproducing apparatus A is digital AV data, the A / D conversion unit 11, the A / D conversion unit 15, and the encoder 16 are omitted from the configuration of FIG. Will be. Further, the demodulator as described above is also omitted. That is, the received digital AV data is directly input to the data separation unit 10 and is separated into EPG data and AV data by the data separation unit 10, and the EPG data is input to the program additional information extraction unit 12. The data is input to the AV data holding unit 22 and also input to the indexing unit 18. Note that when digital AV data is input to the video information recording / reproducing apparatus A, encoded data is input, so that the encoder 16 as shown in FIG. 1 is not necessary.
[0039]
Next, the A / D converter 11 converts EPG data as analog data into digital data.
[0040]
Further, the program additional information extraction unit 12 extracts necessary data from the EPG data. For example, genre and program title data are extracted. The program additional information extraction unit 12 also has a function of sending the extracted data to the EPG data holding unit 14. The EPG data holding unit 14 holds the extracted data.
[0041]
The A / D converter 15 converts AV data as analog data into digital data.
[0042]
The encoder 16 is an MPEG encoder, and compresses and encodes the sent AV data according to the MPEG standard. The encoder 16 has a function of sending the encoded AV data as an MPEG stream to the AV data holding unit 22 and the indexing unit 18.
[0043]
The indexing unit 18 analyzes AV data sent from the encoder 16 and is used to calculate a predetermined evaluation function by performing silence detection, cut point detection, face area detection, and the like. The basis of the feature quantity to be detected is detected, and each shot is evaluated, and in particular, an evaluation function is calculated for the feature quantity to create scenario data.
[0044]
That is, the indexing unit 18 detects the last frame position of the silent section (this is the “corner top frame”) by performing silence detection on the AV data, or the frame position at a position where there is no image continuity. (This becomes a “cut point”) or whether or not a face area exists in the representative frame based on the cut point is detected. The presence of a corner head frame, the presence of a cut point, and the presence of a face region in a representative frame are the basis of the feature amount.
[0045]
In addition, regarding the calculation of the evaluation function, the indexing unit 18 calculates a feature amount based on the basis of the feature amount in a shot composed of frames from a cut point frame to a frame before the next cut point. An evaluation value is calculated by performing an operation on the obtained feature amount according to a predetermined evaluation function. When calculating the evaluation function, the data from the EPG data holding unit 14 and the information in the weighting coefficient table 20 are referred to. Here, examples of the feature amount include a value of a shot length (for example, the number of frames of the shot), a value given in the case of a corner top frame, and a value given when a face area exists in a representative frame. It is done. The shot corresponds to a “scene” in the claims.
[0046]
In creating the scenario data, the indexing unit 18 evaluates the calculated evaluation value, and extracts information for specifying the shot when a predetermined criterion is satisfied. A plurality of types of scenario data are created.
[0047]
The detailed contents of the processing in the indexing unit 18 will be described later.
[0048]
The data holding unit 19 holds the data detected by the indexing unit 18, and specifically, a detection result storage table (for storing the detection results detected by the indexing unit 18). 2) and a feature amount storage table (see FIG. 3) for storing predetermined data based on the data stored in the detection result storage table.
[0049]
The weighting factor table 20 holds weighting factor data used when calculating the evaluation function, and specifically stores a weighting factor table as shown in FIG. This weighting coefficient table stores weighting coefficients for shot length, whether or not a corner head is present, and whether there is a face area, and stores a plurality of combinations according to priority for each genre. That is, when the priority is the highest, the combination of weighting factors most appropriate for the genre is obtained.
[0050]
Here, one combination of the shot length, the corner head, and the weighting factor for the face area corresponds to the “weighting factor group”. For example, in the example of FIG. 4, weighting factors 0.1, 0.7, and 0.2 with priority 1 in news constitute a weighting factor group. In a certain genre, a plurality of weight coefficient groups provided in accordance with the priority order correspond to the weight coefficient group set. That is, in the example of FIG. 4, combinations of weighting factors of priorities 1 to 3 in news constitute a weighting factor group set.
[0051]
The AV data holding unit 22 is for holding AV data sent from the encoder 16. The scenario holding unit 24 holds the scenario data. Since plural types of scenario data are created, the created plural types of scenario data are held. That is, the scenario holding unit 24 has a scenario data table (see FIG. 5) that stores a plurality of scenario data.
[0052]
Further, the reproduction data selection unit 26 is an operation unit for a user to perform an operation, and is configured by a remote controller, for example. This operation unit is used particularly when the user performs digest reproduction. That is, it is used to select a program to be viewed or to select a scenario to be reproduced.
[0053]
Further, the navigation control unit 28 reads predetermined scenario data from the scenario holding unit 24 based on the data from the reproduction data selection unit 26, and sends it to the AV data reading unit 30.
[0054]
The AV data reading unit 30 reads predetermined data from the AV data held in the AV data holding unit 22 based on the data sent from the navigation control unit 28 and sends it to the decoder 32.
[0055]
The decoder 32 is an MPEG decoder and decodes the sent AV data.
[0056]
The navigation control unit 28, the AV data reading unit 30, the decoder 32, the D / A conversion unit 34, the monitor 36, and the like function as the scenario reproducing unit.
[0057]
Note that each of the above-described units constituting the video information recording / reproducing apparatus A may be configured as a device having each function, or a part of the configuration may include a program for executing a predetermined process, and the program. You may comprise by CPU which performs a process based on it.
[0058]
That is, when each unit is configured by a device, for example, the data separation unit 10 is configured as a device having a function of separating EPG data and AV data, and the encoder 16 is also configured by an encoding device. Further, the EPG data holding unit 14, the weight coefficient table 20, the AV data holding unit 22, and the scenario holding unit 24 are configured by a storage device.
[0059]
Further, when a part of the configuration is configured by a program for executing a predetermined process and a CPU that operates based on the program, the storage device that stores various programs and the process is executed based on the program As the various programs, which are constituted by a CPU, a program for executing processing performed by the data separation unit 10, a program for executing processing performed by the program additional information extraction unit 12, and a processing performed by the encoder 16 are executed. A program for executing the processing performed by the indexing unit 18, a program for executing the processing performed by the navigation control unit 28, a program for executing the processing performed by the AV data reading unit 30, And a program for executing processing performed by the decoder 32 .
[0060]
The operation of the video information recording / reproducing apparatus A having the above configuration will be described. First. When analog AV data including EPG data and AV data is input to the video information recording / reproducing apparatus A via a receiving unit (not shown), the data is demodulated by a demodulating unit (not shown) and then the data separating unit 10 Is input. Then, the AV data is separated into the EPG data and the AV data by the data separation unit 10. The EPG data is A / D converted by the A / D converter 11 and then sent to the program additional information extracting unit 12. The AV data is A / D converted by the A / D converter 15 and then the encoder 16. Sent to.
[0061]
The program additional information extraction unit 12 extracts necessary data such as genre and program title data from the EPG data, and sends the extracted data to the EPG data holding unit 14. The sent data is held in the EPG data holding unit 14.
[0062]
On the other hand, the encoder 16 compresses and encodes the sent AV data according to the MPEG standard, and sends the encoded AV data to the AV data holding unit 22 and the indexing unit 18 as an MPEG stream. The AV data holding unit 22 stores AV data. The storage in the AV data holding unit 22 corresponds to the video information storage step.
[0063]
In the configuration in the case of the digital AV data, the digital AV data is directly input to the separation unit 10, the EPG data is sent to the program additional information extraction unit 12, and the AV data is stored in the indexing unit 18 and the AV data. Sent to the unit 22.
[0064]
Further, the indexing unit 18 performs the process of the flowchart shown in FIG. 6 and the process of the flowchart of FIG.
[0065]
That is, it is determined whether or not the program being recorded has ended by determining whether or not AV data is being sent from the encoder 16 (S10), and the process ends when the program ends. On the other hand, if the program has not ended, the process proceeds to step S11, and the target frame to be processed is specified (S11). This is sequentially identified as the target frame from the first frame.
[0066]
Next, it is determined whether or not silence detection is performed at the position of the frame (temporal position) (S12). This is because the silence detection is performed every time interval longer than the frame interval, and such a determination is provided. In this step S12, the silence detection is performed at a rate of once every plural times. Is made.
[0067]
When silence detection is performed, it is determined whether or not the frame position is silence (S13). That is, it is determined whether or not there is silence by determining whether or not the sound level exceeds a preset threshold value.
[0068]
If there is no sound, the corresponding frame number is stored (S14, S15). This may be held in the indexing unit 18 itself or may be stored in the data holding unit 19. On the other hand, if it is not silent, the process proceeds to step S16.
[0069]
In step S16, it is determined whether a predetermined time has elapsed since the last silence (S16). That is, in step S15, the frame number of the frame at the position detected as silence is stored, so it is determined whether or not a predetermined time has elapsed from the position where the sound was finally determined according to the stored data. Is done. This is because the last silent position is set as the head of the corner only when no silence is detected for a predetermined time since the last silence. When a predetermined time has passed since the last silence, the frame number at the position of the last silence is stored in the detection result storage table (see FIG. 2) together with data indicating that the corner is the head. If the voiced section continues after the corner head data is stored, it is not necessary to store the corner head data. Therefore, in step S16, the frame number of the last silent position is the corner number. It is also determined that it is not stored as the head, and if it is not stored, the process proceeds to step S17. That is, in step S16, it is determined whether or not a predetermined time has passed since the last silence, and it is also determined that the frame number at the position of the last silence is not stored as the head of the corner. If time has passed and the frame number of the last silent position is not stored as the corner head, the frame number of the last silent position is stored as the corner head in step S17.
[0070]
For example, in the example shown in FIGS. 2 and 9, since silence is detected at the frame position of frame number l + 2, and no silence is detected for a predetermined time, the data of frame number l + 2 and the start of the corner are indicated. The data is stored in the detection result storage table. The same applies to the frame with frame number n + 2.
[0071]
Next, cut detection is performed (S18). This is performed by determining whether or not there is continuity with the previous frame, and when there is no continuity between frames physically, such as when the camera is switched in AV data, it is determined that there is no continuity. Is done. If it is determined that there is no continuity with the previous frame, the target frame becomes a cut point. Various methods have already been proposed for such a cut detection method. Efficient cut detection from MPEG-1 and MPEG-2 video streams using the method proposed by Meng et al. It becomes possible to do. When the cut point is detected by the cut detection in step S18, the frame number of the target frame is stored in the detection result storage table together with the data indicating the cut point (S19, S20). On the other hand, if it is detected that it is not a cut point, the process proceeds to step S21 (S19).
[0072]
For example, in the example shown in FIG. 2 and FIG. 9, in the determination of the frame with the frame number m + 1, it is determined that there is no continuity with the frame with the frame number m, which is the previous frame. It is stored in the detection result storage table together with the data indicating the cut point. The same applies to the frame with frame number n + 1.
[0073]
Next, it is determined whether or not the target frame is a representative frame (S21). The representative frame refers to a frame that is a predetermined frame from the frame that corresponds to the cut point. For example, when the third frame from the frame that corresponds to the cut point is defined as the representative frame, the target frame is 3 frames from the frame that corresponds to the cut point. It is determined whether or not it is the th frame.
[0074]
If it is determined that the frame is a representative frame, it is determined whether or not a face area exists in the frame (S22). That is, it is determined whether or not a face image exists in the frame. Various methods have already been proposed for this face area detection. It is possible to use the method proposed by Wang et al., “A High Efficient System for Automatic Face Region Detection in MPEG Video”, IEEE TCSVT. When the face area is detected in the target frame by the face area detection in step S22, the frame number of the frame is stored in the detection result storage table together with the data indicating that the frame includes the face area (S23, S24). On the other hand, if the face area is not included, the process for the target frame is terminated, and the process returns to step S10.
[0075]
If it is determined in step S21 that the target frame is not a representative frame, the process for the target frame is terminated, and the process returns to step S10.
[0076]
For example, in the examples shown in FIGS. 2 and 9, in the determination for the frame with the frame number n + 4, the frame is the representative frame and the face area is detected, and the data with the frame number n + 1 exists. This is stored in the detection result storage table together with the data indicating that the operation is to be performed. In FIG. 9, the frame with the frame number m + 4 is also a representative frame, but the data for the frame number m + 4 is not stored in the detection result storage table, assuming that no face area exists.
[0077]
When the processing of steps S12 to S24 is completed for a certain target frame, the process proceeds to step S25 to perform shot evaluation. Details of this shot evaluation will be described later. Thereafter, in step S11, the next frame is specified as the target frame, and the processing from steps S12 to S24 is repeated in the same manner. In this way, processing is performed up to the last frame, and the detection result is stored in the detection result storage table.
[0078]
As the processing is sequentially performed for each frame, data is stored in the detection result storage table. In the indexing unit 18, the data is sequentially stored in the detection result storage table. The feature amount storage table shown in FIG. For each shot number, this feature amount storage table includes a first frame, a last frame, a shot length, a shot length determination value, a feature amount at the corner head, and a feature amount when a face area exists. Whether or not the evaluation process has been completed is stored.
[0079]
In other words, one shot is composed of the frame from the cut point to the frame before the next cut point (for the first shot, from the first frame to the frame before the first cut point). Each data is memorized. That is, for the first frame, the frame number of the first frame of the shot is stored, the last frame stores the frame number of the last frame of the shot, and the shot length stores the number of frames in the shot. The shot length determination value is 1 when the shot length is greater than a certain threshold value, and is 0 when the shot length is less than the threshold value. Thus, the value given when the shot length is larger than a certain threshold corresponds to the scene length feature amount. The feature amount at the corner head is set to 1 when there is a frame in which data indicating the corner head is stored in the shot, and on the other hand, the feature amount is indicated as the corner head in the shot. When there is no frame in which data is stored, 0 is set. Thus, the value given when there is a frame in which data indicating the beginning of a corner is stored in the shot corresponds to the audio level feature amount. Similarly, the feature amount when a face area exists is also set to 1 when there is a frame storing data indicating the presence of the face area in the shot, while the feature area is included in the shot. If there is no frame that stores data indicating the presence of “0”, it is set to “0”. As described above, the value given when there is a frame in which data indicating that a face area exists in the shot is stored corresponds to the face area feature amount. That is, the shot length determination value, the corner head, and the face area are binary.
[0080]
The scene length feature value, the sound level feature value, and the face area feature value are detected by the indexing unit 18. In this case, the indexing unit 18 uses the scene length feature value detecting unit. It can also be said that it functions as a voice level feature quantity detection means or a face area feature quantity detection means. Further, as described above, the step of detecting each feature amount corresponds to the feature amount detection step.
[0081]
As to whether or not the evaluation process is completed, when the evaluation in step S25 is completed for the shot, data to that effect (for example, 1) is stored. In this way, as data is sequentially stored in the detection result storage table, data is also stored sequentially in the feature amount storage table.
[0082]
The timing of storage in the feature quantity storage table may be arbitrarily set at the timing when storage in the feature quantity storage table becomes possible. For example, the storage in the detection result storage table in steps S17, S20, and S24 is performed. At the same time. For example, when a certain cut point is detected, the number of the last frame of the shot before the cut point and the number of the first frame of the next shot can be known, so in step S20 the cut point data is stored in the detection result storage table. At the same time, the feature quantity storage table is also written. The shot length and shot length determination value can also be written because they can be calculated if the data of the last frame is known. Also, regarding a feature amount in the case where a face area exists, if a certain cut point is detected, and there is a frame in which the face area exists in the previous shot, a detection result storage table in step S20. At the same time that the cut point data is written, the data to that effect is written. Note that the feature amount in the case of the corner head is determined at the same time as writing in the detection result storage table in step S17 because it is not known that the corner head is reached after a predetermined period of time has passed since the last silence. Whether or not the evaluation process has been completed is written when the process of step S25 is completed for a certain shot.
[0083]
As described above, the indexing unit 18 detects the feature quantity, and the indexing unit 18 at that time functions as the feature quantity detection unit.
[0084]
Next, the shot evaluation in step S25 will be described with reference to FIG.
[0085]
First, it is determined whether there is an unprocessed shot (S30). That is, it is determined whether or not there is a shot that has not been evaluated in step S25. If there is a shot, the process proceeds to step S31. If not, the process in step S25 is temporarily terminated, and step S10 (FIG. 6) is performed. Return to Browse. Here, whether or not there is a shot that has not been evaluated in step S25 may be determined based on data indicating whether or not the evaluation processing in the feature amount storage table has been completed.
[0086]
In step S31, it is determined whether or not a predetermined time has elapsed from the frame immediately after the unprocessed shot, that is, the cut point immediately after the unprocessed shot (S31). This is because whether or not a certain frame is at the beginning of a corner cannot be determined that the predetermined time has not elapsed since the last silence, so that the unprocessed shot has no head at the beginning of the corner until the predetermined time has elapsed since the next cut point. This is because there is a possibility of including the data. In step S31, if the predetermined time has elapsed, the process proceeds to S32. If not, the process of step S25 is temporarily terminated and the process returns to step S10.
[0087]
In step S32, data about the genre of the AV data to be processed is read out from the data held in the EPG data holding unit 14 (S32).
[0088]
Next, the feature amount for the target shot is acquired (S33). That is, the feature amount for the unprocessed shot is read from the feature amount storage table. If there are a plurality of unprocessed shots, the feature amount for the first unprocessed shot is read out.
[0089]
Next, the scenario number is set to an initial value (S34). For example, the scenario number is 1. Then, the weighting coefficient for each feature amount is acquired from the weighting coefficient table (S35). The weighting coefficient is selected and acquired according to the genre acquired in S32 and the scenario number. For example, when the genre of the AV data acquired in step S32 is news and the scenario number is 1, according to the weighting coefficient table shown in FIG. 4, the shot length is 0.1, the corner head is 0.7, For the face area, each weight coefficient of 0.2 is acquired. That is, scenario numbers 1 to 3 correspond to priorities 1 to 3 in the weighting coefficient table.
[0090]
Then, an evaluation value is calculated according to a predetermined evaluation function. In other words, each feature value is weighted and calculated to calculate an evaluation value (S36). As a specific example of the evaluation function, the following evaluation function is used.
[0091]
F = w1 * v1 + w2 * v2 + w3 * v3
In the evaluation function, w1, w2, and w3 indicate weighting factors, and v1, v2, and v3 indicate feature amounts. That is, for each feature amount, a sum of values obtained by multiplying the corresponding weight coefficients is obtained. For example, in the case of AV data whose genre is news, in the shot of shot number 1 shown in FIG. 3, F = 0.1 * 1 + 0.7 * 1 + 0.2 * 0, and the value of F becomes the evaluation value. .
[0092]
Then, the calculated evaluation value is compared with a predetermined threshold value. If the evaluation value is larger than the threshold value, data for specifying the shot is stored in the scenario data table. Specifically, data for specifying the shot is written in correspondence with a predetermined scenario number in the scenario data table. The data for specifying the shot is representative frame address data in the shot. This is because only a predetermined range in the shot is reproduced when digest reproduction is performed. Note that the predetermined range is a predetermined number of frames (or a predetermined time) from the representative frame, and this range is called a segment in the shot. This segment corresponds to “a part of the scene” in the claims. The address data of the representative frame written in the scenario data table corresponds to the “specific information that is information for specifying a scene or a part of the scene”.
[0093]
Next, it is determined whether or not the evaluation has been completed for all scenarios (S39). If the evaluation has not been completed, the scenario number is incremented and the next scenario is evaluated. For example, in the case of news, when scenario number 2 is evaluated, according to the weight coefficient table shown in FIG. 4, the shot length is 0.1, the corner head is 0.5, and the face area is 0.4. (S35), the evaluation value is calculated in the same manner (S36), and compared with the threshold value (S37). If the threshold value is exceeded, scenario data is obtained. Write to the table. The same applies to scenario number 3.
[0094]
As described above, when the evaluation is completed for all scenarios, the process of step S25 is completed, and the process returns to step S10. The process of step S25 is also performed for each shot, and data is sequentially written into the scenario data table. In this way, scenario data is stored. When the processing is completed to the end of a certain AV data, the scenario data table stores the address data of the representative frame of the shot whose evaluation value exceeds the threshold value for each scenario number. .
[0095]
For example, in the example of FIGS. 5 and 10, for example, in the shot with shot number 11, the evaluation value exceeds the threshold value for all scenarios with scenario numbers 1 to 3, so the address of the representative frame of the shot Data is being written. In the shot of shot number 12, the evaluation value exceeds the threshold only in the case of scenario number 3, and therefore, the address data of the representative frame of the shot of shot number 12 is written only for scenario number 3.
[0096]
When the above processing is performed for a plurality of programs, a scenario data table is stored for each program.
[0097]
In the above description, the process in step S25 has been described as existing in the flow of a series of processes shown in FIG. 6, but step S25 is deleted from the flowchart in FIG. 6 and the flowchart in FIG. It may be performed in parallel with the flowchart of FIG.
[0098]
As described above, the processing in step S25 is performed by the indexing unit 18. In this case, the indexing unit 18 functions as the evaluation unit. Step S25 corresponds to the evaluation step. Further, writing to the scenario table as described above corresponds to the scenario storing step.
[0099]
Next, an operation for digest reproduction of AV data will be described. The user uses the playback data selection unit 26 to select a program to be digest played back. For example, selectable programs are displayed on the monitor 36, and a program is selected from these.
[0100]
When the digest playback operation is performed, playback is performed according to the processing shown in FIG. That is, when the user performs a digest playback operation, the information is sent to the navigation control unit 28. Then, the navigation control unit 28 reads scenario data with the highest priority from the scenario holding unit 24 (S50). That is, the address data in the scenario of scenario number 1 is read and sent to the AV data reading unit 30. Based on the first address data, the AV data reading unit 30 reads a segment having the address data as the first frame from the AV data holding unit 22 and transfers it to the decoder 32 (S51). Then, the segment is decoded by the decoder 32, is D / A converted by the D / A converter 34, and is then reproduced on the monitor 36. Unless the reproduction data control unit 26 instructs to change the scenario (S53), the AV data reading unit 30 reads the corresponding segment from the AV data holding unit 22 based on the sequentially sent address data, and sends it to the decoder 32. By sending, it is reproduced in the same manner thereafter. That is, only the segments corresponding to the address data stored as scenario number 1 are sequentially reproduced.
[0101]
On the other hand, when the reproduction data selection unit 26 is instructed to change the scenario (S53), that is, the scenario data of the scenario with the next highest priority, that is, the scenario with the scenario number 2 is read from the scenario holding unit 24 (see FIG. S54). That is, the address data stored for the selected scenario is read and sent to the AV data reading unit 30. The AV data reading unit 30 reads a predetermined segment according to the sent address data, sends it to the decoder 32, and is reproduced after being decoded by the decoder 32. In this case, playback is performed from the beginning based on the changed scenario. Further, when there is an instruction to change the scenario, the scenario data of the scenario with the next highest priority, that is, the scenario with the scenario number 3, is read and reproduced. If there is no lower priority scenario, playback is performed after returning to the scenario with the highest priority. That is, when the scenario number is from 1 to 3, if a scenario change instruction is given during playback of the scenario number 3, the process returns to the scenario number 1.
[0102]
In other words, when a digest playback operation is performed, playback is performed based on the scenario with the highest priority at first, but if a scenario change is made in the middle, playback is performed based on the scenario with the lower priority in order, and the lowest If a scenario change instruction is given during the scenario playback, the top scenario is returned. In step S53, the scenario may be selected by the user, and playback may be performed based on the selected scenario in step S54.
[0103]
Note that digest playback according to the scenario data described above and below corresponds to the scenario playback step.
[0104]
It should be noted that “normal mode” and “scenario selection mode” may be provided as follows to select each mode.
[0105]
That is, since a screen for selecting “normal mode” and “scenario selection mode” is displayed, it is assumed that “normal mode” is selected here.
[0106]
Then, the reproduction data selection unit 26 sends data for specifying a program and data indicating the normal mode to the navigation control unit 28. Then, the navigation control unit 28 selects a scenario data table for the program according to the data for specifying the program. Then, based on the data indicating the normal mode, the scenario of scenario number 1 is selected, and the stored address data of the representative frame is sent to the AV data reading unit 30.
[0107]
The AV data reading unit 30 reads predetermined data from the AV data holding unit 22 according to the sent address data, and sends it to the decoder 32. That is, one segment of data is read from the frame indicated by the address data and sent to the decoder 32.
[0108]
Then, in the decoder 32, the data sent from the AV data reading unit 30 is decoded, sent to the D / A conversion unit 34, subjected to D / A conversion, and then sent to the monitor 36 for reproduction.
[0109]
That is, only the segments corresponding to the address data stored as scenario number 1 are sequentially reproduced. For example, in the example of FIG. 5, the segment at shot number 15 is played after the segment at shot number 11.
[0110]
On the other hand, when the scenario selection mode is selected, a scenario selection screen is displayed. By selecting a scenario on the scenario selection screen, reproduction is performed according to the selected scenario. For example, when the scenario number 2 is selected, the address data stored in the scenario number 2 is sent from the scenario holding unit 24 to the AV data reading unit 30 via the navigation control unit 28. Therefore, the AV data reading unit 30 The AV data is read and reproduced according to the address data stored in scenario number 2.
[0111]
As described above, according to the video information recording / reproducing apparatus of the present embodiment, scenario data is created in accordance with a standard defined for each genre of video information, and digest playback is performed based on the scenario data. Even if they are different, it is possible to perform the optimum digest reproduction for each program genre. In addition, since multiple digest playback scenarios are created based on multiple criteria, the user can view digest playback based on scenario data based on different criteria, making it easier to view video information and creating important scenes. It will be easier to find. In other words, since a plurality of scenario data is created for a certain video information, by switching the scenario data used for the digest playback, it is possible to see the digest playback based on various standards, and to easily view the video information. .
[0112]
It should be noted that a plurality of programs may be played back simultaneously during digest playback. That is, by providing a moving image thumbnail display function in the decoder 32, digest playback of a plurality of programs may be performed simultaneously as thumbnail playback as shown in FIG. In the example shown in FIG. 11, on the display screen of the monitor 36, the program of news *** is digest-reproduced in the upper part, the program of drama AAA is digest-reproduced in the middle part, and the drama XXX is presented in the lower part. This is an example in which the program is digest-reproduced.
[0113]
That is, when the playback data selection unit 26 instructs simultaneous digest playback of a plurality of programs, the scenario data for each program is read from the scenario holding unit 24, sent to the AV data reading unit 30, and AV In the data reading unit 30, the segment is read from the AV data of each program stored in the AV data holding unit 22 according to the scenario data and is sent to the decoder 32.
[0114]
Further, as shown in FIG. 12, display areas M1 to M3 and the like for displaying segments corresponding to address data stored in certain scenario data in a line are provided on the display screen M, and the segments are repeatedly displayed in each display area. You may do it. For example, in the example of FIG. 5, the segment corresponding to the shot number 11 is repeatedly displayed in the display area M1, and the segment corresponding to the shot number 15 is repeatedly displayed in the display area M2. In the display area M3, the segment corresponding to the shot number stored in the scenario data after the shot number 15 is displayed. That is, as each segment moves from the front to the back in time, the display screen M is arranged from the left to the right, and only a plurality of them (for example, three as shown in FIG. 12) are displayed. These segments are displayed so that the user can shift the screen by one screen by operating the input device left or right.
[0115]
In this way, it is possible to view important scenes of a certain program at once, and to immediately know the outline of the program.
[0116]
Note that by preparing a scenario based on the same scenario but having a different digest playback time, when a scenario with a long playback time is selected, each scene may be viewed in more detail.
[0117]
This is possible by using different combinations of the weighting factors and different thresholds used in step S37. For example, in the example of FIG. 4, the evaluation value is calculated using a weighting factor of priority 1 in news (0.1 for shot length, 0.7 for corner head, 0.2 for face area). In this case, a plurality of threshold values are prepared for determination. For example, threshold value a and threshold value b (threshold value a> threshold value b) are prepared, scenario data based on the comparison result between the evaluation value and threshold value a is set as scenario number 1-1, and evaluation is performed. Scenario data based on the comparison result between the value and the threshold value b is written as scenario number 1-2 in the scenario data table. In this case, since the threshold value b is smaller than the threshold value a, there is a high possibility that the number of address data to be written as scenario data also increases with the threshold value b.
[0118]
When the digest is reproduced, the scenario number 1-1 and the scenario number 1-2 can be selected, so that the reproduction time differs even though the scenarios are created by combining the same weighting factors. It becomes possible to comprise.
[0119]
It should be noted that when performing digest playback, it is preferable to know what scenario is being used for playback. That is, display is performed on the playback screen so that the contents of the scenario can be understood. For example, in the example of FIG. 4, in the scenario with scenario number 1 (that is, the scenario with priority 1) in the news, the corner head is considerably evaluated. For example, “mode with many corner heads” and the end position of the playback screen, etc. In the scenario with scenario number 2, the corner start is evaluated, but not as high as scenario number 1. For example, “slightly more corner start” is displayed. Therefore, for example, “Corner head, mode with slightly more face area” is displayed.
[0120]
In the above description, as examples of the feature amount, the shot length, the corner head, and the face area are given as examples. However, the present invention is not limited to this. For example, there is a telop, etc. You may use together the basis of the feature-value of.
[0121]
【The invention's effect】
According to the video information recording / playback apparatus and video information recording / playback method according to the present invention, a scenario is created according to a standard defined for each genre of video information, and digest playback is performed based on the scenario. Even if they are different, it is possible to perform the optimum digest reproduction for each genre of the program.
[0122]
Further, according to the video information recording / playback apparatus and video information recording / playback method according to the present invention, a plurality of digest playback scenarios are created based on a plurality of criteria, so that the user can perform digest playback based on scenarios based on different criteria. This makes it easier to view video information and to find important scenes. For example, when it is desired to grasp the characters, a scenario according to a standard that places importance on the scene where the face area exists may be reproduced.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a configuration of a video information recording / reproducing apparatus according to an embodiment of the present invention.
FIG. 2 is an explanatory diagram showing a configuration of a detection result storage table.
FIG. 3 is an explanatory diagram showing a configuration of a feature amount storage table.
FIG. 4 is an explanatory diagram showing a configuration of a weighting coefficient table.
FIG. 5 is an explanatory diagram showing a configuration of a scenario data table.
FIG. 6 is a flowchart for explaining the operation of the video information recording / reproducing apparatus according to the embodiment of the present invention.
FIG. 7 is a flowchart for explaining the operation of the video information recording / reproducing apparatus according to the embodiment of the present invention.
FIG. 8 is a flowchart for explaining the operation of the video information recording / reproducing apparatus according to the embodiment of the present invention.
FIG. 9 is an explanatory diagram for explaining the operation of the video information recording / reproducing apparatus according to the embodiment of the present invention.
FIG. 10 is an explanatory diagram for explaining the operation of the video information recording / reproducing apparatus according to the embodiment of the present invention.
FIG. 11 is an explanatory diagram illustrating an example of display during reproduction.
FIG. 12 is an explanatory diagram illustrating an example of display during reproduction.
[Explanation of symbols]
A Video information recording and playback device
10 Data separator
11, 15 A / D converter
12 Program additional information extraction unit
14 EPG data holding unit
16 Encoder
18 Indexing section
19 Data holding part
20 Weight coefficient table
22 AV data holding unit
24 Scenario holding part
26 Playback data selection section
28 Navigation control unit
30 AV data reading section
32 decoder
34 D / A converter
36 monitors

Claims

映像情報の記録・再生を行う映像情報記録再生装置であって、
映像情報を記憶する映像情報記憶手段と、
映像情報における各シーンごとに複数種類の特徴量を検出する特徴量検出手段と、
映像情報のジャンルごとに定められた基準で、該特徴量を評価するための基準に従い、検出された特徴量を評価する評価手段と、
上記評価手段が、各特徴量について重み付けを行なうための重み係数が各特徴量ごとに設けられた重み係数群が、複数種類のジャンルについてそれぞれ設けられた重み係数記憶部と、
該評価手段による評価結果に基づき、該シーン又は該シーンの一部を特定する情報である特定情報を記憶することにより、該特定情報からなるシナリオデータを記憶するシナリオ記憶手段と、
該シナリオ記憶手段に記憶されたシナリオデータに基づいて、該映像情報記憶手段に記憶された映像データから、所定のシーン又は該シーンの一部を読み出して再生を行うシナリオ再生手段とを有し、
上記評価手段は、映像情報のジャンルに対応する重み係数群を用いて特徴量を評価することを特徴とする映像情報記録再生装置。A video information recording / reproducing apparatus for recording / reproducing video information,
Video information storage means for storing video information;
Feature amount detection means for detecting a plurality of types of feature amounts for each scene in the video information;
An evaluation means for evaluating the detected feature value according to a criterion for evaluating the feature value, based on a standard determined for each genre of video information;
A weighting coefficient storage unit in which a weighting coefficient group in which a weighting coefficient for weighting each feature quantity is provided for each feature quantity is provided for each of a plurality of types of genres;
Scenario storage means for storing scenario data composed of the specific information by storing specific information that is information for specifying the scene or a part of the scene based on the evaluation result by the evaluation means;
Based on the scenario data stored in the scenario storage means, the scenario reproduction means for reading out and reproducing a predetermined scene or a part of the scene from the video data stored in the video information storage means,
The video information recording / reproducing apparatus characterized in that the evaluation means evaluates a feature amount using a weighting coefficient group corresponding to a genre of video information.

映像情報の記録・再生を行う映像情報記録再生装置であって、
映像情報を記憶する映像情報記憶手段と、
映像情報における各シーンごとに複数種類の特徴量を検出する特徴量検出手段と、
該特徴量検出手段により検出された特徴量を複数種類の基準に従いそれぞれ評価する評価手段と、
上記評価手段が、各特徴量について重み付けを行なうための重み係数が各特徴量ごとに設けられた重み係数群が優先順位に従い複数設けられた重み係数記憶部と、
該評価手段による評価結果に基づき、該シーン又は該シーンの一部を特定する情報である特定情報を各基準ごとに記憶することにより、各基準ごとに該特定情報からなるシナリオデータを記憶するシナリオ記憶手段と、
該シナリオ記憶手段に記憶されたシナリオデータのうちの所定の基準に基づくシナリオデータに基づいて、該映像報記憶手段に記憶された映像データから、所定のシーン又は該シーンの一部を読み出して再生を行うシナリオ再生手段とを有し、
上記評価手段は、各重み係数群を用いてそれぞれ特徴量を評価することにより、該特徴量を複数種類の基準に従い評価することを特徴とする映像情報記録再生装置。A video information recording / reproducing apparatus for recording / reproducing video information,
Video information storage means for storing video information;
Feature amount detection means for detecting a plurality of types of feature amounts for each scene in the video information;
Evaluation means for evaluating each of the feature quantities detected by the feature quantity detection means according to a plurality of types of criteria;
A weight coefficient storage unit in which a plurality of weight coefficient groups each having a weight coefficient for weighting each feature quantity provided for each feature quantity are provided in accordance with the priority order;
A scenario for storing scenario data composed of the specific information for each criterion by storing specific information that is information for identifying the scene or a part of the scene for each criterion based on the evaluation result by the evaluation means Storage means;
Based on scenario data based on a predetermined criterion among scenario data stored in the scenario storage means, a predetermined scene or a part of the scene is read out from the video data stored in the video information storage means and reproduced. Scenario reproduction means for performing
The video information recording / reproducing apparatus characterized in that the evaluation means evaluates the feature quantity using each weight coefficient group to evaluate the feature quantity according to a plurality of types of criteria.

映像情報の記録・再生を行う映像情報記録再生装置であって、
映像情報を記憶する映像情報記憶手段と、
映像情報における各シーンごとに複数種類の特徴量を検出する特徴量検出手段と、
該特徴量検出手段により検出された複数種類の特徴量を、映像情報のジャンルごとに定められた基準に従いそれぞれ評価する評価手段と、
上記評価手段が、各特徴量について重み付けを行なうための重み係数が各特徴量ごとに設けられた重み係数群が、複数種類のジャンルについてそれぞれ設けられた重み係数記憶部と、
該評価手段による評価結果に基づき、該シーン又は該シーンの一部を特定する情報である特定情報を各基準ごとに記憶することにより、各基準ごとに該特定情報からなるシナリオデータを記憶するシナリオ記憶手段と、
該シナリオ記憶手段に記憶されたシナリオデータのうちの所定の基準に基づくシナリオデータに基づいて、該映像報記憶手段に記憶された映像データから、所定のシーン又は該シーンの一部を読み出して再生を行うシナリオ再生手段とを有し、
上記評価手段は、映像情報のジャンルに対応する重み係数群を用いてそれぞれ特徴量を評価することにより、該特徴量を複数種類の基準に従い評価することを特徴とする映像情報記録再生装置。A video information recording / reproducing apparatus for recording / reproducing video information,
Video information storage means for storing video information;
Feature amount detection means for detecting a plurality of types of feature amounts for each scene in the video information;
An evaluation unit that evaluates the plurality of types of feature amounts detected by the feature amount detection unit according to a criterion defined for each genre of video information;
A weighting coefficient storage unit in which a weighting coefficient group in which a weighting coefficient for weighting each feature quantity is provided for each feature quantity is provided for each of a plurality of types of genres;
A scenario for storing scenario data composed of the specific information for each criterion by storing specific information that is information for identifying the scene or a part of the scene for each criterion based on the evaluation result by the evaluation means Storage means;
Based on scenario data based on a predetermined criterion among scenario data stored in the scenario storage means, a predetermined scene or a part of the scene is read out from the video data stored in the video information storage means and reproduced. Scenario reproduction means for performing
The video information recording / reproducing apparatus characterized in that the evaluation means evaluates the feature quantity according to a plurality of types of criteria by evaluating each feature quantity using a weighting coefficient group corresponding to the genre of the video information.

映像情報の記録・再生を行う映像情報記録再生装置であって、
映像情報を記憶する映像情報記憶手段と、
映像情報における各シーンごとに複数種類の特徴量を検出する特徴量検出手段と、
該特徴量検出手段により検出された複数種類の特徴量を、映像情報のジャンルごとに定められた基準に従いそれぞれ評価する評価手段と、
上記評価手段が、重み係数群集合であって、各特徴量について重み付けを行なうための重み係数が各特徴量ごとに設けられた重み係数群が優先順位に従い複数設けられた重み係数群集合を複数のジャンルについてそれぞれ有する重み係数記憶部と、
該評価手段による評価結果に基づき、該シーン又は該シーンの一部を特定する情報である特定情報を各基準ごとに記憶することにより、各基準ごとに該特定情報からなるシナリオデータを記憶するシナリオ記憶手段と、
該シナリオ記憶手段に記憶されたシナリオデータのうちの所定の基準に基づくシナリオデータに基づいて、該映像報記憶手段に記憶された映像データから、所定のシーン又は該シーンの一部を読み出して再生を行うシナリオ再生手段とを有し、
上記評価手段は、映像情報のジャンルに対応する重み係数群集合における複数の重み係数群を用いてそれぞれ特徴量を評価することにより、該特徴量を複数種類の基準に従い評価することを特徴とする映像情報記録再生装置。A video information recording / reproducing apparatus for recording / reproducing video information,
Video information storage means for storing video information;
Feature amount detection means for detecting a plurality of types of feature amounts for each scene in the video information;
An evaluation unit that evaluates the plurality of types of feature amounts detected by the feature amount detection unit according to a criterion defined for each genre of video information;
The evaluation means is a weighting coefficient group set, and a plurality of weighting coefficient group sets in which a plurality of weighting coefficient groups are provided for each feature quantity in accordance with the priority order. A weighting coefficient storage unit for each genre,
A scenario for storing scenario data composed of the specific information for each criterion by storing specific information that is information for identifying the scene or a part of the scene for each criterion based on the evaluation result by the evaluation means Storage means;
Based on scenario data based on a predetermined criterion among scenario data stored in the scenario storage means, a predetermined scene or a part of the scene is read out from the video data stored in the video information storage means and reproduced. Scenario reproduction means for performing
The evaluation means evaluates the feature amount according to a plurality of types of criteria by evaluating each feature amount using a plurality of weight coefficient groups in a set of weight coefficient groups corresponding to the genre of video information. Video information recording and playback device.

上記評価手段は、各特徴量と対応する重み係数とを乗算した値を積算した値を、所定のしきい値と比較し、該積算した値が該所定のしきい値よりも大きいか否かを判定することを特徴とする請求項１あるいは３に記載の映像情報記録再生装置。 The evaluation means compares a value obtained by multiplying each feature amount by a corresponding weighting factor with a predetermined threshold value, and determines whether the accumulated value is greater than the predetermined threshold value. The video information recording / reproducing apparatus according to claim 1, wherein the video information recording / reproducing apparatus is determined.

上記評価手段は、各特徴量と対応する重み係数とを乗算した値を積算した値を、所定のしきい値と比較し、該積算した値が該所定のしきい値よりも大きいか否かを判定することを特徴とする請求項２あるいは４に記載の映像情報記録再生装置。 The evaluation means compares a value obtained by multiplying each feature amount by a corresponding weighting factor with a predetermined threshold value, and determines whether the accumulated value is greater than the predetermined threshold value. The video information recording / reproducing apparatus according to claim 2, wherein the video information recording / reproducing apparatus is determined.

上記シナリオ記憶手段は、上記積算した値が上記所定のしきい値よりも大きい場合に、そのシーン又は該シーンの一部を特定する情報である特定情報を記憶していくことを特徴とする請求項５あるいは６に記載の映像情報記録再生装置。 The scenario storage means, when the integrated value is larger than the predetermined threshold, stores specific information that is information for specifying the scene or a part of the scene. Item 7. The video information recording / reproducing apparatus according to Item 5 or 6.

上記評価手段は、評価に用いる複数種類のしきい値を有し、各しきい値に基づいて評価を行い、また、上記シナリオ記憶手段は、各しきい値に基づく評価結果に基づき、各しきい値ごとにシナリオデータを記憶することを特徴とする請求項５ないし７の何れかに記載の映像情報記録再生装置。 The evaluation means has a plurality of types of threshold values used for evaluation, and performs evaluation based on each threshold value. The scenario storage means performs each evaluation based on the evaluation result based on each threshold value. 8. The video information recording / reproducing apparatus according to claim 5, wherein scenario data is stored for each threshold value.