JP4176573B2

JP4176573B2 - Data editing apparatus and data editing method

Info

Publication number: JP4176573B2
Application number: JP2003203072A
Authority: JP
Inventors: 史行井澤; 隆之妹尾
Original assignee: Kyocera Corp
Current assignee: Kyocera Corp
Priority date: 2003-07-29
Filing date: 2003-07-29
Publication date: 2008-11-05
Anticipated expiration: 2023-07-29
Also published as: JP2005051307A

Description

【０００１】
【発明の属する技術分野】
この発明は、携帯電話等の携帯用電子機器に格納され、動画データをユーザの好みに合わせて編集するデータ編集装置およびデータ編集方法に関する。
【０００２】
【従来の技術】
近年、ユーザが携帯電話等を用いてユーザ間で様々な情報交換を行えるようになってきている。例えば、特許文献１には電子メールに画像および音声を送信することによって、文字だけでは伝達しきれない情報を相手方に視覚的に把握させる携帯テレビ電話端末が記載されている。近年、静止画像データのみならず、動画データを他の機器へ送信することができる携帯用電子機器が登場した。このような携帯用電子機器は、動画データの再生を行うことができると共に、最大再生時間が１５秒の動画データを電子メールに添付して他の機器へ送信することができる。なお、送信可能な動画データの再生時間が１５秒に制限されているのは、データを伝送する無線のトラヒックラインに過大な負荷が発生しないようにするためである。
【０００３】
【特許文献１】
特開２０００−３３２９０４号公報
【０００４】
【発明が解決しようとする課題】
しかし、従来の携帯用電子機器は、ユーザの撮影による動画データの作成やその再生、他の機器との動画データの送受信、Ｗｅｂサイトからの動画データのダウンロード等を行うことはできたものの、動画データの編集を行うことはできなかった。ユーザは、１５秒を超える動画データを再生することはできるが、その編集を行うことができないため、１５秒を超える動画データを他の機器へ送信することができず、自分の携帯用電子機器上で再生させて楽しむだけであった。
【０００５】
携帯電話において動画データの編集を行うには、動画データを一旦デコードし、表示用データに変換してから画面上に表示し、この表示用データに対して様々な編集操作を行ったものをエンコードして保存することが必要であると一般的に考えられている。編集中に動画データのデコードおよびエンコードを繰り返すのでは、このデータの変換に時間がかかり、ユーザにとって使いづらいため、動画データの編集機能を有する携帯電話は存在しなかった。
【０００６】
本発明は、上述した問題点に鑑みてなされたものであって、動画データの編集を行うことができるデータ編集装置およびデータ編集方法を提供することを目的とする。
【０００７】
【課題を解決するための手段】
本発明は上記の課題を解決するためになされたもので、以下の手段を採用した。
本発明は、複数の画像フレームと当該複数の画像フレームを連続デコードする際の情報を含むタイムテーブルが記載されるヘッダとを含んで構成される動画データを複数記憶する記憶手段と、前記記憶手段に記憶される複数の動画データのうち、つなぎ合わせる動画データを指定する操作手段と、前記操作手段により第１の動画データと第２の動画データとが指定されると、当該第２の動画データのうち、そのヘッダを取り除いたデータを前記第 1 の動画データの後ろに連結して第３の動画データを生成する連結手段と、前記連結手段によって生成された前記第３の画像データを前記記憶手段に格納する格納手段と、を備え、前記連結手段は、前記第３の動画データ生成の際、前記第２の動画データのヘッダ内のタイムテーブルに含まれる情報に基づいて、前記第２の動画データが前記第１の動画データに連続するよう、前記第１の動画データのヘッダ内のタイムテーブルを書き換え、当該書き換えたヘッダを前記第３の動画データのヘッダとする書き換えを行うことを特徴とする。
【０００８】
また、前記第１および第２の動画データそれぞれにおける複数の画像フレームは、基準画像を示すための基準フレームと、当該基準画像からの差分を示すための差分フレームとを含み、前記タイムテーブルには、少なくとも前記基準フレームの位置を示す情報を有するフレーム判別情報を含み、前記連結手段は、前記第２の動画データのタイムテーブルに含まれるフレーム判別情報に基づいて、前記第２の動画データのうちの画像データの有する基準フレームの位置を示す情報を、前記第１の動画データに含まれる画像フレーム数の分繰り下げ、当該繰り下げた第２の動画データの基準フレームの位置を示す情報を前記第１の動画データのフレーム判別情報に加えるよう、前記タイムテーブルの書き換えを行うことを特徴とする。
【０００９】
また、前記第１および第２の動画データそれぞれにおける複数の画像フレームの各フレームには、動画データ再生の経過時間を管理するための経過時間情報が与えられており、前記連結手段は、前記第２の動画データに含まれていた複数の画像フレームの各フレームの経過時間情報を、前記第１の動画データに含まれていた複数の画像フレームの後ろに追加することを特徴とする。
【００１０】
また、前記記憶手段は、ＲＯＭとＲＡＭとを含み、前記連結手段は、前記第３の動画データの生成に関する処理を前記ＲＡＭ上で行い、生成された前記第３の画像データを前記格納手段により、前記記憶手段のうちの前記ＲＯＭに格納する前に、前記ＲＡＭ上の第３の動画データのヘッダに含まれるタイムテーブルを参照して当該第３の動画データを再生するプレビュー手段をさらに備えることを特徴とする。
【００１１】
また、前記第１および第２の動画データは、前記複数の画像フレームを有する画像データと、当該画像データに対応し、複数のフレームからなる音声データとをそれぞれ含み、
前記第１および第２の動画データそれぞれのヘッダに含まれるタイムテーブルには、それぞれの画像データと音声データのフレームごとのデコード間隔を示す間隔時間情報がさらに含まれることを特徴とする。
【００１２】
また、前記連結手段は、前記第３の動画データを生成する際に、前記第１の動画データの有する画像データと音声データそれぞれの全フレームの合計再生時間を、それぞれの前記間隔時間情報から算出するとともに互いに比較し、比較画像データの合計再生時間が音声データの合計再生時間よりも長い場合には、音声データが不足する時間分の無音の音声データを生成して、当該音声データを前記第１の動画データの有する音声データの後ろに挿入し、音声データが前記第１の動画データに含まれない場合には当該第１の動画データに含まれる画像データの全フレームの合計再生時間と等しい時間分の無音の音声データを生成して、前記第１の動画データの有する画像データに対応付けて挿入し、いずれの場合においても挿入した音声データ分の時間情報を追加するよう前記第３の動画データのヘッダ中のタイムテーブルを書き換えることを特徴とする。
【００１３】
また、前記第１の動画データの有する画像データのフレームと音声データのフレームは、それぞれのデコード間隔の時間が、互いに約数あるいは倍数の関係が成り立たない関係であり、前記連結手段は、前記第３の動画データを生成する際に、前記第１の動画データの有する画像データと音声データそれぞれの全フレームの合計再生時間をそれぞれの前記間隔時間情報から算出して互いに比較し、互いの合計再生時間の長さが異なる場合には、これを揃えるよう、前記第３の動画データのヘッダのタイムテーブルを、前記第１の動画データの画像データのデコード間隔の時間を延長あるいは短縮するよう書き換えることを特徴とする。
【００１４】
また、前記操作手段は、前記記憶手段に記憶される動画データから、切り出し範囲を指定して新たに動画を切り出し指示可能であり、前記操作手段により、第４の動画データにおける切り出し開始位置が指定されると指定された位置に最も近い基準フレームを検索し、切り出し終了位置が指定されると指定された位置に最も近い基準フレームの直前の差分フレームを検索する第１の検索手段と、前記第１の検索手段により切り出し開始位置として特定された基準フレームから前記第４の動画データの画像フレームの最後まで、前記第４の動画データの画像フレームの最初から前記第１の検索手段により切り出し開始位置として特定された基準フレームから切り出し終了位置として特定された基準フレームの直前の差分フレームまで、前記第４の動画データの画像フレームの最初から前記検索手段により切り出し終了位置として特定された基準フレームの直前の差分フレームまで、いずれかの画像データのフレーム抽出を行い、当該抽出した画像フレームを有する新たな第５の動画データとして生成する第１の抽出手段と、を備え、前記格納手段は、前記第１の抽出手段によって生成された前記第５の画像データを前記記憶手段に格納し、前記第１の抽出手段は、当該第１の抽出手段により抽出したフレームのデコード間隔を示す間隔時間情報について、前記第４の動画データのヘッダに含まれるタイムテーブルから抽出し、当該抽出した間隔時間情報を含む新たなタイムテーブルと、当該新たなタイムテーブルを有するヘッダを生成し、当該生成したヘッダを当該第１の抽出手段により抽出した画像データに対して付加して前記第５の動画データとして生成することを特徴とする。
【００１５】
また、前記操作手段は、前記記憶手段に記憶される動画データから、切り出し範囲を指定して新たに動画を切り出し指示可能であり、前記操作手段により、第４の動画データにおける切り出し開始位置が指定されると指定された位置に最も近い基準フレームを検索し、さらに当該基準フレームを起点として所定時間範囲のフレームを検索する第２の検索手段と、前記第２の検索手段により検索された範囲のフレームの抽出を行い、当該抽出した画像フレームを有する新たな第６の動画データとして生成する第２の抽出手段と、を備え、前記格納手段は、前記第２の抽出手段によって生成された前記第６の画像データを前記記憶手段に格納し、前記第２の抽出手段は、当該第２の抽出手段により抽出したフレームのデコード間隔を示す間隔時間情報について、前記第４の動画データのヘッダに含まれるタイムテーブルから抽出し、当該抽出した間隔時間情報を含む新たなタイムテーブルと、当該新たなタイムテーブルを有するヘッダを生成し、当該生成したヘッダを当該第２の抽出手段により抽出した画像データに対して付加して前記第６の動画データとして生成することを特徴とする。
【００１６】
また、前記第４の動画データそれぞれにおける複数の画像フレームの各フレームごとのデコード間隔を示す間隔時間情報が与えられており、前記第２の検索手段は、前記記憶手段に格納される前記第４の動画データのヘッダに含まれるタイムテーブルを参照して、前記起点とする基準フレーム以降のフレームごとの前記間隔時間情報を順次加算することで前記所定時間範囲のフレームを特定することを特徴とする。
【００１７】
また、前記第２の検索手段は、前記記憶手段に格納される前記第４の動画データのヘッダに含まれるタイムテーブルを参照して、前記起点とする基準フレームから再生したときに所定時間経過する前であり、なおかつ前記所定時間範囲内の差分フレームを検索することを特徴とする。
【００１８】
また、前記記憶手段に格納される動画データを他の機器に送信可能な通信手段をさらに備え、前記所定時間は、前記通信手段により送信可能な範囲内の値であることを特徴とする請求項９から１１のいずれか一項に記載のデータ編集装置。
【００１９】
本発明は、複数の画像フレームと当該複数の画像フレームを連続デコードする際の情報を含むタイムテーブルが記載されるヘッダとを含んで構成される動画データを複数記憶する記憶手段に対し、記憶する動画データを編集するデータ編集方法であって、前記記憶手段に記憶される複数の動画データのうち、つなぎ合わせる動画データを指定するステップと、前記第１の動画データと第２の動画データとが指定されると、当該第２の動画データのうち、そのヘッダを取り除いたデータを前記第 1 の動画データの後ろに連結して第３の動画データを生成するステップと、生成された前記第３の画像データを前記記憶手段に格納するステップと、前記第３の動画データ生成の際、前記第２の動画データのヘッダ内のタイムテーブルに含まれる情報に基づいて、前記第２の動画データが前記第１の動画データに連続するよう、前記第１の動画データのヘッダ内のタイムテーブルを書き換え、当該書き換えたヘッダを前記第３の動画データのヘッダとするステップと、を備えることを特徴とする。
【００２６】
【発明の実施の形態】
以下、図面を参照し、この発明の実施形態について説明する。図１は、この発明の一実施形態によるデータ編集装置の構成を示す図である。図において、１は撮像部であり、撮影したアナログ画像データを画像制御部３へ出力する。２は表示部であり、画像制御部３から出力される表示用データに基づいて動画像を表示する。画像制御部３は撮像部１から出力されたアナログ画像データをエンコードし、ＭＰＥＧ−４形式の３ｇｐ２フォーマットの画像データを生成して制御部１０へ出力する。また、画像制御部３は制御部１０から出力される画像データをデコードして表示用データを生成し、表示部２へ出力する。４は操作部であり、ユーザによって操作されるテンキー、文字入力用キー、ファンクションキーなどの各種のキーが設けられている。
【００２７】
５はスピーカであり、音声制御部７から出力される音声信号に基づいて音声を発生する。６はマイクであり、外部の音声を音声信号に変換し、音声制御部７へ出力する。音声制御部７は制御部１０から出力されるＱＣＥＬＰ（ＱｕａｌｃｏｍｍＣｏｄｅＥｘｃｉｔｅｄＬｉｎｅａｒＰｒｅｄｉｃｔｉｏｎ）形式の音声データをデコードし、音声信号を生成してスピーカ５へ出力する。また、音声制御部７はマイク６から出力される音声信号をエンコードして音声データを生成し、制御部１０へ出力する。８はＥＰＲＯＭやＥＥＰＲＯＭなどのデータの消去および書き込みが可能であり、一定時間以上データを保持することが可能なＲＯＭである。ＲＯＭ８は制御部１０によって格納される画像データおよび音声データを保存する。９はＲＡＭであり、制御部１０によってＲＯＭ８から取り出された画像データおよび音声データが一時的に格納され、データの編集処理が行われる。１０は各部を制御する制御部である。１１はデータ送受信用のアンテナである。１２は無線通信部であり、無線通信網と無線により通信を行い、通話時の音声データやメール通信時のメールや各種データの送受信を行う。
【００２８】
次に、本実施形態における動画データの構造を図２を用いて説明する。動画データの先頭にはヘッダが付加され、ヘッダに続いて画像データと音声データが交互に並び、末尾にデータの終了を示すエンドコードが付加されている。画像データはフレームから構成され、個々の装置の設計にもよるが、例えば１秒間の動画データは１５フレームの画像データから構成される。また、フレームはＩフレームとＰフレームの２種類から構成される。Ｉフレームは画像を構成する基準となるフレームであり、完全な一つの静止画像を符号化したものである。Ｐフレームは直前のＩフレームまたはＰフレームと実際の静止画像の差分のみを符号化したものである。画像データは図２に示すように、Ｉフレームの後に複数のＰフレームが続いており、例えば１秒１５フレームで構成される動画データの場合には、１秒間の画像データは１個のＩフレームと１４個のＰフレームで構成される。一方、音声データのフレームには画像データのようなＩフレームやＰフレームという概念はない。
【００２９】
画像データの各フレームには経過時間情報を示すタイムベースとタイムインクリメントが含まれている。タイムベースおよびタイムインクリメントはＭＰＥＧ規格で定められており、動画データ再生の際に再生間隔を示す情報として用いられる。タイムインクリメントはフレーム再生時の経過時間を示しており、タイムベースは補助的に用いられ、０または１の値をとる。図３にタイムベースおよびタイムインクリメントの使用例を示す。これは１秒７．５フレーム（１秒１５フレームのハーフレート）で構成される画像データの例であり、最初のフレームの先頭のＩフレームを基準としてフレームごとにタイムインクリメントに値が加算される。なお、１÷７．５＝０．１３３３・・・、１÷１５＝０．０６６６・・・であるが、ここでは説明のため端数を切り捨て、１秒７．５フレーム（ハーフレート）の場合は１フレームごとの経過時間が１３３ミリ秒、１秒１５フレーム（フルレート）の場合は１フレームごとの経過時間が６６ミリ秒としている。
【００３０】
タイムインクリメントのみに時間情報を持たせると、タイムインクリメントのデータ量が膨大となってしまうため、１秒経過するごとにタイムインクリメントを変更する。例えば、図３では６６ミリ秒が経過するごとにタイムインクリメントが１増加する。なお、この例は１秒が７．５フレームで構成されるハーフレートの場合であり、１フレームは１３３ミリ秒となり、１フレームごとにタイムインクリメントは２ずつ上昇することとなる。９番目のフレームで経過時間が１秒となるので、タイムインクリメントを更新し、タイムインクリメントの値は、１秒から６６ミリ秒経過したことを示す１となる。（この例はハーフレートの例であるため、７．５フレームで１秒が経過する）また、１秒が経過するごとに、タイムベースは０から１へ反転し、次のＰフレームでは再び０に戻る。以下同様に、１秒が経過するごとにタイムインクリメントが変更され、タイムベースが１秒経過のフラグとなる。なお、図３はタイムベースおよびタイムインクリメントの説明のために用意したに過ぎず、図中の数値は動画データの仕様に基づいて設計時に決定される。
【００３１】
図２の動画データ先頭のヘッダにはフレームの情報を有するタイムテーブルが格納されている。タイムテーブルは、各フレームのフレームサイズ（ｓｔｓｚ）・フレーム判別情報（ｓｔｓｓ）・デコード間隔時間（ｓｔｔｓ）に関する情報を有している。フレームサイズは各フレームのサイズを示している。フレームサイズはＩフレームとＰフレームで顕著に異なっている。フレーム判別情報はＩフレームの判別に用いられ、Ｉフレームの番号を情報として有している。すなわち、どのフレームが基準画像を格納するＩフレームであるのかが把握しやすくなっており、このフレーム判別情報は動画データの早送り再生時にＩフレームのみをコマ送り再生するなどの手法に用いられる。
【００３２】
フレームは画像制御部３でデコードされた後、一時的に画像制御部３内のバッファに蓄積され、一つ前のフレームの画像表示が完了した時点で表示部２へ出力される。デコード間隔時間は再生開始から再生終了までの時間を、各フレームのデータサイズに応じて分割し、各フレームに割り当てたものである。すなわち、デコード間隔時間は、これに指定されるフレームのデコードおよび再生を行う時間である。そして、割り当てられたデコード間隔時間内に画像制御部３においてフレームのデコードが終了した場合、このフレームの画像が表示部２に表示されるが、デコード間隔時間内にフレームのデコードが終了しなかった場合は、このフレームの画像は表示されず、次のフレームのデコードが行われる。このタイムテーブルは、画像制御部３においてアナログ画像データがエンコードされる際に生成される。なお、音声データに関しても、フレームサイズおよびデコード間隔時間に相当するものがヘッダに格納されている。
【００３３】
また、動画データの各フレームにはフレームの先頭を示すフレーム先頭情報も格納されている。制御部１０は各フレームのデータを参照する場合に、このフレーム先頭情報に基づいてフレームの先頭を検索し、続けてフレーム判別情報に基づいて、そのフレームがＩフレームとＰフレームのどちらのフレームであるのか判断したり、タイムベース・タイムインクリメントに基づいて時間情報の取得を行ったりすることができる。
【００３４】
次に、上述した構造の２つの動画データのつなぎ合わせについて説明する。まず、ユーザが操作部４を用いて、表示部２に表示される動画データのタイトル等を見ながら、つなぎ合わせたい２つの動画データを指定する。図４のように２つの動画データ（データ１とデータ２）をつなぎ合わせる場合、まず、データ１およびデータ２が制御部１０によってＲＯＭ８から読み出され、ＲＡＭ９に格納される。制御部１０はデータ１のエンドコードを検出し、このエンドコードとデータ２のヘッダをデータから除き、図４右に示すように２つのデータをＲＡＭ９上の連続した領域に格納する。このとき、データ２のヘッダ内の情報がデータ１のヘッダ内のタイムテーブルに付加されるが、データ１の画像データの最後のフレームに関するデコード間隔時間と、データ２のフレームに関するフレーム判別情報とが変更される。
【００３５】
データ１の画像データの最後のフレームに関するデコード間隔時間が変更されるのは、画像データと音声データの同期を取るためであり、これについては後述する。なお、画像データのみからなる動画データどうしをつなぎ合わせる場合は、音声データとの同期を取る必要が無いので、タイムテーブル中のデコード間隔時間の変更は必要ない。また、フレーム判別情報に関しては、データ１にデータ２を追加したことによりデータ１の後ろに増えたフレーム列に関するフレーム判別情報を元のフレーム判別情報に追加する。このとき、追加されたデータ２に関しては、フレーム判別情報中のＩフレームの番号がデータ１のフレーム数だけ繰り下げられて書き換えられる。
【００３６】
つなぎ合わせたデータをＲＯＭ８に保存するには、各フレームのタイムベースおよびタイムインクリメントの変更が必要である。制御部１０はデータのつなぎ合わせの後、データ２に含まれていた先頭のフレームから順番にタイムベースおよびタイムインクリメントを書き換え、書き換えた動画データをＲＯＭ８に格納する。しかし、ユーザが動画データの編集を行い、編集が望みどおりに行えたかどうかを確認する場合に、編集した動画データを一旦保存してから再生すると保存に時間がかかるため、手軽に編集確認を行うことができない。
【００３７】
そこで、編集した動画データを保存する前に、新たに作成したヘッダ内のタイムテーブルを用いて動画を再生し（プレビュー動作）、ユーザに動画データの編集結果の確認を行わせる。このプレビュー動作は以下のように行われる。制御部１０はつなぎ合わせた動画データからフレームを読み出すごとにタイムテーブル中のデコード間隔時間を読み出す。動画データ中の画像データは画像制御部３においてデコードされ、画像制御部３内のバッファに蓄積される。画像制御部３におけるデコードが当該フレームのデコード間隔時間内に終了した場合、制御部１０は画像制御部３へ出力指示を出力し、画像制御部３は出力指示を受けた表示用データを表示部２へ出力する。この表示用データを受け取った表示部２は表示用データに基づいて画像を表示する。この動作が連続的に行われることにより、動画像が再生される。
【００３８】
なお、データ１の最後のフレームが再生されたとき、このフレームの後に挿入されていたデータ１のエンドコードが削除されているため、そのままデコードが続行され、データ２のフレームのデコードが行われる。以上のように、制御部１０がタイムテーブルで管理される時間情報に基づいて各フレームの画像の表示を制御するため、ＲＡＭ９上に一時的に生成される動画データを用いて動画の再生を行うことができる。さらに、ヘッダ内のフレーム判別情報についても更新が行われているため、プレビューを行っている状態においても動画データの早送り再生等を行うことができる。これは、長い動画データを編集した場合に、編集後の動画データを素早く確認することができるので有用である。
【００３９】
音声データに関しても上述した動作と同様の動作が行われ、スピーカ５から音声が発生する。しかし、例えばユーザがアフレコ編集で音声データを削除した場合などに生成されるような、音声データが含まれず画像データのみからなる動画データもあり、その場合には以下のような動作が行われる。図５（ａ）のように画像データ１のみからなるデータ１と、画像データ２および音声データ２からなるデータ２とをつなぎ合わせるとする。なお、図においてはデータ構造を簡略化しており、データの配置などは必ずしも実際のＲＡＭ９上の位置を示す訳ではない。データ１には音声データが無いため、データ１のヘッダは画像データ１のみに関する情報を有している。
【００４０】
データ１およびデータ２をつなぎ合わせた場合、画像データについては画像データ２に関するデータ２のヘッダの情報がデータ１のヘッダに追加され、音声データについては音声データ２に関するデータ２のヘッダの情報がデータ１のヘッダに新たに書き込まれる。制御部１０がヘッダ中のタイムテーブルを参照しながらデータ再生の制御を行う場合、音声データ２に関する時間情報がそのままヘッダ中に書き込まれているため、音声データ２は画像データ１と共に再生され、動画像と音声が一致しないことになってしまう。
【００４１】
そこで、制御部１０はデータ１とデータ２をつなぎ合わせる場合に、タイムテーブル中のデコード間隔時間を参照し、画像データ１の再生時間を算出する。そして、制御部１０はこの時間に最も近い再生時間となるように、１つあるいは複数の無音の音声フレームで音声データ１を構成し、この音声データ１を音声データ２の前に挿入し、音声データ１に関するデコード間隔時間をタイムテーブル中に追加する。また、画像データに関しては、画像データ２のフレームに関するフレーム判別情報を変更すると共に、後述するように、画像データと音声データの同期を取るため、画像データ１の最後のフレームのデコード間隔時間を変更する。この音声データ１用の無音データとしては、例えば制御部１０がホワイトノイズを生成し、これを使用してもよい。以上のようにして、図５（ｂ）のように画像データ１に音声データ１が対応し、動画データの再生が画像と音声が正しく対応するように行われる。
【００４２】
上記のようにしてつなぎ合わせた動画データをＲＯＭ８に保存する場合、前述した通り、制御部１０は画像データ中のフレームの先頭からタイムベースおよびタイムインクリメントを参照し、最初のデータの後ろにつなぎ合わされたデータのタイムベースおよびタイムインクリメントを全て書き換える。続いて制御部１０は動画データをＲＡＭ９から読み出して、ＲＯＭ８に格納する。
【００４３】
なお、ユーザがアフレコ編集によって音声データ中の途中から音声データ末尾までの音声データを削除した動画データに別の動画データをつなぎ合わせる場合も、上記と同様の動作によって、画像データを基準とした長さの無音データが挿入され、ヘッダが書き換えられる。これにより、データのつなぎ合わせが可能である。また、音声データが削除された動画データどうしをつなぎ合わせる場合には、音声データと同期を取る必要がないため、画像データに関するデコード間隔時間は変更されない。
【００４４】
次に、動画データから一部のデータを切り出し、新たな動画データを生成する動画データの切り出しについて図６を用いて説明する。データを切り出す場合、ユーザは動画を再生し、希望する先頭切り出し位置および末尾切り出し位置において再生をストップし、切り出し位置を指定する。図６（ａ）のユーザ指定位置のＰフレームが切り出しの先頭としてユーザによって指定された場合、制御部１０はタイムテーブル中のフレーム判別情報を参照し、このＰフレームに最も近いＩフレームを検索し、このＩフレームをデータの先頭切り出し位置とする。これは、画像データの先頭がＰフレームであると、画像データのデコードの際に、差分情報のみから表示用データを作成することができないからである。
【００４５】
また、図６（ｂ）のユーザ指定位置のＰフレームが切り出しの末尾としてユーザによって指定された場合、制御部１０はこのＰフレームに最も近いＩフレームを検索し、その直前のＰフレームをデータの末尾切り出し位置とする。末尾切り出し位置のフレームがＩフレームであると、切り出したデータの後ろにさらに別の動画データをつなぎ合わせる場合に、つなぎ目を境にしてＩフレームが続いてしまう。すると、この２つのＩフレームが画像制御部３においてデコードされるときに、サイズの大きなデータのデコードが続くことになってしまう。Ｉフレームはデータサイズが大きいので、Ｉフレームが連続するとビデオビットレートが大きくなり、規格外データとなる可能性がある。これを防ぐため、末尾切り出し位置はＩフレームの直前のＰフレームとしている。
【００４６】
なお、データの先頭切り出し位置および末尾切り出し位置の検索において、制御部１０はユーザによって指定されたフレームに最も近いＩフレームを検索しているが、これに限らず、ユーザによって指定されたフレームの直前または直後のフレームを検索するようにしてもよい。
【００４７】
上述した動画データの切り出しによって、図７（ａ）に示すように、動画データの一部を切り出して、新たな動画データを生成することができる。また、図７（ｂ）に示すように、動画データ中の不要な部分をカットし、カットした部分を除いてつなぎ合わせることもできる。いずれの場合も、切り出されたデータの先頭にはタイムテーブルを含むヘッダが付加され、このタイムテーブル中のデコード間隔時間が変更される。データの編集が終了したら、前述したプレビュー動作を行うことによって、ユーザは好みの編集を行うことができたかどうか確認することができる。さらに、各フレームのタイムベースおよびタイムインクリメントを変更し、データを保存することもできる。
【００４８】
動画データの切り出し・カット・つなぎ合わせなどの編集が行われると、画像と音声の同期に誤差が生じるので、両者を同期させるための補正が必要である。例えば、図８に示すように、画像フレーム１および画像フレーム２の間でデータが分割される場合、画像フレームと音声フレームの再生時間が異なることから、必ずしも両者の再生開始が一致した位置で切り出しが行われるとは限らない。音声フレームは画像フレームよりも時間が短く、さらに、その時間は互いに約数・倍数の関係にはなっていない。また、データの分割が行われる際には、フレーム単位で分割が行われるので、約数・倍数の関係にない画像フレームでの分割が指定されると、音声フレームの分割位置と画像フレームの分割位置とにわずかなずれが生じることになってしまう。
【００４９】
図８において、音声データについては音声フレーム７と音声フレーム８の間でデータが分割されるとする。画像フレーム２の再生開始時刻と音声フレーム８の再生開始時刻は９ミリ秒異なっており、切り出したデータの補正を行わずに再生を行うと、音声フレーム８の音声は画像フレーム２の画像の出力よりも９ｍｓ遅れて出力されなければ音声と画像が同期しないにもかかわらず、それぞれのフレームの先頭から同時に出力されてしまう。つまり、音声フレーム８の音声は画像に同期した位置よりも９ｍｓ早く出力されてしまうので、補正が必要である。
【００５０】
この場合、制御部１０はタイムテーブル中のデコード間隔時間を参照し、先頭のフレームから切り出し位置直前のフレームまでのデコード間隔時間を画像データおよび音声データのそれぞれについて加算し、比較することにより、画像フレーム２と音声フレーム８の再生開始の時間差が９ｍｓであることを認識する。そして、制御部１０は画像フレーム２に関するデコード間隔時間を９ｍｓ短縮するように書き換える。これにより、画像フレーム２の次の画像フレーム３からについては、画像データと音声データが完全に同期が取れた形で再生できる。
【００５１】
また、データをつなぎ合わせる場合にも、画像データと音声データの再生時間が異なることから、つなぎ目における画像データと音声データの同期が必要である。例えば、図８のように切り出された画像フレーム１および音声フレーム１〜音声フレーム７からなる動画データの後ろに他の動画データをつなぎ合わせる場合、画像フレーム１の再生終了位置と音声フレーム７の再生終了位置の間に９ｍｓの差があるため、タイムテーブルの画像フレーム１に関するデコード間隔時間が９ｍｓ延長するように書き換えられる。これにより、つなぎ目の後に続く画像フレームと音声データとが同期の取れた形でデータをつなぎ合わせることができる。
【００５２】
さらに、データをカットした場合の切り出し位置での同期補正についても同様である。例えば、図７（ａ）のようにデータが切り出された場合、切り出された画像データの先頭のフレームのデコード間隔時間が変更される。また、図７（ｂ）のようにデータが切り出され、つなぎ合わされた場合、図に向かって左側の画像データの右端のフレームのデコード間隔時間が変更される。以上のように、タイムテーブル中のデコード間隔時間の書き換えだけで画像データおよび音声データの同期のずれを補正することができる。
【００５３】
なお、画像データおよび音声データの再生時間の差が音声データの１フレーム長を超える場合には、音声データに無音データが挿入され、音声データの１フレーム長以内である場合には、画像データに関するデコード間隔時間が書き換えられる。また、ユーザによるアフレコ編集によって音声データが削除され、画像データのみからなる動画データの場合には、上述したデコード間隔時間の変更を行う必要はない。
【００５４】
次に、図９に示すようなデータの入れ替えを説明する。例えば、図中のデータＣとデータＢの順番を入れ替える場合、まずユーザはデータＣを切り出す両端の位置を指定し、続いてデータＣを挿入する位置（ここではデータＡおよびデータＢの境目の位置）を指定する。この場合、データＢおよびデータＣのＲＡＭ９上での格納位置が交換されると共にフレーム判別情報が書き換えられ、さらに画像データと音声データの同期を取るために、データＡ、データＢおよびデータＣの画像データについてタイムテーブル中のデコード間隔時間の変更が行われる。これにより、ユーザはデータの順番が入れ替えられた動画データのプレビュー動作を行うことができる。
【００５５】
次に、ユーザが動画データの編集を行う場合に、動画データの長さ（再生時間）を指定する方法を説明する。例えば、図７（ａ）のように動画データを切り出す場合、ユーザは動画データの先頭切り出し位置および動画データを切り出す長さを指定する。例えば、動画データの長さとして１５秒が指定された場合、動画データの切り出しは以下のように行われる。まず、ユーザの指定によって動画データの先頭切り出し位置が図７（ａ）で示した方法に従って決定される。続いて制御部１０はタイムテーブル中の、データの先頭切り出し位置のＩフレームに関するデコード間隔時間を読み出し、この値に、このＩフレームの後に続くフレームのデコード間隔時間を次々と加算していく。
【００５６】
制御部１０はこの加算される時間が１５秒を超えるかどうか監視する。この時間が初めて１５秒を超えたとき、制御部１０は１つ前のフレームが動画データの末尾切り出し位置であると判断する。前述したように、末尾切り出し位置はＩフレームの直前のＰフレームでなければならないため、制御部１０は末尾切り出し位置であると判断したフレームの直前のＩフレームを検索し、そのＩフレームの１つ前のＰフレームを動画データの末尾切り出し位置に決定する。これ以後の動作は前述した通りである。
【００５７】
また、ユーザが特定の時間以内であるような好みの長さの動画データを編集できるようにしてもよい。例えば、ユーザは図６（ａ）および（ｂ）で示した方法に従って動画データの先頭切り出し位置と末尾切り出し位置を指定する。制御部１０は上述した方法と同様にタイムテーブル中のデコード間隔時間を参照し、先頭切り出し位置のＩフレームから末尾切り出し位置のＰフレームまでのデコード間隔時間を加算することにより動画データの再生時間を算出する。この再生時間が特定の時間（例えば１５秒）を超えている場合、ユーザに対してもう一度編集を行うよう、表示部２からメッセージが出力される。再生時間が特定の時間を超えていなければ、制御部１０は末尾切り出し位置であると判断したフレームの直前のＩフレームを検索し、そのＩフレームの１つ前のＰフレームを動画データの末尾切り出し位置に決定する。
【００５８】
さらに続けて、ユーザはプレビュー再生や動画データの保存などを行うことができる。これにより、メールに添付して送付する動画データに対して、動画データの長さの面で通信事業者等により規定がある場合にも、ユーザが真に必要なデータのみを素早く切り出して送付することが可能となる。また、前述の規定がデータ量に関する規定であった場合には、先頭切り出し位置以後の各フレームのフレームサイズを加算することにより末尾切り出し位置を決定するように構成すればよい。
【００５９】
なお、携帯電話等の通信機能および動画撮影による動画データ生成機能を有する携帯用電子機器の内部に、上述したデータ編集装置が格納されていてもよい。その場合、撮像部１〜無線通信部１２の各構成は、携帯用電子機器と共用する形態であってもよい。携帯用電子機器が具備する通信機能により、ユーザが編集した動画データをメールに添付して他の携帯用電子機器へ送信したり、ユーザが撮影した動画データにＷｅｂサイトからダウンロードした画像データをつなぎ合わせたりする（例えば、ユーザが撮影した動画データに、オープニング映像およびエンディング映像をつなぎ合わせて映画やドラマのような動画データを作成する）など様々な編集が可能となり、ユーザの利便性が向上する。
【００６０】
特に、携帯電話においては、送信できる動画データは例えば再生時間が１５秒以内のものというように限られている場合が大半であり、ユーザが撮影した動画データの再生時間が１５秒よりも長い場合でも、ユーザは様々な編集を行って１５秒以内の動画データを作成することにより、これを他の携帯電話に送信することができる。さらに、意図的に長く撮影を行ってから必要なところだけを切り出すことを容易に実施することができるため、ユーザは落ち着いて撮影を行うことができる。また、プレビュー再生によって、データ編集の確認処理スピードが向上するため、ユーザが非常に快適に利用できる編集機能を持つ携帯用電子機器を提供することができる。
【００６１】
【発明の効果】
以上説明したように、この発明によれば、動画データ中のヘッダに含まれる時間情報が変更されるようにしたので、ユーザは動画データのつなぎ合わせや一部分の切り出し、不要な部分のカット、データの部分的な順番の入れ替えなど様々な動画データの編集を行うことができるという効果が得られる。また、動画データ中のヘッダに含まれる時間情報を変更し、この時間情報に基づいて動画データのプレビュー再生を行うようにしたので、ユーザは編集した動画データを容易に確認することができるという効果も得られる。
【図面の簡単な説明】
【図１】この発明の一実施形態によるデータ編集装置の構成を示すブロック図である。
【図２】同実施形態における動画データの構造を示す図である。
【図３】同実施形態におけるタイムベースおよびタイムインクリメントの機能を説明するための図である。
【図４】同実施形態において、２つの動画データをつなぎ合わせる場合のデータ構造を示す図である。
【図５】同実施形態における無音データの挿入を説明するための図である。
【図６】同実施形態における動画データの切り出しを説明するための図である。
【図７】同実施形態における動画データの切り出しおよびカットを概念的に示す図である。
【図８】同実施形態における動画データの切り出しの際の画像データおよび音声データの同期を説明するための図である。
【図９】同実施形態における動画データの入れ替えを概念的に示す図である。
【符号の説明】
１・・・撮像部、２・・・表示部、３・・・画像制御部、４・・・操作部、５・・・スピーカ、６・・・マイク、７・・・音声制御部、８・・・ＲＯＭ、９・・・ＲＡＭ、１０・・・制御部、１１・・・アンテナ、１２・・・無線通信部[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a data editing apparatus and a data editing method that are stored in a portable electronic device such as a mobile phone and edit moving image data according to user preferences.
[0002]
[Prior art]
In recent years, it has become possible for users to exchange various information between users using a mobile phone or the like. For example, Patent Literature 1 describes a portable videophone terminal that allows an opponent to visually grasp information that cannot be transmitted only by characters by transmitting an image and sound to an e-mail. In recent years, portable electronic devices capable of transmitting not only still image data but also moving image data to other devices have appeared. Such a portable electronic device can reproduce moving image data, and can attach moving image data having a maximum reproduction time of 15 seconds to an electronic mail and transmit it to another device. Note that the playback time of transmittable moving image data is limited to 15 seconds in order to prevent an excessive load from being generated on a wireless traffic line for transmitting data.
[0003]
[Patent Document 1]
JP 2000-332904 A
[0004]
[Problems to be solved by the invention]
However, although conventional portable electronic devices were able to create and reproduce moving image data by user shooting, send and receive moving image data with other devices, download moving image data from websites, etc. The data could not be edited. Although the user can reproduce the moving image data exceeding 15 seconds, but cannot edit the moving image data, the moving image data exceeding 15 seconds cannot be transmitted to other devices, and his / her portable electronic device I just played it back and enjoyed it.
[0005]
In order to edit video data on a mobile phone, the video data is once decoded, converted into display data, displayed on the screen, and encoded by performing various editing operations on this display data. It is generally believed that it is necessary to preserve it. When decoding and encoding of moving image data is repeated during editing, it takes time to convert this data, and it is difficult for the user to use it. Therefore, there is no mobile phone having a moving image data editing function.
[0006]
The present invention has been made in view of the above-described problems, and an object thereof is to provide a data editing apparatus and a data editing method capable of editing moving image data.
[0007]
[Means for Solving the Problems]
The present invention has been made to solve the above problems,The following means were adopted.
The present invention provides a storage means for storing a plurality of moving image data including a plurality of image frames and a header in which a time table including information when continuously decoding the plurality of image frames is described, and the storage means When the first moving image data and the second moving image data are specified by the operating means for specifying the moving image data to be connected among the plurality of moving image data stored in the operation data, the second moving image data Of which the header is removed 1 And a storage means for storing the third image data generated by the connection means in the storage means. The means causes the second moving image data to be continuous with the first moving image data based on information included in a time table in a header of the second moving image data when generating the third moving image data. The time table in the header of the first moving image data is rewritten, and the rewritten header is rewritten as the header of the third moving image data.
[0008]
The plurality of image frames in each of the first and second moving image data includes a reference frame for indicating a reference image and a difference frame for indicating a difference from the reference image, and the time table includes Including at least frame discriminating information having information indicating a position of the reference frame, wherein the connecting means includes the second moving image data based on the frame discriminating information included in the time table of the second moving image data. The information indicating the position of the reference frame of the image data is decremented by the number of image frames included in the first moving image data, and the information indicating the position of the reference frame of the second moving image data is decreased. The time table is rewritten so as to be added to the frame discrimination information of the moving image data.
[0009]
Further, each frame of the plurality of image frames in each of the first and second moving image data is provided with elapsed time information for managing the elapsed time of reproducing the moving image data. The elapsed time information of each of the plurality of image frames included in the second moving image data is added after the plurality of image frames included in the first moving image data.
[0010]
The storage means includes a ROM and a RAM, and the connecting means performs processing relating to the generation of the third moving image data on the RAM, and the generated third image data is stored by the storage means. And a preview means for reproducing the third moving image data with reference to a time table included in the header of the third moving image data on the RAM before storing in the ROM of the storage means. It is characterized by.
[0011]
Further, each of the first and second moving image data includes image data having the plurality of image frames, and audio data corresponding to the image data and including a plurality of frames,
The time table included in the header of each of the first and second moving image data further includes interval time information indicating a decoding interval for each frame of the image data and audio data.
[0012]
Further, when generating the third moving image data, the connecting means calculates a total reproduction time of all frames of the image data and the audio data of the first moving image data from the respective interval time information. If the total playback time of the comparison image data is longer than the total playback time of the audio data, silent audio data corresponding to the time when the audio data is insufficient is generated, and the audio data is When the audio data is inserted after the audio data of one moving image data and the audio data is not included in the first moving image data, it is equal to the total reproduction time of all the frames of the image data included in the first moving image data. Generate silent audio data for a period of time, insert it in association with the image data of the first moving image data, and in each case the inserted audio data Characterized in that rewriting the timetable in the header of the third moving image data so as to add the time information.
[0013]
The frame of the image data and the frame of the audio data included in the first moving image data are in a relationship in which the time of each decoding interval is not a divisor or a multiple of each other, When the video data of 3 is generated, the total playback time of all the frames of the image data and the audio data of the first video data is calculated from the respective interval time information, compared with each other, and the total playback of each other When the time lengths are different, the time table of the header of the third moving image data is rewritten so as to make it uniform so as to extend or shorten the decoding interval time of the image data of the first moving image data. It is characterized by.
[0014]
Further, the operation means can specify a cutout range and specify a new cutout from the moving picture data stored in the storage means, and the operation means designates a cutout start position in the fourth moving picture data. A first search unit that searches for a reference frame that is closest to the specified position and searches for a difference frame immediately before the reference frame that is closest to the specified position when the cutout end position is specified; From the reference frame specified as the cutout start position by one search means to the end of the image frame of the fourth video data, the cutout start position by the first search means from the beginning of the image frame of the fourth video data From the reference frame specified as, to the difference frame immediately before the reference frame specified as the cut-out end position, From the beginning of the image frame of the moving image data to the difference frame immediately before the reference frame specified as the cut-out end position by the search means, any of the image data is extracted, and a new fifth image having the extracted image frame is obtained. And first storage means for generating the moving image data, wherein the storage means stores the fifth image data generated by the first extraction means in the storage means, and the first extraction means. The means extracts the interval time information indicating the decoding interval of the frame extracted by the first extraction means from the time table included in the header of the fourth moving image data, and newly includes the extracted interval time information. A header having a time table and the new time table is generated, and the generated header is stored in the first extraction unit. Ri is added to the extracted image data and generates as the fifth video data.
[0015]
Further, the operation means can specify a cutout range and specify a new cutout from the moving picture data stored in the storage means, and the operation means designates a cutout start position in the fourth moving picture data. If so, a second search means for searching for a reference frame closest to the designated position, further searching for a frame within a predetermined time range starting from the reference frame, and a range searched by the second search means Second extraction means for extracting a frame and generating as new sixth moving image data having the extracted image frame, wherein the storage means is the second extraction means generated by the second extraction means. 6 image data is stored in the storage means, and the second extracting means is an interval time indicating a decoding interval of the frames extracted by the second extracting means. The information is extracted from the time table included in the header of the fourth moving image data, a new time table including the extracted interval time information and a header having the new time table are generated, and the generated header Is added to the image data extracted by the second extraction means to generate the sixth moving image data.
[0016]
In addition, interval time information indicating a decoding interval for each of a plurality of image frames in each of the fourth moving image data is given, and the second search unit is configured to store the fourth stored in the storage unit. A frame in the predetermined time range is specified by sequentially adding the interval time information for each frame after the reference frame as the starting point with reference to a time table included in a header of the moving image data of .
[0017]
The second search unit refers to a time table included in the header of the fourth moving image data stored in the storage unit, and a predetermined time elapses when the second search unit reproduces from the reference frame as the starting point. A difference frame within the predetermined time range is searched for before.
[0018]
The communication device may further include a communication unit capable of transmitting the moving image data stored in the storage unit to another device, and the predetermined time is a value within a range that can be transmitted by the communication unit. The data editing apparatus according to any one of 9 to 11.
[0019]
The present invention stores a plurality of moving image data including a plurality of image frames and a header in which a time table including information when continuously decoding the plurality of image frames is stored. A data editing method for editing moving image data, the step of designating moving image data to be joined among a plurality of moving image data stored in the storage means, and the first moving image data and the second moving image data, When specified, the second moving image data is obtained by removing the header from the second moving image data. 1 Generating the third moving image data by connecting to the back of the moving image data, storing the generated third image data in the storage means, and generating the third moving image data, Based on the information included in the time table in the header of the second moving image data, the time table in the header of the first moving image data is set so that the second moving image data is continuous with the first moving image data. Rewriting, and using the rewritten header as a header of the third moving image data.
[0026]
DETAILED DESCRIPTION OF THE INVENTION
Embodiments of the present invention will be described below with reference to the drawings. FIG. 1 is a diagram showing the configuration of a data editing apparatus according to an embodiment of the present invention. In the figure, reference numeral 1 denotes an imaging unit which outputs captured analog image data to the image control unit 3. Reference numeral 2 denotes a display unit which displays a moving image based on display data output from the image control unit 3. The image control unit 3 encodes the analog image data output from the imaging unit 1, generates MPEG-4 format 3gp2 format image data, and outputs the image data to the control unit 10. The image control unit 3 decodes the image data output from the control unit 10 to generate display data, and outputs the display data to the display unit 2. An operation unit 4 is provided with various keys such as a numeric keypad, a character input key, and a function key operated by the user.
[0027]
Reference numeral 5 denotes a speaker, which generates sound based on the sound signal output from the sound control unit 7. Reference numeral 6 denotes a microphone, which converts external sound into a sound signal and outputs the sound signal to the sound control unit 7. The audio control unit 7 decodes audio data in QCELP (Qualcomm Code Excited Linear Prediction) format output from the control unit 10, generates an audio signal, and outputs the audio signal to the speaker 5. The audio control unit 7 encodes the audio signal output from the microphone 6 to generate audio data, and outputs the audio data to the control unit 10. Reference numeral 8 denotes a ROM capable of erasing and writing data such as EPROM and EEPROM, and capable of holding data for a predetermined time or more. The ROM 8 stores image data and audio data stored by the control unit 10. Reference numeral 9 denotes a RAM which temporarily stores image data and audio data extracted from the ROM 8 by the control unit 10 and performs data editing processing. A control unit 10 controls each unit. Reference numeral 11 denotes an antenna for data transmission / reception. A wireless communication unit 12 communicates wirelessly with a wireless communication network, and transmits and receives voice data during a call and mail and various data during mail communication.
[0028]
Next, the structure of the moving image data in the present embodiment will be described with reference to FIG. A header is added to the beginning of the moving image data, image data and audio data are alternately arranged following the header, and an end code indicating the end of the data is added to the end. Image data is composed of frames, and depending on the design of each device, for example, moving image data for one second is composed of 15 frames of image data. Also, the frame is composed of two types of I frame and P frame. An I frame is a reference frame constituting an image, and is obtained by encoding a complete still image. The P frame is obtained by encoding only the difference between the immediately preceding I frame or P frame and the actual still image. As shown in FIG. 2, the image data includes an I frame followed by a plurality of P frames. For example, in the case of moving image data composed of 15 frames per second, one second of image data is one I frame. And 14 P frames. On the other hand, a frame of audio data has no concept of I frame or P frame like image data.
[0029]
Each frame of the image data includes a time base indicating elapsed time information and a time increment. The time base and the time increment are defined by the MPEG standard, and are used as information indicating a reproduction interval when reproducing moving image data. The time increment indicates the elapsed time at the time of frame reproduction, and the time base is used supplementarily and takes a value of 0 or 1. FIG. 3 shows an example of using the time base and time increment. This is an example of image data composed of 7.5 frames per second (half rate of 15 frames per second), and a value is added to the time increment for each frame on the basis of the first I frame of the first frame. . It should be noted that 1 ÷ 7.5 = 0.1333... 1 ÷ 15 = 0.0666..., But here, for the sake of explanation, the fraction is rounded down, and the case is 7.5 frames per second (half rate). In the case where the elapsed time per frame is 133 milliseconds, and the 15 frames per second (full rate), the elapsed time per frame is 66 milliseconds.
[0030]
If time information is included only in the time increment, the amount of time increment data becomes enormous, so the time increment is changed every 1 second. For example, in FIG. 3, the time increment increases by 1 every time 66 milliseconds elapse. This example is a case of a half rate in which 1 second is composed of 7.5 frames. One frame is 133 milliseconds, and the time increment is increased by 2 for each frame. Since the elapsed time is 1 second in the ninth frame, the time increment is updated, and the value of the time increment is 1 indicating that 66 milliseconds have elapsed from 1 second. (This example is an example of a half rate, so 1 second elapses in 7.5 frames.) Each time 1 second elapses, the time base reverses from 0 to 1, and 0 again in the next P frame. Return to. Similarly, the time increment is changed every time 1 second passes, and the time base becomes a flag indicating that 1 second has passed. Note that FIG. 3 is merely prepared for explaining the time base and the time increment, and the numerical values in the figure are determined at the time of design based on the specifications of the moving image data.
[0031]
A time table having frame information is stored in the header of the moving image data in FIG. The time table has information on the frame size (stsz), frame discrimination information (stss), and decoding interval time (stts) of each frame. The frame size indicates the size of each frame. The frame size is significantly different between the I frame and the P frame. The frame discriminating information is used for discriminating the I frame, and has the I frame number as information. That is, it is easy to grasp which frame is the I frame for storing the reference image, and this frame discrimination information is used in a technique such as frame-by-frame playback of only the I frame during fast-forward playback of moving image data.
[0032]
The frame is decoded by the image control unit 3 and then temporarily stored in a buffer in the image control unit 3, and is output to the display unit 2 when image display of the previous frame is completed. The decoding interval time is obtained by dividing the time from the start of playback to the end of playback according to the data size of each frame and assigning it to each frame. In other words, the decoding interval time is a time for decoding and reproducing a frame specified in the decoding interval time. When the frame decoding is completed in the image control unit 3 within the allocated decoding interval time, the image of this frame is displayed on the display unit 2, but the decoding of the frame is not completed within the decoding interval time. In this case, the image of this frame is not displayed and the next frame is decoded. This time table is generated when analog image data is encoded in the image control unit 3. Note that audio data corresponding to the frame size and decoding interval time is also stored in the header.
[0033]
Each frame of moving image data also stores frame head information indicating the head of the frame. When referring to the data of each frame, the control unit 10 searches for the head of the frame based on the frame head information, and subsequently determines whether the frame is an I frame or a P frame based on the frame discrimination information. It is possible to determine whether or not there is time information and to acquire time information based on time base and time increment.
[0034]
Next, the joining of two moving image data having the above-described structure will be described. First, the user designates two pieces of moving image data to be connected using the operation unit 4 while viewing the titles of the moving image data displayed on the display unit 2. When connecting two moving image data (data 1 and data 2) as shown in FIG. 4, first, data 1 and data 2 are read from the ROM 8 by the control unit 10 and stored in the RAM 9. The control unit 10 detects the end code of the data 1, removes the end code and the header of the data 2 from the data, and stores the two data in a continuous area on the RAM 9 as shown in the right of FIG. At this time, the information in the header of data 2 is added to the time table in the header of data 1, but the decoding interval time for the last frame of the image data of data 1 and the frame discrimination information for the frame of data 2 are Be changed.
[0035]
The reason why the decoding interval time for the last frame of the image data of the data 1 is changed is to synchronize the image data and the audio data, which will be described later. Note that when moving image data consisting of only image data is connected, there is no need to synchronize with audio data, so there is no need to change the decoding interval time in the time table. As for the frame discrimination information, the frame discrimination information related to the frame sequence increased after the data 1 by adding the data 2 to the data 1 is added to the original frame discrimination information. At this time, for the added data 2, the number of the I frame in the frame discrimination information is reduced by the number of frames of the data 1 and rewritten.
[0036]
In order to save the joined data in the ROM 8, it is necessary to change the time base and time increment of each frame. After data joining, the control unit 10 rewrites the time base and time increment sequentially from the first frame included in the data 2 and stores the rewritten moving image data in the ROM 8. However, when the user edits the video data and confirms whether the editing has been performed as desired, it takes time to save the edited video data once saved and then played back, so it is easy to check the editing. I can't.
[0037]
Therefore, before saving the edited moving image data, the moving image is played back using the newly created time table in the header (preview operation), and the user confirms the editing result of the moving image data. This preview operation is performed as follows. The control unit 10 reads the decoding interval time in the time table every time the frame is read from the joined moving image data. Image data in the moving image data is decoded by the image control unit 3 and stored in a buffer in the image control unit 3. When the decoding in the image control unit 3 is completed within the decoding interval time of the frame, the control unit 10 outputs an output instruction to the image control unit 3, and the image control unit 3 displays the display data received the output instruction on the display unit. Output to 2. The display unit 2 that has received the display data displays an image based on the display data. By performing this operation continuously, a moving image is reproduced.
[0038]
When the last frame of data 1 is reproduced, since the end code of data 1 inserted after this frame is deleted, the decoding is continued as it is, and the frame of data 2 is decoded. As described above, since the control unit 10 controls the display of the image of each frame based on the time information managed in the time table, the moving image is reproduced using the moving image data temporarily generated on the RAM 9. be able to. Further, since the frame discrimination information in the header is also updated, it is possible to perform fast-forward playback of moving image data even in a preview state. This is useful because when editing long moving image data, the edited moving image data can be quickly confirmed.
[0039]
The same operation as that described above is performed on the audio data, and sound is generated from the speaker 5. However, for example, there is moving image data that does not include audio data and includes only image data, which is generated when the user deletes audio data by post-record editing. In this case, the following operation is performed. Assume that data 1 consisting only of image data 1 and data 2 consisting of image data 2 and audio data 2 are joined together as shown in FIG. In the figure, the data structure is simplified, and the arrangement of data does not necessarily indicate the actual position on the RAM 9. Since there is no audio data in the data 1, the header of the data 1 has information relating to only the image data 1.
[0040]
When the data 1 and the data 2 are connected, the header information of the data 2 related to the image data 2 is added to the header of the data 1 for the image data, and the header information of the data 2 related to the audio data 2 is the data for the audio data. 1 is newly written in the header. When the control unit 10 controls the data reproduction while referring to the time table in the header, since the time information regarding the audio data 2 is written in the header as it is, the audio data 2 is reproduced together with the image data 1 and the moving image The image and sound will not match.
[0041]
Therefore, when connecting data 1 and data 2, the control unit 10 refers to the decoding interval time in the time table and calculates the reproduction time of the image data 1. And the control part 10 comprises the audio | voice data 1 by one or several silence audio | voice frames so that it may become the reproduction | regeneration time nearest to this time, this audio | voice data 1 is inserted before the audio | voice data 2, and audio | voice The decoding interval time for data 1 is added to the time table. As for image data, the frame discrimination information related to the frame of the image data 2 is changed, and the decoding interval time of the last frame of the image data 1 is changed in order to synchronize the image data and the audio data as will be described later. To do. As the silence data for the audio data 1, for example, the control unit 10 may generate white noise and use it. As described above, as shown in FIG. 5B, the audio data 1 corresponds to the image data 1, and the reproduction of the moving image data is performed so that the image and the audio correspond correctly.
[0042]
When the moving image data connected as described above is stored in the ROM 8, as described above, the control unit 10 refers to the time base and the time increment from the beginning of the frame in the image data, and is connected to the end of the first data. Rewrite all the time base and time increment of the data. Subsequently, the control unit 10 reads the moving image data from the RAM 9 and stores it in the ROM 8.
[0043]
In addition, when the user joins another video data to the video data in which the audio data from the middle of the audio data to the end of the audio data has been deleted by post-record editing, the same operation as described above can be used to set the length based on the image data. The silent data is inserted and the header is rewritten. As a result, data can be joined together. In addition, when moving image data from which audio data has been deleted is connected, there is no need to synchronize with the audio data, so the decoding interval time for the image data is not changed.
[0044]
Next, a description will be given of cutting out moving image data for cutting out a part of data from moving image data and generating new moving image data, with reference to FIG. When cutting out data, the user plays the video and selects the desired start and end cutout positions.InTo stop playback and specify the cutout position. When the P frame at the user designated position in FIG. 6A is designated by the user as the start of extraction, the control unit 10 refers to the frame discrimination information in the time table and searches for the I frame closest to this P frame. This I frame is set as the leading cutout position of the data. This is because if the head of the image data is a P frame, display data cannot be created from only the difference information when the image data is decoded.
[0045]
When the P frame at the user designated position in FIG. 6B is designated by the user as the end of extraction, the control unit 10 searches for the I frame closest to the P frame and uses the immediately preceding P frame as the data frame. This is the end cutout position. If the frame at the end cutout position is an I frame, when another moving image data is joined after the cut out data, the I frame continues at the joint. Then, when these two I frames are decoded by the image control unit 3, decoding of large data will continue. Since the data size of the I frame is large, if the I frame continues, the video bit rate increases, which may result in nonstandard data. In order to prevent this, the end cutout position is set to the P frame immediately before the I frame.
[0046]
In the search for the head clipping position and the tail clipping position of the data, the control unit 10 searches for the I frame closest to the frame specified by the user. However, the present invention is not limited to this, and immediately before the frame specified by the user. Alternatively, the immediately following frame may be searched.
[0047]
By cutting out the moving image data described above, as shown in FIG. 7A, a part of the moving image data can be cut out to generate new moving image data. Further, as shown in FIG. 7B, unnecessary portions in the moving image data can be cut and connected by removing the cut portions. In either case, a header including a time table is added to the head of the cut out data, and the decoding interval time in this time table is changed. When the editing of the data is completed, the user can confirm whether or not the editing can be performed as desired by performing the preview operation described above. Furthermore, the time base and time increment of each frame can be changed and data can be stored.
[0048]
When editing such as cutout / cut / joining of moving image data is performed, an error occurs in the synchronization between the image and the sound, and correction for synchronizing both is necessary. For example, as shown in FIG. 8, when the data is divided between the image frame 1 and the image frame 2, the playback times of the image frame and the audio frame are different. Is not always done. An audio frame is shorter in time than an image frame, and the times are not in a divisor / multiple relationship. In addition, when data is divided, it is divided in units of frames. Therefore, when division in an image frame not related to a divisor or multiple is specified, the division position of the audio frame and the division of the image frame are specified. A slight shift will occur in the position.
[0049]
In FIG. 8, it is assumed that the audio data is divided between the audio frame 7 and the audio frame 8. The playback start time of the image frame 2 and the playback start time of the audio frame 8 are different by 9 milliseconds. When playback is performed without correcting the extracted data, the audio of the audio frame 8 is output as the image of the image frame 2. If the output is not delayed by 9 ms, the sound and the image are not synchronized but are output simultaneously from the beginning of each frame. In other words, the audio of the audio frame 8 is output 9 ms earlier than the position synchronized with the image, so correction is necessary.
[0050]
In this case, the control unit 10 refers to the decoding interval time in the time table, adds the decoding interval time from the first frame to the frame immediately before the cut-out position for each of the image data and the audio data, and compares them to obtain the image It recognizes that the time difference between the start of playback of frame 2 and audio frame 8 is 9 ms. Then, the control unit 10 rewrites the decoding interval time for the image frame 2 so as to be shortened by 9 ms. As a result, from the image frame 3 subsequent to the image frame 2, the image data and the audio data can be reproduced in a completely synchronized form.
[0051]
In addition, when data is connected, the reproduction time of the image data and the sound data is different, so that it is necessary to synchronize the image data and the sound data at the joint. For example, when other moving image data is connected after the moving image data composed of the image frame 1 and the sound frames 1 to 7 cut out as shown in FIG. 8, the reproduction end position of the image frame 1 and the sound frame 7 are reproduced. Since there is a difference of 9 ms between the end positions, the decoding interval time for the image frame 1 in the time table is rewritten so as to be extended by 9 ms. As a result, the image frames following the joint and the audio data can be joined together in a synchronized form.
[0052]
The same applies to the synchronization correction at the cutout position when the data is cut. For example, when data is cut out as shown in FIG. 7A, the decoding interval time of the first frame of the cut out image data is changed. When data is cut out and joined as shown in FIG. 7B, the decoding interval time of the rightmost frame of the left image data is changed as shown in the figure. As described above, the synchronization shift between the image data and the audio data can be corrected only by rewriting the decoding interval time in the time table.
[0053]
When the difference between the reproduction times of the image data and the sound data exceeds one frame length of the sound data, silence data is inserted into the sound data, and when the difference is within one frame length of the sound data, Decoding interval time is rewritten. In addition, in the case of moving image data including only image data after audio data is deleted by post-record editing by the user, it is not necessary to change the decoding interval time described above.
[0054]
Next, replacement of data as shown in FIG. 9 will be described. For example, when the order of data C and data B in the figure is changed, the user first specifies the positions of both ends from which the data C is cut out, and then the position at which the data C is inserted (here, the position of the boundary between the data A and the data B) ) Is specified. In this case, the storage positions of the data B and data C on the RAM 9 are exchanged, the frame discrimination information is rewritten, and the images of the data A, data B, and data C are synchronized in order to synchronize the image data and audio data. The decoding interval time in the time table is changed for the data. Thereby, the user can perform the preview operation of the moving image data in which the order of the data is changed.
[0055]
Next, a method for designating the length (reproduction time) of the moving image data when the user edits the moving image data will be described. For example, when moving image data is cut out as shown in FIG. 7A, the user designates the start cutout position of the moving image data and the length to cut out the moving image data. For example, when 15 seconds is specified as the length of the moving image data, the moving image data is cut out as follows. First, the start cutout position of the moving image data is determined according to the method shown in FIG. Subsequently, the control unit 10 reads out the decoding interval time related to the I frame at the first cutout position of the data in the time table, and sequentially adds the decoding interval times of the frames following this I frame to this value.
[0056]
The control unit 10 monitors whether or not this added time exceeds 15 seconds. When this time exceeds 15 seconds for the first time, the control unit 10 determines that the previous frame is the end cutout position of the moving image data. As described above, since the end cutout position must be the P frame immediately before the I frame, the control unit 10 searches for the I frame immediately before the frame determined to be the end cutout position, and selects one of the I frames. The previous P frame is determined as the end cutout position of the moving image data. The subsequent operation is as described above.
[0057]
Also, the user may be able to edit video data having a desired length that is within a specific time. For example, the user designates the start cutout position and the end cutout position of the moving image data according to the method shown in FIGS. The control unit 10 refers to the decoding interval time in the time table in the same manner as described above, and adds the decoding interval time from the I frame at the leading cutout position to the P frame at the trailing cutout position, thereby reducing the reproduction time of the video data. calculate. When the reproduction time exceeds a specific time (for example, 15 seconds), a message is output from the display unit 2 so that the user can edit again. If the playback time does not exceed a specific time, the control unit 10 searches for the I frame immediately before the frame determined to be the end cutout position, and cuts out the P frame immediately before the I frame from the end of the moving image data. Determine the position.
[0058]
Furthermore, the user can perform preview playback, save moving image data, and the like. This allows users to quickly cut out and send only the data that they really need, even if there is a stipulation by the telecommunications carrier or the like regarding the length of the video data for the video data that is sent as an email attachment. It becomes possible. Further, when the above-mentioned rule is a rule concerning the data amount, the tail cut-out position may be determined by adding the frame sizes of the frames after the head cut-out position.
[0059]
Note that the above-described data editing apparatus may be stored in a portable electronic device having a communication function such as a mobile phone and a moving image data generation function by moving image shooting. In that case, each configuration of the imaging unit 1 to the wireless communication unit 12 may be configured to be shared with a portable electronic device. Using the communication function of the portable electronic device, the video data edited by the user is attached to an email and transmitted to another portable electronic device, or the video data downloaded by the user is connected to the video data shot by the user. Various editing is possible, such as combining video images taken by the user (for example, creating video data such as movies and dramas by linking the opening video and the ending video) .
[0060]
In particular, in mobile phones, video data that can be transmitted is mostly limited such that the playback time is within 15 seconds, for example, and the playback time of video data taken by the user is longer than 15 seconds However, the user can transmit this to other mobile phones by making various edits and creating moving image data within 15 seconds. Furthermore, since it is possible to easily cut out only a necessary portion after intentionally shooting for a long time, the user can calmly shoot. In addition, since the preview reproduction improves the confirmation processing speed of data editing, it is possible to provide a portable electronic device having an editing function that can be used very comfortably by the user.
[0061]
【The invention's effect】
As described above, according to the present invention, since the time information included in the header in the moving image data is changed, the user can join the moving image data, cut out a part, cut an unnecessary portion, and data Thus, it is possible to edit various moving image data such as a partial order change. Further, since the time information included in the header in the moving image data is changed and the preview reproduction of the moving image data is performed based on the time information, the user can easily check the edited moving image data. Can also be obtained.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a configuration of a data editing apparatus according to an embodiment of the present invention.
FIG. 2 is a diagram showing a structure of moving image data in the same embodiment.
FIG. 3 is a diagram for explaining functions of a time base and a time increment in the embodiment.
FIG. 4 is a diagram illustrating a data structure when two moving image data are connected in the embodiment;
FIG. 5 is a diagram for explaining insertion of silence data in the embodiment;
FIG. 6 is a diagram for explaining how to cut out moving image data in the embodiment;
FIG. 7 is a diagram conceptually showing cutout and cutout of moving image data in the same embodiment.
FIG. 8 is a diagram for explaining synchronization of image data and audio data when moving image data is cut out in the embodiment.
FIG. 9 is a diagram conceptually illustrating replacement of moving image data in the embodiment.
[Explanation of symbols]
DESCRIPTION OF SYMBOLS 1 ... Imaging part, 2 ... Display part, 3 ... Image control part, 4 ... Operation part, 5 ... Speaker, 6 ... Microphone, 7 ... Audio | voice control part, 8 ... ROM, 9 ... RAM, 10 ... control unit, 11 ... antenna, 12 ... wireless communication unit

Claims

複数の画像フレームと当該複数の画像フレームを連続デコードする際の情報を含むタイムテーブルが記載されるヘッダとを含んで構成される動画データを複数記憶する記憶手段と、  Storage means for storing a plurality of moving image data including a plurality of image frames and a header in which a time table including information when continuously decoding the plurality of image frames is described;
前記記憶手段に記憶される複数の動画データのうち、つなぎ合わせる動画データを指定する操作手段と、  Operation means for designating moving image data to be joined among a plurality of moving image data stored in the storage means;
前記操作手段により第１の動画データと第２の動画データとが指定されると、当該第２の動画データのうち、そのヘッダを取り除いたデータを前記第１の動画データの後ろに連結して第３の動画データを生成する連結手段と、  When the first moving image data and the second moving image data are designated by the operating means, the data from which the header is removed from the second moving image data is connected to the back of the first moving image data. Connecting means for generating third video data;
前記連結手段によって生成された前記第３の画像データを前記記憶手段に格納する格納手段と、を備え、Storage means for storing the third image data generated by the connecting means in the storage means,
前記連結手段は、前記第３の動画データ生成の際、前記第２の動画データのヘッダ内のタイムテーブルに含まれる情報に基づいて、前記第２の動画データが前記第１の動画データに連続するよう、前記第１の動画データのヘッダ内のタイムテーブルを書き換え、当該書き換えたヘッダを前記第３の動画データのヘッダとする書き換えを行うThe connection means, when generating the third moving image data, the second moving image data continues to the first moving image data based on information included in a time table in a header of the second moving image data. The time table in the header of the first moving image data is rewritten so that the rewritten header is used as the header of the third moving image data.
ことを特徴とするデータ編集装置。A data editing apparatus characterized by that.

前記第１および第２の動画データそれぞれにおける複数の画像フレームは、基準画像を示すための基準フレームと、当該基準画像からの差分を示すための差分フレームとを含み、  The plurality of image frames in each of the first and second moving image data includes a reference frame for indicating a reference image and a difference frame for indicating a difference from the reference image,
前記タイムテーブルには、少なくとも前記基準フレームの位置を示す情報を有するフレーム判別情報を含み、  The time table includes at least frame discrimination information having information indicating a position of the reference frame,
前記連結手段は、前記第２の動画データのタイムテーブルに含まれるフレーム判別情報に基づいて、前記第２の動画データのうちの画像データの有する基準フレームの位置を示す情報を、前記第１の動画データに含まれる画像フレーム数の分繰り下げ、当該繰り下げた第２の動画データの基準フレームの位置を示す情報を前記第１の動画データのフレーム判別情報に加えるよう、前記タイムテーブルの書き換えを行う  The connecting means uses information indicating a position of a reference frame included in image data of the second moving image data based on frame discrimination information included in a time table of the second moving image data, as the first moving image data. The time table is rewritten so that the number of image frames included in the moving image data is decremented and information indicating the position of the reference frame of the lowered second moving image data is added to the frame determination information of the first moving image data.
ことを特徴とする請求項１に記載のデータ編集装置。The data editing apparatus according to claim 1, wherein:

前記第１および第２の動画データそれぞれにおける複数の画像フレームの各フレームには、動画データ再生の経過時間を管理するための経過時間情報が与えられており、
前記連結手段は、前記第２の動画データに含まれていた複数の画像フレームの各フレームの経過時間情報を、前記第１の動画データに含まれていた複数の画像フレームの後ろに追加する
ことを特徴とする請求項１または２に記載のデータ編集装置。 Each frame of the plurality of image frames in each of the first and second moving image data is provided with elapsed time information for managing the elapsed time of moving image data reproduction,
The connecting means adds elapsed time information of each frame of the plurality of image frames included in the second moving image data to the back of the plurality of image frames included in the first moving image data.
The data editing apparatus according to claim 1 , wherein the data editing apparatus is a data editing apparatus.

前記記憶手段は、ＲＯＭとＲＡＭとを含み、  The storage means includes a ROM and a RAM,
前記連結手段は、前記第３の動画データの生成に関する処理を前記ＲＡＭ上で行い、  The connecting means performs processing related to generation of the third moving image data on the RAM,
生成された前記第３の画像データを前記格納手段により、前記記憶手段のうちの前記ＲＯＭに格納する前に、前記ＲＡＭ上の第３の動画データのヘッダに含まれるタイムテーブルを参照して当該第３の動画データを再生するプレビュー手段をさらに備える  Before storing the generated third image data in the ROM of the storage means by the storage means, refer to the time table included in the header of the third moving image data on the RAM. Preview means for reproducing the third moving image data is further provided.
ことを特徴とする請求項１から３のいずれか一項に記載のデータ編集装置。The data editing apparatus according to claim 1, wherein the data editing apparatus is a data editing apparatus.

前記第１および第２の動画データは、前記複数の画像フレームを有する画像データと、当該画像データに対応し、複数のフレームからなる音声データとをそれぞれ含み、 Each of the first and second moving image data includes image data having the plurality of image frames, and audio data corresponding to the image data and including a plurality of frames,
前記第１および第２の動画データそれぞれのヘッダに含まれるタイムテーブルには、それぞれの画像データと音声データのフレームごとのデコード間隔を示す間隔時間情報がさらに含まれる The time table included in the header of each of the first and second moving image data further includes interval time information indicating a decoding interval for each frame of the image data and the audio data.
ことを特徴とする請求項２に記載のデータ編集装置。The data editing apparatus according to claim 2, wherein:

前記連結手段は、前記第３の動画データを生成する際に、前記第１の動画データの有する画像データと音声データそれぞれの全フレームの合計再生時間を、それぞれの前記間隔時間情報から算出するとともに互いに比較し、比較画像データの合計再 The connecting means, when generating the third moving image data, calculates a total reproduction time of all the frames of the image data and the audio data of the first moving image data from the interval time information. Compare each other and re-comparison of the comparison image data 生時間が音声データの合計再生時間よりも長い場合には、音声データが不足する時間分の無音の音声データを生成して、当該音声データを前記第１の動画データの有する音声データの後ろに挿入し、音声データが前記第１の動画データに含まれない場合には当該第１の動画データに含まれる画像データの全フレームの合計再生時間と等しい時間分の無音の音声データを生成して、前記第１の動画データの有する画像データに対応付けて挿入し、いずれの場合においても挿入した音声データ分の時間情報を追加するよう前記第３の動画データのヘッダ中のタイムテーブルを書き換えるWhen the live time is longer than the total reproduction time of the audio data, silent audio data corresponding to the time when the audio data is insufficient is generated, and the audio data is placed behind the audio data of the first moving image data. When the audio data is not included in the first moving image data, silence audio data for a time equal to the total reproduction time of all frames of the image data included in the first moving image data is generated. The time table in the header of the third moving image data is rewritten so that it is inserted in association with the image data of the first moving image data, and in each case, the time information for the inserted audio data is added.
ことを特徴とする請求項５に記載のデータ編集装置。6. The data editing apparatus according to claim 5, wherein

前記第１の動画データの有する画像データのフレームと音声データのフレームは、それぞれのデコード間隔の時間が、互いに約数あるいは倍数の関係が成り立たない関係であり、 The frame of the image data and the frame of the audio data included in the first moving image data are in a relationship in which the time of each decoding interval does not hold a divisor or multiple.
前記連結手段は、前記第３の動画データを生成する際に、前記第１の動画データの有する画像データと音声データそれぞれの全フレームの合計再生時間をそれぞれの前記間隔時間情報から算出して互いに比較し、互いの合計再生時間の長さが異なる場合には、これを揃えるよう、前記第３の動画データのヘッダのタイムテーブルを、前記第１の動画データの画像データのデコード間隔の時間を延長あるいは短縮するよう書き換える When generating the third moving image data, the connecting means calculates the total reproduction time of all the frames of the image data and audio data of the first moving image data from the respective interval time information, and calculates each other. If the total playback time lengths differ from each other, the time table of the header of the third moving image data is set to the time of the decoding interval of the image data of the first moving image data so as to align them. Rewrite to extend or shorten
ことを特徴とする請求項５に記載のデータ編集装置。The data editing apparatus according to claim 5, wherein:

前記操作手段は、前記記憶手段に記憶される動画データから、切り出し範囲を指定して新たに動画を切り出し指示可能であり、  The operation means is capable of instructing a new cutout by specifying a cutout range from the moving picture data stored in the storage means,
前記操作手段により、第４の動画データにおける切り出し開始位置が指定されると指定された位置に最も近い基準フレームを検索し、切り出し終了位置が指定されると指定された位置に最も近い基準フレームの直前の差分フレームを検索する第１の検索手段と、  When the cutout start position in the fourth moving image data is designated by the operation means, the reference frame closest to the designated position is searched, and when the cutout end position is designated, the reference frame closest to the designated position is searched. First search means for searching for the immediately preceding difference frame;
前記第１の検索手段により切り出し開始位置として特定された基準フレームから前記第４の動画データの画像フレームの最後まで、前記第４の動画データの画像フレームの最初から前記第１の検索手段により切り出し開始位置として特定された基準フレームから切り出し終了位置として特定された基準フレームの直前の差分フレームまで、前記第４の動画データの画像フレームの最初から前記検索手段により切り出し終了位置として特定された基準フレームの直前の差分フレームまで、いずれかの画像データのフレーム抽出を行い、当該抽出した画像フレームを有する新たな第５の動画データとして生成する第１の抽出手段と、を備え、  From the reference frame specified as the cutout start position by the first search means to the end of the image frame of the fourth video data, the first search means cuts out from the beginning of the image frame of the fourth video data. From the reference frame specified as the start position to the difference frame immediately before the reference frame specified as the cutout end position, the reference frame specified as the cutout end position from the beginning of the image frame of the fourth moving image data by the search means First extraction means for performing frame extraction of any image data up to the difference frame immediately before and generating as new fifth moving image data having the extracted image frame,
前記格納手段は、前記第１の抽出手段によって生成された前記第５の画像データを前記記憶手段に格納し、  The storage means stores the fifth image data generated by the first extraction means in the storage means,
前記第１の抽出手段は、当該第１の抽出手段により抽出したフレームのデコード間隔を示す間隔時間情報について、前記第４の動画データのヘッダに含まれるタイムテーブルから抽出し、当該抽出した間隔時間情報を含む新たなタイムテーブルと、当該新たなタイムテーブルを有するヘッダを生成し、当該生成したヘッダを当該第１の抽出手段により抽出した画像データに対して付加して前記第５の動画データとして生成する  The first extracting means extracts interval time information indicating a frame decoding interval extracted by the first extracting means from a time table included in a header of the fourth moving image data, and extracts the extracted interval time. A new time table including information and a header having the new time table are generated, and the generated header is added to the image data extracted by the first extraction unit to form the fifth moving image data. Generate
ことを特徴とする請求項２に記載のデータ編集装置。The data editing apparatus according to claim 2, wherein:

前記操作手段は、前記記憶手段に記憶される動画データから、切り出し範囲を指定して新たに動画を切り出し指示可能であり、  The operation means is capable of instructing a new cutout by specifying a cutout range from the moving picture data stored in the storage means,
前記操作手段により、第４の動画データにおける切り出し開始位置が指定されると指定された位置に最も近い基準フレームを検索し、さらに当該基準フレームを起点として所定時間範囲のフレームを検索する第２の検索手段と、  When the cutout start position in the fourth moving image data is specified by the operation means, a reference frame closest to the specified position is searched, and a frame in a predetermined time range is searched using the reference frame as a starting point. Search means;
前記第２の検索手段により検索された範囲のフレームの抽出を行い、当該抽出した画像フレームを有する新たな第６の動画データとして生成する第２の抽出手段と、を備え、  Second extraction means for extracting frames in the range searched by the second search means and generating new sixth moving image data having the extracted image frames;
前記格納手段は、前記第２の抽出手段によって生成された前記第６の画像データを前記記憶手段に格納し、  The storage means stores the sixth image data generated by the second extraction means in the storage means,
前記第２の抽出手段は、当該第２の抽出手段により抽出したフレームのデコード間隔を示す間隔時間情報について、前記第４の動画データのヘッダに含まれるタイムテーブルから抽出し、当該抽出した間隔時間情報を含む新たなタイムテーブルと、当該新たなタイム  The second extracting means extracts interval time information indicating a frame decoding interval extracted by the second extracting means from a time table included in a header of the fourth moving image data, and extracts the extracted interval time. A new timetable containing information and the new time テーブルを有するヘッダを生成し、当該生成したヘッダを当該第２の抽出手段により抽出した画像データに対して付加して前記第６の動画データとして生成するA header having a table is generated, and the generated header is added to the image data extracted by the second extraction unit to generate the sixth moving image data.
ことを特徴とする請求項２に記載のデータ編集装置。The data editing apparatus according to claim 2, wherein:

前記第４の動画データそれぞれにおける複数の画像フレームの各フレームごとのデコード間隔を示す間隔時間情報が与えられており、 Interval time information indicating a decoding interval for each of a plurality of image frames in each of the fourth moving image data is given,
前記第２の検索手段は、前記記憶手段に格納される前記第４の動画データのヘッダに含まれるタイムテーブルを参照して、前記起点とする基準フレーム以降のフレームごとの前記間隔時間情報を順次加算することで前記所定時間範囲のフレームを特定する The second search unit sequentially refers to the time table included in the header of the fourth moving image data stored in the storage unit, and sequentially sets the interval time information for each frame after the reference frame as the starting point. The frame within the predetermined time range is specified by adding
ことを特徴とする請求項９に記載のデータ編集装置。The data editing apparatus according to claim 9.

前記第２の検索手段は、前記記憶手段に格納される前記第４の動画データのヘッダに含まれるタイムテーブルを参照して、前記起点とする基準フレームから再生したときに所定時間経過する前であり、なおかつ前記所定時間範囲内の差分フレームを検索する The second search means refers to a time table included in a header of the fourth moving image data stored in the storage means, and when a predetermined time has elapsed before reproduction from the reference frame as the starting point. There is a difference frame within the predetermined time range.
ことを特徴とする請求項１０に記載のデータ編集装置。The data editing apparatus according to claim 10.

前記記憶手段に格納される動画データを他の機器に送信可能な通信手段をさらに備え、 A communication unit capable of transmitting the moving image data stored in the storage unit to another device;
前記所定時間は、前記通信手段により送信可能な範囲内の値である The predetermined time is a value within a range that can be transmitted by the communication means.
ことを特徴とする請求項９から１１のいずれか一項に記載のデータ編集装置。The data editing apparatus according to claim 9, wherein the data editing apparatus is a data editing apparatus.

複数の画像フレームと当該複数の画像フレームを連続デコードする際の情報を含むタイムテーブルが記載されるヘッダとを含んで構成される動画データを複数記憶する記憶手段に対し、記憶する動画データを編集するデータ編集方法であって、  Edit stored moving image data for storage means that stores a plurality of moving image data including a plurality of image frames and a header in which a time table including information when continuously decoding the plurality of image frames is described. Data editing method,
前記記憶手段に記憶される複数の動画データのうち、つなぎ合わせる動画データを指定するステップと、  Designating moving image data to be joined among the plurality of moving image data stored in the storage means;
前記第１の動画データと第２の動画データとが指定されると、当該第２の動画データのうち、そのヘッダを取り除いたデータを前記第  When the first moving image data and the second moving image data are designated, the second moving image data is obtained by removing the header from the second moving image data. 11 の動画データの後ろに連結して第３の動画データを生成するステップと、Generating third video data by concatenating behind the video data of
生成された前記第３の画像データを前記記憶手段に格納するステップと、  Storing the generated third image data in the storage means;
前記第３の動画データ生成の際、前記第２の動画データのヘッダ内のタイムテーブルに含まれる情報に基づいて、前記第２の動画データが前記第１の動画データに連続するよう、前記第１の動画データのヘッダ内のタイムテーブルを書き換え、当該書き換えたヘッダを前記第３の動画データのヘッダとするステップと、  When generating the third moving image data, based on information included in a time table in a header of the second moving image data, the second moving image data is continuous with the first moving image data. Rewriting the time table in the header of the first moving image data and setting the rewritten header as the header of the third moving image data;
を備えることを特徴とするデータ編集方法。A data editing method comprising: