JP2002532996A

JP2002532996A - Web-based video editing method and system

Info

Publication number: JP2002532996A
Application number: JP2000588693A
Authority: JP
Inventors: イー．オルロフ，マックス; ディー．チクロフスキー，ディミトリ; サモイロフ，マイケル
Original assignee: ジェイエーヴィユーテクノロジーズ，インク．
Priority date: 1998-12-15
Filing date: 1999-12-15
Publication date: 2002-10-02
Also published as: EP1157338A1; WO2000036516A9; CA2352962A1; AU3124500A; WO2000036516A1

Abstract

(57)【要約】ネットワークに基づくシステム（３０）がマルチメディア情報を処理するために提供されている。サーバ、好ましくはサーバのグループ（２０，２２，２４）がネットワークに結合され、夫々映像、画像、音声及び動画の少なくとも一つを含むファイルのようなマルチメディアオブジェクトの作成、編集、観察及び管理を可能にするマルチメディアツールキット（３４）またはエンジンを組み込んでいる。各サーバはマルチメディアオブジェクトについてのメモリを含む。ネットワークへのアクセスが可能であるクライアント（３０）、好ましくはパーソナルコンピュータ及び好ましくは複数のクライアントは夫々、所定の一組のマルチメディア処理命令からのマルチメディア処理命令をネットワークを介してサーバへ送ることをクライアントに可能にさせるマルチメディア編集インタフェース（３２）、好ましくはグラフィックユーザインタフェースを組み込んでいる。好ましくは、クライアントは一度に一連の命令を送ることができ、種々のクライアント夫々は同じオブジェクトに実施されるべき命令を送ることができる。各サーバ内のマルチメディアエンジンは、サーバのメモリにクライアントによって予め記憶されたマルチメディアオブジェクトについての対応する処理操作を行うことにより一または複数のクライアントからの受信マルチメディア処理命令に基づいて動作し、サーバ処理済みマルチメディアオブジェクトをネットワークを介してクライアントに利用可能にする。 SUMMARY A network-based system (30) is provided for processing multimedia information. A server, preferably a group of servers (20, 22, 24), is coupled to the network to create, edit, view and manage multimedia objects, such as files, each containing at least one of video, image, audio and video. It incorporates an enabling multimedia toolkit (34) or engine. Each server includes memory for multimedia objects. A client (30), preferably a personal computer and preferably a plurality of clients, each having access to the network, sends multimedia processing instructions from a predetermined set of multimedia processing instructions to the server via the network. A multimedia editing interface (32), preferably a graphic user interface, that allows the client to do the same. Preferably, a client can send a series of instructions at once, and each of the various clients can send an instruction to be performed on the same object. The multimedia engine in each server operates based on received multimedia processing instructions from one or more clients by performing corresponding processing operations on multimedia objects pre-stored by the clients in the server's memory; Make the servered multimedia object available to clients over the network.

Description

【発明の詳細な説明】DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【産業上の利用分野】[Industrial applications]

本発明は、ＤＡＲＰＡからの認可の下で部分的に政府資金を受けた。政府は、
本発明の一部について相当の権利を有する。This invention was partially funded with government approval under DARPA. the government,
We have considerable rights in part of the present invention.

【０００２】本発明は一般にマルチメディアソフトウェアに関し、より詳細にはウェブに基
づく映像編集アプリケーションについての処理集約型マルチメディアソフトウェ
アの構設に使用するライブラリに関する。[0002] The present invention relates generally to multimedia software, and more particularly, to a library used to construct process-intensive multimedia software for web-based video editing applications.

【０００３】[0003]

【従来の技術及び発明が解決しようとする課題】Problems to be solved by the prior art and the invention

マルチメディア研究共同体は、伝統的にその努力をマルチメディアデータの圧
縮、転送、記憶及び表示に注いできた。これらの技術は、映像会議及び映像オン
デマンドのようなアプリケーションに基本的に重要である。これらの努力の結果
、多くの市場製品へと繋がってきた。例えば下記のようなＪＰＥＧ及びＭＰＥＧ
は、画像及び音声／映像圧縮からの遍在する標準である。しかしながら内容に基
づく検索と理解力、映像製作、並びに、不均質性及び帯域適応についてのトラン
スコーディングにおける問題がある。処理集約型マルチメディアアプリケーショ
ンを構設するために使用できる高性能のツールキットの欠如は、マルチメディア
アプリケーションにおける開発を阻害している。特に映像編集の領域では大量の
データが効率的な仕方で記憶され、アクセスされ、操作される必要がある。映像
データを記憶する問題への解決は、クライアント−サーバアプリケーション及び
World Wide Web（Ｗｅｂ）上での編集を含む。しかしながら現存するマルチメデ
ィアツールキットは、これらのアプリケーションを実用化する十分な高性能を持
っていない。The multimedia research community has traditionally focused its efforts on the compression, transfer, storage and display of multimedia data. These techniques are of fundamental importance for applications such as video conferencing and video on demand. These efforts have resulted in many marketed products. For example, the following JPEG and MPEG
Is a ubiquitous standard from image and audio / video compression. However, there are problems with content-based searching and understanding, video production, and transcoding for heterogeneity and bandwidth adaptation. The lack of high performance toolkits that can be used to configure process intensive multimedia applications has hindered development in multimedia applications. Particularly in the area of video editing, large amounts of data need to be stored, accessed, and manipulated in an efficient manner. The solution to the problem of storing video data is a client-server application and
Includes editing on the World Wide Web (Web). However, existing multimedia toolkits do not have enough performance to make these applications practical.

【０００４】データ標準ＧＩＦ、ＪＰＥＧ及びＭＰＥＧは、当技術分野での現状では画像及
び映像データを支配している。ＧＩＦ（Graphics Interchange Format)は、Ｗｅ
ｂ上で一般に使用されるビットマップグラフィックスファイルフォーマットであ
る。ＪＰＥＧ（Joint Photographic Experts Group）は、画像データ用に国際的
に認められた標準である。ＪＰＥＧは、フルカラーまたはグレイスケールの静止
画像を圧縮するために設計されている。音声データを含む映像データについて、
国際標準はＭＰＥＧ（Moving Picture Experts Group）である。ＭＰＥＧは実際
に、標準の進展シリーズへの一般基準である。簡単のために、種々のＭＰＥＧ版
が“ＭＰＥＧ標準”または単に“ＭＰＥＧ”と呼ばれるであろう。ＭＰＥＧ標準
は、全画像の代わりに一つのフレームから次のフレームへの変化のみを記憶する
ことにより、高効率のデータ圧縮を達成している。[0004] The data standards GIF, JPEG and MPEG dominate image and video data in the current state of the art. GIF (Graphics Interchange Format)
b is a commonly used bitmap graphics file format. JPEG (Joint Photographic Experts Group) is an internationally recognized standard for image data. JPEG is designed to compress full color or gray scale still images. For video data including audio data,
The international standard is MPEG (Moving Picture Experts Group). MPEG is in fact a general reference to a series of evolutions of the standard. For simplicity, various MPEG versions will be referred to as "MPEG Standard" or simply "MPEG". The MPEG standard achieves highly efficient data compression by storing only the changes from one frame to the next instead of the entire image.

【０００５】ＭＰＥＧ標準は四つのタイプの処理用画像符号化、即ちＩ−フレーム、Ｐフレ
ーム、Ｂフレーム及びＤフレームを持つ（ＭＰＥＧの早期版からあるが、最近の
標準からは欠けている）。[0005] The MPEG standard has four types of image coding for processing: I-frames, P-frames, B-frames and D-frames (from an earlier version of MPEG, but missing from recent standards).

【０００６】Ｉ−フレーム（イントラ符号化画像）は自立式、即ち他の画像を参照すること
なしに符号化される。Ｉ−フレームは静止画像として処理され、ＭＰＥＧはそれ
を符号化するためにＪＰＥＧ標準を使用する。ＭＰＥＧにおける圧縮はしばしば
リアルタイムで行われ、Ｉ−フレームの圧縮率はＭＰＥＧ標準内で最低である。
Ｉ−フレームはＭＰＥＧストリームにおいてランダムアクセス用の点として使用
される。[0006] I-frames (intra-coded images) are self-contained, that is, encoded without reference to other images. I-frames are processed as still images, and MPEG uses the JPEG standard to encode it. Compression in MPEG is often done in real time, and the compression ratio of I-frames is the lowest in the MPEG standard.
The I-frame is used as a point for random access in the MPEG stream.

【０００７】Ｐ−フレーム（予測符号化フレーム）は、符号化及び復号のためにＭＰＥＧス
トリーム中の先行Ｉ−フレーム及び／または全ての先行Ｐ−フレームの情報を必
要とする。Ｐ−フレームの符号化は、画像区域が連続画像において変化する代わ
りにシフトする、という原理に基づいている。[0007] P-frames (predictive coded frames) require information of the preceding I-frame and / or all preceding P-frames in the MPEG stream for encoding and decoding. The encoding of P-frames is based on the principle that image areas shift instead of changing in successive images.

【０００８】Ｂ−フレーム（双方向予測符号化フレーム）は、符号化及び復号のためにＭＰ
ＥＧストリーム中の先行及び後続Ｉ−フレーム並びに／またはＰ−フレームの両
方からの情報を必要とする。Ｂ−フレームは、ＭＰＥＧ標準内で最高の圧縮率を
持つ。[0008] B-frames (bi-directionally coded frames) are used for encoding and decoding.
It requires information from both previous and subsequent I-frames and / or P-frames in the EG stream. B-frames have the highest compression ratio within the MPEG standard.

【０００９】Ｄ−フレーム（ＤＣ符号化フレーム）は、イントラフレーム符号化される。Ｄ
−フレームはＭＰＥＧ標準のより最近の版では欠けているが、しかしながら旧Ｍ
ＰＥＧ版で作業する際にＤフレームを扱うためにアプリケーションは依然として
必要である。Ｄ−フレームは、画像の最低頻度のみから成る。Ｄ−フレームは、
急速先送り及び急速捲き戻しモードにおける表示のために使用される。これらの
モードはまた、適当な順序のＩ−フレームを使用して達成できるであろう。[0009] D-frames (DC encoded frames) are intra-frame encoded. D
-Frames are missing in more recent versions of the MPEG standard, however,
Applications are still needed to handle D frames when working with the PEG version. D-frames consist only of the lowest frequency of the image. The D-frame is
Used for display in fast forward and fast rewind modes. These modes could also be achieved using an appropriately ordered I-frame.

【００１０】映像情報符号化は、ＤＣＴ（離散的コサイン変換）を使用するＭＰＥＧ標準で
達成される。この技術はコサインの加重合計としてのデータを形成する。ＤＣＴ
はまた、ＪＰＥＧ標準のデータ圧縮用に使用される。[0010] Video information coding is achieved with the MPEG standard using DCT (Discrete Cosine Transform). This technique forms data as a weighted sum of cosine. DCT
Is also used for JPEG standard data compression.

【００１１】現在、高性能マルチメディアツールキットの不足を埋め合わせるために、そこ
から選択しなければならない各種の不十分な選択肢がある。第一に、コードは特
別な問題を解決するために必要であるように始めから開発することができるであ
ろうが、ＪＰＥＧ及びＭＰＥＧのような複雑なマルチメディア標準が与えられて
いて、これは困難である。第二に、現存コードは修正できるが、これは複雑で、
管理不能な、そして一般に維持、デバッグ及び再使用することが困難なシステム
に帰着する。第三に、ＭＰＥＧ標準のｏｏＭＰＥＧ、またはＪＰＥＧ標準の独立
ＪＰＥＧグループ（ＩＪＰ）のような現存の標準ライブラリは使用できるであろ
うが、これらのライブラリの機能の詳細は見られず、そして僅かに限定された最
適化が実施できるだけである。[0011] Currently, there are various inadequate options from which to choose to make up for the lack of high performance multimedia toolkits. First, the code could be developed from scratch as needed to solve a particular problem, but given complex multimedia standards such as JPEG and MPEG, Have difficulty. Second, the existing code can be modified, but this is complicated,
This results in a system that is unmanageable and generally difficult to maintain, debug and reuse. Third, existing standard libraries, such as the MPEG standard ooMPEG, or the JPEG standard independent JPEG group (IJP) could be used, but details of the functions of these libraries were not found and were slightly limited Only optimized optimization can be performed.

【００１２】マルチメディア処理用の高性能ツールキットを所有することが、望ましい。It is desirable to have a high performance toolkit for multimedia processing.

【００１３】クライアント−サーバ映像編集を可能とする方法及び装置を提供することが、
本発明の目的である。To provide a method and apparatus for enabling client-server video editing,
It is an object of the present invention.

【００１４】Ｗｅｂに基づく映像編集を可能とする方法及び装置を提供することが、本発明
の他の目的である。It is another object of the present invention to provide a method and apparatus for enabling web-based video editing.

【００１５】[0015]

【課題を解決するための手段】[Means for Solving the Problems]

本発明によれば、ネットワークに基づくシステムが、マルチメディア情報を処
理するために提供される。サーバ、好ましくはサーバのグループがネットワーク
に結合され、夫々映像、画像、音声及び動画の少なくとも一つを含むファイルの
ようなマルチメディアオブジェクトの作成、編集、観察及び管理を可能にするマ
ルチメディアツールキットまたはエンジンを組み込んでいる。各サーバは、マル
チメディアオブジェクトについてのメモリを含む。ネットワークへのアクセスが
可能であるクライアント、好ましくはパーソナルコンピュータ及び好ましくは複
数のクライアントは夫々、所定の一組のマルチメディア処理命令からのマルチメ
ディア処理命令をネットワークを介してサーバへ送ることをクライアントに可能
にさせるマルチメディア編集インタフェース、好ましくはグラフィックユーザイ
ンタフェース（ＧＵＩ）を組み込んでいる。好ましくはクライアントは一度に一
連の命令を送ることができ、複数のクライアント夫々は同じオブジェクトで実施
されるべき命令を送ることができる。各サーバ内のマルチメディアエンジンは、
サーバのメモリにクライアントによって予め記憶されたマルチメディアオブジェ
クトについての対応する処理操作を行うことにより一または複数のクライアント
からの受信マルチメディア処理命令に基づいて動作し、サーバは処理済みマルチ
メディアオブジェクトをネットワークを介してクライアントに利用可能にする。According to the present invention, a network-based system is provided for processing multimedia information. A multimedia toolkit in which a server, preferably a group of servers, is coupled to a network and each of which can create, edit, view and manage multimedia objects such as files containing at least one of video, image, audio and video. Or incorporates an engine. Each server includes memory for multimedia objects. A client, preferably a personal computer and preferably a plurality of clients, capable of accessing the network, each instructs the client to send multimedia processing instructions from a predetermined set of multimedia processing instructions to the server over the network. It incorporates a multimedia editing interface, preferably a graphic user interface (GUI), that enables it. Preferably, the client can send a series of instructions at once, and each of the multiple clients can send an instruction to be performed on the same object. The multimedia engine in each server is
The server operates based on received multimedia processing instructions from one or more clients by performing corresponding processing operations on the multimedia objects pre-stored by the client in the memory of the server, and the server processes the processed multimedia object over a network. Make it available to clients via

【００１６】好ましくは、クライアントが認識され、命令が適切なサーバへ経路付けられ、
各サーバは性能及び特徴の有用性を含む種々の基準に基づいて、処理されるべき
オブジェクトを含有するメモリへのアクセスが可能となるように、サーバは負荷
平衡デーモンなどにより管理される。承認済クライアントはこのような基準を管
理できることが意図される。Preferably, the client is recognized, the instructions are routed to the appropriate server,
The servers are managed by a load balancing daemon or the like so that each server can access the memory containing the objects to be processed based on various criteria, including performance and feature availability. Approved clients are intended to be able to manage such criteria.

【００１７】本発明は、上記及び他の利点と共に図面に示された以下の本発明の実施の形態
の詳細な説明から最も良く理解できる。The present invention, together with the above and other advantages, is best understood from the following detailed description of embodiments of the invention, as illustrated in the drawings.

【００１８】[0018]

【発明の実施の形態】BEST MODE FOR CARRYING OUT THE INVENTION

図１は、クライアント／サーバウェブに基づく映像編集システム１０を示す。
クライアントコンピュータ（クライアント）１５はＷｅｂ３０を介して複数のサ
ーバ２０、２２、２４に接続できる。クライアント１５は、映像編集ユーザイン
タフェース３２を走らせる。本実施の形態では、Ｗｅｂ３０はクライアント１５
とサーバ２０、２２、２４とを接続するために使用されるが、しかしながら他の
実施の形態では、他のタイプのネットワークがクライアント−サーバ接続を形成
するために使用できるであろう。サーバ２０、２２、２４は、視聴覚映像データ
を記憶し、マルチメディア処理アプリケーションを有する。FIG. 1 shows a video editing system 10 based on a client / server web.
The client computer (client) 15 can connect to a plurality of servers 20, 22, and 24 via the Web 30. The client 15 runs the video editing user interface 32. In the present embodiment, the Web 30 is the client 15
And servers 20, 22, 24; however, in other embodiments, other types of networks could be used to form the client-server connection. The servers 20, 22, 24 store audiovisual video data and have multimedia processing applications.

【００１９】動作時、クライアント１５は、Ｗｅｂ３０を介してサーバ２０、２２、２４の
少なくとも一つに映像処理に関する命令を送る。サーバ２０、２２、２４の少な
くとも一つは、命令を受け、それを解析し、要求された映像処理を実施し、結果
をクライアント１５に送る。In operation, the client 15 sends an instruction relating to video processing to at least one of the servers 20, 22, 24 via the Web 30. At least one of the servers 20, 22, 24 receives the command, analyzes it, performs the requested video processing, and sends the result to the client 15.

【００２０】操作の更に詳細なシーケンスは、以下の通りである。即ち、ユーザは要求をクライアント１５からサーバ（Ａ、ＢまたはＣ：２０、２２ま
たは２４）へ送り、Ｗｅｂ３０を介して、映像編集ユーザインタフェースと映像
処理サーバとの間の接続をマルチメディア編集ツールキット３４を用いて確立す
る。サーバは、例えば負荷平衡デーモンによって選択され、ユーザをクライアン
トとして認識し、ユーザ命令を待つ。ユーザは、例えば交差プラットフォームＪ
ａｖａまたはプラットフォームに基づくクライアントで実施された映像編集ユー
ザインタフェース３２を使用して、サーバに記憶された特定のメディアオブジェ
クトに或る操作の処理を要求する命令を送る。サーバは、命令を検証し、要求さ
れたメディアオブジェクトを位置決めする。サーバは、要求された操作について
のライブラリを含む適切なツールキット（３４）を位置決めし、オブジェクトを
修正しながら操作を行う。サーバは、例えばメディアストリーミングまたは他の
移送プロトコルを介してクライアントへフィードバックを送り、行われた操作を
知らせて結果をユーザインタフェースに表示する。このプロセスは、ユーザが接
続を閉じるまで繰り返される。A more detailed sequence of operations is as follows. That is, the user sends a request from the client 15 to the server (A, B or C: 20, 22, or 24), and establishes a connection between the video editing user interface and the video processing server via the Web 30 by using the multimedia editing toolkit. 34 to establish. The server recognizes the user as a client, for example selected by the load balancing daemon, and waits for a user command. The user, for example, cross platform J
Using a video editing user interface 32 implemented on an ava or platform-based client, send instructions to a particular media object stored on a server to process an operation. The server verifies the instructions and locates the requested media object. The server locates the appropriate toolkit (34), which contains the library for the requested operation, and performs the operation while modifying the object. The server sends feedback to the client, for example, via media streaming or other transport protocols, informs the user of the operation performed and displays the result on the user interface. This process is repeated until the user closes the connection.

【００２１】サーバ２０、２２、２４は、本発明の原理に従って、映像処理アプリケーショ
ンの一部として高性能ツールキット（またはライブラリ）３４を持つ。本発明は
ＭＰＥＧ標準について記述されるが、しかしながら本発明の原理は他のデータ標
準にも適用できる。加えてＭＰＥＧ標準は進化する標準であり、本発明の原理は
更に発展されるＭＰＥＧ標準にも適用できる。The servers 20, 22, 24 have a high performance toolkit (or library) 34 as part of the video processing application in accordance with the principles of the present invention. The invention is described with reference to the MPEG standard, however, the principles of the invention are applicable to other data standards. In addition, the MPEG standard is an evolving standard, and the principles of the present invention can be applied to further evolving MPEG standards.

【００２２】要約して、図１はオンライン編集及びブラウジングシステムの実行を記述する
。クライアント−サーバアプリケーションにより、ユーザはネットワークを介し
て映像、画像、音声、動画及びテキストを編集し、観察し、管理できる。このシ
ステムは、終端及び一または複数のクライアントに対する一つ以上のサーバを含
むマルチ要素構造である。このシステムは、ネットワークを介してクライアント
からサーバへ送られる命令によるサーバ側の操作を集中する。クライアントの基
礎的アイデアは、ユーザと相互作用することである。クライアントは、ユーザに
所定の一組の命令を行わせ、それらをサーバに送らせ、表示用の結果を受け取ら
せる映像編集インタフェースを組み込んでいる。実際のデータ処理の全てまたは
大部分は、主データ処理機能性を提供して離れたクライアントとの通信を扱うサ
ーバで発生する。In summary, FIG. 1 describes the implementation of an online editing and browsing system. Client-server applications allow users to edit, view, and manage video, images, audio, video, and text over a network. The system is a multi-element structure that includes a termination and one or more servers for one or more clients. This system concentrates operations on the server side by commands sent from the client to the server via the network. The client's basic idea is to interact with the user. The client incorporates a video editing interface that allows the user to perform a predetermined set of instructions, send them to the server, and receive the results for display. All or most of the actual data processing occurs at the server that provides the main data processing functionality and handles communication with remote clients.

【００２３】高性能ツールキット３４は、未解決の抽象的概念を破ることなしに実施される
最適化を行わせ、予見できない方法でユーザによって組立てができる手同調Ｃコ
ードと競合する性能を備えるコードを提供する。予測可能な性能、資源制御並び
に置換性及び伸展性（即ち多くのアプリケーションに使用可能である）で高性能
マルチメディアデータ処理を達成するために、本発明は以下の特性を備えて設計
されたツールキット即ちＡＰＩを提供する。The high performance toolkit 34 provides optimizations performed without breaking unresolved abstractions, and provides code with performance that competes with hand-tuned C code that can be assembled by the user in an unforeseen manner. I will provide a. To achieve high performance multimedia data processing with predictable performance, resource control, and replaceability and extensibility (ie, usable for many applications), the present invention provides a tool designed with the following characteristics: A kit or API is provided.

【００２４】ツールキット３４の最初の特性は、資源制御である。資源制御は、Ｉ／Ｏ実施
の言語レベル、並びに、不必要なメモリ割付けの低減及び／または除去を含むメ
モリ割付けでの制御を言う。本発明のツールキットルーチンは、何れも暗黙のう
ちにメモリを割付けたりＩ／Ｏを実施したりすることはない。Ｉ／Ｏを実施する
ツールキットの僅かなプリミティブは、ビットストリーム（BitStream)データを
負荷するまたは記憶するプリミティブである。ビットストリーム（BitStream)は
、マルチメディアデータの実際のストリームである。ＭＰＥＧビットストリーム
は、以下に述べられる。本発明の他の全てのツールキットプリミティブは、デー
タソースとしてビットストリーム（BitStream)を使用する。ユーザは、メモリ利
用及びＩ／Ｏについての完全な制御を有する。この特徴はユーザに、性能臨界資
源に亘る厳しい制御と予測可能な性能を備えてアプリケーションを書くための本
質的な特徴とを与える。ツールキットはまた、データコピー回避のような技術を
利用してプログラムを最適化し、良好なキャッシュ動作についてのプログラムを
構成する機構をユーザに与える。The first feature of the toolkit 34 is resource control. Resource control refers to control at the language level of the I / O implementation and memory allocation, including reducing and / or eliminating unnecessary memory allocation. None of the toolkit routines of the present invention implicitly allocate memory or perform I / O. A few primitives in the toolkit that perform I / O are primitives that load or store bitstream data. A bit stream (BitStream) is an actual stream of multimedia data. The MPEG bitstream is described below. All other toolkit primitives of the present invention use a bitstream (BitStream) as a data source. The user has complete control over memory utilization and I / O. This feature gives the user essential features for writing applications with tight control over performance critical resources and predictable performance. The toolkit also provides the user with a mechanism to optimize the program using techniques such as data copy avoidance and configure the program for good cache operation.

【００２５】本発明におけるＩ／Ｏの分離は、三つの利点を持つ。第一に、それは使用され
るＩ／Ｏ方法をツールキットプリミティブに対して平明にする。一般的に従来の
ライブラリは、統合処理及びＩ／Ｏを使用する。ファイルＩ／Ｏをその処理と統
合するライブラリは、ネットワークのＩ／Ｏ動作がファイルのそれとは異なるの
で、ネットワーク環境において使用することは困難である。第二に、Ｉ／Ｏの分
離はまた、Ｉ／Ｏが実施された場合の制御を行わせる。それは例えば、データを
同時に読んで処理する二重バッファリング方式を使用させるツールキットのマル
チスレッド実行の構設を可能にする。第三に、Ｉ／Ｏ呼出しを分離することによ
り、残りの機能の性能はより一層予測可能となる。The separation of I / O in the present invention has three advantages. First, it clarifies the I / O method used for toolkit primitives. Generally, conventional libraries use integrated processing and I / O. Libraries that integrate file I / O with their processing are difficult to use in a network environment because the network's I / O operations are different from those of files. Second, I / O isolation also allows for control when I / O is performed. It allows, for example, the construction of a multi-threaded execution of the toolkit, which uses a double buffering scheme for reading and processing data simultaneously. Third, by separating the I / O calls, the performance of the remaining functions is more predictable.

【００２６】本発明のツールキットは、抽象的概念即ちデータオブジェクト間のメモリを共
有する二つの機構を設ける。これらの機構はクリッピングとキャスティングと呼
ばれる。The toolkit of the present invention provides two mechanisms for sharing memory between abstract concepts, ie, data objects. These mechanisms are called clipping and casting.

【００２７】クリッピングにおいて、一つのオブジェクトは同じタイプの別のオブジェクト
からメモリを“借りる”。クリッピングの例示使用法が、図２に見られる。図２
において、抽象的概念バイト画像（ByteImage)（範囲０・・２５５における値の
２Ｄ配列）は、黒色箱６０次いで灰色箱６２を作るために使用され、次に下記の
擬似コードを使用してメモリを共有する画像が、６４内部に灰色箱を持った黒色
箱として結合される。ａを［バイト−新１００１００］と設定する。 ★★★大きさ１００×１００の新バイト画像（ByteImage)を作る★★★ バイト−セット＄ａ０ ★★★バイト画像（ByteImage)を０に初期化する（全て黒）★★★ ｂを［バイト−クリップ＄ａ３０３０２０２０］と設定する。 ★★★＄ａ内部に２０×２０の大きさで位置（３０，３０）に小さなバイト画像（ByteImage)を作る。ａ及びｂは同じメモリを共有する★★★ バイト−セット＄ｂ１２８ ★★★小さな箱を灰色に初期化する★★★ バイト−表示＄ｂ ★★★小さな灰色の箱を表示する★★★ バイト−表示＄ａ ★★★黒い箱を小さな灰色の箱と共に表示する★★★In clipping, one object “borrows” memory from another object of the same type. An exemplary use of clipping can be seen in FIG. FIG.
In, an abstract concept ByteImage (a 2D array of values in the range 0..255) is used to create a black box 60 and then a gray box 62, and then use the following pseudo code to save memory The images to be shared are combined as a black box with a gray box inside 64. Set a to [byte-new 100 100]. ★★★ Create a new byte image (ByteImage) of size 100 × 100 ★★★ Byte-set ＄ a 0 ★★★ Initialize byte image (ByteImage) to 0 (all black) ★★★ b Byte-clip {a 30 30 20 20]. ★★★ ＄ Create a small byte image (ByteImage) at position (30, 30) with a size of 20 × 20 inside a. a and b share the same memory ★★★ Byte-set ＄ b128 ★★★ Initialize small box to gray ★★★ Byte-display ＄ b ★★★ Display small gray box ★★★ Byte -Display ＄ a ★★★ Display black box with small gray box ★★★

【００２８】クリッピング機能は、それが画像ヘッダ構造のみを割付け、それが全てのツー
ルキット画像と音声データタイプとに設けられるので、メモリ使用の項目で安価
である。クリッピングは、データの不必要なコピーまたは処理を回避するために
有用である。例えば、もしユーザがＭＰＥＧＩ−フレームでのグレイスケール画
像のほんの一部の復号を望む場合、ユーザはクリップされたＤＣＴ画像（DCTIma
ge）を作ることができるであろう。ＤＣＴ画像（DCTImage）はＤＣＴベクトルの
配列であり、クリップされたＤＣＴ画像（DCTImage）は、復号されたＩ−フレー
ムからのＤＣＴブロックの部分集合を含む画像である。ユーザは次いで、復号プ
ロセスを完了するために、そのクリップされた画像にＩＤＣＴ（逆離散的コサイ
ン変換）を実施する。この方法の利点は、使用されない符号化データにＩＤＣＴ
を実施することをそれが回避することである。The clipping function is inexpensive in terms of memory usage, since it allocates only the image header structure, which is provided for all toolkit images and audio data types. Clipping is useful to avoid unnecessary copying or processing of data. For example, if the user wants to decode only a small part of a grayscale image in an MPEG I-frame, the user can use the clipped DCT image (DCTIma
ge). The DCT image (DCTImage) is an array of DCT vectors, and the clipped DCT image (DCTImage) is an image including a subset of DCT blocks from the decoded I-frame. The user then performs an IDCT (Inverse Discrete Cosine Transform) on the clipped image to complete the decoding process. The advantage of this method is that IDCT is used for unused encoded data.
It is to avoid implementing

【００２９】キャスティングは、異なるタイプのオブジェクト間のメモリの共有を言う。全
てのＩ／Ｏはビットストリーム（BitStream)を通してなされるので、キャスティ
ングは一般的にＩ／Ｏについて使用される。例えば、グレイスケール画像ファイ
ルがビットストリーム（BitStream)に読み込まれる際に、ヘッダは分析され、残
りのデータはバイト画像（ByteImage)に落される。キャスティングは、データの
不必要なコピーを回避する。Casting refers to the sharing of memory between different types of objects. Casting is commonly used for I / O since all I / O is through a bitstream. For example, when a grayscale image file is read into a bitstream (BitStream), the header is analyzed and the remaining data is dropped into a byte image (ByteImage). Casting avoids unnecessary copying of data.

【００３０】ツールキットにおいてユーザは、新しくて自由なプリミティブ、例えば新バイ
ト画像（ByteImageNew) 及び自由バイト画像（ByteImageFree)を使用して、全て
の重要なメモリ資源を明快に割付けて自由にする。機能は、決して一時的なメモ
リを割付けない。もしこのようなメモリが操作（例えばｓｃｒａ空間）を完了す
るために必要であるならば、ユーザはそれを割付けてそれをパラメータとしてル
ーチンに渡さなければならない。明快なメモリ割付けにより、ユーザはページ付
けを低減または除去され、アプリケーションの性能をより予測可能なものとする
ことができる。In the toolkit, the user uses new and free primitives, such as a new byte image (ByteImageNew) and a free byte image (ByteImageFree), to allocate and free all important memory resources. The function never allocates temporary memory. If such memory is needed to complete an operation (eg, scra space), the user must allocate it and pass it as a parameter to the routine. Clear memory allocation allows the user to reduce or eliminate pagination and make application performance more predictable.

【００３１】例えば一つのバイト画像（ByteImage)を他へコピーする機能バイトコピー（By
teCopy）において、潜在的な問題は、例えばもしそれらがクリッピングを使用し
てメモリを共有するならば二つのバイト画像（ByteImages) が重なり合うであろ
うということである。バイトコピー（ByteCopy）を実行する先行技術の方法は、
次のとおりである。ＢｙｔｅＣｏｐｙ（ｓｒｃ，ｄｅｓｔ）｛ｔｅｍｐ＝ｍａｌｌｏｃ（）；ｍｅｍｃｐｙｓｒｃｔｏｔｅｍｐ；ｍｅｍｃｐｙｔｅｍｐｔｏｄｅｓｔ；ｆｒｅｅ（ｔｅｍｐ）；｝上記の実行は一時的バッファを割付け、ソースを一時的バッファにコピーし、一
時的バッファを目的地にコピーし、一時的バッファを自由にする。対照的に本発
明のツールキットを使用する操作は次の通りである。ＢｙｔｅＣｏｐｙ（ｓｒｃ，ｄｅｓｔ）｛ｍｅｍｃｐｙｓｒｃｔｏｄｅｓｔ；｝ｔｅｍｐ＝ＢｙｔｅＮｅｗ（）；ＢｙｔｅＣｏｐｙ（ｓｒｃ，ｔｅｍｐ）；ＢｙｔｅＣｏｐｙ（ｔｅｍｐ，ｄｅｓｔ）；ＢｙｔｅＦｒｅｅ（ｔｅｍｐ）；本発明のツールキットバイトコピー（ByteCopy）操作は、ソースと目的地とが重
ならず、それがソースを目的地にコピーするということを仮定する。ユーザはソ
ースと目的地とが重なるか否かを決定しなければならず、もし重なるならばユー
ザは上記に示されるように、一時的バイト画像（ByteImage)及び二つのバイトコ
ピー（ByteCopy）呼出しを割付けなければならない。For example, a function of copying one byte image (ByteImage) to another byte copy (By
In teCopy, a potential problem is that, for example, if they share memory using clipping, the two ByteImages will overlap. Prior art methods of performing byte copy (ByteCopy)
It is as follows. ByteCopy (src, dest) {temp = malloc (); memcpy src to temp; memcpy temp to dest; free (temp);} The above execution allocates a temporary buffer, copies the source to a temporary buffer, Copy the buffer to the destination and free the temporary buffer. In contrast, the operation using the toolkit of the present invention is as follows. ByteCopy (src, dest) ｍｅ memcpy src to dest; ｔｅ temp = ByteNew (); ByteCopy (src, temp); ByteCopy (temp, dest); ByteFree (temp); , Assume that the source and destination do not overlap, which copies the source to the destination. The user must determine whether the source and destination overlap, and if so, the user must make a temporary byte image (ByteImage) and two byte copy (ByteCopy) calls as shown above. Must be assigned.

【００３２】本発明のツールキットの第二の特性は、“僅かな”プリミティブを持つ特性で
ある。ツールキットは、複雑な機能を層状にできる単純な機能に分解する。この
特徴はコード再利用を促進し、もしこの発明の方法に依らなければ利用困難な最
適化を行うことができる。例えばＪＰＥＧ画像を復号するために、ツールキット
は次の三つのプリミティブ、即ち、（１）ビットストリームを夫々の色成分一つ
について三つのＤＣＴ画像（DCTImages)に復号する機能（ＤＣＴ画像（DCTImage
）はＤＣＴベクトルの配列である）、（２）各ＤＣＴ画像（DCTImage）をバイト
画像（ByteImage)に変換する機能（その画素は範囲０・・２５５にある単純画像
）、及び、（３）ＹＵＶ色空間からＲＧＢ色空間へ変換する機能を与える。A second property of the toolkit of the present invention is the property with “slight” primitives. The toolkit breaks down complex functions into simple functions that can be layered. This feature promotes code reuse and can provide optimizations that are difficult to use without relying on the method of the present invention. For example, to decode a JPEG image, the toolkit uses the following three primitives: (1) the function of decoding a bit stream into three DCT images (DCTImages) for each color component (DCTImages).
) Is an array of DCT vectors), (2) a function of converting each DCT image (DCTImage) into a byte image (ByteImage) (a simple image whose pixels are in the range 0... 255), and (3) YUV The function of converting from the color space to the RGB color space is provided.

【００３３】この構造を公表することは、種々の利点を持つ。第一に、それはコード再利用
を促進する。例えば逆ＤＣＴ及び色空間変換機能は、ＪＰＥＧ及びＭＰＥＧルー
チンによって共有される。第二に、それは他の方法では利用することが困難であ
ろう最適化を行わせる。このような最適化の一つは、圧縮領域処理である。他の
例は、グレイスケール成分を表す一つのＤＣＴ画像（DCTImage）のみが解凍され
るべきである場合に、ＪＰＥＧ画像をグレイスケール画像に復号することである
。Publishing this structure has various advantages. First, it promotes code reuse. For example, the inverse DCT and color space conversion functions are shared by JPEG and MPEG routines. Second, it causes optimizations that would otherwise be difficult to exploit. One such optimization is compression domain processing. Another example is to decode a JPEG image to a grayscale image if only one DCT image representing the grayscale component is to be decompressed.

【００３４】本発明の多くのツールキットプリミティブは、より一般的な操作の特別な場合
を実行する。特別な場合は、一般的な操作の同じ機能性を達成するために結合で
き、そしてその機能が予測可能である単純かつ迅速な実行を持つ。バイトコピー
（ByteCopy）はこのようなプリミティブの一つであり、重なり合わない画像の特
別な場合のみが実行される。Many of the toolkit primitives of the present invention perform special cases of more general operations. Special cases can be combined to achieve the same functionality of general operations, and have a simple and quick execution whose function is predictable. ByteCopy is one such primitive, and is performed only in special cases of non-overlapping images.

【００３５】他の例は、画像の大きさの変更（画像の縮小または拡大）である。任意の係数
で画像の大きさを変更する一つのプリミティブを与える代わりに、ツールキット
は画像を縮小する五つのプリミティブ（Shrink４×４、Shrink２×２、Shrink２
×１、Shrink１×２及びShrinkBilinear）、並びに、画像を拡大する五つのプリ
ミティブを設ける。各プリミティブは高度に最適化され、そして特定のタスクを
実施する。例えばShrink２×２は、各寸法に２の係数で画像を縮小する。それは
、繰り返し４画素値を一緒に加えて結果をシフトさせることにより極めて速い操
作で実行される。同様の実行が、Shrink４×４、Shrink２×１及びShrink１×２
に設けられている。対照的に機能ShrinkBilinearは、双一次補間を使用して１と
２との間の係数により画像を縮小する。任意の大きさ変更はこれらのプリミティ
ブを組み合わせることにより達成できるけれども、それらを特定化した操作に分
割することは性能を予測可能にさせ、ユーザにコストをより明確に表し、また、
ユーザに非常に速い実行を作り出させる。Another example is changing the size of an image (reducing or enlarging the image). Instead of giving a single primitive that resizes the image by an arbitrary factor, the toolkit uses five primitives to shrink the image (Shrink4 × 4, Shrink2 × 2, Shrink2
× 1, Shrink1 × 2 and ShrinkBilinear) and five primitives for enlarging the image. Each primitive is highly optimized and performs a specific task. For example, Shrink2 × 2 reduces the image by a factor of 2 for each dimension. It is performed in a very fast operation by repeatedly adding the four pixel values together and shifting the result. The same execution is performed for Shrink4 × 4, Shrink2 × 1, and Shrink1 × 2
It is provided in. In contrast, the function ShrinkBilinear reduces the image by a factor between 1 and 2 using bilinear interpolation. Although arbitrary resizing can be achieved by combining these primitives, splitting them into specialized operations makes performance predictable, gives users a clearer indication of cost, and
Let the user produce very fast execution.

【００３６】本発明の特殊化の欠点は、それがＡＰＩにおける多数の機能の爆発的増加に繋
がることである。しかしながら時々ユーザは、性能を落とすことなしに幾つかの
プリミティブを一つに結合することができ、そのことはＡＰＩの多数のプリミテ
ィブをかなり低減する。この原理は、一般化と呼ばれる。A disadvantage of the specialization of the present invention is that it leads to an explosion of a number of functions in the API. However, sometimes the user can combine several primitives together without compromising performance, which significantly reduces the large number of primitives in the API. This principle is called generalization.

【００３７】一般化の良い例が、音声バッファ（AudioBuffers）を処理するプリミティブに
見られる。音声バッファ（AudioBuffers）は、モノまたはステレオ音声データを
記憶する。左右チャンネルからのステレオサンプルは、図３に示されるようにメ
モリ内でインタリーブされる。A good example of generalization is found in primitives that process audio buffers (AudioBuffers). The audio buffers (AudioBuffers) store mono or stereo audio data. Stereo samples from the left and right channels are interleaved in memory as shown in FIG.

【００３８】ユーザが、一つのチャンネル上のボリュームを上げる操作（即ち平衡制御）を
実行したと仮定する。一つの可能な設計は次のコードに見られるように、左チャ
ンネルを処理する一方のプリミティブと右チャンネルを処理する他方とを設ける
ことである。ｐｒｏｃｅｓｓ−ｌｅｆｔｆｏｒ（ｉ＝０，ｉ＜ｎ，ｉ＋＝２）｛ｐｒｏｃｅｓｓｘ［ｉ］｝ｐｒｏｃｅｓｓ−ｒｉｇｈｔｆｏｒ（ｉ＝１，ｉ＜ｎ，ｉ＋＝２）｛｝．Assume that the user has performed an operation to raise the volume on one channel (ie, balance control). One possible design is to provide one primitive for processing the left channel and the other for processing the right channel, as seen in the following code. process-left for (i = 0, i <n, i + = 2) {process x [i]} process-right for (i = 1, i <n, i + = 2)}.

【００３９】しかしながら二つの操作は、ルーピング変数（右について１、左について０）
の初期化を変更することにより性能を落とすことなく結合できる。この実行は、
次のコードに示される。ｐｒｏｃｅｓｓ（ｏｆｆｓｅｔ）ｆｏｒ（ｉ＝ｏｆｆｓｅｔ；ｉ＜ｎ，ｉ＋＝２）｛ｐｒｏｃｅｓｓｘ［ｉ］｝．However, the two operations are looping variables (1 for right, 0 for left)
By changing the initialization of, it is possible to combine without reducing the performance. This execution
This is shown in the following code. process (offset) for (i = offset; i <n, i + = 2) {process x [i]}.

【００４０】一般的に、もし特殊化がより良い性能を与えるならば、それは推奨される。そ
うでなければ、ＡＰＩの多数の機能を低減するために一般化は使用されるべきで
ある。In general, if specialization gives better performance, it is recommended. Otherwise, generalizations should be used to reduce many functions of the API.

【００４１】本発明のツールキットの第三の特性は、公表構造の特性である。殆どのライブ
ラリは、単純な高レベルのＡＰＩを与えながら、ユーザからの符号化アルゴリズ
ムの詳細を隠そうとする。対照的に本発明は、圧縮データの構造を二つの方法で
公表する。A third property of the toolkit of the present invention is that of a published structure. Most libraries attempt to hide the details of the encoding algorithm from the user, while providing a simple high-level API. In contrast, the present invention publishes the structure of the compressed data in two ways.

【００４２】第一に、ツールキットは復号プロセスでの中間構造を公表する。例えばＭＰＥ
Ｇフレームを直接ＲＧＢフォーマットに復号する代わりに、ツールキットはプロ
セスを三つのステップ、即ち、（１）ビットストリーム復号（ハフマン復号及び
逆量子化）、（２）フレーム再構築（動き補償及びＩＤＣＴ）、（３）色空間変
換に分解する。例えばMpegPicParseP 関数はビットストリーム（BitStream)から
のＰフレームを分析し、その結果を三つのＤＣＴ画像（DCTImages)と一つのベク
トル画像（VectorImage)とに書き込む。第二のプリミティブは、ＤＣＴ画像（DC
TImage）及びベクトル画像（VectorImage)データから画素データを再構築し、第
三は色空間の間で変換する。重要な点は、ツールキットが中間データ構造を公表
し、それによりユーザが通常は不可能な最適化を開発できるということである。
例えばグレイスケールデータを復号するためには、単にＣｒ／Ｃｂ面でフレーム
再構築ステップを飛び越せば良い。更に圧縮領域処理技術は、ＤＣＴ画像（DCTI
mage）またはベクトル画像（VectorImage)に適用できる。First, the toolkit publishes the intermediate structure in the decoding process. For example, MPE
Instead of decoding G-frames directly into RGB format, the toolkit describes the process in three steps: (1) bitstream decoding (Huffman decoding and inverse quantization), (2) frame reconstruction (motion compensation and IDCT). , (3) decompose into color space conversion. For example, the MpegPicParseP function analyzes a P frame from a bit stream (BitStream) and writes the result into three DCT images (DCTImages) and one vector image (VectorImage). The second primitive is a DCT image (DC
Reconstruct pixel data from TImage) and vector image (VectorImage) data, and thirdly convert between color spaces. The important point is that the toolkit exposes an intermediate data structure, which allows the user to develop optimizations that are not normally possible.
For example, in order to decode grayscale data, it is sufficient to simply skip the frame reconstruction step on the Cr / Cb plane. Furthermore, the compression area processing technology uses a DCT image (DCTI image).
mage) or vector images (VectorImage).

【００４３】本発明のツールキットはまた、基礎的なビットストリームの構造を公表する。
ツールキットはＭＰＥＧ、ＪＰＥＧ及びＧＩＦのような圧縮ビットストリームの
構造的要素を見出す操作を提供する。この特徴によりユーザは、より良い性能の
ために基礎的なビットストリーム構造の知識を開発できる。例えばＭＰＥＧ映像
ストリームにおける事象について検索するプログラムは、データが容易に（そし
て迅速に）分析されて圧縮領域技術が適用できるので、最初はＩ−フレームのみ
を検索することによってデータを選り分けることができるであろう。この最適化
は或る状況においては、従来の事象検索方法についての性能における大きさ改良
の種々の順序を与えることができるが、他のライブラリはユーザからＭＰＥＧビ
ットストリームの構造を隠しているので、この最適化は使用できない。本発明で
は、この最適化を用いる必要は少ない。ユーザは、画像ヘッダを見出すためにMp
egPicHdrFind機能を、それを復号するためにMpegPicHdrParse を使用でき、もし
復号されたヘッダのタイプフィールドが、続く映像がＩ−フレームであることを
指示するならば、画像を復号するためにMpegIPicParse を使用できる。The toolkit of the present invention also exposes the structure of the underlying bitstream.
The toolkit provides operations for finding structural elements of a compressed bitstream such as MPEG, JPEG and GIF. This feature allows the user to develop basic bitstream structure knowledge for better performance. For example, a program that searches for events in an MPEG video stream can sort the data by first searching only I-frames because the data is easily (and quickly) analyzed and the compression domain technique can be applied. Will. Although this optimization can provide, in some situations, various orders of magnitude improvement in performance over conventional event retrieval methods, since other libraries hide the structure of the MPEG bitstream from the user, This optimization cannot be used. In the present invention, there is little need to use this optimization. User finds Mp to find image header
The egPicHdrFind function can use MpegPicHdrParse to decode it, and can use MpegIPicParse to decode an image if the type field of the decoded header indicates that the following video is an I-frame. .

【００４４】ツールキットは複数の基礎的抽象的概念を提供する。これらの抽象的概念は次
の通りである。・バイト画像（ByteImage)：範囲０・・２５５における値の２Ｄ配列。・ビット画像（BitImage）：０／１値の２Ｄ配列。・ＤＣＴ画像（DCTImage）：要素の２Ｄ配列であり、その各々はＭＰＥＧ及びＪＰＥＧのような多くのブロックに基づく圧縮方式に見出されるラン・レングス
符号化ＤＣＴブロックを表す一連の（インデックス，値）対である。・ベクトル画像（VectorImage)：ベクトルの２Ｄ配列であり、夫々ＭＰＥＧまたはＨ．２６１に見出される動きベクトルを表す水平及び垂直成分を持つ。・音声バッファ（AudioBuffer)：８または１６ビット値の１Ｄ配列。・バイトＬＵＴ（ByteLUT)：バイト画像（ByteImages）についてのルックアップテーブル。バイトＬＵＴ（ByteLUT)は一つのバイト画像（ByteImage)に適用されて他のバイト画像（ByteImage)を生成する。・音声ＬＵＴ（AudioLUT）：音声バッファ（AudioBuffers）についてのルックアップテーブル。・ビットストリーム（BitStream)／ビットパーザ（BitParser)：ビットストリーム（BitStream)は符号化データのためのバッファである。ビットパーザ（BitP arser)はビットストリーム（BitStream)中へのカーソル、及び、ビットストリーム（BitStream)から／へのビット読み出し／書き込みの機能を設ける。・カーネル（Kernel）：畳込みに使用される整数の２Ｄ配列。・フィルタ（Filter）−ビットストリーム（BitStream)の部分集合を選択するために使用できる拡散／収集リスト。これらの抽象的概念は、共通マルチメディアデータオブジェクトを表すために
使用できる。例えば、・グレイスケール画像はバイト画像（ByteImage)を使用して表すことができる。
・モノクローム画像はビット画像（BitImage）を使用して表すことができる。・不規則な形状の領域はビット画像（BitImage）を使用して表すことができる。
・ＲＧＢ画像は全て同じ大きさの三つのバイト画像（ByteImages）を使用して表すことができる。・４：２：０フォーマットのＹＵＶ画像は三つのバイト画像（ByteImages）を使用して表すことができる。Ｙ面を表すバイト画像（ByteImage)はＵ及びＶ面を表すバイト画像（ByteImages）の幅及び高さの２倍である。・ＪＰＥＧ画像のＤＣＴブロック、ＭＰＥＧＩ−フレームまたはＭＰＥＧＰ−及びＢ−フレームの誤差項は、ＤＣＴ領域における画像のＹ，Ｕ及びＶ面の各々に一つづつの三つのＤＣＴ画像（DCTImages)を使用して表すことができる。・ＭＰＥＧＰ−及びＢ−フレームの動きベクトルはベクトル画像（VectorImage) で表すことができる。・ＧＩＦ画像は、各カラーマップに一つづつの三つのバイトＬＵＴ（ByteLUTs）及びカラーマップ画素データに一つのバイト画像（ByteImage)を使用して表すことができる。・８または１６ビットＰＣＭ音声、１６ビットＰＣＭ音声、μ−法則またはＡ法則音声データは音声バッファ（AudioBuffer)を使用して表すことができる。音声は単一チャンネルとして記憶されるか、または左右両チャンネルを含むことができる。The toolkit provides several basic abstractions. These abstract concepts are as follows. ByteImage: 2D array of values in the range 0..255. -BitImage: 2D array of 0/1 values. DCT Image: A 2D array of elements, each of which is a series (index, value) representing a run-length coded DCT block found in many block-based compression schemes such as MPEG and JPEG. It is a pair. -Vector Image: A 2D array of vectors, each of which is MPEG or H.264. H.261 has horizontal and vertical components representing the motion vector. AudioBuffer: 1D array of 8 or 16 bit values. -Byte LUT (ByteLUT): Lookup table for byte images (ByteImages). The byte LUT is applied to one byte image (ByteImage) to generate another byte image (ByteImage).・ Audio LUT (AudioLUT): Look-up table for audio buffers (AudioBuffers).・ Bitstream / BitParser: Bitstream is a buffer for encoded data. The bit parser (BitParser) provides a cursor into the bit stream (BitStream) and a function of reading / writing bits from / to the bit stream (BitStream). Kernel: 2D array of integers used for convolution. • Filter-A spread / gather list that can be used to select a subset of the BitStream. These abstractions can be used to represent common multimedia data objects. For example: Grayscale images can be represented using a ByteImage.
-A monochrome image can be represented using a bit image (BitImage). -Irregular shaped areas can be represented using a BitImage.
-All RGB images can be represented using three byte images of the same size (ByteImages). • A 4: 2: 0 format YUV image can be represented using three ByteImages. The byte image (ByteImage) representing the Y plane is twice the width and height of the byte images (ByteImages) representing the U and V planes. • The error terms of DCT blocks, MPEG I-frames or MPEG P- and B-frames of JPEG images use three DCT images (DCTImages), one for each of the Y, U, and V planes of the image in the DCT domain. Can be expressed as -The motion vectors of MPEGP- and B-frames can be represented by a vector image (VectorImage). • GIF images can be represented using three Byte LUTs, one for each color map, and one Byte Image for color map pixel data. 8 or 16-bit PCM audio, 16-bit PCM audio, μ-law or A-law audio data can be represented using an audio buffer (AudioBuffer). The sound can be stored as a single channel or include both left and right channels.

【００４５】ツールキットはまた、符号化特定情報を記憶するための抽象的概念を持つ。例
えばMpegPicHdrは、ＭＰＥＧ−１映像ビットストリーム中の画像ヘッダから分析
された情報を記憶する。ヘッダの抽象的概念の全リストが表１に見られる。The toolkit also has an abstraction for storing encoding specific information. For example, MpegPicHdr stores information analyzed from an image header in an MPEG-1 video bit stream. A full list of header abstractions can be found in Table 1.

【００４６】[0046]

【表１】 [Table 1]

【００４７】ツールキットに規定された抽象的概念のセットはかなり小さいけれども、これ
らの抽象的概念を扱うオペレータのセットはそうではない。Although the set of abstractions specified in the toolkit is fairly small, the set of operators that deal with these abstractions is not.

【００４８】本発明に関連する以下の例は、ツールキットにおける抽象的概念の使用を図示
し、ツールキットを使用してプログラムを書くことを論証する。第一例は、画像
を扱うためにツールキットを如何に使用するかを示す。第二例は、ＭＰＥＧ復号
にツールキットのプリミティブと抽象的概念とを如何に使用するかを示す。第三
例は、ＭＰＥＧシステムストリームを多重分離化するためにツールキットフィル
タを如何に使用するかを示す。The following examples relating to the present invention illustrate the use of abstract concepts in the toolkit and demonstrate writing programs using the toolkit. The first example shows how to use the toolkit to handle images. The second example shows how to use the primitives and abstract concepts of the toolkit for MPEG decoding. The third example shows how a toolkit filter is used to demultiplex an MPEG system stream.

【００４９】第一例は、画像を扱うためにツールキットを使用する簡単な例である。バイト
画像（ByteImage)関数が使用されるであろう。バイト画像（ByteImage)はヘッダ
と本体とから成る。ヘッダは、バイト画像（ByteImage)の幅及び高さ並びに本体
へのポインタのような情報を記憶する。本体は、画像データを包含するメモリの
ブロックである。バイト画像（ByteImage)は、物理または仮想の何れでも良い。
物理バイト画像（ByteImage)の本体はメモリに隣接しており、一方仮想バイト画
像（ByteImage)は別のバイト画像（ByteImage)（その親と呼ばれる）の一部から
その本体を借りる。換言すれば、仮想バイト画像（ByteImage)はメモリ共有の形
式を提供し、図２に見られるように、仮想バイト画像（ByteImage)の本体の変更
は必然的にその親の本体を変更する。The first example is a simple example of using the toolkit to handle images. The ByteImage function will be used. The byte image (ByteImage) includes a header and a main body. The header stores information such as the width and height of the byte image (ByteImage) and a pointer to the main body. The body is a block of memory that contains the image data. The byte image (ByteImage) may be either physical or virtual.
The body of a physical byte image (ByteImage) is adjacent to memory, while a virtual byte image (ByteImage) borrows its body from part of another byte image (ByteImage), called its parent. In other words, the virtual byte image (ByteImage) provides a form of memory sharing, and changing the body of the virtual byte image (ByteImage) necessarily changes its parent body, as seen in FIG.

【００５０】物理バイト画像（ByteImage)は、バイトニュー（ByteNew)（ｗ，ｈ）を使用し
て作られる。仮想バイト画像（ByteImage)は、バイトクリップ（ByteClip）（ｂ
，ｘ，ｙ，ｗ，ｈ）を使用して作られる。その大きさがｗ×ｈであって（ｘ，ｙ
）にその上部左隅を持つ矩形区域は、仮想バイト画像（ByteImage)と物理バイト
画像（ByteImage)との間で共有される。仮想／物理バイト画像（ByteImage)の特
質は、ツールキットの全ての画像タイプに当てはまる。例えば仮想ＤＣＴ画像（
DCTImage）は、ＪＰＥＧ画像の部分集合を復号するために作ることができる。The physical byte image (ByteImage) is created using ByteNew (w, h). The virtual byte image (ByteImage) is a byte clip (ByteClip) (b
, X, y, w, h). Its size is w × h and (x, y
) Has its upper left corner shared by the virtual byte image (ByteImage) and the physical byte image (ByteImage). The nature of virtual / physical byte images (ByteImage) applies to all image types in the toolkit. For example, a virtual DCT image (
DCTImage) can be created to decode a subset of a JPEG image.

【００５１】画像上に“画中の画”（ＰＩＰ）効果を作る操作において、ＰＩＰ効果を作る
ステップは次の通りである。即ち、（１）入力画像が与えられて画像を半分に縮
小し、（２）原画像上に縮小された画像よりも若干大きい白色箱を描き、そして
（３）白色箱の中に縮小画像を貼り付ける。In the operation of creating an “in-image” (PIP) effect on an image, the steps for creating the PIP effect are as follows. That is, (1) the input image is given, the image is reduced by half, (2) a white box slightly larger than the reduced image is drawn on the original image, and (3) the reduced image is placed in the white box. paste.

【００５２】図４のコードは、ＰＩＰ操作を実施するツールキット機能を示す。この機能は
、三つの論拠を取る。即ち、画像−入力画像、境界幅−出力における内側画像周
囲の境界、並びに、マージン−出力画像の右及び底エッジからの内側画像のオフ
セットを取る。The code in FIG. 4 illustrates a toolkit function for performing a PIP operation. This feature takes three arguments. That is, the image-input image, the border width-the border around the inner image in the output, and the margin-the offset of the inner image from the right and bottom edges of the output image.

【００５３】関数の行５〜行６は、入力画像の幅と高さとを照会する。行７〜行１０は、内
側画像の位置と寸法とを計算する。行１３は、原画像の半分の大きさである新物
理バイト画像（ByteImage)−ｔｅｍｐを作る。行１４は、入力画像をｔｅｍｐへ
縮める。行１５は内側画像よりも若干大きい仮想バイト画像（ByteImage)を作り
、行１８は仮想バイト画像（ByteImage)の値を２５５に設定し、白色箱を描く効
果を達成する。行１９は、この仮想画像の割付けを外す。行２０は、内側画像に
対応して別の仮想バイト画像（ByteImage)を作る。行２１は、縮小された画像を
バイトコピー（ByteCopy）を使用して内側画像にコピーする。最後に行２２及び
２３は、バイト画像（ByteImages) 用に割付けられたメモリを自由にする。Lines 5 through 6 of the function query the width and height of the input image. Lines 7 to 10 calculate the position and size of the inner image. Line 13 creates a new physical byte image (ByteImage) -temp that is half the size of the original image. Line 14 shrinks the input image to temp. Line 15 creates a virtual byte image (ByteImage) slightly larger than the inner image, and line 18 sets the value of the virtual byte image (ByteImage) to 255 to achieve the effect of drawing a white box. Line 19 deallocates this virtual image. Line 20 creates another virtual byte image (ByteImage) corresponding to the inner image. Line 21 copies the reduced image to the inner image using ByteCopy. Finally, lines 22 and 23 free the memory allocated for ByteImages.

【００５４】この例は、一連の単純な僅かな操作を通して画像が如何にツールキットにより
扱われるかを示す。それはまた、ツールキットの種々の設計原理を図示する。即
ち、（１）メモリの共有（仮想画像を通した）、（２）明白なメモリ制御（バイ
トクリップ（ByteClip）、バイトニュー（ByteNew)及びバイトフリー（ByteFree
）を通した）、及び（３）特定したオペレータ（ByteShrink２×２）を示す。This example shows how images are handled by the toolkit through a series of simple, minor operations. It also illustrates the various design principles of the toolkit. That is, (1) memory sharing (through a virtual image), (2) explicit memory control (ByteClip, ByteNew, ByteFree)
)) And (3) the specified operator (ByteShrink2 × 2).

【００５５】本発明に関連する第二例は、ツールキットを使用してＭＰＥＧ映像ストリーム
を如何に処理するかを図示する。例示のプログラムは、ＭＰＥＧ映像ストリーム
中のＩ−フレームを一連のＲＧＢ画像に復号する。ＭＰＥＧ映像ストリームを分
析するために、符号化映像データは先ずビットストリーム（BitStream)に読み込
まれる。ビットストリーム（BitStream)は入力／出力操作についての抽象的概念
であり、即ちそれはバッファである。ビットストリーム（BitStream)からデータ
を読み書きするために、ビットパーザ（BitParser)が使用される。ビットパーザ
（BitParser)はビットストリーム（BitStream)からデータを読むと共にそこへデ
ータを書き込む機能を提供し、加えてビットストリーム（BitStream)中にカーソ
ルを提供する。A second example relating to the present invention illustrates how to process an MPEG video stream using a toolkit. The example program decodes I-frames in an MPEG video stream into a series of RGB images. In order to analyze the MPEG video stream, the encoded video data is first read into a bit stream (BitStream). BitStream is an abstraction about input / output operations, ie it is a buffer. A bit parser (BitParser) is used to read and write data from a bit stream (BitStream). The bit parser provides a function of reading data from and writing data to the bit stream (BitStream), and additionally provides a cursor in the bit stream (BitStream).

【００５６】図５は、ＭＰＥＧ−１映像ストリーム１５０のフォーマットを示す。ＭＰＥＧ
映像ストリーム１５０はＧＯＰｓ（画像のグループ）１５４のシーケンスが続き
、更にそれにシーケンス終了マーカー１５６が続くシーケンスヘッダ１５２を持
つ。各ＧＯＰは、画像１６０のシーケンスが続くＧＯＰヘッダ１５８より成る。
各画像は、画を再構築するために必要な圧縮データより作られる画本体１６４が
続く画ヘッダ１６２より成る。シーケンスヘッダ１５２は映像の幅と高さ、フレ
ーム率、縦横比等のような情報を含む。ＧＯＰヘッダ１５８は、ＧＯＰについて
の時間コードを含む。画ヘッダ１６２は、画像を復号するために必要な情報、殆
どの場合画像タイプ（Ｉ，Ｐ，Ｂ）を含む。ツールキットは、これらの構造的各
要素についての抽象的概念を提供する（表１参照）。FIG. 5 shows a format of the MPEG-1 video stream 150. MPEG
The video stream 150 has a sequence header 152 followed by a sequence of GOPs (groups of images) 154, followed by a sequence end marker 156. Each GOP consists of a GOP header 158 followed by a sequence of images 160.
Each image consists of an image header 162 followed by an image body 164 made from the compressed data needed to reconstruct the image. The sequence header 152 includes information such as the width and height of the image, the frame rate, the aspect ratio, and the like. GOP header 158 includes a time code for the GOP. The picture header 162 contains the information needed to decode the picture, most often the picture type (I, P, B). The toolkit provides an abstraction for each of these structural elements (see Table 1).

【００５７】ツールキットは各構造的要素についての五つのプリミティブ、見出す、飛び越
す、ダンプする、分析する、及び符号化する、を提供する。見出しは、カーソル
を要素の直前のビットストリーム（BitStream)中に配置する。飛び越しは、カー
ソルを要素の末尾に進める。ダンプは、カーソルがヘッダの末尾にくるまで、要
素に対応するバイトを入力ビットストリーム（BitStream)から出力ビットストリ
ーム（BitStream)へ移動する。分析は、ビットストリーム（BitStream)を復号し
、情報をヘッダ抽象的概念に記憶し、そして、符号化はヘッダ抽象的概念からの
情報をビットストリーム（BitStream)に符号化する。こうしてMpegPicHdrFind機
能はカーソルを次の映像ヘッダへ進め、そして、MpegSeqHdrParse はシーケンス
ヘッダを構造に復号する。The toolkit provides five primitives for each structural element: find, jump, dump, analyze, and encode. The heading positions the cursor in the bitstream (BitStream) immediately before the element. Jumping moves the cursor to the end of the element. Dump moves the bytes corresponding to the element from the input bitstream (BitStream) to the output bitstream (BitStream) until the cursor is at the end of the header. The analysis decodes the bitstream (BitStream), stores the information in the header abstraction, and the encoding encodes the information from the header abstraction into a bitstream (BitStream). Thus, the MpegPicHdrFind function advances the cursor to the next video header, and MpegSeqHdrParse decodes the sequence header into a structure.

【００５８】これらのプリミティブは、ＭＰＥＧＩ−フレームを見出す、飛び越す、または
分析するために必要な機能を提供する。ＭＰＥＧＩ−フレームから分析された映
像データは、ＤＣＴ画像（DCTImage）を使用して表される。ＤＣＴ画像（DCTIma
ge）はバイト画像（ByteImage)と同様であるが、各“画素”は８×８ＤＣＴ符号
化ブロックである。These primitives provide the necessary functionality to find, jump, or analyze MPEGI-frames. The video data analyzed from the MPEGI-frame is represented using a DCT image (DCTImage). DCT image (DCTIma
ge) is similar to a ByteImage, but each “pixel” is an 8 × 8 DCT coded block.

【００５９】図６のツールキットコードは、ＭＰＥＧ映像のＩ−フレームをＲＧＢ画像に復
号する。行１〜５は、復号に必要なデータ構造を割付ける。行６は、ビットパー
ザ（BitParser)inbpをビットストリーム（BitStream)inbsに取り付ける。inbpの
カーソルは、inbsのバッファの第一バイトを指すであろう。行７は、入力ＭＰＥ
Ｇ映像からの６４ｋのデータでinbsを満たすであろう。行８は、inbpのカーソル
をシーケンスヘッダの始点へ移動し、そして行９は、シーケンスヘッダを分析し
、シーケンスヘッダからの情報を構造seqhdrへ記憶する。The toolkit code in FIG. 6 decodes an I-frame of an MPEG video into an RGB image. Rows 1 to 5 allocate data structures required for decryption. Line 6 attaches a BitParser inbp to the BitStream inbs. The inbp cursor will point to the first byte of the inbs buffer. Line 7 is the input MPE
Will fill the inbs with 64k data from the G video. Line 8 moves the inbp cursor to the start of the sequence header, and line 9 analyzes the sequence header and stores information from the sequence header in the structure seqhdr.

【００６０】画（vbvsize)を復号するために存在しなければならない幅、高さ及び最小デー
タのような中枢情報は、行１０〜１２のシーケンスヘッダから抽出される。行１
３〜２１は、我々がＩ−フレームを復号するのに必要なバイト画像（ByteImages
）及びＤＣＴ画像（DCTImages)を割付ける。Ｙ、ｕ及びｖはＹＵＶ色空間の復号
画を記憶し、ｒ、ｇ及びｂはＲＧＢ色空間の復号画を記憶する。Ｄｃｔｙ、ｄｃ
ｔｕ及びｄｃｔｖは圧縮（ＤＣＴ領域）画データを記憶する。復号プログラムの
主ループ（行２２−４６）は、ビットパーザ（BitParser)カーソルを次のマーカ
ーの始点へ進めることにより開始し（行２４）、現在のマーカーを検索する（行
２５）。Central information such as width, height and minimum data that must be present to decode the picture (vbvsize) is extracted from the sequence header in rows 10-12. Row 1
3-21 are the byte images (ByteImages) we need to decode the I-frame.
) And DCT images (DCTImages). Y, u and v store decoded images in the YUV color space, and r, g and b store decoded images in the RGB color space. Dcty, dc
tu and dctv store compressed (DCT area) image data. The main loop of the decoding program (lines 22-46) begins by advancing the BitParser cursor to the start of the next marker (line 24) and searches for the current marker (line 25).

【００６１】もしマーカーが画ヘッダの始点を指示するならば、画ヘッダは分析されて（行
２８）そのタイプはチェックされる（行２９）。もし画がＩ−フレームであるな
らば、Ｉ−フレームは三つのＤＣＴ画像（DCTImages)に分析され（行３０）、Ｄ
ＣＴ画像（DCTImages)はバイト画像（ByteImages）に変換され（行３１−３３）
、そしてバイト画像（ByteImages）はＲＧＢ色空間に変換される（行３４）。If the marker indicates the start of the picture header, the picture header is analyzed (line 28) and its type is checked (line 29). If the image is an I-frame, the I-frame is analyzed into three DCT images (DCTImages) (line 30) and D
CT images (DCTImages) are converted to byte images (ByteImages) (lines 31-33).
, And ByteImages are converted to an RGB color space (line 34).

【００６２】もしマーカーがＧＯＰヘッダの始点を指示するならば、ＧＯＰヘッダからの情
報は必要ではないので、ヘッダは飛び越される（そのことはカーソルをＧＯＰヘ
ッダ末尾に移動する）。If the marker points to the start of the GOP header, the header is skipped since it does not need information from the GOP header (which moves the cursor to the end of the GOP header).

【００６３】最後にもしシーケンス終了マーカーが遭遇すると、それは映像ストリームの末
尾をマークし、ループは退出される。行４３−４５は、主ループの次の反復の間
、inbsが復号を継続するのに十分なデータを含むであろうことを保証する。Upda
teIfUnderflow は、inbs中に残っているバイトの数がvbvsize よりも少ないか否
かをチェックする。もしそうならば、残存されたバイトはバッファの始点にシフ
トされ、バッファの残りはファイルからのデータで満たされる。Finally, if a sequence end marker is encountered, it marks the end of the video stream and the loop is exited. Lines 43-45 ensure that during the next iteration of the main loop, the inbs will contain enough data to continue decoding. Upda
teIfUnderflow checks if the number of bytes remaining in inbs is less than vbvsize. If so, the remaining bytes are shifted to the beginning of the buffer, and the rest of the buffer is filled with data from the file.

【００６４】ＭＰＥＧ復号のような複雑な復号操作を“僅かな”プリミティブに分解するこ
とは、ツールキットコードを高度に構成可能なものとする。例えば行３２〜３４
を除去することにより、プログラムはＭＰＥＧＩ−フレームをグレイスケール画
像に復号する。行３１〜３４をＪＰＥＧ符号化プリミティブと置換することによ
り、ＪＰＥＧ相互コーダーへの効率的なＭＰＥＧＩ−フレームが作られる。本発明に関連する第三例は、処理するためのビットストリーム（BitStream)の
部分集合を取り出すことである。フィルタはインタリーブされたデータ（例えば
ＡＶＩ、ＱｕｉｃｋＴｉｍｅ、またはＭＰＥＧシステムストリーム）によるビッ
トストリームの処理を単純化するために設計された。フィルタは拡散／収集ベク
トルに類似しており、それらはより大きなデータセットの順序付けされた部分集
合を特定する。Decomposing complex decoding operations such as MPEG decoding into “small” primitives makes the toolkit code highly configurable. For example, rows 32-34
, The program decodes the MPEG I-frame into a grayscale image. By replacing rows 31-34 with JPEG encoding primitives, an efficient MPEG I-frame to JPEG intercoder is created. A third example relevant to the present invention is to retrieve a subset of the bitstream for processing. The filters were designed to simplify processing of the bitstream with interleaved data (eg, AVI, QuickTime, or MPEG system streams). Filters are similar to diffusion / collection vectors, which specify an ordered subset of a larger dataset.

【００６５】フィルタの共通の使用は、インタリーブされた音声または映像（Ａ／Ｖ）スト
リームを含むＭＰＥＧシステムストリームを処理することである。ＭＰＥＧにお
いて、各Ａ／Ｖストリームは独自のｉｄを割り当てられる。音声ストリームは範
囲０・・３１のｉｄｓを持ち、映像ストリームｉｄｓは範囲３２・・４７である
。Ａ／Ｖストリームは、パケットと呼ばれる小さな（約２キロバイト）塊りに分
割される。各パケットは、ストリームのｉｄ、パケットの長さ及び他の情報（例
えば時間コード）を含むヘッダを持つ。A common use of filters is to process MPEG system streams, including interleaved audio or video (A / V) streams. In MPEG, each A / V stream is assigned a unique id. The audio stream has ids in the range 0..31 and the video stream ids has the range 32..47. The A / V stream is divided into small (about 2 kilobytes) chunks called packets. Each packet has a header that contains the stream id, packet length, and other information (eg, time code).

【００６６】この例において、第一映像ストリーム（ｉｄ＝３２）のパケットを一つのビッ
トストリーム（BitStream)に記憶されたシステムストリームから他へコピーする
ために使用できるフィルタが構設される。一旦コピーされるとツールキットＭＰ
ＥＧ映像処理プリミティブは、映像のみのビットストリーム（BitStream)で使用
できる。このフィルタを構設するツールキットコードが、図７に示されている。In this example, a filter is provided which can be used to copy the packets of the first video stream (id = 32) from the system stream stored in one bitstream (BitStream) to another. Once copied, Toolkit MP
The EG video processing primitive can be used in a video-only bit stream (BitStream). The toolkit code that makes up this filter is shown in FIG.

【００６７】行２〜８は、このプログラムに必要な種々の構造を割付けて初期化する。可変
オフセットは、ストリームの出発に関連してビットストリーム中のパケットのバ
イトオフセットを記憶する。行９はカーソルを第一パケットヘッダの始点へ進め
、オフセットを更新する。主ループ（行１０−１８）はパケットヘッダ（行１１
）を分析し、もしパケットが第一映像ストリームに属するならばそのオフセット
と長さとがフィルタに加えられる（行１４）。EndOfBitStreamは、ビットストリ
ームカーソルの位置をデータバッファの長さに対してチェックするマクロである
。Lines 2 to 8 allocate and initialize various structures required for this program. The variable offset stores the byte offset of the packet in the bitstream relative to the start of the stream. Line 9 advances the cursor to the start of the first packet header and updates the offset. The main loop (lines 10-18) is the packet header (lines 11-18).
), And if the packet belongs to the first video stream, its offset and length are added to the filter (line 14). EndOfBitStream is a macro that checks the position of the bit stream cursor against the length of the data buffer.

【００６８】一旦フィルタが構築されると、それはディスクに保存され、BitStreamFileFil
ter またはBitStreamDumpUsingFilter機能に対するパラメータとして使用される
。前者はフィルタによって特定されたファイルの部分集合を読み出し、後者はフ
ィルタによって特定されたデータ部分集合を一つのビットストリームから他へコ
ピーする。Once the filter has been constructed, it is saved to disk and BitStreamFileFil
Used as a parameter to the ter or BitStreamDumpUsingFilter function. The former reads a subset of the files specified by the filter, and the latter copies the data subset specified by the filter from one bitstream to another.

【００６９】この例は、ツールキットが如何に多重分離のインタリーブされたデータに使用
できるかを図示する。それは容易にＱｕｉｃｋｔｉｍｅ、ＡＶＩ、ＭＰＥＧ−２
、及びＭＰＥＧ−４のような他のフォーマットに拡張できる。この機構はデータ
コピーを使用するけれども、コピーの費用はフィルタされたデータを処理する実
施利得によって相殺される。This example illustrates how the toolkit can be used for demultiplexed interleaved data. It can easily be Quicktime, AVI, MPEG-2
And other formats such as MPEG-4. Although this mechanism uses data copying, the cost of copying is offset by the implementation gains of processing the filtered data.

【００７０】上記の実施の形態は本発明の原理を単に図示するものであるということが理解
されるべきである。本発明の原理を具体化し、その意図と範囲とに入る種々及び
他の修正並びに変更が、当該技術に習熟した人々によってなされ得る。It should be understood that the above-described embodiments are merely illustrative of the principles of the present invention. Various and other modifications and alterations that embody the principles of the invention and fall within the spirit and scope of the invention may be made by those skilled in the art.

【図面の簡単な説明】[Brief description of the drawings]

【図１】本発明の原理によるウェブに基づく映像編集システムのブロック図である。FIG. 1 is a block diagram of a web-based video editing system according to the principles of the present invention.

【図２】本発明の原理によるメモリクリッピングの略図である。FIG. 2 is a schematic diagram of memory clipping according to the principles of the present invention.

【図３】メモリ内でインタリーブされたステレオサンプルを示す。FIG. 3 shows stereo samples interleaved in memory.

【図４】本発明の原理による画像操作における画像を実施するツールキット機能を示す
。FIG. 4 illustrates a toolkit function for implementing an image in image manipulation according to the principles of the present invention.

【図５】ＭＰＥＧ−１映像ストリームのフォーマットを示す。FIG. 5 shows the format of an MPEG-1 video stream.

【図６】本発明の原理によるＭＰＥＧ映像のＩ−フレームをＲＧＢ画像へ復号するツー
ルキット機能を示す。FIG. 6 illustrates a toolkit function for decoding an I-frame of an MPEG image into an RGB image according to the principles of the present invention.

【図７】本発明の原理による、第一ビットストリーム中に記憶されたシステムストリー
ムから第二ビットストリームへ第一映像ストリームのパケットをコピーするため
に使用できるフィルタとして作用するツールキット機能を示す。FIG. 7 illustrates a toolkit function acting as a filter that can be used to copy packets of a first video stream from a system stream stored in a first bitstream to a second bitstream, in accordance with the principles of the present invention.

───────────────────────────────────────────────────── フロントページの続き (81)指定国ＥＰ(ＡＴ，ＢＥ，ＣＨ，ＣＹ，ＤＥ，ＤＫ，ＥＳ，ＦＩ，ＦＲ，ＧＢ，ＧＲ，ＩＥ，ＩＴ，ＬＵ，ＭＣ，ＮＬ，ＰＴ，ＳＥ)，ＯＡ(ＢＦ，ＢＪ，ＣＦ，ＣＧ，ＣＩ，ＣＭ，ＧＡ，ＧＮ，ＧＷ，ＭＬ，ＭＲ，ＮＥ，ＳＮ，ＴＤ，ＴＧ)，ＡＰ(ＧＨ，ＧＭ，ＫＥ，ＬＳ，ＭＷ，ＳＤ，ＳＬ，ＳＺ，ＴＺ，ＵＧ，ＺＷ )，ＥＡ(ＡＭ，ＡＺ，ＢＹ，ＫＧ，ＫＺ，ＭＤ，ＲＵ，ＴＪ，ＴＭ)，ＡＥ，ＡＬ，ＡＭ，ＡＴ，ＡＵ，ＡＺ，ＢＡ，ＢＢ，ＢＧ，ＢＲ，ＢＹ，ＣＡ，ＣＨ，ＣＮ，ＣＲ，ＣＵ，ＣＺ，ＤＥ，ＤＫ，ＤＭ，ＥＥ，ＥＳ，ＦＩ，ＧＢ，ＧＤ，ＧＥ，ＨＲ，ＨＵ，ＩＤ，ＩＬ，ＩＮ，ＩＳ，ＪＰ，ＫＥ，ＫＧ，ＫＰ，ＫＲ，ＫＺ，ＬＣ，ＬＫ，ＬＲ，ＬＳ，ＬＴ，ＬＵ，ＬＶ，ＭＤ，ＭＧ，ＭＫ，ＭＮ，ＭＷ，ＭＸ，ＮＯ，ＮＺ，ＰＬ，ＰＴ，ＲＯ，ＲＵ，ＳＤ，ＳＥ，ＳＧ，ＳＩ，ＳＫ，ＳＬ，ＴＪ，ＴＭ，ＴＲ，ＴＴ，ＴＺ，ＵＡ，ＵＧ，ＵＳ，ＵＺ，ＶＮ，ＹＵ，ＺＡ，ＺＷ (72)発明者チクロフスキー，ディミトリディー．アメリカ合衆国ニューヨーク 10017，ニューヨーク，イースト 49番ストリート 333、アパートメント１Ｂ (72)発明者サモイロフ，マイケルアメリカ合衆国ニューヨーク 10014，ニューヨーク，ジョーンズストリート７、アパートメント８Ｆターム(参考） 5C053 FA14 GB21 JA01 LA06 LA11 LA14 5C064 BA07 BB03 BC25 BD01 BD08 【要約の続き】エンジンは、サーバのメモリにクライアントによって予め記憶されたマルチメディアオブジェクトについての対応する処理操作を行うことにより一または複数のクライアントからの受信マルチメディア処理命令に基づいて動作し、サーバ処理済みマルチメディアオブジェクトをネットワークを介してクライアントに利用可能にする。──────────────────────────────────────────────────続き Continuation of front page (81) Designated country EP (AT, BE, CH, CY, DE, DK, ES, FI, FR, GB, GR, IE, IT, LU, MC, NL, PT, SE ), OA (BF, BJ, CF, CG, CI, CM, GA, GN, GW, ML, MR, NE, SN, TD, TG), AP (GH, GM, KE, LS, MW, SD, SL, SZ, TZ, UG, ZW), EA (AM, AZ, BY, KG, KZ, MD, RU, TJ, TM), AE, AL, AM, AT, AU, AZ, BA, BB, BG, BR, BY, CA, CH, CN, CR, CU, CZ, DE, DK, DM, EE, ES, FI, GB, GD, GE, HR, HU, ID, IL, IN , IS, JP, KE, KG, KP, KR, KZ, LC, LK, LR, LS, LT, LU, LV, MD, MG, MK, MN, MW, MX, NO, NZ, PL, PT, RO, RU, SD, SE, SG, SI, SK, SL, TJ, TM, TR, TT, TZ, UA, UG, US, UZ, VN, YU, ZA, ZW (72) Inventor Tiklovsky, Dimitri Dee. New York 10017, New York, East 49th Street 333, Apartment 1B (72) Inventor Samoylov, Michael United States New York 10014, New York, Jones Street 7, Apartment 8 F-term (reference) 5C053 FA14 GB21 JA01 LA06 LA11 LA14 5C064 BA07 BB03 BC25 BD01 BD08 [Continued Summary] The engine responds to received multimedia processing instructions from one or more clients by performing corresponding processing operations on the multimedia objects pre-stored by the clients in the memory of the server. And make server-processed multimedia objects available to clients over the network.

Claims

【特許請求の範囲】[Claims]

【請求項１】マルチメディア情報を処理するネットワークに基づくシステ
ムにおいて、ネットワークに結合されており、映像、画像、音声及び動画の少なくとも一つ
を含むマルチメディアオブジェクトの作成、編集、観察及び管理を可能にするマ
ルチメディアエンジンを組み込んでおり、サーバはマルチメディアオブジェクト
についてのメモリへのアクセスが可能であるサーバモジュールと、ネットワークへのアクセスが可能であり、所定の一組のマルチメディア処理命
令からのマルチメディア処理命令をネットワークを介してサーバへ送ることをク
ライアントに可能にさせるマルチメディア編集インタフェースを組み込んでいる
クライアントモジュールとを備えており、サーバ内のマルチメディアエンジンは、サーバのメモリにクライアントによっ
て予め記憶されたマルチメディアオブジェクトについての対応する処理操作を行
うことによりクライアントからの受信マルチメディア処理命令に基づいて動作し
、サーバは処理済みマルチメディアオブジェクトをクライアントに利用可能にす
るようにしたことを特徴とするネットワークに基づくシステム。1. A network-based system for processing multimedia information, wherein the system is coupled to a network and is capable of creating, editing, observing, and managing multimedia objects including at least one of video, images, audio, and moving images. The server incorporates a server module that has access to memory for multimedia objects and a network module that has access to the network and provides multimedia from a predetermined set of multimedia processing instructions. A client module incorporating a multimedia editing interface that allows the client to send media processing instructions over the network to the server, wherein the multimedia engine in the server has a client engine in the server memory. Operating based on the received multimedia processing instructions from the client by performing corresponding processing operations on the multimedia object pre-stored by the client, and the server making the processed multimedia object available to the client. A system based on a network, characterized in that:

【請求項２】ネットワークに結合されており、画像、音声及び動画の少な
くとも一つを含むマルチメディアオブジェクトの編集、観察及び管理を可能にす
るマルチメディアエンジンを組み込んでおり、サーバはマルチメディアオブジェ
クトについてのメモリへのアクセスが可能である追加のサーバモジュールと、所
定の基準に基づいてクライアントモジュールにより前記サーバモジュールへのア
クセスを制御する制御モジュールとを更に備えており、前記少なくとも一つのサ
ーバ内のマルチメディアエンジンは、予め記憶されたマルチメディアオブジェク
トについての対応する処理操作を行うことによりクライアントからの受信マルチ
メディア処理命令に基づいて動作し、サーバは処理済みマルチメディアオブジェ
クトをクライアントに利用可能にさせることを特徴とする請求項１に記載のシス
テム。2. A multimedia engine that is coupled to the network and that enables editing, viewing and management of multimedia objects including at least one of images, sounds, and moving images, wherein the server is configured to manage the multimedia objects. Further comprising: an additional server module capable of accessing the memory of the at least one server; and a control module configured to control access to the server module by the client module based on a predetermined criterion. The media engine operates based on the received multimedia processing instructions from the client by performing corresponding processing operations on the pre-stored multimedia objects, and the server uses the processed multimedia objects to the client. The system of claim 1, wherein the system is enabled.

【請求項３】前記制御モジュールは、クライアントを認識し、命令を待ち
、基準に基づいて選択されたサーバへそれらを送る負荷平衡デーモンであること
を特徴とする請求項２に記載のシステム。3. The system of claim 2, wherein the control module is a load balancing daemon that recognizes clients, waits for commands, and sends them to a server selected based on criteria.

【請求項４】前記制御モジュールは、承認済クライアントに基準を設定及
び修正させるように構成されていることを特徴とする請求項３に記載のシステム
。4. The system of claim 3, wherein the control module is configured to cause approved clients to set and modify criteria.

【請求項５】ネットワークへのアクセスが可能であり、ネットワークを介
して少なくとも一つのサーバへマルチメディア処理命令を送ることをクライアン
トに可能にさせるマルチメディア編集インタフェースを組み込んでいる追加のク
ライアントモジュールを更に備えており、処理命令は所定の一組のマルチメディ
ア処理命令からであることを特徴とする請求項３に記載のシステム。5. An additional client module that has access to the network and incorporates a multimedia editing interface that allows the client to send multimedia processing instructions to at least one server over the network. The system of claim 3, comprising, wherein the processing instructions are from a predetermined set of multimedia processing instructions.

【請求項６】前記制御モジュールは、一または複数のクライアントに特定
のオブジェクトに関するマルチメディア処理命令を一度に提起させるように構成
されていることを特徴とする請求項５に記載のシステム。6. The system of claim 5, wherein the control module is configured to cause one or more clients to submit multimedia processing instructions for a particular object at one time.

【請求項７】前記制御モジュールは、クライアント間での相互作用を行わ
せるように構成されていることを特徴とする請求項５に記載のシステム。7. The system of claim 5, wherein the control module is configured to cause an interaction between clients.

【請求項８】ネットワークへのアクセスが可能であり、ネットワークを介
して少なくとも一つのサーバへマルチメディア処理命令を送ることをクライアン
トに可能にさせるマルチメディア編集インタフェースを組み込んでいる追加のク
ライアントモジュールを更に備えており、処理命令は所定の一組のマルチメディ
ア処理命令からであることを特徴とする請求項１または２に記載のシステム。8. An additional client module that has access to the network and incorporates a multimedia editing interface that allows the client to send multimedia processing instructions over the network to at least one server. The system of claim 1 or 2, comprising a processing instruction from a predetermined set of multimedia processing instructions.

【請求項９】サーバは、処理されたマルチメディアオブジェクトをストリ
ーミングメディアの形式で提供することを特徴とする請求項１または２に記載の
システム。9. The system according to claim 1, wherein the server provides the processed multimedia object in the form of streaming media.

【請求項１０】マルチメディア情報のネットワークに基づく処理を提供す
る方法において、第一位置において、映像、画像、音声及び動画の少なくとも一つを含むマルチ
メディアオブジェクトの作成、編集、観察及び管理を可能にするマルチメディア
エンジンプログラムを走らせるサーバコンピュータを設置し、サーバはマルチメ
ディアオブジェクトについてのメモリへのアクセス及びネットワークへの接続が
可能であり、第一位置から離れた第二位置において、ネットワークへのアクセスが可能であ
り、所定の一組のマルチメディア処理命令からのマルチメディア処理命令をネッ
トワークを介してサーバコンピュータへクライアントが送ることを可能にさせる
マルチメディア編集インタフェースプログラムを走らせるクライアントコンピュ
ータを設置し、第一位置において、サーバのメモリに予め記憶されたマルチメディアオブジェ
クトについての対応する処理操作を実行することによってクライアントからの受
信マルチメディア処理命令に基づいて動作することをマルチメディアエンジンプ
ログラムに行わせ、処理されたマルチメディアオブジェクトをネットワークを介
してクライアントコンピュータに利用可能にすることを特徴とする方法。10. A method for providing network-based processing of multimedia information, wherein a first location is capable of creating, editing, observing, and managing multimedia objects including at least one of video, image, audio, and video. A server computer running a multimedia engine program to enable access to the memory and connection to the network for multimedia objects, and to a network at a second location remote from the first location. A client computer that is accessible and runs a multimedia editing interface program that allows the client to send multimedia processing instructions from a predetermined set of multimedia processing instructions to a server computer over a network. Multimedia data operating at a first location based on the received multimedia processing instructions from the client by performing corresponding processing operations on the multimedia objects pre-stored in the memory of the server. A method for causing an engine program to perform and make a processed multimedia object available to a client computer over a network.

【請求項１１】ネットワークへの接続と共に、画像、音声及び動画の少な
くとも一つを含むマルチメディアオブジェクトの編集、観察及び管理を可能にす
るマルチメディアエンジンを走らせる追加のサーバコンピュータと、マルチメデ
ィアオブジェクトについての追加のメモリとを設置し、所定の基準に基づいてクライアントコンピュータにより前記サーバコンピュー
タへのアクセスを制御し、前記少なくとも一つのサーバ内のマルチメディアエンジンプログラムは、前記
少なくとも一つのサーバのメモリにクライアントによって予め記憶されたマルチ
メディアオブジェクトについての対応する処理操作を行うことによりクライアン
トからの受信マルチメディア処理命令に基づいて動作し、ネットワークを介して少なくとも一つのサーバからのマルチメディアオブジェ
クトをクライアントコンピュータに利用可能にすることを特徴とする請求項１０
に記載の方法。11. An additional server computer running a multimedia engine that enables editing, viewing and management of multimedia objects including at least one of images, sounds and moving images, together with connection to a network, and a multimedia object. Providing additional memory for controlling access to the server computer by a client computer based on predetermined criteria, wherein a multimedia engine program in the at least one server is stored in a memory of the at least one server. Operate based on the received multimedia processing instruction from the client by performing a corresponding processing operation on the multimedia object stored in advance by the client, and execute at least one server via the network. 11. The method of claim 10 wherein said multimedia objects are made available to a client computer.
The method described in.

【請求項１２】前記制御ステップは、負荷を平衡させることと、クライア
ントを認識することと、命令を待つことと、基準に基づいて選択されたサーバへ
それらを送ることとを有することを特徴とする請求項１１に記載の方法。12. The method of claim 1, wherein the controlling comprises balancing loads, recognizing clients, waiting for commands, and sending them to a server selected based on criteria. The method of claim 11, wherein

【請求項１３】前記制御ステップは、承認済クライアントに基準を確立及
び修正させることを有することを特徴とする請求項１２に記載の方法。13. The method of claim 12, wherein the controlling step comprises causing authorized clients to establish and modify criteria.

【請求項１４】ネットワークへのアクセスが可能であり、ネットワークを
介して少なくとも一つのサーバコンピュータへマルチメディア処理命令を送るこ
とをクライアントに可能にさせるマルチメディア編集インタフェースを組み込ん
でいる追加のクライアントコンピュータを設置し、処理命令は所定の一組のマル
チメディア処理命令からなされることを特徴とする請求項１３に記載の方法。【請求項１４】前記制御ステップは、一または複数のクライアントに特定
のオブジェクトに関するマルチメディア処理命令を一度に提起させることを含む
ことを特徴とする請求項１３に記載の方法。14. An additional client computer that has access to the network and incorporates a multimedia editing interface that allows the client to send multimedia processing instructions over the network to at least one server computer. 14. The method of claim 13, wherein the installing and processing instructions are made from a predetermined set of multimedia processing instructions. 14. The method of claim 13, wherein the controlling step comprises causing one or more clients to submit multimedia processing instructions for a particular object at one time.

【請求項１５】前記制御ステップは、クライアント間の相互作用を行わせ
ることを含むことを特徴とする請求項１３に記載の方法。15. The method of claim 13, wherein the controlling step comprises causing an interaction between clients.

【請求項１６】ネットワークへのアクセスが可能であり、ネットワークを
介して少なくとも一つのサーバコンピュータへマルチメディア処理命令を送るこ
とをクライアントに可能にさせるマルチメディア編集インタフェースを組み込ん
でいる追加のクライアントコンピュータを設置し、処理命令は所定の一組のマル
チメディア処理命令からなされることを特徴とする請求項１０または１１に記載
の方法。16. An additional client computer that has access to the network and incorporates a multimedia editing interface that allows the client to send multimedia processing instructions over the network to at least one server computer. The method of claim 10 or 11, wherein the installing and processing instructions are made from a predetermined set of multimedia processing instructions.

【請求項１７】マルチメディア編集インタフェースは、グラフィックユー
ザインタフェースであることを特徴とする請求項１から９の何れかに記載のシス
テム。17. The system according to claim 1, wherein the multimedia editing interface is a graphic user interface.

【請求項１８】マルチメディアエンジンは、クライアントから同時に複数
の命令を受けることを特徴とする請求項１から９の何れかに記載のシステム。18. The system according to claim 1, wherein the multimedia engine receives a plurality of instructions from the client at the same time.