JP2003302971A

JP2003302971A - Apparatus and program for video data processing

Info

Publication number: JP2003302971A
Application number: JP2002105574A
Authority: JP
Inventors: Eiichiro Aoki; 栄一郎青木
Original assignee: Yamaha Corp
Current assignee: Yamaha Corp
Priority date: 2002-04-08
Filing date: 2002-04-08
Publication date: 2003-10-24

Abstract

<P>PROBLEM TO BE SOLVED: To actualize automatic composition and lyrics writing of a song matching video. <P>SOLUTION: A computer executes a video data processing program. A function A specifies and inputs video data and a function B selects and extracts the specified video data from a video database C. A function D detects features of video from the extracted video data and a function E generates data for automatic music composition on the basis of the detected features. A function F extracts a keyword from the extracted video data and a function G generates data for automatic music composition on the basis of the detected keyword. On the basis of the data for automatic music composition generated by the function E and function G or the data for automatic music composition generated by the function G, a function H automatically composes music. The tile of the video, a keyword registered beforehand in the video data, etc., is used as the keyboard. Further, the features of the video are extracted and a corresponding keyword is registered in the video data. Furthermore, lyrics, etc., are generated with a template for lyrics writing corresponding to the keyword. <P>COPYRIGHT: (C)2004,JPO

Description

【発明の詳細な説明】Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は、映像に合った曲の
作曲や歌詞等の作詞をする映像データ処理装置、該作曲
や作詞に適した映像データを生成する映像データ処理装
置及びコンピュータで実行する映像データ処理プログラ
ムに関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention is carried out by a video data processing device that composes a song suitable for a video or lyrics lyrics and the like, a video data processing device that generates video data suitable for the composition or lyric, and a computer. Video data processing program.

【０００２】[0002]

【従来の技術】近年、ビデオカメラ等の民生用映像機器
が普及し、一般ユーザが映像を自作する機会も増えてい
る。この際、自作の映像に自作の曲を付加するといった
ことも要求される。また、コンピュータミュージック等
の分野において、メロディを自動生成する自動作曲の技
術がある。この自動作曲の処理では、楽節構成、楽節内
のフレーズ数、フレーズ内の小節数、小節内のメロディ
・データ（小節の始めと終わりのピッチ）などで構成さ
れた曲テンプレートを自動作曲用データとして用いてい
る。そして、ユーザーが選択した曲テンプレートと予め
決められている音楽ルールに即して曲が生成される。2. Description of the Related Art In recent years, consumer video equipment such as video cameras has become widespread, and general users have more opportunities to make their own videos. At this time, it is also required to add an original song to an original image. In addition, in the field of computer music and the like, there is a technique of automatic composition for automatically generating a melody. In this automatic music processing, a music template composed of the phrase composition, the number of phrases in the phrase, the number of measures in the phrase, the melody data in the measure (pitch at the beginning and end of the measure) is used as the data for the automatic music. I am using. Then, a song is generated according to the song template selected by the user and a predetermined music rule.

【０００３】[0003]

【発明が解決しようとする課題】上記のような自動作曲
の技術によれば、音楽的に不自然でない曲は、ある程度
作ることができる。しかし、映像に合った曲を自動作曲
するとなると、映像を見てその映像にふさわしい曲テン
プレートを設定する必要があるが、このような曲テンプ
レートを見つけるのが困難である。このため、映像に合
った曲を作曲するには、かなり熟練したユーザでも多大
な労力と時間を要するという問題がある。なお、映像に
合った曲（すなわち映像に応じた曲）を簡単に自動作曲
できると、例えば各種の映像をランダムに選んで曲を自
動作曲することができ、バラエティに富んだ作曲を楽し
むこともできる。さらに、映像にふさわしい歌詞や詞な
どを付けることができると、より映像を楽しむこともで
きる。According to the above-mentioned technique of automatic music, a music that is not musically unnatural can be created to some extent. However, if a song suitable for an image is to be automatically tuned, it is necessary to look at the image and set a song template suitable for the image, but it is difficult to find such a song template. For this reason, there is a problem in that even a highly skilled user requires a great deal of labor and time to compose a song suitable for an image. It should be noted that if you can easily self-tune a song that matches the video (that is, a song that matches the video), you can, for example, randomly select various videos to automatically play the song, and enjoy a variety of compositions. it can. Furthermore, if you can add lyrics and lyrics suitable for the video, you can enjoy the video more.

【０００４】本発明は、映像に合った曲を簡単に自動作
曲できるようにし、また、映像に合った歌詞や詞などを
生成できるようにし、さらに映像データを当該自動作曲
や自動作詞に適した映像データとなるように自動処理す
ることを課題とする。According to the present invention, it is possible to easily perform a music piece suitable for a video image, and to generate lyrics and words suitable for the video image. Further, the video data is suitable for the automatic music piece and the automatic movement word. The problem is to automatically process the image data.

【０００５】[0005]

【課題を解決するための手段】本発明の請求項１の映像
データ処理装置は、映像データを供給する映像データ供
給手段と、該映像データ供給手段で供給された映像デー
タからキーワードを抽出するキーワード抽出手段と、該
キーワード抽出手段で抽出したキーワードに応じた自動
作曲用データを生成する自動作曲用データ生成手段と、
該自動作曲用データ生成手段で生成された自動作曲用デ
ータに基づいて曲を生成する曲生成手段と、を備えたこ
とを特徴とする。According to a first aspect of the present invention, there is provided a video data processing device for supplying video data, and a keyword for extracting a keyword from the video data supplied by the video data supplying means. Extraction means, automatic music data generation means for generating automatic music data according to the keyword extracted by the keyword extraction means,
And a song generating means for generating a song based on the automatic song data generated by the automatic song data generating means.

【０００６】請求項１の映像データ処理装置において、
曲を生成するための各種パラメータ等の自動作曲用デー
タは、キーワードに応じたものである。ここで、キーワ
ードは、例えば映像のタイトルやタイトル中の単語、あ
るいは、映像のシーンなどに合わせて予め記憶させた単
語等であり、このようなキーワードは映像の内容に深く
関係している、あるいは内容に深く関係するキーワード
とすることができる。したがって、このようなキーワー
ドに応じた自動作曲用データで自動作曲することによ
り、映像データに合った曲を作曲できる。In the video data processing device according to claim 1,
The automatic song data, such as various parameters for generating a song, corresponds to the keyword. Here, the keyword is, for example, a title of a video, a word in the title, or a word stored in advance according to a scene of the video, and such a keyword is deeply related to the content of the video, or It can be a keyword deeply related to the content. Therefore, a song suitable for the video data can be composed by automatically performing the song with the song data for the song according to such a keyword.

【０００７】本発明の請求項２の映像データ処理装置
は、請求項１の構成を備えるとともに、前記映像データ
供給手段で供給された映像データの特徴を抽出する特徴
抽出手段を備え、前記自動作曲用データ生成手段が、前
記キーワード抽出手段で抽出したキーワードと前記特徴
抽出手段で抽出した映像データの特徴とに応じた自動作
曲用データを生成することを特徴とする。According to a second aspect of the present invention, there is provided a video data processing device having the structure of the first aspect, further comprising feature extraction means for extracting features of the video data supplied by the video data supply means, and the automatic music composition. The automatic data generating means generates automatic music data according to the keyword extracted by the keyword extracting means and the characteristic of the video data extracted by the characteristic extracting means.

【０００８】請求項２の映像データ処理装置において、
曲を生成するための各種パラメータ等の自動作曲用デー
タは、前記キーワードと映像データの特徴に応じたもの
である。ここで、映像データの特徴は、例えば映像の明
るさ（あるいは暗さ）、映像の動きの速さ（あるいは遅
さ）、あるいは映像の色合いなど、映像データから機械
的に抽出できるものであるが、このような特徴は、一方
では映像の内容に深く関係している。したがって、前記
キーワードとこのような映像データの特徴とに応じた自
動作曲用データで自動作曲することにより、さらに映像
データに合った曲を作曲できる。In the video data processing device of claim 2,
The automatic song data, such as various parameters for generating a song, corresponds to the characteristics of the keyword and the video data. Here, the characteristic of the video data is, for example, the brightness (or darkness) of the video, the speed (or slowness) of the motion of the video, or the hue of the video that can be mechanically extracted from the video data. On the one hand, such characteristics are deeply related to the content of the image. Therefore, by automatically synthesizing the song with the automatic song data corresponding to the keyword and the characteristics of the image data, a song more suitable for the image data can be composed.

【０００９】本発明の請求項３の映像データ処理装置
は、映像データを供給する映像データ供給手段と、該映
像データ供給手段で供給された映像データからキーワー
ドを抽出するキーワード抽出手段と、該キーワード抽出
手段で抽出したキーワードに基づいて、予め決められた
既存曲を選択する選択手段と、を備えたことを特徴とす
る。According to a third aspect of the present invention, there is provided a video data processing device for supplying video data, video data supplying means, keyword extracting means for extracting a keyword from the video data supplied by the video data supplying means, and the keyword. Selection means for selecting a predetermined existing music piece based on the keyword extracted by the extraction means.

【００１０】請求項３の映像データ処理装置によれば、
映像（あるいは映像データ）の内容に深く関係するキー
ワードに既存曲を対応付けておき、キーワードによって
曲を選択するだけでよいので、簡単な処理で映像に合っ
た曲を得ることができる。According to the video data processing device of claim 3,
Since it is only necessary to associate an existing song with a keyword that is deeply related to the contents of the image (or image data) and select the song by the keyword, it is possible to obtain a song that matches the image by a simple process.

【００１１】本発明の請求項４の映像データ処理装置
は、映像データを供給する映像データ供給手段と、該映
像データ供給手段で供給された映像データからキーワー
ドを抽出するキーワード抽出手段と、該キーワード抽出
手段で抽出したキーワードに基づいて、詞を自動生成す
る作詞手段と、を備えたことを特徴とする。According to a fourth aspect of the present invention, there is provided a video data processing device for supplying video data, video data supplying means, keyword extracting means for extracting a keyword from the video data supplied by the video data supplying means, and the keyword. And a lyric writing means for automatically generating a lyric based on the keyword extracted by the extracting means.

【００１２】請求項４の映像データ処理装置によれば、
映像（あるいは映像データ）の内容に深く関係するキー
ワードに応じた詞が自動生成されるので、このキーワー
ドの元となる映像データに合った歌詞等も作詞できる。According to the video data processing device of claim 4,
Since words are automatically generated according to keywords that are deeply related to the contents of the image (or image data), lyrics and the like that match the image data that is the source of these keywords can also be written.

【００１３】本発明の請求項５の映像データ処理装置
は、映像データを供給する映像データ供給手段と、該映
像データ供給手段で供給された映像データからキーワー
ドを抽出するキーワード抽出手段と、該キーワード抽出
手段で抽出したキーワードに基づいて、予め決められた
既存文章または既存詞を選択する選択手段と、を備えた
ことを特徴とする。A video data processing apparatus according to a fifth aspect of the present invention is a video data supply means for supplying video data, a keyword extraction means for extracting a keyword from the video data supplied by the video data supply means, and the keyword. Selection means for selecting a predetermined existing sentence or existing words based on the keyword extracted by the extraction means.

【００１４】請求項５の映像データ処理装置によれば、
映像（あるいは映像データ）の内容に深く関係するキー
ワードに既存文章または既存詞を対応付けておき、キー
ワードによって既存文章または既存詞を選択するだけで
よいので、簡単な処理で映像に合った文章あるいは詞を
得ることができる。また、歌詞等を得ることができる。According to the video data processing device of claim 5,
It is only necessary to associate existing sentences or existing words with keywords that are deeply related to the contents of the video (or video data), and select the existing sentences or existing words by the keywords. You can get the lyrics. Also, lyrics and the like can be obtained.

【００１５】本発明の請求項６の映像データ処理装置
は、映像データを供給する映像データ供給手段と、該映
像データ供給手段で供給された映像データの特徴を抽出
する特徴抽出手段と、映像データの特徴に対応するキー
ワードを記憶するキーワード記憶手段と、を備え、前記
特徴抽出手段で抽出した映像データの特徴に応じたキー
ワードを前記キーワード記憶手段から読み出し、該キー
ワードを該映像データに登録することを特徴とする。A video data processing device according to a sixth aspect of the present invention is a video data supply means for supplying video data, a feature extraction means for extracting a feature of the video data supplied by the video data supply means, and the video data. And a keyword storage unit for storing a keyword corresponding to the characteristic of the image data, the keyword corresponding to the characteristic of the video data extracted by the characteristic extraction unit is read from the keyword storage unit, and the keyword is registered in the video data. Is characterized by.

【００１６】請求項６の映像データ処理装置によれば、
映像データを、前記各請求項の映像データ処理装置によ
る自動作曲、作詞等に適した映像データとなるように自
動処理することができる。なお、「登録」とは元の映像
データとキーワードとを１セットのデータとして、例え
ば記憶手段に記憶することである。According to the video data processing device of claim 6,
The video data can be automatically processed by the video data processing device according to each of the claims so as to become video data suitable for an automatic tune, a song, and the like. The "registration" is to store the original video data and the keyword as one set of data in, for example, the storage means.

【００１７】本発明の請求項７の映像データ処理プログ
ラムは、映像データを供給するステップと、供給された
映像データからキーワードを抽出するステップと、抽出
したキーワードに応じた自動作曲用データを生成するス
テップと、生成された自動作曲用データに基づいて曲を
生成するステップと、をコンピュータが実行するための
ものであり、この映像データ処理プログラムの実行によ
れば、請求項１と同様な作用効果が得られる。According to a seventh aspect of the present invention, there is provided a video data processing program, which comprises a step of supplying video data, a step of extracting a keyword from the supplied video data, and generation of data for an automatic tune corresponding to the extracted keyword. The computer executes the steps and the step of generating a music piece based on the generated automatic music piece data. According to the execution of the video data processing program, the same effect as that of claim 1 is obtained. Is obtained.

【００１８】本発明の請求項８の映像データ処理プログ
ラムは、請求項７記載の映像データ処理プログラムであ
って、前記供給された映像データの特徴を抽出するステ
ップを備え、前記曲を生成するステップが、前記抽出し
たキーワードと前記抽出した映像データの特徴とに応じ
た自動作曲用データを生成することを特徴とし、この映
像データ処理プログラムの実行によれば、請求項２と同
様な作用効果が得られる。A video data processing program according to claim 8 of the present invention is the video data processing program according to claim 7, further comprising a step of extracting a characteristic of the supplied video data, and a step of generating the song. However, the automatic music composition data is generated according to the extracted keyword and the characteristics of the extracted video data. According to the execution of this video data processing program, the same effect as that of claim 2 can be obtained. can get.

【００１９】本発明の請求項９の映像データ処理プログ
ラムは、映像データを供給するステップと、供給された
映像データからキーワードを抽出するステップと、抽出
したキーワードに基づいて、予め決められた既存曲を選
択するステップと、をコンピュータが実行するためのも
のであり、この映像データ処理プログラムの実行によれ
ば、請求項３と同様な作用効果が得られる。According to a ninth aspect of the present invention, there is provided a video data processing program, which comprises a step of supplying video data, a step of extracting a keyword from the supplied video data, and an existing music piece determined in advance based on the extracted keyword. And the step of selecting (1) are executed by the computer. By executing this video data processing program, the same operational effect as that of the third aspect can be obtained.

【００２０】本発明の請求項１０の映像データ処理プロ
グラムは、映像データを供給するステップと、供給され
た映像データからキーワードを抽出するステップと、抽
出したキーワードに基づいて、詞を自動生成するステッ
プと、をコンピュータが実行するためのものであり、こ
の映像データ処理プログラムの実行によれば、請求項４
と同様な作用効果が得られる。A video data processing program according to a tenth aspect of the present invention is to supply video data, to extract a keyword from the supplied video data, and to automatically generate a word based on the extracted keyword. According to the execution of this video data processing program,
The same action and effect can be obtained.

【００２１】本発明の請求項１１の映像データ処理プロ
グラムは、映像データを供給するステップと、供給され
た映像データからキーワードを抽出するステップと、抽
出したキーワードに基づいて、予め決められた既存文章
または既存詞を選択するステップと、をコンピュータが
実行するためのものであり、この映像データ処理プログ
ラムの実行によれば、請求項５と同様な作用効果が得ら
れる。A video data processing program according to claim 11 of the present invention is directed to a step of supplying video data, a step of extracting a keyword from the supplied video data, and an existing sentence predetermined based on the extracted keyword. Alternatively, the step of selecting an existing word is executed by the computer, and by executing this video data processing program, the same effect as that of claim 5 can be obtained.

【００２２】本発明の請求項１２の映像データ処理プロ
グラムは、映像データを供給するステップと、供給され
た映像データの特徴を抽出するステップと、抽出した映
像データの特徴に応じたキーワードを、映像データ対応
する予め設定されたキーワードを記憶するキーワード記
憶手段から読み出すステップと、読み出したキーワード
を該映像データに登録するステップと、をコンピュータ
が実行するためのものであり、この映像データ処理プロ
グラムの実行によれば、請求項６と同様な作用効果が得
られる。According to a twelfth aspect of the present invention, there is provided a video data processing program, which includes a step of supplying video data, a step of extracting a characteristic of the supplied video data, and a keyword according to the characteristic of the extracted video data. The computer executes the steps of reading from the keyword storage means for storing the preset keyword corresponding to the data and the step of registering the read keyword in the video data. Execution of this video data processing program According to this, the same effect as that of the sixth aspect can be obtained.

【００２３】なお、請求項２の映像データ処理装置及び
請求項４の映像データ処理プログラムにおいて、（１）
自動作曲用データとしてのパラメータを生成する際に、
該パラメータをキーワードと映像データの特徴とに基づ
いて生成するか、キーワードのみに基づいて生成するか
を、そのパラメータの種類毎に（種類に応じて）選択し
てもよい。（２）また、映像データの特徴に基づいて生
成したパラメータと、キーワードに基づいて生成したパ
ラメータとの平均を採って自動作曲用データのパラメー
タとしてもよい。（３）さらに、パラメータをキーワー
ドと映像データの特徴とに基づいて生成するか、キーワ
ードのみに基づいて生成するかを、ユーザが選択できる
ようにしてもよい。According to the video data processing device of claim 2 and the video data processing program of claim 4, (1)
When generating parameters as data for automatic composition,
Whether to generate the parameter based on the keyword and the characteristic of the video data or only the keyword may be selected for each type of the parameter (depending on the type). (2) Further, an average of the parameter generated based on the characteristics of the video data and the parameter generated based on the keyword may be taken as the parameter of the automatic music composition data. (3) Further, the user may be allowed to select whether to generate the parameter based on the keyword and the feature of the video data or only the keyword.

【００２４】[0024]

【発明の実施の形態】以下、図面を参照して本発明の実
施形態について説明する。図６は本発明の映像データ処
理装置をパーソナルコンピュータとソフトウエアで構成
した実施形態のブロック図である。パーソナルコンピュ
ータは、ＣＰＵ１、ＲＯＭ２、ＲＡＭ３、ディスプレイ
４、キーボード５、マウス６、外部記憶装置７、音源
８、サウンドシステム９、通信インターフェース１０お
よびＭＩＤＩインターフェース１１を備えている。な
お、ディスプレイ４用のグラフィック回路、キーボード
５、マウス６および外部記憶装置７の各種インターフェ
ースは図示を省略してある。また、音源８はサウンドカ
ード等により実装されており、Ｄ／Ａコンバータ等を備
えている。BEST MODE FOR CARRYING OUT THE INVENTION Embodiments of the present invention will be described below with reference to the drawings. FIG. 6 is a block diagram of an embodiment in which the video data processing device of the present invention is configured by a personal computer and software. The personal computer includes a CPU 1, a ROM 2, a RAM 3, a display 4, a keyboard 5, a mouse 6, an external storage device 7, a sound source 8, a sound system 9, a communication interface 10 and a MIDI interface 11. The graphic circuit for the display 4, the keyboard 5, the mouse 6, and various interfaces of the external storage device 7 are not shown. The sound source 8 is mounted by a sound card or the like and has a D / A converter or the like.

【００２５】ＣＰＵ１は、外部記憶装置７の例えばハー
ドディスク装置（ＨＤＤ）にインストールされたＯＳ
（オペレーティングシステム）によりＲＡＭ３のワーキ
ングエリアを使用して各種の処理を行う。例えば、キー
ボード５やマウス６の操作に応じたデータの入力処理を
行ったり、ディスプレイ４で映像を再生する。また、曲
を再生するときは、割込み処理により曲データ中の各種
音符のデータを読み出し、各楽音に対応するデータを音
源８に出力する。音源８はＣＰＵ１から入力されるデー
タに応じた楽音信号を発生するとともにアナログオーデ
ィオ信号に変換し、アンプやスピーカ等のサウンドシス
テム９で楽音を発生する。The CPU 1 is an OS installed in, for example, a hard disk device (HDD) of the external storage device 7.
The (operating system) uses the working area of the RAM 3 to perform various processes. For example, the input processing of data according to the operation of the keyboard 5 or the mouse 6 is performed, or the video is reproduced on the display 4. Further, when reproducing the music, the data of various notes in the music data is read by interruption processing and the data corresponding to each musical sound is output to the sound source 8. The sound source 8 generates a musical tone signal according to the data input from the CPU 1 and converts it into an analog audio signal, and a musical tone is generated by a sound system 9 such as an amplifier or a speaker.

【００２６】外部記憶装置７は、ハードディスク装置
（ＨＤＤ）、フロッピィ（商標）ディスク装置（ＦＤ
Ｄ）、ＣＤ−ＲＯＭ装置、光磁気ディスク（ＭＯ）装
置、デジタル多目的ディスク（ＤＶＤ）装置等である。
この外部記憶装置７は、複数の映像データを記憶してい
るデータベース、自動作曲用データとしての複数の曲テ
ンプレート（パラメータの組）を記憶しているデータベ
ース、各種キーワードを記憶しているデータベース、各
キーワードに対応する各種の既存曲の曲データを記憶し
ているデータベース、各キーワードに対応する各種の既
存の文章や詞を記憶しているデータベース、各種テーブ
ルを記憶しているデータベース等として用いられる。さ
らに、外部記憶装置７は、生成したメロディデータや、
生成した歌詞等を保存するために利用される。なお、Ｍ
ＩＤＩインターフェース１１は他のＭＩＤＩ機器との間
で各種データの授受を行うものであり、例えば、生成し
たメロディデータや歌詞データをＭＩＤＩ機器に出力す
ることもできる。The external storage device 7 is a hard disk device (HDD), a floppy (trademark) disk device (FD).
D), a CD-ROM device, a magneto-optical disc (MO) device, a digital multipurpose disc (DVD) device, and the like.
This external storage device 7 is a database that stores a plurality of video data, a database that stores a plurality of song templates (sets of parameters) as data for automatic song, a database that stores various keywords, It is used as a database that stores music data of various existing songs corresponding to keywords, a database that stores various existing sentences and words corresponding to each keyword, a database that stores various tables, and the like. In addition, the external storage device 7 stores the generated melody data,
It is used to store the generated lyrics. In addition, M
The IDI interface 11 sends and receives various data to and from other MIDI devices, and can output the generated melody data and lyrics data to the MIDI device, for example.

【００２７】また、通信インターフェース１０を介して
ＬＡＮ（ローカルエリアネットワーク）やインターネッ
トあるいは電話回線等の通信ネットワーク２０に接続
し、サーバコンピュータ３０から映像データ、映像デー
タ処理プログラム、自動作曲用データ（パラメータ／曲
テンプレート）、既存曲の曲データ、各種の既存の文章
や詞等の各種データの配信を受けるようにすることもで
きる。なお、この実施の形態では、映像データ処理プロ
グラムは外部記憶装置７のハードディスク装置（ＨＤ
Ｄ）に記憶されており、ＣＰＵ１は、このハードディス
ク装置の映像データ処理プログラムをＲＡＭ３に展開
し、このＲＡＭ３のプログラムを実行して各種の処理を
行う。Further, the communication data is connected to a communication network 20 such as a LAN (local area network), the Internet or a telephone line via the communication interface 10, and the video data, the video data processing program, the data for the automatic tune (parameter / A song template), song data of an existing song, and various data such as various existing sentences and words can be delivered. In this embodiment, the video data processing program is stored in the hard disk device (HD
The image data processing program stored in D) is loaded into the RAM 3, and the CPU 1 executes the program in the RAM 3 to perform various processes.

【００２８】図７は実施形態の自動作曲用データの構造
を示す図である。この実施形態の自動作曲用データは、
一つの曲テンプレートに曲構成データ、ピッチ生成用デ
ータ、リズム生成用データ、伴奏生成用データ等を含ん
でいる。曲構成データはブロック、楽節及びフレーズの
階層構造を指定するとともに、ブロック数、楽節数及び
フレーズ数のデータや、テンポデータ等のパラメータで
構成されている。ピッチ生成用データは、生成されるメ
ロディピッチの音域（ピッチダイナミックス）を指定す
るデータ、小節内の先頭と終わりのピッチを指定するデ
ータ等のパラメータで構成されている。リズム生成用デ
ータは、例えばシンコペーションの有無及び回数のデー
タ、音符数のデータ等のパラメータで構成されている。
伴奏生成用データは、伴奏パターンを指定するスタイル
のデータ、和音の種類と根音により和音の変化を指定す
るコード進行のデータ、音色のデータ等のパラメータで
構成されている。FIG. 7 is a diagram showing the structure of the automatic music composition data of the embodiment. The automatic song data of this embodiment is
One music template includes music composition data, pitch generation data, rhythm generation data, accompaniment generation data, and the like. The music composition data designates a hierarchical structure of blocks, passages and phrases, and is composed of parameters such as block number, passage number and phrase number data, tempo data and the like. The pitch generation data is composed of parameters such as data designating the tone range (pitch dynamics) of the generated melody pitch, data designating the beginning and end pitches within a bar. The rhythm generation data is composed of parameters such as the presence / absence and number of times of syncopation, the number of notes, and the like.
The accompaniment generation data is composed of parameters such as style data that specifies an accompaniment pattern, chord progression data that specifies a chord change based on the chord type and root note, and tone color data.

【００２９】（第１実施例）図１は実施形態の映像デー
タ処理装置における第１実施例の要部機能ブロック図で
あり、各機能はＣＰＵ１が映像データ処理プログラムを
実行することにより実現される処理ステップでもある。
機能Ａは、複数ある映像データの中から曲を付与する映
像データを指定入力する機能であり、例えばディスプレ
イ４に各映像データのタイトル（あるいはファイル名）
を表示してユーザがマウスのクリックで選択できるよう
にしたり、ユーザがタイトル（あるいはファイル名）を
直接入力できるようにする。なお、この映像データ処理
プログラムは映像編集ソフト等で実行することも有用で
あり、映像編集ソフトにおいて映像を選択して順次並べ
ていく映像シーケンスボード等にファイルをドラッグし
て張り付ける方法でもよい。(First Example) FIG. 1 is a functional block diagram of a main part of a first example in the video data processing apparatus of the embodiment, and each function is realized by the CPU 1 executing a video data processing program. It is also a processing step.
The function A is a function of designating and inputting video data to which a song is added from a plurality of video data, for example, the title (or file name) of each video data on the display 4.
Is displayed so that the user can select it by clicking the mouse, or the user can directly input the title (or file name). It is also useful to execute this video data processing program with video editing software or the like, and a method of dragging and pasting files to a video sequence board or the like in which video is selected and sequentially arranged in the video editing software may be used.

【００３０】機能Ｂは指定された映像データを映像デー
タベースＣから選択し抽出する。なお、映像データは動
画でも静止画でもよい。機能Ｂは、この選択抽出した映
像データを機能Ｄ及び機能Ｆへと供給する。機能Ｄは機
能Ｂにより供給された映像データから映像の特徴を検出
し、機能Ｅは機能Ｄで検出した特徴に基づいて自動作曲
用データを生成する。一方、機能Ｆは機能Ｂにより供給
された映像データからキーワードを検出し、機能Ｇは機
能Ｆで検出したキーワードに基づいて自動作曲用データ
を生成する。The function B selects and extracts the specified video data from the video database C. The video data may be moving images or still images. The function B supplies the selected and extracted video data to the functions D and F. The function D detects the characteristic of the image from the image data supplied by the function B, and the function E generates the automatic music data based on the characteristic detected by the function D. On the other hand, the function F detects a keyword from the video data supplied by the function B, and the function G generates automatic song data based on the keyword detected by the function F.

【００３１】そして、機能Ｅあるいは機能Ｇで生成した
自動作曲用データに基づいて、機能Ｈで自動作曲する。
この自動作曲の処理では、自動作曲用データに基づいて
次のような処理を行う。例えば、メロディ生成用データ
の音符数のデータとシンコペーションの有無／数のデー
タに基づいてリズムパターンをデータベースからロード
する。そして、ピッチ生成用データの音域（ピッチダイ
ナミックス）のデータ、小節内の先頭と終わりのピッチ
を指定するデータ等により、ピッチを決定して曲データ
（メロディ）を生成する。Then, based on the data for the automatic tune generated by the function E or the function G, the automatic tune is performed by the function H.
In the processing of the automatic music piece, the following processing is performed based on the automatic music piece data. For example, a rhythm pattern is loaded from the database based on the data of the number of notes and the data of presence / absence of syncopation in the melody generating data. Then, the pitch is determined based on the tone range (pitch dynamics) data of the pitch generation data, the data designating the start and end pitches within the measure, and the tune data (melody) is generated.

【００３２】次に、機能Ｉで、作曲した曲データを対応
する映像データと共に出力する。この曲データと映像デ
ータの出力とは、ディスプレイ４での映像の再生と音源
８及びサウンドシステム９による曲の再生、外部記憶装
置７の記録媒体への保存、通信インターフェース１０あ
るいはＭＩＤＩインターフェース１１を介して他の装置
への送信などである。Next, in function I, the composed song data is output together with the corresponding video data. The output of the music data and the video data is performed by reproducing the video on the display 4, reproducing the music by the sound source 8 and the sound system 9, storing the music in the recording medium of the external storage device 7, the communication interface 10 or the MIDI interface 11. Transmission to another device.

【００３３】なお、機能Ｄ，Ｅによる特徴からの自動作
曲用データの生成処理と、機能Ｆ，Ｇによるキーワード
からの自動作曲用データの生成処理は、機能Ｆ，Ｇ一方
でもよいし、機能Ｄ，Ｅと機能Ｆ，Ｇの両方でもよい。
両方行う場合は、自動作曲用データのパラメータの種類
に応じて機能Ｄ，Ｅを採用しない方法でもよいし、自動
作曲データのパラメータの値として機能Ｄ，Ｅで得られ
た値と機能Ｆ，Ｇで得られた値の平均値を採用するよう
にしてもよい。さらに、機能Ｄ，Ｅ及び機能Ｆ，Ｇの両
方を使うか、機能Ｆ，Ｇのみを使うかを、ユーザが選択
するようにしてもよい。It should be noted that the automatic music composition data generation process from the feature by the functions D and E and the automatic music composition data generation process from the keyword by the functions F and G may be either the function F or G. , E and both functions F and G may be used.
When both are performed, it is possible to adopt a method in which the functions D and E are not adopted depending on the type of the parameter of the automatic tune data, or the values obtained by the functions D and E and the functions F and G are used as the values of the parameter of the automatic tune data. You may make it employ | adopt the average value of the value obtained by. Further, the user may select whether to use both the functions D and E and the functions F and G or to use only the functions F and G.

【００３４】ここで、映像データのキーワードとは、映
像データに付随するタイトルや、後述のように映像デー
タの特徴を検出してその特徴に合ったキーワードを予め
映像データに登録したものなどであり、これらのキーワ
ードに対して自動作曲用データのパラメータが対応付け
られている。Here, the keyword of the video data is, for example, a title attached to the video data, or a keyword which matches a characteristic of the video data and is registered in advance in the video data as described later. The parameters of the automatic music composition data are associated with these keywords.

【００３５】図４は映像のキーワードとパラメータとの
関係を記憶している対応テーブルを示す図であり、この
対応テーブルは各種の対応キーワードに対して生成され
る自動作曲用データのパラメータを直接対応付けて記憶
している。例えば、キーワードが「運動会」の場合は、
速いテンポでピッチダイナミックスが大きいようなパラ
メータが対応付けられ、キーワードが「散歩」の場合
は、中ぐらいのテンポで中ぐらいの音符数となるパラメ
ータが対応付けられ、キーワードが「お墓参り」の場合
は、ピッチが低くシンコペーション無しとなるようなパ
ラメータが対応付けられている。なお、このキーワード
と自動作曲用データのパラメータとの対応関係を記憶し
ている対応テーブルも、ユーザが編集したり、複数のテ
ーブルの中からいずれかを選択できるようにしてもよ
い。FIG. 4 is a diagram showing a correspondence table which stores the relationship between video keywords and parameters. This correspondence table directly corresponds to the parameters of the automatic music composition data generated for various correspondence keywords. I remember it. For example, if the keyword is "athletic meet",
If the parameter is associated with a fast tempo and large pitch dynamics, and the keyword is “walk”, the parameter with a medium tempo and medium note number is associated, and the keyword is “visit the grave”. In this case, the parameters are associated such that the pitch is low and there is no syncopation. The correspondence table storing the correspondence relationship between the keyword and the parameter of the automatic tune data may be edited by the user or may be selected from a plurality of tables.

【００３６】これにより、例えば映像のタイトルが「太
郎の運動会」だったとすると、この中から「運動会」と
いうキーワードが抽出され、対応する自動作曲用データ
が得られる。Thus, for example, if the title of the video is "Taro's athletic meet", the keyword "athletic meet" is extracted from this, and the corresponding automatic song data is obtained.

【００３７】また、映像の特徴としては、例えば映像の
明るさ（明〜暗）、色合い（赤系〜青系）、動き（速い
〜遅い）（動画の場合）、複雑さ（単純画像〜複雑画
像）などを抽出する。すなわち、映像の輝度信号（例え
ばＹ信号）のレベルを測定し、映像区間全体の輝度信号
のレベルの平均とって、それを明るさのデータとする。
赤の色信号のレベルと青の色信号のレベルを測定し、映
像区間全体のレベルの平均を求め、その比を赤系から青
系までの色合いのデータとする。前後（フレーム間）の
画像データの相関をとって映像区間全体の相関値の平均
をとり、その平均の相関値を動きのデータとする。１画
像データ（１フレーム）における信号レベルの起伏の激
しさを測定し、この起伏の激しさがあるレベル以上とな
るような頻度をとり、その頻度を複雑さのデータとす
る。The characteristics of the image include, for example, image brightness (bright to dark), hue (red to blue), movement (fast to slow) (for moving images), complexity (simple image to complex). Image) etc. are extracted. That is, the level of the luminance signal of the image (for example, the Y signal) is measured, and the average of the levels of the luminance signal of the entire image section is used as the brightness data.
The level of the red color signal and the level of the blue color signal are measured, the average of the levels of the entire video section is obtained, and the ratio is used as the data of the hue from red to blue. Correlation of the image data before and after (between frames) is calculated to average the correlation values of the entire video section, and the average correlation value is used as motion data. The intensity of the undulation of the signal level in one image data (one frame) is measured, and the frequency is set such that the intensity of the undulation is above a certain level, and the frequency is used as the data of complexity.

【００３８】これらの映像の特徴に応じたデータ（以
後、「特徴データ」という。）でテーブルを参照して自
動作曲用データのパラメータを得るが、どのようなテー
ブルを参照するかは図５の対応テーブルによって設定さ
れる。図５は映像の特徴と生成される自動作曲用データ
の特性との対応関係を示しており、例えば図示のように
「明るい」という特徴に対して「メジャーコードが多い
コード進行」という特性を自動作曲用データにもたせる
場合、前記明るさの特徴データに対してメジャーコード
を含有する度合のパラメータを出力するテーブルを参照
するように規定する。また、「動きが速い」という特徴
に対して「テンポを速くする」という特性を自動作曲用
データにもたせる場合、前記動きの特徴データに対して
テンポ値のパラメータを出力するテーブルを参照するよ
うに規定する。The parameters of the automatic music composition data are obtained by referring to a table with data (hereinafter referred to as "feature data") corresponding to the characteristics of these images. Which table is referred to is shown in FIG. It is set by the correspondence table. FIG. 5 shows the correspondence between the characteristics of the image and the characteristics of the generated automatic music data. For example, as shown in the figure, the characteristic of "bright" is automatically changed to the characteristic of "chord progression with many major chords". When the composition data is provided, it is specified to refer to a table that outputs a parameter of the degree containing the major code for the brightness feature data. In addition, when the characteristic of "moving fast" is given to the automatic song data as a characteristic of "increasing tempo", a table for outputting a tempo value parameter to the moving characteristic data is referred to. Stipulate.

【００３９】このように、対応テーブルによって設定さ
れた状態により、特徴データに応じたテーブルを参照し
て自動作曲用データのパラメータを得る。これにより、
例えば、明るさの特徴データのレベルが大きいほどメジ
ャーコードの含み具合が多くなり、赤系の色が多いほど
ピッチが高くなり、青系の色が多いほどピッチが低くな
る。また、映像の動きが速いほどテンポが速くなる。さ
らに、画像が複雑になるほど非コード音（非和音構成
音）を多く含むようになる。なお、この映像の特徴と生
成される曲の特性との対応を規定する対応テーブルは、
ユーザが編集したり、複数のテーブルの中からいずれか
を選択できるようにしてもよい。As described above, the parameters of the automatic music composition data are obtained by referring to the table corresponding to the characteristic data according to the state set by the correspondence table. This allows
For example, the larger the level of the brightness characteristic data is, the more the degree of inclusion of the major code is, the more the red color is, the higher the pitch is, and the more the blue color is, the lower the pitch is. Also, the faster the motion of the image, the faster the tempo. Furthermore, the more complex the image, the more non-chord sounds (non-chord constituent tones) are included. The correspondence table that defines the correspondence between the characteristics of this video and the characteristics of the generated song is
The user may edit it or select one from a plurality of tables.

【００４０】図２は実施形態の映像データ処理装置にお
けるキーワード登録の機能を示す要部機能ブロック図で
あり、各機能はＣＰＵ１が映像データ処理プログラムを
実行することにより実現される処理ステップでもある。
機能Ａ′は、図１の機能Ａ，Ｂ，Ｃと同様な機能であ
り、映像データベースＣの複数ある映像データの中から
曲を映像データを指定して、映像データを機能Ｄへと供
給する。機能Ｄは図１の機能Ｄと同様な機能であり、機
能Ａ′により供給された映像データから前記同様に映像
の特徴を検出する。機能Ｊは、機能Ｄで検出した特徴に
基づいて、特徴−キーワードデータベース（外部記憶装
置７の例えばハードディスク）から該特徴に対応するキ
ーワードを検索し、検索したキーワードを機能Ｋに供給
する。機能Ｋは、上記映像データと検索されたキーワー
ドを１つのセットのデータとして、前記映像データベー
スＣ（例えば外部記憶装置７のハードディスク）に記憶
する。すなわち映像データにキーワードを登録する。以
上の処理により、映像データの特徴に合ったキーワード
がその映像データに登録される。FIG. 2 is a functional block diagram of main parts showing the function of keyword registration in the video data processing apparatus of the embodiment, and each function is also a processing step realized by the CPU 1 executing the video data processing program.
The function A'is similar to the functions A, B, and C in FIG. 1, and specifies video data for a song from a plurality of video data in the video database C and supplies the video data to the function D. . The function D is the same function as the function D in FIG. 1, and the feature of the image is detected from the image data supplied by the function A ′ in the same manner as described above. The function J searches the feature-keyword database (for example, the hard disk of the external storage device 7) for a keyword corresponding to the feature based on the feature detected by the function D, and supplies the retrieved keyword to the function K. The function K stores the video data and the retrieved keyword as one set of data in the video database C (for example, the hard disk of the external storage device 7). That is, the keyword is registered in the video data. Through the above processing, the keyword matching the characteristics of the video data is registered in the video data.

【００４１】なお、このような特徴の抽出は、色合いや
複雑さ等の抽象的な特徴を抽出するものに限らず、より
具体的な映像の特徴を抽出してもよい。例えば、予め典
型的な映像パターン（風景、建物、人物、動物、植物、
乗り物等や、文字）を多数記憶しておき、これら映像パ
ターンと供給された映像データとを比較し、類似度の高
い映像パターンを求めて、これを映像の特徴として抽出
してもよい。さらに具体的には、例えば、映像の特徴と
して、映像から多数の人物が抽出され、さらに各人物の
動きが激しいことが検出された場合に、「運動会」とい
うキーワードを対応付けることもできる。The extraction of such features is not limited to extraction of abstract features such as hue and complexity, and more specific video features may be extracted. For example, typical video patterns (landscapes, buildings, people, animals, plants,
It is also possible to store a large number of vehicles, characters, etc., compare these video patterns with the supplied video data, obtain a video pattern with a high degree of similarity, and extract this as a video feature. More specifically, for example, when a large number of persons are extracted from the image and it is detected that the movement of each person is vigorous as a feature of the image, the keyword “athletic meet” can be associated.

【００４２】（第２実施例）前記第１実施例では映像に
登録されているキーワードを自動作曲に利用するように
しているが、次の第２実施例のように、このキーワード
は自動作曲以外にも利用することができる。図３は実施
形態の映像データ処理装置における第２実施例の要部機
能ブロック図であり、各機能はＣＰＵ１が映像データ処
理プログラムを実行することにより実現される処理ステ
ップでもある。機能Ａ，Ｂ，Ｃ，Ｆは第１実施例（図
１）と同様であり、複数の映像データの中から指定され
た映像データが供給され、その映像データからキーワー
ドが抽出される。(Second Embodiment) In the first embodiment, the keyword registered in the video is used for the automatic music piece. However, as in the next second embodiment, this keyword is not the automatic music piece. Can also be used for FIG. 3 is a functional block diagram of a main part of the second example in the video data processing apparatus of the embodiment, and each function is also a processing step realized by the CPU 1 executing the video data processing program. The functions A, B, C, and F are the same as those in the first embodiment (FIG. 1), specified video data is supplied from a plurality of video data, and a keyword is extracted from the video data.

【００４３】この抽出されたキーワードを既存曲の選択
に利用することができる。予め多数の既存曲をキーワー
ドと対応付けて外部記憶装置７の例えばハードディスク
に曲データベースとして記憶しておき、機能Ｌは、機能
Ｆで抽出されたキーワードに基づいて曲データベースか
ら該キーワードに対応する既存曲の曲データを選択し、
機能Ｍに供給する。なお、曲データベースの既存曲とキ
ーワードの対応関係は１対１に限らず１対多数、あるい
は多数対多数の関係でもよい。そして、機能Ｍは、機能
Ｌで選択された既存曲に対して、その曲の長さを映像の
長さに合うように微調整し、機能Ｎで曲を映像とともに
出力する。This extracted keyword can be used to select an existing song. A large number of existing songs are stored in advance as a song database on the hard disk of the external storage device 7 in association with the keywords, and the function L is based on the keywords extracted by the function F and corresponds to the existing keywords from the song database. Select the song data of the song,
Supply to function M. The correspondence between the existing songs in the song database and the keywords is not limited to one-to-one, but may be one-to-many or many-to-many. Then, the function M finely adjusts the length of the existing song selected by the function L to match the length of the image, and the function N outputs the song together with the image.

【００４４】また、機能Ｆで抽出されたキーワードを、
自動作詞に利用することができる。予め多数の作詞用テ
ンプレートをキーワードと対応付けて外部記憶装置７の
例えばハードディスクに作詞用テンプレートデータベー
スとして記憶しておき、機能Ｏは、機能Ｆで抽出された
キーワードに基づいて作詞用テンプレートデータベース
から該キーワードに対応する作詞用テンプレートを選択
する。キーワードと作詞用テンプレートとの対応関係は
１対１、１対多数、多数対多数などでもよい。作詞用テ
ンプレートとしては、例えば穴あきテンプレートがあ
る。この穴あきテンプレートは、所定の歌詞の一部が空
白になっており、ユーザが自由に言葉を埋めたり、所定
の選択肢の中から言葉を選択したりすることで歌詞を完
成させるためのテンプレートである。機能Ｏは、選択さ
れた作詞用テンプレートに基づいて自動作詞（例えば穴
あきテンプレートの空白部分に言葉を埋める処理等）
し、自動作詞された詞のデータを機能Ｐに供給する。そ
して、機能Ｐで、自動作詞された詞のデータに対して、
その詞の長さを映像の長さに合うように微調整し、機能
Ｑで詞や文章を映像とともに出力する。The keywords extracted by the function F are
It can be used for automatic lyrics. A large number of lyric templates are stored in advance as a lyric template database on the hard disk of the external storage device 7 in association with the keywords, and the function O stores the lyric template database based on the keywords extracted in the function F from the lyric template database. Select the lyric template corresponding to the keyword. The correspondence between the keyword and the lyric template may be one-to-one, one-to-many, many-to-many, or the like. An example of a lyrics template is a perforated template. This perforated template is a template for completing the lyrics by allowing the user to fill in words freely or select words from predetermined options, because part of the predetermined lyrics is blank. is there. Function O is a self-verb based on the selected lyrics template (for example, a process of filling a word in a blank portion of a perforated template)
Then, the data of the self-verbed word is supplied to the function P. Then, with the function P, for the data of the verb that is the self-verb,
The length of the words is finely adjusted to match the length of the image, and the function Q outputs the words and sentences together with the image.

【００４５】さらに、機能Ｆで抽出されたキーワード
を、既存の文章や詞の選択に利用することができる。予
め多数の既存の文章や詞をキーワードと対応付けて外部
記憶装置７の例えばハードディスクに既存文章・詞デー
タベースとして記憶しておき、機能Ｒは、機能Ｆで抽出
されたキーワードに基づいて既存文章・詞データベース
から該キーワードに対応する既存の文章・詞を選択す
る。キーワードと既存文章あるいは詞との対応関係は１
対１、１対多数、多数対多数などでもよい。そして、機
能Ｒで選択された文章あるいは詞は機能Ｐに供給され、
前記同様に機能Ｐで詞の長さを映像の長さに合うように
微調整し、機能Ｑで詞や文章を映像とともに出力する。Furthermore, the keywords extracted by the function F can be used for selecting existing sentences and words. A large number of existing sentences and words are associated with keywords in advance and stored as an existing sentence / word database in, for example, the hard disk of the external storage device 7, and the function R is based on the keywords extracted in the function F. An existing sentence / word corresponding to the keyword is selected from the word database. The correspondence between keywords and existing sentences or words is 1
It may be one-to-one, one-to-many or many-to-many. Then, the sentence or word selected by the function R is supplied to the function P,
Similarly to the above, the function P finely adjusts the length of the words to match the length of the image, and the function Q outputs the words and sentences together with the image.

【００４６】なお、作詞された詞や、選択した既存の文
章や詞の出力形態としては、例えば詞をテロップとして
映像に重ねて表示したり、詞に基づいて音声合成して読
み上げ、映像と共に再生してもよい。As for the output form of the lyricized lyrics and the selected existing sentences or lyrics, for example, the lyrics may be displayed as a telop on the video, or the voice may be synthesized based on the lyric and read aloud and reproduced with the video. You may.

【００４７】また、実施形態の映像データ処理装置とし
ては、上記の「既存曲選択」「自動作詞」「既存文章・
詞選択」の機能のうち少なくとも１つの機能を備えてい
ればよい。複数を有する場合はいずれかを選択実行でき
るようにしたり、複数並列に実行する、例えば既存曲を
選択するとともに自動作詞するなどしたりしてもよい。
この場合「自動作曲」も選択実行や複数並列実行の対象
にしてもよい。Further, the video data processing device of the embodiment has the above-mentioned "existing song selection""self-verb""existing sentence /
It suffices to have at least one of the functions of "word selection". In the case of having a plurality, one of them may be selected and executed, or a plurality of them may be executed in parallel, for example, an existing song may be selected and an automatic verb may be used.
In this case, the "automatic music" may also be the target of selective execution or plural parallel executions.

【００４８】以上の実施形態では、特徴データを映像全
体について平均したものとして得るようにしたが、１つ
の映像データ中に場面に応じた区間情報を登録してお
き、その区間ごとに特徴データを抽出して、区間毎の自
動作曲用データで曲を生成するようにしてもよい。な
お、この区間情報はユーザが登録してもよいし、区間を
自動検出して区間情報を登録するようにしてもよい。ま
た、１つの映像データに、場面に応じて区間毎にキーワ
ードを登録しておき、この区間毎にキーワードを抽出し
て、区間毎の自動作曲用データで曲を生成するようにし
てもよい。In the above embodiment, the characteristic data is obtained as an average of the entire image. However, the section information corresponding to the scene is registered in one image data, and the characteristic data is obtained for each section. Alternatively, the music may be generated by using the automatic music data for each section. The section information may be registered by the user, or the section information may be registered by automatically detecting the section. Alternatively, a keyword may be registered in one video data for each section according to a scene, the keyword may be extracted for each section, and a song may be generated using the automatic song data for each section.

【００４９】実施形態では、自動作曲用データとしてパ
ラメータを生成する例について説明したが、自動作曲用
データはパラメータのセットである曲テンプレートであ
ってもよい。例えば、特徴が「明るい」の場合は「テン
プレート１番」として対応付けて選択したり、キーワー
ドが「運動会」の場合は「テンプレート１０番」として
対応付けて選択する。In the embodiment, the example in which the parameter is generated as the automatic music data has been described, but the automatic music data may be a music template which is a set of parameters. For example, when the feature is "bright", the template is selected as "template 1", and when the keyword is "athletics", it is selected as "template 10".

【００５０】また、映像の生成されたパラメータや曲テ
ンプレートを、ユーザが編集できるようにしてもよい。Also, the user may be able to edit the parameters and song template for which the video has been generated.

【００５１】実施形態では映像を選択指定すると直ぐに
曲や歌詞等を生成するようにしているが、一度選択した
映像を確認した後で、自動作曲や作詞の指示を与えるこ
とで曲や歌詞等を生成するようにしてもよい。また、生
成された曲が気に入らない場合など、パラメータや曲テ
ンプレートの編集や別のものを選択する選択肢を設け、
パラメータや曲テンプレートの一部を変更して曲を再度
生成するようにしてもよい。In the embodiment, when the image is selected and designated, the song, lyrics, etc. are generated immediately. However, after confirming the selected image once, the song, lyrics, etc. are given by giving an instruction of an automatic song or songwriting. It may be generated. Also, if you do not like the generated song, you have the option to edit the parameters or song template or select another one,
The parameters or part of the song template may be changed to regenerate the song.

【００５２】また、実施形態では、映像データ処理装置
をパーソナルコンピュータとソフトウエアで構成した例
について説明したが、本発明は電子楽器に適用すること
もできる。この場合、電子楽器は、音源装置、シーケン
サ、エフェクタなどそれぞれが別体の装置であって、Ｍ
ＩＤＩあるいは各種ネットワーク等の通信手段を用いて
各装置を接続するようなものであってもよい。Further, in the embodiment, the example in which the video data processing device is composed of the personal computer and the software is explained, but the present invention can be applied to the electronic musical instrument. In this case, the electronic musical instrument is a device such as a sound source device, a sequencer, an effector, etc.
The devices may be connected using a communication means such as IDI or various networks.

【００５３】また、サーバと端末からなるシステムにお
いて、サーバ側に映像データのデータベースを備え、端
末で映像データを指定してサーバへ指定情報を送信し、
サーバにおいて指定された映像データに基づいて自動作
曲あるいは作詞をし、この映像データと作曲された曲あ
るいは歌詞等を端末に送信するようにしてもよい。ま
た、端末から映像データをサーバに送信し、サーバにお
いて、送信された映像データに基づいて自動作曲あるい
は作詞をし、この作曲された曲あるいは歌詞等を端末に
送信するようにしてもよい。さらに、このような場合、
作曲された曲を試聴できるようにするとよい。Further, in a system comprising a server and a terminal, a database of video data is provided on the server side, the video data is designated by the terminal, and designation information is transmitted to the server.
It is also possible that an automatic song or song is made based on the image data designated by the server, and this image data and the composed song or lyrics are transmitted to the terminal. Alternatively, the video data may be transmitted from the terminal to the server, the server may compose an automatic song or lyric based on the transmitted video data, and the composed song or lyrics may be transmitted to the terminal. Furthermore, in such cases,
It is good to be able to audition the composed song.

【００５４】[0054]

【発明の効果】本発明の請求項１の映像データ処理装置
または請求項７の映像データ処理プログラムの実行によ
れば、映像の内容に深く関係するキーワードに応じた自
動作曲用データで自動作曲するようにしたので、映像デ
ータに合った曲を作曲できる。According to the video data processing device of claim 1 of the present invention or the video data processing program of claim 7 of the present invention, the automatic music composition is performed with the automatic composition data corresponding to the keyword deeply related to the content of the image. As a result, it is possible to compose a song that matches the video data.

【００５５】本発明の請求項２の映像データ処理装置ま
たは請求項８の映像データ処理プログラムの実行によれ
ば、前記キーワードと映像データの特徴とに応じた自動
作曲用データで自動作曲するようにしたので、さらに映
像データに合った曲を作曲できる。According to the video data processing device of claim 2 of the present invention or the video data processing program of claim 8 of the present invention, the automatic music composition is performed with the automatic composition data according to the keyword and the characteristics of the video data. As a result, I can compose a song that matches the video data.

【００５６】本発明の請求項３の映像データ処理装置ま
たは請求項９の映像データ処理プログラムの実行によれ
ば、映像の内容に深く関係するキーワードに既存曲を対
応付けておき、キーワードによって曲を選択するだけで
よいので、簡単な処理で映像に合った曲を得ることがで
きる。According to the video data processing device of claim 3 or the video data processing program of claim 9 of the present invention, an existing song is associated with a keyword deeply related to the content of the image, and the song is designated by the keyword. Since you only have to select it, you can get the music that suits the video with simple processing.

【００５７】本発明の請求項４の映像データ処理装置ま
たは請求項１０の映像データ処理プログラムの実行によ
れば、映像の内容に深く関係するキーワードに応じた詞
が自動生成されるので、このキーワードの元となる映像
データに合った歌詞等も作詞できる。According to the execution of the video data processing device according to claim 4 or the video data processing program according to claim 10 of the present invention, a word corresponding to a keyword deeply related to the contents of the video is automatically generated. You can also write lyrics that match the original video data.

【００５８】本発明の請求項５の映像データ処理装置ま
たは請求項１１の映像データ処理プログラムの実行によ
れば、映像の内容に深く関係するキーワードに既存文章
または既存詞を対応付けておき、キーワードによって既
存文章または既存詞を選択するだけでよいので、簡単な
処理で映像に合った文章あるいは詞を得ることができ
る。According to the execution of the video data processing device according to claim 5 or the video data processing program according to claim 11 of the present invention, an existing sentence or an existing word is associated with a keyword deeply related to the content of the video, and the keyword Since it suffices to select an existing sentence or an existing word by, it is possible to obtain a sentence or a word suitable for an image by a simple process.

【００５９】本発明の請求項６の映像データ処理装置ま
たは請求項１２の映像データ処理プログラムの実行によ
れば、映像データを、前記各請求項の映像データ処理装
置による自動作曲、作詞等に適した映像データとなるよ
うに自動処理することができる。According to the execution of the video data processing device according to claim 6 or the video data processing program according to claim 12, the video data is suitable for automatic music composition, songwriting, etc. by the video data processing device according to each of the claims. It is possible to automatically process the video data.

【図面の簡単な説明】[Brief description of drawings]

【図１】本発明の実施形態の映像データ処理装置におけ
る第１実施例の要部機能ブロック図である。FIG. 1 is a functional block diagram of a main part of a first example in a video data processing device according to an embodiment of the present invention.

【図２】本発明の実施形態の映像データ処理装置におけ
るキーワード登録の機能を示す要部機能ブロック図であ
る。FIG. 2 is a functional block diagram of a main part showing a keyword registration function in the video data processing device according to the embodiment of the present invention.

【図３】本発明の実施形態の映像データ処理装置におけ
る第２実施例の要部機能ブロック図である。FIG. 3 is a functional block diagram of a main part of a second example of the video data processing apparatus according to the embodiment of the present invention.

【図４】本発明の実施形態における映像のキーワードと
パラメータの特性との対応関係を記憶している対応テー
ブルを示す図である。FIG. 4 is a diagram showing a correspondence table that stores a correspondence relationship between video keywords and parameter characteristics according to the embodiment of the present invention.

【図５】本発明の実施形態における映像の特徴と生成さ
れる自動作曲用データの特性との対応関係を記憶してい
る対応テーブルを示す図である。FIG. 5 is a diagram showing a correspondence table storing a correspondence relation between video features and characteristics of generated automatic music data according to the embodiment of the present invention.

【図６】本発明の実施形態の映像データ処理装置のブロ
ック図である。FIG. 6 is a block diagram of a video data processing device according to an embodiment of the present invention.

【図７】本発明の実施形態における自動作曲用データの
構造を示す図である。FIG. 7 is a diagram showing a structure of automatic music composition data according to the embodiment of the present invention.

【符号の説明】[Explanation of symbols]

１…ＣＰＵ、２…ＲＯＭ、３…ＲＡＭ、４…ディスプレ
イ、７…外部記憶装置1 ... CPU, 2 ... ROM, 3 ... RAM, 4 ... Display, 7 ... External storage device

───────────────────────────────────────────────────── フロントページの続きＦターム(参考） 5B009 ME21 MF06 ND02 5C022 CA00 5C026 DA00 5D378 PP01 5L096 AA02 AA06 FA32 FA34 FA35 GA41 HA02 ─────────────────────────────────────────────────── ─── Continued front page F-term (reference) 5B009 ME21 MF06 ND02 5C022 CA00 5C026 DA00 5D378 PP01 5L096 AA02 AA06 FA32 FA34 FA35 GA41 HA02

Claims

【特許請求の範囲】[Claims]

【請求項１】映像データを供給する映像データ供給手
段と、該映像データ供給手段で供給された映像データからキー
ワードを抽出するキーワード抽出手段と、該キーワード抽出手段で抽出したキーワードに応じた自
動作曲用データを生成する自動作曲用データ生成手段
と、該自動作曲用データ生成手段で生成された自動作曲用デ
ータに基づいて曲を生成する曲生成手段と、を備えたこ
とを特徴とする映像データ処理装置。1. A video data supply means for supplying video data, a keyword extraction means for extracting a keyword from the video data supplied by the video data supply means, and an automatic song corresponding to the keyword extracted by the keyword extraction means. Data for automatic music composition for generating music data, and music composition means for generating a music composition based on the automatic music composition data generated by the automatic music composition data generating means Processing equipment.

【請求項２】請求項１記載の映像データ処理装置であ
って、前記映像データ供給手段で供給された映像データ
の特徴を抽出する特徴抽出手段を備え、前記自動作曲用データ生成手段が、前記キーワード抽出
手段で抽出したキーワードと前記特徴抽出手段で抽出し
た映像データの特徴とに応じた自動作曲用データを生成
することを特徴とする請求項１記載の映像データ処理装
置。2. The video data processing device according to claim 1, further comprising a feature extraction unit that extracts a feature of the video data supplied by the video data supply unit, wherein the automatic music composition data generation unit includes: 2. The video data processing apparatus according to claim 1, wherein the automatic music composition data is generated according to the keyword extracted by the keyword extracting means and the feature of the video data extracted by the feature extracting means.

【請求項３】映像データを供給する映像データ供給手
段と、該映像データ供給手段で供給された映像データからキー
ワードを抽出するキーワード抽出手段と、該キーワード抽出手段で抽出したキーワードに基づい
て、予め決められた既存曲を選択する選択手段と、を備
えたことを特徴とする映像データ処理装置。3. A video data supply means for supplying video data, a keyword extraction means for extracting a keyword from the video data supplied by the video data supply means, and a keyword extracted by the keyword extraction means in advance. A video data processing device comprising: a selection unit that selects a predetermined existing song.

【請求項４】映像データを供給する映像データ供給手
段と、該映像データ供給手段で供給された映像データからキー
ワードを抽出するキーワード抽出手段と、該キーワード抽出手段で抽出したキーワードに基づい
て、詞を自動生成する作詞手段と、を備えたことを特徴
とする映像データ処理装置。4. A video data supply means for supplying video data, a keyword extraction means for extracting a keyword from the video data supplied by the video data supply means, and a keyword based on the keyword extracted by the keyword extraction means. A video data processing device, comprising: a lyricizing means for automatically generating.

【請求項５】映像データを供給する映像データ供給手
段と、該映像データ供給手段で供給された映像データからキー
ワードを抽出するキーワード抽出手段と、該キーワード抽出手段で抽出したキーワードに基づい
て、予め決められた既存文章または既存詞を選択する選
択手段と、を備えたことを特徴とする映像データ処理装
置。5. A video data supply unit for supplying video data, a keyword extraction unit for extracting a keyword from the video data supplied by the video data supply unit, and a keyword extracted in advance based on the keyword extracted by the keyword extraction unit. A video data processing device, comprising: a selection unit that selects a predetermined existing sentence or existing phrase.

【請求項６】映像データを供給する映像データ供給手
段と、該映像データ供給手段で供給された映像データの特徴を
抽出する特徴抽出手段と、映像データの特徴に対応するキーワードを記憶するキー
ワード記憶手段と、を備え、前記特徴抽出手段で抽出した映像データの特徴に応じた
キーワードを前記キーワード記憶手段から読み出し、該
キーワードを該映像データに登録することを特徴とする
映像データ処理装置。6. Video data supply means for supplying video data, feature extraction means for extracting features of the video data supplied by the video data supply means, and keyword storage for storing keywords corresponding to the features of the video data. Means for reading the keyword corresponding to the feature of the video data extracted by the feature extracting means from the keyword storing means, and registering the keyword in the video data.

【請求項７】映像データを供給するステップと、供給された映像データからキーワードを抽出するステッ
プと、抽出したキーワードに応じた自動作曲用データを生成す
るステップと、生成された自動作曲用データに基づいて曲を生成するス
テップと、をコンピュータが実行するための映像データ
処理プログラム。7. A step of supplying video data, a step of extracting a keyword from the supplied video data, a step of generating automatic music data according to the extracted keyword, and a step of generating the generated automatic music data. A video data processing program for causing a computer to execute a step of generating a song based on the program.

【請求項８】請求項７記載の映像データ処理プログラ
ムであって、前記供給された映像データの特徴を抽出す
るステップを備え、前記曲を生成するステップが、前記
抽出したキーワードと前記抽出した映像データの特徴と
に応じた自動作曲用データを生成することを特徴とする
請求項７記載の映像データ処理プログラム。8. The video data processing program according to claim 7, further comprising a step of extracting a characteristic of the supplied video data, wherein the step of generating the song includes the extracted keyword and the extracted video. 8. The video data processing program according to claim 7, wherein the automatic music composition data is generated according to the characteristics of the data.

【請求項９】映像データを供給するステップと、供給された映像データからキーワードを抽出するステッ
プと、抽出したキーワードに基づいて、予め決められた既存曲
を選択するステップと、をコンピュータが実行するため
の映像データ処理プログラム。9. A computer executes a step of supplying video data, a step of extracting a keyword from the supplied video data, and a step of selecting a predetermined existing song based on the extracted keyword. Video data processing program for.

【請求項１０】映像データを供給するステップと、供給された映像データからキーワードを抽出するステッ
プと、抽出したキーワードに基づいて、詞を自動生成するステ
ップと、をコンピュータが実行するための映像データ処
理プログラム。10. Video data for causing a computer to execute the steps of supplying video data, extracting a keyword from the supplied video data, and automatically generating a word based on the extracted keyword. Processing program.

【請求項１１】映像データを供給するステップと、供給された映像データからキーワードを抽出するステッ
プと、抽出したキーワードに基づいて、予め決められた既存文
章または既存詞を選択するステップと、をコンピュータ
が実行するための映像データ処理プログラム。11. A computer comprising: a step of supplying video data; a step of extracting a keyword from the supplied video data; and a step of selecting a predetermined existing sentence or an existing word based on the extracted keyword. A video data processing program to be executed by.

【請求項１２】映像データを供給するステップと、供給された映像データの特徴を抽出するステップと、抽出した映像データの特徴に応じたキーワードを、映像
データ対応する予め設定されたキーワードを記憶するキ
ーワード記憶手段から読み出すステップと、読み出したキーワードを該映像データに登録するステッ
プと、をコンピュータが実行するための映像データ処理
プログラム。12. A step of supplying video data, a step of extracting characteristics of the supplied video data, a keyword according to the characteristics of the extracted video data, and a preset keyword corresponding to the video data is stored. A video data processing program for causing a computer to execute a step of reading from the keyword storage means and a step of registering the read keyword in the video data.