JP4877177B2

JP4877177B2 - Karaoke device with scoring function

Info

Publication number: JP4877177B2
Application number: JP2007251333A
Authority: JP
Inventors: 哲也水谷
Original assignee: Brother Industries Ltd
Current assignee: Brother Industries Ltd
Priority date: 2007-09-27
Filing date: 2007-09-27
Publication date: 2012-02-15
Anticipated expiration: 2027-09-27
Also published as: JP2009080429A

Description

本発明は、精度の高い採点が可能な採点機能を有するカラオケ装置に関する。 The present invention relates to a karaoke apparatus having a scoring function capable of scoring with high accuracy.

従来、採点機能を有するカラオケ装置が広く知られている。マイクなどから入力されたカラオケ演奏中の歌唱者の音声を分析し、分析によって得られた特徴量を所定の手法で評価し、数値化して、その数値を採点結果として歌唱者に報知するものである。また、採点処理に用いられる特徴量としては、歌唱音声の音高を抽出した音高ピッチの周波数を特徴量とする情報（音高ピッチ情報）が一般的である。下記特許文献１においては、デュエット曲に対し、男女のパート毎に採点するものが開示されている。また、下記特許文献２においては、男女音声の周波数特性の違いを利用して入力音声の性別を判別するものが開示されている。
特開平１１−２８２４７８号公報特開２００１−５６６９９号公報 Conventionally, a karaoke apparatus having a scoring function is widely known. Analyzes the voice of a singer during a karaoke performance input from a microphone, etc., evaluates the feature value obtained by the analysis by a predetermined method, quantifies it, and notifies the singer of the numerical value as a scoring result is there. In addition, information (pitch pitch information) that uses a pitch frequency obtained by extracting the pitch of a singing voice as a feature amount is generally used as a feature amount used for scoring processing. In the following Patent Document 1, a score for each male and female part is disclosed for a duet song. Japanese Patent Application Laid-Open No. 2004-228561 discloses a technique for determining the gender of input speech using the difference in frequency characteristics of male and female speech.
Japanese Patent Laid-Open No. 11-282478 JP 2001-56699 A

ところで、カラオケ演奏中の入力歌唱音声を分析（採点）する場合、高精度の採点結果を得るためには、入力歌唱音声以外の音響信号を除去することが望ましい。しかしながら、歌唱者のマイクには、歌唱者の歌唱音声のみならず、楽音再生装置から出力されるカラオケ演奏中の楽曲音も一部入力されることになる。 By the way, when analyzing (scoring) the input singing voice during karaoke performance, it is desirable to remove acoustic signals other than the input singing voice in order to obtain a highly accurate scoring result. However, not only the singing voice of the singer but also a part of the music sound during the karaoke performance outputted from the musical sound reproducing device is inputted to the microphone of the singer.

そのため、従来、周波数帯域フィルタなどを利用したフィルタ処理によって上記楽曲音を除去する試みが行われているが、この場合、歌唱者の入力歌唱音声は除去せずに、楽曲音のみを除去するためのフィルタ処理を施すには、入力歌唱音声には影響を及ぼさないようなフィルタ処理を行うよう留意する必要がある。 For this reason, conventionally, attempts have been made to remove the music sound by filtering using a frequency band filter or the like, but in this case, in order to remove only the music sound without removing the singer's input singing voice. In order to perform this filtering process, it is necessary to take care to perform a filtering process that does not affect the input singing voice.

このとき、男声と女声とでは歌唱音声の周波数帯域が異なる（一般的に、男声を発する男性の音声は、女声を発する女性の音声よりも低い周波数帯域である。）ため、男声と女声に対して同じフィルタ処理を施すことは、高精度の分析結果を得る観点からは好ましくない。 At this time, the frequency band of the singing voice is different between the male voice and female voice (in general, the voice of a male voice is lower than that of a female voice of a female voice). Applying the same filter processing is not preferable from the viewpoint of obtaining a highly accurate analysis result.

この点について、ドラム音、女性、男声それぞれの周波数分布について図を用いて説明する。以下に説明する図においては、横軸は対数周波数であり、縦軸はデシベルである。図１〜図３は、ドラム音の周波数分布の例を示すものである。ドラム音は、ドン、ドンという打音である。図１では、７０Ｈｚ付近に第１のピークがある。図２では、６０Ｈｚ付近に第１のピークがある。図３では、８０Ｈｚ付近にピークがある。これらの図によれば、一般的に、ドラム音は、１００Ｈｚ以下の帯域に、第１のピークが存在することがわかる。 Regarding this point, the frequency distribution of each of the drum sound, female, and male voice will be described with reference to the drawings. In the figures described below, the horizontal axis is the logarithmic frequency, and the vertical axis is the decibel. 1 to 3 show examples of frequency distribution of drum sounds. The drum sound is a percussive sound. In FIG. 1, there is a first peak near 70 Hz. In FIG. 2, there is a first peak near 60 Hz. In FIG. 3, there is a peak near 80 Hz. According to these figures, it can be seen that the drum sound generally has a first peak in a band of 100 Hz or less.

図４〜図６は、女声の周波数分布の例を示すものである。図４では、３００Ｈｚ付近に第１のピークがある。図５では、４００Ｈｚ付近に第１のピークがある。図６では、５００Ｈｚ付近に第１のピークがある。 4 to 6 show examples of female voice frequency distribution. In FIG. 4, there is a first peak near 300 Hz. In FIG. 5, there is a first peak near 400 Hz. In FIG. 6, there is a first peak near 500 Hz.

図７、図８は、男声の周波数分布の例を示すものである。図７では、８０Ｈｚ付近に第１のピークがある。図８では、１００〜２００Ｈｚの間に第１のピークがある。 7 and 8 show examples of male voice frequency distribution. In FIG. 7, there is a first peak near 80 Hz. In FIG. 8, there is a first peak between 100 and 200 Hz.

これらの図によれば、女声の周波数帯域と男声の周波数帯域とで大きな差があることがわかる。また、低い声の男声の第１のピークは、ドラム音の第１のピークに近いことがわかる。 According to these figures, it can be seen that there is a large difference between the female voice frequency band and the male voice frequency band. Further, it can be seen that the first peak of the low-pitched male voice is close to the first peak of the drum sound.

また、これらの図が示すとおり、男声に合わせてフィルタ（例えば、ローカットフィルタ）を設定し、そのフィルタをそのまま用いて女声に対しフィルタ処理した場合、ドラム音の帯域が除去できないため、女声の音高ピッチ情報の取得に悪影響を及ぼすことになる。また、女声に合わせてフィルタを設定し、そのフィルタをそのまま男声に対しをフィルタ処理した場合、男声自体にもフィルタ処理が施されてしまい、男声の音高ピッチ情報の取得に悪影響を及ぼす。 In addition, as shown in these figures, when a filter (for example, a low cut filter) is set in accordance with the male voice and the female voice is filtered using the filter as it is, the band of the drum sound cannot be removed. This will adversely affect the acquisition of high pitch information. Further, when a filter is set in accordance with a female voice and the filter is applied to the male voice as it is, the male voice itself is also subjected to the filter processing, which adversely affects the acquisition of the pitch pitch information of the male voice.

さらに、ドラム音以外の楽器音であっても、発音帯域が男声または女声の帯域と近い楽器音は、音高ピッチ情報の取得に悪影響を及ぼすため、この楽器音の発音時に採点処理を行うと、採点精度が下がる原因となる。したがって、採点に悪影響を及ぼす楽器音を特定し、採点精度に悪影響を及ぼさないタイミングで音高ピッチ情報を取得する必要がある。 Furthermore, even for instrument sounds other than drum sounds, instrument sounds whose sounding band is close to that of male or female voice adversely affects the acquisition of pitch pitch information. This will cause the scoring accuracy to drop. Therefore, it is necessary to identify musical instrument sounds that adversely affect scoring and to acquire pitch pitch information at a timing that does not adversely affect scoring accuracy.

図９は、ピアノ音（発音コード：Ａ１（基本波の周波数は５５Ｈｚ付近））のアタック（発音）直後の周波数分布を示すものである。この図が示すとおり、ピアノ音には、５０〜６０Ｈｚの間に１つのピークが存在し、また、１００〜１１０Ｈｚの間の２ヵ所にピークが存在することがわかる。 FIG. 9 shows the frequency distribution immediately after the attack (sounding) of the piano sound (sounding code: A1 (the frequency of the fundamental wave is around 55 Hz)). As can be seen from this figure, the piano sound has one peak between 50 and 60 Hz, and two peaks between 100 and 110 Hz.

図１０は、エレキベース音（発音コード：Ａ１（基本波の周波数は５５Ｈｚ付近））のアタック（発音）直後の周波数分布を示すものである。この図が示すとおり、エレキベース音には、５０〜６０Ｈｚの間に１つのピークが存在し、また、１００〜４００Ｈｚの間に複数のピークが存在することがわかる。 FIG. 10 shows the frequency distribution immediately after the attack (sounding) of the electric bass sound (sounding code: A1 (frequency of the fundamental wave is around 55 Hz)). As shown in this figure, the electric bass sound has one peak between 50 and 60 Hz, and a plurality of peaks between 100 and 400 Hz.

図１１は、チューバ音（発音コード：Ａ１（基本波の周波数は５５Ｈｚ付近））のアタック（発音）直後の周波数分布を示すものである。この図が示すとおり、チューバ音には、５０〜６０Ｈｚの間に１つのピークが存在し、また、１００〜４００Ｈｚの間に複数のピークが存在することがわかる。 FIG. 11 shows the frequency distribution immediately after the attack (sounding) of the tuba sound (sounding code: A1 (frequency of the fundamental wave is around 55 Hz)). As shown in this figure, the tuba sound has one peak between 50 and 60 Hz, and a plurality of peaks between 100 and 400 Hz.

図１２は、コントラバス音（発音コード：Ａ１（基本波の周波数は５５Ｈｚ付近））のアタック（発音）直後の周波数スペクトルを示すものである。この図が示すとおり、コントラバス音には、５０〜６０Ｈｚの間に１つのピークが存在し、また、１００Ｈｚ以降、複数のピークが存在することがわかる。 FIG. 12 shows the frequency spectrum immediately after the attack (sound generation) of the contrabass sound (sounding code: A1 (frequency of the fundamental wave is around 55 Hz)). As shown in this figure, it can be seen that the contrabass sound has one peak between 50 and 60 Hz, and a plurality of peaks after 100 Hz.

ここで、ピアノ音及びエレキベース音は、発音直後から徐々に音量が減衰する楽器音であり、チューバ音及びコントラバス音は減衰しない楽器音である。図９〜１２が示すとおり、減衰する楽器音、減衰しない楽器音に関わらず、人の声のピークと類似する帯域にピークが存在しうる。そのため、これらの楽器音が発音している区間を特定する必要がある。 Here, the piano sound and the electric bass sound are instrument sounds whose volume is gradually attenuated immediately after the pronunciation, and the tuba sound and the contrabass sound are instrument sounds that are not attenuated. As shown in FIGS. 9 to 12, a peak may exist in a band similar to the peak of a human voice regardless of the instrument sound that is attenuated or the instrument sound that is not attenuated. Therefore, it is necessary to specify the section in which these instrument sounds are being generated.

また、男声／女声でフィルタを切り替える場合、入力音声が男声であるか女声であるかを判別する必要がある。カラオケ演奏時に歌唱者に性別を選択させることもできるが、このような選択動作は歌唱者に負担を課すことになる。また、男声／女声の判別を装置が自動的に行う場合は、例えばセキュリティシステムに用いられるような精度の高い判別が必要となる。一般的に、精度の高い音声認識技術を、ワンチップＣＰＵ等を用いたカラオケ用採点装置で実行した場合は、認識処理に時間がかかるため、娯楽として用いられるカラオケ装置には採用しがたい面がある。 Further, when the filter is switched between male voice / female voice, it is necessary to determine whether the input voice is male voice or female voice. Although it is possible to make the singer select a gender when performing karaoke, such a selection operation places a burden on the singer. In addition, when the apparatus automatically determines male / female voice, it is necessary to perform highly accurate determination, for example, as used in a security system. Generally, when high-accuracy speech recognition technology is executed by a karaoke scoring device using a one-chip CPU or the like, recognition processing takes time, so it is difficult to adopt it for a karaoke device used for entertainment. There is.

そこで、本発明は、上記問題点を解消し、歌唱者に何らの負担を課すことなく、楽曲演奏中に男声／女声を正確に、かつ、容易に判別し、男声女声別に適正なフィルタ処理を行うことによって、精度の高い採点処理ができるカラオケ用採点装置の実現を目的とする。 Therefore, the present invention eliminates the above-mentioned problems, accurately and easily discriminates male / female voices during music performance without imposing any burden on the singer, and performs appropriate filter processing for each male / female voice. The purpose of this is to realize a karaoke scoring device that can perform scoring with high accuracy.

上記目的を達成するため、請求項１に係る発明は、歌唱者音声入力手段、楽曲再生手段、制御手段、記憶手段、採点手段を備えた採点機能を有するカラオケ装置において、上記記憶手段には、楽器音ごとの減衰時間が定義された減衰時間データと、ピッチ周波数と周波数帯域フィルタとが関連付けられた関連データと、歌唱者音程情報、楽器音情報、楽曲演奏進行情報を含んだ楽曲データと、が記憶されており、上記制御手段は、前記楽曲データに基く楽曲の演奏中に、上記楽器音情報、上記楽曲演奏進行情報、上記減衰時間データに基いて、設定された所定の楽器音が発音されていないかどうかを判断し、上記楽器音が発音されていない期間において上記歌唱者音声入力手段に入力された入力音声から歌唱者のピッチ周波数を特定し、特定されたピッチ周波数及び上記関連データに基いて、周波数帯域フィルタを特定し、特定した周波数帯域フィルタによって上記入力音声をフィルタ処理し、上記採点手段は、上記歌唱者音程情報を用いて前記フィルタ処理された後の入力音声を採点することを特徴とする。 In order to achieve the above object, the invention according to claim 1 is a karaoke apparatus having a scoring function comprising a singer voice input means, a music playback means, a control means, a storage means, and a scoring means. Decay time data in which decay time for each instrument sound is defined, related data in which pitch frequency and frequency band filter are associated, song data including singer pitch information, instrument sound information, and music performance progress information, And the control means generates a predetermined instrument sound based on the instrument sound information, the music performance progress information, and the decay time data during the performance of the music based on the music data. And determine the pitch frequency of the singer from the input voice input to the singer voice input means during the period when the instrument sound is not pronounced. A frequency band filter is identified based on the pitch frequency and the related data, and the input voice is filtered by the identified frequency band filter, and the scoring means is filtered using the singer pitch information. It is characterized by scoring the input voice after.

請求項２に係る発明は、請求項１の採点機能を有するカラオケ装置において、上記楽曲は、ＭＩＤＩデータを含む楽曲データに基いて演奏され、上記制御手段は、上記所定の楽器音を発音させるための命令が出力されたタイミングを起点として、当該所定の楽器音の減衰時間が経過するまでの期間、または、当該所定の楽器音の発音を止めるための命令が出力されるまでの期間のうち、短い方の期間を、当該所定の楽器音が発音されている期間と判断し、上記楽器音が発音されていない期間から除外することを特徴とする。 According to a second aspect of the present invention, in the karaoke apparatus having the scoring function of the first aspect, the music is played based on music data including MIDI data, and the control means causes the predetermined musical instrument sound to be generated. From the timing when the command is output as a starting point, the period until the decay time of the predetermined instrument sound elapses, or the period until the command for stopping the sound generation of the predetermined instrument sound is output, The shorter period is determined as a period in which the predetermined instrument sound is being generated, and is excluded from a period in which the instrument sound is not being generated.

請求項３に係る発明は、請求項１または２の採点機能を有するカラオケ装置において、上記楽曲は、ＭＩＤＩデータを含む楽曲データに基いて演奏され、上記制御手段は、所定値より小さいノートナンバを含むノートオン信号を検出し、当該ノートオン信号のチャネルに対応する楽器音を、上記所定の楽器音として設定することを特徴とする。 According to a third aspect of the present invention, in the karaoke apparatus having the scoring function of the first or second aspect, the music is played based on music data including MIDI data, and the control means has a note number smaller than a predetermined value. The note-on signal including the note-on signal is detected, and the instrument sound corresponding to the channel of the note-on signal is set as the predetermined instrument sound.

請求項４に係る発明は、請求項１〜３いずれかの採点機能を有するカラオケ装置において、蓄積手段をさらに有し、上記発音されていない期間において取得した上記入力音声の音高ピッチ情報は、上記蓄積手段に記憶され、所定時間分の音高ピッチ情報が蓄積されると平均処理によって上記歌唱者のピッチ周波数を特定することを特徴とする。 According to a fourth aspect of the present invention, in the karaoke apparatus having the scoring function according to any one of the first to third aspects, the karaoke apparatus further includes a storage unit, and the pitch pitch information of the input voice acquired during the period in which the sound is not generated is The pitch frequency of the singer is specified by an averaging process when pitch pitch information for a predetermined time is stored in the storage means.

請求項５に係る発明は、請求項１、２または４いずれかの採点機能を有するカラオケ装置において、上記楽曲データには、上記所定の楽器音を特定可能にする情報が付されており、上記制御手段は、上記情報に基いて上記所定の楽器音を設定することを特徴とする。 According to a fifth aspect of the present invention, in the karaoke apparatus having the scoring function according to any one of the first, second, and fourth aspects, the music data is attached with information that enables the predetermined musical instrument sound to be specified. The control means sets the predetermined musical instrument sound based on the information.

請求項６に係る発明は、請求項１、２または４いずれかの採点機能を有するカラオケ装置において、上記所定の楽器音が予め定められていることを特徴とする。 According to a sixth aspect of the present invention, in the karaoke apparatus having the scoring function of any one of the first, second, and fourth aspects, the predetermined musical instrument sound is predetermined.

請求項７に係る発明は、請求項４の採点機能を有するカラオケ装置において、上記制御手段は、上記楽曲の演奏開始後所定の時間内に上記蓄積手段に所定時間分の声高ピッチ情報が蓄積されなかった場合に、所定の情報を利用者に報知することを特徴とする。 According to a seventh aspect of the present invention, in the karaoke apparatus having the scoring function of the fourth aspect , the control means stores the pitch pitch information for a predetermined time in the storage means within a predetermined time after the performance of the music is started. If not, predetermined information is notified to the user.

請求項８に係る発明は、請求項１〜７いずれかの採点機能を有するカラオケ装置において、前記楽器音ごとの減衰時間は、４バイトのデータであることを特徴とする。 According to an eighth aspect of the present invention, in the karaoke apparatus having the scoring function according to any one of the first to seventh aspects, the decay time for each musical instrument sound is 4-byte data.

請求項１に係る発明によれば、楽器音ごとの減衰時間を用いて所定の楽器音が発音されていない期間において入力された入力音声に基いて歌唱者のピッチ周波数を特定し、その結果に基いてフィルタを設定し、設定したフィルタを用いてフィルタ処理を行った音声に基いて採点を行うので、音高ピッチ情報の抽出に悪影響を及ぼす楽器を所定の楽器音として設定することにより、精度の高い採点処理を実現することができる。 According to the first aspect of the present invention, the pitch frequency of the singer is specified based on the input voice that is input during a period when the predetermined instrument sound is not generated using the decay time for each instrument sound, and the result is Since the scoring is based on the sound that has been filtered using the set filter, the instrument that has an adverse effect on the pitch pitch information extraction is set as a predetermined instrument sound. High scoring processing can be realized.

請求項２に係る発明によれば、さらに、楽器音ごとの減衰時間及び楽器音の発音を止めるための命令を用いて、所定の楽器音が発音されていない期間を決定するので、楽器音の種類（減衰する楽器、減衰しない楽器）によらず上記発音されていない期間を適切に決定することができる。 According to the second aspect of the invention, the decay time for each instrument sound and the instruction for stopping the sound of the instrument sound are used to determine the period during which the predetermined instrument sound is not sounded. Regardless of the type (attenuating instrument, non-attenuating instrument), the period in which the sound is not generated can be determined appropriately.

請求項３に係る発明によれば、さらに、所定値より小さいノートナンバを含むノートオン信号のチャネルに対応する楽器音を採点に悪影響を及ぼす楽器として設定することができるので、採点に悪影響を及ぼす楽器音を適切に決定することができる。 According to the third aspect of the present invention, since the instrument sound corresponding to the channel of the note-on signal including the note number smaller than the predetermined value can be set as an instrument that adversely affects the scoring, the scoring is adversely affected. The instrument sound can be determined appropriately.

請求項４に係る発明によれば、さらに、所定時間分の音高ピッチ情報を利用して、歌唱者のピッチ周波数を特定するので、特定したピッチ周波数の精度を高めることができる。 According to the invention which concerns on Claim 4, since the pitch frequency of a singer is specified using the pitch information for predetermined time, the precision of the specified pitch frequency can be improved.

請求項５に係る発明によれば、さらに、上記所定の楽器音を特定可能にする情報が楽曲データに付されているから、楽曲ごとに、音高ピッチ情報の取得に悪影響を及ぼす楽器音を特定することが可能となるため、音高ピッチ情報の取得のための適切な区間を決定することできる。 According to the fifth aspect of the present invention, since the music data is provided with information that allows the predetermined musical instrument sound to be specified, the musical instrument sound that adversely affects the acquisition of pitch pitch information is provided for each musical piece. Since it becomes possible to specify, the suitable area for acquisition of pitch pitch information can be determined.

請求項６に係る発明によれば、さらに、上記所定の楽器音が予め定められているから、既存の楽曲データを利用して精度の高い採点処理を行うことができる。 According to the sixth aspect of the present invention, since the predetermined musical instrument sound is predetermined, it is possible to perform highly accurate scoring using existing music data.

請求項７に係る発明によれば、さらに、楽曲の演奏開始後所定の時間内に所定時間分の声高ピッチ情報が蓄積できなかった場合にその旨を報知するので、男声／女声判別が正しく行われたか否かを演奏中の早い段階で歌唱者に報知することができる。 According to the seventh aspect of the present invention, when voice pitch information for a predetermined time cannot be accumulated within a predetermined time after the performance of the music is started, this is notified, so that male / female voice discrimination is performed correctly. It is possible to notify the singer whether or not it has been received at an early stage during the performance.

請求項８に係る発明によれば、さらに、減衰時間を４バイトで表現するので、ミスアラインメントを防止するとともに、適切なデータ量で減衰時間を表現することができる。 According to the eighth aspect of the present invention, since the attenuation time is expressed by 4 bytes, misalignment can be prevented and the attenuation time can be expressed by an appropriate data amount.

本発明に係るカラオケ装置を具体化した実施形態について、図面を参照しつつ詳細に説明する。なお、以下の説明においては、採点に影響のある楽器音を「影響音」ということがあり、また、採点に影響のある楽器音のトラックを「影響音トラック」ということがある。 DESCRIPTION OF EMBODIMENTS An embodiment embodying a karaoke apparatus according to the present invention will be described in detail with reference to the drawings. In the following description, an instrument sound that affects scoring may be referred to as an “influence sound”, and a track of an instrument sound that affects scoring may be referred to as an “influence sound track”.

図１３は、本実施形態のカラオケ装置における制御装置の内部構成、及び、その周辺機器を示すブロック図である。 FIG. 13 is a block diagram showing the internal configuration of the control device and its peripheral devices in the karaoke device of the present embodiment.

制御装置１０は、通信回線を介してホストコンピュータ（図示せず）に接続されており、通信回線を介して楽曲データを通信Ｉ／Ｆ１５を介して受信する。受信されたカラオケ曲データは、記憶装置１２に記憶される。 The control device 10 is connected to a host computer (not shown) via a communication line, and receives music data via the communication I / F 15 via the communication line. The received karaoke song data is stored in the storage device 12.

ここで、楽曲データには、楽曲の再生用データに加えて、カラオケ曲のタイトルデータ、カラオケ曲に対応する映像データ等が含まれることもある。 Here, the music data may include karaoke music title data, video data corresponding to the karaoke music, and the like in addition to the music playback data.

コントローラ１１は、制御装置１０全体の制御を行う。また、種々のプログラムを実行する。
記憶装置１２は、楽曲データ等を記憶する。また、記憶装置１２は、動的記憶媒体（ＨＤＤ等）で構成される。また、必要に応じて静的記憶媒体で構成してもよい。 The controller 11 controls the entire control device 10. Various programs are executed.
The storage device 12 stores music data and the like. The storage device 12 is configured by a dynamic storage medium (HDD or the like). Moreover, you may comprise with a static storage medium as needed.

操作パネル１３は、操作者が選曲番号等の各種情報を入力するために用いられる。また、リモコン１３ａを介して各種の情報を入力してもよい。 The operation panel 13 is used for an operator to input various information such as a music selection number. Various kinds of information may be input via the remote controller 13a.

ＲＡＭ１４は、種々の制御に必要な情報が記憶される一時記憶メモリである。ＲＡＭ１４は、マルチスレッドプロセスにおいては、共有メモリとして機能する。なお、本発明における共有メモリとしての使用方法については、後述する。 The RAM 14 is a temporary storage memory in which information necessary for various controls is stored. The RAM 14 functions as a shared memory in the multithread process. The usage method as the shared memory in the present invention will be described later.

通信Ｉ／Ｆ１５は、図示しないホストコンピュータとの通信を、通信回線を介して行う。ここで、通信回線は、有線無線を問わない。 The communication I / F 15 performs communication with a host computer (not shown) via a communication line. Here, the communication line may be wired or wireless.

採点回路１６は、マイク１７より入力された音声の採点を行う。また、本実施形態においては、採点の対象となる音声は、後述するフィルタ処理が施されている。なお、図１３においては、マイクの数は２つであるが、マイクの数はいくつでもよい。なお、図１３においては、コントローラ１１と採点回路１６を別個の構成として図示しているが、コントローラ１１が、フィルタ処理及び採点処理を行ってもよい。また、フィルタ処理を別の回路が行ってもよい。 The scoring circuit 16 scores the voice input from the microphone 17. Moreover, in this embodiment, the audio | voice used as the object of scoring is given the filter process mentioned later. In FIG. 13, the number of microphones is two, but any number of microphones may be used. In FIG. 13, the controller 11 and the scoring circuit 16 are illustrated as separate components, but the controller 11 may perform filter processing and scoring processing. Further, another circuit may perform the filtering process.

音源１８は、アンプ１９に接続されている。楽曲データは、音源１８を介して音声信号に変換され、アンプ１９で増幅された後、スピーカ２０によって音声出力される。なお、本実施形態においては、音源１８は、ＭＩＤＩ音源である。また、アンプ１９は、マイク１７より入力された音声についても増幅する。 The sound source 18 is connected to the amplifier 19. The music data is converted into an audio signal via the sound source 18, amplified by the amplifier 19, and then output by the speaker 20. In the present embodiment, the sound source 18 is a MIDI sound source. The amplifier 19 also amplifies the sound input from the microphone 17.

映像制御回路２１は、モニタ２２に接続されている。記憶装置１２または通信回線より取得した映像データと、楽曲データに含まれた歌詞情報とを、映像制御回路２１を介して楽曲のカラオケ再生時の背景映像と歌詞として、モニタ２２に表示する。また、映像データが、符号化されている場合は、復号処理を映像制御回路２１で行ってもよい。 The video control circuit 21 is connected to the monitor 22. The video data acquired from the storage device 12 or the communication line and the lyric information included in the music data are displayed on the monitor 22 via the video control circuit 21 as background video and lyrics at the time of karaoke playback of the music. If the video data is encoded, the video control circuit 21 may perform the decoding process.

なお、上述した内部構成は、本発明の説明に必要なものを主に記載したものであり、上述した構成以外にも、種々の回路や要素が含まれることはもちろんである。 The above-described internal configuration mainly describes what is necessary for the description of the present invention, and it goes without saying that various circuits and elements are included in addition to the above-described configuration.

なお、本実施形態におけるカラオケ装置の外観は、本発明において何ら限定されるものではない。また、本実施形態においては、上述したカラオケ装置の内部構成として示した一部の要素を、外部に備えてよい。一部の構成要素の機能を、ネットワークに接続されたサーバで実現することも可能である。 In addition, the external appearance of the karaoke apparatus in this embodiment is not limited at all in the present invention. Moreover, in this embodiment, you may provide the one part element shown as an internal structure of the karaoke apparatus mentioned above outside. The functions of some components can be realized by a server connected to a network.

本実施形態で用いる楽曲データは、楽器音情報、楽曲演奏進行情報、歌唱者音程情報等を有するものである。代表的なものとしてＭＩＤＩデータを挙げることができるが、本実施形態は、ＭＩＤＩデータに限定されるものではなく、本実施形態を実施可能な限度においてその他のデータであってもよい。また、ＭＩＤＩデータにおいては、チャネル番号によって楽器音が指定され、楽器音の発音のオン／オフは、例えば、ノートオン／ノートオフ信号で制御される。また、歌唱者音程情報は、ボーカルトラックのノートナンバを用いることができる。 The music data used in this embodiment has instrument sound information, music performance progress information, singer pitch information, and the like. A typical example is MIDI data, but the present embodiment is not limited to MIDI data, and may be other data as long as the present embodiment can be implemented. In MIDI data, a musical instrument sound is designated by a channel number, and on / off of the sound generation of the musical instrument sound is controlled by, for example, a note on / note off signal. Moreover, the note number of a vocal track can be used for the singer pitch information.

図１４は、本実施形態における楽曲データの一例を示すものである。図１４に示すとおり、ＭＩＤＩデータのトラックヘッダに、減衰時間を定義するためのデータエリアが設けられている。減衰時間の詳細については後述する。また、減衰しない楽器音（オルガン等）のトラックについては、減衰しないことを示す所定値（例えば、４バイトデータであれば、FFFFFFFFh）を上記データエリアに設定することができる。また、減衰する楽器であ
るか否かを示すためのデータエリアをトラックヘッダに別途に設けてもよい。 FIG. 14 shows an example of music data in the present embodiment. As shown in FIG. 14, a data area for defining an attenuation time is provided in the track header of MIDI data. Details of the decay time will be described later. For a track of instrument sound (such as an organ) that is not attenuated, a predetermined value (for example, FFFFFFFFh for 4-byte data) indicating that the track is not attenuated can be set in the data area. In addition, a data area for indicating whether or not the instrument is attenuating may be provided separately in the track header.

図１５は、共有メモリの内部構成を示す図である。図１５が示すとおり、共有メモリには、影響音トラックにおけるノートオン信号が出力された時刻を記憶するための領域（時刻領域）と、ボーカルトラックのノートナンバを記憶するための領域とからなる。図１５においては、影響音トラックとして２つのトラックが設定された場合の共有メモリの内容を示している。また、本実施形態においては、設定される影響音トラックの数は任意に設定可能であるから、共有メモリに設定される時刻領域の数も任意に設定可能である。 FIG. 15 is a diagram illustrating an internal configuration of the shared memory. As shown in FIG. 15, the shared memory includes an area for storing the time when the note-on signal in the influence sound track is output (time area) and an area for storing the note number of the vocal track. FIG. 15 shows the contents of the shared memory when two tracks are set as influence sound tracks. In the present embodiment, since the number of influence sound tracks to be set can be arbitrarily set, the number of time areas set in the shared memory can also be arbitrarily set.

次に、本実施形態におけるカラオケ装置の処理の流れについて図を参照しつつ説明する。図１６は、本実施形態におけるカラオケ装置の採点用プロセスのフローチャートであり、図１７は、本実施形態におけるカラオケ装置の演奏用プロセスのフローチャートである。ここで、採点用プロセスと演奏用プロセスとは、マルチスレッドとして処理される。すなわち、上記両プロセスは、並列で処理される。マルチスレッド処理については公知であるので説明を省略する。 Next, the flow of processing of the karaoke apparatus in the present embodiment will be described with reference to the drawings. FIG. 16 is a flowchart of the scoring process of the karaoke apparatus in the present embodiment, and FIG. 17 is a flowchart of the performance process of the karaoke apparatus in the present embodiment. Here, the scoring process and the performance process are processed as multi-threads. That is, both processes are processed in parallel. Since multi-thread processing is publicly known, description thereof is omitted.

上記両プロセスは、実行中に共有メモリにアクセスすることによって、両プロセス間で情報のやり取りが可能となっている。 Both processes can exchange information between the two processes by accessing the shared memory during execution.

［採点用プロセス］
まず、採点用プロセスについて図１６を参照にしつつ説明する。楽曲の再生がスタートすると、採点用プロセスは実行開始される。
Ｓ１において、取得ピッチ保持エリアを初期化する。取得ピッチ保持エリアは、ＲＡＭ１４内に形成される。 [Scoring process]
First, the scoring process will be described with reference to FIG. When music playback starts, the scoring process begins to run.
In S1, the acquisition pitch holding area is initialized. The acquisition pitch holding area is formed in the RAM 14.

Ｓ２において、ローカットフィルタをオフにする。なお、本実施形態においては、フィルタとしてローカットフィルタを用いる例を説明するが、バンドパスフィルタ等を用いても本実施形態は実現可能であることはもちろんである。また、フィルタをデジタルフィルタで構成してもよいし、アナログフィルタで構成してもよい。 In S2, the low cut filter is turned off. In the present embodiment, an example in which a low cut filter is used as a filter will be described, but it is needless to say that the present embodiment can be realized even if a bandpass filter or the like is used. Further, the filter may be configured with a digital filter or an analog filter.

Ｓ３において、取得ピッチ保持エリア内に音高ピッチ情報がＦＵＬＬになったか否かを判断する。取得ピッチ保持エリアに記憶する音高ピッチ情報の量は適宜設定可能であり、入力音声のピッチ周波数を正確に算出できる量とする。音高情報ピッチ情報がＦＵＬＬになっていないと判断した場合は（Ｓ３：ＮＯ）、Ｓ４に進む。 In S3, it is determined whether or not the pitch pitch information becomes FULL within the acquired pitch holding area. The amount of pitch information stored in the acquired pitch holding area can be set as appropriate, and is an amount that can accurately calculate the pitch frequency of the input voice. If it is determined that the pitch information pitch information is not FULL (S3: NO), the process proceeds to S4.

Ｓ４において、楽曲の演奏が終了したか否かを判別する。演奏が終了したと判断した場合（Ｓ４：ＹＥＳ）は、Ｓ１１に進む。演奏が終了していないと判断した場合（Ｓ４：ＮＯ）は、Ｓ５に進む。 In S4, it is determined whether or not the music performance has ended. If it is determined that the performance has ended (S4: YES), the process proceeds to S11. If it is determined that the performance has not ended (S4: NO), the process proceeds to S5.

Ｓ５において、演奏開始から一定時間が経過したか否かを判断する。一定時間が経過したと判断した場合は（Ｓ５：ＹＥＳ）、Ｓ１１に進む。この一定時間は、適宜設定可能である。一定時間が経過したとの判断は、音高ピッチ情報が所定時間内に所定量取得できなかったと判断したことを意味する。 In S5, it is determined whether or not a predetermined time has elapsed since the start of the performance. If it is determined that the predetermined time has elapsed (S5: YES), the process proceeds to S11. This certain time can be set as appropriate. The determination that a certain time has elapsed means that it has been determined that a predetermined amount of pitch pitch information has not been acquired within a predetermined time.

Ｓ１１において、楽曲の演奏を終了し、採点ができなかった旨をモニタ２２に表示する。また、Ｓ５及びＳ１１における処理は、必要に応じ省略してもよい。 In S11, the performance of the music is finished, and a message that the scoring is not possible is displayed on the monitor 22. Moreover, you may abbreviate | omit the process in S5 and S11 as needed.

Ｓ５において、演奏開始から一定時間が経過していない場合は（Ｓ５：ＮＯ）は、Ｓ６に進む。
Ｓ６において、音高ピッチ情報が取得できたか否かを判断する。音高ピッチ情報が取得できないと判断した場合は（Ｓ６：ＮＯ）、Ｓ３に戻る。音高ピッチ情報が取得できたと判断した場合は（Ｓ６：ＹＥＳ）、Ｓ７に進む。なお、音高ピッチ情報の取得の手法については、公知の種々の技術を採用することができる。 In S5, when the predetermined time has not elapsed since the start of the performance (S5: NO), the process proceeds to S6.
In S6, it is determined whether or not pitch pitch information has been acquired. When it is determined that pitch pitch information cannot be acquired (S6: NO), the process returns to S3. If it is determined that pitch pitch information has been acquired (S6: YES), the process proceeds to S7. It should be noted that various known techniques can be employed as a method for acquiring pitch pitch information.

Ｓ７において、Ｓ６で取得した音高ピッチ情報の取得タイミングにおける入力音量が所定以上であるか否かを判断する。入力音量が所定以上でない場合は（Ｓ７：ＮＯ）、当該タイミングにおいてマイク１７により音声が入力されていなかったと判断してＳ３に戻る。入力音量が所定以上である場合は（Ｓ７：ＹＥＳ）、Ｓ８に進む。 In S7, it is determined whether or not the input volume at the acquisition timing of the pitch pitch information acquired in S6 is greater than or equal to a predetermined value. If the input volume is not higher than the predetermined level (S7: NO), it is determined that no sound has been input by the microphone 17 at the timing, and the process returns to S3. If the input volume is equal to or higher than the predetermined level (S7: YES), the process proceeds to S8.

Ｓ８において、現在時刻を取得する。この現在時刻は、Ｓ６で取得した音高ピッチ情報を取得した時刻を表すものである。その後、Ｓ９に進む。 In S8, the current time is acquired. This current time represents the time when the pitch pitch information acquired in S6 is acquired. Then, it progresses to S9.

Ｓ９において、いずれかの影響音トラックが発音中か否かが判断される。Ｓ９の処理について具体的に説明する。まず、Ｓ８で取得した時刻と、共有メモリに記憶されている時刻との時間差を算出する。このとき、共有メモリに複数の時刻が記憶されている（すなわち、影響音トラックが複数存在する）場合は、それぞれの時刻との時間差が算出される。その後、上記時間差（複数存在する場合は、それぞれの時間差）が、対応する影響音の減衰時間以上であるか否かが判断される。 In S9, it is determined whether any affected sound track is sounding. The process of S9 will be specifically described. First, the time difference between the time acquired in S8 and the time stored in the shared memory is calculated. At this time, when a plurality of times are stored in the shared memory (that is, there are a plurality of influence sound tracks), a time difference from each time is calculated. Thereafter, it is determined whether or not the time difference (or the time difference when there are a plurality of time differences) is equal to or longer than the decay time of the corresponding influence sound.

この点について、ドラムトラックとピアノトラックとが影響音トラックである場合を例にして説明する。共有メモリには、ドラムトラック用の時刻領域と、ピアノトラック用の時刻領域が確保されることになる。Ｓ８で取得した時刻をＴｃ、ドラムトラック用の時刻領域に記憶されている時刻をＮｄ、ピアノトラック用の時刻領域に記憶されている時刻をＮｐ、ドラム音源のデータトラックに記載されている減衰時間をＡｄ、ピアノ音源のデータトラックに記載されている減衰時間をＡｐとすれば、ドラムトラックにおける時間差Ｄｄ、ピアノトラックにおける時間差Ｄｐは、それぞれ、
Ｄｄ＝Ｔｃ−Ｎｄ
Ｄｐ＝Ｔｃ−Ｎｐ
として算出される。
そして、Ｄｄ＞Ａｄ、かつ、Ｄｐ＞Ａｐの場合に限って、いずれの影響音トラックも発音中ではない、すなわち、Ｓ９において「Ｎｏ」と判断される。 This point will be described by taking as an example the case where the drum track and the piano track are influence sound tracks. In the shared memory, a time area for the drum track and a time area for the piano track are secured. The time acquired in S8 is Tc, the time stored in the time region for the drum track is Nd, the time stored in the time region for the piano track is Np, and the decay time described in the data track of the drum sound source Is Ad, and the decay time described in the data track of the piano sound source is Ap, the time difference Dd in the drum track and the time difference Dp in the piano track are respectively
Dd = Tc−Nd
Dp = Tc-Np
Is calculated as
Only in the case of Dd> Ad and Dp> Ap, no influence sound track is sounding, that is, “No” is determined in S9.

いずれかの影響音トラックが発音中であると判断した場合は（Ｓ９：ＹＥＳ）、Ｓ３に戻る。いずれの影響音トラックも発音中ではないと判断した場合は（Ｓ９：ＮＯ）、Ｓ１０に進む。ここで、時刻領域に記憶されている時刻は、後述する演奏用プロセスにおいて書き込まれた影響音トラックのノートオン信号が出力された時刻、もしくは、所定値（ゼロ値）である。この点については、演奏用プロセスの説明中において詳細に説明する。 If it is determined that any of the affected sound tracks is sounding (S9: YES), the process returns to S3. If it is determined that none of the influence sound tracks is sounding (S9: NO), the process proceeds to S10. Here, the time stored in the time area is a time when a note-on signal of an influence sound track written in a performance process described later is output or a predetermined value (zero value). This point will be described in detail in the description of the performance process.

なお、Ｓ９における「発音中ではない」かどうかの判断は、採点に影響を及ぼす音量での発音がなされていないという基準で行われる。すなわち、たとえ影響音が発音されていても、その音量が採点に影響を及ぼさない大きさであれば、「発音中ではない」と判断される。 In S9, whether or not “sounding is not being generated” is determined based on the criterion that no sound is produced at a volume that affects the scoring. That is, even if an influential sound is pronounced, it is determined that “the sound is not being produced” if the volume does not affect the scoring.

Ｓ１０において、Ｓ６で取得した音高ピッチ情報を取得ピッチ保持エリアに書き込む。そして、Ｓ３に戻る。 In S10, the pitch pitch information acquired in S6 is written in the acquired pitch holding area. Then, the process returns to S3.

一方、Ｓ３において、取得ピッチ保持エリアがＦＵＬＬになっていると判断した場合は（Ｓ３：ＹＥＳ）は、Ｓ１２に進む。
Ｓ１２において、取得ピッチ保持エリア内の全データの平均値を算出し、平均ピッチ情報を入力音声のピッチ周波数として取得する。その後、Ｓ１３に進む。 On the other hand, if it is determined in S3 that the acquired pitch holding area is FULL (S3: YES), the process proceeds to S12.
In S12, an average value of all data in the acquired pitch holding area is calculated, and the average pitch information is acquired as the pitch frequency of the input voice. Then, it progresses to S13.

Ｓ１３において、算出したピッチ周波数が、所定ピッチ以下であるか否かを判別する。ピッチ周波数が所定ピッチ以下である場合（Ｓ１３：ＹＥＳ）は、マイク１７に入力された音声は男声であると判断して、Ｓ１４に進む。
Ｓ１４において、ローカットフィルタを男声用に設定する。 In S13, it is determined whether or not the calculated pitch frequency is equal to or less than a predetermined pitch. When the pitch frequency is equal to or lower than the predetermined pitch (S13: YES), it is determined that the voice input to the microphone 17 is a male voice, and the process proceeds to S14.
In S14, the low cut filter is set for male voice.

ピッチ周波数が所定ピッチ量以下ではない場合（Ｓ１３：ＮＯ）は、マイク１７に入力された音声は女声であると判断して、Ｓ１５に進む。
Ｓ１５において、ローカットフィルタを女声用に設定する。 If the pitch frequency is not equal to or less than the predetermined pitch amount (S13: NO), it is determined that the voice input to the microphone 17 is a female voice, and the process proceeds to S15.
In S15, the low cut filter is set for female voice.

なお、Ｓ１３〜Ｓ１５の処理においては、マイク１７に入力された音声のピッチ情報を２つのピッチ周波数（男声・女声）に分類し、それぞれのピッチ周波数に対してフィルタを設定したが、ピッチ周波数を３つ以上に分類して、それぞれの周波数に対しフィルタを設定してもよい。また、周波数と設定されるフィルタとの関係は、予めデータテーブルとして有してもよいし、プログラム上で処理（すなわち、ＩＦ／ＴＨＥＮ処理）してもよい。また、男声と判断した場合は、フィルタ処理をしないように構成してもよい。 In the processes of S13 to S15, the pitch information of the voice input to the microphone 17 is classified into two pitch frequencies (male voice / female voice), and a filter is set for each pitch frequency. It is possible to classify into three or more and set a filter for each frequency. Further, the relationship between the frequency and the set filter may be stored in advance as a data table, or may be processed on a program (that is, IF / THEN processing). Further, when it is determined that the voice is male, the filter process may not be performed.

Ｓ１６において、採点値を初期化する。なお、本実施形態においては、減点法によって処理を行うため、初期値として例えば１０００点を設定できるが、加点法や、その他周知の採点値設定法を適宜適用可能であることはいうまでもない。なお、その場合は、初期値は１０００点ではなく異なる値となることはいうまでもない。また、減点法以外の採点手法を採用する場合は、以下に説明するＳ２２の処理が異なる点はいうまでもない。 In S16, the scoring value is initialized. In this embodiment, since processing is performed by the deduction method, for example, 1000 points can be set as the initial value, but it is needless to say that a scoring method or other known scoring value setting methods can be applied as appropriate. . In this case, it goes without saying that the initial value is not 1000 points but a different value. Needless to say, when a scoring method other than the deduction method is employed, the processing of S22 described below is different.

Ｓ１７において、楽曲の演奏が終了したか否かを判断する。楽曲の演奏が終了したと場合は（Ｓ１７：ＹＥＳ）、Ｓ１８に進み、採点結果をモニタ２２に表示する。なお、楽曲の演奏中においても、適宜採点結果（途中結果）を表示するよう構成してもよい。楽曲の演奏が終了していない場合は（Ｓ１７：ＮＯ）、Ｓ１９に進む。 In S17, it is determined whether or not the music performance has ended. When the performance of the music is finished (S17: YES), the process proceeds to S18 and the scoring result is displayed on the monitor 22. In addition, you may comprise so that a scoring result (intermediate result) may be displayed suitably also during the performance of a music. If the performance of the music has not ended (S17: NO), the process proceeds to S19.

Ｓ１９において、音高ピッチ情報を取得できたか否かを判断する。音高ピッチ情報が取得できなかった場合は（Ｓ１９：ＮＯ）、Ｓ１７に戻る。音高ピッチ情報が取得できたと判断した場合は（Ｓ１９：ＹＥＳ）、Ｓ２０に進む。 In S19, it is determined whether or not pitch pitch information has been acquired. If the pitch information cannot be acquired (S19: NO), the process returns to S17. If it is determined that pitch pitch information has been acquired (S19: YES), the process proceeds to S20.

Ｓ２０において、Ｓ１９で取得した音高ピッチ情報の取得タイミングにおける入力音量が所定以上であるか否かを判断する。入力音量が所定以上でない場合は（Ｓ２０：ＮＯ）、当該タイミングにおいてマイク１７により音声が入力されていなかったと判断してＳ１７に戻る。入力音量が所定以上である場合は（Ｓ２０：ＹＥＳ）、Ｓ２１に進む。 In S20, it is determined whether or not the input volume at the acquisition timing of the pitch pitch information acquired in S19 is greater than or equal to a predetermined value. If the input volume is not higher than the predetermined level (S20: NO), it is determined that no sound is input from the microphone 17 at this timing, and the process returns to S17. If the input volume is equal to or higher than the predetermined level (S20: YES), the process proceeds to S21.

Ｓ２１において、Ｓ１９で取得した音高ピッチ情報と、共有メモリに記憶されているＭＩＤＩデータにおけるボーカルトラックのノートナンバとの差分を算出する。本実施形態においては、ＭＩＤＩデータにおけるボーカルトラックのノートナンバを採点の基準としている。また、ＭＩＤＩデータにおけるボーカルトラックは、歌唱時において、ガイドメロディとしても利用される。共有メモリへの書き込みについては、演奏用プロセスにおいて詳細に説明する。その後、Ｓ２２に進む。 In S21, the difference between the pitch pitch information acquired in S19 and the note number of the vocal track in the MIDI data stored in the shared memory is calculated. In this embodiment, the note number of the vocal track in the MIDI data is used as a scoring standard. The vocal track in MIDI data is also used as a guide melody when singing. The writing to the shared memory will be described in detail in the performance process. Then, it progresses to S22.

Ｓ２２において、現在の採点値からＳ２１で求めた差分の絶対値を減算する。ここで、マイクから入力された音高ピッチ情報が共有メモリに記憶されているノートナンバが示すピッチ情報と同じであれば、マイク１７から入力された音声信号の音高ピッチ情報は正しいものなので、減算されないことになる。 In S22, the absolute value of the difference obtained in S21 is subtracted from the current scoring value. Here, if the pitch information input from the microphone is the same as the pitch information indicated by the note number stored in the shared memory, the pitch information of the audio signal input from the microphone 17 is correct. It will not be subtracted.

なお、Ｓ２１及びＳ２２における採点手法においては、歌唱音声がガイドメロディに対して時間方向にずれている場合にも採点値に影響を及ぼす。そこで、所定時間分のガイドメロディ（ボーカルトラック）のノートナンバー（ピッチ情報）を共有メモリに記憶しておき、入力音声における音高ピッチ情報とＤＰマッチング等を利用して採点処理を行うことにより、時間方向のずれが採点値に影響を及ぼすことを低減できる。 In the scoring method in S21 and S22, the scoring value is also affected when the singing voice is shifted in the time direction with respect to the guide melody. Therefore, by storing a note number (pitch information) of a guide melody (vocal track) for a predetermined time in a shared memory and performing a scoring process using pitch pitch information and DP matching in the input voice, It is possible to reduce the influence of the deviation in the time direction on the scoring value.

［演奏用プロセス］
次に、演奏用プロセスについて図１７を参照しつつ説明する。楽曲の再生がスタートすると、演奏用プロセスは実行開始される。演奏用プロセスでは、楽曲データであるＭＩＤＩデータが有する複数のトラックのうち、影響音トラック及びボーカルトラックが処理対象となる。また、どのトラックが影響音トラックであるか否かは、予め設定されているものとする。 [Performance process]
Next, the performance process will be described with reference to FIG. When the reproduction of the music starts, the performance process is started. In the performance process, an influence sound track and a vocal track are processed among a plurality of tracks included in MIDI data as music data. It is assumed that which track is the influence sound track is set in advance.

まず、Ｓ３１において、トラックデータを取得する。そして、取得したトラックデータが影響音トラックであるか否かを判断する。影響音トラックでないと判断した場合は（Ｓ３１：ＮＯ）、他のトラックを処理の対象とするため、当該トラックに対しては処理を行なわず、Ｓ３７に進む。 First, in S31, track data is acquired. Then, it is determined whether or not the acquired track data is an influence sound track. If it is determined that the track is not an influence sound track (S31: NO), the process proceeds to S37 without performing the process for the other track because the other track is the target of the process.

影響音トラックであると判断した場合は（Ｓ３１：ＹＥＳ）、Ｓ３２に進む。
Ｓ３２において、Ｓ３１で取得したデータにノートオン情報が存在するか否かを判断する。ノートオン情報が含まれていないと判断した場合は（Ｓ３２：ＮＯ）、影響音の発音開始タイミングではないと判断してＳ３５に進む。
ノートオン情報が含まれていると判断した場合は（Ｓ３２：ＹＥＳ）、影響音の発音開始タイミングであると判断し、Ｓ３３に進む。 If it is determined that the track is an influence sound track (S31: YES), the process proceeds to S32.
In S32, it is determined whether or not note-on information exists in the data acquired in S31. If it is determined that the note-on information is not included (S32: NO), it is determined that it is not the sound generation start timing of the influence sound, and the process proceeds to S35.
If it is determined that the note-on information is included (S32: YES), it is determined that it is the sounding start timing of the influence sound, and the process proceeds to S33.

Ｓ３３において、現在時刻を取得する。この現在時刻は、影響音の発音開始時刻を意味するものである。なお、この現在時刻としては、システム起動からの時間を利用することができるが、その他の時刻情報でもよい。その後、Ｓ３４に進む。 In S33, the current time is acquired. This current time means the sounding start time of the influence sound. As the current time, the time from system startup can be used, but other time information may be used. Thereafter, the process proceeds to S34.

Ｓ３４において、Ｓ３３で取得した現在時刻を、共有メモリ上の、Ｓ３１で特定した影響音トラックに対応する時刻領域にセットする。この時刻領域の内容は、Ｓ３３で現在時刻が取得される度に更新されるものである。 In S34, the current time acquired in S33 is set in a time region corresponding to the influence sound track specified in S31 on the shared memory. The contents of this time area are updated every time the current time is acquired in S33.

Ｓ３５において、Ｓ３１で取得したデータにノートオフ情報が存在するか否かを判断する。ノートオフ情報が含まれていないと判断した場合は（Ｓ３５：ＮＯ）、影響音の発音停止タイミングではないと判断してＳ３７に進む。ノートオフ情報が含まれていると判断した場合は（Ｓ３５：ＹＥＳ）、影響音の発音停止タイミングであると判断し、Ｓ３６に進む。 In S35, it is determined whether or not note-off information exists in the data acquired in S31. If it is determined that the note-off information is not included (S35: NO), it is determined that it is not the sound generation stop timing of the influence sound, and the process proceeds to S37. If it is determined that note-off information is included (S35: YES), it is determined that it is the sound generation stop timing of the influence sound, and the process proceeds to S36.

Ｓ３６において、所定値を共有メモリ上のＳ３１で特定した影響音トラックに対応する時刻領域にセットする。この所定値としては、現在時刻をシステムの起動時からの経過時間で定義している場合は、「０」をセットすることができる。「０」がセットされることにより、図１６のＳ９における時間差は必ず減衰時間よりも大きくなる。また、現在時刻を、時／分／秒で定義している場合は、例えば、本楽曲の演奏開始時刻を所定値としてもよい。 In S36, a predetermined value is set in the time region corresponding to the influence sound track specified in S31 on the shared memory. As this predetermined value, “0” can be set when the current time is defined as an elapsed time from the start of the system. When “0” is set, the time difference in S9 of FIG. 16 is always greater than the decay time. Further, when the current time is defined in hours / minutes / seconds, for example, the performance start time of the music may be set to a predetermined value.

Ｓ３６の処理により、図１６のＳ９において、影響音を発音させるための命令が出力されたタイミングを起点として、該影響音の減衰時間が経過するまでの期間、または、該影響音の発音を止めるための命令が出力されるまでの期間のうち、短い方の期間を、該影響音が発音されている期間と判断することができる。 By the process of S36, starting from the timing at which the command for generating the influence sound is output in S9 of FIG. 16, the period until the decay time of the influence sound elapses or the sound generation of the influence sound is stopped. Among the periods until the command for output is output, the shorter period can be determined as the period during which the influence sound is generated.

Ｓ３７において、取得したトラックのデータがボーカルトラックであるか否かを判断する。ボーカルトラックでないと判断した場合は（Ｓ３７：ＮＯ）は、他のトラックを処理の対象とするため、当該トラックに対しては処理を行わずＳ４２に進む。 In S37, it is determined whether or not the acquired track data is a vocal track. If it is determined that the track is not a vocal track (S37: NO), the process proceeds to S42 without performing the process on the track because another track is the target of the process.

ボーカルトラックであると判断した場合は（Ｓ３７：ＹＥＳ）は、Ｓ３８に進む。Ｓ３８において、取得したデータにノートオン情報が存在するか否かを判断する。ノートオン情報が含まれていないと判断した場合は（Ｓ３８：ＮＯ）、ボーカル発声開始タイミングではないと判断してＳ４０に進む。ノートオン情報が含まれていると判断した場合は（Ｓ３８：ＹＥＳ）、ボーカル発声開始タイミングであると判断して、Ｓ３９に進む。 If it is determined that the track is a vocal track (S37: YES), the process proceeds to S38. In S38, it is determined whether or not note-on information exists in the acquired data. If it is determined that the note-on information is not included (S38: NO), it is determined that it is not the vocal utterance start timing, and the process proceeds to S40. If it is determined that the note-on information is included (S38: YES), it is determined that it is the vocal utterance start timing, and the process proceeds to S39.

Ｓ３９において、ノートナンバをノートナンバ用共有メモリにセットする。この内容は、採点用プロセスにおいて、採点の基準として用いられる。 In S39, the note number is set in the note number shared memory. This content is used as a scoring standard in the scoring process.

Ｓ４０において、取得したトラックデータにノートオフ指示情報が存在するか否かを判断する。ノートオフ情報が含まれていないと判断した場合は（Ｓ４０：ＮＯ）、Ｓ４２に進む。ノートオフ情報が含まれていると判断した場合は（Ｓ４０：ＹＥＳ）、Ｓ４１に進む。
Ｓ４１において、ノートナンバ用共有メモリをクリアする。これにより、ボーカル発声期間のみ、ノートナンバ用共有メモリに情報が記憶されることになる。 In S40, it is determined whether or not note-off instruction information exists in the acquired track data. When it is determined that note-off information is not included (S40: NO), the process proceeds to S42. If it is determined that note-off information is included (S40: YES), the process proceeds to S41.
In S41, the note number shared memory is cleared. As a result, information is stored in the note number shared memory only during the vocal utterance period.

Ｓ４２において、ＭＩＤＩデータを音源に送出する。これにより、楽音データに基いて演奏が制御されることになる。 In S42, the MIDI data is sent to the sound source. As a result, the performance is controlled based on the musical sound data.

上記処理においては、影響音トラック及びボーカルトラックに対する処理の後にＭＩＤＩデータを音源に送出するよう構成したが、先にＭＩＤＩデータを音源に送出し、その後ドラムトラック及びボーカルトラックに対する処理を行うよう構成してもよい。 In the above processing, the MIDI data is sent to the sound source after the processing for the influence sound track and the vocal track. However, the MIDI data is first sent to the sound source and then the processing for the drum track and the vocal track is performed. May be.

なお、採点用プロセス及び演奏用プロセスとして説明した上記フローチャートは単なる一例であり、上記処理と同等の結果を得ることできる処理であれば、他のフローチャートによって処理を実現してもよい。 Note that the flowcharts described as the scoring process and the performance process are merely examples, and the process may be realized by another flowchart as long as the process can obtain a result equivalent to the above process.

次に、上述した採点用プロセス及び演奏用プロセス実行時における、音高ピッチ情報が取得される様子を、図１８を用いて説明する。
図１８は、ドラムトラックのノートオン、ノートオフのタイミング、システム起動からの時間、取得したピッチ情報の関係を示す。なお、横軸は時間軸である。 Next, how the pitch pitch information is acquired during the above-described scoring process and performance process will be described with reference to FIG.
FIG. 18 shows the relationship between the drum track note-on and note-off timing, the time from system startup, and the acquired pitch information. The horizontal axis is the time axis.

図１８においては、ドラム音の発音開始から、２０ｍｓを減衰時間として設定しているが、この時間間隔は適宜設定可能である。システム時間８００１０（単位であるｍｓは省略する。以下同じ。）の時点でドラム音がノートオンされたので、８００１０から８００３０までに取得された音高ピッチ情報は破棄される。そのため、ピッチ情報１は、取得ピッチ保持エリアに記憶されることなく破棄される。 In FIG. 18, 20 ms is set as the decay time from the start of drum sound generation, but this time interval can be set as appropriate. Since the drum sound is note-on at the time of system time 80010 (the unit ms is omitted; the same applies hereinafter), pitch pitch information acquired from 80010 to 80030 is discarded. Therefore, the pitch information 1 is discarded without being stored in the acquired pitch holding area.

また、ピッチ情報２、３は、８００３０以降に取得され、また、次のドラム音の発音開始前であるので、取得ピッチ保持エリアに書き込まれることになる。以下同様にして、ピッチ情報４、８は破棄され、ピッチ情報５〜７、９〜１１は取得ピッチ保持エリアに書き込まれる。なお、図１８においては、ドラムトラックのノートオフ指示情報は利用していない。取得ピッチ保持エリアに所定時間分のピッチ情報が蓄積されると、ピッチ周波数が算出されることになる。 The pitch information 2 and 3 is acquired after 80030 and is written in the acquired pitch holding area because it is before the start of the sound generation of the next drum sound. Similarly, the pitch information 4 and 8 are discarded, and the pitch information 5 to 7 and 9 to 11 are written in the acquired pitch holding area. In FIG. 18, note-off instruction information for the drum track is not used. When pitch information for a predetermined time is accumulated in the acquired pitch holding area, the pitch frequency is calculated.

上述したとおり、本実施形態においては、演奏開始後、影響音の音源が演奏されていない期間で取得した音高ピッチ情報に基いて男声／女声を判断し、判断した性別に応じたフィルタを設定し、設定されたフィルタを用いて入力音声をフィルタ処理し、フィルタ処理された音声を採点対象とするので、精度の高い採点を行うことができる。また、楽器音ごとの減衰時間及び楽器音の発音を止めるための命令を用いて、所定の楽器音が発音されていない期間を決定するので、楽器音の種類（減衰する楽器、減衰しない楽器）によらず上記発音されていない期間を適切に決定することができる。さらに、楽曲の演奏開始が所定時間内に所定量の音高ピッチ情報が取得できなかった場合は、その旨が演奏の早い段階で報知することができる。 As described above, in this embodiment, after the performance is started, the male / female voice is determined based on the pitch information acquired during the period when the sound source of the influence sound is not played, and a filter corresponding to the determined gender is set. Since the input voice is filtered using the set filter and the filtered voice is used as a scoring target, scoring with high accuracy can be performed. Also, since the decay time for each instrument sound and a command for stopping the sound of the instrument sound are used to determine the period during which the predetermined instrument sound is not sounded, the type of instrument sound (attenuating instrument, non-attenuating instrument) Regardless of this, it is possible to appropriately determine the period in which the sound is not generated. Furthermore, when a predetermined amount of pitch information cannot be acquired within a predetermined time when the performance of the music is started, this can be notified at an early stage of the performance.

なお、上記処理においては、共有メモリを利用してメモリ内のボーカルトラックの情報を随時変更していたが、共有メモリを用いることなく処理することもできる。すなわち、採点用プロセスにおいて、直接楽曲データに含まれるボーカルトラックのノートナンバをアクセスするよう構成してもよい。 In the above processing, the information of the vocal track in the memory is changed as needed using the shared memory. However, the processing can be performed without using the shared memory. That is, in the scoring process, the note number of the vocal track included in the music data may be directly accessed.

上記実施形態においては、影響音トラックを予め指定し、影響音トラックであればノートナンバに関係なく採点に悪影響及ぼすものとして処理を行っていたが、影響音トラックを予め設定することなく処理することもできる。具体的には、取得したトラックのデータにノートオン情報が含まれ、かつ、ノートナンバが所定値以下である場合に、その発音は採点に悪影響を及ぼすものと判断し、その発音中は採点しないように処理することができる。この場合、所定値のノートナンバは適宜設定可能であり、採点に悪影響を及ぼすと考えられる周波数に基いて決定することができる。 In the above embodiment, the influence sound track is specified in advance, and if it is an influence sound track, the process is performed as having an adverse effect on the scoring regardless of the note number. However, the influence sound track is processed without setting in advance. You can also. Specifically, when note-on information is included in the acquired track data and the note number is below a predetermined value, it is determined that the pronunciation has an adverse effect on the scoring and is not scored during the pronunciation. Can be processed as follows. In this case, a predetermined note number can be set as appropriate, and can be determined based on a frequency that is considered to have an adverse effect on scoring.

また、楽曲データがＭＩＤＩデータの場合、プログラム・チェンジコマンドを利用することにより、任意の楽器音のトラックが、演奏途中に影響音の楽器音のトラックとして指定されることがある。
図１９に、楽曲データの別の形態を示す。図１９の例においては、プログラム・チェンジコマンドに減衰時間が付され、トラック１６のプログラム・チェンジコマンドの実行後においては、トラック１６は影響音のトラックとして機能する。また、トラック１６における楽器音の減衰時間は、プログラム・チェンジコマンドに付された減衰時間となる。なお、コントラバストラック→ピアノトラックのように、影響音トラックから別の影響音トラックに変更することも可能である。 When the music data is MIDI data, a track of an arbitrary instrument sound may be designated as an instrument sound track of an influence sound during the performance by using a program change command.
FIG. 19 shows another form of music data. In the example of FIG. 19, the decay time is added to the program change command, and the track 16 functions as an influence sound track after the program change command of the track 16 is executed. The decay time of the instrument sound on the track 16 is the decay time attached to the program change command. It is also possible to change from an influence sound track to another influence sound track, such as a contrabass track → piano track.

上述した実施形態に示した構成を採れば、楽曲データ作成時に、楽曲データ作成者側で、予め楽曲ごとに音高ピッチ情報取得に悪影響を及ぼす影響音を特定することができるので、前記楽曲データを用いたカラオケ用採点装置は、歌唱者に何らの負担を課すことなく、楽曲演奏中の楽器音種類と時間的流れとに応じて、男声／女声を正確に、かつ、容易に判別できるようになり、男声女声別に適正なフィルタ処理を行えるので、楽曲ごとに精度の高い採点ができる。 If the configuration shown in the above-described embodiment is adopted, the music data creator side can specify in advance the influence sound that adversely affects the pitch pitch information acquisition for each music piece when creating the music data. A karaoke scoring device that uses a singer can accurately and easily discriminate male / female voices according to the type of musical instrument sound and the temporal flow during the performance of music without imposing any burden on the singer. Therefore, it is possible to perform high-precision scoring for each piece of music because proper filtering can be performed for each male and female voice.

また、カラオケ装置において予め影響音として決定しておくこともできる。この場合は、既存の楽曲データを何ら加工することなく、本発明を実現できる。この構成では、楽曲によっては、音高ピッチ情報が正しく取得できないことも想定されるが、上述した採点用プロセスにおいては、楽曲の演奏開始後所定時間内に所定時間分の音高ピッチ情報が取得できなかった旨を報知するので、利用者は、演奏中に採点が正しく行われなかったことを把握することができる。 It can also be determined in advance as an influence sound in the karaoke apparatus. In this case, the present invention can be realized without processing any existing music data. In this configuration, although it is assumed that the pitch pitch information cannot be acquired correctly depending on the music, in the above scoring process, the pitch pitch information for a predetermined time is acquired within a predetermined time after the performance of the music starts. Since the fact that it was not possible is notified, the user can grasp that scoring was not performed correctly during the performance.

上記実施形態においては、男声／女声の判断を行い、フィルタが決定されるまでにマイク１７に入力された音声は採点処理の対象としなかったが、この期間の音声信号をバッファ等に蓄えておき、フィルタ決定後にこれら入力音声をフィルタ処理して採点結果に反映させることもできる。 In the above embodiment, male / female voice is determined, and the voice input to the microphone 17 is not subject to scoring processing until the filter is determined. However, the voice signal of this period is stored in a buffer or the like. These input voices can be filtered after the filter is determined and reflected in the scoring results.

図２０は、バッファに上記期間の入力音声を記憶しておき、フィルタ決定後に用いる処理（以下、「一時記憶用プロセス」という。）のフローチャートを示すものである。 FIG. 20 shows a flowchart of processing (hereinafter referred to as “temporary storage process”) that is used after the input speech of the above period is stored in the buffer and the filter is determined.

一時記憶用プロセスは、採点用プロセス及び演奏用プロセスと並列して処理される。一時記憶用プロセスは、楽曲の開始がスタートすると実行開始される。 The temporary storage process is processed in parallel with the scoring process and the performance process. The temporary storage process is started when the music starts.

Ｓ５１において、楽曲の演奏が終了したか否かを判断する。楽曲の演奏が終了したと判断した場合は（Ｓ５１：ＹＥＳ）、一時記憶用プロセスを終了する。楽曲の演奏が終了していないと判断した場合は（Ｓ５１：ＮＯ）、Ｓ５２に進む。 In S51, it is determined whether or not the music performance has been completed. If it is determined that the music performance has ended (S51: YES), the temporary storage process is ended. When it is determined that the performance of the music has not ended (S51: NO), the process proceeds to S52.

Ｓ５２において、採点用プロセスにおいてフィルタが設定されたか否かを判断する。フィルタが設定されたと判断した場合は（Ｓ５４：ＹＥＳ）、Ｓ５４に進む。フィルタが設定されていないと判断した場合は（Ｓ５４：ＮＯ）、Ｓ５３に進む。 In S52, it is determined whether or not a filter is set in the scoring process. If it is determined that a filter has been set (S54: YES), the process proceeds to S54. If it is determined that the filter is not set (S54: NO), the process proceeds to S53.

Ｓ５３において、マイク１７に入力された音声信号を、バッファに書き込む。Ｓ５１〜Ｓ５３の処理を繰り返すことにより、採点用プロセスにおいてフィルタが設定されるまでの入力音声は、バッファに書き込まれることになる。 In S53, the audio signal input to the microphone 17 is written into the buffer. By repeating the processing of S51 to S53, the input voice until the filter is set in the scoring process is written to the buffer.

Ｓ５４において、バッファに記憶した入力音声信号を順次読み出す。その後、Ｓ５５に進む。
Ｓ５５において、読み出した入力音声信号に対し採点用プロセスにおいて設定されたフィルタを利用してフィルタ処理を行う。その後、Ｓ５６に進む。 In S54, the input audio signals stored in the buffer are sequentially read out. Thereafter, the process proceeds to S55.
In S55, the read input audio signal is filtered using the filter set in the scoring process. Thereafter, the process proceeds to S56.

Ｓ５６において、フィルタ処理された音声信号に対し、採点処理を行う。この採点処理自体は、採点用プロセスで行われるものと同じであるので説明を省略する。
なお、Ｓ５４〜Ｓ５６の処理は、互いに並列に行ってもよい。すなわち、バッファから読み出した音声をフィルタ処理している最中に、次の入力音声をバッファから読み出してもよい。 In S56, a scoring process is performed on the filtered audio signal. Since the scoring process itself is the same as that performed in the scoring process, description thereof is omitted.
In addition, you may perform the process of S54-S56 mutually in parallel. That is, the next input sound may be read from the buffer while the sound read from the buffer is being filtered.

Ｓ５７において、本プロセスで算出した採点値を、採点用プロセスにおける採点結果に反映させる。なお、反映させるタイミングは適宜設定可能である。 In S57, the scoring value calculated in this process is reflected in the scoring result in the scoring process. In addition, the timing to reflect can be set suitably.

上述した本実施形態においては、フィルタが設定されるまでにマイク１７に入力された音声についても採点結果に反映されることができる。なお、上記フローチャートは単なる一例であり、上記処理と同等の結果を得ることできる処理であれば、他のフローチャートによって処理を実現してもよい。 In the above-described embodiment, the voice input to the microphone 17 before the filter is set can be reflected in the scoring result. Note that the above flowchart is merely an example, and the process may be realized by another flowchart as long as it can obtain a result equivalent to the above process.

［減衰時間を４バイトで表現する技術的意義］
上述した実施形態においては、影響音の種類によって減衰時間が異なるため、減衰時間を定義するためのデータ領域が必要である。ここで、それぞれの減衰時間の示すデータは、ミスアラインメント（ミスアライメント）を防ぐため、３バイト、５バイト、７バイトではなく、２バイト、４バイト、８バイトのいずれかにすることが望ましい。この点、例えば、特開２００６−１８５３６号公報、特開平０８−０３０５０５号公報、特開平０９−０４４３９７号公報にも記載されている。 [Technical significance of expressing the decay time in 4 bytes]
In the above-described embodiment, since the decay time varies depending on the type of the influence sound, a data area for defining the decay time is necessary. Here, in order to prevent misalignment (misalignment), the data indicated by each decay time is preferably one of 2 bytes, 4 bytes, and 8 bytes instead of 3 bytes, 5 bytes, and 7 bytes. This is also described in, for example, Japanese Patent Application Laid-Open No. 2006-18536, Japanese Patent Application Laid-Open No. 08-030505, and Japanese Patent Application Laid-Open No. 09-044397.

また、減衰時間の解像度については、減衰時間より充分に小さい時間分解能が必要であるから、減衰時間が１〜２秒であれば、秒単位でなく、少なくともミリ秒単位で管理することが望ましい。したがって、楽曲データの時間情報も少なくともミリ秒単位で管理すればよい。この点は、ＭＩＤＩデータにおいて、ＴＭＰ（テンポ値）＝１２５、ＴｉｍｅＢａｓｅ＝４８である場合に、１ｔｉｃｋ＝１０ミリ秒、すなわち、分解能がミリ秒オーダーとなることからも明らかである。 Further, the resolution of the decay time needs to be sufficiently smaller than the decay time. Therefore, if the decay time is 1 to 2 seconds, it is desirable to manage at least milliseconds instead of seconds. Therefore, the time information of the music data may be managed at least in milliseconds. This point is also clear from the fact that, in the MIDI data, when TMP (tempo value) = 125 and TimeBase = 48, 1 tick = 10 milliseconds, that is, the resolution is on the order of milliseconds.

以下に、１ｔｉｃｋ＝１０となる根拠を示す。
ＴＭＰ＝１２５（ｂｐｍ：Beat Per Minute）であれば、１分間（６００００（ｍＳ）
）に４分音符の発生頻度は６００００／１２５＝４８０（ｍＳ）に１回となるので、４分音符１周期の時間は、＝４８０（ｍＳ）となる。
また、タイムベース値＝４８であるとき、１つの４分音符は４８クロックで構成されることとなる。
つまり、ＴＭＰ＝１２５、ＴｉｍｅＢａｓｅ＝４８である場合、１周期４８０（ｍＳ
）の四分音符が、４８クロックで構成される。
ここで、１つの４分音符の解像度の単位をｔｉｃｋとするならば、１つの４分音符（
４８０（ｍＳ））は４８個のtickから構成される。そして、４分音符１周期の時間を、ｔｉｃｋ１周期への時間に置き換えると、ｔｉｃｋ１周期の時間は４８０／４８＝１０（ｍＳ）となる。 The grounds for 1 tick = 10 are shown below.
If TMP = 125 (bpm: Beat Per Minute), 1 minute (60000 (mS)
), The frequency of occurrence of quarter notes is once per 60000/125 = 480 (mS), so the time of one quarter note period is = 480 (mS).
When the time base value = 48, one quarter note is composed of 48 clocks.
That is, when TMP = 125 and TimeBase = 48, one cycle 480 (mS
) Is composed of 48 clocks.
Here, if the resolution unit of one quarter note is tick, one quarter note (
480 (mS)) is composed of 48 ticks. When the time of one quarter note period is replaced with the time to the tick 1 period, the time of the tick 1 period is 480/48 = 10 (mS).

ここで、４バイト（FFFFFFFFh）で扱うことができる最大数は、４２９４９６７２９５
である。したがって、ミリ秒を分解能とした場合、４バイトのデータで扱うことができる最大値は、４２９４９６７２９５ミリ秒であるから、最大７１５８３分まで扱うことができる。また、更に高精度に管理するためにマイクロ秒を分解能とした場合では、７１．６分まで扱うことができる。通常、カラオケ演奏に用いる楽曲の演奏時間は３〜６分であるから、減衰時間を定義するためのデータ領域は４バイトであれば、ミリ秒、マイクロ秒いずれの分解能にも対応できる。 Here, the maximum number that can be handled with 4 bytes (FFFFFFFFh) is 4294967295.
It is. Accordingly, when the resolution is milliseconds, the maximum value that can be handled with 4-byte data is 4294967295 milliseconds, and therefore, a maximum of 71585 minutes can be handled. Further, in order to manage with higher accuracy, when the resolution is microseconds, it can be handled up to 71.6 minutes. Normally, the performance time of music used for karaoke performance is 3 to 6 minutes. Therefore, if the data area for defining the decay time is 4 bytes, it can correspond to both millisecond and microsecond resolutions.

ところが、前記データ領域が２バイトであると、ミリ秒を分解能とした場合、２バイトのデータで扱うことができる最大値は、１．０９分となり、１楽曲の演奏時間には足りない。また、８バイトであると、マイクロ秒を分解能とした場合、８バイトのデータで扱うことができる最大値は、３０７４４５７３４５６２分であるので、楽曲の演奏時間に比べて余りにも長く、楽曲を管理する上では不必要に長い時間長であり、メモリ（データ領域）が無駄になる。 However, if the data area is 2 bytes and the resolution is milliseconds, the maximum value that can be handled with 2 bytes of data is 1.09 minutes, which is insufficient for the performance time of one piece of music. Also, if the resolution is 8 bytes, the maximum value that can be handled with 8 bytes of data is 307444534552 minutes, so the music is managed too long compared to the performance time of the music. The above time is unnecessarily long, and the memory (data area) is wasted.

したがって、通常の楽曲が３〜６分であるカラオケ用途においては、減衰時間のデータを４バイトで扱うことが、データ容量及びミスアラインメントを考慮した場合に最も適しているものと考えられる。 Therefore, in a karaoke application in which a normal music piece is 3 to 6 minutes, it is considered that it is most suitable to handle the data of the decay time with 4 bytes in consideration of the data capacity and misalignment.

［減衰時間の具体例］
減衰時間の例について示す。図２１は、ピアノ音の減衰の様子を示した図である。採点に影響を及ぼさない音量の閾値を、「発音直後の音量（０ｄＢ）から−１２ｄＢ減衰したとき」とした場合、図２１においては、発音後約１．３秒で閾値以下の音量となっている。したがって、ピアノ音の減衰時間として、１．３秒とすることができる。なお、図２１に示した減衰の様子は単なる一例であり、また、上記閾値も適宜設定可能である。また、ピアノ音に対するノートオフ信号が、ピアノ音の発音開始後１．３秒よりも早いタイミングで発生した場合（例えば、ピアノ音の発音持続時間が１秒）は、発音開始からその時点までが「採点に影響を及ぼす期間」となる。すなわち、ピアノ音については、減衰時間とノートオフ信号とを用いて「採点に影響を及ぼす期間」が決定されることになる。 [Specific example of decay time]
An example of decay time is shown. FIG. 21 is a diagram showing how the piano sound is attenuated. If the volume threshold that does not affect scoring is “when attenuated by −12 dB from the volume immediately after sounding (0 dB)”, in FIG. 21, the sound volume falls below the threshold in about 1.3 seconds after sounding. Yes. Accordingly, the decay time of the piano sound can be set to 1.3 seconds. Note that the state of attenuation shown in FIG. 21 is merely an example, and the threshold value can also be set as appropriate. In addition, when the note-off signal for the piano sound is generated at a timing earlier than 1.3 seconds after the start of the sound generation of the piano sound (for example, the sound generation duration time of the piano sound is 1 second), the time from the start of the sound generation until that point “Period that affects scoring”. That is, for the piano sound, the “period influencing scoring” is determined using the decay time and the note-off signal.

図２２は、エレキベース音の減衰の様子を示した図である。採点に影響を及ぼさない音量の閾値を、「発音直後の音量（０ｄＢ）から−１２ｄＢ減衰したとき」とした場合、図２２においては、発音後０．４秒で閾値以下の音量となっている。したがって、エレキベース音の減衰時間として、０．４秒とすることができる。なお、図２２に示した減衰の様子は単なる一例であり、また、上記閾値も適宜設定可能である。エレキベース音についても、ピアノ音と同様に、減衰時間とノートオフ信号とを用いて「採点に影響を及ぼす期間」が決定される。 FIG. 22 shows how the electric bass sound is attenuated. When the threshold value of the volume that does not affect the scoring is “when the sound volume is attenuated by −12 dB from the sound volume immediately after sounding (0 dB)”, in FIG. . Therefore, the decay time of the electric bass sound can be set to 0.4 seconds. Note that the state of attenuation shown in FIG. 22 is merely an example, and the threshold value can also be set as appropriate. For the electric bass sound as well as the piano sound, the “period influencing scoring” is determined using the decay time and the note-off signal.

図２３は、コントラバス音の減衰の様子を示した図である。この図が示すとおり、コントラバス音は減衰しないことがわかる。したがって、減衰時間を設定することはできない。そのため、コントラバス音については、減衰時間は用いずノートオフ信号を用いて「採点に影響を及ぼす期間」が決定されることになる。 FIG. 23 is a diagram showing how the contrabass sound is attenuated. As this figure shows, it can be seen that the contrabass sound is not attenuated. Therefore, the decay time cannot be set. Therefore, for the contrabass sound, the “period influencing the scoring” is determined using the note-off signal without using the decay time.

本発明は上述した実施形態に限定されるものではなく、本発明の要旨を逸脱しない範囲内で種々の改良、変形が可能であることは勿論である。また、上述した処理を実行するためのカラオケ装置における採点方法としても本発明は実現可能である。さらに、当該カラオケ装置における採点方法をコンピュータで実行させるためのプログラム、及び、そのプログラムが記録された記録媒体としても本発明は実現可能である。 The present invention is not limited to the above-described embodiment, and various improvements and modifications can be made without departing from the scope of the present invention. The present invention can also be realized as a scoring method in a karaoke apparatus for executing the above-described processing. Furthermore, the present invention can be realized as a program for causing a computer to execute the scoring method in the karaoke apparatus and a recording medium on which the program is recorded.

ドラム音の周波数分布の一例を示した図である。It is the figure which showed an example of the frequency distribution of a drum sound. ドラム音の周波数分布の一例を示した図である。It is the figure which showed an example of the frequency distribution of a drum sound. ドラム音の周波数分布の一例を示した図である。It is the figure which showed an example of the frequency distribution of a drum sound. 女声の周波数分布の一例を示した図である。It is the figure which showed an example of the frequency distribution of a female voice. 女声の周波数分布の一例を示した図である。It is the figure which showed an example of the frequency distribution of a female voice. 女声の周波数分布の一例を示した図である。It is the figure which showed an example of the frequency distribution of a female voice. 男声の周波数分布の一例を示した図である。It is the figure which showed an example of the male voice frequency distribution. 男声の周波数分布の一例を示した図である。It is the figure which showed an example of the male voice frequency distribution. ピアノ音の周波数分布の一例を示した図である。It is the figure which showed an example of the frequency distribution of a piano sound. エレキベース音の周波数分布の一例を示した図である。It is the figure which showed an example of the frequency distribution of an electric bass sound. チューバ音の周波数分布の一例を示した図である。It is the figure which showed an example of the frequency distribution of the tuba sound. コントラバス音の周波数分布の一例を示した図である。It is the figure which showed an example of the frequency distribution of a contrabass sound. 制御装置の内部構成及びその周辺要素を示した図である。It is the figure which showed the internal structure of the control apparatus, and its peripheral element. 楽曲データの構成の一例を示した図である。It is the figure which showed an example of the structure of music data. 共有メモリの内部構造を示した図である。It is the figure which showed the internal structure of the shared memory. 採点用プロセスのフローチャートである。It is a flowchart of the process for scoring. 演奏用プロセスのフローチャートである。It is a flowchart of the process for performance. ドラム音の発音タイミング、システム起動からの時間、取得された音高ピッチ情報を示したタイミング図である。FIG. 6 is a timing diagram showing drum sound generation timing, time since system activation, and acquired pitch pitch information. 楽曲データの構成の一例を示した図である。It is the figure which showed an example of the structure of music data. 一時記憶用プロセスのフローチャートである。It is a flowchart of the process for temporary storage. ピアノ音の減衰の様子の一例を示した図である。It is the figure which showed an example of the mode of attenuation of a piano sound. エレキベース音の減衰の様子の一例を示した図である。It is the figure which showed an example of the mode of attenuation of an electric bass sound. コントラバス音の減衰の様子の一例を示した図である。It is the figure which showed an example of the mode of attenuation of a contrabass sound.

１０制御装置
１１コントローラ
１２記憶装置
１３操作パネル
１３ａリモコン
１４ＲＡＭ
１５通信Ｉ／Ｆ
１６採点回路
１７マイク
１８音源
１９アンプ
２０スピーカ
２１映像制御回路
２２モニタ DESCRIPTION OF SYMBOLS 10 Control apparatus 11 Controller 12 Storage device 13 Operation panel 13a Remote control 14 RAM
15 Communication I / F
16 Scoring Circuit 17 Microphone 18 Sound Source 19 Amplifier 20 Speaker 21 Video Control Circuit 22 Monitor

Claims

歌唱者音声入力手段、楽曲再生手段、制御手段、記憶手段、採点手段を備えた採点機能を有するカラオケ装置において、
上記記憶手段には、
楽器音ごとの減衰時間が定義された減衰時間データと、
ピッチ周波数と周波数帯域フィルタとが関連付けられた関連データと、
歌唱者音程情報、楽器音情報、楽曲演奏進行情報を含んだ楽曲データと、
が記憶されており、
上記制御手段は、前記楽曲データに基く楽曲の演奏中に、
上記楽器音情報、上記楽曲演奏進行情報、上記減衰時間データに基いて、設定された所定の楽器音が発音されていないかどうかを判断し、
上記楽器音が発音されていない期間において上記歌唱者音声入力手段に入力された入力音声から歌唱者のピッチ周波数を特定し、
特定されたピッチ周波数及び上記関連データに基いて、周波数帯域フィルタを特定し、
特定した周波数帯域フィルタによって上記入力音声をフィルタ処理し、
上記採点手段は、上記歌唱者音程情報を用いて前記フィルタ処理された後の入力音声を採点する、
ことを特徴とする採点機能を有するカラオケ装置。 In a karaoke apparatus having a scoring function comprising a singer voice input means, a music playback means, a control means, a storage means, and a scoring means,
In the storage means,
Attenuation time data with a defined decay time for each instrument sound,
Related data in which a pitch frequency and a frequency band filter are associated;
Music data including singer pitch information, instrument sound information, music performance progress information,
Is remembered,
During the performance of the music based on the music data, the control means,
Based on the musical instrument sound information, the music performance progress information, and the decay time data, it is determined whether or not the set predetermined musical instrument sound is being pronounced,
The pitch frequency of the singer is identified from the input voice input to the singer voice input means during a period when the instrument sound is not pronounced,
Based on the identified pitch frequency and the related data, identify the frequency band filter,
Filter the input speech with the specified frequency band filter,
The scoring means scores the input voice after the filtering process using the singer pitch information.
A karaoke apparatus having a scoring function.

上記楽曲は、ＭＩＤＩデータを含む楽曲データに基いて演奏され、
上記制御手段は、上記所定の楽器音を発音させるための命令が出力されたタイミングを起点として、当該所定の楽器音の減衰時間が経過するまでの期間、または、当該所定の楽器音の発音を止めるための命令が出力されるまでの期間のうち、短い方の期間を、当該所定の楽器音が発音されている期間と判断し、上記楽器音が発音されていない期間から除外する、
ことを特徴とする請求項１の採点機能を有するカラオケ装置。 The above music is played based on music data including MIDI data,
The control means starts a period until the decay time of the predetermined instrument sound elapses from the timing at which the command for generating the predetermined instrument sound is output, or generates a sound of the predetermined instrument sound. Of the period until the command to stop is output, the shorter period is determined as the period during which the predetermined instrument sound is being generated, and is excluded from the period during which the instrument sound is not being generated.
A karaoke apparatus having a scoring function according to claim 1.

上記楽曲は、ＭＩＤＩデータを含む楽曲データに基いて演奏され、
上記制御手段は、所定値より小さいノートナンバを含むノートオン信号を検出し、当該ノートオン信号のチャネルに対応する楽器音を、上記所定の楽器音として設定する、
ことを特徴とする請求項１または２の採点機能を有するカラオケ装置。 The above music is played based on music data including MIDI data,
The control means detects a note-on signal including a note number smaller than a predetermined value, and sets an instrument sound corresponding to the channel of the note-on signal as the predetermined instrument sound.
A karaoke apparatus having a scoring function according to claim 1 or 2.

蓄積手段をさらに有し、
上記発音されていない期間において取得した上記入力音声の音高ピッチ情報は、上記蓄積手段に記憶され、
所定時間分の音高ピッチ情報が蓄積されると平均処理によって上記歌唱者のピッチ周波数を特定する、
ことを特徴とする請求項１〜３いずれかの採点機能を有するカラオケ装置。 It further has storage means,
The pitch information of the input voice acquired during the period where the pronunciation is not performed is stored in the storage means,
When pitch information for a predetermined time is accumulated, the pitch frequency of the singer is specified by averaging.
A karaoke apparatus having a scoring function according to any one of claims 1 to 3.

上記楽曲データには、上記所定の楽器音を特定可能にする情報が付されており、
上記制御手段は、上記情報に基いて上記所定の楽器音を設定する、
ことを特徴とする請求項１、２または４いずれかの採点機能を有するカラオケ装置。 The music data is attached with information that makes it possible to specify the predetermined instrument sound,
The control means sets the predetermined instrument sound based on the information;
A karaoke apparatus having a scoring function according to any one of claims 1, 2, and 4.

上記所定の楽器音が予め定められている、
ことを特徴とする請求項１、２または４いずれかの採点機能を有するカラオケ装置。 The predetermined musical instrument sound is predetermined;
A karaoke apparatus having a scoring function according to any one of claims 1, 2, and 4.

上記制御手段は、上記楽曲の演奏開始後所定の時間内に上記蓄積手段に所定時間分の声高ピッチ情報が蓄積されなかった場合に、所定の情報を利用者に報知する、ことを特徴とする請求項４の採点機能を有するカラオケ装置。 The control means notifies the user of predetermined information when the pitch means information for a predetermined time is not stored in the storage means within a predetermined time after the performance of the music is started. A karaoke apparatus having the scoring function according to claim 4 .

前記楽器音ごとの減衰時間は、４バイトのデータである、
ことを特徴とする請求項１〜７いずれかの採点機能を有するカラオケ装置。 The decay time for each instrument sound is 4-byte data.
A karaoke apparatus having a scoring function according to any one of claims 1 to 7.