JPH06202627A

JPH06202627A - Sound signal pitch extracting device

Info

Publication number: JPH06202627A
Application number: JP5148325A
Authority: JP
Inventors: Koichiro Oki; 広一郎太期
Original assignee: Casio Computer Co Ltd
Current assignee: Casio Computer Co Ltd
Priority date: 1993-05-28
Filing date: 1993-05-28
Publication date: 1994-07-22
Anticipated expiration: 2013-02-16
Also published as: JP2713102B2

Abstract

PURPOSE:To provide the pitch extracting device which can extract the pitch of a complex tone. CONSTITUTION:A DSP(digital signal processor) 1 processes the waveform signal of the complex tone inputted through an anti-aliasing filter 9 by DFT(Discrete Fourier Transformation) to obtain the spectrum of the waveform signal. Then spectrum components which meet specific requirements are extracted from the spectrum to extract the pitch of the complex tone.

Description

【発明の詳細な説明】Detailed Description of the Invention

【０００１】[0001]

【発明の技術分野】この発明は音信号のピッチを抽出す
るピッチ抽出装置に関する。TECHNICAL FIELD The present invention relates to a pitch extraction device for extracting the pitch of a sound signal.

【０００２】[0002]

【従来技術とその問題点】音信号のピッチを抽出する技
術は既知である。例えば、弦の振動をピックアップで検
出した信号や、マイクロホンで変換した楽音信号から、
その波形のピーク点やゼロクロス点の繰返しの間隔を測
定して、音信号の基本ピッチを抽出する波形処理ピッチ
抽出装置がある。また、音声信号処理の分野でも、サン
プリングした音声信号系列やその残差信号系列に相関処
理を施し、相関関数列に表われるピークを検出すること
で音声信号（有声信号）のピッチを抽出する相関式ピッ
チ抽出方式や、音声信号のフーリエ変換、対数変換、逆
フーリエ変換、リフタリングにより音声信号のスペクト
ル包絡と微細構造を低ケフレンシー部と高ケフレンシー
部とに分離し、高ケフレンシー部のピークを検出するこ
とにより、音声信号の基本ピッチを抽出するケプトスラ
ム分析ピッチ抽出方式が知られている。しかしながら、
これらの従来技術は、分析対象である音信号入力に含ま
れる基本ピッチが高々１つであることを想定しており、
和音のように複数のピッチを含む楽音信号（複音信号）
に対しては十分な分析結果を与えることができない。複
音信号入力から複数のピッチを抽出するために、楽音信
号が取り得る各々のピッチに対応してそのピッチ成分を
通すデジタルバンドパスフィルタを設け、これらのフィ
ルタ出力に基づいてピッチを検出するアプローチが考え
られる。しかし、音楽への応用のように正確なピッチの
評価が要求される用途ではフィルタ間の周波数分解能を
十分高くとらなければならず、例えば、通常の音階音を
考慮しただけでも、各フィルタのパスバンドを１／２半
音以下のきざみにする必要があり、必要なデジタルバン
ドパスフィルタの数、信号処理量、装置の規模が非常に
大きくなってしまう。2. Description of the Related Art Techniques for extracting the pitch of a sound signal are known. For example, from the signal that detected the vibration of the string with the pickup, or the musical sound signal converted with the microphone,
There is a waveform processing pitch extraction device that measures the basic interval of a sound signal by measuring the repeating intervals of the peak points and zero cross points of the waveform. Also in the field of audio signal processing, correlation is performed by performing correlation processing on the sampled audio signal series and its residual signal series and detecting the peaks appearing in the correlation function sequence to extract the pitch of the audio signal (voiced signal). Formula: Pitch extraction method, Fourier transform, logarithmic transform, inverse Fourier transform of speech signal, and lifter ring are used to separate spectral envelope and fine structure of speech signal into low and high keffency parts and detect peaks of high keffency part. Therefore, a Keptoslam analysis pitch extraction method for extracting the basic pitch of a voice signal is known. However,
These prior arts assume that the sound signal input to be analyzed has at most one basic pitch,
Musical signal including multiple pitches like chord (compound signal)
Cannot give sufficient analysis results. In order to extract a plurality of pitches from a compound sound signal input, an approach has been proposed in which a digital band pass filter that passes the pitch component corresponding to each pitch that a musical sound signal can take is provided and the pitch is detected based on the output of these filters. Conceivable. However, in applications that require accurate pitch evaluation, such as music applications, the frequency resolution between filters must be sufficiently high. It is necessary to set the band to ½ semitone or less, and the number of required digital bandpass filters, the amount of signal processing, and the scale of the apparatus become very large.

【０００３】[0003]

【発明の目的】したがって、この発明の目的は、比較的
簡単な構成でありながら、１つのピッチを含む音信号
（モノフォニック信号）のみならず、複数のピッチを含
む音信号（ポリフォニック信号）にも対応できる音信号
ピッチ抽出装置を提供することである。SUMMARY OF THE INVENTION Therefore, an object of the present invention is not only for a sound signal containing one pitch (monophonic signal) but also for a sound signal containing a plurality of pitches (polyphonic signal), although it has a relatively simple structure. It is to provide a sound signal pitch extraction device that can deal with the problem.

【０００４】[0004]

【発明の構成、作用】上記の目的を達成するため、この
発明によれば、音信号入力をサンプリングする音信号サ
ンプリング手段と、サンプリングした音信号入力のスペ
クトルを抽出するスペクトル抽出手段と、抽出したスペ
クトルのなかで所定の条件を満たす成分を検出すること
により、音信号入力に従って数が可変のピッチを抽出す
る可変数ピッチ抽出手段とを有することを特徴とする音
信号ピッチ抽出装置が提供される。この構成によれば、
スペクトル領域での条件マッチングに従い、条件に合う
周波数成分をすべて音信号入力のピッチとして抽出する
ことができるので、従来ではピッチ抽出が困難であった
複音信号（ポリフォニック信号）に対しても、それに含
まれる複数のピッチを評価可能になる。条件マッチング
の方式としては代表的には、比較等を含む種々の論理操
作によってピッチ候補を減少させる減少式ピッチ抽出ロ
ジックが採用できる。一構成例において、ピッチ抽出手
段は、抽出されたスペクトルのなかで所定の基音レベル
を超える周波数成分（振幅スペクトル成分）を検出する
手段と、検出された周波数成分のなかで、その倍音成分
（高調波成分）が所定の倍音レベルを超えるピッチ成分
を音信号入力のピッチあるいはその候補として選択する
手段とで構成される。分析する音信号入力の多様性に鑑
み、ピッチ抽出手段が条件マッチングの各テストで比較
参照する設定条件（基準値やしきい値、あるいは基準パ
ターン）はユーザープログラム可能であるのが好まし
い。例えば、分析する音信号の音色等がある程度、特定
できれば、そのスペクトル上の特性をしぼり込めるの
で、音色ごとに基準スペクトルパターン（基音−倍音振
幅パターン）のデータを用意しておき、使用者からの音
色指定入力に応答して、指定音色の基準スペクトルパタ
ーンデータを呼び出し、この呼び出した基準スペクトル
パターンを音信号入力のピッチ抽出のために使用すると
よい。また、電子音源を通してピッチ抽出結果を可聴表
示して、使用者に原音との間での聴覚によるピッチ比較
を可能にし、その判断結果の入力に従って設定条件を変
更するようにしてもよい。この発明のもう１つの特徴は
ピッチ抽出の量子化（例えば音階量子化）に係ってお
り、この特徴によれば、サンプリング周波数を可変に設
定する可変サンプリング周波数設定手段と、設定された
サンプリング周波数で音信号入力をサンプリングする音
信号サンプリング手段と、サンプリングした音信号のス
ペクトルを抽出するスペクトル抽出手段と、抽出したス
ペクトルと設定されたサンプリング周波数とに基づい
て、音信号入力に従って数が可変のピッチを抽出する可
変数ピッチ抽出手段と、抽出されたピッチを量子化する
量子化手段とを有することを特徴とする音信号ピッチ抽
出装置が提供される。いま、サンプリング周波数ｆ、分
析サンプル数Ｎとすると、このＮ個のサンプリング音信
号系列のスペクトル抽出による周波数分解能△ｆは △ｆ＝ｆ／Ｎである。換言すると、抽出したスペクトル（線スペクト
ルのセット）のなかで最初の線スペクトルの周波数がｆ
／Ｎ、２番目が２×ｆ／Ｎ、同様にしてｉ番目がｉ×ｆ
／Ｎの周波数値をもつ。これらの線スペクトルのなかで
条件に合う線スペクトルが原音のピッチを評価するわけ
であるが、評価したピッチと原音の実際のピッチとは完
全に一致するわけでなく、最悪のケースで周波数分解能
の１／２分ずれる。したがって、抽出したピッチに量子
化を施す場合、例えば、半音きざみの音階量子化を施す
場合に、原音のピッチより半音上、あるいは半音下の音
階音に量子化する可能性がある。このような場合に、サ
ンプリング周波数を変更して、抽出される線スペクトル
の位置をずらし、音信号のピッチとして評価される線ス
ペクトルの周波数と原音の実際の音階音ピッチとのサン
プリングを十分小さくすることにより、誤りのない音階
音量子化が可能になる。サンプリングの作業をやり直さ
ないですむように、一回のサンプリング処理のなかで、
サンプリング周波数を切り替えるようにすれば、更に都
合がよい。例えば、最初のＮ個のサンプルを第１のサン
プリング周波数でサンプリングし、次のＮ個のサンプル
を第１のサンプリング周波数と若干、異なるサンプリン
グ周波数でサンプリングするといった具合である。更
に、この発明によればピッチ抽出の量子化エラー対策と
して、スペクトルを抽出する音信号入力のサンプル数を
切り替えるアプローチを採用することもできる。即ち、
音信号入力をサンプリングする音信号サンプリング手段
と、複数の異なるサンプル数について、サンプリングし
た音信号入力のスペクトルを抽出するスペクトル抽出手
段と、各サンプル数について抽出したスペクトルに基づ
いて、音信号入力に従って数が可変のピッチを抽出する
可変数ピッチ抽出手段と、抽出したピッチを量子化する
量子化手段とを有することを特徴とする音信号ピッチ抽
出装置が提供される。この構成の場合、いったんピッチ
抽出した後で、再度、ピッチ抽出のために切り替える構
成に比べ、サンプリング作業のやり直しを行う必要がな
く、また、１回のサンプリング中にサンプリング周波数
を切り替える構成に比べ、サンプルデータの記憶容量を
節約できる。好ましくは、あるサンプル数に対して抽出
したスペクトルから求めたピッチが原音の音階音ピッチ
からずれていて、ピッチの音階音量子化によってエラー
が生じる可能性があるとき、スペクトルを抽出するサン
プル数を変更して、原音の音階音ピッチに十分近いピッ
チをもつ線スペクトルを検出する手段を設けることによ
り、音階音量子化エラーを効率よく除去できる。In order to achieve the above object, according to the present invention, a sound signal sampling means for sampling a sound signal input, a spectrum extracting means for extracting a spectrum of the sampled sound signal input, and an extracted sound signal sampling means. Provided is a sound signal pitch extraction device characterized by having a variable number pitch extraction means for extracting a pitch whose number is variable according to a sound signal input by detecting a component satisfying a predetermined condition in a spectrum. . According to this configuration,
According to the condition matching in the spectral domain, all frequency components that meet the conditions can be extracted as the pitch of the sound signal input. Therefore, even for complex sound signals (polyphonic signals) that were difficult to extract in the past, they are also included in it. It becomes possible to evaluate multiple pitches. As a condition matching method, typically, a reduction-type pitch extraction logic that reduces pitch candidates by various logical operations including comparison and the like can be adopted. In one configuration example, the pitch extraction means detects a frequency component (amplitude spectrum component) exceeding a predetermined fundamental tone level in the extracted spectrum, and a harmonic component (harmonic component) among the detected frequency components. And a means for selecting a pitch component whose wave component exceeds a predetermined overtone level as a pitch of a sound signal input or a candidate thereof. In view of the variety of sound signal inputs to be analyzed, it is preferable that the setting condition (reference value, threshold value, or reference pattern) that the pitch extracting means compares and references in each condition matching test is user programmable. For example, if the timbre of the sound signal to be analyzed can be specified to some extent, the characteristics on the spectrum can be narrowed down. Therefore, data of a reference spectrum pattern (fundamental-harmonic amplitude pattern) should be prepared for each timbre, and In response to the tone color designation input, the reference spectrum pattern data of the designated tone color may be called, and the called reference spectrum pattern may be used for pitch extraction of the tone signal input. Alternatively, the pitch extraction result may be audibly displayed through an electronic sound source so that the user can compare the pitch with the original sound by hearing, and the setting condition may be changed according to the input of the judgment result. Another feature of the present invention relates to quantization of pitch extraction (for example, scale quantization). According to this feature, variable sampling frequency setting means for variably setting the sampling frequency, and the set sampling frequency A sound signal sampling means for sampling the sound signal input with the sound signal, a spectrum extraction means for extracting the spectrum of the sampled sound signal, and a pitch whose number is variable according to the sound signal input based on the extracted spectrum and the set sampling frequency. There is provided a sound signal pitch extraction device characterized in that it has a variable number pitch extraction means for extracting the pitch and a quantization means for quantizing the extracted pitch. Now, assuming that the sampling frequency is f and the number of analysis samples is N, the frequency resolution Δf by spectrum extraction of the N sampling sound signal sequences is Δf = f / N 2. In other words, the frequency of the first line spectrum in the extracted spectrum (set of line spectra) is f.
/ N, 2nd is 2 × f / N, i-th is i × f
Has a frequency value of / N. Among these line spectra, the line spectrum that meets the conditions evaluates the pitch of the original sound, but the evaluated pitch and the actual pitch of the original sound do not exactly match, and in the worst case, the frequency resolution Shift by 1/2 minute. Therefore, when the extracted pitch is quantized, for example, when the tone is quantized in semitone steps, it may be quantized into a tone that is a semitone above or below the pitch of the original tone. In such a case, the sampling frequency is changed to shift the position of the extracted line spectrum, and the sampling between the frequency of the line spectrum evaluated as the pitch of the sound signal and the actual pitch of the original sound is made sufficiently small. This enables error-free scale quantization. In order to avoid having to redo the sampling work, in one sampling process,
It is more convenient if the sampling frequency is switched. For example, the first N samples are sampled at a first sampling frequency, the next N samples are sampled at a sampling frequency that is slightly different from the first sampling frequency, and so on. Further, according to the present invention, as a measure against a quantization error in pitch extraction, an approach of switching the number of samples of a sound signal input for extracting a spectrum can be adopted. That is,
A sound signal sampling means for sampling the sound signal input, a spectrum extracting means for extracting a spectrum of the sampled sound signal input for a plurality of different sample numbers, and a number according to the sound signal input based on the spectrum extracted for each sample number. Is provided with a variable number pitch extracting means for extracting a variable pitch, and a quantizing means for quantizing the extracted pitch. In the case of this configuration, compared with the configuration in which the pitch is extracted once and then switched again for pitch extraction, it is not necessary to redo the sampling work, and compared with the configuration in which the sampling frequency is switched during one sampling, The storage capacity of sample data can be saved. Preferably, when the pitch obtained from the spectrum extracted for a certain number of samples is deviated from the scale pitch of the original sound, and there is a possibility that an error may occur due to the scale quantization of the pitch, the number of samples for extracting the spectrum is set. By providing a means for detecting a line spectrum having a pitch sufficiently close to the scale pitch of the original sound, the scale quantization error can be efficiently removed.

【０００５】[0005]

【実施例】以下、図面を参照してこの発明の実施例を説
明する。本実施例の全体構成を図１に示す。デジタルシ
グナルプロセッサ（ＤＳＰ）１はＣＰＵの一種であり、
乗算命令が１マシンサイクルで実行できる等、高速デジ
タル演算に適した設計になっており、プログラムデータ
ＲＯＭ２ａのプログラムやデータテーブルに従って演算
を実行し、対応する処理として各種要素に制御信号を出
力し外部回路との入出力を行う。プログラムデータＲＯ
Ｍ２ａにはＤＳＰ１の動作に必要なプログラムと各種デ
ータテーブルが記憶されておりＤＳＰ１のバーＭＥＮ信
号により選択される。ＲＡＭ２６はＤＳＰ１でフーリエ
変換を行うとき大量のデータを扱うために必要でありバ
ーＭＥＮ信号で選択されバーＷＥ信号で書き込まれる。
デコーダ３はＤＳＰ１で外部ポートの入出力命令が実行
されたとき、どのポートが選択されたかを解読するもの
である。バストランシーバ４は入出力方向の切り替えが
できるバッファでありＤＳＰ１が外部ポートの入力命令
が実行されたときに出力されるバーＤＥＮ信号により入
出力方向を切り替える。通常（バーＤＥＮ信号が出てい
ない“Ｈ”のとき）はＡポートが入力でＢポートが出力
である。ラッチ５、ラッチ６の出力は通常ハイインピー
ダンスであるがロジックゲート部７によりバーＤＥＮと
バーＰＯＲＴ１との負論理ＡＮＤ７ｃによりＡＤＩが、
バーＤＥＮとバーＰＯＲＴ２との負論理ＡＮＤ７ｂによ
りバーＳＷＩが出力し、これによりラッチ５、ラッチ６
の出力がアクティブ状態になる。各種制御スイッチ部８
は本実施例の動作に必要な各種制御スイッチで構成され
ておりＴＲＩＧ信号により、ラッチ６に１６ビットデー
タをセットする。Embodiments of the present invention will be described below with reference to the drawings. The overall configuration of this embodiment is shown in FIG. The digital signal processor (DSP) 1 is a kind of CPU,
It has a design suitable for high-speed digital operations, such as a multiplication instruction that can be executed in one machine cycle. It executes operations according to the programs and data tables in the program data ROM 2a, and outputs control signals to various elements as the corresponding processing. I / O with the circuit. Program data RO
A program and various data tables necessary for the operation of the DSP 1 are stored in the M 2 a and selected by the bar MEN signal of the DSP 1. The RAM 26 is necessary for handling a large amount of data when performing Fourier transform in the DSP 1, and is selected by the bar MEN signal and written by the bar WE signal.
The decoder 3 decodes which port is selected when the input / output instruction of the external port is executed by the DSP 1. The bus transceiver 4 is a buffer capable of switching the input / output direction, and switches the input / output direction by the bar DEN signal output when the DSP 1 executes the input command of the external port. Normally (when the DEN signal is not output "H"), the A port is an input and the B port is an output. The outputs of the latches 5 and 6 are usually high impedance, but the logic gate unit 7 causes the negative logic AND7c of the bar DEN and the bar PORT1 to generate ADI.
The bar SWI is output by the negative logic AND7b of the bar DEN and the bar PORT2.
Output becomes active. Various control switch section 8
Is composed of various control switches necessary for the operation of this embodiment, and sets 16-bit data in the latch 6 by the TRIG signal.

【０００６】外部から入力されるアナログ音信号はアン
チエリアシングフィルタ９によりサンプリング周波数の
１／２以上の高調波成分を除去したあと、サンプル・ホ
ールド回路１０でサンプリングクロックＳＣＫによりホ
ールドしＡ・Ｄコンバータ１１によりサンプリングクロ
ックバーＳＣＫによりアナログデジタル変換を行いＳＣ
Ｋ信号によりラッチ５にセットする。An analog signal input from the outside is removed by the anti-aliasing filter 9 to remove harmonic components of 1/2 or more of the sampling frequency, and then is held by the sampling / holding circuit 10 by the sampling clock SCK and the A / D converter. 11, sampling clock bar SCK is used to perform analog-to-digital conversion and SC
Set to latch 5 by K signal.

【０００７】パラレルシリアル変換器１２は、ＤＳＰ１
でポート０に出力命令が実行されたときにロジックゲー
ト回路７によりバーＰＯＲＴ０とバーＷＥとの負論理Ａ
ＮＤ７ａから出力されるバーＰＳＯ信号により、パラレ
ルシリアル変換を実行しシリアル出力をバッファ１３を
通しＭＩＤＩＯＵＴ信号として出力する。クロックジェ
ネレータ１４は、図２に詳細を示すように、発振回路１
４ａにてＤＳＰ１の動作クロックＣＫ（２０ＭＨｚ）を
生成する。更に実施例の特徴の１つとして、クロックジ
ェネレータ１４は微調整可能なサンプリングクロックＳ
ＣＫ、バーＳＣＫを生成可能であり、ロジックゲート回
路７により、バーＰＯＲＴ３とバーＷＥとの負論理ＡＮ
Ｄ７ｄにより生成されるバーＦＱＳ信号に応答して、１
６ビットデータバスからのサンプリング周波数指定デー
タをラッチするデータラッチ回路１４ｂと、このデータ
ラッチ回路１４ｂからのサンプリング周波数指定データ
と、２０ＭＨｚクロックＣＫで動作する１６ビットカウ
ンタ１４ｃからのカウントとを比較するコンパレータ１
４ｄとを有しており、コンパレータ１４ｄの一致信号パ
ルスで１６ビットカウンタ１４ｃをクリアするととも
に、この一致信号パルスをトグル回路１４ｅに通し、そ
のトグル出力によって、サンプリング周波数指定データ
に対応するサンプリング周期でレベルが切り替わるサン
プリングクロック信号ＳＣＫを得、更にインバータ１４
ｆを通して相補なサンプリングクロック信号バーＳＣＫ
を得ている。例えば、２０ＭＨｚのサンプリングクロッ
クを得る場合には、図３に示すように、サンプリング周
波数指定データを５０００（１０進）即ち、１３８８
（１６進）に選べばよい。The parallel-serial converter 12 is a DSP 1
When an output command is executed at port 0 at, the logic gate circuit 7 causes a negative logic A between PORT0 and WE.
Parallel-serial conversion is executed by the bar PSO signal output from the ND 7a, and the serial output is output as the MIDIOUT signal through the buffer 13. The clock generator 14, as shown in detail in FIG.
The operation clock CK (20 MHz) of the DSP 1 is generated at 4a. Further, as one of the features of the embodiment, the clock generator 14 is a sampling clock S that can be finely adjusted.
CK and SCK can be generated, and the logic gate circuit 7 allows the negative logic AN of the bar PORT3 and the bar WE.
1 in response to the bar FQS signal generated by D7d
A data latch circuit 14b for latching sampling frequency designation data from a 6-bit data bus, a comparator for comparing the sampling frequency designation data from this data latch circuit 14b with the count from a 16-bit counter 14c operating at a 20 MHz clock CK. 1
4d, the 16-bit counter 14c is cleared by the coincidence signal pulse of the comparator 14d, the coincidence signal pulse is passed through the toggle circuit 14e, and the toggle output causes a sampling cycle corresponding to the sampling frequency designation data. The sampling clock signal SCK whose level is switched is obtained, and the inverter 14
Sampling clock signal bar SCK complementary through f
Is getting For example, when obtaining a sampling clock of 20 MHz, as shown in FIG. 3, the sampling frequency designation data is 5000 (decimal), that is, 1388.
You can choose (hexadecimal).

【０００８】図４と図５に本実施例の動作フローを示
す。動作フローは図４に示すメイン処理とサンプリング
周期ごとに発生するバーＩＮＴ信号（バーＳＣＫ）によ
りメイン処理からジャンプしてＡ・Ｄ値等を取り込む図
５に示すインタラプト処理とで構成されている。4 and 5 show the operation flow of this embodiment. The operation flow is composed of the main processing shown in FIG. 4 and the interrupt processing shown in FIG. 5 that jumps from the main processing by the bar INT signal (bar SCK) generated at each sampling period and takes in the A / D values and the like.

【０００９】まず図４に示すメイン処理を説明する。電
源がオンになるとパワーオンイニシャル処理（Ｓ１）に
より、ＤＳＰ１の外部・内部ＲＡＭのクリア及び初期設
定とＤＳＰ１に接続されて外部回路の初期設定を行う。
この処理なかには、当初のサンプリング周波数を２０Ｍ
Ｈｚにするため、値１３８８（１６進）のサンプリング
周波数指定データをクロックジェネレータ１４のデータ
ラッチ回路１４ｂに初期設定する処理も含まれる。外部
メモリ（ＲＯＭ２ａ、ＲＡＭ２ｂ）のアドレスマップを
図６に示す。アドレス００００ｈ〜０３ＦＦｈに割り当
てられた外部ＲＯＭ２ａには本実施例の動作フローのプ
ログラムと制御データテーブルが記憶されている。ＤＳ
Ｐ１はアドレス０４００ｈ〜０ＦＦＦｈに割り当てられ
た外部ＲＡＭ２ｂを波形処理とＡ・Ｄコンバータ１１に
より取り込む波形データの記憶バッファとして用い、そ
の他汎用レジスタとしてＤＳＰ１の内部のＲＡＭを使用
する。詳細には、外部ＲＡＭ２ｂの第１エリア０４００
ｈ〜０７ＦＦｈと第２エリア０８００ｈ〜０８ＦＦｈは
第１エリアを波形バッファ（Ａ・Ｄコンバータからのサ
ンプル記憶バッファ）として使用する動作サイクルでは
第２エリアを波形処理ＤＦＴ演算における実数エリアＲ
（ｎ）として使用し、第１エリアにサンプルが書き込ま
れた後の動作サイクルでは第１エリアを波形処理ＤＦＴ
演算における実数エリアとし、第２エリアを波形バッフ
ァとして使用することにより、頻繁にあるいは連続的に
サンプリングが行われる状況をサポートしている。外部
ＲＡＭ２ｂの第３エリア０Ｃ００ｈ〜０ＦＦＦｈは波形
処理ＤＦＴ演算における虚数エリアＩ（ｎ）として使用
される。メインルーチンのＳ２ではインタラプト処理で
取り込んだ波形バッファが１０２４個確保されたかをバ
ッファフルフラグにより判定し、１０２４個のデータが
取り込まれていればインタラプトを禁止し（Ｓ３）、オ
フセット値（ベースアドレス）の切替により、外部ＲＡ
Ｍ２ｂの波形処理ＤＦＴ演算用エリアと、サンプリング
波形のバッファエリアとを変換する（Ｓ４）ことによ
り、インタラプト禁止解除後、ただちに音波形のサンプ
リングができるようにする。次に、インタラプト処理で
Ａ・Ｄ変換した波形データの（相対）アドレスを示すア
ドレスカウンタをクリアし、バッファフルフラグもクリ
アし（Ｓ５）インタラプトを許可する（Ｓ６）。次に取
り込んだ１０２４個の波形データＲ（ｎ）、ｎ＝０〜１
０２４に窓関数（無限時間から切り取った１０２４個の
有限データの両端をなめらかにする関数で、ハニング
窓、ハミング窓、ブラックマン窓等が用いられる。例え
ばハニング窓はＷ=(n)＝0.5-0.5cos(2πn/N) ０≦ｎ≦Ｎ-1・０上記範囲以外で定められる）をかけ（Ｓ７）た後、ＤＦＴ（離散フー
リエ変換）サブルーチンにより（Ｓ８）演算されたデー
タＲ（ｎ）、Ｉ（ｎ）ｎ＝０〜１０２３により５１２個
のデータで構成される振幅（線）スペクトルを演算し
（Ｓ９）、外部スイッチにより入力された（Ｓ１０）条
件を示すデータにより、条件を満たすスペクトルナンバ
ーを１個もしくは複数個選択し音階音データに変換し
（Ｓ１１）さらにＭＩＤＩデータに変換し、パラレルシ
リアル変換器に出力する（Ｓ１２）。First, the main processing shown in FIG. 4 will be described. When the power is turned on, the power-on initial processing (S1) is performed to clear and initialize the external / internal RAM of the DSP1 and to initialize the external circuit connected to the DSP1.
During this process, the initial sampling frequency was 20M
In order to set the frequency to Hz, a process of initializing sampling frequency designation data having a value of 1388 (hexadecimal) in the data latch circuit 14b of the clock generator 14 is also included. FIG. 6 shows an address map of the external memory (ROM 2a, RAM 2b). The external ROM 2a assigned to addresses 0000h to 03FFh stores the program of the operation flow of this embodiment and a control data table. DS
P1 uses the external RAM 2b assigned to addresses 0400h to 0FFFh as a waveform processing and storage buffer for the waveform data fetched by the A / D converter 11, and uses the internal RAM of the DSP 1 as a general-purpose register. Specifically, the first area 0400 of the external RAM 2b
h to 07FFh and the second area 0800h to 08FFh are the real number area R in the waveform processing DFT operation in the operation cycle in which the first area is used as the waveform buffer (sample storage buffer from the A / D converter).
Used as (n), in the operation cycle after the sample is written in the first area, the first area is subjected to the waveform processing DFT.
By using the second area as the waveform buffer in the real number area in the calculation, the situation where the sampling is performed frequently or continuously is supported. The third area 0C00h to 0FFFh of the external RAM 2b is used as the imaginary number area I (n) in the waveform processing DFT operation. In S2 of the main routine, it is determined by the buffer full flag whether 1024 waveform buffers acquired in the interrupt process are secured. If 1024 data are acquired, the interrupt is prohibited (S3) and the offset value (base address) is set. External RA by switching
By converting the waveform processing DFT calculation area of M2b and the sampling waveform buffer area (S4), the sound waveform can be sampled immediately after the interruption prohibition is released. Next, the address counter indicating the (relative) address of the waveform data A / D converted in the interrupt process is cleared, the buffer full flag is also cleared (S5), and the interrupt is permitted (S6). Next, 1024 pieces of waveform data R (n) captured, n = 0 to 1
024 is a window function (a function that smoothes both ends of 1024 finite data cut from infinite time, and Hanning window, Hamming window, Blackman window, etc. are used. For example, Hanning window is W = (n) = 0.5- 0.5cos (2πn / N) 0 ≤ n ≤ N-1 · 0 (determined outside the above range) (S7), and then the data R (n) calculated by the DFT (discrete Fourier transform) subroutine (S8) , I (n) n = 0 to 1023 is used to calculate an amplitude (line) spectrum composed of 512 pieces of data (S9), and the spectrum which satisfies the condition (S10) is input by an external switch. One or more numbers are selected and converted into scale tone data (S11), further converted into MIDI data, and output to the parallel / serial converter (S12).

【００１０】次に、図５のインタラプト処理について説
明する。インタラプトがかかるとＡ・Ｄコンバータ１１
により取り込んだデータを波形バッファにセットする
（Ｔ１）。次に波形バッファのアドレスカウンタを＋１
する（Ｔ２）。なお、実際の波形バッファの外部ＲＡＭ
上のアドレスはこのカウンタと波形バッファオフセット
値（波形バッファベースアドレス値）とにより決定さ
れ、オフセット値はＳ４を１回通過するごとに０４００
ｈNext, the interrupt processing of FIG. 5 will be described. When an interrupt is applied, the A / D converter 11
The data taken in is set in the waveform buffer (T1). Next, add +1 to the address counter of the waveform buffer.
Yes (T2). The external RAM of the actual waveform buffer
The upper address is determined by this counter and the waveform buffer offset value (waveform buffer base address value), and the offset value is 0400 every time S4 is passed.
h

【外１】０８００ｈの変化をくり返す（図６参照）。アドレスカ
ウンタが１０２４になったかを判定し（Ｔ３）、１０２
４になったときアドレスカウンタをクリアしバッファフ
ルフラグをセットする（Ｔ４）。最後に本実施例で使用
したＤＳＰ１はバーＩＮＴ信号によりインタラプト処理
にジャンプしたときハードウェア的にインタラプト禁止
状態になっているのでインタラプトを許可し（Ｔ５）メ
イン処理に戻る。[Outer 1] The change of 0800h is repeated (see FIG. 6). It is determined whether the address counter has reached 1024 (T3), and 102
When it reaches 4, the address counter is cleared and the buffer full flag is set (T4). Finally, the DSP 1 used in this embodiment permits the interrupt (T5) and returns to the main process because the interrupt is prohibited by the hardware when jumping to the interrupt process by the bar INT signal.

【００１１】以下、図４のＳ１１に示した設定条件参照
について詳しく説明する。本実施例で振幅スペクトルか
ら１個または複数個のスペクトルナンバーを選択する条
件は基音の音程範囲と基音レベルと倍音レベル（高調波
レベル）である。図４のＤＦＴサブルーチンＳ８と振幅
スペクトル演算Ｓ９により図７に示す振幅スペクトルが
得られたときを例に図８の動作フロー（設定条件参照サ
ブルーチン）を追ってみる。ここでは基音の音程範囲を
Ａ₄〜Ａ₆で基音レベル0.5以上、２倍音レベル0.2 以
上、３倍音レベル0.1 以上に外部スイッチ８によりＳ１
０で設定されたときを例にして説明する。なおスペクト
ルナンバーと音階音の関係はサンプリング周波数２０Ｋ
Ｈｚで１０２４点データをサンプルしたとき周波数分解
能が20000／1024＝19.53125（Ｈｚ）であり、19.53125
×スペクトルナンバーがスペクトルナンバーの周波数
（Ｈｚ）となるためあまりサンプル数が少ないと半音ご
との分解能を得ることはできない。まず指定音程範囲が
Ａ₄からＡ₆の間でＡ₄を４４０ＨｚとしたときＡ₆＝１７
６０Ｈｚなので、それぞれスペクトルナンバーを逆算し
スペクトルナンバー２２から９１の間で振幅0.5以上の
ものをサーチする（Ｕ１）。その結果、スペクトルナン
バーが２７、２８、３０、３７、３８、４５、４６、５
３のデータが条件を満たしている。この中でさらに、そ
れぞれのスペクトルナンバーを基音としたときの２倍音
（５３、５６、６０、７４……）が条件を満たすかサー
チする（Ｕ２）。その結果、スペクトルナンバーが２
７、３０、３８、４５が残る。その中でさらに３倍音が
満たすかサーチする。その結果スペクトルナンバー３
０、３８、４５が残りそれをスペクトルナンバーから周
波数に変換すると、それぞれ５８０Ｈｚ、７４２Ｈｚ、
８７９Ｈｚとなり、これらに最も近い音階音に変換する
とＤ₄、Ｆ₄＝、Ａ₅になるのでそのための音階音コード
変換（音階音量子化処理）を行う（Ｕ４）。以上で、設
定条件マッチングによる入力音信号のピッチ評価を終え
次の出力処理に進む。Hereinafter, the setting condition reference shown in S11 of FIG. 4 will be described in detail. In this embodiment, the conditions for selecting one or a plurality of spectrum numbers from the amplitude spectrum are the pitch range of the fundamental tone, the fundamental tone level, and the overtone level (harmonic level). The operation flow (setting condition reference subroutine) of FIG. 8 will be followed by taking as an example the case where the amplitude spectrum shown in FIG. 7 is obtained by the DFT subroutine S8 of FIG. 4 and the amplitude spectrum calculation S9. Here, the pitch range of the fundamental tone is A _{4 to} A ₆ and the fundamental tone level is 0.5 or higher, the second harmonic level is 0.2 or higher, and the third harmonic level is 0.1 or higher.
An example will be described when the value is set to 0. The relationship between the spectrum number and the scale tone is 20K sampling frequency.
The frequency resolution is 20000/1024 = 19.53125 (Hz) when 1024 points data is sampled at Hz.
B. Since the spectrum number becomes the frequency (Hz) of the spectrum number, if the number of samples is too small, the resolution for each semitone cannot be obtained. First, when the specified pitch range is between A ₄ and A ₆ , and A ₄ is 440 Hz, A ₆ = 17
Since the frequency is 60 Hz, the spectrum numbers are calculated backward and the spectrum numbers 22 to 91 with amplitudes of 0.5 or more are searched (U1). As a result, the spectrum numbers are 27, 28, 30, 37, 38, 45, 46, 5
The data of 3 satisfy the condition. Further, among these, it is searched whether or not the second overtone (53, 56, 60, 74 ...) When each spectrum number is used as a fundamental tone satisfies the condition (U2). As a result, the spectrum number is 2
7, 30, 38, 45 remain. Among them, it is searched whether the third harmonic is satisfied. As a result, spectrum number 3
0, 38, 45 remain and convert it from spectrum number to frequency, 580Hz, 742Hz,
It becomes 879 Hz, and D ₄ and F ₄ =, A ₅ are obtained when converted to the closest scale tone, so scale tone code conversion (scale tone quantization processing) for that is performed (U4). With the above, the pitch evaluation of the input sound signal by the setting condition matching is completed, and the process proceeds to the next output process.

【００１２】以上のように、本実施例では音信号入力か
ら抽出したスペクトルに対し、基音／倍音系列の設定レ
ベルを設定条件として参照し、設定条件を満たす基音／
倍音系列の基音となる線スペクトルを見つけ出すことに
より、音信号入力に含まれる１つあるいは複数のピッチ
を評価している。したがって単音だけでなく複音のピッ
チも抽出可能であり、音信号入力として和音信号が与え
られるような場合に特に有益である。As described above, in the present embodiment, the set level of the fundamental / overtone series is referred to as the setting condition for the spectrum extracted from the sound signal input, and the fundamental /
One or a plurality of pitches included in the sound signal input is evaluated by finding a line spectrum that is a fundamental tone of the overtone series. Therefore, not only a single note but also a pitch of a compound note can be extracted, which is particularly useful when a chord signal is given as a sound signal input.

【００１３】サンプリング周波数を変えたときの、音階
音とスペクトルナンバーの示す周波数との対応を第９図
に示す。同図の（ｂ）に示すように、１０２４点のサン
プルでサンプリング周波数が２０ＫＨｚの場合、スペク
トルナンバ−２６が音階音Ｂ₄とＣ₄の中間になってしま
いどちらか判定できない。このようなポイントが各サン
プリング周波数で必ず生じてしまい、特に低い周波数帯
ほど音程間の周波数サンプリングがせまいため判定でき
ないことが多くなる。この問題を改善するためにはサン
プルポイントを多くして各スペクトルナンバー間の周波
数分解能を上げれば良いがそうするとＤＦＴ演算時間が
増大するという欠点がある。そこで、この実施例では、
サンプリング周波数を微調整可能にすることでピッチの
音階音量子化エラーの問題を克服している。例えば、２
０ＫＨｚのサンプリング周波数に対する音信号スペクト
ルのピッチ抽出でスペクトルナンバー２６が基音として
評価されたとすると、このスペクトルナンバー２６の周
波数が音階音Ｃ₄とＢ₄の丁度、中間にあるので、正しい
音階音量子化を行い得ない。しかし、サンプリング周波
数を例えば、２１ＫＨｚに切り替えて、再度、ピッチ抽
出を試み、それにより、スペクトルナンバー２５が基音
として評価されたとすると、このスペクトルナンバー２
５の周波数は５１３ＨｚでＣ₄ に十分近いので、Ｃ₄と
判定することができる。このようなサンプリング周波数
の切替は制御スイッチ部８で設定でき、メインフロー
（図４）のＳ１０でサンプリング周波数の指定変更が読
まれたとき、そのサンプリング周波数指定データがクロ
ックジェネレータ１４のデータラッチ回路１４ｂにセッ
トされ、指定された周波数のサンプリングクロックＳＣ
Ｋ、バーＳＣＫが形成される。この場合、サンプル数は
固定なので、例えば１０２４のような２のべき乗のサン
プル数を選択することにより、ＤＦＴ（離散フーリェ変
換）を基数２による通常の高速フーリェ変換（ＦＦＴ）
で実現てきる。FIG. 9 shows the correspondence between the scale tone and the frequency indicated by the spectrum number when the sampling frequency is changed. As shown in (b) of the figure, when the sampling frequency is 20 KHz with 1024 samples, the spectrum number 26 is in the middle of the scale tones B ₄ and C ₄ , and it is not possible to determine which one. Such a point always occurs at each sampling frequency, and the lower the frequency band is, the more difficult the frequency sampling between intervals is, so that it is often impossible to make a determination. In order to improve this problem, the number of sample points may be increased and the frequency resolution between the spectrum numbers may be increased. However, this has the disadvantage of increasing the DFT calculation time. So, in this example,
The problem of pitch scale quantization error is overcome by making the sampling frequency finely adjustable. For example, 2
If the spectrum number 26 is evaluated as the fundamental tone by pitch extraction of the tone signal spectrum with respect to the sampling frequency of 0 KHz, the frequency of this spectrum number 26 is exactly in the middle of the scale tones C ₄ and B ₄ , so that correct scale quantization is performed. Can't do. However, if the sampling frequency is switched to, for example, 21 KHz and the pitch extraction is tried again, and the spectrum number 25 is evaluated as the fundamental tone, the spectrum number 2
Frequency of 5 is sufficiently close to the C ₄ at 513Hz, it can be determined that C _4. Such switching of the sampling frequency can be set by the control switch unit 8, and when the designation change of the sampling frequency is read in S10 of the main flow (FIG. 4), the sampling frequency designation data is the data latch circuit 14b of the clock generator 14. Sampling clock SC with specified frequency set to
K and bar SCK are formed. In this case, since the number of samples is fixed, a normal fast Fourier transform (FFT) based on a radix of 2 is selected from DFT (discrete Fourier transform) by selecting a sample number of powers of 2 such as 1024.
Will be realized in.

【００１４】以上で、実施例の説明を終えるが、この発
明の範囲内で種々の変形、変更が容易である。Although the description of the embodiment has been completed, various modifications and changes can be easily made within the scope of the present invention.

【００１５】例えば、ピッチ量子化に関し、一般的な状
況では、音信号入力のソース（音源）の音階音の周波数
は不明である。しかし、マイクロチューニングの操作子
を設けることで、音源に合ったピッチ量子化も可能であ
る。例えば、原音とＭＩＤＩ出力され、再生される評価
ピッチの楽音とを使用者が聴き比べ、合わなければ、マ
イクロチューニング操作子を動かす。このマイクロチュ
ーニング操作子からのデータを修正パラメータとして音
階音コードをＤＳＰ１で再評価し、再評価したピッチの
楽音を再生する。聴覚テストで一致したときのマイクロ
チューニング操作子データを利用することにより、音信
号入力ソース（音源）の音階音の周波数を正確に評価で
きる。音源のすべての音階音について個別に聴覚テスト
を行ってその周波数を評価してもよいが、平均律に従う
音源であれば、１点の音階音を評価することで残る音階
音は自動的に評価できる。また、そうでないような場合
でも、何点（例えば１オクターブ間隔）かの音階音を評
価することで、残りの音階音を補間によって近似し得
る。For example, regarding pitch quantization, in a general situation, the frequency of the scale tone of the source (sound source) of the sound signal input is unknown. However, pitch quantization suitable for the sound source can be achieved by providing a micro tuning operator. For example, the user listens to the original sound and the musical sound of the evaluation pitch that is output by MIDI and reproduced, and if they do not match, the micro tuning operator is moved. The scale tone code is re-evaluated by the DSP 1 using the data from the microtuning operator as a correction parameter, and the tone having the re-evaluated pitch is reproduced. The frequency of the scale tone of the sound signal input source (sound source) can be accurately evaluated by using the micro-tuning operator data obtained when they match in the hearing test. Although it is possible to perform an auditory test individually for all scale sounds of a sound source to evaluate their frequencies, if the sound source is in accordance with the equal temperament, evaluating one scale sound will automatically evaluate the remaining scale sounds. it can. Further, even in such a case, the remaining scale tones can be approximated by interpolation by evaluating the scale tones of several points (for example, one octave interval).

【００１６】音源の音階音ピッチが既知の場合、あるい
は上述したような方法で音源の各音階音のピッチを決定
した後で、音源からの和音等のピッチを音階音で評価す
るような場合において、スペクトル抽出、ピッチ分析の
処理速度が十分速ければ、実時間ベースで音階音を再生
可能である。例えば、実施例のように２０ＫＨｚ程度の
サンプリング周波数で１０２４ポイントをＤＦＴ処理
し、ピッチ抽出する処理は今日の高速ＤＳＰの能力によ
ってサンプリング時間より短い時間内で実現し得る。こ
のような場合、サンプリングしながら、ピッチ抽出が可
能なので、サンプリング周波数を分析区間ごとに切り替
えてサンプリングを行うことにより、誤りのない音階
音量子化を効率よく行える。例えば、最初の１０２４ポ
イントのサンプルを２０ＫＨｚでサンプリングし、それ
に対して、ＤＦＴ処理を施し、得られたスペクトルから
設定条件に従う成分（スペクトルナンバー）を得、その
スペクトルナンバーの周波数と既知の音階音ピッチとを
比較し、スペクトルナンバーの周波数に十分近い音階音
ピッチがあればその音階音を音源からの音信号の音階音
ピッチとして評価、再生し、音階量子化が困難なスペク
トルナンバーについては評価を打ち切る。このようなＤ
ＦＴ処理、ピッチ抽出処理と並行して、２回目の１０２
４ポイントの音信号サンプルが２０ＫＨｚとは少し異な
るサンプリング周波数で取り込まれる。２回目の１０２
４ポイントのサンプルに対するＤＦＴ処理、ピッチ抽出
処理により、前回、量子化できなかったピッチを多分、
量子化することが可能になる。あるいは、音階量子化が
困難なスペクトルナンバーを得た場合に、量子化誤差
（例えばスペクトルナンバーの周波数と最寄りの音階音
の周波数との比で与えられる）を計算し、この量子化誤
差をキャンセルするようなサンプリング周波数（例え
ば、上記比を元のサンプリング周波数に乗じたもの）を
選択し、そのサンプリング周波数で次の１０２４ポイン
トの音信号サンプルを取り込み、それに対してスペクト
ル分析を行うようにすれば、確実なピッチ量子化が可能
になる。When the scale pitch of the sound source is known, or when the pitch of each scale of the sound source is determined by the above-described method and then the pitch of a chord or the like from the sound source is evaluated by the scale sound. If the processing speed of spectrum extraction and pitch analysis is fast enough, it is possible to reproduce the scale sound on a real-time basis. For example, as in the embodiment, the DFT processing of 1024 points at a sampling frequency of about 20 KHz and the pitch extraction processing can be realized within a time shorter than the sampling time by the capability of today's high-speed DSP. In such a case, the pitch can be extracted while sampling, so that the error-free scale quantization can be efficiently performed by switching the sampling frequency for each analysis section for sampling. For example, a sample of the first 1024 points is sampled at 20 KHz, a DFT process is applied to the sample, a component (spectrum number) according to a setting condition is obtained from the obtained spectrum, a frequency of the spectrum number and a known pitch pitch are obtained. If there is a scale pitch sufficiently close to the frequency of the spectrum number, that scale sound is evaluated and reproduced as the scale pitch of the sound signal from the sound source, and the spectrum number that is difficult to quantize is discontinued. . D like this
In parallel with the FT process and the pitch extraction process, the second 102
Four-point sound signal samples are taken at a sampling frequency slightly different from 20 KHz. Second time 102
By the DFT process and the pitch extraction process for the 4-point sample, the pitch that could not be quantized last time is probably
It becomes possible to quantize. Alternatively, when a spectrum number that is difficult to quantize is obtained, a quantization error (for example, given by the ratio between the frequency of the spectrum number and the frequency of the nearest scale tone) is calculated, and this quantization error is canceled. By selecting such a sampling frequency (for example, one obtained by multiplying the original sampling frequency by the above ratio), the next 1024-point sound signal sample is acquired at the sampling frequency, and the spectrum analysis is performed on it. Reliable pitch quantization is possible.

【００１７】更に、ピッチ量子化に関し、量子化エラー
をなくすために、スペクトルの分析区間を定める音信号
のサンプル数を若干、変更できるようにしてもよい。例
えば、１０２４個のサンプル数の代りにこれより、若
干、少ないサンプル数をＤＦＴ処理することにより、周
波数分解能を少しずらして、評価するピッチについては
量子化の判定が確実になるような線スペクトル（スペク
トルナンバー）が得られるようにする。この場合、ＤＦ
Ｔ処理を高速化するために、例えば、チャープＺ変換
（chirp Ｚ transform：ＣＺＴ）処理を採用できる。Further, regarding pitch quantization, in order to eliminate a quantization error, the number of samples of a sound signal which defines a spectrum analysis section may be slightly changed. For example, instead of the number of samples of 1024, the frequency resolution is slightly shifted by performing a DFT process on a slightly smaller number of samples than this, and a line spectrum (quantization determination for the pitch to be evaluated becomes reliable). The spectrum number). In this case, DF
In order to speed up the T process, for example, a chirp Z transform (CZT) process can be adopted.

【００１８】[0018]

【発明の効果】最後に、この発明考案の効果について述
べる。請求項１では音信号のスペクトル分析において、
所定の条件を満足するスペクトル成分を検出することに
より音信号の種類に応じて数の可変のピッチを抽出して
いるので、比較的簡単な構成でありながら単音（モノフ
ォニック）だけでなく和音のような複音（ポリフォニッ
ク）のピッチも抽出できる利点があり、和音の学習等に
有益である。請求項２では基音と倍音の系列に着目し、
設定した基音レベルと倍音レベルをもつ基音／倍音系列
を抽出スペクトル上でサーチすることにより、ピッチを
抽出しているので、入力される音信号の音色が既知であ
るような場合に特に有効であり、正確なピッチ評価を与
えることができる。請求項３ではサンプリング周波数を
可変に設定し、設定したサンプリング周波数と抽出スペ
クトルとに基づいてピッチを抽出し、量子化しているの
で、評価ピッチの量子化エラーを少なくすることができ
る。また、請求項４では、スペクトルを抽出するための
音信号のサンプル数を異ならせてピッチ抽出を行ってい
るので、請求項３と同様に評価ピッチの量子化エラーを
少なくすることができ、更に、サンプリングをやり直す
必要がない。Finally, the effect of the present invention will be described. In claim 1, in the spectrum analysis of the sound signal,
The number of variable pitches is extracted according to the type of sound signal by detecting the spectral components that satisfy the specified conditions, so it is not only a single note (monophonic) but also a chord even though it has a relatively simple structure. There is an advantage that the pitch of various complex tones (polyphonic) can be extracted, which is useful for learning chords. In claim 2, focusing on the fundamental and overtone series,
Since the pitch is extracted by searching the extracted spectrum for the fundamental / overtone series having the set fundamental and overtone levels, it is particularly effective when the tone color of the input sound signal is known. , Can give an accurate pitch evaluation. According to the third aspect, the sampling frequency is variably set, and the pitch is extracted and quantized based on the set sampling frequency and the extracted spectrum. Therefore, the quantization error of the evaluation pitch can be reduced. Further, in claim 4, since the pitch extraction is performed by changing the number of samples of the sound signal for extracting the spectrum, it is possible to reduce the quantization error of the evaluation pitch as in the case of claim 3, and , You don't have to resample.

【図面の簡単な説明】[Brief description of drawings]

【図１】この発明の実施例に係るピッチ抽出装置の全体
構成図である。FIG. 1 is an overall configuration diagram of a pitch extraction device according to an embodiment of the present invention.

【図２】図１のクロックジェネレータ１４の構成図であ
る。FIG. 2 is a configuration diagram of a clock generator 14 of FIG.

【図３】サンプリング周波数を２０ＫＨｚにしたときの
クロックジェネレータの動作のタイムチャートである。FIG. 3 is a time chart of the operation of the clock generator when the sampling frequency is set to 20 KHz.

【図４】図１のデジタルシグナルプロセッサ（ＤＳＰ）
１のメイン処理のフローチャートである。4 is a digital signal processor (DSP) of FIG.
It is a flowchart of the main process of 1.

【図５】図１のＤＳＰ１のインタラプト処理のフローチ
ャートである。5 is a flowchart of an interrupt process of the DSP 1 of FIG.

【図６】外部メモリのアドレスマップを示す図である。FIG. 6 is a diagram showing an address map of an external memory.

【図７】サンプル数１０２４、サンプリング周波数２０
ＫＨｚの下での音信号のスペクトルを例示する図であ
る。FIG. 7 shows a sample number of 1024 and a sampling frequency of 20.
It is a figure which illustrates the spectrum of the sound signal under KHz.

【図８】設定条件を参照してスペクトルからピッチを抽
出する処理のフローチャートである。FIG. 8 is a flowchart of a process for extracting a pitch from a spectrum with reference to setting conditions.

【図９】異なるサンプリング周波数におけるスペクトル
ナンバーと音階音との対応を示す図である。FIG. 9 is a diagram showing correspondence between spectrum numbers and scale sounds at different sampling frequencies.

【符号の説明】[Explanation of symbols]

１デジタルシグナルプロセッサ２ａプログラムデータＲＯＭ２ｂ演算用ＲＡＭ８各種制御スイッチ部 1 Digital Signal Processor 2a Program Data ROM 2b Arithmetic RAM 8 Various Control Switch Units

Claims

【特許請求の範囲】[Claims]

【請求項１】音信号入力をサンプリングする音信号サン
プリング手段と、サンプリングした音信号入力のスペクトルを抽出するス
ペクトル抽出手段と、抽出したスペクトルのなかで所定の条件を満たす成分を
検出することにより、音信号入力に従って数が可変のピ
ッチを抽出する可変数ピッチ抽出手段と、を有することを特徴とする音信号ピッチ抽出装置。1. A sound signal sampling means for sampling a sound signal input, a spectrum extracting means for extracting a spectrum of the sampled sound signal input, and detecting a component satisfying a predetermined condition in the extracted spectrum, A sound signal pitch extraction device, comprising: a variable number pitch extraction means for extracting a variable number of pitches according to a sound signal input.

【請求項２】請求項１記載の音信号ピッチ抽出装置にお
いて、前記可変数ピッチ抽出手段は、抽出されたスペクトルのなかで所定の基音レベルを超え
る周波数成分を検出する手段と、検出された周波数成分のなかでその倍音成分が所定の倍
音レベルを超える周波数成分を前記ピッチまたはその候
補として選択する手段と、を有することを特徴とする音信号ピッチ抽出装置。2. The sound signal pitch extracting apparatus according to claim 1, wherein the variable number pitch extracting means detects a frequency component exceeding a predetermined fundamental tone level in the extracted spectrum, and the detected frequency. A sound signal pitch extracting apparatus comprising: a component for selecting, as the pitch or a candidate thereof, a frequency component whose harmonic component exceeds a predetermined harmonic level among the components.

【請求項３】サンプリング周波数を可変に設定する可変
サンプリング周波数設定手段と、設定されたサンプリング周波数で音信号入力をサンプリ
ングする音信号サンプリング手段と、サンプリングした音信号のスペクトルを抽出するスペク
トル抽出手段と、抽出したスペクトルと設定されたサンプリング周波数と
に基づいて、音信号入力に従って数が可変のピッチを抽
出する可変数ピッチ抽出手段と、抽出されたピッチを量
子化する量子化手段と、を有することを特徴とする音信号ピッチ抽出装置。3. A variable sampling frequency setting means for variably setting a sampling frequency, a sound signal sampling means for sampling a sound signal input at the set sampling frequency, and a spectrum extracting means for extracting a spectrum of the sampled sound signal. A variable number pitch extracting means for extracting a pitch whose number is variable according to a sound signal input based on the extracted spectrum and the set sampling frequency; and a quantizing means for quantizing the extracted pitch. A sound signal pitch extraction device characterized by.

【請求項４】音信号入力をサンプリングする音信号サン
プリング手段と、複数の異なるサンプル数について、サンプリングした音
信号入力のスペクトルを抽出するスペクトル抽出手段
と、各サンプル数について抽出したスペクトルに基づいて、
音信号入力に従って数が可変のピッチを抽出する可変数
ピッチ抽出手段と、抽出したピッチを量子化する量子化手段と、を有することを特徴とする音信号ピッチ抽出装置。4. A sound signal sampling means for sampling a sound signal input, a spectrum extracting means for extracting a spectrum of a sampled sound signal input for a plurality of different sample numbers, and a spectrum extracted for each sample number,
A sound signal pitch extracting device comprising: a variable number pitch extracting means for extracting a pitch whose number is variable according to a sound signal input; and a quantizing means for quantizing the extracted pitch.