JPS59102296A - Pitch extraction - Google Patents

Pitch extraction

Info

Publication number
JPS59102296A
JPS59102296A JP21293582A JP21293582A JPS59102296A JP S59102296 A JPS59102296 A JP S59102296A JP 21293582 A JP21293582 A JP 21293582A JP 21293582 A JP21293582 A JP 21293582A JP S59102296 A JPS59102296 A JP S59102296A
Authority
JP
Japan
Prior art keywords
pitch
value
pitch period
determined
period
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP21293582A
Other languages
Japanese (ja)
Inventor
泰助 渡辺
平岡 省二
達也 木村
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Panasonic Holdings Corp
Original Assignee
Matsushita Electric Industrial Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Matsushita Electric Industrial Co Ltd filed Critical Matsushita Electric Industrial Co Ltd
Priority to JP21293582A priority Critical patent/JPS59102296A/en
Publication of JPS59102296A publication Critical patent/JPS59102296A/en
Pending legal-status Critical Current

Links

Landscapes

  • Working-Up Tar And Pitch (AREA)
  • Fats And Perfumes (AREA)
  • Steroid Compounds (AREA)

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。
(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】 産業上の利用分野 本発明は音声をディジタル化し帯域圧縮する音声分析合
成系における重要な基本パラメータの一つである音声の
ピッチ周期を正確に、抽出するピンチ抽出方法に関する
ものである。
DETAILED DESCRIPTION OF THE INVENTION Field of the Invention The present invention relates to a pinch extraction method for accurately extracting the pitch period of speech, which is one of the important basic parameters in a speech analysis and synthesis system that digitizes speech and compresses the band. It is something.

従来例の構成とその問題点 音声分析合成系において音声の有声部のピッチ周期を抽
出する方法として自己相関法が広く知られている。この
方法はディジタル化した音声信号をある時間区間(フレ
ームと呼ぶ)切出し、その区間で自己相関演算全施し、
自己相関係数が最大の点を検出しピッチ周期を求める方
法である。しかし、前記方法においては正しいピッチ周
期の倍周期や半周期成分全課まって抽出するという欠点
があった。誤まったピッチ周期の抽出は音声合成時にお
ける音質の著しい劣化となって現われる。
Conventional Structure and Problems The autocorrelation method is widely known as a method for extracting the pitch period of voiced parts of speech in speech analysis and synthesis systems. This method cuts out a certain time interval (called a frame) from a digitized audio signal, performs all autocorrelation calculations on that interval, and
This method detects the point with the maximum autocorrelation coefficient and determines the pitch period. However, the above method has a drawback in that it requires all double period and half period components of the correct pitch period to be extracted. Extraction of incorrect pitch periods results in significant deterioration of sound quality during speech synthesis.

発明の目的 本発明は上記の従来の欠点を除去し、正しいピッチ周期
を確実に抽出するピンチ抽出方法を提供することを目的
とする。
OBJECTS OF THE INVENTION It is an object of the present invention to provide a pinch extraction method that eliminates the above-mentioned conventional drawbacks and reliably extracts the correct pitch period.

発明の構成 上記目的を達成するために本発明は1フレーム内の音声
データX (n)から自己相関係数W(τ)を求め、W
(τ)の値の最大を示す第1の極太値W(τ)maxl
と2番目の大きさの凸部金持つ第2の極大値W(τ)m
a x2を検出してW(τ)maxl とW(τ)ma
x2 を当該フレームの音声データの振幅の絶対値総和
I’SIGで各々正規化した値AW1とAW2 ’i求
め、こ(7)AWlとAW2と前フレームで既に抽出さ
れたピンチ周期PPIT等によってAWl とAW’2
 ’i与えるピッチ周期PIT1とPIT2のいずれか
全選択してこれを当該フレームのピンチ周期とするよう
にしたピッチ抽出方法である。
Structure of the Invention In order to achieve the above object, the present invention calculates an autocorrelation coefficient W(τ) from audio data X(n) within one frame, and
The first extremely thick value W(τ)maxl indicating the maximum value of (τ)
and the second maximum value W(τ)m with the second largest convex metal
Detect a x2 and calculate W(τ)maxl and W(τ)ma
The values AW1 and AW2 'i are obtained by normalizing x2 with the sum of absolute values of the amplitudes of the audio data of the frame I'SIG, and (7) AWl is calculated using AWl, AW2, and the pinch period PPIT already extracted in the previous frame. and AW'2
This pitch extraction method selects all pitch periods PIT1 and PIT2 given by 'i' and sets this as the pinch period of the frame.

実施例の説明 以下、本発明を実施例によって詳説する。Description of examples Hereinafter, the present invention will be explained in detail with reference to Examples.

第1図は本発明方法全実施する装置の構成を示したブロ
ック図である。マイク等から入力端子1を経て入力され
た音声信号はAD変換器2で標本化。
FIG. 1 is a block diagram showing the configuration of an apparatus for carrying out the entire method of the present invention. The audio signal input from a microphone etc. through input terminal 1 is sampled by AD converter 2.

量子化されディジタルデータとなり10m5〜20m5
程度の1フレ一ム分がまとまりとなってデータバッファ
3に蓄積される。第2図は音声波形21とフレームの関
係を示した図である。フレーム1分析時においてデータ
バッファ3に貯えられた音声データX(n)は、自己相
関器4で(1)式%式%(1 (NUフレーム内のサンプル数) によって自己相関係数W(τ)が割算される。第3図a
はフレーム1におけるW(τ)を図示したものである。
Quantized digital data 10m5 to 20m5
One frame of data is stored in the data buffer 3 as a group. FIG. 2 is a diagram showing the relationship between the audio waveform 21 and frames. The audio data X(n) stored in the data buffer 3 during frame 1 analysis is processed by the autocorrelator 4 using the autocorrelation coefficient W(τ ) is divided.Figure 3a
is a diagram illustrating W(τ) in frame 1.

極太値検出器6ではW(τ)から最大値を持つ第1の極
太値301のVv’(τ)maxlと2番目に大きい第
2の極太値302のW(τ)   を求め、各々の極大
ax2 値を与えるτの値303.304iピッチ周期の第1候
補PIT1及び第2候袖PIT2と定める。
The thick value detector 6 calculates Vv'(τ)maxl of the first thick value 301 having the maximum value and W(τ) of the second thick value 302 having the second largest value from W(τ), and calculates each maximum value. The value of τ that gives the ax2 value is determined as the first candidate PIT1 and the second candidate PIT2 with a pitch period of 303.304i.

一方、絶対値加算器5において、(2)式で与えられる
音声データの振幅の絶対値総和l5IGが求められる。
On the other hand, the absolute value adder 5 calculates the absolute value sum l5IG of the amplitude of the audio data given by equation (2).

このl5IG(5用いて極太値検出器6で(3)式によ
り極太値W(τ)maxlとw(τ)max2 の正規
化を行ないAWlとAW2i求める。
Using this l5IG(5), the thick value detector 6 normalizes the thick values W(τ)maxl and w(τ)max2 according to equation (3) to obtain AWl and AW2i.

次にAWl、PITl及びAW2 、 P I T2の
2組のデータは、第1候補バックァ7.第2候補バツフ
ア8に一時記憶される。選択器9は第1候補バツフア7
と第2候補バツフア8のいずれの出力が正しいピッチか
判断選択し出力端子1Qに抽出しだピッチを出力する。
Next, the two sets of data AWl, PITl and AW2, PIT2 are stored in the first candidate backup 7. It is temporarily stored in the second candidate buffer 8. The selector 9 selects the first candidate buffer 7
It is determined which output from the second candidate buffer 8 is the correct pitch, and the extracted pitch is output to the output terminal 1Q.

9における選択は以下の様な2つの選択方法のいずれか
で行なわれる。
The selection at step 9 is performed by one of the following two selection methods.

(選択方法1) 第2図のフレーム1における正しいピッチ22と半周期
ピッチ23は第3図では各々303と304に対応し、
極大値301のAWlと極太値302のAVV’2との
差305は大きいので、第1候補PIT1’iピツチと
判断する。
(Selection method 1) The correct pitch 22 and half-period pitch 23 in frame 1 in FIG. 2 correspond to 303 and 304, respectively, in FIG.
Since the difference 305 between the maximum value 301 AWl and the maximum value 302 AVV'2 is large, it is determined that the first candidate PIT1'i pitch is reached.

(選択方法2) 次に、フレーム2においては、第3図すの半周期ピッチ
308が第1候補となり、正しいピッチ309が第2候
補になって従来例では誤ピッチが抽出されるところであ
るが、極太値306のAWlと極太値307のAW2の
差310は小さいので、更に前のフレームで抽出された
ピッチPPITに近い方のピッチを選択するという条件
を付加すると、第3図においてPPITはフレーム1で
抽出されたピッチ303であり、ピッチ303とPIT
l  308との差311はピッチ303とPIT23
09との差312より太であるので、第2候補であるP
IT2309 を選択するものとする。
(Selection method 2) Next, in frame 2, the half-period pitch 308 shown in FIG. , the difference 310 between AWl of the thickest value 306 and AW2 of the thickest value 307 is small, so if we add the condition of selecting the pitch that is closer to the pitch PPIT extracted in the previous frame, then in Fig. 3, PPIT is the same as the frame This is the pitch 303 extracted in step 1, and the pitch 303 and PIT
l The difference 311 from 308 is pitch 303 and PIT23
Since it is thicker than the difference from 09 by 312, the second candidate P
IT2309 shall be selected.

第1図の11は以上の判定に必要な1つ前のフレームで
抽出されたピッチPPlTl保持するバッファである。
Reference numeral 11 in FIG. 1 is a buffer that holds the pitch PPlTl extracted in the previous frame, which is necessary for the above determination.

第4図はAWlとAW2の値によってとるべき選択法全
示すもので、領域45ではAWl  とAW2の差が太
きいと判断し選択方法1が採用され、領域41では八W
1とAW’2の差が小さいと判断し選択方法2がとられ
る。
Figure 4 shows all the selection methods that should be taken depending on the values of AWl and AW2.
It is determined that the difference between AW'1 and AW'2 is small, and selection method 2 is adopted.

図中の3つの関数は、 直線41は AW2 = AWl 直線42は AW2 = AWI X、−0,035直
線43は AW2 = AWlX、 −0,235であ
る。
The three functions in the figure are: Straight line 41 is AW2 = AWl; Straight line 42 is AW2 = AWIX, -0,035; Straight line 43 is AW2 = AWlX, -0,235.

発明の効果 本発明は、上記のようにフレーム内の音声信号の自己相
関係数から第1の極太値と第2の極太値を与える2つの
ピッチ候補のいずれかを選択する際候補の正規化した自
己相関係数の値によって異なった選択方法を適用するこ
とにより、従来のピンチ抽出装置にありがちな半周期又
は倍周期等の誤ピッチ抽出を防止し、確実に正しいピッ
チを抽出することを可能とする。
Effects of the Invention As described above, the present invention provides normalization of the candidate when selecting one of the two pitch candidates that gives the first extremely thick value and the second extremely thick value from the autocorrelation coefficient of the audio signal within the frame. By applying different selection methods depending on the value of the autocorrelation coefficient, it is possible to prevent incorrect pitch extraction such as half period or double period, which is common with conventional pinch extraction devices, and to reliably extract the correct pitch. shall be.

【図面の簡単な説明】[Brief explanation of drawings]

第1図は本発明方法全実施する装置の構成を示したブロ
ック図、第2図は音声信号とフレームとの関係及び抽出
すべきピンチ周期と誤捷って抽出され易いピンチ周期を
示した波形図、第3図a。 bは本発明方法によるピッチ抽出に用いられる正規化自
己相関係数全厚した図、第4図は本発明方法における2
つのピッチ候補から1つを選ぶ方法全決定する判別図で
ある。 1・・・−入力端子、2・・・・・・AD変換器、3・
・・・データバッファ、4・・・・・・自己相関器、5
・・・・・・絶対値加算器、6・・・・・極太値検出器
、7,8・・・・・バッファ、9・・・・・選択器、1
0・・・・−出力端子。 代理人の氏名 弁理士 中 尾 敏 男 ほか1名第3
図 第4図
Fig. 1 is a block diagram showing the configuration of an apparatus for carrying out the entire method of the present invention, and Fig. 2 is a waveform showing the relationship between an audio signal and a frame, and the pinch period that is likely to be extracted by mistake with the pinch period to be extracted. Figure 3a. b is a full-thickness diagram of the normalized autocorrelation coefficient used for pitch extraction by the method of the present invention, and Fig.
It is a discriminant diagram for determining the method of selecting one pitch candidate from two pitch candidates. 1...-input terminal, 2...AD converter, 3.
...Data buffer, 4...Autocorrelator, 5
......Absolute value adder, 6...Extreme value detector, 7, 8...Buffer, 9...Selector, 1
0...-output terminal. Name of agent: Patent attorney Toshio Nakao and 1 other person No. 3
Figure 4

Claims (1)

【特許請求の範囲】[Claims] 音声波形をピッチ周期の数倍程度の区間に分割し、当該
区間での自己相関係数が最大となる第1の極太値を第2
の極大値及び各々の極太値を与える第1.第2のピッチ
周期の候補全求め、第1と第2の極大値の差が大きい時
は第1のピッチ周期候補を当該区間のピッチ周期と決定
し、前記極太値の差が少ない時には第1又は第2の候補
のうち、直前のフレームで既に決定されたピッチ周期に
近い方を選択して当該区間のピッチ周期を決定する小を
特徴とするピッチ抽出方法。
Divide the audio waveform into sections several times the pitch period, and set the first thickest value that has the maximum autocorrelation coefficient in the section as the second.
The first one gives the maximum value and each thick value. All candidates for the second pitch period are determined. When the difference between the first and second maximum values is large, the first pitch period candidate is determined as the pitch period of the section; when the difference between the maximum values is small, the first pitch period candidate is determined. Alternatively, a pitch extraction method characterized in that the pitch period of the section is determined by selecting one of the second candidates that is closer to the pitch period already determined in the immediately previous frame.
JP21293582A 1982-12-03 1982-12-03 Pitch extraction Pending JPS59102296A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP21293582A JPS59102296A (en) 1982-12-03 1982-12-03 Pitch extraction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP21293582A JPS59102296A (en) 1982-12-03 1982-12-03 Pitch extraction

Publications (1)

Publication Number Publication Date
JPS59102296A true JPS59102296A (en) 1984-06-13

Family

ID=16630725

Family Applications (1)

Application Number Title Priority Date Filing Date
JP21293582A Pending JPS59102296A (en) 1982-12-03 1982-12-03 Pitch extraction

Country Status (1)

Country Link
JP (1) JPS59102296A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002086866A1 (en) * 2001-04-16 2002-10-31 Sakai, Yasue Compression method and apparatus, decompression method and apparatus, compression/decompression system, peak detection method, program, and recording medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002086866A1 (en) * 2001-04-16 2002-10-31 Sakai, Yasue Compression method and apparatus, decompression method and apparatus, compression/decompression system, peak detection method, program, and recording medium

Similar Documents

Publication Publication Date Title
US7672844B2 (en) Voice processing apparatus
JPH10508389A (en) Voice detection device
JPS58130393A (en) Voice recognition equipment
JPS5862699A (en) Voice recognition equipment
JP3354252B2 (en) Voice recognition device
JPS59102296A (en) Pitch extraction
JP3266124B2 (en) Apparatus for detecting similar waveform in analog signal and time-base expansion / compression device for the same signal
JP2992324B2 (en) Voice section detection method
JPH03114100A (en) Voice section detecting device
JPH09247800A (en) Method for extracting left right sound image direction
JP3360978B2 (en) Voice recognition device
JP3190231B2 (en) Apparatus and method for extracting pitch period of voiced sound signal
KR940002853B1 (en) Adaptationally sampling method for starting and finishing points of a sound signal
JP3058569B2 (en) Speaker verification method and apparatus
KR100523905B1 (en) Dual Speech Detection Method of The Startpoint and The Endpoint in Speech Recognition
JPS6217800A (en) Voice section decision system
JPS6267598A (en) Voice section detection system
JPS60101598A (en) Voice section detector
JP2602641B2 (en) Audio coding method
JPS62183500A (en) Voice pitch extractor
JPS63108400A (en) Voice encoder
JPS5872994A (en) Signal input unit
JPS6039700A (en) Detection of voice section
JPS63223696A (en) Voice pattern generation system
JPS63262695A (en) Voice recognition system