JPS58209794A - Pattern matching system - Google Patents

Pattern matching system

Info

Publication number
JPS58209794A
JPS58209794A JP57092824A JP9282482A JPS58209794A JP S58209794 A JPS58209794 A JP S58209794A JP 57092824 A JP57092824 A JP 57092824A JP 9282482 A JP9282482 A JP 9282482A JP S58209794 A JPS58209794 A JP S58209794A
Authority
JP
Japan
Prior art keywords
pattern
time series
matching
series pattern
standard
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
JP57092824A
Other languages
Japanese (ja)
Other versions
JPH0361955B2 (en
Inventor
純一 市川
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujitsu Ltd
Original Assignee
Fujitsu Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujitsu Ltd filed Critical Fujitsu Ltd
Priority to JP57092824A priority Critical patent/JPS58209794A/en
Publication of JPS58209794A publication Critical patent/JPS58209794A/en
Publication of JPH0361955B2 publication Critical patent/JPH0361955B2/ja
Granted legal-status Critical Current

Links

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。
(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】 jal  発明り技#分野 本発明は音声認識方式に係り、4?に音声の特徴全表わ
すペクトjしの時系列で表わされるパターンの7ツ千ン
グ万式の改良に関する。
[Detailed Description of the Invention] jal Inventive Technique #Field The present invention relates to a voice recognition method, and includes 4? This invention relates to the improvement of seven thousand patterns of patterns expressed in a time series of vectors that express all the characteristics of speech.

(1))  従来技術と問題点 −ffにパターンのマツチングによって音声全認識する
ためには、単語単位に音声の標準パダ、−ン紫単語辞書
としてメモリに登録して8き、入力音声パターンと標準
パターンとがマツチングされる。
(1)) Prior Art and Problems - In order to recognize all speech by matching patterns to ff, register the speech in memory as a standard word dictionary for each word, and then match the input speech pattern with the input speech pattern. The standard pattern is matched.

この場合入力音声パターン及び標準パダー7は出来る限
妙、その単語の発音に関して時間軸で正規化されている
事が望ましい。ところが1つ(/J単単語全発音る場合
、単語音声の継続時間長及び単語音声中の音韻の長さは
発音丁6人や発音する時の状況等によって、一般に変動
する。この様に音声は、動的計画法(DPマ、千ング)
と識別関数法とが代表的な方法として仰られてい6つし
かし動的計画法は時間軸の非憑形な伸縮に対する正規化
には有効であるが、パラメークの確率的な変動に対処す
ることは困難である。又識別関数法はパラメークの確率
的変動には有効であるが9時間軸方向の正規化に問題が
あるといろ欠点がある。
In this case, it is desirable that the input speech pattern and the standard padder 7 be normalized on the time axis with respect to the pronunciation of the word as accurately as possible. However, when pronouncing a whole word (/J), the duration of the word sound and the length of the phoneme in the word sound generally vary depending on the person who is pronouncing it, the situation when pronouncing it, etc. is dynamic programming (DPma, Sengu)
The discriminant function method is said to be a typical method.6 However, while dynamic programming is effective for normalizing against unnatural expansion and contraction of the time axis, it is difficult to deal with stochastic fluctuations in parameters. It is difficult. Although the discriminant function method is effective for stochastic variations in parameters, it has drawbacks such as problems with normalization in the time axis direction.

(Cl  発明の目的 本発明の目的は上記欠点金線くため1時間軸。(Cl Object of the invention The purpose of the present invention is to eliminate the above-mentioned defective gold wire on a one-time scale.

闇波数軸とで距離計算の重み金変えることにより。By changing the weight of distance calculation with the dark wave number axis.

動的計画法と識別関数法との両方式全組み合せ。Full combination of dynamic programming and discriminant function method.

両方式の長所金倉せもったパダーンマッ千ンク万式?提
供することにある。
Advantages of both methods: Kanakura Semota Padaan Mak Sennku Manshiki? It is about providing.

(cl  発明の構成 本発明の構成は入力時系列パターンを動的計画法によっ
て標準時系列パターンと7クチングする手段と、該マツ
チング手段の結果に基づき入力時系列パターンから新た
な7.千ノグ用時系列パターンを作放す6手段とを設け
、該新たな7ツ千ング用時系列パダーンと標準パターン
との間で2時間軸上及び周波数軸上で異な6重み付けを
用いた距離計算全行なってパターン7ツ千ング金行なう
も力である。
(cl) Structure of the Invention The structure of the present invention includes a means for matching an input time series pattern with a standard time series pattern by dynamic programming, and a method for generating a new time series pattern from the input time series pattern based on the result of the matching means. A pattern is created by performing all distance calculations between the new 7-thousand time series pattern and the standard pattern using 6 different weightings on the time axis and the frequency axis. Even if you spend 7,000 yen, it's power.

(el  発明CON施例 ii図は本発明の一実施例を説明するブロック図である
。音声パターンの継続時間長を等時間単位の区分に分割
し、該区分のすべてに於る該音声パターンの周波数スペ
クトル? 入力時系列/−e 9−ンについてはペクト
・レイ、 、 鶴、 i、・・・・・・iとし。
(el Invention CON Embodiment II Figure is a block diagram illustrating an embodiment of the present invention.The duration length of a voice pattern is divided into sections of equal time units, and the voice pattern in all of the sections is divided into sections of equal time units. Frequency spectrum? For the input time series/-e 9-, let Pecto-Rei, , Tsuru, i, ......i.

→→ 標準時系列パターンについてはベクトルb+wbt*b
、 ・・b、とすると、メモリ1には入力時系列パター
ンA =(ai ) ;−+〜工が格納される0又メモ
リ2には標準時系列パターンB = (b」)4−1−
、+が格納される。DPマツチング部2に於て、メモリ
1及び3より入力時系列パターンA及び標準時系列パタ
ーンBが動的計画法によりマツチングされる。
→→ For standard time series pattern, vector b+wbt*b
, . . b, then memory 1 stores the input time series pattern A = (ai) ;-+~ 0, and memory 2 stores the standard time series pattern B = (b'') 4-1-
, + are stored. In the DP matching section 2, the input time series pattern A and the standard time series pattern B from the memories 1 and 3 are matched by dynamic programming.

動的計画法によるマツチングの詳細は1−日本音響学会
誌J vo127.No9.483〜490ベーヂ。
For details on matching using dynamic programming, see 1-Journal of the Acoustical Society of Japan J vo127. No. 9.483-490 beige.

1971年9月に記載されているが、概略的には、2つ
の音声パターンA、  Hの夫々の継続時間長を等しい
時間単位に時刻1〜正と時刻1〜Jに分割し、夫々を平
面上の縦軸及び横軸の目盛とし、各時刻に於ける音声パ
ターンの7j波数スペクトルを求める。即ちベクトルa
le a*+ as・・・・・・&!及びベクトルb+
 e k)t * bs・・・bJを求めて2つの音声
パターンA、 Bの間の距離が最小になるように最適径
路10を得て、その時の最適径路lOに沿った累積距離
としてマツチング距離を得るもので66゜即ち先ず距離
gH* 1=/ at  k)I/′とし、g + *
 Jは第2図に示す如くg・−1,j・g・、j−1・
gI・j−1の3点よりの最小値に1.1点に於ける音
声パターンAとBの対応するベクトル間の距離7 鱈−
r7を加えることにより得られる。このようにしてi、
jを遂次増加しなからg ’ + 3を求めることによ
り最終的に音声パターンA、  Bの間の最適なマツチ
ング距離としてKIJ−が求まり、又この計算過程で、
マトリックスの各格子点に於て最適径路が直前の3点の
中のどれを通ったかP+、 jに記憶させることにより
最適なマツチングの径路も同時に求められる。
Although it is described in September 1971, roughly speaking, the duration length of each of the two voice patterns A and H is divided into equal time units into time 1 to positive and time 1 to J, and each is divided into two planes. The 7j wave number spectrum of the voice pattern at each time is determined using the vertical and horizontal axes as scales. That is, vector a
le a*+ as...&! and vector b+
e k) t * bs...bJ is obtained to obtain the optimal path 10 so that the distance between the two voice patterns A and B is minimized, and the matching distance is the cumulative distance along the optimal path IO at that time. To obtain 66 degrees, that is, first, let the distance gH* 1=/at k)I/', then g + *
As shown in Figure 2, J is g・-1, j・g・, j−1・
The distance between the corresponding vectors of speech patterns A and B at 1.1 points is the minimum value from the three points of gI・j−1 7
Obtained by adding r7. In this way i,
By successively increasing j and finding g' + 3, KIJ- is finally found as the optimal matching distance between voice patterns A and B, and in this calculation process,
By storing in P+,j which of the previous three points the optimal path passed through at each grid point of the matrix, the optimal matching path can be found at the same time.

マトリックス4に得られた音声パターンA、  B間の
最適マツチング距離gt、 Jと最適径路PI 、Jに
より、入力時系列パターンAとの間で新たなマツチング
用パターンCを求めるため、変換部5に於て第3図のフ
ローチャートに示す如くマトリックス4の点(1,J)
からj=J、J−1,〜3.2.1と最適径路を遡りな
がら、jに対応するベクトルeを、又対応するベクトル
τが複数個ある時、それ等を平均化したものをdとする
ことによりマツチング用パターンCを得る。ここでMは
jに対応する&1が複数個ある時平均化するための%叙
でめるo 1=I−j=J、M=0は出発点でC,1=
alとなる。最通−路R,1が第2図に示す如く点し−
l。
In order to obtain a new matching pattern C between the input time series pattern A and the input time series pattern A using the optimal matching distance gt, J between the audio patterns A and B obtained in the matrix 4 and the optimal path PI, J, As shown in the flowchart of Figure 3, point (1, J) of matrix 4
While tracing back the optimal path from j = J, J-1, ~3.2.1, find the vector e corresponding to j, and when there are multiple corresponding vectors τ, the average of them is d. By doing this, matching pattern C is obtained. Here, M can be expressed as a % expression for averaging when there are multiple &1 corresponding to j. 1=I-j=J, M=0 is the starting point and C, 1=
It becomes al. The closest route R, 1 is marked as shown in Fig. 2.
l.

J)より点Ct、3)に至る径路を通っ7t″ih合+
1点い−”IJ−Oより点(+、 」)に至る径路を通
つった場合01点”、j−s)工り点(+、 J)に至
る径路を通った場合−とすればP19.が+の場合一つ
前の格子点ではM=1となる。PL、jが0又は−の時
はM=0である。今山発点(L J)より1つ遡った場
合p+、jがOの方向であつ几とすると1=1−1.j
′−j−19M二Oであるから(J−1=&1−1とな
り、−の方向でめったとするとe J−1” & lで
あり、十の方向であったとするとi=、−1,M=M+
1であるからn ;cn+、、5./ 2となん更にP
I、jが+の1同に1点遡ったとするとi=+ −r 
、 M、=■+ 1であるため号== (2(箔+i箔
ン2+旨1/ 3 = (ii”7+旨+旨y、3とな
り結が3個平均されることを示す。
From J) to point Ct, 7t''ih + through the route leading to 3)
1 point - If you take a route from IJ-O to point (+, '') 01 point, j-s) If you take a route to machining point (+, J) - If P19. is +, M = 1 at the previous grid point. When PL, j is 0 or -, M = 0. If you go back one point from the Imayama starting point (L J), p + , if j is in the direction of O, then 1=1-1.j
'-j-19M2O, so (J-1=&1-1, if it is rarely in the - direction, e J-1"& l, and if it is in the ten direction, i=, -1, M=M+
1, so n;cn+,,5. / 2 and even more P
If I and j go back one point to +, then i = + -r
, M, = ■ + 1, so sign = = (2 (Hoil + i Hakun 2 + Uma 1/3 = (ii"7 + Uta + Uma y, 3, indicating that three results are averaged.

変換部5に於て上記の如く求められたマノチング用パダ
ーンC= (ci b = 1−Jはメモリ6に格納さ
れ6.3新たなマツチング用パターンCは演算部7に於
て、標準時系列パターンBとの間で距離計算に重み付け
?するため9重みW == (’W”7 ) j = 
1−Jで格納しているメモリ8より重み金表わすベクト
ルw、、w2・・・・・WJ f用いて距離りが計算さ
れ最終的な7ツ千ング距離が出力され6゜距離りは例え
l−1′ 但しNは音声パターンのペクトlしの次元数として釆め
ちれる。
The matching pattern C=(ci b = 1-J) obtained as above in the converting section 5 is stored in the memory 6. 6.3 The new matching pattern C is converted into a standard time series pattern in the calculating section 7. To weight the distance calculation between B and B, 9 weights W == ('W”7) j =
The distance is calculated from the memory 8 stored in 1-J using vectors representing weights w,, w2...WJ f, and the final 7-thousand distance is output. l-1' where N is determined as the number of dimensions of the speech pattern vector l.

重みWとしては、先ず或単語に関する多数回の発声パタ
ーンから平均パターンt−求めてBとし。
As the weight W, first, an average pattern t- is calculated from the utterance patterns of a certain word many times and is set as B.

それに対して各−回一回の発声パターンt−Aとして、
上記本発明の方法によりCt求め6゜これにより重みW
、に?求め6.) パターンCの要素 りは重みを求めるため発声した発声回数上記の如き時間
軸上及び周波数軸上で異なる重み付は全行なって距離全
計算することにより、パラメータの確率的な変動に対処
し得る動的計画法全利用した7ツ千ングが実施出来る。
On the other hand, as a vocalization pattern t-A for each time,
Ct is determined by the above method of the present invention by 6 degrees, and the weight W
, to? Search 6. ) The elements of pattern C are the number of utterances uttered to find the weights. Stochastic variations in parameters can be dealt with by performing all different weightings on the time axis and frequency axis and calculating the distances as described above. It is possible to perform 7-thousands using dynamic programming.

即ちパラメータの中心的なスペクトlLf重くシ、比較
的重要でないスペクトルの重み金軽くして距離計算する
ものである。
That is, the distance is calculated by weighting the central spectrum lLf of the parameters and reducing the weight of relatively unimportant spectra.

げ)発明の詳細 な説明した如く本発明は時間軸上及び周波数軸上で距離
計算の重み?変えることが出来るため。
g) As explained in detail, the present invention does not require weighting of distance calculations on the time axis and the frequency axis. Because it can be changed.

時間軸の非線形な伸縮に対する正規化に有効な動的計画
法によるパターンマツチング方式に、パラメータの確率
的な変動に有効な識別関数法の長所全組合せたパターン
マツチング方式を提倶し得るので、パターンの変動に有
効に対処し得ると共に。
We can propose a pattern matching method that combines all the advantages of the pattern matching method using dynamic programming, which is effective for normalization against nonlinear expansion and contraction of the time axis, and the discriminant function method, which is effective for stochastic fluctuations of parameters. , as well as being able to effectively deal with pattern variations.

標準パターンの数?有効に減少せしめることが可能であ
り、その効果は大なるものがある。
Number of standard patterns? It can be effectively reduced, and the effect is significant.

【図面の簡単な説明】[Brief explanation of drawings]

第1図は本発明の一実施例金説明丁6ブロック図、第2
図は7ノ千ング径路の選択を説明する図。 第3図は新たなマツチング用パターン業求めるためのフ
ローチャートでゐ6゜ 1.3,6.8はメモリ、2はDPマツ千7グ部。 5は変換部、7は演算部である。 代理人 弁理士  松 岡 宏四廊7−二〜 1−ゴ −二二
Fig. 1 is a block diagram of one embodiment of the present invention;
The figure is a diagram illustrating the selection of the 7-thousand route. Figure 3 is a flowchart for finding a new matching pattern. 6.1.3 and 6.8 are memories, and 2 is a DP mating section. 5 is a conversion section, and 7 is a calculation section. Agent Patent Attorney Koshiro Matsuoka 7-2~ 1-Go-22

Claims (1)

【特許請求の範囲】[Claims] 入力時系列パターン全動的計画法によって標準時系列パ
ターンとマツチングする手段と、該マツチング手段の結
果に基づき入力時系列パターンから新たなマツ千ング用
時系列パダーン金作放する手段と?設け、該新たなマツ
チング用時系列パターンと標準パターンとの間で1時間
軸上及び周波数軸上で踵な6重み付は金剛いた距離計’
I:i行なってパターンマツチング全行なうこと全特撃
とするバダーノマッ千7グ方式。
A means for matching an input time series pattern with a standard time series pattern by full dynamic programming, and a means for generating a new time series pattern for pine milling from the input time series pattern based on the result of the matching means. 6 weightings between the new matching time series pattern and the standard pattern on the time axis and the frequency axis are used as a distance meter.
I: A Badano match method in which all pattern matching performed by performing i is considered a special attack.
JP57092824A 1982-05-31 1982-05-31 Pattern matching system Granted JPS58209794A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP57092824A JPS58209794A (en) 1982-05-31 1982-05-31 Pattern matching system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP57092824A JPS58209794A (en) 1982-05-31 1982-05-31 Pattern matching system

Publications (2)

Publication Number Publication Date
JPS58209794A true JPS58209794A (en) 1983-12-06
JPH0361955B2 JPH0361955B2 (en) 1991-09-24

Family

ID=14065176

Family Applications (1)

Application Number Title Priority Date Filing Date
JP57092824A Granted JPS58209794A (en) 1982-05-31 1982-05-31 Pattern matching system

Country Status (1)

Country Link
JP (1) JPS58209794A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007279742A (en) * 2006-04-06 2007-10-25 Toshiba Corp Speaker authentication recognition method and device

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS5525091A (en) * 1978-08-14 1980-02-22 Nippon Electric Co Voice characteristic pattern comparator

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS5525091A (en) * 1978-08-14 1980-02-22 Nippon Electric Co Voice characteristic pattern comparator

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007279742A (en) * 2006-04-06 2007-10-25 Toshiba Corp Speaker authentication recognition method and device

Also Published As

Publication number Publication date
JPH0361955B2 (en) 1991-09-24

Similar Documents

Publication Publication Date Title
Nishimura et al. Singing Voice Synthesis Based on Deep Neural Networks.
Zen et al. Statistical parametric speech synthesis using deep neural networks
Zhang et al. Transfer learning from speech synthesis to voice conversion with non-parallel training data
US20220013106A1 (en) Multi-speaker neural text-to-speech synthesis
JP3114975B2 (en) Speech recognition circuit using phoneme estimation
CN110570876B (en) Singing voice synthesizing method, singing voice synthesizing device, computer equipment and storage medium
Nakamura et al. Singing voice synthesis based on convolutional neural networks
CN112489629A (en) Voice transcription model, method, medium, and electronic device
CN113539232A (en) Muslim class voice data set-based voice synthesis method
Li et al. Styletts-vc: One-shot voice conversion by knowledge transfer from style-based tts models
KR20190016889A (en) Method of text to speech and system of the same
CN111599339A (en) Speech splicing synthesis method, system, device and medium with high naturalness
Gao et al. Personalized Singing Voice Generation Using WaveRNN.
JP3311460B2 (en) Voice recognition device
Chen et al. The USTC System for Voice Conversion Challenge 2016: Neural Network Based Approaches for Spectrum, Aperiodicity and F0 Conversion.
JP2898568B2 (en) Voice conversion speech synthesizer
Liu et al. Controllable accented text-to-speech synthesis
JPS58209794A (en) Pattern matching system
JP3102195B2 (en) Voice recognition device
Chandra et al. Towards the development of accent conversion model for (l1) bengali speaker using cycle consistent adversarial network (cyclegan)
Bhatia et al. Speaker accent recognition by MFCC using K-nearest neighbour algorithm: a different approach
Al-Radhi et al. Nonparallel expressive tts for unseen target speaker using style-controlled adaptive layer and optimized pitch embedding
JP2886474B2 (en) Rule speech synthesizer
Sung et al. Factored maximum likelihood kernelized regression for HMM-based singing voice synthesis.
JP3438293B2 (en) Automatic Word Template Creation Method for Speech Recognition