JPS58209794A

JPS58209794A - Pattern matching system

Info

Publication number: JPS58209794A
Application number: JP57092824A
Authority: JP
Inventors: 純一市川
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1982-05-31
Filing date: 1982-05-31
Publication date: 1983-12-06
Also published as: JPH0361955B2

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】ｊａｌ　　発明り技＃分野本発明は音声認識方式に係り、４？に音声の特徴全表わ
すペクトｊしの時系列で表わされるパターンの７ツ千ン
グ万式の改良に関する。[Detailed Description of the Invention] jal Inventive Technique #Field The present invention relates to a voice recognition method, and includes 4? This invention relates to the improvement of seven thousand patterns of patterns expressed in a time series of vectors that express all the characteristics of speech.

（１））　　従来技術と問題点 −ｆｆにパターンのマツチングによって音声全認識する
ためには、単語単位に音声の標準パダ、−ン紫単語辞書
としてメモリに登録して８き、入力音声パターンと標準
パターンとがマツチングされる。(1)) Prior Art and Problems - In order to recognize all speech by matching patterns to ff, register the speech in memory as a standard word dictionary for each word, and then match the input speech pattern with the input speech pattern. The standard pattern is matched.

この場合入力音声パターン及び標準パダー７は出来る限
妙、その単語の発音に関して時間軸で正規化されている
事が望ましい。ところが１つ（／Ｊ単単語全発音る場合
、単語音声の継続時間長及び単語音声中の音韻の長さは
発音丁６人や発音する時の状況等によって、一般に変動
する。この様に音声は、動的計画法（ＤＰマ、千ング）
と識別関数法とが代表的な方法として仰られてい６つし
かし動的計画法は時間軸の非憑形な伸縮に対する正規化
には有効であるが、パラメークの確率的な変動に対処す
ることは困難である。又識別関数法はパラメークの確率
的変動には有効であるが９時間軸方向の正規化に問題が
あるといろ欠点がある。In this case, it is desirable that the input speech pattern and the standard padder 7 be normalized on the time axis with respect to the pronunciation of the word as accurately as possible. However, when pronouncing a whole word (/J), the duration of the word sound and the length of the phoneme in the word sound generally vary depending on the person who is pronouncing it, the situation when pronouncing it, etc. is dynamic programming (DPma, Sengu)
The discriminant function method is said to be a typical method.6 However, while dynamic programming is effective for normalizing against unnatural expansion and contraction of the time axis, it is difficult to deal with stochastic fluctuations in parameters. It is difficult. Although the discriminant function method is effective for stochastic variations in parameters, it has drawbacks such as problems with normalization in the time axis direction.

（Ｃｌ　　発明の目的本発明の目的は上記欠点金線くため１時間軸。(Cl　Object of the invention The purpose of the present invention is to eliminate the above-mentioned defective gold wire on a one-time scale.

闇波数軸とで距離計算の重み金変えることにより。By changing the weight of distance calculation with the dark wave number axis.

動的計画法と識別関数法との両方式全組み合せ。Full combination of dynamic programming and discriminant function method.

両方式の長所金倉せもったパダーンマッ千ンク万式？提
供することにある。Advantages of both methods: Kanakura Semota Padaan Mak Sennku Manshiki? It is about providing.

（ｃｌ　　発明の構成本発明の構成は入力時系列パターンを動的計画法によっ
て標準時系列パターンと７クチングする手段と、該マツ
チング手段の結果に基づき入力時系列パターンから新た
な７．千ノグ用時系列パターンを作放す６手段とを設け
、該新たな７ツ千ング用時系列パダーンと標準パターン
との間で２時間軸上及び周波数軸上で異な６重み付けを
用いた距離計算全行なってパターン７ツ千ング金行なう
も力である。(cl) Structure of the Invention The structure of the present invention includes a means for matching an input time series pattern with a standard time series pattern by dynamic programming, and a method for generating a new time series pattern from the input time series pattern based on the result of the matching means. A pattern is created by performing all distance calculations between the new 7-thousand time series pattern and the standard pattern using 6 different weightings on the time axis and the frequency axis. Even if you spend 7,000 yen, it's power.

（ｅｌ　　発明ＣＯＮ施例ｉｉ図は本発明の一実施例を説明するブロック図である
。音声パターンの継続時間長を等時間単位の区分に分割
し、該区分のすべてに於る該音声パターンの周波数スペ
クトル？　入力時系列／−ｅ　９−ンについてはペクト
・レイ、　、　鶴、　ｉ、・・・・・・ｉとし。(el Invention CON Embodiment II Figure is a block diagram illustrating an embodiment of the present invention.The duration length of a voice pattern is divided into sections of equal time units, and the voice pattern in all of the sections is divided into sections of equal time units. Frequency spectrum? For the input time series/-e 9-, let Pecto-Rei, , Tsuru, i, ......i.

→→ 標準時系列パターンについてはベクトルｂ＋ｗｂｔ＊ｂ
、　・・ｂ、とすると、メモリ１には入力時系列パター
ンＡ　＝（ａｉ　）　；−＋〜工が格納される０又メモ
リ２には標準時系列パターンＢ　＝　（ｂ」）４−１−
、＋が格納される。ＤＰマツチング部２に於て、メモリ
１及び３より入力時系列パターンＡ及び標準時系列パタ
ーンＢが動的計画法によりマツチングされる。→→ For standard time series pattern, vector b+wbt*b
, . . b, then memory 1 stores the input time series pattern A = (ai) ;-+~ 0, and memory 2 stores the standard time series pattern B = (b'') 4-1-
, + are stored. In the DP matching section 2, the input time series pattern A and the standard time series pattern B from the memories 1 and 3 are matched by dynamic programming.

動的計画法によるマツチングの詳細は１−日本音響学会
誌Ｊ　ｖｏ１２７．Ｎｏ９．４８３〜４９０ベーヂ。For details on matching using dynamic programming, see 1-Journal of the Acoustical Society of Japan J vo127. No. 9.483-490 beige.

１９７１年９月に記載されているが、概略的には、２つ
の音声パターンＡ、　　Ｈの夫々の継続時間長を等しい
時間単位に時刻１〜正と時刻１〜Ｊに分割し、夫々を平
面上の縦軸及び横軸の目盛とし、各時刻に於ける音声パ
ターンの７ｊ波数スペクトルを求める。即ちベクトルａ
ｌｅ　ａ＊＋　ａｓ・・・・・・＆！及びベクトルｂ＋
　ｅ　ｋ）ｔ　＊　ｂｓ・・・ｂＪを求めて２つの音声
パターンＡ、　Ｂの間の距離が最小になるように最適径
路１０を得て、その時の最適径路ｌＯに沿った累積距離
としてマツチング距離を得るもので６６゜即ち先ず距離
ｇＨ＊　１＝／　ａｔ　　ｋ）Ｉ／′とし、ｇ　＋　＊
　Ｊは第２図に示す如くｇ・−１，ｊ・ｇ・、ｊ−１・
ｇＩ・ｊ−１の３点よりの最小値に１．１点に於ける音
声パターンＡとＢの対応するベクトル間の距離７　鱈−
ｒ７を加えることにより得られる。このようにしてｉ、
ｊを遂次増加しなからｇ　’　＋　３を求めることによ
り最終的に音声パターンＡ、　　Ｂの間の最適なマツチ
ング距離としてＫＩＪ−が求まり、又この計算過程で、
マトリックスの各格子点に於て最適径路が直前の３点の
中のどれを通ったかＰ＋、　ｊに記憶させることにより
最適なマツチングの径路も同時に求められる。Although it is described in September 1971, roughly speaking, the duration length of each of the two voice patterns A and H is divided into equal time units into time 1 to positive and time 1 to J, and each is divided into two planes. The 7j wave number spectrum of the voice pattern at each time is determined using the vertical and horizontal axes as scales. That is, vector a
le a*+ as...&! and vector b+
e k) t * bs...bJ is obtained to obtain the optimal path 10 so that the distance between the two voice patterns A and B is minimized, and the matching distance is the cumulative distance along the optimal path IO at that time. To obtain 66 degrees, that is, first, let the distance gH* 1=/at k)I/', then g + *
As shown in Figure 2, J is g・-1, j・g・, j−1・
The distance between the corresponding vectors of speech patterns A and B at 1.1 points is the minimum value from the three points of gI・j−1 7
Obtained by adding r7. In this way i,
By successively increasing j and finding g' + 3, KIJ- is finally found as the optimal matching distance between voice patterns A and B, and in this calculation process,
By storing in P+,j which of the previous three points the optimal path passed through at each grid point of the matrix, the optimal matching path can be found at the same time.

マトリックス４に得られた音声パターンＡ、　　Ｂ間の
最適マツチング距離ｇｔ、　Ｊと最適径路ＰＩ　、Ｊに
より、入力時系列パターンＡとの間で新たなマツチング
用パターンＣを求めるため、変換部５に於て第３図のフ
ローチャートに示す如くマトリックス４の点（１，Ｊ）
からｊ＝Ｊ、Ｊ−１，〜３．２．１と最適径路を遡りな
がら、ｊに対応するベクトルｅを、又対応するベクトル
τが複数個ある時、それ等を平均化したものをｄとする
ことによりマツチング用パターンＣを得る。ここでＭは
ｊに対応する＆１が複数個ある時平均化するための％叙
でめるｏ　１＝Ｉ−ｊ＝Ｊ、Ｍ＝０は出発点でＣ，１＝
ａｌとなる。最通−路Ｒ，１が第２図に示す如く点し−
ｌ。In order to obtain a new matching pattern C between the input time series pattern A and the input time series pattern A using the optimal matching distance gt, J between the audio patterns A and B obtained in the matrix 4 and the optimal path PI, J, As shown in the flowchart of Figure 3, point (1, J) of matrix 4
While tracing back the optimal path from j = J, J-1, ~3.2.1, find the vector e corresponding to j, and when there are multiple corresponding vectors τ, the average of them is d. By doing this, matching pattern C is obtained. Here, M can be expressed as a % expression for averaging when there are multiple &1 corresponding to j. 1=I-j=J, M=0 is the starting point and C, 1=
It becomes al. The closest route R, 1 is marked as shown in Fig. 2.
l.

Ｊ）より点Ｃｔ、３）に至る径路を通っ７ｔ″ｉｈ合＋
１点い−”ＩＪ−Ｏより点（＋、　」）に至る径路を通
つった場合０１点”、ｊ−ｓ）工り点（＋、　Ｊ）に至
る径路を通った場合−とすればＰ１９．が＋の場合一つ
前の格子点ではＭ＝１となる。ＰＬ、ｊが０又は−の時
はＭ＝０である。今山発点（Ｌ　Ｊ）より１つ遡った場
合ｐ＋、ｊがＯの方向であつ几とすると１＝１−１．ｊ
′−ｊ−１９Ｍ二Ｏであるから（Ｊ−１＝＆１−１とな
り、−の方向でめったとするとｅ　Ｊ−１”　＆　ｌで
あり、十の方向であったとするとｉ＝、−１，Ｍ＝Ｍ＋
１であるからｎ　；ｃｎ＋、、５．／　２となん更にＰ
Ｉ、ｊが＋の１同に１点遡ったとするとｉ＝＋　−ｒ　
、　Ｍ、＝■＋　１であるため号＝＝　（２（箔＋ｉ箔
ン２＋旨１／　３　＝　（ｉｉ”７＋旨＋旨ｙ、３とな
り結が３個平均されることを示す。From J) to point Ct, 7t''ih + through the route leading to 3)
1 point - If you take a route from IJ-O to point (+, '') 01 point, j-s) If you take a route to machining point (+, J) - If P19. is +, M = 1 at the previous grid point. When PL, j is 0 or -, M = 0. If you go back one point from the Imayama starting point (L J), p + , if j is in the direction of O, then 1=1-1.j
'-j-19M2O, so (J-1=&1-1, if it is rarely in the - direction, e J-1"& l, and if it is in the ten direction, i=, -1, M=M+
1, so n;cn+,,5. / 2 and even more P
If I and j go back one point to +, then i = + -r
, M, = ■ + 1, so sign = = (2 (Hoil + i Hakun 2 + Uma 1/3 = (ii"7 + Uta + Uma y, 3, indicating that three results are averaged.

変換部５に於て上記の如く求められたマノチング用パダ
ーンＣ＝　（ｃｉ　ｂ　＝　１−Ｊはメモリ６に格納さ
れ６．３新たなマツチング用パターンＣは演算部７に於
て、標準時系列パターンＢとの間で距離計算に重み付け
？するため９重みＷ　＝＝　（’Ｗ”７　）　ｊ　＝　
１−Ｊで格納しているメモリ８より重み金表わすベクト
ルｗ、、ｗ２・・・・・ＷＪ　ｆ用いて距離りが計算さ
れ最終的な７ツ千ング距離が出力され６゜距離りは例え
ｌ−１′ 但しＮは音声パターンのペクトｌしの次元数として釆め
ちれる。The matching pattern C=(ci b = 1-J) obtained as above in the converting section 5 is stored in the memory 6. 6.3 The new matching pattern C is converted into a standard time series pattern in the calculating section 7. To weight the distance calculation between B and B, 9 weights W == ('W”7) j =
The distance is calculated from the memory 8 stored in 1-J using vectors representing weights w,, w2...WJ f, and the final 7-thousand distance is output. l-1' where N is determined as the number of dimensions of the speech pattern vector l.

重みＷとしては、先ず或単語に関する多数回の発声パタ
ーンから平均パターンｔ−求めてＢとし。As the weight W, first, an average pattern t- is calculated from the utterance patterns of a certain word many times and is set as B.

それに対して各−回一回の発声パターンｔ−Ａとして、
上記本発明の方法によりＣｔ求め６゜これにより重みＷ
、に？求め６．）パターンＣの要素りは重みを求めるため発声した発声回数上記の如き時間
軸上及び周波数軸上で異なる重み付は全行なって距離全
計算することにより、パラメータの確率的な変動に対処
し得る動的計画法全利用した７ツ千ングが実施出来る。On the other hand, as a vocalization pattern t-A for each time,
Ct is determined by the above method of the present invention by 6 degrees, and the weight W
, to? Search 6. ) The elements of pattern C are the number of utterances uttered to find the weights. Stochastic variations in parameters can be dealt with by performing all different weightings on the time axis and frequency axis and calculating the distances as described above. It is possible to perform 7-thousands using dynamic programming.

即ちパラメータの中心的なスペクトｌＬｆ重くシ、比較
的重要でないスペクトルの重み金軽くして距離計算する
ものである。That is, the distance is calculated by weighting the central spectrum lLf of the parameters and reducing the weight of relatively unimportant spectra.

げ）発明の詳細な説明した如く本発明は時間軸上及び周波数軸上で距離
計算の重み？変えることが出来るため。g) As explained in detail, the present invention does not require weighting of distance calculations on the time axis and the frequency axis. Because it can be changed.

時間軸の非線形な伸縮に対する正規化に有効な動的計画
法によるパターンマツチング方式に、パラメータの確率
的な変動に有効な識別関数法の長所全組合せたパターン
マツチング方式を提倶し得るので、パターンの変動に有
効に対処し得ると共に。We can propose a pattern matching method that combines all the advantages of the pattern matching method using dynamic programming, which is effective for normalization against nonlinear expansion and contraction of the time axis, and the discriminant function method, which is effective for stochastic fluctuations of parameters. , as well as being able to effectively deal with pattern variations.

標準パターンの数？有効に減少せしめることが可能であ
り、その効果は大なるものがある。Number of standard patterns? It can be effectively reduced, and the effect is significant.

【図面の簡単な説明】[Brief explanation of drawings]

第１図は本発明の一実施例金説明丁６ブロック図、第２
図は７ノ千ング径路の選択を説明する図。第３図は新たなマツチング用パターン業求めるためのフ
ローチャートでゐ６゜１．３，６．８はメモリ、２はＤＰマツ千７グ部。５は変換部、７は演算部である。代理人　弁理士　　松　岡　宏四廊７−二〜１−ゴ　−二二Fig. 1 is a block diagram of one embodiment of the present invention;
The figure is a diagram illustrating the selection of the 7-thousand route. Figure 3 is a flowchart for finding a new matching pattern. 6.1.3 and 6.8 are memories, and 2 is a DP mating section. 5 is a conversion section, and 7 is a calculation section. Agent Patent Attorney Koshiro Matsuoka 7-2~ 1-Go-22

Claims

【特許請求の範囲】[Claims]

入力時系列パターン全動的計画法によって標準時系列パ
ターンとマツチングする手段と、該マツチング手段の結
果に基づき入力時系列パターンから新たなマツ千ング用
時系列パダーン金作放する手段と？設け、該新たなマツ
チング用時系列パターンと標準パターンとの間で１時間
軸上及び周波数軸上で踵な６重み付は金剛いた距離計’
Ｉ：ｉ行なってパターンマツチング全行なうこと全特撃
とするバダーノマッ千７グ方式。A means for matching an input time series pattern with a standard time series pattern by full dynamic programming, and a means for generating a new time series pattern for pine milling from the input time series pattern based on the result of the matching means. 6 weightings between the new matching time series pattern and the standard pattern on the time axis and the frequency axis are used as a distance meter.
I: A Badano match method in which all pattern matching performed by performing i is considered a special attack.