JPH0465393B2

JPH0465393B2 -

Info

Publication number: JPH0465393B2
Application number: JP62061732A
Authority: JP
Inventors: Hiroaki Sekoe
Original assignee: Nippon Electric Co Ltd
Current assignee: NEC Corp
Priority date: 1987-03-16
Filing date: 1987-03-16
Publication date: 1992-10-19
Also published as: JPS63226693A

Description

【発明の詳細な説明】（産業上の利用分野）本発明は人間が発声した音声を自動認識する音
声認識におけるパターンマツチング方式に関する
ものである。DETAILED DESCRIPTION OF THE INVENTION (Field of Industrial Application) The present invention relates to a pattern matching method in speech recognition for automatically recognizing speech uttered by a human.

（従来の技術）音声認識のパターンマツチング方式に関しては
種々の技術が開発されているが、それらの中で最
も確立され重用されているものの一つとして「日
本音響学会誌第42巻９号（昭和61年９月発行）の
第725頁」に記載されるがごときDPマツチング法
がある。これは音声の時間軸歪を整合する手法と
して極めて有効とされている。またDPマツチン
グ法を連続単語認識に拡張したものとして、特願
昭56−199098号明細書に記載されるか如きクロツ
クワイズDP法が知られている。この手法は構文
制御を有する連続単語認識法として説明されてい
るが、当然離散単語認識にも適用可能である。こ
こでは説明を簡単にするため、離散単語認識の形
成で、クロツクワイズDP法の要部を説明する。(Prior art) Various technologies have been developed regarding pattern matching methods for speech recognition, but one of the most established and heavily used among them is the one described in ``Journal of the Acoustical Society of Japan, Vol. 42, No. 9''. There is a DP matching method such as that described in "Page 725 of ``Published September 1986.'' This is considered to be extremely effective as a method for matching time axis distortion of audio. Furthermore, as an extension of the DP matching method to continuous word recognition, the clockwise DP method as described in Japanese Patent Application No. 199098/1982 is known. Although this method has been described as a continuous word recognition method with syntactic control, it is of course also applicable to discrete word recognition. To simplify the explanation, here, we will explain the main parts of the Clockwise DP method using the formation of discrete word recognition.

単語名を番号ｎで指定することとして｛ｎ｜ｎ＝１、２、…Ｎ｝なる単語セツトを認識対象とする。各単語に標準
パターン Bⁿ＝〓ⁿ ₁、…〓ⁿ ₂、…〓ⁿ _j…〓ⁿ _Jｎと考える。ここにｊは時刻を示し、〓ⁿ _jは標準パ
ターンBⁿの時刻ｊの特徴を意味する。入力音声
パターンを同様にＡ＝a₁a₂…a_i…a_I と示す。 Assuming that word names are designated by numbers n, a set of words {n|n=1, 2, . . . N} is to be recognized. Consider standard patterns B ⁿ =〓 ⁿ ₁ , ...〓 ⁿ ₂ , ...〓 ⁿ _j ...〓 ⁿ _J n for each word. Here, j indicates time, and 〓 ⁿ _j means the characteristic of standard pattern B ⁿ at time j. The input voice pattern is similarly expressed as A=a ₁ a ₂ ...a _i ...a _I.

音声認識は、入力パターンＡと標準パターン
Bⁿとのパターン間距離Ｄ（Ａ、Bⁿ）を求め、それ
が最小となるｎを定めて認識結果とすることによ
つて行なわれる。DPマツチングではこの距離の
計算を次のような漸化式計算によつて行なう。 Speech recognition uses input pattern A and standard pattern
This is done by finding the inter-pattern distance D (A, B ⁿ ) with respect to B ⁿ , determining the minimum n, and using this as the recognition result. In DP matching, this distance is calculated using the following recurrence formula calculation.

Γ初期条件 gⁿ（１、１）＝dⁿ（１、１） ……(1) Γ漸化式 gⁿ（ｉ、ｊ）＝dⁿ（ｉ、ｊ）＋mingⁿ（ｉ−ｊ
、ｊ） gⁿ（ｉ−１、ｊ−１） gⁿ（ｉ−１、ｊ−２）……(2) ｉ＝１、２、…Ｉｊ＝１、２、…Jⁿ Γパターン間距離Ｄ（Ａ、Bⁿ）＝gⁿ（Ｉ、Jⁿ） ……(3) ここにdⁿ（ｉ、ｊ）は特徴a_iとbⁿ _jの距離dⁿ（ｉ、
ｊ）＝‖ai−〓ⁿ _j‖である。これに対して(2)式で計
算されるgⁿ（ｉ、ｊ）を最適累積距離と呼ぶ。Γ initial condition g ⁿ (1, 1) = d ⁿ (1, 1) ...(1) Γ recurrence formula g ⁿ (i, j) = d ⁿ (i, j) + ming ⁿ (i-j
, j) g ⁿ (i-1, j-1) g ⁿ (i-1, j-2)...(2) i=1, 2,...I j=1, 2,...J ⁿ Γ patterns Distance D (A, B ⁿ ) ₌ g ⁿ (I, J ⁿ ) ...(3) Here, d ⁿ ( _i , ^j ) is the distance d ⁿ (i,
j)=‖ai−〓 ⁿ _j ‖. On the other hand, g ⁿ (i, j) calculated using equation (2) is called the optimal cumulative distance.

このDPマツチング処理は当初、単語ごとに実
行されていたが、クロツクワイズDP法では各単
語に対して並列的に実行する形式に改良された。 Initially, this DP matching process was executed for each word, but in the Crotwise DP method, it has been improved to a format in which it is executed for each word in parallel.

第５図ａ，ｂは従来方式を説明する図である。
すなわち、第５図ａのような、ｉ，ｊ，ｎが関係
する空間において、入力パターンの各時刻ｉにお
いて、各標準パターンBⁿの指定ｎとそれらの中
の時刻ｊのすべての組み合わせで指定されるｎ，
ｊに対してgⁿ（ｉ，ｊ）なる最適累積距離を計算
し、しかる後に時刻ｉを進めて処理を続行すると
いう方式となつている。 FIGS. 5a and 5b are diagrams explaining the conventional method.
That is, in the space where i, j, and n are related, as shown in Figure 5a, at each time i of the input pattern, all combinations of the designation n of each standard pattern B ⁿ and the time j therein are specified. n to be done,
The method is to calculate the optimal cumulative distance g ⁿ (i, j) for j, and then advance time i and continue processing.

実際の計算は、gⁿ(j)＝gⁿ（ｉ、ｊ）、g^ⁿ(j)＝gⁿ
（ｉ−１、ｊ）なる表示を考え、これらを保持す
るため、第５図ａに参照数字１で示す２列のメモ
リを用意する。これらのメモリの間で、第５図ｂ
に示すごとく gⁿ(j)＝dⁿ(j)＋ming^ⁿ(j) g^ⁿ（ｊ−１） g^ⁿ（ｊ−２） ……(4) なる演算を行なつて、(2)式の漸化式計算を実行す
る。 The actual calculation is g ⁿ (j)=g ⁿ (i, j), g^ ⁿ (j)=g ⁿ
Considering the display (i-1, j), two columns of memories indicated by reference numeral 1 in FIG. 5a are prepared in order to hold these. Between these memories, Figure 5b
As shown in the figure, by performing the operation g ⁿ (j)=d ⁿ (j) + ming^ ⁿ (j) g^ ⁿ (j−1) g^ ⁿ (j−2) ……(4), we obtain (2 ) performs recurrence formula calculation.

ここに、 dⁿ(j)＝dⁿ（ｉ、ｊ） ……(5) である。すべてのｎ，ｊの組み合せに対して、(4)
式の計算を実行し終了と時刻ｉを１クロツク進
め、g^ⁿ(j)をgⁿ(j)として以上の処理を繰り返す。か
くしてｉ・Ｉまでの処理が終了した時点で、パタ
ーン間距離はＤ（Ａ、Bⁿ）＝gⁿ（Jⁿ）として、各ｎに対して並列的に求まる。クロツク
ワイズDPによる連続単語認識においてもまつた
く同様な処理が行なわれる。 Here, d ⁿ (j)=d ⁿ (i, j)...(5). For all combinations of n, j, (4)
The calculation of the formula is completed, the time i is advanced by one clock, and the above process is repeated with g^ ⁿ (j) set to g ⁿ (j). Thus, when the processing up to i and I is completed, the inter-pattern distance is found in parallel for each n as D(A, B ⁿ )=g ⁿ (J ⁿ ). A very similar process is performed in continuous word recognition using Clockwise DP.

（発明が解決しようとする問題点）このような方法は入力パターンの時刻ｉに同期
して処理を進めることができるので、発声と並行
して処理を進めることができ、実時間性が良いと
されている。しかし、この方法を大語いの音声認
識に適用しようとすると計算量が大となるという
問題がある。すなわち、(4)式の漸化式計算は、ｊ
とｎのすべての組み合せについて実行しなくては
ならない。標準パターン長がJⁿ＝30で、1000語を
認識しようとすると、３×10⁴点で(4)式を計算す
ることになる。１点あたり10μsで実行したとして
も300ｍｓを要する。通常の音声認識では入力パ
ターンの特徴a_iをサンプリングする周期は20ｍｓ
以下であるので、このような大語いではとても実
時間実行は不可能である。(Problems to be Solved by the Invention) This method can proceed with processing in synchronization with time i of the input pattern, so it is possible to proceed with processing in parallel with the utterance, and has good real-time performance. has been done. However, when this method is applied to speech recognition of large words, there is a problem in that the amount of calculation becomes large. In other words, the recurrence formula calculation of equation (4) is j
must be executed for all combinations of and n. If the standard pattern length is J ⁿ =30 and an attempt is made to recognize 1000 words, equation (4) will be calculated using 3×10 ⁴ points. Even if it is executed at 10 μs per point, it will take 300 ms. In normal speech recognition, the sampling period for input pattern features a _i is 20ms.
Since the following is true, real-time execution is impossible with such big words.

本発明はクロツクワイズ型のDPマツチングが
有する、計算量が多いという上記欠点を改良し
て、高速でありながら低価格な音声認識装置のパ
ターンマツチング方式を提供することを目的とし
たものである。 An object of the present invention is to improve the above-mentioned disadvantage of the large amount of calculation that the clockwise type DP matching has, and to provide a pattern matching method for a speech recognition device that is high speed and inexpensive.

（問題点を解決する手段）本発明による音声認識のパターンマツチング方
式は、各単語ｎの標準パターンを特徴の時系列
Bⁿ＝〓ⁿ ₁…〓ⁿ _j…〓ⁿ _Jｎとして記憶する手段と、入
力音声パターンの特徴a_iを一時保持する手段と、
それぞれの単語ｎに対応して前記特徴a_iと〓ⁿ _jと
の距離dⁿ（ｉ、ｊ）の最適累積値gⁿ（ｉ、ｊ）を動
的計画法の漸化式によつて算出する手段を有する
方式において、時刻（ｉ、１）までの最適累積値
gⁿ（ｉ−１、ｊ）を各単語ｎ及び時刻ｊに対応し
てg^ⁿ(j)の形で記憶する手段と、時刻ｉまでの最適
累積値gⁿ（ｉ、ｊ）に対応するgⁿ(j)なる量を記憶
する手段とを備え、時刻ｉにて、g^ⁿ(j)の値が所定
の枝刈り条件を満足する（ｎ、ｊ）のセツトに対
してのみ、g^ⁿ(j)に距離dⁿ(j)＝dⁿ（ｉ、ｊ）を加算
して新たな数値ｇとする第１の処理を行ない、こ
のｊの近傍のｊに対応するgⁿ(j)との大小比較を行
ない、ｇ＜gⁿ(j)のときｇを新たなgⁿ(j)として転写
する第２の処理を行ない、時刻ｉが１クロツク進
行するごとにgⁿ(j)を新たなg^ⁿ(j)として前記第１、
第２の処理を進行する方式である。(Means for solving the problem) The pattern matching method for speech recognition according to the present invention uses a standard pattern of each word n as a time series of features.
means for storing as B ⁿ =〓 ⁿ ₁ ...〓 ⁿ _j ...〓 ⁿ _J n; means for temporarily holding characteristics a _i of the input speech pattern;
For each word n, calculate the optimal cumulative value g ⁿ (i, j) of the distance d ⁿ (i, j) between the features a _i and 〓 ⁿ _j using the recurrence formula of dynamic programming. In a method having a means to
Means for storing g ⁿ (i-1, j) in the form g^ ⁿ (j) corresponding to each word n and time j, and corresponding to the optimal cumulative value g ⁿ (i, j) up to time i and a means for storing a quantity g ⁿ (j), and only for the set (n, j) for which the value of g^ ⁿ (j) satisfies a predetermined pruning condition at time i, The first process is to add the distance ^{d n} ⁽ j)=d ⁿ (i, j) to g^ n (j) to obtain a new value g, and then calculate g ⁿ ( j), and when g<g ⁿ (j), a second process is performed in which g is transferred as a new g ⁿ (j), and each time i advances by one clock, g ⁿ (j ) as the new g^ ⁿ (j),
This is a method for proceeding with the second process.

（作用）本発明の第１の特徴はg^ⁿ(j)がある基準より小な
ｎ，ｊのみを対象として(4)式（等価的に(2)式）の
動的計画法漸化式を実行するという枝刈りの考え
方を導入した点である。DPマツチングの動的計
画法漸化式(2)あるいは(4)は最小値を探索する形式
のものであるので、g^ⁿ(j)が大であるということ
は、そのｊが最適経路上に存在する可能性が低い
ことを意味する。そこで、このようなg^ⁿ(j)は無視
しようという考え方を導入したのである。これに
より、(4)式の計算は第１図にハツチを付して示し
たg^ⁿ(j)の近辺で行なえばよいことになり大幅な計
算量低減が見込まれる。しかし、(4)式の漸化式を
そのままの形で実行するのは、せつかくの枝刈り
の効果が薄い。なぜならば(2)式の計算が省略でき
るのは、右辺のg^ⁿ(j)、g^ⁿ（ｊ−１）、g^ⁿ（ｊ−２）
が総て大きかつた場合である。このため、３個の
値それぞれの大きさを判定しなくてはならず、か
つ３個とも大きいという条件の論理積が満足され
る確率は小さくなる。(Operation) The first feature of the present invention is dynamic programming recurrence of equation (4) (equivalently, equation (2)) only for n, j where g^ ⁿ (j) is smaller than a certain standard. The point is that it introduces the idea of pruning, which is the execution of formulas. Since the dynamic programming recurrence formula (2) or (4) for DP matching searches for the minimum value, the fact that g^ ⁿ (j) is large means that j is on the optimal path. This means that there is a low possibility that it exists. Therefore, we introduced the idea of ignoring such g^ ⁿ (j). As a result, the calculation of equation (4) can be performed in the vicinity of g^ ⁿ (j) indicated by hatching in FIG. 1, and a significant reduction in the amount of calculation is expected. However, executing the recurrence formula (4) in its original form has little effect on pruning. This is because the calculation of equation (2) can be omitted for g^ ⁿ (j), g^ ⁿ (j-1), g^ ⁿ (j-2) on the right side.
This is the case when all are large and large. Therefore, it is necessary to determine the magnitude of each of the three values, and the probability that the logical product of the condition that all three values are large is satisfied becomes small.

そこで、本発明では(4)式の計算を以下に述べる
ように、前向きの条件付代入処理として実行する
ことを第２の特徴とする。第２図はその処理を説
明するための図である。ここではg^ⁿ(j)は g^ⁿ(j)＝mingⁿ（ｉ−１、ｊ） gⁿ（ｉ−１、ｊ−１） gⁿ（ｉ−１、ｊ−２） ……(6) と定義され、ｄ（ｉ、ｊ）が加算されていない、
１時刻前（ｉ−１）までの累積最適距離に対応し
ている。以下の処理に先立つてgⁿ(j)には充分大き
な数値∞が初期セツトされているとする。特定の
ｎ、ｊに対する処理は次のようである。 Therefore, the second feature of the present invention is that the calculation of equation (4) is executed as a forward conditional assignment process, as described below. FIG. 2 is a diagram for explaining the processing. Here, g^ ⁿ (j) is g^ ⁿ (j)=ming ⁿ (i-1, j) g ⁿ (i-1, j-1) g ⁿ (i-1, j-2) ……( 6) is defined as d(i, j) is not added,
This corresponds to the cumulative optimal distance up to one time before (i-1). It is assumed that g ⁿ (j) is initially set to a sufficiently large value ∞ prior to the following processing. The processing for specific n and j is as follows.

(1) if g^ⁿ(j)≧θ(i)、go to(4) (2) g^ⁿ＋dⁿ(j)→ｇ（３−１）if ｇ＜gⁿ(j)、ｇ→gⁿ(j) （３−２）if ｇ＜gⁿ（ｊ＋１）、ｇ→gⁿ（ｊ＋１）（３−３）if ｇ＜gⁿ（ｊ＋２）、ｇ→gⁿ（ｊ＋２） (4) ｊ＋１→ｊ ……(7) この手続きが（ｊ−２）からｊまで繰り返され
るとgⁿ(j)に対して次の処理が行なわれたことにな
る。 (1) if g^ ⁿ (j)≧θ(i), go to(4) (2) g^ ⁿ +d ⁿ (j)→g (3-1) if g＜g ⁿ (j), g→ g ⁿ (j) (3-2) if g<g ⁿ (j+1), g→g ⁿ (j+1) (3-3) if g<g ⁿ (j+2), g→g ⁿ (j+2) (4) j+1→j...(7) When this procedure is repeated from (j-2) to j, the following processing has been performed on g ⁿ (j).

（ｊ−２）において（ｊ−２）において min（g^ⁿ（ｊ−２）＋dⁿ（ｊ−２）、gⁿ(j)（＝∞）→g
ⁿ(j) （ｊ−１）において min′（g^ⁿ（ｊ−１）＋dⁿ（ｊ−１）、gⁿ(j)→gⁿ(j) ｊにおいて min（g^ⁿ(j)＋dⁿ(j)、gⁿ(j)→gⁿ(j) ……(8) g^ⁿ（ｊ−２）＋dⁿ（ｊ−２）＝gⁿ（ｉ、ｊ−２） g^ⁿ（ｊ−１）＋dⁿ（ｊ−１）＝gⁿ（ｉ、ｊ−１） g^ⁿ(j)＋dⁿ(j)＝gⁿ（ｉ、ｊ）であることを考慮して(8)式を総合すると gⁿ(j)＝mingⁿ（ｉ、ｊ） gⁿ（ｉ、ｊ−１） gⁿ（ｉ、ｊ−２） ……(9) と、(6)式のｉが１クロツク進んだものが計算され
たことになる。このことをまとめると、(7)式の条
件付代入処理によつて g^ⁿ(j)＝dⁿ(j)＋migg^ⁿ(j) g^ⁿ（ｊ−１） g^ⁿ（ｊ−２） ……(10) なる漸化式計算が達成できたことになるが、この
g^ⁿ(j)は(6)の定義より、gⁿ（ｉ、ｊ）と gⁿ（ｉ、ｊ）＝dⁿ(j)＋g^ⁿ(j) なる関係で結ばれているので、g^ⁿ(j)を求める(7)式
の処理はgⁿ（ｉ、ｊ）を求める(2)式の計算と等価
なことになる。 At (j-2), at (j-2), min(g^ ⁿ (j-2) + d ⁿ (j-2), g ⁿ (j) (=∞) → g
^min '(g^ ⁿ (j-1) + d ⁿ (j-1) at n (j) (j-1), g ⁿ (j)→g ⁿ (j) min(g^ ⁿ (j) +d ⁿ (j), g ⁿ (j) → g ⁿ (j) ...(8) g^ ⁿ (j-2) + d ⁿ (j-2) = g ⁿ (i, j-2) g^ ⁿ Considering that (j-1)+d ⁿ (j-1)=g ⁿ (i, j-1) g^ ⁿ (j)+d ⁿ (j)=g ⁿ (i, j), (8 ), g ⁿ (j) = ming ⁿ (i, j) g ⁿ (i, j-1) g ⁿ (i, j-2) ...(9), and i in equation (6) This means that the value advanced by one clock is calculated.To summarize this, by the conditional substitution process of equation (7), g^ ⁿ (j) = d ⁿ (j) + migg^ ⁿ (j) g ^ ⁿ (j−1) g^ ⁿ (j−2) ……(10) This means that we have achieved the recursion equation calculation, but this
From the definition of (6), g^ ⁿ (j) is connected to g ⁿ (i, j) by the relationship g ⁿ (i, j) = d ⁿ (j) + g^ ⁿ (j), so The processing of equation (7) to obtain g^ ⁿ (j) is equivalent to the calculation of equation (2) to obtain g ⁿ (i, j).

このように、動的計画法の計算を前向きに実行
すると枝刈りに関して次のような高能率化が達成
される。枝刈りの判定は(7)式(1)のg^ⁿ(j)のみのテス
トで実行されるので効率がよい。これによつて枝
刈りと判定された場合には(2)、（３−１）、（３−
２）、（３−３）という漸化式計算に必要な処理が
総べて省略できる。 In this way, when dynamic programming calculations are performed forward, the following high efficiency of pruning can be achieved. The pruning decision is efficient because it is executed by testing only g^ ⁿ (j) in equation (7) and (1). If pruning is determined by this, (2), (3-1), (3-
2) and (3-3) necessary for calculating the recurrence formula can all be omitted.

以上の第１と第２の特徴によつて本発明による
パターンマツチング方式は極めて効率良いものと
なる。 The above first and second features make the pattern matching method according to the present invention extremely efficient.

本発明は(7)式の手続きを各時刻ｉで各単語ｎ中
のｊ＝１、２、…Jⁿまでそれぞれ実行し、時刻ｉ
のサイクルを進めるという方式である。なお、(7)
式(1)の枝刈りの条件には種々の変形が考えられ
る。最も簡単な１例はｉの増加とともに最適累積
距離gⁿ(j)が増加することを考慮し、θ(i)をｉの１
次関数（単調増加）として(7)式(1)を適用する方法
である。別にはg^ⁿ(j)の最小値gminを定め、これ
にαなる余裕を持たせ、θ(i)＝gmin＋αとして
(7)式(1)を適用することも考えられる。さらに別の
例としては閾値を用いるのではなく、全g^ⁿ(j)の中
で小さいものから何位以下という条件で枝刈りを
行なつてもよい。 The present invention executes the procedure of equation (7) at each time i until j = 1, 2, ...J ⁿ in each word n, and
The method is to advance the cycle of Furthermore, (7)
Various modifications can be considered to the pruning conditions in equation (1). The simplest example is to consider that the optimal cumulative distance g ⁿ (j) increases as i increases, and set θ(i) to 1 of i.
This is a method of applying equation (7) and (1) as the next function (monotonically increasing). Separately, determine the minimum value gmin of g^ ⁿ (j), add a margin of α to this, and set θ(i) = gmin + α.
(7) It is also possible to apply equation (1). As another example, instead of using a threshold value, pruning may be performed based on the condition of the smallest value among all g^ ⁿ (j).

（実施例）第３図は本発明の実施した単語音声認識装置の
ブロツク図である。マイクロホン１０より入力さ
れた音声信号は分析部２０によつて周波数分析さ
れたのち標本化、デイジタル化され特徴ベクトル
a_iとしてマイクロプロセツサ３０に送られる。こ
のマイクロプロセツサにはメモリとして標準パタ
ーン記憶部４０と、Ｇメモリ５０が接続されてい
る。標準パターン記憶部４０には各単語ｎの標準
パターンBⁿが特徴ベクトル〓ⁿの時系列として記
憶されている。Ｇメモリ５０は第１図、第２図に
示したg^ⁿ(j)gⁿ(j)のためのワークメモリである。こ
れらの標準パターン記憶部４０、Ｇメモリ５０は
マイクロプロセツサの主記憶内にあつて区別され
るメモリエリアとして定義されてもよい。(Embodiment) FIG. 3 is a block diagram of a word speech recognition device according to the present invention. The audio signal input from the microphone 10 is subjected to frequency analysis by the analysis section 20, and then sampled and digitized into a feature vector.
It is sent to the microprocessor 30 as _ai . A standard pattern storage section 40 and a G memory 50 are connected to this microprocessor as memories. The standard pattern storage unit 40 stores a standard pattern B ⁿ of each word n as a time series of feature vectors 〓 ⁿ . The G memory 50 is a work memory for g^ ⁿ (j)g ⁿ (j) shown in FIGS. 1 and 2. These standard pattern storage section 40 and G memory 50 may be defined as distinct memory areas within the main memory of the microprocessor.

認識処理はマイクロプロセツサ３０のプログラ
ムによつて実行される。入力パターンの先頭のベ
クトルa₁が入力されると次のような初期設定が各
ｎに対して行なわれる。 The recognition process is executed by the program of the microprocessor 30. When the first vector _a1 of the input pattern is input, the following initial settings are performed for each n.

g^ⁿ(1)＝dⁿ(1)＝‖a₁−bⁿ _j‖ g^ⁿ(j)＝∞（２≦ｊ≦Jⁿ） gⁿ(j)＝∞（１≦ｊ≦Jⁿ′） ……（11）これは(1)式の初期条件に対応したものである。g^ ⁿ (1)=d ⁿ (1)=‖a ₁ −b ⁿ _j ‖ g^ ⁿ (j)=∞ (2≦j≦J ⁿ ) g ⁿ (j)=∞ (1≦j≦J ⁿ ′) ...(11) This corresponds to the initial condition of equation (1).

以下、入力パターンの特徴ベクトルaiが入力さ
れるごとの第４図のフローチヤートに示すごとく
処理がマイクロプロセツサ３０によつて実行され
る。図中の１０１のブロツクは(7)式の(2)の枝刈り
判定に対応する。同様にして１０２の２個のブロ
ツクは(1)の処理に、１０３は（３−１）、（３−
２）、（３−３）の処理にそれぞれ対応する。１０
４のブロツクはgⁿ(j)と値をすべてg^ⁿ(j)として切り
換えることを意味する。この処理はgⁿ(j)を記憶す
るエリアとg^ⁿ(j)を記憶するエリアとの番地切り換
えで実現される。１０５のブロツクはgⁿ(j)のエリ
アを充分大な数値∞でリセツトすることを意味す
る。 Thereafter, the microprocessor 30 executes processing as shown in the flowchart of FIG. 4 each time the feature vector ai of the input pattern is input. Block 101 in the figure corresponds to the pruning determination in equation (2) of equation (7). Similarly, the two blocks 102 are processed in (1), and the blocks 103 are processed (3-1) and (3-1).
This corresponds to the processes 2) and (3-3), respectively. 10
Block 4 means switching g ⁿ (j) and all values as g^ ⁿ (j). This process is realized by switching addresses between the area for storing g ⁿ (j) and the area for storing g^ ⁿ (j). Block 105 means resetting the area of g ⁿ (j) to a sufficiently large value ∞.

以上の処理が終了すると、入力パターンの時刻
が１クロツク増加され、次の特徴ベクトルa₁の入
力を待つて同様の処理が行なわれる。音声パター
ンが終了して最後の特徴ベクトルa_Iが入力された
とき、マイクロプロセツサ３０の内部では次のよ
うな処理が行なわれる。この時点でＧメモリ５０
に記憶されている。g^ⁿ（Jⁿ）は(6)式より g^ⁿ（Jⁿ）＝mingⁿ（Ｉ−１、Jⁿ） gⁿ（Ｉ−１、Jⁿ−１） gⁿ（Ｉ−１、Jⁿ−２） ……（12）であることから、Ｄ（Ａ、Bⁿ）＝ｇ（Ｉ、Ｊ）＝g^ⁿ（Jⁿ）＋ｄ（Ｉ、Ｊ
）
……（13）として、各単語ｎごとにパターン間距離Ｄ（Ａ、
Bⁿ）を得る。これらを順次比較することによつ
て、最小値を求め、これに対応するｎ＝n^を認識
結果とし出力する。 When the above processing is completed, the time of the input pattern is incremented by one clock, and the same processing is performed after waiting for the input of the next feature vector _a1 . When the voice pattern ends and the last feature vector a _I is input, the following processing is performed inside the microprocessor 30. At this point G memory 50
is stored in g^ ⁿ (J ⁿ ) is obtained from equation (6), g^ ⁿ (J ⁿ ) = ming ⁿ (I-1, J ⁿ ) g ⁿ (I-1, J ⁿ -1) g ⁿ (I-1, J ⁿ −2) ... (12) Therefore, D (A, B ⁿ ) = g (I, J) = g^ ⁿ (J ⁿ ) + d (I, J
)
...(13) Assuming that for each word n, the inter-pattern distance D(A,
B ⁿ ) is obtained. By sequentially comparing these values, the minimum value is determined, and the corresponding n=n^ is output as the recognition result.

以上本発明を実施例に基づき説明したがこれら
の記載は本発明の範囲を限定するものではない。
特に以上の記載ではパターン間の類似性を距離に
よつて評価するとしたが、これと大小関係が逆な
量によつてもよい。その場合は(7)式等の大小関係
の判定を逆に行なうことになる。また、本実施例
では特徴a_i、b_jをベクトルであるとしたが、ベク
トル量子化を行なつた場合のコードブツク中のベ
クトルを指定する番号のようなスカラ量であつて
もよい。また、説明を簡単にするため単語認識装
置に適用した例を述べたが、クロツクワイズDP
式の連続音声認識装置にも本パターンマツチング
方式は利用できることは自明である。 Although the present invention has been described above based on Examples, these descriptions do not limit the scope of the present invention.
In particular, in the above description, the similarity between patterns is evaluated by distance, but it may also be evaluated by a quantity whose magnitude is opposite to this. In that case, the determination of the magnitude relationship, such as in equation (7), will be performed in reverse. Furthermore, although the features a _i and b _j are vectors in this embodiment, they may be scalar quantities such as numbers that designate vectors in a codebook when vector quantization is performed. In addition, to simplify the explanation, we have given an example of application to a word recognition device, but Kurotswize DP
It is obvious that this pattern matching method can also be used in continuous speech recognition devices of the type.

（発明の効果）以上述べた枝刈りの効果及び、漸化式計算を前
向きの条件付代入処理によつて実行するという本
発明によつて、DPマツチングの処理量を大幅に
低減し、小型かつ低価格な音声認識装置の実現が
可能となつた。(Effects of the Invention) The above-mentioned effects of pruning and the present invention, in which recurrence formula calculation is executed by forward conditional assignment processing, greatly reduce the amount of processing for DP matching, making it compact and easy to use. It has become possible to realize a low-cost speech recognition device.

【図面の簡単な説明】[Brief explanation of the drawing]

第１図、第２図は本発明の原理説明図、第３図
は本発明の一実施例を示すブロツク図、第４図は
要部の処理を示すフローチヤート、第５図ａ，ｂ
は従来技術説明図である。１０……マイクロホン、２０……分析部、３０
……マイクロプロセツサ、４０……標準パターン
記憶部、５０……Ｇメモリー。 Fig. 1 and Fig. 2 are diagrams explaining the principle of the present invention, Fig. 3 is a block diagram showing an embodiment of the invention, Fig. 4 is a flowchart showing the main processing, and Fig. 5 a and b.
is a diagram illustrating the prior art. 10...Microphone, 20...Analysis department, 30
...Microprocessor, 40...Standard pattern storage unit, 50...G memory.

Claims

【特許請求の範囲】[Claims]

１各単語ｎの標準パターンを特徴の時系列Bⁿ
＝〓ⁿ ₁…〓ⁿ _j…〓ⁿ _Jｎとして記憶する手段と、入力
音声パターンの特徴a_iを一時保持する手段と、そ
れぞれの単語ｎに対応して前記特徴a_iと〓ⁿ _jとの
距離dⁿ（ｉ、ｊ）の最適累積値gⁿ（ｉ、ｊ）を動的
計画法の漸化式によつて算出する手段とを有する
音声認識のパターンマツチング方式において、時
刻（ｉ−１）までの最適累積値gⁿ（ｉ、１、ｊ）
を各単語ｎ及各時刻ｊに対応してg^ⁿ(j)の形で記憶
する手段と、時刻ｉまでの最適累積値gⁿ（ｉ、ｊ）
に対応するgⁿ(j)なる量を記憶する手段とを備え、
時刻ｉにて、g^ⁿ(j)の値が所定の枝刈り条件を満足
する（ｎ、ｊ）のセツトに対してのみg^ⁿ(j)にdⁿ(j)
＝dⁿ（ｉ、ｊ）を加算して新たな数値ｇとする第
１の処理を行ない、このｊの近傍のｊに対応する
gⁿ(j)との大小比較を行ない、ｇ＜gⁿ(j)のときｇを
新たなgⁿ(j)として転写する第２の処理を行ない、
時刻ｉが１クロツク進行するごとにgⁿ(j)を新たな
g^ⁿ(j)として前記第１、第２の処理を進行すること
を特徴とする音声認識のパターンマツチング方
式。1 Time series B ⁿ featuring the standard pattern of each word n
=〓 ⁿ ₁ ...〓 ⁿ _j ...〓 ⁿ _J _n ; means for temporarily holding _the feature a _i of the input ^speech pattern; In the pattern matching method of speech recognition, which has means for calculating the optimal cumulative value g ⁿ (i, j) of the distance d ⁿ (i, j) of the distance d n (i, j) by a recurrence formula of dynamic programming, −1) to the optimal cumulative value g ⁿ (i, 1, j)
means for storing g^ ⁿ (j) corresponding to each word n and each time j, and an optimal cumulative value g ⁿ (i, j) up to time i.
means for storing a quantity g ⁿ (j) corresponding to
At time i, set g^ ⁿ (j) to d ⁿ (j) only for the set of (n, j) for which the value of g^ ⁿ (j) satisfies the predetermined pruning condition.
Perform the first process of adding =d ⁿ (i, j) to create a new value g, which corresponds to j in the vicinity of this j.
Compare the magnitude with g ⁿ (j), and when g < g ⁿ (j), perform a second process of transferring g as a new g ⁿ (j),
Each time time i advances by one clock, g ⁿ (j) is
A pattern matching method for speech recognition, characterized in that the first and second processes are performed as g^ ⁿ (j).