JPH0134399B2

JPH0134399B2 -

Info

Publication number: JPH0134399B2
Application number: JP56197841A
Authority: JP
Inventors: Masao Watari; Hiroaki Sekoe
Original assignee: Nippon Electric Co Ltd
Current assignee: NEC Corp
Priority date: 1981-12-09
Filing date: 1981-12-09
Publication date: 1989-07-19
Also published as: JPS5898796A

Description

【発明の詳細な説明】本発明は１個以上の単語を連続して発声した連
続音声を自動的に認識する連続音声認識装置に関
する。DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a continuous speech recognition device that automatically recognizes continuous speech in which one or more words are successively uttered.

音声認識の手段としては従来から種々の方法が
試みられている。それらの中で最も簡単、かつ有
効な方法としてパタンマツチング法があげられ
る。この方法は、認識すべき語彙の各単語に標準
的なパタン（以下単語標準パタンと称する）を用
意しておき、入力された未知の音声パタン（以下
入力パタンと称する）との間で比較操作（すなわ
ちパタンマツチング）を行つて相互で異なる度合
を表わす量（以下相異度と称する）を算出し、最
も相異の少ないすなわち相異度が最小になる単語
標準パタンと同じ単語に属すると判定する方法で
ある。 Various methods have been tried in the past as voice recognition means. Among them, the pattern matching method is the simplest and most effective method. In this method, a standard pattern (hereinafter referred to as word standard pattern) is prepared for each word in the vocabulary to be recognized, and a comparison operation is performed between it and an unknown input speech pattern (hereinafter referred to as input pattern). (in other words, pattern matching) to calculate the amount representing the degree of mutual difference (hereinafter referred to as the degree of dissimilarity). This is a method of determining.

特開昭51−104204号公報には上記パタンマツチ
ング法を基礎として動作する連続音声認識装置の
動作原理が記載されている。この原理は大略次の
ようである。すなわち、何個かの単語標準パタン
をあらゆる順列で接続することによつて得られる
パタンを連続音声の標準パタン（以下連続音声標
準パタンと称す）と考えて、入力パタン全体との
マツチングを行う。全体としての相異度が最小と
なるように単語標準パタンの個数と単語標準パタ
ンの順列を定めることによつて認識を行なう。実
際には上記最小化を単語単位での最小化と全体と
しての最小化の２段階に分割し、それぞれの最小
化を動的計画法を利用して実行する（以下動的計
画法を用いたマツチングをDPマツチングと称す
る）。 JP-A-51-104204 describes the operating principle of a continuous speech recognition device that operates based on the pattern matching method described above. The principle is roughly as follows. That is, a pattern obtained by connecting several word standard patterns in any permutation is considered as a continuous speech standard pattern (hereinafter referred to as a continuous speech standard pattern), and is matched with the entire input pattern. Recognition is performed by determining the number of word standard patterns and the permutation of word standard patterns so that the overall degree of difference is minimized. In reality, the above minimization is divided into two stages: word-by-word minimization and overall minimization, and each minimization is performed using dynamic programming (hereinafter, the minimization is performed using dynamic programming). This matching is called DP matching).

上記公開公報記載の装置では単語単位での最小
化において、入力パタンを単語単位にあらゆる可
能な分割をし、そのすべてに対して単語標準パタ
ンとのDPマツチングを行つている。すなわち入
力パタン長をＭとし、単語標準パタン数をＶとす
ればＭ・Ｖ回のDPマツチングを必要とする。 In the device described in the above-mentioned publication, in word-by-word minimization, the input pattern is divided into every possible word unit, and DP matching with the word standard pattern is performed on all of them. That is, if the input pattern length is M and the number of word standard patterns is V, then M.V times of DP matching are required.

ところで、上記のDPマツチングの回数を
Lmax・Ｖ回（入力パタンの最大可能桁数を
Lmaxとする）にする方法がIEEE
TRANSACTIONS ON ACOUSTICS.
SPEECH.AND SIGNAL PROCESSLING.
VOL ASSP−29 No.2APRIL 1981第284頁から
第297頁に記載されている。次にこの方法（以下
HLB法と称する）の大略を述べる。入力パタン
Ａと連続音声標準パタンＣ＝B^v1，B^v2，……，
B^vl，……，B^vLmaxとの相異度は次のようにして
求める。入力パタンの時間点ｍと連続音声標準パ
タンの時間点ｎを第１図に示したような最適な単
調増加で非線形関数ｎ＝ｎ（ｍ）（以下時間正規化
関数という）にて対応づけを行い、その対応づけ
られた時間点における特徴ベクトル間の距離ｄ
（ｍ、ｎ）を時間正規化関数に沿つて加算したも
のを相異度Ｓ（Ａ、Ｃ）と定義する。 By the way, the number of times of DP matching above is
Lmax・V times (maximum possible number of digits of input pattern
IEEE
TRANSACTIONS ON ACOUSTICS.
SPEECH.AND SIGNAL PROCESSLING.
VOL ASSP-29 No. 2 APRIL 1981, pages 284 to 297. Then this method (below
This section provides an overview of the HLB method (referred to as the HLB method). Input pattern A and continuous speech standard pattern C = B ^v1 , B ^v2 , ...,
The degree of difference from B ^vl , ..., B ^vLmax is determined as follows. The time point m of the input pattern and the time point n of the continuous speech standard pattern are correlated using an optimal monotonically increasing nonlinear function n=n(m) (hereinafter referred to as time normalization function) as shown in Figure 1. and the distance d between the feature vectors at the associated time points.
The sum of (m, n) along the time normalization function is defined as the degree of dissimilarity S(A, C).

Ｓ（Ａ、Ｃ）＝ min^n(m) _M 〓^m=1 ｄ（ｍ、ｎ（ｍ）） ……(1) ｎ＝ｎ（ｍ） ……(2) ここで距離ｄ（ｍ、ｎ）は例えば(3)式にて求める
ことができる。 S(A, C) = min ^n(m) _M 〓 ^m=1 d(m, n(m)) ……(1) n=n(m) ……(2) Here, the distance d(m, n ) can be determined using equation (3), for example.

ｄ（ｍ、ｎ）＝Dis（a_n、b^v _o）＝_R 〓^r=1 ｜〓_nr−〓^v _or｜
……(3) ただし〓_n＝（a_n1、a_n2、……、a_nR）〓^v _o＝（b^v _o′₁、b^v _o2、……、b^v _oR） (1)式の最小化を次のような動的計画の手法で行
う。すなわち、初期条件Ｄ（０、０）＝０ ……(4) Ｄ（ｍ、０）＝∞ ｍ＝０〜Ｍ ……(5) Ｆ（ｍ、０）＝ｍｍ＝０〜Ｍ ……(6) のもとに漸化式Ｄ（ｍ、ｎ）＝ｄ（ｍ、ｎ）＋Ｄ（ｍ−１、n^）
……(7) Ｆ（ｍ、ｎ）＝Ｆ（ｍ−１、n^） ……(8) ただし n^＝argmin〔Ｄ（ｍ−１、n′）〕ｎ−２n′ｎ
……(9) をｎ＝Ｌ（ｍ）〜Ｕ（ｍ）、ｍ＝１〜Ｍについてす
なわち第１図の斜線部分について求める。d(m, n)=Dis(a _n , b ^v _o )= _R 〓 ^r=1 ｜〓 _nr −〓 ^v _or ｜
...(3) However, 〓 _n = (a _n1 , a _n2 , ..., a _nR ) 〓 ^v _o = (b ^v _o ′ ₁ , b ^v _o2 , ..., b ^v _oR ) Minimum of equation (1) This is done using the following dynamic programming method. That is, the initial condition D(0,0)=0...(4) D(m,0)=∞ m=0~M...(5) F(m,0)=m m=0~M... Based on (6), the recurrence formula D(m, n) = d(m, n) + D(m-1, n^)
...(7) F(m, n) = F(m-1, n^) ...(8) However, n^ = argmin [D(m-1, n')] n-2n'n
...(9) is calculated for n=L(m) to U(m) and m=1 to M, that is, for the shaded area in FIG.

Ｕ（ｍ）＝2m−１ ……(10) Ｌ（ｍ）＝（ｍ＋１）／２……(11) ここで^argmin _xEXｙはxEXの条件のもとでｙを最小と
するｘを意味している。すなわち(9)式はｎ−２
n′ｎのもとでＤ（ｍ−１、n′）を最小とする
n′をn^としている。また、(7)式は第２図に示す３
つの経路より最小を選択することを示しており、
許される経路を３つに制限したのは時間正規化関
数による対応づけが必要以上に歪むことを防ぐた
めである。ここで相異度を求める時用いた最小値
を選択した経路をマツチング経路と呼び、(8)式の
Ｆ（ｍ、ｎ）を経路情報と呼ぶ。前記漸化式(7)、
(8)、(9)を入力パタンの終端Ｍ、連続音声標準パタ
ンの終端Ｎまで計算して得られるＤ（Ｍ、Ｎ）が
前記(1)式の相異度Ｓ（Ａ、Ｃ）である。 U(m)=2m−1...(10) L(m)=(m+1)/2...(11) Here, ^argmin _xEX y means x that minimizes y under the condition of xEX. ing. In other words, equation (9) is n-2
Minimize D(m-1, n') under n'n
n' is set to n^. Also, equation (7) is expressed as 3 shown in Figure 2.
indicates that the smallest path is selected from among the two paths,
The reason for limiting the number of allowed routes to three is to prevent the correspondence by the time normalization function from being unnecessarily distorted. Here, the route from which the minimum value used when calculating the degree of dissimilarity is selected is called a matching route, and F(m, n) in equation (8) is called route information. The recurrence formula (7),
D (M, N) obtained by calculating (8) and (9) up to the end M of the input pattern and the end N of the continuous speech standard pattern is the dissimilarity S (A, C) in equation (1) above. be.

ところで、全体の最小相異度を求めた時得られ
たマツチング経路（ｍ，ｎ（ｍ））上のある点
（ml、nl）において、始端よりそのマツチング経
路に沿つてその点（ml、nl）まで得られた部分
相異度は、その点（ml、nl）を通るすべてマツ
チング経路に沿つて得られる部分相異度の最小値
である。すなわちある点（ml、nl）を通るすべ
てのマツチング経路に沿つて得られる全体相異度
の最小値は始端よりその点（ml、nl）までの部
分相異度とその点（ml、nl）より終端までの部
分相異度のそれぞれの最小値の和で与えられる。
すなわちＳ（Ａ、Ｃ）＝ min^n(m) _M 〓^m=1 ｄ（ｍ、ｎ）＝ min^n(m) _nl 〓^m=1 ｄ（ｍ、ｎ）＋_M 〓^m=ml+1 ｄ（ｍ、ｎ） ……(12) ただし（ml、nl）はＳ（Ａ、Ｃ）が得られたマ
ツチング経路上の点である。今、連続音声標準パ
タンの各桁ごとの区切れ目の点を考え、それぞれ
の桁で最小の部分相異度を求め、その和として最
小の全体相異度を得ることができる。従つて、入
力パタンＡと連続音声標準パタンＣ＝B^v1、B^v2、
…、B^vl、…、B^vLmaxとの最小相異度は次のよう
にして求めることができる。初めに、連続音声標
準パタンの第１桁目の各単語標準パタンと入力パ
タンとのマツチングを行い、相異度の最小値を求
め、その結果を第２桁目のマツチングの切期値と
して第２桁目の各単語標準パタンと入力パタンと
のマツチングを行う。第Lmax桁までマツチング
を行つた後、入力パタンの終端Ｍにおける各桁ご
との相異度の最小値を求め、最適な桁数Ｌを得
る。第Ｌ桁の相異度が得られたマツチング経路を
逆にたどつて順次各桁での認識カテゴリを得る。 By the way, at a certain point (ml, nl) on the matching path (m, n(m)) obtained when calculating the overall minimum dissimilarity, the point (ml, nl) is ) is the minimum value of the partial dissimilarities obtained along all matching paths passing through that point (ml, nl). In other words, the minimum value of the overall dissimilarity obtained along all matching paths passing through a certain point (ml, nl) is the partial dissimilarity from the starting edge to that point (ml, nl) and that point (ml, nl) It is given by the sum of the minimum values of the partial dissimilarities up to the terminal.
That is, S(A, C) = min ^n(m) _M 〓 ^m=1 d(m, n) = min ^n(m) _nl 〓 ^m=1 d(m, n) + _M 〓 ^m=ml+1 d (m, n) ...(12) where (ml, nl) is the point on the matching path where S(A, C) was obtained. Now, considering the break points for each digit of the continuous speech standard pattern, the minimum partial dissimilarity can be found for each digit, and the minimum overall dissimilarity can be obtained as the sum of the results. Therefore, input pattern A and continuous speech standard pattern C = B ^v1 , B ^v2 ,
..., B ^vl , ..., B ^vLmax can be determined as follows. First, each word standard pattern in the first digit of the continuous speech standard pattern is matched with the input pattern, the minimum value of the degree of dissimilarity is found, and the result is used as the matching cutoff value in the second digit. The standard pattern of each word in the second digit is matched with the input pattern. After performing matching up to the Lmax digit, the minimum value of the degree of difference for each digit at the end M of the input pattern is determined to obtain the optimum number L of digits. The matching path from which the degree of difference for the L-th digit was obtained is traced in reverse to obtain the recognition category for each digit in sequence.

次にHLB法の計算手順を第３図〜第６図を用
いて説明する。第３図は相異度計算の進行順序を
示す図、第４図は第ｌ桁目の相異度計算を示す
図、第５図は桁経路情報FB（ｌ、ｍ）、桁認識カ
テゴリＷ（ｌ、ｍ）より判定計算順序を示す図、
第６図は前記引用文献の第289頁から第290頁に記
載されているアルゴリズム５をフローチヤートで
表わしたものである。ここで、ｍは入力パタンの
時間点、ｎは標準パタンの時間点、ｖは単語、ｌ
は桁、Ｍは入力パタンの終端、N^vは第ｖ番目の
単語標準パタンの終端、Ｖは単語標準パタン数、
Lminは入力パタンの最小桁数、Lmaxは最大桁
数である。相異度計算は、初期条件(4)、(5)、(6)式
のもとで漸化式(7)、(8)、(9)を第３図に示す領域１
より領域Lmaxまですなわち連続音声標準パタン
の各桁ごとに順に求めることである。初期条件の
設定は第６図ブロツク１で行われる。次に第ｌ桁
目における相異度計算は以下のようにして行われ
る。相異度Ｄ（ｍ、０）の初期値として前の桁の
結果である桁相異度DB（ｌ−１、ｍ）をセツト
し（第６図のブロツク２で行われる）、(7)、(8)、
(9)式に示す漸化式を第４図に示すように上限Ｕ
（ｍ）と下限Ｌ（ｍ）でかこまれた部分について計
算する。（第６図のブロツク３，４で行われる）。
ここで上限Ｕ（ｍ）は第４図の左側の線分AB（(10)
式に示されている）および上側の線分BEを意味
し、下限Ｌ（ｍ）は下側の線分ACおよび右側の線
分CE（(11)式に示されている）を意味する。単語標
準パタンの終端N^vまで計算を行い、その終端N^v
での相異度Ｄ（ｍ、N^v）を単語相異度D〓（ｖ、ｍ）
とする（第６図のブロツク５で行われる。Ｖ個の
単語標準パタンと計算した後単語相異度D〓（ｖ、
ｍ）の最小値を求め、その最小値を桁相異度DB
（ｌ、ｍ）とし、その最小値が得られた単語標準
パタンの属するカテゴリv^を桁認識カテゴリＷ
（ｌ、ｍ）とし、その最小値が得られたマツチン
グ経路情報F〓（ｖ、ｍ）を桁経路情報FB（ｌ、ｍ）
とする（第６図のブロツク６で行われる）。この
ようにして第１桁目より第Lmax桁目まで相異度
計算を行つた後、得られた桁経路情報FB（ｌ、
ｍ）と桁認識カテゴリＷ（ｌ、ｍ）より入力パタ
ンの判定を行う。まず、入力パタンの終端Ｍにお
ける各桁の桁相異度DB（ｌ、Ｍ）より、許され
た桁すなわちLmin桁よりLmax桁の間で最小値
を求め、（第６図のブロツク７で行われる）、最小
値の得られた桁Ｌが入力パタンの最終的に決定さ
れた桁数である。つづいて、第５図に示すように
第Ｌ桁目の認識結果Ｒ(L)をＷ（Ｌ、Ｍ）より得、
また入力パタンの終端Ｍでの桁経路情報FB（Ｌ、
Ｍ）より第Ｌ−１桁目の終端を得る（第６図のブ
ロツク８で行われる）。前記操作を順にくり返す
ことによつて、各桁での認識結果Ｒ(l)が得られ
る。 Next, the calculation procedure of the HLB method will be explained using FIGS. 3 to 6. Fig. 3 is a diagram showing the progression order of dissimilarity calculation, Fig. 4 is a diagram showing dissimilarity calculation of the lth digit, Fig. 5 is digit route information FB (l, m), digit recognition category W A diagram showing the judgment calculation order from (l, m),
FIG. 6 is a flowchart representing Algorithm 5 described on pages 289 to 290 of the cited document. Here, m is the time point of the input pattern, n is the time point of the standard pattern, v is the word, l
is the digit, M is the end of the input pattern, N ^v is the end of the vth word standard pattern, V is the number of word standard patterns,
Lmin is the minimum number of digits of the input pattern, and Lmax is the maximum number of digits. The dissimilarity calculation is performed using the recurrence equations (7), (8), and (9) under the initial conditions (4), (5), and (6) in the region 1 shown in Figure 3.
In other words, each digit of the continuous speech standard pattern is sequentially obtained up to the area Lmax. The initial conditions are set in block 1 of FIG. Next, the degree of difference calculation at the l-th digit is performed as follows. Set the digit dissimilarity DB(l-1, m), which is the result of the previous digit, as the initial value of the dissimilarity D(m, 0) (this is done in block 2 of Fig. 6), (7) ,(8),
The recurrence formula shown in equation (9) is expressed as the upper limit U as shown in Figure 4.
(m) and the lower limit L(m). (Performed in blocks 3 and 4 of Figure 6).
Here, the upper limit U (m) is the line segment AB on the left side of Figure 4 ((10)
The lower limit L(m) means the lower line segment AC and the right line segment CE (shown in equation (11)). Perform calculations up to the end N ^v of the word standard pattern, and calculate the end N ^v
The degree of dissimilarity D(m, N ^v ) in word dissimilarity D〓(v, m)
(This is done in block 5 of Figure 6. After calculating the V word standard patterns, the word dissimilarity degree D〓(v,
Find the minimum value of m) and add that minimum value to the digit dissimilarity DB
(l, m), and the category v^ to which the word standard pattern whose minimum value was obtained belongs is the digit recognition category W
(l, m), and the matching route information F for which the minimum value was obtained = (v, m) is the digit route information FB (l, m)
(carried out in block 6 of FIG. 6). After calculating the degree of difference from the first digit to the Lmax digit in this way, the obtained digit path information FB(l,
The input pattern is determined based on the digit recognition category W(l, m) and the digit recognition category W(l, m). First, from the digit dissimilarity DB (l, M) of each digit at the end M of the input pattern, find the minimum value between the allowed digits, that is, between Lmin digit and Lmax digit, and (step 7 in Figure 6) ), the digit L for which the minimum value is obtained is the finally determined number of digits of the input pattern. Next, as shown in FIG. 5, the L-th digit recognition result R(L) is obtained from W(L,M),
Also, the digit path information FB (L,
M) to obtain the end of the L-1st digit (carried out in block 8 of FIG. 6). By repeating the above operations in order, a recognition result R(l) for each digit can be obtained.

以上説明したように、HLB法では各桁でＶ回
のDPマツチングを行えばよいので、全体で
Lmax・Ｖ回のDPマツチングを必要としている。
一方、特開昭51−104204号公報の方法ではＭ・Ｖ
回のDPマツチングが必要である。通常、入力パ
タンの最大桁数Lmaxが５程度ある場合には、フ
レーム周期を20msと想定すると、入力パタン長
Ｍは100程度となり、HLB法の計算量は大幅に少
ないことになる。 As explained above, in the HLB method, it is only necessary to perform DP matching V times for each digit, so the overall
Requires Lmax/V DP matching.
On the other hand, in the method of JP-A-51-104204, M.V.
DP matching is necessary. Normally, when the maximum number of digits Lmax of an input pattern is about 5, assuming a frame period of 20 ms, the input pattern length M is about 100, and the amount of calculation in the HLB method is significantly reduced.

音声認識装置において認識応答時間は、音声の
終端が検出されてから認識結果を出力するまでの
時間である。ところで、HLB法においては、第
１桁目のマツチングに必要な入力パタンが得られ
た後、第１行目のマツチングが開始され、順次第
Lmax桁目までマツチングを行い、認識結果が得
られる。その途中の第ｌ桁目に関しては、第４図
に示した右上隅のＥ点（m_e、n_e）まで入力パタ
ンが得られた時、すなわち連続音声標準パタンの
第ｌ桁目までの最大パタン長をn_e、単語標準パタ
ンの最大パタン長をNmaxとすれば、Ｅ点の座標
の関係より n_e＝Ｌ（m_e） ……（14）であり n_e＝Nmax・ｌ、Ｌ（m_e）＝m_e＋１／２ ……（15）であるので m_e＝２・Nmax・ｌ−１ ……（16）となり、２・Nmax・ｌ−１点まで入力パタンが
得られた時第ｌ桁目のマツチングを行うことがで
きる。今、入力パタンはＬ桁で各桁の平均単語長
をとし、＝１／２Nmax、Ｌ＝Lmaxと仮定すれば、（16）式へm_e＝Ｌ・を代入するとｌ≒
１／４Lmaxとなり、入力音声の終端が検出された時点では１／４Lmax桁までの計算しか進めることはできず、残りの３／４Lmax桁に関してはその後で計算することになり、この３／４Lmax桁分の計算時間が認識応答時間となり、大きな遅れを持つ
ことになる。一方、この認識応答時間を短くする
ためには、並列処理やパイプライン処理ができる
複雑な高速演算器を必要とする。 In a speech recognition device, the recognition response time is the time from when the end of speech is detected to when the recognition result is output. By the way, in the HLB method, after the input pattern necessary for matching the first digit is obtained, matching of the first row is started, and the matching is performed in order.
Matching is performed up to the Lmax digit and the recognition result is obtained. Regarding the lth digit in the middle, when the input pattern is obtained up to point E (m _e , n _e ) in the upper right corner shown in Figure 4, that is, the maximum value up to the lth digit of the continuous speech standard pattern. If the pattern length is n _e and the maximum pattern length of the word standard pattern is Nmax, then from the relationship of the coordinates of point E, n _e = L(m _e )...(14) and n _e = Nmax・l, L( m _e )=m _e +1/2 ...(15) Therefore, m _e =2・Nmax・l−1 ...(16) When the input pattern is obtained up to 2・Nmax・l−1 points Matching of the lth digit can be performed. Now, assuming that the input pattern has L digits and the average word length of each digit is = 1/2Nmax and L = Lmax, then substituting _me = L into equation (16), l≒
1/4Lmax, and when the end of the input audio is detected, calculations can only proceed up to 1/4Lmax digit, and the remaining 3/4Lmax digits will be calculated later, and this 3/4Lmax digit The calculation time becomes the recognition response time, resulting in a large delay. On the other hand, in order to shorten this recognition response time, a complex high-speed arithmetic unit capable of parallel processing and pipeline processing is required.

本発明の目的は、上記HLB法を改良すること
により、認識応答時間を短縮させ、さらに全体の
計算量を少なくし、これにより経済的な連続音声
認識装置を提供することにある。 An object of the present invention is to shorten the recognition response time and further reduce the overall amount of calculation by improving the above-mentioned HLB method, thereby providing an economical continuous speech recognition device.

このためHLB法の計算順序を入れ換えて、本
発明の原理であるVLB法と呼ぶ新規な計算原理
を導出する。HLB法においては第６図のフロー
チヤートに示すように、各単語標準パタンと入力
パタンとの相異度の計算は、初めに各単語標準パ
タンと入力パタンと計算を行い、次に桁を１つ上
げ同様の計算を行つている。すなわち第６図のブ
ロツク３と４に示す計算のループの順序は一番内
側より、単語標準パタンの時間点ｎ、入力パタン
の時間点ｍ、単語標準パタンの番号ｖ、桁の番号
ｌである。ここで前記計算のループの順序を入れ
換え、第７図のフローチヤートに示すように、一
番外側を入力パタンの時間点ｍにすることが可能
であることを示す。DPマツチングの時間正規化
関数ｎ（ｍ）は単調増加関数であるので、第ｌ桁
目の初期値DB（ｌ−１、ｍ−１）は、入力パタ
ン時間点ｍ−１以前のデータによつて決定されて
いる。すなわち入力パタンの時間点ｍ−１以前の
すべての点において相異度計算が終了しているな
らば、入力パタンの時間点ｍにおける相異度計算
を行うことができる。すなわち、第８図の斜線部
分に示すようにｎ軸に平行で各桁を含む縦１列の
相異度計算を行うことができる。この各桁を含む
縦１列の相異度計算には各桁での初期値DB（ｌ
−１、ｍ−１）と各桁のｍ−１点における相異度
Ｄ（ｍ−１、ｎ）が必要であり、これらはｍ−１
点での計算にて求められている。ただし、ｍ−１
点における相異度Ｄ（ｍ−１、ｎ）および経路情
報Ｆ（ｍ−１、ｎ）を各桁ｌ各単語ｖについて記
憶しておく必要がある。このため桁ｌ、単語ｖに
おける相異度Ｄ（ｍ−１、ｎ）およびＦ（ｍ−１、
ｎ）をそれぞれＤ（ｌ、ｖ、ｎ）およびＦ（ｌ、
ｖ、ｎ）で示す。このＤ（ｌ、ｖ、ｎ）とＦ（ｌ、
ｖ、ｎ）の構成を第１５図に示す。 For this reason, the calculation order of the HLB method is changed to derive a new calculation principle called the VLB method, which is the principle of the present invention. In the HLB method, as shown in the flowchart in Figure 6, the degree of dissimilarity between each word standard pattern and input pattern is calculated by first calculating each word standard pattern and input pattern, and then converting the digits into 1s. A similar calculation is performed. That is, the order of the calculation loop shown in blocks 3 and 4 in FIG. 6 is, from the innermost point, the time point n of the word standard pattern, the time point m of the input pattern, the number v of the word standard pattern, and the digit number l. . Here, it is shown that it is possible to change the order of the calculation loops and make the outermost point the time point m of the input pattern, as shown in the flowchart of FIG. Since the time normalization function n(m) of DP matching is a monotonically increasing function, the initial value DB(l-1, m-1) of the first digit is based on the data before the input pattern time point m-1. It has been decided that That is, if the degree of dissimilarity calculation has been completed at all points before the time point m-1 of the input pattern, the degree of dissimilarity calculation at the time point m of the input pattern can be performed. That is, as shown in the shaded area in FIG. 8, it is possible to calculate the degree of dissimilarity in one vertical column that is parallel to the n-axis and includes each digit. To calculate the dissimilarity of one column including each digit, initial value DB(l
-1, m-1) and the degree of dissimilarity D(m-1, n) at the m-1 point of each digit, and these are m-1
It is calculated using points. However, m-1
It is necessary to store the degree of dissimilarity D (m-1, n) at a point and the route information F (m-1, n) for each digit l and each word v. Therefore, the degree of dissimilarity D(m-1, n) and F(m-1,
n) as D(l, v, n) and F(l,
v, n). This D(l, v, n) and F(l,
v, n) is shown in FIG. 15.

このように計算順序を入れ換えたVLB法の計
算手順を第７図と第８図を用いて説明する。相異
度計算は、動的計画の漸化式を第８図に示すよう
に上限Ｕ（ｍ）と下限Ｌ（ｍ）の間の領域内で入力
パタンの時間軸ｍの順に求めることである。
VLB法における相異度計算の初期条件はＤ（ｌ、ｖ、ｎ）＝∞ ……（17）ｌ＝１〜Lmax、ｖ＝１〜Ｖ、ｎ＝１〜N^v DB（ｌ、ｍ）＝∞ ……（18）ｌ＝０〜Lmax、ｍ＝０〜Ｍ DB（０、０）＝０
……（19）であり、第７図のブロツク１で行われる。次に入
力パタンの時間点ｍにおけるｎ軸に平行な縦１列
の相異度計算は以下のように行われる。初めに入
力パタンの時間点ｍの特徴ベクトル〓_nと第ｖ番
目の単語標準パタン〓_o ^vとの間のベクトル距離を
(3)式により求める（第７図のブロツク２で行われ
る）。つづいて各桁において縦１列の相異度計算
を行う。この縦１列の相異度計算は、初期値をＤ(l、v、0)＝DB(l-1、m-1) ……（20）Ｆ(l、v、0)＝ｍ−１ ……（21）として（第７図のブロツク３で行われる）、漸化
式Ｄ(l、v、n)＝ｄ(n)＋Ｄ(l、v、n^)……（22）Ｆ(l、v、n)＝Ｆ(l、v、n^) ……（23）ただし n^＝argmin〔Ｄ(l、v、n′)〕ｎ−２n′ｎ
……（24）をＵ（ｍ）とＬ（ｍ）の間でｎを減少させる方向で
計算する（第７図のブロツク４で行われる）。第
２図に示すように（ｍ、ｎ）点の計算は（ｍ−
１、ｎ）、（ｍ−１、ｎ−１）、（ｍ−１、ｎ−２）
の３点の相異度より求められる。次の（ｍ、ｎ−
１）点の計算は（ｍ−１、ｎ−１）、（ｍ−１、ｎ
−２）、（ｍ−１、ｎ−３）の３点の相異度より求
められ（ｍ−１、ｎ）点の相異度は使用しないの
で（ｍ、ｎ）点の計算結果を（ｍ−１、ｎ）点へ
記憶しても（ｍ、ｎ−１）点の計算に影響を与え
ない。ゆえにｎを減少させる方向で計算を進めれ
ば、ｍ−１点の相異度とｍ点の相異度の記憶エリ
アを共有することができる。上記漸化式計算を縦
１列実行した後、単語標準パタンの終端N^vにお
ける相異度Ｄ（ｌ、ｖ、N^v）とそれまで計算され
た最小単語相異度である桁相異度DB（ｌ、ｍ）
と比較し、得られた相異度Ｄ（ｌ、ｖ、N^v）の方
が小さい場合は、その相異度Ｄ（ｌ、ｖ、N^v）を
桁相異度DB（ｌ、ｍ）とし、その単語標準パタ
ンの属するカテゴリｖを桁認識カテゴリＷ（ｌ、
ｍ）とし、その相異度Ｄ（ｌ、ｖ、N^v）が得られ
たマツチング経路情報Ｆ（ｌ、ｖ、N^v）を桁経路
情報FB（ｌ、ｍ）とする（第７図のブロツク５で
行われる）。このようにして行われる縦１列の相
異度計算（第７図のブロツク２，３，４，５の計
算）をＶ個の単語標準パタンについて実行する。
次に入力パタンの時間点ｍを１つ増加して同様の
縦１列の相異度計算をＶ個の単語標準パタンにつ
いて実行し、入力パタンの終端Ｍまで求める。最
後に桁経路情報FB（ｌ、ｍ）と桁認識カテゴリＷ
（ｌ、ｍ）より入力パタンの判定を行う。この判
定の方法はHLB法の判定方法と同様である。ま
ず、入力パタンの終端Ｍにおける各桁の桁相異度
DB（ｌ、Ｍ）より許された桁すなわちLmin桁よ
りLmax桁の間で最小値を求め（第７図のブロツ
ク６で行われる）、最小値の得られた桁Ｌが入力
パタンの桁数である。さらに第Ｌ桁目の認識結果
Ｒ(L)をＷ（Ｌ、Ｍ）より得、また桁経路情報FB
（Ｌ、Ｍ）より第Ｌ−１桁目の終端を得る（第７
図のブロツク７で行われる）。前記操作を順にく
りすことによつて各桁での認識結果Ｒ(l)が得られ
る。 The calculation procedure of the VLB method in which the calculation order is changed in this way will be explained using FIGS. 7 and 8. Dissimilarity calculation is to find the recurrence formula of the dynamic programming in the order of the time axis m of the input pattern within the region between the upper limit U(m) and the lower limit L(m) as shown in Figure 8. .
The initial conditions for dissimilarity calculation in the VLB method are D (l, v, n) = ∞ ... (17) l = 1 ~ Lmax, v = 1 ~ V, n = 1 ~ N ^v DB (l, m) =∞ ...(18) l=0~Lmax, m=0~M DB(0,0)=0
...(19) and is carried out in block 1 of Figure 7. Next, the dissimilarity calculation for one vertical column parallel to the n-axis at time point m of the input pattern is performed as follows. First, the vector distance between the feature vector 〓 _n at time point m of the input pattern and the v-th word standard pattern 〓 _o ^v is
(3) (carried out in block 2 of FIG. 7). Next, a dissimilarity calculation is performed in one vertical column for each digit. To calculate the degree of dissimilarity in one vertical column, the initial value is D(l, v, 0) = DB(l-1, m-1) ... (20) F(l, v, 0) = m-1 ...(21) (carried out in block 3 of Figure 7), the recurrence formula D(l, v, n) = d(n) + D(l, v, n^)...(22) F( l, v, n) = F(l, v, n^) ... (23) where n^ = argmin [D(l, v, n')] n - 2n'n
...(24) is calculated in the direction of decreasing n between U(m) and L(m) (this is done in block 4 of FIG. 7). As shown in Figure 2, the calculation at the (m, n) point is (m-
1, n), (m-1, n-1), (m-1, n-2)
It is obtained from the degree of dissimilarity of the three points. Next (m, n-
1) Point calculation is (m-1, n-1), (m-1, n
-2), (m-1, n-3). Since the dissimilarity of the (m-1, n) point is not used, the calculation result of the (m, n) point is Even if it is stored at the point (m-1, n), it does not affect the calculation at the point (m, n-1). Therefore, if the calculation proceeds in the direction of decreasing n, the storage area for the dissimilarity of the m-1 point and the dissimilarity of the m point can be shared. After executing the above recurrence formula calculation in one column, the dissimilarity D (l, v, N ^v ) at the end N ^v of the word standard pattern and the digit dissimilarity which is the minimum word dissimilarity calculated so far. DB(l,m)
If the obtained dissimilarity D (l, v, N ^v ⁾ is smaller than Let the category v to which the word standard pattern belongs be the digit recognition category W(l,
m), and the matching path information F (l, v, N ^v ) from which the degree of dissimilarity D (l, v, N ^v ) was obtained is assumed to be the digit path information FB (l, m) (as shown in Fig. 7). (done in block 5). The dissimilarity calculations in one vertical column (calculations in blocks 2, 3, 4, and 5 in FIG. 7) performed in this manner are executed for V word standard patterns.
Next, the time point m of the input pattern is increased by one, and the same dissimilarity calculation in one vertical column is performed for V word standard patterns to obtain the end point M of the input pattern. Finally, digit route information FB (l, m) and digit recognition category W
The input pattern is determined based on (l, m). This determination method is similar to that of the HLB method. First, the digit difference of each digit at the terminal M of the input pattern
Find the minimum value from the allowed digits from DB (l, M), that is, between the Lmin digit and the Lmax digit (this is done in block 6 in Figure 7), and the digit L for which the minimum value is obtained is the number of digits in the input pattern. It is. Furthermore, the L-th digit recognition result R(L) is obtained from W(L, M), and the digit route information FB
Obtain the end of the L-1st digit from (L, M) (7th
(done in block 7 of the figure). By sequentially performing the above operations, the recognition result R(l) for each digit can be obtained.

本発明の連続音声認識装置は前記のVLB法を
実行する装置であるから次のような各部を必要と
する。すなわち、入力パタンＡと連続音声標準パ
タンＣ＝B^v1、B^v2、……、B^vl、……、B^vLmaxと、
以下の各部に対して入力パタンの時間点を示す信
号ｍを１からＭまで変化させ、各ｍに関して単語
を示す信号ｖを１からＶまで変化させ、さらに各
ｖに関して桁を示す信号ｌを１からLmaxまでお
よび標準パタンの時間点を示す信号ｎを１から
N^vまで変化させて与える制御部と、上記制御部
の信号ｌ、ｖ、ｎによつて番地指定される相異度
メモリ部Ｄ（ｌ、ｖ、ｎ）と、経路情報メモリ部
Ｆ（ｌ、ｖ、ｎ）とを有し、各時間点ｍにおいて
前記制御部より順次指定される単語ｖの単語標準
パタン〓_o ^v、ｎ＝１〜N^vと入力パタン〓_nとのベ
クトル間距離ｄ（〓_n、〓_o ^v）ｎ＝１〜N_vを求め
る距離計算部と；この距離を記憶する距離メモリ
部ｄ（ｎ）と、各時間点ｍにおいて、各桁ｌ、お
よび各単語ｖに関して最初に初期条件を時間点ｍ
−１の結果である桁相異度DB（ｌ−１、ｍ−１）
と桁経路情報FB（ｌ−１、ｍ−１）により与え、
前記距離ｄ（ｎ）と時間点ｍ−１における相異度
Ｄ（ｌ、ｖ、ｎ）と経路情報Ｆ（ｌ、ｖ、ｎ）とを
参照して動的計画の漸化式を計算し時間点ｍにお
ける相異度Ｄ（ｌ、ｖ、ｎ）と経路情報Ｆ（ｌ、
ｖ、ｎ）を順次求め、単語相異度Ｄ（ｌ、ｖ、
N^v）と単語経路情報Ｆ（ｌ、ｖ、N^v）を求める
漸化式計算部と；各時間点ｍにおいて、各桁ｌに
関して前記漸化式計算部で求められた各単語相異
度Ｄ（ｌ、ｖ、N^v）の中より最小を求め、これを
桁相異度DB（ｌ、ｍ）とし、これに対応した単
語経路情報Ｆ（ｌ、ｖ、N^v）を桁経路情報FB
（ｌ、ｍ）とし、最小値が得られた単語各ｖを桁
認識カテゴリＷ（ｌ、ｍ）とする桁相異度計算部
と；これらを記憶するための桁相異度メモリ部
DB（ｌ、ｍ）と桁経路情報メモリ部FB（ｌ、ｍ）
と、桁認識カテゴリメモリ部Ｗ（ｌ、ｍ）と；桁
経路情報FB（ｌ、ｍ）と桁認識カテゴリＷ（ｌ、
ｍ）に基づいて逆順に入力パタンの各桁のカテゴ
リを判定し出力する判定部とを有している。 Since the continuous speech recognition device of the present invention is a device that executes the above-mentioned VLB method, it requires the following sections. That is, input pattern A and continuous speech standard pattern C = B ^v1 , B ^v2 , ..., B ^vl , ..., B ^vLmax ,
For each part below, the signal m indicating the time point of the input pattern is changed from 1 to M, the signal v indicating the word is changed from 1 to V for each m, and the signal l indicating the digit is changed to 1 for each v. to Lmax and the signal n indicating the time point of the standard pattern from 1 to
A control section that changes the information up to N ^v , a dissimilarity memory section D (l, v, n) whose address is specified by the signals l, v, n of the control section, and a route information memory section F (l , v, n), and the ^word standard pattern of the word v sequentially specified by the control unit at each time point m〓 _o ^v , n=1 to _N (〓 _n , 〓 _o ^v ); a distance calculation unit that calculates n=1 to N _v ; a distance memory unit d(n) that stores this distance; and a distance memory unit d(n) that stores this distance; First, the initial conditions are set at time point m
-1 result digit dissimilarity DB (l-1, m-1)
and given by the digit path information FB (l-1, m-1),
Calculate the recurrence formula of the dynamic program with reference to the distance d(n), the degree of dissimilarity D(l, v, n) at the time point m-1, and the route information F(l, v, n). Dissimilarity degree D(l, v, n) and route information F(l,
v, n) are sequentially obtained, and the word dissimilarity degree D(l, v,
N ^v ) and word path information F(l, v, N ^v ); a recurrence formula calculation unit that calculates word path information F(l, v, N v ); at each time point m, each word dissimilarity calculated by the recurrence formula calculation unit for each digit l; Find the minimum among D(l, v, N ^v ), set this as digit dissimilarity DB(l, m), and use the corresponding word path information F(l, v, N ^v ) as digit path information. FB
(l, m), and each word v for which the minimum value was obtained is a digit recognition category W(l, m); a digit dissimilarity calculation unit; a digit dissimilarity memory unit for storing these;
DB (l, m) and digit route information memory section FB (l, m)
, digit recognition category memory unit W (l, m); digit path information FB (l, m) and digit recognition category W (l,
m) for determining and outputting the category of each digit of the input pattern in reverse order.

このように本発明の原理であるVLB法を用い
れば相異度計算を入力パタンの時間軸方向に進め
ることができる。これによつて音声の入力が検出
されるとすぐ計算を開始し、音声の入力に同期し
て順次計算することができるので音声の終了と同
時に第７図のブロツク６，７の判定処理を始める
ことができる。したがつて従来技術であるHLB
法に比較し、認識応答時間が短縮できることにな
る。また、距離計算は、HLB法では第６図のブ
ロツク３に示すようにｎ、ｍ、ｖ、ｌのループで
囲まれているが、VLB法では第７図のブロツク
３で示すようにｎ、ｖ、ｍのループで囲まれてい
る。すなわちHLB法における距離計算の回数は
N^v、Ｍ・Ｖ・Lmaxであり、VLB法における距
離計算の回数はN^v・Ｖ・Ｍである。したがつて
従来技術であるHLB法に比較し、距離計算の計
算量が１／Lmaxに減少できることになる。 In this way, by using the VLB method, which is the principle of the present invention, the dissimilarity calculation can proceed in the time axis direction of the input pattern. As a result, the calculation starts as soon as the voice input is detected, and the calculations can be performed sequentially in synchronization with the voice input, so that the judgment process of blocks 6 and 7 in Fig. 7 starts as soon as the voice ends. be able to. Therefore, the prior art HLB
This means that the recognition response time can be shortened compared to the conventional method. Furthermore, in the HLB method, the distance calculation is surrounded by a loop of n, m, v, l, as shown in block 3 of FIG. 6, but in the VLB method, the distance calculation is surrounded by a loop of n, It is surrounded by a loop of v and m. In other words, the number of distance calculations in the HLB method is
N ^v , M·V·Lmax, and the number of distance calculations in the VLB method is N ^v ·V·M. Therefore, compared to the conventional HLB method, the amount of distance calculation can be reduced to 1/Lmax.

次に本発明の装置の具体的構成を図面を参照し
ながら説明する。第９図は、本発明の一構成例を
示すブロツク図であり、第１０図は制御指令信号
のタイムチヤートである。制御部１０は、m1、
n1、v1などの制御指令信号を第１０図に示すよ
うに発することによつて、他の各部を制御する機
能を持つが、その詳細は他の各部の動作に関連し
てその都度説明する。入力部１１は、信号
Speech inで与えられる入力音声を分析し一定時
間ごとに特徴ベクトルを出力する。この音声分析
は例えば、多チヤンネルのフイルタより構成され
るフイルタバンクによる周波数分析などがある。
また入力部１１には入力音声のレベルを監視し、
音声の始端、終端を検出する機能を持ち、その検
出した時点を制御部１０へ信号SPにより伝える。
入力パタンバツフア１２は、音声の始端が検出さ
れた後、信号m3に従つて入力部１１より与えら
れる特徴ベクトルa_nを記憶する。信号m3は入力
パタンの時間点ｍに対応した信号である。標準パ
タンメモリ部１３は、Ｖ個の単語標準パタンB¹、
B²、……、B^vを記憶し、標準パタン長メモリ部
１４は単語標準パタンB^vの長さN^vを記憶してい
る。信号v1は連続音声標準パタンの単語ｖに対
応する信号であり、制御部１０は、信号v1に従
つて、標準パタン長メモリ部１４より単語標準パ
タンB^vの長さN^vを読み出し、単語標準パタンの
時間点ｎに対応する信号n1を発生する。信号n1
に従つて入力パタンバツフア１２より入力パタン
の特徴ベクトル〓_nが読み出され、標準パタンメ
モリ部より〓₁ ^v、〓₂ ^v、……、〓^v _Nvが順次読み出
され距離計算部１５において(3)式が計算され、距
離ｄ（ｎ）、ｎ＝１，２……N^vが距離メモリ部１
６へ記憶される。 Next, the specific configuration of the apparatus of the present invention will be explained with reference to the drawings. FIG. 9 is a block diagram showing a configuration example of the present invention, and FIG. 10 is a time chart of control command signals. The control unit 10 includes m1,
It has a function of controlling other parts by issuing control command signals such as n1 and v1 as shown in FIG. 10, but the details will be explained each time in relation to the operation of the other parts. The input section 11 receives a signal
Analyzes the input voice given by Speech in and outputs feature vectors at regular intervals. This audio analysis includes, for example, frequency analysis using a filter bank composed of multi-channel filters.
The input unit 11 also monitors the level of input audio,
It has a function of detecting the start and end of audio, and transmits the detected time to the control unit 10 by a signal SP.
The input pattern buffer 12 stores the feature vector a _n given from the input unit 11 in accordance with the signal m3 after the start of the voice is detected. The signal m3 is a signal corresponding to time point m of the input pattern. The standard pattern memory unit 13 stores V word standard patterns B ¹ ,
B ² , . . . , B ^v are stored, and the standard pattern length memory unit 14 stores the length N ^v of word standard patterns B ^v . The signal v1 is a signal corresponding to the word v of the continuous speech standard pattern, and the control unit 10 reads the length N ^v of the word standard pattern B ^v from the standard pattern length memory unit 14 according to the signal v1, A signal n1 corresponding to time point n of the pattern is generated. signal n1
Accordingly, the feature vector 〓 _n of the input pattern is read out from the input pattern buffer 12, and 〓 ₁ ^v , 〓 ₂ ^v , ..., 〓 ^v _Nv are sequentially read out from the standard pattern memory section, and the distance calculation section 15 calculates (3 ) is calculated, and the distance d(n), n=1,2...N ^v is the distance memory part 1
6 is stored.

距離計算部１５において第１１図に示すように
初めに信号Cl153にてアキユムレータ１５３がク
リヤされ、入力パタンバツフア１２と標準パタン
メモリ部１３より信号r1に従つてｒ個のデータが
読み込まれ、絶対値回路１５１にて差の絶対値を
求め、加算器１５２にて加算され、(3)式の距離
Dis（〓_n、〓^v _o）がアキユレータ１５３にて求ま
り、この距離が距離メモリ部１６へ出力される。 In the distance calculation unit 15, as shown in FIG. 11, the accumulator 153 is first cleared by the signal Cl153, r pieces of data are read from the input pattern buffer 12 and the standard pattern memory unit 13 according to the signal r1, and the absolute value circuit 151 calculates the absolute value of the difference, which is added in an adder 152 to obtain the distance of equation (3).
Dis( _〓n , ^〓vo ₎ is determined by the accurator 153, and this distance is output to the distance memory section 16.

漸化式計算の初期値のセツトは音声の入力され
る前に制御部１０の信号CLにより行われ、相異
度メモリ部１８、桁相異度メモリ部２１へ（17）、
（18）、（19）式で示した値がセツトされる。 The initial value for the recurrence formula calculation is set by the signal CL from the control section 10 before the voice is input, and is sent to the dissimilarity memory section 18, digit dissimilarity memory section 21 (17),
The values shown in equations (18) and (19) are set.

漸化式計算部１７は、第７図のブロツク４を行
う部分であ、漸化式（22）、（23）、（24）を実行す
る。すなわち、漸化式計算部１７は、第１２図に
示すように３つの相異度レジスタD1、D2、D3
と、その３つのレジスタD1、D2、D3の最小値を
計算する比較回路１７１と、加算器１７２と、３
つの経路レジスタF1、F2、F3より構成される。
制御部１０より発せられた信号n2、n21、n22に
よつて相異度メモリ部１８と経路メモリ部１９よ
り３つの相異度Ｄ（ｌ、ｖ、ｎ）、Ｄ（ｌ、ｖ、ｎ
−１）、Ｄ（ｌ、ｖ、ｎ−２）と３つの経路情報Ｆ
（ｌ、ｖ、ｎ）、Ｆ（ｌ、ｖ、ｎ−１）、Ｆ（ｌ、ｖ、
ｎ−２）を読み出しそれぞれ相異度レジスタD1、
D2、D3と経路レジスタF1、F2、F3へ格納する。
比較回路１７１は相異度レジスタD1，D2，D3よ
り最小値を検出し、その最小値が得られた相異度
レジスタDn^（n^は１、２、３のどれか）に対応し
た経路レジスタFn^を選択するゲート信号n^を発す
る。前記ゲート信号n^により選択された経路レジ
スタFn^の内容が経路メモリ部１９のＦ（ｌ、ｖ、
ｎ）へ格納される。また、比較回路１７１より出
力された相異度の最小値Ｄ（ｌ、ｖ、n^）は、距
離メモリ部１６より読み出された距離ｄ（ｎ）と
加算器１７２によつて加算され、相異度メモリ部
１８のＤ（ｌ、ｖ、ｎ）へ格納される。 The recurrence formula calculation unit 17 is a part that performs block 4 in FIG. 7, and executes recurrence formulas (22), (23), and (24). That is, the recurrence formula calculation unit 17 has three dissimilarity registers D1, D2, D3 as shown in FIG.
, a comparison circuit 171 that calculates the minimum value of the three registers D1, D2, and D3, an adder 172, and
It consists of three route registers F1, F2, and F3.
Three dissimilarities D(l, v, n), D(l, v, n
-1), D(l, v, n-2) and three route information F
(l, v, n), F(l, v, n-1), F(l, v,
n-2) and set them in the difference register D1, respectively.
Store in D2, D3 and route registers F1, F2, F3.
The comparison circuit 171 detects the minimum value from the dissimilarity registers D1, D2, and D3, and selects a path corresponding to the dissimilarity register Dn^ (where n^ is 1, 2, or 3) from which the minimum value was obtained. Generates a gate signal n^ that selects register Fn^. The contents of the route register Fn^ selected by the gate signal n^ are stored in the route memory section 19 as F(l, v,
n). Further, the minimum value D(l, v, n^) of the degree of difference outputted from the comparison circuit 171 is added to the distance d(n) read out from the distance memory section 16 by the adder 172, It is stored in D(l, v, n) of the dissimilarity memory unit 18.

この漸化式計算がｎ＝Ｕ（ｍ）よりＬ（ｍ）まで
算出され、この結果である単語相異度Ｄ（ｌ、ｖ、
N^v）が各々ｖおよび各ｌに対して算出される。 This recurrence formula calculation is calculated from n=U(m) to L(m), and the word dissimilarity degree D(l, v,
N ^v ) are calculated for each v and each l.

桁相異度計算部２０は、第７図のブロツク５を
行う部分であり、Ｖ個の単語相異度Ｄ（ｌ、ｖ、
N^v）の最小値を逐次求める。すなわち、桁相異
度計算部２０は第１３図に示すように、比較回路
２０１と、単語相異度Ｄ（ｌ、ｖ、N^v）を保持す
るレジスタ２０２と、単語標準パタンの属するカ
テゴリｖを保持するレジスタ２０３と、経路情報
Ｆ（ｌ、ｖ、N^v）を保持するレジスタ２０４より
構成される。信号l1は信号v11つの区間にLmax
個発生される。この信号l1は、連続音声標準パタ
ンの桁ｌに対応する信号である。制御部１０より
発せられた信号l1に従い、相異度メモリ部１８と
経路メモリ部１９より単語相異度Ｄ（ｌ、ｖ、
N^v）と単語経路情報Ｆ（ｌ、ｖ、N^v）が読み出
され、それぞれレジスタ２０２と２０４へ格納さ
れ、単語標準パタンの属するカテゴリｖをレジス
タ２０３へ格納される。一方、比較回路２０１は
前記単語相異度Ｄ（ｌ、ｖ、N^v）と桁相異度メモ
リ部２１より読み出された桁相異度DB（ｌ、ｍ）
と比較し、単語相異度Ｄ（ｌ、ｖ、N^v）がより小
さいと判定するとゲート信号v^を発生する。ゲー
ト信号v^に従つてレジスタ２０２，２０３，２０
４に保持されていた単語相異度Ｄ（ｌ、ｖ、N^v）、
カテゴリｖ単語経路情報Ｆ（ｌ、ｖ、N^v）がそれ
ぞれ桁相異度メモリ部２１のDB（ｌ、ｍ）、桁認
識カテゴリメモリ部２２のＷ（ｌ、ｍ）、桁経路メ
モリ部２３のFB（ｌ、ｍ）へ格納される。さらに
制御部１０より信号Cl2によつて第７図のブロツ
ク３にて行われる部分である縦１列の相異度計算
の（20）、（21）式に示した初期セツトが行われ
る。すなわち桁相異度メモリ部２１よりDB（ｌ
−１、ｍ−１）が読み出され、相異度メモリ部１
８のＤ（ｌ、ｖ、０）へ格納され、経路メモリ部
のＦ（ｌ、ｖ、０）へｍ−１が格納される。判定
部２４は、第７図のブロツク６，７を行う部分で
あり、桁経路情報FB（ｌ、ｍ）と桁認識カテゴリ
Ｗ（ｌ、ｍ）より入力パタンの各桁の認識結果Ｒ
(l)を出力する。すなわち判定部２４は第１４図に
示すように、比較回路２４１と、最小桁相異度を
保持するレジスタ２４２と、桁数を保持するレジ
スタ２４３と、桁経路情報Ｆ（ｌ、ｍ）を保持す
るレジスタ２４４と認識結果を保持するレジスタ
２４５より構成される。音声の終端が検出される
と入力部１１より信号SPによつて制御部１０に
通知され、つづいて制御部１０は判定部２４へ信
号m1を発し、判定部２４は判定処理を開始する。
判定制御部２４６は信号m1を受けた後、信号l3
を桁相異度メモリ部２１へ発する。信号l3に従つ
て、桁相異度メモリ部２１より入力パタンの終端
Ｍでの桁相異度DB（ｌ、Ｍ）が順次読み出され、
比較回路２４１によつて逐次最小値を求めレジス
タ２４２へ格納され、その時の桁数ｌがレジスタ
２４３へ格納される。信号l3に従つて、Lmax個
の桁相異度が読み出された後、レジスタ２４３の
内容が入力パタンの桁数を示している。判定制御
部２４６はｌ＝Ｌ、ｍ＝Ｍとしてアドレス信号
l4、m2を桁経路メモリ部２３と桁認識カテゴリ
メモリ部２２へ発し、FB（Ｌ、Ｍ）とＷ（Ｌ、Ｍ）
が読み出され、レジスタ２４４とレジスタ２４５
へ格納される。レジスタ２４５の内容が認識結果
として出力される。さらに判定制御部２４６はｌ
＝ｌ−１、ｍ＝（レジスタ２４４の内容）として
アドレス信号m2を桁経路メモリ部２３と桁桁認
識カテゴリメモリ部２２へ発し、FB（ｌ、ｍ）と
Ｗ（ｌ、ｍ）が読み出されレジスタ２４４とレジ
スタ２４５に格納される。この処理を順次Ｌより
１まで操り返すことによりＬ桁の認識結果がレジ
スタＲより出力される。 The digit dissimilarity calculation unit 20 is a part that performs block 5 in FIG.
Sequentially find the minimum value of N ^v ). That is, as shown in FIG. 13, the digit dissimilarity calculation unit 20 includes a comparison circuit 201, a register 202 that holds the word dissimilarity D (l, v, ^Nv ), and a category v to which the word standard pattern belongs. , and a register 204 that holds route information F(l, v, N ^v ). Signal l1 is Lmax in one section of signal v1
are generated. This signal l1 is a signal corresponding to digit l of the continuous voice standard pattern. According to the signal l1 emitted from the control unit 10, word dissimilarity D(l, v,
N ^v ) and word path information F (l, v, N ^v ) are read out and stored in registers 202 and 204, respectively, and the category v to which the word standard pattern belongs is stored in register 203. On the other hand, the comparison circuit 201 compares the word dissimilarity D (l, v, N ^v ) and the digit dissimilarity DB (l, m) read from the digit dissimilarity memory unit 21.
When it is determined that the word dissimilarity degree D (l, v, N ^v ) is smaller than that, a gate signal v^ is generated. Registers 202, 203, 20 according to gate signal v^
The word dissimilarity degree D (l, v, N ^v ) held at 4,
Category v word path information F (l, v, N ^v ) is stored in DB (l, m) of digit dissimilarity memory unit 21, W (l, m) of digit recognition category memory unit 22, and digit path memory unit 23, respectively. is stored in the FB (l, m) of Furthermore, the initial setting shown in equations (20) and (21) of the dissimilarity calculation for one vertical column, which is the part performed in block 3 of FIG. 7, is performed by the control section 10 using the signal Cl2. In other words, DB(l
-1, m-1) is read out, and the dissimilarity memory unit 1
8 is stored in D(l, v, 0), and m-1 is stored in F(l, v, 0) of the route memory section. The determination unit 24 is a part that performs blocks 6 and 7 in FIG.
Output (l). That is, as shown in FIG. 14, the determination unit 24 includes a comparison circuit 241, a register 242 that holds the minimum digit difference, a register 243 that holds the number of digits, and digit path information F (l, m). It is composed of a register 244 for holding the recognition result and a register 245 for holding the recognition result. When the end of the voice is detected, the input section 11 notifies the control section 10 by a signal SP, and the control section 10 then issues a signal m1 to the determination section 24, and the determination section 24 starts determination processing.
After receiving the signal m1, the determination control unit 246 receives the signal l3.
is issued to the digit difference memory section 21. According to the signal l3, the digit dissimilarity DB(l, M) at the terminal end M of the input pattern is sequentially read out from the digit dissimilarity memory unit 21,
The comparison circuit 241 successively obtains the minimum value and stores it in the register 242, and the number of digits l at that time is stored in the register 243. After Lmax digit differences are read out in accordance with the signal l3, the contents of the register 243 indicate the number of digits of the input pattern. The determination control unit 246 sets the address signal as l=L and m=M.
Sends l4 and m2 to the digit path memory section 23 and digit recognition category memory section 22, and outputs FB (L, M) and W (L, M).
is read and registers 244 and 245
is stored in The contents of the register 245 are output as the recognition result. Furthermore, the determination control unit 246
= l-1, m = (contents of the register 244), the address signal m2 is sent to the digit path memory section 23 and the digit recognition category memory section 22, and FB (l, m) and W (l, m) are read out. and stored in registers 244 and 245. By repeating this process sequentially from L to 1, the recognition result of L digits is output from register R.

以上、本発明の原理とその一構成例を説明した
が、これらの記載は本発明の範囲を限定するもの
ではない。特に本発明の原理であるVLB法の説
明において計算のループの順序を一番内側により
ｎ、ｌ、ｖ、ｍとしたが、ｌ、ｎ、ｖ、ｍするこ
ともVLB法を導出した同様な理由により可能で
ある。 Although the principle of the present invention and one configuration example thereof have been explained above, these descriptions do not limit the scope of the present invention. In particular, in the explanation of the VLB method, which is the principle of the present invention, the order of the calculation loop was set to n, l, v, m from the innermost, but it is also possible to do l, n, v, m in the same way that the VLB method was derived. Possible for a reason.

また、桁相異度DB（ｌ、ｍ）、桁経路情報FB
（ｌ、ｍ）、桁認識カテゴリＷ（ｌ、ｍ）より入力
パタンの判定を行う部分の説明において、DB
（ｌ、Ｍ）の最小値を求め入力パタンの桁数を判
定しているが、IEEE TRANSACTIONS ON
ACOUSTICS、SPEECH.AND SIGNAL
PROCESSING、VOL ASSP−27、
DECEMBER 1979 第588頁より第595頁に記載
されているような制約条件のもとで入力パタンの
桁数を判定する方法も可能である。 In addition, digit dissimilarity DB (l, m), digit route information FB
DB
I am determining the number of digits of the input pattern by finding the minimum value of (l, M), but IEEE TRANSACTIONS ON
ACOUSTICS, SPEECH.AND SIGNAL
PROCESSING, VOL ASSP−27,
DECEMBER 1979, pages 588 to 595, it is also possible to use a method of determining the number of digits of an input pattern under constraint conditions.

さらに、入力パタン〓_nと標準パタン〓^v _oとの距
離を(3)式のような距離尺度を用いて説明したが、
このかわりに（25）式のようなユークリツド距
離、（26）式のような内積等を用いてよい。 Furthermore, we explained the distance between the input pattern 〓 _n and the standard pattern 〓 ^v _o using a distance measure such as equation (3), but
Instead, Euclidean distance as in equation (25), inner product as in equation (26), etc. may be used.

ｄ（ｍ、ｎ）＝_R 〓^r=1 （a_nr−b^v _or）² ……（25）ｄ（ｍ、ｎ）＝−_R 〓^r=1 （a_nr×b^v _or） ……（26）また、相異度を計算するための漸化式は（22）、
（23）、（24）式の形の他にも種々考えられ、この
（22）、（23）、（24）式の代わりに特公告56−28278
号に記載されている形も使用できることは明白で
ある。 d(m, n)= _R 〓 ^r=1 (a _nr −b ^v _or ) ² …(25) d(m, n)=− _R 〓 ^r=1 (a _nr ×b ^v _or ) ……( 26) Also, the recurrence formula for calculating the degree of dissimilarity is (22),
Various forms other than formulas (23) and (24) can be considered, and instead of formulas (22), (23), and (24), Japanese Patent Publication No. 56-28278
It is clear that the forms described in this issue can also be used.

【図面の簡単な説明】[Brief explanation of drawings]

第１図は相異度計算を行う範囲および、マツチ
ング経路の例を示した図であり、第２図は漸化式
において許されているマツチング経路を示した図
であり、第３図はHLB法の計算順辱を示した図
であり、第４図はHLB法における第ｌ桁目の計
算順序を示した図であり、第５図は、判定処理の
計算順序を示した図であり、第６図１および２は
HLB法の計算手順を示すフローチヤートであり、
第７図１および２は本発明の原理であるVLB法
の計算手順を示すフローチヤートであり、第８図
はVLB法の計算順序を示した図であり、第９図
は本発明の一実施例の構成図であり、第１０図は
本発明の実施例の動作を説明するためのタイムチ
ヤートであり、第１１図は本発明の一構成要素の
一つである距離計算部の構成図であり、第１２図
は漸化式計算部の構成図であり、第１３図は桁相
異度計算部の構成図であり、第１４図は判定部の
構成図であり、第１５図は相異度メモリ部、経路
情報メモリ部の構成図であり、第１６図は桁相異
度メモリ部、桁経路情報メモリ部、桁認識カテゴ
リメモリ部の構成図である。第９図、第１１図、第１２図、第１３図、第１
４図において、１０……制御部、１１……入力
部、１２……入力パタンバツフア、１３……標準
パタンメモリ部、１４……標準パタン長メモリ
部、１５……距離計算部、１６……距離メモリ
部、１７……漸化式計算部、１８……相異度メモ
リ部、１９……経路情報メモリ部、２０……桁相
異度計算部、２１……桁相異度メモリ部、２２…
…桁認識カテゴリメモリ部、２３……桁経路情報
メモリ部、２４……判定部、１５１……絶対値回
路、１５２……加算器、１５３……アキユムレー
タ、１７１……比較回路、１７２……加算器、
D1，D2，D3……相異度を保持するレジスタ、
F1，F2、F3……経路を保持するレジスタ、２０
１……比較回路、２０２……単語相異度を保持す
るレジスタ、２０３……カテゴリを保持するレジ
スタ、２０４……経路情報を保持するレジスタ、
２４１……比較回路、２４２……最小桁相異度を
保持するレジスタ、２４３……桁数を保持するレ
ジスタ、２４４……桁経路情報を保持するレジス
タ、２４５……認識結果を保持し出力するレジス
タ、２４６……判定制御部である。 Figure 1 is a diagram showing the range for dissimilarity calculation and examples of matching paths, Figure 2 is a diagram showing matching paths allowed in recurrence formulas, and Figure 3 is a diagram showing HLB FIG. 4 is a diagram showing the calculation order of the first digit in the HLB method, and FIG. 5 is a diagram showing the calculation order of the judgment process. Figure 6 1 and 2 are
This is a flowchart showing the calculation procedure of the HLB method,
7 are flowcharts showing the calculation procedure of the VLB method which is the principle of the present invention, FIG. 8 is a diagram showing the calculation order of the VLB method, and FIG. FIG. 10 is a time chart for explaining the operation of the embodiment of the present invention, and FIG. 11 is a configuration diagram of a distance calculation section, which is one of the components of the present invention. 12 is a block diagram of the recurrence formula calculation section, FIG. 13 is a block diagram of the digit difference calculation section, FIG. 14 is a block diagram of the determination section, and FIG. 15 is a block diagram of the digit difference calculation section. FIG. 16 is a block diagram of a digit difference memory section, a route information memory section, and FIG. 16 is a block diagram of a digit difference memory section, a digit route information memory section, and a digit recognition category memory section. Figure 9, Figure 11, Figure 12, Figure 13, Figure 1
In Fig. 4, 10... control section, 11... input section, 12... input pattern buffer, 13... standard pattern memory section, 14... standard pattern length memory section, 15... distance calculation section, 16... distance Memory section, 17... Recurrence formula calculation section, 18... Dissimilarity memory section, 19... Route information memory section, 20... Digit dissimilarity calculation section, 21... Digit dissimilarity memory section, 22 …
... Digit recognition category memory section, 23 ... Digit path information memory section, 24 ... Judgment section, 151 ... Absolute value circuit, 152 ... Adder, 153 ... Accumulator, 171 ... Comparison circuit, 172 ... Addition vessel,
D1, D2, D3...Registers that hold the degree of difference,
F1, F2, F3...Registers that hold the route, 20
1...Comparison circuit, 202...Register for holding word dissimilarity, 203...Register for holding category, 204...Register for holding route information,
241... Comparison circuit, 242... Register that holds the minimum digit difference, 243... Register that holds the number of digits, 244... Register that holds digit path information, 245... Holds and outputs recognition results. Register, 246...determination control section.

Claims

【特許請求の範囲】[Claims]

１特徴ベクトルの時系列である１個以上の単語
よりなる入力パタンＡ＝〓₁、〓₂、……、〓_n、
……、〓_Mとあらかじめ記憶されているＶ個の単
語標準パタンB^V＝〓₁ ^V、〓₂ ^V、……、〓^V _o、……
〓^V _NＶ（ｖ＝１、２、……、Ｖ）を組合せて得られ
る最大Lmax桁の連続音声標準パタンＣ＝B^v1、
B^v2、……、B^vl、……、B^vLmaxとの間で入力パタ
ンの時間軸ｍと連続音声標準パタンの時間軸ｎと
を対応させる時間関数ｎ（ｍ）の上の入力パタン
〓_nと連続音声標準パタン〓_oとのベクトル間距離
ｄ（〓_n、〓_o）の和として定義される相異度の最
小値を求めるために、連続音声標準パタンＣ＝
B^v1、B^v2、……、B^vl、……、B^vLmaxを各桁ごと
に分割し、第ｌ桁目における最適な時間関数ｎ
（ｍ）によつて与えられるベクトル間距離の最小
累積量を示す桁相異度DB（ｌ、ｍ）と、この時
間関数の先頭時間点を示す桁経路情報FB（ｌ、
ｍ）と、この時間関数上において最小累積距離を
与えた単語名ｖである桁認識カテゴリＷ（ｌ、ｍ）
とを、桁ｌおよび入力パタンの時間点ｍに対して
順次求め、最後に入力パタンの桁数および各桁の
認識結果を判定する連続音声認識装置において、
入力パタンの時間点を示す信号ｍを１からＭまで
変化させ、各ｍに関して単語を示す信号ｖを１か
らＶまで変化させ、さらに各ｖに関して桁を示す
信号ｌを１からLmaxまでおよび標準パタンの時
間点を示す信号ｎを１からN^vまで変化させて与
える制御部と；前記制御部の信号ｌ、ｖ、ｎによ
つて番地指定される相異度メモリ部Ｄ（ｌ、ｖ、
ｎ）と経路情報メモリ部Ｆ（ｌ、ｖ、ｎ）と；各
時間点ｍにおいて前記制御部より順次指定される
単語ｖの単語標準パタン〓_o ^v、ｎ＝１〜N^vと入
力パタン〓_nとのベクトル間距離ｄ（〓_n、〓_o ^v）
ｎ＝１〜N^vを求める距離計算部と；この距離を
記憶する距離メモリ部ｄ（ｎ）と；各時間点ｍに
おいて、各桁ｌ、および各単語ｖに関して最初に
初期条件を時間点ｍ−１の結果である桁相異度
DB（ｌ−１、ｍ−１）と桁経路情報FB（ｌ−１、
ｍ−１）により与え、前記距離ｄ（ｎ）と時間点
ｍ−１における相異度Ｄ（ｌ、ｖ、ｎ）と経路情
報Ｆ（ｌ、ｖ、ｎ）とを参照して動的計画の漸化
式を計算し時間点ｍにおける相異度Ｄ（ｌ、ｖ、
ｎ）と経路情報Ｆ（ｌ、ｖ、ｎ）を順次求め、単
語相異度Ｄ（ｌ、ｖ、N^v）と単語経路情報Ｆ（ｌ、
ｖ、N^v）を求める漸化式計算部と；各時間点ｍ
において、各桁ｌに関して前記漸化式計算部で求
められた各単語相異度Ｄ（ｌ、ｖ、N_v）の中より
最小を求め、これを桁相異度DB（ｌ、ｍ）とし、
これに対応した単語経路情報Ｆ（ｌ、ｖ、N_v）を
桁経路情報FB（ｌ、ｍ）とし、最小値が得られた
単語名ｖを桁認識カテゴリＷ（ｌ、ｍ）とする桁
相異度計算部と；これらを記憶するための桁相異
度メモリ部DB（ｌ、ｍ）と桁経路情報メモリ部
FB（ｌ、ｍ）と、桁認識カテゴリメモリ部Ｗ（ｌ、
ｍ）と；桁経路情報FB（ｌ、ｍ）と桁認識カテゴ
リＷ（ｌ、ｍ）に基づいて逆順に入力パタンの各
桁のカテゴリを判定し出力する判定部とを有する
ことを特徴とする連続音声認識装置。1 Input pattern A consisting of one or more words that is a time series of feature vectors = ₁ , ₂ , ..., 2, _n ,
..., 〓 _M and V word standard patterns stored in advance B ^V = ₁ ^V , ₂ ^V , ..., 〓 ^V _o , . . .
〓 Continuous voice standard pattern C=B ^v1 of maximum Lmax digits obtained by combining ^V _N V (v=1, 2, ..., V),
The input pattern on the time function n(m) that makes the time axis m of the input pattern correspond to the time axis n of the continuous speech standard pattern between B ^v2 , ..., B ^vl , ..., B ^vLmax 〓 _n In order to find the minimum value of the degree of dissimilarity defined as the sum of the vector distance d (〓 _n , 〓 _o ₎ between the continuous speech standard pattern 〓 o , the continuous speech standard pattern C=
Divide B ^v1 , B ^v2 , ..., B ^vl , ..., B ^vLmax into each digit, and find the optimal time function n at the lth digit.
Digit dissimilarity DB (l, m) indicating the minimum cumulative amount of distance between vectors given by (m), and digit path information FB (l, m) indicating the first time point of this time function.
m) and the digit recognition category W(l, m), which is the word name v that gave the minimum cumulative distance on this time function.
In a continuous speech recognition device that sequentially obtains digit l and time point m of an input pattern, and finally determines the number of digits of the input pattern and the recognition result of each digit,
The signal m indicating the time point of the input pattern is varied from 1 to M, the signal v indicating the word is varied from 1 to V for each m, and the signal l indicating the digit is varied from 1 to Lmax for each v, and the standard pattern a control unit that varies a signal n indicating a time point from 1 to N ^v ; a dissimilarity memory unit D (l, v,
n) and the route information memory unit F(l, v, n); word standard pattern of word v sequentially specified by the control unit at each time point m = _o ^v , n = 1 to N ^v and input pattern = Intervector distance d (〓 _n , 〓 _o ^v ) with _n
A distance calculation unit that calculates n=1 to N ^v ; A distance memory unit d(n) that stores this distance; At each time point m, initial conditions are first set for each digit l and each word v at time point m. Digit dissimilarity which is the result of -1
DB (l-1, m-1) and digit route information FB (l-1,
m-1), and dynamic planning is performed with reference to the distance d(n), the degree of dissimilarity D(l, v, n) at time point m-1, and the route information F(l, v, n). Calculate the recurrence formula and calculate the degree of dissimilarity D(l, v,
n) and route information F(l, v, n) are sequentially obtained, and word dissimilarity degree D(l, v, N ^v ) and word route information F(l,
v, N ^v ); and a recurrence formula calculator for calculating each time point m
Then, for each digit l, find the minimum among the word dissimilarities D (l, v, N _v ) obtained by the recurrence formula calculator, and use this as the digit dissimilarity DB (l, m). ,
The word path information F (l, v, N _v ) corresponding to this is taken as the digit path information FB (l, m), and the word name v for which the minimum value is obtained is the digit recognition category W (l, m). A dissimilarity calculation unit; a digit dissimilarity memory unit DB (l, m) for storing these and a digit route information memory unit
FB(l, m) and digit recognition category memory part W(l,
m) and a determination unit that determines and outputs the category of each digit of the input pattern in reverse order based on the digit path information FB (l, m) and the digit recognition category W (l, m). Continuous speech recognition device.