JPH0259928A

JPH0259928A - Memory address control circuit for dp operation

Info

Publication number: JPH0259928A
Application number: JP63212720A
Authority: JP
Inventors: Kiyoshi Indo; 印藤　清志; Satoshi Miki; 三樹　聡
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1988-08-26
Filing date: 1988-08-26
Publication date: 1990-02-28
Anticipated expiration: 2009-03-02
Also published as: JPH0616262B2

Abstract

PURPOSE:To rapidly execute dynamic programming(DP) operation by counting up/down the contents of a counter at the time of changing the frame number of an input voice, and at the time of changing a spectrum pattern number, setting up the spectrum pattern number in a register. CONSTITUTION:The upper and lower bits of a memory address are respectively stored in the counter 1 and the register 2. Although an input voice frame number requires continued three values in case of calculating an accumulated point on a certain point, the values can be formed by controlling the count-up/down of the counter 1 storing the input voice frame number. At the time of changing a base address, the count up/down of the counter 1 is controlled by a signal outputted from the control part 4, and at the time of changing an offset address, an artificial sound element number stored in a buffer memory 3 is set up in the register 2 to change the address. Consequently, memory access for DP operation can be rapidly executed.

Description

【発明の詳細な説明】［産業上の利用分野］この発明は、ベクトル量子化に基づく単語音声認識にお
いて、ダイナミックプログラミング（以下ＤＰと呼ぶ）
演算を高速に行うためのメモリアドレス制御回路に関す
るものである。[Detailed Description of the Invention] [Industrial Application Field] This invention uses dynamic programming (hereinafter referred to as DP) in word speech recognition based on vector quantization.
The present invention relates to a memory address control circuit for performing calculations at high speed.

「従来の技術Ｊ単語単位の認識方式において、単語辞書の表現にベクト
ル量子化の手法を導入することにより、単語辞書の記憶
量と認識の処理量の削減が可能となる。ベクトル量子化
手法を導入した認識方式では、ＤＰ演算に先立ち、ベク
トル量子化手法により予め作成されている数百程度の代
表的なスペクトルパタン（以下擬音素標準バタンと呼ぶ
）と入力音声の各フレームのスペクトルパタンとのスペ
クトル距離を計算し、距離マトリクスを作成する。``Conventional Technology J'' Introducing a vector quantization method to the representation of a word dictionary in a word-by-word recognition method, it is possible to reduce the storage amount of the word dictionary and the amount of recognition processing.Vector quantization method In the introduced recognition method, prior to the DP calculation, the spectral pattern of each frame of the input speech is compared with about several hundred representative spectral patterns (hereinafter referred to as onomatopoeic standard patterns) created in advance using the vector quantization method. Calculate spectral distances and create a distance matrix.

ＤＰ演算時には上記距離マトリクスの中からＤＰ演算の
漸化式に従い、必要な距離値を読みだし累積加算を行う
、この時、上記距離マトリクスをランダムにアクセスす
る必要がある。汎用信号処理プロセッサ等で用いられて
いるメモリアドレス制御部では、メモリアクセス用にｕ
ｐ／ｄｏｗｎカウンタを持っているだけであり、ランダ
ムアクセスの場合には必要なアドレスを論理演算部で一
度計算した後アドレスカウンタに設定する必要がある。During DP calculation, necessary distance values are read from the distance matrix according to the recurrence formula of DP calculation and cumulative addition is performed. At this time, it is necessary to randomly access the distance matrix. The memory address control unit used in general-purpose signal processing processors etc. uses u for memory access.
It only has a p/down counter, and in the case of random access, it is necessary to calculate the necessary address once in the logical operation unit and then set it in the address counter.

このため、メモリアクセスの頻繁なりＰ演算ではアドレ
ス生成のオーバヘンドが大きくなり、処理時間の増大を
まねく欠点があった。Therefore, due to frequent memory access, the overhead of address generation becomes large in P operations, resulting in an increase in processing time.

「課題を解決するための手段」この発明においては、上記問題点を解決しＤＰ演算時の
メモリアクセスを高速に行うことを目的とする。"Means for Solving the Problems" The present invention aims to solve the above-mentioned problems and speed up memory access during DP calculations.

入力音声の時間情報を示すフレーム番号が格納でき、か
つｕｐ／ｄｏｗｎが可能なカウンタと、ベクトル量子化
された擬音素標準パタンの番号を格納するレジスタと、
ＤＰ演算に必要な擬音素標準パタンの番号を退避してお
くバッファメモリと、上記カウンタへの入力音声フレー
ム番号の設定及びカウンタのｕｐ／ｄｏｗｎ制御、上記
レジスタへの擬音素標準バタン番号の設定、上記バッフ
ァメモリからの擬音素標準バタン番号の読みだしを行う
制御部とを持ち、上記カウンタが距離マトリクスメモリ
の上位側アドレス、上記レジスタが下位側アドレスを示
すように構成し、入力音声のフレーム番号の変更時には
制御部からの信号により、カウンタをｕｐ／ｄｏｗｎＬ
／、擬音素標準バタン番号の変更時には上記バッファメ
モリに格納されている擬音素標準バタン番号を上記レジ
スタに設定できるようにすることにより、ＤＰ演算時の
距離マトリクスメモリアクセスのためのアドレス生成を
高速に行う。a counter that can store a frame number indicating time information of input speech and that can be up/down; a register that stores a vector quantized onomatopoeic standard pattern number;
A buffer memory for saving the number of the onomatopoeic standard pattern necessary for DP calculation, setting the input audio frame number to the counter and controlling up/down of the counter, setting the onomatopoeic standard pattern number to the above register, a control unit that reads out the standard onomatopoeic bang number from the buffer memory, the counter is configured to indicate the upper address of the distance matrix memory, the register indicates the lower address, and the frame number of the input voice When changing, the counter is up/down by a signal from the control section.
/, When changing the onomatopoeic standard baton number, the onomatopoeic standard baton number stored in the buffer memory can be set in the above register, thereby speeding up address generation for distance matrix memory access during DP calculation. to be done.

「実施例Ｊ以下、図面に基づいて説明する。第１図は単語辞書とベ
クトル量子化に基づ（単語音声認識において作成される
距離マトリクスとの関係を示す。Embodiment J A description will be given below based on the drawings. FIG. 1 shows the relationship between a word dictionary and a distance matrix created based on vector quantization (word speech recognition).

第１図の距離マトリクスは各入力音声フレームと全凝音
素標準バタンとのスペクトル距離値が格納される。ここ
でｄ、″は入力音声フレーム番号ｉのスペクトルパタン
とｎ番目の擬音素標準バタンとのスペクトル距離を表す
。擬音素標準バタン数は計算の容易性から２のベキ乗（
２５６，５１２，１０２４等）個が選択される。スペク
トル距離としては、例えばＬＰＣケプストラム距離、ス
ペクトルのピークを重視したＷＬＲ距離、ＷＬＲ距離に
パワー項を付加したＰＷＬＲ距離等種々の距離尺度が用
いられる。The distance matrix shown in FIG. 1 stores spectral distance values between each input speech frame and all the phoneme standard bangs. Here, d,'' represents the spectral distance between the spectral pattern of the input speech frame number i and the n-th onomatopoeic standard bang.The number of onomatopoeic standard bangs is a power of 2 (
256, 512, 1024, etc.) are selected. Various distance measures are used as the spectral distance, such as LPC cepstral distance, WLR distance that emphasizes the peak of the spectrum, and PWLR distance that adds a power term to the WLR distance.

ある一つの単語辞書のｊフレーム目と入力音声ｉフレー
ム目とのスペクトル距離値ｄ　ｉｊは、以下の様にして
距離マトリクスから読み出すことが出来る。The spectral distance value d ij between the j-th frame of one word dictionary and the i-th frame of input audio can be read from the distance matrix as follows.

単語辞書からｊフレーム目に格納されている擬音素番号
：ｎ、を読みだす。次に距離値マトリクスから入力音声
フレーム番号ｉと擬音素番号ｎ。The onomatopoeic phoneme number: n stored in the jth frame is read from the word dictionary. Next, input speech frame number i and onomatopoeic phoneme number n are obtained from the distance value matrix.

で示される距離値ｄ、ｎｊを読みだす。上記ｄ　、ＲＪ
が入力音声ｉフレーム目と単語辞書」フレーム目とのス
ペクトル距離ｄｉｊとなる。Read out the distance values d and nj indicated by . d above, RJ
is the spectral distance dij between the i-th frame of the input audio and the word dictionary frame.

次にＤＰ演算に関して漸化式（１）を用いて説明する。Next, the DP calculation will be explained using recurrence formula (1).

漸化式（］）を用いても、以降に述べるこの本発明の特
徴に関して一般性を失うものではない。（１）式におけ
るＤＰ６４算の概念を第２図に示す。The use of the recurrence formula ( ) does not result in any loss of generality with respect to the features of the invention described below. The concept of DP64 calculation in equation (1) is shown in FIG.

・・・・・・（１）但しＧ（ｋ）；累積距離値Ｇ　＋　　　、前の累積距離値ｄ　ｉｊ　　：入力音声ｉフレームと単語辞書ｊフレー
ム（擬音素番号ｎｊ）とのスペクトル距離値漸化式（１）より一累積点の計算には、距離値データｄ
　ｉ−Ｌ　Ｊ−２、ｄｉ−１＋ｊ−１、ｄ　ｉ＋ｊ−１
、ト１．４、ｄ　ｉｊが必要となる。・・・・・・(1) However, G(k): cumulative distance value G + , previous cumulative distance value d ij : spectral distance value gradual between input speech i frame and word dictionary j frame (onomatopoeic number nj) From formula (1), to calculate one cumulative point, distance value data d
i-L J-2, di-1+j-1, di+j-1
, 1.4, and d ij are required.

それぞれの距離値は入力音声フレーム番号ｉ。Each distance value is the input audio frame number i.

ｉ−１，ｉ−２、擬音素番号ｎ＝−ｚ、ｎｉ−＋、ｎＪ
からアクセス可能である。i-1, i-2, onomatopoeic number n=-z, ni-+, nJ
It is accessible from.

距離値データｄｉ−２ｒｉ−２、ｄ　１−１ｎ　ｊ−１
＋　ｄ　ｉ＋　ｊ−１、ｄ　ｌ−１＋　Ｊ　、ｄ　ＩＪ
を高速にアクセスし、ＤＰ演算を効率よく処理するため
に考案したこの発明の実施例を第３図に示す。Distance value data di-2ri-2, d 1-1n j-1
+ d i+ j-1, d l-1+ J, d IJ
FIG. 3 shows an embodiment of the present invention devised for high-speed access to DP operations and efficient processing of DP operations.

１は入力音声フレーム番号が格納でき、かつｕｐ／ｄｏ
ｗｎ可能なカウンタ、２は擬音素標準バタン番号を格納
するレジスタ、３はＤＰ演算に必要な擬音素標準バタン
番号を退避しておくバッファメモリ、４はカウンタＩへ
の入力音声フレーム番号の設定及びカウンタ１のｕｐ／
　ｄｏｉｖｎｉｌｌ？Ｉ、レジスタ２への凝音素標準バ
タン番号の設定、バッファメモリ３からの擬音素標準バ
タン番号の読みだしを行う制御部、５は生成されたアド
レスの出力端子である。1 can store the input audio frame number and up/do
wn possible counter, 2 is a register that stores the onomatopoeic standard bang number, 3 is a buffer memory that saves the onomatopoeic standard bang number necessary for DP calculation, 4 is the setting of the input audio frame number to counter I, and Counter 1 up/
Doivnill? I is a control unit that sets the phoneme standard bang number in the register 2 and reads the onomatopoeic standard bang number from the buffer memory 3; 5 is an output terminal for the generated address;

ここで入力音声フレーム番号を距離マトリクスのベース
アドレス（上位側アドレス）とし、擬音素標準バタン番
号をオフセットアドレス（下位側アドレス）とすること
によりＤＰ演算に必要な距離値を格納した距離マトリク
ス上のアドレスを示すことができる。例えば擬音素数を
２５６、距離マトリクスの先頭アドレスをＯ番地とした
場合、人力音声フレーム番号に対応する距離値マトリク
スのベースアドレスは、０．２５６　Ｘｉ　、　２５６
　Ｘ２゜・・・・・・、　２５６　Ｘ（ｉ−１）　　：
　（ただしｉは入力音声フレーム番号）と変化する。仮
にメモリアドレスのビット幅を１６ビツトとした場合、
上位８ピントを入力音声フレーム番号、下位８ピントを
擬音素番号とし、それぞれ第３図のカウンタ１、レジス
タ２に格納しておく。ＤＰ演算の漸化式（１）とＤＰ演
算の概念図第２図とにより、ある−点の累積点を計算す
る場合、入力音声フレーム番号は連続する３値が必要と
なる。この値は入力音声フレーム番号を格納してあるカ
ウンタ１をｕｐ／ｄｏｗｎ制御することにより生成する
ことが出来る。従ってベースアドレスの変更時には制御
部４からのカウンタ１のｕｐ　／　ｄｏｗｎＩＩＩ′４
１を行い、オフ上４ツトアドレスノ変更時にはバッファ
メモリ３に格納されている擬音素番号をレジスタ２に設
定することによりアドレスの変更が可能となる。カウン
タＩ、レジスタ２により示されるアドレスは端子５を通
して出力される。Here, the input audio frame number is used as the base address (upper address) of the distance matrix, and the onomatopoeic standard baton number is used as the offset address (lower side address). Address can be shown. For example, if the onomatopoeic prime is 256 and the start address of the distance matrix is address O, the base address of the distance value matrix corresponding to the human voice frame number is 0.256 Xi, 256
X2゜..., 256 X(i-1):
(where i is the input audio frame number). If the bit width of the memory address is 16 bits,
The upper 8 pintos are the input audio frame numbers and the lower 8 pintos are the onomatopoeic phoneme numbers, which are stored in the counter 1 and register 2 in FIG. 3, respectively. According to the recurrence formula (1) of the DP calculation and the conceptual diagram of the DP calculation in FIG. 2, when calculating the cumulative score of a certain point, three consecutive input audio frame numbers are required. This value can be generated by controlling up/down the counter 1 that stores the input audio frame number. Therefore, when changing the base address, the counter 1 is up/down III'4 from the control unit 4.
1, and when changing the off-upper four address, the address can be changed by setting the onomatopoeic phoneme number stored in the buffer memory 3 in the register 2. The address indicated by counter I and register 2 is output through terminal 5.

この実施例によるアドレス生成例を第４図に示す。図は
ｄ　ｉ−２，Ｊ−Ｌ　ｄ　ｉ−１＋　ｊ−１、ｄ　ＩＪ
、ｄ　ｉ−１＋　ｊを順次アクセスする例を示している
。各サイクルでの処理内容は以下の通りである。なおり
ウンタ１には入力音声フレーム番号ｉ−２が設定されて
いる状態を想定している。FIG. 4 shows an example of address generation according to this embodiment. The diagram shows d i-2, J-L d i-1+ j-1, d IJ
, d i-1+j are sequentially accessed. The processing contents in each cycle are as follows. It is assumed that the input audio frame number i-2 is set in the naori counter 1.

サイクル１；バッファメモリ３より擬音素番号ｎ＝−ｚ
を読み出し本サイクルの最後でレジスタ２に設定する。Cycle 1; Onomatopoeic number n=-z from buffer memory 3
is read and set in register 2 at the end of this cycle.

サイクル２；出力＠５よりｄｉ−ｔ＋ｊ−１のアドレス
を出力する。Cycle 2: Output the address of di-t+j-1 from output @5.

バッファメモリ３より擬音素番号ｎｊ−１を読み出し本サイクルの最後でレジスタ２に設定する。Onomatopoeic number from buffer memory 3 Read nj-1 and set it at the end of this cycle. Set it in register 2 later.

カウンタｌを本サイクルの最後でカウントｕｐ（＋１）する。Counter l at the end of this cycle Count up (+1).

サイクル３；出力端５よりｄ　ｉ−１＋　ｊ−１のアド
レスを出力する。Cycle 3: The address of d i-1+j-1 is output from the output terminal 5.

バッファメモリ３より擬音素番号ｎＪを読み出し本サイクルの最後でレジスタ２に設定する。Onomatopoeic number from buffer memory 3 Read nJ at the end of this cycle Set in register 2 with .

サイクル４；出力端５よりｄ、Ｊのアドレスを出力する
。Cycle 4: Addresses d and J are output from output terminal 5.

カウンタ２を本サイクルの最後でカウントｄｏ鍔ｎ（−１）する。Counter 2 at the end of this cycle Count dotsuba n(-1).

サイクル５；出力端５よりｄｉ−１，ｊのアドレスを出
力する。Cycle 5: The address of di-1,j is output from the output terminal 5.

以上の処理により連続的に距離値ｄ　ｉ−２，ｊ−２、
ｄ　ｉ−１＋　ｊ−１，ｄ　ｉｊ、ｄ　ｉ−１＋　ｊが
アクセス可能となる。Through the above processing, distance values d i-2, j-2,
d i-1+ j-1, d ij, and d i-1+ j become accessible.

「発明の効果」以上説明したように、この発明によれば簡単な回路構成
によりヘクトル量子化手法に基づいた単語音声認識にお
いて距離値を−ａｉｔサイクルなしに高速に読み出すこ
とができ、メモリアクセスの頻繁なりＰ演算において、
距離値アドレス生成のオーバヘッドを削減でき認識処理
時間の短縮を図ることが出来る。"Effects of the Invention" As explained above, according to the present invention, distance values can be read out at high speed without -ait cycles in word speech recognition based on the hector quantization method with a simple circuit configuration, and memory access can be performed quickly. In the frequent P operation,
The overhead of distance value address generation can be reduced, and the recognition processing time can be shortened.

【図面の簡単な説明】[Brief explanation of the drawing]

第１図は単語辞書とベクトル量子化に基づく単語認識に
おいて作成される距離マトリクスとの関係を示す図、第
２図はＤＰ演算の一例を示す概念図、第３図はこの発明
の一実施例を示すブロック図、第４図はアドレス生成の
一実施例を示す図である。特許出願人　　日本電信電話株式会社Fig. 1 is a diagram showing the relationship between a word dictionary and a distance matrix created in word recognition based on vector quantization, Fig. 2 is a conceptual diagram showing an example of DP calculation, and Fig. 3 is an embodiment of the present invention. FIG. 4 is a block diagram showing an example of address generation. Patent applicant Nippon Telegraph and Telephone Corporation

Claims

【特許請求の範囲】[Claims]

（１）認識対象の単語辞書をベクトル量子化により作成
されたスペクトルパタンの番号系列で表現し、その単語
辞書と入力音声とのダイナミックプログラミング（ＤＰ
）マッチングにより、単語認識を行う認識系において、入力音声の時間情報を示すフレーム番号が格納でき、か
つｕｐ／ｄｏｗｎ可能なカウンタと、ベクトル量子化さ
れたスペクトルパタンの番号を格納するレジスタと、演算に必要なスペクトルパタンの番号を退避しておくバ
ッファメモリと、上記カウンタへの入力音声フレーム番号の設定及びカウ
ンタのｕｐ／ｄｏｗｎ制御、上記レジスタへのスペクト
ルパタン番号の設定、上記バッファメモリからのスペク
トルパタン番号の読みだしを行う制御部とを持ち、ＤＰ演算に用いる距離値データを格納してあるデータメ
モリのアクセスに際して、上記カウンタが上記データメ
モリの上位側アドレス、上記レジスタが下位側アドレス
を示すよう構成し、入力音声のフレーム番号の変更時には制御部からの信号
により、上記カウンタをｕｐ／ｄｏｗｎし、スペクトル
パタン番号の変更時には上記バッファメモリに格納され
ているスペクトルパタン番号を上記レジスタに設定でき
るようにした事を特徴とするＤＰ演算用メモリアドレス
制御回路。(1) A word dictionary to be recognized is expressed as a number series of spectral patterns created by vector quantization, and dynamic programming (DP) is applied to the word dictionary and input speech.
) In a recognition system that performs word recognition by matching, a counter that can store a frame number indicating time information of input speech and that can be up/down, a register that stores a vector quantized spectrum pattern number, and an operation. A buffer memory for saving the spectral pattern numbers necessary for the above, setting of the input audio frame number to the above counter and up/down control of the counter, setting of the spectral pattern number to the above register, and inputting the spectrum from the above buffer memory. and a control unit that reads a pattern number, and when accessing a data memory storing distance value data used for DP calculation, the counter indicates the upper address of the data memory, and the register indicates the lower address. When the frame number of the input audio is changed, the counter is up/down by a signal from the control section, and when the spectral pattern number is changed, the spectral pattern number stored in the buffer memory can be set in the register. A memory address control circuit for DP operation, characterized in that: