JPS636599A

JPS636599A - Word preselection system

Info

Publication number: JPS636599A
Application number: JP61150191A
Authority: JP
Inventors: 畑崎　香一郎
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1986-06-26
Filing date: 1986-06-26
Publication date: 1988-01-12
Also published as: JPH0575120B2

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】（産業上の利用分野）本発明は、音声認識装置、音声入力装置等において用い
られ、入力音声に出現している可能性の高い単語の認識
用単語辞書等から効率よく選択する音声認ｍＫおける単
語予備選択方式に関する。DETAILED DESCRIPTION OF THE INVENTION (Industrial Application Field) The present invention is used in speech recognition devices, speech input devices, etc., and is an efficient method for recognizing words that are likely to appear in input speech. This article relates to a preliminary selection method for frequently selected words in voice recognition mK.

（従来技術とその問題点）音声認識装置、音声入力装置では、通常、認識対象の語
粟をあらかじめ定めておき、入力音声をその語粟中のひ
とつの単語あるいは単語の並びとみなして認識処理を行
なう。認識処理とは例えば、入力音声と語粟中の各単語
の標準パターンとのマツチング、あるいは入力音声の音
素候補系列と語粟中の各単語の音素系列とのマツチング
を行ない、入力音声にもつとも似ている単語または単語
の並びを求めることである。通常、この認識処理には多
大の計算量が必要である。しかも現在、認識対象の語粟
の大きさはますます増加しておυ、それに従って認識処
理に必要々計算量もますます増加している。(Prior art and its problems) In speech recognition devices and voice input devices, a word to be recognized is usually determined in advance, and the input speech is recognized as one word or a sequence of words in that word. Do the following. Recognition processing involves, for example, matching the input speech with the standard pattern of each word in the word millet, or matching the phoneme candidate series of the input speech with the phoneme sequence of each word in the word millet, to find a pattern that is similar to the input speech. It is to find a word or a sequence of words. Normally, this recognition process requires a large amount of calculation. Moreover, the size of words to be recognized is increasing, and the amount of calculation required for recognition processing is also increasing accordingly.

そこで、音声が入力されたとき、その入力音声に出現し
ている可能性の高い単語のみを認識対象の語全から容易
に予備的に選択することができるならば、選択された単
語に対してのみ認識処理を行なえばよく、認識処理に必
要な計算量を減少させることが可能となる。Therefore, when speech is input, if it is possible to easily preliminarily select only words that are likely to appear in the input speech from among all the words to be recognized, then It is only necessary to perform the recognition process, which makes it possible to reduce the amount of calculation required for the recognition process.

通常、予備選択は入力音声中で安定に検出できる音素ク
ラスによって行なわれる。すなわち、入力音声中にいく
つかのそのような安定な音素クラスが検出されれば、認
識対象の語粟中の単語のうち、少なくともそれらの検出
された音素クラスをまったく含まない単語がその入力音
声中に含まれている確率は非常に小さいという原理を用
いる。Usually, preliminary selection is performed by phoneme classes that can be stably detected in the input speech. In other words, if some such stable phoneme classes are detected in the input speech, at least words in the words to be recognized that do not contain any of those detected phoneme classes will be included in the input speech. It uses the principle that the probability contained within is very small.

安定に検出できる音素クラスとしては、５母音、摩擦音
および撥音を各クラスとしたシ、あるいは摩擦音、破裂
音等の子音のおおまかな分類を各クラスとすることなど
、あるいは各音素を精度良く検出できるならば各音素を
そのまま音素クラスとすることなどが考えられる。Phoneme classes that can be stably detected include five vowels, fricatives, and plosives in each class, or rough classifications of consonants such as fricatives and plosives in each class, or each phoneme can be detected accurately. In that case, it is conceivable to use each phoneme as it is as a phoneme class.

予備選択では、入力音声から得た１個以上の予備選択の
キーの少なくとも１個を含む単語が選択結果として出力
される。予備選択では、入力音声に含まれる単語が正し
く選択されている限りにおいては、それ以外に選択され
る単語の数が少ないほど有効である。選択される単語の
数を少なくするためには、予備選択のキーの種類を多く
シ、キーの種類の総数に対する、入力音声から得る相異
なるキーの数の割合を小さくすればよい。このために、
従来から、入力音声中に検出された音素クラスのｎ　Ｉ
ｔａｌの組み合わせを長さｎのキーとして予備選択を行
なっている。In the preliminary selection, a word containing at least one of the one or more preliminary selection keys obtained from the input speech is output as a selection result. In preliminary selection, as long as the words included in the input speech are correctly selected, the smaller the number of other words selected, the more effective it is. In order to reduce the number of words to be selected, it is sufficient to increase the number of preselected keys and to reduce the ratio of the number of different keys obtained from the input speech to the total number of key types. For this,
Conventionally, n I of the phoneme class detected in the input speech
Preliminary selection is performed using the combination of tal as a key of length n.

一方、音声の発声時の調音変化や音素クラス検出部の検
出性能などのために、入力音声中の音素クラスの検出時
には、含まれているはずの音素クラスが脱落したシ、チ
るいは逆に本来存在しない音素クラスが挿入されたりと
いう検出誤りの生ずることがある。従って、検出された
音素クラスの並びのなかの連続する一部の並びだけを予
備選択のキーとして用いるのでは、入力された単語が正
しく選択されない場合が生ずることになる。On the other hand, due to changes in articulation during speech production and the detection performance of the phoneme class detector, when detecting phoneme classes in input speech, phoneme classes that should have been included may be omitted, incorrect, or vice versa. Detection errors may occur, such as insertion of a phoneme class that does not originally exist. Therefore, if only a partial sequence of detected phoneme classes is used as a preliminary selection key, the input word may not be selected correctly.

以上の理由から、従来は、たとえば文献１「仮橋、損出
“語中部分音素系列の指定による語愛の減少について”
昭和５８年日本音響学会講演論文集１−１−３、昭和５
８年１０月」、あるいは文献２「特願昭６０−１７３４
２２号、音声認識における単語予備選択方式」に示され
ているように、入力音声中に検出された音素クラスの並
びのなかの必ずしも連続しないｎ個の音素クラスの並び
を予備選択のキーとして用い、このキーを同じ音素クラ
スを必ずしも連続せずに含む単語を選択することにより
、音素クラスの検出具シに対処して、単語の予備選択を
行なっていた。For the above reasons, conventionally, for example, Document 1 "Karihashi, Yasuide "On the reduction of word love by specifying the mid-word partial phoneme sequence"
1982 Acoustical Society of Japan Proceedings 1-1-3, 1972
October 1988” or Document 2 “Patent Application 1986-1734
No. 22, "Word Preliminary Selection Method in Speech Recognition", a sequence of n phoneme classes that are not necessarily consecutive among the sequence of phoneme classes detected in input speech is used as a key for preliminary selection. By selecting words that contain this key without necessarily consecutively including the same phoneme class, preliminary selection of words is performed in response to phoneme class detection tools.

この場合でも、ｎを大きくすると、入力音声から得られ
る長さｎの相異なるキーの数は増加するが、キーの種類
の総数はそれ以上に増加し、従って、入力音声から得た
キーを含む単語の数はほとんどの場合少なくなシ、有効
な予備選択が行なえる。Even in this case, increasing n increases the number of distinct keys of length n obtained from the input audio, but the total number of key types increases even more, thus including the keys obtained from the input audio. The number of words is small in most cases, allowing for effective preliminary selection.

（発明が解決しようとしている問題点）しかしながら、
前記の従来技術では、予備選択のキーの長さを一定とし
ていたことに起因する問題点があった。(The problem that the invention is trying to solve) However,
The above-mentioned prior art has a problem due to the fact that the length of the preliminary selection key is constant.

第１の問題点は、キーの長さに比較して、入力音声に含
まれる単語の長さが十分長くないときに、その単語を選
択するためのキーが正しく得られないことがあるという
点である。The first problem is that when the length of a word included in the input speech is not long enough compared to the length of the key, the key to select that word may not be obtained correctly. It is.

例えば、キー〈使用する音素クラスを、ａ、　　ｉ。For example, the key <the phoneme class to be used is a, i.

ｕ、ｅ、ｏの５母音と撥音Ｘの計６種類とした場合、入
力音声「エイゴデワ」中のこれらの音素クラス列は正し
くは／ｅｉｏｅａ／であるが、このうちの１の検出に失
敗したとすると予備選択のキーは／ｅｏｅａ／から得る
ことになる。In the case of a total of 6 types of vowels u, e, o and the cursive sound X, these phoneme class strings in the input speech ``Eigodewa'' are correctly /eioea/, but one of them failed to be detected. Then, the preliminary selection key will be obtained from /eoea/.

入力音声中で最大１個の飛び越しを許して並ぶ音素クラ
スをキーとし、キーの長さをｎ　＝　３とすると、予備
選択のキーはｅ−０−ｅｌｅ−０−ａ％ｅ−ｅ−ａ、　
ｏ−ｅ−ａの４個となる。このキーを用いて予備選択を
行なっても、選択されねばならない単語ｒｅｉｇｏＪは
、これらのキーのいずれもを含まないため、選択されな
い。If the key is a phoneme class arranged with a maximum of one jump allowed in the input speech, and the key length is n = 3, then the preliminary selection key is e-0-ele-0-a%e-e-a. ,
There are four: o-e-a. Even if a preliminary selection is made using this key, the word reigoJ that should be selected is not selected because it does not contain any of these keys.

また、入力音声中に単語長が２音節以下の単語が含まれ
ている場合は、その単語中のすべての音素クラスを正し
く検出できても、その単語を選択するための長さ３のキ
ーを得ることはできない。In addition, if the input speech contains a word with a word length of two syllables or less, even if all phoneme classes in that word are correctly detected, a key with a length of 3 to select that word may be used. You can't get it.

第２の問題点は、反対に、キーの長さく比較して長すぎ
る単語に対しては、その単語に含まれる相異なるキーの
数が多くなることＫよシ、その単語が入力音声に含まれ
ないときＫも誤って選択されてしまうことが多くなると
いう点である。The second problem is that, on the other hand, if a word is too long compared to the length of the key, the number of different keys included in that word will increase. The problem is that when the number is not available, K is also often selected by mistake.

例えば、４音節の単語ｒｏＸｓｅｉＪ　　に含まれる上
述の長さ３のキーはｏ　−Ｘ−ｅＸｏ　−Ｘ−ｉ。For example, the above-mentioned key of length 3 included in the four-syllable word roXseiJ is o -X-eXo -X-i.

ｏ−ｅ−ｔ、　Ｘ−ｅ　　ｉの４ｍ類であるのに対し、
６音節の単語ｒｍｏＸｄａＬｔｅＸ　Ｊでは０−Ｘ−ａ
、　ｏ　−Ｘ−１，ｏ−−ａ−１、ｏ−ａ−ｅ、　Ｘ−
ａ−ｔ。In contrast to the 4m classes of o-e-t and X-e i,
6-syllable word rmoXdaLteX 0-X-a in J
, o -X-1, o--a-1, o-a-e, X-
a-t.

Ｘ−ａ−−ｅ、　Ｘ−１−ｅＳＸ−ｉ−ＸＸａ−ｉ　−
ｅ、　ａ−１−−Ｘｚ　ａ−ｅ　−Ｘ、　ｌ−ｅ　−Ｘ
　Ｏｌ　２種類となシ、６音節の単語が選択される機会
は４音節の単語の３倍になる。X-a--e, X-1-eSX-i-XXa-i −
e, a-1--Xz a-e -X, le-X
With two types, six-syllable words are three times more likely to be selected than four-syllable words.

（問題点を解決するための手段）前述の問題点を解決するために本願の発明が提供する手
段は、入力音声中の必ずしも連続しないｎ個の音素クラ
スの並びを長さｎのキーとし、前記入力音声から少なく
とも１個のキーを取り出しこれらのキーのいずれかと同
じ音素クラスの並びを必ずしも連続せずに含む単語を予
備選択結果として出力する音声認識における単語予備選
択方式であって、前記入力音声からは複数種類の長さの
キーを取り出し、単語の選択はそれぞれの単語の長さく
応じてあらかじめ定めた長さのキーを用いて行なうこと
を特徴とする。(Means for Solving the Problem) In order to solve the above-mentioned problem, the invention of the present application provides a means for solving the above-mentioned problem by using a sequence of n phoneme classes that are not necessarily consecutive in input speech as a key of length n, A word preliminary selection method in speech recognition that extracts at least one key from the input speech and outputs as a preliminary selection result a word that contains the same phoneme class arrangement as one of these keys, but not necessarily consecutively, the input speech. It is characterized in that keys of a plurality of lengths are extracted from the voice, and words are selected using keys of predetermined lengths depending on the length of each word.

（作用）前述の問題点は、予備選択のキーの長さを、認識対象の
語粟中の単語の長さに関わらず一定としていたことに起
因する。(Operation) The above-mentioned problem is caused by the fact that the length of the preliminary selection key is constant regardless of the length of the word in the words to be recognized.

これに対して本発明では、認識対象の語當中のそれぞれ
の単語の長さに応じて最適な長さの予備選択のキーを用
いて、予備選択を行なう。具体的には、短い単語を調べ
るときＫは短いキーを用い、長い単語を調べるときには
長いキーを用いる。In contrast, in the present invention, preliminary selection is performed using a preliminary selection key having an optimal length depending on the length of each word among the words to be recognized. Specifically, K uses short keys when searching for short words, and uses long keys when searching for long words.

これによって、短い単語に対しては、キー長も短いため
に、入力音声中のある音素クラスの検出に失敗しても、
それ以外の音素クラスから予備選択に十分な数のキーを
得ることができる。単語が短すぎてキーがまったく得ら
れなくなるということもない。また、長す単語に対して
は、キーの長さが長くなるために１キーの種類の総数が
多くなる。従って、その総数に対してその単語に含まれ
るキーの数の割合は少なくなシ、すなわち、その単語は
誤って選択されてしまう機会が少なくなる。As a result, for short words, even if a certain phoneme class in the input speech fails to be detected because the key length is short,
A sufficient number of keys for preliminary selection can be obtained from other phoneme classes. Words are never so short that you can't get any keys at all. Furthermore, for longer words, the length of the key becomes longer, so the total number of types of one key increases. Therefore, the ratio of the number of keys included in the word to the total number is small, that is, the chance that the word will be selected by mistake is reduced.

複数種類の長さのキーを使用するためＫは、入力音声か
ら複数種類の長さのキーを取り出す必要があ夛、従来の
方法よシも仁のための処理量が増加する。しかしながら
、この処理は１つの入力音声に対して１回だけでよく、
その処理量は予備選択全体の処理量に比較すれば模めて
少ない。Since keys of multiple types of lengths are used, K has to extract keys of multiple types of lengths from the input audio, which increases the amount of processing compared to the conventional method. However, this process only needs to be performed once for each input audio.
The amount of processing is relatively small compared to the amount of processing for the entire preliminary selection.

（実施例）以下では、図面を参照しつつ、実施例に従って本発明の
詳細な説明する。(Examples) Hereinafter, the present invention will be described in detail according to examples with reference to the drawings.

第１図は、本願の発明の一実施例を示すブロック図であ
る。本実施例では、予備選択のキーに使用する音素クラ
スとして、ａ、　　ｉ、　　ｕ、　　ｅ、　　’の５母
音および撥音Ｘの６種類を用いる。これらの音素クラス
は入力音声の中では比較的定常状態はあシ、現在の技術
レベルで比較的安定に検出できる。キーの長さとしては
、単語の音節数に応じて例えばｎ＝１．　２．　３．　
４の４徨類を用いる。FIG. 1 is a block diagram showing one embodiment of the invention of the present application. In this embodiment, six types of phoneme classes are used as the preliminary selection key: five vowels a, i, u, e, and ', and the cursive sound X. These phoneme classes are in a relatively steady state in input speech, and can be detected relatively stably at the current level of technology. The length of the key may be, for example, n=1, depending on the number of syllables in the word. 2. 3.
Use the 4 groups in Section 4.

入力音声はいったん、音声メモ！７１０１に記憶される
。音素クラス検出部１０２は、音声メモリ１０１の入力
音声から、予備選択のキーの構成要素となる音素クラス
を複数個検出し、音素クラスメモリ１０３に各音素クラ
スとそれらの入力音声中での位置とを記憶する。音素ク
ラス検出部１０２において入力音声からこれらの音素ク
ラスを検出するためには、例えば、あらかじめ各音素ク
ラスの１音声フレームあたりの標準パターンを用意して
おき、入力音声の各フレームとそれらの標準パターンと
の類似度を叫べ、ある音素クラスの標準パターンが数フ
レームにわたって連続して高い類似度を示す区間があれ
ば、その音素クラスをその音声区間の音素クラスとして
検出する、という方法が知られている。The input audio is now a voice memo! 7101. The phoneme class detection unit 102 detects a plurality of phoneme classes that are constituent elements of the preliminary selection key from the input speech in the speech memory 101, and stores each phoneme class and its position in the input speech in the phoneme class memory 103. remember. In order for the phoneme class detection unit 102 to detect these phoneme classes from input speech, for example, standard patterns per speech frame for each phoneme class are prepared in advance, and each frame of the input speech and its standard patterns are prepared in advance. There is a known method in which if there is an interval in which the standard pattern of a certain phoneme class shows a high degree of similarity continuously over several frames, that phoneme class is detected as the phoneme class of that phonetic interval. There is.

例えば、「エイゴデワ」という入力音声から、／ｅｏａ
ａ／　　の４個の音素クラスが検出され、それぞれ入力
音声中の位置情報とともに音素クラスメモリ１０３に記
憶される。For example, from the input voice "Eigodewa", /eoa
Four phoneme classes of a/ are detected and stored in the phoneme class memory 103 together with their position information in the input speech.

キー検出部１０４は、音素クラスノそり１０３から最大
１個の飛び越しを許して取υ出したｎ個の音素クラスの
列を、長さｎのキーとする。第１〜第４のキーメモリに
はそれぞれ長さが１〜４のキーを記憶する。この結果、
第１のキーメモリ１０５には、ｅ、　　ｏ、　　ａの３
個の相異なる長さ１のキーが記憶される。第２のキーメ
モリ１０６には、ｅ　　Ｏ，ｅ−ｅ、ｏ　　ｅ）ｏ−ａ
、ｅ−ａの５個の長さ２のキーが記憶される。第３のキ
ーメモリ１０７には、ｅ−ｏ−ｅＸｅ−ｏ−ａ、　ｅ　
−ｅ−ａ、ｏ−ｅ−ａの４個の長さ３のキーが記憶され
る。また、第４のキーメモリｔｏｓには、ｅ−ｏ　−ｅ
　−ａの１個の長さ４のキーが記憶される。The key detection unit 104 uses a sequence of n phoneme classes extracted from the phoneme class noseries 103 with a maximum of one jump as a key of length n. The first to fourth key memories each store keys having lengths of 1 to 4. As a result,
The first key memory 105 contains three keys: e, o, and a.
Different keys of length 1 are stored. The second key memory 106 contains e O, ee, o e) o-a
, e-a are stored. The third key memory 107 includes e-o-eXe-o-a, e
Four keys of length 3 are stored: -ea, oe-a. Furthermore, the fourth key memory tos contains e-o-e.
One length 4 key of -a is stored.

続いて、単語肩択部１０９が、認識対象の語粟の単語を
記憶する単語辞書１１０の中のそれぞれの単語について
予備選択を行なう。単語辞書１１０中のそれぞれの単語
には、予備選択のキーに使用する音素クラスの単語中で
の並びが付与されている。それぞれの単語に対して、ま
ずキー選択部１１１によって予備選択に用いるキーの長
さが決定され、該当する長さのキーを記憶するキーメモ
リの内容が単語選択部に送られる。本実施例では、単語
の音節数３を用いて、次式によりキー長ｎを決定する。Subsequently, the word selection unit 109 performs preliminary selection for each word in the word dictionary 110 that stores words of the word millet to be recognized. Each word in the word dictionary 110 is given a sequence within the word of the phoneme class used as a key for preliminary selection. For each word, the length of the key used for preliminary selection is first determined by the key selection section 111, and the contents of the key memory storing keys of the corresponding length are sent to the word selection section. In this embodiment, the key length n is determined by the following equation using the number of syllables of a word, which is three.

ｎ＝ｍ１ｎ　（ｍａｘ　（ｓ−１，ｌ）　、　４　）予
備選択は、単語に付与されている音素クラスの並びと、
キー選択部から送られたキーのそれぞれとを比較するこ
とによって行なわれる。すなわち、キーメモリのいずれ
かのキーの音素クラスの並びを含む単語を予備選択候補
として出力する。このとき例えば、音素クラス検出部１
０２における音素クラスの検出誤シが生じても、それに
より２個以上の音素クラスが連続して脱落する確率が非
常に小さいとすると、キーの音素クラスの並びの中に他
の音素クラスが続けて１個までなら挿入されていてもよ
いとする。また同じ理由から、キーの音素クラスのうち
始端あるいは終端の印が与えられているものは、単語の
それぞれ始端あるいは終端から２個目以内の音素クラス
でなけルばならないとする。n=m1n (max (s-1,l), 4) The preliminary selection is based on the arrangement of phoneme classes assigned to the word,
This is done by comparing each of the keys sent from the key selection section. That is, a word including the phoneme class arrangement of any key in the key memory is output as a preliminary selection candidate. At this time, for example, the phoneme class detection unit 1
Even if a phoneme class detection error occurs in 02, the probability that two or more phoneme classes will be dropped in succession is very small. It is assumed that up to one item may be inserted. For the same reason, it is assumed that among the phoneme classes of the key, those that are given the start or end mark must be within the second phoneme class from the start or end of the word, respectively.

例えば、単語辞書１１０の中の単語ｒｅｉｇ。For example, the word reig in the word dictionary 110.

（エイゾ）」に対しては、音節数ｓ　＝　３よシ上式か
らキー長ｎ＝２となる。そこで、長さ２のキーを記憶し
ている第２のキーメモリ１０６中の５個のキーと、単語
ｒｅｉｇｏ　　（エイゾ）」　中の音素クラス列／ｅｉ
ｏ／　　とが比較される。この結果、キーの１つである
ｅ−ｏが含まれることがわかるため、この単語は予備選
択結果の１つとして出力される。−方、例えば同じく３
音節の単語ｒ　ｔａＸｇｏ　（夕／ゴ）」は、その中の
音素クラス列は／ａ　　Ｘ　　ｏ　／であるため、前記
０５個の長さ２のキーのいずれも含まれないために、選
択されない。(Eizo)'', the number of syllables s = 3, and the key length n = 2 from the above formula. Therefore, five keys in the second key memory 106 storing keys of length 2 and the phoneme class string /ei in the word "reigo" are used.
o/ is compared. As a result, it is found that one of the keys, e-o, is included, so this word is output as one of the preliminary selection results. -, for example, 3
The syllable word r ta

また、例えば５音節以上の長さの単語に対しては、上式
よりキーの長さはｎ　＝　４となるが、前記第４のキー
メモリ１０８にはただ１個のキーしかない。従って、５
音節以上の長さの単語が誤って選択される機会はきわめ
て少なくなる。Further, for example, for a word with a length of five syllables or more, the key length is n = 4 according to the above formula, but the fourth key memory 108 has only one key. Therefore, 5
The chance that a word longer than a syllable will be mistakenly selected becomes extremely small.

以下、単語辞書１１０の他のすべての単語をζりいても
同様にキーとの比較が行われ、いくつかの単語が予備選
択結果として出力される。Thereafter, when all other words in the word dictionary 110 are retrieved, comparison with the key is performed in the same way, and some words are output as preliminary selection results.

以上、本願の発明の実施例を示したが、予備選択のキー
に使用する音素クラスとしては、実施例で示したものに
限らず、例えば摩擦音、破裂音等の子音のおおまかなり
ラスなど、安定に検出できるものであればよい。また、
各音素を精度よく検出できるならば、それらをそのまま
音素クラスとしてもよい。Although the embodiments of the invention of the present application have been described above, the phoneme classes used for the preliminary selection key are not limited to those shown in the embodiments, but can also be stable, such as a rough rast of consonants such as fricatives and plosives. It suffices if it can be detected. Also,
If each phoneme can be detected with high accuracy, they may be used as a phoneme class as is.

また、予備選択に使用するキーの長さは、実施例の組み
合わせに限らず、例えばり、　　３．　５のような非連
続の数の組でもよい。Further, the length of the keys used for preliminary selection is not limited to the combinations of the embodiments, and for example, 3. It may also be a set of non-consecutive numbers such as 5.

また、単語の長さは、実施例のような音節数に限らず、
疑似音素の数などとしてもよい。In addition, the length of a word is not limited to the number of syllables as in the example.
It may also be the number of pseudophonemes.

（発明の効果）以上説明したように、上述の単語ｒｅｔｇｏ（エイゾ）
」の例のごとく、従来の方式ではうまく選択されなかっ
た単語が本発明の方式では正しく選択される。また、長
い単語に対しても、それらが誤って選択されてしまう機
会は少なくなる。(Effect of the invention) As explained above, the above-mentioned word retgo
'', words that were not selected successfully using the conventional method are correctly selected using the method of the present invention. Furthermore, even for long words, there is less chance that they will be mistakenly selected.

また、前述のように、キーが長くなると、中−の樵類の
総数が増加し、その結果選択される単語の数は少なくな
り、予備選択はよシ有効になる。Also, as mentioned above, as the key becomes longer, the total number of words in the middle increases, resulting in fewer words being selected, making the preliminary selection more effective.

本発明の方式を用いると、それぞれの単語に最適な、よ
シ長いキーを用することができるため、よシ有効な予備
選択ができる。By using the method of the present invention, it is possible to use a long key that is optimal for each word, so that a more effective preliminary selection can be made.

とのように、本発明の方式を用いれば、よシ有効な、か
つ誤シの少ない、音声認識のための単語の予備選択方式
を提供することができる。As described above, by using the method of the present invention, it is possible to provide a highly effective word preselection method for speech recognition with fewer errors.

【図面の簡単な説明】[Brief explanation of drawings]

第１図は、本発明の一実施例を示すブロック図である。１０１・・・音声メモリ、１０２・・・音素クラス検出
部、１０３・・・音素クラスメモリ、ＩＯ４・・・キー
検出部、ｌＯ５〜１０８・・・キーメモリ、１０９・・
・単語選択部、１１０・・・単語辞書、１１１・・・キ
ー選択部。代理人　弁理士　本　庄　伸　介単語第１図FIG. 1 is a block diagram showing one embodiment of the present invention. 101... Voice memory, 102... Phoneme class detection unit, 103... Phoneme class memory, IO4... Key detection unit, IO5-108... Key memory, 109...
- Word selection section, 110... Word dictionary, 111... Key selection section. Agent Patent Attorney Shinsuke Honsho Words Diagram 1

Claims

【特許請求の範囲】[Claims]

入力音声中の必ずしも連続しないｎ個の音素クラスの並
びを長さｎのキーとし、前記入力音声から少なくとも１
個のキーを取り出し、これらのキーのいずれかと同じ音
素クラスの並びを必ずしも連続せずに含む単語を予備選
択結果として出力する音声認識における単語予備選択方
式において、前記入力音声からは複数種類の長さのキー
を取り出し、単語の選択はそれぞれの単語の長さに応じ
てあらかじめ定めた長さのキーを用いて行なうことを特
徴とする単語予備選択方式。A sequence of n phoneme classes that are not necessarily consecutive in the input speech is used as a key of length n, and at least one phoneme class from the input speech is
In the word preliminary selection method in speech recognition, which extracts several keys and outputs as a preliminary selection result words that contain the same phoneme class sequence as one of these keys, but not necessarily consecutively, multiple types of long lengths are extracted from the input speech. This word preliminary selection method is characterized in that a key is taken out, and words are selected using a key of a predetermined length depending on the length of each word.