JPS6332394B2 - - Google Patents

Info

Publication number
JPS6332394B2
JPS6332394B2 JP56028139A JP2813981A JPS6332394B2 JP S6332394 B2 JPS6332394 B2 JP S6332394B2 JP 56028139 A JP56028139 A JP 56028139A JP 2813981 A JP2813981 A JP 2813981A JP S6332394 B2 JPS6332394 B2 JP S6332394B2
Authority
JP
Japan
Prior art keywords
pattern
registered
recognition
input
similarity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired
Application number
JP56028139A
Other languages
Japanese (ja)
Other versions
JPS57141700A (en
Inventor
Masahiko Goto
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Mitsubishi Electric Corp
Original Assignee
Mitsubishi Electric Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mitsubishi Electric Corp filed Critical Mitsubishi Electric Corp
Priority to JP56028139A priority Critical patent/JPS57141700A/en
Publication of JPS57141700A publication Critical patent/JPS57141700A/en
Publication of JPS6332394B2 publication Critical patent/JPS6332394B2/ja
Granted legal-status Critical Current

Links

Description

【発明の詳細な説明】 この発明は登録操作を伴なう音声認識装置にお
いて誤認識防止策を施したものに関するものであ
る。
DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a speech recognition device that involves a registration operation and is provided with measures to prevent misrecognition.

第1図は従来の音声認識装置の一構成例を示す
ものである。マイクロホン1で拾われた音声波形
2は音声分析・特徴抽出回路3内で例えば周波数
スペクトラム分析を受け、スペクトラムの時間構
造を表わす特徴パターン4が抽出される。この特
徴パターン4は次のパターン圧縮回路5で発声時
間の長短にかかわらず一定長の圧縮パターン6に
凝縮される。後続のスイツチ7は学習(登録)/
認識モードを切替えるもので、音声の登録操作時
には点線側、認識実行時には実線側に倒される。
FIG. 1 shows an example of the configuration of a conventional speech recognition device. The audio waveform 2 picked up by the microphone 1 is subjected to, for example, frequency spectrum analysis in the audio analysis/feature extraction circuit 3, and a feature pattern 4 representing the time structure of the spectrum is extracted. This feature pattern 4 is condensed into a compressed pattern 6 of a constant length in the next pattern compression circuit 5, regardless of the length of the utterance time. The subsequent switch 7 is learning (registration)/
This is used to switch the recognition mode, and it is set to the dotted line side when registering a voice, and to the solid line side when performing recognition.

そこで音声登録時、圧縮パターン6はモード切
替スイツチ7の点線側を通り、登録パターンメモ
リ9へ順次書き込まれる。このメモリは登録音声
数N語分備えられており、通常音声番号No.1、
2、3、…、Nと順番に貯えられていく。さて一
通り、例えば100語分の登録が終わると、スイツ
チ7は実線側に切替えられ、認識動作が開始され
る。認識時には圧縮パターン6はスイツチ7の実
線側を通り、入力パターンメモリ8に一時貯えら
れる。このメモリ8は音声入力の都度更新され、
書き替えられる。
Therefore, during audio registration, the compressed pattern 6 passes through the dotted line side of the mode changeover switch 7 and is sequentially written into the registered pattern memory 9. This memory is equipped with the number of registered voices N words, and usually voice numbers No. 1,
2, 3, ..., N are stored in order. Once the registration of, for example, 100 words is complete, the switch 7 is switched to the solid line side and the recognition operation is started. During recognition, the compressed pattern 6 passes through the solid line side of the switch 7 and is temporarily stored in the input pattern memory 8. This memory 8 is updated every time there is a voice input,
Can be rewritten.

次に登録パターンメモリ9からの複数の登録パ
ターン11は、入力パターンメモリ8から1つの
入力パターン10と、認識処理回路12で順次比
較され、両者間の類似度が次々と求められる。そ
して入力音声と最大の類似度をもつ登録音声が選
択され、判定結果13が出力される。ところで認
識処理回路12は、誤認識を避ける為、類似度の
監視をも行なつており、類似度がある閾値を越え
た時にのみ、判定制御信号14を発生する。前記
判定結果13とこの判定制御信号14とは転送ゲ
ート15に導かれており、認識スコアが良好な時
に限つて最終の認識結果16が出力される。
Next, the plurality of registered patterns 11 from the registered pattern memory 9 are sequentially compared with one input pattern 10 from the input pattern memory 8 in the recognition processing circuit 12, and the degree of similarity between the two is determined one after another. Then, the registered voice with the greatest similarity to the input voice is selected, and the determination result 13 is output. By the way, the recognition processing circuit 12 also monitors the degree of similarity in order to avoid erroneous recognition, and generates the determination control signal 14 only when the degree of similarity exceeds a certain threshold. The judgment result 13 and the judgment control signal 14 are led to a transfer gate 15, and the final recognition result 16 is output only when the recognition score is good.

例えば類似度100を完全一致時、90を閾値とす
ると、ある入力音声の類似度が85の場合、この判
定出力は棄却される事になる。以上の様な判定出
力制御は音声パターンの変動や周囲雑音に対処
し、誤認識を防止する上で重要な機能である。
For example, if a similarity of 100 is a perfect match and a threshold of 90 is used, if the similarity of a given input voice is 85, this judgment output will be rejected. The above-mentioned judgment output control is an important function in dealing with fluctuations in voice patterns and ambient noise, and preventing erroneous recognition.

ところで従来装置では極めて類似度の高い音
声、例えばSIX/スイツクス/、FIX/フイツク
ス/等が不用意に登録される可能性があり、認識
モードで誤認識が多発し、システムの動作の混乱
を惹き起こす要因となつていた。又誤認識が続く
場合には、当然該当音声パターンの更新操作が必
要となり、この間認識動作が中断される為、シス
テムの稼働率を著しく低下させていた。
However, with conventional devices, there is a possibility that voices with extremely high similarity, such as SIX/FIX/, etc., may be registered inadvertently, resulting in frequent erroneous recognition in recognition mode, leading to confusion in system operation. It was a contributing factor. Furthermore, if erroneous recognition continues, it is naturally necessary to update the corresponding voice pattern, and the recognition operation is interrupted during this time, resulting in a significant decrease in system operation rate.

この発明は前記従来装置の有する欠点を除去す
る為に成されたもので、音声登録時にも認識動作
を行なわせ、極めて類似度の高いパターンの登録
を未然に防止する事により、誤認識の発生を抑
え、信頼度の高い音声認識装置を提供せんとする
ものである。
This invention was made in order to eliminate the drawbacks of the conventional device, and it performs recognition operation even during voice registration, and prevents the registration of patterns with extremely high similarity, thereby causing erroneous recognition. The aim is to provide a highly reliable speech recognition device that suppresses the noise.

第2図は本発明による音声認識装置の一実施例
を示す構成図である。同図中20は登録操作時入
力パターンとそれ以前に登録された登録パターン
との比較を行なうため、入力パターンを登録パタ
ーンメモリ9を経て入力パターンメモリ8に転送
するための転送パス18とルート切替えスイツチ
17とからなる転送回路であり、ルート切替えス
イツチ17はモード切替スイツチ7と同様、認識
実行時には実線側、登録操作時には点線側に倒さ
れる。また19は上記登録操作時の上記転送ゲー
ト15の出力である最終認識結果16を監視し、
類似度の高い、即ち認識動作時の類似度閾値より
も低く設定したリジエクト閾値を越えるパターン
が既に登録されている場合警報を発し、再入力を
促す監視回路である。
FIG. 2 is a block diagram showing an embodiment of a speech recognition device according to the present invention. In the figure, reference numeral 20 indicates a transfer path 18 and a route switch for transferring the input pattern to the input pattern memory 8 via the registered pattern memory 9 in order to compare the input pattern during the registration operation with the previously registered registered pattern. The route changeover switch 17, like the mode changeover switch 7, is turned to the solid line side during recognition execution and to the dotted line side during registration operation. Further, 19 monitors the final recognition result 16 which is the output of the transfer gate 15 during the registration operation,
This is a monitoring circuit that issues an alarm and prompts re-input if a pattern with a high degree of similarity, that is, a pattern exceeding a reject threshold set lower than the similarity threshold at the time of recognition operation, has already been registered.

本装置での音声登録操作は以下の様に行なわれ
る。
The voice registration operation in this device is performed as follows.

例えば第1語として1/イチ/が入力された
時、その圧縮パターン6はスイツチ7の点線側を
通り登録パターンメモリ9のNo.1に書き込まれ
る。第1語目は類似度比較対象が無い為、認識動
作は行なわれない。さて第2語として7/シチ/
が入力されたとする。その圧縮パターン6は登録
パターンメモリ9のNo.2に書き込まれると同時に
転送パス18及びルート切替えスイツチ17の点
線側を通り、入力パターンメモリ8へ転送され
る。この入力パターン10は前回までに登録され
た登録パターン11、本例では第1語と認識処理
回路12で類似度が計算される。/イチ/と/シ
チ/では類似度が極めて高く、判定結果13は第
1語/イチ/を出力し、又判定制御信号14も発
せられて転送ゲート15の出力には/イチ/なる
認識結果16が現われる。即ち/シチ/と音声入
力したにもかかわらず、/イチ/と誤まつた認識
結果が出力された事になる。
For example, when 1/ichi/ is input as the first word, the compressed pattern 6 passes through the dotted line side of the switch 7 and is written into No. 1 of the registered pattern memory 9. Since there is no similarity comparison target for the first word, no recognition operation is performed. Now, as the second word, 7/shichi/
Suppose that is input. The compressed pattern 6 is written into No. 2 of the registered pattern memory 9, and at the same time is transferred to the input pattern memory 8 through the transfer path 18 and the dotted line side of the route changeover switch 17. The recognition processing circuit 12 calculates the degree of similarity between this input pattern 10 and the previously registered registered pattern 11, in this example, the first word. The similarity between /ichi/ and /ichi/ is extremely high, and the judgment result 13 outputs the first word /ichi/, and the judgment control signal 14 is also issued, and the output of the transfer gate 15 is the recognition result /ichi/. 16 appears. In other words, even though /ichi/ was inputted by voice, the recognition result was incorrectly output as /ichi/.

ところで認識結果16は監視回路19に導かれ
ており前記誤認識をモニターし、使用者に対して
警報、表示等を発して該当音声の再入力を促す。
そこで第2語を7/ナナ/と入力すると、今度は
第1語1/イチ/との類似度が低い為、登録パタ
ーンメモリ9のNo.2にはその圧縮パターンが保持
され、次の第3語の登録に移る事になる。第3語
登録の際は、前回迄に登録した第1語、第2語と
の類似度比較が行なわれる。以下同様に第N語入
力時には、既に登録済みの第1〜第(N―1)語
との類似度判定が逐次実施される事になる。
By the way, the recognition result 16 is led to a monitoring circuit 19, which monitors the erroneous recognition and issues an alarm, display, etc. to the user to urge him or her to re-input the corresponding voice.
Therefore, when the second word is inputted as 7/nana/, the similarity with the first word 1/ichi/ is low, so that compressed pattern is retained in No. 2 of the registered pattern memory 9, and the next word is inputted as 7/nana/. We will move on to registering three words. When registering a third word, a similarity comparison with the first and second words registered up to the previous time is performed. Similarly, when the Nth word is input, similarity determination with the already registered first to (N-1)th words is sequentially performed.

この様にすれば誤認識を誘発する紛わしい音声
や異音の混入した音声、あるいは不明瞭な発声に
より、不良音声パターンが登録されるのを未然に
排除する事ができる。
In this way, it is possible to prevent a defective speech pattern from being registered due to confusing speech, speech mixed with abnormal sounds, or unclear pronunciation that would induce erroneous recognition.

このようにして登録操作が完了したのちの認識
実行時には、モード切替スイツチ7とルート切替
えスイツチ17とは実線側に倒され、従来装置と
同様の認識処理が行なわれる。但し、この時の認
識処理回路12の閾値は、前述の登録操作時の閾
値よりも高く設定される。前記の如く登録パター
ンの品質が改善されている為、誤認識の発生を極
めて低く抑える事ができ、認識率の向上を図る事
が可能である。また本実施例では、登録操作時の
リジエクト閾値を認識動作時の類似度閾値よりも
低く設定しているので、類似パターンの登録を厳
しく締め出すことができる。例えば、類似度100
を完全一致時、90を認識動作時の閾値とした場
合、登録操作時のリジエクト閾値を60(類似度60
を越えるものは登録しない)と設定すれば、不良
パターンの登録を未然に阻止することができ、認
識率の向上を図る事が可能となる。
When performing recognition after the registration operation is completed in this manner, the mode changeover switch 7 and route changeover switch 17 are turned to the solid line side, and recognition processing similar to that of the conventional apparatus is performed. However, the threshold of the recognition processing circuit 12 at this time is set higher than the threshold at the time of the registration operation described above. Since the quality of the registered pattern is improved as described above, the occurrence of erroneous recognition can be suppressed to an extremely low level, and it is possible to improve the recognition rate. Furthermore, in this embodiment, since the reject threshold at the time of registration operation is set lower than the similarity threshold at the time of recognition operation, registration of similar patterns can be strictly prohibited. For example, similarity 100
If 90 is the threshold for the recognition operation when there is a perfect match, then the reject threshold for the registration operation is 60 (similarity 60).
If the setting is set such that patterns exceeding 100% are not registered, it is possible to prevent defective patterns from being registered, and it is possible to improve the recognition rate.

尚上記実施例では、音声入力の都度類似度比較
を行なう例を示したが、全語入力終了後類似度判
定を行ない、その中の不良パターンの更新をする
事もできる。また本発明による音声パターン登録
法は、特徴パターン圧縮を行なわず、ダイナミツ
クプログラミング手法により、不等長パターン間
の照合操作を行なう装置にも適用可能である。
In the above embodiment, an example was shown in which similarity comparison is performed each time a voice is input, but it is also possible to perform similarity judgment after inputting all words and update defective patterns therein. Furthermore, the voice pattern registration method according to the present invention can also be applied to an apparatus that performs a matching operation between patterns of unequal length using a dynamic programming method without performing feature pattern compression.

さらに音声以外の他の音響信号や画像信号等の
認識装置にも本発明を拡張し、適用する事が可能
である。
Furthermore, the present invention can be extended and applied to recognition devices for other acoustic signals other than voice, image signals, etc.

以上説明した如く、本発明による音声認識装置
では、学習と認識が併行して行なわれ、登録パタ
ーンの品質向上が図れる為、誤認識の発生が極め
て少ない信頼度の高いシステムを構成する事がで
きる。従つて従来装置の如く、認識動作を中断し
音声パターンの再登録をする手間が減り、システ
ムの稼働率を著しく高める事ができる。また本発
明による音声認識装置は、従来装置に簡単な転送
回路及び監視回路を付加するのみで良く、極めて
容易且つ経済的に実現する事が可能である。
As explained above, in the speech recognition device according to the present invention, learning and recognition are performed in parallel, and the quality of registered patterns can be improved, making it possible to configure a highly reliable system with extremely few occurrences of misrecognition. . Therefore, unlike conventional devices, the effort required to interrupt the recognition operation and re-register the voice pattern is reduced, and the operating rate of the system can be significantly increased. Furthermore, the speech recognition device according to the present invention can be realized extremely easily and economically, by simply adding a simple transfer circuit and a monitoring circuit to the conventional device.

【図面の簡単な説明】[Brief explanation of the drawing]

第1図は従来の音声認識装置の一例を示す構成
図、第2図は本発明による音声認識装置の一実施
例を示す構成図である。 1…マイクロホン、2…音声波形、3…音声分
析・特徴抽出回路、4…特徴パターン、7…モー
ド切替えスイツチ、8…入力パターンメモリ、9
…登録パターンメモリ、10…入力パターン、1
1…登録パターン、12…認識処理回路、13…
判定結果、14…判定制御信号、15…転送ゲー
ト、16…最終認識結果、19…監視回路、20
…転送回路。なお図中、同一符号は同一又は相当
部分を示す。
FIG. 1 is a block diagram showing an example of a conventional speech recognition device, and FIG. 2 is a block diagram showing an embodiment of the speech recognition device according to the present invention. DESCRIPTION OF SYMBOLS 1... Microphone, 2... Audio waveform, 3... Audio analysis/feature extraction circuit, 4... Feature pattern, 7... Mode changeover switch, 8... Input pattern memory, 9
...Registered pattern memory, 10...Input pattern, 1
1...Registered pattern, 12...Recognition processing circuit, 13...
Judgment result, 14... Judgment control signal, 15... Transfer gate, 16... Final recognition result, 19... Monitoring circuit, 20
...transfer circuit. In the figures, the same reference numerals indicate the same or equivalent parts.

Claims (1)

【特許請求の範囲】 1 マイクロホンからの入力音声波形を音声分析
し特徴パターンを抽出する音声分析・特徴抽出回
路と、 この音声分析・特徴抽出回路からの特徴パター
ンを登録操作時と認識実行時とで切替えるモード
切替スイツチと、 登録操作時該モード切替スイツチから送られる
特徴パターンを記憶する登録パターンメモリと、 認識実行時上記モード切替スイツチから送られ
る特徴パターンを記憶する入力パターンメモリ
と、 上記入力パターンメモリからの入力パターンと
登録パターンメモリからの登録パターンとを順次
比較し最大の類似度を持つ登録音声を判定結果と
して出力するとともに、認識実行時にはその類似
度が第1の閾値を越えた時に、登録操作時にはそ
の類似度が上記第1の閾値よりも低く設定した第
2の閾値を越えた時に制御信号を出力する認識処
理回路と、 上記制御信号により上記判定結果出力を転送し
最終認識結果を出力する転送ゲートと、 登録操作時入力パターンとそれ以前に登録され
たすべての登録パターンとの比較を行なうため上
記入力パターンを上記登録パターンメモリを経て
上記入力パターンメモリに転送するための転送回
路と、 上記登録操作時の上記転送ゲートの出力である
最終認識結果を監視し、認識実行時の類似度閾値
よりも低く設定された閾値を越えるパターンが既
に登録されている場合警報を発し再入力を促す監
視回路とを備えたことを特徴とする音声認識装
置。
[Scope of Claims] 1. A voice analysis/feature extraction circuit that analyzes an input voice waveform from a microphone and extracts a feature pattern; and a voice analysis/feature extraction circuit that analyzes the input voice waveform from a microphone and extracts a feature pattern; a registration pattern memory that stores the characteristic pattern sent from the mode changeover switch during a registration operation; an input pattern memory that stores the characteristic pattern sent from the mode changeover switch during recognition execution; and the input pattern described above. The input pattern from the memory and the registered pattern from the registered pattern memory are sequentially compared, and the registered voice with the highest degree of similarity is output as a determination result, and when the degree of similarity exceeds a first threshold during recognition execution, a recognition processing circuit that outputs a control signal when the degree of similarity exceeds a second threshold set lower than the first threshold during a registration operation; and a recognition processing circuit that transmits the judgment result output using the control signal and outputs the final recognition result. a transfer gate for outputting; a transfer circuit for transferring the input pattern to the input pattern memory via the registered pattern memory in order to compare the input pattern during the registration operation with all previously registered registered patterns; The final recognition result, which is the output of the transfer gate during the registration operation, is monitored, and if a pattern has already been registered that exceeds the similarity threshold set lower than the similarity threshold during recognition execution, an alarm is issued and the system requires re-input. A voice recognition device characterized by comprising a monitoring circuit for prompting.
JP56028139A 1981-02-26 1981-02-26 Voice recognizer Granted JPS57141700A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP56028139A JPS57141700A (en) 1981-02-26 1981-02-26 Voice recognizer

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP56028139A JPS57141700A (en) 1981-02-26 1981-02-26 Voice recognizer

Publications (2)

Publication Number Publication Date
JPS57141700A JPS57141700A (en) 1982-09-02
JPS6332394B2 true JPS6332394B2 (en) 1988-06-29

Family

ID=12240428

Family Applications (1)

Application Number Title Priority Date Filing Date
JP56028139A Granted JPS57141700A (en) 1981-02-26 1981-02-26 Voice recognizer

Country Status (1)

Country Link
JP (1) JPS57141700A (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS595291A (en) * 1982-07-01 1984-01-12 電子計算機基本技術研究組合 Standard pattern registration for voice recognition
JPS59124398A (en) * 1982-12-29 1984-07-18 富士通株式会社 Voice recognition equipment
JPS59154500A (en) * 1983-02-21 1984-09-03 松下電器産業株式会社 Voice recognition equipment
JPS6059395A (en) * 1983-09-12 1985-04-05 富士通株式会社 Voice standard feature pattern generation processing system
JPS6063899U (en) * 1983-10-05 1985-05-04 カシオ計算機株式会社 voice recognition device
JPS62148998A (en) * 1985-12-23 1987-07-02 松下電器産業株式会社 Voice recognition equipment
JP2555029B2 (en) * 1986-06-06 1996-11-20 株式会社日立製作所 Voice recognition device
JPH0830288A (en) * 1994-07-14 1996-02-02 Nec Robotics Eng Ltd Voice recognition device
WO2006109515A1 (en) * 2005-03-31 2006-10-19 Pioneer Corporation Operator recognition device, operator recognition method, and operator recognition program

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS5651800A (en) * 1979-10-04 1981-05-09 Sanyo Electric Co Sound identifier

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS5651800A (en) * 1979-10-04 1981-05-09 Sanyo Electric Co Sound identifier

Also Published As

Publication number Publication date
JPS57141700A (en) 1982-09-02

Similar Documents

Publication Publication Date Title
US4694493A (en) Speaker verification system
JPS6332394B2 (en)
WO1994002936A1 (en) Voice recognition apparatus and method
JPS645320B2 (en)
JPS6239749B2 (en)
JPS599080B2 (en) Voice recognition method
JP2975808B2 (en) Voice recognition device
JPS6011897A (en) Voice recognition equipment
JPH03155599A (en) Speech recognition device
JPH0556519B2 (en)
JPH06250687A (en) Voice recognition device for unspecified speaker
JPS63281196A (en) Voice recognition equipment
JPS59154498A (en) Voice input unit
JPH05216493A (en) Operator assistance type speech recognition device
JP2712586B2 (en) Pattern matching method for word speech recognition device
JPS5962900A (en) Voice recognition system
JPS61165797A (en) Voice recognition equipment
JPH0711760B2 (en) Method for correcting standard parameters in voice recognition device
JPH02272495A (en) Voice recognizing device
JPH0331274B2 (en)
JPS60170899A (en) Monosyllabic voice registration system
JPS5917597A (en) Voice recognition system
JPH01302297A (en) Speaker recognition device
JPS6028696A (en) Voice input unit
JPS595291A (en) Standard pattern registration for voice recognition