JPH02250095A

JPH02250095A - Speech recognition system

Info

Publication number: JPH02250095A
Application number: JP1070918A
Authority: JP
Inventors: Hiromi Shibuya; 渋谷　浩洋; Munekazu Maeda; 宗万前田; Yasutomo Onishi; 大西　康友
Original assignee: Matsushita Refrigeration Co
Current assignee: Panasonic Holdings Corp
Priority date: 1989-03-23
Filing date: 1989-03-23
Publication date: 1990-10-05

Abstract

PURPOSE:To realize a high recognition rate and to give a speaking person no feeling of uneasiness by informing the speaking person of a speech recognizable period by a voicing start indication means and inhibiting the speaking person from voicing anything in a speech unrecognizable period such as a voicing guidance period. CONSTITUTION:The voicing start indication means 9 informs the speaking person of the speech recognizable period to inhibit the speaking person from voicing anything in the speech unrecognizable period such as the voicing guidance period of an automatic cup vending machine. Namely, the voicing start indication means 9 informs the speaking person of the speech recognizable period and a lamp, etc., is provided on the front surface of, for example, the automatic cup vending machine in a conspicuous place; and a voicing start indication signal is OFF in the speech unrecognizable period such as the voicing guidance period, pattern selection processing period, etc., of the automatic cup vending machine, but ON in the speech recognizable period. Therefore, the speaking person voices words when the voicing start indication signal is ON and then never voice any word by mistake in the speech unrecognizable period.

Description

【発明の詳細な説明】産業上の利用分野本発明は、特定話者及び不特定話者が入力した単語音声
を認識しその音声により数々の処理を行なうための音声
認識システムに間し、特に、不特定話者に関するもので
坐る。DETAILED DESCRIPTION OF THE INVENTION Field of Industrial Application The present invention relates to a speech recognition system for recognizing word speech input by specific speakers and unspecified speakers and performing various processes using the speech. , Sit down with regards to unspecified speakers.

従来の技術従来、カップ飲料等の自動販売機（以後、簡単にカップ
自販機と称する。）を始めとする自販機用音声認識シス
テムは、第４図に示すように、まず、利用者がマイクロ
ホン１により入力した音声を音声分析手段２により分析
して音声パターンを抽出する６分析には帯域通過フィル
ター群を使ったＢＰＦ（Ｂａｎｄ　　Ｐａｔｈ　　Ｆｉ
ｌｔｅｒ）分析結果を時間軸と周波数軸で標本化し、強
度をデジタル処理する手法を用いる。標準パターン記憶
手段３には、同様の方法により抽出した多数の不特定話
者が発声した複数の離散単語の音声パターンを標準パタ
ーンとして記憶しである。ただし、ここで標準パターン
として記憶されている単語は、カップ自販機で販売する
フレーバー（コーヒージュース等飲料の品名）の呼称と
いくつかの返答単語（はい、いいえ、ホット、アイス等
）である。2. Description of the Related Art Conventionally, a voice recognition system for a vending machine such as a vending machine for cup beverages (hereinafter simply referred to as a cup vending machine), as shown in FIG. The input voice is analyzed by the voice analysis means 2 to extract a voice pattern. 6 Analysis is performed using a band pass filter (BPF) using a group of band pass filters.
lter) A method is used to sample the analysis results on the time axis and frequency axis and digitally process the intensity. The standard pattern storage means 3 stores, as standard patterns, audio patterns of a plurality of discrete words uttered by a large number of unspecified speakers extracted by a similar method. However, the words stored as standard patterns here are the names of flavors sold in cup vending machines (product names of beverages such as coffee juice) and some response words (yes, no, hot, ice cream, etc.).

そして、標準パターン選出手段４で、標準パターンの中
から入カバターンに最も近い標準パターンをＤＰ（Ｄｙ
ｎａｍｉｃ　　Ｐｒｏｇｒａｍｉｎｇ）マツチング法に
より選び畠し音声を認識するものである、ＤＰマツチン
グ法とは動的計画法と訳され、１９５７年に米国のＢｅ
　ｌ　１ｍａｎが提案した数理計画法の一手法で、多段
決定過程の最適化に適用される。その手法は、各段であ
る決定（制御）を行なって状態を変換させながら、目的に達するまでの過程での良
さ／悪さを評価する間数を最大／最小とするというもの
である。また、音声認識システムが特定話者に対応する
場合は、標準パターン記憶手段３に特定話者が発声した
認識単語の音声パターンを登録し、−万年特定話者に対
応する場合は、不特定多数の話者が発声した認識単語の
音声パターンの内、代表パターンのいくつかを登録する
。発声誘導手段５は、音声合成手段により構成され、後
述する制御手段６に応じて、利用者の発声を促すために
音声による発声を促す。ただし、フレーバー名は、カッ
プ自販機前面のパネル板等に明記してあり、利用者はそ
の中から好みのフレーバー名を１つ選んで発声するもの
である。制御手段６は、処理に応じて発声誘導手段５に
誘導音声の発声を指示し、標準パターン選出手段４によ
り選出した標準パターンの中から利用者が発声した単語
を認識すると共に、認識結果により以後のカップ自販機
の動作を制御するものである。また、７はコインの受取
りと釣銭の払い戻しを行なうコイン受取り手段、８は選
択されたフレーバーをカップに注ぎ搬出する飲料搬出手
段である。Then, the standard pattern selection means 4 selects the standard pattern closest to the input cover turn from among the standard patterns as DP (Dy
The DP matching method, which is translated as dynamic programming, recognizes selected speech using the matching method.
This is a method of mathematical programming proposed by 1man, and is applied to the optimization of multi-stage decision processes. The method is to make certain decisions (controls) at each stage and transform the state, while maximizing/minimizing the number of steps required to evaluate the quality/badness of the process to reach the goal. Furthermore, when the speech recognition system corresponds to a specific speaker, the speech pattern of the recognized word uttered by the specific speaker is registered in the standard pattern storage means 3, and when the speech recognition system corresponds to a specific speaker, an unspecified Among the voice patterns of recognized words uttered by a large number of speakers, some representative patterns are registered. The utterance guiding means 5 is constituted by a voice synthesizing means, and in response to a control means 6 to be described later, prompts the user to utter by voice. However, the flavor names are clearly written on a panel board or the like on the front of the cup vending machine, and the user selects one flavor name from among them and speaks it out loud. The control means 6 instructs the utterance guidance means 5 to utter a guidance voice according to the process, recognizes the words uttered by the user from among the standard patterns selected by the standard pattern selection means 4, and uses the recognition results to This is to control the operation of the cup vending machine. Further, 7 is a coin receiving means for receiving coins and refunding change, and 8 is a beverage discharging means for pouring the selected flavor into a cup.

発明が解決しようとする課題しかしながら、上記のような方法では、発声者は音声認
識可能期間が明確にわからないため、カップ自販機の発
声誘導期間中（音声認識不可能期間）に、′はい”と発
声したりすることにより、カップ自販機が音声を正当に
認識できずに認識率が低下したり、発声時期がわからな
いために発声者が不安感を持ったりするという欠点を有
していた。Problems to be Solved by the Invention However, with the method described above, the speaker does not clearly know the period during which voice recognition is possible, so it is difficult for the speaker to say ``Yes'' during the voice guidance period of the cup vending machine (the period during which voice recognition is not possible). As a result, the cup vending machine cannot properly recognize the voice, resulting in a lower recognition rate, and the speaker may feel uneasy because he or she cannot know when the voice is being uttered.

本発明は上記従来の課題を解決するもので、音声認識不
可能である期間を発声者に明確に告知することにより、
認識率が高く、また、発声者に不安感を与えない音声認識システムを提供する
ことを目的とする。The present invention solves the above-mentioned conventional problems by clearly notifying the speaker of the period during which speech recognition is not possible.
The purpose is to provide a speech recognition system that has a high recognition rate and does not cause anxiety to the speaker.

課題を解決するための手段この目的を達成するために本発明の音声認識システムは
、複数の離散単語音声の標準パターン群を記憶した標準
パターン記憶手段と、発声者の音声を分析し音声パター
ンを抽出する音声分析手段と、前記音声分析手段により
抽出した音声パターンに最も近い標準パターンを前記標
準パターン群から選出する標準パターン選出手段と、音
声認識可能期間を発声者に告知する発声開始告知手段と
、発声者に単語を発声するように誘導する発声誘導手段
とからなる構成を有している。Means for Solving the Problems To achieve this object, the speech recognition system of the present invention includes a standard pattern storage means that stores a group of standard patterns of a plurality of discrete word sounds, and a standard pattern storage means that analyzes the speech of a speaker and generates speech patterns. a speech analysis means for extracting, a standard pattern selection means for selecting a standard pattern closest to the speech pattern extracted by the speech analysis means from the standard pattern group, and a speech start notification means for notifying the speaker of a speech recognition possible period; , and utterance guiding means for guiding the speaker to utter the word.

作用この構成によって、発声開始告知手段が、音声認識可能
期間を発声者に告知し、カップ自販機の発声誘導期間な
どの音声認識不可能期間に発声者に発声を行わせないこ
とにより、認識率が高く、また、発声者に不安感を与え
ない音声認識システムを実現できる。Effect: With this configuration, the speech start notification means notifies the speaker of the speech recognition possible period and prevents the speaker from making speech during the speech recognition impossible period such as the speech guidance period of a cup vending machine, thereby increasing the recognition rate. It is possible to realize a speech recognition system that is high quality and does not give a sense of anxiety to the speaker.

実施例以下本発明の一実施例について、図面を参照しながら説
明する。EXAMPLE An example of the present invention will be described below with reference to the drawings.

本実施例は、不特定話者に対する音声認識システムをカ
ップ自販機に適応したものである。ただし、構成要件中
、従来例と同構成のものは、同番号を付し、説明を割愛
する。第１図は、本発明の実施例における音声認識シス
テムの機能ブロック図を示すものである。９は、発声開
始告知手段であり、音声認識可能期間を発声者に告知す
るもので、例えばカップ自販機の前面のよく目立つ場所
にランプなどを設けたものである。第２図は前記発声開
始告知手段９の動作説明図である。第２図に示すように
カップ自販機の発声誘導期間やパターン選出処理期間な
どの音声認識不可能期間には発声開始告知信号はＯＦＦ
状態にあり、音声認識可能期間には発声開始告知信号は
ＯＮ状態にある。従って、発声者は発声開始告知信号が
ＯＮ状態にある時に発声を行なうことにより、従来のよ
うに誤って音声認識不可能期間に発声することがなくな
る。In this embodiment, a voice recognition system for unspecified speakers is applied to a cup vending machine. However, among the structural requirements, those having the same configuration as the conventional example are given the same numbers and explanations are omitted. FIG. 1 shows a functional block diagram of a speech recognition system in an embodiment of the present invention. Reference numeral 9 denotes a voice start notification means for notifying the voice recognition period to the voice speaker, and is provided with a lamp or the like in a conspicuous place on the front of the cup vending machine, for example. FIG. 2 is an explanatory diagram of the operation of the utterance start notification means 9. As shown in Figure 2, the voice start notification signal is OFF during periods when voice recognition is not possible, such as during the voice guidance period of the cup vending machine and during the pattern selection processing period.
The speech start notification signal is in the ON state during the speech recognition possible period. Therefore, by uttering the voice when the utterance start notification signal is in the ON state, the speaker does not erroneously utter the voice during the speech recognition impossible period as in the conventional case.

以上のように構成されたカップ自販機用音声認識システ
ムについて、第３図のフローチャートを用いてその販売
動作を説明する。第３図において、まず、ステップ２０
１で、前記コイン受取手段７にコインが投入されたか否
かを判定し、コインが投入されればステップ２０２に進
む。ステップ２０２では、前記発声誘導手段５により、
′いらっしゃいませ、何になさいますか”と誘導し、次
に、ステップ２０３で前記発声開始告知手段９により発
声開始告知信号を出す０発声告知信号を受けて発声者は
発声を行い、ステップ２０４で、前記標準パターン選出
手段４により前記標準パターン記憶手段３に記憶されて
いる標準パターンに最も近い標準パターンを選出してフ
レーバー名を認識する。ステップ２０５では、ステップ
２０４での認識結果が適当か否かを判定し、リジェクト
の場合はステップ２０６へ進み、発声誘導手段５により
、′もう一度お答え下さい”と誘導して２０３へ戻る。The vending operation of the voice recognition system for a cup vending machine configured as described above will be explained using the flowchart shown in FIG. In FIG. 3, first, step 20
1, it is determined whether a coin has been inserted into the coin receiving means 7, and if a coin has been inserted, the process proceeds to step 202. In step 202, the voice guidance means 5
``Welcome, what do you want to say?'' Then, in step 203, the utterance start notification means 9 issues a utterance start notification signal.Receiving the 0 utterance notification signal, the speaker makes a utterance, and in step 204 , the standard pattern selection means 4 selects the standard pattern closest to the standard pattern stored in the standard pattern storage means 3 to recognize the flavor name.In step 205, it is determined whether the recognition result in step 204 is appropriate or not. If the answer is rejected, the process proceeds to step 206, where the voice guidance means 5 prompts the user to say, ``Please answer again,'' and the process returns to step 203.

一方、リジェクトでない場合はステップ２０７へ進む。On the other hand, if the request is not rejected, the process advances to step 207.

ステップ２０７では、ステップ２０４で認識したフレー
バーにより以降の動作を分岐するものであるが、本実施
例においてはコーヒーを認識したものとし、他のフレー
バー名を認識した場合の動作についてはコーヒーの場合
と同様であるため説明を割愛する。次にステップ２０８
では、発声誘導手段５により、″コーヒーですね”と確
認し、続いてステップ２０９で前記発声開始告知手段９
により発声開始告知信号を出す。発声告知信号を受けて
発声者は発声を行い、２１０で、フレーバー名と同様の
方法で、はいか、いいえの返答を認識する。ステップ２
１１では、ステップ２１０での認識結果が適当か否かを
判定し、リジェクトの場合はステップ２０８へ戻り、そ
うでない場合はステップ２１２へ進む。ステップ２１２
では、ステップ２１０で認識した返答がはいの場合はス
テップ２１３へ進み、いいえの場合はステップ２０６へ
戻る。ステップ２１３では、制御手段６が、コーヒーを
前記飲料搬出手段８を使ってカップに注ぎ搬出する。そ
して、ステップ２１４で、釣り銭がある場合は、コイン
受取手段７により釣り銭を払い戻し、最後に、ステップ
２１５で発声誘導手段５により、”ありがどうございま
した″と発声して一連の動作を終了する。In step 207, the subsequent operation is branched depending on the flavor recognized in step 204, but in this embodiment, it is assumed that coffee has been recognized, and the operation when another flavor name is recognized is the same as in the case of coffee. Since they are similar, the explanation will be omitted. Next step 208
Then, the voice guidance means 5 confirms that "it's coffee," and then in step 209 the voice start notification means 9
to issue a vocalization start announcement signal. In response to the vocal announcement signal, the speaker makes a vocalization, and at 210 recognizes a yes or no response in the same manner as the flavor name. Step 2
In step 11, it is determined whether or not the recognition result in step 210 is appropriate. If rejected, the process returns to step 208; otherwise, the process proceeds to step 212. Step 212
If the answer recognized in step 210 is yes, the process advances to step 213; if the answer is no, the process returns to step 206. In step 213, the control means 6 pours and transports the coffee into a cup using the beverage transport means 8. Then, in step 214, if there is change, the coin receiving means 7 refunds the change, and finally, in step 215, the voice guiding means 5 utters "Thank you" to end the series of operations. .

以上のように本実施例によれば、発声開始告知手段によ
り音声認識可能期間を発声者に告知するため、発声者は
音声認識システムの発声誘導期間中などの音声認識不可
能期中に”はい”などと発声することがなくなり、また
、発声時期が明確にわかるために、認識率が高く、また
、発声者に不安感を与えない音声認識システムを実現で
きることとなりその効果は大である。As described above, according to this embodiment, the speech start notification means notifies the speaker of the speech recognition possible period, so that the speaker can say "Yes" during the speech recognition impossible period such as during the speech guidance period of the speech recognition system. This eliminates the need to utter such phrases, and since the timing of the utterance is clearly known, it is possible to realize a speech recognition system that has a high recognition rate and does not make the speaker feel uneasy, which is highly effective.

発明の効果以上のように本発明の音声認識システムは、複数の離散
単語音声の標準パターン群を記憶した標準パターン記憶
手段と、発声者の音声を分析し音声パターンを抽出する
音声分析手段と、前記音声分析手段により抽出した音声
パターンに最も近い標準パターンを前記標準パターン群
から選出する標準パターン選出手段と、音声認識可能期
間を発声者に告知する発声開始告知手段と、発声者に単
語を発声するように誘導する発声誘導手段とを設けるこ
とにより、発声開始告知手段が音声認識可能期間を発声
者に告知するため、発声誘導期間などの音声認識不可能
期間に発声者に発声を行わせないことにより、認識率が
高く、また、発声者に不安感を与えない音声認識システ
ムを実現できることとなる。Effects of the Invention As described above, the speech recognition system of the present invention includes: a standard pattern storage means that stores a group of standard patterns of a plurality of discrete word sounds; a speech analysis means that analyzes a speaker's speech and extracts a speech pattern; standard pattern selection means for selecting a standard pattern closest to the speech pattern extracted by the speech analysis means from the standard pattern group; utterance start notification means for notifying the speaker of a speech recognition possible period; and uttering a word to the speaker. Since the utterance start notification means notifies the speaker of the period during which voice recognition is possible, the speaker is not made to utter during the period during which voice recognition is not possible, such as the utterance guidance period. As a result, it is possible to realize a speech recognition system that has a high recognition rate and does not make the speaker feel uneasy.

【図面の簡単な説明】[Brief explanation of drawings]

第１図は本発明の一実施例における音声認識システムの
機能ブロック図、第２図は第１図の発声開始告知手段の
動作説明図、第３図は本発明の実施例における音声認識
システムの動作例を示すフローチャート、第４図は従来
の音声認識システムの機能ブロック図である。２・・・音声分析手段、３・・・標準パターン記憶手段
、４・・・標準パターン選出手段、５・・・発声誘導手
段、９・・・発声開始告知手段。代理人の氏名　弁理士　粟野重孝　はか１名第図FIG. 1 is a functional block diagram of a speech recognition system according to an embodiment of the present invention, FIG. 2 is an explanatory diagram of the operation of the utterance start notification means of FIG. 1, and FIG. 3 is a functional block diagram of a speech recognition system according to an embodiment of the present invention. A flowchart showing an example of operation, and FIG. 4 is a functional block diagram of a conventional speech recognition system. 2...Speech analysis means, 3...Standard pattern storage means, 4...Standard pattern selection means, 5...Speech guidance means, 9...Speech start notification means. Name of agent: Patent attorney Shigetaka Awano

Claims

【特許請求の範囲】[Claims]

複数の離散単語音声の標準パターン群を記憶した標準パ
ターン記憶手段と、発声者の音声を分析し音声パターン
を抽出する音声分析手段と、前記音声分析手段により抽
出した音声パターンに最も近い標準パターンを前記標準
パターン群から選出する標準パターン選出手段と、音声
認識可能期間を発声者に告知する発声開始告知手段と、
発声者に単語を発声するように誘導する発声誘導手段と
からなる音声認識システム。a standard pattern storage means that stores a group of standard patterns of a plurality of discrete word sounds, a voice analysis means that analyzes the voice of a speaker and extracts a voice pattern, and a standard pattern that is closest to the voice pattern extracted by the voice analysis means. standard pattern selection means for selecting from the group of standard patterns; utterance start notification means for notifying the speaker of a speech recognition possible period;
A speech recognition system comprising a speech guidance means for guiding a speaker to pronounce a word.