JPH0756595B2

JPH0756595B2 - Interactive voice input / output device

Info

Publication number: JPH0756595B2
Application number: JP61145219A
Authority: JP
Inventors: 正明北野; 正宏浜田; 博之直野
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1986-06-20
Filing date: 1986-06-20
Publication date: 1995-06-14
Anticipated expiration: 2010-06-14
Also published as: JPS62299997A

Description

【発明の詳細な説明】産業上の利用分野本発明は、各種機器への命令を音声によって行なうため
に用いられる対話型音声入出力装置に関するものであ
る。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an interactive voice input / output device that is used to give a command to various devices by voice.

従来の技術近年、音声認識，音声合成等の音声情報処理，およびLS
Iの技術の発達に伴い、音声認識装置，音声合成装置は
産業機器，民生機器等に利用され始め、音声認識装置と
音声合成装置とを組み合わせて人間と機械が対話しなが
ら命令入力と情報出力を行なう対話型音声入出力装置が
出現している。BACKGROUND ART In recent years, voice information processing such as voice recognition and voice synthesis, and LS
With the development of the technology of I, the voice recognition device and the voice synthesis device have begun to be used in industrial equipment, consumer equipment, etc., and the command input and the information output while the human and the machine interact by combining the voice recognition device and the voice synthesis device. An interactive voice input / output device for performing has appeared.

以下図面を参照しながら、上述した従来の対話型音声入
出力装置の一例について説明する。An example of the conventional interactive voice input / output device described above will be described below with reference to the drawings.

第３図は従来の対話型音声入出力装置のブロック図を示
すものである。FIG. 3 is a block diagram of a conventional interactive voice input / output device.

第３図において、５はシーケンス制御部であり、後述す
る音声認識装置２と音声合成装置３と被制御機器４のそ
れぞれの状態を調べてそれぞれに起動を指示する。２は
音声認識装置であり、音声入力を認識して認識結果をシ
ーケンス制御部５に伝える。３は音声合成装置であり、
シーケンス制御部５から起動命令を受けて、利用者に音
声入力を要求する旨の合成音を出力する。４は被制御機
器であり、本対話型音声入出力装置により利用者の音声
入力が命令として伝えられる。In FIG. 3, reference numeral 5 denotes a sequence control unit, which checks the respective states of a voice recognition device 2, a voice synthesis device 3, and a controlled device 4 which will be described later and gives an instruction to start them. Reference numeral 2 denotes a voice recognition device, which recognizes a voice input and transmits the recognition result to the sequence control unit 5. 3 is a voice synthesizer,
Upon receiving a start-up command from the sequence control unit 5, a synthetic sound requesting the user to input a voice is output. Reference numeral 4 is a controlled device, and the voice input of the user is transmitted as a command by the interactive voice input / output device.

以上のように構成された対話型音声入出力装置につい
て、以下第３図及び第４図を用いてその動作を説明す
る。The operation of the interactive voice input / output device configured as described above will be described below with reference to FIGS. 3 and 4.

第４図はシーケンス制御部５の動作のフローチャートで
ある。FIG. 4 is a flowchart of the operation of the sequence controller 5.

まず被制御機器４がシーケンス制御部５に命令の要求を
出すと、（ステップ11）シーケンス制御部５は音声合成
装置３に利用者の機能名の音声入力を要求する旨の合成
音を出力させる（ステップ12）。合成音の出力が終了す
ると（ステップ23）、シーケンス制御部５は音声認識装
置２に起動を指示し（ステップ14）、音声認識装置２は
利用者の音声入力を待つ。利用者が音声を入力すると音
声認識装置２はこの音声を認識してシーケンス制御部５
へ伝える（ステップ15）。シーケンス制御部５は音声合
成装置３にこの認識結果の是非を利用者に音声入力を要
求する旨の合成音を出力させる（ステップ16）。合成音
の出力が出力するとシーケンス制御部５は音声認識装置
２に起動を指示し（ステップ27）、音声認識装置２は利
用者の音声入力を待つ。利用者が音声を入力すると音声
認識装置２はこの音声を認識してシーケンス制御部５へ
伝える（ステップ18）。ここの認識結果が「是」ならシ
ーケンス制御部５は機能名の認識結果を示す命令を被制
御機器４へ伝え（ステップ20,21）、被制御機器４は動
作する。是非の認識結果が「非」のときはシーケンス制
御部５は再度機能名を利用者に音声入力させるようステ
ップ12に戻り、前記と同様の動作を行なう。First, when the controlled device 4 issues a command request to the sequence control unit 5, (step 11) the sequence control unit 5 causes the voice synthesis device 3 to output a synthesized voice requesting voice input of the user's function name. (Step 12). When the output of the synthetic sound is completed (step 23), the sequence controller 5 instructs the voice recognition device 2 to start up (step 14), and the voice recognition device 2 waits for the user's voice input. When the user inputs a voice, the voice recognition device 2 recognizes this voice and the sequence control unit 5
Tell (step 15). The sequence control unit 5 causes the voice synthesizer 3 to output a synthetic voice to the effect that the user is requested to input the voice of the recognition result (step 16). When the output of the synthetic sound is output, the sequence control unit 5 instructs the voice recognition device 2 to start up (step 27), and the voice recognition device 2 waits for the user's voice input. When the user inputs a voice, the voice recognition device 2 recognizes this voice and transmits it to the sequence controller 5 (step 18). If the recognition result here is "yes", the sequence control unit 5 transmits a command indicating the recognition result of the function name to the controlled device 4 (steps 20 and 21), and the controlled device 4 operates. If the recognition result is "non", the sequence control unit 5 returns to step 12 so that the user can input the function name by voice again, and the same operation as described above is performed.

発明が解決しようとする問題点しかしながら上記のような構成では、利用者は自分に選
択の必要がある場合の音声入力の際は合成音声を聞き終
えてから発声することが多いのに対し、是非の認識の際
には合成音声の終わるのを待たずに性急に発声してしま
うので、音声が正しく音声認識装置に入力できずに誤認
識することが多いという問題点を有していた。Problems to be Solved by the Invention However, in the above configuration, when the user inputs a voice when he / she needs to make a selection, he / she often speaks after listening to the synthetic voice. When recognizing, the voice is suddenly uttered without waiting for the synthesized voice to end, and thus there is a problem that the voice cannot be correctly input to the voice recognition device and is erroneously recognized.

本発明は上記問題点に鑑み、利用者に選択の必要がある
場合の認識の際には利用者は比較的間を置いて発声し、
また利用者が既に音声入力した事項の認識の認識の際に
は性急に発声すという癖に対応して高品質の音声入力に
より高い認識率の対話型音声入出力装置を提供するもの
である。In view of the above problems, the present invention allows the user to utter a relatively long time when recognizing when the user needs to select,
Further, the present invention provides an interactive voice input / output device having a high recognition rate by high-quality voice input in response to the habit of promptly uttering when recognizing the recognition of a matter that the user has already input by voice.

問題点を解決するための手段上記問題点を解決するために本発明の対話型音声入出力
装置は、利用者に選択の必要がある場合の認識の際には
音声合成装置の出力が終了した後に音声認識装置を起動
し、利用者が既に音声入力した事項の確認の認識の際に
は音声合成装置の出力が終了する直前に音声認識装置を
起動することを特徴とする時間制御部と、これにより制
御される音声認識装置と、音声合成装置という構成を備
えたものである。Means for Solving the Problems In order to solve the above problems, the interactive voice input / output device of the present invention terminates the output of the voice synthesizer at the time of recognition when the user needs to select. A time control unit characterized by activating a voice recognition device later, and activating the voice recognition device immediately before the output of the voice synthesis device is terminated when recognizing confirmation of a matter that the user has already input by voice. A voice recognition device controlled by this and a voice synthesis device are provided.

作用本発明は上記した構成によって、時間制御部は、利用者
に選択の必要がある場合の認識の際には音声合成装置の
出力が終了した後に音声認識装置を起動し、利用者が既
に音声入力した事項の確認の認識の際には音声合成装置
の出力する合成者の継続時間をあらかじめ記憶してお
き、音声合成装置の出力が終了する直前に音声認識装置
を起動するので、合成音の終れるのを待たずに性急に発
声する利用者の音声も正しく入力することができる。With the above-described configuration, the time control unit of the present invention activates the voice recognition device after the output of the voice synthesizer is finished at the time of recognition when the user needs to make a selection, and the user already recognizes the voice. When recognizing the confirmation of the input items, the duration of the synthesizer output by the speech synthesizer is stored in advance, and the speech recognition apparatus is activated immediately before the output of the speech synthesizer is finished. It is also possible to correctly input the voice of the user who promptly utters without waiting for the end.

実施例以下本発明の一実施例の対話型音声入出力装置につい
て、図面を参照しながら説明する。Embodiment Hereinafter, an interactive voice input / output device according to an embodiment of the present invention will be described with reference to the drawings.

第１図は本発明の一実施例における対話型音声入出力装
置のブロック図を示すものである。FIG. 1 is a block diagram of an interactive voice input / output device according to an embodiment of the present invention.

第１図において、１は時間制御部であり、利用者に選択
の必要がある場合の認識の際には、音声合成装置３の出
力が終了した後に音声認識装置２を起動し、利用者が既
に音声入力した事項の確認の認識の際には音声合成装置
３の出力する合成音の継続時間をあらかじめ記憶してお
き、音声合成装置３の出力が終了する直前に音声認識装
置２を起動する。尚、音声認識装置2,音声合成装置3,被
制御機器４は従来例の構成と同じものである。In FIG. 1, reference numeral 1 denotes a time control unit, and when recognition is performed when the user needs to select, the voice recognition device 2 is activated after the output of the voice synthesis device 3 is completed, and the user When recognizing the confirmation of a matter that has already been input by voice, the duration of the synthesized voice output by the voice synthesizer 3 is stored in advance, and the voice recognizer 2 is activated immediately before the output of the voice synthesizer 3 ends. . The voice recognition device 2, the voice synthesizing device 3, and the controlled device 4 have the same configuration as the conventional example.

以上のように構成された対話型音声入出力装置につい
て、以下第１図及び第２図を用いてその動作を説明す
る。The operation of the interactive voice input / output device configured as described above will be described below with reference to FIGS. 1 and 2.

第２図は、時間制御部１の動作のフーチャートである。
まず被制御機器４が時間制御部１に命令の要求を出すと
（ステップ11）、時間制御部１は音声合成装置３に利用
者に機能名の音声入力を要求する旨の合成音を出力させ
る（ステップ12）。合成音の出力が終了すると（ステッ
プ13）、時間制御部１は音声認識装置２に起動を指示し
（ステップ14）、音声認識装置２は利用者の音声入力を
待つ。利用者が音声を入力すると音声認識装置２はこの
音声を認識して時間制御部１へ伝える（ステップ15）。
時間制御部１は音声合成装置３にこの認識結果の是非を
利用者に音声入力を要求する旨の合成音を出力させる
（ステップ16）。ここであらかじめ記憶しておいた合成
音の継続時間より若干短い時間、時間制御部は停止し
（ステップ17）、合成音の出力が終了する直前に音声認
識装置２を起動する（ステップ18）。利用者が音声を入
力すると音声認識装置２はこの音声を認識して時間制御
部１へ伝える（ステップ19）。この認識結果が「是」な
ら、時間制御部１は機能名の認識結果の示す命令を被制
御機器４へ伝え（ステップ19,20）、被制御機器４は動
作する。是非の認識結果が「非」のときは、時間制御部
１は再度機能名を利用者に音声入力させるようステップ
12に戻り、前記と同様の動作を行なう（ステップ12〜2
0）。FIG. 2 is a flowchart of the operation of the time control unit 1.
First, when the controlled device 4 issues a command request to the time control unit 1 (step 11), the time control unit 1 causes the voice synthesizing device 3 to output a synthetic voice requesting the user to input the voice of the function name. (Step 12). When the output of the synthetic sound is completed (step 13), the time control unit 1 instructs the voice recognition device 2 to start up (step 14), and the voice recognition device 2 waits for the user's voice input. When the user inputs a voice, the voice recognition device 2 recognizes this voice and transmits it to the time control unit 1 (step 15).
The time control unit 1 causes the voice synthesizer 3 to output a synthesized voice to the effect that the user is requested to input a voice whether the recognition result is right or wrong (step 16). Here, the time control unit is stopped for a time slightly shorter than the duration of the synthesized voice stored in advance (step 17), and the voice recognition device 2 is activated immediately before the output of the synthesized voice is finished (step 18). When the user inputs a voice, the voice recognition device 2 recognizes this voice and transmits it to the time control unit 1 (step 19). If the recognition result is "yes", the time control unit 1 transmits the command indicated by the recognition result of the function name to the controlled device 4 (steps 19 and 20), and the controlled device 4 operates. If the recognition result is "non", the time control unit 1 again causes the user to input the function name by voice.
Return to 12 and perform the same operation as above (steps 12 to 2).
0).

以上のように本実施例によれば、音声合成装置３を起動
させ、あらかじめ記憶しておいた合成音の継続時間より
若干短い時間停止し、合成音の出力が終了する直前に音
声認識装置２を起動する時間制御部１と、これにより制
御される音声認識装置２と、音声合成装置３という構成
を備えることにより、合成音の終わるのを待たずに性急
に発声する利用者の音声も正しく入力することができ
る。なお利用者に選択の必要がある場合の認識の際に
は、時間制御部１は音声合成装置３の合成音の出力が終
了してから音声認識装置２の起動を行なうので雑音を入
力してしまうことが少なく利用者の音声を正しく入力す
ることができる。以上のように利用者の音声を正しく入
力することができるので高い認識率の対話型音声入出力
装置を実現することができる。As described above, according to the present embodiment, the voice synthesizing device 3 is activated, stopped for a time slightly shorter than the duration of the synthetic voice stored in advance, and immediately before the output of the synthetic voice ends. By providing the configuration of the time control unit 1 for activating, the voice recognition device 2 controlled thereby, and the voice synthesis device 3, the voice of the user who promptly utters without waiting for the end of the synthesized voice is correct. You can enter. When recognizing when the user needs to make a selection, the time control unit 1 activates the voice recognition device 2 after the output of the synthesized voice of the voice synthesis device 3 is completed, so that noise is input. The voice of the user can be correctly input without being confused. As described above, the voice of the user can be input correctly, so that an interactive voice input / output device with a high recognition rate can be realized.

発明の効果以上のように本発明によれば、利用者が性急に発声する
ことが多い場合でも、利用者の声を正しく入力できると
ともに、雑音を入力してしまうことが少なく、高い認識
率の対話型音声入出力装置を実現することができる。EFFECTS OF THE INVENTION As described above, according to the present invention, the user's voice can be correctly input even when the user often utters in a hurry, and noise is less likely to be input, resulting in a high recognition rate. An interactive voice input / output device can be realized.

【図面の簡単な説明】[Brief description of drawings]

第１図は本発明の一実施例における対話型音声入出力装
置のブロック図、第２図は時間制御部の制御手順を示す
フーチャート、第３図は従来の対話型音声入出力装置の
ブロック図、第４図は従来の対話型音声入出力装置のシ
ーケンス制御部のフローチャートである。１……時間制御部、２……音声認識装置、３……音声合
成装置、４……被制御機器。FIG. 1 is a block diagram of an interactive voice input / output device according to an embodiment of the present invention, FIG. 2 is a flowchart showing a control procedure of a time control unit, and FIG. 3 is a block of a conventional interactive voice input / output device. 4 and 5 are flowcharts of the sequence control unit of the conventional interactive voice input / output device. 1 ... time control unit, 2 ... voice recognition device, 3 ... voice synthesis device, 4 ... controlled device.

Claims

【特許請求の範囲】[Claims]

【請求項１】音声認識装置と、利用者に前記音声認識装
置への音声入力を指示する音声合成装置と、利用者に選
択の必要がある場合の認識の際には音声合成装置の出力
が終了した後に音声認識装置を起動し、利用者が既に音
声入力した事項の確認のための認識の際には前記音声合
成装置の出力が終了する直前に前記音声認識装置を起動
するように制御する時間制御部とを備えたことを特徴と
する対話型音声入出力装置。1. A voice recognition device, a voice synthesizer for instructing a user to input voice to the voice recognizer, and an output of the voice synthesizer for recognition when the user needs to select. After the end, the voice recognition device is activated, and when recognition is performed to confirm the items that the user has already input by voice, control is performed so that the voice recognition device is activated immediately before the output of the voice synthesis device is terminated. An interactive voice input / output device comprising a time control unit.