JPH0673800U

JPH0673800U - Voice recognizer

Info

Publication number: JPH0673800U
Application number: JP015272U
Authority: JP
Inventors: 博之藤本; 和也佐古; 昇治藤本
Original assignee: Denso Ten Ltd
Current assignee: Denso Ten Ltd
Priority date: 1993-03-30
Filing date: 1993-03-30
Publication date: 1994-10-18

Abstract

(57)【要約】【目的】話者の発声に対して音声で応答を行なう音声認
識装置に関し、予め複数カ国語で音声認識および音声応
答可能とすることで使い勝手を良くし、かつ汎用性の高
い安価な装置を実現することを目的とする。【構成】少なくとも、音声入力部１と、入力して来た音
声信号からその言語を判定する使用言語判定部２と、入
力された音声信号の内容を判別する複数カ国語対応の音
声認識部３と、音声認識結果に従って機器を制御する制
御部４と、音声応答のための複数カ国語対応の応答信号
発生部５とを有し、使用言語判定部２から出力した判定
信号が音声認識部３に入力して、認識対象言語が選定さ
れ、かつ使用言語判定部２から出力した判定信号が応答
信号発生部５に入力して、音声応答言語が選定されるよ
うな構成とする。 (57) [Abstract] [Purpose] A voice recognition device that responds to a speaker's utterance by voice. A voice recognition device capable of voice recognition and voice response in a plurality of languages in advance improves usability and provides versatility. The purpose is to realize an expensive and inexpensive device. At least a voice input unit 1, a use language determination unit 2 for determining the language from an input voice signal, and a multi-language voice recognition unit 3 for determining the content of the input voice signal. And a response signal generator 5 for controlling a plurality of languages for voice response, and a determination signal output from the used language determiner 2 is a voice recognizer 3. The language to be recognized is selected, and the determination signal output from the used language determination unit 2 is input to the response signal generation unit 5 to select the voice response language.

Description

【考案の詳細な説明】[Detailed description of the device]

【０００１】[0001]

【産業上の利用分野】[Industrial applications]

本考案は、話者の発声に対して音声で応答を行なう音声認識装置に関する。自動車などの運転者が運転中に、エアコンディショナやオーディオ装置、カーナビゲーション、自動車電話、オートクルーズなどの機器を手動操作すると、安全運転に支障を来たし、事故を引き起こすおそれがある。そこで、これらの機器の操作を音声で指示し、また機器側では、指示された操作に対して音声で応答して、話者が確認できるようにすることが、操作性向上のために望ましい。 The present invention relates to a voice recognition device that makes a voice response to a speaker's utterance. If a driver of an automobile or the like manually operates a device such as an air conditioner, an audio device, a car navigation system, a car phone, or an auto cruise while driving, it may interfere with safe driving and cause an accident. Therefore, in order to improve the operability, it is desirable that the operation of these devices be instructed by voice, and that the device responds by voice to the instructed operation so that the speaker can confirm. .

【０００２】[0002]

【従来の技術】[Prior art]

自動車用に限らず、音声認識装置は、話者の音声を認識し、さらに認識した内容に従って機器を操作するようになっている。そして、結果を話者に対して、機器側から信号音を発して応答する。 Not only for automobiles, the voice recognition device recognizes the voice of the speaker and operates the device according to the recognized contents. Then, the device responds to the speaker by emitting a signal sound from the device side.

【０００３】図５は従来の音声認識装置における応答動作を示すフローチャートである。オーディオ機器を操作する場合であれば、まず、ステップS1において、例えば「プレイ」と発声して音声入力し、再生動作を指示すると、ステップS2において、マイクロホンを介して音声信号が入力され、ステップS3で音声認識に適するように前処理が行なわれる。FIG. 5 is a flowchart showing a response operation in a conventional voice recognition device. In the case of operating an audio device, first, in step S1, for example, "play" is input to input voice, and when a playback operation is instructed, in step S2, an audio signal is input via the microphone. In step S3, preprocessing is performed so as to be suitable for voice recognition.

【０００４】次いで、ステップS4に示すように、音声認識部において、音声認識が行なわれ、ステップS5において認識結果の判定が行なわれる。「プレイ」という言葉を認識できた場合は、ステップS6において、認識できたことをブザー音などで応答して話者に知らせた後、ステップS7において、プレイ動作を開始する。Next, as shown in step S4, voice recognition is performed in the voice recognition unit, and determination of the recognition result is performed in step S5. When the word "play" is recognized, in step S6, the speaker is informed by a buzzer sound or the like that the recognition is possible, and then the play operation is started in step S7.

【０００５】これに対し、話者の言葉を認識できない場合は、ステップS8において、その旨を知らせる信号音を出力して、話者に対し、再度音声入力するよう促す。On the other hand, if the speaker's words cannot be recognized, in step S8, a signal sound to that effect is output to urge the speaker to input the voice again.

【０００６】[0006]

【考案が解決しようとする課題】[Problems to be solved by the device]

しかしながら、このようなブザー音などによる応答では、音声認識できたか、できないかの二通り程度しか表現できない。自動車などの場合は、多数の機器が装備されており、多数のスイッチを操作する必要があるので、スイッチ操作によって行なわれる動作の内容も多種多様である。したがって、応答もブザー音ではなく、操作指示の内容に応じて具体的に音声で応答できるのが望ましい。 However, the response such as the buzzer sound can express only two types, that is, whether or not voice recognition is possible. In the case of automobiles and the like, a large number of devices are equipped and it is necessary to operate a large number of switches, so the contents of operations performed by the switch operations are also diverse. Therefore, it is desirable that the response is not a buzzer sound, but can be a concrete voice response according to the content of the operation instruction.

【０００７】例えば、話者が「プレイ」と発声した場合、その指示音声を認識し、しかもプレイ可能であれば、例えば「カセットテープをプレイします」のように、音声で応答し、音声認識できない場合は、「もう一度お話し下さい」のように、音声応答（トークバック）できると、対話式に、より正確に機器操作を行なうことができ、操作性が向上する。For example, when a speaker utters “play”, the instruction voice is recognized, and if it is playable, a voice response such as “play a cassette tape” is given and a voice response is given. If you can not recognize it, if you can respond to the voice (talk back) like "Please speak again", you can operate the device interactively and more accurately, and the operability is improved.

【０００８】また、エアコンディショナの場合は、温度や風量などの状態を、オートクルーズの場合は、現在の速度などの状態を、音声応答で話者に知らせることができると、操作がより簡便となる。In the case of an air conditioner, if the state of temperature and air volume can be notified to the speaker by a voice response, the state of current speed and the like of an auto cruise can be notified to the speaker. It becomes easy.

【０００９】ところで、音声認識は通常、日本語で行なわれるが、輸出車の場合は、相手国語で音声認識可能にしておく必要がある。また、本考案のように音声で応答する場合も、輸出用の車両については、相手国語で応答する必要がある。By the way, speech recognition is usually performed in Japanese, but in the case of an exported car, it is necessary to enable speech recognition in the other language. In addition, even when responding by voice as in the present invention, it is necessary to respond in the partner's language for vehicles for export.

【００１０】しかしながら、音声認識やトークバックを、輸出車に関してのみ、英語などの相手国語で行なえるようにするには、英語を音声認識可能とし、しかもトークバックも英語で行なえる認識装置を別に開発する必要があり、コスト高となるという問題がある。However, in order to enable voice recognition and talkback to be performed in the other language such as English only for the exported car, a recognition device that enables voice recognition of English and also talkback in English can be used. There is a problem that it needs to be developed separately and the cost becomes high.

【００１１】一方、国際化が進展すると、レンタルカーなどを想定したとき、同一車両を日本人が運転することもあれば外国人が運転することもある。このような場合、日本語のみ又は英語のみしか音声認識やトークバックできないとなると、所期の目的を達成できず、不便である。On the other hand, as internationalization progresses, assuming a rental car or the like, a Japanese person may drive the same vehicle or a foreigner may drive the same vehicle. In such a case, if only Japanese or English can be used for voice recognition and talkback, it is inconvenient because the intended purpose cannot be achieved.

【００１２】本考案の技術的課題は、このような問題に着目し、予め複数カ国語で音声認識および音声応答可能とすることで使い勝手を良くし、かつ汎用性の高い安価な装置を実現することにある。The technical problem of the present invention is to pay attention to such a problem and to realize a user-friendly and inexpensive device with high versatility by enabling voice recognition and voice response in multiple languages in advance. To do.

【００１３】[0013]

【課題を解決するための手段】[Means for Solving the Problems]

図１は本考案による音声認識装置の基本構成を説明するブロック図である。請求項１の音声認識装置は、話者の発声した音声の信号が入力すると、該音声信号に対応するトークバック信号を出力し、音声で応答する音声認識装置において、図１（１）のように、少なくとも、音声入力部１と、入力して来た音声信号からその言語を判定する使用言語判定部２と、入力された音声信号の内容を判別する音声認識部３と、音声認識結果に従って機器を制御する制御部４と、音声応答のための応答信号発生部５とを有している。 FIG. 1 is a block diagram illustrating the basic configuration of a voice recognition device according to the present invention. The voice recognition device of claim 1 is a voice recognition device that outputs a talkback signal corresponding to a voice signal when a voice signal of a speaker is input and responds with a voice. As described above, at least the voice input unit 1, the use language determination unit 2 that determines the language from the input voice signal, the voice recognition unit 3 that determines the content of the input voice signal, and the voice recognition result. It has a control section 4 for controlling the equipment according to the above, and a response signal generation section 5 for making a voice response.

【００１４】前記の音声認識部３は、入力した音声信号から複数カ国語を認識でき、また応答信号発生部５は、複数カ国語で音声応答できる。そして、使用言語判定部２から出力された判定信号が音声認識部３に入力して、認識対象言語が選定され、かつ使用言語判定部２から出力された判定信号が応答信号発生部５に入力して、音声応答言語が選定されるように構成されている。The voice recognition unit 3 can recognize a plurality of languages from the input voice signal, and the response signal generation unit 5 can make a voice response in a plurality of languages. Then, the judgment signal output from the used language judgment unit 2 is input to the voice recognition unit 3, the recognition target language is selected, and the judgment signal output from the used language judgment unit 2 is the response signal generation unit 5. Input, and the voice response language is selected.

【００１５】請求項２は、話者の発声した音声の信号が入力すると、該音声信号に対応するトークバック信号を出力し、音声で応答する音声認識装置において、図１（２）のように、少なくとも、音声入力部１と、入力された音声信号の内容を判別する音声認識部３と、音声認識結果に従って機器を制御する制御部４と、音声応答のための応答信号発生部５と、使用言語の設定部６とを有している。According to a second aspect of the present invention, when a signal of a voice uttered by a speaker is input, a talkback signal corresponding to the voice signal is output, and a voice recognition device that responds with a voice is as shown in FIG. 1 (2). , At least a voice input unit 1, a voice recognition unit 3 that determines the content of an input voice signal, a control unit 4 that controls the device according to the voice recognition result, and a response signal generation unit 5 for voice response, It has a setting unit 6 for the language used.

【００１６】音声認識部３は、入力した音声信号から複数カ国語を認識でき、また応答信号発生部５は、複数カ国語による音声応答が可能となっている。そして、前記音声認識部３および前記応答信号発生部５の少なくとも一方は、使用言語設定部６から出力された選択信号に応じて、認識対象言語あるいは音声応答言語が選択されるように構成されている。The voice recognition unit 3 can recognize a plurality of languages from the input voice signal, and the response signal generation unit 5 can make a voice response in a plurality of languages. At least one of the voice recognition unit 3 and the response signal generation unit 5 is configured to select a recognition target language or a voice response language according to a selection signal output from the used language setting unit 6. ing.

【００１７】[0017]

【作用】[Action]

本考案の音声認識部３は、それぞれの言語ごとに音声認識機能を有していて、使用言語判定部２からの判定信号によって、判定された言語の音声認識機能が作動する構成になっている。したがって、入力した音声信号から複数カ国語を認識できる。 The voice recognition unit 3 of the present invention has a voice recognition function for each language, and the voice recognition function of the determined language is activated by the determination signal from the used language determination unit 2. There is. Therefore, it is possible to recognize multiple languages from the input voice signal.

【００１８】また、応答信号発生部５は、それぞれの言語ごとに、操作命令に応じた音声応答機能を有しているため、使用言語判定部２において、入力して来た音声信号からその言語が自動的に判定されて、判定信号が音声認識部３と応答信号発生部５に入力し、使用言語が指定されると、音声認識部３では指定された言語のみを選択して音声認識し、応答信号発生部５では、指定された言語を選択して音声応答が行なわれる。Further, since the response signal generation unit 5 has a voice response function according to the operation command for each language, the language determination unit 2 uses the voice signal input from the input voice signal. The language is automatically determined, the determination signal is input to the voice recognition unit 3 and the response signal generation unit 5, and when the language to be used is designated, the voice recognition unit 3 selects only the designated language. The voice is recognized, and the response signal generator 5 selects the designated language and makes a voice response.

【００１９】このように、音声入力部１から音声が入力するだけで、その言語が判定され、その言語で音声認識や音声応答が行なわれるので、複数カ国語で音声認識および音声応答でき、輸出用の装置や複数カ国の人種が使用する場合に便利である。また、輸出先の国語専用の装置とする必要がなく、汎用性が高いので、安価に実現できる。As described above, the language is determined only by inputting the voice from the voice input unit 1, and the voice recognition and the voice response are performed in the language. Therefore, the voice recognition and the voice response can be performed in a plurality of languages, and the export can be performed. This is useful when using devices for people and races from multiple countries. In addition, it does not have to be a device dedicated to the language of the export destination and has high versatility, so it can be realized at low cost.

【００２０】なお、使用言語判定部２においては、初期状態で入力してきた最初の音声信号で言語の種別を判定して、以後はその言語のみを有効とする手法と、音声信号が入力してくるたびにその言語の種別を判定し、判定信号を音声認識部３および応答信号発生部５に入力して、使用言語を指定する手法とがあるが、本考案は両方を含むものとする。Note that the used language determination unit 2 determines the type of language from the first voice signal input in the initial state, and after that, a method for validating only that language and a voice signal input There is a method of determining the type of the language each time it comes and inputting the determination signal to the voice recognition section 3 and the response signal generation section 5 to specify the language to be used, but the present invention includes both.

【００２１】請求項２における音声認識部３も、入力した音声信号から複数カ国語を認識でき、また応答信号発生部５は、複数カ国語で音声応答が可能となっており、使用言語設定部６から出力した設定信号が音声認識部３に入力して、認識対象言語が選択され、あるいは使用言語設定部６から出力した設定信号が応答信号発生部５に入力して、音声応答言語が選択される構成になっている。The voice recognition unit 3 in claim 2 can also recognize a plurality of languages from the input voice signal, and the response signal generation unit 5 can make a voice response in a plurality of languages, and sets the language to be used. The setting signal output from the unit 6 is input to the voice recognition unit 3 to select the recognition target language, or the setting signal output from the used language setting unit 6 is input to the response signal generation unit 5 to change the voice response language. The configuration is selected.

【００２２】したがって、予め人為的に使用言語を選択する操作が必要ではあるが、複数カ国語に対応できるので、請求項１の場合と同様な効果を奏するほか、一旦使用言語を設定すれば、使用言語を変更しない限り、操作を行なう必要がないので、使用言語の変更の頻度が少ない装置に有利であり、また構成が簡素となるため、より安価に実現できる。Therefore, although it is necessary to artificially select the language to be used in advance, it is possible to deal with multiple languages, so that the same effect as in the case of claim 1 can be obtained, and once the language to be used is set. Since it is not necessary to perform the operation unless the language used is changed, it is advantageous for a device in which the language used is not frequently changed, and the configuration is simple, so that it can be realized at a lower cost.

【００２３】[0023]

【実施例】【Example】

次に本考案による音声認識装置が実際上どのように具体化されるかを実施例で説明する。図２は請求項１の音声認識装置を自動車に実施した例を示すブロック図である。自動車７のフロントシート８の前方には、運転席側および助手席側の双方に、音声入力手段として、マイクロホン１ａ、１ｂが配設されている。 Next, how the voice recognition apparatus according to the present invention is actually embodied will be described with reference to an embodiment. FIG. 2 is a block diagram showing an example in which the voice recognition device of claim 1 is applied to an automobile. In front of the front seat 8 of the automobile 7, microphones 1a and 1b are provided as voice input means on both the driver side and the passenger side.

【００２４】そして、マイクロホン１ａ、１ｂで検出された音声信号は、雑音除去などを行なう前処理回路９を経て、使用言語の判定部２に入力され、フロントシート８における話者の話した言語の種別が判別される。使用言語判定部２において、話者の使用言語が判別されると、その判別信号が音声認識部３と応答信号発生部５に入力され、判定された言語のみが有効となる。Then, the voice signals detected by the microphones 1 a and 1 b are input to the language determining unit 2 through a preprocessing circuit 9 that performs noise removal and the like, and a speaker talks on the front seat 8. The type of language used is determined. When the language used by the speaker is discriminated by the used language determination unit 2, the determination signal is input to the voice recognition unit 3 and the response signal generation unit 5, and only the determined language becomes valid.

【００２５】この自動車には、装備として、オーディオ装置Ａ１、エアコンディショナＡ２、自動車電話Ａ３、ナビゲーション装置Ａ４およびオートクルーズＡｎなどが搭載されている。The vehicle is equipped with an audio device A1, an air conditioner A2, a car phone A3, a navigation device A4, an auto cruise An and the like as equipment.

【００２６】この実施例は、初期状態で入力してきた最初の音声信号で言語の種別を判定して、以後はその言語のみを有効とする構成になっているものとする。いま、フロントシート８の話者が、日本語で機器の操作を行なう場合であれば、エンジンを始動させて初期状態において、まず「ニホンゴ」と発声すると、マイクロホン１ａまたは１ｂから入力した音声信号は、前処理回路９を経て、使用言語判定部２に入力し、使用言語の種別が「日本語」であると判定される。In this embodiment, it is assumed that the type of language is determined by the first voice signal input in the initial state and only that language is validated thereafter. Now, if the speaker of the front seat 8 operates the device in Japanese, when the engine is started and an initial state is made, first, when uttering "Nihongo", the voice signal input from the microphone 1a or 1b is input. Is input to the used language determination unit 2 via the preprocessing circuit 9 and the type of used language is determined to be "Japanese".

【００２７】そして、この判定信号によって、音声認識部３および応答信号発生部５では、日本語のみが有効とされ、音声認識部３に登録されている日本語の単語ないし文章と、入力してきた単語ないし文章とが比較照合される。Then, according to this determination signal, in the voice recognition unit 3 and the response signal generation unit 5, only Japanese is valid, and the Japanese words or chapters registered in the voice recognition unit 3 are input. The words or sentences that have been set are compared and collated.

【００２８】例えば、オーディオ装置Ａ１のカセットテープを再生する場合であれば、話者がまず「テープ・プレイ」と発声すると、マイクロホン1aまたは1bから入力した音声信号は、前処理回路９および使用言語判定部２を経て、音声認識部３に入力し、音声認識部３に登録されている日本語の辞書と比較照合される。For example, in the case of reproducing the cassette tape of the audio device A1, when the speaker first says “tape play”, the audio signal input from the microphone 1a or 1b is the preprocessing circuit 9 and the language used. It is input to the voice recognition unit 3 via the determination unit 2 and compared and collated with the Japanese dictionary registered in the voice recognition unit 3.

【００２９】英語など、他の国語の辞書も登録されているが、他の国語が登録されているテーブルは選択されず、使用言語判定部２から入力した判定信号によって、日本語が登録されているテーブルのみが選択され、予め登録されている用語と比較照合される。Although dictionaries of other national languages such as English are also registered, the table in which other national languages are registered is not selected, and Japanese is registered by the judgment signal input from the used language judgment unit 2. Selected tables are selected and compared with the pre-registered terms.

【００３０】その結果、カセートテープの再生命令であることが認識され、認識信号にしたがって制御部４において、カセットが挿入されていて再生が可能か確認が行なわれる。そして、再生が可能であれば、応答信号発生部５から出力する音声信号によって、例えば「テープをプレイします」と、スピーカから日本語で応答が行なわれた後、テープの再生が開始する。カセットが挿入されていない場合は、「カセットが挿入されていません。カセットを挿入して下さい」と、音声応答が行なわれる。As a result, it is recognized that it is a cassette tape reproduction command, and according to the recognition signal, the control unit 4 confirms whether the cassette is inserted and reproduction is possible. Then, if reproduction is possible, the audio signal output from the response signal generation unit 5 responds in Japanese, for example, "play the tape" from the speaker, and then the tape reproduction starts. To do. If the cassette is not inserted, a voice response is given, "No cassette is inserted. Insert the cassette."

【００３１】一方、プレイ状態において、例えば「テープ・ストップ」と発声すると、前記のプレイ指示の場合と同様にして、制御部４で停止が可能か確認が行なわれた後、応答信号発生部５から「テープを停止します」等の応答信号を出力してスピーカを鳴動させた後に、再生動作が停止する。On the other hand, in the play state, if, for example, "tape stop" is uttered, the control signal is confirmed by the control unit 4 as in the case of the play instruction, and then the response signal generation unit 5 is activated. After outputting a response signal such as "Stop the tape" to make the speaker ring, the playback operation stops.

【００３２】さらに、カセットテープを取り出したい場合は、「テープ・イジェクト」と発声すれば、前記の場合と同じ原理で「テープをイジェクトします」等の応答音を発生した後に、カセットテープがイジェクトされる。Furthermore, if you want to take out the cassette tape, you can say "tape eject", and after the response sound such as "eject the tape" is generated by the same principle as above, the cassette tape will Ejected.

【００３３】フロントシート８の話者が、英語でハンドフリー操作を行なう場合であれば、エンジン始動後に、例えば「イングリッシュ」と発声すると、マイクロホン１ａまたは１ｂから入力した音声信号が前処理回路９を経て、使用言語判定部２に入力し、使用言語の種別が「英語」と判定される。この判定信号により、音声認識部３および応答信号発生部５では、英語のみが有効とされ、音声認識部３に登録されている英語の辞書と、入力してきた英語の音声信号とが比較照合される。したがって、以後は英語によって、搭載機器をハンドフリー操作できる。In the case where the speaker of the front seat 8 performs a hands-free operation in English, when the engine is started and the user speaks, for example, "English", the voice signal input from the microphone 1a or 1b causes the pre-processing circuit 9 to operate. After that, the input language determination unit 2 is input, and the type of the used language is determined to be "English". Based on this determination signal, the voice recognition unit 3 and the response signal generation unit 5 validate only English, and the English dictionary registered in the voice recognition unit 3 is compared and collated with the input English voice signal. It Therefore, after that, the onboard equipment can be operated hands-free in English.

【００３４】図３は、図２の実施例の装置における処理手順を示すフローチャートである。いま、エンジンを始動させた初期状態において、ステップS1で、図２のマイクロホン１ａまたは１ｂに向かって、「ニホンゴ」と発声し音声入力すると、ステップS2でマイクロホン１ａまたは１ｂから音声信号が出力し、ステップS3において図２の前処理回路９により雑音除去などが行なわれる。FIG. 3 is a flowchart showing a processing procedure in the apparatus of the embodiment of FIG. Now, in the initial state when the engine is started, in step S1, when uttering “Nihongo” toward the microphone 1a or 1b in FIG. 2 and inputting voice, a voice signal is output from the microphone 1a or 1b in step S2. Then, in step S3, noise removal or the like is performed by the preprocessing circuit 9 of FIG.

【００３５】そして、ステップＳ４において、本考案による言語判定部２に音声信号が入力され、入力された用語が言語判定用語であると、言語の種別が判定されて、ステップS5で音声認識装置３および応答信号発生部５に対して、日本語が設定され、使用言語の指定が完了する。Then, in step S4, a speech signal is input to the language determination unit 2 according to the present invention, and if the input term is a language determination term, the type of language is determined, and speech recognition is performed in step S5. Japanese is set to the device 3 and the response signal generator 5, and the designation of the language to be used is completed.

【００３６】ついで、ステップS1において、例えば「テープ・プレイ」などの操作命令が音声入力されると、ステップS2でマイクロホン１ａまたは１ｂから音声信号が出力し、ステップS3において雑音除去などの前処理が行なわれる。そして、次のステップＳ４で、入力された用語は言語判定用語（“ニホンゴ”または“イングリッシュ”）ではないと判定される。すなわち、操作指令の用語であるので、ステップS6で、音声認識部３により、日本語の辞書テーブルが選択されて、入力した日本語の意味が認識される。Next, when an operation command such as “tape play” is input in step S1, a voice signal is output from the microphone 1a or 1b in step S2, and preprocessing such as noise removal is performed in step S3. Is performed. Then, in the next step S4, it is determined that the input term is not a language determination term (“Nihongo” or “English”). That is, since it is the term of the operation command, in step S6, the voice recognition unit 3 selects the Japanese dictionary table and recognizes the meaning of the input Japanese language.

【００３７】ステップS7における意味判定の結果、正常に認識された場合であれば、ステップS8において制御部４によりカセットテープの再生が可能か確認が行なわれた後、ステップS9において、「テープを再生します」と音声応答が行なわれ、ステップＳ10において、カセットテープの再生動作が開始される。If the result of the meaning determination in step S7 is normal recognition, after the controller 4 confirms in step S8 whether the cassette tape can be reproduced, in step S9, “tape A voice response is made and the cassette tape playback operation is started in step S10.

【００３８】ステップS6における音声認識が不可能であると、ステップS7で判定された場合は、ステップＳ11において、応答信号発生部５から「もう一度入力して下さい」との音声信号を出力してスピーカを駆動し、音声操作のリトライを促す。If it is determined in step S7 that the voice recognition in step S6 is impossible, in step S11, the response signal generating unit 5 outputs a voice signal “Please input again” and outputs it to the speaker. To prompt voice operation retry.

【００３９】なお、エンジンを始動させて初期状態において、ステップS1で「イングリッシュ」と発声し音声入力すると、ステップS4において「英語」と判定されて、図２における音声認識部３および応答信号発生部５には、英語が設定されるので、以後の音声入力では、英語のみで音声認識および音声応答が行なわれる。In the initial state with the engine started, when "English" is uttered and the voice is input in step S1, it is determined to be "English" in step S4, and the voice recognition unit 3 and the response signal generation in FIG. 2 are generated. Since English is set in the section 5, voice recognition and voice response are performed only in English in the subsequent voice input.

【００４０】この実施例では、エンジンを始動した初期状態において、「ニホンゴ」または「イングリッシュ」と発声して、最初に、以後の使用言語を指定するようになっている。したがって、エンジンを停止しない限り、使用言語の変更はできない。In this embodiment, in the initial state when the engine is started, "Nihongo" or "English" is uttered, and the language to be used thereafter is designated first. Therefore, the language used cannot be changed unless the engine is stopped.

【００４１】これに対し、図３の判定ステップＳ４を省き、入力された用語のすべてについて、図２の使用言語判定部２における判定を行ない、音声認識部３および応答信号発生部５に言語判定信号を入力する構成にすれば、言語が入力される度に言語判定を行ない、音声認識部３において、判定された言語の登録テーブル中の用語と比較照合し、かつ応答信号発生部５からは判定された言語で音声応答を行なうことができる。この構成によると、使用言語を変更するたびに、エンジンを停止し再度始動して言語の設定をやり直す必要はなく、操作性が向上する。On the other hand, the judgment step S4 of FIG. 3 is omitted, and the input language judgment unit 2 of FIG. 2 makes a judgment for all the input terms, and the speech recognition unit 3 and the response signal generation unit 5 are judged. If the language determination signal is input, the language determination is performed each time the language is input, the speech recognition unit 3 compares and verifies with the term in the registration table of the determined language, and the response signal generation unit 5 Can make a voice response in the determined language. With this configuration, it is not necessary to stop and restart the engine and set the language again each time the language used is changed, which improves operability.

【００４２】しかしながら、「テープ・プレイ」や「テープ・イジクト」などの操作命令語のように、日本語と英語で共通している用語を用いると、日本語で操作命令したのに英語で音声応答が行なわれるとか、逆に英語で操作命令したのに日本語で音声応答が行なわれる、といった事態が発生し、混乱を来す。これを防止するには、単語で命令しないで、「テープをプレイしなさい」や「テープをイジェクトしなさい」などのように、文章で命令する必要があり、また使用言語判定部２は、単語の特徴のみでなく、日本語と英語の構文の違いなどを利用して言語判定する必要がある。However, when a term common to both Japanese and English is used, such as an operation command word such as “tape play” or “tape eject”, the operation command is issued in Japanese but is spoken in English. It is confusing because a response is given or, conversely, a voice response is given in Japanese even though an operation command is given in English. To prevent this, it is necessary to give a command in sentences such as "Play the tape" or "Eject the tape" without giving a command in the word. It is necessary to make language judgments not only by using the features of, but also by utilizing the differences in the syntax between Japanese and English.

【００４３】以上は、請求項１の思想によって、入力音声から自動的に言語判定を行なう例であるのに対し、請求項２のように人為的に使用言語を設定する場合は、図４のように、ステップS0において、スイッチなどで使用言語を選択すると、図１（２）における使用言語の設定部６から設定信号が発生して、音声認識部３および応答信号発生部５に入力し、以後の使用言語が選択される。The above is an example in which the language is automatically determined from the input voice according to the idea of claim 1, while when the language to be used is artificially set as in claim 2, the language of FIG. As described above, when the language used is selected by the switch or the like in step S0, a setting signal is generated from the language setting unit 6 in FIG. 1 (2) and input to the voice recognition unit 3 and the response signal generation unit 5. , The language used thereafter is selected.

【００４４】したがって、ステップS0において、例えば英語が設定されたとすると、以後は英語による操作指令のみが有効となり、他の言語を用いる場合は、ステップS0において設定しなおす必要がある。この実施例では、言語設定スイッチを手動操作する必要はあるが、一旦設定しておくと、使用言語の変更がない限り何ら操作を行なう必要がない。言語設定は、手動操作スイッチに限らず、前記のように例えば音声で設定するようにしてもよい。Therefore, if, for example, English is set in step S0, only the operation command in English becomes valid thereafter, and if another language is used, it is necessary to set it again in step S0. In this embodiment, it is necessary to manually operate the language setting switch, but once it is set, there is no need to perform any operation unless the language used is changed. The language setting is not limited to the manual operation switch, and may be set by voice as described above, for example.

【００４５】なお、音声認識部３の認識対象言語と応答信号発生部５の音声応答言語のいずれか一方のみを選択可能（他方は所定言語で行なう）としてもよく、また両者の言語を別個に独立して設定できるようにしてもよい。It should be noted that either one of the recognition target language of the voice recognition unit 3 and the voice response language of the response signal generation unit 5 may be selectable (the other is performed in a predetermined language), or both languages may be selected. You may enable it separately and independently.

【００４６】以上の実施例は、日本語と英語の二カ国語を使用可能とした例であるが、これのみに限定されないことはもちろんであり、また三カ国語以上を選択使用可能とすることもできる。The above embodiment is an example in which two languages, Japanese and English, can be used, but it is needless to say that the present invention is not limited to this, and three or more languages can be selectively used. You can also

【００４７】音声による操作対象は、図２における機器のほか、例えばワイパーなどのような各種のスイッチ類の操作を、可能な限り本考案による音声認識装置で行なうことができる。しかも、自動車内における機器の操作に限らず、他の機器の操作にも適用できる。As an object to be operated by voice, in addition to the device in FIG. 2, various switches such as a wiper can be operated by the voice recognition device according to the present invention as much as possible. Moreover, the invention can be applied not only to the operation of equipment in a car but also to the operation of other equipment.

【００４８】[0048]

【考案の効果】[Effect of device]

請求項１によれば、音声入力するだけで、その言語が判定され、その言語で音声認識や音声応答が行なわれるので、複数カ国語で音声認識および音声応答でき、輸出用の装置や複数カ国の人種が使用する場合に便利である。また、輸出先の国語専用の装置とする必要がないので、汎用性が高く、安価に実現できる。 According to claim 1, the language is determined only by inputting the voice, and the voice recognition and the voice response are performed in the language. Therefore, the voice recognition and the voice response can be performed in plural languages. It is useful when used by races of different countries. In addition, since it is not necessary to use a device dedicated to the language of the export destination, it is highly versatile and can be realized at low cost.

【００４９】請求項２のように、予め人為的にスイッチ操作などで使用言語を選択する構成の場合は、一旦使用言語を設定すれば、使用言語を変更しない限り、操作を要しないので、使用言語の変更の頻度が少ない装置に有利であり、構成が簡素なためより安価に実現できる。In the case of the configuration in which the language to be used is artificially selected beforehand by switch operation as in claim 2, once the language to be used is set, the operation is not required unless the language to be used is changed. This is advantageous for devices that change the language used less frequently, and can be realized at a lower cost because the configuration is simple.

【図面の簡単な説明】[Brief description of drawings]

【図１】本考案による音声認識装置の基本原理を説明す
るブロック図である。FIG. 1 is a block diagram illustrating the basic principle of a voice recognition device according to the present invention.

【図２】請求項１の音声認識装置を自動車に実施した例
を示すブロック図である。FIG. 2 is a block diagram showing an example in which the voice recognition device of claim 1 is applied to an automobile.

【図３】図２の実施例装置における処理手順を例示する
フローチャートである。3 is a flowchart illustrating a processing procedure in the apparatus of the embodiment of FIG.

【図４】請求項２の考案における処理手順を例示するフ
ローチャートである。FIG. 4 is a flowchart illustrating a processing procedure in the invention of claim 2;

【図５】従来の音声認識装置における応答動作を示すフ
ローチャートである。FIG. 5 is a flowchart showing a response operation in a conventional voice recognition device.

【符号の説明】[Explanation of symbols]

１音声入力部２使用言語判定部３音声認識部４制御部５応答信号発生部６使用言語設定部 A1〜An 自動車における各種装備 1 voice input unit 2 language determination unit 3 voice recognition unit 4 control unit 5 response signal generation unit 6 language setting unit A1 to An

Claims

【実用新案登録請求の範囲】[Scope of utility model registration request]

【請求項１】話者の発声した音声の信号が入力する
と、該音声信号に対応するトークバック信号を出力し、
音声で応答する音声認識装置において、少なくとも、音声入力部（１）と、入力して来た音声信
号からその言語を判定する使用言語判定部（２）と、入
力された音声信号の内容を判別する音声認識部（３）
と、音声認識結果に従って機器を制御する制御部（４）
と、音声応答のための応答信号発生部（５）とを有し、音声認識部（３）は、入力した音声信号から複数カ国語
を認識でき、また応答信号発生部（５）は、複数カ国語
による音声応答が可能となっており、使用言語判定部（２）から出力した判定信号が音声認識
部（３）に入力して、認識対象言語が選定され、かつ使
用言語判定部（２）から出力した判定信号が応答信号発
生部（５）に入力して、音声応答言語が選定されるよう
に構成されていることを特徴とする音声認識装置。1. When a signal of a voice uttered by a speaker is input, a talkback signal corresponding to the voice signal is output,
In a voice recognition device that responds by voice, at least a voice input unit (1), a used language determination unit (2) that determines the language from an input voice signal, and a content of the input voice signal are determined. Voice recognition unit (3)
And a control unit (4) for controlling the device according to the voice recognition result.
And a response signal generation unit (5) for voice response, the voice recognition unit (3) can recognize a plurality of languages from the input voice signal, and the response signal generation unit (5) has a plurality of It is possible to respond to voices in different languages, and the judgment signal output from the used language determination unit (2) is input to the voice recognition unit (3) to select the recognition target language and the used language determination unit (2). The voice recognition device is configured so that the determination signal output from (1) is input to the response signal generation unit (5) to select the voice response language.

【請求項２】話者の発声した音声の信号が入力する
と、該音声信号に対応するトークバック信号を出力し、
音声で応答する音声認識装置において、少なくとも、音声入力部（１）と、入力された音声信号
の内容を判別する音声認識部（３）と、音声認識結果に
従って機器を制御する制御部（４）と、音声応答のため
の応答信号発生部（５）と、使用言語の設定部（６）と
を有し、音声認識部（３）は、入力した音声信号から複数カ国語
を認識でき、また応答信号発生部（５）は、複数カ国語
による音声応答が可能となっており、前記音声認識部（３）および前記応答信号発生部（５）
の少なくとも一方は、使用言語設定部（６）から出力さ
れた選択信号に応じて、認識対象言語あるいは音声応答
言語が選択されるように構成されていることを特徴とす
る音声認識装置。2. When a signal of a voice uttered by a speaker is input, a talkback signal corresponding to the voice signal is output,
In a voice recognition device that responds by voice, at least a voice input unit (1), a voice recognition unit (3) that determines the content of an input voice signal, and a control unit (4) that controls the device according to the voice recognition result. And a response signal generator (5) for voice response, and a language setting unit (6). The voice recognition unit (3) can recognize multiple languages from the input voice signal, and The response signal generation unit (5) is capable of voice response in multiple languages, and the voice recognition unit (3) and the response signal generation unit (5).
At least one of the above is configured to select a recognition target language or a voice response language in accordance with a selection signal output from the used language setting unit (6).