JP2004198488A

JP2004198488A - Electronic apparatus

Info

Publication number: JP2004198488A
Application number: JP2002363755A
Authority: JP
Inventors: Tetsuji Makino; 哲司牧野
Original assignee: Casio Computer Co Ltd
Current assignee: Casio Computer Co Ltd
Priority date: 2002-12-16
Filing date: 2002-12-16
Publication date: 2004-07-15

Abstract

<P>PROBLEM TO BE SOLVED: To enable a user to make an electronic apparatus execute an arbitrary operation by voice without annoying persons around the user. <P>SOLUTION: A CPU 10 specifies a command to be executed by voice input out of a plurality of commands by, for example, user's operation to an input device 20 and allows a voice corresponding to this specified command to be inputted from a voice input device 18 which picks up sounds with a bone-conductive microphone and registers the voice in an operation data table correspondingly to the command. The CPU 10 discriminates whether a voice inputted from the voice input device 18 is registered in the operation data table or not and executes a command registered made to correspond to this voice when discriminating that of the voice exists in the operation data table. <P>COPYRIGHT: (C)2004,JPO&NCIPI

Description

【０００１】
【発明の属する技術分野】
本発明は、入力されたコマンドに応じた処理を実行する電子装置に関する。
【０００２】
【従来の技術】
一般に、各種の電子装置においては、機器本体に設けられたボタンや操作キーに対する操作を行なうことによりコマンドを入力して、コマンドに応じた処理を実行させることができる。また、電子装置本体と無線あるいは有線によってコントローラを接続し、このコントローラに設けられたボタンや操作キーに対する操作によってコマンドを入力させることもできる。
【０００３】
さらに近年では、ボタンやキーに対する操作ではなく、音声入力の技術を利用して電子装置を操作することができるようになってきている。例えば、コマンド名を発声してマイクを通じて入力することで、音声認識処理によってコマンド名が認識され、そのコマンドによる機能を実行させることができるような装置が公開されている（例えば、特許文献１参照）。
【０００４】
【特許文献１】
特開２００１−３４４５４号公報
【０００５】
【発明が解決しようとする課題】
しかしながら、従来の電子装置において、音声入力によって電子装置を操作しようとする場合、予め決められた音声、例えばコマンド名を発声しなければならなかった。また、発声したコマンド名を電子装置によって確実に認識させるためには、例えば音声を安定して集音できる位置にマイクをセットし、その状態で明瞭に発声する必要があった。このため、周囲に人がいる場合などでは、コマンド名を発声すると迷惑がかかってしまうことがあり、こうした環境下では音声入力によって電子装置を操作することが困難となっていた。
【０００６】
本発明は、前記のような問題に鑑みなされたもので、周囲に迷惑を掛けることなく音声によって、ユーザが希望する任意の機能を簡単便利に実行させることが可能な電子装置を提供することを目的とする。
【０００７】
【課題を解決するための手段】
本発明は、入力されたコマンドに応じた処理を実行する電子装置において、音声を入力する音声入力手段と、複数のコマンド中からコマンドを特定するコマンド特定手段と、前記コマンド特定手段によって特定されたコマンドに対応する音声を前記音声入力手段から入力するコマンド音声入力手段と、前記コマンド特定手段によって特定されたコマンドと前記コマンド音声入力手段によって入力された音声とを対応づけて記憶するコマンドテーブル記録手段と、前記音声入力手段によって入力された音声が前記コマンドテーブル記録手段によって記憶された音声に存在するか否かを判別する音声判別手段と、前記音声判別手段によって、前記音声入力手段によって入力された音声が前記コマンドテーブル記録手段によって記憶された音声に存在すると判別された場合に、当該音声に対応して前記コマンドテーブル記録手段により記録されたコマンドを実行させるコマンド実行手段とを具備したことを特徴とする。
【０００８】
これにより、例えばユーザが電子装置に対して行った操作を基にしてコマンド特定手段により特定されるコマンドに対して、音声入力手段を通じて任意に入力される音声とを対応づけてコマンドテーブル記録手段により記憶させることができる。音声入力手段によって入力された音声が、コマンドテーブル記録手段に記憶された音声であった場合には、この音声と対応するコマンドを実行させることにより、音声入力による電子装置の操作を実現する。
【０００９】
また、前記コマンド特定手段は、複数のコマンドを記憶するコマンド記憶手段と、前記コマンド記憶手段によって記憶された複数のコマンドから所定の順番でコマンドを選択するコマンド選択手段とを有し、前記コマンド選択手段による選択によってコマンドを特定することを特徴とする。
【００１０】
これにより、例えばユーザによる電子装置に対する操作を必要とすることなく、コマンド記憶手段に記憶されたコマンドがコマンド選択手段によって所定の順番で選択されるので、電子装置に対する操作が不明なユーザであっても、順次、コマンドに対応づける音声を入力することができる。
【００１１】
また、前記音声入力手段によって入力された音声から所定の周波数帯の音声を抽出する音声抽出手段を具備したことを特徴とする。
【００１２】
これにより、電子装置を使用する環境が例えば電車内など雑音が多い場所であっても、不要なノイズなどをカットすることができ、コマンドを実行させるために入力された音声を音声判別手段によって確実に判別できるようになる。
【００１３】
また、前記音声入力手段によって入力された音声の周波数分布を検出する周波数分布検出手段と、前記周波数分布検出手段によって検出された周波数分布に基づいて、前記音声抽出手段によって音声の抽出対象とする周波数帯を設定する周波数帯設定手段とを具備したことを特徴とする。
【００１４】
これにより、ユーザが発声する音声の特性に応じて、音声抽出手段によって音声の抽出対象となる周波数帯が設定され、ユーザに適した周波数帯の音声が抽出されるようになることで、入力された音声を音声判別手段によってより確実に判別できるようになる。
【００１５】
また、前記音声入力手段は、骨伝導マイクであることを特徴とする。
これにより、骨伝導マイクを耳に装着して使用することで、通常の口を開けた発声をすることなく、歯を噛み合わせる、歯ぎしりする、口を開けないでハミングするなどによって、耳骨の振動がピックアップされて音声として入力されるようになるため、周囲に迷惑を掛けることなく音声による機能の指定が可能になる。
【００１６】
また、音声を出力する音声出力手段を有し、前記音声出力手段と前記骨伝導マイクとが一体化されていることを特徴とする。
これにより、電子装置がイヤホンやヘッドホンなどの音声出力手段によって音声出力するのであれば、この音声出力手段と骨伝導マイクとを一体化することで、独立して構成される別の音声入力手段を設ける必要が無い。
【００１７】
【発明の実施の形態】
以下、図面を参照して本発明の実施の形態について説明する。図１は、本実施形態における電子装置の構成を示すブロック図である。電子装置は、例えばヘッドホンステレオ、ＰＤＡ（personal digital assistant）、携帯型映像音声再生プレーヤ（ＤＶＤプレーヤなど）、携帯型ＴＶなどである。
【００１８】
図１に示すように、本発明に係る電子装置は、ＣＰＵ１０と、表示装置１２、音声出力装置１４、記録装置１６、音声入力装置１８、入力装置２０、メモリ２２、及び必要に応じて備えられる通信装置２４のそれぞれとが、相互に接続されて構成されている。
【００１９】
ＣＰＵ１０は、電子装置全体の制御を司るもので、メモリ２２や記録媒体１６ａに記録されたプログラムを実行することにより各種の処理を実行する。ＣＰＵ１０は、かかるプログラムを実行することで、ユーザによる入力操作により入力されるコマンドに応じた各種の処理を実行する。本実施形態では、音声入力装置１８によって入力された音声に対応するコマンドを判別して、このコマンドに応じた処理を実行することができる。
【００２０】
表示装置１２は、各種情報を表示するためのもので、ＣＰＵ１０による各種プログラムの実行に伴う画面を表示する。
音声出力装置１４は、各種音声を出力するためのもので、例えばスピーカ、ヘッドホン、イヤホンなどによって実現される。
【００２１】
記録装置１６は、記録媒体１６ａに対するプログラム、データ等の記録や読み出しを実行する。記録媒体１６ａは磁気的、光学的記録媒体、もしくは半導体メモリで構成され、記録装置１６に固定的に設けたもの、もしくは着脱自在に装着するものである。また、記録媒体１６ａに記録されるプログラム、データ等は、通信回線等を介して接続された他の機器から受信して記録する構成にしても良く、さらに、通信回線等を介して接続された他の機器側に記録媒体を備えた記録装置を設け、この記録媒体に記録されているプログラムやデータを、通信回線を介して使用する構成にしても良い。記録装置１６は、記録媒体１６ａに対して各種アプリケーションによって使用される各種のデータ等も記録し、必要に応じて読み出してメモリ２２に記録させることができる。本実施形態では、音声入力によって電子装置を操作する音声入力操作機能を実現するための操作データテーブル作成処理、フィルタ設定処理、音声入力操作処理などを実行するためのプログラムなどが記録されているものとする。また、音声入力操作機能を実現するために用いられる操作データテーブル３０（図４参照）、音声コマンド登録順序リスト３２などが記憶される（図６参照）。
【００２２】
音声入力装置１８（音声入力手段）は、ユーザが発声した音声を入力するためのもので、マイク、Ａ／Ｄ変換回路、デジタル信号処理回路などを含んで構成される。音声入力装置１８は、マイクによってピックアップした音声をデジタル化し、デジタル信号処理回路により信号処理を施した後に出力する。本実施形態の電子装置では、音声入力装置１８として、例えば骨伝導マイク（あるいは耳骨マイクと称する）が使用されるものとする。骨伝導マイクは、耳や頭部等の人体の一部に装着して使用されるマイクであり、通常の口を開けた発声の他、歯を噛み合わせる、歯ぎしりする、口を開けないでハミングするといったことで発生する耳骨の振動をピックアップして、音声として入力することができるマイクである。特に歯の噛み合わせは頭骨内で反響しやすいので、骨伝導マイクによって集音された音声をもとに、噛み合わせのパターンの違い（例えば噛み合わせた回数とその間隔）を容易に識別することができる。
【００２３】
入力装置２０は、キーボードや、マウスなどポインティングデバイス、あるいは本体に設けられた各種のボタンなどにより構成され、データや各種の指示を入力するために用いられる。
メモリ２２は、ＣＰＵ１０によってアクセスされるプログラムやデータの情報が記録媒体１６ａから読み出されるなどして必要に応じて記録されるもので、本実施形態では例えば電子装置に設けられた機能を実現するための各種プログラムや、各プログラムを実行する際に用いられる各種データの他、作業用のデータを一時的に記録するためのワークエリアなどが設定される。
【００２４】
通信装置２４は、ＣＰＵ１０の制御のもとでネットワークを介して接続される他の電子装置との通信を制御する。
次に、本実施形態における電子装置の動作について説明する。
まず、音声入力によって電子装置を操作するために用いられる操作データテーブル３０を作成する操作データテーブル作成処理について、図２に示すフローチャートを参照しながら説明する。図３は、操作データテーブル作成処理のユーザによる操作を説明するための図である。
【００２５】
本実施形態の電子装置では、音声入力によって操作をする際に、通常の口を開けて発声した音声だけでなく、歯を噛み合わせる、歯ぎしりする、口を開けないでハミングするといった、音声入力装置１８として設けられた骨伝導マイクによって集音可能な音声を使用することができる。以下の説明では、歯を噛み合わせることによって音声を発生させる場合を例にして説明する。
【００２６】
まず、ＣＰＵ１０は、入力装置２０に設けられた音声コマンド登録ボタンに対してプッシュ操作があると、以下で登録する音声コマンドに対応づけて実行させる既存の機能をユーザに指定させるために、入力操作の待ち受け状態となる（ステップＡ１）（図３（ａ）（１））。なお、音声コマンド登録ボタンは、入力装置２０に設けられた複数のボタンの何れかに割り当てられているものとする。
【００２７】
ここで、ユーザは、音声入力を行った際に実行される機能コマンドを、実際に装置に対する操作をすることによって指定することができる。
ＣＰＵ１０は、入力装置２０などに対して入力操作が行われると（図３（ａ）（２））、この入力操作に対応するコマンドが、操作データテーブル３０への登録対象とするコマンドとして指定されたものと判別する（指定操作コマンドＡ）（ステップＡ２）。
【００２８】
ここで、再度、音声コマンド登録ボタンに対してプッシュ操作があると（コマンド音声登録開始ボタン操作）、ＣＰＵ１０は、操作データテーブル３０に登録する操作コマンドを決定すると共に、音声入力装置１８を通じて入力される音声の記録を開始する（図３（ａ）（３））（ステップＡ３）。すなわち、音声入力によって電子装置を操作するために、操作データテーブル３０に記憶させた操作コマンドを指定する際に用いる音声（以下、コマンド音声と称する）を入力させる（図３（ｂ）（４））（ステップＡ４）。ここでは、図３（ｂ）に示すように、歯を噛み合わせることによって発生された音声が入力されるものとする（登録音声Ｂ）。
【００２９】
ここで、ＣＰＵ１０は、音声コマンド登録ボタンに対してプッシュ操作があると（コマンド音声登録終了ボタン操作）、コマンド音声の入力を終了する（図３（ｃ）（５））（ステップＡ５）。
【００３０】
そして、ＣＰＵ１０は、ステップＡ２でユーザから指定された操作コマンド（指定操作コマンドＡ）と、ステップＡ４でこのコマンドに対して入力されたコマンド音声（登録音声）とを関連付けて、操作データテーブル３０に記憶させる（ステップＡ６）。
【００３１】
図４には、操作データテーブル３０のデータ構造の一例を示している。図４に示すように、操作コマンドとコマンド音声の音声パターンデータとが対応づけて登録される。こうして、ユーザは、前述したような操作を行なうことで、電子装置に設けられた機能を操作するための音声を、複数の機能（操作コマンド）について操作データテーブル３０に記憶させることができる。
【００３２】
なお、前述した操作データテーブル作成処理（図２）では、ユーザが実際に入力操作を行なうことで、音声入力によって実行できるようにする機能のコマンドを指定しているが、このようにしてユーザに音声コマンドの登録をさせることとすると、この登録の際にまずユーザが一度、実行させたい機能の操作を実際に行わなくてはならないために、電子装置の操作をユーザがあらかじめ熟知していなければ、容易にコマンドを指定することができない場合がある。
【００３３】
図５には、ユーザが実際に入力操作を行なうことなく操作データテーブル３０に記憶させるコマンドを特定することができる操作データテーブル作成処理のフローチャートを示している。
【００３４】
図５に示す操作データテーブル作成処理を実行する場合には、例えば図６に示すような音声コマンド登録順序リスト３２が予め用意されているものとする。
【００３５】
音声コマンド登録順序リスト３２には、電子装置に設けられた機能を実行させるための複数のコマンドが登録されている。図５に示す音声コマンド登録順序リスト３２は電子メール機能を対象としたコマンドが登録された例を示すもので、「読み上げ開始」「次メール読み上げ」「前メール読み上げ」「読み上げ終了」…などのコマンドが登録されている。
【００３６】
まず、ＣＰＵ１０は、入力装置２０に設けられた音声コマンド登録ボタンに対してプッシュ操作があると（コマンド音声登録開始ボタン操作）（ステップＢ１）、音声コマンド登録順序リスト３２を記録装置１６から読み出してメモリ２２に記憶させる（ステップＢ２）。そして、ＣＰＵ１０は、音声コマンド登録順序リスト３２から音声の入力対象とするコマンドを所定の順番で取得する（ステップＢ３）。ここで所定の順番としては、音声コマンド登録順序リスト３２に登録されている順に先頭から取得されるものとする。この時、ＣＰＵ１０は、音声コマンド登録順序リスト３２から取得したコマンドを表示装置１２において表示させるなどして、ユーザに対して音声の入力対象とするコマンドを通知する。
【００３７】
するとユーザは、音声入力装置１８を通じてコマンド音声を入力する（ステップＢ４）。
【００３８】
ＣＰＵ１０は、音声コマンド登録ボタンに対してプッシュ操作があると（コマンド音声登録終了ボタン操作）、コマンド音声の入力を終了する（ステップＢ５）。
【００３９】
そして、ＣＰＵ１０は、音声コマンド登録順序リスト３２から取得したコマンドと、このコマンドに対してステップＢ４で入力されたコマンド音声とを関連付けて、操作データテーブル３０に記憶させる（ステップＢ６）。
【００４０】
ＣＰＵ１０は、音声コマンド登録順序リスト３２に登録された全てのコマンドに対する音声入力が終了していなければ、音声コマンド登録順序リスト３２から次に登録されたコマンドを取得して、以下、前述と同様にして音声を入力して、操作データテーブル３０に記憶させる（ステップＢ３〜Ｂ７）。
【００４１】
このようにして、音声コマンド登録順序リスト３２に登録されたコマンドを順次ユーザに提示して、ユーザにこのコマンドに対する音声を順次入力させることができるので、例えばユーザが電子装置を購入した直後である場合などのように、ユーザが電子装置に設けられた機能を実行させる操作を知らないような場合でも、機能を実行させるコマンドに対する音声を簡単に入力することができる。
【００４２】
なお、図５に示す処理では、音声コマンド登録順序リスト３２に登録されたコマンドの全てに対して音声を入力しているが、音声入力の対象として通知したコマンドに対してユーザが音声入力をしないことを指示できるようにすることで、このコマンドに対する音声入力を省略して、音声コマンド登録順序リスト３２に登録された次のコマンドを対象とすることができる。これにより、音声入力によって操作する意図がない機能については音声入力を省略して、操作データテーブル作成処理を簡略化することができる。
【００４３】
また、音声コマンド登録順序リスト３２に登録されたコマンドのうち、音声入力の対象とするコマンドをユーザからの指示に応じて限定しても良い。例えば、音声コマンド登録順序リスト３２に登録された複数のコマンドを複数のグループに分類し、この分類されたコマンドのグループから、操作データテーブル３０への登録対象とするものをユーザに選択させておく。なお、電子装置は、表示装置１２にグループ一覧を表示し、そのグループ一覧中から入力装置２０の操作によって選択させるようにする。ＣＰＵ１０は、選択されたグループを記憶しておき、操作データテーブル作成処理を実行する際に、音声コマンド登録順序リスト３２から該当するグループに含まれるコマンドのみを取得して、そのコマンドに対する音声をユーザに入力させるといった方法が可能である。
【００４４】
なお、図２と図５に示す操作データテーブル作成処理の何れを電子装置によって実行させるかは、予め決められていても良いし、ユーザが何れか一方を任意に指定できるようにしても良い。
【００４５】
次に、操作データテーブル３０を用いて、音声入力によって電子装置を実際に操作する音声入力操作処理について、図７に示すフローチャートを参照しながら説明する。
【００４６】
本実施形態では、音声入力操作処理を実行させて音声入力によって電子装置を操作できるようにするか否を、例えば入力装置２０に設けられた所定のボタンによってユーザが任意に指示できるものとする。
【００４７】
ＣＰＵ１０は、音声入力操作処理の実行が指示されている場合すなわち音声入力を用いることとされているときには、登録音声操作があるか、すなわち操作データテーブル３０に対してコマンドとこのコマンドに関連付けられたコマンド音声が登録されているかを判別する（ステップＣ１）。
【００４８】
登録音声操作が操作データテーブル３０に登録されている場合には、ＣＰＵ１０は、音声入力装置１８から音声が入力されている場合（ステップＣ２）、入力されている音声を所定の操作時間単位分記録する（ステップＣ３）。なお、音声入力装置１８からは、所定の音量レベルの音声が入力されることで音声入力有りと判別する。
【００４９】
ＣＰＵ１０は、入力された操作時間単位分の音声と、操作データテーブル３０に登録されたコマンド音声（コマンド音声パターン）とを比較し、入力された音声が操作データテーブル３０に存在するか否かを判別する（ステップＣ４）。
【００５０】
ここで、操作データテーブル３０に該当するコマンド音声が登録されていなかった場合、音声入力に対して何の処理も実行せず、音声入力状態に移行する（ステップＣ５，Ｎｏ）。
【００５１】
これに対し、入力された音声が操作データテーブル３０に存在すると判別された場合に、ＣＰＵ１０は、当該音声と関連付けて操作データテーブル３０に記録された操作コマンドを出力し、このコマンドに応じた処理を実行させる（ステップＣ６）。
【００５２】
図８には、音声入力によって電子装置を操作する状況の一例を示している。図８では、例えば電車内でつり革に掴まっており、電子装置を洋服のポケットなどに収納しているために、電子装置に設けられたボタンに対して直接操作できない状況にあるものとする。
【００５３】
また、音声入力装置１８として設けられた骨伝導マイクと音声出力装置１４として設けられるヘッドホン（あるいはイヤホン）とが一体化されたデバイスとして構成されているものとする。すなわち、骨伝導マイクとヘッドホン（イヤホン）は、何れも耳に装着して使用するので、一体化することで１つのデバイスを音声入力と音声出力に共用できるようにしている。こうして、ヘッドホン（イヤホン）と骨伝導マイクとを一体化することで、ユーザにとって取り扱いを容易にすることができる。
【００５４】
ここで、電子装置に設けられた電子メール読み上げ機能を利用して、電子メールを読み上げさせて、音声出力装置１４（ヘッドホン（イヤホン））によって内容を聞く状況を想定する。
【００５５】
まず、電子装置に対して電子メール読み上げ機能の実行を指示する。なお、電子メール読み上げ機能は、予め登録してある所定のコマンド音声を入力することで起動したものとする。電子メール読み上げ機能が起動されると、例えば「次のメールを読み上げますか」の音声メッセージを音声出力装置１４（ヘッドホン（イヤホン））から出力させる。
【００５６】
ユーザは、このメッセージに対して、操作コマンド「次メール読み上げ」に対して登録している「カチッ！」という音声を入力するために歯を１回噛み合わせる。この歯の噛み合わせによって発生した音声は、音声入力装置１８（骨伝導マイク）によってピックアップされ、この入力された音声（コマンド音声）に対応する操作コマンド「次メール読み上げ」が判別されて実行される。
【００５７】
同様にして、操作コマンド「読み上げ終了」に対して登録している「カチカチッ！」という音声を入力するために歯を２回噛み合わせることで、メールの読み上げを終了させることができる。また、操作コマンド「読み上げ開始」に対して登録している「カチカチカチッ！」という音声を入力するために歯を３回噛み合わせることで、最初からメールの内容を読み上げさせることができる。
こうして、音声入力によって電子装置に設けられた機能を操作することができる。
【００５８】
なお、前述した説明では、歯を噛み合わせることによって発生する音声を対象としているが、頭骨に伝わり骨伝導マイクによって音をピックアップできる音であれば、歯ぎしりやハミングなど、どのような音声を対象としていも良い。
【００５９】
このようにして、操作データテーブル３０に操作コマンドと対応づけて任意のコマンド音声を登録しておくことで、そのコマンド音声を入力することで対応する操作コマンドによる機能を実行させることができる。また、操作データテーブル３０に登録する音声は、コマンド名など固有のものである必要がないので、ユーザの電子装置を使用する状況に応じた音声とすることができる。例えば、電車内など周囲に人がいる状況で通常の口を開けて発声できない場所で使用する場合には、歯を噛み合わせた時に発生する音声を登録することができる。また、口を開けて発声できる状況で使用する場合には通常の音声を登録することもできる。
【００６０】
次に、音声入力装置１８から入力される音声の品質を向上させて、音声入力による操作コマンドの指定を確実にするための機能について説明する。
【００６１】
音声入力装置１８は、マイクによってピックアップした音声をデジタル化し、デジタル信号処理回路により信号処理を施すが、デジタル信号処理回路において例えば図９（ａ）に示すようなデジタルＦＩＲ（finite impulse response）フィルタ回路によって所定の周波数帯の音声を抽出する。図９（ａ）に示すデジタルＦＩＲフィルタ回路の出力は、図９（ｂ）に示すように表される。デジタルＦＩＲフィルタ回路は、図９（ｂ）に示す係数の値を変更することで出力される信号（音声）の周波数帯を設定することができる。
【００６２】
図１０には、フィルタの周波数特性の一例を示している。
例えば（例１）として、一般的な人の声のスペクトルで一定パワー以上の信号が分布する周波数帯を予め定め、その範囲の周波数帯の音声信号を通過させるようにフィルタを設計するといった方法が可能である。人の声のスペクトルは、例えばある周波数をピークとした図１１に示したようなものとなっている。そこでこのような場合には、例えば、図１１における一定パワーの基準を示すレベルＳ以上の信号が分布する周波数帯を求め、この周波数帯の信号が出力されるようにフィルタを設計するようにすれば、人の声に多く含まれる周波数の信号を抽出して、精度の良い音声コマンドの入力を行うことができる。
【００６３】
また、（例２）として、図１０に示すフィルタの周波数特性を登録用のコマンド音声に応じて変化させる、すなわち、ユーザが実際に登録するコマンド音声に応じて、図９（ａ）（ｂ）に示すデジタルＦＩＲフィルタ回路の係数を設定（変更）するようにしても良い。
【００６４】
図１２には、ユーザが入力するコマンド音声に応じて、フィルタの特性を設定するためのフィルタ設定処理のフローチャートを示している。
まず、フィルタ選択要求ボタンが操作されると（ステップＤ１）、ＣＰＵ１０は、ユーザに対して音声入力装置１８を通じてフィルタ設定用のコマンド音声を入力させる。そして、ＣＰＵ１０は、音声入力装置１８を通じて入力されるコマンド音声を、コマンド音声入力終了ボタンが操作されるまで記録する（ステップＤ２，Ｄ３）。
【００６５】
ＣＰＵ１０は、コマンド音声入力終了ボタンの操作によってフィルタ設定用のコマンド音声の入力完了が指示されると、入力された音声の周波数分布を検出し（ステップＤ４）、この検出された周波数分布に基づいて、音声入力装置１８（デジタル信号処理回路）におけるデジタルＦＩＲフィルタ回路の係数を設定する。すなわち、音声入力装置１８によって音声の抽出対象とする周波数帯を設定する（ステップＤ５）。
【００６６】
こうして、ユーザが実際に入力するコマンド音声の周波数分布に基づいてフィルタが設定されるので、本発明によれば、ユーザ個別の音声から適切な周波数帯の音声が抽出されることになる。一般的に音声認識においては、不特定話者に対する認識よりも特定話者認識の方が良好な認識率を得ることができるので、本発明のように特定のユーザの声に合わせたフィルタを用いて音声の抽出を行うことによれば、音声入力による操作コマンドの指定を確実にすることができる。
【００６７】
なお、前述した説明では、音声入力装置１８のマイクによってピックアップした音声をデジタル化し、デジタル信号処理回路においてフィルタリングを含めた信号処理を施すものとしているが、音声入力装置１８から出力されるデジタル信号に対して、ＣＰＵ１０により信号処理を施すようにしても良い。この場合、ＣＰＵ１０は、音声入力装置１８から入力されたデジタル信号に対して、図９に示すフィルタを含むデジタル信号処理回路による機能を実現する。
【００６８】
このようにして、本実施形態における電子装置では、電車内などでつり革を掴んでいる、鞄を持っているといった、手が使用できない状況であっても音声を使って電子機器を操作することができる。特に、骨伝導マイクを使用することで、声を出すこと無くコマンドを入力して実行させることができるので、周囲に人がいて声を出すことができない環境下であっても電子装置を操作することができる。
【００６９】
また、骨伝導マイクが耳に装着して使用する耳骨マイクであるとして説明しているが、耳に装着して使用する以外の骨伝導される音をピックアップできる他のマイクを使用しても良い。
【００７０】
また、操作データテーブル３０に登録するコマンド音声は、電子装置に設けられた全ての機能で共通のコマンドを対象とする必要はなく、例えばアプリケーション毎に専用の操作データテーブル３０を用意して、アプリケーションに応じたコマンド音声によって操作できるようにすることも可能である。
【００７１】
また、各アプリケーションレベル以外にも、例えば、ＯＳレベルのプログラムに本発明を応用すれば、実行するアプリケーションの選択をも音声入力によって行うことができ、また、各個人ごとに異なるアプリケーションがインストールされていても、各個人ごとに異なる音声コマンドを各アプリケーションに対応付けすることができる。
【００７２】
また、本発明は、前述した実施形態に限定されるものではなく、実施段階ではその要旨を逸脱しない範囲で種々に変形することが可能である。また、前述した実施形態で実行される機能は可能な限り適宜組み合わせて実施しても良い。前述した実施形態には種々の段階の発明が含まれており、開示される複数の構成要件における適宜の組み合わせにより種々の発明が抽出され得る。例えば、実施形態に示される全構成要件から幾つかの構成要件が削除されても、あるいは処理の順番を入れ替える等した場合であっても、効果が得られるので有れば、この構成要件が削除された構成が発明として抽出され得る。
【００７３】
また、前述した各実施形態において記載した処理は、コンピュータに実行させることのできるプログラムとして、例えば磁気ディスク（フレキシブルディスク、ハードディスク等）、光ディスク（ＣＤ−ＲＯＭ、ＤＶＤ等）、半導体メモリなどの記録媒体に書き込んで各種装置に提供することができる。また、通信媒体により伝送して各種装置に提供することも可能である。電子装置を実現するコンピュータは、記録媒体に記録されたプログラムを読み込み、または通信媒体を介してプログラムを受信し、このプログラムによって動作が制御されることにより、上述した処理を実行する。
【００７４】
【発明の効果】
以上のように、本発明によれば、例えばユーザが電子装置に対して操作することで特定されるコマンドに対して、任意に音声を入力して対応づけて登録しておき、この登録した音声によってコマンドを実行させて電子装置を操作することができる。特に、音声入力を骨伝導マイクを通じて行なうことで、通常の口を開けた発声をすることなく、歯を噛み合わせる、歯ぎしりする、口を開けないでハミングするなどによって、耳骨の振動がピックアップされて音声として入力されるので、周囲に迷惑を掛けることなく音声によって任意の操作を実行させることが可能となる。
【図面の簡単な説明】
【図１】本実施形態における電子装置の構成を示すブロック図。
【図２】本実施形態における操作データテーブルを作成する操作データテーブル作成処理について説明するためのフローチャート。
【図３】本実施形態における操作データテーブル作成処理のユーザによる操作を説明するための図。
【図４】本実施形態における操作データテーブルのデータ構造の一例を示す図。
【図５】本実施形態における操作データテーブル作成処理について説明するためのフローチャート。
【図６】本実施形態における音声コマンド登録順序リストの一例を示す図。
【図７】本実施形態における操作データテーブル３０を用いた音声入力により電子装置を操作する音声入力操作処理について説明するためのフローチャート。
【図８】本実施形態における音声入力によって電子装置を操作する状況の一例を示す図。
【図９】本実施形態におけるデジタルＦＩＲフィルタ回路を示す図。
【図１０】フィルタの周波数特性を示す図。
【図１１】一般的な人の入力音声の周波数分布を示す図。
【図１２】本実施形態におけるフィルタ設定処理を説明するためのフローチャート。
【符号の説明】
１０…ＣＰＵ
１２…表示装置
１４…音声出力装置
１６…記録装置
１６ａ…記録媒体
１８…音声入力装置
２０…入力装置
２２…メモリ
２４…通信装置[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to an electronic device that executes a process according to an input command.
[0002]
[Prior art]
Generally, in various electronic devices, a command can be input by performing an operation on a button or an operation key provided on a device main body, and a process corresponding to the command can be executed. Alternatively, a controller may be connected to the electronic device main body by wireless or wired communication, and commands may be input by operating buttons and operation keys provided on the controller.
[0003]
Further, in recent years, it has become possible to operate the electronic device by using a voice input technique instead of operating a button or key. For example, an apparatus has been disclosed in which a command name is recognized by voice recognition processing by uttering the command name and inputting the command name through a microphone (for example, see Patent Document 1). ).
[0004]
[Patent Document 1]
JP 2001-34454 A
[0005]
[Problems to be solved by the invention]
However, in the conventional electronic device, when attempting to operate the electronic device by voice input, a predetermined voice, for example, a command name has to be uttered. Also, in order for the electronic device to reliably recognize the uttered command name, for example, it is necessary to set a microphone at a position where sound can be stably collected and utter clearly in that state. For this reason, when a command name is uttered when a person is around, annoyance may occur, and in such an environment, it has been difficult to operate the electronic device by voice input.
[0006]
The present invention has been made in view of the above-described problems, and provides an electronic device that can easily and conveniently execute any function desired by a user by voice without disturbing the surroundings. Aim.
[0007]
[Means for Solving the Problems]
According to the present invention, in an electronic device that executes a process according to an input command, a voice input unit that inputs a voice, a command specifying unit that specifies a command from among a plurality of commands, and a command specifying unit that specifies the command. Command voice input means for inputting voice corresponding to a command from the voice input means, and command table recording means for storing the command specified by the command specifying means and the voice input by the command voice input means in association with each other Voice determining means for determining whether or not the voice input by the voice input means is present in the voice stored by the command table recording means; and The voice is stored in the voice stored by the command table recording means. Then if it is determined, characterized by comprising a command executing means for executing the corresponding to the voice command recorded by the command table storage means.
[0008]
Thereby, for example, the command specified by the command specifying unit based on the operation performed by the user on the electronic device is associated with the voice arbitrarily input through the voice input unit and the command table recording unit Can be memorized. If the voice input by the voice input means is the voice stored in the command table recording means, the command corresponding to the voice is executed to realize the operation of the electronic device by voice input.
[0009]
Further, the command specifying means has a command storage means for storing a plurality of commands, and a command selection means for selecting a command in a predetermined order from the plurality of commands stored by the command storage means, The command is specified by selection by means.
[0010]
Thereby, for example, the command stored in the command storage unit is selected in a predetermined order by the command selection unit without requiring the user to operate the electronic device. Can also sequentially input a voice corresponding to the command.
[0011]
Further, there is provided a voice extracting means for extracting a voice in a predetermined frequency band from the voice input by the voice input means.
[0012]
Thereby, even in an environment where the electronic device is used, for example, in a place where there is a lot of noise such as in a train, unnecessary noise and the like can be cut, and the voice input for executing the command can be reliably determined by the voice determination unit. Can be determined.
[0013]
A frequency distribution detecting unit that detects a frequency distribution of the voice input by the voice input unit; and a frequency to be extracted by the voice extracting unit based on the frequency distribution detected by the frequency distribution detecting unit. Frequency band setting means for setting a band.
[0014]
Thereby, the frequency band for which the voice is to be extracted is set by the voice extracting unit in accordance with the characteristics of the voice uttered by the user, and the voice in the frequency band suitable for the user is extracted, so that the voice is input. This makes it possible to more reliably discriminate the sound that has been made by the sound discriminating means.
[0015]
Further, the voice input means is a bone conduction microphone.
By using the bone conduction microphone attached to the ear and using it, the teeth can be meshed, bruxed, hummed without opening the mouth, etc. Since the vibration is picked up and input as voice, the function can be specified by voice without disturbing the surroundings.
[0016]
In addition, there is provided a sound output means for outputting a sound, wherein the sound output means and the bone conduction microphone are integrated.
Thus, if the electronic device outputs sound by means of sound output means such as earphones or headphones, the sound output means and the bone conduction microphone are integrated to form another sound input means which is independently configured. There is no need to provide.
[0017]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, embodiments of the present invention will be described with reference to the drawings. FIG. 1 is a block diagram illustrating a configuration of an electronic device according to the present embodiment. The electronic device is, for example, a headphone stereo, a personal digital assistant (PDA), a portable video / audio player (such as a DVD player), a portable TV, or the like.
[0018]
As shown in FIG. 1, the electronic device according to the present invention is provided with a CPU 10, a display device 12, an audio output device 14, a recording device 16, an audio input device 18, an input device 20, a memory 22, and as needed. The communication devices 24 are connected to each other.
[0019]
The CPU 10 controls the entire electronic device, and executes various processes by executing programs recorded in the memory 22 and the recording medium 16a. By executing such a program, the CPU 10 executes various processes according to commands input by a user's input operation. In the present embodiment, it is possible to determine a command corresponding to the voice input by the voice input device 18 and execute a process corresponding to the command.
[0020]
The display device 12 is for displaying various information, and displays a screen accompanying execution of various programs by the CPU 10.
The audio output device 14 is for outputting various sounds, and is realized by, for example, a speaker, headphones, earphones, or the like.
[0021]
The recording device 16 executes recording and reading of programs, data, and the like on the recording medium 16a. The recording medium 16a is formed of a magnetic or optical recording medium or a semiconductor memory, and is fixedly provided in the recording device 16 or is detachably mounted. The program, data, and the like recorded on the recording medium 16a may be configured to be received and recorded from another device connected via a communication line or the like. A recording device provided with a recording medium may be provided on another device side, and the program and data recorded on the recording medium may be used via a communication line. The recording device 16 can also record various data and the like used by various applications on the recording medium 16a, read it out as needed, and record it in the memory 22. In the present embodiment, a program for executing an operation data table creation process, a filter setting process, a voice input operation process, and the like for realizing a voice input operation function of operating an electronic device by voice input is recorded. And Further, an operation data table 30 (see FIG. 4) and a voice command registration order list 32 used for realizing the voice input operation function are stored (see FIG. 6).
[0022]
The voice input device 18 (voice input means) is for inputting voice uttered by the user, and includes a microphone, an A / D conversion circuit, a digital signal processing circuit, and the like. The audio input device 18 digitizes the audio picked up by the microphone, performs signal processing by a digital signal processing circuit, and outputs the processed signal. In the electronic device of the present embodiment, it is assumed that, for example, a bone conduction microphone (or an osseous microphone) is used as the voice input device 18. The bone conduction microphone is a microphone that is used by being worn on a part of the human body such as the ear or the head.In addition to the usual vocalization with the mouth open, the teeth bite, clench, and hum without opening the mouth. This is a microphone that can pick up vibrations of the ear bones generated by doing so and input them as voice. In particular, teeth meshing tends to reverberate in the skull, so it is easy to identify differences in meshing patterns (for example, the number of meshings and their intervals) based on the sound collected by the bone conduction microphone. Can be.
[0023]
The input device 20 includes a keyboard, a pointing device such as a mouse, or various buttons provided on a main body, and is used to input data and various instructions.
The memory 22 is for storing information on programs and data accessed by the CPU 10 as necessary, for example, by reading the information from the recording medium 16a. In the present embodiment, for example, the memory 22 is for realizing functions provided in the electronic device. In addition to the various programs and various data used when executing the programs, a work area for temporarily recording work data and the like are set.
[0024]
The communication device 24 controls communication with another electronic device connected via a network under the control of the CPU 10.
Next, the operation of the electronic device according to the present embodiment will be described.
First, an operation data table creation process for creating an operation data table 30 used to operate an electronic device by voice input will be described with reference to the flowchart shown in FIG. FIG. 3 is a diagram illustrating an operation performed by a user in the operation data table creation process.
[0025]
In the electronic device of the present embodiment, when operating by voice input, not only a voice uttered by opening a normal mouth but also a voice input device such as meshing teeth, brushing teeth, humming without opening a mouth. A voice that can be collected by the bone conduction microphone provided as 18 can be used. In the following description, a case where a sound is generated by meshing teeth will be described as an example.
[0026]
First, when a push operation is performed on the voice command registration button provided on the input device 20, the CPU 10 performs an input operation to allow the user to specify an existing function to be executed in association with the voice command registered below. (Step A1) (FIG. 3 (a) (1)). It is assumed that the voice command registration button is assigned to any of a plurality of buttons provided on the input device 20.
[0027]
Here, the user can designate a function command to be executed when voice input is performed by actually operating the device.
When an input operation is performed on the input device 20 or the like (FIGS. 3A and 2B), the CPU 10 specifies a command corresponding to the input operation as a command to be registered in the operation data table 30. (Designation operation command A) (step A2).
[0028]
Here, if there is a push operation again on the voice command registration button (command voice registration start button operation), the CPU 10 determines an operation command to be registered in the operation data table 30 and is input through the voice input device 18. Then, recording of the sound is started (FIGS. 3A and 3) (step A3). That is, in order to operate the electronic device by voice input, a voice (hereinafter, referred to as a command voice) used when designating an operation command stored in the operation data table 30 is input (FIGS. 3B and 4D). ) (Step A4). Here, as shown in FIG. 3B, it is assumed that a sound generated by engaging the teeth is input (registered sound B).
[0029]
Here, when there is a push operation on the voice command registration button (command voice registration end button operation), the CPU 10 ends the input of the command voice (FIG. 3 (c) (5)) (step A5).
[0030]
Then, the CPU 10 associates the operation command specified by the user in step A2 (designated operation command A) with the command voice (registered voice) input in response to this command in step A4, and stores the command in the operation data table 30. It is stored (step A6).
[0031]
FIG. 4 shows an example of the data structure of the operation data table 30. As shown in FIG. 4, the operation command and the voice pattern data of the command voice are registered in association with each other. In this way, by performing the above-described operation, the user can cause the operation data table 30 to store voices for operating the functions provided in the electronic device for a plurality of functions (operation commands).
[0032]
In the above-described operation data table creation process (FIG. 2), a command of a function that can be executed by voice input by the user actually performing an input operation is specified. If voice commands are to be registered, the user must actually perform the operation of the function to be executed once at the time of this registration. Command may not be easily specified.
[0033]
FIG. 5 shows a flowchart of an operation data table creation process that allows a user to specify a command to be stored in the operation data table 30 without actually performing an input operation.
[0034]
When the operation data table creation processing shown in FIG. 5 is executed, it is assumed that a voice command registration order list 32 as shown in FIG. 6 is prepared in advance.
[0035]
A plurality of commands for executing the functions provided in the electronic device are registered in the voice command registration order list 32. The voice command registration order list 32 shown in FIG. 5 shows an example in which commands for the e-mail function are registered, such as "start reading aloud", "read a next mail", "read a previous mail", "read aloud", and so on. Command is registered.
[0036]
First, when a push operation is performed on a voice command registration button provided on the input device 20 (command voice registration start button operation) (step B1), the CPU 10 reads the voice command registration order list 32 from the recording device 16. It is stored in the memory 22 (step B2). Then, the CPU 10 acquires, in a predetermined order, commands to be input as voices from the voice command registration order list 32 (step B3). Here, it is assumed that the predetermined order is obtained from the head in the order registered in the voice command registration order list 32. At this time, the CPU 10 notifies the user of the command to be input with the voice by displaying the command acquired from the voice command registration order list 32 on the display device 12 or the like.
[0037]
Then, the user inputs a command voice through the voice input device 18 (step B4).
[0038]
When there is a push operation on the voice command registration button (command voice registration end button operation), the CPU 10 ends the input of the command voice (step B5).
[0039]
Then, the CPU 10 stores the command acquired from the voice command registration order list 32 in the operation data table 30 in association with the command voice input in step B4 for this command (step B6).
[0040]
If the voice input for all the commands registered in the voice command registration order list 32 has not been completed, the CPU 10 acquires the next registered command from the voice command registration order list 32, and performs the same operation as described above. Then, a voice is input and stored in the operation data table 30 (steps B3 to B7).
[0041]
In this way, the commands registered in the voice command registration order list 32 can be sequentially presented to the user, and the user can sequentially input the voice corresponding to this command. For example, immediately after the user purchases the electronic device. Even when the user does not know the operation for executing the function provided in the electronic device as in the case, the voice for the command for executing the function can be easily input.
[0042]
In the process shown in FIG. 5, the voice is input for all the commands registered in the voice command registration order list 32, but the user does not input a voice for the command notified as the voice input target. In this case, the voice input for this command can be omitted, and the next command registered in the voice command registration order list 32 can be targeted. This makes it possible to omit voice input for functions that are not intended to be operated by voice input, thereby simplifying the operation data table creation processing.
[0043]
Further, among the commands registered in the voice command registration order list 32, the commands to be input by voice may be limited according to an instruction from the user. For example, a plurality of commands registered in the voice command registration order list 32 are classified into a plurality of groups, and the user is caused to select a command to be registered in the operation data table 30 from the classified command group. . The electronic device displays a group list on the display device 12 and allows the user to select a group from the group list by operating the input device 20. The CPU 10 stores the selected group, and when executing the operation data table creation process, acquires only the commands included in the corresponding group from the voice command registration order list 32 and outputs the voice corresponding to the command to the user. Is possible.
[0044]
Note that which one of the operation data table creation processes shown in FIGS. 2 and 5 is to be executed by the electronic device may be determined in advance, or the user may be able to arbitrarily specify one of them.
[0045]
Next, a voice input operation process for actually operating the electronic device by voice input using the operation data table 30 will be described with reference to a flowchart shown in FIG.
[0046]
In the present embodiment, it is assumed that the user can arbitrarily instruct whether or not the electronic device can be operated by voice input by executing the voice input operation process, for example, by a predetermined button provided on the input device 20.
[0047]
When execution of the voice input operation process is instructed, that is, when voice input is to be used, the CPU 10 determines whether there is a registered voice operation, that is, a command and a command associated with the command in the operation data table 30. It is determined whether a command voice has been registered (step C1).
[0048]
When the registered voice operation is registered in the operation data table 30, when the voice is input from the voice input device 18 (step C2), the CPU 10 records the input voice for a predetermined operation time unit. (Step C3). It should be noted that the voice input device 18 determines that a voice input is present when a voice of a predetermined volume level is input.
[0049]
The CPU 10 compares the input voice for the operation time unit with the command voice (command voice pattern) registered in the operation data table 30 and determines whether or not the input voice exists in the operation data table 30. A determination is made (step C4).
[0050]
Here, if the command voice corresponding to the operation data table 30 has not been registered, no processing is performed for the voice input, and the process shifts to the voice input state (step C5, No).
[0051]
On the other hand, when it is determined that the input voice is present in the operation data table 30, the CPU 10 outputs the operation command recorded in the operation data table 30 in association with the voice, and performs processing corresponding to the command. Is executed (step C6).
[0052]
FIG. 8 shows an example of a situation in which the electronic device is operated by voice input. In FIG. 8, for example, it is assumed that the user cannot directly operate buttons provided on the electronic device because the user holds the electronic device in a clothes pocket or the like while holding the electronic device in a train.
[0053]
Further, it is assumed that the bone conduction microphone provided as the voice input device 18 and the headphones (or earphones) provided as the voice output device 14 are configured as an integrated device. That is, since both the bone conduction microphone and the headphones (earphones) are used by being worn on the ears, one device can be shared for voice input and voice output by integrating them. In this way, by integrating the headphone (earphone) and the bone conduction microphone, handling can be facilitated for the user.
[0054]
Here, a situation is assumed in which an e-mail is read out using an e-mail reading function provided in the electronic device and the content is heard by the audio output device 14 (headphones (earphones)).
[0055]
First, the electronic device is instructed to execute the e-mail reading function. It is assumed that the e-mail reading function is activated by inputting a predetermined command voice registered in advance. When the electronic mail reading function is activated, for example, a voice message “Do you want to read the next mail?” Is output from the voice output device 14 (headphones (earphones)).
[0056]
In response to this message, the user engages the teeth once in order to input the voice “click!” Registered for the operation command “read next mail”. The voice generated by the meshing of the teeth is picked up by the voice input device 18 (bone conduction microphone), and the operation command “read next mail” corresponding to the input voice (command voice) is determined and executed. .
[0057]
Similarly, the mail can be read out by engaging the teeth twice in order to input the voice “click!” Registered for the operation command “end reading”. Also, by engaging the teeth three times in order to input the voice "click" registered for the operation command "start reading", the contents of the mail can be read aloud from the beginning.
Thus, a function provided in the electronic device can be operated by voice input.
[0058]
In the above description, the sound generated by meshing the teeth is targeted. Good.
[0059]
In this way, by registering an arbitrary command voice in association with the operation command in the operation data table 30, a function based on the corresponding operation command can be executed by inputting the command voice. Further, since the voice registered in the operation data table 30 does not need to be unique such as a command name, the voice can be a voice corresponding to a situation where the user uses the electronic device. For example, when using in a place where a normal mouth is opened and speech cannot be made in a situation where there are people around, such as in a train, a voice generated when the teeth are engaged can be registered. In addition, when using in a situation where the user can speak with the mouth open, a normal voice can be registered.
[0060]
Next, a function for improving the quality of voice input from the voice input device 18 and ensuring the designation of an operation command by voice input will be described.
[0061]
The audio input device 18 digitizes audio picked up by a microphone and performs signal processing by a digital signal processing circuit. In the digital signal processing circuit, for example, a digital FIR (finite impulse response) filter circuit as shown in FIG. To extract voice in a predetermined frequency band. The output of the digital FIR filter circuit shown in FIG. 9A is represented as shown in FIG. 9B. The digital FIR filter circuit can set the frequency band of the output signal (voice) by changing the value of the coefficient shown in FIG.
[0062]
FIG. 10 shows an example of the frequency characteristic of the filter.
For example, as (Example 1), there is a method in which a frequency band in which a signal having a certain power or more is distributed in a general human voice spectrum is determined in advance, and a filter is designed to pass an audio signal in a frequency band in the range. It is possible. The spectrum of the human voice is, for example, as shown in FIG. 11 with a peak at a certain frequency. Therefore, in such a case, for example, a frequency band in which a signal of level S or more indicating the reference of the constant power in FIG. 11 is distributed is obtained, and a filter is designed so that a signal in this frequency band is output. For example, it is possible to extract a signal of a frequency that is often included in a human voice and input a voice command with high accuracy.
[0063]
Further, as (example 2), the frequency characteristics of the filter shown in FIG. 10 are changed according to the command voice for registration, ie, according to the command voice actually registered by the user, FIGS. May be set (changed) in the digital FIR filter circuit shown in FIG.
[0064]
FIG. 12 shows a flowchart of a filter setting process for setting filter characteristics according to a command voice input by the user.
First, when the filter selection request button is operated (step D1), the CPU 10 causes the user to input a command voice for filter setting through the voice input device 18. Then, the CPU 10 records the command voice input through the voice input device 18 until the command voice input end button is operated (steps D2 and D3).
[0065]
When the command voice input end button is operated to instruct the completion of the input of the command voice for setting the filter, the CPU 10 detects the frequency distribution of the input voice (step D4), and based on the detected frequency distribution. , The coefficient of the digital FIR filter circuit in the audio input device 18 (digital signal processing circuit) is set. That is, the voice input device 18 sets a frequency band from which voice is to be extracted (step D5).
[0066]
Thus, since the filter is set based on the frequency distribution of the command voice actually input by the user, according to the present invention, a voice in an appropriate frequency band is extracted from the voice of each user. In general, in speech recognition, a specific speaker recognition can obtain a better recognition rate than a recognition for an unspecified speaker. Therefore, a filter adapted to a specific user's voice as in the present invention is used. According to the voice extraction, the designation of the operation command by voice input can be ensured.
[0067]
In the above description, the audio picked up by the microphone of the audio input device 18 is digitized, and the digital signal processing circuit performs signal processing including filtering, but the digital signal output from the audio input device 18 On the other hand, the CPU 10 may perform signal processing. In this case, the CPU 10 realizes a function of a digital signal processing circuit including a filter shown in FIG. 9 for a digital signal input from the audio input device 18.
[0068]
As described above, in the electronic device according to the present embodiment, even when the user cannot use the hand, such as holding a strap in a train or holding a bag, the user can operate the electronic device using voice. Can be. In particular, by using a bone conduction microphone, it is possible to input and execute a command without speaking, so that the electronic device can be operated even in an environment where there are people around and cannot speak. be able to.
[0069]
In addition, although the bone conduction microphone is described as being an ear bone microphone used by being worn on the ear, other microphones capable of picking up bone-conducted sound other than being worn on the ear may be used. good.
[0070]
The command voice registered in the operation data table 30 does not need to target a command common to all functions provided in the electronic device. For example, a dedicated operation data table 30 is prepared for each application, Can be operated by a command voice corresponding to.
[0071]
If the present invention is applied to, for example, an OS-level program in addition to each application level, an application to be executed can be selected by voice input, and a different application is installed for each individual. However, different voice commands for each individual can be associated with each application.
[0072]
Further, the present invention is not limited to the above-described embodiment, and can be variously modified in an implementation stage without departing from the gist thereof. Further, the functions executed in the above-described embodiments may be implemented in combination as appropriate as much as possible. The embodiments described above include inventions at various stages, and various inventions can be extracted by appropriately combining a plurality of disclosed constituent features. For example, even if some components are deleted from all the components shown in the embodiment, or even if the order of processing is changed, if the effect is obtained, this component is deleted. The configuration can be extracted as an invention.
[0073]
Further, the processing described in each of the above-described embodiments is, for example, a recording medium such as a magnetic disk (flexible disk, hard disk, etc.), an optical disk (CD-ROM, DVD, etc.), a semiconductor memory, etc. And can be provided to various devices. Further, it is also possible to transmit the data via a communication medium and provide the data to various devices. A computer that realizes the electronic device reads the program recorded on a recording medium or receives the program via a communication medium, and executes the above-described processing by controlling the operation of the program.
[0074]
【The invention's effect】
As described above, according to the present invention, for example, a command specified by a user operating the electronic device is arbitrarily input and registered in association with the registered voice. The command can be executed to operate the electronic device. In particular, by performing voice input through a bone conduction microphone, the vibrations of the ear bones are picked up by engaging the teeth, clenching the teeth, humming without opening the mouth, etc. without making a normal open mouth utterance Therefore, any operation can be executed by voice without disturbing the surroundings.
[Brief description of the drawings]
FIG. 1 is an exemplary block diagram illustrating the configuration of an electronic device according to an embodiment.
FIG. 2 is a flowchart for explaining operation data table creation processing for creating an operation data table in the embodiment.
FIG. 3 is an exemplary view for explaining an operation by a user in an operation data table creation process according to the embodiment.
FIG. 4 is a view showing an example of a data structure of an operation data table according to the embodiment.
FIG. 5 is a flowchart illustrating an operation data table creation process according to the embodiment.
FIG. 6 is an exemplary view showing an example of a voice command registration order list in the embodiment.
FIG. 7 is an exemplary flowchart for explaining a voice input operation process for operating the electronic device by voice input using the operation data table 30 in the embodiment.
FIG. 8 is an exemplary view showing an example of a situation in which the electronic device is operated by voice input in the embodiment.
FIG. 9 is a diagram showing a digital FIR filter circuit according to the embodiment.
FIG. 10 is a diagram illustrating frequency characteristics of a filter.
FIG. 11 is a diagram showing a frequency distribution of a general human input voice.
FIG. 12 is a flowchart illustrating a filter setting process according to the embodiment.
[Explanation of symbols]
10 ... CPU
12 Display device
14. Voice output device
16 Recording device
16a: Recording medium
18 Voice input device
20 input device
22 ... Memory
24 ... Communication device

Claims

入力されたコマンドに応じた処理を実行する電子装置において、
音声を入力する音声入力手段と、
複数のコマンド中からコマンドを特定するコマンド特定手段と、
前記コマンド特定手段によって特定されたコマンドに対応する音声を前記音声入力手段から入力するコマンド音声入力手段と、
前記コマンド特定手段によって特定されたコマンドと前記コマンド音声入力手段によって入力された音声とを対応づけて記憶するコマンドテーブル記録手段と、
前記音声入力手段によって入力された音声が前記コマンドテーブル記録手段によって記憶された音声に存在するか否かを判別する音声判別手段と、
前記音声判別手段によって、前記音声入力手段によって入力された音声が前記コマンドテーブル記録手段によって記憶された音声に存在すると判別された場合に、当該音声に対応して前記コマンドテーブル記録手段により記録されたコマンドを実行させるコマンド実行手段と
を具備したことを特徴とする電子装置。In an electronic device that executes a process according to an input command,
Voice input means for inputting voice,
Command specifying means for specifying a command from among a plurality of commands;
Command voice input means for inputting a voice corresponding to the command specified by the command specifying means from the voice input means,
Command table recording means for storing the command specified by the command specifying means and the voice input by the command voice input means in association with each other,
Voice determining means for determining whether the voice input by the voice input means is present in the voice stored by the command table recording means,
When the voice discriminating unit determines that the voice input by the voice input unit is present in the voice stored by the command table recording unit, the voice recorded by the command table recording unit corresponds to the voice. An electronic device, comprising: command execution means for executing a command.

前記コマンド特定手段は、
複数のコマンドを記憶するコマンド記憶手段と、
前記コマンド記憶手段によって記憶された複数のコマンドから所定の順番でコマンドを選択するコマンド選択手段とを有し、
前記コマンド選択手段による選択によってコマンドを特定することを特徴とする請求項１記載の電子装置。The command specifying means includes:
Command storage means for storing a plurality of commands;
Command selection means for selecting a command in a predetermined order from a plurality of commands stored by the command storage means,
2. The electronic device according to claim 1, wherein a command is specified by the selection by the command selection unit.

前記音声入力手段によって入力された音声から所定の周波数帯の音声を抽出する音声抽出手段を具備したことを特徴とする請求項１記載の電子装置。2. The electronic device according to claim 1, further comprising a voice extracting unit that extracts a voice in a predetermined frequency band from the voice input by the voice input unit.

前記前記音声入力手段によって入力された音声の周波数分布を検出する周波数分布検出手段と、
前記周波数分布検出手段によって検出された周波数分布に基づいて、前記音声抽出手段によって音声の抽出対象とする周波数帯を設定する周波数帯設定手段とを具備したことを特徴とする請求項１記載の電子装置。Frequency distribution detecting means for detecting the frequency distribution of the voice input by the voice input means,
2. The electronic device according to claim 1, further comprising: a frequency band setting unit that sets a frequency band to be subjected to audio extraction by the audio extraction unit based on the frequency distribution detected by the frequency distribution detection unit. apparatus.

前記音声入力手段は、骨伝導マイクであることを特徴とする請求項１記載の電子装置。The electronic device according to claim 1, wherein the voice input unit is a bone conduction microphone.

音声を出力する音声出力手段を有し、
前記音声出力手段と前記骨伝導マイクとが一体化されていることを特徴とする請求項５記載の電子装置。Having audio output means for outputting audio,
The electronic device according to claim 5, wherein the voice output unit and the bone conduction microphone are integrated.