JP4164568B2

JP4164568B2 - Character information input device, character information input method, and recording medium

Info

Publication number: JP4164568B2
Application number: JP2001305203A
Authority: JP
Inventors: 武志蔵田; 隆史大隈; 正克興梠; 丈和加藤; 勝彦坂上
Original assignee: National Institute of Advanced Industrial Science and Technology AIST
Current assignee: National Institute of Advanced Industrial Science and Technology AIST
Priority date: 2001-10-01
Filing date: 2001-10-01
Publication date: 2008-10-15
Anticipated expiration: 2021-10-01
Also published as: JP2003108923A

Description

【０００１】
【発明の属する技術分野】
本発明は、コンピュータの文字情報入力装置に関し、特に、着用型コンピュータ、携帯型コンピュータ、携帯電話、ＰＤＡ、リモコン、デジタルカメラなどの装置のための文字情報入力装置および文字情報入力方法並びに文字情報入力方法をコンピュータにより機能させるためのプログラムを記録した記録媒体に関する。
【０００２】
【従来の技術】
従来、コンピュータに文字や操作のためのコマンドを入力するために、もっぱら、QWERTYキーボードに代表されるキーボード装置や、マウス、トラックボール、タッチパッドなどのポインティング装置など卓上で使用することを前提とした入力装置を用いた文字情報入力方法や、文字を読み上げて、マイクで音声を録音し、音声認識により文字情報を得ることを特徴とする文字情報入力方法が用いられている。
【０００３】
特に、着用型コンピュータ、携帯型コンピュータ、携帯電話、ＰＤＡ、家電のリモコン、デジタルカメラなどの携帯端末（携帯装置）に文字や操作のためのコマンドを入力するためには、テンキーに代表される十数個のキーからなる携帯キーボード装置、ボタン装置、ジョグシャトル装置、ダイヤル装置、タブレット装置、ペン装置など携帯して使用することを前提とした入力装置を用いた文字情報入力方法が用いられている。
【０００４】
【発明が解決しようとする課題】
しかしながら、このように、前記卓上で使用することを前提とした入力装置を用いた従来の文字情報入力方法を用いて文字や操作のためのコマンドを迅速にしかも確実にかつ簡単に入力するためには、該装置および該方法を長期間使用し使用技術を習得する必要がある。
【０００５】
また、移動時、外出時、歩行と一時停止を繰り返しながらの行動時、作業時などにおいては、前記卓上で使用することを前提とした入力装置を携帯および使用することは困難である。
【０００６】
さらに、前記携帯して使用することを前提とした入力装置を用いた文字情報入力方法では、その使用技術を習得しても、キーの数の少なさや装置の持ちづらさなどが要因となる入力速度の物理的限界により、文字や操作のためのコマンドを迅速にしかも確実にかつ簡単に入力することは困難である。
【０００７】
さらに、前記携帯して使用することを前提とした入力装置をポケットやかばんなどに収納してある場合、例えば入力する文字数やコマンド数が少ないとしても、使用前に該入力装置を取り出す必要があるため、それを迅速にしかも確実にかつ簡単に入力することは困難である。
【０００８】
また、前期音声認識を用いた文字情報入力方法は、周辺が騒がしい場合、または逆に周辺が静かで声を出しづらい場合、または読み方がわからない場合に文字を読み上げることができない。
【０００９】
本発明は、上述の点に鑑みてなされたもので、その目的は、従来の卓上で使用または携帯して使用することを前提とした入力装置または音声認識による文字情報入力方法を用いるばかりではなく、カメラおよび表示器および使用者の手と指を用いて、屋内や屋外に表記の文字または手書きの文字を文字コードデータに変換することで、文字情報を迅速にしかも確実にかつ簡単に入力できるコンピュータの文字情報入力装置、特に、着用型コンピュータ、携帯型コンピュータ、携帯電話、ＰＤＡ、リモコン、デジタルカメラなどの装置のための文字情報入力装置および文字情報入力方法並びに文字情報入力方法をコンピュータにより機能させるためのプログラムを記録した記録媒体を提供することにある。
【００１０】
【課題を解決するための手段】
上記目的を達成するために、請求項１の文字情報入力装置の発明は、カメラから撮影画像を入力する画像入力手段と、表示器の画面に画像を表示する画像表示手段と、前記撮影画像に写された手と指の位置および指示姿勢および選択姿勢を認識する手指画像認識手段と、該手指画像認識手段で認識された手と指の位置および姿勢に基づいてカーソルやポインタなどで表現される指示記号および選択記号の前記表示器の画面上での位置を決定する記号位置決定手段と、該記号位置決定手段で決定された前記表示器の画面上での位置に前記指示記号を表示する指示記号表示手段と、前記指示姿勢および前記選択姿勢により画像範囲を指定する範囲指定手段と、該範囲指定手段で指定された範囲の画像から文字を認識する文字認識手段と、該文字認識手段で認識した文字を文字コードデータに変換する文字コードデータ変換手段と、該文字コードデータ変換手段で変換された文字コードデータを記憶媒体に記憶する文字コードデータ記憶手段とを具備することを特徴とする。
【００１１】
さらに、前記画像入力手段で入力された前記撮影画像を圧縮または非圧縮で無線または有線の通信回路を介して送信する入力画像送信手段と、該圧縮または非圧縮画像を受信し圧縮画像の場合は展開する入力画像受信手段とを有することができる。
【００１２】
さらに、前記範囲指定手段で指定された範囲の画像を圧縮または非圧縮で無線または有線の通信回路を介して送信する範囲画像送信手段と、該圧縮または非圧縮範囲画像を受信し圧縮画像の場合は展開する範囲画像受信手段とを有することができる。
【００１３】
さらに、前記画像入力手段で入力、または無線や有線の通信回線を介して受信した画像を記憶媒体に記憶する画像記憶手段と、該画像記憶手段で記憶した画像を前記表示器に表示する画像表示手段とを有することができる。
【００１４】
さらに、前記文字コードデータ記憶手段で記憶した文字コードデータに対応する文字を、文字フォントデータを用いて前記表示器に表示する文字表示手段を有することができる。
【００１５】
さらに、前記文字コードデータ記憶手段で記憶した文字コードデータを、インターネットやデータベースの検索キーワードまたはオペレーションシステム（ＯＳ）やアプリケーションのコマンドや入力文字または文字認識に使われた画像の付加情報として用いることができる。
【００１６】
さらに、前記カメラに、使用者の身体の一部に直接着用または身体の一部に着用するものに装備または携帯装置に装備したカメラを用いることができる。
【００１７】
さらに、前記表示器に、使用者の視野に入るように頭部に直接着用または頭部に着用するものに装備または腕に直接着用または腕に着用するものに装備した表示器を用いることができる。
【００１８】
さらに、前記範囲指定手段において、前記指示姿勢を認識した後に前記選択姿勢を認識してから、再び前記指示姿勢を認識するまでの間の前記選択記号の位置の集合により表現される矩形または閉ループまたは直線分または曲線分などの図形に基づいて画像範囲を指定することができる。
【００１９】
さらに、前記文字認識手段で文字認識の結果の候補が複数存在する場合は、前記文字コードデータ変換手段で該候補それぞれを文字コードデータに変換する手段と、該文字コードデータに対応する文字を、文字フォントデータを用いて前記表示器に表示する文字認識候補表示手段と、表示された前記候補から１つを前記記号位置決定手段で指示し前記選択姿勢で選択する文字認識結果選択手段と、該文字認識結果選択手段で選択した候補の文字コードデータを前記文字コードデータ記憶手段により記憶媒体に記憶する手段とを有することができる。
【００２０】
さらに、前記文字認識手段で文字認識の結果の候補に正解が含まれない場合は、前記表示器に仮想キーボードを表示する仮想キーボード表示手段と、前記指示姿勢および前記選択姿勢により仮想キーボードを操作する仮想キーボード操作手段と、前記文字認識候補表示手段および前記文字認識結果選択手段で前記候補から１つを選択する手段と、該選択候補を前記文字表示手段で表示し該仮想キーボード操作手段で修正する文字認識結果修正手段と、該文字認識結果修正手段で修正した文字の文字コードデータを前記文字コードデータ記憶手段により記憶媒体に記憶する手段とを有することができる。
【００２１】
上記目的を達成するために、請求項１２の文字情報入力装置の文字情報入力方法の発明は、文字情報入力装置の文字情報入力方法において、カメラから撮影画像を入力するステップと、表示器の画面に画像を表示するステップと、前記撮影画像に写された手と指の位置および指示姿勢および選択姿勢を認識するステップと、認識された手と指の位置および姿勢に基づいてカーソルやポインタなどで表現される指示記号および選択記号の前記表示器の画面上での位置を決定するステップと、決定された前記表示器の画面上での位置に前記記号を表示するステップと、前記指示姿勢および前記選択姿勢により画像範囲を指定するステップと、指定された範囲の画像から文字を認識するステップと、認識した文字を文字コードデータに変換するステップと、変換された文字コードデータを記憶媒体に記憶するステップとを有することを特徴とする。
【００２２】
さらに、前記撮影画像を圧縮または非圧縮で無線または有線の通信回路を介して送信するステップと、該圧縮または非圧縮画像を受信し圧縮画像の場合は展開するステップとを有することができる。
【００２３】
さらに、前記指定された範囲の画像を圧縮または非圧縮で無線または有線の通信回路を介して送信するステップと、該圧縮または非圧縮範囲画像を受信し圧縮画像の場合は展開するステップとを有することができる。
【００２４】
さらに、前記撮影画像、または無線や有線の通信回線を介して受信した画像を記憶媒体に記憶するステップと、該記憶した画像を前記表示器に表示するステップとを有することができる。
【００２５】
さらに、前記記憶した文字コードデータに対応する文字を、文字フォントデータを用いて前記表示器に表示するステップを有することができる。
【００２６】
さらに、前記文字コードデータを、インターネットやデータベースの検索キーワードまたはＯＳやアプリケーションのコマンドや入力文字または文字認識に使われた画像の付加情報として用いることができる。
【００２７】
さらに、前記カメラに、使用者の身体の一部に直接着用または身体の一部に着用するものに装備または携帯装置に装備したカメラを用いることができる。
【００２８】
さらに、前記表示器に、使用者の視野に入るように頭部に直接着用または頭部に着用するものに装備または腕に直接着用または腕に着用するものに装備した表示器を用いることができる。
【００２９】
さらに、前記画像範囲を指定するステップにおいて、前記指示姿勢を認識した後に前記選択姿勢を認識してから、再び前記指示姿勢を認識するまでの間の前記選択記号の位置の集合により表現される矩形または閉ループまたは直線分または曲線分などの図形に基づいて画像範囲を指定することができる。
【００３０】
さらに、前記文字認識の結果の候補が複数存在する場合は、該候補それぞれを文字コードデータに変換するステップと、該文字コードデータに対応する文字を、文字フォントデータを用いて前記表示器に表示するステップと、表示された前記候補から１つを前記指示記号で指示し前記選択姿勢で選択するステップと、前記文字コードデータを記憶するステップで選択した候補の文字コードデータを記憶媒体に記憶するステップとを有することができる。
【００３１】
さらに、前記文字認識の結果の候補に正解が含まれない場合は、前記表示器に仮想キーボードを表示するステップと、前記指示姿勢および前記選択姿勢により仮想キーボードを操作するステップと、前記文字認識の候補から１つを選択するステップと、該選択候補を前記表示器で表示し前記仮想キーボードで修正するステップと、修正した文字の文字コードデータを記憶媒体に記憶するステップとを有することができる。
【００３２】
上記目的を達成するため、請求項２３の記録媒体の発明は、文字情報入力装置の文字情報入力方法をコンピュータによって機能させるためのプログラムの記録媒体であって、プログラムはコンピュータに、カメラから撮影画像を入力させ、表示器の画面に画像を表示させ、前記撮影画像に写された手と指の位置および指示姿勢および選択姿勢を認識させ、認識された手と指の位置および姿勢に基づいて、カーソルやポインタなどで表現される指示記号および選択記号の前記表示器の画面上での位置を決定させ、決定された前記表示器の画面上での位置に前記記号を表示させ、前記指示姿勢および前記選択姿勢により画像範囲を指定させ、指定された範囲の画像から文字を認識させ、認識した文字を文字コードデータに変換させ、変換された文字コードデータを記憶媒体に記憶させることを特徴とする。
【００３３】
さらに、前記プログラムはコンピュータに、前記撮影画像を圧縮または非圧縮で無線または有線の通信回路を介して送信させ、該圧縮または非圧縮画像を受信し圧縮画像の場合は展開させることを特徴とする。
【００３４】
さらに、前記プログラムはコンピュータに、前記指定された範囲の画像を圧縮または非圧縮で無線または有線の通信回路を介して送信させ、該圧縮または非圧縮範囲画像を受信し圧縮画像の場合は展開させることを特徴とする。
【００３５】
さらに、前記プログラムはコンピュータに、前記撮影画像、または無線や有線の通信回線を介して受信した画像を記憶媒体に記憶させ、該記憶した画像を前記表示器に表示させることを特徴とする。
【００３６】
さらに、前記プログラムはコンピュータに、前記記憶した文字コードデータに対応する文字を、文字フォントデータを用いて前記表示器に表示させることを特徴とする。
【００３７】
さらに、前記プログラムはコンピュータに、前記文字コードデータを、インターネットやデータベースの検索キーワードまたはＯＳやアプリケーションのコマンドや入力文字または文字認識に使われた画像の付加情報として使用させることを特徴とする。
【００３８】
さらに、前記プログラムはコンピュータに、前記画像範囲を指定させる際に、前記指示姿勢を認識した後に前記選択姿勢を認識してから、再び前記指示姿勢を認識するまでの間の前記選択記号の位置の集合により表現される矩形または閉ループまたは直線分または曲線分などの図形に基づいて画像範囲を指定させることを特徴とする。
【００３９】
さらに、前記プログラムはコンピュータに、前記文字認識の結果の候補が複数存在する場合は、該候補それぞれを文字コードデータに変換させ、該文字コードデータに対応する文字を、文字フォントデータを用いて前記表示器に表示させ、表示された前記候補から１つを前記指示記号で指示させ前記選択姿勢で選択させ、前記文字コードデータを記憶する際に選択された候補の文字コードデータを記憶媒体に記憶させることを特徴とする。
【００４０】
さらに、前記プログラムはコンピュータに、前記文字認識の結果の候補に正解が含まれない場合は、前記表示器に仮想キーボードを表示させ、前記指示姿勢および前記選択姿勢により仮想キーボードを操作させ、前記文字認識の候補から１つを選択させ、該選択候補を前記表示器で表示し前記仮想キーボードで修正させ、修正した文字の文字コードデータを記憶媒体に記憶させることを特徴とする。
【００４１】
【発明の実施の形態】
以下、本発明の実施形態について、添付図面を参照して詳細に説明する。
【００４２】
図１は、本発明の実施形態の文字情報入力装置の構成を示すブロック図である。端末装置１０は、例えばＣＣＤカメラやＣＭＯＳカメラなどからなる画像入力部１０４と、例えば液晶パネルや有機ＥＬパネルやフィールドエミッションディスプレイパネルやＬＥＤや小型プロジェクタなどからなる表示部１０３と、例えば無線ＬＡＮ装置や携帯電話装置やＰＨＳ装置や赤外線通信装置などからなる通信部１０２と、画像データや文字コードデータや文字フォントデータやプログラムや文字認識用辞書や他のデータを記憶する、例えばＲＡＭやＲＯＭ、可搬性記録媒体などからなる記憶部１０１と、手指画像認識機能３００、記号位置決定機能３０１、範囲指定機能３０２、文字認識機能手段３０３、文字コードデータ変換機能３０４、画像圧縮展開機能３０５を含む画像処理、文字認識処理などの処理を行うとともに端末装置１０全体の制御を行う、例えばプロセッサ（ＣＰＵやＭＰＵやＤＳＰなど）やキャッシュメモリなどからなる情報処理制御部１００を備える。
【００４３】
なお、手指画像認識機能３００および文字認識機能３０３は、情報処理技術分野の当業者には周知の機能を用いて、実現することができる。手指画像認識機能３００では、例えば、事前に学習した色や輪郭形状などの画像特徴と撮影された入力画像に含まれるそれらの画像特徴とを確率統計的に組み合わせることで手と指を認識することができるし、他の一般的な方法でもよい。文字認識機能３０３では、例えば、事前にいくつものフォントの文字を濃淡の勾配特徴などに基づいて学習して文字の辞書を作るとともに単語辞書などの文字のつながりに関する辞書を用意し、入力画像に含まれる画像特徴をそれらの辞書に確率統計的に当てはめて文字を認識することができるし、他の一般的な方法でもよい。また、性能のよい文字認識ソフトも市販されているので、それを使ってもよい。
【００４４】
文字コードデータ変換機能３０４は、文字認識機能で認識された文字を、一般的なコード、例えば、ＥＵＣコードやＪＩＳコードやＳ−ＪＩＳコードやアスキーコードに変換することができる。もちろん、文字認識機能で認識された文字が、後処理、例えば、文字フォントデータを用いた表示器への文字の表示などで必要とされる文字コードで表現されている場合、文字コード変換機能は使わなくてもよい。
【００４５】
ところで、端末装置、特に、着用型コンピュータ、携帯型コンピュータ、携帯電話、ＰＤＡ、リモコン、デジタルカメラなどの携帯端末装置を、小型化、軽量化、省電力化、低コスト化する必要がある場合などにおいては、情報処理制御部の処理性能や記憶部の記憶容量を抑えることが有効である。端末装置と、例えば、画像処理サーバ、ホームコンピュータ、家電などのホスト装置とで通信して画像データや処理結果などの送受信することで、手指画像認識機能３００、記号位置決定機能３０１、範囲指定機能３０２、文字認識機能手段３０３、文字コードデータ変換機能３０４などの処理の一部またはすべてをホスト装置に処理させることで、端末装置の機能を省き、情報処理制御部の処理性能や記憶部の記憶容量を抑えることができる。これは、携帯端末装置を、小型化、軽量化、省電力化、低コスト化するのに有効である。
【００４６】
図２は、本発明の実施形態の文字情報入力装置のホストの構成を示すブロック図である。ホスト装置２０は、端末装置１０の通信部１０２と通信する、例えば無線ＬＡＮ装置や携帯電話装置やＰＨＳ装置や赤外線通信装置やイーサネット装置や有線ＬＡＮ装置などからなる通信部２０２と、例えばＲＡＭやＲＯＭ、可搬性記録媒体などからなる記憶部２０１と、手指画像認識機能４００、記号位置決定機能４０１、範囲指定機能４０２、文字認識機能手段４０３、文字コードデータ変換機能４０４、画像圧縮展開機能４０５を含む画像処理、文字認識処理などの処理を行うとともにホスト装置１０全体の制御を行う、例えばプロセッサ（ＣＰＵやＭＰＵやＤＳＰなど）やキャッシュメモリなどからなる情報処理制御部２００を備える。もちろん、ホスト装置は並列コンピュータやＰＣクラスタなどの分散システムであってもよい。
【００４７】
手指画像認識機能３００と手指画像認識機能４００、記号位置決定機能３０１と記号位置決定機能４０１、範囲指定機能３０２と範囲指定機能４０２、文字認識機能手段３０３と文字認識機能手段４０３、文字コードデータ変換機能３０４と文字コードデータ変換機能４０４はそれぞれ同等の機能であるため、端末装置１０またはホスト装置２０のどちらかにそれぞれの機能が備わっていればよい。例えば、文字認識機能４０３がホスト装置に備わっていれば文字認識機能３０３を端末装置に備える必要はないため、端末装置１０で必要とされる処理能力やプログラムの記憶容量を抑えることができ、情報処理制御部１００や記憶部１０１を、小型化、軽量化、省電力化、低コスト化するのに有効である。
【００４８】
図３は、本発明の実施形態の文字情報入力装置の通信システムの構成を示すブロック図である。基地局３０は例えば無線ＬＡＮ装置や携帯電話装置やＰＨＳ装置や赤外線通信装置などからなり、端末装置１０と交換網３１を接続する。交換網３１は、公衆網、構内網どちらでもよく、ホスト装置２０は端末装置１０の通信相手となる。もちろん、現在のリモコンと家電のように、端末装置１０とホスト装置２０とが赤外線通信や無線通信などで直接通信してもよい。もちろん、有線で通信してもよい。
【００４９】
なお、画像を端末装置１０とホスト装置２０とで画像を送受信する場合、例えば、ＪＰＥＧやＭＯＴＩＯＮＪＰＥＧやＭＰＥＧやＭＰＥＧ２やＭＰＥＧ４やＭＰＥＧ７などＤＣＴ変換やウェーブレット変換や動き補償などの技術を用いた画像圧縮展開機能３０５および画像圧縮展開機能４０５により、画像を圧縮して通信部１０２、通信部２０２を介して送受信することで、伝送容量を抑えることができる。また、手指画像認識機能、記号位置決定機能、範囲決定機能を端末装置１０に備え、文字認識処理、文字コードデータ変換機能をホスト装置２０に備えた場合、指定された範囲の画像のみを伝送することで、さらに伝送容量を抑えることができる。
【００５０】
図４は、画像入力部１０４を構成するカメラの着用位置を示す模式図である。カメラ５０〜５３は、それぞれ、肩、胸部、耳に着用補助器具などを用いて着用、またはサングラスや眼鏡などのフレームに装備することで着用したカメラの模式図である。
【００５１】
カメラ５０は、ショルダーバッグやリュックサックなどの鞄類の肩にかかる部分または服の肩にあたる部分に直接装備またはアタッチメントなどの補助器具により着脱可能な状態で装備することで図４のような位置に着用できる。
【００５２】
カメラ５１は、ショルダーバッグやリュックサックなどの鞄類の胸部にかかる部分または服の胸にあたるの部分に直接装備またはアタッチメントなどの補助器具により着脱可能な状態で装備することで図４のような位置に着用できる。例えば、ブローチやペンダントなどにカメラを装備し胸部に取り付けることなどが可能である。
【００５３】
カメラ５２は、アタッチメントなどの補助器具により着脱可能な状態で装備することで図４のような位置に着用できる。例えば、イヤリングやピアスなどにカメラを装備し耳に取り付けることなどが可能である。
【００５４】
カメラ５３は、眼鏡やサングラスのフレームにカメラを装備することで図4のような位置に着用できる。
【００５５】
カメラ５０、５１は、体の前方に手と指が存在する場合にそれらを撮影するのに有効である。カメラ５２、５３は、顔の前方に手と指が存在する場合にそれらを撮影するのに有効である。もちろん、手と指が写る位置であれば身体の上述以外の部分、例えば被写体となる手と指とは逆の腕に腕時計を着用しその腕時計にカメラを装備してもよいし、着用はせず携帯装置に装備したカメラを用いてもよい。
【００５６】
図５は、表示部１０３を構成する表示器の着用位置を示す模式図である。表示器６０〜６２は、それぞれ、使用者の視野に入るように頭部に直接着用または頭部に着用するものに装備したものである。このように着用することで、ハンズフリーで表示器を見ることができ、手と指で指示・選択しながら表示を確認することが容易になる。
【００５７】
表示器６０は、額と側頭部を支点とする留め具に装着し着用する表示器であり、同様に着用できる表示器はすでに市販されている。
【００５８】
表示器６１は、耳と後頭部を支点とする留め具に装着し着用する表示器である。ヘッドホンの留め具に表示器を装着しても同様に着用できる。
【００５９】
表示器６２は、眼鏡やサングラスなどのレンズ前面に固定されるように装備、またはレンズ内部に装備した表示器であり、これと同様に装備できる表示器はすでに市販されている。
【００６０】
もちろん、使用者の視野に入るようにできるのであれば、身体の他の部分、例えば腕時計や腕時計型の携帯情報端末の表示器を流用することで、その表示器を腕に着用することができるし、着用はせず携帯装置に装備した表示器を用いてもよい。
【００６１】
図６は、本発明の実施形態の文字情報入力装置を使用する際に、画像入力部１０４で撮影する右手と指の姿勢の例を示す模式図である。ここでは、手指画像認識機能３００または４００で認識する手と指の姿勢の例として、１本指を使用する姿勢５０２、５０３と、２本指を使用する姿勢５００，５０１を例としてとりあげる。
【００６２】
もちろん、左手でもよいし、他の姿勢により、指示と選択姿勢を認識してもよい。例えば、じゃんけんのパー、チョキ、グーのような姿勢を認識し、それぞれが指示姿勢なのか選択姿勢なのかを決めてもよい。
【００６３】
指示姿勢５００は、人差し指と親指を伸ばして、それらの指先を離した姿勢を指示姿勢と決めた場合の例である。この際、記号位置決定機能３０１または記号位置決定機能４０１により決定される指示記号の位置は、例えば認識された2本の指の先を結ぶ直線の中間位置であってもよいし、人差し指の指先の位置でもよい。また、認識される指と記号位置との相対関係を使用者が使用しやすいように設定してもよい。
【００６４】
選択姿勢５０１は、選択姿勢５００の状態から、人差し指と親指の指先をくっつけた姿勢を選択姿勢と決めた場合の例である。この際、記号位置決定機能３０１または記号位置決定機能４０１により決定される選択記号の位置は、例えば認識された2本の指の接する部分の中心であってもよいし、人差し指の指先の位置でもよい。また、認識される指と記号位置との相対関係を使用者が使用しやすいように設定してもよい。
【００６５】
指示姿勢５０２は、人差し指を伸ばした姿勢を指示姿勢と決めた場合の例である。この際、記号位置決定機能３０１または記号位置決定機能４０１により決定される指示記号の位置は、例えば認識された人差し指の指先の位置やそれに近い位置でもよい。また、認識される指と記号位置との相対関係を使用者が使用しやすいように設定してもよい。
【００６６】
選択姿勢５０３は、選択姿勢５００の状態から、人差し指の第２関節を曲げた姿勢を選択姿勢と決めた場合の例である。もちろん、第３関節を曲げた姿勢を選択姿勢としてもよい。なお、第２関節を曲げる際に、第１関節が曲がってもよいし、第３関節を曲げる際に、第１、第２関節が曲がってもよい。この際、記号位置決定機能３０１または記号位置決定機能４０１により決定される選択記号の位置は、例えば認識された人差し指の画像上における最も上の位置でもよい。また、認識される指と記号位置との相対関係を使用者が使用しやすいように設定してもよい。
【００６７】
指示姿勢で指示した一点を選択姿勢で選択する場合は、選択姿勢が認識された時点からもっとも近い過去の時点で認識された指示姿勢に基づいて決定される指示記号位置を選択した位置とすることができる。もちろん、もっとも近い過去の時点ではなく、使用者が使用しやすいように数十から数百ミリ秒前の指示記号位置を選択した位置としてもよい。
【００６８】
図７は、範囲指定機能３０２または範囲指定機能４０２により指定される矩形の例を示す模式図である。図７（ａ）では、指示姿勢５００を認識した後に選択姿勢５０１を認識してから、再び前記指示姿勢５００を認識するまでの間の選択記号位置の集合により表現される閉ループに矩形を当てはめ、当てはまった矩形を指定される画像範囲としている場合の例である。
【００６９】
閉ループは隣り合った記号位置同士を直線で結んだり、記号位置の集合にベジェ曲線やスプライン曲線を当てはめたりすることにより求められる。画像範囲を示す矩形のあてはめは、例えば、閉ループに内接するもっとも面積の大きい矩形や、外接するもっとも面積の小さい矩形を選ぶことにより可能である。もちろん、使用者が使用しやすいように閉ループと矩形との相対関係を設定してもよい。もちろん、画像範囲は矩形ではなく閉ループのままでもよい。
【００７０】
また、上述のような閉ループ的な記号位置の集合から実際には閉ループを求めず、いきなり矩形を当てはめてもよい。例えば、記号位置の集合の重心位置を求め、その重心位置より上にある記号位置のメディアン値を矩形の上側の辺とし、同様に下の辺、左右の辺とし、それぞれが交わる位置を頂点とすることで矩形を求めることができる。もちろん、他の方法、例えばハフ変換などで当てはめてもよい。
【００７１】
図７（ｂ）は、指示姿勢５００を認識した後に選択姿勢５０１を認識してから、再び前記指示姿勢５００を認識するまでの間の選択記号位置の集合により表現される曲線分や折れ線の上に矩形を設定し、設定された矩形を指定される画像範囲としている場合の例である。
【００７２】
折れ線は、例えば、隣り合った記号位置同士を直線で結ぶことで求めることができるし、曲線分は記号位置の集合にベジェ曲線やスプライン曲線を当てはめたりすることにより求められる。画像範囲を示す矩形を決めるには、例えば、折れ線や曲線分の下側に接する直線分を求め、それを下の辺（底辺）とし、決められた長さの左右の辺を持つ矩形を求めることで実現できる。左右の辺の長さは使用者が下の辺と左右の辺の長さの相対関係を設定してもよい。
【００７３】
なお、図７（ｂ）の例は横方向に書かれた文字の下の部分を折れ線や曲線分で指定する場合であり、縦書きの文字の場合に右手によって指定するのであれば、例えば、折れ線や曲線分の右側に接する直線分を求め、それを右の辺とし、決められた長さの上下の辺を持つ矩形を求めることで画像範囲を示す矩形を求めることができる。上下の辺の長さは使用者が右の辺と上下の辺の長さの相対関係を設定してもよい。もちろん、縦書きの文字を左手の場合によって指定する場合は、矩形の左の辺を指定してもよい。
【００７４】
また、実際には曲線分や折れ線を求めず、記号位置の集合にいきなり矩形を当てはめてもよい。例えば、ハフ変換などの統計的な手法で記号位置の集合に直線分を当てはめ、それを一辺とする矩形を上述と同様の方法で求めればよい。
【００７５】
もちろん、指示姿勢５０２と選択姿勢５０３によって、上述と同様に閉ループや折れ線や曲線分などで画像範囲の指定をしてもよいし、他の姿勢を選択姿勢、指示姿勢を用いてもよい。
【００７６】
また、一点を指示姿勢で指示し、選択姿勢でその点を選択し、その周辺の文字らしい部分を自動抽出することで範囲を指定してもよい。文字らしさは、情報処理技術分野の当業者に周知の機能を用いて実現することができる。例えば、画像の高周波成分が規則的に現れたり、輝度勾配が急であったりする部分が、横書きの文字の場合は横方向に長く分布し、縦書きの文字の場合は縦方向に長く分布するといった特性を利用して、抽出することができる。また、この文字らしい部分の形状によって、上述の記号位置の集合に基づいて決定される画像範囲を修正・調整してもよい。
【００７７】
次に、本発明装置の使用方法とその際の操作手順および本装置の処理について説明する。ここでは、本発明の使用方法として、実施形態１および実施形態２の２つを説明する。実施形態１は、使用者が着用または携帯する画像入力部１０４のカメラによって撮影された手と指および文字を含む画像から、手と指の位置と姿勢の情報および文字情報を獲得する場合であり、実施形態２は、使用者が着用または携帯する画像入力部１０４のカメラによって撮影された手と指を含む画像から手と指の位置と姿勢を獲得し、記憶部１０１または２０１に保存されている文字を含む画像から文字情報を獲得する場合である。
【００７８】
なお、以下の説明では、端末装置１０のみで処理した場合について説明するが、上述したように、手指画像認識機能３００と手指画像認識機能４００、記号位置決定機能３０１と記号位置決定機能４０１、範囲指定機能３０２と範囲指定機能４０２、文字認識機能手段３０３と文字認識機能手段４０３、文字コードデータ変換機能３０４と文字コードデータ変換機能４０４はそれぞれ同等の機能であるため、端末装置１０またはホスト装置２０のどちらかにそれぞれの機能が備わっていて、端末装置１０とホスト装置２０を組み合わせて処理してもよい。
【００７９】
図８は、本発明の実施形態１を示す模式図である。この例では、使用者が着用または携帯する画像入力部１０４のカメラによって手と指および文字を含む画像が撮影され、使用者は表示部１０３の表示器に表示されている撮影画像を見ながら、その撮影画像に含まれる「第１８会議室」という文字を、範囲指定機能３０２を用いて選択している。
【００８０】
図８に示す例では、指示記号を矢印マークのポインタを用いて表示し、指示姿勢を認識した後に選択姿勢を認識してから、再び前記指示姿勢を認識するまでの間の選択記号の軌跡を白丸で表示し、画像範囲を表す矩形を点線矩形で表示しているが、もちろん、他の記号や図形を用いてもよい。
【００８１】
図９は、本発明の実施形態１で本発明装置が行う処理のフローである。また、図１０は、図９の処理がＦ１１４まで到達する場合に、使用者が行う操作のフローの例である。
【００８２】
まず、本装置は、画像入力部１０４で画像を撮影し、実施形態１を開始する仮想ボタン（またはメニュー）や使用者への他の仮想ボタンおよびメニュー等を含む提示情報を撮影画像に重畳して、グラフィカルユーザーインターフェイス（ＧＵＩ）として表示部１０３に表示する（Ｆ１００）。実施形態１を開始する仮想ボタンが表示されている部分を、指示姿勢で指示し、選択姿勢で選択した（Ｓ１００）ことを認識、または他の仮想ボタンやメニューを選んだことを認識するまで処理Ｆ１００を繰り返す（Ｆ１０１）。実施形態１を開始する仮想ボタン以外の仮想ボタンやメニューが選ばれたと認識すれば他の処理に移る（Ｆ１１６）。
【００８３】
処理Ｆ１０１で実施形態１が選ばれたと認識すると、画像を撮影し、実施形態１を開始する仮想ボタンを含まないＧＵＩを表示する（Ｆ１０２）。次に、手指画像認識機能３００により、使用者の指示姿勢（Ｓ１０１）を認識するまで処理Ｆ１０２を繰り返す（Ｆ１０３）。
【００８４】
処理Ｆ１０３で指示姿勢を認識すると、記号位置決定機能３０１により指示記号位置を決定し（Ｆ１０４）、画像を撮影し、該指示記号を含むＧＵＩを表示する（Ｆ１０５）。次に、手指画像認識機能３００により使用者の選択姿勢（Ｓ１０２）を認識するまで処理Ｆ１０５を繰り返す（Ｆ１０６）。
【００８５】
処理Ｆ１０６で選択姿勢を認識すると、記号位置決定機能３０１により選択記号位置を決定し、その位置を範囲指定機能３０２のために蓄積し（Ｆ１０７）、画像を撮影し、該選択記号および選択記号の軌跡を示す記号（図８の例では白丸）を含むＧＵＩを表示する（Ｆ１０８）。次に、手指画像認識機能３００により使用者の選択姿勢（Ｓ１０２）を認識し続ける間、処理Ｆ１０７および処理Ｆ１０８を繰り返す（Ｆ１０９）。処理Ｆ１０９で指示姿勢（Ｓ１０３）を認識すると、記号位置決定機能３０１により指示記号位置を決定する（Ｆ１１０）。なお、処理Ｆ１０９で指示姿勢も選択姿勢も認識できなければ処理を終了し、文字コードデータは記憶されない。
【００８６】
処理Ｆ１１０で指示記号位置を決定した後、範囲指定機能３０２により画像範囲を求めてその時点の画像を保存し（Ｆ１１１）、画像を撮影し、指示記号および画像範囲を表す図形（図８の例では点線矩形）を含むＧＵＩを表示する（Ｆ１１２）。次に、処理Ｆ１１１で保存した画像の画像範囲内の部分から文字認識機能３０３により文字認識を行い、文字が認識できなければ処理は終了となり、文字コードデータは記憶されない（Ｆ１１３）。
【００８７】
文字が認識できれば、認識した文字の確認、選択、修正処理（Ｓ１０４、Ｆ１１４）を行い、最終的に確認、選択、修正された文字があれば、その文字コードデータを記憶部１０１に記憶し（Ｆ１１５）、処理を終了する。なければ文字コードデータは記憶せず終了する。
【００８８】
図１１は、本発明の実施形態２を示す模式図である。この例では、使用者が着用または携帯する画像入力部１０４のカメラによって撮影された手と指を含む画像とともに、記憶部１０１に保存されている文字認識の対象画像が、表示部１０３の表示器に表示される。使用者は表示部を見ながら、記憶部１０１に保存されている対象画像に含まれる「実験室Ａ」という文字を、範囲指定機能３０２を用いて選択している。
【００８９】
図１１に示す例では、その時点での撮影画像は表示器の画面の右下に対象画像に重畳して表示されているが、もちろん、他の位置でもよいし、重畳せずに、例えば表示器の画面の左上に対象画像を、右上に撮影画像を、下には他の情報を表示するなどしてもよい。
【００９０】
図１２は、本発明の実施形態２で本発明装置が行う処理のフローである。また、図１３は、図１２の処理がＦ２１６まで到達する場合に、使用者が行う操作のフローである。
【００９１】
まず、本装置は、画像入力部１０４で画像を撮影し、実施形態２を開始する仮想ボタン（またはメニュー）や使用者への他の仮想ボタンおよびメニュー等を含む提示情報を撮影画像に重畳して、グラフィカルユーザーインターフェイス（ＧＵＩ）として表示部１０３に表示する（Ｆ２００）。実施形態２を開始する仮想ボタンが表示されている部分を、使用者が指示姿勢で指示し、選択姿勢で選択した（Ｓ２００）ことを認識、または他の仮想ボタンやメニューを選んだことを認識するまで処理Ｆ２００を繰り返す（Ｆ２０１）。実施形態２を開始する仮想ボタン以外の仮想ボタンやメニューが選ばれたと認識すれば他の処理に移る（Ｆ２１８）。
【００９２】
処理Ｆ２０１で実施形態２が選ばれたと認識すると、画像を撮影し、記憶部１０１に保存されている画像のサムネイル画像（縮小画像）、または通信部１０２を介してインターネットやホスト装置などから受信したサムネイル画像の一覧を表示する（Ｆ２０２）。
【００９３】
図１４は、文字認識の対象画像を選択するためのＧＵＩの例を示した模式図である。この例ではサムネイル画像の一覧および、該一覧をスクロールさせる上下のスクロール仮想ボタンが表示されている。この例では、例えば、選択したい画像のサムネイル画像が表示されている部分を、使用者が指示姿勢で指示し、選択姿勢で選択した（Ｓ２０１）ことを認識する（Ｆ２０３）ことで、対象画像を選択することができる。
【００９４】
処理Ｆ２０３で対象画像を選択すると、画像を撮影し、図１１の例のように、撮影画像、対象画像を含むＧＵＩを表示する（Ｆ２０４）。次に、手指画像認識機能３００により使用者の選択姿勢（Ｓ２０２）を認識するまで処理Ｆ２０４を繰り返す（Ｆ２０５）。
【００９５】
処理Ｆ２０５で指示姿勢を認識すると、記号位置決定機能３０１により指示記号位置を決定し（Ｆ２０６）、画像を撮影し、該指示記号を含むＧＵＩを表示する（Ｆ２０７）。次に、手指画像認識機能３００により使用者の選択姿勢（Ｓ２０３）を認識するまで処理Ｆ２０７を繰り返す（Ｆ２０８）。
【００９６】
処理Ｆ２０８で選択姿勢を認識すると、記号位置決定機能３０１により選択記号位置を決定し、その位置を範囲指定機能３０２のために蓄積し（Ｆ２０９）、画像を撮影し、該選択記号および選択記号の軌跡を示す記号（図１１の例では白丸）を含むＧＵＩを表示する（Ｆ２１０）。
次に、手指画像認識機能３００により使用者の選択姿勢（Ｓ２０３）を認識し続ける間、処理Ｆ２０９および処理Ｆ２１０を繰り返す（Ｆ２１１）。処理Ｆ２１１で指示姿勢（Ｓ２０４）を認識すると、記号位置決定機能３０１により指示記号位置を決定する（Ｆ２１２）。なお、処理Ｆ２１１で指示姿勢も選択姿勢も認識できなければ処理を終了し、文字コードデータは記憶されない。
【００９７】
処理Ｆ２１２で指示記号位置を決定した後、範囲指定機能３０２により画像範囲を求め（Ｆ２１３）、画像を撮影し、指示記号および画像範囲を表す図形（図１１の例では点線矩形）を含むＧＵＩを表示する（Ｆ２１４）。次に、対象画像の画像範囲内の部分から文字認識機能３０３により文字認識を行い、文字が認識できなければ処理は終了となり、文字コードデータは記憶されない（Ｆ２１５）。
【００９８】
文字が認識できれば、認識した文字の確認、選択、修正処理（Ｓ２０５、Ｆ２１６）を行い、最終的に確認、選択、修正された文字があれば、その文字コードデータを記憶部１０１に記憶し（Ｆ２１７）、処理を終了する。なければ文字コードデータは記憶せず終了する。
【００９９】
もちろん、実施形態１、２が終了した時点で記憶部１０２に記憶した文字コードデータは、インターネットやデータベースの検索キーワードまたはオペレーションシステムやアプリケーションのコマンドや入力文字または文字認識に使われた画像（この場合は撮影画像）の付加情報などに用いることができる。例えば、記憶部１０２に記憶された文字コードデータをキーボードイベントとしてＧＵＩに送信したり、後処理として対象画像のヘッダ情報などに文字コードデータを埋め込む処理をしたりすることで、記憶された文字コードデータを利用できる。
【０１００】
ここで、処理Ｆ１１４、Ｆ２１６およびステップ１０４、２０５、つまり、認識した文字の確認、選択、修正処理（操作）の詳細な説明をする。図１５は、認識した文字の候補を選択するためのＧＵＩの例を示した模式図である。この図の例では、認識した文字の候補として４つが文字認識機能３０３から得られ、それらの文字コードデータに対応する文字を、文字フォントデータを用いてボタンのように並べて表示している。
【０１０１】
図１６は、認識した文字を修正するためのＧＵＩの例を示した模式図である。この図の例では、編集可能なテキストボックスが上部に、携帯電話などで広く使われているテンキー型の仮想キーボードがその下に表示されている。
【０１０２】
図１７は、認識した文字の確認、選択、修正のために本発明装置が行う処理のフローである。また、図１８は、図１７の処理がＦ３０７に到達する場合に、使用者が行う操作のフローである。
【０１０３】
まず、文字認識機能３０３で得られた認識結果の候補の数を調べる（Ｆ３００）。候補が複数であれば、画像を撮影し、候補の一覧を含むＧＵＩを、図１５のように表示する（Ｆ３０１）。使用者がそれを確認し（Ｓ３００）、例えば、選択したい候補が表示されている部分を、使用者が指示姿勢で指示し、選択姿勢で選択した（Ｓ３０１）ことを認識する（Ｆ３０２）ことで、候補を選択することができる。
【０１０４】
次に、画像を撮影し、候補が１つの場合はその候補を、複数の場合は選択した候補、および決定仮想ボタン、修正仮想ボタン、キャンセル仮想ボタンを含むＧＵＩを表示する（Ｆ３０３）。いずれかの仮想ボタンが選択されるまで、処理Ｆ３０３を繰り返す（Ｆ３０４）。
【０１０５】
使用者がキャンセル仮想ボタンを選択したことを認識した場合は、処理がキャンセルされる。決定仮想ボタンを選択したと認識した場合は、処理Ｆ３０３で表示されている候補の文字コードデータを後段の処理、例えば、Ｆ１１５やＦ２１７に渡す。修正仮想ボタンを選択した（Ｓ３０２）と認識した場合は、画像を撮影し、処理Ｆ３０３で表示されている候補を表示したテキストボックス、および仮想キーボードを含むＧＵＩを、図１６のように表示する（Ｆ３０５）。
【０１０６】
テキストボックス内のカーソルは、指示姿勢で指示し選択姿勢で選択することで、修正したい文字の位置に移動することができる。例えば、消去したい文字があればその文字を指示し選択した後、仮想キーボードのクリアキーを選択すればよい。また、カーソル位置に文字を入力したい場合は、仮想キーボードの各キーを選択して入力すればよい（Ｆ３０６、Ｓ３０３）。
【０１０７】
修正中は、使用者が決定仮想ボタンを選択した（Ｓ３０４）かどうかを認識し（Ｆ３０７）、選択しない限り修正処理（操作）は続けられる。決定仮想ボタンが選択されれば、その時点でテキストボックスに表示してある文字の文字コードデータを、後段の処理、例えば、Ｆ１１５やＦ２１７に渡す。
【０１０８】
なお、文字認識の対象となる文字は、活字や看板の文字だけではなく、手書きの文字も対象となることは言うまでもない。例えば、紙やホワイトボードに手書きで字を書き、それを撮影し、手と指で範囲指定することで文字情報を入力することができる。さらに、コンピュータの処理能力および文字認識の精度が向上すれば、範囲指定せずに画像中のすべての文字を認識してから、使いたい結果の候補だけを図１５のように手と指で指示、選択して選んでもよい。
【０１０９】
また、上述の仮想ボタンやメニューで行う操作は、他の方法で操作することもできる。表示器の画面のある部分、例えば、左上隅を指示し選択すると、処理のキャンセルをすることができるなどが考えられる。また、じゃんけんのパー、チョキ、グーのようなさまざまな姿勢を、指示姿勢や選択姿勢だけではなく、決定姿勢、キャンセル姿勢などとして認識し操作できるようにしてもよい。
【０１１０】
次に本発明に係る他の実施形態について述べる。本発明の目的は、上述した実施形態の機能を実現するプログラムを記録した記録媒体を、システムあるいは装置に供給し、そのシステムあるいは装置のコンピュータやＣＰＵやＭＰＵが記録媒体に格納されたプログラムを読み出し、実行することによっても、達成されることは言うまでもない。
【０１１１】
この場合、記録媒体から読み出されたプログラム自体が上述した実施形態の機能を実現することになり、そのプログラムを記録した記録媒体は本発明を構成することになる。
【０１１２】
そのプログラムを記録し、またそのプログラムで読み込まれるデータを記録する記録媒体としては、例えば、磁気テープやカセットテープ等のテープ媒体、フロッピーディスクやハードディスク等の磁気ディスクやＣＤ−ＲＯＭ／ＣＤ−Ｒ／ＣＤ−ＲＷ／ＭＯ／ＭＤ／ＤＶＤ−ＲＯＭ／ＤＶＤ−Ｒ／ＤＶＤ−ＲＷ／ＤＶＤ＋ＲＷ／ＤＶＤ−ＲＡＭ等の光（磁気）ディスクを含むディスク媒体、不揮発性のメモリカード／光カード等のカード媒体、ＲＯＭなどを用いことができる。
【０１１３】
また、コンピュータが読み出したプログラムを実行することにより、上述の実施形態の機能が実現されるだけでなく、そのプログラムの指示に基づいて、コンピュータ上で稼動しているＯＳなどが実際の処理の一部または全部を行ない、その処理によって上述した実施形態の機能が実現される場合も含まれることは言うまでもない。
【０１１４】
【発明の効果】
以上に説明したように、本発明によれば、カメラおよび表示器および使用者の手と指を用いて、屋内や屋外に表記の文字または手書きの文字を文字コードデータに変換することで、文字情報を迅速にしかも確実にかつ簡単に入力できる。特に、着用型コンピュータ、携帯型コンピュータ、携帯電話、ＰＤＡ、リモコン、デジタルカメラなどの着用、携帯型装置にとって有効な文字情報入力インターフェイスを提供する。
【図面の簡単な説明】
【図１】本発明の実施形態の文字情報入力装置の構成を示すブロック図である。
【図２】本発明の実施形態の文字情報入力装置のホストの構成を示すブロック図である。
【図３】本発明の実施形態の文字情報入力装置の通信システムの構成を示す模式図である。
【図４】画像入力部１０４を構成するカメラの着用位置を示す模式図である。
【図５】表示部１０３を構成する表示器の着用位置を示す模式図である。
【図６】本発明の実施形態の文字情報入力装置を使用する際に画像入力部１０４で撮影する手と指の姿勢の例を示す模式図である。
【図７】範囲指定機能３０２と範囲指定機能４０２により指定される矩形の例を示す模式図である。
【図８】本発明の実施形態１を示す模式図である。
【図９】本発明の実施形態１で本発明装置が行う処理のフローである。
【図１０】図９の処理がＦ１１４まで到達する場合に、使用者が行う操作のフローの例である。
【図１１】本発明の実施形態２を示す模式図である。
【図１２】本発明の実施形態２で本発明装置が行う処理のフローである。
【図１３】図１２の処理がＦ２１６まで到達する場合に、使用者が行う操作のフローである。
【図１４】文字認識の対象画像を選択するためのＧＵＩの例を示した模式図である。
【図１５】認識した文字の候補を選択するためのＧＵＩの例を示した模式図である。
【図１６】認識した文字を修正するためのＧＵＩの例を示した模式図である。
【図１７】認識した文字の確認、選択、修正のために本発明装置が行う処理のフローである。
【図１８】図１７の処理がＦ３０７に到達する場合に、使用者が行う操作のフローである。
【符号の説明】
１０端末装置
２０ホスト装置
３０基地局
３１交換網
５０肩に着用したカメラ
５１胸部に着用したカメラ
５２耳に着用したカメラ
５３眼鏡やサングラスなどのフレームに装備したカメラ
６０額と側頭部を支点とする留め具に装着された表示器
６１耳と後頭部を支点とする留め具に装着された表示器
６２眼鏡やサングラスなどのレンズ前面、または内部に装着された表示器
１００情報処理制御部（端末装置１０に装備）
１０１記憶部（端末装置１０に装備）
１０２通信部（端末装置１０に装備）
１０３表示部（端末装置１０に装備）
１０４画像入力部（ホスト装置２０に装備）
２００情報処理制御部（ホスト装置２０に装備）
２０１記憶部（ホスト装置２０に装備）
２０２通信部（ホスト装置２０に装備）
３００手指画像認識機能（端末装置１０の機能）
３０１記号位置決定機能（端末装置１０の機能）
３０２範囲指定機能（端末装置１０の機能）
３０３文字認識機能（端末装置１０の機能）
３０４文字コードデータ変換機能（端末装置１０の機能）
３０５画像圧縮展開機能（端末装置１０の機能）
４００手指画像認識機能（ホスト装置２０の機能）
４０１記号位置決定機能（ホスト装置２０の機能）
４０２範囲指定機能（ホスト装置２０の機能）
４０３文字認識機能（ホスト装置２０の機能）
４０４文字コードデータ変換機能（ホスト装置２０の機能）
４０５画像圧縮展開機能（ホスト装置２０の機能）
５００２本指を使った指示姿勢
５０１２本指を使った選択姿勢
５０２１本指を使った指示姿勢
５０３１本指を使った選択姿勢[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a character information input device for a computer, and more particularly, to a character information input device, character information input method, and character information input for devices such as a wearable computer, a portable computer, a mobile phone, a PDA, a remote controller, and a digital camera. The present invention relates to a recording medium on which a program for causing a method to function by a computer is recorded.
[0002]
[Prior art]
Conventionally, in order to input characters and commands for operation to a computer, it was assumed that the keyboard device represented by the QWERTY keyboard or a pointing device such as a mouse, trackball, or touchpad would be used on a desktop. A character information input method using an input device or a character information input method characterized in that a character is read out, voice is recorded with a microphone, and character information is obtained by voice recognition is used.
[0003]
In particular, ten keys such as a numeric keypad are used to input characters and commands for operations to a portable terminal (portable device) such as a wearable computer, a portable computer, a cellular phone, a PDA, a home appliance remote controller, and a digital camera. A character information input method using an input device such as a portable keyboard device, a button device, a jog shuttle device, a dial device, a tablet device, and a pen device, which is composed of several keys, is used. .
[0004]
[Problems to be solved by the invention]
However, in order to input characters and commands for operations quickly, surely and easily using a conventional character information input method using an input device premised on use on the desktop as described above. Therefore, it is necessary to use the apparatus and the method for a long period of time and to acquire the usage technique.
[0005]
In addition, it is difficult to carry and use an input device that is presumed to be used on the table when moving, going out, acting while walking and pausing, or working.
[0006]
Furthermore, in the character information input method using the input device that is assumed to be carried and used, the input due to the small number of keys and the difficulty of holding the device even if the usage technique is acquired. Due to the physical limitations of speed, it is difficult to enter characters and commands for operations quickly, reliably and easily.
[0007]
Furthermore, when the input device that is assumed to be carried and used is stored in a pocket or a bag, for example, even if the number of characters or commands to be input is small, it is necessary to take out the input device before use. Therefore, it is difficult to input it quickly, surely and easily.
[0008]
Also, the character information input method using speech recognition in the previous period cannot read out characters when the surroundings are noisy, or conversely, the surroundings are quiet and difficult to speak, or the reading is unclear.
[0009]
The present invention has been made in view of the above points, and its purpose is not only to use an input device or a character information input method by voice recognition that is assumed to be used on a conventional desktop or carried. By using camera and display and user's hands and fingers to convert written characters or handwritten characters into character code data indoors or outdoors, character information can be input quickly, reliably and easily Character information input device for computer, in particular, character information input device, character information input method and character information input method for devices such as wearable computers, portable computers, mobile phones, PDAs, remote controllers, digital cameras, etc. Another object of the present invention is to provide a recording medium on which a program for causing the program to be recorded is recorded.
[0010]
[Means for Solving the Problems]
In order to achieve the above object, the character information input device according to claim 1 includes an image input means for inputting a photographed image from a camera, an image display means for displaying an image on a screen of a display, and the photographed image. The hand image recognition means for recognizing the position of the hand and finger, the pointing posture and the selected posture, and the cursor and the pointer are represented based on the hand and finger position and posture recognized by the hand image recognition means. Symbol position determining means for determining the position of the indicator symbol and the selection symbol on the screen of the display, and an instruction for displaying the instruction symbol at the position of the indicator on the screen determined by the symbol position determining means Symbol display means, range designation means for designating an image range based on the designated orientation and the selection orientation, character recognition means for recognizing characters from an image in the range designated by the range designation means, and the character recognition Character code data conversion means for converting characters recognized by the means into character code data; and character code data storage means for storing the character code data converted by the character code data conversion means in a storage medium. And
[0011]
Further, in the case of the compressed image, the input image transmitting means for transmitting the photographed image input by the image input means through a wireless or wired communication circuit in a compressed or uncompressed manner, and the compressed or uncompressed image Input image receiving means for developing.
[0012]
Further, a range image transmitting unit that transmits an image in a range specified by the range specifying unit via a wireless or wired communication circuit in a compressed or uncompressed manner, and a compressed image that receives the compressed or uncompressed range image Can have a range image receiving means for developing.
[0013]
Further, an image storage means for storing an image input by the image input means or received via a wireless or wired communication line in a storage medium, and an image display for displaying the image stored in the image storage means on the display Means.
[0014]
Furthermore, it can have a character display means which displays the character corresponding to the character code data memorize | stored in the said character code data storage means on the said display using character font data.
[0015]
Further, the character code data stored in the character code data storage means may be used as search keywords in the Internet or database, operation system (OS), application commands, input characters, or additional information on images used for character recognition. it can.
[0016]
Further, the camera can be a camera that is directly attached to or worn on a part of the user's body or is equipped on a portable device.
[0017]
Furthermore, the display equipped with what is directly worn on the head or worn on the head or worn on the arm or worn on the arm so as to enter the user's field of view can be used as the display. .
[0018]
Further, in the range designating unit, a rectangle or a closed loop expressed by a set of positions of the selection symbol after recognizing the selected posture and recognizing the designated posture after recognizing the designated posture or An image range can be specified based on a figure such as a straight line or a curved line.
[0019]
Further, when there are a plurality of candidate character recognition results in the character recognition means, the character code data conversion means converts each of the candidates into character code data, and a character corresponding to the character code data, Character recognition candidate display means for displaying on the display unit using character font data, one of the displayed candidates is indicated by the symbol position determination means, and character recognition result selection means for selecting in the selection posture; Means for storing candidate character code data selected by the character recognition result selection means in a storage medium by the character code data storage means.
[0020]
Further, when the character recognition means does not include a correct answer as a result of the character recognition, the virtual keyboard is displayed on the display unit, and the virtual keyboard is operated by the instruction posture and the selection posture. Virtual keyboard operation means, means for selecting one of the candidates by the character recognition candidate display means and the character recognition result selection means, and the selection candidate is displayed by the character display means and corrected by the virtual keyboard operation means. Character recognition result correcting means, and means for storing the character code data of the character corrected by the character recognition result correcting means in a storage medium by the character code data storage means.
[0021]
To achieve the above object, the character information input method of the character information input device according to claim 12 is the character information input method of the character information input device, the step of inputting the photographed image from the camera, and the screen of the display device. A step of displaying an image on the screen, a step of recognizing the position and the pointing posture and the selected posture of the hand and finger shown in the photographed image, and a cursor and a pointer based on the recognized position and posture of the hand and finger. Determining the position on the screen of the indicator of the indication symbol and the selection symbol to be represented; displaying the symbol at the determined position on the screen of the indicator; A step of designating an image range according to the selection posture, a step of recognizing characters from the image in the designated range, and a step of converting the recognized characters into character code data. Characterized by a step of storing the flop, the converted character code data to the storage medium.
[0022]
Further, the method may include a step of transmitting the photographed image through a wireless or wired communication circuit in a compressed or non-compressed manner, and a step of receiving the compressed or non-compressed image and expanding the compressed image.
[0023]
Further, the method includes a step of transmitting the image in the designated range through a wireless or wired communication circuit in a compressed or non-compressed manner, and a step of receiving the compressed or non-compressed range image and expanding it in the case of a compressed image. be able to.
[0024]
Furthermore, the method can include a step of storing the photographed image or an image received via a wireless or wired communication line in a storage medium and a step of displaying the stored image on the display.
[0025]
Furthermore, it can have a step of displaying a character corresponding to the stored character code data on the display using character font data.
[0026]
Furthermore, the character code data can be used as a search keyword for the Internet or a database, an OS or an application command, an input character, or additional information of an image used for character recognition.
[0027]
Further, the camera can be a camera that is directly attached to or worn on a part of the user's body or is equipped on a portable device.
[0028]
Furthermore, the display equipped with what is directly worn on the head or worn on the head or worn on the arm or worn on the arm so as to enter the user's field of view can be used as the display. .
[0029]
Further, in the step of designating the image range, a rectangle expressed by a set of positions of the selection symbols from the time when the selected posture is recognized until the time when the selected posture is recognized after the designated posture is recognized. Alternatively, the image range can be specified based on a figure such as a closed loop or a straight line or a curve.
[0030]
Further, when there are a plurality of candidate character recognition results, a step of converting each of the candidates into character code data, and a character corresponding to the character code data is displayed on the display unit using character font data. Storing the candidate character code data selected in the step of selecting, one of the displayed candidates with the indicating symbol and selecting in the selection posture, and storing the character code data in a storage medium Steps.
[0031]
Further, when the candidate for the result of character recognition does not include a correct answer, a step of displaying a virtual keyboard on the display unit, a step of operating the virtual keyboard according to the indicated posture and the selected posture, and the character recognition The method may include a step of selecting one of the candidates, a step of displaying the selection candidate on the display unit and correcting the selection candidate using the virtual keyboard, and a step of storing the character code data of the corrected character in a storage medium.
[0032]
In order to achieve the above object, the invention of a recording medium according to claim 23 is a recording medium for a program for causing a computer to function the character information input method of the character information input device. To display the image on the display screen, to recognize the hand and finger position and the indicated posture and the selected posture captured in the captured image, based on the recognized hand and finger position and posture, A position of an indication symbol and a selection symbol represented by a cursor or a pointer on the screen of the display unit is determined, the symbol is displayed at the determined position of the display unit on the screen, and the indication posture and The image range is designated by the selection posture, the character is recognized from the image of the designated range, the recognized character is converted into character code data, and the converted sentence Wherein the storing the code data in the storage medium.
[0033]
Furthermore, the program causes the computer to transmit the captured image in a compressed or uncompressed manner via a wireless or wired communication circuit, and to receive the compressed or uncompressed image and expand the compressed image. .
[0034]
Further, the program causes the computer to transmit the image in the designated range through a wireless or wired communication circuit in a compressed or uncompressed manner, and to receive the compressed or uncompressed range image and expand the compressed image. It is characterized by that.
[0035]
Furthermore, the program causes the computer to store the photographed image or an image received via a wireless or wired communication line in a storage medium and display the stored image on the display.
[0036]
Furthermore, the program causes the computer to display characters corresponding to the stored character code data on the display unit using character font data.
[0037]
Furthermore, the program causes the computer to use the character code data as a search keyword in the Internet or a database, an OS or an application command, an input character, or additional information of an image used for character recognition.
[0038]
Furthermore, when the program causes the computer to specify the image range, the position of the selection symbol after the recognition of the designated posture after the recognition of the designated posture after recognizing the designated posture. The image range is specified based on a rectangle, a closed loop, a straight line, a curved line, or the like represented by the set.
[0039]
Further, when there are a plurality of candidate character recognition results in the computer, the program converts each of the candidates into character code data, and the character corresponding to the character code data is converted into the character code data using the character font data. Display on the display unit, indicate one of the displayed candidates with the instruction symbol, select with the selection posture, and store the character code data of the candidate selected when storing the character code data in the storage medium It is characterized by making it.
[0040]
Further, the program causes the computer to display a virtual keyboard on the display when the candidate of the character recognition result does not include a correct answer, and to operate the virtual keyboard according to the instruction posture and the selection posture, and One of the candidates for recognition is selected, the selected candidate is displayed on the display device, corrected by the virtual keyboard, and character code data of the corrected character is stored in a storage medium.
[0041]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.
[0042]
FIG. 1 is a block diagram showing a configuration of a character information input device according to an embodiment of the present invention. The terminal device 10 includes an image input unit 104 including, for example, a CCD camera or a CMOS camera, a display unit 103 including, for example, a liquid crystal panel, an organic EL panel, a field emission display panel, an LED, or a small projector, and a wireless LAN device or the like. A communication unit 102 including a mobile phone device, a PHS device, an infrared communication device, and the like, and image data, character code data, character font data, a program, a character recognition dictionary, and other data are stored, for example, RAM, ROM, portability Image processing including a storage unit 101 composed of a recording medium, a finger image recognition function 300, a symbol position determination function 301, a range specification function 302, a character recognition function means 303, a character code data conversion function 304, and an image compression / decompression function 305; Terminals that perform character recognition processing It performs location 10 overall control, for example, a processor (such as a CPU or MPU, DSP) and the information processing controller 100 consisting of such as a cache memory.
[0043]
Note that the finger image recognition function 300 and the character recognition function 303 can be realized by using functions well known to those skilled in the information processing technology field. The finger image recognition function 300 recognizes a hand and a finger by probabilistically combining, for example, image features such as colors and contour shapes learned in advance and those image features included in the captured input image. Or other common methods. In the character recognition function 303, for example, a character dictionary is prepared by learning characters in several fonts in advance on the basis of the gradient characteristics, and a dictionary relating to character connections such as a word dictionary is prepared and included in the input image. Characters can be recognized by probabilistically applying image features to those dictionaries, or other general methods may be used. Moreover, since character recognition software with good performance is commercially available, it may be used.
[0044]
The character code data conversion function 304 can convert a character recognized by the character recognition function into a general code such as an EUC code, a JIS code, an S-JIS code, or an ASCII code. Of course, if the character recognized by the character recognition function is expressed in the character code required for post-processing, for example, display of the character on a display using character font data, the character code conversion function is You do not have to use it.
[0045]
By the way, when it is necessary to reduce the size, weight, power saving, and cost of portable terminal devices such as wearable computers, portable computers, mobile phones, PDAs, remote controllers, digital cameras, etc. It is effective to reduce the processing performance of the information processing control unit and the storage capacity of the storage unit. By communicating with a terminal device and a host device such as an image processing server, a home computer, or a home appliance and transmitting and receiving image data and processing results, a finger image recognition function 300, a symbol position determination function 301, a range specification function 302, the character recognition function means 303, the character code data conversion function 304, etc. are all or partly processed by the host device, so that the function of the terminal device is omitted, the processing performance of the information processing control unit and the storage of the storage unit Capacity can be reduced. This is effective for reducing the size, weight, power saving, and cost of the portable terminal device.
[0046]
FIG. 2 is a block diagram showing the configuration of the host of the character information input device according to the embodiment of the present invention. The host device 20 communicates with the communication unit 102 of the terminal device 10. The communication unit 202 includes, for example, a wireless LAN device, a mobile phone device, a PHS device, an infrared communication device, an Ethernet device, a wired LAN device, and the like. A storage unit 201 composed of a portable recording medium, a finger image recognition function 400, a symbol position determination function 401, a range specification function 402, a character recognition function means 403, a character code data conversion function 404, and an image compression / decompression function 405 The information processing control unit 200 includes, for example, a processor (CPU, MPU, DSP, etc.), a cache memory, and the like that performs processing such as image processing and character recognition processing and controls the entire host device 10. Of course, the host device may be a distributed system such as a parallel computer or a PC cluster.
[0047]
Finger image recognition function 300 and finger image recognition function 400, symbol position determination function 301 and symbol position determination function 401, range specification function 302 and range specification function 402, character recognition function means 303 and character recognition function means 403, character code data conversion Since the function 304 and the character code data conversion function 404 are equivalent functions, it is sufficient that either the terminal device 10 or the host device 20 has the respective functions. For example, if the character recognition function 403 is provided in the host device, the character recognition function 303 does not need to be provided in the terminal device, so that it is possible to reduce processing capacity and program storage capacity required by the terminal device 10. This is effective for reducing the size, weight, power saving, and cost of the processing control unit 100 and the storage unit 101.
[0048]
FIG. 3 is a block diagram showing the configuration of the communication system of the character information input device according to the embodiment of the present invention. The base station 30 includes, for example, a wireless LAN device, a mobile phone device, a PHS device, an infrared communication device, and the like, and connects the terminal device 10 and the exchange network 31. The exchange network 31 may be either a public network or a private network, and the host device 20 is a communication partner of the terminal device 10. Of course, like the current remote controller and home appliance, the terminal device 10 and the host device 20 may communicate directly by infrared communication or wireless communication. Of course, you may communicate by wire.
[0049]
When an image is transmitted and received between the terminal device 10 and the host device 20, for example, image compression using techniques such as DCT conversion, wavelet conversion, motion compensation, etc. such as JPEG, MOTION JPEG, MPEG, MPEG2, MPEG4, and MPEG7. By compressing an image and transmitting / receiving it via the communication unit 102 and the communication unit 202 by the expansion function 305 and the image compression / expansion function 405, the transmission capacity can be suppressed. Further, when the terminal device 10 is provided with a finger image recognition function, a symbol position determination function, and a range determination function, and a character recognition process and a character code data conversion function are provided in the host device 20, only an image in a specified range is transmitted. As a result, the transmission capacity can be further reduced.
[0050]
FIG. 4 is a schematic diagram showing the wearing position of the camera constituting the image input unit 104. Cameras 50 to 53 are schematic diagrams of cameras that are worn on the shoulders, chest, and ears by using a wear assisting device, or worn by wearing on a frame such as sunglasses or glasses.
[0051]
The camera 50 can be placed in the position as shown in FIG. 4 by being directly attached to a portion of the shoulder bag of a bag such as a shoulder bag or a rucksack or a portion corresponding to the shoulder of clothes, or detachable with an auxiliary device such as an attachment. Can be worn.
[0052]
The camera 51 is mounted directly on a portion of the chest of a mosquito such as a shoulder bag or a rucksack or a portion corresponding to the chest of a clothes in a state where it can be attached or detached by an auxiliary device such as an attachment, as shown in FIG. Can be worn on. For example, it is possible to equip the chest with a camera on a broach or pendant.
[0053]
The camera 52 can be worn at a position as shown in FIG. 4 by being mounted in a detachable state with an auxiliary device such as an attachment. For example, earrings and earrings can be equipped with a camera and attached to the ears.
[0054]
The camera 53 can be worn at the position shown in FIG. 4 by equipping the frame of glasses or sunglasses with the camera.
[0055]
The cameras 50 and 51 are effective for photographing a hand and a finger when they are present in front of the body. The cameras 52 and 53 are effective for photographing a hand and a finger in front of the face. Of course, it is possible to wear a wristwatch on the other part of the body where the hand and fingers are reflected, for example, on the opposite arm of the hand and finger, and to equip the wristwatch with a camera. Alternatively, a camera installed in a portable device may be used.
[0056]
FIG. 5 is a schematic diagram showing the wearing position of the display device constituting the display unit 103. Each of the indicators 60 to 62 is mounted on a device worn directly on the head or worn on the head so as to be in the user's field of view. By wearing in this way, the display can be seen hands-free, and it is easy to check the display while pointing and selecting with hands and fingers.
[0057]
The display device 60 is a display device that is attached to and worn on a fastener that uses the forehead and the temporal region as a fulcrum, and a display device that can be similarly worn is already on the market.
[0058]
The display device 61 is a display device that is attached to and worn by a fastener that uses the ear and the back of the head as fulcrums. Even if a display is attached to the headphone fastener, it can be worn in the same way.
[0059]
The display device 62 is a display device equipped to be fixed to the front surface of the lens such as glasses or sunglasses, or equipped inside the lens, and a display device that can be equipped in the same manner is already on the market.
[0060]
Of course, as long as it can enter the user's field of view, the display can be worn on the arm by diverting the display of another part of the body, for example, a wristwatch or a wristwatch-type portable information terminal. However, a display equipped with the portable device without wearing it may be used.
[0061]
FIG. 6 is a schematic diagram illustrating an example of the posture of the right hand and the finger taken by the image input unit 104 when using the character information input device according to the embodiment of the present invention. Here, as examples of hand and finger postures recognized by the finger image recognition function 300 or 400, postures 502 and 503 using one finger and postures 500 and 501 using two fingers are taken as examples.
[0062]
Of course, the left hand may be used, and the instruction and the selected posture may be recognized by another posture. For example, postures such as Janken's par, choki, and goo may be recognized, and it may be determined whether each is a designated posture or a selected posture.
[0063]
The pointing posture 500 is an example in which the pointing posture is determined by extending the index finger and the thumb and releasing the fingertips. At this time, the position of the instruction symbol determined by the symbol position determination function 301 or the symbol position determination function 401 may be, for example, an intermediate position of a straight line connecting two recognized fingertips, or the fingertip of the index finger It may be the position. Further, the relative relationship between the recognized finger and the symbol position may be set so that the user can easily use it.
[0064]
The selection posture 501 is an example in the case where the selection posture is determined from the state of the selection posture 500 by attaching the index finger and the fingertip of the thumb. At this time, the position of the selected symbol determined by the symbol position determination function 301 or the symbol position determination function 401 may be, for example, the center of the recognized contact portion of two fingers or the position of the fingertip of the index finger. Good. Further, the relative relationship between the recognized finger and the symbol position may be set so that the user can easily use it.
[0065]
The designated posture 502 is an example in a case where the posture with the index finger extended is determined as the designated posture. At this time, the position of the instruction symbol determined by the symbol position determination function 301 or the symbol position determination function 401 may be, for example, the position of the recognized fingertip of the index finger or a position close thereto. Further, the relative relationship between the recognized finger and the symbol position may be set so that the user can easily use it.
[0066]
The selected posture 503 is an example in the case where the selected posture is determined by bending the second joint of the index finger from the state of the selected posture 500. Of course, a posture in which the third joint is bent may be set as the selected posture. The first joint may be bent when the second joint is bent, and the first and second joints may be bent when the third joint is bent. At this time, the position of the selected symbol determined by the symbol position determination function 301 or the symbol position determination function 401 may be, for example, the highest position on the image of the recognized index finger. Further, the relative relationship between the recognized finger and the symbol position may be set so that the user can easily use it.
[0067]
When selecting one point designated by the designated posture using the selected posture, the designated symbol position determined based on the designated posture recognized at the past time point closest to the time when the selected posture is recognized is set as the selected position. Can do. Of course, not the nearest past time point, but the position of the instruction symbol position several tens to several hundreds of milliseconds ago may be selected so that the user can easily use it.
[0068]
FIG. 7 is a schematic diagram illustrating an example of a rectangle specified by the range specifying function 302 or the range specifying function 402. In FIG. 7A, a rectangle is applied to a closed loop expressed by a set of selected symbol positions from the recognition of the selected posture 501 after recognizing the designated posture 500 to the recognition of the designated posture 500 again. This is an example in the case where the applied rectangle is the designated image range.
[0069]
The closed loop is obtained by connecting adjacent symbol positions with a straight line, or by fitting a set of symbol positions with a Bezier curve or a spline curve. For example, a rectangle having the largest area inscribed in the closed loop or a rectangle having the smallest area inscribed in the closed loop can be fitted to the rectangle indicating the image range. Of course, the relative relationship between the closed loop and the rectangle may be set so that the user can easily use it. Of course, the image range may be a closed loop instead of a rectangle.
[0070]
In addition, a rectangle may be suddenly applied without actually finding a closed loop from the closed loop symbol position set as described above. For example, the centroid position of a set of symbol positions is obtained, the median value of the symbol position above the centroid position is set as the upper side of the rectangle, the lower side and the left and right sides in the same manner, and the position where each intersects is the vertex. By doing so, a rectangle can be obtained. Of course, other methods such as Hough transform may be applied.
[0071]
FIG. 7B shows a curve or a broken line expressed by a set of selected symbol positions from the recognition of the selected posture 501 to the recognition of the designated posture 500 again after the designated posture 500 is recognized. This is an example in which a rectangle is set in, and the set rectangle is set as the designated image range.
[0072]
The broken line can be obtained, for example, by connecting adjacent symbol positions with a straight line, and the curve portion can be obtained by fitting a set of symbol positions with a Bezier curve or a spline curve. In order to determine the rectangle indicating the image range, for example, a straight line segment that touches the lower side of a polygonal line or a curved line segment is obtained, and this is used as the lower side (bottom side), and a rectangle having left and right sides of a predetermined length is obtained. This can be achieved. As for the lengths of the left and right sides, the user may set a relative relationship between the lengths of the lower side and the left and right sides.
[0073]
Note that the example of FIG. 7B is a case where the lower part of the character written in the horizontal direction is designated by a polygonal line or a curve, and if it is designated by the right hand in the case of a vertically written character, for example, A straight line that touches the right side of a polygonal line or a curved line is obtained, and the right side is used as a right side, and a rectangle having upper and lower sides of a predetermined length is obtained, whereby a rectangle indicating an image range can be obtained. As for the lengths of the upper and lower sides, the user may set a relative relationship between the lengths of the right side and the upper and lower sides. Of course, when a vertically written character is designated depending on the case of the left hand, the left side of the rectangle may be designated.
[0074]
In practice, a rectangle may be suddenly applied to a set of symbol positions without obtaining a curve segment or a broken line. For example, a straight line may be applied to a set of symbol positions by a statistical method such as Hough transform, and a rectangle having one side may be obtained by the same method as described above.
[0075]
Of course, the designated posture 502 and the selected posture 503 may specify the image range by a closed loop, a broken line, a curved line, or the like, as described above, or other postures may be used as the selected posture and the designated posture.
[0076]
Alternatively, a range may be specified by designating one point with the designated posture, selecting the point with the selected posture, and automatically extracting a portion that seems to be a character around the point. Characteristicity can be realized using functions well known to those skilled in the information processing technology field. For example, the high-frequency component of an image appears regularly or has a steep luminance gradient, which is distributed long in the horizontal direction for horizontal writing characters and long in the vertical direction for vertical writing characters. It is possible to extract using such characteristics. Further, the image range determined based on the set of symbol positions described above may be corrected and adjusted according to the shape of the character-like portion.
[0077]
Next, a method of using the apparatus of the present invention, an operation procedure at that time, and processing of the apparatus will be described. Here, two methods of Embodiment 1 and Embodiment 2 will be described as usage methods of the present invention. The first embodiment is a case in which hand and finger position and posture information and character information are acquired from an image including a hand, a finger, and characters photographed by a camera of the image input unit 104 worn or carried by the user. In the second embodiment, the position and posture of the hand and finger are acquired from the image including the hand and finger photographed by the camera of the image input unit 104 worn or carried by the user, and stored in the storage unit 101 or 201. This is a case where character information is acquired from an image including a character.
[0078]
In the following description, a case where processing is performed only by the terminal device 10 will be described. However, as described above, the hand image recognition function 300 and the finger image recognition function 400, the symbol position determination function 301 and the symbol position determination function 401, the range Since the designation function 302 and the range designation function 402, the character recognition function means 303 and the character recognition function means 403, the character code data conversion function 304 and the character code data conversion function 404 are equivalent functions, respectively, the terminal device 10 or the host device 20 Either of these functions is provided, and the terminal device 10 and the host device 20 may be combined and processed.
[0079]
FIG. 8 is a schematic diagram showing Embodiment 1 of the present invention. In this example, an image including a hand, a finger, and characters is photographed by the camera of the image input unit 104 worn or carried by the user, and the user looks at the photographed image displayed on the display unit 103, The character “18th conference room” included in the captured image is selected using the range designation function 302.
[0080]
In the example shown in FIG. 8, the indication symbol is displayed using the pointer of the arrow mark, and the trajectory of the selection symbol from the recognition of the indication posture after the indication posture is recognized until the indication posture is recognized again. The white circle is displayed, and the rectangle representing the image range is displayed as a dotted rectangle. Of course, other symbols and figures may be used.
[0081]
FIG. 9 is a flow of processing performed by the device of the present invention in the first embodiment of the present invention. FIG. 10 is an example of a flow of operations performed by the user when the processing in FIG. 9 reaches F114.
[0082]
First, the apparatus captures an image with the image input unit 104 and superimposes presentation information including a virtual button (or menu) for starting Embodiment 1 and other virtual buttons and menus for the user on the captured image. Then, it is displayed on the display unit 103 as a graphical user interface (GUI) (F100). A process in which the virtual button for starting the first embodiment is displayed is designated with the designated posture, and it is recognized that it has been selected with the selected posture (S100), or another virtual button or menu is selected. F100 is repeated (F101). If it is recognized that a virtual button or menu other than the virtual button for starting the first embodiment is selected, the process proceeds to another process (F116).
[0083]
When recognizing that the first embodiment is selected in the process F101, an image is taken and a GUI that does not include a virtual button for starting the first embodiment is displayed (F102). Next, the process F102 is repeated until the user's pointing posture (S101) is recognized by the finger image recognition function 300 (F103).
[0084]
When the pointing posture is recognized in processing F103, the pointing symbol position is determined by the symbol position determination function 301 (F104), an image is taken, and a GUI including the pointing symbol is displayed (F105). Next, the process F105 is repeated until the user's selection posture (S102) is recognized by the hand image recognition function 300 (F106).
[0085]
When the selected posture is recognized in the process F106, the selected symbol position is determined by the symbol position determining function 301, the position is accumulated for the range specifying function 302 (F107), an image is taken, and the selected symbol and the selected symbol are selected. A GUI including a symbol indicating a locus (white circle in the example of FIG. 8) is displayed (F108). Next, while the finger image recognition function 300 continues to recognize the user's selection posture (S102), the processing F107 and the processing F108 are repeated (F109). When the pointing posture (S103) is recognized in process F109, the pointing symbol position is determined by the symbol position determination function 301 (F110). If neither the designated posture nor the selected posture can be recognized in process F109, the process ends and character code data is not stored.
[0086]
After the position of the instruction symbol is determined in processing F110, an image range is obtained by the range designation function 302, the image at that time is stored (F111), the image is taken, and the figure representing the instruction symbol and the image range (example in FIG. 8) Then, a GUI including a dotted rectangle is displayed (F112). Next, character recognition is performed from the portion within the image range of the image saved in process F111 by the character recognition function 303. If the character cannot be recognized, the process ends and character code data is not stored (F113).
[0087]
If the character can be recognized, the recognized character is confirmed, selected, and corrected (S104, F114). If there is a finally confirmed, selected, and corrected character, the character code data is stored in the storage unit 101 ( F115), and the process is terminated. If not, the character code data is not stored and the process ends.
[0088]
FIG. 11 is a schematic diagram showing Embodiment 2 of the present invention. In this example, a character recognition target image stored in the storage unit 101 together with an image including a hand and a finger photographed by the camera of the image input unit 104 worn or carried by the user is displayed on the display unit 103. Is displayed. The user selects the character “laboratory A” included in the target image stored in the storage unit 101 using the range specification function 302 while looking at the display unit.
[0089]
In the example shown in FIG. 11, the captured image at that time is displayed superimposed on the target image at the lower right of the screen of the display, but of course, it may be at another position or displayed without being superimposed, for example, The target image may be displayed on the upper left of the instrument screen, the captured image may be displayed on the upper right, and other information may be displayed below.
[0090]
FIG. 12 is a flow of processing performed by the device of the present invention in Embodiment 2 of the present invention. FIG. 13 is a flow of operations performed by the user when the processing in FIG. 12 reaches F216.
[0091]
First, the apparatus captures an image with the image input unit 104, and superimposes presentation information including a virtual button (or menu) for starting Embodiment 2 and other virtual buttons and menus for the user on the captured image. Then, it is displayed on the display unit 103 as a graphical user interface (GUI) (F200). Recognizing that the user indicates the portion where the virtual button for starting the second embodiment is displayed in the indicated posture and selected in the selected posture (S200), or recognizes that another virtual button or menu has been selected. The process F200 is repeated until it is done (F201). If it is recognized that a virtual button or menu other than the virtual button for starting the second embodiment is selected, the process proceeds to another process (F218).
[0092]
When recognizing that the second embodiment is selected in process F201, an image is captured and received from the thumbnail image (reduced image) of the image stored in the storage unit 101 or the Internet or a host device via the communication unit 102. A list of thumbnail images is displayed (F202).
[0093]
FIG. 14 is a schematic diagram illustrating an example of a GUI for selecting a target image for character recognition. In this example, a list of thumbnail images and an up and down scroll virtual button for scrolling the list are displayed. In this example, for example, the user designates the portion where the thumbnail image of the image to be selected is displayed in the designated posture, and recognizes that the user has selected the selected posture (S201) (F203), so that the target image is selected. You can choose.
[0094]
When the target image is selected in process F203, the image is captured and a GUI including the captured image and the target image is displayed as in the example of FIG. 11 (F204). Next, processing F204 is repeated until the user's selection posture (S202) is recognized by the finger image recognition function 300 (F205).
[0095]
When the instruction posture is recognized in the process F205, the instruction symbol position is determined by the symbol position determination function 301 (F206), an image is taken, and a GUI including the instruction symbol is displayed (F207). Next, processing F207 is repeated until the user's selection posture (S203) is recognized by the finger image recognition function 300 (F208).
[0096]
When the selection posture is recognized in processing F208, the selection symbol position is determined by the symbol position determination function 301, the position is stored for the range specification function 302 (F209), an image is taken, and the selection symbol and the selection symbol are selected. A GUI including a symbol indicating a locus (white circle in the example of FIG. 11) is displayed (F210).
Next, while the user's finger image recognition function 300 continues to recognize the user's selection posture (S203), the process F209 and the process F210 are repeated (F211). When the instruction posture (S204) is recognized in process F211, the instruction symbol position is determined by the symbol position determination function 301 (F212). If neither the designated posture nor the selected posture is recognized in processing F211, the processing is terminated and character code data is not stored.
[0097]
After the instruction symbol position is determined in the process F212, an image range is obtained by the range designation function 302 (F213), an image is taken, and a GUI including the instruction symbol and a graphic representing the image range (a dotted rectangle in the example of FIG. 11) is displayed. It is displayed (F214). Next, character recognition is performed from the portion within the image range of the target image by the character recognition function 303. If the character cannot be recognized, the process ends and character code data is not stored (F215).
[0098]
If the character can be recognized, the recognized character is confirmed, selected, and corrected (S205, F216). If there is a finally confirmed, selected, and corrected character, the character code data is stored in the storage unit 101 ( F217), the process is terminated. If not, the character code data is not stored and the process ends.
[0099]
Of course, the character code data stored in the storage unit 102 when the first and second embodiments are completed is a search keyword of the Internet or database, an operation system or an application command, an input character, or an image used for character recognition (in this case) Can be used as additional information for captured images. For example, the character code data stored in the storage unit 102 is transmitted to the GUI as a keyboard event, or the character code data stored in the header information of the target image is embedded as post-processing. Data is available.
[0100]
Here, processes F114 and F216 and steps 104 and 205, that is, a process for confirming, selecting, and correcting a recognized character (operation) will be described in detail. FIG. 15 is a schematic diagram illustrating an example of a GUI for selecting a recognized character candidate. In the example of this figure, four recognized character candidates are obtained from the character recognition function 303, and characters corresponding to these character code data are displayed side by side like buttons using character font data.
[0101]
FIG. 16 is a schematic diagram illustrating an example of a GUI for correcting a recognized character. In the example of this figure, an editable text box is displayed at the top, and a numeric keypad type virtual keyboard widely used for mobile phones or the like is displayed below it.
[0102]
FIG. 17 is a flow of processing performed by the device of the present invention for confirmation, selection, and correction of recognized characters. FIG. 18 is a flow of operations performed by the user when the process of FIG. 17 reaches F307.
[0103]
First, the number of recognition result candidates obtained by the character recognition function 303 is checked (F300). If there are a plurality of candidates, an image is taken and a GUI including a list of candidates is displayed as shown in FIG. 15 (F301). The user confirms it (S300), for example, by recognizing that the user indicates the part in which the candidate to be selected is displayed in the designated posture and selects it in the selected posture (S301) (F302). Can select candidates.
[0104]
Next, an image is taken, and when there is one candidate, the candidate is displayed, and when there are a plurality of candidates, a GUI including the selected candidate and a decision virtual button, a correction virtual button, and a cancel virtual button is displayed (F303). The process F303 is repeated until any virtual button is selected (F304).
[0105]
If the user recognizes that the cancel virtual button has been selected, the process is canceled. If it is recognized that the decision virtual button has been selected, the candidate character code data displayed in process F303 is transferred to subsequent processes, for example, F115 and F217. When it is recognized that the correction virtual button has been selected (S302), an image is captured, and a text box displaying the candidates displayed in the process F303 and a GUI including the virtual keyboard are displayed as shown in FIG. F305).
[0106]
The cursor in the text box can be moved to the position of the character to be corrected by pointing with the pointing posture and selecting with the selection posture. For example, if there is a character to be erased, the character is designated and selected, and then the clear key of the virtual keyboard is selected. If a character is to be input at the cursor position, each key on the virtual keyboard may be selected and input (F306, S303).
[0107]
During the correction, it is recognized whether or not the user has selected the decision virtual button (S304) (F307), and the correction process (operation) is continued unless selected. If the decision virtual button is selected, the character code data of the character currently displayed in the text box is transferred to subsequent processing, for example, F115 and F217.
[0108]
Needless to say, the characters to be recognized are not only the characters of letters and signs, but also handwritten characters. For example, character information can be input by writing a character on paper or a whiteboard by handwriting, photographing it, and specifying a range with a hand and a finger. Furthermore, if the computer's processing power and character recognition accuracy are improved, all characters in the image are recognized without specifying the range, and then only the result candidates to be used are indicated with hands and fingers as shown in FIG. , You may choose.
[0109]
In addition, the operations performed using the virtual buttons and menus described above can be performed by other methods. It is conceivable that the processing can be canceled by pointing and selecting a certain part of the display screen, for example, the upper left corner. Further, various postures such as Janken's par, choki, and goo may be recognized and operated as not only the designated posture and the selected posture, but also as the determined posture and the cancel posture.
[0110]
Next, another embodiment according to the present invention will be described. An object of the present invention is to supply a recording medium recording a program that realizes the functions of the above-described embodiments to a system or apparatus, and read the program stored in the recording medium by a computer, CPU, or MPU of the system or apparatus. Needless to say, it can be achieved through execution.
[0111]
In this case, the program itself read from the recording medium realizes the functions of the above-described embodiment, and the recording medium on which the program is recorded constitutes the present invention.
[0112]
As a recording medium for recording the program and recording data read by the program, for example, a tape medium such as a magnetic tape or a cassette tape, a magnetic disk such as a floppy disk or a hard disk, or a CD-ROM / CD-R / CD media including optical (magnetic) disks such as CD-RW / MO / MD / DVD-ROM / DVD-R / DVD-RW / DVD + RW / DVD-RAM; card media such as nonvolatile memory cards / optical cards; A ROM or the like can be used.
[0113]
In addition, by executing the program read by the computer, not only the functions of the above-described embodiments are realized, but also the OS running on the computer based on the instruction of the program performs an actual process. Needless to say, a case where the functions of the above-described embodiment are realized by performing the processing of the above-described part or all of the processing is also included.
[0114]
【The invention's effect】
As described above, according to the present invention, characters and handwritten characters are converted into character code data indoors or outdoors using a camera, a display, and a user's hand and fingers. Information can be entered quickly, reliably and easily. In particular, the present invention provides a character information input interface effective for a wearable or portable device such as a wearable computer, a portable computer, a mobile phone, a PDA, a remote controller, or a digital camera.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a configuration of a character information input device according to an embodiment of the present invention.
FIG. 2 is a block diagram illustrating a configuration of a host of the character information input device according to the embodiment of this invention.
FIG. 3 is a schematic diagram showing a configuration of a communication system of the character information input device according to the embodiment of the present invention.
4 is a schematic diagram showing a wearing position of a camera constituting the image input unit 104. FIG.
FIG. 5 is a schematic diagram illustrating a wearing position of a display device constituting the display unit 103;
FIG. 6 is a schematic diagram showing an example of hand and finger postures taken by the image input unit 104 when using the character information input device according to the embodiment of the present invention.
FIG. 7 is a schematic diagram illustrating an example of a rectangle specified by the range specifying function 302 and the range specifying function 402;
FIG. 8 is a schematic diagram showing Embodiment 1 of the present invention.
FIG. 9 is a flow of processing performed by the device of the present invention in Embodiment 1 of the present invention.
FIG. 10 is an example of a flow of operations performed by the user when the process of FIG. 9 reaches F114.
FIG. 11 is a schematic diagram showing Embodiment 2 of the present invention.
FIG. 12 is a flow of processing performed by the device of the present invention in Embodiment 2 of the present invention.
13 is a flow of operations performed by the user when the processing in FIG. 12 reaches F216.
FIG. 14 is a schematic diagram illustrating an example of a GUI for selecting a target image for character recognition.
FIG. 15 is a schematic diagram showing an example of a GUI for selecting a recognized character candidate.
FIG. 16 is a schematic diagram illustrating an example of a GUI for correcting a recognized character.
FIG. 17 is a flow of processing performed by the device of the present invention for confirmation, selection, and correction of a recognized character.
FIG. 18 is a flow of operations performed by the user when the processing of FIG. 17 reaches F307.
[Explanation of symbols]
10 Terminal device
20 Host device
30 base station
31 Switching network
50 Camera worn on shoulder
51 Camera worn on the chest
52 Camera worn on ear
53 Cameras mounted on frames such as glasses and sunglasses
60 Indicator attached to a fastener with the forehead and temporal head as a fulcrum
61 Indicators attached to fasteners with ears and back of head as fulcrums
62 Indicators attached to the front or inside of lenses such as glasses and sunglasses
100 Information processing control unit (equipped in terminal device 10)
101 Storage unit (equipped in the terminal device 10)
102 Communication unit (equipped in terminal device 10)
103 Display unit (equipped in the terminal device 10)
104 Image input unit (equipped in host device 20)
200 Information processing control unit (equipped in host device 20)
201 Storage unit (equipped in the host device 20)
202 Communication unit (equipped in host device 20)
300 Finger image recognition function (function of terminal device 10)
301 Symbol position determination function (function of terminal device 10)
302 Range specification function (function of terminal device 10)
303 Character recognition function (function of terminal device 10)
304 Character code data conversion function (function of terminal device 10)
305 Image compression / decompression function (function of terminal device 10)
400 Finger image recognition function (function of host device 20)
401 Symbol position determination function (function of host device 20)
402 Range specification function (function of host device 20)
403 Character recognition function (function of host device 20)
404 Character code data conversion function (function of host device 20)
405 Image compression / decompression function (function of host device 20)
500 Pointing posture using two fingers
501 Choice posture with two fingers
502 Pointing posture using one finger
503 Selection posture using one finger

Claims

認識すべき文字及び手と指を含む画像を撮影した撮影画像をカメラから入力する画像入力手段と、表示器の画面に画像を表示する画像表示手段と、前記撮影画像に写された手と指の位置および指示姿勢および選択姿勢を認識する手指画像認識手段と、該手指画像認識手段で認識された手と指の位置および前記姿勢に基づいてカーソルやポインタなどの指示記号や選択記号の前記表示器の画面上での位置を決定する記号位置決定手段と、該記号位置決定手段で決定された前記表示器の画面上での位置に前記記号を表示する記号表示手段と、前記記号および前記姿勢により画像範囲を指定する範囲指定手段と、該範囲指定手段で指定された範囲の画像から文字を認識する文字認識手段と、該文字認識手段で認識した文字を文字コードデータに変換する文字コードデータ変換手段と、該文字コードデータ変換手段で変換された文字コードデータを記憶媒体に記憶する文字コードデータ記憶手段とを具備することを特徴とする文字情報入力装置。An image input means for inputting a photographed image obtained by capturing an image including characters and a hand and a finger to be recognized from a camera, an image display means for displaying an image on a screen of a display, and a hand and a finger photographed in the photographed image The hand image recognition means for recognizing the position, the pointing posture and the selection posture, and the display of the pointing symbol and the selection symbol such as a cursor and a pointer based on the hand and finger position and the posture recognized by the finger image recognition portion Symbol position determining means for determining the position of the display on the screen, Symbol display means for displaying the symbol at the position of the display on the screen determined by the symbol position determination means, the symbol and the attitude A range designating unit for designating an image range, a character recognition unit for recognizing a character from an image in a range designated by the range designating unit, and converting the character recognized by the character recognition unit into character code data That the character code data conversion means, the character information input apparatus characterized by comprising a character code data storing means for storing the converted character code data in the storage medium in the character code data conversion means.

前記画像入力手段で入力された前記撮影画像を圧縮または非圧縮で無線または有線の通信回路を介して送信する入力画像送信手段と、該圧縮または非圧縮画像を受信し圧縮画像の場合は展開する入力画像受信手段とを有することを特徴とする請求項１に記載の文字情報入力装置。 Input image transmission means for transmitting the captured image input by the image input means via a wireless or wired communication circuit in a compressed or uncompressed manner, and receiving the compressed or uncompressed image and expanding the compressed image The character information input device according to claim 1, further comprising an input image receiving unit.

前記範囲指定手段で指定された範囲の画像を圧縮または非圧縮で無線または有線の通信回路を介して送信する範囲画像送信手段と、該圧縮または非圧縮範囲画像を受信し圧縮画像の場合は展開する範囲画像受信手段とを有することを特徴とする請求項１または２に記載の文字情報入力装置。 Range image transmitting means for transmitting an image in the range specified by the range specifying means via a wireless or wired communication circuit in a compressed or non-compressed manner, and decompressing the compressed or uncompressed range image in the case of a compressed image The character information input device according to claim 1, further comprising a range image receiving unit.

前記画像入力手段で入力、または無線や有線の通信回線を介して受信した画像を記憶媒体に記憶する画像記憶手段と、該画像記憶手段で記憶した画像を前記表示器に表示する画像表示手段とを有することを特徴とする請求項１ないし３のいずれかに記載の文字情報入力装置。 Image storage means for storing an image input by the image input means or received via a wireless or wired communication line in a storage medium; and image display means for displaying the image stored in the image storage means on the display The character information input device according to claim 1, comprising:

前記文字コードデータ記憶手段で記憶した文字コードデータに対応する文字を、文字フォントデータを用いて前記表示器に表示する文字表示手段を有することを特徴とする請求項１ないし４のいずれかに記載の文字情報入力装置。 5. The apparatus according to claim 1, further comprising a character display unit configured to display a character corresponding to the character code data stored in the character code data storage unit on the display unit using character font data. Character information input device.

前記文字コードデータ記憶手段で記憶した文字コードデータを、インターネットやデータベースの検索キーワードまたはオペレーションシステムやアプリケーションのコマンドや入力文字または文字認識に使われた画像の付加情報として用いることを特徴とする請求項１ないし５のいずれかに記載の文字情報入力装置。 The character code data stored in the character code data storage means is used as a search keyword of the Internet or database, an operation system or an application command, an input character, or additional information of an image used for character recognition. The character information input device according to any one of 1 to 5.

前記カメラに、使用者の身体の一部に直接着用または身体の一部に着用するものに装備または携帯装置に装備したカメラを用いることを特徴とする請求項１ないし６のいずれかに記載の文字情報入力装置。 7. The camera according to claim 1, wherein a camera that is directly worn on a part of a user's body or that is worn on a part of the user's body or equipped on a portable device is used. Character information input device.

前記表示器に、使用者の視野に入るように頭部に直接着用または頭部に着用するものに装備または腕に直接着用または腕に着用するものに装備した表示器を用いることを特徴とする請求項１ないし７のいずれかに記載の文字情報入力装置。 The display is equipped with an indicator that is worn directly on the head or worn on the head so as to be in the user's field of vision, or an indicator that is worn on the arm or worn on the arm. The character information input device according to claim 1.

前記範囲指定手段において、前記指示姿勢を認識した後に前記選択姿勢を認識してから、再び前記指示姿勢を認識するまでの間の前記選択記号の位置の集合により表現される矩形または閉ループまたは直線分または曲線分などの図形に基づいて画像範囲を指定することを特徴とする請求項１ないし８のいずれかに記載の文字情報入力装置。 In the range specifying means, a rectangle, a closed loop, or a straight line segment expressed by a set of positions of the selection symbol from the time when the selected posture is recognized to the time when the selected posture is recognized again. The character information input device according to claim 1, wherein an image range is designated based on a figure such as a curved line.

前記文字認識手段で文字認識の結果の候補が複数存在する場合は、前記文字コードデータ変換手段で該候補それぞれを文字コードデータに変換する手段と、該文字コードデータに対応する文字を、文字フォントデータを用いて前記表示器に表示する文字認識候補表示手段と、表示された前記候補から１つを前記指示姿勢で指示し前記選択姿勢で選択する文字認識結果選択手段と、該文字認識結果選択手段で選択した候補の文字コードデータを、前記文字コードデータ記憶手段によって記憶媒体に記憶する手段とを有することを特徴とする請求項１ないし９のいずれかに記載の文字情報入力装置。 When there are a plurality of candidate character recognition results by the character recognition means, the character code data conversion means converts each of the candidates into character code data, and a character corresponding to the character code data is converted into a character font. Character recognition candidate display means for displaying on the display unit using data, character recognition result selection means for indicating one of the displayed candidates from the indicated attitude and selecting the selected attitude, and the character recognition result selection 10. The character information input device according to claim 1, further comprising means for storing candidate character code data selected by the means in a storage medium by the character code data storage means.

前記文字認識手段で文字認識の結果の候補に正解が含まれない場合は、前記表示器に仮想キーボードを表示する仮想キーボード表示手段と、前記指示姿勢および前記選択姿勢により仮想キーボードを操作する仮想キーボード操作手段と、前記文字認識候補表示手段および前記文字認識結果選択手段で前記候補から１つを選択する手段と、該選択候補を前記文字表示手段で表示し該仮想キーボード操作手段で修正する文字認識結果修正手段と、該文字認識結果修正手段で修正した文字の文字コードデータを、前記文字コードデータ記憶手段により記憶媒体に記憶する手段とを有することを特徴とする請求項１ないし１０のいずれかに記載の文字情報入力装置。 When the character recognition means does not include a correct answer as a result of character recognition, the virtual keyboard display means for displaying a virtual keyboard on the display, and the virtual keyboard for operating the virtual keyboard according to the instruction posture and the selection posture An operation means; a means for selecting one of the candidates by the character recognition candidate display means and the character recognition result selection means; and a character recognition for displaying the selection candidate by the character display means and correcting by the virtual keyboard operation means. 11. The method according to claim 1, further comprising: a result correcting unit; and a unit that stores the character code data of the character corrected by the character recognition result correcting unit in a storage medium by the character code data storing unit. The character information input device described in 1.

文字情報入力装置の文字情報入力方法において、認識すべき文字及び手と指を含む画像を撮影した撮影画像をカメラから入力するステップと、表示器の画面に画像を表示する画像表示ステップと、前記撮影画像に写された手と指の位置および指示姿勢および選択姿勢を認識するステップと、該手指画像認識手段で認識された手と指の位置および前記姿勢に基づいてカーソルやポインタなどの指示記号や選択記号の前記表示器の画面上での位置を決定するステップと、該記号位置決定手段で決定された前記表示器の画面上での位置に前記記号を表示するステップと、前記記号および前記姿勢により画像範囲を指定するステップと、該範囲指定手段で指定された範囲の画像から文字を認識するステップと、該文字認識手段で認識した文字を文字コードデータに変換するステップと、該文字コードデータ変換手段で変換された文字コードデータを記憶媒体に記憶するステップとを具備することを特徴とする文字情報入力方法。In the character information input method of the character information input device, a step of inputting a photographed image obtained by photographing an image including a character to be recognized and a hand and a finger, an image display step of displaying an image on a display screen, A step of recognizing a hand and finger position and a pointing posture and a selection posture captured in a captured image; Determining the position of the selection symbol on the screen of the display, displaying the symbol at the position of the display on the screen determined by the symbol position determination means, the symbol and the A step of designating an image range according to a posture; a step of recognizing a character from an image in a range designated by the range designating unit; and a character coordinating the character recognized by the character recognizing unit. And converting the Dodeta, the character information input method characterized by comprising the steps of: storing character code data converted by the character code data conversion means in a storage medium.

前記撮影画像を圧縮または非圧縮で無線または有線の通信回路を介して送信するステップと、該圧縮または非圧縮画像を受信し圧縮画像の場合は展開するステップとを有することを特徴とする請求項１２に記載の文字情報入力方法。 The compressed image or the non-compressed image is transmitted through a wireless or wired communication circuit, and the compressed or uncompressed image is received and expanded in the case of a compressed image. 12. The character information input method according to 12.

前記指定された範囲の画像を圧縮または非圧縮で無線または有線の通信回路を介して送信するステップと、該圧縮または非圧縮範囲画像を受信し圧縮画像の場合は展開するステップとを有することを特徴とする請求項１２または１３に記載の文字情報入力方法。 Transmitting the specified range of images via a wireless or wired communication circuit in a compressed or uncompressed manner, and receiving the compressed or uncompressed range images and expanding them in the case of a compressed image. The character information input method according to claim 12 or 13, characterized in that:

前記撮影画像、または無線や有線の通信回線を介して受信した画像を記憶媒体に記憶するステップと、該画像記憶手段で記憶した画像を前記表示器に表示するステップとを有することを特徴とする請求項１２ないし１４のいずれかに記載の文字情報入力方法。 Storing the photographed image or an image received via a wireless or wired communication line in a storage medium, and displaying the image stored in the image storage means on the display unit. The character information input method according to claim 12.

前記文字コードデータに対応する文字を、文字フォントデータを用いて前記表示器に表示するステップを有することを特徴とする請求項１２ないし１５のいずれかに記載の文字情報入力方法。 16. The character information input method according to claim 12, further comprising a step of displaying a character corresponding to the character code data on the display unit using character font data.

前記文字コードデータを、インターネットやデータベースの検索キーワードまたはオペレーションシステムやアプリケーションのコマンドや入力文字または文字認識に使われた画像の付加情報として用いることを特徴とする請求項１２ないし１６のいずれかに記載の文字情報入力方法。 17. The character code data is used as search information in the Internet or a database, an operation system or an application command, an input character, or additional information of an image used for character recognition. Character information input method.

前記カメラに、使用者の身体の一部に直接着用または身体の一部に着用するものに装備または携帯装置に装備したカメラを用いることを特徴とする請求項１２ないし１７のいずれかに記載の文字情報入力方法。 18. The camera according to claim 12, wherein the camera is a device that is directly worn on a part of a user's body or that is worn on a part of the user's body or is equipped on a portable device. Character information input method.

前記表示器に、使用者の視野に入るように頭部に直接着用または頭部に着用するものに装備または腕に直接着用または腕に着用するものに装備した表示器を用いることを特徴とする請求項１２ないし１８のいずれかに記載の文字情報入力方法。 The display is equipped with an indicator that is worn directly on the head or worn on the head so as to be in the user's field of vision, or an indicator that is worn on the arm or worn on the arm. The character information input method according to claim 12.

前記画像範囲を指定するステップにおいて、前記指示姿勢を認識した後に前記選択姿勢を認識してから、再び前記指示姿勢を認識するまでの間の前記選択記号の位置の集合により表現される矩形または閉ループまたは直線分または曲線分などの図形に基づいて画像範囲を指定することを特徴とする請求項１２ないし１９のいずれかに記載の文字情報入力方法。 In the step of designating the image range, a rectangle or a closed loop expressed by a set of positions of the selection symbols from the time when the selected posture is recognized to the time when the selected posture is recognized again after the designated posture is recognized 20. The character information input method according to claim 12, wherein an image range is designated based on a graphic such as a straight line or a curved line.

前記文字認識の結果の候補が複数存在する場合は、前記文字コードデータ変換手段で該候補それぞれを文字コードデータに変換するステップと、該文字コードデータに対応する文字を、文字フォントデータを用いて前記表示器に表示するステップと、表示された前記候補から１つを前記指示姿勢で指示し前記選択姿勢で選択するステップと、該選択した候補の文字コードデータを、前記文字コードデータを記憶するステップによって記憶媒体に記憶するステップとを有することを特徴とする請求項１２ないし２０のいずれかに記載の文字情報入力方法。 When there are a plurality of candidate character recognition results, the character code data conversion means converts each of the candidates into character code data, and the character corresponding to the character code data is converted using character font data. The step of displaying on the display unit, the step of indicating one of the displayed candidates with the pointing posture and selecting with the selection posture, and the character code data of the selected candidate are stored as the character code data 21. The character information input method according to claim 12, further comprising a step of storing the information in a storage medium according to the steps.

前記文字認識の結果の候補に正解が含まれない場合は、前記表示器に仮想キーボードを表示するステップと、前記指示姿勢および前記選択姿勢により仮想キーボードを操作するステップと、前記文字認識文字認識の候補から１つを選択するステップと、該選択候補を前記表示器に表示した前期仮想キーボードで修正するステップと、該修正した文字の文字コードデータを記憶媒体に記憶するステップとを有することを特徴とする請求項１２ないし２１のいずれかに記載の文字情報入力方法。 If the character recognition result candidate does not include a correct answer, a step of displaying a virtual keyboard on the display unit, a step of operating the virtual keyboard according to the indicated posture and the selected posture, and the character recognition character recognition A step of selecting one of the candidates; a step of correcting the selection candidate with the virtual keyboard displayed on the display unit; and a step of storing the character code data of the corrected character in a storage medium. The character information input method according to any one of claims 12 to 21.

文字情報入力装置の文字情報入力方法をコンピュータによって機能させるために、コンピュータに、認識すべき文字及び手と指を含む画像を撮影した撮影画像をカメラから入力させ、表示器の画面に画像を表示させ、前記撮影画像に写された手と指の位置および指示姿勢および選択姿勢を認識させ、認識された手と指の位置および姿勢に基づいてカーソルやポインタなどの記号の前記表示器の画面上での位置を決定させ、決定された前記表示器の画面上での位置に前記記号を表示させ、前記記号および前記姿勢により画像範囲を指定させ、指定された範囲の画像から文字を認識させ、認識した文字を文字コードデータに変換させ、変換された文字コードデータを記憶媒体に記憶させることを特徴とするプログラムを記録したコンピュータで読み取り可能な記録媒体。In order for the character information input method of the character information input device to function by the computer, the computer inputs a photographed image of the image including the characters to be recognized and the hand and fingers, and displays the image on the display screen. And recognizing the hand and finger positions and the indicated postures and the selected postures shown in the photographed image, and displaying symbols such as a cursor and a pointer on the display screen based on the recognized hand and finger positions and postures. The position is determined at the display, the symbol is displayed at the determined position on the screen, the image range is designated by the symbol and the posture, the character is recognized from the image of the designated range, A computer that records a program that converts recognized characters into character code data and stores the converted character code data in a storage medium. Recording medium that can be taken.

コンピュータに、前記撮影画像を圧縮または非圧縮で無線または有線の通信回路を介して送信させ、該圧縮または非圧縮画像を受信し圧縮画像の場合は展開させることを特徴とする請求項２３に記載のプログラムを記録したコンピュータで読み取り可能な記録媒体。 24. The computer according to claim 23, wherein the captured image is transmitted through a wireless or wired communication circuit in a compressed or uncompressed manner, and the compressed or uncompressed image is received and expanded in the case of a compressed image. A computer-readable recording medium on which the program is recorded.

コンピュータに、前記指定された範囲の画像を圧縮または非圧縮で無線または有線の通信回路を介して送信させ、該圧縮または非圧縮範囲画像を受信し圧縮画像の場合は展開させることを特徴とする請求項２３または２４に記載のプログラムを記録したコンピュータで読み取り可能な記録媒体。 A computer causes the specified range of images to be transmitted in a compressed or uncompressed manner via a wireless or wired communication circuit, receives the compressed or uncompressed range images, and expands the compressed images in the case of compressed images. A computer-readable recording medium on which the program according to claim 23 or 24 is recorded.

コンピュータに、前記撮影画像、または無線や有線の通信回線を介して受信した画像を記憶媒体に記憶させ、該記憶した画像を前記表示器に表示させることを特徴とする請求項２３ないし２５のいずれかに記載のプログラムを記録したコンピュータで読み取り可能な記録媒体。 26. The computer according to any one of claims 23 to 25, wherein the photographed image or the image received via a wireless or wired communication line is stored in a storage medium, and the stored image is displayed on the display. A computer-readable recording medium on which the program according to claim 1 is recorded.

コンピュータに、前記記憶した文字コードデータに対応する文字を、文字フォントデータを用いて前記表示器に表示させることを特徴とする請求項２３ないし２６のいずれかに記載のプログラムを記録したコンピュータで読み取り可能な記録媒体。 27. A computer recording a program according to claim 23, wherein a character corresponding to the stored character code data is displayed on the display unit using character font data. Possible recording media.

コンピュータに、前記文字コードデータを、インターネットやデータベースの検索キーワードまたはオペレーションシステムやアプリケーションのコマンドや入力文字または文字認識に使われた画像の付加情報として使用させることを特徴とする請求項２３ないし２７のいずれかに記載のプログラムを記録したコンピュータで読み取り可能な記録媒体。 28. A computer according to claim 23, wherein the character code data is used as additional information of a search keyword of the Internet or a database, an operation system or an application command, an input character, or an image used for character recognition. A computer-readable recording medium on which any one of the programs is recorded.

コンピュータに、前記画像範囲を指定させる際に、前記指示姿勢を認識した後に前記選択姿勢を認識してから、再び前記指示姿勢を認識するまでの間の前記選択記号の位置の集合により表現される矩形または閉ループまたは直線分または曲線分などの図形に基づいて画像範囲を指定させることを特徴とする請求項２３ないし２８のいずれかに記載のプログラムを記録したコンピュータで読み取り可能な記録媒体。 When the computer designates the image range, it is expressed by a set of the positions of the selection symbols from the time when the selected posture is recognized until the time when the designated posture is recognized after the designated posture is recognized. 29. A computer-readable recording medium recording a program according to claim 23, wherein an image range is designated based on a figure such as a rectangle, a closed loop, a straight line, or a curved line.

コンピュータに、前記文字認識の結果の候補が複数存在する場合は、該候補それぞれを文字コードデータに変換させ、該文字コードデータに対応する文字を、文字フォントデータを用いて前記表示器に表示させ、表示された前記候補から１つを前記指示記号で指示させ前記選択姿勢で選択させ、前記文字コードデータを記憶する際に選択された候補の文字コードデータを記憶媒体に記憶させることを特徴とする請求項２３ないし２９のいずれかに記載のプログラムを記録したコンピュータで読み取り可能な記録媒体。 When there are a plurality of candidate character recognition results on the computer, each of the candidates is converted into character code data, and the character corresponding to the character code data is displayed on the display using character font data. One of the displayed candidates is indicated by the instruction symbol, selected in the selection posture, and the character code data of the candidate selected when storing the character code data is stored in a storage medium, A computer-readable recording medium on which the program according to any one of claims 23 to 29 is recorded.

コンピュータに、前記文字認識の結果の候補に正解が含まれない場合は、前記表示器に仮想キーボードを表示させ、前記指示姿勢および前記選択姿勢により仮想キーボードを操作させ、前記文字認識の候補から１つを選択させ、該選択候補を前記表示器で表示し前記仮想キーボードで修正させ、修正した文字の文字コードデータを記憶媒体に記憶させることを特徴とする請求項２３ないし３０のいずれかに記載のプログラムを記録したコンピュータで読み取り可能な記録媒体。 When the computer does not include a correct answer as a result of the character recognition, the virtual keyboard is displayed on the display, and the virtual keyboard is operated according to the instruction posture and the selection posture. 31. The method according to claim 23, further comprising: selecting one of them, displaying the selection candidate on the display unit, correcting the selected candidate using the virtual keyboard, and storing the character code data of the corrected character in a storage medium. A computer-readable recording medium on which the program is recorded.