JP2009169464A

JP2009169464A - Character input method

Info

Publication number: JP2009169464A
Application number: JP2008003472A
Authority: JP
Inventors: Takashi Saito; 剛史齊藤; Ryosuke Konishi; 亮介小西
Original assignee: Tottori University NUC
Current assignee: Tottori University NUC
Priority date: 2008-01-10
Filing date: 2008-01-10
Publication date: 2009-07-30

Abstract

<P>PROBLEM TO BE SOLVED: To solve the following problem: methods of using shape recognition of mouths are proposed as methods of lightening the burden of key operation for character input to electronic equipment, and some key operation is combined or a gaze is used in combination for the shape recognition of mouths in those methods and however those methods are not always character input methods of lightening the burden on a user having difficulty in minutely moving the hands and fingers. <P>SOLUTION: A method and a system are disclosed which display input character candidates one after another on a display screen through shape recognition of the mouth, and then determine a desired character in accordance with a simple movement of the face to select the input character. <P>COPYRIGHT: (C)2009,JPO&INPIT

Description

本発明は電子機器への文字入力方法であって、入力者の口の形状および口と鼻の動きを利用して入力文字を決定する方法に関する。 The present invention relates to a method for inputting characters to an electronic device, and relates to a method for determining an input character using the shape of the mouth of the input person and the movement of the mouth and nose.

コンピュータや携帯電話などの機器の普及により、多くの人がこれらの機器を使う機会が増えている。これらの機器を操作するための一般的なインタフェースは、キーボードやマウス、ボタンなどの手で操作する手段である。 With the spread of devices such as computers and mobile phones, many people are increasingly using these devices. A general interface for operating these devices is a means of operating with hands such as a keyboard, a mouse, and a button.

しかし、すべての人がキーボードやマウスなどを巧みに操作して文字を入力できるとは限らない。このことは高齢化社会を迎える上で、次第に重要視される問題であると考えられる。また手指に障害をもつ人もキーボードやマウスなどの操作は難しい。 However, not all people can input characters by skillfully operating a keyboard or mouse. This is considered to be an increasingly important issue in an aging society. Also, people with disabilities in fingers are difficult to operate keyboards and mice.

手指を使わないインタフェースとして、音声認識が考えられる。音声認識技術は古くから取り組まれている分野であり、少しずつ普及しつつあるが、周囲の騒音に影響を受ける問題がある。また音声ではなく、視線を利用するインタフェースが開発されているが、意図的に視線をかえるのは利用者に負担を強いる問題がある。 Speech recognition can be considered as an interface that does not use fingers. Speech recognition technology is a field that has been tackled for a long time, and it is gradually spreading, but there is a problem that it is affected by ambient noise. In addition, interfaces that use gaze instead of voice have been developed, but intentionally changing gaze has the problem of imposing a burden on the user.

また、視線を用いるのは、単に対象物に注視した状態と、装置に対する何らかの意図をもって見ているのかを判別することが難しいなどの問題もある。 In addition, the use of the line of sight also has a problem that it is difficult to determine whether the object is simply looking at the object and whether the user is looking at the device with some intention.

画像情報を用いて発話内容を認識する読唇に関する研究も取り組まれているが、音声認識に比べ認識率が低く、まだ実用段階にあるとはいえない。 Research on lip-reading that recognizes utterance content using image information is also underway, but the recognition rate is lower than speech recognition, and it is not yet in practical use.

例えば、特許文献１では、口の形状とキー操作を併用したテキスト入力システムを提案している。日本語のかな５０音表の異なる子音に対してはキーを、母音に対しては口の形状を対応させることによりテキスト入力を実現している。 For example, Patent Document 1 proposes a text input system that uses both a mouth shape and key operations. Text input is realized by associating keys with different consonants in the Japanese Kana 50-sound table and corresponding mouth shapes with vowels.

また、特許文献２では、特許文献１のキー操作を目の動きに置き換えるシステムを提案している。
特開２００５−３０９９５２号公報特開２００２−２６９５４４号公報 Patent Document 2 proposes a system that replaces the key operation of Patent Document 1 with eye movements.
JP 2005-309952 A JP 2002-269544 A

特許文献１は、口の形状とキー操作を利用することにより早いテキスト入力を実現しているが、テキストを入力するためにキー操作が必要であり、手指の細かな動きが必要となる。 In Patent Document 1, fast text input is realized by using the shape of the mouth and key operation. However, key operation is required to input text, and fine movement of fingers is required.

また、特許文献２は、口の形状と目の動きを利用するテキスト入力を提案しているが、処理の詳細が記述されておらず、目の動きと口の形状のどのタイミングで文字の入力が実現できるか不明である。また、文字を入力するために、意図的に目を素早く動かすことは使用者に負担を強いる。 Patent document 2 proposes text input using the mouth shape and eye movement, but details of the process are not described, and at what timing of the eye movement and mouth shape the character is input. It is unknown whether can be realized. In addition, intentionally moving the eyes quickly to input characters imposes a burden on the user.

このように様々な文字入力方法が提案されているが、手指の細かな動きが困難な利用者に対して、必ずしも負担の少ない文字入力方法を提供するものではない。 Various character input methods have been proposed in this way, but it does not necessarily provide a character input method with a low burden for users who cannot easily move their fingers.

上記の課題に鑑み、本発明では、手指の細かな動きが困難な利用者に対して、負担の少ない入力方法として口の動きと簡単な顔の動きのみで文字を入力する方法を提供するものである。つまり、本発明の目的は、コンピュータや携帯電話などの携帯端末などの機器を操作するのに、手指を用いずに口部形状を利用して文字を入力する方法を提供することである。 In view of the above-described problems, the present invention provides a method for inputting characters with only mouth movements and simple face movements as an input method with less burden for users who have difficulty in fine finger movement. It is. That is, an object of the present invention is to provide a method for inputting characters using a mouth shape without using a finger to operate a device such as a portable terminal such as a computer or a mobile phone.

すなわち、本発明の第１の局面は、電子機器に対して文字を入力する方法であって、口の形状を認識する工程と、前記口の形状に基づいて入力文字の候補群を選択する工程と、前記候補群に属する文字を順に表示する工程と、前記文字を表示する度に口又は鼻の動きを検出する工程と、前記口又は鼻の動きに基づいて前記候補群又は前記表示中の文字に対する処理を選択する工程とを含む文字入力方法を提供するものである。 That is, the first aspect of the present invention is a method of inputting characters to an electronic device, the step of recognizing the shape of the mouth, and the step of selecting a candidate group of input characters based on the shape of the mouth A step of sequentially displaying characters belonging to the candidate group, a step of detecting mouth or nose movement each time the character is displayed, and the candidate group or the currently being displayed based on the movement of the mouth or nose And a character input method including a step of selecting a process for the character.

また、本発明の第２の局面は、前記選択される処理には前記表示中の文字を入力文字とする処理を含む第１の局面に記載された文字入力方法を提供するものである。 A second aspect of the present invention provides the character input method described in the first aspect, wherein the selected process includes a process of using the displayed character as an input character.

また、本発明の第３の局面は、前記選択される処理には前記候補群を他の候補群に置き換える処理を含む第１の局面に記載された文字入力方法を提供するものである。 A third aspect of the present invention provides the character input method described in the first aspect, wherein the selected process includes a process of replacing the candidate group with another candidate group.

また、本発明の第４の局面は、前記選択される処理には前記候補群から表示される文字の表示の順を変更する処理を含む第１の局面に記載された文字入力方法を提供するものである。 The fourth aspect of the present invention provides the character input method described in the first aspect, wherein the selected process includes a process of changing a display order of characters displayed from the candidate group. Is.

また、本発明の第５の局面は、前記口の形状を認識する工程の前に、口の動きを検出する工程と、前記口の動きに基づき既に入力文字とした文字を削除する工程とを更に含む第１の局面に記載された文字入力方法を提供するものである。 The fifth aspect of the present invention includes a step of detecting mouth movement before the step of recognizing the mouth shape, and a step of deleting characters that have already been input characters based on the mouth movement. Furthermore, the character input method described in the 1st aspect which includes is provided.

本発明の文字入力方法は、キー操作を利用する方法などに比べ、文字の入力速度は早くないが、口の動きと簡単な顔の動きだけで文字を入力することができる。しかも、口の動きを発話に基づく動きと対応付けることで、人の自然なコミュニケーション手段である発話を意識した口の動きで文字の入力を実現することができる。 The character input method of the present invention is not faster than a character input method compared to a method using a key operation. However, a character can be input only by mouth movement and simple face movement. In addition, by associating mouth movements with movements based on utterances, it is possible to realize character input with movements of the mouths that are conscious of utterances, which are natural communication means.

＜構成＞
図１に本発明の文字入力方法を実現するシステムの構成を示す。本発明の文字入力のためのシステムは、撮像手段１００とコンピュータ２００と表示装置３００を含む。 <Configuration>
FIG. 1 shows the configuration of a system for realizing the character input method of the present invention. The system for character input according to the present invention includes an imaging means 100, a computer 200, and a display device 300.

領域抽出手段２、形状特徴量計測手段３、母音ＤＢ４、動き計測手段６、母音判断手段７、動き判断手段９、文字入力手段１１はコンピュータ２００内の構成、機能および処理される項目である。これらは主としてソフトウェアで実現されるが、複数のＭＰＵ（ＭｉｃｒｏＰｒｏｃｅｓｓｏｒＵｎｉｔ）とプログラム若しくは専用のハードウェアを用いて実現してもよい。本発明の文字入力方法によって決定された入力文字は文字入力手段１１に蓄えられ、例えば文章作成ソフトやメール作成ソフトなどの他のアプリケーションソフトのために使用される。 The region extraction unit 2, the shape feature amount measurement unit 3, the vowel DB 4, the movement measurement unit 6, the vowel determination unit 7, the movement determination unit 9, and the character input unit 11 are items to be processed, configured, and processed in the computer 200. These are mainly realized by software, but may be realized by using a plurality of MPUs (Micro Processor Units) and programs or dedicated hardware. The input characters determined by the character input method of the present invention are stored in the character input means 11 and used for other application software such as sentence creation software and mail creation software.

特徴量表示手段５、入力候補文字表示手段８、入力文字表示手段１０、動き表示手段１２は利用者に表示する表示装置３００であり、ディスプレイでよい。なお、これらの手段はディスプレイに表示する情報を加工し、表示のための書式なども作成するので、コンピュータ側に実装されるプログラムであってもよい。また、ディスプレイ自体にＭＰＵを搭載させ、コンピュータから受け取った情報に基づいてディスプレイ自体が情報を加工し表示を行っても良い。 The feature amount display means 5, the input candidate character display means 8, the input character display means 10, and the movement display means 12 are display devices 300 that are displayed to the user, and may be displays. Since these means process information to be displayed on the display and create a format for display and the like, a program installed on the computer side may be used. Further, an MPU may be mounted on the display itself, and the display itself may process and display information based on information received from a computer.

本発明の文字入力方法では、撮像手段１で入力者の顔を撮影し、その画像および画像の動きに基づいて入力文字を決定する。入力者は表示装置３００に示された入力候補から所望の文字を選択するため、表示装置３００は必要な構成要素となる。 In the character input method of the present invention, the imaging means 1 captures the face of the input person, and determines the input character based on the image and the movement of the image. Since the input person selects a desired character from the input candidates shown on the display device 300, the display device 300 becomes a necessary component.

撮像手段１００は撮像装置１であり、具体的には画像データを出力するカメラが該当する。これは入力者の顔、特に口と鼻の画像データを取得するため、少なくとも顔下半分をカバーできる視野を有する必要がある。また、口と鼻の位置や形状を得られた画像データより画像処理によって評価するため、これらの部位が画面の中で小さすぎても十分な評価を行うことができない。カメラの画角、焦点距離、倍率等はシステムを使用する状況に応じて適宜設計されるものである。 The imaging means 100 is the imaging device 1, and specifically corresponds to a camera that outputs image data. In order to acquire image data of the input person's face, particularly mouth and nose, it is necessary to have a field of view that can cover at least the lower half of the face. In addition, since the position and shape of the mouth and nose are evaluated by image processing from the obtained image data, sufficient evaluation cannot be performed even if these portions are too small in the screen. The angle of view, focal length, magnification, etc. of the camera are appropriately designed according to the situation where the system is used.

領域抽出手段２は撮像装置１によって取得した入力者の顔の画像Ｐｓから、鼻孔領域と口内領域の２領域の画像データを抽出する手段である。鼻孔領域とは、少なくとも鼻の穴を含む領域である。鼻の穴は鼻の動きを検出する基準とできるからである。また口内領域とは唇で囲まれた領域である。 The area extracting means 2 is a means for extracting image data of two areas, a nostril area and an intraoral area, from the face image Ps of the input person acquired by the imaging apparatus 1. The nostril region is a region including at least the nostril. This is because the nostril can be used as a reference for detecting the movement of the nose. The intraoral area is an area surrounded by lips.

なお、口内領域には唇を含んでいても良い。鼻孔領域及び口内領域それぞれの領域の画像データをＰｎ、Ｐｍとする。鼻孔領域および口内領域の画像データＰｎおよびＰｍの抽出は入力者の顔の画像Ｐｓから画像処理で抽出するが、抽出する画像処理の方法は特に限定されるものではない。 The mouth area may include lips. The image data of the nostril region and the intraoral region are Pn and Pm, respectively. Extraction of the image data Pn and Pm of the nostril region and the intraoral region is performed by image processing from the face image Ps of the input person, but the image processing method to be extracted is not particularly limited.

形状特徴量計測手段３は抽出された口内領域の画像データＰｍより形状特徴量（口内領域の面積Ｓと縦横比Ａｓｐ：以後これらをまとめてＱｃとも呼ぶ）を計測する手段である。形状特徴量Ｑｃの計測は画像処理によって行う。例えば口内領域の画像データＰｍの画素数をカウントすることで面積Ｓを求めるなどである。また縦横比Ａｓｐは、画像データＰｍの縦と横の画素数をカウントしその比を求めるなどである。もちろん、これ以外の方法を用いてもよい。 The shape feature quantity measuring means 3 is a means for measuring a shape feature quantity (the area S of the mouth area and the aspect ratio Asp: hereinafter collectively referred to as Qc) from the extracted image data Pm of the mouth area. The shape feature quantity Qc is measured by image processing. For example, the area S is obtained by counting the number of pixels of the image data Pm in the mouth area. The aspect ratio Asp is obtained by counting the number of vertical and horizontal pixels of the image data Pm and obtaining the ratio. Of course, other methods may be used.

母音ＤＢ４は計測された形状特徴量Ｑｃを、形状特徴量に応じた入力文字候補群Ｃｓａと関連づけて登録する母音ＤＢである。入力文字候補群Ｃｓａは入力されるべき文字を所定の規則に基づいてグループ分けしたものであり、１つの入力文字候補群には複数の候補文字Ｃｔｘが含まれる。ここで関連付けられた面積Ｓと縦横比Ａｓｐは、以後の文字入力モードでは入力文字候補群Ｃｓａを選択する基準となるので、基準面積ＳＳおよび基準縦横比ＳＡとする。なお、基準面積ＳＳと基準縦横比ＳＡをまとめて基準形状特徴量ＳＱｃとも呼ぶ。 The vowel DB4 is a vowel DB that registers the measured shape feature quantity Qc in association with the input character candidate group Csa corresponding to the shape feature quantity. The input character candidate group Csa is a group of characters to be input based on a predetermined rule, and one input character candidate group includes a plurality of candidate characters Ctx. The area S and the aspect ratio Asp associated here are used as a reference for selecting the input character candidate group Csa in the subsequent character input mode, and are therefore set as the reference area SS and the reference aspect ratio SA. The reference area SS and the reference aspect ratio SA are collectively referred to as a reference shape feature amount SQc.

特徴量表示手段５は形状特徴量計測手段３で計測された形状特徴量Ｑｃを表示する。ここでは、現在入力者が示している鼻や口部から得られる形状特徴量Ｑｃだけでなく、予め登録した基準形状特徴量ＳＱｃも合わせて表示してよい。利用者は母音ＤＢで登録されているこれら基準となる形状特徴量の位置にあわせて口部形状を変化させることにより、容易に入力文字の母音形状を作ることができる。 The feature quantity display means 5 displays the shape feature quantity Qc measured by the shape feature quantity measurement means 3. Here, not only the shape feature value Qc obtained from the nose or mouth part currently indicated by the input user, but also the reference shape feature value SQc registered in advance may be displayed together. The user can easily create the vowel shape of the input character by changing the mouth shape according to the position of the reference shape feature amount registered in the vowel DB.

動き計測手段６は領域抽出手段２で抽出された鼻孔領域の画像データＰｎより鼻孔位置Ｎｘｙの時間的変化を顔の動きとして計測し、口内領域の画像データＰｍの縦幅Ｌｍの時間的変化より口の開閉動作を計測する手段である。鼻孔位置に動きがあった場合は動き信号Ｍｎを出力し、口の開閉動作は動き信号Ｍｍを出力する。また、縦幅Ｌｍと鼻孔位置Ｎｘｙの最新データを動き表示手段１２へ出力する。 The movement measuring means 6 measures a temporal change in the nostril position Nxy as a facial movement from the image data Pn of the nostril area extracted by the area extracting means 2, and from the temporal change of the vertical width Lm of the image data Pm in the mouth area. It is means for measuring the opening and closing operation of the mouth. When there is movement at the nostril position, the movement signal Mn is output, and the opening / closing operation of the mouth outputs the movement signal Mm. Further, the latest data of the vertical width Lm and the nostril position Nxy is output to the movement display means 12.

動き計測手段６では、時間的な変化を検出するので、一定時間前の画像データＰｎとＰｍを記憶しておき、現時点での画像データＰｎとＰｍとを比較する。若しくは、画像データ自身を記憶するのではなく、それぞれの画像データから必要なパラメータだけを抽出しそのパラメータ同士を比較してもよい。 Since the motion measuring means 6 detects temporal changes, the image data Pn and Pm before a certain time are stored, and the current image data Pn and Pm are compared. Alternatively, instead of storing the image data itself, only necessary parameters may be extracted from each image data and the parameters may be compared with each other.

例えば、鼻孔位置Ｎｘｙと口内領域の縦幅Ｌｍを時間毎に記憶しておき、過去の値と比較することで時間変化があったか否かを判断するなどである。 For example, the nostril position Nxy and the vertical width Lm of the mouth region are stored for each time, and it is determined whether or not there has been a change in time by comparing with a past value.

動き表示手段１２は、動き計測手段６から鼻孔位置Ｎｘｙと口内領域の縦幅Ｌｍを受け取り、それを表示する。この表示は所定時間間隔のデータを表示するものとしておけば入力者が鼻や口の動きを認識しやすく好適である。 The movement display means 12 receives the nostril position Nxy and the vertical width Lm of the mouth area from the movement measurement means 6 and displays them. This display is suitable for displaying data at predetermined time intervals so that the input person can easily recognize movements of the nose and mouth.

母音判断手段７は計測された形状特徴量Ｑｃ（面積Ｓと縦横比Ａｓｐ）と母音ＤＢ４に登録された基準形状特徴量ＳＱｃ（基準面積ＳＳと基準縦横比ＳＡ）を比較し、形状特徴量と関連づけられた入力候補群Ｃｓａを呼び出して候補文字Ｃｔｘを順次、入力候補文字表示手段８へ表示させる。母音判断手段７は動き判断手段からの制御信号に従って、候補文字Ｃｔｘの表示順序を変更したり、入力候補群Ｃｓａを取得しなおしたりする。 The vowel judging means 7 compares the measured shape feature quantity Qc (area S and aspect ratio Asp) with the reference shape feature quantity SQc (reference area SS and reference aspect ratio SA) registered in the vowel DB 4 to obtain the shape feature quantity. The associated input candidate group Csa is called and the candidate characters Ctx are sequentially displayed on the input candidate character display means 8. The vowel judging means 7 changes the display order of the candidate characters Ctx or re-acquires the input candidate group Csa according to the control signal from the motion judging means.

動き判断手段９は、動き計測手段６からの動き信号ＭｎおよびＭｍを受信し、動きに応じた処理を行う。例えば、現在候補文字Ｃｔｘが入力候補文字表示手段８に表示されている場合であって、下向きの鼻の動きがあった場合は、その候補文字Ｃｔｘを入力文字として入力手段へ送ると共に入力文字表示手段１０にも表示させる。 The motion determination unit 9 receives the motion signals Mn and Mm from the motion measurement unit 6 and performs processing according to the motion. For example, when the candidate character Ctx is currently displayed on the input candidate character display unit 8 and there is a downward nose movement, the candidate character Ctx is sent to the input unit as an input character and the input character display is performed. Also displayed on the means 10.

また、候補文字Ｃｔｘが表示されていない場合であって、口の開閉動作があった場合は入力文字表示手段に表示している文字Ｃｔｘを１文字削除する。この場合は１文字削除のための制御信号Ｓｃｄを入力文字表示手段１０に送信する。また、文字入力手段１１にも制御信号Ｓｃｄを送り入力文字を１文字削除する。 If the candidate character Ctx is not displayed and the mouth is opened and closed, one character Ctx displayed on the input character display means is deleted. In this case, a control signal Scd for deleting one character is transmitted to the input character display means 10. Further, the control signal Scd is also sent to the character input means 11 to delete one input character.

また、候補文字Ｃｔｘが表示されている場合であって、口の開閉動作があった場合は、入力候補文字群Ｃｓａを別の入力候補文字群に変更する。これは同じ基準形状特徴量ＳＱｃ（基準面積ＳＳと基準縦横比ＳＡ）に対応付けられる入力文字候補群が複数種類ある場合があるからである。例えば同じ口部の形に対して、日本語５０音の「あ」行という入力文字候補群と、濁点などの特殊文字からなる入力文字候補群の２つが対応付けられる場合である。この場合は、母音判断手段７若しくは母音ＤＢ４に制御信号Ｓｃａを送信する。 When the candidate character Ctx is displayed and the mouth is opened and closed, the input candidate character group Csa is changed to another input candidate character group. This is because there may be a plurality of types of input character candidate groups associated with the same reference shape feature amount SQc (reference area SS and reference aspect ratio SA). For example, the same mouth shape may be associated with two input character candidate groups, “a” line of Japanese 50 sounds, and input character candidate groups composed of special characters such as dakuten. In this case, the control signal Sca is transmitted to the vowel judging means 7 or the vowel DB4.

また、候補文字Ｃｔｘが表示されている場合であって、横向きの鼻の動きがあった場合は、入力候補文字群Ｃｓａから候補文字Ｃｔｘを表示する順番を逆転させる。すなわち、入力候補文字群Ｃｓａの１番目の候補文字Ｃｔｘから順に表示させるのを、１番目の候補文字に向かって表示させるようにする。この場合は、母音判断手段７に制御信号Ｓｃｓを送信する。 When the candidate character Ctx is displayed and there is a horizontal nose movement, the order of displaying the candidate character Ctx from the input candidate character group Csa is reversed. That is, the first candidate characters Ctx of the input candidate character group Csa are displayed in order from the first candidate character. In this case, the control signal Scs is transmitted to the vowel judging means 7.

また、候補文字Ｃｔｘが表示されている場合であって、上向きの鼻の動きがあった場合は、入力候補文字表示手段８の表示を消し、入力候補文字群Ｃｓａの選択をやり直す。すなわち、形状特徴量計測手段３からの特徴量である面積Ｓと縦横比Ａｓｐを取得し、基準面積ＳＳおよび基準縦横比ＳＡとの比較を行い入力候補文字群Ｃｓａを取得しなおす。この場合は、母音判断手段７に制御信号Ｓｃｒを送信する。 When the candidate character Ctx is displayed and there is an upward nose movement, the input candidate character display unit 8 is turned off and the input candidate character group Csa is selected again. That is, the area S and the aspect ratio Asp that are the feature quantities from the shape feature quantity measuring means 3 are acquired, and the input candidate character group Csa is acquired again by comparing the reference area SS and the reference aspect ratio SA. In this case, the control signal Scr is transmitted to the vowel judging means 7.

動き判断手段９はこの他、口および鼻の動きに対応して定義した制御信号を各構成手段に送信することにしてもよい。また、本構成の説明では、手段から手段へ制御信号やデータを送受信するようにハードウェアのように説明を行ったが、それぞの手段をソフトウェアで実現する場合は、制御データや値としてそれぞれの処理ルーチンへ受け渡すようにしてもよい。 In addition to this, the movement determination means 9 may transmit a control signal defined corresponding to the movement of the mouth and nose to each constituent means. In the description of this configuration, the hardware is described so as to transmit and receive control signals and data from means to means. However, when each means is realized by software, control data and values are respectively provided. It may be transferred to the processing routine.

＜処理の流れ＞
本発明は入力者の口の形状を、予め登録しておいた口の形状のパターンと比較することで認識する。そのため文字を入力するモードだけでなく、口部形状の登録モードを有する。これらのモードの変更方法については特に限定されるものではないが、例えば専用のスイッチを用いたり、何かのボタンを２度連続で押下するなどの方法が考えられる。 <Process flow>
The present invention recognizes the shape of the mouth of the input person by comparing with the pattern of the mouth shape registered in advance. Therefore, it has a mouth shape registration mode as well as a mode for inputting characters. The method for changing these modes is not particularly limited. For example, a method using a dedicated switch or pressing a button twice continuously may be considered.

また、図１で示したシステムが有する画像処理を用いて、鼻若しくは口の形や位置や動きなどでモードの変更を行っても良い。いずれにしても、モードの変更は１人の入力者に対しては１度行えば足りるものであるので、多少手間がかかっても本発明の趣旨に反するものではない。 Further, the mode may be changed by the shape, position, movement, or the like of the nose or mouth using the image processing included in the system shown in FIG. In any case, since it is sufficient to change the mode once for one input person, even if it takes some time, it does not contradict the gist of the present invention.

口部形状の登録モードと文字の入力モードの処理の流れを図２および３にそれぞれ示す。システム構成は図１に示したものであり、文章で示すことなく図１の構成を適宜参照しながら説明する。 The flow of processing in the mouth shape registration mode and the character input mode are shown in FIGS. 2 and 3, respectively. The system configuration is as shown in FIG. 1, and will be described with reference to the configuration of FIG.

図２は口部形状の登録モードの処理の流れを示す。この処理がスタートすると、利用者に対して、登録する入力文字候補群Ｃｓａの先頭の文字を表示する（ステップＳ１）。利用者個人の特徴量を抽出しやすくするために、利用者による特徴量を予め登録するためである。そこで利用者は登録する口部パターン形状をカメラの前で作る。ここでは日本語５母音（‘あ’、‘い’、‘う’、‘え’、‘お’）の５形状を考える。 FIG. 2 shows the flow of processing in the mouth shape registration mode. When this process starts, the first character of the input character candidate group Csa to be registered is displayed to the user (step S1). This is because the feature amount by the user is registered in advance in order to easily extract the feature amount of the individual user. Therefore, the user creates a mouth pattern shape to be registered in front of the camera. Here we consider five shapes of Japanese vowels ('A', 'I', 'U', 'E', 'O').

利用者の口部パターン形状は、カメラより画像として取得する（ステップＳ２）。ここで取得した画像データＰｓは領域抽出手段２に送られ特徴量の計測が行われる（ステップＳ３）。具体的には、取得画像から二つの鼻孔領域画像データＰｎと口内領域画像データＰｍを抽出し、二つの鼻孔間の中点の座標Ｎｘｙと口内領域の面積Ｓと縦横比Ａｓｐ（アスペクト比）を計測する。 The mouth pattern shape of the user is acquired as an image from the camera (step S2). The acquired image data Ps is sent to the region extracting means 2 and the feature amount is measured (step S3). Specifically, the two nostril region image data Pn and the intraoral region image data Pm are extracted from the acquired image, and the coordinates Nxy of the midpoint between the two nostrils, the area S of the intraoral region, and the aspect ratio Asp (aspect ratio) are obtained. measure.

次に動き計測手段６で計測した動きがゼロ、若しくは所定の許容量以下であった場合、すなわち動きがなかったと判断できる場合（ステップＳ４のＹ分岐）は、動き判断手段９が母音ＤＢ４に制御信号Ｓｃｃを送り、そのときの面積Ｓと縦横比Ａｓｐを基準面積ＳＳおよび基準縦横比ＳＡとして入力文字候補群Ｃｓａと関連付けて母音ＤＢへ記録する（ステップＳ５）。 Next, when the movement measured by the movement measuring unit 6 is zero or less than a predetermined allowable amount, that is, when it can be determined that there is no movement (Y branch in step S4), the movement determining unit 9 controls the vowel DB4. The signal Scc is sent, and the area S and aspect ratio Asp at that time are recorded as a reference area SS and a reference aspect ratio SA in association with the input character candidate group Csa in the vowel DB (step S5).

動きがあった場合（ステップＳ４のＮ分岐）は、ステップＳ１に戻り再び登録処理をやり直す。なお、この際に戻るステップはステップＳ２の画像の取得までであってもよい。以上の口部形状の登録モードにより本システムは入力者の口の形状の特徴の認識率を高めることができる。利用者の唇の形や口腔内の状態発音時の癖などは利用者それぞれで異なるからである。 If there is a motion (N branch in step S4), the process returns to step S1 and the registration process is performed again. Note that the step of returning at this time may be until the acquisition of the image in step S2. With the above mouth shape registration mode, the present system can increase the recognition rate of the mouth shape characteristics of the input person. This is because the shape of the user's lips and wrinkles at the time of pronunciation in the oral cavity are different for each user.

図３は文字の入力モードの処理の流れを示している。文字の入力モードでは、口部パターン形状と顔の動きより利用者の意図を推測する。すなわち口部パターン形状より入力したい文字の段、顔の動きより、文字の入力、入力のキャンセル、文字種の変換、文字の取り消しを判断する。 FIG. 3 shows the flow of processing in the character input mode. In the character input mode, the user's intention is estimated from the mouth pattern shape and the face movement. That is, character input, input cancellation, character type conversion, and character cancellation are determined based on the character pattern to be input and the movement of the face from the mouth pattern shape.

まずステップＳ１１で口の２回の開閉動作を判断する。これは動き計測手段６からの動き信号Ｍｍで動き判断手段９が判断する。開閉の動きがあれば（ステップＳ１１のＹ分岐）、最後に入力された文字を取り消す（ステップＳ１２）。これは動き判断手段９が文字入力手段１１に制御信号Ｓｃｄを送信することで行われる。 First, in step S11, two opening / closing operations of the mouth are determined. This is determined by the motion determining means 9 based on the motion signal Mm from the motion measuring means 6. If there is an opening / closing movement (Y branch in step S11), the last input character is canceled (step S12). This is performed by the movement determination means 9 sending a control signal Scd to the character input means 11.

開閉の動きが検出されない場合（ステップＳ１１のＮ分岐）は、特徴量表示手段５に利用者の登録した口部パターン形状と現在の利用者の口部の形状を表示させる（ステップＳ１３）。利用者は、画面に表示されている登録した口部パターン形状の特徴量の位置を参考にしながら、入力したい文字の段の口部パターン形状を作る。 If the opening / closing movement is not detected (N branch in step S11), the feature amount display means 5 displays the mouth pattern shape registered by the user and the current mouth shape of the user (step S13). The user creates the mouth pattern shape of the step of the character to be input while referring to the position of the feature value of the registered mouth pattern shape displayed on the screen.

次に口内領域の形状が一定時間（ここでは１秒間）同じ形状であれば（ステップＳ１４のＹ分岐）、形状認識処理を行う（ステップＳ１５）。そうでなければ（ステップＳ１４のＮ分岐）、ステップＳ１１に戻る。口内領域の形状が一定時間同じ形状であるか否かは、動き計測手段６の動き信号Ｍｍによって動き判断手段９が判断する。一定時間同じ形状であった場合は、動き判断手段９が母音判断手段７に制御信号Ｓｃｒを送信し、母音判断手段７をリセットする。 Next, if the shape of the mouth area is the same for a certain time (here, 1 second) (Y branch in step S14), shape recognition processing is performed (step S15). Otherwise (N branch of step S14), it returns to step S11. Whether or not the shape of the intraoral area is the same for a certain period of time is determined by the motion determining means 9 based on the motion signal Mm of the motion measuring means 6. If the shapes are the same for a certain time, the movement determination means 9 transmits a control signal Scr to the vowel determination means 7 and resets the vowel determination means 7.

母音判断手段７はリセットされると形状特徴量計測手段３から形状特徴量を入手し、特徴量空間のユークリッド距離が最も近い形状の基本面積および基本縦横比に関連付けられる入力文字候補群Ｃｓａの最初の文字を表示する。これが形状認識処理である。 When the vowel judging means 7 is reset, the shape feature quantity is obtained from the shape feature quantity measuring means 3, and the first input character candidate group Csa associated with the basic area and the basic aspect ratio of the shape having the closest Euclidean distance in the feature quantity space is obtained. The character of is displayed. This is the shape recognition process.

なお、特徴量空間のユークリッド距離とはこの場合、形状特徴量Ｑｃ（口内領域の面積Ｓと縦横比Ａｓｐ）と基本形状特徴量ＳＱｃ（基本面積ＳＳと基本縦横比ＳＡ）との差の２乗根をいう。より具体的には特徴量空間のユークリッド距離ＳＲは次の１式で表される。

・・・・・・（１） In this case, the Euclidean distance of the feature amount space is the square of the difference between the shape feature amount Qc (the area S of the mouth region and the aspect ratio Asp) and the basic shape feature amount SQc (the basic area SS and the basic aspect ratio SA). Say root. More specifically, the Euclidean distance SR of the feature amount space is represented by the following one expression.

(1)

形状認識処理の結果が正しい場合、利用者はさらに口部形状を一定時間（ここでは１秒間）同じ形状に保つ。認識結果が間違っていれば（ステップＳ１７のＮ分岐）、利用者は口部形状を変えることによりステップＳ１１に戻る。口部形状が一定時間同じ形状であれば（ステップＳ１７のＹ分岐）、システムは入力文字候補群を表示する（ステップＳ１８）。 When the result of the shape recognition process is correct, the user further keeps the mouth shape the same for a certain period of time (here, 1 second). If the recognition result is wrong (N branch in step S17), the user returns to step S11 by changing the mouth shape. If the mouth shape is the same shape for a certain period of time (Y branch of step S17), the system displays the input character candidate group (step S18).

ここで一定時間同じ形状であったか否かについては、動き計測手段６の動き信号Ｍｍを用いて動き判断手段９が判断する。動き判断手段９は、母音判断手段７に入力文字候補群Ｃｓａの中の候補文字Ｃｔｘを順次表示させる制御信号Ｓｃｔを出力する。母音判断手段７は、制御信号Ｓｃｔによって入力候補文字表示手段８に候補文字Ｃｔｘを順次表示させる。 Here, whether or not the shapes are the same for a predetermined time is determined by the motion determination means 9 using the motion signal Mm of the motion measurement means 6. The movement determination unit 9 outputs a control signal Sct that causes the vowel determination unit 7 to sequentially display the candidate characters Ctx in the input character candidate group Csa. The vowel judging means 7 sequentially displays the candidate characters Ctx on the input candidate character display means 8 by the control signal Sct.

ステップＳ１７の同じ形状維持においては、認識結果が正しくとも、文字の入力をやめたい場合は口部形状を換えることによりステップＳ１１に戻ることができる。ステップＳ１８は、入力文字候補群Ｃｓａの中の候補文字Ｃｔｘを順次表示させる制御信号Ｓｃｔを出力するのであるが、具体的には認識された母音に属する段の文字リストを画面に表示する。文字リストは一定間隔（ここでは１秒）に強調表示される文字が変わる。利用者は強調表示されている文字が入力したい文字であれば、下向き動作を検出することで（ステップＳ１９）、文字が入力される（ステップＳ２０）。 In maintaining the same shape in step S17, even if the recognition result is correct, if it is desired to stop inputting characters, it is possible to return to step S11 by changing the mouth shape. In step S18, the control signal Sct for sequentially displaying the candidate characters Ctx in the input character candidate group Csa is output. Specifically, the character list of the stage belonging to the recognized vowel is displayed on the screen. In the character list, the highlighted characters change at regular intervals (here, 1 second). If the highlighted character is a character to be input, the user detects the downward movement (step S19), and the character is input (step S20).

また文字の入力をキャンセルしたい場合は、上を向く動き（上向き動作）をすることにより文字入力がキャンセルされてステップＳ１１に戻る（ステップＳ２１）。また利用者は濁音や半濁音などの文字を入力したい場合は口の２回開閉動作を行うこと（ステップＳ２２のＹ分岐）により入力文字候補群を切り替えられる（ステップＳ２３）。 If it is desired to cancel character input, the character input is canceled by moving upward (upward movement), and the process returns to step S11 (step S21). In addition, when the user wants to input characters such as muddy or semi-voiced sounds, the input character candidate group can be switched by performing the mouth opening / closing operation twice (Y branch in step S22) (step S23).

なお、ステップＳ１３（口部形状表示）やステップＳ１９、Ｓ２１、Ｓ２２に関連する動き検出については、常に継続して処理を行うようにしてもよい。すなわち、図３のフローのタイミングで口部形状を表示するだけでなく、口部の形状は常時表示するようにしてもよい。また、「下を向く動き」があったか否かについては、ステップＳ１９のタイミングで動きを検出しにいくのではなく、動き検出は常に動きを検知しており、動き判断手段９はいつでも動きがあったか否かについて情報を得られるようにしておいてもよい。 In addition, about the motion detection relevant to step S13 (mouth part shape display) and step S19, S21, S22, you may make it always process continuously. That is, not only the mouth shape is displayed at the timing of the flow of FIG. 3, but the shape of the mouth may be always displayed. Also, as to whether or not there has been a “downward movement”, the movement detection is not always detected at the timing of step S19, but the movement detection always detects the movement. Information about whether or not may be obtained.

具体的には、図１の構成図において、領域抽出手段２、形状特徴量計測手段３、動き計測手段６については、母音判断手段７及び動き判断手段９とは別のＭＰＵと制御プログラムによって常時動作を行うことで実現できる。また、これらの処理をソフトウェア的に行う場合は、プログラムのバックグランド処理を行いリアルタイム性を確保するなどの方法で実現できる。 Specifically, in the configuration diagram of FIG. 1, the region extracting unit 2, the shape feature amount measuring unit 3, and the motion measuring unit 6 are always operated by an MPU and a control program different from the vowel determining unit 7 and the motion determining unit 9. This can be realized by performing an operation. Further, when these processes are performed in software, it can be realized by a method such as performing a background process of a program to ensure real-time performance.

本発明では、実際に口部形状パターンを利用した文字入力システムを構築し、被験者実験によりその効果を検討した。 In the present invention, a character input system that actually uses the mouth shape pattern was constructed, and the effect was examined by subject test.

＜入力可能な文字＞
図４に本システムで入力可能な文字の一覧表を示す。本システムでは濁音、半濁音などを含むかな８１文字としているが、この表を変更することは可能であり、例えばカタカナ文字やアルファベット、数字などの入力も可能となる。本システムでは、例えば“さ”を入力したい場合は、利用者は“あ”の口部パターン形状を作り、認識結果が正しければ、入力候補文字リストとして‘あ’、‘か’、‘さ’、‘た’、‘な’、‘は’、‘ま’、‘や’、‘ら’、‘わ’、‘ん’が表示される。 <Enterable characters>
FIG. 4 shows a list of characters that can be input in this system. In this system, there are 81 kana characters including muddy and semi-voiced sounds, but this table can be changed. For example, katakana characters, alphabets, and numbers can be input. In this system, for example, when inputting “sa”, the user creates the mouth pattern shape of “a”, and if the recognition result is correct, the input candidate character list is “a”, “ka”, “sa”. , 'Ta', 'na', 'ha', 'ma', 'ya', 'ra', 'wa', 'n' are displayed.

１秒間隔で“あ”から順に強調表示され、利用者は‘さ’が強調表示されるときに「下向き動作」をすることにより‘さ’を入力することができる。また、例えば‘べ’を入力したい場合は、利用者は“え”の口部パターン形状を作り、認識結果が正しければ、入力候補文字リストとして‘え’、‘け’、‘せ’、‘て’、‘ね’、‘へ’、‘め’、‘れ’が表示される。次に利用者は「口を２回開閉する動き」をすることにより、入力候補文字リストは図４の（２）表の“げ”、‘ぜ’、‘で’、‘べ’、‘ぺ’、‘ぇ’が表示される。利用者は“べ”が強調表示されるときに「下向き動作」をすることにより‘べ’を入力することができる。 Highlighted in order from “A” at 1 second intervals, the user can input “SA” by performing “downward movement” when “SA” is highlighted. Also, for example, if you want to enter 'be', the user creates a mouth pattern shape of “e”, and if the recognition result is correct, 'e', 'ke', 'se', ' ”,‘ Ne ’,‘ to ’,‘ me ’and‘ re ’are displayed. Next, the user performs the “movement to open and close the mouth twice”, so that the input candidate character list becomes “be”, “ze”, “de”, “be”, “pe” in the table (2) of FIG. ',' É 'is displayed. The user can input “be” by performing “downward movement” when “be” is highlighted.

＜画面構成＞
図５〜図７に本システムの画面の表示例を示す。図５は初期の画面を示しており、（１）カメラ画像、（２）口内領域画像、（３）口内領域の縦幅の変化、（４）鼻孔位置の変化、（５）特徴量の推移がそれぞれ表示される。これらの表示はリアルタイムで更新される。（１）カメラ画像には抽出された鼻孔と口内領域が上書き表示される。（２）口内領域画像ではサイズと回転に対して正規化された口内領域画像が表示される。 <Screen configuration>
5 to 7 show display examples of the screen of this system. FIG. 5 shows an initial screen. (1) Camera image, (2) Mouth area image, (3) Change in length of mouth area, (4) Change in nostril position, (5) Change in feature amount Is displayed. These displays are updated in real time. (1) The extracted nostril and intraoral area are overwritten on the camera image. (2) In the mouth area image, the mouth area image normalized with respect to size and rotation is displayed.

（３）口内領域の縦幅の変化では、正規化された口内領域の縦幅の時間的変化を示しており、口を閉じると上側に、口を開けると下側にプロットされる。右側が現在時刻であり、左側ほど過去の値を示している。２本の水平線は、口の開閉動作を判断するためのしきい値の線である。 (3) The change in the vertical width of the mouth area shows the temporal change in the vertical width of the normalized mouth area, and is plotted on the upper side when the mouth is closed and on the lower side when the mouth is opened. The right side is the current time, and the left side shows the past value. Two horizontal lines are threshold lines for determining the opening / closing operation of the mouth.

（４）鼻孔位置の変化は、抽出された鼻孔間の中点座標を表しており、水平方向の線はｙ座標の変化、垂直方向の線はｘ座標の変化を示している。例えば、中点が下側に移動すると、水平方向の線は下側にプロットされる。水平方向の線は（３）と同様に右側が現在時刻であり、左側ほど過去の値を示している。また、垂直方向の線は下側が現在時刻であり、上側ほど過去の値を示している。 (4) The change in the nostril position represents the midpoint coordinates between the extracted nostrils, the horizontal line indicates the change in the y coordinate, and the vertical line indicates the change in the x coordinate. For example, as the midpoint moves down, the horizontal line is plotted down. As in the case of (3), the horizontal line shows the current time on the right side and the past value on the left side. In the vertical line, the lower side is the current time, and the upper side shows the past value.

（５）特徴量の推移とは入力者の口内領域より計測される形状特徴量を計測時間毎にプロットしたものである。この表示は縦軸をアスペクト比、横軸を面積とした表示領域内に表示される。十字の交点が現在時刻での形状特徴量を表す位置である。入力者の口の形状が変化する毎にこの十字の交点はその時々のアスペクト比と面積の位置に移動する。形状特徴量の変化を視覚化するために、十字の交点の位置は所定時間の間表示位置にとどまり、時間の経過に従って薄い色に変化する。このようにすることで、形状特徴量の変化を視覚化できる。 (5) The transition of the feature value is obtained by plotting the shape feature value measured from the mouth area of the input person for each measurement time. This display is displayed in a display area where the vertical axis is the aspect ratio and the horizontal axis is the area. The intersection of the crosses is the position representing the shape feature value at the current time. Each time the shape of the input person's mouth changes, the intersection of the crosses moves to the position of the aspect ratio and area at that time. In order to visualize the change of the shape feature value, the position of the crossing intersection remains at the display position for a predetermined time and changes to a light color as time passes. In this way, the change in the shape feature amount can be visualized.

また、この表示領域には母音ＤＢに登録した‘あ’、‘い’、‘う’、‘え’、‘お’の口部形状パターンの特徴量を示す。これらは、それぞれ色を違えた点を表示させることで示す。入力者は形状特徴量を示す十字の交点の移動を見ながら自分の口の形を変え、上記の登録された口部形状パターンの特徴量の点に十字の交点を近づける。このようにすることでシステムが入力者の口の形状を認識する確度を高めることができる。 In addition, this display area shows the feature amounts of the mouth shape patterns of “A”, “I”, “U”, “E”, and “O” registered in the vowel DB. These are indicated by displaying points with different colors. The input person changes the shape of his / her mouth while observing the movement of the intersection of the cross indicating the shape feature value, and brings the cross intersection closer to the feature value point of the registered mouth shape pattern. By doing so, it is possible to increase the accuracy with which the system recognizes the shape of the input person's mouth.

図６は利用者が‘あ’の口部形状パターンを作り、‘あ’と正しく認識された状態を示している。認識結果は（６）の位置に表示される。また（７）は入力可能文字の一覧表（図５に対応するもの）を表示している。 FIG. 6 shows a state where the user has created a mouth shape pattern of “A” and is correctly recognized as “A”. The recognition result is displayed at the position (6). (7) displays a list of inputable characters (corresponding to FIG. 5).

図７は利用者が認識結果は正しいと判断したあと、さらに口部形状を一定時間同じ状態を保ったときに（８）の位置に入力候補の文字リストが表示されている状態を示している。ここで入力候補の文字リストとは入力文字候補群若しくはその一部をいう。中央の文字５０が強調表示されている文字であり、利用者は入力したい文字が強調表示されるときに「下向き動作」をすることにより文字を入力できる。すなわち、強調表示されている文字が候補文字Ｃｔｘである。文字リストは７文字表示しているが、この文字数の変更は可能である。また（９）の位置に入力された文が表示される。この例では“さいとう”と入力されている様子を示している。なお、強調表示とは、太線や大文字、文字の色を変えるなど、他の文字との違いが視認できればよい。 FIG. 7 shows a state in which a character list of input candidates is displayed at the position (8) when the user determines that the recognition result is correct and then keeps the mouth shape the same for a certain period of time. . Here, the input candidate character list refers to an input character candidate group or a part thereof. The central character 50 is a highlighted character, and the user can input a character by performing a “downward movement” when the desired character is highlighted. That is, the highlighted character is the candidate character Ctx. Although the character list displays seven characters, the number of characters can be changed. The input sentence is displayed at the position (9). In this example, “Saito” is input. Note that the highlighting only needs to allow visual recognition of differences from other characters, such as changing the color of bold lines, capital letters, and characters.

＜認識モードの動作例＞
図８は５形状（５母音）の認識結果例を示している。利用者は‘あ’、‘い’、‘う’、‘え’、‘お’の順で口部形状を変化させている。その動きに伴い特徴量が推移していることが確認できる。また、認識結果が正しく表示されていることも確認できる。 <Operation example in recognition mode>
FIG. 8 shows an example of recognition results of five shapes (five vowels). The user changes the mouth shape in the order of “A”, “I”, “U”, “E”, “O”. It can be confirmed that the feature amount is changing along with the movement. It can also be confirmed that the recognition result is correctly displayed.

＜入力候補文字リストの表示モードの動作例＞
図９は入力候補文字リストの表示例を示している。ここでは最上段で‘あ’の認識結果が表示されている。その後、利用者は口部形状を一定に保つことにより入力候補文字リストを表示するモードに変わり、‘あ’の段の１１字５１が順に強調表示されている。 <Operation example of input candidate character list display mode>
FIG. 9 shows a display example of the input candidate character list. Here, the recognition result of “A” is displayed at the top. Thereafter, the user changes to the mode for displaying the input candidate character list by keeping the mouth shape constant, and the 11 characters 51 in the “A” row are highlighted in order.

＜文字の入力例＞
図１０は２文字（‘さ’と‘い’）を入力する例を示している。入力は前述の通り、下向き動作で実現するが、１段目から２段目で下を向き、３段目で再び同じ位置（正面）に戻していることが画像より確認できる。この動きに伴い、鼻孔位置変化の水平方向の線が「谷」の形５３となり、文字が入力されている。なお、丸で囲んだ文字は強調表示される文字である。 <Example of character input>
FIG. 10 shows an example of inputting two characters ('sa' and 'i'). As described above, the input is realized by the downward operation, but it can be confirmed from the image that the first stage is directed downward from the second stage and returned to the same position (front) again at the third stage. With this movement, the horizontal line of the nostril position change becomes a “valley” shape 53, and characters are input. The circled characters are highlighted characters.

＜リスト表示順序の変換例＞
図１１ではリストの表示順序の変換例を示している。初期の状態では、例えば‘あ’の段では、‘あ’、‘か’、‘さ’、‘た’、‘な’、‘は’、‘ま’、‘や’、‘ら’、‘わ’、‘ん’の順序で強調表示される。この順序の変更は顔を左右に動かす動き（左向き動作、右向き動作）で実現するが、１段目から２段目で左（または右）を向き、３段目で再び同じ位置（正面）に戻していることが画像より確認できる。この動きに伴い、鼻孔位置変化の垂直方向の線が符号５４の部分で凹（または凸）の形となり、順序が変更されている。なお、丸で囲んだ文字は強調表示される文字である。 <List display order conversion example>
FIG. 11 shows a conversion example of the list display order. In the initial state, for example, in the stage of 'A', 'A', 'KA', 'SA', 'TA', 'NA', 'NA', 'MA', 'YA', 'RA', ' It will be highlighted in the order of 'wa' and 'n'. This change of order is realized by moving the face to the left or right (leftward movement, rightward movement), but the left (or right) is turned from the first stage to the second stage, and the same position (front) is returned to the third stage. It can be confirmed from the image that it has been restored. Along with this movement, the vertical line of the nostril position change becomes concave (or convex) at the portion 54, and the order is changed. The circled characters are highlighted characters.

＜文字入力のキャンセル例＞
図１２の１段目は２文字（「さい」）が入力され、‘あ’の段の入力候補文字リストが表示されている。‘あ’の段の入力をキャンセルする動きを図１２で示しているが、文字入力のキャンセルは前述の通り、上向き動作で実現する。画像より１段目から２段目で上を向き、３段目で再び同じ位置（正面）に戻していることが確認できる。この動きに伴い、鼻孔位置変化の水平方向の線が「山」の形５５となり、入力候補文字リストが表示されていないことが確認できる。 <Example of canceling character input>
In the first row of FIG. 12, two characters (“sai”) are input, and the input candidate character list of the “a” row is displayed. FIG. 12 shows the movement of canceling the input of “A”, but the cancellation of the character input is realized by the upward operation as described above. It can be confirmed that the image is directed upward from the first stage to the second stage and is returned to the same position (front) again at the third stage. With this movement, the horizontal line of the nostril position change becomes a “mountain” shape 55, and it can be confirmed that the input candidate character list is not displayed.

＜文字の取り消し例＞
図１３では文字の取り消し例を示している。前述の通り、文字の取り消しは口の２回の開閉動作で実現できる。画像より２段目から６段目にかけて口の開閉動作を２回行っていることが確認できる。この動きに伴い、口内領域の縦幅の変化のグラフが６段目の丸の部分５６で２回振幅があり、最後に入力した‘み’（符号５７）が６段目で消えていることが確認できる。 <Example of character cancellation>
FIG. 13 shows an example of character cancellation. As described above, cancellation of characters can be realized by opening and closing the mouth twice. From the image, it can be confirmed that the opening / closing operation of the mouth is performed twice from the second stage to the sixth stage. Along with this movement, the graph of the change in the vertical width of the mouth area has an amplitude twice in the round part 56 of the sixth stage, and the last input “mi” (symbol 57) disappears in the sixth stage. Can be confirmed.

＜文字種の変換例＞
図１４では文字種の変換例を示している。前述の通り、濁音や半濁音などの文字種の変換は入力候補文字リストの表示モードにおいて、口の２回の開閉動作で実現できる。画像より１段目から５段目にかけて口の開閉動作を２回行っていることが確認できる。この動きに伴い、口内領域の縦幅の変化のグラフが５段目の丸の部分５８で２回振幅があり、４段目から５段目にかけて入力候補文字リストが‘ら’、‘わ’、‘ん’、‘あ’、‘か’、‘さ’、‘た’（符号６０）から‘ゃ’、‘ー’、‘が’、‘ざ’、‘だ’、‘ば’、‘ぱ’（符号６１）に変更されていることが確認できる。 <Character type conversion example>
FIG. 14 shows an example of character type conversion. As described above, conversion of character types such as muddy sound and semi-voiced sound can be realized by opening and closing the mouth twice in the input candidate character list display mode. It can be confirmed that the opening / closing operation of the mouth is performed twice from the first stage to the fifth stage from the image. Along with this movement, the graph of the change in the vertical width of the mouth area has twice the amplitude at the round part 58 in the fifth row, and the input candidate character list is “R”, “W” from the fourth row to the fifth row. , 'N', 'a', 'ka', 'sa', 'ta' (reference numeral 60) from 'nya', 'ー', 'ga', 'za', 'da', 'ba', ' It can be confirmed that it has been changed to Pa ′ (reference numeral 61).

＜文章入力実験＞
被験者１０名（男性健常者９名、女性健常者１名）に対して５文の入力実験を行った。５文は下記の通りである。
「とっとりだいがく」
「おはようございます」
「おなかがすきました」
「きょうはにちようびです」
「わたしはだいがくせいです」 <Text input experiment>
A five-sentence input experiment was performed on 10 subjects (9 healthy males and 1 healthy female). The five sentences are as follows.
"It's all right"
"Good morning"
"I'm hungry"
“Today is a day.”
"I'm terrible"

８〜１２文字で構成された一般的な文章である。入力する前に、被験者毎に口部パターン形状の登録を行った。登録は各形状で５個ずつとした。その後に入力実験を行った。入力実験は５回行い、全て異なる日時で行った。その結果を図１５に示す。 It is a general sentence composed of 8 to 12 characters. Before inputting, the mouth pattern shape was registered for each subject. There were 5 registrations for each shape. After that, an input experiment was conducted. The input experiment was performed five times, all at different dates. The result is shown in FIG.

図１５では、縦軸が平均入力文字数（ｋｐｍ；ｋａｎａｐｅｒｍｉｎｕｔｅ）を表し、１分間に入力した文字数の平均を示す。また横軸は実験の回数を表す。 In FIG. 15, the vertical axis represents the average number of input characters (kpm; kana per minute) and indicates the average number of characters input per minute. The horizontal axis represents the number of experiments.

１回目の実験では１分間あたりの平均入力文字数は１０人の平均で４．７ｋｐｍであった。試行回数が増えるたびに文字の入力速度があがり、５回目では６．８ｋｐｍであった。入力速度に個人差があるものの、試行回数５回程度で操作に十分慣れたことが実験結果より推測できる。 In the first experiment, the average number of input characters per minute was 4.7 kpm on average for 10 people. As the number of trials increased, the character input speed increased, and the fifth time was 6.8 kpm. Although there are individual differences in the input speed, it can be inferred from the experimental results that the user has become accustomed to the operation with about 5 trials.

また最も早い被験者は５回目で７．７ｋｐｍであった。また被験者実験のログを解析することにより、文字の入力ミスは１回目では１４．１％であったが、５回目では４．３％と低下しており、操作に慣れることにより入力ミスがほとんどなくなることを確認した。 The earliest test subject was 7.7 kpm at the fifth time. In addition, by analyzing the subject experiment log, the input error of the character was 14.1% at the first time, but decreased to 4.3% at the fifth time. I confirmed that it disappeared.

本発明は、携帯電話やパソコンに代表される電子機器への文字入力として利用することができる。 The present invention can be used as character input to electronic devices typified by mobile phones and personal computers.

本発明の文字入力方法を実現するシステムの構成の一例を示す図。The figure which shows an example of the structure of the system which implement | achieves the character input method of this invention. 口部形状を登録する際の処理のフローを示す図。The figure which shows the flow of a process at the time of registering a mouth part shape. 文字入力をする際の処理のフローを示す図。The figure which shows the flow of a process at the time of inputting a character. 本発明の入力可能な文字の一覧を例示する図。The figure which illustrates the list of the characters which can be input of this invention. 本発明のシステムの表示手段上への表示例を示す図。The figure which shows the example of a display on the display means of the system of this invention. 「あ」を入力している場合の画面の一例を示す図。The figure which shows an example of the screen at the time of inputting "A". 本発明のシステムの表示手段上への表示例を示す図。The figure which shows the example of a display on the display means of the system of this invention. ５つの母音を入力した際の画面表示の一例を示す図。The figure which shows an example of the screen display at the time of inputting five vowels. 「あ」段の入力文字候補群を例示する図。The figure which illustrates the input character candidate group of "A" level. ２文字（‘さ’と‘い’）を入力する例を示す図。The figure which shows the example which inputs 2 characters ('sa' and 'i'). リストの表示順序の変換例を示す図。The figure which shows the conversion example of the display order of a list. 入力をキャンセルする動きを示す図。The figure which shows the motion which cancels input. 文字の取り消し例を示す図。The figure which shows the example of cancellation of a character. 文字種の変換例を示す図。The figure which shows the example of conversion of a character type. 被験者１０名に対する５文の入力実験の結果を示す図。The figure which shows the result of the input experiment of 5 sentences with respect to 10 test subjects.

符号の説明Explanation of symbols

１撮像装置
２領域抽出手段
３形状特徴量計測手段
４母音ＤＢ
５特徴量表示手段
６動き計測手段
７母音判断手段
８入力候補文字表示手段
９動き判断手段
１０入力文字表示手段
１１文字入力手段
１２動き表示手段

DESCRIPTION OF SYMBOLS 1 Imaging device 2 Area extraction means 3 Shape feature-value measurement means 4 Vowel DB
5 Characteristic Display Unit 6 Motion Measurement Unit 7 Vowel Determination Unit 8 Input Candidate Character Display Unit 9 Motion Judgment Unit 10 Input Character Display Unit 11 Character Input Unit 12 Motion Display Unit

Claims

電子機器に対して文字を入力する方法であって、口の形状を認識する工程と、前記口の形状に基づいて入力文字の候補群を選択する工程と、前記候補群に属する文字を順に表示する工程と、前記文字を表示する度に口又は鼻の動きを検出する工程と、前記口又は鼻の動きに基づいて前記候補群又は前記表示中の文字に対する処理を選択する工程とを含む文字入力方法。 A method of inputting characters to an electronic device, the step of recognizing the shape of the mouth, the step of selecting a candidate group of input characters based on the shape of the mouth, and the characters belonging to the candidate group are displayed in order A step of detecting a movement of the mouth or nose each time the character is displayed, and a step of selecting a process for the candidate group or the character being displayed based on the movement of the mouth or nose. input method.

前記選択される処理には前記表示中の文字を入力文字とする処理を含む請求項１に記載された文字入力方法。 The character input method according to claim 1, wherein the selected process includes a process of using the displayed character as an input character.

前記選択される処理には前記候補群を他の候補群に置き換える処理を含む請求項１に記載された文字入力方法。 The character input method according to claim 1, wherein the selected process includes a process of replacing the candidate group with another candidate group.

前記選択される処理には前記候補群から表示される文字の表示の順を変更する処理を含む請求項１に記載された文字入力方法。 The character input method according to claim 1, wherein the selected process includes a process of changing a display order of characters displayed from the candidate group.

前記口の形状を認識する工程の前に、口の動きを検出する工程と、前記口の動きに基づき既に入力文字とした文字を削除する工程とを更に含む請求項１に記載された文字入力方法。

2. The character input according to claim 1, further comprising a step of detecting movement of the mouth before the step of recognizing the shape of the mouth and a step of deleting a character that has already been input based on the movement of the mouth. Method.