JP5464661B2

JP5464661B2 - Information terminal equipment

Info

Publication number: JP5464661B2
Application number: JP2010055804A
Authority: JP
Inventors: 晴久加藤; 恒夫加藤
Original assignee: KDDI Corp
Current assignee: KDDI Corp
Priority date: 2010-03-12
Filing date: 2010-03-12
Publication date: 2014-04-09
Anticipated expiration: 2030-03-12
Also published as: JP2011191870A

Description

本発明は、情報を提示する情報端末装置に関し、特に、撮像部と撮像対象の相対的位置関係の変化によって表示部での表示情報を制御できる情報端末装置に関する。 The present invention relates to an information terminal device that presents information, and more particularly to an information terminal device that can control display information on a display unit by changing a relative positional relationship between an imaging unit and an imaging target.

撮像部に対して撮像対象の位置を変化させ、撮像部と撮像対象の相対的位置に応じて表示情報を制御する装置は、表示情報を直感的に変化させることが可能であり、利用者の利便性を向上させる上で有用である。 The device that changes the position of the imaging target with respect to the imaging unit and controls the display information according to the relative position of the imaging unit and the imaging target can intuitively change the display information. This is useful for improving convenience.

これを実現する手法として、特許文献１には、物体の色に近似した色が付され、かつその外形および配置の組み合わせにより識別情報を構成する複数のマーカエレメントで構成されるマーカユニットを物体に配置し、該物体のカメラ画像を解析することによって、現実環境における物体に仮想情報を重畳表示する際の位置あわせを行う複合現実感システムが提案されている。 As a technique for realizing this, Japanese Patent Application Laid-Open No. H10-228667 describes a marker unit composed of a plurality of marker elements that are provided with a color that approximates the color of the object and that constitutes identification information by a combination of its outer shape and arrangement. There has been proposed a mixed reality system that performs positioning when displaying virtual information superimposed on an object in a real environment by arranging and analyzing the camera image of the object.

特許文献２には、位置姿勢センサが配置された現実空間の実写画像に、位置姿勢センサとカメラの計測範囲を重ね合わせた合成画像を生成する画像合成装置が提案されている。 Patent Document 2 proposes an image composition device that generates a composite image obtained by superimposing a measurement range of a position and orientation sensor and a camera on a real image in which a position and orientation sensor is arranged.

特許文献３には、モーションセンサの検出した移動および傾動の方向および量に応じて、表示される部分をスクロールまたはズームするとともに、キー入力の所定キーを操作することによりスクロールまたはズームを停止する機能を有するポータブル機器が提案されている。 Patent Document 3 discloses a function of scrolling or zooming a displayed portion according to the direction and amount of movement and tilt detected by a motion sensor, and stopping scrolling or zooming by operating a predetermined key for key input. A portable device having the above has been proposed.

非特許文献１では、カメラで撮像した手の画像から指先を検出し、楕円フィッティングの技術を利用して、検出された手の場所に表示情報を位置合わせする手法が提案されている。これでは、曲率が大きな箇所を手の指先として検出する。 Non-Patent Document 1 proposes a method of detecting a fingertip from an image of a hand imaged by a camera and aligning display information with the detected hand location using an ellipse fitting technique. In this case, a portion having a large curvature is detected as the fingertip of the hand.

特開２００９−０２０６１４号公報JP 2009-020614 A 特開２００７−２３３９７１号公報JP 2007-233971 A 特開２００９−００３７９９号公報JP 2009-003799 A

T.Lee,et al.,"Handy AR: Markerless Inspection of Augmented Reality Objects Using Fingertip Tracking,"In Proc. IEEE International Symposium on Wearable Computers, pp.83-90, Oct.2007.T. Lee, et al., "Handy AR: Markerless Inspection of Augmented Reality Objects Using Fingertip Tracking," In Proc. IEEE International Symposium on Wearable Computers, pp. 83-90, Oct. 2007.

特許文献１の複合現実感システムでは、カメラ画像におけるマーカユニットの位置により現実環境に重畳表示される仮想情報の位置が制御されるが、物体に予め人工的な複数のマーカエレメントを配置しておく必要がある。このため、この複合現実感システムでの手法を利用できる場所は限定されるという課題がある。 In the mixed reality system of Patent Document 1, the position of virtual information superimposed and displayed in the real environment is controlled by the position of the marker unit in the camera image, but a plurality of artificial marker elements are arranged in advance on the object. There is a need. For this reason, there is a problem that places where the method of the mixed reality system can be used are limited.

特許文献２の画像合成装置では、位置姿勢センサを必要としている。また、特許文献３のポータブル機器では、モーションセンサを必要としている。このため、これらにおける手法を利用できる装置が限定されるという課題がある。また、センサ類の搭載は、端末のコスト上昇を招くだけでなく、装置の小型化や省電力化が困難になるという課題もある。 The image synthesizing apparatus of Patent Document 2 requires a position and orientation sensor. In addition, the portable device disclosed in Patent Document 3 requires a motion sensor. For this reason, there exists a subject that the apparatus which can utilize the method in these is limited. In addition, mounting sensors does not only increase the cost of the terminal, but also makes it difficult to reduce the size and power consumption of the device.

非特許文献１では、特に曲率が大きな箇所を指先として検出するものであるので、指が真っ直ぐに伸びている状態でしか正しく表示できない。自然な手の形状は、指が若干曲がっており、また、指先は動きが大きいので、位置合わせの精度が十分でなく、対応付けが難しいという課題がある。また、指が閉じている場合は指先を正しく検出できないという課題もある。 In Non-Patent Document 1, since a part having a particularly large curvature is detected as a fingertip, it can be correctly displayed only in a state where the finger is straightly extended. The natural hand shape has a problem that the finger is slightly bent and the fingertip has a large movement, so that the positioning accuracy is not sufficient and the matching is difficult. There is also a problem that the fingertip cannot be detected correctly when the finger is closed.

本発明の目的は、上記課題を解決し、マーカーやセンサを用いることなく、撮像部に対して撮像対象の空間的位置や姿勢を変化させるだけで、表示部での表示情報を確実かつ高精度に制御できる情報端末装置を提供することにある。 The object of the present invention is to solve the above-mentioned problems, and to change the spatial position and orientation of the imaging target with respect to the imaging unit without using a marker or a sensor, and reliably and accurately display information on the display unit. It is to provide an information terminal device that can be controlled easily.

上記課題を解決するため、本発明は、手を撮像対象として撮像対象を撮像する撮像部を有する情報端末装置であって、前記撮像部から入力される画像から抽出した特徴点を、基準となる位置および姿勢での撮像対象に予め設定され、撮像対象の全体的な位置および姿勢の変化を示す基準点に対応付けて撮像対象の位置および姿勢の変化を推定する推定部と、情報を表示する表示部と、前記表示部に表示する情報を記憶する記憶部と、前記記憶部から読み出して前記表示部に表示する情報を、前記推定部により推定された撮像対象の位置および姿勢の変化に応じて制御する制御部を備え、前記基準点は、掌の内部もしくは周囲に設定され、前記推定部は、前記撮像部から入力される画像における肌領域から指同士の境界部分、指と掌の境界部分あるいは皺部分を排除してから肌領域内に存在する特徴点を抽出する機能を有する点を基本的特徴としている。 In order to solve the above-described problem, the present invention is an information terminal device having an imaging unit that captures an imaging target using a hand as an imaging target, and uses a feature point extracted from an image input from the imaging unit as a reference An estimation unit that preliminarily sets the imaging target in the position and orientation and estimates the change in the position and orientation of the imaging target in association with a reference point that indicates a change in the overall position and orientation of the imaging target, and displays information A display unit, a storage unit that stores information to be displayed on the display unit, and information that is read from the storage unit and displayed on the display unit according to changes in the position and orientation of the imaging target estimated by the estimation unit The reference point is set inside or around the palm, and the estimation unit is configured to provide a boundary part between fingers from a skin region in an image input from the imaging unit, a boundary between the finger and the palm. portion Rui are basically characterized that it has a function of extracting feature points existing in the skin area after eliminating wrinkles portion.

ここで、基準点は、例えば、指の股、関節などの皺の端点、さらには拇指の第２関節の外側の点、掌横方向中央の縦方向ラインに対して拇指の第２関節の外側の点に対応する点に設定することができる。 Here, the reference points are, for example, end points of the heels such as the crotch and joint of the finger, the points outside the second joint of the thumb, and the outside of the second joint of the thumb with respect to the longitudinal line at the center of the palm. It can be set to a point corresponding to this point.

本発明によれば、撮像部と撮像対象の相対的位置および姿勢を変化させるだけで表示部での表示情報を制御することが可能となる。したがって、利用者は、撮像部に対する撮像対象の相対的位置および姿勢を変化させるという直観的な操作で表示情報を制御できる。 According to the present invention, it is possible to control display information on the display unit only by changing the relative position and orientation of the imaging unit and the imaging target. Therefore, the user can control the display information by an intuitive operation of changing the relative position and posture of the imaging target with respect to the imaging unit.

また、表示情報の制御は、撮像部に入力される画像を解析し、撮像部に対する撮像対象の相対的位置および姿勢の変化を基に表示情報の制御を行うので、ソフトウェアで実現可能であり、センサ類の特別なハードウェアを情報端末に組み込む必要がない。 The display information can be controlled by analyzing the image input to the imaging unit and controlling the display information based on changes in the relative position and orientation of the imaging target with respect to the imaging unit. There is no need to incorporate special hardware of sensors into the information terminal.

さらに、本発明では、撮像部から入力される画像から抽出した特徴点を、基準となる位置および姿勢での撮像対象に予め設定された基準点に対応付けて撮像対象の位置および姿勢の変化を推定するようにし、基準点を、撮像対象の全体的な位置および姿勢の変化を示す点に設定しているので、撮像対象の位置および姿勢の変化を確実かつ高精度に推定でき、したがって、表示情報を確実かつ高精度に制御できる。 Furthermore, in the present invention, the feature point extracted from the image input from the imaging unit is associated with the reference point preset for the imaging target at the reference position and orientation, and the change in the position and orientation of the imaging target is performed. Since the reference point is set to a point indicating the change in the overall position and orientation of the imaging target, the change in the position and orientation of the imaging target can be estimated reliably and with high accuracy. Information can be reliably and accurately controlled.

本発明の情報端末装置の一実施形態を示す機能ブロック図である。It is a functional block diagram which shows one Embodiment of the information terminal device of this invention. 推定部での処理手順を示すフローチャートである。It is a flowchart which shows the process sequence in an estimation part. 予め設定される基準点の例を示す図である。It is a figure which shows the example of the reference point set beforehand. 画像から抽出された肌領域、該肌領域を包含する外接矩形および特徴点の一例を示す図である。It is a figure which shows an example of the skin area | region extracted from the image, the circumscribed rectangle which includes this skin area | region, and a feature point. 画像から抽出された肌領域および分離された肌領域の例を示す図である。It is a figure which shows the example of the skin area | region extracted from the image, and the separated skin area | region. 本発明による表示例を示す図である。It is a figure which shows the example of a display by this invention. 本発明による他の表示例を示す図である。It is a figure which shows the other example of a display by this invention. 本発明による更に他の表示例を示す図である。It is a figure which shows the other example of a display by this invention.

以下、図面を参照して本発明を説明する。以下では、情報端末装置として携帯電話を利用した場合について説明する。しかし、本発明の情報端末装置は、携帯電話に限られるものではなく、撮像部を備えたものであればどのような情報端末装置でもよく、例えば、コンピュータなどでもよい。 The present invention will be described below with reference to the drawings. Below, the case where a mobile telephone is utilized as an information terminal device is demonstrated. However, the information terminal device of the present invention is not limited to a mobile phone, and may be any information terminal device provided with an imaging unit, such as a computer .

図１は、本発明の情報端末装置の一実施形態を示す機能ブロック図である。本実施形態の情報端末装置は、撮像部11、推定部12、制御部13、記憶部14及び表示部15を備える。 FIG. 1 is a functional block diagram showing an embodiment of the information terminal device of the present invention. The information terminal device of this embodiment includes an imaging unit 11, an estimation unit 12, a control unit 13, a storage unit 14, and a display unit 15.

撮像部11は、所定のサンプリング周期で撮像対象(手)を連続的に撮像し、その画像を推定部12および制御部13へ出力する。撮像部11としては、携帯電話に標準装備されるデジタルカメラを利用することができる。 The imaging unit 11 continuously captures an imaging target (hand) at a predetermined sampling period, and outputs the images to the estimation unit 12 and the control unit 13. As the imaging unit 11, a digital camera provided as a standard in a mobile phone can be used.

推定部12には、撮像対象の位置および姿勢の変化の判断基準として既知の位置および姿勢での撮像対象に対して基準点が予め登録されている。ここで、基準点は、撮像対象の全体的な位置および姿勢の変化を示すように設定される。 In the estimation unit 12, a reference point is registered in advance with respect to an imaging target at a known position and orientation as a determination criterion for changes in the position and orientation of the imaging target. Here, the reference point is set to indicate a change in the overall position and orientation of the imaging target.

推定部12は、撮像部11から入力される画像から基準点に対応する特徴点を検出し、該特徴点と基準点との対応を推定する。さらに、推定部12は、対応する特徴点と基準点の座標位置から、予め設定された変換式に基づいて撮像部11に対する撮像対象の相対的位置および姿勢を表す変換係数を推定する。推定部12において変換係数の推定に用いられた変換式および推定された変換係数は、制御部13へ出力される。推定部12での処理の詳細は後述する。 The estimation unit 12 detects a feature point corresponding to the reference point from the image input from the imaging unit 11, and estimates the correspondence between the feature point and the reference point. Further, the estimation unit 12 estimates a conversion coefficient representing the relative position and orientation of the imaging target with respect to the imaging unit 11 from the coordinate positions of the corresponding feature points and reference points based on a preset conversion equation. The conversion equation used for estimating the conversion coefficient in the estimation unit 12 and the estimated conversion coefficient are output to the control unit 13. Details of the processing in the estimation unit 12 will be described later.

記憶部14は、表示部15に表示する表示情報を予め複数蓄積している。利用者は、制御部13に対する入力操作で、記憶部14に蓄積されている表示情報の中から任意の表示情報を選択して表示部15に表示させることができる。 The storage unit 14 stores a plurality of display information to be displayed on the display unit 15 in advance. The user can select arbitrary display information from the display information stored in the storage unit 14 and display it on the display unit 15 by an input operation on the control unit 13.

表示部15での情報表示の際、制御部13は、推定部12から入力された変換式および変換係数を表示情報に適用して表示情報を加工する。これにより、推定部12での推定結果に従って表示部15での表示情報が制御される。 When displaying information on the display unit 15, the control unit 13 processes the display information by applying the conversion formula and the conversion coefficient input from the estimation unit 12 to the display information. Thereby, the display information on the display unit 15 is controlled according to the estimation result in the estimation unit 12.

図２は、推定部12での処理手順を示すフローチャートである。推定部12は、領域形成処理S21、特徴検出処理S22および姿勢推定処理S23を所定タイミング間隔で順次繰り返し実行する。 FIG. 2 is a flowchart showing a processing procedure in the estimation unit 12. The estimation unit 12 repeatedly executes the region formation process S21, the feature detection process S22, and the posture estimation process S23 sequentially at predetermined timing intervals.

まず、領域形成処理S21では、撮像部11から入力される画像から肌領域を抽出する。肌領域は、撮像部11から入力される画像における色情報に基づいて抽出できる。例えば、撮像部11から入力される画像がＲＧＢ色空間で表現されている場合、それをＨＳＶ色空間(明度Ｖ，彩度Ｓ，色相Ｈ)の表現に変換する。ＲＧＢ色空間の表現からＨＳＶ色空間の表現への変換は、式(1)〜(3)で可能である。肌領域は、色相Ｈが予め設定した範囲内(すなわち、ＨＳＶ色空間上の0°〜360°の色相Ｈの中の肌色の範囲内)に収まっている画素を抽出することで抽出できる。 First, in region formation processing S21, a skin region is extracted from an image input from the imaging unit 11. The skin region can be extracted based on the color information in the image input from the imaging unit 11. For example, when an image input from the imaging unit 11 is expressed in the RGB color space, it is converted into an expression in the HSV color space (brightness V, saturation S, hue H). The conversion from the RGB color space expression to the HSV color space expression can be performed by equations (1) to (3). The skin region can be extracted by extracting pixels in which the hue H falls within a preset range (that is, within the skin color range in the hue H of 0 ° to 360 ° on the HSV color space).

肌領域の抽出は、色情報のＨＳＶ色空間上での位置を元にした上記手法に限らず、他の手法でも可能である。例えば、ガウス混合モデル(ＧＭＭ)で肌の尤度によって肌領域を判断し、抽出することもできる。ＧＭＭを利用する場合、予め肌および非肌の確率分布が、式(4)で表される複数のガウス分布の和で構成されるように学習しておく。 The extraction of the skin region is not limited to the above method based on the position of the color information in the HSV color space, and other methods are possible. For example, the skin region can be determined and extracted based on the likelihood of the skin using a Gaussian mixture model (GMM). When using the GMM, learning is performed in advance so that the probability distribution of the skin and the non-skin is configured by the sum of a plurality of Gaussian distributions expressed by Expression (4).

ここで、ｘおよびＤはそれぞれ、画素の特徴量とその次元数を表し、Ｎは、ガウス分布の数を示す。なお、画素の特徴量は、画素情報そのものでもよい。ガウス分布はそれぞれ重み係数ｗ_ｉを持ち、μ_ｉ、Σ_ｉはそれぞれ、平均値、共分散行列である。ガウス分布のパラメータ決定には、ＥＭアルゴリズムなどの最尤推定法を利用することができる。 Here, x and D respectively represent the feature amount of the pixel and the number of dimensions thereof, and N represents the number of Gaussian distributions. The pixel feature amount may be pixel information itself. Each Gaussian distribution has a weight coefficient w _i , and μ _i and Σ _i are an average value and a covariance matrix, respectively. A maximum likelihood estimation method such as an EM algorithm can be used to determine the parameters of the Gaussian distribution.

肌領域であって特徴量ｘが発生する確率をＰ(ｘ｜ｓｋｉｎ)とし、非肌領域であって特徴量ｘが発生する確率をＰ(ｘ｜¬ｓｋｉｎ)とする。それぞれの確率算出関数のパラメータ、すなわちガウス分布の数Ｎと各ガウス分布の平均値μ_ｉ、共分散行列Σ_ｉ、重み係数ｗ_ｉが学習結果として記録される。 Let P (x | skin) be the probability that a feature quantity x occurs in a skin area, and P (x | ¬skin) the probability that a feature quantity x occurs in a non-skin area. The parameters of the respective probability calculation functions, that is, the number N of Gaussian distributions, the average value μ _i of each Gaussian distribution, the covariance matrix Σ _i , and the weighting coefficient w _i are recorded as learning results.

利用者の肌情報を確率分布に反映させるようにしてもよい。この場合、学習データとは別に利用者の肌領域から抽出した画素情報で確率分布を補正する。学習データで学習した肌確率分布をＰｇ(ｘ｜ｓｋｉｎ)、非肌確率分布をＰｇ(ｘ｜¬ｓｋｉｎ)とし、利者の肌で学習した肌確率分布をＰｕ(ｘ｜ｓｋｉｎ)、非肌確率分布をＰｕ(ｘ｜¬ｓｋｉｎ)とすると、利用者の肌情報を反映した肌確率分布Ｐ(ｘ｜ｓｋｉｎ)と非肌確率分布Ｐ(ｘ｜¬ｓｋｉｎ)は、式(5),(6)で与えられる。なお、利用者の肌情報だけを利用した学習で肌確率分布および非肌確率分布を求めることも可能である。 The user's skin information may be reflected in the probability distribution. In this case, the probability distribution is corrected with pixel information extracted from the skin area of the user separately from the learning data. The skin probability distribution learned from the learning data is Pg (x | skin), the non-skin probability distribution is Pg (x | ¬skin), the skin probability distribution learned from the user's skin is Pu (x | skin), and the non-skin Assuming that the probability distribution is Pu (x | ¬skin), the skin probability distribution P (x | skin) reflecting the user's skin information and the non-skin probability distribution P (x | ¬skin) are expressed by the equations (5), ( Given in 6). It is also possible to obtain the skin probability distribution and the non-skin probability distribution by learning using only the user's skin information.

ＧＭＭを利用し、利用者の肌情報を確率分布に反映させたものである場合、領域形成処理S21では、撮像部11から画像が入力されたとき、予め設定した閾値ＴＨ１と学習結果の式(5),(6)を使って、肌確率分布Ｐ(ｘ｜ｓｋｉｎ)と非肌確率分布Ｐ(ｘ｜¬ｓｋｉｎ)比が式(7)を満たす画素を肌領域とする。 When the GMM is used and the skin information of the user is reflected in the probability distribution, in the region forming process S21, when an image is input from the imaging unit 11, a preset threshold TH1 and an expression of a learning result ( Using 5) and (6), a pixel in which the ratio of skin probability distribution P (x | skin) and non-skin probability distribution P (x | ¬skin) satisfies equation (7) is defined as a skin region.

次に、特徴検出処理S22では、撮像部11から入力された画像の中から予め設定された基準点に対応する点を特徴点として抽出し、その特徴点の座標を検出する。基準点は、後続する姿勢推定処理S23での姿勢推定が指の曲げや指の開閉に対して頑健になるように、撮像対象(手)の全体的な位置および姿勢の変化を確実に特定できる点に設定する。例えば、指の股など掌の周囲で、互いに離れた複数の点を基準点に設定することが好ましい。また、例えば、拇指の第２関節の外側の点と掌横方向中央の縦方向ラインに対して該点の反対側にある点に基準点を設定することも好ましい。これらの点と指の股を合わせて基準点としてもよい。 Next, in the feature detection process S22, a point corresponding to a preset reference point is extracted from the image input from the imaging unit 11 as a feature point, and the coordinates of the feature point are detected. The reference point can reliably identify the overall position and posture change of the imaging target (hand) so that the posture estimation in the subsequent posture estimation process S23 is robust against finger bending and finger opening / closing. Set to point. For example, it is preferable to set a plurality of points separated from each other around the palm, such as the crotch of a finger, as reference points. For example, it is also preferable to set the reference point at a point on the opposite side of the point on the outer side of the second joint of the thumb and the longitudinal line in the center of the palm. These points and the crotch of the finger may be combined as a reference point.

図３は、予め設定される基準点の例を示す図である。ここで、(1)は拇指の股に設定される基準点であり、(2)〜(4)は、４指の股に設定される基準点である。また、(5)は、拇指の第２関節の外側に設定される基準点であり、(6)は、掌横方向中央の縦方向ラインに対して基準点(5)の反対側に設定される基準点である。これらの基準点(1)〜(6)全てを設定する必要はなく、最低４つの基準点を選択して設定すればよい。この場合、上述したように、互いに離れた点を選択することが好ましい。 FIG. 3 is a diagram illustrating an example of preset reference points. Here, (1) is a reference point set to the crotch of the thumb, and (2) to (4) are reference points set to the crotch of the four fingers. (5) is the reference point set outside the second joint of the thumb, and (6) is set on the opposite side of the reference point (5) with respect to the vertical line at the center of the palm. This is the reference point. It is not necessary to set all of these reference points (1) to (6), and at least four reference points may be selected and set. In this case, as described above, it is preferable to select points separated from each other.

特徴抽出処理S22についてさらに詳細に説明する。特徴抽出処理S22では、領域形成処理S21で抽出された肌領域を整形し、該肌領域内もしくは周囲の特徴点を抽出する。特徴点は基準点に対応する点だけを抽出すればよいが、抽出する際の条件次第では基準点の数より多くの点が抽出される。この場合には、抽出された点の中から基準点と同じ特徴量を持つ点を基準点に対応する特徴点と選択すればよい。 The feature extraction process S22 will be described in more detail. In the feature extraction process S22, the skin area extracted in the area formation process S21 is shaped, and feature points in or around the skin area are extracted. Only the points corresponding to the reference points need to be extracted as feature points, but more points than the number of reference points are extracted depending on the conditions for extraction. In this case, a point having the same feature amount as the reference point may be selected as a feature point corresponding to the reference point from the extracted points.

特徴点の抽出の前にモルフォロジフィルタによって肌領域の部分的な欠損を補うことが好ましい。肌領域の部分的な欠損は、手への光の当たり加減などによって生じる。次に、肌領域を包含しかつ面積が最小となる外接矩形を形成する。外接矩形の算出には、主成分分析などを利用することができる。この外接矩形は、後述する指同士の分離に利用される。また、外接矩形は特徴点を抽出する範囲の限定に利用することもできる。 Before extracting feature points, it is preferable to compensate for partial defects in the skin region by a morphological filter. A partial defect in the skin area is caused by light exposure to the hand. Next, a circumscribed rectangle that includes the skin region and has the smallest area is formed. For calculation of the circumscribed rectangle, principal component analysis or the like can be used. This circumscribed rectangle is used for separation of fingers described later. The circumscribed rectangle can also be used to limit the range from which feature points are extracted.

図４は、入力された画像から抽出された肌領域、該肌領域を包含する外接矩形および図３の基準点(1)〜(6)に対応する特徴点の一例を示す図である。 FIG. 4 is a diagram illustrating an example of a skin area extracted from an input image, a circumscribed rectangle that includes the skin area, and feature points corresponding to the reference points (1) to (6) in FIG.

図４に示すように撮像対象である手が閉じられている場合もあるので、特徴抽出処理S22では、また、特徴点の抽出の前に指同士を分離する処理を行う。指同士の分離は、エッジ検出を適用して肌色レベルが低いエッジ部分を検出し、さらに上記外接矩形を利用して、各指間のエッジ部分を排除して非肌領域とすることにより実現できる。 As shown in FIG. 4, since the hand that is the imaging target may be closed, in the feature extraction process S22, a process of separating fingers from each other is performed before the feature points are extracted. Separation between fingers can be realized by applying edge detection to detect an edge portion with a low skin color level, and further using the circumscribed rectangle to eliminate the edge portion between each finger to make a non-skin region. .

肌領域を包含する外接矩形の長軸方向は、ほぼ指の方向に一致すると仮定できる。そこで、検出されたエッジ部分の中で、肌領域を包含する外接矩形の長軸方向に対するエッジ方向の角度が予め設定した閾値の範囲内に収まるエッジ部分を非肌部分とすることにより指同士を分離できる。 It can be assumed that the major axis direction of the circumscribed rectangle including the skin region substantially coincides with the finger direction. Therefore, among the detected edge portions, the edges of the circumscribed rectangle that includes the skin region with respect to the major axis direction are within the predetermined threshold range, and the fingers are placed between the fingers. Can be separated.

図５は、入力された画像から抽出された肌領域(a)およびエッジ検出技術を適用して分離された肌領域(b)の一例を示す図である。各指の間は、肌色レベルが低いので、非肌領域とされる。 FIG. 5 is a diagram illustrating an example of a skin region (a) extracted from an input image and a skin region (b) separated by applying an edge detection technique. Between each finger, since the skin color level is low, it is set as a non-skin area.

最後に、分離された肌領域の境界で特徴点を抽出する。SIFTやSURFなどの既存方式を利用して基準点と同数の特徴点を抽出する。基準点の数より多くの特徴点候補(端点)を抽出し、その中から、以下のように、所定条件を満たすものを選択して特徴点とするようにしてもよい。 Finally, feature points are extracted at the boundaries of the separated skin regions. Extract the same number of feature points as the reference points using existing methods such as SIFT and SURF. It is also possible to extract more feature point candidates (end points) than the number of reference points, and select those satisfying a predetermined condition from among them as the feature points as follows.

指の股部分の端点は、抽出された端点を含む小領域ごとに肌領域の数と割合を算出し、肌領域が単一かつ肌領域の占める割合が予め設定した閾値の範囲内に収まるので、特徴点として選択される。指の関節部分の非肌領域の端点が特徴点候補として抽出されても、これらの端点では関節の上下方向に２つ肌領域が存在するので、該端点は特徴点として選択されない。また、指先や皺部分の端点などでは、肌領域の占める割合が予め設定した閾値の範囲内に収まらないので、特徴点として選択されない。 For the end points of the crotch part of the finger, the number and ratio of the skin areas are calculated for each small area including the extracted end points, and the ratio of the skin area is single and the ratio of the skin area is within a preset threshold range. , Selected as a feature point. Even if the end points of the non-skin region of the finger joint part are extracted as feature point candidates, since these two end points have two skin regions in the vertical direction of the joint, the end points are not selected as feature points. In addition, since the proportion of the skin region does not fall within a preset threshold range at the fingertips, the end points of the heel portion, etc., they are not selected as feature points.

拇指の第２関節の外側の特徴点、掌横方向中央の縦方向ラインに対して該特徴点の反対側の特徴点は、肌領域内において外接矩形の短軸と平行な線長が最大となる線分が肌領域境界と交わる２点として抽出される。 The feature point on the outer side of the second joint of the thumb and the feature point on the opposite side to the longitudinal line in the center in the lateral direction of the palm has the maximum line length parallel to the short axis of the circumscribed rectangle in the skin region. Are extracted as two points that intersect the skin region boundary.

姿勢推定処理S23では、抽出された特徴点を予め設定された基準点に対応付け、両者の座標を比較することにより撮像部11に対する撮像対象の相対的位置および姿勢を推定する。 In the posture estimation process S23, the extracted feature point is associated with a preset reference point, and the relative position and posture of the imaging target with respect to the imaging unit 11 are estimated by comparing the coordinates of both.

このために、まず、抽出された特徴点を予め設定された基準点(1)〜(6)に対応付ける。掌横方向中央の縦方向ラインに対して拇指の第２関節の外側の特徴点の反対側の特徴点は、複数の特徴点が形成する凸多角形の重心から最も離れた特徴点として判別できる。この特徴点を掌横方向中央の縦方向ラインに対して拇指の第２関節の外側の基準点の反対側の基準点(6)に対応付ける。 For this purpose, first, the extracted feature points are associated with preset reference points (1) to (6). The feature point opposite to the feature point outside the second joint of the thumb with respect to the vertical line at the center of the palm in the lateral direction can be determined as the feature point farthest from the center of gravity of the convex polygon formed by the plurality of feature points . This feature point is associated with a reference point (6) on the opposite side of the reference point outside the second joint of the thumb with respect to the vertical line in the center of the palm.

拇指の第２関節の外側の特徴点は、上述のようにして判別された特徴点(掌横方向中央の縦方向ラインに対して拇指の第２関節の外側の特徴点の反対側の特徴点)を通り、外接矩形の短軸と平行に延長した線分上あるいは線分近辺に位置する特徴点として判別できる。この特徴点を拇指の第２関節の外側の基準点(5)に対応付ける。 The feature point outside the second joint of the thumb is the feature point determined as described above (the feature point on the opposite side of the feature point outside the second joint of the thumb relative to the vertical line in the center of the palm side) ) And a feature point located on or near the line segment extending in parallel with the short axis of the circumscribed rectangle. This feature point is associated with a reference point (5) outside the second joint of the thumb.

拇指の股の特徴点は、拇指の第２関節の外側に最も近い箇所の特徴点として判別でき、拇指の第２関節の外側の基準点(1)に対応付けることができる。 The feature point of the thumb crotch can be determined as the feature point of the location closest to the outside of the second joint of the thumb, and can be associated with the reference point (1) outside the second joint of the thumb.

４指の股の特徴点は、肌領域を包含する外接矩形の短軸方向に各特徴点を射影したときに最も密集している箇所の特徴点として判別でき、拇指の股の特徴点からの距離が近い順に各指間の股の基準点(2)〜(4)に対応付けることができる。 The feature points of the four-finger crotch can be determined as the feature points of the most crowded points when the feature points are projected in the short axis direction of the circumscribed rectangle that includes the skin area. Corresponding to the reference points (2) to (4) of the crotch between the fingers in order of increasing distance.

これにより対応付けられたｎ個の特徴点の座標を(ｘ′_ｊ,ｙ′_ｊ) (１≦ｊ≦ｎ)、基準点の座標を(ｘ_ｊ,ｙ_ｊ) (１≦ｊ≦ｎ)とし、予め設定した変換式において両者が一致するような変換係数ａ_ｋ (１≦ｋ≦ｍ，ｍ≦２ｎ)を求める。 As a result, the coordinates of the n feature points associated with each other are expressed as (x ′ _j , y ′ _j ) (1 ≦ j ≦ n), and the coordinates of the reference point are (x _j , y _j ) (1 ≦ j ≦ n). Then, a conversion coefficient a _k (1 ≦ k ≦ m, m ≦ 2n) is determined so that they match in a preset conversion equation.

式(8),(9)の射影変換を利用する変換式で、ｍ＝２ｎの場合、２ｎ元連立方程式として変換係数ａ_ｋを求めることができる。また、ｍ＜２ｎの場合には最小二乗法で変換係数ａ_ｋを求めることができる。 In the conversion equation using the projective transformation of equations (8) and (9), when m = 2n, the conversion coefficient a _k can be obtained as a 2n simultaneous equation. In the case of m <2n, the conversion coefficient _ak can be _obtained by the least square method.

姿勢推定処理S23で求められた変換係数ａ_ｋは、変換式と共に制御部13へ送られる。制御部13は、推定部12から入力される変換式および変換係数を表示情報に適用して該表示情報を加工する。これにより、推定部12での推定結果に基づいて表示部15での表示が制御される。表示部15での表示制御には、ＣＧ(Computer Graphics)の技術を利用できる。 The conversion coefficient _ak obtained in the posture estimation process S23 is sent to the control unit 13 together with the conversion formula. The control unit 13 applies the conversion formula and the conversion coefficient input from the estimation unit 12 to the display information and processes the display information. Thereby, the display on the display unit 15 is controlled based on the estimation result in the estimation unit 12. For display control in the display unit 15, CG (Computer Graphics) technology can be used.

例えば、画像一覧や地図、Ｗｅｂブラウザなどのアプリケーションあるいはアドレス帳やメニュー項目などの情報が表示部15に表示されている場合に、利用者が撮像部11の撮像面に対して撮像対象(手)を垂直方向や水平方向などに移動させることにより、表示部15での表示情報を撮像対象の位置に連動させて移動させることができる。また、利用者が撮像部11の撮像面に対して撮像対象を奥行き方向に移動させることにより、表示部15での表示情報を撮像対象の見かけ上の大きさに連動させて拡大縮小させることができる。さらに、利用者が撮像対象の姿勢を傾けることにより、撮像対象の傾きに連動させて表示部15での表情報を回転させることもできる。 For example, when information such as an image list, a map, an application such as a web browser, an address book, or a menu item is displayed on the display unit 15, the user captures an image on the imaging surface of the imaging unit 11 (hand). By moving in the vertical direction or horizontal direction, the display information on the display unit 15 can be moved in conjunction with the position of the imaging target. In addition, when the user moves the imaging target in the depth direction with respect to the imaging surface of the imaging unit 11, the display information on the display unit 15 can be enlarged or reduced in conjunction with the apparent size of the imaging target. it can. Furthermore, when the user tilts the posture of the imaging target, the table information on the display unit 15 can be rotated in conjunction with the tilt of the imaging target.

図６、図７および図８は、本発明による表示例を示す図である。図６は、画像一覧の表示例であり、手に載ったように画像一覧が表示されている。ここで、撮像部に対して手を傾ければ、手に載った状態のまま画像一覧表示を同様に傾けることができ、撮像部に対して手を近づけたり遠ざけたりすれば、手に載った状態のまま画像一覧表示を拡大縮小できる。図７は、３ＤＣＧのアバターで表現されたアドレス帳の表示例である。この場合、手の上でアドレス帳表示を制御することができる。 6, 7 and 8 are diagrams showing display examples according to the present invention. FIG. 6 is a display example of the image list, and the image list is displayed as if it was put on the hand. Here, if the hand is tilted with respect to the image pickup unit, the image list display can be similarly tilted while being placed on the hand, and if the hand is brought close to or away from the image pickup unit, it is placed on the hand. The image list display can be enlarged or reduced as it is. FIG. 7 is a display example of an address book expressed by a 3D CG avatar. In this case, the address book display can be controlled on the hand.

以上、実施形態を説明したが、本発明は、上記実施形態に限定されるものではない。例えば、手の甲に基準点を設定するようにして表示情報を制御するようにすることもできる。図８は、この場合の表示例を示し、この場合には、実際に指輪を付けているかのような表示を行わせることができる。 Although the embodiment has been described above, the present invention is not limited to the above embodiment. For example, the display information can be controlled by setting a reference point on the back of the hand. FIG. 8 shows a display example in this case. In this case, it is possible to display as if a ring is actually attached.

また、指と掌の境界や皺の端点に基準点を設定してもよい。この場合、特徴抽出処理S22で検出されたエッジ部分の中で、肌領域を包含する外接矩形の短軸方向に対するエッジ方向の角度が予め設定した閾値の範囲内に収まるエッジ部分を指と掌の境界あるいは指関節など皺部分とし、この部分を非肌部分とすることにより指と掌あるいは各指関節を分離する。その後、分離された肌領域の境界で特徴点を抽出し、各基準点に対応付ける。 In addition, a reference point may be set at the boundary between the finger and palm or the end point of the eyelid. In this case, among the edge portions detected in the feature extraction process S22, an edge portion in which the angle of the edge direction with respect to the minor axis direction of the circumscribed rectangle that includes the skin region falls within a preset threshold range is determined. By using a heel part such as a boundary or a finger joint and making this part a non-skin part, the finger and palm or each finger joint are separated. Thereafter, feature points are extracted at the boundaries of the separated skin regions and associated with the respective reference points.

また、上記実施形態では、肌領域を包含する外接矩形を形成して特徴点の抽出や特徴点を抽出する範囲の限定に利用しているが、これは外接矩形に限らず、外接多角形でよい。この場合、外接多角形の辺に接する最大非肌領域内部の最大深部(指間のエッジの端点が最も深い部分)に位置する特徴点を拇指の股と判断でき、拇指の股との関係から他の特徴点に対応する基準点を判断できる。 In the above embodiment, a circumscribed rectangle that includes the skin region is formed and used to extract feature points and limit the range for extracting feature points. However, this is not limited to the circumscribed rectangle, and is a circumscribed polygon. Good. In this case, the feature point located in the maximum deep part inside the maximum non-skin area that touches the side of the circumscribed polygon (the part where the end point of the edge between the fingers is deepest) can be determined as the crotch of the thumb, and from the relationship with the crotch of the thumb Reference points corresponding to other feature points can be determined.

11・・・撮像部、12・・・推定部、13・・・制御部、14・・・記憶部、15・・・表示部 11 ... Imaging unit, 12 ... Estimation unit, 13 ... Control unit, 14 ... Storage unit, 15 ... Display unit

Claims

手を撮像対象として撮像する撮像部を有する情報端末装置であって、
前記撮像部から入力される画像から抽出した特徴点を、基準となる位置および姿勢での撮像対象に予め設定され、撮像対象の全体的な位置および姿勢の変化を示す基準点に対応付けて撮像対象の位置および姿勢の変化を推定する推定部と、
情報を表示する表示部と、
前記表示部に表示する情報を記憶する記憶部と、
前記記憶部から読み出して前記表示部に表示する情報を、前記推定部により推定された撮像対象の位置および姿勢の変化に応じて制御する制御部を備え、
前記基準点は、掌の内部もしくは周囲に設定され、前記推定部は、前記撮像部から入力される画像における肌領域から指同士の境界部分、指と掌の境界部分あるいは皺部分を排除してから肌領域内に存在する特徴点を抽出する機能を有することを特徴とする情報端末装置。 An information terminal device having an imaging unit that images a hand as an imaging target ,
The feature points extracted from the image input from the imaging unit are set in advance as imaging targets at the reference position and orientation, and imaged in association with the reference points that indicate changes in the overall position and orientation of the imaging target. An estimation unit for estimating a change in the position and orientation of the target;
A display for displaying information;
A storage unit for storing information to be displayed on the display unit;
A control unit that controls information read from the storage unit and displayed on the display unit according to changes in the position and orientation of the imaging target estimated by the estimation unit ;
The reference point is set inside or around the palm, and the estimation unit excludes a finger-to-finger boundary, a finger-palm boundary, or a heel from a skin area in an image input from the imaging unit. An information terminal device having a function of extracting feature points existing in a skin region from

前記基準点は、指の股あるいは関節、その他の皺の端点を含むことを特徴とする請求項１に記載の情報端末装置。 The information terminal device according to claim 1 , wherein the reference point includes a crotch or a joint of a finger and other end points of the heel.

前記基準点は、拇指の第２関節の外側の点、掌横方向中央の縦方向ラインに対して拇指の第２関節の外側の点の反対側に設定された点の一方あるいは両方を含むことを特徴とする請求項２に記載の情報端末装置。 The reference point includes one or both of a point outside the second joint of the thumb and a point set on the opposite side of the point outside the second joint of the thumb relative to the longitudinal line in the middle of the palm. The information terminal device according to claim 2 .

前記推定部は、前記撮像部から入力される画像における肌領域を包含する外接多角形を利用して、特徴点を抽出する範囲を限定する機能を有することを特徴とする請求項１に記載の情報端末装置。 The estimation unit, according to claim 1, characterized in that it has a function to limit the scope of utilizing includes circumscribed polygon skin region, extracts feature points in the image input from the imaging unit Information terminal device.

前記推定部は、前記撮像部から入力される画像における肌領域を包含する外接矩形を生成し、該外接矩形の長軸方向の画像エッジ部分を肌領域から排除することによって指同士の境界部分を肌領域から排除する機能を有することを特徴とする請求項１に記載の情報端末装置。 The estimation unit generates a circumscribed rectangle that includes a skin region in the image input from the imaging unit, and removes the image edge portion of the circumscribed rectangle in the long axis direction from the skin region, thereby removing a boundary portion between fingers. The information terminal device according to claim 1 , having a function of excluding from a skin region.

前記推定部は、前記撮像部から入力される画像における肌領域を包含する外接矩形を生成し、該外接矩形の短軸方向の画像エッジ部分を肌領域から排除することによって指と掌の境界部分あるいは皺部分を肌領域から排除する機能を有することを特徴とする請求項１に記載の情報端末装置。 The estimation unit generates a circumscribed rectangle that includes a skin region in an image input from the imaging unit, and excludes an image edge portion of the circumscribed rectangle in the short axis direction from the skin region, thereby demarcating the boundary between the finger and the palm Alternatively, the information terminal device according to claim 1 , which has a function of excluding the heel portion from the skin region.

前記推定部は、前記撮像部から入力される画像から肌領域を抽出し、該肌領域から予め設定された基準点の数と同数もしくはより多くの点を特徴点候補として抽出する機能を有することを特徴とする請求項１に記載の情報端末装置。 The estimation unit has a function of extracting a skin region from an image input from the imaging unit, and extracting the same number or more points as preset reference points from the skin region as feature point candidates. The information terminal device according to claim 1 .

前記推定部は、前記撮像部から入力される画像から抽出された特徴点候補から、基準点に対応付けする上で適合する特徴点候補を特徴点として選択する機能を有することを特徴とする請求項７に記載の情報端末装置。 The estimation unit has a function of selecting, as a feature point, a feature point candidate that is suitable for association with a reference point from feature point candidates extracted from an image input from the imaging unit. Item 8. The information terminal device according to Item 7 .

前記推定部は、基準点に適合する特徴点候補を特徴点として選択するに際し、特徴点候補の点を含む小領域の肌領域の数および肌領域が占める割合を用いることを特徴とする請求項８に記載の情報端末装置。 The estimation unit, when selecting a feature point candidate that matches a reference point as a feature point, uses the number of skin regions of a small region including the feature point candidate point and a ratio occupied by the skin region. 8. The information terminal device according to 8 .

前記推定部は、特徴点候補の点を含む小領域の肌領域が単一かつ肌領域の占める割合が予め設定した閾値の範囲内に収まる特徴点候補を特徴点として選択することを特徴とする請求項８に記載の情報端末装置。 The estimation unit selects, as a feature point, a feature point candidate in which a small skin region including a feature point candidate point is within a single threshold and a ratio of the skin region is within a preset threshold range. The information terminal device according to claim 8 .

前記推定部は、前記撮像部から入力される画像における肌領域を包含する外接矩形の短軸と平行かつ肌領域内で長さが最大の線分が肌領域境界と交わる点を特徴点として抽出する機能を有することを特徴とする請求項３に記載の情報端末装置。 The estimation unit extracts, as a feature point, a point where a line segment that is parallel to a short axis of a circumscribed rectangle including the skin region in the image input from the imaging unit and has the maximum length in the skin region intersects the skin region boundary. The information terminal device according to claim 3 , wherein the information terminal device has a function of:

前記推定部は、前記撮像部から入力される画像から抽出された特徴点が形成する凸多角形の重心から最も離れた特徴点を、掌横方向中央の縦方向ラインに対して拇指の第２関節の外側に設定された基準点の反対側に設定された基準点に対応付ける機能を有することを特徴とする請求項１に記載の情報端末装置。 The estimation unit sets the feature point farthest from the center of gravity of the convex polygon formed by the feature point extracted from the image input from the imaging unit to the second vertical finger line with respect to the vertical line in the middle of the palm. The information terminal device according to claim 1 , wherein the information terminal device has a function of associating with a reference point set on the opposite side of the reference point set outside the joint.

前記推定部は、掌横方向中央の縦方向ラインに対して拇指の第２関節の外側に設定された基準点の反対側に設定された基準点に対応する特徴点を通り、肌領域を包含する外接矩形の短軸と平行に延長した線分上あるいは線分近辺に位置する特徴点を、拇指の第２関節の外側に設定された基準点に対応付ける機能を有することを特徴とする請求項１２に記載の情報端末装置。 The estimation unit includes a skin region through a feature point corresponding to a reference point set on the opposite side of the reference point set on the outer side of the second joint of the thumb with respect to the vertical line in the center of the horizontal direction of the palm. And a feature point located on or near the line segment extending in parallel with the short axis of the circumscribed rectangle is associated with a reference point set outside the second joint of the thumb. 12. The information terminal device according to 12 .

前記推定部は、前記撮像部から入力される画像から抽出された特徴点を肌領域を包含する外接矩形の短軸方向に射影したとき、最も密集した箇所の特徴点を、４指の股に設定された基準点に対応付ける機能を有することを特徴とする請求項１に記載の情報端末装置。 When the estimation unit projects the feature points extracted from the image input from the imaging unit in the short axis direction of the circumscribed rectangle that includes the skin region, the feature points of the most dense locations are displayed on the crotch of four fingers. The information terminal device according to claim 1 , having a function of associating with a set reference point.

前記推定部は、前記撮像部から入力される画像から抽出された特徴点を、拇指の股の特徴点からの距離が近い順に４指の股に設定された基準点にそれぞれ対応付けることを特徴とする請求項１４に記載の情報端末装置。 The estimation unit associates the feature points extracted from the image input from the imaging unit with reference points set on the four-finger crotches in order of increasing distance from the thumb crotch feature point, respectively. The information terminal device according to claim 14 .

前記推定部は、撮像対象としての手指の位置および姿勢の変化を推定するために、対応付けられた特徴点と基準点が、予め設定された変換式によって一致するような変換係数を求める機能を有することを特徴とする請求項１に記載の情報端末装置。 The estimation unit has a function of obtaining a conversion coefficient so that a feature point and a reference point that are associated with each other are matched by a preset conversion formula in order to estimate a change in the position and posture of a finger as an imaging target. The information terminal device according to claim 1 , comprising:

前記制御部は、前記推定部で推定された撮像対象の位置および姿勢の変化に応じて、前記表示部での表示情報を制御する機能を有することを特徴とする請求項１に記載の情報端末装置。 The information terminal according to claim 1, wherein the control unit has a function of controlling display information on the display unit in accordance with a change in position and orientation of the imaging target estimated by the estimation unit. apparatus.