JP5597087B2

JP5597087B2 - Virtual object manipulation device

Info

Publication number: JP5597087B2
Application number: JP2010225154A
Authority: JP
Inventors: 聡古川; 栄次中元; 剛木本
Original assignee: Panasonic Corp; Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Corp; Panasonic Holdings Corp
Priority date: 2010-10-04
Filing date: 2010-10-04
Publication date: 2014-10-01
Anticipated expiration: 2030-10-04
Also published as: JP2012079177A

Description

本発明は、コンピュータグラフィックス技術を用いて構築された仮想三次元空間に配置される仮想物体を操作する仮想物操作装置に関するものである。 The present invention relates to a virtual object manipulation device that manipulates a virtual object arranged in a virtual three-dimensional space constructed using computer graphics technology.

一般に、製品の展示会や説明会においては、製品のモックアップモデルやクレイモデルを展示したり、製品の試作品を展示して来訪者に体験させることが行われている。したがって、展示品の作成には手間がかかり、展示品の準備に多大な時間がかかるという問題を有している。 In general, at product exhibitions and briefing sessions, product mock-up models and clay models are exhibited, and prototypes of products are exhibited to allow visitors to experience. Therefore, there is a problem that it takes time to create an exhibit and it takes a lot of time to prepare the exhibit.

この種の問題に対しては、コンピュータグラフィックス技術を用いて構築される仮想三次元空間に配置される仮想物体を展示品に代えて提示することが考えられる。仮想物体を展示品に代えて用いれば、実体としての展示品を作成する必要がないから、作成の手間が軽減されるとともに準備に要する時間も短縮されると考えられる。このような仮想物体を作成するには、三次元ＣＡＤ用のソフトウェアを用いることによって、三次元形状データを生成することが多い。 For this type of problem, a virtual object arranged in a virtual three-dimensional space constructed using computer graphics technology may be presented instead of an exhibit. If a virtual object is used instead of an exhibit, it is not necessary to create an actual exhibit, so that it is possible to reduce the time and effort required for preparation. In order to create such a virtual object, three-dimensional shape data is often generated by using software for three-dimensional CAD.

また、三次元形状データを生成する技術として、対象物を撮像した画像データを用いる技術も知られている（たとえば、特許文献１参照）。特許文献１に記載された技術では、対象物に関して様々な方向から撮像した複数の画像データを用い、対象物に関する三次元形状データを生成している。 In addition, as a technique for generating three-dimensional shape data, a technique using image data obtained by capturing an object is also known (see, for example, Patent Document 1). In the technique described in Patent Document 1, three-dimensional shape data regarding an object is generated using a plurality of image data captured from various directions regarding the object.

一方、このような三次元形状データに基づいて実空間のワークを加工することによって実立体モデルを作成することが考えられている（たとえば、特許文献２参照）。特許文献２には、特許文献１に記載された技術により生成した三次元形状データの変形、修正、統合、加工を行うことも記載されている。 On the other hand, it is considered to create a real solid model by processing a real space workpiece based on such three-dimensional shape data (see, for example, Patent Document 2). Patent Document 2 also describes that deformation, correction, integration, and processing of three-dimensional shape data generated by the technique described in Patent Document 1 are performed.

特許文献２には、三次元形状データの変形、修正、統合、加工を行うために、どのような作業を行うかは、とくに記載されていない。ただし、通常の技術であれば、三次元機能を備えたグラフィックソフトウェアを用い、マウスのようなポインティングデバイスの操作によって三次元形状データに対する作業を行うことが推定される。 Patent Document 2 does not specifically describe what kind of work is performed in order to perform deformation, correction, integration, and processing of three-dimensional shape data. However, if it is a normal technique, it is estimated that the graphic software provided with the three-dimensional function is used, and the operation is performed on the three-dimensional shape data by operating a pointing device such as a mouse.

ところで、物体までの距離を画素値とする距離画像の動画を生成し、距離画像から人の特定部位を抽出するとともに、特定部位の時間変化をジェスチャーとして認識することが考えられている（たとえば、特許文献３参照）。特許文献３に記載された技術では、ジェスチャーに対応付けられた制御出力を制御対象機器へ与えることによって、制御対象機器をジェスチャーで制御している。 By the way, it is considered that a moving image of a distance image having a pixel value as a distance to an object is generated, a specific part of a person is extracted from the distance image, and a time change of the specific part is recognized as a gesture (for example, (See Patent Document 3). In the technique described in Patent Document 3, the control target device is controlled by the gesture by giving the control target device a control output associated with the gesture.

特願平１０−１２４７０４号公報Japanese Patent Application No. 10-124704 特開２００１−１６６８０９号公報JP 2001-166809 A 特開２００６−９９７４９号公報JP 2006-99749 A

三次元形状データは、特許文献１に記載された技術のように、対象物を撮像した画像データを用いて生成することができ、また、三次元ＣＡＤソフトウェアを用いて作成することができる。また、特許文献２に記載されているように、三次元形状データの変形、修正、統合、加工の作業が可能であるが、この作業には、モニタ装置に表示された仮想三次元空間に対して、マウスやデジタイザのようなポインティングデバイスを操作するのが一般的である。 The three-dimensional shape data can be generated using image data obtained by capturing an object as in the technique described in Patent Document 1, and can be generated using three-dimensional CAD software. In addition, as described in Patent Document 2, it is possible to perform deformation, correction, integration, and processing of three-dimensional shape data. For this work, a virtual three-dimensional space displayed on the monitor device is used. It is common to operate a pointing device such as a mouse or digitizer.

通常のモニタ装置は二次元平面の表示領域しか持たず、また、ポインティングデバイスは平面上での位置を指示することしかできない。したがって、モニタ装置に表示されている仮想三次元空間に対する作業とポインティングデバイスの操作の感覚とを一致させることは困難を伴い、三次元形状データに対する上述のような作業を思い通りに行うには熟練が必要である。 A normal monitor device has only a two-dimensional plane display area, and a pointing device can only indicate a position on the plane. Accordingly, it is difficult to match the work on the virtual three-dimensional space displayed on the monitor device with the sense of operation of the pointing device, and it is necessary to be skilled in performing the above-described work on the three-dimensional shape data as desired. is necessary.

また同様に、仮想三次元空間において三次元形状データで表された仮想物体に動きを付与する場合にも、ポインティングデバイスで指示することは困難である。 Similarly, when a motion is given to a virtual object represented by three-dimensional shape data in the virtual three-dimensional space, it is difficult to give an instruction with a pointing device.

一方、特許文献３に記載された技術は、人の特定部位の動きをジェスチャーとして認識し、ジェスチャーの内容に応じて制御対象機器の制御を行っている。ただし、特許文献３では、ジェスチャーの内容を制御対象機器に対するスイッチとして用いており、三次元形状データの変形、修正、統合、加工などの作業に用いることは考慮されておらず、また、この種の作業に用いる構成も示されていない。つまり、特許文献３には、人の特定部位の動きを、制御対象機器の動作に反映させることは記載されているが、仮想三次元空間に存在する仮想物体に反映させる技術については示されていない。 On the other hand, the technique described in Patent Document 3 recognizes the movement of a specific part of a person as a gesture, and controls a control target device according to the content of the gesture. However, in Patent Document 3, the content of the gesture is used as a switch for the device to be controlled, and is not considered to be used for operations such as transformation, correction, integration, and processing of three-dimensional shape data. The configuration used for this work is also not shown. That is, Patent Document 3 describes that the movement of a specific part of a person is reflected in the operation of the device to be controlled, but does not describe a technique for reflecting it in a virtual object existing in a virtual three-dimensional space. Absent.

本発明は、実空間における物体の動きによる作用を仮想三次元空間の仮想物体に反映させるようにした仮想物体操作装置を提供することを目的とする。 An object of the present invention is to provide a virtual object manipulating apparatus in which an action caused by movement of an object in real space is reflected on a virtual object in a virtual three-dimensional space.

本発明は、実空間に存在する物体についての距離画像を生成する距離画像生成手段と、前記距離画像から前記物体の特定部位を抽出する入力情報取得手段と、前記特定部位の動きを仮想三次元空間の仮想物体に作用させ、その結果を表示手段への表示に反映させる対話手段とを備え、前記対話手段は、前記入力情報取得手段が抽出した前記特定部位の画素値を用いて前記特定部位をコンピュータにより構築された仮想三次元空間に配置される指標に対応付ける射影手段と、前記仮想三次元空間に仮想物体を配置する物体配置手段と、前記物体配置手段が前記仮想三次元空間に配置した前記仮想物体に所定の作用を及ぼす処理ツールを提供するツール提供手段と、前記入力情報取得手段が抽出した前記指標に前記ツール提供手段が提供する処理ツールを結合させるツール結合手段と、前記射影手段により前記仮想三次元空間に対応付けられた前記指標に前記ツール結合手段により処理ツールを結合した状態において、前記指標の動きに応じて前記表示手段に表示された前記仮想物体に処理ツールによる作用の結果を反映させる表示処理手段とを備え、前記物体配置手段は、前記仮想三次元空間に前記仮想物体を配置する機能と、前記仮想物体に背景を透過する透明なテクスチャを付与する機能と、前記仮想物体として、規定形状である第１の仮想物体と、前記仮想物体を覆う第２の仮想物体とを前記仮想三次元空間に配置する機能とを有し、前記ツール提供手段は、処理ツールとして、通過部位に存在する前記仮想物体に、あらかじめ定めた不透明のテクスチャをマッピングする貼付ツールと、通過部位に存在する前記第２の仮想物体を消去する消去ツールと、前記指標を前記仮想物体の表面に吸着させ前記指標の移動の向きに応じて前記仮想物体を変形させる変形ツールとを提供し、前記ツール結合手段は、前記指標に結合する処理ツールを選択することを特徴とする。 The present invention includes a distance image generating means for generating a distance image of the object present in the real space, the distance and the input information obtaining means for extracting a specific portion of the object from the image, a virtual three-dimensional movement of the specific portion Interaction means for acting on a virtual object in space and reflecting the result on the display on the display means , wherein the interaction means uses the pixel value of the specific part extracted by the input information acquisition means. Projecting means for associating with an index arranged in a virtual three-dimensional space constructed by a computer, object arranging means for arranging a virtual object in the virtual three-dimensional space, and the object arranging means arranged in the virtual three-dimensional space Tool providing means for providing a processing tool that exerts a predetermined action on the virtual object, and processing provided by the tool providing means for the index extracted by the input information acquisition means Tool combining means for combining tools, and the display means according to the movement of the index in a state in which a processing tool is combined with the index associated with the virtual three-dimensional space by the projection means by the tool combining means. Display processing means for reflecting the result of the action by the processing tool on the virtual object displayed on the object, the object placement means has a function of placing the virtual object in the virtual three-dimensional space, and a background on the virtual object. And a function of arranging a first virtual object having a prescribed shape and a second virtual object covering the virtual object in the virtual three-dimensional space as the virtual object. And the tool providing means, as a processing tool, pastes a predetermined opaque texture onto the virtual object existing at the passage site. , An erasing tool for erasing the second virtual object present at the passage site, and a deformation tool for adsorbing the index to the surface of the virtual object and deforming the virtual object according to the direction of movement of the index The tool combination means selects a processing tool to be combined with the index .

この場合、入力情報取得手段が抽出した特定部位の画素値を用いて特定部位の動きパターンを分類するパターン分類手段を備え、ツール結合手段は、処理ツールが動きパターンに対応付けられておりパターン分類手段が分類した動きパターンに応じた処理ツールを指標に結合することが好ましい。 In this case, a pattern classification unit that classifies the motion pattern of the specific part using the pixel value of the specific part extracted by the input information acquisition unit is provided, and the tool combination unit includes a pattern classification in which the processing tool is associated with the motion pattern. Preferably, a processing tool corresponding to the motion pattern classified by the means is combined with the index.

表示処理手段は、変形ツールにより仮想物体を変形させる際に、変形ツールが仮想物体を貫通すると、貫通した仮想物体の表示状態を他の仮想物体の表示状態とは異ならせるようにしてもよい。 The display processing means may change the display state of the penetrated virtual object from the display state of other virtual objects when the deformation tool penetrates the virtual object when the virtual object is deformed by the deformation tool.

物体は人体であり特定部位は少なくとも四肢と頭部とを含み、仮想物体は四肢と頭部とを備える立体モデルであって、ツール提供手段は、処理ツールとして人体の四肢および頭部の動きを立体モデルの四肢および頭部の動きに連動させる連動ツールを提供するようにしてもよい。 The object is a human body, the specific part includes at least an extremity and a head, and the virtual object is a three-dimensional model including the extremity and the head, and the tool providing means uses the movement of the extremity and the head of the human body as a processing tool. You may make it provide the interlocking | linkage tool linked with the movement of the limb and head of a three-dimensional model.

特定部位は人体の四肢と頭部とのうちの少なくとも１箇所の動きに伴って実空間での位置が変化する部位であって、入力情報取得手段は、特定部位の表示手段からの距離および特定部位の移動の向きを抽出し、表示処理手段は、表示手段に表示される仮想三次元空間の範囲を、入力情報取得手段により抽出された特定部位の位置に追従させて変更するとともに、表示手段に表示される仮想三次元空間の範囲および座標軸の向きを、入力情報取得手段により抽出された特定部位の移動の向きに追従させて変更するようにしてもよい。 The specific part is a part whose position in the real space changes in accordance with the movement of at least one of the limbs and the head of the human body, and the input information acquisition means determines the distance and the specific part from the display means of the specific part. The direction of movement of the part is extracted, and the display processing means changes the range of the virtual three-dimensional space displayed on the display means so as to follow the position of the specific part extracted by the input information acquisition means, and the display means The range of the virtual three-dimensional space and the direction of the coordinate axis displayed on the screen may be changed so as to follow the direction of movement of the specific part extracted by the input information acquisition unit.

物体は人体であり特定部位は頭部であって、入力情報取得手段は、頭部の位置および向きを抽出し、表示処理手段は、表示手段に表示される仮想三次元空間の範囲を、入力情報取得手段により抽出された頭部の位置により定めた視点に追従させて変更するとともに、表示手段に表示される仮想三次元空間の座標軸の向きを、入力情報取得手段により抽出された頭部の向きにより定めた視線の向きに追従させて変更するようにしてもよい。 The object is a human body and the specific part is the head, the input information acquisition means extracts the position and orientation of the head, and the display processing means inputs the range of the virtual three-dimensional space displayed on the display means. The head position extracted by the information acquisition means is changed to follow the viewpoint determined by the position of the head, and the orientation of the coordinate axis of the virtual three-dimensional space displayed on the display means is changed to the position of the head extracted by the input information acquisition means. You may make it change according to the direction of the gaze determined by direction.

表示手段は、仮想三次元空間の立体表示を行うことが好ましい。 The display means preferably performs stereoscopic display of the virtual three-dimensional space.

本発明の構成によれば、仮想三次元空間の仮想物体に対して所定の作用を及ぼす処理ツールに実空間における物体の特定部位を結合し、実空間における物体の動きによる作用を仮想物体に反映させることができる。その結果、仮想三次元空間の仮想物体に対して、ポインティングデバイスを用いることなく、直観的に作用を及ぼすことが可能になり、熟練を要することなく仮想物体に対する操作が可能になるという利点がある。 According to the configuration of the present invention, a specific part of an object in the real space is combined with a processing tool that has a predetermined action on the virtual object in the virtual three-dimensional space, and the action caused by the movement of the object in the real space is reflected in the virtual object. Can be made. As a result, the virtual object in the virtual three-dimensional space can be operated intuitively without using a pointing device, and there is an advantage that the virtual object can be operated without requiring skill. .

実施形態を示すブロック図である。It is a block diagram which shows embodiment. 同上に用いる人体モデルの一例を示す図である。It is a figure which shows an example of the human body model used for the same as the above. 同上に用いる距離画像生成手段の構成例を示すブロック図である。It is a block diagram which shows the structural example of the distance image generation means used for the same as the above. 同上の使用例を示す図である。It is a figure which shows the usage example same as the above. 同上の他の使用例を示す図である。It is a figure which shows the other usage example same as the above. 同上のさらに他の使用例を示す図である。It is a figure which shows the other usage example same as the above. 同上の別の使用例を示す図である。It is a figure which shows another example of use same as the above. 同上のさらに別の使用例を示す図である。It is a figure which shows another usage example same as the above.

以下に説明する実施形態において、実空間における「物体」としては、人体と人が身に着ける物品とがある。また、人が身に着ける物品には、指し棒、筆記具、手袋などを想定している。物体の「特定部位」には、物体が人体である場合に、手、四肢、頭部などがあり、指し棒や筆記具のような物品では先端部などを意味する。以下では、物体が人体であって、特定部位が左右の一方の手である場合を例示する。 In the embodiments described below, “objects” in real space include human bodies and articles that humans can wear. In addition, a pointing rod, a writing instrument, a glove and the like are assumed as articles worn by a person. When the object is a human body, the “specific part” of the object includes a hand, extremities, a head, and the like, and an article such as a pointing stick or a writing instrument means a tip portion. Hereinafter, a case where the object is a human body and the specific part is one of the left and right hands will be exemplified.

本実施形態は、図１に示すように、距離画像生成手段１０を備え、距離画像生成手段１０は、実空間における所定の空間領域に存在する物体との距離を画素値とする距離画像を生成する。所定の空間領域は、距離画像生成手段１０が物体までの距離を検出可能な範囲内で定められる。距離画像生成手段１０の構成は後述する。 As shown in FIG. 1, the present embodiment includes a distance image generation unit 10, and the distance image generation unit 10 generates a distance image having a pixel value as a distance from an object existing in a predetermined space area in real space. To do. The predetermined spatial region is determined within a range in which the distance image generating unit 10 can detect the distance to the object. The configuration of the distance image generation unit 10 will be described later.

距離画像生成手段１０が生成した距離画像は入力情報取得手段２１に与えられ、入力情報取得手段２１は、距離画像から人体の手を抽出する。距離画像から人体の手を抽出するために、入力情報取得手段２１では、まず、背景との距離差によって人体の存在する領域を抽出する。距離画像生成手段１０では、後述するように、実空間における物体の位置を実空間に設定した直交座標系で検出する。そこで、この直交座標系で基準平面を規定し、基準平面からの距離を用いることにより、人体の存在する領域を抽出する。 The distance image generated by the distance image generation means 10 is given to the input information acquisition means 21, and the input information acquisition means 21 extracts a human hand from the distance image. In order to extract the human hand from the distance image, the input information acquisition unit 21 first extracts a region where the human body exists based on the distance difference from the background. As will be described later, the distance image generation means 10 detects the position of the object in the real space using an orthogonal coordinate system set in the real space. Therefore, a region in which a human body exists is extracted by defining a reference plane in this orthogonal coordinate system and using a distance from the reference plane.

たとえば、床面に相当する平面を規定する２本の座標軸と床面に直交する１本の座標軸とにより直交座標系を規定し、壁面に相当する平面を上述した基準平面に用いる。基準平面から物体までの距離について、濃淡画像における輪郭線抽出の処理と同様にして、エッジを強調する処理を行うことにより、人体に相当する物体の輪郭線を抽出することができる。エッジの強調には、濃淡画像の微分と同様に着目画素の周囲の画素値を用いて距離値の局所的な変化の大きさを評価する評価値を求め、評価値が極大になる画素をエッジ上の画素とする。 For example, an orthogonal coordinate system is defined by two coordinate axes defining a plane corresponding to the floor surface and one coordinate axis orthogonal to the floor surface, and the plane corresponding to the wall surface is used as the reference plane described above. With respect to the distance from the reference plane to the object, the outline of the object corresponding to the human body can be extracted by performing the edge enhancement process in the same manner as the outline extraction process in the grayscale image. For edge enhancement, an evaluation value that evaluates the magnitude of the local change in the distance value is obtained using the pixel values around the pixel of interest in the same manner as in the differentiation of the grayscale image, and the pixel having the maximum evaluation value is edged. Let it be the upper pixel.

人体の輪郭線を抽出すれば、抽出した輪郭線を図２に示すような適宜の人体モデル３０に当てはめることにより、人体の各部位を分離することができる。図示する人体モデル３０は、骨に相当するセグメント３１と、関節に相当するジョイント３２とで構成されている。ここでは、１５個のジョイントを用いた人体モデル３０を例示しているが、他の構成の人体モデル３０を用いてもよい。 If the outline of the human body is extracted, each part of the human body can be separated by applying the extracted outline to an appropriate human body model 30 as shown in FIG. The illustrated human body model 30 includes a segment 31 corresponding to a bone and a joint 32 corresponding to a joint. Here, the human body model 30 using 15 joints is illustrated, but a human body model 30 having another configuration may be used.

距離画像から求めた輪郭線に基づいて適宜に取り決めた複数個の特徴点を抽出すれば、実空間に存在する人体に人体モデル３０を当てはめることができる。このように人体モデル３０への当てはめにより、実空間に存在する人体の各部位を分離して認識することができる。ここでは、人体の四肢と頭部と胴体とを分離して認識し、さらに腕を人体モデル３０に当てはめることによって手を認識する。 The human body model 30 can be applied to the human body existing in the real space by extracting a plurality of feature points appropriately determined based on the contour line obtained from the distance image. Thus, by applying to the human body model 30, each part of the human body existing in the real space can be separated and recognized. Here, the limbs of the human body, the head, and the torso are recognized separately, and the hand is recognized by applying the arm to the human body model 30.

入力情報取得手段２１では、手の形状も認識するのが望ましい。手の形状を認識するには、距離画像から抽出した手の輪郭線を手の骨格モデルに当てはめる必要があるから、手の位置を追跡するとともに、手を含む領域を拡大するのが望ましい。ここで、手の形状としては、手を開いているか握っているかの別や、どの指を曲げているかが認識できればよい。 It is desirable that the input information acquisition means 21 also recognizes the hand shape. In order to recognize the shape of the hand, it is necessary to apply the outline of the hand extracted from the distance image to the skeleton model of the hand. Therefore, it is desirable to track the position of the hand and enlarge the region including the hand. Here, as the shape of the hand, it is only necessary to recognize whether the hand is open or gripped and which finger is bent.

上述のようにして、入力情報取得手段２１は、実空間に存在する人体を人体モデル３０に当てはめることにより、実空間における手を認識する。さらに、入力情報取得手段２１は、距離画像生成手段１０が実空間に設定した直交座標系における手の位置を求める。このようにして求めた実空間における手の座標位置は、コンピュータにより構築された仮想三次元空間の座標位置に対応付けられる。仮想三次元空間には適宜の指標が規定されており、射影手段２２によって当該指標に手が対応付けられる。言い換えると、射影手段２２において実空間の手と仮想現実空間の指標とを対応付けることにより、実空間の手を動かすと仮想三次元空間の指標を動かせるようになる。 As described above, the input information acquisition unit 21 recognizes the hand in the real space by applying the human body existing in the real space to the human body model 30. Further, the input information acquisition unit 21 obtains the position of the hand in the orthogonal coordinate system set in the real space by the distance image generation unit 10. The coordinate position of the hand in the real space thus obtained is associated with the coordinate position of the virtual three-dimensional space constructed by the computer. An appropriate index is defined in the virtual three-dimensional space, and a hand is associated with the index by the projection means 22. In other words, the projection means 22 associates the hand in the real space with the index in the virtual reality space, so that the index in the virtual three-dimensional space can be moved by moving the hand in the real space.

ところで、仮想三次元空間には物体配置手段２３により適宜の仮想物体が配置される。仮想三次元空間は表示処理手段２４を通して表示手段２５に表示される。表示手段２５としては、二次元の平面表示を行う通常のモニタ装置のような構成のほか、三次元の立体表示を行う構成を採用してもよい。立体表示を行う表示手段２５には、表示映像と同期して開閉されるシャッタを備えたゴーグルを用いる構成、ゴーグルを用いずに左右の眼で見ることによって立体視が可能になる表示器、ゴーグルに表示器が内蔵されたヘッドマウントディスプレイなどがある。仮想物体の配置例については後述する。 Incidentally, an appropriate virtual object is arranged in the virtual three-dimensional space by the object arrangement means 23. The virtual three-dimensional space is displayed on the display means 25 through the display processing means 24. As the display means 25, a configuration for performing a three-dimensional stereoscopic display may be employed in addition to a configuration of a normal monitor device that performs a two-dimensional planar display. The display means 25 that performs stereoscopic display uses a configuration using goggles with a shutter that opens and closes in synchronization with the displayed video, a display that allows stereoscopic viewing when viewed with the left and right eyes without using goggles, and goggles There is a head-mounted display with a built-in indicator. An arrangement example of the virtual object will be described later.

ところで、射影手段２２では、仮想三次元空間に規定した指標に実空間の手を対応付けているから、手を移動させることにより仮想三次元空間における指標を移動させることができる。そのため、指標をマウスカーソルと同様に扱うことによって、手をマウスに代えて用いることが可能になる。ただし、手の移動に追従させて指標を移動させるだけでは、仮想三次元空間に何の作用も及ぼすことができない。 By the way, in the projection means 22, since the hand of real space is matched with the parameter | index prescribed | regulated to virtual three-dimensional space, the parameter | index in virtual three-dimensional space can be moved by moving a hand. Therefore, the hand can be used in place of the mouse by handling the index in the same manner as the mouse cursor. However, no action can be exerted on the virtual three-dimensional space simply by moving the index following the movement of the hand.

そこで、ツール提供手段２７を設け、仮想三次元空間に配置した仮想物体に所定の作用を及ぼす処理ツールを提供するとともに、ツール提供手段２７から提供された処理ツールを、ツール結合手段２８において手に結合している。この構成により、実空間において手を移動させると、仮想三次元空間に配置した仮想物体に対して、手に結合した処理ツールが作用し、処理ツールに定められた処理が行われる。 Therefore, the tool providing means 27 is provided to provide a processing tool that exerts a predetermined action on the virtual object arranged in the virtual three-dimensional space, and the processing tool provided from the tool providing means 27 is obtained by the tool combining means 28 in hand. Are connected. With this configuration, when the hand is moved in the real space, the processing tool coupled to the hand acts on the virtual object placed in the virtual three-dimensional space, and the processing determined for the processing tool is performed.

ツール提供手段２７は、１種類の処理ツールのみを提供するように構成してもよいが、通常は複数種類の処理ツールを提供することが好ましい。また、ツール結合手段２８は、複数種類の処理ツールから指標（手に対応する）に結合する処理ツールを選択するのが好ましい。 Although the tool providing means 27 may be configured to provide only one type of processing tool, it is usually preferable to provide a plurality of types of processing tools. Moreover, it is preferable that the tool coupling | bonding means 28 selects the processing tool couple | bonded with a parameter | index (corresponding to a hand) from a plurality of types of processing tools.

複数種類の処理ツールから手に結合する処理ツールを選択するには、通常のグラフィックソフトと同様に、表示手段２５の画面上に処理ツールを配列したツールパレットを表示し、ツールパレットから処理ツールを選択すればよい。 In order to select a processing tool to be combined with a hand from a plurality of types of processing tools, a tool palette in which processing tools are arranged is displayed on the screen of the display means 25 as in the case of ordinary graphic software, and the processing tool is selected from the tool palette. Just choose.

また、ツールパレットを用いるのではなく、射影手段２２が検出した手などの人体の特定部位の動きパターン（形状パターンを含んでいてもよい）を用いて処理ツールを選択してもよい。この場合、パターン分類手段２９にあらかじめ動きパターンを登録しておき、登録された動きパターンをパターン分類手段２９が検出したときに、ツール結合手段２８において当該動きパターンに応じた処理ツールを指標（手に対応する）に結合するようにしてもよい。形状パターンは、手を開いているか握っているかの別や、どの指を曲げているかなどのパターンを意味する。 Further, instead of using the tool palette, the processing tool may be selected using a motion pattern (which may include a shape pattern) of a specific part of the human body such as a hand detected by the projection unit 22. In this case, a movement pattern is registered in advance in the pattern classification unit 29, and when the registered movement pattern is detected by the pattern classification unit 29, the tool combination unit 28 selects a processing tool corresponding to the movement pattern as an index (hand). (Corresponding to) may be combined. The shape pattern means a pattern such as whether the hand is open or gripped and which finger is bent.

パターン分類手段２９では、入力情報取得手段２１が取得した手（特定部位）の画素値を用い、画素値から求めた代表点（たとえば、重心）の位置の時間変化により、手の動きパターンを分類する。手の動きパターンは、複数種類があらかじめ登録されており、種類ごとにコードが付与される。ツール結合手段２８では、このコードをツール提供手段２７に与えることにより、コードに応じた処理ツールを指標に結合することができる。 The pattern classification unit 29 uses the pixel value of the hand (specific part) acquired by the input information acquisition unit 21 to classify the hand movement pattern based on the temporal change in the position of the representative point (for example, the center of gravity) obtained from the pixel value. To do. A plurality of types of hand movement patterns are registered in advance, and a code is assigned to each type. In the tool combination means 28, by supplying this code to the tool providing means 27, a processing tool corresponding to the code can be combined with the index.

なお、上述したツールパレットにおける処理ツールにもコードを付与し、パターン分類手段２９のコードと一致させておけば、手の動きパターンを用いる処理ツールの選択をツールパレットからの処理ツールの選択とが同処理になる。つまり、ツール結合手段２８には、手の動きパターンによる指示かツールパレットによる指示かによらず、同じ処理ツールの指示には同じコードが与えられ、同じ手順で指標に結合する処理ツールを選択することができる。 If a code is also given to the processing tool in the above-described tool palette and is matched with the code of the pattern classification means 29, the selection of the processing tool using the hand movement pattern can be selected from the tool palette. It becomes the same processing. That is, the same code is given to the instruction of the same processing tool regardless of whether the instruction is based on the hand movement pattern or the tool palette, and the processing tool to be combined with the index is selected in the same procedure. be able to.

処理ツールによる仮想物体への作用の結果は、表示処理手段２４により表示手段２５への表示内容に反映される。つまり、射影手段２２により仮想三次元空間に対応付けられた指標（手に対応する）に処理ツールを結合した状態では、手の動きに応じて処理ツールが仮想物体に作用し、その結果が表示手段２５の表示に反映される。 The result of the action on the virtual object by the processing tool is reflected in the display content on the display means 25 by the display processing means 24. That is, in a state where the processing tool is coupled to the index (corresponding to the hand) associated with the virtual three-dimensional space by the projecting means 22, the processing tool acts on the virtual object according to the movement of the hand, and the result is displayed. This is reflected in the display of the means 25.

上述の構成により、入力情報取得手段２１で実空間に存在する人体の動きを取得し、その動きを仮想三次元空間に配置した仮想物体に作用させるとともに、その結果を表示手段２５への表示に反映させることができる。したがって、射影手段２２、物体配置手段２３、表示処理手段２４、ツール提供手段２７、ツール結合手段２８は、入力情報取得手段２１が取得した実空間の人体の動きと仮想三次元空間の仮想物体とをインタラクティブに結合する対話手段として機能する。なお、入力情報取得手段２１と対話手段とは、コンピュータを用いて構成される。 With the above-described configuration, the input information acquisition unit 21 acquires the movement of the human body existing in the real space, and the movement is applied to the virtual object arranged in the virtual three-dimensional space, and the result is displayed on the display unit 25. It can be reflected. Therefore, the projecting means 22, the object arranging means 23, the display processing means 24, the tool providing means 27, and the tool combining means 28 are the movements of the human body in the real space and the virtual objects in the virtual three-dimensional space acquired by the input information acquiring means 21. It functions as an interactive means of interactively combining The input information acquisition unit 21 and the dialogue unit are configured using a computer.

以下、さらに具体的に説明する。距離画像生成手段１０において生成する距離画像は画素値を距離値とした画像であり、距離画像を生成する技術は種々知られている。パッシブ型の距離画像生成手段１０としては、複数台の撮像装置の視差に基づいて物体までの距離を求めるステレオ画像法が知られている。また、アクティブ型の距離画像生成手段１０としては、三角測量法の原理に基づいて物体までの距離を求める光切断法、光を投光し物体での反射光を受光するまでの時間を計測する飛行時間法が広く採用されている。また、空間位相の異なる複数種類の光パターンを投光し、物体の表面に形成された光パターンの位置関係から物体の表面の３次元形状を求める距離画像生成手段１０も知られている。 More specific description will be given below. The distance image generated by the distance image generating means 10 is an image having a pixel value as a distance value, and various techniques for generating a distance image are known. As the passive type distance image generating means 10, a stereo image method is known in which the distance to an object is obtained based on the parallax of a plurality of imaging devices. Further, as the active type distance image generating means 10, a light cutting method for obtaining the distance to the object based on the principle of the triangulation method, and the time until the light is projected and the reflected light from the object is received is measured. The time-of-flight method is widely adopted. There is also known a distance image generation means 10 that projects a plurality of types of light patterns having different spatial phases and obtains the three-dimensional shape of the surface of the object from the positional relationship of the light patterns formed on the surface of the object.

以下に説明する実施形態では、距離画像を生成する技術についてとくに制限はないが、飛行時間法（ＴｉｍｅＯｆＦｌｉｇｈｔ）を用いる場合を例として説明する。以下では、飛行時間法を「ＴＯＦ法」と略称する。ＴＯＦ法を用いる距離画像生成手段１０の構成は種々知られている。ここでは、光の強度を一定周期の変調信号で変調した強度変調光を実空間である対象空間に投光し、対象空間に存在する物体で反射された強度変調光が受光されるまでの時間を、強度変調光の投受光の位相差として検出し、この位相差を物体までの距離に換算する構成を採用する。 In the embodiment described below, there is no particular limitation on the technology for generating a distance image, but a case of using a time-of-flight method will be described as an example. Hereinafter, the time-of-flight method is abbreviated as “TOF method”. Various configurations of the distance image generation means 10 using the TOF method are known. Here, the time from when the intensity-modulated light, which is the intensity of light modulated by a modulated signal with a fixed period, is projected to the target space, which is real space, and the intensity-modulated light reflected by the object existing in the target space is received Is detected as a phase difference between light projection and reception of intensity-modulated light, and this phase difference is converted into a distance to an object.

対象空間に投光する光は、多くの物体を透過することなく物体の表面で反射され、かつ人に知覚されない光が望ましい。そのため、投光する光には近赤外線を用いるのが望ましい。ただし、撮像領域を調節する場合のように、人に知覚されるほうが望ましい場合には可視光を用いることも可能である。 The light projected into the target space is preferably light that is reflected by the surface of the object without passing through many objects and is not perceived by humans. For this reason, it is desirable to use near infrared rays as the light to be projected. However, visible light can be used when it is desirable to be perceived by a person, such as when adjusting the imaging region.

強度変調光の波形は、正弦波を想定しているが、三角波、鋸歯状波、方形波などを用いることができる。正弦波、三角波、鋸歯状波を用いる場合には強度変調光の周期を一定周期とする。なお、方形波を用いる場合に、強度変調光の周期を一定周期とするほか、オン期間（発光源の投光期間）とオフ期間（発光源の非投光期間）との比率を乱数的に変化させる技術を採用することも可能である。すなわち、オン期間とオフ期間とに対して十分に長い時間において、オン期間の生じる確率が５０％になるようにオン期間とオフ期間とを不規則に変化させ、十分に長い時間において累積した受光量を用いてもよい。 The waveform of the intensity-modulated light is assumed to be a sine wave, but a triangular wave, a sawtooth wave, a square wave, or the like can be used. When a sine wave, a triangular wave, or a sawtooth wave is used, the period of the intensity modulated light is set to a constant period. In addition, when using a square wave, in addition to making the period of intensity modulated light constant, the ratio of the on period (light emitting source light emitting period) to the off period (light emitting source non-light emitting period) is randomized. It is also possible to adopt changing technology. In other words, the on period and the off period are irregularly changed so that the probability of occurrence of the on period is 50% in a sufficiently long time with respect to the on period and the off period, and the light reception accumulated in the sufficiently long time. An amount may be used.

また、強度変調光を一定周期とする場合、たとえば、投光する光を２０ＭＨｚの変調信号により変調し、１００００周期程度の受光量を累積することによりショットノイズの影響を軽減させる。オン期間とオフ期間とを乱数的に変化させる場合にも、たとえば、単位期間を２０ＭＨｚの１周期に相当する期間（５×１０^−８ｓ）とし、単位期間の数倍程度の範囲でオン期間とオフ期間とを変化させ、単位期間の１００００倍程度の期間の受光量を累積する。この動作により、累積後の受光量は、一定周期の強度変調光を用いて受光量を累積した場合と同様に扱うことができる。 In addition, when the intensity-modulated light has a fixed period, for example, the light to be projected is modulated by a 20 MHz modulation signal, and the received light amount of about 10,000 periods is accumulated to reduce the influence of shot noise. Even when the on period and the off period are changed randomly, for example, the unit period is set to a period (5 × 10 ⁻⁸ s) corresponding to one cycle of 20 MHz, and the on period is in the range of several times the unit period. And the off period are changed, and the received light amount of about 10,000 times the unit period is accumulated. With this operation, the amount of received light after accumulation can be handled in the same manner as when the amount of received light is accumulated using intensity-modulated light having a fixed period.

物体で反射された強度変調光は、複数個の画素が２次元に配列された撮像装置により受光する。撮像装置は、濃淡画像を撮像するため撮像素子と、撮像素子の受光面に光が入射する範囲を制限する受光光学系とを備える。撮像素子は、ＣＣＤイメージセンサやＣＭＯＳイメージセンサとして提供されている濃淡画像を撮像する周知構成の撮像素子を用いることができるが、距離画像生成手段１０に適する構造を有するように専用に設計された撮像素子を用いることが望ましい。 The intensity-modulated light reflected by the object is received by an imaging device in which a plurality of pixels are two-dimensionally arranged. The imaging apparatus includes an imaging element for capturing a grayscale image and a light receiving optical system that limits a range in which light enters the light receiving surface of the imaging element. As the image pickup element, an image pickup element having a well-known configuration for picking up a gray image provided as a CCD image sensor or a CMOS image sensor can be used. However, the image pickup element is specially designed to have a structure suitable for the distance image generation unit 10. It is desirable to use an image sensor.

以下では、距離画像生成手段１０の一例として下記構成を想定して説明するが、この構成に限定する趣旨ではなく、強度変調光の変調波形、撮像素子の構成、撮像素子の制御などに関して、周知の種々の距離画像生成手段１０に提供された構成を採用することができる。 In the following description, the following configuration is assumed as an example of the distance image generation unit 10, but not limited to this configuration, but the modulation waveform of intensity-modulated light, the configuration of the image sensor, the control of the image sensor, and the like are well known. The various configurations provided in the various distance image generation means 10 can be adopted.

以下の説明で用いる距離画像生成手段１０は、図３に示すように、光を対象空間に投光する発光素子１１と、対象空間からの光を受光する撮像素子１２とを備える。発光素子１１は、発光ダイオードやレーザダイオードのように入力の瞬時値に比例した光出力が得られる素子を用いる。また、発光素子１１から出射した光は投光光学系１３を通して投光される。発光素子１１は、光出力を確保するために適数個設けられる。撮像素子１２の前方には、視野を決める受光光学系１４が配置される。 As shown in FIG. 3, the distance image generation means 10 used in the following description includes a light emitting element 11 that projects light into a target space, and an imaging element 12 that receives light from the target space. As the light emitting element 11, an element such as a light emitting diode or a laser diode that can obtain an optical output proportional to an instantaneous value of input is used. Further, the light emitted from the light emitting element 11 is projected through the light projecting optical system 13. An appropriate number of light emitting elements 11 are provided to ensure light output. A light receiving optical system 14 that determines the field of view is disposed in front of the image sensor 12.

発光素子１１から出射された強度変調光は投光光学系１３を通して所望の空間領域に投光される。撮像素子１２は、受光光学系１４を通して対象空間からの光を受光する。投光光学系１３と受光光学系１４とは、投受光の方向（光軸の方向）を平行にし、互いに近接して配置してある。ここに、投光光学系１３と受光光学系１４との距離は視野領域に対して実質的に無視することができるものとする。 The intensity-modulated light emitted from the light emitting element 11 is projected onto a desired spatial region through the light projecting optical system 13. The image sensor 12 receives light from the target space through the light receiving optical system 14. The light projecting optical system 13 and the light receiving optical system 14 are arranged close to each other with the light projecting and receiving directions (the direction of the optical axis) in parallel. Here, it is assumed that the distance between the light projecting optical system 13 and the light receiving optical system 14 can be substantially ignored with respect to the visual field region.

距離画像生成手段１０は、強度変調光を出射させるために発光素子１１に与える変調信号を生成する変調信号生成部１５を備える。また、距離画像生成手段１０は、撮像素子１２が対象空間から受光するタイミングを制御するために、撮像素子１２での受光タイミングを規定する受光タイミング信号を変調信号から生成するタイミング制御部１６を備える。撮像素子１２で得られた受光量に相当する電荷は撮像素子１２から読み出されて演算処理部１７に入力される。演算処理部１７は、受光タイミングと受光量との関係から対象空間に存在する物体までの距離を求める。また、後述するように、演算処理部１７は、物体までの距離と方位とを、実空間に設定した直交座標系の座標値に変換して出力する機能も備える。 The distance image generation unit 10 includes a modulation signal generation unit 15 that generates a modulation signal to be given to the light emitting element 11 in order to emit intensity modulated light. In addition, the distance image generation unit 10 includes a timing control unit 16 that generates a light reception timing signal that defines the light reception timing at the image sensor 12 from the modulation signal in order to control the timing at which the image sensor 12 receives light from the target space. . The charge corresponding to the amount of light received by the image sensor 12 is read from the image sensor 12 and input to the arithmetic processing unit 17. The arithmetic processing unit 17 obtains the distance to the object existing in the target space from the relationship between the light reception timing and the light reception amount. As will be described later, the arithmetic processing unit 17 also has a function of converting the distance and direction to the object into coordinate values of an orthogonal coordinate system set in the real space and outputting the same.

変調信号生成部１５は、出力電圧が一定周波数（たとえば、２０ＭＨｚ）の正弦波形で変化する変調信号を生成する。発光素子１１はこの変調信号により駆動され、光出力が正弦波状に変化する強度変調光が発光素子１１から出射される。 The modulation signal generator 15 generates a modulation signal whose output voltage changes with a sine waveform having a constant frequency (for example, 20 MHz). The light emitting element 11 is driven by this modulation signal, and intensity modulated light whose light output changes in a sine wave shape is emitted from the light emitting element 11.

本実施形態において用いる撮像素子１２は、電子シャッタの技術を用いることにより、受光タイミング信号に同期する期間にのみ受光強度に応じた電荷を生成する。また、生成された電荷は、遮光された蓄積領域に転送され、蓄積領域において変調信号の複数周期（たとえば、１００００周期）に相当する蓄積期間に蓄積された後、撮像素子１２の外部に受光出力として取り出される。 The image sensor 12 used in the present embodiment generates an electric charge according to the light reception intensity only during a period synchronized with the light reception timing signal by using an electronic shutter technique. The generated charges are transferred to a light-shielded accumulation area, accumulated in an accumulation period corresponding to a plurality of periods (for example, 10000 periods) of a modulation signal in the accumulation area, and then received and output to the outside of the image sensor 12. As taken out.

タイミング制御部１６では、変調信号に同期する受光タイミング信号を生成する。ここでは、タイミング制御部１６が、変調信号の異なる４位相ごとに一定時間幅の受光期間を有した４種類の受光タイミング信号を生成する。また、上述した蓄積期間ごとに４種類の受光タイミング信号から選択した１種類の受光タイミング信号を撮像素子１２に与える。 The timing control unit 16 generates a light reception timing signal synchronized with the modulation signal. Here, the timing control unit 16 generates four types of light reception timing signals having a light reception period having a certain time width for each of four phases having different modulation signals. In addition, one type of light reception timing signal selected from the four types of light reception timing signals is supplied to the image sensor 12 for each accumulation period described above.

すなわち、１回の蓄積期間において１種類の受光タイミング信号を撮像素子１２に与えることにより、変調信号の特定の位相期間に対応する受光期間における電荷を撮像素子１２の各画素で生成する。蓄積後の電荷は、受光出力として撮像素子１２から取り出される。蓄積期間ごとに異なる各受光タイミング信号を撮像素子１２に与え、撮像素子１２で生成された電荷を受光出力として取り出す動作を繰り返すと、４回の蓄積期間で４種類の受光タイミング信号に対応する受光出力が撮像素子１２から得られる。 That is, by supplying one type of light reception timing signal to the image sensor 12 in one accumulation period, charges in the light reception period corresponding to a specific phase period of the modulation signal are generated in each pixel of the image sensor 12. The accumulated charge is taken out from the image sensor 12 as a light reception output. When the light receiving timing signals different for each accumulation period are given to the image sensor 12 and the operation of taking out the electric charges generated by the image sensor 12 as light reception outputs is repeated, the light reception corresponding to four types of light reception timing signals in four accumulation periods. An output is obtained from the image sensor 12.

いま、４種類の受光タイミング信号が、変調信号の１周期において９０度ずつ異なる位相に設定され、各受光タイミング信号に対応して撮像素子１２から出力された受光出力（電荷量）が、それぞれＡ０，Ａ１，Ａ２，Ａ３であったとする。このとき、三角関数の関係を用いると、強度変調光の投光時と受光時との位相差ψ〔ｒａｄ〕は、下式の形式で表すことができる。
ψ＝（Ａ０−Ａ２）／（Ａ１−Ａ３）
変調信号の周波数は一定であるから、位相差ψを投光から受光までの時間差に換算することができ、光速は既知であるから、時間差が求まれば物体までの距離を求めることができる。 Now, four types of light reception timing signals are set to phases different by 90 degrees in one cycle of the modulation signal, and the light reception outputs (charge amounts) output from the image sensor 12 corresponding to the respective light reception timing signals are respectively A0. , A1, A2, A3. At this time, using the relationship of the trigonometric function, the phase difference ψ [rad] between when the intensity-modulated light is projected and received can be expressed in the form of the following equation.
ψ = (A0−A2) / (A1−A3)
Since the frequency of the modulation signal is constant, the phase difference ψ can be converted into a time difference from light projection to light reception, and since the speed of light is known, the distance to the object can be obtained if the time difference is obtained.

すなわち、４種類の受光出力（電荷量）Ａ０〜Ａ３により物体までの距離を求めることができる。なお、受光期間は、各画素において適正な受光量が得られるように、適宜に設定することができる（たとえば、変調信号の４分の１周期に相当する受光期間を用いることが可能である）。各受光期間の時間幅は互いに等しくすることが必要である。 That is, the distance to the object can be obtained from the four types of light reception outputs (charge amounts) A0 to A3. The light receiving period can be set as appropriate so that an appropriate amount of light is obtained in each pixel (for example, a light receiving period corresponding to a quarter cycle of the modulation signal can be used). . The time width of each light receiving period needs to be equal to each other.

演算処理部１７では、受光出力（電荷量）Ａ０〜Ａ３に基づいて位相差ψを求め、距離に換算する上述の処理のほか、以下の実施形態において説明する処理も行う。演算処理部１７はマイコン、ＤＳＰ、ＦＰＧＡなどから選択されるデジタル信号処理装置を用いて構成され、上述した処理はデジタル信号処理装置においてプログラムを実行することにより実現される。また、演算処理部１７だけではなく、発光素子１１および撮像素子１２を除く構成は、上述したデジタル信号処理装置を用いて実現可能である。 The arithmetic processing unit 17 obtains the phase difference ψ based on the light reception outputs (charge amounts) A0 to A3, and performs the processing described in the following embodiment in addition to the above processing for converting to the distance. The arithmetic processing unit 17 is configured by using a digital signal processing device selected from a microcomputer, a DSP, an FPGA, and the like, and the above-described processing is realized by executing a program in the digital signal processing device. Further, not only the arithmetic processing unit 17 but also the configuration excluding the light emitting element 11 and the imaging element 12 can be realized by using the above-described digital signal processing apparatus.

上述の動作例では、４種類の受光タイミング信号を用いているが、３種類の受光タイミング信号でも位相差ψを求めることができ、環境光ないし周囲光が存在しない環境下では、２種類の受光タイミング信号でも位相差ψを求めることが可能である。 In the above-described operation example, four types of light reception timing signals are used. However, the phase difference ψ can be obtained with three types of light reception timing signals, and two types of light reception are performed in an environment where there is no ambient light or ambient light. The phase difference ψ can also be obtained from the timing signal.

さらに、上述した動作では、１画素について１種類の受光タイミング信号に対応する電荷を蓄積しているから、４種類の受光出力（電荷量）Ａ０〜Ａ３を撮像素子１２から取り出すために４回の蓄積期間が必要である。これに対して、１画素について２種類の受光タイミング信号に対応する電荷を蓄積すれば、撮像素子１２から２種類の受光タイミング信号に対応した受光出力を１回で読み出すことが可能になる。同様に、１画素について４種類の受光タイミング信号に対応する電荷を蓄積可能に構成すれば、４種類の受光タイミング信号に対応する受光出力を１回で読み出すことが可能になる。 Furthermore, in the above-described operation, charges corresponding to one type of light reception timing signal are accumulated for one pixel, so that four types of light reception outputs (charge amounts) A0 to A3 are extracted four times from the image sensor 12. An accumulation period is required. On the other hand, if charges corresponding to two types of light reception timing signals are accumulated for one pixel, it is possible to read light reception outputs corresponding to the two types of light reception timing signals from the image sensor 12 at a time. Similarly, if the charge corresponding to the four types of light reception timing signals can be accumulated for one pixel, the light reception outputs corresponding to the four types of light reception timing signals can be read out once.

上述した距離画像生成手段１０は、対象空間からの光を受光するための受光素子として複数個の画素が２次元配列された撮像素子を用いているから、各画素の画素値として距離値を求めることにより距離画像が生成されることになる。すなわち、撮像素子の受光面が距離画像生成手段１０の視野領域を投影する仮想の投影面になる。 Since the above-described distance image generation means 10 uses an image sensor in which a plurality of pixels are two-dimensionally arranged as a light receiving element for receiving light from the target space, a distance value is obtained as a pixel value of each pixel. As a result, a distance image is generated. That is, the light receiving surface of the image sensor becomes a virtual projection surface for projecting the visual field region of the distance image generating means 10.

上述した距離画像生成手段１０は、対象空間に発光素子１１から投光し撮像素子１２の視野領域を対象空間として撮像するから、対象空間の形状は、距離画像生成手段１０を頂点として距離画像生成手段１０から離れるほど広がる形になる。たとえば、投光光学系１３および受光光学系１４がそれぞれ光軸の周りに等方的に形成されていると、対象空間の形状は、距離画像生成手段１０を頂点とする角錐状になる。 Since the above-described distance image generation means 10 projects light from the light emitting element 11 into the target space and images the field of view of the imaging element 12 as the target space, the shape of the target space is a distance image generation with the distance image generation means 10 as a vertex. The further away from the means 10, the wider the shape. For example, when the light projecting optical system 13 and the light receiving optical system 14 are formed isotropically around the optical axis, the shape of the target space is a pyramid with the distance image generating means 10 as the apex.

したがって、上述した仮想の投影面に配列された画素の位置は、距離画像生成手段１０から対象空間を見込む方向に対応することになり、各画素の画素値は当該方向に存在する物体までの距離を表すことになる。言い換えると、距離画像生成手段１０により生成された距離画像は、極座標系で物体の位置を表していることになる。このような距離画像を極座標系の距離画像と呼ぶことにする。 Therefore, the positions of the pixels arranged on the virtual projection plane described above correspond to the direction in which the target space is viewed from the distance image generation unit 10, and the pixel value of each pixel is the distance to the object existing in the direction. Will be expressed. In other words, the distance image generated by the distance image generating means 10 represents the position of the object in the polar coordinate system. Such a distance image is called a polar coordinate system distance image.

上述した極座標系の距離画像は、距離画像生成手段１０からの距離の情報が必要であるときには利便性が高いが、対象空間である実空間の各位置との対応関係がわかりにくく、実空間に存在する物体を基準にした領域を指定するには不便である。したがって、演算処理部１７では、極座標系の距離画像から直交座標系の各座標値を有した画像を生成する座標変換を行う。以下では、座標変換を行った後の画像を座標軸別画像と呼ぶ。 The above-mentioned distance image in the polar coordinate system is highly convenient when distance information from the distance image generating means 10 is necessary, but the correspondence with each position in the real space that is the target space is difficult to understand, and the real space It is inconvenient to specify an area based on an existing object. Therefore, the arithmetic processing unit 17 performs coordinate conversion for generating an image having each coordinate value of the orthogonal coordinate system from the distance image of the polar coordinate system. Hereinafter, the image after the coordinate conversion is referred to as a coordinate axis-specific image.

極座標系の距離画像から座標軸別画像を生成する手順について、簡単に説明する。極座標系の距離画像から座標軸別画像を生成するには、まず極座標系の距離画像を直交座標系であるカメラ座標系の３次元画像に変換し、この３次元画像を対象空間に規定した直交座標系であるグローバル座標系の３次元画像に変換する。グローバル座標系の３次元画像が得られると、各座標軸別画像に分解することができる。 A procedure for generating the coordinate axis-specific image from the polar coordinate system distance image will be briefly described. In order to generate an image for each coordinate axis from a polar coordinate system distance image, first, the polar coordinate system distance image is converted into a three-dimensional image of a camera coordinate system, which is an orthogonal coordinate system, and this three-dimensional image is defined as an orthogonal coordinate defined in the target space. Convert to a global coordinate system 3D image. When a three-dimensional image of the global coordinate system is obtained, it can be decomposed into images for each coordinate axis.

極座標系の距離画像からカメラ座標系の３次元画像を生成するには以下の演算を行う。ここに、撮像素子１２の受光面における水平方向と垂直方向とをｕ方向とｖ方向とする。撮像素子１２と受光光学系１４とは、受光光学系１４の光軸が撮像素子１２の受光面の中心位置の画素を通るように配置する。また、撮像素子１２の受光面を受光光学系１４の焦点に位置させる。この位置関係において、撮像素子１２の受光面の中心位置の画素の座標（単位は、ピクセル）を（ｕｃ，ｖｃ）、撮像素子１２の画素のｕ方向とｖ方向とのピッチ（単位は、ｍｍ）を（ｓｕ，ｓｖ）とする。さらに、受光光学系１４の焦点距離をｆ［ｍｍ］とし、撮像素子１２の受光面における各画素の座標（単位はピクセル）の位置（ｕ，ｖ）に対応する方向に存在する物体について受光光学系１４の中心から物体までの距離をｄ［ｍｍ］とする。これらの値は、距離画像生成手段１０において物体までの距離ｄを各画素の位置に対応付けることにより既知の値になる。 In order to generate a three-dimensional image in the camera coordinate system from the distance image in the polar coordinate system, the following calculation is performed. Here, the horizontal direction and the vertical direction on the light receiving surface of the image sensor 12 are defined as the u direction and the v direction. The image sensor 12 and the light receiving optical system 14 are arranged so that the optical axis of the light receiving optical system 14 passes through the pixel at the center position of the light receiving surface of the image sensor 12. Further, the light receiving surface of the image sensor 12 is positioned at the focal point of the light receiving optical system 14. In this positional relationship, the coordinates (unit: pixel) of the pixel at the center position of the light receiving surface of the image sensor 12 are (uc, vc), and the pitch between the u direction and the v direction of the pixel of the image sensor 12 (unit: mm). ) Is (su, sv). Further, the focal length of the light receiving optical system 14 is f [mm], and light receiving optics is used for an object that exists in the direction corresponding to the position (u, v) of the coordinates (unit: pixel) of each pixel on the light receiving surface of the image sensor 12. The distance from the center of the system 14 to the object is d [mm]. These values become known values by associating the distance d to the object with the position of each pixel in the distance image generating means 10.

物体についてカメラ座標系での座標値（Ｘ１，Ｙ１，Ｚ１）を求めると以下のようになる。座標値（Ｘ１，Ｙ１，Ｚ１）の各成分の単位はｍｍである。なお、撮像素子１２の受光面に設定した座標系の原点は矩形状である撮像素子１２の受光面の１つの角の位置とし、直交座標系の原点は受光光学系１４の中心とする。
Ｘ１＝ｕ１・ｄ／Ｒ
Ｙ１＝ｖ１・ｄ／Ｒ
Ｚ１＝ｆ・ｄ／Ｒ
ただし、
ｕ１＝ｓｕ（ｕ−ｕｃ）
ｖ１＝ｓｖ（ｖ−ｖｃ）
Ｒ＝（ｕ１^２＋ｖ１^２＋ｆ^２）^１／２
ここでは、説明を簡単にするために、受光光学系１４の光学歪みの影響は省略している。ただし、光学中心からの距離Ｒを補正する歪み補正式を用いることにより、光学歪みを補正することができる。 The coordinate values (X1, Y1, Z1) in the camera coordinate system for the object are obtained as follows. The unit of each component of the coordinate values (X1, Y1, Z1) is mm. The origin of the coordinate system set on the light receiving surface of the image sensor 12 is the position of one corner of the light receiving surface of the image sensor 12 which is rectangular, and the origin of the orthogonal coordinate system is the center of the light receiving optical system 14.
X1 = u1 · d / R
Y1 = v1 · d / R
Z1 = f · d / R
However,
u1 = su (u-uc)
v1 = sv (v−vc)
R = (u1 ² + v1 ² + f ² ) ^1/2
Here, in order to simplify the description, the influence of the optical distortion of the light receiving optical system 14 is omitted. However, optical distortion can be corrected by using a distortion correction formula for correcting the distance R from the optical center.

上述したように、極座標系の距離画像をカメラ座標系の３次元画像に変換した後に、グローバル座標系への座標変換を行う。ここに、距離画像生成手段１０は、撮像素子１２の受光面における垂直方向（すなわち、ｖ方向）が室内の壁面および床面に平行になるように配置されているものとする。 As described above, after the distance image in the polar coordinate system is converted into a three-dimensional image in the camera coordinate system, the coordinate conversion to the global coordinate system is performed. Here, it is assumed that the distance image generation means 10 is arranged so that the vertical direction (that is, the v direction) on the light receiving surface of the image sensor 12 is parallel to the indoor wall surface and floor surface.

いま、距離画像生成手段１０において受光光学系１４の光軸の俯角をθとする。また、距離画像生成手段１０の視野角をφとする。カメラ座標系からグローバル座標系への変換には、カメラ座標系での座標値（Ｘ１，Ｙ１，Ｚ１）に対して、Ｙ軸周りで俯角θに相当する回転を行う。これにより、室内の壁面および床面に直交する座標軸を持つグローバル座標系の３次元画像が生成される。 Now, let the angle of depression of the optical axis of the light receiving optical system 14 in the distance image generating means 10 be θ. Further, the viewing angle of the distance image generating means 10 is assumed to be φ. For conversion from the camera coordinate system to the global coordinate system, rotation corresponding to the depression angle θ is performed around the Y axis with respect to the coordinate values (X1, Y1, Z1) in the camera coordinate system. Thereby, a three-dimensional image of a global coordinate system having a coordinate axis orthogonal to the indoor wall surface and floor surface is generated.

以下では、グローバル座標系での座標値を（Ｘ，Ｙ，Ｚ）とする。図示例では、壁に直交する方向で壁から離れる向きをＸ方向の正の向きとし、床に直交する方向の下向きをＺ方向の正の向きとする。また、座標系には右手系を用いる。俯角θに関する座標系の回転は、以下の計算により行う。
Ｘ＝Ｘ１・ｃｏｓ（９０°−θ）＋Ｚ１・ｓｉｎ（９０°−θ）
Ｙ＝Ｙ１
Ｚ＝−Ｘ１・ｓｉｎ（９０°−θ）＋Ｚ１・ｃｏｓ（９０°−θ）
座標軸別画像は、グローバル座標系にマッピングを行った画像そのものではなく、距離画像の各画素の座標位置（ｕ，ｖ）にＸ値とＹ値とＺ値とをそれぞれ個別に対応付けた画像である。すなわち、各座標位置（ｕ，ｖ）に、Ｘ（ｕ，ｖ）、Ｙ（ｕ，ｖ）、Ｚ（ｕ，ｖ）をそれぞれ対応付けた画像であり、１枚の極座標系の距離画像に対して３枚の座標軸別画像が生成される。極座標系の距離画像から座標軸別画像を得るには、上述の計算を行ってもよいが、極座標系の距離画像から座標値（Ｘ，Ｙ，Ｚ）に変換するテーブルを用意しておけば、処理負荷を軽減することができる。 Hereinafter, the coordinate values in the global coordinate system are assumed to be (X, Y, Z). In the illustrated example, the direction away from the wall in the direction orthogonal to the wall is defined as the positive direction in the X direction, and the downward direction orthogonal to the floor is defined as the positive direction in the Z direction. A right-hand system is used as the coordinate system. The rotation of the coordinate system with respect to the depression angle θ is performed by the following calculation.
X = X1 · cos (90 ° −θ) + Z1 · sin (90 ° −θ)
Y = Y1
Z = −X1 · sin (90 ° −θ) + Z1 · cos (90 ° −θ)
The image by coordinate axis is not the image itself mapped to the global coordinate system, but an image in which the X value, the Y value, and the Z value are individually associated with the coordinate position (u, v) of each pixel of the distance image. is there. In other words, each coordinate position (u, v) is an image in which X (u, v), Y (u, v), and Z (u, v) are associated with each other. On the other hand, three coordinate axis-specific images are generated. In order to obtain an image for each coordinate axis from a polar coordinate system distance image, the above calculation may be performed. However, if a table for converting a polar coordinate system distance image into a coordinate value (X, Y, Z) is prepared, Processing load can be reduced.

座標軸別画像のうちＸ値を画素値とする画像をＸ画像、Ｙ値を画素値とする画像をＹ画像、Ｚ値を画素値とする画像をＺ画像と呼ぶことにする。上記構成では、座標変換を行うテーブルとしては、Ｘ画像に変換するＸ変換テーブルと、Ｙ画像に変換するＹ変換テーブルと、Ｚ画像に変換するＺ変換テーブルとの３種類が必要になる。 Of the images by coordinate axes, an image having an X value as a pixel value is referred to as an X image, an image having a Y value as a pixel value is referred to as a Y image, and an image having a Z value as a pixel value is referred to as a Z image. In the above configuration, three types of tables for performing coordinate conversion are required: an X conversion table for converting to an X image, a Y conversion table for converting to a Y image, and a Z conversion table for converting to a Z image.

上述したＸ画像とＹ画像とＺ画像とは、コンピュータのメモリに格納され、実空間である対象空間に対応する３次元の仮想空間を表している。したがって、仮想空間における条件を設定すれば、対象空間において条件を設定したことと等価になる。 The X image, the Y image, and the Z image described above are stored in a memory of a computer and represent a three-dimensional virtual space corresponding to a target space that is a real space. Therefore, setting the condition in the virtual space is equivalent to setting the condition in the target space.

距離画像生成手段１０において生成されたＸ画像とＹ画像とＺ画像とは、図１に示したように、入力情報取得手段２１に与えられる。ここで、距離画像生成手段１０により実空間における所定の空間領域に存在する人体までの距離画像を生成すると、仮想三次元空間に存在する仮想物体に対して上述のように手の位置に応じた作用を及ぼすことが可能になる。 The X image, Y image, and Z image generated by the distance image generating means 10 are given to the input information acquiring means 21 as shown in FIG. Here, when a distance image to a human body existing in a predetermined space area in the real space is generated by the distance image generating means 10, the virtual object existing in the virtual three-dimensional space corresponds to the position of the hand as described above. It becomes possible to act.

以下では、手に結合される処理ツールの例を説明する。図４には、背景を透過させる透明なテクスチャを有した仮想物体４１を仮想三次元空間に配置した例を示している。仮想物体４１は、たとえば自動車であるが、図４（ａ）に示すように、当初は表示手段２５の画面上では視認できない状態で配置される。図４（ａ）における破線は、表示手段２５には表示されていないが、透明な仮想物体４１が存在していることを示している。 Below, the example of the processing tool couple | bonded with a hand is demonstrated. FIG. 4 shows an example in which a virtual object 41 having a transparent texture that transmits the background is arranged in a virtual three-dimensional space. The virtual object 41 is, for example, an automobile, but is initially arranged in a state where it cannot be visually recognized on the screen of the display means 25 as shown in FIG. A broken line in FIG. 4A indicates that there is a transparent virtual object 41 that is not displayed on the display means 25.

この状態において、図４（ｂ）のように、表示手段２５の画面に表示される指標４２を仮想物体４１が存在する領域に重ねると、仮想物体４１に不透明なテクスチャががマッピングされ、仮想物体４１が表示手段２５の画面上で視認できるようになる。すなわち、図示例では、処理ツールとして、仮想物体４１に作用して不透明なテクスチャをマッピングする貼付ツールを提供する場合を示している。 In this state, as shown in FIG. 4B, when the indicator 42 displayed on the screen of the display unit 25 is overlaid on the area where the virtual object 41 exists, an opaque texture is mapped to the virtual object 41, and the virtual object 41 becomes visible on the screen of the display means 25. That is, in the illustrated example, a case where a pasting tool that acts on the virtual object 41 and maps an opaque texture is provided as a processing tool is shown.

貼付ツールが手に結合されている場合には、図４（ｂ）に示すように、仮想物体４１を手でこする動作を行うことによって、透明である仮想物体４１に不透明なテクスチャが徐々にマッピングされる。図示例では、仮想三次元空間における指標の位置を確認できるように、手の形の指標４２を表示手段２５の画面上に表示している。そして、実空間において手を動かすことにより、この指標４２の通過部位に存在する仮想物体４１にテクスチャがマッピングされる。 When the pasting tool is coupled to the hand, an opaque texture is gradually formed on the transparent virtual object 41 by performing an operation of rubbing the virtual object 41 with a hand as shown in FIG. To be mapped. In the illustrated example, a hand-shaped index 42 is displayed on the screen of the display means 25 so that the position of the index in the virtual three-dimensional space can be confirmed. Then, by moving the hand in the real space, the texture is mapped to the virtual object 41 existing at the passage part of the index 42.

上述のように貼付ツールを用いることによって、実空間での手の動きに合わせて、何も存在しない空間から仮想物体４１が現れるかのような印象で、表示手段２５の画面に徐々に仮想物体４１が表示される。その結果、展示会や説明会において、仮想物体４１を登場させる際に印象的な演出を行うことができる。 By using the pasting tool as described above, the virtual object gradually appears on the screen of the display means 25 with the impression that the virtual object 41 appears from a space where nothing exists in accordance with the movement of the hand in the real space. 41 is displayed. As a result, an impressive presentation can be performed when the virtual object 41 appears in an exhibition or a briefing session.

図５に示す例では、２種類の仮想物体４３，４４を仮想三次元空間に配置している。第１の仮想物体４３は、たとえば自動車であって、第２の仮想物体４４は、第１の仮想物体４３を覆うように配置される。図５（ａ）は第２の仮想物体４４が第１の仮想物体４３を完全に覆っている状態を示している。図５（ａ）には第１の仮想物体４３を破線で示しているが、実際には表示手段２５の画面には第１の仮想物体４３は視認できない状態になっている。 In the example shown in FIG. 5, two types of virtual objects 43 and 44 are arranged in a virtual three-dimensional space. The first virtual object 43 is, for example, an automobile, and the second virtual object 44 is disposed so as to cover the first virtual object 43. FIG. 5A shows a state in which the second virtual object 44 completely covers the first virtual object 43. In FIG. 5A, the first virtual object 43 is indicated by a broken line, but in reality, the first virtual object 43 is not visible on the screen of the display means 25.

この状態において第２の仮想物体４４を消去すれば、第１の仮想物体４３を表示手段２５の画面に表示することができる。そこで、図示例では、処理ツールとして、第２の仮想物体４４にのみ作用して第２の仮想物体４４を消去する消去ツールを提供する場合を示している。消去ツールが手に結合されている場合には、図５（ｂ）に示すように、第２の仮想物体４４の表面を手でこする動作を行うことによって、第１の仮想物体４３を覆っていた第２の仮想物体４４が消しゴムで消されるかのように徐々に消去される。図示例では、仮想三次元空間における指標の位置を確認できるように、手の形の指標４２を表示手段２５の画面上に表示している。そして、実空間において手を動かすことにより、この指標４２の通過部位に存在する第２の仮想物体４４を消去することが可能になる。 If the second virtual object 44 is deleted in this state, the first virtual object 43 can be displayed on the screen of the display means 25. Therefore, in the illustrated example, a case is shown in which an erasing tool that operates only on the second virtual object 44 and erases the second virtual object 44 is provided as a processing tool. When the erasing tool is coupled to the hand, as shown in FIG. 5B, the first virtual object 43 is covered by performing an operation of rubbing the surface of the second virtual object 44 with the hand. The second virtual object 44 is erased gradually as if it were erased with an eraser. In the illustrated example, a hand-shaped index 42 is displayed on the screen of the display means 25 so that the position of the index in the virtual three-dimensional space can be confirmed. Then, by moving the hand in the real space, it is possible to erase the second virtual object 44 existing at the passage portion of the index 42.

上述のように消去ツールを用いることによって、実空間での手の動きに合わせて、第２の仮想物体４４がこすり取られるかのような印象で、表示手段２５の画面に徐々に第１の仮想物体４３が表示される。消去ツールを用いることによっても、貼付ツールを用いる場合と同様に、展示会や説明会において、第１の仮想物体４３を登場させる際に印象的な演出を行うことができる。 By using the erasing tool as described above, the first virtual object 44 is gradually displayed on the screen of the display means 25 with the impression that the second virtual object 44 is scraped in accordance with the movement of the hand in the real space. A virtual object 43 is displayed. By using the erasing tool, as in the case of using the pasting tool, an impressive presentation can be performed when the first virtual object 43 appears at an exhibition or a briefing session.

ツール提供手段２７は、処理ツールとして仮想物体に変形を加える変形ツールも提供する。変形ツールは、実際の物体に類似した形状、円柱や立方体や球体のような幾何学的な形状など、規定形状である仮想物体を仮想三次元空間に配置されている状態において、仮想物体の表面に指標を吸着させる機能を有している。 The tool providing means 27 also provides a deformation tool that deforms a virtual object as a processing tool. The deformation tool is a surface of a virtual object in a state in which a virtual object having a prescribed shape such as a shape similar to an actual object or a geometric shape such as a cylinder, cube, or sphere is placed in a virtual three-dimensional space. Has the function of adsorbing the indicator.

変形ツールは、仮想物体の表面に指標が吸着した部位に対して、指標を仮想物体に押し込む向きに移動させると仮想物体に凹ませる変形を加え、指標により仮想物体を引っ張る向きに指標を移動させると仮想物体を膨らます変形を加える機能を備える。さらに、変形ツールには、仮想物体との吸着部位をつまんで捻ることも可能であり、あたかも粘土の成形を行っているかのようにして、仮想物体を変形させることができる。 The deformation tool adds deformation that causes the virtual object to be recessed when the index is moved in a direction to push the virtual object into the virtual object surface, and the index is moved in the direction in which the virtual object is pulled by the index. And a function to add deformation to inflate virtual objects. Furthermore, the deformation tool can also pinch and twist the adsorption part with the virtual object, and the virtual object can be deformed as if clay is being formed.

ここで、変形ツールを用いて仮想物体を凹ませる変形を加える場合に、変形ツールが仮想物体を貫通する可能性が考えられる。そこで、変形ツールが仮想物体を貫通した場合には、変形ツールが貫通した仮想物体の表示状態を他の仮想物体とは異ならせる。仮想物体が複数個の部品によって構成されている場合には、部品を単位として変形ツールによる貫通の有無を報知するのが望ましい。 Here, when the deformation | transformation which makes a virtual object dent using a deformation | transformation tool is added, possibility that a deformation | transformation tool penetrates a virtual object is considered. Therefore, when the deformation tool penetrates the virtual object, the display state of the virtual object penetrated by the deformation tool is made different from other virtual objects. When the virtual object is composed of a plurality of parts, it is desirable to notify the presence or absence of penetration by the deformation tool in units of parts.

このように表示状態を変えることによって、仮想物体を変形させている使用者に注意を喚起することができる。表示状態としては、仮想物体の色、仮想物体の点滅表示などを用いる。 By changing the display state in this way, the user who is deforming the virtual object can be alerted. As the display state, the color of the virtual object, the blinking display of the virtual object, or the like is used.

上述のように変形ツールを提供しているから、製品のモックアップモデルやクレイモデルを製作することなく、表示手段２５に表示された仮想物体をもとにして、所望の変形を加えることが可能になり、仮想物体の作成作業が容易になる。また、複数人で議論しながら仮想物体を変形させることも可能である。 Since the deformation tool is provided as described above, a desired deformation can be applied based on the virtual object displayed on the display means 25 without producing a mock-up model or clay model of the product. This makes it easier to create a virtual object. It is also possible to deform a virtual object while discussing with a plurality of people.

上述の動作では、入力情報取得手段２１が実空間に存在する人体の手を特定部位として抽出しているが、図６に示すように、人体３３の頭部３４を特定部位として抽出してもよい。また、入力情報取得手段２１は頭部３４を特定部位として抽出する際には、頭部３４の位置および向きを抽出する。頭部３４の位置および向きを抽出すると、実空間の人体が表示手段２５の画面を見ている位置と向きとを推定することができる。 In the above-described operation, the input information acquisition unit 21 extracts the human hand existing in the real space as the specific part. However, as illustrated in FIG. 6, even if the head 34 of the human body 33 is extracted as the specific part. Good. Further, when the input information acquisition unit 21 extracts the head 34 as a specific part, the input information acquisition unit 21 extracts the position and orientation of the head 34. When the position and orientation of the head 34 are extracted, the position and orientation in which the human body in the real space is looking at the screen of the display means 25 can be estimated.

そこで、表示処理手段２４では、表示手段２５に表示される仮想三次元空間の範囲を、表示手段２５の画面を見ている位置および向きに応じて変更するのが望ましい。つまり、頭部３４の位置により視点の位置を定めて表示手段２５に表示する範囲を調整し、頭部３４の位置により視線の向きを定めて表示手段２５に表示される仮想三次元空間の座標軸の向きを調整する。これらの調整は、入力情報取得手段２１により抽出された頭部３４の位置および向きに追従させて行う。 Therefore, it is desirable for the display processing unit 24 to change the range of the virtual three-dimensional space displayed on the display unit 25 according to the position and orientation of viewing the screen of the display unit 25. That is, the position of the viewpoint is determined by the position of the head 34 and the display range on the display unit 25 is adjusted, and the direction of the line of sight is determined by the position of the head 34 and the coordinate axis of the virtual three-dimensional space displayed on the display unit 25. Adjust the orientation. These adjustments are performed by following the position and orientation of the head 34 extracted by the input information acquisition unit 21.

図６（ａ）では人体３３の頭部３４が表示手段２５の画面に正対した状態を示し、表示手段２５には仮想物体４５が表示されている。ここで、図６（ｂ）に示すように、頭部３４を下に向けると視線が下向きになるから、仮想物体４５の上面が見えるように仮想三次元空間の座標軸の向きが変更される。ここで、表示手段２５に正対する位置まで頭部３４を上げたときに、仮想物体４５の向きが変わると、使用者である人体３３は仮想物体４５の状態を詳細に見ることができない。したがって、頭部３４の位置を元に戻す際には、座標軸の向きは変更されないようにすることが望ましい。すなわち、表示手段２５に正対した位置から頭部３４を向けた向きに応じて座標軸の向きを変更し、その後、表示手段２５に正対する位置まで戻す間には、座標軸の向きを変更しないのが望ましい。 FIG. 6A shows a state in which the head 34 of the human body 33 faces the screen of the display means 25, and a virtual object 45 is displayed on the display means 25. Here, as shown in FIG. 6B, when the head 34 is turned downward, the line of sight becomes downward, so that the orientation of the coordinate axes of the virtual three-dimensional space is changed so that the upper surface of the virtual object 45 can be seen. Here, if the orientation of the virtual object 45 changes when the head 34 is raised to a position facing the display means 25, the human body 33 as the user cannot see the state of the virtual object 45 in detail. Therefore, when returning the position of the head 34 to the original position, it is desirable not to change the orientation of the coordinate axes. That is, the direction of the coordinate axis is changed according to the direction in which the head 34 is directed from the position facing the display unit 25, and then the direction of the coordinate axis is not changed while returning to the position facing the display unit 25. Is desirable.

上述の動作により、使用者である人体３３が頭部３４を動かすことによって、表示手段２５の画面に表示される仮想物体４５の向きを変えることができる。したがって、表示手段２５を見ている使用者である人体３３が、仮想物体４５について見たい部位に視線を向けるように頭部３４を動かすだけで（このとき、人体３３は表示手段２５に対する位置を変えてもよい）、仮想物体４５を見る視点を変更することができる。 By the above-described operation, the human body 33 as the user moves the head 34, whereby the orientation of the virtual object 45 displayed on the screen of the display means 25 can be changed. Therefore, the human body 33 who is the user who is viewing the display means 25 simply moves the head 34 so that the line of sight is directed to the part desired to be seen with respect to the virtual object 45 (at this time, the human body 33 moves the position relative to the display means 25). The viewpoint of viewing the virtual object 45 can be changed.

図６に示す動作は、人体３３の頭部を用いているが、実空間における人体３３の手３５の位置に応じて表示手段２５における仮想物体の表示を変更するようにしてもよい。たとえば、図７に示すように、表示手段２５の画面と手３５との距離に応じて、表示手段２５に提示されている仮想三次元空間の範囲を変更してもよい。表示する範囲の変更は表示処理手段２４が行う。 Although the operation shown in FIG. 6 uses the head of the human body 33, the display of the virtual object on the display means 25 may be changed according to the position of the hand 35 of the human body 33 in the real space. For example, as shown in FIG. 7, the range of the virtual three-dimensional space presented on the display unit 25 may be changed according to the distance between the screen of the display unit 25 and the hand 35. The display processing unit 24 changes the display range.

図示例において、表示手段２５に表示されている仮想三次元空間は、地図や航空写真のように仮想物体４６の上方に視点を有する場合を示している。この場合、図７（ａ）のように、表示手段２５の画面から手３５までの距離が大きいときには、視点が上方に位置するとみなして表示の範囲を広くする。これに対して、図７（ｂ）のように、表示手段２５の画面から手３５までの距離が小さいときには、視点が下方に位置するとみなして表示の範囲を狭くする。表示手段２５に表示する仮想三次元空間の範囲は、手３５の動きに追従させて変更すればよい。 In the illustrated example, the virtual three-dimensional space displayed on the display means 25 shows a case where the viewpoint is located above the virtual object 46 like a map or an aerial photograph. In this case, as shown in FIG. 7A, when the distance from the screen of the display unit 25 to the hand 35 is large, it is assumed that the viewpoint is located above and the display range is widened. On the other hand, as shown in FIG. 7B, when the distance from the screen of the display unit 25 to the hand 35 is small, the display range is narrowed assuming that the viewpoint is located below. The range of the virtual three-dimensional space displayed on the display means 25 may be changed by following the movement of the hand 35.

このように表示手段２５に対する手３５の位置の遠近により、表示手段２５への仮想三次元空間の表示範囲を変化させることができる。この動作は図６に示した動作と組み合わせることが可能であり、また、手３５ではなく表示手段２５と頭部３４との距離を用いて表示範囲を変更するようにしてもよい。あるいはまた、手３５の左右の動きに追従させて仮想三次元空間の表示範囲を左右に移動させたり、手３５の向き（手の平の向き）に追従させて仮想三次元空間の表示範囲のパンやチルト（つまり、視点の回りでの座標軸の向きの変更）を行ってもよい。 As described above, the display range of the virtual three-dimensional space on the display unit 25 can be changed depending on the position of the hand 35 relative to the display unit 25. This operation can be combined with the operation shown in FIG. 6, and the display range may be changed using the distance between the display means 25 and the head 34 instead of the hand 35. Alternatively, the display range of the virtual three-dimensional space is moved to the left and right by following the left and right movements of the hand 35, or the panning of the display range of the virtual three-dimensional space by following the direction of the hand 35 (the direction of the palm) Tilt (that is, change of the direction of the coordinate axis around the viewpoint) may be performed.

さらに、頭部３４や手３５だけではなく、人体３３の四肢と頭部との少なくとも１箇所の動きに伴って実空間の位置が変化する部位であれば、他の箇所の動きによって表示手段２５に表示する座標軸の向きや範囲を変更してもよい。四肢と頭部との少なくとも１箇所の動きに伴って実空間の位置が変化する部位は、四肢や頭部の全体だけではなく、四肢や頭部の一部であったり、四肢や頭部とともに動く物品、たとえば手に持った指し棒や筆記具などであってもよい。 Furthermore, as long as the position of the real space changes not only with the head 34 and the hand 35 but also with the movement of at least one position of the limb of the human body 33 and the head, the display means 25 is moved according to the movement of other positions. You may change the direction and range of the coordinate axes to be displayed. The part where the position of the real space changes with at least one movement of the limbs and the head is not only the limbs and the entire head, but also a part of the limbs and the head, together with the limbs and the head It may be a moving article, such as a pointing stick or a writing instrument held in the hand.

図８に示す例は、さらに別の動作例であって、仮想物体として四肢と頭部とを備える立体モデル４７を仮想三次元空間に配置している。図示している立体モデル４７には、たとえば、キャラクタを用いることができる。入力情報取得手段２１は、実空間における人体３３の四肢および頭部の動きを抽出する。また、ツール提供手段２７は、実空間における人体３３の四肢および頭部の動きを、立体モデル４７における四肢および頭部の動きに連動させる連動ツールを提供する。 The example shown in FIG. 8 is still another operation example, in which a three-dimensional model 47 having limbs and a head as virtual objects is arranged in a virtual three-dimensional space. For the three-dimensional model 47 shown in the figure, for example, a character can be used. The input information acquisition unit 21 extracts the movements of the limbs and the head of the human body 33 in the real space. The tool providing means 27 provides an interlocking tool that links the movements of the limbs and the head of the human body 33 in the real space with the movements of the limbs and the head in the three-dimensional model 47.

したがって、使用者である人体３３は、四肢および頭部を動かすことによって、表示手段２５に表示されたキャラクタなどの立体モデル４７の四肢および頭部を、連動させて動かすことができる。この動作では、仮想三次元空間において、立体モデル４７により使用者の立体的なアバター（化身）が設定されるから、使用者は手足の動きに合わせて仮想三次元空間内でアバターを操ることができる。また、図示しているように、実空間の人体３３と仮想三次元空間の立体モデル４７との左右を反転させると、立体モデル４７を鏡に映した鏡像のように動かすこともできる。 Therefore, the human body 33 as a user can move the limbs and the head of the three-dimensional model 47 such as the character displayed on the display unit 25 in conjunction with each other by moving the limbs and the head. In this operation, since the user's three-dimensional avatar (incarnation) is set by the three-dimensional model 47 in the virtual three-dimensional space, the user can manipulate the avatar in the virtual three-dimensional space according to the movement of the limbs. it can. Further, as shown in the figure, when the left and right of the human body 33 in the real space and the three-dimensional model 47 in the virtual three-dimensional space are reversed, the three-dimensional model 47 can be moved like a mirror image reflected in a mirror.

上述した各動作は一例であって、適宜に組み合わせて用いることが可能であり、また、表示手段２５に表示する仮想三次元空間の内容も、上述した実施形態の例に限定されるものではない。上述のように、本実施形態で説明した技術は、実空間での物体（とくに、人体）の動きを、表示手段２５の画面に表示される仮想物体に反映させることができる。したがって、製品デザインの際には、実空間でモデルを形成することなく仮想物体に様々な変形を加えることによって製品デザインを行うことが可能である。また、仮想三次元空間に配置された仮想物体に処理ツールによる作用を与える際に、入力用の装置を操作する必要がなく、また人体にインターフェースとなる装置を装着する必要もないから、幼児であっても仮想物体を操作することができる。したがって、幼児の教育ゲームにも応用することができる。 Each operation described above is an example, and can be used in an appropriate combination, and the contents of the virtual three-dimensional space displayed on the display means 25 are not limited to the example of the embodiment described above. . As described above, the technique described in the present embodiment can reflect the movement of an object (particularly, a human body) in a real space on a virtual object displayed on the screen of the display unit 25. Therefore, when designing a product, it is possible to design the product by applying various deformations to the virtual object without forming a model in the real space. In addition, when a virtual object placed in a virtual three-dimensional space is subjected to an action by a processing tool, it is not necessary to operate an input device and it is not necessary to attach an interface device to the human body. Even if it exists, the virtual object can be operated. Therefore, it can be applied to educational games for infants.

１０距離画像生成手段
２１入力情報取得手段
２２射影手段
２３物体配置手段
２４表示処理手段
２５表示手段
２７ツール提供手段
２８ツール結合手段
２９パターン分類手段
３０人体モデル
３１セグメント
３２ジョイント
３３人体
３４頭部
３５手
４２指標
４３第１の仮想物体
４４第２の仮想物体
４５仮想物体
４６仮想物体
４７立体モデル DESCRIPTION OF SYMBOLS 10 Distance image generation means 21 Input information acquisition means 22 Projection means 23 Object arrangement means 24 Display processing means 25 Display means 27 Tool provision means 28 Tool connection means 29 Pattern classification means 30 Human body model 31 Segment 32 Joint 33 Human body 34 Head 35 Hand 42 index 43 first virtual object 44 second virtual object 45 virtual object 46 virtual object 47 3D model

Claims

実空間に存在する物体についての距離画像を生成する距離画像生成手段と、
前記距離画像から前記物体の特定部位を抽出する入力情報取得手段と、
前記特定部位の動きを仮想三次元空間の仮想物体に作用させ、その結果を表示手段への表示に反映させる対話手段とを備え、
前記対話手段は、
前記入力情報取得手段が抽出した前記特定部位の画素値を用いて前記特定部位をコンピュータにより構築された仮想三次元空間に配置される指標に対応付ける射影手段と、
前記仮想三次元空間に仮想物体を配置する物体配置手段と、
前記物体配置手段が前記仮想三次元空間に配置した前記仮想物体に所定の作用を及ぼす処理ツールを提供するツール提供手段と、
前記入力情報取得手段が抽出した前記指標に前記ツール提供手段が提供する処理ツールを結合させるツール結合手段と、
前記射影手段により前記仮想三次元空間に対応付けられた前記指標に前記ツール結合手段により処理ツールを結合した状態において、前記指標の動きに応じて前記表示手段に表示された前記仮想物体に処理ツールによる作用の結果を反映させる表示処理手段とを備え、
前記物体配置手段は、
前記仮想三次元空間に前記仮想物体を配置する機能と、
前記仮想物体に背景を透過する透明なテクスチャを付与する機能と、
前記仮想物体として、規定形状である第１の仮想物体と、前記仮想物体を覆う第２の仮想物体とを前記仮想三次元空間に配置する機能とを有し、
前記ツール提供手段は、処理ツールとして、
通過部位に存在する前記仮想物体に、あらかじめ定めた不透明のテクスチャをマッピングする貼付ツールと、
通過部位に存在する前記第２の仮想物体を消去する消去ツールと、
前記指標を前記仮想物体の表面に吸着させ前記指標の移動の向きに応じて前記仮想物体を変形させる変形ツールとを提供し、
前記ツール結合手段は、前記指標に結合する処理ツールを選択する
ことを特徴とする仮想物体操作装置。 Distance image generation means for generating a distance image of an object existing in real space;
Input information acquisition means for extracting a specific part of the object from the distance image;
Interaction means for causing the movement of the specific part to act on a virtual object in a virtual three-dimensional space and reflecting the result on the display on the display means ;
The interaction means includes
Projection means for associating the specific part with an index arranged in a virtual three-dimensional space constructed by a computer using the pixel value of the specific part extracted by the input information acquisition unit;
Object placement means for placing a virtual object in the virtual three-dimensional space;
Tool providing means for providing a processing tool that the object placement means exerts a predetermined action on the virtual object placed in the virtual three-dimensional space;
Tool coupling means for coupling a processing tool provided by the tool providing means to the index extracted by the input information acquisition means;
In a state where the processing tool is combined with the index associated with the virtual three-dimensional space by the projecting means, the processing tool is applied to the virtual object displayed on the display means according to the movement of the index. Display processing means for reflecting the result of the action by
The object placement means includes
A function of arranging the virtual object in the virtual three-dimensional space;
A function of giving the virtual object a transparent texture that transmits a background;
The virtual object has a function of arranging a first virtual object having a prescribed shape and a second virtual object covering the virtual object in the virtual three-dimensional space;
The tool providing means is a processing tool,
A pasting tool for mapping a predetermined opaque texture to the virtual object present at the passage site;
An erasing tool for erasing the second virtual object present at the passage site;
Providing a deformation tool for adsorbing the index to the surface of the virtual object and deforming the virtual object according to the direction of movement of the index;
The virtual object operating apparatus , wherein the tool combining means selects a processing tool to be combined with the index .

前記入力情報取得手段が抽出した前記特定部位の画素値を用いて前記特定部位の動きパターンを分類するパターン分類手段を備え、前記ツール結合手段は、処理ツールが動きパターンに対応付けられており前記パターン分類手段が分類した動きパターンに応じた処理ツールを前記指標に結合することを特徴とする請求項１記載の仮想物体操作装置。 Pattern classification means for classifying the movement pattern of the specific part using the pixel value of the specific part extracted by the input information acquisition means, and the tool combining means, the processing tool is associated with the movement pattern, 2. The virtual object manipulation device according to claim 1 , wherein a processing tool corresponding to the motion pattern classified by the pattern classification unit is coupled to the index .

前記表示処理手段は、前記変形ツールにより前記仮想物体を変形させる際に、前記変形ツールが前記仮想物体を貫通すると、貫通した前記仮想物体の表示状態を他の仮想物体の表示状態とは異ならせることを特徴とする請求項１又は２記載の仮想物操作装置。 When the deformation tool penetrates the virtual object when the deformation tool deforms the virtual object with the deformation tool, the display processing means changes the display state of the penetrated virtual object from the display state of other virtual objects. The virtual object operating device according to claim 1 or 2 , characterized in that

前記物体は人体であり前記特定部位は少なくとも四肢と頭部とを含み、前記仮想物体は四肢と頭部とを備える立体モデルであって、前記ツール提供手段は、処理ツールとして人体の四肢および頭部の動きを立体モデルの四肢および頭部の動きに連動させる連動ツールを提供することを特徴とする請求項１〜３のいずれか１項に記載の仮想物操作装置。 The object is a human body, the specific part includes at least an extremity and a head, and the virtual object is a three-dimensional model including an extremity and a head, and the tool providing means includes a human extremity and a head as a processing tool. The virtual object operating device according to any one of claims 1 to 3, further comprising an interlocking tool that interlocks the movement of the unit with the movement of the limbs and the head of the three-dimensional model .

前記特定部位は人体の四肢と頭部とのうちの少なくとも１箇所の動きに伴って実空間での位置が変化する部位であって、前記入力情報取得手段は、前記特定部位の前記表示手段からの距離および前記特定部位の移動の向きを抽出し、前記表示処理手段は、前記表示手段に表示される前記仮想三次元空間の範囲を、前記入力情報取得手段により抽出された前記特定部位の位置に追従させて変更するとともに、前記表示手段に表示される前記仮想三次元空間の範囲および座標軸の向きを、前記入力情報取得手段により抽出された前記特定部位の移動の向きに追従させて変更することを特徴とする請求項１〜４のいずれか１項に記載の仮想物操作装置。 The specific part is a part whose position in real space changes with movement of at least one of the extremities and the head of the human body, and the input information acquisition unit is configured to display the specific part from the display unit. And the direction of movement of the specific part is extracted, and the display processing means determines the range of the virtual three-dimensional space displayed on the display means as the position of the specific part extracted by the input information acquisition means. The range of the virtual three-dimensional space displayed on the display unit and the direction of the coordinate axis are changed to follow the direction of movement of the specific part extracted by the input information acquisition unit. The virtual object operating device according to any one of claims 1 to 4 , wherein

前記物体は人体であり前記特定部位は頭部であって、前記入力情報取得手段は、頭部の位置および向きを抽出し、前記表示処理手段は、前記表示手段に表示される前記仮想三次元空間の範囲を、前記入力情報取得手段により抽出された頭部の位置により定めた視点に追従させて変更するとともに、前記表示手段に表示される前記仮想三次元空間の座標軸の向きを、前記入力情報取得手段により抽出された頭部の向きにより定めた視線の向きに追従させて変更することを特徴とする請求項１〜４のいずれか１項に記載の仮想物操作装置。 The object is a human body, the specific part is a head, the input information acquisition means extracts the position and orientation of the head, and the display processing means is the virtual three-dimensional displayed on the display means. The space range is changed by following the viewpoint determined by the position of the head extracted by the input information acquisition means, and the direction of the coordinate axes of the virtual three-dimensional space displayed on the display means is changed to the input 5. The virtual object operating device according to claim 1 , wherein the virtual object operating device is changed so as to follow the direction of the line of sight determined by the orientation of the head extracted by the information acquisition unit .

前記表示手段は、前記仮想三次元空間の立体表示を行うことを特徴とする請求項１〜６のいずれか１項に記載の仮想物操作装置。 The display means, the virtual object manipulation apparatus according to any one of claims 1 to 6, characterized in that the three-dimensional display of the virtual three-dimensional space.