JP2018107642A

JP2018107642A - Image processing system, control method for image processing system, and program

Info

Publication number: JP2018107642A
Application number: JP2016252282A
Authority: JP
Inventors: 宗士大志万; Soshi Oshima
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2016-12-27
Filing date: 2016-12-27
Publication date: 2018-07-05

Abstract

PROBLEM TO BE SOLVED: To notify information urging a user to move an object so that the object placed on a document table and an area where an operation instruction image is projected do not overlap.SOLUTION: In an image processing system that images an object placed on a document table, an operation instruction image for accepting an operation instruction to read the object on the document table is projected to a projection position, thereby recognizing an operating gesture of a user. In this case, according to a stereoscopic shape of the object placed on the document table, an area where the projection means cannot project the operation instruction image is specified. In a case where this area and the area where the operation instruction image is projected is determined to overlap, information urging movement of the object is projected onto the document table.SELECTED DRAWING: Figure 17

Description

本発明は、画像処理装置、画像処理装置の制御方法、及びプログラムに関するものである。 The present invention relates to an image processing apparatus, a control method for the image processing apparatus, and a program.

従来、文書をスキャンして電子データとして保存する場合、撮像にラインセンサを用いるラインスキャナと、２次元の撮像センサを用いるカメラスキャナとがある。特に、書画台の上方にカメラを配置し、原稿を上向きに書画台に置いて撮像するカメラスキャナの場合には、1枚の原稿であれば置くだけで素早くスキャンすることができるとともに、本のように厚みのある原稿も容易に書画台に置いてスキャンすることができる。 Conventionally, when a document is scanned and stored as electronic data, there are a line scanner that uses a line sensor for imaging and a camera scanner that uses a two-dimensional imaging sensor. In particular, in the case of a camera scanner in which a camera is placed above the document table and the document is placed on the document table with the document facing upward, a single document can be quickly scanned and the book can be scanned quickly. Thus, a thick original can be easily placed on the document table and scanned.

また、特許文献１で開示されているユーザインタフェースシステムでは、書画台上に置かれた原稿に対してメニューをプロジェクタにより投射表示する。この際、前記ユーザインタフェースシステムでは、原稿内の余白領域を検出し、原稿と投射像とが重ならないようにすることで視認性を損なわないようにしている。 In the user interface system disclosed in Patent Document 1, a menu is projected and displayed on a document placed on a document table by a projector. At this time, in the user interface system, the blank area in the document is detected so that the document and the projected image do not overlap so that the visibility is not impaired.

特開2005−236878号公報JP-A-2005-236878

しかしながら、特許文献１のユーザインタフェースシステムでは、プロジェクタによって机上に投射した操作画面を制御することができるが、机上に置く物の対象が紙書類や書籍のような文書のみであった。そのため、ユーザインタフェースシステムでは、机上に置いた文書以外の立体物のような物に対しての操作画面を投射表示する際に立体部の死角となって表示されない部分や、ジェスチャー操作を検知できない領域については考慮されていない。
また、紙書類が置かれた際にも紙原稿の位置や向きに応じて動的に表示位置を変える等の制御を行うことが困難だった。 However, in the user interface system of Patent Document 1, the operation screen projected onto the desk by the projector can be controlled, but the object placed on the desk is only a document such as a paper document or a book. Therefore, in the user interface system, when projecting and displaying an operation screen for an object such as a three-dimensional object other than a document placed on a desk, a part that is not displayed as a blind spot of a three-dimensional part or an area where a gesture operation cannot be detected Is not considered.
In addition, it is difficult to perform control such as dynamically changing the display position according to the position and orientation of a paper document even when a paper document is placed.

本発明は、上記の課題を解決するためになされたもので、本発明の目的は、書画台に載置される被写体と、投射する操作指示画像との領域が重ならないように、被写体の移動をユーザに催促する情報を報知できる仕組みを提供することである。 The present invention has been made to solve the above-described problems, and an object of the present invention is to move the subject so that the region of the subject placed on the document table and the operation instruction image to be projected do not overlap. It is to provide a mechanism capable of notifying information prompting the user.

上記目的を達成する本発明の画像処理装置は以下に示す構成を備える。
書画台に載置される被写体を撮像して画像データを取得する第１の取得手段と、距離画像データを取得する第２の取得手段と、前記書画台上の被写体を読み取る操作指示を受け付ける操作指示画像を投射位置に投射する投射手段と、前記第２の取得手段が取得する距離画像データに基づいて、前記書画台上におけるユーザが前記操作指示画像を操作するジェスチャーを認識する認識手段と、前記書画台に載置される被写体の立体形状に従い、前記投射手段が前記操作指示画像を投射できない領域を特定する特定手段と、前記特定手段が特定した領域と、前記投射手段が前記操作指示画像を投射する領域とが重なるかどうかを判断する判断手段と、を備え、前記判断手段が前記特定手段が特定した領域と、前記投射手段が前記操作指示画像を投射する領域とが重なると判断した場合、前記投射手段は前記被写体の移動を催促する情報を前記書画台に投射することを特徴とする。 The image processing apparatus of the present invention that achieves the above object has the following configuration.
A first acquisition unit that captures an image of a subject placed on the document table and acquires image data; a second acquisition unit that acquires distance image data; and an operation for receiving an operation instruction for reading the subject on the document table A projecting unit that projects an instruction image onto a projection position; a recognition unit that recognizes a gesture on which the user on the document table operates the operation instruction image based on distance image data acquired by the second acquiring unit; According to the three-dimensional shape of the subject placed on the document table, the specifying means for specifying the area where the projection means cannot project the operation instruction image, the area specified by the specifying means, and the projection means using the operation instruction image Determining means for determining whether or not the area where the image is projected overlaps, the determining means specifies the area specified by the specifying means, and the projecting means projects the operation instruction image. If the area to be is determined to overlap, it said projecting means is characterized by projecting the information prompting the movement of the object in the document stage.

本発明によれば、書画台に載置される被写体と、投射する操作指示画像との領域が重ならないように、被写体の移動をユーザに催促する情報を報知できる。 ADVANTAGE OF THE INVENTION According to this invention, the information which prompts a user to move a subject can be alert | reported so that the area | region of the to-be-projected object and the operation instruction image to project may not overlap.

コントローラ部のハードウェア構成の一例を示す図である。It is a figure which shows an example of the hardware constitutions of a controller part. カメラスキャナの外観の一例を示す図である。It is a figure which shows an example of the external appearance of a camera scanner. コントローラ部のハードウェア構成の一例を示す図である。It is a figure which shows an example of the hardware constitutions of a controller part. カメラスキャナの機能構成の一例を示す図である。It is a figure which shows an example of a function structure of a camera scanner. スキャナシステムの読取処理シーケンス例を示す図である。It is a figure which shows the example of a reading process sequence of a scanner system. 画像処理装置の制御方法を説明するフローチャートである。It is a flowchart explaining the control method of an image processing apparatus. カメラによるスキャン処理を説明する図である。It is a figure explaining the scanning process by a camera. 画像処理装置の制御方法を説明するフローチャートである。It is a flowchart explaining the control method of an image processing apparatus. スキャナ装置のジェスチャー読取原理を示す模式図である。It is a schematic diagram which shows the gesture reading principle of a scanner apparatus. 画像処理装置の制御方法を説明するフローチャートである。It is a flowchart explaining the control method of an image processing apparatus. 画像処理装置の制御方法を説明するフローチャートである。It is a flowchart explaining the control method of an image processing apparatus. スキャナ装置の読み取り処理を説明する模式図である。It is a schematic diagram explaining the reading process of a scanner apparatus. 画像処理装置の制御方法を説明するフローチャートである。It is a flowchart explaining the control method of an image processing apparatus. スキャナ装置の読み取り処理を説明する模式図である。It is a schematic diagram explaining the reading process of a scanner apparatus. 画像処理装置の制御方法を説明するフローチャートである。It is a flowchart explaining the control method of an image processing apparatus. スキャナ装置の読み取り処理を説明する模式図である。It is a schematic diagram explaining the reading process of a scanner apparatus. 画像処理装置の制御方法を説明するフローチャートである。It is a flowchart explaining the control method of an image processing apparatus. 操作画面を示す図である。It is a figure which shows an operation screen. 画像処理装置の制御方法を説明するフローチャートである。It is a flowchart explaining the control method of an image processing apparatus. スキャナ装置の読み取り処理を説明する模式図である。It is a schematic diagram explaining the reading process of a scanner apparatus.

次に本発明を実施するための最良の形態について図面を参照して説明する。
＜システム構成の説明＞
〔第１実施形態〕 Next, the best mode for carrying out the present invention will be described with reference to the drawings.
<Description of system configuration>
[First Embodiment]

図１は、本実施形態を示す画像処理装置を含むスキャナシステムの構成を説明する図である。 FIG. 1 is a diagram illustrating a configuration of a scanner system including an image processing apparatus according to the present embodiment.

図１に示すように、カメラスキャナ101は、イーサネット（登録商標）等のネットワーク104にてホストコンピュータ102及びプリンタ103に接続されている。なお、カメラスキャナ101は、画像処理装置の一例である。
図１のシステムでは、ホストコンピュータ102からの指示により、カメラスキャナ101から撮影する被写体の平面画像または立体画像を読み取るスキャン機能を実行する。また、本システムでは、カメラスキャナ101が出力するスキャンデータをプリンタ103により出力するプリント機能を実行する。また、ホストコンピュータ102を介さず、カメラスキャナ101への直接の指示により、スキャン機能、プリント機能の実行も可能である。 As shown in FIG. 1, a camera scanner 101 is connected to a host computer 102 and a printer 103 via a network 104 such as Ethernet (registered trademark). The camera scanner 101 is an example of an image processing apparatus.
In the system of FIG. 1, a scan function for reading a planar image or a stereoscopic image of a subject to be photographed from the camera scanner 101 is executed according to an instruction from the host computer 102. Further, in this system, a print function for outputting the scan data output from the camera scanner 101 by the printer 103 is executed. Further, the scan function and the print function can be executed by direct instructions to the camera scanner 101 without using the host computer 102.

（カメラスキャナの構成）
図２は、図１に示したカメラスキャナ101の外観の一例を示す図である。
図２(a)に示すように、カメラスキャナ101は、コントローラ部201、カメラ部202、腕部203、短焦点プロジェクタ207（以下、プロジェクタ207という）、距離画像センサ部208を備える。カメラスキャナ101の本体であるコントローラ部201と、撮像を行うためのカメラ部202、プロジェクタ207及び距離画像センサ部208は、腕部203により連結されている。腕部203は関節を用いて曲げ伸ばしが可能である。
ここで、カメラ部202は、撮像画像を取得する撮像部の一例である。プロジェクタ207は、後述するユーザが操作を行うための操作画面（操作表示）を投射する投射部の一例である。距離画像センサ部208は、距離画像を取得する立体測定部の一例である。 (Configuration of camera scanner)
FIG. 2 is a diagram showing an example of the appearance of the camera scanner 101 shown in FIG.
As shown in FIG. 2A, the camera scanner 101 includes a controller unit 201, a camera unit 202, an arm unit 203, a short focus projector 207 (hereinafter referred to as a projector 207), and a distance image sensor unit 208. A controller unit 201 which is a main body of the camera scanner 101, a camera unit 202 for performing imaging, a projector 207, and a distance image sensor unit 208 are connected by an arm unit 203. The arm portion 203 can be bent and stretched using a joint.
Here, the camera unit 202 is an example of an imaging unit that acquires a captured image. The projector 207 is an example of a projection unit that projects an operation screen (operation display) for a user to perform an operation described later. The distance image sensor unit 208 is an example of a stereoscopic measurement unit that acquires a distance image.

図２(a)には、カメラスキャナ101が設置されている書画台204も示されている。カメラ部202及び距離画像センサ部208のレンズは、書画台204方向に向けられており、破線で囲まれた読み取り領域205内に置かれる被写体の画像を読み取り可能である。
図２の例では、原稿206は読み取り領域205内に置かれているので、カメラスキャナ101で読み取り可能である。また、書画台204内にはターンテーブル209が設けられている。ターンテーブル209は、コントローラ部201からの指示によって回転することが可能である。ターンテーブル209を回転することで、ターンテーブル209上に置かれた物体（被写体）とカメラ部202との撮影する角度を変えることができる。例えば立体的な被写体がターンテーブル209に置かれている場合、ターンテーブル209を回転することで、静止するカメラ部202が被写体の回り３６０°のいずれの方向からも撮影することができる。さらに、後述するように、プロジェクタ207が投影するＵＩ画面が被写体の位置により死角となる場合に、当該ターンテーブル209を回転することで、死角とならない領域にＵＩ画面を投影可能となる。
カメラ部202は、単一解像度で被写体を撮像するものとしてもよいが、高解像度画像撮像と低解像度画像撮像とが可能なものとすることが好ましい。
なお、図２に示されていないが、カメラスキャナ101は、後述するＬＣＤタッチパネル330及びスピーカ340を更に含むこともできる。
なお、書画台204には、平面形状の原稿、ブック形状の原稿等を載置して、カメラ部202に画像を読み取られることができるように構成されている。 FIG. 2A also shows a document table 204 on which the camera scanner 101 is installed. The lenses of the camera unit 202 and the distance image sensor unit 208 are directed in the direction of the document table 204, and can read an image of a subject placed in a reading area 205 surrounded by a broken line.
In the example of FIG. 2, the document 206 is placed in the reading area 205 and can be read by the camera scanner 101. A turntable 209 is provided in the document table 204. The turntable 209 can be rotated by an instruction from the controller unit 201. By rotating the turntable 209, the angle at which the camera unit 202 shoots an object (subject) placed on the turntable 209 can be changed. For example, when a three-dimensional subject is placed on the turntable 209, the camera unit 202 that is stationary can photograph from any direction of 360 ° around the subject by rotating the turntable 209. Further, as will be described later, when the UI screen projected by the projector 207 has a blind spot depending on the position of the subject, the UI screen can be projected onto an area that does not become a blind spot by rotating the turntable 209.
The camera unit 202 may capture a subject with a single resolution, but it is preferable to be able to capture a high resolution image and a low resolution image.
Although not shown in FIG. 2, the camera scanner 101 may further include an LCD touch panel 330 and a speaker 340, which will be described later.
Note that the document table 204 is configured such that a planar document, a book document, or the like can be placed thereon and an image can be read by the camera unit 202.

図２(b)は、カメラスキャナ101における座標系を表している。
図２(b)において、カメラスキャナ101では各ハードウェアデバイスに対して、カメラ座標系、距離画像座標系、プロジェクタ座標系という座標系が定義される。これらはカメラ部202及び距離画像センサ部208のＲＧＢカメラが撮像する画像平面、又はプロジェクタ207が投射（投影）する画像平面をＸＹ平面とし、画像平面に直交した方向をＺ方向として定義したものである。
更に、これらの独立した座標系の３次元データを統一的に扱えるようにするために、書画台204を含む平面をＸＹ平面とし、このＸＹ平面から上方に垂直な向きをＺ軸とする直交座標系を定義する。 FIG. 2B shows a coordinate system in the camera scanner 101.
In FIG. 2B, the camera scanner 101 defines a coordinate system such as a camera coordinate system, a distance image coordinate system, and a projector coordinate system for each hardware device. These define the image plane captured by the RGB cameras of the camera unit 202 and the distance image sensor unit 208 or the image plane projected (projected) by the projector 207 as the XY plane and the direction orthogonal to the image plane as the Z direction. is there.
Further, in order to be able to handle the three-dimensional data of these independent coordinate systems in a unified manner, a rectangular coordinate having the plane including the document table 204 as the XY plane and the direction perpendicular to the XY plane as the Z axis is used. Define the system.

座標系を変換する場合の例として、図２(c)に直交座標系と、カメラ部202を中心としたカメラ座標系を用いて表現された空間と、カメラ部202が撮像する画像平面との関係を示す。直交座標系における３次元点Ｐ［Ｘ，Ｙ，Ｚ］は、数1によって、カメラ座標系における３次元点Ｐｃ［Ｘｃ，Ｙｃ，Ｚｃ］へ変換できる。 As an example in the case of transforming the coordinate system, FIG. 2C shows the orthogonal coordinate system, the space expressed using the camera coordinate system centered on the camera unit 202, and the image plane captured by the camera unit 202. Show the relationship. The three-dimensional point P [X, Y, Z] in the orthogonal coordinate system can be converted into the three-dimensional point Pc [Xc, Yc, Zc] in the camera coordinate system by Equation 1.

ここで、Ｒ_ｃ及びｔ_ｃは、直交座標系に対するカメラの姿勢（回転）と位置（並進）とによって求まる外部パラメータによって構成され、Ｒ_ｃを３×３の回転行列、ｔ_ｃを並進ベクトルと呼ぶ。逆に、カメラ座標系で定義された３次元点は数２によって、直交座標系へ変換することができる。 Here, R _c and t _c are constituted by external parameters determined by the posture (rotation) and position (translation) of the camera with respect to the Cartesian coordinate system, R _c is a 3 × 3 rotation matrix, and t _c is a translation vector. Call. Conversely, a three-dimensional point defined in the camera coordinate system can be converted into an orthogonal coordinate system by Equation 2.

さらに、カメラ部202で撮影される２次元のカメラ画像平面は、カメラ部202によって３次元空間中の３次元情報が２次元情報に変換されたものである。即ち、カメラ座標系上での３次元点Ｐｃ［Ｘ_ｃ，Ｙ_ｃ，Ｚ_ｃ］を、数３によってカメラ画像平面での２次元座標ｐｃ［ｘ_ｐ，ｙ_ｐ］に透視投影変換することによって変換することができる。 Further, the two-dimensional camera image plane photographed by the camera unit 202 is obtained by converting the three-dimensional information in the three-dimensional space into the two-dimensional information by the camera unit 202. That is, the three-dimensional point Pc [X _c , Y _c , Z _c ] on the camera coordinate system is subjected to perspective projection conversion to the two-dimensional coordinate pc [x _p , y _p ] on the camera image plane by Equation 3. Can be converted.

ここで、Ａは、カメラ部202の内部パラメータと呼ばれ、焦点距離と画像中心等とで表現される３×３の行列である。

Here, A is a 3 × 3 matrix called an internal parameter of the camera unit 202 and expressed by a focal length and an image center.

以上のように、数１と数３とを用いることで、直交座標系で表された３次元点群を、カメラ座標系での３次元点群座標やカメラ画像平面に変換することができる。なお、各ハードウェアデバイスの内部パラメータ及び直交座標系に対する位置姿勢（外部パラメータ）は、公知のキャリブレーション手法により予めキャリブレーションされているものとする。
以後、特に断りがなく３次元点群と表記した場合は、直交座標系における３次元データ（立体データ）を表しているものとする。 As described above, by using Equations 1 and 3, the three-dimensional point group represented by the orthogonal coordinate system can be converted into the three-dimensional point group coordinates or the camera image plane in the camera coordinate system. Note that the internal parameters of each hardware device and the position / orientation (external parameters) with respect to the orthogonal coordinate system are calibrated in advance by a known calibration method.
Hereinafter, when there is no particular notice and it is described as a three-dimensional point group, it is assumed to represent three-dimensional data (three-dimensional data) in an orthogonal coordinate system.

（カメラスキャナのコントローラのハードウェア構成）
図３は、カメラスキャナ101の本体であるコントローラ部201等のハードウェア構成の一例を示す図である。 (Hardware configuration of camera scanner controller)
FIG. 3 is a diagram illustrating an example of a hardware configuration of the controller unit 201 and the like that are the main body of the camera scanner 101.

図３に示すように、コントローラ部201は、システムバス301に接続されたＣＰＵ302、ＲＡＭ303、ＲＯＭ304、ＨＤＤ305を備える。さらに、コントローラ部201は、システムバス301に接続されたネットワークＩ/Ｆ306、画像処理プロセッサ307、カメラＩ/Ｆ308、シリアルＩ/Ｆ310の各種Ｉ/Ｆを備える。さらに、コントローラ部201は、システムバス301に接続された、ディスプレイコントローラ309、オーディオコントローラ311及びＵＳＢコントローラ312を備える。 As shown in FIG. 3, the controller unit 201 includes a CPU 302, a RAM 303, a ROM 304, and an HDD 305 connected to the system bus 301. The controller unit 201 further includes various I / Fs such as a network I / F 306, an image processor 307, a camera I / F 308, and a serial I / F 310 connected to the system bus 301. The controller unit 201 further includes a display controller 309, an audio controller 311 and a USB controller 312 connected to the system bus 301.

ＣＰＵ302は、コントローラ部201全体の動作を制御する中央演算装置である。ＲＡＭ303は、揮発性メモリである。ＲＯＭ304は、不揮発性メモリであり、ＣＰＵ302の起動用プログラムが格納されている。ＨＤＤ305は、ＲＡＭ303と比較して大容量なハードディスクドライブ（ＨＤＤ）である。ＨＤＤ305にはコントローラ部201の実行する、カメラスキャナ101の制御用プログラムが格納されている。ＣＰＵ302が、ＲＯＭ304やＨＤＤ305に格納されているプログラムを実行することにより、カメラスキャナ101の機能構成及び後述するフローチャートの処理（情報処理）が実現される。 The CPU 302 is a central processing unit that controls the operation of the entire controller unit 201. The RAM 303 is a volatile memory. A ROM 304 is a nonvolatile memory, and stores a startup program for the CPU 302. The HDD 305 is a hard disk drive (HDD) that has a larger capacity than the RAM 303. The HDD 305 stores a control program for the camera scanner 101 executed by the controller unit 201. When the CPU 302 executes a program stored in the ROM 304 or the HDD 305, the functional configuration of the camera scanner 101 and the processing (information processing) of the flowchart described later are realized.

ＣＰＵ302は、電源ＯＮ等の起動時、ＲＯＭ304に格納されている起動用プログラムを実行する。この起動用プログラムは、ＣＰＵ302がＨＤＤ305に格納されている制御用プログラムを読み出し、ＲＡＭ303上に展開するためのものである。ＣＰＵ302は、起動用プログラムを実行すると、続けてＲＡＭ303上に展開した制御用プログラムを実行し、制御を行う。また、ＣＰＵ302は、制御用プログラムによる動作に用いるデータもＲＡＭ303上に格納して読み書きを行う。ＨＤＤ305上には更に、制御用プログラムによる動作に必要な各種設定や、また、カメラ入力によって生成した画像データを格納することができ、ＣＰＵ302によって読み書きされる。ＣＰＵ302は、ネットワークＩ/Ｆ306を介してネットワーク104上の他の機器との通信を行う。 The CPU 302 executes a startup program stored in the ROM 304 when the power is turned on or the like. This activation program is used by the CPU 302 to read out the control program stored in the HDD 305 and expand it on the RAM 303. When executing the startup program, the CPU 302 executes the control program developed on the RAM 303 and performs control. The CPU 302 also stores data used for the operation by the control program on the RAM 303 to read / write. Further, various settings necessary for the operation by the control program and image data generated by camera input can be stored on the HDD 305 and read / written by the CPU 302. The CPU 302 communicates with other devices on the network 104 via the network I / F 306.

画像処理プロセッサ307は、ＲＡＭ303に格納された画像データを読み出して処理し、またＲＡＭ303へ書き戻す。なお、画像処理プロセッサ307が実行する画像処理は、回転、変倍、色変換等である。 The image processor 307 reads and processes the image data stored in the RAM 303 and writes it back to the RAM 303. Note that image processing executed by the image processor 307 includes rotation, scaling, color conversion, and the like.

カメラＩ/Ｆ308は、カメラ部202及び距離画像センサ部208と接続され、ＣＰＵ302からの指示に応じてカメラ部202から画像データを、距離画像センサ部208から距離画像データを取得してＲＡＭ303へ書き込む。また、カメラＩ/Ｆ308は、ＣＰＵ302からの制御コマンドをカメラ部202及び距離画像センサ部208へ送信し、カメラ部202及び距離画像センサ部208の設定を行う。 The camera I / F 308 is connected to the camera unit 202 and the distance image sensor unit 208, acquires image data from the camera unit 202 in accordance with an instruction from the CPU 302, and acquires distance image data from the distance image sensor unit 208 and writes it to the RAM 303. . In addition, the camera I / F 308 transmits a control command from the CPU 302 to the camera unit 202 and the distance image sensor unit 208 to set the camera unit 202 and the distance image sensor unit 208.

また、コントローラ部201は、ディスプレイコントローラ309、シリアルＩ/Ｆ310、オーディオコントローラ311及びＵＳＢコントローラ312のうち少なくとも１つを更に含むことができる。 The controller unit 201 may further include at least one of a display controller 309, a serial I / F 310, an audio controller 311 and a USB controller 312.

ディスプレイコントローラ309は、ＣＰＵ302の指示に応じてディスプレイへの画像データの表示を制御する。ここでは、ディスプレイコントローラ309は、プロジェクタ207及びＬＣＤタッチパネル330に接続されている。 A display controller 309 controls display of image data on the display in accordance with an instruction from the CPU 302. Here, the display controller 309 is connected to the projector 207 and the LCD touch panel 330.

シリアル信号の入出力を行うシリアルＩ/Ｆ310は、ターンテーブル209に接続され、ＣＰＵ302の回転開始・終了及び回転角度の指示をターンテーブル209へ送信する。また、シリアルＩ/Ｆ310は、ＬＣＤタッチパネル330に接続され、ＣＰＵ302はＬＣＤタッチパネル330が押下されたときに、シリアルＩ/Ｆ310を介して押下された座標を取得する。オーディオコントローラ311は、スピーカ340に接続され、ＣＰＵ302の指示に応じて音声データをアナログ音声信号に変換し、スピーカ340を通じて音声を出力する。 A serial I / F 310 that inputs and outputs serial signals is connected to the turntable 209, and transmits instructions to the CPU 302 to start and end rotation and a rotation angle to the turntable 209. The serial I / F 310 is connected to the LCD touch panel 330, and the CPU 302 obtains the coordinates pressed via the serial I / F 310 when the LCD touch panel 330 is pressed. The audio controller 311 is connected to the speaker 340, converts audio data into an analog audio signal in accordance with an instruction from the CPU 302, and outputs audio through the speaker 340.

ＵＳＢコントローラ312は、ＣＰＵ302の指示に応じて外付けのＵＳＢデバイスの制御を行う。ここでは、ＵＳＢコントローラ312は、ＵＳＢメモリやＳＤカード等の外部メモリ350に接続され、外部メモリ350へのデータの読み書きを行う。 The USB controller 312 controls an external USB device in accordance with an instruction from the CPU 302. Here, the USB controller 312 is connected to an external memory 350 such as a USB memory or an SD card, and reads / writes data from / to the external memory 350.

（カメラスキャナの機能構成）
図４Ａは、図３に示したＣＰＵ302が制御用プログラムを実行することにより実現されるカメラスキャナ101の機能構成401の一例を示す図である。また、図４Ｂは、図４Ａに示す機能構成401の各モジュール間のシーケンス例を示す図である。
なお、カメラスキャナ101の制御用プログラムは前述のようにＨＤＤ305に格納され、ＣＰＵ302が起動時にＲＡＭ303上に展開して実行する。メイン制御部402は、制御の中心であり、機能構成401内の他の各モジュールを図４Ｂに示すシーケンス例に従い制御する。 (Functional configuration of camera scanner)
FIG. 4A is a diagram illustrating an example of a functional configuration 401 of the camera scanner 101 realized by the CPU 302 illustrated in FIG. 3 executing a control program. FIG. 4B is a diagram illustrating a sequence example between modules of the functional configuration 401 illustrated in FIG. 4A.
The control program for the camera scanner 101 is stored in the HDD 305 as described above, and the CPU 302 loads and executes it on the RAM 303 at the time of activation. The main control unit 402 is the center of control, and controls the other modules in the functional configuration 401 according to the sequence example shown in FIG. 4B.

図４Ａにおいて、画像取得部416は、画像入力処理を行うモジュールであり、カメラ画像取得部407、距離画像取得部408から構成される。カメラ画像取得部407は、カメラＩ/Ｆ308を介してカメラ部202が撮影する画像データを取得し、ＲＡＭ303へ格納する（撮像画像取得処理）。距離画像取得部408は、カメラＩ/Ｆ308を介して距離画像センサ部208が検出する距離画像データを取得し、ＲＡＭ303へ格納する（距離画像取得処理）。距離画像取得部408の処理の詳細は図５を用いて後述する。 4A, an image acquisition unit 416 is a module that performs an image input process, and includes a camera image acquisition unit 407 and a distance image acquisition unit 408. The camera image acquisition unit 407 acquires image data captured by the camera unit 202 via the camera I / F 308 and stores it in the RAM 303 (captured image acquisition process). The distance image acquisition unit 408 acquires the distance image data detected by the distance image sensor unit 208 via the camera I / F 308 and stores it in the RAM 303 (distance image acquisition process). Details of the processing of the distance image acquisition unit 408 will be described later with reference to FIG.

認識処理部417は、カメラ画像取得部407、距離画像取得部408が取得する画像データから書画台204上の物体の動きを検知して認識するモジュールであり、ジェスチャー認識部409、物体検知部410から構成される。ジェスチャー認識部409は、画像取得部416から書画台204上の画像データを取得し続け、ユーザが投影されたＵＩ画面に対して指先によるタッチ等のジェスチャーを検知するとメイン制御部402へ通知する。
物体検知部410は、メイン制御部402から物体載置待ち処理又は物体除去待ち処理の通知を受けると、画像取得部416から書画台204を撮像した画像データを取得する。そして、物体検知部410は、書画台204上に物体が置かれて静止するタイミング又は物体が取り除かれるタイミングを検知する処理を行う。ジェスチャー認識部409、物体検知部410の処理の詳細は図６〜図８を用いてそれぞれ後述する。 The recognition processing unit 417 is a module that detects and recognizes the movement of an object on the document table 204 from image data acquired by the camera image acquisition unit 407 and the distance image acquisition unit 408, and includes a gesture recognition unit 409 and an object detection unit 410. Consists of The gesture recognition unit 409 continues to acquire image data on the document table 204 from the image acquisition unit 416, and notifies the main control unit 402 when a gesture such as a touch with a fingertip is detected on the UI screen projected by the user.
When the object detection unit 410 receives a notification of object placement waiting processing or object removal waiting processing from the main control unit 402, the object detection unit 410 acquires image data obtained by capturing the document table 204 from the image acquisition unit 416. Then, the object detection unit 410 performs processing for detecting the timing at which the object is placed on the document table 204 and stopped or the timing at which the object is removed. Details of the processes of the gesture recognition unit 409 and the object detection unit 410 will be described later with reference to FIGS.

スキャン処理部418は、実際に対象物のスキャンを行うモジュールであり、平面原稿画像撮影部411、書籍画像撮影部412、立体形状測定部413から構成される。平面原稿画像撮影部411は平面原稿に適した処理（読み取り処理）を実行し、平面原稿に応じた形式の画像データ（読み取り画像）を出力する。
書籍画像撮影部412は書籍に適した処理（読み取り処理）を実行し、書籍原稿に応じた形式のデータ（読み取り画像）を出力する。
立体形状測定部413は立体物に、立体物に適した処理（読み取り処理）を実行し、立体物に応じた形式のデータ（読み取り画像）を出力する。これらのモジュールの処理の詳細は図９〜図１４を用いてそれぞれ後述する。 The scan processing unit 418 is a module that actually scans an object, and includes a flat document image photographing unit 411, a book image photographing unit 412, and a three-dimensional shape measuring unit 413. The flat original image photographing unit 411 executes processing (reading processing) suitable for a flat original, and outputs image data (read image) in a format corresponding to the flat original.
The book image photographing unit 412 executes processing (reading processing) suitable for the book and outputs data (read image) in a format corresponding to the book document.
The three-dimensional shape measuring unit 413 performs processing (reading processing) suitable for the three-dimensional object on the three-dimensional object, and outputs data (read image) in a format corresponding to the three-dimensional object. Details of the processing of these modules will be described later with reference to FIGS.

ユーザインタフェース部403は、ＧＵＩ部品生成表示部414と、投射領域検出部415とから構成される。ＧＵＩ部品生成表示部414は、メイン制御部402からの要求を受け、メッセージやボタン等のＧＵＩ部品を生成する。なお、ここでいうＧＵＩ部品は、操作表示を構成するオブジェクトの一例である。そして、ＧＵＩ部品生成表示部414は、表示部406へ生成したＧＵＩ部品の表示を要求する。なお、書画台204上のＧＵＩ部品の表示場所は、投射領域検出部415により検出される。 The user interface unit 403 includes a GUI component generation / display unit 414 and a projection area detection unit 415. The GUI component generation / display unit 414 receives a request from the main control unit 402 and generates a GUI component such as a message or a button. The GUI component here is an example of an object that constitutes an operation display. Then, the GUI component generation / display unit 414 requests the display unit 406 to display the generated GUI component. Note that the display area of the GUI part on the document table 204 is detected by the projection area detection unit 415.

表示部406は、ディスプレイコントローラ309を介して、プロジェクタ207又はＬＣＤタッチパネル330へ要求されたＧＵＩ部品の表示を行う。プロジェクタ207は、書画台204に向けて設置されているため、書画台204上にＧＵＩ部品を投射（投影）することが可能となっている。
また、ユーザインタフェース部403は、ジェスチャー認識部409が認識したタッチ等のジェスチャー操作、又はシリアルＩ/Ｆ310を介したＬＣＤタッチパネル330からの入力操作、更にそれらの座標を受信する。そして、ユーザインタフェース部403は、描画中の操作画面の内容と操作座標とを対応させて操作内容（押下されたボタン等）を判定する。ユーザインタフェース部403がこの操作内容をメイン制御部402へ通知することにより、操作者の指先による操作を受け付ける。ネットワーク通信部404は、ネットワークＩ/Ｆ306を介して、ネットワーク104上の他の機器とＴＣＰ／ＩＰによる通信を行う。 The display unit 406 displays the requested GUI component on the projector 207 or the LCD touch panel 330 via the display controller 309. Since the projector 207 is installed toward the document table 204, it is possible to project a GUI component on the document table 204.
Further, the user interface unit 403 receives a gesture operation such as a touch recognized by the gesture recognition unit 409, an input operation from the LCD touch panel 330 via the serial I / F 310, and coordinates thereof. Then, the user interface unit 403 determines the operation content (such as a pressed button) by associating the content of the operation screen being drawn with the operation coordinates. The user interface unit 403 notifies the operation content to the main control unit 402, thereby accepting an operation with the fingertip of the operator. The network communication unit 404 communicates with other devices on the network 104 via the network I / F 306 using TCP / IP.

データ管理部405は、ＣＰＵ302が制御用プログラムを実行することにより生成した作業データ等の様々なデータをＨＤＤ305上の所定の領域へ保存し、管理する。ここで、ＨＤＤ305に保存されるデータは、例えば平面原稿画像撮影部411、書籍画像撮影部412、立体形状測定部413が生成したスキャンデータ等である。 The data management unit 405 stores and manages various data such as work data generated by the CPU 302 executing the control program in a predetermined area on the HDD 305. Here, the data stored in the HDD 305 is, for example, scan data generated by the flat document image photographing unit 411, the book image photographing unit 412, and the three-dimensional shape measuring unit 413.

（距離画像センサ及び距離画像取得部の説明）
図３に距離画像センサ部208は、赤外線によるパターン投射方式の距離画像センサの一例を示した。
図３において、距離画像センサ部208の赤外線パターン投射部361は、対象物に、人の目には不可視である赤外線によって３次元測定パターンを投射する。赤外線カメラ362は、対象物に投射した３次元測定パターンを読み採るカメラである。ＲＧＢカメラ363は、人の目に見える可視光をＲＧＢ信号で撮影するカメラである。距離画像取得部408の処理は、図５Ａのフローチャートを用いて後述する。 (Description of distance image sensor and distance image acquisition unit)
FIG. 3 shows an example of a distance image sensor unit 208 that uses a pattern projection method using infrared rays.
In FIG. 3, an infrared pattern projection unit 361 of the distance image sensor unit 208 projects a three-dimensional measurement pattern onto an object using infrared rays that are invisible to human eyes. The infrared camera 362 is a camera that reads a three-dimensional measurement pattern projected on an object. The RGB camera 363 is a camera that captures visible light visible to the human eye using RGB signals. The processing of the distance image acquisition unit 408 will be described later with reference to the flowchart of FIG. 5A.

図５Ａは、本実施形態を示す画像処理装置の制御方法を説明するフローチャートである。本例は、画像取得部416による距離画像取得処理例である。なお、各ステップは、ＣＰＵ302が記憶された制御プログラムを実行することで実現される。以下、制御主体は図４Ａを用いて説明する。また、図５Ｂの(a)〜(c)は、パターン投射方式による距離画像の計測原理を説明するための図である。 FIG. 5A is a flowchart illustrating a method for controlling the image processing apparatus according to the present exemplary embodiment. This example is a distance image acquisition processing example by the image acquisition unit 416. Each step is realized by the CPU 302 executing a stored control program. Hereinafter, the control subject will be described with reference to FIG. 4A. Further, (a) to (c) of FIG. 5B are diagrams for explaining the measurement principle of the distance image by the pattern projection method.

距離画像取得部408は、処理を開始すると、Ｓ501では図５Ｂ(a)に示すように赤外線パターン投射部361を用いて赤外線による３次元形状測定パターン（立体形状測定パターン）522を対象物521に投射する。
Ｓ502で、距離画像取得部408は、ＲＧＢカメラ363を用いて対象物521を撮影したＲＧＢカメラ画像523、及び赤外線カメラ362を用いてＳ501で投射した３次元形状測定パターン522を撮影した赤外線カメラ画像524を取得する。
なお、赤外線カメラ362とＲＧＢカメラ363とでは設置位置が異なるため、図５Ｂ(b)に示すようにそれぞれで撮影される２つのＲＧＢカメラ画像523及び赤外線カメラ画像524の撮影領域が異なる。 When the distance image acquisition unit 408 starts the processing, in S501, as shown in FIG. 5B (a), the infrared pattern projection unit 361 is used to set a three-dimensional shape measurement pattern (three-dimensional shape measurement pattern) 522 using infrared rays to the object 521. Project.
In S502, the distance image acquisition unit 408 captures the RGB camera image 523 obtained by photographing the object 521 using the RGB camera 363, and the infrared camera image obtained by photographing the three-dimensional shape measurement pattern 522 projected in S501 using the infrared camera 362. Get 524.
Since the infrared camera 362 and the RGB camera 363 have different installation positions, the shooting areas of the two RGB camera images 523 and the infrared camera image 524 that are captured respectively are different as shown in FIG. 5B (b).

Ｓ503で、距離画像取得部408は、赤外線カメラ362の座標系からＲＧＢカメラ363の座標系への座標系変換（前述した数１〜数３参照）を用いて赤外線カメラ画像524をＲＧＢカメラ画像523の座標系に合わせる。なお、距離画像取得部408は、赤外線カメラ362とＲＧＢカメラ363との相対位置や、それぞれの内部パラメータは事前のキャリブレーション処理により既知であるとする。 In step S503, the distance image acquisition unit 408 converts the infrared camera image 524 into the RGB camera image 523 using coordinate system conversion from the coordinate system of the infrared camera 362 to the coordinate system of the RGB camera 363 (see Equations 1 to 3 above). Set to the coordinate system. It is assumed that the distance image acquisition unit 408 knows the relative positions of the infrared camera 362 and the RGB camera 363 and the internal parameters of each of them by a prior calibration process.

Ｓ504で、距離画像取得部408は、図５Ｂ(c)に示すように、３次元形状測定パターン522とＳ503で座標変換を行った赤外線カメラ画像524との間での対応点を抽出する。
例えば、距離画像取得部408は、赤外線カメラ画像524上の１点を３次元形状測定パターン522上から探索して、同一の点が検出された場合に対応付けを行う。また、距離画像取得部408は、赤外線カメラ画像524の画素の周辺のパターンを３次元形状測定パターン522上から探索し、一番類似度が高い部分と対応付けてもよい。 In S504, the distance image acquisition unit 408 extracts corresponding points between the three-dimensional shape measurement pattern 522 and the infrared camera image 524 that has undergone coordinate conversion in S503, as shown in FIG. 5B (c).
For example, the distance image acquisition unit 408 searches for one point on the infrared camera image 524 from the three-dimensional shape measurement pattern 522, and performs association when the same point is detected. Further, the distance image acquisition unit 408 may search for a pattern around the pixel of the infrared camera image 524 from the three-dimensional shape measurement pattern 522 and associate it with a portion having the highest similarity.

Ｓ505で、距離画像取得部408は、赤外線パターン投射部361と赤外線カメラ362とを結ぶ直線を基線525として三角測量の原理を用いて計算を行うことにより、赤外線カメラ362からの距離を算出する。距離画像取得部408は、Ｓ504で対応付けができた画素については、その画素に対応する位置における対象物521と赤外線カメラ362との距離を算出して画素値として保存する。 In step S505, the distance image acquisition unit 408 calculates the distance from the infrared camera 362 by performing calculation using the principle of triangulation with the straight line connecting the infrared pattern projection unit 361 and the infrared camera 362 as the base line 525. The distance image acquisition unit 408 calculates the distance between the object 521 and the infrared camera 362 at the position corresponding to the pixel that has been associated in S504, and stores it as a pixel value.

一方、距離画像取得部408は、対応付けができなかった画素については、距離の計測ができなかった部分として無効値を保存する。距離画像取得部408は、これをＳ503で座標変換を行った赤外線カメラ画像524の全画素に対して行うことで、各画素に距離値（距離情報）が入った距離画像データを生成する。 On the other hand, the distance image acquisition unit 408 stores an invalid value as a part where the distance could not be measured for pixels that could not be associated. The distance image acquisition unit 408 generates distance image data in which each pixel has a distance value (distance information) by performing this process on all the pixels of the infrared camera image 524 that have undergone coordinate conversion in S503.

Ｓ506で、距離画像取得部408は、距離画像データの各画素にＲＧＢカメラ画像523のＲＧＢ値を保存することにより、１画素につきＲ、Ｇ、Ｂ、距離の４つの値を持つ距離画像を生成する。ここで取得した距離画像データは、距離画像センサ部208のＲＧＢカメラ363で定義された距離画像センサ座標系が基準となっている。 In S506, the distance image acquisition unit 408 generates a distance image having four values of R, G, B, and distance for each pixel by storing the RGB values of the RGB camera image 523 in each pixel of the distance image data. To do. The distance image data acquired here is based on the distance image sensor coordinate system defined by the RGB camera 363 of the distance image sensor unit 208.

そこで、Ｓ507で、距離画像取得部408は、図２(c)を用いて上述したように、距離画像センサ座標系として得られた距離情報を直交座標系における３次元点群に変換する。以後、特に指定がなく３次元点群と表記した場合は、直交座標系における３次元点群を示すものとする。 Therefore, in S507, the distance image acquisition unit 408 converts the distance information obtained as the distance image sensor coordinate system into a three-dimensional point group in the orthogonal coordinate system, as described above with reference to FIG. Hereinafter, when there is no particular designation and it is described as a three-dimensional point group, it indicates a three-dimensional point group in an orthogonal coordinate system.

なお、本実施形態では上述したように、距離画像センサ部208として赤外線パターン投射方式を採用しているが、他の方式の距離画像センサを用いることも可能である。例えば、２つのＲＧＢカメラでステレオ立体視を行うステレオ方式や、レーザー光の飛行時間を検出することで距離を測定するＴＯＦ（Time of Flight）方式を用いてもよい。 In the present embodiment, as described above, an infrared pattern projection method is adopted as the distance image sensor unit 208, but a distance image sensor of another method can also be used. For example, a stereo system that performs stereo stereoscopic vision with two RGB cameras, or a TOF (Time of Flight) system that measures distance by detecting the flight time of laser light may be used.

（ジェスチャー認識部409の説明）
図６は、本実施形態を示す画像処理装置の制御方法を説明するフローチャートである。本例は、ジェスチャー認識部409の処理の詳細に対応する。なお、各ステップは、ＣＰＵ302が記憶された制御プログラムを実行することで実現される。 (Description of gesture recognition unit 409)
FIG. 6 is a flowchart for explaining a control method of the image processing apparatus according to the present embodiment. This example corresponds to the details of the processing of the gesture recognition unit 409. Each step is realized by the CPU 302 executing a stored control program.

Ｓ601で処理を開始すると、ジェスチャー認識部409は、所定の初期化処理を行う。ジェスチャー認識部409は、所定の初期化処理で、距離画像取得部408から距離画像データを１フレーム分取得する。ここで、ジェスチャー認識部409の処理の開始時は書画台204上に対象物が置かれていない状態であるため、初期状態として書画台204の平面の認識を行う。
即ち、ジェスチャー認識部409は、取得した距離画像データから最も広い平面を抽出し、その位置と法線ベクトル（以降、書画台204の平面パラメータと呼ぶ）とを算出し、ＲＡＭ303に保存する。 When the process starts in S601, the gesture recognition unit 409 performs a predetermined initialization process. The gesture recognition unit 409 acquires distance image data for one frame from the distance image acquisition unit 408 in a predetermined initialization process. Here, since the object is not placed on the document table 204 at the start of the processing of the gesture recognition unit 409, the plane of the document table 204 is recognized as an initial state.
That is, the gesture recognition unit 409 extracts the widest plane from the acquired distance image data, calculates its position and normal vector (hereinafter referred to as plane parameters of the document table 204), and stores them in the RAM 303.

Ｓ602で、ジェスチャー認識部409は、Ｓ621〜622に示す、書画台204上に存在する物体の３次元点群を取得する。具体的には、Ｓ621で、ジェスチャー認識部409は、距離画像取得部408から距離画像データと３次元点群とを１フレーム分取得する。Ｓ622で、ジェスチャー認識部409は、書画台204の平面パラメータを用いて、取得した３次元点群から書画台204を含む平面にある点群を除去する。 In S602, the gesture recognizing unit 409 acquires a three-dimensional point group of the object existing on the document table 204 shown in S621 to 622. Specifically, in S621, the gesture recognition unit 409 acquires distance image data and a three-dimensional point group for one frame from the distance image acquisition unit 408. In step S622, the gesture recognizing unit 409 uses the plane parameter of the document table 204 to remove the point group on the plane including the document table 204 from the acquired three-dimensional point group.

Ｓ603で、ジェスチャー認識部409は、Ｓ631〜Ｓ634に示す、取得した３次元点群からユーザの手の形状及び指先を検出する処理を行う。ここで、Ｓ603の処理について、図７に示す、指先検出処理の方法を模式的に表した図を用いて説明する。 In step S603, the gesture recognition unit 409 performs processing for detecting the shape and fingertip of the user's hand from the acquired three-dimensional point group shown in steps S631 to S634. Here, the processing of S603 will be described with reference to the diagram schematically showing the fingertip detection processing method shown in FIG.

Ｓ631で、ジェスチャー認識部409は、Ｓ602で取得した３次元点群から、書画台204を含む平面から所定の高さ以上にある、肌色の３次元点群を抽出することで、操作者の手の３次元点群を得る。図７(a)において、３次元点群701は、抽出した手の３次元点に対応する。 In step S631, the gesture recognizing unit 409 extracts the skin color three-dimensional point group that is higher than a predetermined height from the plane including the document table 204 from the three-dimensional point group acquired in step S602, and thereby the operator's hand. To obtain a three-dimensional point cloud. In FIG. 7A, a three-dimensional point group 701 corresponds to the extracted three-dimensional points of the hand.

Ｓ632で、ジェスチャー認識部409は、抽出した手の３次元点群を、書画台204の平面に射影した２次元画像を生成して、その手の外形702を検出する。図７(a)において、外形702は、書画台204の平面に投射した３次元点群から検出される。投射は、点群の各座標を、書画台204の平面パラメータを用いてされればよい。また、図７(b)に示すように、ジェスチャー認識部409は、投射した３次元点群から、ＸＹ座標の値だけを取り出せば、Ｚ軸方向から見た２次元画像703として扱うこともできる。 In step S632, the gesture recognition unit 409 generates a two-dimensional image obtained by projecting the extracted three-dimensional point group of the hand onto the plane of the document table 204, and detects the outer shape 702 of the hand. In FIG. 7A, the outer shape 702 is detected from a three-dimensional point group projected on the plane of the document table 204. The projection may be performed using the coordinates of the point group using the plane parameters of the document table 204. Further, as shown in FIG. 7B, the gesture recognizing unit 409 can also handle the image as a two-dimensional image 703 viewed from the Z-axis direction by extracting only the value of the XY coordinates from the projected three-dimensional point group. .

このとき、ジェスチャー認識部409は、手の３次元点群の各点が、書画台204の平面に投射した２次元画像の各座標のどれに対応するかを、ＲＡＭ303に記憶しておくものとする。 At this time, the gesture recognizing unit 409 stores in the RAM 303 which of the coordinates of the two-dimensional image projected on the plane of the document table 204 each point of the three-dimensional point group of the hand corresponds to. To do.

Ｓ633で、ジェスチャー認識部409は、検出した手の外形702上の各点について、その点での外形の曲率を算出し、算出した曲率が所定値より小さい点を指先として検出する。
図７(c)は、外形702の曲率から指先を検出する方法を模式的に表したものである。注目点704は、書画台204の平面に投射された２次元画像703の外形を表す点の一部を表している。ここで、注目点704のような、外形を表す点のうち、隣り合う５個の点を含むように円を描くことを考える。 In step S633, the gesture recognizing unit 409 calculates, for each point on the detected outer shape 702 of the hand, the curvature of the outer shape at that point, and detects a point where the calculated curvature is smaller than a predetermined value as a fingertip.
FIG. 7C schematically shows a method for detecting the fingertip from the curvature of the outer shape 702. An attention point 704 represents a part of a point representing the outer shape of the two-dimensional image 703 projected onto the plane of the document table 204. Here, it is considered to draw a circle so as to include five adjacent points among points representing the outer shape, such as the point of interest 704.

円705、707が、その例である。この円を、全ての外形の点に対して順に描き、その直径（例えば直径706、708）が所定の値より小さい（曲率が小さい）ことを以て、指先とする。
この例では隣り合う５個の点としたが、その数は限定されるものではない。また、ここでは曲率を用いたが、外形に対して楕円フィッティングを行うことで、指先を検出してもよい。 Circles 705 and 707 are examples. This circle is drawn in order with respect to all the points of the outer shape, and the diameter (for example, the diameters 706 and 708) is smaller than a predetermined value (the curvature is small), and is used as a fingertip.
In this example, five points are adjacent to each other, but the number is not limited. In addition, the curvature is used here, but the fingertip may be detected by performing elliptic fitting on the outer shape.

Ｓ634で、ジェスチャー認識部409は、検出した指先の個数及び各指先の座標を算出する。このとき、ジェスチャー認識部409は、前述したように、書画台204に投射した２次元画像の各点と、手の３次元点群の各点との対応関係をＲＡＭ303に記憶している。このため、ジェスチャー認識部409は、各指先の３次元座標を得ることができる。今回は、３次元点群から２次元画像に投射した画像データから指先を検出する方法を説明したが、指先検出の対象とする画像は、これに限定されるものではない。 In S634, the gesture recognition unit 409 calculates the number of detected fingertips and the coordinates of each fingertip. At this time, the gesture recognizing unit 409 stores, in the RAM 303, the correspondence between each point of the two-dimensional image projected on the document table 204 and each point of the three-dimensional point group of the hand as described above. For this reason, the gesture recognition unit 409 can obtain the three-dimensional coordinates of each fingertip. This time, a method of detecting a fingertip from image data projected from a three-dimensional point group onto a two-dimensional image has been described. However, an image that is a target of fingertip detection is not limited to this.

例えば、距離画像データの背景差分や、ＲＧＢ画像の肌色領域から手の領域を抽出し、上述と同様の方法（外形の曲率計算等）で、手領域のうちの指先を検出してもよい。この場合、検出した指先の座標はＲＧＢ画像や距離画像データといった、２次元画像上の座標であるため、ジェスチャー認識部409は、その座標における距離画像の距離情報を用いて、直交座標系の３次元座標に変換する必要がある。このとき、指先点となる外形上の点ではなく、指先を検出するときに用いた、曲率円の中心を指先点としてもよい。 For example, the hand region may be extracted from the background difference of the distance image data or the skin color region of the RGB image, and the fingertip in the hand region may be detected by the same method (external curvature calculation or the like) as described above. In this case, since the detected coordinates of the fingertip are coordinates on a two-dimensional image such as an RGB image or distance image data, the gesture recognizing unit 409 uses the distance information of the distance image at the coordinates to determine 3 of the orthogonal coordinate system Need to convert to dimensional coordinates. At this time, the center of the curvature circle used when detecting the fingertip may be used as the fingertip point, instead of the point on the outer shape that becomes the fingertip point.

Ｓ604で、ジェスチャー認識部409は、Ｓ641〜Ｓ646に示す、検出した手の形状及び指先からのジェスチャー判定処理を行う。
Ｓ641で、ジェスチャー認識部409は、Ｓ603で検出した指先が１つかどうか判定する。ジェスチャー認識部409は、指先が１つでないと判断した場合、処理をＳ646へ進め、ジェスチャーなしと判定して、処理をＳ605へ進める。
ジェスチャー認識部409は、Ｓ641において検出した指先が１つであると判断した場合、処理をＳ642へ進め、検出した指先と書画台204を含む平面との距離を算出する。 In S604, the gesture recognizing unit 409 performs a gesture determination process based on the detected hand shape and fingertip shown in S641 to S646.
In S641, the gesture recognition unit 409 determines whether there is one fingertip detected in S603. If the gesture recognizing unit 409 determines that the number of fingertips is not one, the process proceeds to S646, determines that there is no gesture, and proceeds to S605.
If the gesture recognizing unit 409 determines that the number of fingertips detected in S641 is one, the process proceeds to S642, and calculates the distance between the detected fingertip and the plane including the document table 204.

Ｓ643で、ジェスチャー認識部409は、Ｓ642で算出した距離が所定値以下であるかどうかを判定し、所定値以下であると判断した場合、処理をＳ644へ進める。S644で、ジェスチャー認識部409は、指先が書画台204へタッチした、タッチジェスチャーありと判定する。 In S643, the gesture recognizing unit 409 determines whether or not the distance calculated in S642 is equal to or smaller than a predetermined value. If it is determined that the distance is equal to or smaller than the predetermined value, the process proceeds to S644. In S644, the gesture recognition unit 409 determines that there is a touch gesture in which the fingertip touches the document table 204.

ジェスチャー認識部409は、Ｓ643において、Ｓ642で算出した距離が所定値以下であるかどうかを判断する。ここで、ジェスチャー認識部409は、Ｓ642で算出した距離が所定値以下でないと判断した場合、処理をＳ645へ進める。そして、Ｓ645で、ジェスチャー認識部409は、指先が移動したジェスチャー（タッチはしていないが指先が書画台204上に存在するジェスチャー）と判定する。 In S643, the gesture recognizing unit 409 determines whether the distance calculated in S642 is equal to or less than a predetermined value. If the gesture recognizing unit 409 determines that the distance calculated in S642 is not less than or equal to the predetermined value, the process proceeds to S645. In step S645, the gesture recognizing unit 409 determines that the fingertip has moved (the gesture that is not touched but the fingertip is on the document table 204).

Ｓ605で、ジェスチャー認識部409は、判定したジェスチャーをメイン制御部402へ通知し、処理をＳ602へ戻し、ジェスチャー認識処理を繰り返す。
以上の処理により、ジェスチャー認識部409は、距離画像データに基づいてユーザのジェスチャーを認識することができる。 In S605, the gesture recognition unit 409 notifies the determined gesture to the main control unit 402, returns the process to S602, and repeats the gesture recognition process.
Through the above processing, the gesture recognition unit 409 can recognize the user's gesture based on the distance image data.

〔物体検知部の処理〕
図８は、本実施形態を示す画像処理装置の制御方法を説明するフローチャートである。本例は、図４Aに示した物体検知部410の処理の詳細に対応する。なお、各ステップは、ＣＰＵ302が記憶された制御プログラムを実行することで実現される。 [Processing of object detection unit]
FIG. 8 is a flowchart for explaining a control method of the image processing apparatus according to the present embodiment. This example corresponds to the details of the processing of the object detection unit 410 shown in FIG. 4A. Each step is realized by the CPU 302 executing a stored control program.

物体検知部410は、処理を開始すると、図８(a)のＳ801で、Ｓ811〜Ｓ813に示す初期化処理を行う。
具体的には、Ｓ811で、物体検知部410は、カメラ画像取得部407からカメラ画像データを、距離画像取得部408から距離画像データをそれぞれ１フレーム分取得する。Ｓ812で、物体検知部410は、取得したカメラ画像データを前フレームカメラ画像データとしてＲＡＭ303に保存する。
Ｓ813で、物体検知部410は、取得したカメラ画像データ及び距離画像データを書画台背景カメラ画像及び書画台背景距離画像としてそれぞれＲＡＭ303に保存する。以降、「書画台背景カメラ画像」及び「書画台背景距離画像」と記載した場合は、ここで取得したカメラ画像データ及び距離画像データのことを指す。
Ｓ802で、物体検知部410は、物体が書画台204上に置かれたことの検知（物体載置検知処理）を行う。本処理の詳細は図８(b)を用いて後述する。 When the process is started, the object detection unit 410 performs an initialization process shown in S811 to S813 in S801 of FIG.
Specifically, in S811, the object detection unit 410 acquires the camera image data from the camera image acquisition unit 407 and the distance image data from the distance image acquisition unit 408 for one frame. In step S812, the object detection unit 410 stores the acquired camera image data in the RAM 303 as previous frame camera image data.
In step S813, the object detection unit 410 stores the acquired camera image data and distance image data in the RAM 303 as a document table background camera image and a document table background distance image, respectively. In the following description, “document table background camera image” and “document table background distance image” refer to the camera image data and distance image data acquired here.
In step S802, the object detection unit 410 detects that an object has been placed on the document table 204 (object placement detection processing). Details of this processing will be described later with reference to FIG.

Ｓ803で、物体検知部410は、書画台204上に置かれた物体の種類の判別（物体種類判別処理）を行う。本処理の詳細は図８(c)を用いて後述する。Ｓ804で、物体検知部410は、Ｓ802で載置を検知した書画台204上の物体が除去されることの検知（物体除去検知処理）を行う。本処理の詳細は図８(d)を用いて後述する。 In step S803, the object detection unit 410 determines the type of an object placed on the document table 204 (object type determination process). Details of this processing will be described later with reference to FIG. In step S804, the object detection unit 410 detects that the object on the document table 204 that has been placed in step S802 is removed (object removal detection process). Details of this processing will be described later with reference to FIG.

〔物体載置検知処理〕
図８(b)は、図８の(a)に示したＳ802の物体載置検知処理を示すフローチャートである。なお、各ステップは、ＣＰＵ302が記憶された制御プログラムを実行することで実現される。
物体検知部410は、物体載置検知処理を開始すると、Ｓ821でカメラ画像取得部407からカメラ画像データを１フレーム分取得する。Ｓ822で、物体検知部410は、取得したカメラ画像データと前フレームのカメラ画像データとの差分を計算してその絶対値を合計した差分値を算出する。 [Object placement detection processing]
FIG. 8B is a flowchart showing the object placement detection process of S802 shown in FIG. Each step is realized by the CPU 302 executing a stored control program.
When the object placement detection process is started, the object detection unit 410 acquires camera image data for one frame from the camera image acquisition unit 407 in S821. In step S822, the object detection unit 410 calculates the difference between the acquired camera image data and the camera image data of the previous frame, and sums the absolute values.

Ｓ823で、物体検知部410は、Ｓ822で算出した差分値が予め決めておいた所定値以上（閾値以上）かどうかを判定する。物体検知部410は、算出した差分値が所定値未満（閾値未満）であると判断した場合、書画台204上には物体がないと判定し、処理をＳ828へ進める。Ｓ828で、物体検知部410は、現フレームのカメラ画像データを前フレームのカメラ画像データとして保存し、処理をＳ821へ戻して、処理を続ける。
一方、Ｓ823で、物体検知部410は、Ｓ822で算出した差分値が所定値以上であると判断した場合、処理をＳ824へ進める。
S824で、物体検知部410は、Ｓ821で取得したカメラ画像データと前フレームのカメラ画像データとの差分値を、Ｓ822と同様に算出する。 In step S823, the object detection unit 410 determines whether the difference value calculated in step S822 is equal to or greater than a predetermined value (a threshold value). If the object detection unit 410 determines that the calculated difference value is less than the predetermined value (less than the threshold value), the object detection unit 410 determines that there is no object on the document table 204, and advances the process to S828. In S828, the object detection unit 410 stores the camera image data of the current frame as the camera image data of the previous frame, returns the process to S821, and continues the process.
On the other hand, if the object detection unit 410 determines in S823 that the difference value calculated in S822 is greater than or equal to a predetermined value, the process proceeds to S824.
In S824, the object detection unit 410 calculates the difference value between the camera image data acquired in S821 and the camera image data of the previous frame in the same manner as in S822.

Ｓ825で、物体検知部410は、算出した差分値が予め決めておいた所定値以下であるかどうかを判定する。物体検知部410は、Ｓ825において算出した差分値が所定値よりも大きいと判断した場合、書画台204上の物体が動いていると判定し、処理をＳ828へ進める。Ｓ828で、物体検知部410は、現フレームのカメラ画像データを前フレームのカメラ画像データとして保存してから、処理をＳ821へ戻して、処理を続ける。
一方、物体検知部410は、Ｓ825において算出した差分値が所定値以下であると判断した場合は、処理をＳ826へ進める。 In S825, the object detection unit 410 determines whether or not the calculated difference value is equal to or less than a predetermined value that has been determined in advance. If the object detection unit 410 determines that the difference value calculated in S825 is greater than the predetermined value, the object detection unit 410 determines that the object on the document table 204 is moving, and advances the process to S828. In S828, the object detection unit 410 saves the camera image data of the current frame as camera image data of the previous frame, returns the process to S821, and continues the process.
On the other hand, when the object detection unit 410 determines that the difference value calculated in S825 is equal to or less than the predetermined value, the object detection unit 410 advances the process to S826.

Ｓ826で、物体検知部410は、Ｓ825における判断、すなわち、Ｓ825において算出した差分値が所定値以下であると連続して判断した回数が所定フレーム数続いたかどうかを判定する。即ち物体検知部410は、書画台204上の物体が静止した状態であるかどうかを判断する。
物体検知部410は、Ｓ826において書画台204上の物体が静止した状態が予め決めておいたフレーム数続いていないと判定した場合、処理をＳ828へ進める。Ｓ828で、現フレームのカメラ画像データを前フレームのカメラ画像データとしてＲＡＭ303に保存し、処理をＳ821へ戻す。
一方、Ｓ826において、物体検知部410は、書画台204上の物体が静止した状態が予め決めておいたフレーム数続いたと判定した場合、処理をＳ827へ進める。Ｓ８２７で、物体検知部410は、書画台204上の物体が置かれたことをメイン制御部402へ通知し、物体載置検知処理を終了する。 In S826, the object detection unit 410 determines whether or not the determination in S825, that is, the number of times that the difference value calculated in S825 is continuously less than or equal to a predetermined value continues for a predetermined number of frames. That is, the object detection unit 410 determines whether or not the object on the document table 204 is stationary.
If the object detection unit 410 determines in S826 that the object on the document table 204 is stationary, the process proceeds to S828. In S828, the camera image data of the current frame is stored in the RAM 303 as camera image data of the previous frame, and the process returns to S821.
On the other hand, if the object detection unit 410 determines in step S826 that the object on the document table 204 has been stationary for a predetermined number of frames, the process proceeds to step S827. In step S827, the object detection unit 410 notifies the main control unit 402 that an object on the document table 204 has been placed, and ends the object placement detection process.

〔物体種類判別処理〕
図８(c)は、Ｓ803の物体種類判別処理の一例を示すフローチャートである。なお、各ステップは、ＣＰＵ302が記憶された制御プログラムを実行することで実現される。 [Object type discrimination processing]
FIG. 8C is a flowchart illustrating an example of the object type determination process in S803. Each step is realized by the CPU 302 executing a stored control program.

物体検知部410は、物体種類判別処理を開始すると、Ｓ831で、物体検知部410は、距離画像取得部408を介して距離画像データ及び３次元点群を１フレーム分取得する。 When the object detection unit 410 starts the object type determination process, in step S831, the object detection unit 410 acquires distance image data and a three-dimensional point group for one frame via the distance image acquisition unit 408.

Ｓ832で、物体検知部410は、書画台204上の物体に含まれる３次元点群の中で、書画台204平面からの高さが最大の点の高さを物体の高さとして取得し、その高さが予め決めておいた所定値以下であるかどうかを判定する。Ｓ832において、物体検知部410は、物体の高さが所定値以下であると判断した場合、処理をＳ833へ進める。Ｓ833で、物体検知部410は、書画台204上に平面原稿が置かれたことをメイン制御部402へ通知して、物体種類判別処理を終了する。
一方、Ｓ832において、物体検知部410は、物体の高さが所定値よりも高いと判断した場合、処理をＳ834へ進める。Ｓ834で、物体検知部410は、書画台204上に立体物が置かれたことをメイン制御部402へ通知して、物体種類判別処理を終了する。 In S832, the object detection unit 410 acquires the height of the point having the maximum height from the plane of the document table 204 as the height of the object in the three-dimensional point group included in the object on the document table 204, It is determined whether the height is equal to or less than a predetermined value. If the object detection unit 410 determines in S832 that the height of the object is equal to or less than the predetermined value, the process proceeds to S833. In step S833, the object detection unit 410 notifies the main control unit 402 that a flat document has been placed on the document table 204, and ends the object type determination process.
On the other hand, when the object detection unit 410 determines in S832 that the height of the object is higher than the predetermined value, the process proceeds to S834. In step S834, the object detection unit 410 notifies the main control unit 402 that a three-dimensional object has been placed on the document table 204, and ends the object type determination process.

〔物体除去検知処理〕
次に、物体除去検知処理について説明する。
図８(d)は、図８の(a)に示したＳ804の物体除去検知処理を示すフローチャートである。なお、各ステップは、ＣＰＵ302が記憶された制御プログラムを実行することで実現される。
物体検知部410は、物体除去検知処理を開始すると、Ｓ841ではカメラ画像取得部407からカメラ画像データを１フレーム分取得する。Ｓ842で、物体検知部410は、取得したカメラ画像データと書画台背景のカメラ画像データとの差分値を算出する。 [Object removal detection processing]
Next, the object removal detection process will be described.
FIG. 8D is a flowchart showing the object removal detection process of S804 shown in FIG. Each step is realized by the CPU 302 executing a stored control program.
When starting the object removal detection process, the object detection unit 410 acquires camera image data for one frame from the camera image acquisition unit 407 in S841. In step S842, the object detection unit 410 calculates a difference value between the acquired camera image data and the camera image data of the document table background.

Ｓ843で、物体検知部410は、Ｓ842で算出した差分値が予め決めておいた所定値以下かどうかを判定する。物体検知部410は、Ｓ842において算出した差分値が予め決めておいた所定値よりも大きいと判断した場合、書画台204上にまだ物体が存在すると判断して、処理をＳ841へ戻し、処理を続ける。
一方、物体検知部410は、Ｓ842において算出した差分値が予め決めておいた所定値以下であると判断した場合、処理をＳ844へ進める。Ｓ844で、物体検知部410は、書画台204上の物体がなくなった（除去された）ため、物体除去をメイン制御部402へ通知し、物体除去検知処理を終了する。 In S843, the object detection unit 410 determines whether or not the difference value calculated in S842 is equal to or less than a predetermined value. If the object detection unit 410 determines that the difference value calculated in S842 is larger than the predetermined value determined in advance, the object detection unit 410 determines that an object still exists on the document table 204, returns the process to S841, and performs the process. to continue.
On the other hand, when the object detection unit 410 determines that the difference value calculated in S842 is equal to or less than a predetermined value, the process proceeds to S844. In step S844, the object detection unit 410 notifies the main control unit 402 of object removal because the object on the document table 204 has disappeared (is removed), and ends the object removal detection process.

以上の処理により、物体検知部410は、カメラ画像データに基づいて書画台204上の物体の載置及び除去を検知することができる。付言すると、物体検知部410は、物体が紙等の平面物の場合、距離画像データからだけでは書画台204上の物体の載置及び除去を検知することができないが、上述したようにカメラ画像データを用いることによって検知することができるようになる。 Through the above processing, the object detection unit 410 can detect the placement and removal of an object on the document table 204 based on the camera image data. In addition, when the object is a flat object such as paper, the object detection unit 410 cannot detect the placement and removal of the object on the document table 204 only from the distance image data, but as described above, the camera image It becomes possible to detect by using data.

（平面原稿画像撮影部411の説明）
図９は、本実施形態を示す画像処理装置の制御方法を説明するフローチャートである。本例は、図４Ａに示した平面原稿画像撮影部411が実行する処理例である。なお、各ステップは、ＣＰＵ302が記憶された制御プログラムを実行することで実現される。
図１０は、図４Ａに示した平面原稿画像撮影部411の処理を説明するための模式図である。
平面原稿画像撮影部411は、処理を開始すると、Ｓ901で、カメラ画像取得部407を介してカメラ部202からの画像データを１フレーム分取得する。ここで、カメラ部202の座標系は、図２(b)で示したように書画台204に正対していないため、このときの撮影画像は図１０(a)に示すように対象物1001、書画台204ともに歪んでいる。 (Description of flat document image photographing unit 411)
FIG. 9 is a flowchart illustrating a method for controlling the image processing apparatus according to the present exemplary embodiment. This example is an example of processing executed by the flat document image photographing unit 411 shown in FIG. 4A. Each step is realized by the CPU 302 executing a stored control program.
FIG. 10 is a schematic diagram for explaining processing of the flat document image photographing unit 411 shown in FIG. 4A.
When the processing is started, the flat document image photographing unit 411 obtains one frame of image data from the camera unit 202 via the camera image obtaining unit 407 in S901. Here, since the coordinate system of the camera unit 202 does not face the document table 204 as shown in FIG. 2B, the captured image at this time is the object 1001, as shown in FIG. The document table 204 is distorted.

Ｓ902で、平面原稿画像撮影部411は、書画台背景カメラ画像データとＳ901で取得したカメラ画像データとの画素毎の差分画像データを生成する。さらに、平面原稿画像撮影部411は、生成した差分画像データに基づいて、差分のある画素が黒、差分のない画素が白となるように二値化する。
したがって、ここで平面原稿画像撮影部411が生成した差分画像は、図１０(b)の領域1002のように、対象物1001の領域が黒色である（差分がある）画像データとなる。
Ｓ903で、平面原稿画像撮影部411は、二値化された領域1002を用いて、図１０(c)のように対象物1001のみの画像データを抽出する。Ｓ904で、平面原稿画像撮影部411は、抽出した原稿領域画像データに対して階調補正を行う。 In step S902, the flat document image photographing unit 411 generates difference image data for each pixel between the document table background camera image data and the camera image data acquired in step S901. Further, the plane document image photographing unit 411 binarizes based on the generated difference image data so that pixels having a difference are black and pixels having no difference are white.
Therefore, the difference image generated by the planar document image photographing unit 411 is image data in which the area of the object 1001 is black (there is a difference) as in the area 1002 in FIG. 10B.
In step S903, the planar document image photographing unit 411 uses the binarized region 1002 to extract image data of only the object 1001 as shown in FIG. In step S904, the planar document image photographing unit 411 performs gradation correction on the extracted document area image data.

Ｓ905で、平面原稿画像撮影部411は、抽出した原稿領域画像に対してカメラ座標系から書画台204への射影変換を行い、図１０(d)のように書画台204の真上から見た画像データ1003に変換する。ここで用いる射影変換パラメータは、ジェスチャー認識部409の処理において、前述した図６のＳ601で算出した平面パラメータとカメラ座標系とから求めることができる。
なお、図１０(d)に示したように、書画台204上への原稿の置き方により、ここで得られる画像データ1003は傾いていることがある。 In step S905, the planar document image photographing unit 411 performs projective transformation from the camera coordinate system to the document table 204 with respect to the extracted document region image, and is viewed from directly above the document table 204 as shown in FIG. Convert to image data 1003. The projective transformation parameter used here can be obtained from the plane parameter calculated in S601 of FIG. 6 and the camera coordinate system in the processing of the gesture recognition unit 409.
As shown in FIG. 10D, the image data 1003 obtained here may be tilted depending on how the document is placed on the document table 204.

そこで、Ｓ906で、平面原稿画像撮影部411は、画像データ1003を矩形近似してからその矩形が水平になるように回転し、図１０(e)で示した画像データ1004のように傾きのない画像データを得る。
平面原稿画像撮影部411は、図１０(f)に示すように、基準ラインに対しての矩形の傾きθ1及びθ2を算出し、傾きが小さい方（ここではθ1）を画データ1003の回転角度として決定する。
また、平面原稿画像撮影部411は、図１０(g)及び図１０(h)に示すように、画像データ1003中に含まれる文字列に対してＯＣＲ処理を行い、文字列の傾きから画像データ1003の回転角度の算出及び天地判定処理をしてもよい。 Therefore, in S906, the planar document image photographing unit 411 rotates the image data 1003 so that the rectangle is horizontal after approximating the image data 1003, and there is no inclination like the image data 1004 shown in FIG. Obtain image data.
As shown in FIG. 10F, the flat document image photographing unit 411 calculates the inclinations θ1 and θ2 of the rectangle with respect to the reference line, and the rotation angle of the image data 1003 indicates the smaller inclination (here, θ1). Determine as.
Further, as shown in FIGS. 10 (g) and 10 (h), the flat document image photographing unit 411 performs OCR processing on the character string included in the image data 1003, and the image data is obtained from the inclination of the character string. The calculation of the rotation angle of 1003 and the top / bottom determination processing may be performed.

Ｓ907で、平面原稿画像撮影部411は、抽出した画像データ1004に対して、予め決めておいた画像フォーマット（例えばＪＰＥＧ、ＴＩＦＦ、ＰＤＦ等）に合わせて圧縮及びファイルフォーマット変換を行う。そして、平面原稿画像撮影部411は、画像データ1004をデータ管理部405を介してＨＤＤ305の所定の領域へファイルとして保存し、処理を終了する。 In step S907, the planar document image photographing unit 411 performs compression and file format conversion on the extracted image data 1004 in accordance with a predetermined image format (for example, JPEG, TIFF, PDF, etc.). Then, the flat document image photographing unit 411 stores the image data 1004 as a file in a predetermined area of the HDD 305 via the data management unit 405, and ends the process.

〔書籍画像撮影部412の処理〕
図１１は、本実施形態を示す画像処理装置の制御方法を説明するフローチャートである。本例は、図４に示した書籍画像撮影部412が実行する処理例である。なお、各ステップは、ＣＰＵ302が記憶された制御プログラムを実行することで実現される。図１２は、図４Ａに示した書籍画像撮影部412の処理を説明するための模式図である。 [Process of Book Image Shooting Unit 412]
FIG. 11 is a flowchart illustrating a method for controlling the image processing apparatus according to the present exemplary embodiment. This example is a processing example executed by the book image photographing unit 412 shown in FIG. Each step is realized by the CPU 302 executing a stored control program. FIG. 12 is a schematic diagram for explaining processing of the book image photographing unit 412 shown in FIG. 4A.

図１１(a)に示す処理を開始して、Ｓ1101で、書籍画像撮影部412はカメラ画像取得部407を用いて、カメラ部202からカメラ画像データを１フレーム分取得する。また、書籍画像撮影部412は距離画像取得部408を用いて距離画像センサ部208から距離画像データを１フレーム分取得する。ここで得られるカメラ画像データ1201の例を図１２(a)に示す。 The processing shown in FIG. 11A is started, and in S1101, the book image photographing unit 412 uses the camera image obtaining unit 407 to obtain one frame of camera image data from the camera unit 202. Further, the book image photographing unit 412 obtains one frame of distance image data from the distance image sensor unit 208 using the distance image obtaining unit 408. An example of the camera image data 1201 obtained here is shown in FIG.

図１２(a)では、書画台204と対象物体1211である撮影対象書籍とを含むカメラ画像データ1201が得られている。図１２(b)は、ここで得られた距離画像データの例である。図１２(b)では、距離画像センサ部208に近いほど濃い色で表されており、対象物体1212上の各画素において距離画像センサ部208からの距離情報が含まれる距離画像データ1202が得られている。 In FIG. 12A, camera image data 1201 including a document table 204 and a photographing target book that is the target object 1211 is obtained. FIG. 12B is an example of the distance image data obtained here. In FIG. 12B, the distance image data 1202 including the distance information from the distance image sensor unit 208 is obtained in a darker color as it is closer to the distance image sensor unit 208 and in each pixel on the target object 1212. ing.

また、図１２(b)において、距離画像センサ部208からの距離が書画台204よりも遠い画素については白で表されており、対象物体1212の書画台204に接している部分（対象物体1212では右側のページ）も同じく白色となる。 In FIG. 12B, pixels whose distance from the distance image sensor unit 208 is farther than the document table 204 are represented in white, and the portion of the target object 1212 that is in contact with the document table 204 (target object 1212). Then the page on the right side is also white.

Ｓ1102で、書籍画像撮影部412は、Ｓ1111〜Ｓ1116に示す、取得したカメラ画像データと距離画像データとから書画台204上に載置された書籍物体の３次元点群を算出する処理を行う。 In S1102, the book image photographing unit 412 performs a process of calculating a three-dimensional point group of the book object placed on the document table 204 from the acquired camera image data and distance image data shown in S1111 to S1116.

Ｓ1111で、書籍画像撮影部412は、カメラ画像データと書画台背景カメラ画像データとの画素毎の差分を算出して二値化を行い、図１２(c)のように物体領域1213が黒で示されるカメラ差分画像データ1203を生成する。 In step S1111, the book image photographing unit 412 calculates a difference for each pixel between the camera image data and the document table background camera image data, performs binarization, and the object region 1213 is black as shown in FIG. The camera difference image data 1203 shown is generated.

Ｓ1112で、書籍画像撮影部412は、カメラ差分画像データ1203を、カメラ座標系から距離画像センサ座標系への変換を行い、図１２(d)のように距離画像センサ部208から見た物体領域1214を含むカメラ差分画像データ1204を生成する。 In S1112, the book image photographing unit 412 converts the camera difference image data 1203 from the camera coordinate system to the distance image sensor coordinate system, and the object region viewed from the distance image sensor unit 208 as shown in FIG. Camera difference image data 1204 including 1214 is generated.

Ｓ1113で、書籍画像撮影部412は、距離画像データと書画台背景距離画像データとの画素毎の差分を算出して二値化を行い、図１２(e)のように物体領域1215が黒で示される距離差分画像データ1205を生成する。ここで、対象物体1211の書画台204と同じ色である部分については、画素値の差が小さくなるためカメラ差分画像データ1203中の物体領域1213に含まれなくなる場合がある。また、対象物体1212の書画台204と高さが変わらない部分については、距離画像センサ部208からの距離値が書画台204までの距離値と比べて差が小さいため、距離差分画像データ1205中の物体領域1215には含まれない場合がある。 In step S1113, the book image photographing unit 412 calculates a difference for each pixel between the distance image data and the document table background distance image data, performs binarization, and the object region 1215 is black as shown in FIG. The indicated distance difference image data 1205 is generated. Here, the portion of the target object 1211 that has the same color as the document table 204 may not be included in the object region 1213 in the camera difference image data 1203 because the difference in pixel values is small. Also, for the portion of the target object 1212 that does not change in height from the document table 204, the distance value from the distance image sensor unit 208 is smaller than the distance value to the document table 204, so the distance difference image data 1205 The object region 1215 may not be included.

そこで、Ｓ1114で、書籍画像撮影部412は、カメラ差分画像データ1203と距離差分画像データ1205との和をとって図１２(f)に示す物体領域画像データ1206を生成し、物体領域1216を得る。ここで物体領域1216は書画台204と比べて色が異なるか又は高さが異なる領域である。この領域は、カメラ差分画像データ1203中の物体領域1213か距離差分画像データ1205中の物体領域1215の何れか片方のみを使った場合よりも、より正確に物体領域を表している。
物体領域画像データ1206は距離画像センサ座標系であるため、Ｓ1115で、書籍画像撮影部412は、距離画像データ1202から物体領域画像1206中の物体領域1216のみを抽出することが可能である。 Therefore, in S1114, the book image photographing unit 412 generates the object region image data 1206 shown in FIG. 12F by taking the sum of the camera difference image data 1203 and the distance difference image data 1205 to obtain the object region 1216. . Here, the object region 1216 is a region having a different color or a different height as compared to the document table 204. This area represents the object area more accurately than when only one of the object area 1213 in the camera difference image data 1203 and the object area 1215 in the distance difference image data 1205 is used.
Since the object area image data 1206 is a distance image sensor coordinate system, the book image photographing unit 412 can extract only the object area 1216 in the object area image 1206 from the distance image data 1202 in S1115.

Ｓ1116で、書籍画像撮影部412は、Ｓ1115で抽出した距離画像データを直交座標系に変換することにより、図１２(g)に示した３次元点群1217を生成する。この３次元点群1217が書籍物体の３次元点群である。 In S1116, the book image photographing unit 412 generates the three-dimensional point group 1217 shown in FIG. 12G by converting the distance image data extracted in S1115 into an orthogonal coordinate system. This three-dimensional point group 1217 is a three-dimensional point group of the book object.

Ｓ1103で、書籍画像撮影部412は、取得したカメラ画像と、算出した３次元点群とから、図１１(b)で詳しく説明する書籍画像のゆがみ補正処理を行い、２次元の書籍画像を生成する。 In S1103, the book image photographing unit 412 performs a distortion correction process of the book image, which will be described in detail in FIG. 11B, from the acquired camera image and the calculated three-dimensional point cloud, and generates a two-dimensional book image. To do.

〔書籍画像ゆがみ補正処理〕
図１１(b)は、本実施形態を示す画像処理装置の制御方法を説明するフローチャートである。本例は、図１１の(a)に示したＳ1103の書籍画像ゆがみ補正処理の詳細手順に対応する。なお、各ステップは、ＣＰＵ303が記憶された制御プログラムを実行することで実現される。
書籍画像ゆがみ補正処理を開始すると、Ｓ1121で、書籍画像撮影部412は、物体領域画像データ1206を距離センサ画像座標系からカメラ座標系に変換する。
Ｓ1122で、書籍画像撮影部412は、カメラ画像1201から物体領域画像データ1206中の物体領域1216をカメラ座標系に変換したものを用いて物体領域画像データを抽出する。Ｓ1123で、書籍画像撮影部412は、抽出した物体領域画像データを書画台平面へ射影変換する。 [Book image distortion correction processing]
FIG. 11B is a flowchart for explaining a control method of the image processing apparatus according to the present embodiment. This example corresponds to the detailed procedure of the book image distortion correction process of S1103 shown in FIG. Each step is realized by the CPU 303 executing a stored control program.
When the book image distortion correction process is started, in S1121, the book image photographing unit 412 converts the object area image data 1206 from the distance sensor image coordinate system to the camera coordinate system.
In step S1122, the book image photographing unit 412 extracts object area image data using the camera image 1201 converted from the object area 1216 in the object area image data 1206 into the camera coordinate system. In step S1123, the book image photographing unit 412 performs projective conversion of the extracted object area image data onto the document table plane.

Ｓ1124で、書籍画像撮影部412は、射影変換した物体領域画像データを矩形近似し、その矩形が水平になるように回転することによって、図１２(h)の書籍画像データ1208を生成する。
書籍画像データ1208は、近似矩形の片方の辺がＸ軸に平行となっているため、書籍画像撮影部412は、以降で書籍画像データ1208に対してＸ軸方向へのゆがみ補正処理を行う。
Ｓ1125で、書籍画像撮影部412は、書籍画像データ1208の最も左端の点をＰとする（図１２(h)の点Ｐ）。Ｓ1126で、書籍画像撮影部412は、書籍物体の３次元点群1217から点Ｐの高さ（図１２(h)のｈ1）を取得する。Ｓ1127で、書籍画像撮影部412は、書籍画像1208の点Ｐに対してＸ軸方向に所定の距離（図１２(h)のｘ1）離れた点をＱとする（図１２(h)の点Ｑ）。 In step S1124, the book image photographing unit 412 generates a book image data 1208 in FIG. 12H by approximating the object region image data obtained by projective transformation to a rectangle and rotating the rectangle so that the rectangle becomes horizontal.
Since the book image data 1208 has one side of the approximate rectangle parallel to the X axis, the book image photographing unit 412 subsequently performs a distortion correction process in the X axis direction on the book image data 1208.
In S1125, the book image photographing unit 412 sets the leftmost point of the book image data 1208 as P (point P in FIG. 12 (h)). In S1126, the book image photographing unit 412 acquires the height of the point P (h1 in FIG. 12 (h)) from the three-dimensional point group 1217 of the book object. In step S1127, the book image photographing unit 412 sets Q as a point separated from the point P of the book image 1208 by a predetermined distance (x1 in FIG. 12H) in the X-axis direction (point in FIG. 12H). Q).

Ｓ1128で、書籍画像撮影部412は、３次元点群1217から点Ｑの高さ（図１２(h)のｈ2）を取得する。Ｓ1129で、書籍画像撮影部412は、点Ｐと点Ｑとの書籍物体上での距離（図１２(h)のｌ1）を数４によって、直線近似で算出する。 In S1128, the book image photographing unit 412 acquires the height of the point Q (h2 in FIG. 12 (h)) from the three-dimensional point group 1217. In S1129, the book image photographing unit 412 calculates the distance between the point P and the point Q on the book object (l1 in FIG. 12 (h)) by equation 4 and linear approximation.

Ｓ1130で、書籍画像撮影部412は、算出した距離ｌ1でＰＱ間の距離を補正し、図１２(h)における画像1219上の点Ｐ'と点Ｑ'との位置に画素をコピーする。

In S1130, the book image photographing unit 412 corrects the distance between the PQs by the calculated distance l1, and copies the pixel to the positions of the points P ′ and Q ′ on the image 1219 in FIG.

Ｓ1131で、書籍画像撮影部412は、処理を行った点Ｑを点Ｐに入れ替え、Ｓ1132の判断がYESとなるまで、Ｓ1126に戻って同じ処理を行う。これによって図１２(h)の点Ｑと点Ｒとの間の補正を実行することができ、画像1219上の点Ｑ'と点Ｒ'との画素とする。書籍画像撮影部412は、この処理を全画素について繰り返すことにより、画像1219はゆがみ補正後の画像となる。 In S1131, the book image photographing unit 412 replaces the processed point Q with the point P, and returns to S1126 to perform the same processing until the determination in S1132 becomes YES. Thereby, the correction between the point Q and the point R in FIG. 12 (h) can be executed, and the pixel of the point Q ′ and the point R ′ on the image 1219 is obtained. The book image photographing unit 412 repeats this process for all pixels, so that the image 1219 becomes an image after distortion correction.

Ｓ1132で、書籍画像撮影部412は、ゆがみ補正処理を全ての点について終えたかどうかを判定し、終えていると判断した場合、書籍物体のゆがみ補正処理を終了する。
以上のようにして、書籍画像撮影部412は、Ｓ1102、Ｓ1103の処理を行ってゆがみ補正を行った書籍画像データを生成することができる。
ここで、図１１の(a)に示す処理に戻り、書籍画像撮影部412は、S1103でゆがみ補正を行った書籍画像データの生成後、Ｓ1104では生成した書籍画像データに階調補正を行う。 In step S1132, the book image photographing unit 412 determines whether or not the distortion correction processing has been completed for all points, and if it is determined that the correction has been completed, the book object distortion correction processing ends.
As described above, the book image photographing unit 412 can generate book image data subjected to the distortion correction by performing the processes of S1102 and S1103.
Returning to the processing shown in FIG. 11A, the book image photographing unit 412 performs tone correction on the generated book image data in S1104 after generating the book image data subjected to the distortion correction in S1103.

Ｓ1105で、書籍画像撮影部412は、生成した書籍画像データに対して、予め決めておいた画像フォーマット（例えばＪＰＥＧ、ＴＩＦＦ、ＰＤＦ等）に合わせて圧縮及びファイルフォーマット変換を行う。Ｓ1106で、書籍画像撮影部412は、生成した書籍画像データを、データ管理部405を介してＨＤＤ305の所定の領域へファイルとして保存し、処理を終了する。 In step S1105, the book image photographing unit 412 performs compression and file format conversion on the generated book image data in accordance with a predetermined image format (for example, JPEG, TIFF, PDF, etc.). In step S1106, the book image photographing unit 412 stores the generated book image data as a file in a predetermined area of the HDD 305 via the data management unit 405, and ends the process.

〔立体形状測定部413の説明〕
図１３は、本実施形態を示す画像処理装置の制御方法を説明するフローチャートである。本例は、図４Ａに示した立体形状測定部413が実行する処理例である。なお、各ステップは、ＣＰＵ302が記憶された制御プログラムを実行することで実現される。図１４は、立体形状測定部413の処理を説明するための模式図である。
本処理を開始すると、Ｓ1301で、立体形状測定部413は、シリアルＩ/Ｆ310を介してターンテーブル209へ回転指示を行い、ターンテーブル209を所定の角度ずつ回転させる。ここでの回転角度は小さければ小さいほど最終的な測定精度は高くなるが、その分測定回数が多くなり時間がかかるため、装置として適切な回転角度を予め決めておけばよい。 [Description of the three-dimensional shape measurement unit 413]
FIG. 13 is a flowchart for explaining a control method of the image processing apparatus according to the present embodiment. This example is an example of processing executed by the three-dimensional shape measurement unit 413 shown in FIG. 4A. Each step is realized by the CPU 302 executing a stored control program. FIG. 14 is a schematic diagram for explaining the processing of the three-dimensional shape measurement unit 413.
When this processing is started, in S1301, the three-dimensional shape measurement unit 413 issues a rotation instruction to the turntable 209 via the serial I / F 310, and rotates the turntable 209 by a predetermined angle. The smaller the rotation angle is, the higher the final measurement accuracy is. However, the number of times of measurement increases accordingly, and it takes time. Therefore, an appropriate rotation angle for the apparatus may be determined in advance.

Ｓ1302で、立体形状測定部413は、書画台204内に設けられたターンテーブル209上の対象物に対して、カメラ部202とプロジェクタ207とを用いて３次元点群測定処理を行う。以下、図１３(b)に示すフローチャートを用いて、Ｓ1302で立体形状測定部413が実行する３次元点群測定処理を説明する。 In step S <b> 1302, the three-dimensional shape measurement unit 413 performs a three-dimensional point cloud measurement process on the object on the turntable 209 provided in the document table 204 using the camera unit 202 and the projector 207. Hereinafter, the three-dimensional point group measurement process executed by the three-dimensional shape measurement unit 413 in S1302 will be described with reference to the flowchart shown in FIG.

〔３次元点群測定処理〕
立体形状測定部413は、３次元点群測定処理を開始すると、Ｓ1311では図１４(a)に示したターンテーブル209上の対象物1401に対して、プロジェクタ207から３次元形状測定パターン1402を投射する。Ｓ1312で、立体形状測定部413は、カメラ画像取得部407を介してカメラ部202からカメラ画像データを１フレーム分取得する。
Ｓ1313で、立体形状測定部413は、３次元形状測定パターン1402と取得したカメラ画像データとの間での対応点を図５のＳ504と同様にして抽出する。 [Three-dimensional point cloud measurement processing]
When the three-dimensional shape measurement unit 413 starts the three-dimensional point group measurement process, in step S1311, the three-dimensional shape measurement pattern 1402 is projected from the projector 207 to the object 1401 on the turntable 209 shown in FIG. To do. In step S1312, the three-dimensional shape measurement unit 413 acquires camera image data for one frame from the camera unit 202 via the camera image acquisition unit 407.
In step S1313, the three-dimensional shape measurement unit 413 extracts corresponding points between the three-dimensional shape measurement pattern 1402 and the acquired camera image data in the same manner as in step S504 in FIG.

Ｓ1314で、立体形状測定部413は、カメラ部202及びプロジェクタ207の位置関係から、カメラ画像データ上の各画素における距離を算出し、距離画像データを生成する。ここでの測定方法は、距離画像取得部408の処理において、図５のＳ505で説明した測定方法と同じである。
Ｓ1315で、立体形状測定部413は、距離画像データの各画素について直交座標系への座標変換を行い、３次元点群を算出する。Ｓ1316で、立体形状測定部413は、算出した３次元点群から書画台204の平面パラメータを用いて書画台平面に含まれる３次元点群を除去して、対象物の３次元点群を抽出する。 In step S1314, the three-dimensional shape measurement unit 413 calculates the distance at each pixel on the camera image data from the positional relationship between the camera unit 202 and the projector 207, and generates distance image data. The measurement method here is the same as the measurement method described in S505 of FIG. 5 in the processing of the distance image acquisition unit 408.
In step S1315, the three-dimensional shape measurement unit 413 performs coordinate conversion to the orthogonal coordinate system for each pixel of the distance image data, and calculates a three-dimensional point group. In step S <b> 1316, the three-dimensional shape measurement unit 413 removes the three-dimensional point group included in the document table plane from the calculated three-dimensional point group using the plane parameter of the document table 204, and extracts the three-dimensional point group of the object. To do.

Ｓ1317で、立体形状測定部413は、残った３次元点群の中から位置が大きく外れている点をノイズとして除去し、対象物1401の３次元点群1403を生成する。ここで、位置が大きく外れている点とは、例えば予め定められた位置よりも外れている点である。1318で、立体形状測定部413は、プロジェクタ207から投射している３次元形状測定パターン1402を消灯する。 In S1317, the three-dimensional shape measurement unit 413 removes, as noise, a point whose position is greatly deviated from the remaining three-dimensional point group, and generates a three-dimensional point group 1403 of the object 1401. Here, the point where the position is greatly deviated is, for example, a point deviated from a predetermined position. In 1318, the three-dimensional shape measurement unit 413 turns off the three-dimensional shape measurement pattern 1402 projected from the projector 207.

Ｓ1319で、立体形状測定部413は、カメラ画像取得部407を介してカメラ部202からカメラ画像を取得し、その角度から見たときのテクスチャ画像として保存し、３次元点群測定処理を終了する。図１３の(a)に示すステップの処理の説明に戻る。 In step S1319, the three-dimensional shape measurement unit 413 acquires a camera image from the camera unit 202 via the camera image acquisition unit 407, stores it as a texture image when viewed from the angle, and ends the three-dimensional point cloud measurement process. . Returning to the description of the processing of the step shown in FIG.

立体形状測定部413が２回目以降にＳ1302の３次元点群測定処理を実行した際は、Ｓ1301でターンテーブル209を回転させて計測を行っている。そのため、図１４(c)に示すようにターンテーブル209上の対象物1401、プロジェクタ207及びカメラ部202の角度が変わっている。そのため、立体形状測定部413は、図１４(d)に示すように、Ｓ1302で得られた３次元点群1403とは異なる視点から見た３次元点群1404が得られる。 When the three-dimensional shape measurement unit 413 executes the three-dimensional point cloud measurement process of S1302 after the second time, the measurement is performed by rotating the turntable 209 in S1301. Therefore, as shown in FIG. 14C, the angles of the object 1401, the projector 207, and the camera unit 202 on the turntable 209 are changed. Therefore, as shown in FIG. 14D, the three-dimensional shape measuring unit 413 obtains a three-dimensional point group 1404 viewed from a different viewpoint from the three-dimensional point group 1403 obtained in S1302.

つまり、３次元点群1403ではカメラ部202及びプロジェクタ207から死角となって算出できなかった部分の３次元点群が、３次元点群1404では含まれることになる。逆に、３次元点群1404には含まれない３次元点群が、３次元点群1403に含まれている。そこで、立体形状測定部413は、異なる視点から見た２つの３次元点群1403と３次元点群1404とを重ね合わせる処理を行う。 That is, the 3D point group 1403 includes a portion of the 3D point group 1403 that cannot be calculated as a blind spot from the camera unit 202 and the projector 207. Conversely, a 3D point group that is not included in the 3D point group 1404 is included in the 3D point group 1403. Therefore, the three-dimensional shape measurement unit 413 performs a process of superimposing two three-dimensional point groups 1403 and three-dimensional point groups 1404 viewed from different viewpoints.

Ｓ1303で、立体形状測定部413は、Ｓ1302で測定した３次元点群1404を、ターンテーブルが初期位置から回転した角度分逆回転することにより、３次元点群1403との位置を合わせた３次元点群1405を算出する。 In S1303, the three-dimensional shape measurement unit 413 performs a three-dimensional combination of the position with the three-dimensional point group 1403 by reversely rotating the three-dimensional point group 1404 measured in S1302 by an angle rotated from the initial position. A point group 1405 is calculated.

Ｓ1304で、立体形状測定部413は、Ｓ1303で算出された３次元点群と、既に合成された３次元点群との合成処理を行う。３次元点群の合成処理には、特徴点を用いたＩＣＰ(Iterative Closest Point)アルゴリズムを用いる。ＩＣＰアルゴリズムで、立体形状測定部413は、合成対象の２つの３次元点群1403と３次元点群1404とから、それぞれコーナーとなる３次元特徴点を抽出する。そして、立体形状測定部413は、３次元点群1403の特徴点と３次元点群1404の特徴点との対応をとって、全ての対応点同士の距離を算出して加算する。立体形状測定部413は、３次元点群1404の位置を動かしながら対応点同士の距離の和が最小となる位置を繰り返し算出する。そして、立体形状測定部413は、繰り返し回数が上限に達した場合や、対応点同士の距離の和が最小となる位置を算出して３次元点群の合成処理を行う。
具体的には、立体形状測定部413は、図１４(e)に示す３次元点群1404を移動してから３次元点群1403と重ね合わせることにより、２つの３次元点群1403と図１４(d)に示す３次元点群1404とを合成する。立体形状測定部413は、このようにして合成後の３次元点群1405（同図(f)参照）を生成し、３次元点群合成処理を終了する。 In S1304, the three-dimensional shape measurement unit 413 performs a synthesis process of the 3D point group calculated in S1303 and the already synthesized 3D point group. An ICP (Iterative Closest Point) algorithm using feature points is used for the synthesis process of the three-dimensional point group. With the ICP algorithm, the three-dimensional shape measurement unit 413 extracts three-dimensional feature points that are corners from the two three-dimensional point groups 1403 and 1404 to be synthesized. Then, the three-dimensional shape measuring unit 413 calculates the correspondence between the feature points of the three-dimensional point group 1403 and the feature points of the three-dimensional point group 1404 and calculates and adds the distances between all the corresponding points. The three-dimensional shape measurement unit 413 repeatedly calculates a position where the sum of the distances between corresponding points is minimized while moving the position of the three-dimensional point group 1404. Then, the three-dimensional shape measurement unit 413 calculates the position where the sum of the distances between the corresponding points is the minimum when the number of repetitions reaches the upper limit, and performs the process of synthesizing the three-dimensional point group.
Specifically, the three-dimensional shape measurement unit 413 moves the three-dimensional point group 1404 shown in FIG. 14 (e) and then superimposes it with the three-dimensional point group 1403, so that the three-dimensional point group 1403 and FIG. The three-dimensional point group 1404 shown in (d) is synthesized. The three-dimensional shape measurement unit 413 generates the combined three-dimensional point group 1405 (see (f) in the figure) in this way, and ends the three-dimensional point group combining process.

立体形状測定部413は、Ｓ1304の３次元点群合成処理を終了すると、Ｓ1305で、ターンテーブル209が１周回転したかを判定する。立体形状測定部413は、まだターンテーブル209が１周回転していないと判定した場合、処理をＳ1301へ戻す。そして、立体形状測定部413は、Ｓ1301でターンテーブル209を更に回転してからＳ1302の処理を実行して、別の角度の３次元点群を測定する。そして、立体形状測定部413は、Ｓ1303〜Ｓ1304において既に合成した３次元点群1405と新たに測定した３次元点群との合成処理を行う。
立体形状測定部413は、このようにＳ1301からＳ1304までの処理を、Ｓ1305でターンテーブル209が１周するまで繰り返していると判断した場合、処理をＳ1306へ進める。立体形状測定部413は、Ｓ1301からＳ1305までの処理を終了すると、処理対象物1401の全周３次元点群を生成することができる。 When the three-dimensional point group combining process in S1304 is finished, the solid shape measuring unit 413 determines in S1305 whether the turntable 209 has rotated once. If the solid shape measuring unit 413 determines that the turntable 209 has not yet rotated once, the process returns to S1301. Then, the three-dimensional shape measuring unit 413 further rotates the turntable 209 in S1301 and then executes the process in S1302, and measures a three-dimensional point group at another angle. Then, the three-dimensional shape measuring unit 413 performs a combining process between the three-dimensional point group 1405 already combined in S1303 to S1304 and the newly measured three-dimensional point group.
When the solid shape measuring unit 413 determines that the processing from S1301 to S1304 is repeated until the turntable 209 makes one round in S1305, the process proceeds to S1306. The three-dimensional shape measuring unit 413 can generate the all-round three-dimensional point group of the processing object 1401 after completing the processing from S1301 to S1305.

Ｓ1305で、立体形状測定部413は、Ｓ1305でターンテーブル209が１周したと判定した場合、処理をＳ1306へ進め、生成した３次元点群からＳ1331〜S1333を実行して、３次元モデルを生成する。
３次元モデル生成処理を開始すると、Ｓ1331で、立体形状測定部413は、３次元点群からノイズ除去及び平滑化する処理を行う。Ｓ1332で、立体形状測定部413は、３次元点群から三角パッチを生成することで、メッシュ化を行う。 If the solid shape measuring unit 413 determines in S1305 that the turntable 209 has made one revolution in S1305, the process proceeds to S1306, and S1331 to S1333 are executed from the generated three-dimensional point group to generate a three-dimensional model. To do.
When the 3D model generation process is started, the solid shape measuring unit 413 performs a process of removing and smoothing noise from the 3D point group in S1331. In S1332, the three-dimensional shape measuring unit 413 generates a mesh by generating a triangular patch from the three-dimensional point group.

Ｓ1333で、立体形状測定部413は、メッシュ化によって得られた平面へＳ1319で保存したテクスチャをマッピングする。立体形状測定部413は、以上によりテクスチャマッピングされた３次元モデルを生成することができる。 In S1333, the solid shape measuring unit 413 maps the texture stored in S1319 to the plane obtained by meshing. The three-dimensional shape measurement unit 413 can generate a three-dimensional model texture-mapped as described above.

Ｓ1307で、立体形状測定部413は、テクスチャマッピング後のデータをＶＲＭＬやＳＴＬ等の標準的な３次元モデルデータフォーマットへ変換し、データ管理部405を介してＨＤＤ305上の所定の領域に格納し、処理を終了する。 In S1307, the solid shape measuring unit 413 converts the texture-mapped data into a standard three-dimensional model data format such as VRML or STL, and stores it in a predetermined area on the HDD 305 via the data management unit 405. The process ends.

〔メイン制御部の説明〕
図１５は、本実施形態を示す画像処理装置の制御方法を説明するフローチャートである。本例は、メイン制御部402が実行するスキャンアプリケーションの処理例である。なお、本実施形態で説明するスキャンアプリケーションは、図１６(b)に示す書画台204上に平面原稿や図１８(a)に示す立体物などの物体が載置された際に、プロジェクタ207により操作指示画像であるＧＵＩ部品（スキャンボタンなど）を投影する。また、前述のジェスチャ認識部409で認識した書画台204へのユーザによるタッチジェスチャを検出して、ＧＵＩ部品の操作を行う。その際、ＧＵＩ部品の投射位置が平面原稿に重なってしまうことがある。
図１６(c)は、ＧＵＩ部品1642〜1644が平面原稿1631に重なっている状態を模式的に表している。 [Description of main control unit]
FIG. 15 is a flowchart for explaining a control method of the image processing apparatus according to the present embodiment. This example is a processing example of a scan application executed by the main control unit 402. The scan application described in the present embodiment is performed by the projector 207 when an object such as a flat document or a three-dimensional object illustrated in FIG. 18A is placed on the document table 204 illustrated in FIG. A GUI component (such as a scan button) that is an operation instruction image is projected. Further, a user's touch gesture on the document table 204 recognized by the above-described gesture recognition unit 409 is detected, and the GUI component is operated. At that time, the projection position of the GUI component may overlap the flat original.
FIG. 16C schematically shows a state where the GUI parts 1642 to 1644 overlap the flat original 1631.

また、立体物が置かれた場合には、立体物が障害となりプロジェクタ207によりＧＵＩ部品を正しく表示できない、または距離画像センサ部208によってＧＵＩ部品に対するジェスチャ検知ができない場合がある。
図１８（ｉ）は立体物が障害となりＧＵＩ部品を正しく表示できない様子を模式的に表している。
本実施形態では上記のような場合に、書画台204上に置かれた物体を移動するよう、適切にユーザに通知する方法を説明する。 When a three-dimensional object is placed, the three-dimensional object may become an obstacle, and the GUI component may not be correctly displayed by the projector 207, or the distance image sensor unit 208 may not be able to detect a gesture for the GUI component.
FIG. 18 (i) schematically shows a state in which a three-dimensional object becomes an obstacle and GUI parts cannot be displayed correctly.
In the present embodiment, a method for appropriately notifying the user to move an object placed on the document table 204 in the above case will be described.

図１５は、本実施形態を示す画像処理装置の制御方法を説明するフローチャートである。なお、各ステップは、ＣＰＵ302が記憶された制御プログラム（図４に示すモジュール）を実行することで実現される。 FIG. 15 is a flowchart for explaining a control method of the image processing apparatus according to the present embodiment. Each step is realized by the CPU 302 executing a stored control program (module shown in FIG. 4).

Ｓ1501で、メイン制御部402は書画台204にスキャンの対象物が載置されるのを待つ、S1511〜S1513に対応する物体載置待ち処理を行う。
Ｓ1501で、メイン制御部402は、対象物載置待ち処理を開始すると、Ｓ1511で、メイン制御部402はユーザインタフェース部403を介して、書画台204にプロジェクタ207によって初期画面を投射表示する。 In step S1501, the main control unit 402 performs an object placement waiting process corresponding to steps S1511 to S1513, which waits for the scanning object to be placed on the document table 204.
In step S1501, when the main control unit 402 starts the object placement waiting process, in step S1511, the main control unit 402 projects and displays an initial screen on the document table 204 by the projector 207 via the user interface unit 403.

例えば、メイン制御部402は、ユーザインタフェース部403を介して、図１６(a)に示すような書画台204上に対象物を置くことをユーザに促すメッセージ1641のＧＵＩ部品を生成し表示する。Ｓ1512で、メイン制御部402は、物体検知部410の処理を起動する。物体検知部410は、図８のフローチャートで説明した処理の実行を開始する。 For example, the main control unit 402 generates and displays a GUI component of a message 1641 that prompts the user to place an object on the document table 204 as shown in FIG. In step S1512, the main control unit 402 activates the processing of the object detection unit 410. The object detection unit 410 starts executing the process described with reference to the flowchart of FIG.

Ｓ1513で、メイン制御部402は、物体検知部410からの物体載置通知を待つ。物体検知部410が図８のＳ827の処理を実行して物体載置をメイン制御部402へ通知すると、メイン制御部402は、物体載置待ち処理を終了する。 In step S1513, the main control unit 402 waits for an object placement notification from the object detection unit 410. When the object detection unit 410 executes the process of S827 in FIG. 8 to notify the main control unit 402 of the object placement, the main control unit 402 ends the object placement waiting process.

メイン制御部402は、Ｓ1501の物体載置待ち処理を終了すると、処理をＳ1502に進める。そして、メイン制御部402は、スキャン操作を行うためのメニュー表示であるＧＵＩ部品を、ユーザインタフェース部403を介して、図１７に一例を示す表示するための処理を行う。 When the main controller 402 finishes the object placement waiting process in S1501, the process proceeds to S1502. Then, the main control unit 402 performs processing for displaying a GUI component, which is a menu display for performing a scanning operation, via the user interface unit 403 as shown in FIG.

〔ＧＵＩ部品表示処理〕
図１７(a)は、Ｓ1502においてユーザインタフェース部403が実行するＧＵＩ部品表示処理の一例を示すフローチャートである。なお、各ステップは、ＣＰＵ302が記憶された制御プログラムを実行することで実現される。
Ｓ1801で、メイン制御部402は、書画台204上にプロジェクタ投影が不可能な領域を算出する図１７(b)に示す処理を開始する。
図１７(b)は。図１７(a)に示す書画台204上にプロジェクタ投影が不可能な領域を算出する処理を説明するフローチャートである。なお、各ステップは、ＣＰＵ302が記憶された制御プログラムを実行することで実現される。
Ｓ1806で、メイン制御部402は、書画台204上に載置された物体の種類を判別する。物体の種類は図８のＳ803でメイン制御部402に対して通知されている。メイン制御部402は、物体の種類が平面原稿であると判断した場合は、処理をＳ1807へ進め、立体物であると判断した場合は、処理をＳ1811へ進める。 [GUI component display processing]
FIG. 17A is a flowchart illustrating an example of GUI component display processing executed by the user interface unit 403 in S1502. Each step is realized by the CPU 302 executing a stored control program.
In step S <b> 1801, the main control unit 402 starts the process illustrated in FIG. 17B that calculates an area where projector projection is impossible on the document table 204.
FIG. 17 (b). 18 is a flowchart for explaining processing for calculating an area where projector projection is impossible on the document table 204 shown in FIG. Each step is realized by the CPU 302 executing a stored control program.
In step S1806, the main control unit 402 determines the type of the object placed on the document table 204. The type of the object is notified to the main control unit 402 in S803 of FIG. If the main control unit 402 determines that the type of the object is a planar document, the process proceeds to step S1807. If the main control unit 402 determines that the object is a three-dimensional object, the process proceeds to step S1811.

Ｓ1807で、メイン制御部402は、カメラ画像取得部407を介してカメラ部202からカメラ画像データを１フレーム分取得する。Ｓ1808で、メイン制御部402は、原稿領域をプロジェクタ投影が不可能な領域として設定する。
この時、図９のＳ902〜Ｓ903の処理と同様にして、Ｓ1807で取得したカメラ画像データと書画台背景カメラ画像データとの差分画像を生成し、二値化する。 In step S <b> 1807, the main control unit 402 acquires camera image data for one frame from the camera unit 202 via the camera image acquisition unit 407. In step S1808, the main control unit 402 sets the document area as an area where projector projection is impossible.
At this time, similarly to the processing of S902 to S903 in FIG. 9, a difference image between the camera image data acquired in S1807 and the document table background camera image data is generated and binarized.

そして、図１６(d)のように、書画台204の真上から見た画像（俯瞰画像）となるように平面射影変換を施す。図１６(d)では、原稿領域を白画素、それ以外の領域を黒画素で表現している。
メイン制御部402は、図１６(d)のような二値画像を、投影不可領域としてＲＡＭ303に設定する。 Then, as shown in FIG. 16D, plane projective transformation is performed so that an image (overhead image) viewed from directly above the document table 204 is obtained. In FIG. 16D, the document area is represented by white pixels and the other areas are represented by black pixels.
The main control unit 402 sets a binary image as shown in FIG. 16D in the RAM 303 as a non-projectable area.

Ｓ1811で、メイン制御部402は、図１８(a)に示すように距離画像取得部408を介して書画台204に置かれた物体1901の距離情報を測定し、距離画像データと３次元点群とを１フレーム分取得する。 In step S1811, the main control unit 402 measures the distance information of the object 1901 placed on the document table 204 via the distance image acquisition unit 408 as shown in FIG. 18A, and the distance image data and the three-dimensional point group are measured. For one frame.

Ｓ1812で、投射領域検出部415は、書画台204の平面パラメータを用いて、取得した３次元点群から書画台204を含む平面にある点群を除去する。このようにして投射領域検出部415は、図１８(b)のように書画台204の上に存在する物体1901の３次元点群1902を取得する。 In step S1812, the projection area detection unit 415 uses the plane parameter of the document stage 204 to remove the point group on the plane including the document stage 204 from the acquired three-dimensional point group. In this way, the projection area detection unit 415 acquires the three-dimensional point group 1902 of the object 1901 existing on the document table 204 as shown in FIG.

Ｓ1813で、投射領域検出部415は、Ｓ1812で取得した物体1901の３次元点群1902を、プロジェクタ207及び距離画像センサ部208を基準としたそれぞれの座標系で定義される画像平面に投射する。この投影された画像平面は、プロジェクタ207及び距離画像センサ部208から見た２次元画像平面である。上述の通り、プロジェクタ座標系又は距離画像センサ座標系と直交座標系との位置姿勢と、プロジェクタ207及び距離画像センサ部208の内部パラメータとはそれぞれキャリブレーション済みである。 In step S1813, the projection area detection unit 415 projects the three-dimensional point group 1902 of the object 1901 acquired in step S1812 onto an image plane defined by each coordinate system based on the projector 207 and the distance image sensor unit 208. This projected image plane is a two-dimensional image plane viewed from the projector 207 and the distance image sensor unit 208. As described above, the position and orientation of the projector coordinate system or the distance image sensor coordinate system and the orthogonal coordinate system, and the internal parameters of the projector 207 and the distance image sensor unit 208 are already calibrated.

そのため、投射領域検出部415は、数1及び数3を用いることによって、直交座標系で表される1点を画像平面上の1点へ投射することができる。
投射領域検出部415は、これを全点に対して行うことで各画像平面の画像データを生成できる。図１８(c)は、プロジェクタ画像平面での投射画像データを示している。図１８(d)は、距離画像センサ画像平面での投射画像データを示している。 Therefore, the projection area detection unit 415 can project one point represented by the orthogonal coordinate system to one point on the image plane by using Equation 1 and Equation 3.
The projection area detection unit 415 can generate image data of each image plane by performing this for all points. FIG. 18C shows the projection image data on the projector image plane. FIG. 18D shows the projection image data on the distance image sensor image plane.

Ｓ1814で、投射領域検出部415は、生成した各座標系における画像平面の投射画像データを、書画台204の真上から見た画像（俯瞰画像）データとなるように平面射影変換を施す。図１８(e)は、プロジェクタ画像平面の俯瞰画像データを表している。また、図１８(f)は、距離画像センサ画像平面の俯瞰画像データを表している。
これらの俯瞰画像データは、書画台204上に置かれた物体1901が、プロジェクタ207及び距離画像センサ部208から書画台204上のどの位置に置かれているかを平面的に捉えた画像データを意味する。そして、俯瞰画像データで表される物***置1903及び物***置1904はそれぞれ、プロジェクタ207及び距離画像センサ部208の死角領域となる。 In step S1814, the projection area detection unit 415 performs planar projection conversion so that the generated projection image data of the image plane in each coordinate system becomes image (overhead image) data viewed from directly above the document table 204. FIG. 18 (e) represents overhead image data of the projector image plane. FIG. 18F shows overhead image data on the distance image sensor image plane.
These bird's-eye view image data mean image data that captures in plan view where the object 1901 placed on the document table 204 is located on the document table 204 from the projector 207 and the distance image sensor unit 208. To do. Then, the object position 1903 and the object position 1904 represented by the overhead image data are blind spots of the projector 207 and the distance image sensor unit 208, respectively.

Ｓ1815で、投射領域検出部415は、Ｓ1814で生成した俯瞰画像データ足し合わせることによって、プロジェクタ207及び距離画像センサ部208のうち、少なくともどちらか一方が見えなくなるような死角領域1905（図１８（g））を求める。メイン制御部402は、死角領域1905を図１８(h)のように二値化し、最終的な死角領域1906とする。Ｓ1815が終了すると、メイン制御部402は、前述のＳ1807を実行し、カメラ画像データを１フレーム分取得する。 In step S1815, the projection area detection unit 415 adds the overhead image data generated in step S1814 to add at least one of the projector 207 and the distance image sensor unit 208 so that a blind spot area 1905 (FIG. 18 (g )). The main control unit 402 binarizes the blind spot area 1905 as shown in FIG. When S1815 ends, the main control unit 402 executes S1807 described above, and acquires camera image data for one frame.

Ｓ1816で、メイン制御部402は、まずＳ1808で平面原稿領域を算出したのと同様の方法で立体物領域を算出する。ここで算出した立体物領域（座標情報）とＳ1815で生成した最終的な死角領域1906を足し合わせて合成した投影不可領域を算出する。さらに、メイン制御部402は、図１８(j)に示す操作ＵＩ部品を投影できない領域に対応づけた投影不可領域1924をＲＡＭ303に設定する。
メイン制御部402は、Ｓ1808、もしくはＳ1816が終了すると、投影不可領域算出処理を終了し、処理を図１７(a)のＳ1802へ進める。 In step S1816, the main control unit 402 first calculates a three-dimensional object region using the same method as that used in step S1808 to calculate a planar document region. A projection impossible region is calculated by adding the solid object region (coordinate information) calculated here and the final blind spot region 1906 generated in S1815 and combining them. Further, the main control unit 402 sets in the RAM 303 a non-projectable area 1924 associated with the area where the operation UI component shown in FIG.
When S1808 or S1816 ends, the main control unit 402 ends the unprojectable area calculation process, and advances the process to S1802 in FIG.

ここで、図１７の(a)の処理に説明を戻し、Ｓ1802で、メイン制御部402は、Ｓ1801で求めた特定の投影不可領域と、これから表示しようとしている操作ＵＩの領域（ＵＩボタンなどの領域）が重なっていないかどうかを判断する。これには図１８(k)に示すように、ＵＩボタン領域1941〜1943と、ＲＡＭ303に設定された投影不可領域1924との重複した画素が存在するか否かを判断すればよい。 Here, the description returns to the processing of FIG. 17A. In S1802, the main control unit 402 determines the specific unprojectable area obtained in S1801 and the area of the operation UI to be displayed (such as a UI button). It is determined whether the (region) overlaps. For this purpose, as shown in FIG. 18 (k), it may be determined whether or not there are overlapping pixels between the UI button areas 1941 to 1943 and the non-projectable area 1924 set in the RAM 303.

Ｓ1802で、メイン制御部402は、Ｓ1801で求めた投影不可領域と、これから表示しようとしている操作ＵＩの領域が重なると判断した場合、処理をＳ1803へ進める。Ｓ1803で、メイン制御部402は、物体を移動させてほしい旨をユーザに通知するための投影を行う。図１８(l)は立体物1922を移動させてほしい旨をユーザに催促するメッセージ（情報）の投影状態を実行している様子を模式的に表したものである。本例は、書画台上で物体を移動させてほしい方向を矢印で示している例である。 If the main control unit 402 determines in S1802 that the non-projectable area obtained in S1801 and the area of the operation UI that is to be displayed overlap, the process advances to S1803. In step S1803, the main control unit 402 performs projection for notifying the user that the object is to be moved. FIG. 18 (l) schematically shows a state in which a projection state of a message (information) prompting the user to move the three-dimensional object 1922 is executed. In this example, an arrow indicates a direction in which an object is desired to be moved on the document table.

Ｓ1804で、メイン制御部402は、図８のＳ827で説明した物体載置検知が通知されたかどうかを待ち受ける。メイン制御部402は、物体載置の検知が通知されたと判断した場合は、ユーザによる物体の移動が実行されたものと判断し、処理をＳ1801へ戻し、もう一度投影不可領域を算出する処理を実行する。 In step S1804, the main control unit 402 awaits whether the object placement detection described in step S827 in FIG. 8 has been notified. If the main control unit 402 determines that the object placement detection has been notified, the main control unit 402 determines that the user has moved the object, returns the process to S1801, and executes the process of calculating the non-projectable area again. To do.

一方、メイン制御部402は、物体載置の検知が通知されないと判断した場合は、再度物体載置検知が通知されるのを待機する。
Ｓ1802で、メイン制御部402は、Ｓ1801で求めた投影不可領域と、これから表示しようとしている操作ＵＩの領域が重ならないと判断した場合、処理をＳ1805へ進める。そして、Ｓ1805で、メイン制御部402は、ユーザから操作指示を受け付けるためのＧＵＩ部品（保冷では、３つのボタン）を投影し、ＧＵＩ部品表示処理を終了する。
メイン制御部402は、上記処理を繰り返すことで、ユーザが物体を移動させ、投影不可領域に操作ＵＩ領域が重ならなくなるまで待つことが可能となる。
ここで、図１７(a)のフローチャートでは、投影不可領域と操作ＵＩ領域が重ならない状態になってからＧＵＩ部品を投影する。 On the other hand, when it is determined that the object placement detection is not notified, the main control unit 402 waits for the object placement detection to be notified again.
If the main control unit 402 determines in S1802 that the non-projectable area obtained in S1801 and the area of the operation UI to be displayed are not overlapped, the process proceeds to S1805. In step S1805, the main control unit 402 projects a GUI component (three buttons for cold insulation) for receiving an operation instruction from the user, and ends the GUI component display process.
By repeating the above processing, the main control unit 402 can wait until the user moves the object and the operation UI area does not overlap the non-projectable area.
Here, in the flowchart of FIG. 17A, the GUI component is projected after the non-projectable area and the operation UI area do not overlap.

ユーザにとってはなぜ物体を移動させなければいけないかわからないことも考えられるため、図１７(c)に示すように投影不可領域と操作ＵＩ領域が最初は重なってもよい前提で先にＧＵＩ部品を投影するようにしてもよい。 Since the user may not know why the object should be moved, as shown in FIG. 17 (c), the GUI part is projected first on the assumption that the non-projectable area and the operation UI area may initially overlap. You may make it do.

〔第２実施形態〕
図１７(c)は、本実施形態を示す画像処理装置の制御方法を説明するフローチャートである。本例は、先にＧＵＩ部品を投影する場合のＧＵＩ部品表示処理例である。なお、各ステップは、ＣＰＵ302が記憶された制御プログラムを実行することで実現される。 [Second Embodiment]
FIG. 17C is a flowchart illustrating a method for controlling the image processing apparatus according to the present embodiment. This example is a GUI component display processing example when a GUI component is projected first. Each step is realized by the CPU 302 executing a stored control program.

図１７(a)のフローチャートと比較して、ステップの順番と、Ｓ1820、Ｓ1821が追加になっている部分が異なっている。図１７(a)と同じステップ番号のステップは同じ処理を行うため詳しい説明を省略する。
Ｓ1801で、メイン制御部402は、投影不可領域を算出した後、Ｓ1805のＧＵＩ部品投影を行う。次に、Ｓ1802で、メイン制御部402は、投影不可領域と操作ＵＩ領域が重なるかどうかを判断する。ここで、投影不可領域と操作ＵＩ領域が重なると判断した場合は処理をＳ1820へ進め、重ならないと判断した場合、処理をＳ1821へ進める。
Ｓ1820で、メイン制御部402は、操作ＵＩのジェスチャ入力が有効になっている場合は、操作ＵＩのジェスチャ入力を無効にする。 Compared with the flowchart of FIG. 17A, the order of steps is different from that of steps S1820 and S1821. Steps having the same step numbers as those in FIG.
In step S1801, the main control unit 402 calculates the unprojectable area, and then performs GUI component projection in step S1805. In step S1802, the main control unit 402 determines whether the non-projectable area and the operation UI area overlap. If it is determined that the non-projectable area and the operation UI area overlap, the process proceeds to S1820. If it is determined that they do not overlap, the process proceeds to S1821.
In step S1820, if the gesture input of the operation UI is enabled, the main control unit 402 disables the gesture input of the operation UI.

これは、ジェスチャＵＩが有効になっていると、ＧＵＩ部品の投影が平面原稿に重なったり、立体物の死角領域で書画台204上に投影できなかったりする場合でもユーザが操作できてしまうことを防ぐためである。
本ステップが終了すると、メイン制御部402の処理は、Ｓ1803へ移行し、物体を移動させてほしい旨をプロジェクションしてユーザに通知する。 This is because if the gesture UI is enabled, the user can operate the GUI component even when the projection of the GUI component overlaps the flat document or when it cannot be projected onto the document table 204 in the blind spot area of the three-dimensional object. This is to prevent it.
When this step is completed, the process of the main control unit 402 proceeds to S1803, and notifies the user by projecting that the object is desired to be moved.

一方、Ｓ1821で、メイン制御部402は、操作ＵＩのジェスチャ入力が無効になっている場合は、操作ＵＩのジェスチャ入力を有効にし、ＧＵＩ部品表示処理を終了する。この処理により、ジェスチャ入力が可能となり、再びＧＵＩ部品を操作することが可能になる。メイン制御部402は、ＧＵＩ部品表示処理が終了すると、図１５のＳ1503に進み、スキャン実行処理を行う。 On the other hand, if the gesture input of the operation UI is disabled in S1821, the main control unit 402 enables the gesture input of the operation UI, and ends the GUI component display process. By this processing, it becomes possible to input a gesture and to operate the GUI component again. When the GUI component display process ends, the main control unit 402 advances to S1503 in FIG. 15 and performs a scan execution process.

スキャン実行処理開始時には、図１６(b)に示したようなスキャン開始画面が、ＧＵＩ部品生成表示部414を介して書画台204に投射されている。2Ｄスキャンボタン1642は、平面原稿の撮影指示を受け付けるボタンである。書籍スキャンボタン1643は、書籍原稿の撮影指示を受け付けるボタンである。3Ｄスキャンボタン1644は、立体形状の測定指示を受け付けるボタンである。
ユーザインタフェース部403は、前述したようにジェスチャー認識部409から通知されるタッチジェスチャーの座標と、これらのボタンを表示している座標とから、何れのボタンがユーザによって押下されたかを検知する。 At the start of the scan execution process, a scan start screen as shown in FIG. 16B is projected on the document stage 204 via the GUI component generation / display unit 414. The 2D scan button 1642 is a button for receiving an instruction to shoot a flat original. The book scan button 1643 is a button for accepting a book manuscript shooting instruction. The 3D scan button 1644 is a button for accepting a three-dimensional shape measurement instruction.
As described above, the user interface unit 403 detects which button is pressed by the user from the coordinates of the touch gesture notified from the gesture recognition unit 409 and the coordinates at which these buttons are displayed.

以降、ユーザインタフェース部403による検知の説明を省略して「ボタンへのタッチを検知する」と記載する。
また、ユーザインタフェース部403は、2Ｄスキャンボタン1642、書籍スキャンボタン1643、3Ｄスキャンボタン1644へのタッチを検知すると、選択されたスキャン実行を行う。 Hereinafter, description of detection by the user interface unit 403 is omitted, and “detection of touch on a button” is described.
When the user interface unit 403 detects a touch on the 2D scan button 1642, the book scan button 1643, and the 3D scan button 1644, the user interface unit 403 executes the selected scan.

また、ユーザが選択したスキャンの実行開始指示を受け付けるスキャン開始ボタンを別途配置し、2Ｄスキャンボタン1642、書籍スキャンボタン1643、3Ｄスキャンボタン1644のそれぞれを排他的に選択できるようにしてもよい。その際には、ユーザインタフェース部403は、ユーザによる何れかのボタンへのタッチを検知すると、タッチされたボタンを選択状態とし、他のボタンの選択を解除する。 Further, a scan start button for receiving an instruction to start execution of a scan selected by the user may be separately arranged so that the 2D scan button 1642, the book scan button 1643, and the 3D scan button 1644 can be exclusively selected. At that time, when the user interface unit 403 detects a touch on any of the buttons by the user, the user interface unit 403 sets the touched button in a selected state and cancels selection of the other buttons.

ここで、図１５の説明に戻り、スキャン実行処理では、Ｓ1531で、メイン制御部402は、スキャン開始ボタンのタッチを検知するまで待つ。メイン制御部402は、スキャン開始ボタンとしてのメニューボタンとして機能する2Ｄスキャンボタン1642、書籍スキャンボタン1643、3Ｄスキャンボタン1644のタッチを検知する。ここで、メイン制御部402は、2Ｄスキャンボタン1642がタッチされたと判断した場合、処理をＳ1532へ進める。Ｓ1532で、メイン制御部402は、平面原稿画像撮影部411の処理を実行する。 Returning to the description of FIG. 15, in the scan execution process, in S1531, the main control unit 402 waits until a touch of the scan start button is detected. The main control unit 402 detects touch of a 2D scan button 1642, a book scan button 1643, and a 3D scan button 1644 that function as menu buttons as scan start buttons. Here, if the main control unit 402 determines that the 2D scan button 1642 has been touched, the process proceeds to S1532. In step S1532, the main control unit 402 executes processing of the flat document image photographing unit 411.

一方、Ｓ1531で、メイン制御部402は、書籍スキャンボタン1643がタッチされたと判断した場合、処理をＳ1533へ進める。Ｓ1533で、メイン制御部402は、書籍画像撮影部412の処理を実行する。
さらに、Ｓ1531で、メイン制御部402は、3Ｄスキャンボタン1644がタッチされたと判断した場合、処理をＳ1534へ進める。Ｓ1534で、Ｓ1531で、立体形状測定部413の処理を実行する。そして、メイン制御部402は、Ｓ1532〜Ｓ1534の何れかによる処理が終了すると、スキャン実行処理を終了する。 On the other hand, if the main control unit 402 determines in S1531 that the book scan button 1643 has been touched, the process proceeds to S1533. In step S1533, the main control unit 402 executes processing of the book image photographing unit 412.
Furthermore, in S1531, when the main control unit 402 determines that the 3D scan button 1644 has been touched, the process proceeds to S1534. In S1534, in S1531, the processing of the three-dimensional shape measurement unit 413 is executed. The main control unit 402 ends the scan execution process when the process in any of S1532 to S1534 ends.

図１５に示したＳ1503のスキャン実行処理を終了すると、メイン制御部402は、続いてＳ1504の物体除去待ち処理を行う。
物体除去待ち処理を開始すると、Ｓ1541で、メイン制御部402は、ＧＵＩ部品生成表示部414を介して、スキャン終了画面を表示する。例えば、ＧＵＩ部品生成表示部414は、図１６(e)に示すようなスキャンが終了した旨をユーザに通知するメッセージ1645のＧＵＩ部品を生成し投射する。 When the scan execution process of S1503 shown in FIG. 15 is terminated, the main control unit 402 subsequently performs an object removal waiting process of S1504.
When the object removal waiting process is started, the main control unit 402 displays a scan end screen via the GUI component generation display unit 414 in S1541. For example, the GUI component generation / display unit 414 generates and projects a GUI component of a message 1645 that notifies the user that the scan as shown in FIG.

Ｓ1542で、メイン制御部402は、物体検知部410からの物体除去通知を受信するのを待つ。ここで、物体除去通知は、物体検知部410が図８のＳ834で通知するものである。Ｓ1542でメイン制御部402は、物体除去通知を受信した場合、メイン制御部402は、物体除去待ち処理を終了する。 In step S1542, the main control unit 402 waits for reception of an object removal notification from the object detection unit 410. Here, the object removal notification is made by the object detection unit 410 in S834 in FIG. When the main control unit 402 receives the object removal notification in S1542, the main control unit 402 ends the object removal waiting process.

Ｓ1504の物体除去待ち処理を終了すると、メイン制御部402は、Ｓ1505へ進み、スキャン終了判定処理を行う。メイン制御部402は、ネットワークＩ/Ｆ306を介してホストコンピュータ102から送信されるスキャン終了命令や、ＬＣＤタッチパネル330から入力される終了命令を受け付けた場合にスキャン終了と判定する。
また、メイン制御部402は、タイマー設定等によりスキャン終了判定を行うようにしてもよい。 When the object removal waiting process in S1504 ends, the main control unit 402 advances to S1505 and performs a scan end determination process. When the main control unit 402 receives a scan end command transmitted from the host computer 102 via the network I / F 306 or an end command input from the LCD touch panel 330, the main control unit 402 determines that the scan has ended.
Further, the main control unit 402 may perform scan end determination by setting a timer or the like.

メイン制御部402は、スキャン処理を継続する場合、処理をＳ1501へ戻し、図１６(a)の初期画面を表示して書画台204への物体載置を待つ。このように制御することで、ユーザが複数の原稿をスキャンしたい場合に、書画台204上の原稿を取り換えたことを検知することができ、複数の原稿のスキャンを連続して実行できる。 When continuing the scanning process, the main control unit 402 returns the process to S1501, displays the initial screen of FIG. 16A, and waits for the object placement on the document table 204. By controlling in this way, when the user wants to scan a plurality of documents, it can detect that the document on the document table 204 has been replaced, and a plurality of documents can be continuously scanned.

以上の処理により、ＧＵＩ部品生成表示部414は、スキャン実行開始等のメニューのＧＵＩ部品を投射表示する際、書画台204上に載置されたスキャン対象物に重ならないように投射表示することができる。このため、投影されるＧＵＩ部品の視認性が低下することを防止できる。 With the above processing, the GUI component generation / display unit 414 can project and display a GUI component on the menu such as the start of scan execution so as not to overlap the scan target object placed on the document table 204. it can. For this reason, it is possible to prevent the visibility of the projected GUI component from being lowered.

なお、本実施形態ではＧＵＩ部品として操作指示を促すようなメッセージ表示や、スキャン実行時のメニューボタンについて述べたが、ＧＵＩ部品は、操作を支援・確認するためのプレビュー表示等であってもよい。 In this embodiment, a message display for prompting an operation instruction as a GUI component and a menu button at the time of scanning are described. However, the GUI component may be a preview display for supporting and confirming the operation. .

本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステムまたは装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサがプログラムを読み出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えばＡＳＩＣ）によっても実現可能である。 The present invention supplies a program that realizes one or more functions of the above-described embodiments to a system or apparatus via a network or a storage medium, and one or more processors in a computer of the system or apparatus read and execute the program This process can be realized. It can also be realized by a circuit (for example, ASIC) that realizes one or more functions.

101 カメラスキャナ
201 コントローラ部
202 カメラ部
204 書画台
207 プロジェクタ
208 距離画像センサ部 101 camera scanner
201 Controller
202 Camera section
204 Document stand
207 Projector
208 Distance image sensor

Claims

書画台に載置される被写体を撮像して画像データを取得する第１の取得手段と、
距離画像データを取得する第２の取得手段と、
前記書画台上の被写体を読み取る操作指示を受け付ける操作指示画像を投射位置に投射する投射手段と、
前記第２の取得手段が取得する距離画像データに基づいて、前記書画台上におけるユーザが前記操作指示画像を操作するジェスチャーを認識する認識手段と、
前記書画台に載置される被写体の立体形状に従い、前記投射手段が前記操作指示画像を投射できない領域を特定する特定手段と、
前記特定手段が特定した領域と、前記投射手段が前記操作指示画像を投射する領域とが重なるかどうかを判断する判断手段と、を備え、
前記判断手段は、前記特定手段が特定した領域と、前記投射手段が前記操作指示画像を投射する領域とが重なると判断した場合、前記投射手段は前記被写体の移動を催促する情報を前記書画台に投射することを特徴とする画像処理装置。 First acquisition means for capturing an image of a subject placed on the document table and acquiring image data;
A second acquisition means for acquiring distance image data;
A projection means for projecting an operation instruction image for receiving an operation instruction for reading a subject on the document table to a projection position;
Recognizing means for recognizing a gesture for a user on the document table to operate the operation instruction image based on distance image data acquired by the second acquiring means;
A specifying unit that specifies a region in which the projection unit cannot project the operation instruction image according to a three-dimensional shape of a subject placed on the document table;
Determining means for determining whether or not the area specified by the specifying means and the area where the projection means projects the operation instruction image;
When the determining unit determines that the area specified by the specifying unit overlaps the area where the projecting unit projects the operation instruction image, the projecting unit displays information for prompting the movement of the subject. And an image processing apparatus.

前記特定手段が特定した領域と、前記操作指示画像を投射する領域とが重ならないと判断した場合、前記投射手段は、前記操作指示画像を投射することを特徴とする請求項１に記載の画像処理装置。 2. The image according to claim 1, wherein when it is determined that the area specified by the specifying unit and the area where the operation instruction image is projected do not overlap, the projecting unit projects the operation instruction image. Processing equipment.

前記投射手段は前記被写体の移動を催促する情報を前記書画台に投射した後、前記第１の取得手段が取得する画像データに基づいて、前記書画台に載置された被写体が移動されたかどうかを判定する判定手段を備え、
前記判断手段は、前記特定手段が特定した領域と、前記操作指示画像を投射する領域とが重ならないと判断した場合、投射された操作指示画像に対するジェスチャーを認識して、前記第１の取得手段が前記被写体の画像データを取得することを特徴とする請求項１に記載の画像処理装置。 Whether the subject placed on the document table is moved based on the image data acquired by the first acquisition unit after the projection unit projects the information for prompting the movement of the subject onto the document table. Determination means for determining
If the determination means determines that the area specified by the specifying means and the area where the operation instruction image is projected do not overlap, the determination means recognizes a gesture for the projected operation instruction image, and the first acquisition means The image processing apparatus according to claim 1, wherein the image data of the subject is acquired.

前記特定手段は、第１の取得手段が取得する被写体の画像データの座標情報と、前記第２の取得手段が取得する被写体の距離画像データの座標情報とを合成して前記投射手段が前記操作指示画像を投射できない領域を特定することを特徴とする請求項１に記載の画像処理装置。 The specifying means combines the coordinate information of the subject image data acquired by the first acquisition means with the coordinate information of the distance image data of the subject acquired by the second acquisition means, and the projection means performs the operation The image processing apparatus according to claim 1, wherein an area where the instruction image cannot be projected is specified.

前記書画台に載置される被写体の形状は、平面形状、立体形状、ブック形状を含むことを特徴とする請求項１に記載の画像処理装置。 The image processing apparatus according to claim 1, wherein the shape of the subject placed on the document table includes a planar shape, a three-dimensional shape, and a book shape.

書画台に載置される被写体を撮像して画像データを取得する第１の取得手段と、距離画像データを取得する第２の取得手段と、前記書画台上の被写体を読み取る操作指示を受け付ける操作指示画像を投射位置に投射する投射手段と、前記第２の取得手段が取得する距離画像データに基づいて、前記書画台上におけるユーザが前記操作指示画像を操作するジェスチャーを認識する認識手段と、を備える画像処理装置の制御方法であって、
前記書画台に載置される被写体の立体形状に従い、前記投射手段が前記操作指示画像を投射できない領域を特定する特定工程と、
前記特定工程が特定した領域と、前記投射手段が前記操作指示画像を投射する領域とが重なるかどうかを判断する判断工程と、を備え、
前記判断工程で、前記特定工程が特定した領域と、前記投射手段が前記操作指示画像を投射する領域とが重なると判断した場合、前記投射手段は前記被写体の移動を催促する情報を前記書画台に投射することを特徴とする画像処理装置の制御方法。 A first acquisition unit that captures an image of a subject placed on the document table and acquires image data; a second acquisition unit that acquires distance image data; and an operation for receiving an operation instruction for reading the subject on the document table A projecting unit that projects an instruction image onto a projection position; a recognition unit that recognizes a gesture on which the user on the document table operates the operation instruction image based on distance image data acquired by the second acquiring unit; An image processing apparatus control method comprising:
A specifying step of specifying an area in which the projection means cannot project the operation instruction image according to a three-dimensional shape of a subject placed on the document table;
A determination step of determining whether or not the region specified by the specifying step and the region where the projection means projects the operation instruction image,
In the determination step, when it is determined that the area specified in the specifying step overlaps with the area where the projection unit projects the operation instruction image, the projection unit displays information for prompting the movement of the subject. A control method for an image processing apparatus, characterized by:

請求項６に記載の画像処理装置の制御方法をコンピュータに実行させることを特徴とするプログラム。 A program for causing a computer to execute the control method of the image processing apparatus according to claim 6.