JP2017041668A

JP2017041668A - Image input device, image input system, control method for image input device, and program

Info

Publication number: JP2017041668A
Application number: JP2015160415A
Authority: JP
Inventors: 宗士大志万; Soshi Oshima
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2015-08-17
Filing date: 2015-08-17
Publication date: 2017-02-23

Abstract

PROBLEM TO BE SOLVED: To enable displaying for a user a placement area in which a placed object can be entirely read when the placed object is upwardly photographed for reading.SOLUTION: The image input device includes a document table on which an object is to be placed, a projection means that projects an image onto the document table and its vicinity, and an imaging means that is disposed with respect the document table and that images a two-dimensional image. The image input device acquires height information on an object placed on the document table and acquires angle-of-view information of imaging means. On the basis of the acquired height information and stored angle-of-view information, the image input device induces a placement area of the document table in which the entirety of an object can fall in a photographing area of the imaging means, and projects the induced placement area.SELECTED DRAWING: Figure 15

Description

本発明は、画像入力装置、画像入力システム、画像入力装置の制御方法、及びプログラムに関するものである。 The present invention relates to an image input device, an image input system, a control method for an image input device, and a program.

従来、文書をスキャンして電子データとして保存する場合、撮像にラインセンサを用いるラインスキャナと、２次元の撮像センサを用いるカメラスキャナとがある。特に、書画台の上方にカメラを配置し、原稿を上向きに書画台に置いて撮像するカメラスキャナの場合には、１枚の原稿であれば置くだけで素早くスキャンすることができるとともに、本のように厚みのある原稿も容易に書画台に置いてスキャンすることができる。 Conventionally, when a document is scanned and stored as electronic data, there are a line scanner that uses a line sensor for imaging and a camera scanner that uses a two-dimensional imaging sensor. In particular, in the case of a camera scanner in which a camera is arranged above the document table and the document is placed on the document table with the document facing upward, a single document can be quickly scanned and the book can be scanned quickly. Thus, a thick original can be easily placed on the document table and scanned.

ここで、本のように立体形状を有する物体を２次元画像として取り込む場合、撮影用カメラの画角を充分に広く取らなければならない。しかしながら、２次元の撮像センサをドキュメントスキャナとして用いる場合、その解像度を充分に活かすためには、書画台上の領域に画角を絞らなければならないというトレードオフの関係があった。
カメラスキャナのようにカメラを斜め上から見下ろすように配置する場合、画角を充分に広く取らなければ、カメラから遠い側で、立体物の高い位置まで撮像出来ないという、立体物を撮影する際に特有の問題がある。
ユーザは自分の置いた位置で立体物の全体が撮影出来ないかどうかを知る術がないため、撮り直し等が必要となり、操作性が低下する恐れがある。
この問題を解決するためには、撮影画角を絞った上で、予め撮影できる範囲をユーザに通知するか、撮影出来ない位置に立体物が置かれた際にその旨をユーザに知らせる術が必要となる。特許文献１で開示されている技術では、撮像対象となる立体物の好適な画像を得るために、撮影画像と予め登録しておいた画像を比較することで、被写体となる立体物をどのように移動させればよいかを判定する。 Here, when an object having a three-dimensional shape such as a book is captured as a two-dimensional image, the angle of view of the photographing camera must be sufficiently wide. However, when a two-dimensional image sensor is used as a document scanner, there is a trade-off relationship that the field angle must be narrowed down to an area on the document table in order to make full use of the resolution.
When shooting a 3D object such as a camera scanner, if you are looking down from the top, you cannot capture a high 3D object on the side far from the camera unless the angle of view is wide enough. There are problems specific to.
Since the user has no way of knowing whether or not the entire three-dimensional object cannot be photographed at the position where the user is placed, it is necessary to re-shoot and the like, which may reduce the operability.
In order to solve this problem, it is necessary to notify the user of the range that can be shot in advance after narrowing the shooting angle of view, or to notify the user when a three-dimensional object is placed at a position where shooting is not possible. Necessary. In the technique disclosed in Patent Document 1, in order to obtain a suitable image of a three-dimensional object to be imaged, how the three-dimensional object to be a subject is compared by comparing a captured image with a previously registered image. It is determined whether to move to.

特開２００７−２０１９７号公報JP 2007-20197 A

しかしながら、特許文献１の技術においては、予め、撮影した画像と比較するための画像登録が必要であるため、任意の物体を置いた時点で撮影可能かどうかを判定することが困難だった。 However, in the technique of Patent Document 1, it is necessary to register an image for comparison with a photographed image in advance, so it is difficult to determine whether or not photographing is possible when an arbitrary object is placed.

本発明は、上記の課題を解決するためになされたもので、本発明の目的は、載置された物体を上方から撮影して読み取る際に、物体の全体を欠くことなく読み取れる載置領域をユーザに明示できる仕組みを提供することである。 The present invention has been made to solve the above-described problems, and an object of the present invention is to provide a placement area that can be read without missing the entire object when the placed object is photographed and read from above. It is to provide a mechanism that can be clearly shown to the user.

上記目的を達成する本発明の画像入力装置は以下に示す構成を備える。
画像入力装置であって、物体を載置する書画台と、前記書画台に画像を投影する投影手段と、前記書画台に載置される物体を当該書画台の上方から撮像する撮像手段と、前記書画台に載置された物体の距離画像を取得する取得手段と、前記取得手段が取得する距離画像から前記書画台の上に載置される物体の高さ情報を取得する高さ取得手段と、取得される前記物体の前記高さ情報と、前記撮像手段が撮像する画角情報とから、前記撮像手段が撮像する画角域で前記書画台に載置される前記物体の全体を撮影できる前記物体の載置領域を導出する導出手段と、を備え、前記投影手段は、導出された載置領域を投影することを特徴とする。 The image input apparatus of the present invention that achieves the above object has the following configuration.
An image input device, a document table on which an object is placed, a projecting unit that projects an image on the document table, an imaging unit that images an object placed on the document table from above the document table, Acquisition means for acquiring a distance image of an object placed on the document table, and height acquisition means for acquiring height information of the object placed on the document table from the distance image acquired by the acquisition means Then, from the height information of the acquired object and the angle of view information captured by the imaging means, the whole of the object placed on the document table is photographed in the angle of view area captured by the imaging means. Deriving means for deriving the placement area of the object that can be produced, and the projecting means projects the derived placement area.

本発明によれば、載置された物体を上方から撮影して読み取る際に、物体の全体を欠くことなく読み取れる載置領域をユーザに明示できる。 According to the present invention, when a placed object is photographed and read from above, a placement area that can be read without missing the entire object can be clearly indicated to the user.

画像処理装置を含む情報入力システムの構成を説明する図である。It is a figure explaining the structure of the information input system containing an image processing apparatus. カメラスキャナの構成例を示す図である。It is a figure which shows the structural example of a camera scanner. コントローラ部のハードウェア構成例を示す図である。It is a figure which shows the hardware structural example of a controller part. 入力支援装置の制御用プログラムの構成を説明する図である。It is a figure explaining the structure of the program for control of an input assistance apparatus. 距離画像取得処理を説明する図である。It is a figure explaining distance image acquisition processing. 画像処理装置の制御方法を説明するフローチャートである。It is a flowchart explaining the control method of an image processing apparatus. 指先検出処理の方法を模式的に表した図である。It is the figure which represented typically the method of the fingertip detection process. 画像処理装置の制御方法を説明するフローチャートである。It is a flowchart explaining the control method of an image processing apparatus. 画像処理装置の制御方法を説明するフローチャートである。It is a flowchart explaining the control method of an image processing apparatus. 平面原稿画像撮影部の処理を説明するための模式図である。FIG. 6 is a schematic diagram for explaining processing of a flat document image photographing unit. 画像処理装置の制御方法を説明するフローチャートである。It is a flowchart explaining the control method of an image processing apparatus. 書籍画像撮影部の処理を説明するための模式図である。It is a schematic diagram for demonstrating the process of a book image imaging | photography part. 画像処理装置の制御方法を説明するフローチャートである。It is a flowchart explaining the control method of an image processing apparatus. 立体形状測定部の処理を説明するための模式図である。It is a schematic diagram for demonstrating the process of a solid shape measurement part. 画像処理装置の制御方法を説明するフローチャートである。It is a flowchart explaining the control method of an image processing apparatus. 画像処理装置における書籍文書入力処理状態を示す遷移図である。It is a transition diagram which shows the book document input process state in an image processing apparatus. 画像処理装置における書籍文書入力処理状態を示す遷移図である。It is a transition diagram which shows the book document input process state in an image processing apparatus. 画像処理装置における書籍文書入力処理状態を示す遷移図である。It is a transition diagram which shows the book document input process state in an image processing apparatus. 画像処理装置における書籍文書入力処理状態を示す遷移図である。It is a transition diagram which shows the book document input process state in an image processing apparatus. 画像処理装置の制御方法を説明するフローチャートである。It is a flowchart explaining the control method of an image processing apparatus.

次に本発明を実施するための最良の形態について図面を参照して説明する。
＜システム構成の説明＞
〔第１実施形態〕
図１は、本実施形態を示す画像処理装置を含む画像入力システムの構成を説明する図である。本システムは、カメラスキャナ１０１はイーサネット（登録商標）等のネットワーク１０４にてホストコンピュータ１０２およびプリンタ１０３に接続されて構成される。 Next, the best mode for carrying out the present invention will be described with reference to the drawings.
<Description of system configuration>
[First Embodiment]
FIG. 1 is a diagram illustrating a configuration of an image input system including an image processing apparatus according to the present embodiment. In this system, a camera scanner 101 is connected to a host computer 102 and a printer 103 via a network 104 such as Ethernet (registered trademark).

図１において、ホストコンピュータ１０２からの指示により、カメラスキャナ１０１から画像を読み取るスキャン機能や、スキャンデータをプリンタ１０３により出力するプリント機能の実行が可能である。また、ホストコンピュータ１０２を介さず、カメラスキャナ１０１への直接の指示により、スキャン機能、プリント機能の実行も可能である。 In FIG. 1, a scan function for reading an image from the camera scanner 101 and a print function for outputting scan data by the printer 103 can be executed by an instruction from the host computer 102. Further, it is possible to execute a scan function and a print function by direct instructions to the camera scanner 101 without using the host computer 102.

＜カメラスキャナの構成＞
図２は、図１に示したカメラスキャナ１０１の構成例を示す図である。
図２の（ａ）に示すように、カメラスキャナ１０１は、コントローラ部２０１、カメラ部２０２、腕部２０３、短焦点のプロジェクタ部２０７、距離画像センサ部２０８を含む。カメラスキャナの本体であるコントローラ部２０１と、撮像を行うためのカメラ部２０２、短焦点のプロジェクタ部２０７および距離画像センサ部２０８は、腕部２０３により連結されている。腕部２０３は関節を用いて曲げ伸ばしが可能である。 <Configuration of camera scanner>
FIG. 2 is a diagram showing a configuration example of the camera scanner 101 shown in FIG.
As illustrated in FIG. 2A, the camera scanner 101 includes a controller unit 201, a camera unit 202, an arm unit 203, a short focus projector unit 207, and a distance image sensor unit 208. A controller unit 201 that is a main body of the camera scanner, a camera unit 202 for performing imaging, a short-focus projector unit 207, and a distance image sensor unit 208 are connected by an arm unit 203. The arm portion 203 can be bent and stretched using a joint.

図２の（ａ）には、カメラスキャナ１０１が設置されている書画台２０４も示している。カメラ部２０２および距離画像センサ部２０８のレンズは書画台２０４方向に向けられており、破線で囲まれた読み取り領域２０５内の画像を読み取り可能である。
図２の（ａ）の例では、原稿２０６は読み取り領域２０５内に置かれているので、カメラスキャナ１０１に読み取り可能となっている。また、書画台２０４内にはターンテーブル２０９が設けられている。ターンテーブル２０９はコントローラ部２０１からの指示によって回転することが可能であり、ターンテーブル２０９上に置かれた物体とカメラ部２０２との角度を変えることができる。カメラ部２０２は単一解像度で画像を撮像するものとしてもよいが、高解像度画像撮像と低解像度画像撮像が可能なものとすることが好ましい。
なお、図２に示されていないが、カメラスキャナ１０１は、ＬＣＤタッチパネル３３０およびスピーカ３４０をさらに含むこともできる。 FIG. 2A also shows a document table 204 on which the camera scanner 101 is installed. The lenses of the camera unit 202 and the distance image sensor unit 208 are directed toward the document table 204, and can read an image in the reading region 205 surrounded by a broken line.
In the example of FIG. 2A, the document 206 is placed in the reading area 205 and can be read by the camera scanner 101. A turntable 209 is provided in the document table 204. The turntable 209 can be rotated by an instruction from the controller unit 201, and the angle between the object placed on the turntable 209 and the camera unit 202 can be changed. The camera unit 202 may capture an image with a single resolution, but it is preferable that the camera unit 202 can capture a high-resolution image and a low-resolution image.
Although not shown in FIG. 2, the camera scanner 101 can further include an LCD touch panel 330 and a speaker 340.

図２の（ｂ）は、カメラスキャナ１０１における座標系について表している。カメラスキャナ１０１では各ハードウェアデバイスに対して、カメラ座標系、距離画像座標系、プロジェクタ座標系という座標系が定義される。これらはカメラ部２０２および距離画像センサ部２０８のＲＧＢカメラ部５０３が撮像する画像平面、あるいはプロジェクタ部２０７が投影する画像平面をＸＹ平面とし、画像平面に直交した方向をＺ方向として定義したものである。さらに、これらの独立した座標系の３次元データを統一的に扱えるようにするために、書画台２０４を含む平面をＸＹ平面とし、このＸＹ平面から上方に垂直な向きをＺ軸とする直交座標系を定義する。 FIG. 2B shows a coordinate system in the camera scanner 101. The camera scanner 101 defines a coordinate system such as a camera coordinate system, a distance image coordinate system, and a projector coordinate system for each hardware device. These are defined by defining the image plane captured by the RGB camera unit 503 of the camera unit 202 and the distance image sensor unit 208 or the image plane projected by the projector unit 207 as the XY plane and the direction orthogonal to the image plane as the Z direction. is there. Furthermore, in order to be able to handle the three-dimensional data of these independent coordinate systems in a unified manner, a rectangular coordinate having the plane including the document table 204 as the XY plane and the direction perpendicular to the XY plane as the Z axis is used. Define the system.

座標系を変換する場合の例として、図２の（ｃ）に直交座標系と、カメラ部２０２を中心としたカメラ座標系を用いて表現された空間と、カメラ部２０２が撮像する画像平面との関係を示す。直交座標系における３次元点Ｐ［Ｘ，Ｙ，Ｚ］は、（１）式によって、カメラ座標系における３次元点Ｐｃ［Ｘｃ，Ｙｃ，Ｚｃ］へ変換できる。

As an example in the case of transforming the coordinate system, FIG. 2C illustrates a rectangular coordinate system, a space expressed using a camera coordinate system centered on the camera unit 202, and an image plane captured by the camera unit 202. The relationship is shown. The three-dimensional point P [X, Y, Z] in the orthogonal coordinate system can be converted to the three-dimensional point Pc [Xc, Yc, Zc] in the camera coordinate system by the equation (1).

ここで、Ｒｃおよびｔｃは、直交座標系に対するカメラの姿勢（回転）と位置（並進）によって求まる外部パラメータによって構成され、Ｒｃを３×３の回転行列、ｔｃを並進ベクトルと呼ぶ。逆に、カメラ座標系で定義された３次元点は（２）式によって、直交座標系への変換することができる

Here, Rc and tc are constituted by external parameters obtained by the posture (rotation) and position (translation) of the camera with respect to the orthogonal coordinate system, and Rc is called a 3 × 3 rotation matrix and tc is called a translation vector. Conversely, a three-dimensional point defined in the camera coordinate system can be converted to an orthogonal coordinate system using equation (2).

さらに、カメラ部２０２で撮影される２次元のカメラ画像平面は、カメラ部２０２によって３次元空間中の３次元情報が２次元情報に変換されたものである。すなわち、カメラ座標系上での３次元点Ｐｃ［Ｘｃ，Ｙｃ，Ｚｃ］を、（３）式によってカメラ画像平面での２次元座標ｐｃ［ｘｐ，ｙｐ］に透視投影変換することによって変換することが出来る。

ここで、Ａは、カメラの内部パラメータと呼ばれ、焦点距離と画像中心などで表現される３×３の行列である。 Further, the two-dimensional camera image plane photographed by the camera unit 202 is obtained by converting the three-dimensional information in the three-dimensional space into two-dimensional information by the camera unit 202. That is, the three-dimensional point Pc [Xc, Yc, Zc] on the camera coordinate system is converted by perspective projection conversion to the two-dimensional coordinate pc [xp, yp] on the camera image plane according to the equation (3). I can do it.

Here, A is a 3 × 3 matrix called an internal parameter of the camera and expressed by a focal length and an image center.

以上のように、（１）式と（３）式を用いることで、直交座標系で表された３次元点群を、カメラ座標系での３次元点群座標やカメラ画像平面に変換することが出来る。なお、各ハードウェアデバイスの内部パラメータおよび直交座標系に対する位置姿勢（外部パラメータ）は、公知のキャリブレーション手法によりあらかじめキャリブレーションされているものとする。以後、特に断りがなく３次元点群と表記した場合は、直交座標系における３次元データを表しているものとする。 As described above, by using the equations (1) and (3), the three-dimensional point group represented by the orthogonal coordinate system is converted into the three-dimensional point group coordinates or the camera image plane in the camera coordinate system. I can do it. It is assumed that the internal parameters of each hardware device and the position / orientation (external parameters) with respect to the orthogonal coordinate system are calibrated in advance by a known calibration method. Hereinafter, when there is no particular notice and it is expressed as a three-dimensional point group, it represents three-dimensional data in an orthogonal coordinate system.

＜カメラスキャナのコントローラのハードウェア構成＞
図３は、図１に示したカメラスキャナ１０１の本体であるコントローラ部２０１のハードウェア構成例を示す図である。
図３において、コントローラ部２０１は、システムバス３０１に接続されたＣＰＵ３０２、ＲＡＭ３０３、ＲＯＭ３０４、ＨＤＤ３０５、ネットワークＩ／Ｆ３０６、画像処理プロセッサ３０７、カメラＩ／Ｆ３０８、ディスプレイコントローラ３０９、シリアルＩ／Ｆ３１０、オーディオコントローラ３１１およびＵＳＢコントローラ３１２を含む。 <Hardware configuration of camera scanner controller>
FIG. 3 is a diagram illustrating a hardware configuration example of the controller unit 201 which is the main body of the camera scanner 101 illustrated in FIG. 1.
In FIG. 3, a controller unit 201 includes a CPU 302, a RAM 303, a ROM 304, an HDD 305, a network I / F 306, an image processor 307, a camera I / F 308, a display controller 309, a serial I / F 310, and an audio controller connected to a system bus 301. 311 and USB controller 312.

ＣＰＵ３０２はコントローラ部２０１全体の動作を制御する中央演算装置である。ＲＡＭ３０３は揮発性メモリである。ＲＯＭ３０４は不揮発性メモリであり、ＣＰＵ３０２の起動用プログラムが格納されている。ＨＤＤ３０５はＲＡＭ３０３と比較して大容量なハードディスクドライブ（ＨＤＤ）である。ＨＤＤ３０５にはコントローラ部２０１の実行する、カメラスキャナ１０１の制御用プログラムが格納されている。 The CPU 302 is a central processing unit that controls the operation of the entire controller unit 201. The RAM 303 is a volatile memory. A ROM 304 is a non-volatile memory, and stores a startup program for the CPU 302. The HDD 305 is a hard disk drive (HDD) having a larger capacity than the RAM 303. The HDD 305 stores a control program for the camera scanner 101 executed by the controller unit 201.

ＣＰＵ３０２は電源ＯＮ等の起動時、ＲＯＭ３０４に格納されている起動用プログラムを実行する。この起動用プログラムは、ＨＤＤ３０５に格納されている制御用プログラムを読み出し、ＲＡＭ３０３上に展開するためのものである。ＣＰＵ３０２は起動用プログラムを実行すると、続けてＲＡＭ３０３上に展開した制御用プログラムを実行し、制御を行う。また、ＣＰＵ３０２は制御用プログラムによる動作に用いるデータもＲＡＭ３０３上に格納して読み書きを行う。ＨＤＤ３０５上にはさらに、制御用プログラムによる動作に必要な各種設定や、また、カメラ入力によって生成した画像データを格納することができ、ＣＰＵ３０２によって読み書きされる。ＣＰＵ３０２はネットワークＩ／Ｆ３０６を介してネットワーク１０４上の他の機器との通信を行う。 The CPU 302 executes a startup program stored in the ROM 304 when the power is turned on. This activation program is for reading a control program stored in the HDD 305 and developing it on the RAM 303. When the CPU 302 executes the startup program, the CPU 302 subsequently executes the control program developed on the RAM 303 to perform control. Further, the CPU 302 also stores data used for the operation by the control program on the RAM 303 to read / write. Further, various settings necessary for operation by the control program and image data generated by camera input can be stored on the HDD 305 and read / written by the CPU 302. The CPU 302 communicates with other devices on the network 104 via the network I / F 306.

画像処理プロセッサ３０７はＲＡＭ３０３に格納された画像データを読み出して処理し、またＲＡＭ３０３へ書き戻す。なお、画像処理プロセッサ３０７が実行する画像処理は、回転、変倍、色変換等である。 The image processor 307 reads and processes the image data stored in the RAM 303 and writes it back to the RAM 303. Note that image processing executed by the image processor 307 includes rotation, scaling, color conversion, and the like.

カメラＩ／Ｆ３０８はカメラ部２０２および距離画像センサ部２０８と接続され、ＣＰＵ３０２からの指示に応じてカメラ部２０２から画像データを、距離画像センサ部２０８から距離画像データを取得してＲＡＭ３０３へ書き込む。また、ＣＰＵ３０２からの制御コマンドをカメラ部２０２および距離画像センサ部２０８へ送信し、カメラ部２０２および距離画像センサ部２０８の設定を行う。 The camera I / F 308 is connected to the camera unit 202 and the distance image sensor unit 208, and acquires image data from the camera unit 202 and distance image data from the distance image sensor unit 208 in accordance with an instruction from the CPU 302, and writes the acquired image data to the RAM 303. In addition, a control command from the CPU 302 is transmitted to the camera unit 202 and the distance image sensor unit 208 to set the camera unit 202 and the distance image sensor unit 208.

また、コントローラ部２０１は、ディスプレイコントローラ３０９、シリアルＩ／Ｆ３１０、オーディオコントローラ３１１およびＵＳＢコントローラ３１２のうち少なくとも１つをさらに含むことができる。 The controller unit 201 can further include at least one of a display controller 309, a serial I / F 310, an audio controller 311, and a USB controller 312.

ディスプレイコントローラ３０９はＣＰＵ３０２の指示に応じてディスプレイへの画像データの表示を制御する。ここでは、ディスプレイコントローラ３０９は短焦点のプロジェクタ部２０７およびＬＣＤタッチパネル３３０に接続されている。 A display controller 309 controls display of image data on the display in accordance with an instruction from the CPU 302. Here, the display controller 309 is connected to the short-focus projector unit 207 and the LCD touch panel 330.

シリアルＩ／Ｆ３１０はシリアル信号の入出力を行う。ここでは、シリアルＩ／Ｆ３１０はターンテーブル２０９に接続され、ＣＰＵ３０２の回転開始・終了および回転角度の指示をターンテーブル２０９へ送信する。また、シリアルＩ／Ｆ３１０はＬＣＤタッチパネル３３０に接続され、ＣＰＵ３０２はＬＣＤタッチパネル３３０が押下されたときに、シリアルＩ／Ｆ３１０を介して押下された座標を取得する。
オーディオコントローラ３１１はスピーカ３４０に接続され、ＣＰＵ３０２の指示に応じて音声データをアナログ音声信号に変換し、スピーカ３４０を通じて音声を出力する。 The serial I / F 310 inputs and outputs serial signals. Here, the serial I / F 310 is connected to the turntable 209, and transmits an instruction of the rotation start / end and rotation angle of the CPU 302 to the turntable 209. Further, the serial I / F 310 is connected to the LCD touch panel 330, and the CPU 302 acquires the coordinates pressed via the serial I / F 310 when the LCD touch panel 330 is pressed.
The audio controller 311 is connected to the speaker 340, converts audio data into an analog audio signal in accordance with an instruction from the CPU 302, and outputs audio through the speaker 340.

ＵＳＢコントローラ３１２はＣＰＵ３０２の指示に応じて外付けのＵＳＢデバイスの制御を行う。ここでは、ＵＳＢコントローラ３１２はＵＳＢメモリやＳＤカードなどの外部メモリ３５０に接続され、外部メモリ３５０へのデータの読み書きを行う。 The USB controller 312 controls an external USB device in accordance with an instruction from the CPU 302. Here, the USB controller 312 is connected to an external memory 350 such as a USB memory or an SD card, and reads / writes data from / to the external memory 350.

＜距離画像センサおよび距離画像取得部の説明＞
図３に距離画像センサ部２０８の構成を示している。距離画像センサ部２０８は赤外線によるパターン投射方式の距離画像センサである。赤外線パターン投射部３６１は対象物に、人の目には不可視である赤外線によって３次元形状測定パターンを投射する。赤外線カメラ３６２は対象物に投射した３次元形状測定パターンを読みとるカメラである。ＲＧＢカメラ３６３は人の目に見える可視光をＲＧＢ信号で撮影するカメラである。 <Description of Distance Image Sensor and Distance Image Acquisition Unit>
FIG. 3 shows the configuration of the distance image sensor unit 208. The distance image sensor unit 208 is a pattern image type distance image sensor using infrared rays. The infrared pattern projection unit 361 projects a three-dimensional shape measurement pattern onto an object using infrared rays that are invisible to human eyes. The infrared camera 362 is a camera that reads a three-dimensional shape measurement pattern projected on an object. The RGB camera 363 is a camera that captures visible light visible to the human eye using RGB signals.

＜カメラスキャナの制御用プログラムの機能構成＞
図４は、図１に示したカメラスキャナ１０１の制御用プログラムの構成を説明する図である。特に、図４の（ａ）は、ＣＰＵ３０２が実行するカメラスキャナ１０１の制御用プログラムの機能構成４０１を示す。また、図４の（ｂ）は、機能構成４０１の各モジュールの関係をシーケンス図として示したものである。
カメラスキャナ１０１の制御用プログラムは前述のようにＨＤＤ３０５に格納され、ＣＰＵ３０２が起動時にＲＡＭ３０３上に展開して実行する。メイン制御部４０２は制御の中心であり、機能構成４０１内の他の各モジュール４０２、４０３、４０９、４１０、４１６を図４の（ｂ）に示すように制御する。 <Functional structure of camera scanner control program>
FIG. 4 is a diagram for explaining the configuration of a control program for the camera scanner 101 shown in FIG. In particular, FIG. 4A shows a functional configuration 401 of a control program for the camera scanner 101 executed by the CPU 302. FIG. 4B is a sequence diagram showing the relationship between the modules of the functional configuration 401.
The control program for the camera scanner 101 is stored in the HDD 305 as described above, and the CPU 302 develops and executes it on the RAM 303 at the time of activation. The main control unit 402 is the center of control, and controls the other modules 402, 403, 409, 410, and 416 in the functional configuration 401 as shown in FIG.

画像取得部４１６は、画像入力処理を行うモジュールであり、カメラ画像取得部４０７、距離画像取得部４０８から構成される。カメラ画像取得部４０７はカメラＩ／Ｆ３０８を介してカメラ部２０２が出力する画像データを取得し、ＲＡＭ３０３へ格納する。距離画像取得部４０８はカメラＩ／Ｆ３０８を介して距離画像センサ部２０８が出力する距離画像データを取得し、ＲＡＭ３０３へ格納する。距離画像取得部４０８の処理の詳細は図５を用いて後述する。 The image acquisition unit 416 is a module that performs image input processing, and includes a camera image acquisition unit 407 and a distance image acquisition unit 408. The camera image acquisition unit 407 acquires image data output from the camera unit 202 via the camera I / F 308 and stores the image data in the RAM 303. The distance image acquisition unit 408 acquires the distance image data output from the distance image sensor unit 208 via the camera I / F 308 and stores it in the RAM 303. Details of the processing of the distance image acquisition unit 408 will be described later with reference to FIG.

認識処理部４１７はカメラ画像取得部４０７、距離画像取得部４０８が取得する画像データから書画台２０４上の物体の動きを検知して認識するモジュールであり、ジェスチャー認識部４０９、物体検知部４１０から構成される。ジェスチャー認識部４０９は、画像取得部４１６から書画台２０４上の画像を取得し続け、タッチなどのジェスチャーを検知するとメイン制御部４０２へ通知する。物体検知部４１０は、メイン制御部４０２から物体載置待ち処理あるいは物体除去待ち処理の通知を受けると、画像取得部４１６から書画台２０４を撮像した画像を取得し、書画台２０４上に物体が置かれて静止するタイミングあるいは物体が取り除かれるタイミングを検知する処理を行う。ジェスチャー認識部４０９、物体検知部４１０の処理の詳細は図６〜図８を用いてそれぞれ後述する。 The recognition processing unit 417 is a module that detects and recognizes the movement of an object on the document table 204 from the image data acquired by the camera image acquisition unit 407 and the distance image acquisition unit 408. From the gesture recognition unit 409 and the object detection unit 410, Composed. The gesture recognition unit 409 continues to acquire the image on the document table 204 from the image acquisition unit 416, and notifies the main control unit 402 when a gesture such as touch is detected. When the object detection unit 410 receives a notification of the object placement waiting process or the object removal waiting process from the main control unit 402, the object detection unit 410 acquires an image obtained by capturing the document table 204 from the image acquisition unit 416, and an object is displayed on the document table 204. A process of detecting when the object is placed and stopped or when the object is removed is performed. Details of processing of the gesture recognition unit 409 and the object detection unit 410 will be described later with reference to FIGS.

スキャン処理部４１８は実際に対象物のスキャンを行うモジュールであり、平面原稿画像撮影部４１１、書籍画像撮影部４１２、立体形状測定部４１３から構成される。平面原稿画像撮影部４１１は平面原稿、書籍画像撮影部４１２は書籍、立体形状測定部４１３は立体物に、それぞれ適した処理を実行し、それぞれに応じた形式のデータを出力する。これらのモジュールの処理の詳細は図９〜図１４を用いてそれぞれ後述する。 The scan processing unit 418 is a module that actually scans an object, and includes a flat document image photographing unit 411, a book image photographing unit 412, and a three-dimensional shape measuring unit 413. The flat document image photographing unit 411 performs processing suitable for the flat original, the book image photographing unit 412 for the book, and the three-dimensional shape measuring unit 413 for the three-dimensional object, and outputs data in a format corresponding to each. Details of the processing of these modules will be described later with reference to FIGS.

ユーザインタフェース部４０３は、ＧＵＩ部品生成表示部４１４と投射領域検出部４１５から構成される。ＧＵＩ部品生成表示部４１４は、メイン制御部４０２からの要求を受け、メッセージやボタン等のＧＵＩ部品を生成する。そして、表示部４０６へ生成したＧＵＩ部品の表示を要求する。なお、書画台２０４上のＧＵＩ部品の表示場所は、投射領域検出部４１５により検出される。表示部４０６はディスプレイコントローラ３０９を介して、短焦点のプロジェクタ部２０７もしくはＬＣＤタッチパネル３３０へ要求されたＧＵＩ部品の表示を行う。プロジェクタ部２０７は書画台２０４に向けて設置されているため、書画台２０４上にＧＵＩ部品を投射することが可能となっている。また、ユーザインタフェース部４０３は、ジェスチャー認識部４０９が認識したタッチ等のジェスチャー操作、あるいはシリアルＩ／Ｆ３１０を介したＬＣＤタッチパネル３３０からの入力操作、そしてさらにそれらの座標を受信する。そして、ユーザインタフェース部４０３は描画中の操作画面の内容と操作座標を対応させて操作内容（押下されたボタン等）を判定する。この操作内容をメイン制御部４０２へ通知することにより、操作者の操作を受け付ける。
ネットワーク通信部４０４は、ネットワークＩ／Ｆ３０６を介して、ネットワーク１０４上の他の機器とＴＣＰ／ＩＰによる通信を行う。 The user interface unit 403 includes a GUI component generation display unit 414 and a projection area detection unit 415. The GUI component generation / display unit 414 receives a request from the main control unit 402 and generates a GUI component such as a message or a button. Then, the display unit 406 is requested to display the generated GUI component. Note that the display area of the GUI component on the document table 204 is detected by the projection area detection unit 415. The display unit 406 displays the requested GUI component on the short focus projector unit 207 or the LCD touch panel 330 via the display controller 309. Since the projector unit 207 is installed toward the document table 204, it is possible to project GUI parts on the document table 204. In addition, the user interface unit 403 receives a gesture operation such as a touch recognized by the gesture recognition unit 409, an input operation from the LCD touch panel 330 via the serial I / F 310, and coordinates thereof. Then, the user interface unit 403 determines the operation content (such as a pressed button) by associating the content of the operation screen being drawn with the operation coordinates. By notifying the main control unit 402 of this operation content, the operator's operation is accepted.
The network communication unit 404 communicates with other devices on the network 104 by TCP / IP via the network I / F 306.

データ管理部４０５は、機能構成４０１に含まれる制御用プログラムの実行において生成した作業データなど様々なデータをＨＤＤ３０５上の所定の領域へ保存し、管理する。例えば平面原稿画像撮影部４１１、書籍画像撮影部４１２、立体形状測定部４１３が生成したスキャンデータなどである。 The data management unit 405 stores and manages various data such as work data generated in executing the control program included in the functional configuration 401 in a predetermined area on the HDD 305. For example, the scan data generated by the flat document image photographing unit 411, the book image photographing unit 412, and the three-dimensional shape measuring unit 413.

図５は、本実施形態を示す距離画像取得処理を説明する図である。特に、図５の（ａ）は、距離画像取得部４０８の処理例を示すフローチャートである。なお、各ステップは、ＣＰＵ３０２が記憶された制御プログラムを実行することで実現される。また、図５の（ｂ）〜（ｄ）はパターン投射方式による距離画像の計測原理を説明するための図である。 FIG. 5 is a diagram for explaining the distance image acquisition process according to the present embodiment. In particular, FIG. 5A is a flowchart illustrating a processing example of the distance image acquisition unit 408. Each step is realized by the CPU 302 executing a stored control program. FIGS. 5B to 5D are diagrams for explaining the measurement principle of the distance image by the pattern projection method.

距離画像取得部４０８が処理を開始すると、Ｓ５０１では、図５（ｂ）に示すように赤外線パターン投射部３６１を用いて赤外線による３次元形状測定パターン５２２を対象物５２１に投射する。Ｓ５０２では、ＲＧＢカメラ３６３を用いて対象物を撮影したＲＧＢカメラ画像５２３および、赤外線カメラ３６２を用いてＳ５０１で投射した３次元形状測定パターン５２２を撮影した赤外線カメラ画像５２４を取得する。
なお、赤外線カメラ３６２とＲＧＢカメラ３６３とでは設置位置が異なるため、図５の（ｃ）に示すようにそれぞれで撮影される２つのＲＧＢカメラ画像５２３および赤外線カメラ画像５２４の撮影領域が異なる。そこで、赤外線カメラ３６２の座標系からＲＧＢカメラ３６３の座標系への座標系変換を用いて赤外線カメラ画像５２４をＲＧＢカメラ画像５２３の座標系に合わせる。なお、赤外線カメラ３６２とＲＧＢカメラ３６３の相対位置や、それぞれの内部パラメータは事前のキャリブレーション処理により既知であるとする。Ｓ５０３では、図５の（ｃ）、（ｄ）に示すように、３次元形状測定パターン５２２と座標変換を行った赤外線カメラ画像５２４間での対応点を抽出する。例えば、赤外線カメラ画像５２４上の１点を３次元形状測定パターン５２２上から探索して、同一の点が検出された場合に対応付けを行う。あるいは、赤外線カメラ画像５２４の画素の周辺のパターンを３次元形状測定パターン５２２上から探索し、一番類似度が高い部分と対応付けてもよい。
Ｓ５０４では、赤外線パターン投射部３６１と赤外線カメラ３６２を結ぶ直線を基線５２５として三角測量の原理を用いて計算を行うことにより、赤外線カメラ３６２からの距離を算出する。Ｓ５０３で対応付けが出来た画素については、赤外線カメラ３６２からの距離を算出して画素値として保存し、対応付けが出来なかった画素については、距離の計測が出来なかった部分として無効値を保存する。さらに、Ｓ５０３で座標変換を行った赤外線カメラ画像５２４の全画素に対して行うことで、各画素に距離値が入った距離画像を生成する。Ｓ５０５では、距離画像の各画素にＲＧＢカメラ画像のＲＧＢ値を取得することにより、１画素につきＲ、Ｇ、Ｂ、距離の４つの値を持つ距離画像を生成する。ここで取得した距離画像は距離画像センサ部２０８のＲＧＢカメラ３６３で定義された距離画像センサ座標系が基準となっている。
そこで、Ｓ５０６で、座標系を合わせるため、図２の（ｂ）を用いて上述したように、距離画像センサ座標系として得られた距離データを直交座標系における３次元点群に変換する。以後、特に指定がなく３次元点群と表記した場合は、直交座標系における３次元点群を示すものとする。そして、Ｓ５０７で、距離画像の各画素についてＲＧＢ値を保存して、本処理を終了する。 When the distance image acquisition unit 408 starts processing, in S501, the infrared pattern projection unit 361 is used to project a three-dimensional shape measurement pattern 522 using infrared rays onto the object 521 as shown in FIG. In S502, an RGB camera image 523 obtained by photographing an object using the RGB camera 363 and an infrared camera image 524 obtained by photographing the three-dimensional shape measurement pattern 522 projected in S501 using the infrared camera 362 are acquired.
Since the infrared camera 362 and the RGB camera 363 have different installation positions, the shooting areas of the two RGB camera images 523 and the infrared camera image 524 that are respectively captured as shown in FIG. 5C are different. Therefore, the infrared camera image 524 is matched with the coordinate system of the RGB camera image 523 by using coordinate system conversion from the coordinate system of the infrared camera 362 to the coordinate system of the RGB camera 363. It is assumed that the relative positions of the infrared camera 362 and the RGB camera 363 and the respective internal parameters are known by a prior calibration process. In S503, as shown in FIGS. 5C and 5D, corresponding points between the three-dimensional shape measurement pattern 522 and the infrared camera image 524 subjected to coordinate transformation are extracted. For example, one point on the infrared camera image 524 is searched from the three-dimensional shape measurement pattern 522, and association is performed when the same point is detected. Alternatively, a pattern around the pixels of the infrared camera image 524 may be searched from the three-dimensional shape measurement pattern 522 and associated with a portion having the highest similarity.
In S504, the distance from the infrared camera 362 is calculated by performing calculation using the principle of triangulation with the straight line connecting the infrared pattern projection unit 361 and the infrared camera 362 as the base line 525. For pixels that can be correlated in S503, the distance from the infrared camera 362 is calculated and stored as a pixel value, and for pixels that cannot be correlated, an invalid value is stored as the portion where the distance could not be measured. To do. Furthermore, a distance image in which each pixel has a distance value is generated by performing the process on all the pixels of the infrared camera image 524 that have undergone coordinate conversion in S503. In S505, the RGB value of the RGB camera image is acquired for each pixel of the distance image, thereby generating a distance image having four values of R, G, B, and distance for each pixel. The distance image acquired here is based on the distance image sensor coordinate system defined by the RGB camera 363 of the distance image sensor unit 208.
Therefore, in S506, in order to match the coordinate system, the distance data obtained as the distance image sensor coordinate system is converted into a three-dimensional point group in the orthogonal coordinate system as described above with reference to FIG. Hereinafter, when there is no particular designation and it is described as a three-dimensional point group, it indicates a three-dimensional point group in an orthogonal coordinate system. In step S507, the RGB values are stored for each pixel of the distance image, and the process ends.

なお、本実施形態では上述したように、距離画像センサ部２０８として赤外線パターン投射方式を採用しているが、他の方式の距離画像センサを用いることも可能である。例えば、２つのＲＧＢカメラでステレオ立体視を行うステレオ方式や、レーザー光の飛行時間を検出することで距離を測定するＴＯＦ（ＴｉｍｅｏｆＦｌｉｇｈｔ）方式を用いても構わない。 In the present embodiment, as described above, an infrared pattern projection method is employed as the distance image sensor unit 208, but a distance image sensor of another method can also be used. For example, a stereo system that performs stereo stereoscopic viewing with two RGB cameras, or a TOF (Time of Flight) system that measures distance by detecting the flight time of laser light may be used.

＜ジェスチャー認識部の説明＞
図６は、本実施形態を示す画像処理装置の制御方法を説明するフローチャートである。本例は、図４に示したジェスチャー認識部４０９の処理例である。
図６において、ジェスチャー認識部４０９が処理を開始すると、Ｓ６０１では初期化処理を行う。初期化処理で、ジェスチャー認識部４０９は距離画像取得部４０８から距離画像を１フレーム取得する。ここで、ジェスチャー認識部の開始時は書画台２０４上に対象物が置かれていない状態であるため、初期状態として書画台２０４の平面の認識を行う。つまり、取得した距離画像から最も広い平面を抽出し、その位置と法線ベクトル（以降、書画台２０４の平面パラメータと呼ぶ）を算出し、ＲＡＭ３０３に保存する。 <Description of gesture recognition unit>
FIG. 6 is a flowchart for explaining a control method of the image processing apparatus according to the present embodiment. This example is a processing example of the gesture recognition unit 409 illustrated in FIG.
In FIG. 6, when the gesture recognition unit 409 starts processing, initialization processing is performed in S601. In the initialization process, the gesture recognition unit 409 acquires one frame of the distance image from the distance image acquisition unit 408. Here, since the object is not placed on the document table 204 at the start of the gesture recognition unit, the plane of the document table 204 is recognized as an initial state. That is, the widest plane is extracted from the acquired distance image, and the position and normal vector (hereinafter referred to as plane parameters of the document table 204) are calculated and stored in the RAM 303.

続いてＳ６０２では、詳細はＳ６２１、６２２に示す、書画台２０４上に存在する物体の３次元点群を取得する。その際、Ｓ６２１では距離画像取得部４０８から距離画像と３次元点群を１フレーム取得する。Ｓ６２２では書画台２０４の平面パラメータを用いて、取得した３次元点群から書画台２０４を含む平面にある点群を除去する。 Subsequently, in S602, the three-dimensional point group of the object existing on the document table 204 shown in details in S621 and 622 is acquired. At that time, in S621, the distance image acquisition unit 408 acquires one frame of the distance image and the three-dimensional point group. In step S622, using the plane parameter of the document table 204, the point group on the plane including the document table 204 is removed from the acquired three-dimensional point group.

Ｓ６０３では、詳細はＳ６３１〜Ｓ６３４に示す、取得した３次元点群からユーザの手の形状および指先を検出する処理を行う。ここで、図７に示す、指先検出処理の方法を模式的に表した図を用いて説明する。
Ｓ６３１では、Ｓ６０２で取得した３次元点群から、書画台２０４を含む平面から所定の高さ以上にある、肌色の３次元点群を抽出することで、手の３次元点群を得る。図７の（ａ）の７０１は抽出した手の３次元点群を表している。Ｓ６３２では、抽出した手の３次元点群を、書画台２０４の平面に射影した２次元画像を生成して、その手の外形を検出する。
図７の（ａ）の７０２は、書画台２０４の平面に投影した３次元点群を表している。投影は、点群の各座標を、書画台２０４の平面パラメータを用いて投影すればよい。また、図７の（ｂ）に示すように、投影した３次元点群から、ｘｙ座標の値だけを取り出せば、ｚ軸方向から見た２次元画像７０３として扱うことができる。
この時、手の３次元点群の各点が、書画台２０４の平面に投影した２次元画像の各座標のどれに対応するかを、記憶しておくものとする。Ｓ６３３では検出した手の外形上の各点について、その点での外形の曲率を算出し、算出した曲率が所定値より小さい点を指先として検出する。
図７の（ｃ）は、外形の曲率から指先を検出する方法を模式的に表したものである。７０４は、書画台２０４の平面に投影された２次元画像７０３の外形を表す点の一部を表している。
ここで、７０４のような、外形を表す点のうち、隣り合う５個の点を含むように円を描くことを考える。円７０５、７０７が、その例である。この円を、全ての外形の点に対して順に描き、その直径（例えば７０６、７０８）が所定の値より小さい（曲率が小さい）ことを以て、指先とする。この例では隣り合う５個の点としたが、その数は限定されるものではない。また、ここでは曲率を用いたが、外形に対して楕円フィッティングを行うことで、指先を検出してもよい。
Ｓ６３４では、検出した指先の個数および各指先の座標を算出する。この時、前述したように、書画台２０４に投影した２次元画像の各点と、手の３次元点群の各点の対応関係を記憶しているため、各指先の３次元座標を得ることができる。今回は、３次元点群から２次元画像に投影した画像から指先を検出する方法を説明したが、指先検出の対象とする画像は、これに限定されるものではない。例えば、距離画像の背景差分や、ＲＧＢカメラ画像の肌色領域から手の領域を抽出し、上に述べたのと同様の方法（外形の曲率計算等）で、手領域のうちの指先を検出してもよい。
この場合、検出した指先の座標はＲＧＢカメラ画像や距離画像といった、２次元画像上の座標であるため、その座標における距離画像の距離情報を用いて、直交座標系の３次元座標に変換する必要がある。この時、指先点となる外形上の点ではなく、指先を検出するときに用いた、曲率円の中心を指先点としてもよい。 In S603, the process of detecting the shape and fingertip of the user's hand from the acquired three-dimensional point group is performed as shown in detail in S631 to S634. Here, the method of the fingertip detection process shown in FIG. 7 will be described with reference to the schematic diagram.
In S631, a three-dimensional point cloud of the hand is obtained by extracting a three-dimensional point cloud of skin color that is higher than a predetermined height from the plane including the document table 204 from the three-dimensional point cloud acquired in S602. Reference numeral 701 in FIG. 7A represents a three-dimensional point group of the extracted hand. In S632, a two-dimensional image obtained by projecting the extracted three-dimensional point group of the hand onto the plane of the document table 204 is generated, and the outline of the hand is detected.
Reference numeral 702 in FIG. 7A represents a three-dimensional point group projected on the plane of the document table 204. The projection may be performed by projecting the coordinates of the point group using the plane parameters of the document table 204. Further, as shown in FIG. 7B, if only the value of the xy coordinate is extracted from the projected three-dimensional point group, it can be handled as a two-dimensional image 703 viewed from the z-axis direction.
At this time, it is assumed that each point of the three-dimensional point group of the hand corresponds to which coordinate of the two-dimensional image projected on the plane of the document table 204. In S633, for each point on the detected outer shape of the hand, the curvature of the outer shape at that point is calculated, and a point where the calculated curvature is smaller than a predetermined value is detected as a fingertip.
FIG. 7C schematically shows a method of detecting the fingertip from the curvature of the outer shape. Reference numeral 704 denotes a part of a point representing the outer shape of the two-dimensional image 703 projected onto the plane of the document table 204.
Here, it is considered to draw a circle so as to include five adjacent points among the points representing the outer shape such as 704. Circles 705 and 707 are examples thereof. This circle is drawn in order with respect to all the points of the outer shape, and the diameter (for example, 706, 708) is smaller than a predetermined value (the curvature is small), and is used as a fingertip. In this example, five points are adjacent to each other, but the number is not limited. In addition, the curvature is used here, but the fingertip may be detected by performing elliptic fitting on the outer shape.
In S634, the number of detected fingertips and the coordinates of each fingertip are calculated. At this time, as described above, since the correspondence between each point of the two-dimensional image projected on the document table 204 and each point of the three-dimensional point group of the hand is stored, the three-dimensional coordinates of each fingertip can be obtained. Can do. This time, a method of detecting a fingertip from an image projected from a three-dimensional point group onto a two-dimensional image has been described, but the image to be detected by the fingertip is not limited to this. For example, the hand region is extracted from the background difference of the distance image or the skin color region of the RGB camera image, and the fingertip in the hand region is detected by the same method (external curvature calculation etc.) as described above. May be.
In this case, since the coordinates of the detected fingertip are coordinates on a two-dimensional image such as an RGB camera image or a distance image, it is necessary to convert to the three-dimensional coordinates of the orthogonal coordinate system using the distance information of the distance image at the coordinates. There is. At this time, the center of the curvature circle used when detecting the fingertip may be used as the fingertip point instead of the point on the outer shape that becomes the fingertip point.

Ｓ６０４では、詳細はＳ６４１〜Ｓ６４６に示す、検出した手の形状および指先からのジェスチャー判定処理を行う。Ｓ６４１では、Ｓ６０３で検出した指先が１つかどうか判定する。指先が１つでなければＳ６４６へ進み、ジェスチャー無しと判定する。Ｓ６４１において検出した指先が１つであればＳ６４２へ進み、検出した指先と書画台２０４を含む平面との距離を算出する。Ｓ６４３ではＳ６４２で算出した距離が微小な所定値以下であるかどうかを判定し、Ｓ６４３がＹＥＳであればＳ６４４へ進んで指先が書画台２０４へタッチした、タッチジェスチャーありと判定する。Ｓ６４３においてＳ６４２で算出した距離が所定値以下で無ければＳ６４５へ進み、指先が移動したジェスチャー（タッチはしていないが指先が書画台２０４上に存在するジェスチャー）と判定する。Ｓ６０５では判定したジェスチャーをメイン制御部４０２へ通知し、Ｓ６０２へ戻ってジェスチャー認識処理を繰り返す。 In S604, a gesture determination process from the detected hand shape and fingertip is performed as shown in detail in S641 to S646. In S641, it is determined whether there is one fingertip detected in S603. If there is not one fingertip, the process proceeds to S646, and it is determined that there is no gesture. If there is one fingertip detected in S641, the process proceeds to S642, and the distance between the detected fingertip and the plane including the document table 204 is calculated. In S643, it is determined whether or not the distance calculated in S642 is equal to or smaller than a minute predetermined value. If S643 is YES, the process proceeds to S644 and it is determined that there is a touch gesture in which the fingertip touches the document table 204. In S643, if the distance calculated in S642 is not less than the predetermined value, the process proceeds to S645, and it is determined that the gesture is that the fingertip has moved (a gesture that is not touched but the fingertip is on the document table 204). In S605, the determined gesture is notified to the main control unit 402, and the process returns to S602 to repeat the gesture recognition process.

＜物体検知部の処理＞
図８は、本実施形態を示す画像処理装置の制御方法を説明するフローチャートである。なお、各ステップは、ＣＰＵ３０２が記憶された制御プログラムを実行することで実現される。本例は、図４に示した物体検知部４１０の処理例である。
物体検知部４１０が処理を開始すると、図８の（ａ）のＳ８０１では、詳細は、Ｓ８１１〜Ｓ８１３に示す初期化処理を行う。
Ｓ８１１では、カメラ画像取得部４０７からカメラ画像を、距離画像取得部４０８から距離画像をそれぞれ１フレーム取得する。Ｓ８１２では、取得したカメラ画像を前フレームカメラ画像として保存する。Ｓ８１３では、取得したカメラ画像および距離画像を書画台背景カメラ画像および書画台背景距離画像としてそれぞれ保存する。（以降、「書画台背景カメラ画像」および「書画台背景距離画像」と記載した場合は、ここで取得したカメラ画像および距離画像のことを指す。）Ｓ８０２では、物体が書画台２０４上に置かれたことの検知（物体載置検知処理）を行う。処理の詳細は図８（ｂ）を用いて後述する。
Ｓ８０３では、Ｓ８０２で載置を検知した書画台２０４上の物体が除去されることの検知（物体除去検知処理）を行う。処理の詳細は図８（ｃ）を用いて後述する。 <Processing of object detection unit>
FIG. 8 is a flowchart for explaining a control method of the image processing apparatus according to the present embodiment. Each step is realized by the CPU 302 executing a stored control program. This example is a processing example of the object detection unit 410 illustrated in FIG.
When the object detection unit 410 starts processing, in S801 in FIG. 8A, the initialization processing shown in S811 to S813 is performed in detail.
In S811, one frame image is acquired from the camera image acquisition unit 407, and one range image is acquired from the distance image acquisition unit 408, respectively. In S812, the acquired camera image is stored as a previous frame camera image. In S813, the acquired camera image and distance image are stored as a document table background camera image and a document table background distance image, respectively. (Hereinafter, “document base background camera image” and “document base background distance image” refer to the camera image and distance image acquired here.) In S802, the object is placed on the document base 204. It is detected (object placement detection process). Details of the processing will be described later with reference to FIG.
In step S803, detection (object removal detection processing) that an object on the document table 204 that has been placed in step S802 is removed is removed. Details of the processing will be described later with reference to FIG.

図８の（ｂ）はＳ８０２の物体載置検知処理の詳細である。物体載置検知処理を開始すると、Ｓ８２１ではカメラ画像取得部４０７からカメラ画像を１フレーム取得する。Ｓ８２２では取得したカメラ画像と前フレームカメラ画像との差分を計算してその絶対値を合計した差分値を算出する。
Ｓ８２３では算出した差分値があらかじめ決めておいた所定値以上かどうかを判定する。算出した差分値が所定値未満であれば書画台２０４上には物体が無いと判断し、Ｓ８２８へ進んで現フレームのカメラ画像を前フレームカメラ画像として保存してからＳ８２１へ戻って処理を続ける。
Ｓ８２３において差分値が所定値以上であればＳ８２４へ進み、Ｓ８２１で取得したカメラ画像と前フレームカメラ画像との差分値を、Ｓ８２２と同様に算出する。Ｓ８２５では算出した差分値があらかじめ決めておいた所定値以下であるかどうかを判定する。Ｓ８２５において、算出した差分値が所定値よりも大きければ書画台２０４上の物体が動いていると判断し、Ｓ８２８へ進んで現フレームのカメラ画像を前フレームカメラ画像として保存してから、Ｓ８２１へ戻り処理を続ける。
Ｓ８２５において算出した差分値が所定値以下であればＳ８２６へ進む。Ｓ８２６では、Ｓ８２５が連続してＹＥＳとなった回数から、差分値が所定値以下、つまり書画台２０４上の物体が静止した状態があらかじめ決めておいたフレーム数続いたかどうかを判定する。Ｓ８２６において書画台２０４上の物体が静止した状態があらかじめ決めておいたフレーム数続いていないと判定したら、Ｓ８２８へ進んで現フレームのカメラ画像を前フレームカメラ画像として保存し、Ｓ８２１へ戻って処理を続ける。
Ｓ８２６において書画台２０４上の物体が静止した状態があらかじめ決めておいたフレーム数続いたと判定したら、Ｓ８２７へ進んで物体が置かれたことをメイン制御部４０２へ通知し、物体載置検知処理を終了する。 FIG. 8B shows details of the object placement detection process in S802. When the object placement detection process is started, one frame of camera image is acquired from the camera image acquisition unit 407 in S821. In step S822, the difference between the acquired camera image and the previous frame camera image is calculated, and a difference value obtained by summing the absolute values is calculated.
In S823, it is determined whether or not the calculated difference value is equal to or larger than a predetermined value. If the calculated difference value is less than the predetermined value, it is determined that there is no object on the document table 204, the process proceeds to S828, the camera image of the current frame is stored as the previous frame camera image, and the process returns to S821 to continue the processing. .
If the difference value is equal to or larger than the predetermined value in S823, the process proceeds to S824, and the difference value between the camera image acquired in S821 and the previous frame camera image is calculated in the same manner as in S822. In S825, it is determined whether or not the calculated difference value is equal to or less than a predetermined value. In S825, if the calculated difference value is larger than the predetermined value, it is determined that the object on the document table 204 is moving, the process proceeds to S828, the current frame camera image is stored as the previous frame camera image, and then the process proceeds to S821. Continue return processing.
If the difference value calculated in S825 is less than or equal to the predetermined value, the process proceeds to S826. In S826, it is determined from the number of times that S825 becomes YES continuously, whether or not the difference value is equal to or smaller than a predetermined value, that is, whether or not the object on the document table 204 has continued for a predetermined number of frames. If it is determined in S826 that the object on the document table 204 has not been stationary for a predetermined number of frames, the process proceeds to S828, where the current frame camera image is stored as the previous frame camera image, and the process returns to S821 for processing. Continue.
If it is determined in S826 that the object on the document table 204 is stationary for a predetermined number of frames, the process proceeds to S827 to notify the main control unit 402 that the object has been placed, and the object placement detection process is performed. finish.

図８の（ｃ）はＳ８０３の物体除去検知処理の詳細フローチャートである。物体除去検知処理を開始するとＳ８３１ではカメラ画像取得部４０７からカメラ画像を１フレーム取得する。Ｓ８３２では取得したカメラ画像と書画台背景カメラ画像との差分値を算出する。Ｓ８３３では算出した差分値が予め決めておいた所定値以下かどうかを判定する。
Ｓ８３３において、算出した差分値が予め決めておいた所定値よりも大きければ書画台２０４上にまだ物体が存在するため、Ｓ８３１へ戻って処理を続ける。Ｓ８３３において算出した差分値が予め決めておいた所定値以下であれば書画台２０４上の物体が無くなったため、物体除去をメイン制御部４０２へ通知し（Ｓ８３４）、物体除去検知処理を終了する。 FIG. 8C is a detailed flowchart of the object removal detection process in S803. When the object removal detection process is started, one frame of camera image is acquired from the camera image acquisition unit 407 in S831. In S832, a difference value between the acquired camera image and the document table background camera image is calculated. In step S833, it is determined whether the calculated difference value is equal to or less than a predetermined value.
If the calculated difference value is larger than the predetermined value determined in advance in S833, there is still an object on the document table 204, so the process returns to S831 to continue the processing. If the difference value calculated in S833 is less than or equal to a predetermined value determined in advance, the object on the document table 204 has disappeared, so object removal is notified to the main control unit 402 (S834), and the object removal detection process ends.

＜平面原稿画像撮影部の説明＞
図９は、本実施形態を示す画像処理装置の制御方法を説明するフローチャートである。なお、各ステップは、ＣＰＵ３０２が記憶された制御プログラムを実行することで実現される。本例は、平面原稿画像撮影部４１１が実行する処理を説明する。図１０は、図４に示した平面原稿画像撮影部４１１の処理を説明するための模式図である。
平面原稿画像撮影部４１１は処理を開始すると、Ｓ９０１ではカメラ画像取得部４０７を介してカメラ部２０２からの画像を１フレーム取得する。ここで、カメラ部２０２の座標系は図２の（ｂ）で示したように書画台２０４に正対していないため、このときの撮影画像は図１０の（ａ）に示すように対象物１００１、書画台２０４ともに歪んでいる。
Ｓ９０２では、書画台背景カメラ画像とＳ９０１で取得したカメラ画像との画素毎の差分を算出し、差分画像を生成した上で、差分のある画素が黒、差分の無い画素が白となるように二値化する。したがって、ここで生成した差分画像は、図１０の（ｂ）の差分領域１００２のように、対象物１００１の領域が黒色である（差分がある）画像となる。Ｓ９０３では差分領域１００２を用いて、図１０の（ｃ）のように対象物１００１のみの画像を抽出する。Ｓ９０４では、抽出した原稿領域画像に対して階調補正を行う。Ｓ９０５では、抽出した原稿領域画像に対してカメラ座標系から書画台２０４への射影変換を行い、図１０の（ｄ）のように書画台２０４の真上から見た画像１００３に変換する。
ここで用いる射影変換パラメータは、ジェスチャー認識部４０９の処理において、前述した図６の（ｂ）のＳ６１２で算出した平面パラメータとカメラ座標系から求めることができる。なお、図１０の（ｄ）に示したように、書画台２０４上への原稿の置き方により、ここで得られる画像１００３は傾いていることがある。そこで、Ｓ９０６では、画像１００３を矩形近似してからその矩形が水平になるように回転し、図１０の（ｅ）で示した画像１００４のように傾きの無い画像を得る。
図１０の（ｆ）に示すように、基準ラインに対しての矩形の傾きθ１およびθ２を算出し、傾きが小さい方（ここではθ１）を画像１００３の回転角度として決定する。あるいは、図１０の（ｇ）および図１０の（ｈ）に示すように、画像１００３中に含まれる文字列に対してＯＣＲ処理を行い、文字列の傾きから画像１００３の回転角度の算出および天地判定処理をしてもよい。
Ｓ９０７では抽出した画像１００４に対して、あらかじめ決めておいた画像フォーマット（例えばＪＰＥＧ、ＴＩＦＦ、ＰＤＦ等）に合わせて圧縮およびファイルフォーマット変換を行う。そして、データ管理部４０５を介してＨＤＤ３０５の所定の領域へファイルとして保存し、平面原稿画像撮影部４１１の処理を終了する。 <Description of Flat Document Image Shooting Unit>
FIG. 9 is a flowchart illustrating a method for controlling the image processing apparatus according to the present exemplary embodiment. Each step is realized by the CPU 302 executing a stored control program. In this example, processing executed by the flat document image photographing unit 411 will be described. FIG. 10 is a schematic diagram for explaining the processing of the planar document image photographing unit 411 shown in FIG.
When the flat document image photographing unit 411 starts the processing, in S901, one frame of the image from the camera unit 202 is obtained via the camera image obtaining unit 407. Here, since the coordinate system of the camera unit 202 does not face the document table 204 as shown in FIG. 2B, the photographed image at this time is an object 1001 as shown in FIG. The document table 204 is distorted.
In step S902, a pixel-by-pixel difference between the document table background camera image and the camera image acquired in step S901 is calculated, and a difference image is generated so that pixels having a difference are black and pixels having no difference are white. Binarize. Therefore, the difference image generated here is an image in which the region of the object 1001 is black (there is a difference), like the difference region 1002 in FIG. In S903, using the difference area 1002, an image of only the target object 1001 is extracted as shown in FIG. In step S904, gradation correction is performed on the extracted document area image. In S905, the extracted document area image is subjected to projective transformation from the camera coordinate system to the document table 204, and converted to an image 1003 viewed from directly above the document table 204 as shown in FIG.
The projective transformation parameters used here can be obtained from the plane parameters calculated in S612 of FIG. 6B and the camera coordinate system in the processing of the gesture recognition unit 409. As shown in FIG. 10D, the image 1003 obtained here may be tilted depending on how the document is placed on the document table 204. Therefore, in S906, the image 1003 is approximated to a rectangle and then rotated so that the rectangle becomes horizontal, thereby obtaining an image with no inclination like the image 1004 shown in FIG.
As shown in (f) of FIG. 10, the inclinations θ1 and θ2 of the rectangle with respect to the reference line are calculated, and the smaller inclination (here, θ1) is determined as the rotation angle of the image 1003. Alternatively, as shown in FIG. 10G and FIG. 10H, the character string included in the image 1003 is subjected to OCR processing, and the rotation angle of the image 1003 is calculated from the inclination of the character string and A determination process may be performed.
In step S907, compression and file format conversion are performed on the extracted image 1004 in accordance with a predetermined image format (for example, JPEG, TIFF, PDF, etc.). Then, the data is stored as a file in a predetermined area of the HDD 305 via the data management unit 405, and the processing of the flat document image photographing unit 411 is ended.

＜書籍画像撮影部の処理＞
図１１は、本実施形態を示す画像処理装置の制御方法を説明するフローチャートである。なお、各ステップは、ＣＰＵ３０２が記憶された制御プログラムを実行することで実現される。本例は、図４に示した書籍画像撮影部４１２が実行する処理例である。図１２は、図４に示した書籍画像撮影部４１２の処理を説明するための模式図である。
図１１の（ａ）のフローチャートで、書籍画像撮影部４１２は処理を開始すると、Ｓ１１０１ではカメラ画像取得部４０７、距離画像取得部４０８を用いて、カメラ部２０２からカメラ画像を、距離画像センサ部２０８から距離画像を、それぞれ１フレームずつ取得する。ここで得られるカメラ画像の例を図１２の（ａ）に示す。
図１２の（ａ）では、書画台２０４と撮影対象書籍１２１１を含むカメラ画像１２０１が得られている。図１２の（ｂ）はここで得られた距離画像１２０２の例である。図１２の（ｂ）では、距離画像センサ部２０８に近い方が濃い色であらわされており、距離画像センサ部２０８から対象物体１２１２上の各画素への距離が含まれる距離画像１２０２が得られている。
また、図１２の（ｂ）において、距離画像センサ部２０８からの距離が書画台２０４よりも遠い画素については白であらわされており、対象物体１２１２の書画台２０４に接している部分（対象物体１２１２では右側のページ）も同じく白色となる。 <Processing of book image photographing unit>
FIG. 11 is a flowchart illustrating a method for controlling the image processing apparatus according to the present exemplary embodiment. Each step is realized by the CPU 302 executing a stored control program. This example is an example of processing executed by the book image photographing unit 412 shown in FIG. FIG. 12 is a schematic diagram for explaining processing of the book image photographing unit 412 shown in FIG.
In the flowchart of FIG. 11A, when the book image photographing unit 412 starts processing, in S1101, the camera image obtaining unit 407 and the distance image obtaining unit 408 are used to obtain a camera image from the camera unit 202 and a distance image sensor unit. A distance image is acquired from 208 one frame at a time. An example of the camera image obtained here is shown in FIG.
In FIG. 12A, a camera image 1201 including a document table 204 and a photographing target book 1211 is obtained. FIG. 12B is an example of the distance image 1202 obtained here. In FIG. 12B, a color closer to the distance image sensor unit 208 is represented by a darker color, and a distance image 1202 including the distance from the distance image sensor unit 208 to each pixel on the target object 1212 is obtained. ing.
In FIG. 12B, pixels farther from the distance image sensor unit 208 than the document table 204 are shown in white, and the portion of the target object 1212 that is in contact with the document table 204 (target object). In 1212, the right page) is also white.

Ｓ１１０２では、詳細は、Ｓ１１１１〜Ｓ１１１６に示す、取得したカメラ画像と距離画像から書画台２０４上に載置された書籍物体の３次元点群を算出する処理を行う。Ｓ１１１１ではカメラ画像１２０１と書画台背景カメラ画像との画素毎の差分を算出して二値化を行い、図１２の（ｃ）のように物体領域１２１３が黒で示されるカメラ差分画像１２０３を生成する。Ｓ１１１２ではカメラ差分画像１２０３を、カメラ座標系から距離画像センサ座標系への変換を行い、図１２の（ｄ）のように距離画像センサ部２０８からみた物体領域１２１４を含むカメラ差分画像１２０４を生成する。
Ｓ１１１３では距離画像と書画台背景距離画像との画素毎の差分を算出して二値化を行い、図１２の（ｅ）のように物体領域１２１５が黒で示される距離差分画像１２０５を生成する。ここで、撮影対象書籍１２１１の書画台２０４と同じ色で有る部分については、画素値の差が小さくなるためカメラ差分画像１２０３中の物体領域１２１３に含まれなくなる場合がある。
また、対象物体１２１２の書画台２０４と高さが変わらない部分については距離画像センサ部２０８からの距離値が書画台２０４と差が小さいため、距離差分画像１２０５中の物体領域１２１５には含まれない場合がある。そこで、Ｓ１１１４ではカメラ差分画像１２０３と距離差分画像１２０５の和をとって図１２の（ｆ）に示す物体領域画像１２０６を生成し、物体領域１２１６を得る。ここで物体領域１２１６は書画台２０４と比べて色が異なるかまたは高さが異なる領域となり、カメラ差分画像１２０３中の物体領域１２１３か距離差分画像１２０５中の物体領域１２１５のいずれか片方のみを使った場合よりも、より正確に物体領域を表している。
物体領域画像１２０６は距離画像センサ座標系であるため、Ｓ１１１５では距離画像１２０２から物体領域画像１２０６中の物体領域１２１６のみを抽出することが可能である。Ｓ１１１６ではＳ１１１５で抽出した距離画像を直交座標系に変換することにより図１２（ｇ）に示した３次元点群１２１７を生成する。この３次元点群１２１７が書籍物体の３次元点群である。 In step S1102, for details, a process of calculating a three-dimensional point group of the book object placed on the document table 204 from the acquired camera image and distance image shown in steps S1111 to S1116 is performed. In S1111, the difference between the camera image 1201 and the document table background camera image is calculated and binarized to generate a camera difference image 1203 in which the object region 1213 is shown in black as shown in FIG. To do. In step S1112, the camera difference image 1203 is converted from the camera coordinate system to the distance image sensor coordinate system, and a camera difference image 1204 including the object region 1214 viewed from the distance image sensor unit 208 is generated as illustrated in FIG. To do.
In step S1113, a pixel-by-pixel difference between the distance image and the document table background distance image is calculated and binarized to generate a distance difference image 1205 in which the object region 1215 is shown in black as shown in FIG. . Here, a portion having the same color as the document table 204 of the book 1211 to be photographed may not be included in the object region 1213 in the camera difference image 1203 because the difference in pixel values is small.
Further, the portion of the target object 1212 whose height does not change from the document table 204 is not included in the object region 1215 in the distance difference image 1205 because the distance value from the distance image sensor unit 208 is small from the document table 204. There may not be. Therefore, in S1114, the sum of the camera difference image 1203 and the distance difference image 1205 is taken to generate an object area image 1206 shown in FIG. Here, the object area 1216 is an area having a different color or a different height as compared to the document table 204, and only one of the object area 1213 in the camera difference image 1203 and the object area 1215 in the distance difference image 1205 is used. The object region is represented more accurately than the case.
Since the object area image 1206 is a distance image sensor coordinate system, only the object area 1216 in the object area image 1206 can be extracted from the distance image 1202 in S1115. In S1116, the three-dimensional point group 1217 shown in FIG. 12G is generated by converting the distance image extracted in S1115 into an orthogonal coordinate system. This three-dimensional point group 1217 is a three-dimensional point group of the book object.

Ｓ１１０３では取得したカメラ画像と、算出した３次元点群から、書籍画像のゆがみ補正処理を行い、２次元の書籍画像を生成する。Ｓ１１０３の処理は、図１１の（ｂ）で詳しく説明する。 In step S1103, a distortion correction process is performed on the book image from the acquired camera image and the calculated three-dimensional point cloud to generate a two-dimensional book image. The process of S1103 will be described in detail with reference to FIG.

図１１の（ｂ）のフローチャートを用いて、Ｓ１１０３の書籍画像ゆがみ補正処理について説明する。
書籍画像ゆがみ補正処理を開始すると、Ｓ１１２１では物体領域画像１２０６を距離センサ画像座標系からカメラ座標系に変換する。Ｓ１１２２ではカメラ画像１２０１から物体領域画像１２０６中の物体領域１２１６をカメラ座標系に変換したものを用いて物体領域を抽出する。Ｓ１１２３では抽出した物体領域画像を書画台平面へ射影変換する。Ｓ１１２４では射影変換した物体領域画像を矩形近似し、その矩形が水平になるように回転することによって、図１２（ｈ）の書籍画像１２０８を生成する。書籍画像１２０８は近似矩形の片方の編がＸ軸に平行となっているため、書籍画像１２０８に対してＸ軸方向へのゆがみ補正処理を行う。Ｓ１１２５では書籍画像１２０８の最も左端の点をＰとする（図１２の（ｈ）の点Ｐ）。Ｓ１１２６では書籍物体の３次元点群１２１７から点Ｐの高さ（図１２の（ｈ）のｈ１）を取得する。Ｓ１１２７では書籍画像１２０８の点Ｐに対してＸ軸方向に所定の距離（図１２の（ｈ）のｘ１）離れた点をＱとする（図１２の（ｈ）の点Ｑ）。Ｓ１１２８では３次元点群１２１７から点Ｑの高さ（図１２の（ｈ）のｈ２）を取得する。Ｓ１２１９では点Ｐと点Ｑの書籍物体上での距離（図１２の（ｈ）のｌ１）を直線近似で算出する。

The book image distortion correction process in S1103 will be described with reference to the flowchart in FIG.
When the book image distortion correction process is started, in S1121, the object region image 1206 is converted from the distance sensor image coordinate system to the camera coordinate system. In step S1122, the object region is extracted from the camera image 1201 using the object region 1216 in the object region image 1206 converted into the camera coordinate system. In step S1123, the extracted object region image is projectively converted to the document table plane. In S1124, the object region image obtained by projective transformation is approximated to a rectangle, and the book image 1208 shown in FIG. 12H is generated by rotating the rectangle so that the rectangle becomes horizontal. Since the book image 1208 has one of the approximate rectangles parallel to the X axis, the book image 1208 is subjected to distortion correction processing in the X axis direction. In S1125, the leftmost point of the book image 1208 is set to P (point P in (h) of FIG. 12). In S1126, the height of the point P (h1 in FIG. 12H) is acquired from the three-dimensional point group 1217 of the book object. In S1127, a point separated by a predetermined distance (x1 in FIG. 12H) from the point P of the book image 1208 in the X-axis direction is defined as Q (point Q in FIG. 12H). In S1128, the height of the point Q (h2 in FIG. 12H) is acquired from the three-dimensional point group 1217. In step S1219, the distance between the points P and Q on the book object (l1 in FIG. 12H) is calculated by linear approximation.

Ｓ１１３０では算出した距離ｌ１でＰＱ間の距離を補正し、図１２の（ｈ）における画像１２１９上の点Ｐ'と点Ｑ' の位置に画素をコピーする。Ｓ１１３１では処理を行った点Ｑを点Ｐとし、Ｓ１１２８に戻って同じ処理を行うことによって図１２の（ｈ）の点Ｑと点Ｒの間の補正を実行することができ、画像１２１９上の点Ｑ'と点Ｒ'の画素とする。この処理を全画素について繰り返すことにより、画像１２１９はゆがみ補正後の画像となる。Ｓ１１３２ではゆがみ補正処理を全ての点について終えたかどうかを判断し、終えていれば書籍物体のゆがみ補正処理を終了する。以上のようにして、Ｓ１１０２、Ｓ１１０３の処理を行ってゆがみ補正を行った書籍画像を生成することができる。 In S1130, the distance between the PQs is corrected by the calculated distance l1, and the pixels are copied to the positions of the points P ′ and Q ′ on the image 1219 in (h) of FIG. In S1131, the processed point Q is set as the point P, and the process returns to S1128 and the same processing is performed, whereby the correction between the point Q and the point R in (h) of FIG. It is assumed that the pixels at point Q ′ and point R ′ By repeating this process for all pixels, the image 1219 becomes an image after distortion correction. In S1132, it is determined whether or not the distortion correction processing has been completed for all points. If completed, the book object distortion correction processing is terminated. As described above, it is possible to generate a book image subjected to the distortion correction by performing the processes of S1102 and S1103.

ゆがみ補正を行った書籍画像の生成後、Ｓ１１０４では生成した書籍画像に階調補正を行う。Ｓ１１０５では生成した書籍画像に対して、あらかじめ決めておいた画像フォーマット（例えばＪＰＥＧ、ＴＩＦＦ、ＰＤＦ等）に合わせて圧縮およびファイルフォーマット変換を行う。Ｓ１１０６では生成した画像データを、データ管理部４０５を介してＨＤＤ３０５の所定の領域へファイルとして保存し、書籍画像撮影部４１２の処理を終了する。 After generating the book image subjected to the distortion correction, in S1104, gradation correction is performed on the generated book image. In S1105, compression and file format conversion are performed on the generated book image in accordance with a predetermined image format (for example, JPEG, TIFF, PDF, etc.). In step S1106, the generated image data is stored as a file in a predetermined area of the HDD 305 via the data management unit 405, and the processing of the book image photographing unit 412 is ended.

＜立体形状測定部の説明＞
図１３は、本実施形態を示す画像処理装置の制御方法を説明するフローチャートである。なお、各ステップは、ＣＰＵ３０２が記憶された制御プログラムを実行することで実現される。本例は、立体形状測定部４１３が実行する処理例である。図１４は、図４に示した立体形状測定部４１３の処理を説明するための模式図である。
立体形状測定部４１３が処理を開始すると、Ｓ１３０１ではシリアルＩ／Ｆ３１０を介してターンテーブル２０９へ回転指示を行い、ターンテーブル２０９を所定の角度ずつ回転させる。ここでの回転角度は小さければ小さいほど最終的な測定精度は高くなるが、その分測定回数が多くなり時間がかかるため、装置として適切な回転角度を予め決めておけば良い。 <Description of the three-dimensional shape measurement unit>
FIG. 13 is a flowchart for explaining a control method of the image processing apparatus according to the present embodiment. Each step is realized by the CPU 302 executing a stored control program. This example is an example of processing executed by the three-dimensional shape measurement unit 413. FIG. 14 is a schematic diagram for explaining the processing of the three-dimensional shape measuring unit 413 shown in FIG.
When the three-dimensional shape measurement unit 413 starts processing, in S1301, a rotation instruction is given to the turntable 209 via the serial I / F 310, and the turntable 209 is rotated by a predetermined angle. The smaller the rotation angle is, the higher the final measurement accuracy is. However, the number of times of measurement increases and it takes time, and therefore, an appropriate rotation angle for the apparatus may be determined in advance.

Ｓ１３０２では書画台２０４内に設けられたターンテーブル２０９上の対象物に対して、カメラ部２０２とプロジェクタ部２０７を用いて３次元点群測定処理を行う。図１３の（ｂ）のフローチャートはＳ１３０２で実行する３次元点群測定処理のフローチャートである。３次元点群測定処理を開始すると、Ｓ１３１１では図１４の（ａ）に示したターンテーブル２０９上の対象物１４０１に対して、プロジェクタ部２０７から３次元形状測定パターン１４０２を投射する。
Ｓ１３１２では、カメラ画像取得部４０７を介してカメラ部２０２からカメラ画像を１フレーム取得する。Ｓ１３１３では、３次元形状測定パターン１４０２と取得したカメラ画像間での対応点を図５のＳ５０４と同様にして抽出する。Ｓ１３１４において、カメラ部２０２およびプロジェクタ部２０７の位置関係から、カメラ画像上の各画素における距離を算出し、距離画像を生成する。ここでの測定方法は、距離画像取得部４０８の処理において、図５のＳ５０５で説明した測定方法と同じである。
Ｓ１３１５では距離画像の各画素について直交座標系への座標変換を行い、３次元点群を算出する。Ｓ１３１６では算出した３次元点群から書画台２０４の平面パラメータを用いて書画台平面に含まれる３次元点群を除去する。そしてＳ１３１７では残った３次元点群の中から位置が大きく外れている点をノイズとして除去し、対象物１４０１の３次元点群１４０３を生成する。Ｓ１３１８ではプロジェクタ部２０７から投射している３次元形状測定パターン１４０２を消灯する。Ｓ１３１９ではカメラ画像取得部４０７を介してカメラ部２０２からカメラ画像を取得し、その角度から見たときのテクスチャ画像として保存し、３次元点群測定処理を終了する。 In step S1302, a three-dimensional point cloud measurement process is performed on the object on the turntable 209 provided in the document table 204 using the camera unit 202 and the projector unit 207. The flowchart in FIG. 13B is a flowchart of the three-dimensional point group measurement process executed in S1302. When the three-dimensional point group measurement process is started, a three-dimensional shape measurement pattern 1402 is projected from the projector unit 207 to the object 1401 on the turntable 209 shown in FIG.
In S1312, one frame of camera image is acquired from the camera unit 202 via the camera image acquisition unit 407. In step S1313, corresponding points between the three-dimensional shape measurement pattern 1402 and the acquired camera image are extracted in the same manner as in step S504 in FIG. In step S1314, the distance at each pixel on the camera image is calculated from the positional relationship between the camera unit 202 and the projector unit 207, and a distance image is generated. The measurement method here is the same as the measurement method described in S505 of FIG. 5 in the processing of the distance image acquisition unit 408.
In S1315, coordinate conversion to the orthogonal coordinate system is performed for each pixel of the distance image to calculate a three-dimensional point group. In step S1316, the 3D point group included in the document table plane is removed from the calculated 3D point group using the plane parameters of the document table 204. In step S <b> 1317, a point whose position is greatly deviated from the remaining three-dimensional point group is removed as noise, and a three-dimensional point group 1403 of the object 1401 is generated. In step S1318, the three-dimensional shape measurement pattern 1402 projected from the projector unit 207 is turned off. In step S1319, a camera image is acquired from the camera unit 202 via the camera image acquisition unit 407, stored as a texture image when viewed from the angle, and the three-dimensional point cloud measurement process is terminated.

２回目以降にＳ１３０２の３次元点群測定処理を実行した際は、Ｓ１３０１でターンテーブル２０９を回転させて計測を行っているため、図１４の（ｃ）に示すようにターンテーブル２０９上の対象物１４０１、プロジェクタ部２０７およびカメラ部２０２の角度が変わっている。そのため、図１４の（ｄ）に示すように、Ｓ１３０２で得られた３次元形状測定パターン１４０２とは異なる視点から見た３次元点群１４０３が得られる。つまり、３次元形状測定パターン１４０２ではカメラ部２０２およびプロジェクタ部２０７から死角となって算出できなかった部分の３次元点群が、３次元点群１４０３では含まれることになる。（逆に、３次元点群１４０３には含まれない３次元点群が、３次元形状測定パターン１４０２に含まれている。）
そこで、異なる視点から見た２つの３次元形状測定パターン１４０２と１４０３を重ね合わせる処理を行う。 When the three-dimensional point cloud measurement process of S1302 is executed after the second time, the measurement is performed by rotating the turntable 209 in S1301, so that the object on the turntable 209 as shown in FIG. The angles of the object 1401, the projector unit 207, and the camera unit 202 are changed. Therefore, as shown in FIG. 14D, a three-dimensional point group 1403 viewed from a different viewpoint from the three-dimensional shape measurement pattern 1402 obtained in S1302 is obtained. That is, the three-dimensional point group 1403 includes portions of the three-dimensional point group 1403 that cannot be calculated as blind spots from the camera unit 202 and the projector unit 207 in the three-dimensional shape measurement pattern 1402. (Conversely, a 3D point group not included in the 3D point group 1403 is included in the 3D shape measurement pattern 1402.)
Therefore, a process of superimposing two three-dimensional shape measurement patterns 1402 and 1403 viewed from different viewpoints is performed.

Ｓ１３０３ではＳ１３０２で測定した３次元点群１４０３（図１４の（ｂ））を、ターンテーブルが初期位置から回転した角度分逆回転することにより、３次元点群１４０３との位置を大まかに合わせた３次元点群１４０４を算出する。 In S1303, the position of the three-dimensional point group 1403 (FIG. 14B) measured in S1302 is roughly matched with the three-dimensional point group 1403 by reversely rotating the turntable by an angle rotated from the initial position. A three-dimensional point group 1404 is calculated.

Ｓ１３０４ではＳ１３０３で算出された３次元点群と、既に合成された３次元点群との合成処理を行う。３次元点群の合成処理は特徴点を用いたＩＣＰ（ＩｔｅｒａｔｉｖｅＣｌｏｓｅｓｔＰｏｉｎｔ）アルゴリズムを用いる。ＩＣＰアルゴリズムでは合成対象の２つの３次元形状測定パターン１４０２と３次元点群１４０４から、それぞれコーナーとなる３次元特徴点を抽出する。
そして、３次元形状測定パターン１４０２の特徴点と３次元点群１４０４の特徴点の対応をとって、すべての対応点同士の距離を算出して加算し、３次元点群１４０４の位置を動かしながら対応点同士の距離の和が最小となる位置を繰り返し算出する。繰り返し回数が上限に達した場合や、対応点同士の距離の和が最小となる位置が算出されると、３次元点群１４０４を移動してから３次元点群１４０３と重ね合わせる。これにより、２つの３次元形状測定パターン１４０２と１４０４を合成する。このようにして合成後の３次元点群１４０５（図１４の（ｄ））を生成し、３次元点群合成処理を終了する。 In S1304, the 3D point group calculated in S1303 and the already synthesized 3D point group are combined. The synthesis process of the three-dimensional point group uses an ICP (Iterative Closest Point) algorithm using feature points. In the ICP algorithm, three-dimensional feature points serving as corners are extracted from the two three-dimensional shape measurement patterns 1402 and the three-dimensional point group 1404 to be synthesized.
Then, the correspondence between the feature points of the three-dimensional shape measurement pattern 1402 and the feature points of the three-dimensional point group 1404 is calculated, the distances between all the corresponding points are calculated and added, and the position of the three-dimensional point group 1404 is moved. The position where the sum of the distances between corresponding points is minimized is repeatedly calculated. When the number of repetitions reaches the upper limit or when the position at which the sum of the distances between corresponding points is minimized is calculated, the 3D point group 1404 is moved and then superimposed on the 3D point group 1403. Thereby, two three-dimensional shape measurement patterns 1402 and 1404 are synthesized. In this way, a combined three-dimensional point group 1405 ((d) in FIG. 14) is generated, and the three-dimensional point group combining process ends.

Ｓ１３０４の３次元点群合成処理が終了するとＳ１３０５ではターンテーブル２０９が１周回転したかを判断する。まだターンテーブル２０９が１周回転していなければ、Ｓ１３０１へ戻ってターンテーブル２０９をさらに回転してからＳ１３０２を実行して別の角度の３次元点群１４０６（図１４の（ｅ））を測定する。そしてＳ１３０３〜Ｓ１３０４において既に合成した３次元点群１４０５と新たに測定した３次元点群との合成処理を行う。このようにＳ１３０１からＳ１３０５の処理をターンテーブル２０９が１周するまで繰り返すことにより、対象物１４０１の全周３次元点群を生成することができる。 When the three-dimensional point group combining process in S1304 is completed, it is determined in S1305 whether the turntable 209 has rotated one turn. If the turntable 209 has not yet rotated one turn, the process returns to S1301, and the turntable 209 is further rotated, and then S1302 is executed to measure the three-dimensional point group 1406 (FIG. 14 (e)) at another angle. To do. In S1303 to S1304, a synthesis process is performed on the three-dimensional point group 1405 already synthesized and the newly measured three-dimensional point group. In this way, by repeating the processing from S1301 to S1305 until the turntable 209 makes one round, the all-round three-dimensional point cloud of the object 1401 can be generated.

Ｓ１３０５でターンテーブル２０９が１周したと判断するとＳ１３０６へ進み、生成した３次元点群から３次元モデルを生成する処理を行う。３次元モデル生成処理を開始すると、Ｓ１３３１では３次元点群からノイズ除去および平滑化を行う。Ｓ１３３２では３次元点群から三角パッチを生成することで、メッシュ化を行う。Ｓ１３３３ではメッシュ化によって得られた平面へＳ１３１９で保存したテクスチャをマッピングする。以上によりテクスチャマッピングされた３次元モデルを生成することが出来る。 If it is determined in step S1305 that the turntable 209 has made one turn, the process proceeds to step S1306, and processing for generating a three-dimensional model from the generated three-dimensional point group is performed. When the three-dimensional model generation process is started, noise removal and smoothing are performed from the three-dimensional point group in S1331. In S1332, meshing is performed by generating a triangular patch from the three-dimensional point group. In S 1333, the texture stored in S 1319 is mapped to the plane obtained by meshing. Thus, a texture-mapped three-dimensional model can be generated.

Ｓ１３０７ではテクスチャマッピング後のデータをＶＲＭＬやＳＴＬ等の標準的な３次元モデルデータフォーマットへ変換し、データ管理部４０５を介してＨＤＤ３０５上の所定の領域に格納し、立体形状測定部４１３の処理を終了する。 In step S1307, the texture-mapped data is converted into a standard three-dimensional model data format such as VRML or STL, stored in a predetermined area on the HDD 305 via the data management unit 405, and processed by the three-dimensional shape measurement unit 413. finish.

＜メイン制御部の説明＞
図１５は、本実施形態を示す画像処理装置の制御方法を説明するフローチャートである。なお、各ステップは、ＣＰＵ３０２が記憶された制御プログラムを実行することで実現される。本例は、図４に示したメイン制御部４０２が実行するスキャンアプリケーションの処理例である。以下、本実施形態で説明するスキャンアプリケーションでは、書籍物体のような立体形状を有する物体を２Ｄスキャンする際のエラー処理について説明する。
図１６、図１７、図１８は、本実施形態を示す画像処理装置における書籍文書入力処理状態を示す遷移図である。
図１６の（ａ）は、書画台２０４上に書籍物体１６０１を載置した状態を、カメラ部２０２の始点から見た図を模式的に表している。ここで１６０２はカメラ部２０２のカメラ画角を表している。書籍物体１６０１のように高さをもつ本を書画台２０４上に置いた場合、書籍物体１６０１の底面は全てカメラ画角１６０２に収まっていても、書籍物体１６０１の一部１６０３ははみ出してしまうことがある。この場合、書籍物体１６０１の上面全てはスキャン出来ない。 <Description of main control unit>
FIG. 15 is a flowchart for explaining a control method of the image processing apparatus according to the present embodiment. Each step is realized by the CPU 302 executing a stored control program. This example is a processing example of a scan application executed by the main control unit 402 shown in FIG. Hereinafter, in the scan application described in the present embodiment, error processing when 2D scanning an object having a three-dimensional shape such as a book object will be described.
FIGS. 16, 17, and 18 are transition diagrams showing book document input processing states in the image processing apparatus according to the present embodiment.
FIG. 16A schematically shows a state in which the book object 1601 is placed on the document table 204 as viewed from the start point of the camera unit 202. Here, reference numeral 1602 denotes a camera angle of view of the camera unit 202. When a book having a height such as the book object 1601 is placed on the document table 204, even if the bottom surface of the book object 1601 is entirely within the camera angle of view 1602, a part 1603 of the book object 1601 protrudes. There is. In this case, the entire upper surface of the book object 1601 cannot be scanned.

図１６の（ｂ）は、図１６の（ａ）の状態を、真横からと真上から見た様子を模式的に表した図である。１６０５は真横から見た書籍物体を表している。１６０８は真横から見たカメラ画角（画角域）を表している。また、１６０７は、真横から見た書画台２０４上の撮影範囲を示している。この時、書籍物体の一部１６０４が、真横から見たカメラ画角１６０８からはみ出している。この部分が、撮影不可能となる部分である。 FIG. 16B is a diagram schematically showing the state of FIG. 16A viewed from the side and from the top. Reference numeral 1605 denotes a book object viewed from the side. Reference numeral 1608 denotes a camera field angle (view angle region) viewed from the side. Reference numeral 1607 denotes a shooting range on the document table 204 viewed from the side. At this time, a part 1604 of the book object protrudes from the camera angle of view 1608 viewed from the side. This part is a part where photographing is impossible.

同様に、１６０９は真上から見た書籍物体を表している。１６１０は真上から見た書画台２０４上の撮影範囲を示している。１６１１、１６１２は、撮影不可能となる部分である。このような場合にユーザに対して適切なフィードバックを行うのが本実施形態の目的である。 Similarly, reference numeral 1609 denotes a book object viewed from directly above. Reference numeral 1610 denotes a shooting range on the document table 204 viewed from directly above. Reference numerals 1611 and 1612 denote portions that cannot be photographed. The purpose of this embodiment is to provide appropriate feedback to the user in such a case.

図１５においてメイン制御部４０２が処理を開始すると、Ｓ１５０１では書画台２０４にスキャンの対象物が載置されるのを待つ、物体載置待ち処理を行う。Ｓ１５０１において対象物載置待ち処理を開始すると、Ｓ１５１１ではユーザインタフェース部４０３を介して、書画台２０４にプロジェクタ部２０７によって図１７の（ａ）の画面を投射する。図１７の（ａ）の画面では、書画台２０４上に対象物を置くことをユーザに促すメッセージ１７０１を投射する。Ｓ１５１２では物体検知部４１０の処理を起動する。物体検知部４１０は図８のフローチャートで説明した処理の実行を開始する。Ｓ１５１３では、物体検知部４１０からの物体載置通知を待つ。物体検知部４１０が図８のＳ８２７の処理を実行して物体載置をメイン制御部４０２へ通知すると、Ｓ１５１３において物体載置通知ありと判断し、物体載置待ち処理を終了する。 In FIG. 15, when the main control unit 402 starts the process, in S1501, an object placement waiting process is performed in which the object to be scanned is placed on the document table 204. When the object placement waiting process is started in S1501, the screen shown in FIG. 17A is projected onto the document stage 204 by the projector unit 207 via the user interface unit 403 in S1511. On the screen of FIG. 17A, a message 1701 that prompts the user to place an object on the document table 204 is projected. In step S1512, the processing of the object detection unit 410 is activated. The object detection unit 410 starts executing the process described with reference to the flowchart of FIG. In step S1513, an object placement notification from the object detection unit 410 is awaited. When the object detection unit 410 executes the process of S827 in FIG. 8 and notifies the main control unit 402 of the object placement, in S1513 it is determined that there is an object placement notification, and the object placement waiting process ends.

メイン制御部４０２は、Ｓ１５４１で、載置された立体形状を有する物体が撮影可能な領域を算出する処理を行う。この領域は、本実施形態では、書画台２０４の面上に投影するものとして説明する。 In step S <b> 1541, the main control unit 402 performs a process of calculating a region in which the placed solid object can be photographed. In the present embodiment, this area is described as being projected on the surface of the document table 204.

まず、メイン制御部４０２は、Ｓ１５４２で、物体の高さ情報を取得する。物体の高さ情報は、距離画像取得部４０８から取得した距離画像を元に算出する。距離画像取得部４０８から取得した距離画像のうち、書籍物体の部分を切り出し、３次元点群に変換する処理は、既に書籍画像撮影部の処理で述べた。図１６の（ｃ）は、距離画像センサ部２０８とその画角１６１４を真横から見た様子を模式的に表した図である。 First, in step S1542, the main control unit 402 acquires object height information. The object height information is calculated based on the distance image acquired from the distance image acquisition unit 408. The process of cutting out the book object portion from the distance image acquired from the distance image acquisition unit 408 and converting it into a three-dimensional point group has already been described in the processing of the book image photographing unit. FIG. 16C is a diagram schematically showing the distance image sensor unit 208 and its angle of view 1614 viewed from the side.

ここで、書籍物体の３次元点群を直交座標系で処理し、書籍物体の高さを算出する方法を、模式図を用いて説明する。取得した書籍物体３次元点群視点を直行座標系の＋ｚ軸方向に設定し、その２次元画像を一時的にＲＡＭ３０３に保存する。図１８の（ａ）がこの時の２次元画像を表している。書画台を真上からみた画像になる。真上から見た画像の書画台領域１８０２内で、矩形領域の検出を行う。
図１８の（ｂ）の１８０３が検出された矩形の領域を表す。発見した矩形１８０３の長辺の方向にｘ'軸を設定する。原点は、例えば、発見した矩形１８０３の頂点うち、直交座標系の原点に最も近い頂点にとればよい。同様に、矩形１８０３の短辺の方向にｙ'軸を設定する。発見した矩形１８０３と書籍物体の３次元点群１８０１だけ抜き出すと図１８の（ｃ）のようになる。
図１８の（ｃ）上で、任意のｙ'の値でｘ'軸と平行な直線１８０４を引き、１８０４上の３次元点群をサンプリングする。サンプリングした３次元点群のうち、最もｚ'の値が大きいもののｚ'の値を、書籍物体の高さとする。書籍物体は、ｙ'軸方向の形状は一定であると仮定できるため、上記のように一本の直線上の３次元点群だけサンプリングすることで書籍物体の高さが求められる。また、書籍物体３次元点群全体で最もｚ'の値が大きいもののｚ'の値を書籍物体の高さとするようにしてもよい。更に、上に述べた書籍物体の形状がｙ'方向で一定であるという前提のもと、書籍物体の全体が、距離画像センサ部２０８の画角１６１４に全て入っていなかったとしても、同様にしてその高さを求めることは可能である。メイン制御部４０２は、Ｓ１５４３で、カメラの画角情報を取得する。カメラ部２０２の画角情報は、予めＲＡＭ３０３に保存されている。 Here, a method for processing a three-dimensional point group of a book object in an orthogonal coordinate system and calculating the height of the book object will be described with reference to a schematic diagram. The acquired book object three-dimensional point cloud viewpoint is set in the + z-axis direction of the orthogonal coordinate system, and the two-dimensional image is temporarily stored in the RAM 303. FIG. 18A shows the two-dimensional image at this time. An image of the document stand viewed from directly above. A rectangular area is detected in the document table area 1802 of the image viewed from directly above.
A rectangular area 1803 in FIG. 18B is detected. The x ′ axis is set in the direction of the long side of the found rectangle 1803. The origin may be, for example, the vertex closest to the origin of the orthogonal coordinate system among the vertices of the found rectangle 1803. Similarly, the y ′ axis is set in the direction of the short side of the rectangle 1803. When only the found rectangle 1803 and the book object three-dimensional point group 1801 are extracted, the result is as shown in FIG.
In FIG. 18C, a straight line 1804 parallel to the x ′ axis is drawn with an arbitrary y ′ value, and a three-dimensional point group on 1804 is sampled. Of the sampled three-dimensional point group, the value of z ′ having the largest value of z ′ is set as the height of the book object. Since it can be assumed that the book object has a constant shape in the y′-axis direction, the height of the book object is obtained by sampling only the three-dimensional point group on one straight line as described above. Alternatively, the value of z ′ of the book object three-dimensional point group having the largest z ′ value may be set as the height of the book object. Furthermore, under the assumption that the shape of the book object described above is constant in the y ′ direction, even if the entire book object is not entirely within the angle of view 1614 of the distance image sensor unit 208, the same applies. It is possible to determine the height. In step S1543, the main control unit 402 acquires camera angle-of-view information. The angle of view information of the camera unit 202 is stored in the RAM 303 in advance.

Ｓ１５４４で、メイン制御部４０２は、載置領域に対応する撮影可能領域を計算により導出する。計算には、Ｓ１５４３、Ｓ１５４４で求めた書籍物体の高さと、カメラ部２０２の画角情報、およびカメラ部２０２の座標系と直交座標系の変換行列（外部パラメータ）を用いる。外部パラメータに関しては、既に＜カメラスキャナの構成＞の部分で述べたとおりである。外部パラメータがわかれば、書画台２０４に対するカメラ部２０２の角度を含めた配置を特定することができる。
図１６の（ｂ）の例でいえば、書籍物体１６０５の高さと、カメラ部２０２の画角および配置から、書籍物体１６０５とカメラ画角１６０８の交点を通る平面１６０６を求めることが可能になる。平面１６０６が書画台２０４と交わる部分に出来る直線が、書籍物体を置くことのできる境界線である。撮影範囲１６１０の一辺をこの直線に変更すれば、書籍物体１６０５を載置可能な領域を求めたことになる。図１６の（ｄ）、（ｅ）の１６１５が撮影可能領域を表している。 In step S1544, the main control unit 402 derives a shootable area corresponding to the placement area by calculation. For the calculation, the height of the book object obtained in S1543 and S1544, the angle of view information of the camera unit 202, and the transformation matrix (external parameters) between the coordinate system and the orthogonal coordinate system of the camera unit 202 are used. The external parameters have already been described in the section <Camera scanner configuration>. If the external parameter is known, the arrangement including the angle of the camera unit 202 with respect to the document table 204 can be specified.
In the example of FIG. 16B, a plane 1606 passing through the intersection of the book object 1605 and the camera field angle 1608 can be obtained from the height of the book object 1605 and the field angle and arrangement of the camera unit 202. . A straight line formed at a portion where the plane 1606 intersects the document table 204 is a boundary line on which a book object can be placed. If one side of the shooting range 1610 is changed to this straight line, an area where the book object 1605 can be placed is obtained. 16D in FIG. 16 and 16E in FIG.

次にＳ１５４５では、Ｓ１５４４で求めた撮影可能領域１６１５を書画台２０４上をプロジェクタ部２０７でプロジェクション（投影）する。撮影可能領域１６１５は、図１６の（ｄ）のように、直交座標系でＺ＝０の平面を真上からみた俯瞰画像の画像で求めることができる。これをプロジェクタ投影可能な画像に変換するためには、プロジェクタ部２０７の外部パラメータを用いれば簡単に変換できる。プロジェクタ部２０７の外部パラメータは、公知のキャリブレーション手法により予め求められているものとする（図２の（ｃ）参照）。 In step S1545, the shootable area 1615 obtained in step S1544 is projected on the document table 204 by the projector unit 207. As shown in FIG. 16D, the imageable region 1615 can be obtained as an overhead image obtained by viewing a plane with Z = 0 in the orthogonal coordinate system from directly above. In order to convert this into an image that can be projected by a projector, conversion can be easily performed by using external parameters of the projector unit 207. Assume that the external parameters of the projector unit 207 are obtained in advance by a known calibration method (see FIG. 2C).

なお、投影の際、投影可能領域の長方形をそのまま投影してしまうと、現在置いてある書籍物体の上にその線がかかってしまい、投影可能領域がどこなのか、ユーザにとってわかりにくくなることが考えられる。このような場合は、図１６の（ｆ）に示す撮影可能領域１６１５のように、本にかからない状態での投影可能領域の枠線の画像を、俯瞰画像上で作成し、それを投影すればよい。こうすることで、本の上に投影可能領域の枠線がかかることはなく、ユーザにとってわかりやすい形で示すことが可能となる。 In addition, if the rectangle of the projectable area is projected as it is at the time of projection, the line is placed on the book object that is currently placed, and it may be difficult for the user to know where the projectable area is. Conceivable. In such a case, an image of a border line of a projectable area that is not covered with a book is created on the overhead view image and projected as in the imageable area 1615 shown in FIG. Good. By doing so, the border of the projectable area is not applied on the book, and can be shown in a form that is easy for the user to understand.

尚、図１６の（ｃ）では、距離画像センサ部２０８がカメラ部２０２より上にあり、カメラ画角の撮影範囲１６０７が距離画像センサ部２０８の撮影範囲と一致するように書かれているが、これに限定されるものではない。
また、上記方法は、距離画像センサ部２０８の画角１６１４内に書籍物体１６０５が全て入っていなくても用いることができる。 In FIG. 16C, the distance image sensor unit 208 is above the camera unit 202, and the shooting range 1607 of the camera angle of view is written to match the shooting range of the distance image sensor unit 208. However, the present invention is not limited to this.
Further, the above method can be used even when the book object 1605 is not entirely contained within the angle of view 1614 of the distance image sensor unit 208.

メイン制御部４０２は、続いてＳ１５０２のスキャン実行処理を行う。Ｓ１５０２のスキャン実行処理を開始すると、Ｓ１５３１では、図１７の（ｂ）に示したスキャン開始画面を、ユーザインタフェース部４０３を介して書画台２０４に投射する。
図１７の（ｂ）において、対象物１７１１がユーザによって載置されたスキャン対象物体である。２Ｄスキャンボタン１７１２は平面原稿の撮影指示を受け付けるボタンである。ブックスキャンボタン１７１３は書籍原稿の撮影指示を受け付けるボタンである。スキャン開始ボタン１７１４は立体形状の測定指示を受け付けるボタンである。スキャン開始ボタン１７１５は選択したスキャンの実行開始指示を受け付けるボタンである。
ユーザインタフェース部４０３は、前述したようにジェスチャー認識部４０９から通知されるタッチジェスチャーの座標とこれらのボタンを表示している座標から、いずれかのボタンがユーザによって押下されたことを検知する（以降、ユーザインタフェース部による検知の説明を省略して「ボタンへのタッチを検知する」と記載する）。また、ユーザインタフェース部４０３は、２Ｄスキャンボタン１７１２、ブックスキャンボタン１７１３、スキャン開始ボタン１７１４のそれぞれを排他的に選択できるようにしている。ユーザのいずれかのボタンへのタッチを検知すると、タッチされたボタンを選択状態とし、他のボタンの選択を解除する。 Subsequently, the main control unit 402 performs the scan execution process of S1502. When the scan execution process in S1502 is started, in S1531, the scan start screen shown in FIG. 17B is projected onto the document stage 204 via the user interface unit 403.
In FIG. 17B, a target 1711 is a scan target object placed by the user. A 2D scan button 1712 is a button for accepting an instruction for photographing a flat original. A book scan button 1713 is a button for accepting an instruction to shoot a book document. A scan start button 1714 is a button for receiving a three-dimensional measurement instruction. A scan start button 1715 is a button for receiving an instruction to start executing the selected scan.
As described above, the user interface unit 403 detects that one of the buttons has been pressed by the user from the coordinates of the touch gesture notified from the gesture recognition unit 409 and the coordinates at which these buttons are displayed (hereinafter, referred to as the button). The description of detection by the user interface unit is omitted, and is described as “detecting a touch on a button”). The user interface unit 403 can exclusively select each of the 2D scan button 1712, the book scan button 1713, and the scan start button 1714. When a touch on any button of the user is detected, the touched button is set in a selected state, and the selection of other buttons is canceled.

Ｓ１５３２では、スキャン開始ボタン１７１５へのタッチを検知するまで待つ。Ｓ１５３２でスキャン開始ボタン１７１５へのタッチを検知したらＳ１５３３へ進み、２Ｄスキャンボタン１７１２が選択状態かどうかを判定する。Ｓ１５３３で２Ｄスキャンボタン１７１２が選択状態であればＳ１５３４へ進んで平面原稿画像撮影部４１１の処理を実行して、スキャン実行処理を終了する。
Ｓ１５３３で２Ｄスキャンボタン１７１２が選択状態でなければＳ１５３５へ進んでブックスキャンボタン１７１３が選択状態かどうかを判定する。Ｓ１５３５でブックスキャンボタン１７１３が選択状態であればＳ１５３６へ進んで書籍画像撮影部４１２の処理を実行して、スキャン実行処理を終了する。Ｓ１５３５でブックスキャンボタン１７１３が選択状態で無ければＳ１５３７へ進んでスキャン開始ボタン１７１４が選択状態かどうかを判定する。
Ｓ１５３７でスキャン開始ボタン１７１４が選択状態であればＳ１５３８へ進んで立体形状測定部４１３の処理を実行して、スキャン実行処理を終了する。Ｓ１５３７でスキャン開始ボタン１７１５が選択状態で無ければ２Ｄスキャンボタン１７１２、ブックスキャンボタン１７１３、スキャン開始ボタン１７１４のいずれも選択状態でないということである。そこで、Ｓ１５３２へ戻り、いずれかのボタンが選択状態になってからスキャン開始ボタン１７１５へのタッチを検知するのを待つ。 In step S1532, the process waits until a touch on the scan start button 1715 is detected. If a touch on the scan start button 1715 is detected in step S1532, the process advances to step S1533 to determine whether the 2D scan button 1712 is in a selected state. If the 2D scan button 1712 is selected in S1533, the process advances to S1534 to execute the process of the flat document image photographing unit 411, and the scan execution process ends.
If the 2D scan button 1712 is not in the selected state in S1533, the process advances to S1535 to determine whether the book scan button 1713 is in the selected state. If the book scan button 1713 is in the selected state in S1535, the process proceeds to S1536, the process of the book image photographing unit 412 is executed, and the scan execution process is ended. If it is determined in step S1535 that the book scan button 1713 is not selected, the process advances to step S1537 to determine whether the scan start button 1714 is selected.
If the scan start button 1714 is selected in S1537, the process proceeds to S1538, the process of the three-dimensional shape measurement unit 413 is executed, and the scan execution process ends. If the scan start button 1715 is not selected in step S1537, none of the 2D scan button 1712, the book scan button 1713, and the scan start button 1714 are selected. Therefore, the process returns to S1532, and waits until a touch on the scan start button 1715 is detected after any button is selected.

Ｓ１５０２のスキャン実行処理を終了すると、メイン制御部４０２は続いてＳ１５０３の物体除去待ち処理を行う。Ｓ１５０３の物体除去待ち処理を開始すると、Ｓ１５２１では図１７の（ｃ）に示したスキャン終了画面を表示する。図１７の（ｃ）のスキャン終了画面では、スキャンした原稿を除去する旨をユーザに通知するメッセージ１７２１と、メイン制御部の処理終了指示を受け付ける終了ボタン１７２２を投射する。
Ｓ１５２２では、物体検知部４１０からの物体除去通知を受信するのを待つ。ここで、物体除去通知は、物体検知部４１０が図８のＳ８３４で通知するものである。Ｓ１５２２で物体除去通知があると、物体除去待ち処理を終了する。 When the scan execution process in S1502 ends, the main control unit 402 performs the object removal waiting process in S1503. When the object removal waiting process in S1503 is started, a scan end screen shown in FIG. 17C is displayed in S1521. In the scan end screen of FIG. 17C, a message 1721 for notifying the user that the scanned document is to be removed and an end button 1722 for receiving a process end instruction from the main control unit are projected.
In step S1522, the process waits for reception of an object removal notification from the object detection unit 410. Here, the object removal notification is made by the object detection unit 410 in S834 in FIG. When there is an object removal notification in S1522, the object removal waiting process is terminated.

Ｓ１５０３の物体除去待ち処理を終了すると、Ｓ１５０４では、Ｓ１５０３の物体除去待ち処理の実行中に、終了ボタン１７２２がタッチされていたかどうかを判定する。Ｓ１５０４で終了ボタン１７２２がタッチされていたと判定したらメイン制御部４０２の処理を終了する。Ｓ１５０４で終了ボタン１７２２がタッチされていないと判定すればＳ１５０１へ戻り、図１７の（ａ）の初期画面を表示して書画台２０４への物体載置を待つ。このようにすることで、ユーザが複数の原稿をスキャンしたい場合に、書画台２０４上の原稿を取り換えたことを検知することができ、複数の原稿のスキャンを実行できる。 When the object removal waiting process in S1503 ends, in S1504, it is determined whether or not the end button 1722 has been touched during the execution of the object removal waiting process in S1503. If it is determined in S1504 that the end button 1722 has been touched, the processing of the main control unit 402 ends. If it is determined in S1504 that the end button 1722 is not touched, the process returns to S1501, and the initial screen shown in FIG. 17A is displayed to wait for the object placement on the document table 204. In this way, when the user wants to scan a plurality of documents, it can be detected that the document on the document table 204 has been replaced, and a plurality of documents can be scanned.

第１実施形態では、ユーザが平面原稿のスキャンを行うか、厚みのある書籍のスキャンを行うか、立体形状測定をおこなうかを選択できるようにした。なお、スキャンのモードが３種類すべて必要無い場合、例えば、ユーザの設定等により平面原稿のスキャンと厚みのある書籍のスキャンの２種類を実行すれば良い場合も考えられる。その場合、実行する２つのスキャンを選択できるように表示を行えばよい。
具体的には、図１７の（ｂ）において２Ｄスキャンボタン１７１２、ブックスキャンボタン１７１３、スキャン開始ボタン１７１５のみを投射することにより、２種類のスキャンを選択するユーザの入力を受け付けることができる。また、スキャンのモードが１種類のみであればよい場合、例えば、ユーザの設定等により平面原稿のスキャンのみ、あるいは、書籍のスキャンのみを実行すれば良い場合も考えられる。
その場合、図１７の（ｂ）においてはスキャン開始ボタン１７１５のみを投射し、ユーザのスキャン種類の選択を受け付けることなく、スキャン開始ボタン１７１５へのタッチを検知したときにスキャンを実行すれば良い。また、このようにスキャンのモードが１種類のみである場合、書画台２０４への物体の載置を検知したとき、図１７の（ｂ）のようなスキャン操作画面を投射せず、すぐにスキャンを実行しても良い。 In the first embodiment, the user can select whether to scan a flat original, scan a thick book, or perform solid shape measurement. When all three types of scanning modes are not necessary, for example, it may be possible to execute two types of scanning of a flat document and a thick book according to user settings or the like. In that case, display may be performed so that two scans to be executed can be selected.
Specifically, in FIG. 17B, by projecting only the 2D scan button 1712, the book scan button 1713, and the scan start button 1715, it is possible to accept an input from the user who selects two types of scans. In addition, when only one type of scan mode is required, for example, only a flat document scan or only a book scan may be executed according to a user setting or the like.
In that case, in FIG. 17B, only the scan start button 1715 is projected, and the scan may be executed when a touch on the scan start button 1715 is detected without accepting the user's selection of the scan type. Further, when there is only one type of scan mode in this way, when the placement of an object on the document table 204 is detected, the scan operation screen as shown in FIG. May be executed.

Ｓ１５４１の撮影可能領域算出処理は、Ｓ１５０１の物体載置待ち処理の後に実行したが、Ｓ１５０２のスキャン実行処理中、各種スキャンボタンが選択された後で実行するようにしてもよい。上記の方法によって、ユーザは、静止画像の撮影前に、高さのある物体が撮影可能な領域を知ることが可能となる。尚、本実施形態は書籍物体を例に処理を説明したが、任意の形状を有する物体であっても構わない。
〔第２実施形態〕
第１実施形態では、本の高さとカメラ画角の情報から、立体形状を有する書籍物体が撮影可能な領域をユーザに知らせる方法を示した。本実施形態では、ユーザに対して、より分かりやすくスキャン可能な領域を知らせるため、書籍物体上にスキャン可能領域をプロジェクションマッピングすることを考える。
図１９は、本実施形態を示す画像処理装置における書籍文書入力処理状態を示す遷移図である。
図１９の（ａ）は、本実施形態における、書画台２０４上に載置された書籍物体１９０１と、カメラ部２０２、距離画像センサ部２０８、プロジェクタ部２０７の位置関係、およびそれらの画角の関係を横から見た図として模式的にあらわしたものである。また、１９０６は、真横から見た書画台２０４上の撮影範囲を示している。１９０７は、プロジェクタ部２０７の投影可能範囲を表している。
図１９の（ｂ）は、図１９の（ａ）の書籍文書に対するプロジェクションマッピングしている様子を模式的示した図である。これは、カメラ部２０２の撮影可能範囲１９０８を真上から俯瞰した図に相当する。
書籍物体のうち、撮影可能領域１９０７に含まれる部分１９０９にプロジェクションマッピングを施している様子が示されている。まず、本実施形態での、距離画像センサ部２０８、カメラ部２０２、プロジェクタ部２０７の配置条件について説明する。
第１実施形態では、距離画像センサ部２０８とカメラ部２０２の、それぞれの画角と位置関係がどのような場合であっても、成立した。本実施形態では、距離画像センサ部２０８の画角が、書籍物体の全体をとらえられるだけ、充分に広い必要がある。 The imageable area calculation process of S1541 is executed after the object placement waiting process of S1501, but may be executed after various scan buttons are selected during the scan execution process of S1502. By the above method, the user can know a region where a tall object can be photographed before photographing a still image. In the present embodiment, the processing has been described using a book object as an example, but an object having an arbitrary shape may be used.
[Second Embodiment]
In the first embodiment, the method of notifying the user of the area where the book object having a three-dimensional shape can be photographed from the information of the book height and the camera angle of view is shown. In the present embodiment, in order to inform the user of a scannable area more easily, it is considered that the scanpable area is projected and mapped on the book object.
FIG. 19 is a transition diagram illustrating a book document input processing state in the image processing apparatus according to the present embodiment.
FIG. 19A illustrates the positional relationship between the book object 1901 placed on the document table 204, the camera unit 202, the distance image sensor unit 208, and the projector unit 207, and their angles of view in the present embodiment. It is a schematic representation of the relationship as seen from the side. Reference numeral 1906 denotes a shooting range on the document table 204 viewed from the side. Reference numeral 1907 denotes a projectable range of the projector unit 207.
FIG. 19B is a diagram schematically showing a state in which projection mapping is performed on the book document of FIG. This corresponds to a view in which the shootable range 1908 of the camera unit 202 is viewed from directly above.
A state in which projection mapping is applied to a portion 1909 included in the shootable area 1907 of the book object is shown. First, the arrangement conditions of the distance image sensor unit 208, the camera unit 202, and the projector unit 207 in this embodiment will be described.
In the first embodiment, the distance image sensor unit 208 and the camera unit 202 are established regardless of the angle of view and the positional relationship between them. In this embodiment, the angle of view of the distance image sensor unit 208 needs to be wide enough to capture the entire book object.

尚、これらの位置関係やそれぞれの画角は、この配置や画角に限ったものではなく、距離画像センサ部２０８、プロジェクタ部２０７の画角が書籍物体１９０１を含むのに充分の配置及び画角になっていればそれでよい。
次に、プロジェクションマッピングの方法を説明する。
プロジェクションマッピングを行うのは、第１実施形態のフローチャート（図１５）の、Ｓ１５４５である。そのほかの処理は実施形態１で説明済みであるので、詳細は割愛する。
図１９の（ｄ）は、書籍物体の３次元点群１９１３を書画台２０４上で真上（直交座標系の＋Ｚの方向）から見た俯瞰図である。書籍物体３次元点群の俯瞰図を作成する方法は、第１実施形態で述べた通りである。
書籍物体の３次元点群１９１３を、プロジェクタ部２０７のカメラ座標系１９１４に変換したものが、図１９の（ｅ）の変換済み書籍物体の３次元点群１９１５である。この変換済み書籍物体の３次元点群１９１５には前述のプロジェクタ部２０７の外部パラメータを用いる。図１９の（ｆ）の直線１９１９は、図１６の（ｂ）で説明した平面１６０６に対応している。
すなわち、直線１９１９より下側が、書画台２０４上の投影可能領域である。これらの対応関係は、直交座標系で簡単に計算可能である。
ここで図１９の（ｆ）の俯瞰画像において、直線１９１９より下側の書籍画像の３次元点群１９１２の領域を、プロジェクタ座標系に変換する。変換された撮影可能な書籍物体の３次元点群が図１９の（ｇ）の１９１６である。
図１９の（ｇ）の状態の画像を２次元画像として扱い、１９１６の部分を画像処理によって切り出したものが、図１９の（ｈ）である。
１９１７が、プロジェクタ部２０７の座標系１９１８で見た、書籍物体上の撮影可能領域を表している。
つまり図１９の（ｈ）の画像を、プロジェクタ部２０７を用いて投影すれば、書籍物体上の撮影可能な部分だけをプロジェクションマッピングすることができる。また、図１９の（ｃ）のように、撮影不可能な部分１９１０，１９１１のみをプロジェクションすることも可能である。この場合、図１９の（ｄ）の俯瞰画像をカメラ座標系１９１４に変換する。
そうすると、カメラの撮影可能領域からはみ出す部分があるので、その部分の点群データをプロジェクタ座標系に変換する。あとは同様に、変換後の画像を２次元画像として扱い、はみ出した部分を画像処理によって切り出し、投影すればよい。上記の方法によって、ユーザによりわかりやすく撮影可能な範囲を示すことが可能となる。
〔第３実施形態〕
第１、第２実施形態では、撮影可能領域を、書籍物体を置いた時点で表示するための方法を示した。本実施形態では、載置位置が修正された後、当該物体がカメラ部２０２の画角域で特定される撮影可能領域をはみ出していることを検知して、ユーザにその旨を通知する例を説明する。また、撮影可能になった時点で、撮影開始ボタンを表示する例も合わせて説明する。
本実施形態でも、距離画像センサ部２０８の画角が、立体的な書籍物体を撮影するのに充分なものであることを前提とする。
図２０は、本実施形態を示す画像処理装置の制御方法を説明するフローチャートである。本例は、図４に示したメイン制御部４０２が実行するスキャンアプリケーションの処理例である
なお、図中のＳ１５ｘｘは、図１５で既に説明したそれらと同じ処理なので、詳しい説明は割愛する。
Ｓ２００１は書籍物体がカメラ画角から３次元的にはみ出していないかどうかを確認し、はみ出している場合はその旨をユーザにフィードバックする処理である。Ｓ２００２で、メイン制御部４０２は、物体がカメラの画角からはみ出していないかどうかを計算する。
これは、第２実施形態で示したのと同様に、書籍物体の３次元点群をカメラ座標系に変換し、カメラ画角からはみ出しがないか調べればよい。
Ｓ２００３では、はみ出しているかどうかの判定を行う。はみ出していた場合は、Ｓ２００４へ移行し、はみ出していない場合はＳ１５３１へ移行してスキャン開始画面を表示する。Ｓ２００４では、ユーザに対して撮影可能範囲からはみ出している旨を通知する。そのためには、プロジェクタ部２０７がテキストメッセージを所定位置にプロジェクションすればよい。
またこの時、カメラの撮影範囲１６１０の一番右側に見開きの書籍物体の長辺の一つを付き当てるように表示してもよい。 Note that these positional relationships and the respective angles of view are not limited to these arrangements and angles of view, and the arrangement and angle of view sufficient for the angle of view of the distance image sensor unit 208 and the projector unit 207 to include the book object 1901. If it is in the corner, that's fine.
Next, a projection mapping method will be described.
Projection mapping is performed in S1545 of the flowchart (FIG. 15) of the first embodiment. Since other processes have already been described in the first embodiment, details are omitted.
FIG. 19D is an overhead view of the three-dimensional point group 1913 of the book object as viewed from directly above (the direction of + Z in the orthogonal coordinate system) on the document table 204. The method for creating an overhead view of a book object three-dimensional point group is as described in the first embodiment.
A three-dimensional point group 1915 of the converted book object of FIG. 19E is obtained by converting the three-dimensional point group 1913 of the book object into the camera coordinate system 1914 of the projector unit 207. The external parameters of the projector unit 207 described above are used for the three-dimensional point group 1915 of the converted book object. A straight line 1919 in FIG. 19F corresponds to the plane 1606 described in FIG.
That is, the area below the straight line 1919 is the projectable area on the document table 204. These correspondences can be easily calculated in an orthogonal coordinate system.
Here, in the bird's-eye view image of FIG. 19F, the region of the three-dimensional point group 1912 of the book image below the straight line 1919 is converted into the projector coordinate system. The converted three-dimensional point group of the photographable book object is 1916 in FIG.
FIG. 19H shows an image in the state shown in FIG. 19G treated as a two-dimensional image and a portion 1916 cut out by image processing.
Reference numeral 1917 denotes a shootable area on the book object as viewed in the coordinate system 1918 of the projector unit 207.
That is, if the image of (h) of FIG. 19 is projected using the projector unit 207, only a shootable part on the book object can be projection-mapped. Further, as shown in FIG. 19C, it is also possible to project only the portions 1910 and 1911 that cannot be photographed. In this case, the overhead image of FIG. 19D is converted into the camera coordinate system 1914.
Then, since there is a part that protrudes from the image-capable area of the camera, the point cloud data of that part is converted into the projector coordinate system. Similarly, the converted image may be handled as a two-dimensional image, and the protruding portion may be cut out by image processing and projected. By the above method, it is possible to indicate a range where photographing can be easily performed by the user.
[Third Embodiment]
In the first and second embodiments, the method for displaying the shootable area when the book object is placed has been shown. In the present embodiment, after the placement position is corrected, an example in which it is detected that the object protrudes from the shootable region specified by the angle of view of the camera unit 202 and the user is notified of the fact. explain. An example in which a shooting start button is displayed when shooting is possible will also be described.
Also in this embodiment, it is assumed that the angle of view of the distance image sensor unit 208 is sufficient to capture a three-dimensional book object.
FIG. 20 is a flowchart illustrating a method for controlling the image processing apparatus according to the present embodiment. This example is a processing example of the scan application executed by the main control unit 402 shown in FIG. 4. Note that S15xx in the drawing is the same processing as those already described in FIG.
S2001 is a process of confirming whether or not the book object has protruded three-dimensionally from the camera angle of view and, if it has protruded, feedbacks that effect to the user. In step S2002, the main control unit 402 calculates whether an object does not protrude from the angle of view of the camera.
In the same manner as shown in the second embodiment, the three-dimensional point group of the book object may be converted into the camera coordinate system and checked for any protrusion from the camera angle of view.
In step S2003, it is determined whether or not the protrusion has occurred. If it has protruded, the process proceeds to S2004, and if it does not protrude, the process proceeds to S1531, and a scan start screen is displayed. In step S2004, the user is notified that the image is out of the shootable range. For this purpose, the projector unit 207 may project the text message at a predetermined position.
At this time, it may be displayed so that one of the long sides of the spread book object is applied to the rightmost side of the shooting range 1610 of the camera.

これは、図１６の（ｂ）のような配置の場合、書籍物体がはみ出すのは、その端で画角の角度が寝ている左側で起きるのであり、画角の端の線と書画台２０４のなす角１６１６が９０度以上となる右端では、付き当てたところではみ出しが起こらないためである。尚、なす角１６１６が９０度以下の場合は、付き当てるよう指示することは出来ない。
また、ユーザが手で本の両端を押さえつけ、本の高さを低くするように指示してもよい。 In the arrangement as shown in FIG. 16B, the book object protrudes on the left side where the angle of view lies at the end of the book object. This is because the protrusion at the right end where the angle 1616 is 90 degrees or more does not protrude when applied. When the angle 1616 formed is 90 degrees or less, it cannot be instructed to make contact.
In addition, the user may instruct the user to press down both ends of the book and reduce the height of the book.

Ｓ２００５では、ユーザによる置きなおしが検知されたかどうかを確認する。置きなおしが検知された場合はＳ２００２に戻る。なお、置きなおしを検知する処理は、距離画像取得部４０８から取得する距離画像の、フレーム間の差分を見ることにより、書画台２０４上に載置された物体が動き出し、再び静止したことを検知すればよい。
以上の処理を繰り返すことにより、ユーザに対して、書籍物体が、撮影可能領域からはみ出していることを通知出来る。また、撮影可能領域に載置されるまで繰り返し処理を実行するため、撮影可能になった段階でスキャンメニューを表示することが可能となる。
尚、Ｓ２００５で、第１実施形態や第２実施形態における図１５のＳ１５４４、Ｓ１５４５を実行するようにしてもよい。Ｓ１５４４、Ｓ１５４５を繰り返し実行することで、ユーザはどこに置けば撮影可能になるのかを判断できる上、今置かれている書籍物体のどこがはみ出しているのかも視認することが可能となる。 In step S2005, it is confirmed whether repositioning by the user has been detected. If a repositioning is detected, the process returns to S2002. In addition, the process of detecting repositioning detects that the object placed on the document table 204 starts moving and stops again by looking at the difference between frames of the distance image acquired from the distance image acquisition unit 408. do it.
By repeating the above processing, it is possible to notify the user that the book object is protruding from the shootable area. In addition, since the process is repeatedly performed until the image is placed in the imageable area, the scan menu can be displayed when the image becomes available.
In S2005, S1544 and S1545 in FIG. 15 in the first embodiment and the second embodiment may be executed. By repeatedly executing S1544 and S1545, the user can determine where the image can be taken and can also visually recognize where the book object currently placed protrudes.

本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステムまたは装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサがプログラムを読み出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えばＡＳＩＣ）によっても実現可能である。 The present invention supplies a program that realizes one or more functions of the above-described embodiments to a system or apparatus via a network or a storage medium, and one or more processors in a computer of the system or apparatus read and execute the program This process can be realized. It can also be realized by a circuit (for example, ASIC) that realizes one or more functions.

１０１カメラスキャナ
２０１コントローラ部 101 camera scanner 201 controller unit

Claims

画像入力装置であって、
物体を載置する書画台と、
前記書画台に画像を投影する投影手段と、
前記書画台に載置される物体を当該書画台の上方から撮像する撮像手段と、
前記書画台に載置された物体の距離画像を取得する取得手段と、
前記取得手段が取得する距離画像から前記書画台の上に載置される物体の高さ情報を取得する高さ取得手段と、
取得される前記物体の前記高さ情報と、前記撮像手段が撮像する画角情報とから、前記撮像手段が撮像する画角域で前記書画台に載置される前記物体の全体を撮影できる前記物体の載置領域を導出する導出手段と、を備え、
前記投影手段は、導出された載置領域を投影することを特徴とする画像入力装置。 An image input device,
A document table on which objects are placed;
Projecting means for projecting an image onto the document table;
Imaging means for imaging an object placed on the document table from above the document table;
Obtaining means for obtaining a distance image of an object placed on the document table;
Height acquisition means for acquiring height information of an object placed on the document table from a distance image acquired by the acquisition means;
From the acquired height information of the object and the angle of view information captured by the imaging means, the entire object placed on the document table can be imaged in the angle of view area captured by the imaging means. Deriving means for deriving a placement area of the object,
The image input apparatus characterized in that the projecting means projects the derived placement area.

前記投影手段は、導出された載置領域を投影して、前記物体の載置位置を移動した後、取得される前記物体の高さ情報と前記撮像手段の画角情報とから前記撮像手段の画角域で前記物体の全体を撮影できるかどうかを判断する判断手段を備え、
前記撮像手段の画角域で前記物体の全体を撮影できないと判断した場合、前記投影手段は、前記物体の載置位置を変更する指示を前記書画台に投影することを特徴とする請求項１に記載の画像入力装置。 The projection means projects the derived placement area, moves the placement position of the object, and then acquires the height information of the object and the angle of view information of the imaging means from the acquired imaging means. A determination means for determining whether or not the entire object can be photographed in an angle of view;
The projection unit projects an instruction to change the placement position of the object onto the document table when it is determined that the entire object cannot be captured in the angle of view of the imaging unit. The image input device described in 1.

前記導出手段が導出する載置領域は、前記物体の全体が前記撮像手段の画角域からはみ出さないように載置可能な矩形領域であることを特徴とする請求項２に記載の画像入力装置。 3. The image input according to claim 2, wherein the placement area derived by the deriving unit is a rectangular area that can be placed so that the entire object does not protrude from the angle of view of the imaging unit. apparatus.

前記判断手段は、前記物体を移動した後の載置位置で、取得される前記物体の高さ情報と前記撮像手段の画角情報とから、前記撮像手段の画角域で前記物体の全体を撮影できるかどうかを判断することを特徴とする請求項２に記載の画像入力装置。 The determination means determines the entire object in the angle of view area of the imaging means from the acquired height information of the object and the angle of view information of the imaging means at the mounting position after moving the object. The image input apparatus according to claim 2, wherein it is determined whether or not photography is possible.

前記撮像手段の画角域で前記物体の全体を撮影できると判断した場合、前記投影手段が前記書画台に前記物体の読み取りを開始するためのボタンを投影することを特徴とする請求項２に記載の画像入力装置。 The projection unit projects a button for starting reading of the object onto the document table when it is determined that the entire object can be captured in the angle of view of the imaging unit. The image input device described.

前記撮像手段の画角域で前記物体の全体を撮影できないと判断した場合、前記投影手段が前記書画台に載置される前記物体の移動を通知するメッセージを投影することを特徴とする請求項４に記載の画像入力装置。 The projection unit projects a message notifying the movement of the object placed on the document table when it is determined that the entire object cannot be captured in the angle of view of the imaging unit. 5. The image input device according to 4.

前記物体は、立体形状を有する立体物であることを特徴とする請求項１乃至６のいずれか１項に記載の画像入力装置。 The image input apparatus according to claim 1, wherein the object is a three-dimensional object having a three-dimensional shape.

前記立体物は、前記書画台に見開き状態で載置されるブックであることを特徴とする請求項７に記載の画像入力装置。 The image input apparatus according to claim 7, wherein the three-dimensional object is a book placed in a spread state on the document table.

前記投影手段は、導出された載置領域から外れる読み取れない領域を投影することを特徴とする請求項１または２に記載の画像入力装置。 The image input device according to claim 1, wherein the projecting unit projects an unreadable region that deviates from the derived placement region.

前記投影手段は、導出された載置領域のうち、載置される前記物体で遮られる領域を外して投影することを特徴とする請求項１または２に記載の画像入力装置。 3. The image input apparatus according to claim 1, wherein the projection unit projects an area that is blocked by the object to be placed out of the derived placement area.

前記撮像手段の画角情報は、前記撮像手段に対応づけられる第１の座標系を、前記書画台に対応づけられる第２の座標系に変換する所定のパラメータで特定されることを特徴とする請求項１に記載の画像入力装置。 The angle-of-view information of the image pickup means is specified by a predetermined parameter for converting a first coordinate system associated with the image pickup means into a second coordinate system associated with the document table. The image input device according to claim 1.

画像入力システムであって、
物体を載置する書画台と、
前記書画台に画像を投影する投影手段と、
前記書画台に載置される物体を当該書画台の上方から撮像する撮像手段と、
前記書画台に載置された物体の距離画像を取得する取得手段と、
前記取得手段が取得する距離画像から前記書画台の上に載置される物体の高さ情報を取得する高さ取得手段と、
取得される前記物体の前記高さ情報と、前記撮像手段が撮像する画角情報とから、前記撮像手段が撮像する画角域で前記書画台に載置される前記物体の全体を撮影できる前記物体の載置領域を導出する導出手段と、を備え、
前記投影手段は、導出された載置領域を投影することを特徴とする画像入力システム。 An image input system,
A document table on which objects are placed;
Projecting means for projecting an image onto the document table;
Imaging means for imaging an object placed on the document table from above the document table;
Obtaining means for obtaining a distance image of an object placed on the document table;
Height acquisition means for acquiring height information of an object placed on the document table from a distance image acquired by the acquisition means;
From the acquired height information of the object and the angle of view information captured by the imaging means, the entire object placed on the document table can be imaged in the angle of view area captured by the imaging means. Deriving means for deriving a placement area of the object,
The image input system, wherein the projection unit projects the derived placement area.

物体を載置する書画台と、前記書画台に画像を投影する投影手段と、前記書画台に載置される物体を当該書画台の上方から撮像する撮像手段とを有する画像入力装置の制御方法であって、
前記書画台に載置された物体の距離画像を取得する取得工程と、
前記取得工程が取得する距離画像から前記書画台の上に載置される物体の高さ情報を取得する高さ取得工程と、
取得される前記物体の前記高さ情報と、前記撮像手段が撮像する画角情報とから、前記撮像手段が撮像する画角域で前記書画台に載置される前記物体の全体を撮影できる前記物体の載置領域を導出する導出工程と、
前記投影手段で、導出された前記載置領域を投影する投影工程と、
を備えることを特徴とする画像入力装置の制御方法。 A method for controlling an image input apparatus, comprising: a document table on which an object is placed; a projecting unit that projects an image on the document table; and an imaging unit that images an object placed on the document table from above the document table. Because
An acquisition step of acquiring a distance image of the object placed on the document table;
A height acquisition step of acquiring height information of an object placed on the document table from a distance image acquired by the acquisition step;
From the acquired height information of the object and the angle of view information captured by the imaging means, the entire object placed on the document table can be imaged in the angle of view area captured by the imaging means. A derivation step for deriving a placement area of the object;
A projecting step of projecting the derived placement area derived by the projecting means;
A method for controlling an image input apparatus comprising:

請求項１３に記載の画像入力装置の制御方法をコンピュータに実行させることを特徴とするプログラム。 A program for causing a computer to execute the method for controlling the image input apparatus according to claim 13.