JP2018129735A

JP2018129735A - Image reading device, image forming system, image reading method, and image reading program

Info

Publication number: JP2018129735A
Application number: JP2017022650A
Authority: JP
Inventors: 田中　邦彦; Kunihiko Tanaka; 邦彦田中
Original assignee: Kyocera Document Solutions Inc
Current assignee: Kyocera Document Solutions Inc
Priority date: 2017-02-09
Filing date: 2017-02-09
Publication date: 2018-08-16
Anticipated expiration: 2037-02-09
Also published as: JP6624106B2

Abstract

PROBLEM TO BE SOLVED: To achieve page turning detection with high robustness to shaking of an imaging part.SOLUTION: An image reading device comprises: an imaging part that picks up a document image presented on a document surface of a book document, and creates a plurality of pieces of frame image data at a preset time interval; and an image analysis part that analyzes the plurality of pieces of frame image data, determines whether the book document presents an image picked up in a stationary state for at least part of the plurality of pieces of frame image data, and extracts frame image data presenting the image picked up in a stationary state on the basis of the determination.SELECTED DRAWING: Figure 1

Description

本発明は、カメラで原稿を撮像して静止画像を取得する技術に関し、特にブック原稿の静止画像を取得する技術に関する。 The present invention relates to a technique for acquiring a still image by imaging a document with a camera, and more particularly to a technique for acquiring a still image of a book document.

原稿の画像は、オーバーヘッドスキャナーを使用して読み取ることが一般的である。オーバーヘッドスキャナーには、エリアセンサでページめくりを検出し、ラインスキャナーを走査させることによって高解像で画像を取得することができるものもある。ラインスキャナーの走査タイミングについては、たとえば特許文献１は、エリアセンサで複数の画像を連続して取得し、取得された複数の画像から、画像差分抽出により動作パターンを算出し、動作パターンに基づいて、ページめくり動作を検出し、ページめくり動作が検出された場合に、ラインスキャナー（リニアセンサ）による読み取り開始を判定する技術を提案している。一方、撮像機能を有するスマートフォンの普及によって、スマートフォンを使用して原稿の画像を読み取ることも望まれるようになってきた。 In general, an image of an original is read using an overhead scanner. Some overhead scanners can acquire a high-resolution image by detecting page turning with an area sensor and scanning with a line scanner. Regarding the scanning timing of the line scanner, for example, Patent Document 1 obtains a plurality of images continuously by an area sensor, calculates an operation pattern from the plurality of acquired images by image difference extraction, and based on the operation pattern. A technique has been proposed in which a page turning operation is detected, and when a page turning operation is detected, a reading start by a line scanner (linear sensor) is determined. On the other hand, with the spread of smartphones having an imaging function, it has also been desired to read an image of a document using a smartphone.

特開２０１４−１６８１６８号公報JP 2014-168168 A

しかし、エリアセンサによるページめくり動作の検出は、オーバーヘッドスキャナーの撮像部が固定されていることを前提としているので、スマートフォンを使用して原稿の画像を読み取る際には、スマートフォンの揺動に起因する画像変化が誤検出の要因となることも考えられる。 However, the detection of the page turning operation by the area sensor is based on the premise that the imaging unit of the overhead scanner is fixed. Therefore, when reading an image of a document using a smartphone, it is caused by the shaking of the smartphone. It is also conceivable that an image change causes a false detection.

本発明は、このような状況に鑑みてなされたものであり、撮像部の揺動に対してロバスト性の高いページめくり検出を実現する技術を提供することを目的とする。 The present invention has been made in view of such a situation, and an object of the present invention is to provide a technique for realizing page turning detection that is highly robust with respect to swinging of an imaging unit.

本発明の画像読取装置は、ブック原稿の原稿面に表された原稿画像を撮像し、予め設定された時間間隔で複数のフレーム画像データを生成する撮像部と、前記複数のフレーム画像データを解析して、前記複数のフレーム画像データの少なくとも一部について前記ブック原稿が静止状態で撮像された画像を表しているか否かを判定し、前記判定に基づいて前記静止状態で撮像された画像を表しているフレーム画像データを抽出する画像解析部とを備える。 An image reading apparatus of the present invention captures a document image represented on a document surface of a book document, generates a plurality of frame image data at a preset time interval, and analyzes the plurality of frame image data Then, it is determined whether or not the book document represents an image captured in a stationary state for at least a part of the plurality of frame image data, and the image captured in the stationary state is represented based on the determination. And an image analysis unit for extracting the frame image data.

本発明の画像形成システムは、前記画像読取装置と印刷媒体に画像を形成する画像形成装置とを備える。 The image forming system of the present invention includes the image reading device and an image forming device that forms an image on a print medium.

本発明の画像読取方法は、ブック原稿の原稿面に表された原稿画像を撮像し、予め設定された時間間隔で複数のフレーム画像データを生成する撮像工程と、前記複数のフレーム画像データを解析して、前記複数のフレーム画像データの少なくとも一部について前記ブック原稿が静止状態で撮像された画像を表しているか否かを判定し、前記判定に基づいて前記静止状態で撮像された画像を表しているフレーム画像データを抽出する画像解析工程とを備える。 An image reading method of the present invention includes an imaging step of capturing a document image represented on a document surface of a book document and generating a plurality of frame image data at a preset time interval, and analyzing the plurality of frame image data Then, it is determined whether or not the book document represents an image captured in a stationary state for at least a part of the plurality of frame image data, and the image captured in the stationary state is represented based on the determination. And an image analysis step of extracting the frame image data.

本発明の画像読取プログラムは、画像読取装置を制御する。前記画像読取プログラムは、ブック原稿の原稿面に表された原稿画像を撮像し、予め設定された時間間隔で複数のフレーム画像データを生成する撮像部、及び前記複数のフレーム画像データを解析して、前記複数のフレーム画像データの少なくとも一部について前記ブック原稿が静止状態で撮像された画像を表しているか否かを判定し、前記判定に基づいて前記静止状態で撮像された画像を表しているフレーム画像データを抽出する画像解析部として前記画像読取装置を機能させる。 The image reading program of the present invention controls the image reading apparatus. The image reading program captures a document image represented on a document surface of a book document, generates a plurality of frame image data at a preset time interval, and analyzes the plurality of frame image data. Determining whether or not the book document represents an image captured in a stationary state for at least some of the plurality of frame image data, and representing the image captured in the stationary state based on the determination. The image reading apparatus is caused to function as an image analysis unit that extracts frame image data.

本発明によれば、撮像部の揺動に対してロバスト性の高いページめくり検出を実現することができる。 According to the present invention, it is possible to realize page turning detection that is highly robust to the swinging of the imaging unit.

本発明の一実施形態に係る画像読取システム１０の機能構成を示すブロックダイアグラムである。1 is a block diagram illustrating a functional configuration of an image reading system 10 according to an embodiment of the present invention. 一実施形態に係る静止画像取得処理の内容を示すフローチャートである。It is a flowchart which shows the content of the still image acquisition process which concerns on one Embodiment. 一実施形態に係るスマートフォン２００による撮像開始の様子を示す説明図である。It is explanatory drawing which shows the mode of the imaging start by the smart phone 200 which concerns on one Embodiment. 一実施形態に係る被写体静止判定処理の内容を示す説明図である。It is explanatory drawing which shows the content of the to-be-photographed object stationary determination process which concerns on one Embodiment.

以下、本発明を実施するための形態（以下、「実施形態」という）を、図面を参照して説明する。 Hereinafter, modes for carrying out the present invention (hereinafter referred to as “embodiments”) will be described with reference to the drawings.

図１は、本発明の一実施形態に係る画像読取システム１０の機能構成を示すブロックダイアグラムである。画像読取システム１０は、画像形成装置１００と、スマートフォン２００とを備えている。画像形成装置１００は、制御部１１０と、画像形成部１２０と、操作表示部１３０と、記憶部１４０と、通信インターフェース部１５０（通信Ｉ／Ｆ部とも呼ばれる。）と、自動原稿送り装置（ＡＤＦ）１６０とを備えている。画像形成部１２０は、印刷媒体上に画像を形成する。 FIG. 1 is a block diagram showing a functional configuration of an image reading system 10 according to an embodiment of the present invention. The image reading system 10 includes an image forming apparatus 100 and a smartphone 200. The image forming apparatus 100 includes a control unit 110, an image forming unit 120, an operation display unit 130, a storage unit 140, a communication interface unit 150 (also referred to as a communication I / F unit), and an automatic document feeder (ADF). 160). The image forming unit 120 forms an image on a print medium.

スマートフォン２００は、制御部２１０と、操作表示部２３０と、記憶部２４０と、通信インターフェース部２５０（通信Ｉ／Ｆ部とも呼ばれる。）と、撮像部２６０とを備えている。制御部２１０は、画像解析部２１１と動画像データ生成部２１２とを有している。画像解析部２１１及び動画像データ生成部２１２の機能については後述する。 The smartphone 200 includes a control unit 210, an operation display unit 230, a storage unit 240, a communication interface unit 250 (also referred to as a communication I / F unit), and an imaging unit 260. The control unit 210 includes an image analysis unit 211 and a moving image data generation unit 212. The functions of the image analysis unit 211 and the moving image data generation unit 212 will be described later.

スマートフォン２００は、通信インターフェース部２５０と通信インターフェース部１５０とを使用して近距離無線通信で画像形成装置１００と接続される。近距離無線通信は、本実施形態では、ＢＬＵＥＴＯＯＴＨ（登録商標）のＣＬＡＳＳ１を使用している。ＢＬＵＥＴＯＯＴＨ（登録商標）のＣＬＡＳＳ１は、出力１００ｍＷの通信であり、画像形成装置１００とスマートフォン２００との距離が１００ｍ以内程度での通信が可能な近距離無線通信である。 The smartphone 200 is connected to the image forming apparatus 100 by short-range wireless communication using the communication interface unit 250 and the communication interface unit 150. In this embodiment, near field communication uses BLUETOOTH (registered trademark) CLASS1. CLASS1 of BLUETOOTH (registered trademark) is a communication with an output of 100 mW, and is a short-distance wireless communication capable of communicating within a distance of about 100 m between the image forming apparatus 100 and the smartphone 200.

画像形成装置１００の操作表示部１３０及びスマートフォン２００の操作表示部２３０は、タッチパネルとして機能し、様々なメニューを入力画面として表示し、ユーザーの操作入力を受け付ける。 The operation display unit 130 of the image forming apparatus 100 and the operation display unit 230 of the smartphone 200 function as a touch panel, display various menus as input screens, and accept user operation inputs.

制御部１１０，２１０及び画像形成部１２０は、ＲＡＭやＲＯＭ等の主記憶手段、及びＭＰＵ（ＭｉｃｒｏＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）やＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）等の制御手段を備えている。また、制御部１１０，２１０は、各種Ｉ／Ｏ、ＵＳＢ（ユニバーサル・シリアル・バス）、バス、その他ハードウェア等のインターフェースに関連するコントローラ機能を備えている。制御部１１０，２１０は、それぞれ画像形成装置１００及びスマートフォン２００の全体を制御する。 The control units 110 and 210 and the image forming unit 120 include main storage means such as RAM and ROM, and control means such as MPU (Micro Processing Unit) and CPU (Central Processing Unit). Further, the control units 110 and 210 have controller functions related to various I / O, USB (Universal Serial Bus), bus, and other hardware interfaces. The control units 110 and 210 control the entire image forming apparatus 100 and the smartphone 200, respectively.

記憶部１４０，２４０は、非一時的な記録媒体であるハードディスクドライブやフラッシュメモリー等からなる記憶装置で、それぞれ制御部１１０，２１０が実行する処理の制御プログラムやデータを記憶する。 The storage units 140 and 240 are storage devices including a hard disk drive or a flash memory that is a non-temporary recording medium, and store control programs and data for processes executed by the control units 110 and 210, respectively.

記憶部１４０には、スマートフォン２００にインストールするための原稿画像取得アプリケーションプログラム１４１（単にアプリケーションとも呼ばれる。）が記憶されている。記憶部２４０は、フレーム画像データを一時的に格納するためのフレームメモリ２４１と、静止画像格納領域２４２とを有している。 The storage unit 140 stores a document image acquisition application program 141 (also simply referred to as an application) for installation on the smartphone 200. The storage unit 240 includes a frame memory 241 for temporarily storing frame image data and a still image storage area 242.

この例では、スマートフォン２００は、画像形成装置１００の記憶部１４０から原稿画像取得アプリケーションプログラム１４１をダウンロードして、記憶部２４０にインストール済みであるものとする。 In this example, it is assumed that the smartphone 200 has downloaded the document image acquisition application program 141 from the storage unit 140 of the image forming apparatus 100 and has been installed in the storage unit 240.

図２は、一実施形態に係る静止画像取得処理の内容を示すフローチャートである。ステップＳ１０では、ユーザーは、操作表示部２３０を操作してスマートフォン２００の作動モードをブック原稿撮像モードに設定する。ブック原稿撮像モードは、原稿画像取得アプリケーションプログラム１４１によってサポートされている撮像モードである。この撮像モードは、スマートフォン２００によるブック原稿の画像の取得用に構成された作動モードである。 FIG. 2 is a flowchart showing the contents of still image acquisition processing according to an embodiment. In step S10, the user operates the operation display unit 230 to set the operation mode of the smartphone 200 to the book document imaging mode. The book document imaging mode is an imaging mode supported by the document image acquisition application program 141. This imaging mode is an operation mode configured for acquiring a book original image by the smartphone 200.

図３は、一実施形態に係るスマートフォン２００による撮像開始の様子を示す説明図である。操作表示部２３０には、原稿Ｄの原稿面を表す画像と、保存アイコン２３１と、撮像停止アイコン２３２とが表示されている。撮像停止アイコン２３２は、撮像を一時停止するためのアイコンである。保存アイコン２３１は、原稿Ｄを撮像して取得された複数の静止画像データを保存するためのアイコンである。 FIG. 3 is an explanatory diagram illustrating a state of imaging start by the smartphone 200 according to an embodiment. The operation display unit 230 displays an image representing the document surface of the document D, a save icon 231, and an imaging stop icon 232. The imaging stop icon 232 is an icon for temporarily stopping imaging. The save icon 231 is an icon for saving a plurality of still image data acquired by imaging the document D.

被写体としての原稿Ｄは、デスクその他の任意の場所に置くことができる。原稿Ｄは、見開き状態で配置されている本としての原稿（上述のようにブック原稿とも呼ばれる。）である。ブック原稿としての原稿Ｄは、一般にページをめくりつつ原稿面の原稿画像が取得されることになる。 The document D as a subject can be placed on a desk or any other place. The document D is a document (also referred to as a book document as described above) as a book arranged in a spread state. A document D as a book document generally obtains a document image on the document surface while turning a page.

ステップＳ２０では、ユーザーは、スマートフォン２００の撮像部２６０を使用して原稿Ｄの全体の撮像を開始する。ステップＳ３０では、スマートフォン２００は、動画像データの生成を前提としてフレーム画像の取得を開始し、複数のフレーム画像データを生成する。スマートフォン２００は、複数のフレーム画像データをフレームメモリ２４１に格納する。フレーム画像データは、全て離散コサイン変換（単にＤＣＴ変換とも呼ばれる。）によってＪＰＥＧ等に変換されることなく、非圧縮のＲＡＷ画像データとしてフレームメモリ２４１に格納される。 In step S <b> 20, the user starts imaging the entire document D using the imaging unit 260 of the smartphone 200. In step S30, the smartphone 200 starts acquiring frame images on the premise of generating moving image data, and generates a plurality of frame image data. The smartphone 200 stores a plurality of frame image data in the frame memory 241. All the frame image data is stored in the frame memory 241 as uncompressed RAW image data without being converted into JPEG or the like by discrete cosine transform (also simply referred to as DCT transform).

フレームレートは、スマートフォン２００では、一般には、６０ｆｐｓ（ＦｒａｍｅｓＰｅｒＳｅｃｏｎｄ）や３０ｆｐｓが利用可能である。しかしながら、本実施形態では、原稿Ｄの撮像では、たとえばフレームレートを５ｆｐｓから１０ｆｐｓの低レートとする一方、解像度を静止画像取得用の最大解像度とする。 As for the frame rate, 60 fps (Frames Per Second) or 30 fps can be generally used in the smartphone 200. However, in the present embodiment, in imaging of the document D, for example, the frame rate is set to a low rate of 5 fps to 10 fps, and the resolution is set to the maximum resolution for still image acquisition.

ステップＳ４０では、ユーザーは、ページめくり動作を開始する。スマートフォン２００では、撮像部２６０で、ブック原稿としての原稿Ｄの静止状態と、ページがめくられている動作状態を含むページめくり動作中の原稿Ｄとを撮像し、上述のフレームレートで複数のフレーム画像データを生成する。 In step S40, the user starts a page turning operation. In the smartphone 200, the imaging unit 260 captures the stationary state of the document D as a book document and the document D during the page turning operation including the operation state in which the page is turned, and a plurality of frames are obtained at the above frame rate. Generate image data.

図４は、一実施形態に係る被写体静止判定処理の内容を示す説明図である。図４には、上側にデータフローダイアグラムが示され、下側にＧＯＰ（ＧｒｏｕｐｏｆＰｉｃｔｕｒｅｓ）が示されている。データフローダイアグラムは、撮像部２６０による撮像処理で生成されたフレーム画像データの流れを示している。フレーム画像データは、ＲＡＷ画像データ（ＲＧＢ画像データ）として構成されている。 FIG. 4 is an explanatory diagram showing the content of the subject stillness determination process according to an embodiment. In FIG. 4, a data flow diagram is shown on the upper side, and GOP (Group of Pictures) is shown on the lower side. The data flow diagram shows the flow of frame image data generated by the imaging process by the imaging unit 260. The frame image data is configured as RAW image data (RGB image data).

ＲＡＷ画像データは、動画像データ生成部２１２による動画像データ生成処理の対象となる。動画像データ生成処理には、たとえばＭＰＥＧ−４（ＩＳＯ／ＩＥＣ１４４９６）やＨ．２６４に規定される処理が含まれる。動画像データ生成処理では、ＲＡＷ画像データは、圧縮効率を高めるために輝度データと色差データを含むＹＵＶ画像データに変換される。ＹＵＶ画像データは、次に離散コサイン変換（ＤＣＴ）変換の対象となる。ＤＣＴ変換は、たとえば８×８画素あるいは１６×１６の画素ブロック毎に実行され、変換係数を出力する。変換係数は、量子化処理の対象となる。 The RAW image data is a target of moving image data generation processing by the moving image data generation unit 212. Examples of the moving image data generation processing include MPEG-4 (ISO / IEC 14496) and H.264. H.264 is included. In the moving image data generation process, the RAW image data is converted into YUV image data including luminance data and color difference data in order to increase compression efficiency. The YUV image data is then subject to discrete cosine transform (DCT) transformation. The DCT conversion is executed for each 8 × 8 pixel block or 16 × 16 pixel block, for example, and a conversion coefficient is output. The transform coefficient is a target of quantization processing.

これにより、動画像データ生成部２１２は、人間の視覚感度を想定し、視覚感度が高い輝度データに対して視覚感度が低い色差データを粗く量子化することができるＹＵＶ色空間で処理し、視覚感度が高い低周波成分に対して視覚感度が低い高周波成分を粗く量子化することを可能とするＤＣＴ変換でデータ量を低減させることができる。 Accordingly, the moving image data generation unit 212 assumes human visual sensitivity, processes color difference data with low visual sensitivity with respect to luminance data with high visual sensitivity, and performs processing in a YUV color space that can be roughly quantized. The amount of data can be reduced by DCT conversion that enables coarse quantization of high-frequency components with low visual sensitivity relative to low-frequency components with high sensitivity.

これにより、動画像データ生成部２１２は、Ｉフレーム（Ｉｎｔｒａ−ｃｏｄｅｄＦｒａｍｅ）を生成することができる。Ｉフレームとは、フレーム間予測を用いずに符号化されるフレームである。Ｉフレームとは、イントラフレームやキーフレームとも呼ばれる。Ｉフレームは、Ｐフレーム（ＰｒｅｄｉｃｔｅｄＦｒａｍｅ）やＢフレーム（Ｂｉ−ｄｉｒｅｃｔｉｏｎａｌＰｒｅｄｉｃｔｅｄＦｒａｍｅ）とともにＧＯＰを構成する。 Thereby, the moving image data generation unit 212 can generate an I-frame (Intra-coded Frame). An I frame is a frame that is encoded without using inter-frame prediction. The I frame is also called an intra frame or a key frame. The I frame constitutes a GOP together with a P frame (Predicted Frame) and a B frame (Bi-directional Predicted Frame).

Ｐフレームは、前方向予測のみを用いて符号化されるフレームである。Ｂフレームは、前方向予測、後方向予測、両方向予測のうちいずれかを選択して符号化されるフレームである。 A P frame is a frame that is encoded using only forward prediction. The B frame is a frame that is encoded by selecting any one of forward prediction, backward prediction, and bidirectional prediction.

動画像データは、時系列順に配列されている複数のフレーム画像データから生成される。複数のフレーム画像データは、時系列の前後のフレーム間で近似していることが多い。フレーム間予測とは、このような動画像データの性質を利用して、時系列的に前のフレーム画像から現在のフレーム画像を予測する技術である。 The moving image data is generated from a plurality of frame image data arranged in time series. A plurality of frame image data is often approximated between frames before and after time series. Inter-frame prediction is a technique for predicting a current frame image from a previous frame image in time series using such characteristics of moving image data.

具体的には、画素ブロック毎の移動を推定し、移動後のフレーム間での画素ブロックの差分をＤＣＴ変換・量子化してＧＯＰ単位での圧縮率を高める技術である。Ｐフレームは、動きベクトルを使用してＩフレームから生成することができる。動きベクトルは、各画素ブロックの移動ベクトルである。これにより、Ｐフレームは、動きベクトルと、移動先における画素ブロック内の差分のＤＣＴ係数を量子化したデータとに圧縮することができる。 Specifically, this is a technique for estimating the movement of each pixel block and DCT-transforming / quantizing the difference of the pixel block between the moved frames to increase the compression rate in GOP units. P frames can be generated from I frames using motion vectors. The motion vector is a movement vector of each pixel block. Thereby, the P frame can be compressed into a motion vector and data obtained by quantizing a DCT coefficient of a difference in a pixel block at a movement destination.

このように、フレーム間予測では、動画像データ生成部２１２は、データ圧縮を目的として動きベクトルを生成する。画像解析部２１１は、動画像データ生成部２１２から動きベクトルを取得し、この動きベクトルを解析してフレーム間の変化が撮像部２６０の揺動（いわゆるパン）に相当するか否かを判定することができる。なお、フレーム間予測の処理後においては、動画像データは廃棄してもよい。なお、この例では。動画像データ生成部２１２は、実質的に画像解析部２１１の一部として機能している。 Thus, in inter-frame prediction, the moving image data generation unit 212 generates a motion vector for the purpose of data compression. The image analysis unit 211 acquires a motion vector from the moving image data generation unit 212 and analyzes the motion vector to determine whether a change between frames corresponds to the swing (so-called panning) of the imaging unit 260. be able to. Note that the moving image data may be discarded after the inter-frame prediction process. In this example. The moving image data generation unit 212 substantially functions as a part of the image analysis unit 211.

ステップＳ５０では、スマートフォン２００の画像解析部２１１は、被写体静止判定処理を実行する（図２及び図４参照）。被写体静止判定処理は、被写体としての原稿Ｄがページのめくり動作中ではなく、静止した状態であることを判定する処理である。具体的には、画像解析部２１１は、フレーム間予測の結果に基づき、画素ブロック内のフレーム間差分のＤＣＴ係数を量子化したデータが殆どゼロで、殆ど全ての画素ブロックの移動ベクトルが一致しているとの判断に基づいて揺動を判定することができる。移動ベクトルの一致は、たとえば予め設定されている範囲内であるか否かを定める各閾値によって判断してもよい。 In step S50, the image analysis unit 211 of the smartphone 200 executes subject stillness determination processing (see FIGS. 2 and 4). The subject stillness determination process is a process for determining that the document D as a subject is in a stationary state, not during a page turning operation. Specifically, based on the inter-frame prediction result, the image analysis unit 211 has almost zero data obtained by quantizing the DCT coefficient of the inter-frame difference in the pixel block, and the motion vectors of almost all the pixel blocks match. Oscillation can be determined based on the determination that it is present. The coincidence of the movement vectors may be determined by each threshold value that determines whether or not the movement vector is within a preset range, for example.

ステップＳ６０では、スマートフォン２００の画像解析部２１１は、被写体が静止状態であると判定した場合には、処理をステップＳ７０に進め、被写体が静止状態でないと判定した場合には、処理をステップＳ８０に進める。 In step S60, if the image analysis unit 211 of the smartphone 200 determines that the subject is stationary, the process proceeds to step S70. If the subject is determined not to be stationary, the process proceeds to step S80. Proceed.

ステップＳ７０では、画像解析部２１１は、フレーム画像データ保存処理を実行する。フレーム画像データ保存処理は、フレームメモリ２４１に格納されているフレーム画像データを静止画像格納領域２４２に保存した後に、フレームメモリ２４１に格納されているフレーム画像データを廃棄する処理である。 In step S70, the image analysis unit 211 performs a frame image data storage process. The frame image data storage process is a process of discarding the frame image data stored in the frame memory 241 after storing the frame image data stored in the frame memory 241 in the still image storage area 242.

ステップＳ８０では、画像解析部２１１は、フレーム画像データ廃棄処理を実行する。フレーム画像データ廃棄処理は、フレームメモリ２４１に格納されているフレーム画像データを静止画像格納領域２４２に保存することなく、フレームメモリ２４１に格納されているフレーム画像データを廃棄する処理である。 In step S80, the image analysis unit 211 performs a frame image data discarding process. The frame image data discarding process is a process of discarding the frame image data stored in the frame memory 241 without saving the frame image data stored in the frame memory 241 in the still image storage area 242.

ステップＳ９０では、画像解析部２１１は、静止画像データ選別処理を実行する。静止画像データ選別処理では、画像解析部２１１は、静止画像格納領域２４２に保存されている複数のフレーム画像データの中から同一ページを撮像したデータをグループ化し、各グループの中でピントの甘いフレーム画像データを廃棄する。 In step S90, the image analysis unit 211 executes still image data selection processing. In the still image data selection process, the image analysis unit 211 groups data obtained by capturing the same page from among a plurality of frame image data stored in the still image storage area 242, and frames that are not in focus in each group. Discard the image data.

画像解析部２１１は、このような処理（ステップＳ５０乃至ステップＳ９０）を動画像データの最終フレーム画像まで繰り返して実行する（ステップＳ１００）。 The image analysis unit 211 repeatedly executes such processing (steps S50 to S90) up to the final frame image of the moving image data (step S100).

ピントの甘いフレーム画像データは、たとえば複数の画素ブロックに分割した後に、ＤＣＴ変換を実行して高周波成分が顕著に少ない画像として特定することができる。これにより、画像解析部２１１は、同一ページの画像を表す複数のフレーム画像データのうち比較的に高周波成分のデータを多く含む画素ブロックの多いフレーム画像データを抽出することができる。ブック原稿は、一般に多くのテキスト画像を含み、テキストの輪郭が高周波成分を含むからである。 For example, the frame image data with poor focus can be identified as an image having significantly less high-frequency components by performing DCT conversion after being divided into a plurality of pixel blocks. Accordingly, the image analysis unit 211 can extract frame image data having a large number of pixel blocks including a relatively large amount of high-frequency component data from among a plurality of frame image data representing the image of the same page. This is because a book manuscript generally includes many text images, and the outline of the text includes a high frequency component.

スマートフォン２００は、さらに、近距離無線通信を介して画像形成装置１００に抽出された複数のフレーム画像データを送信する。画像形成装置１００は、複数のフレーム画像データからブック原稿としての原稿Ｄの三次元形状を推定し、その推定結果に基づいて歪み補正処理を実行する。なお、スマートフォン２００は、歪み補正処理をスマートフォン２００で実行するように構成してもよい。 The smartphone 200 further transmits a plurality of frame image data extracted to the image forming apparatus 100 via short-range wireless communication. The image forming apparatus 100 estimates the three-dimensional shape of the document D as a book document from a plurality of frame image data, and executes distortion correction processing based on the estimation result. Note that the smartphone 200 may be configured to execute the distortion correction process on the smartphone 200.

このように、本実施形態によれば、ページをめくる動作中のフレーム画像データを自動的に廃棄し、被写体としての原稿Ｄが静止状態となっていときのフレーム画像データを抽出することができる。さらに、原稿Ｄの静止状態は、スマートフォン２００の揺動の影響を排除して判定することができる。これにより、撮像部の揺動に対してロバスト性の高いページめくり検出を実現する。 As described above, according to this embodiment, it is possible to automatically discard frame image data during a page turning operation and extract frame image data when the document D as a subject is in a stationary state. Further, the stationary state of the document D can be determined by eliminating the influence of the swing of the smartphone 200. This realizes page turning detection with high robustness to the swinging of the imaging unit.

本発明は、上記各実施形態だけでなく、以下のような変形例でも実施することができる。 The present invention can be implemented not only in the above embodiments but also in the following modifications.

変形例１：上記実施形態では、ＹＵＶ画像データはＤＣＴ変換の対象となっているが、必ずしもＤＣＴ変換に限られず、たとえば離散フーリエ変換（ＤＦＴ）を使用してもよく、各画素ブロックの画素値を周波数領域のデータに変換し、視覚感度が低い高周波成分を粗く量子化できるものであればよい。 Modification 1: In the above embodiment, the YUV image data is subject to DCT transformation, but is not necessarily limited to DCT transformation. For example, discrete Fourier transformation (DFT) may be used, and the pixel value of each pixel block Can be converted into frequency domain data and high frequency components with low visual sensitivity can be roughly quantized.

変形例２：上記実施形態では、動画像データから静止画像データを取り出すのではなく、動画像データにおいて行われる処理で生成されるフレーム間予測の結果やＤＣＴ変換データを利用して、非圧縮のＲＡＷ画像データとしての（すなわち非圧縮状態の）複数のフレーム画像データから原稿Ｄが静止状態となっていときのフレーム画像データを抽出している。しかしながら、このような方法に限られず、動画像データからフレーム画像データを復元してもよい。 Modified example 2: In the above embodiment, still image data is not extracted from moving image data, but the result of inter-frame prediction generated by processing performed on moving image data or DCT conversion data is used to perform non-compression. Frame image data when the document D is in a stationary state is extracted from a plurality of frame image data as RAW image data (that is, in an uncompressed state). However, the present invention is not limited to this method, and frame image data may be restored from moving image data.

ただし、動画像データから復元されたＲＧＢ画像データでは、ＤＣＴ変換及びその量子化で失われた高周波成分やＹＵＶの色差情報が完全には復元されておらず、たとえばテキスト等の輪郭がぼやけた画像となる。よって、上記実施形態は、非圧縮のＲＡＷ画像データとして撮像時のフレーム画像データが完全な形で利用可能であるという優位性を有している。特に、ブック原稿は、自然画像と異なり、高周波成分でエッジを表現するテキスト表示の再現が重要なので、上記実施形態は、特に顕著な効果を奏することができる。なお、本明細書では、非圧縮の語は、広い意味を有し、完全に復元可能な可逆圧縮を含むようにしてもよい。 However, in RGB image data restored from moving image data, high-frequency components and YUV color-difference information lost by DCT conversion and quantization thereof are not completely restored. It becomes. Therefore, the above embodiment has an advantage that the frame image data at the time of imaging can be used in a complete form as uncompressed RAW image data. In particular, unlike a natural image, a book manuscript is important to reproduce a text display that expresses an edge with a high-frequency component. Therefore, the embodiment described above can achieve a particularly remarkable effect. In this specification, an uncompressed word has a broad meaning and may include a reversible compression that can be completely restored.

変形例３：上記実施形態では、動画像の生成を想定して複数のフレーム画像データが生成されているが、必ずしも動画像の生成を想定する必要はない。具体的には、たとえばスマートフォン２００の連射機能を使用して複数のフレーム画像データを生成するようにしてもよい。この場合には、動画像の生成のための機能を利用して、原稿Ｄの静止状態を判定しても良いし、制御部２１０に動画像の生成と同様の機能を実装してもよい。このように、本発明で利用可能な撮像部は、予め設定された時間間隔で複数のフレーム画像データを生成するものであればよい。 Modification 3: In the above embodiment, a plurality of frame image data is generated assuming the generation of a moving image, but it is not always necessary to assume the generation of a moving image. Specifically, for example, a plurality of frame image data may be generated using the continuous shooting function of the smartphone 200. In this case, the stationary state of the document D may be determined using a function for generating a moving image, or a function similar to that for generating a moving image may be implemented in the control unit 210. As described above, the imaging unit that can be used in the present invention is only required to generate a plurality of frame image data at a preset time interval.

変形例４：上記実施形態では、本発明は、スマートフォン２００（画像読取装置とも呼ばれる。）の一機能として具現化されているが、必ずしもスマートフォン２００単独で処理する必要はなく、処理の一部を画像形成装置１００で実行して画像読取システムとして具現化してもよい。 Modification 4: In the above embodiment, the present invention is embodied as one function of the smartphone 200 (also referred to as an image reading device), but it is not always necessary to process the smartphone 200 alone, and part of the processing is performed. It may be implemented by the image forming apparatus 100 and embodied as an image reading system.

変形例５：上記実施形態では、スマートフォンが使用されているが、本発明は、撮像が可能であればノートＰＣやタブレットといった携帯端末に適用可能である。 Modification 5: In the above embodiment, a smartphone is used, but the present invention can be applied to a portable terminal such as a notebook PC or a tablet as long as imaging is possible.

１０画像読取システム
１００画像形成装置
１１０制御部
１２０画像形成部
１３０操作表示部
１４０，２４０記憶部
１５０，２５０通信インターフェース部
１６０自動原稿送り装置（ＡＤＦ）
２００スマートフォン
２１０制御部
２３０操作表示部
２６０撮像部

DESCRIPTION OF SYMBOLS 10 Image reading system 100 Image forming apparatus 110 Control part 120 Image forming part 130 Operation display part 140,240 Storage part 150,250 Communication interface part 160 Automatic document feeder (ADF)
200 Smartphone 210 Control Unit 230 Operation Display Unit 260 Imaging Unit

Claims

画像読取装置であって、
ブック原稿の原稿面に表された原稿画像を撮像し、予め設定された時間間隔で複数のフレーム画像データを生成する撮像部と、
前記複数のフレーム画像データを解析して、前記複数のフレーム画像データの少なくとも一部について前記ブック原稿が静止状態で撮像された画像を表しているか否かを判定し、前記判定に基づいて前記静止状態で撮像された画像を表しているフレーム画像データを抽出する画像解析部と、
を備える画像読取装置。 An image reading device,
An imaging unit that captures a document image represented on a document surface of a book document and generates a plurality of frame image data at a preset time interval;
Analyzing the plurality of frame image data to determine whether or not the book document represents an image captured in a stationary state for at least a part of the plurality of frame image data, and based on the determination, the still image An image analysis unit that extracts frame image data representing an image captured in a state;
An image reading apparatus comprising:

請求項１記載の画像読取装置であって、さらに、
前記複数のフレーム画像データを非圧縮状態で一時的に格納するフレームメモリを備え、
前記画像解析部は、前記フレームメモリに格納されている前記複数のフレーム画像データから前記静止状態で撮像された画像を表している前記非圧縮状態のフレーム画像データを抽出する画像読取装置。 The image reading apparatus according to claim 1, further comprising:
A frame memory for temporarily storing the plurality of frame image data in an uncompressed state;
The image reading unit is configured to extract the uncompressed frame image data representing an image captured in the still state from the plurality of frame image data stored in the frame memory.

請求項１又は２に記載の画像読取装置であって、
前記画像解析部は、前記複数のフレーム画像データのそれぞれを複数の画素から構成されている複数の画素ブロックに分割し、前記分割された複数の画素ブロックのそれぞれの移動ベクトルを生成し、前記生成された複数の移動ベクトルが一致しているとの判断に基づいて前記判定を行う画像読取装置。 The image reading apparatus according to claim 1, wherein
The image analysis unit divides each of the plurality of frame image data into a plurality of pixel blocks composed of a plurality of pixels, generates a movement vector of each of the plurality of divided pixel blocks, and generates the generation An image reading apparatus that performs the determination based on the determination that the plurality of movement vectors that are coincident with each other.

請求項３に記載の画像読取装置であって、
前記画像解析部は、前記複数のフレーム画像データからフレーム間予測を行って動画像データを生成し、前記フレーム間予測の結果を利用して前記判定を行う画像読取装置。 The image reading apparatus according to claim 3,
The image analysis unit is an image reading device that performs inter-frame prediction from the plurality of frame image data to generate moving image data, and performs the determination using a result of the inter-frame prediction.

請求項４に記載の画像読取装置であって、
前記画像解析部は、前記動画像データの生成において前記複数の画素ブロックの画素値を周波数領域のデータに変換し、同一ページの画像を表す前記複数のフレーム画像データのうち比較的に高周波成分のデータを多く含む前記画素ブロックの多い前記フレーム画像データを抽出する画像読取装置。 The image reading apparatus according to claim 4,
In the generation of the moving image data, the image analysis unit converts pixel values of the plurality of pixel blocks into frequency domain data, and relatively high frequency components of the plurality of frame image data representing the image of the same page. An image reading apparatus for extracting the frame image data having a large number of pixel blocks including a large amount of data.

画像形成システムであって、
請求項１乃至５のいずれか１項に記載の画像読取装置と、
印刷媒体に画像を形成する画像形成装置と、
を備える画像形成システム。 An image forming system,
An image reading apparatus according to any one of claims 1 to 5,
An image forming apparatus for forming an image on a print medium;
An image forming system comprising:

画像読取方法であって、
ブック原稿の原稿面に表された原稿画像を撮像し、予め設定された時間間隔で複数のフレーム画像データを生成する撮像工程と、
前記複数のフレーム画像データを解析して、前記複数のフレーム画像データの少なくとも一部について前記ブック原稿が静止状態で撮像された画像を表しているか否かを判定し、前記判定に基づいて前記静止状態で撮像された画像を表しているフレーム画像データを抽出する画像解析工程と、
を備える画像読取方法。 An image reading method comprising:
An imaging step of imaging a document image represented on a document surface of a book document and generating a plurality of frame image data at a preset time interval;
Analyzing the plurality of frame image data to determine whether or not the book document represents an image captured in a stationary state for at least a part of the plurality of frame image data, and based on the determination, the still image An image analysis step of extracting frame image data representing an image captured in a state;
An image reading method comprising:

画像読取装置を制御するための画像読取プログラムであって、
ブック原稿の原稿面に表された原稿画像を撮像し、予め設定された時間間隔で複数のフレーム画像データを生成する撮像部、及び
前記複数のフレーム画像データを解析して、前記複数のフレーム画像データの少なくとも一部について前記ブック原稿が静止状態で撮像された画像を表しているか否かを判定し、前記判定に基づいて前記静止状態で撮像された画像を表しているフレーム画像データを抽出する画像解析部として前記画像読取装置を機能させる画像読取プログラム。

An image reading program for controlling an image reading apparatus,
An imaging unit that captures a document image represented on a document surface of a book document and generates a plurality of frame image data at a preset time interval; and the plurality of frame image data analyzed by the plurality of frame image data It is determined whether or not the book document represents an image captured in a stationary state for at least a part of the data, and based on the determination, frame image data representing an image captured in the stationary state is extracted. An image reading program for causing the image reading apparatus to function as an image analysis unit.