JP2012074852A

JP2012074852A - Image processing device, image formation device, image reading device, image processing method, image processing program and recording medium

Info

Publication number: JP2012074852A
Application number: JP2010217342A
Authority: JP
Inventors: 章人 ▲吉▼田; Akito Yoshida
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 2010-09-28
Filing date: 2010-09-28
Publication date: 2012-04-12

Abstract

PROBLEM TO BE SOLVED: To provide an image processing device that when changing image data obtained by reading a document into an image file suitable for output and storage, by detecting a situation relating to the amount of characters on the document using layout analysis and character recognition processing, can automatically select a file format according to the situation.SOLUTION: From respective results of a character recognition unit 111 and a layout analysis unit 112, a document determination unit 113 determines whether a document image indicated by input image data is a character-based image that contains many characters. Based on a determination result of the document determination unit 113, a file format determination unit 114 selects a file format suitable for a document image when the document image is a character-based image, selects a file format suitable for a photographic image when the document image is not a character-based image, and outputs a designation signal to a formatting processing unit 110 to designate a file format.

Description

本発明は、スキャナ等で読み取った画像データを出力や保存に適したファイル形式に変換する画像処理装置、画像処理方法、およびこれを利用した、画像形成装置、画像読取装置に関する。 The present invention relates to an image processing apparatus and an image processing method for converting image data read by a scanner or the like into a file format suitable for output and storage, and an image forming apparatus and an image reading apparatus using the image processing apparatus.

書類等の紙媒体に記録されている画像をスキャナ等の画像入力装置で読み取って画像データを作成し、ＰＤＦ形式やＴＩＦＦ形式等の出力や保存に適したファイル形式に変換して、所定フォルダに保存したり、ネットワークを用いて送信したりすることが行われている。 An image input device such as a scanner reads an image recorded on a paper medium such as a document, creates image data, converts it into a file format suitable for output and storage such as PDF format and TIFF format, and stores it in a predetermined folder. Saving and transmission using a network are performed.

また、画像入力装置で読み取った画像データを、複数のバンド（領域）に分けてバンド毎にＰＤＦファイル化することも行われている。例えば、特許文献１には、画像データを複数のバンド（領域）に分けてバンド毎にＰＤＦファイル化するにあたり、メモリの空き容量に応じて、バンド毎に適用する画像圧縮形式を自動に切り換える技術が開示されている。 In addition, image data read by an image input device is divided into a plurality of bands (areas) and converted into PDF files for each band. For example, Patent Document 1 discloses a technique for automatically switching an image compression format to be applied to each band in accordance with the available memory capacity when dividing the image data into a plurality of bands (areas) to form a PDF file for each band. Is disclosed.

具体的には、領域（文字、背景等）の判別を行う場合に用いるメモリの空き容量が規定値以上か否かを判断し、メモリの空き容量が規定値以上の場合、バンドについてその文字領域をＭＭＲ圧縮し、その背景領域をＪＰＥＧ圧縮とする。また、メモリの空き容量が規定値より少ない場合は、バンドにおける全領域をＪＰＥＧ圧縮する。また、１ページ内の文字数が極端に多い原稿など、蓄積処理の途中でメモリの空き容量が規定値よりも少なくなる蓋然性が高いと考えられる原稿（特殊原稿）の場合は、全バンドについてＪＰＥＧ圧縮を施す。 Specifically, it is determined whether or not the free space of the memory used for determining the area (character, background, etc.) is greater than or equal to a specified value. Is subjected to MMR compression, and the background area is set to JPEG compression. If the free memory capacity is less than the specified value, the entire area in the band is JPEG compressed. In the case of a manuscript (special manuscript) that is considered to have a high probability that the free capacity of the memory will be less than the specified value during the accumulation process, such as a manuscript with an extremely large number of characters in one page, JPEG compression for all bands Apply.

特開２００７−６４１１号公報（平成１９年１月１１日公開）Japanese Unexamined Patent Publication No. 2007-6411 (released on January 11, 2007) 特開平７−１９２０８６号公報（平成７年７月２８日公開）Japanese Laid-Open Patent Publication No. Hei 7-192086 (published July 28, 1995) 特開平６−１８９０８３号公報（平成６年７月８日公開）Japanese Patent Laid-Open No. 6-189083 (published July 8, 1994) 特開２００２−９４８０５号公報（平成１４年３月２９日公開）JP 2002-94805 (published March 29, 2002)

しかしながら、特許文献１のように、メモリの空き容量に応じて適応する画像圧縮形式を切り換えた場合、原稿の種類（入力画像の状況）に適さない画像圧縮形式にて画像が圧縮されてしまうことがあり、保存した画像データを後々有効に利用できないといった問題がある。 However, as in Patent Document 1, when the image compression format that is adapted according to the free space of the memory is switched, the image is compressed in an image compression format that is not suitable for the type of document (the state of the input image). There is a problem that the stored image data cannot be effectively used later.

具体的に言うと、その最たる例が、上記した文字数が極端に多い原稿に対する処理である。文字数が極端に多い原稿の画像の処理としては本来、文字領域をＭＭＲ圧縮、背景領域をＪＰＥＧ圧縮する方式の方が望ましい。しかしながら、上述したように、引用文献１では、このような原稿は、全バンドがＪＰＥＧ圧縮されるといった、望ましくない処理が行われる。このような場合、最悪の場合、記憶装置から読み出して再現した画像からは文字の判別が困難になるといったことが起こる。 More specifically, the best example is processing for a document having an extremely large number of characters. Originally, it is desirable to process a document image having an extremely large number of characters by compressing the character area by MMR and the background area by JPEG compression. However, as described above, according to the cited document 1, such an original is subjected to an undesirable process such that all bands are JPEG compressed. In such a case, in the worst case, it may be difficult to distinguish characters from an image read from the storage device and reproduced.

本発明は、上記の問題点に鑑みてなされたものであり、その目的は、原稿を読み取って得られた画像データを出力や保存に適した形式の画像ファイルに変更するにあたり、レイアウト解析と文字認識処理とを利用して原稿における文字の量に関する状況を検出して該状況に応じたファイル形式を自動的に選択して変換することのできる画像処理装置、画像処理方法を提供することにある。 The present invention has been made in view of the above-described problems, and its purpose is to change layout analysis and text in changing image data obtained by reading a document into an image file having a format suitable for output and storage. To provide an image processing apparatus and an image processing method capable of detecting a situation related to the amount of characters in a document using a recognition process and automatically selecting and converting a file format corresponding to the situation. .

本発明の画像処理装置は、上記の課題を解決するために、原稿画像を画像入力装置にて読み取ることで得られた入力画像データを指定されたファイル形式に変更するフォーマット化処理手段を備えた画像処理装置であって、上記入力画像データに対して文字認識処理を行う文字認識手段と、上記入力画像データに対して、ページ毎の画像のレイアウトを解析するレイアウト解析手段と、上記レイアウト解析手段の解析結果および上記文字認識手段の認識結果より、上記入力画像データが示す原稿画像が文字を多く含んでいる文字中心画像であるか否かを判定する原稿判定手段と、上記原稿判定手段の判定結果に基づいてファイル形式を決定し、上記フォーマット化処理手段に対して決定したファイル形式を指定するファイル形式決定手段とを有し、上記ファイル形式決定手段は、文字中心画像であると判定された場合に、文書画像に適したファイル形式を選択し、文字中心画像ではないと判定された場合に、写真画像に適したファイル形式を選択することを特徴としている。 In order to solve the above-described problems, an image processing apparatus according to the present invention includes formatting processing means for changing input image data obtained by reading an original image with an image input apparatus into a specified file format. An image processing apparatus comprising: character recognition means for performing character recognition processing on the input image data; layout analysis means for analyzing an image layout for each page with respect to the input image data; and the layout analysis means. From the analysis result of the above and the recognition result of the character recognition means, a document determination means for determining whether or not the document image indicated by the input image data is a character center image containing many characters, and a determination by the document determination means File format determining means for determining a file format based on the result and designating the determined file format to the formatting processing means; The file format determining means selects a file format suitable for the document image when it is determined that the image is a character-centered image, and when it is determined that the image is not a character-centered image, It is characterized by selecting a format.

また、本発明の画像処理方法は、上記課題を解決するために、原稿画像を画像入力装置にて読み取ることで得られた入力画像データを指定されたファイル形式に変更するフォーマット化処理工程を含む画像処理方法であって、上記入力画像データに対して文字認識処理を行う文字認識工程と、上記入力画像データに対して、ページ毎の画像のレイアウトを解析するレイアウト解析工程と、上記レイアウト解析工程の解析結果および上記文字認識工程の認識結果より、上記入力画像データが示す原稿画像が文字を多く含んでいる文字中心画像であるか否かを判定する原稿判定工程と、上記原稿判定工程の判定結果に基づいてファイル形式を決定し、上記フォーマット化処理工程に対して決定したファイル形式を指定するファイル形式決定工程とを有し、上記ファイル形式決定工程においては、上記原稿判定工程にて文字中心画像であると判定された場合に、文書画像に適したファイル形式を選択し、上記原稿判定工程にて文字中心画像ではないと判定された場合に、写真画像に適したファイル形式を選択することを特徴としている。 The image processing method of the present invention includes a formatting process step for changing input image data obtained by reading a document image with an image input device into a designated file format in order to solve the above-described problem. An image processing method, a character recognition step for performing character recognition processing on the input image data, a layout analysis step for analyzing a layout of an image for each page with respect to the input image data, and the layout analysis step The document determination step for determining whether or not the document image indicated by the input image data is a character-centered image containing many characters, and the determination in the document determination step A file format determining step for determining a file format based on the result and designating the determined file format for the formatting process step; The file format determining step selects a file format suitable for the document image when the document determining step determines that the image is a character-centered image, and the character determining image is selected in the document determining step. If it is determined that there is no file format, a file format suitable for the photographic image is selected.

上記画像処理装置および画像処理方法によれば、文字認識手段による文字認識結果とレイアウト解析手段によるレイアウト解析結果を基に、原稿画像が文字を多く含んでいる文字中心画像かどうかを判断し、文字中心画像である場合とない場合とで、変更するファイル形式を切り換え、それぞれに応じたファイル形式を自動選択する。 According to the image processing apparatus and the image processing method, based on the character recognition result by the character recognition unit and the layout analysis result by the layout analysis unit, it is determined whether or not the document image is a character-centered image containing many characters. The file format to be changed is switched between the case of the central image and the case of not being the central image, and the file format corresponding to each is automatically selected.

つまり、入力された原稿画像の文字の量に関する状況に応じて、ファイル形式が自動的に原稿画像に適したものへと切り換えられる。 That is, the file format is automatically switched to one suitable for the document image according to the situation regarding the amount of characters of the input document image.

したがって、利用者は、自身で原稿画像の文字の量に関する状況を確認し、手動にて状況に適したファイル形式へと切り換える作業を伴うことなく、適切なファイル形式に変更させることが可能となる。 Therefore, the user can check the situation regarding the amount of characters in the document image by himself and change the file format to an appropriate file format without manually switching to a file format suitable for the situation. .

上記文書画像に適したファイル形式としては、例えば、原稿画像における文字部分を２値画像として格納し、原稿画像におけるその他の部分を多階調画像として格納する高圧縮ＰＤＦ形式とすることができる。また、上記写真画像に適した出力ファイル形式としては、原稿画像全体を多階調画像として格納する通常ＰＤＦ形式とすることができる。 As a file format suitable for the document image, for example, a high-compression PDF format in which a character part in a document image is stored as a binary image and the other part in the document image is stored as a multi-tone image. The output file format suitable for the photographic image can be a normal PDF format in which the entire document image is stored as a multi-tone image.

本発明の画像処理装置は、さらに、上記原稿判定手段は、上記レイアウト解析手段および上記文字認識手段の各結果より原稿画像における文字領域の占める割合を算出する文字領域割合算出手段と、上記文字認識手段において認識された文字のページ毎の総数を数える文字計数手段と、上記文字計数手段にて計数された文字数および上記文字領域割合算出手段にて算出された文字領域の割合に基づいて、少なくとも文字数が多いあるいは文字領域の割合が高い場合に文字中心画像と判定し、文字数が少なくかつ文字領域の割合も低い場合に文字中心画像ではないと判定する第１判定処理手段とを含む構成とすることもできる。なお、上記文字領域とは、文字の外接矩形のことである。 In the image processing apparatus of the present invention, the document determination unit further includes a character area ratio calculation unit that calculates a ratio of a character area in the document image based on the results of the layout analysis unit and the character recognition unit, and the character recognition unit. Based on the character counting means for counting the total number of characters recognized by the means per page, the number of characters counted by the character counting means, and the character area ratio calculated by the character area ratio calculating means, at least the number of characters And a first determination processing unit that determines that the image is a character center image when the ratio of the character area is high or the ratio of the character area is high, and that the image is not a character center image when the number of characters is small and the ratio of the character area is low. You can also. The character area is a circumscribed rectangle of the character.

上記構成によれば、少なくとも文字数が多いあるいは文字領域の割合が高い場合に文字中心画像と判定され、文字数が少なくかつ文字領域の割合も低い場合に文字中心画像ではないと判定される。 According to the above configuration, the image is determined to be the character center image when the number of characters is large or the ratio of the character area is high, and is determined not to be the character center image when the number of characters is small and the ratio of the character area is low.

つまり、原稿画像の文字の量に関する状況を、文字数あるいは文字領域の割合より検出する。文字数や文字領域の割合からは、高い精度で原稿画像における文字の量を検出可能である。したがって、これにより、入力された原稿画像の文字の量に関する状況に応じたファイル形式の自動切り換えをより高精度に実現できる。 That is, the situation regarding the amount of characters in the document image is detected from the number of characters or the ratio of the character area. From the number of characters and the ratio of the character area, the amount of characters in the document image can be detected with high accuracy. Therefore, automatic switching of the file format according to the situation regarding the amount of characters in the input document image can be realized with higher accuracy.

本発明の画像処理装置は、さらに、上記原稿判定手段は、上記レイアウト解析手段の解析結果および上記文字認識手段の認識結果よりタイトル領域の有無を判定するタイトル領域判定手段と、タイトル領域が有る場合に文字中心画像と判定し、タイトル領域が無い場合に文字中心画像ではないと判定する第２判定処理手段とを含む構成とすることもできる。 In the image processing apparatus according to the aspect of the invention, the document determination unit may further include a title region determination unit that determines presence / absence of a title region based on an analysis result of the layout analysis unit and a recognition result of the character recognition unit, and a title region. And a second determination processing unit that determines that the image is not a character center image when there is no title area.

上記構成によれば、タイトル領域が有る場合に文字中心画像と判定され、タイトル領域が無い場合には文字中心画像ではないと判定される。 According to the above configuration, when there is a title area, it is determined as a character center image, and when there is no title area, it is determined that it is not a character center image.

つまり、原稿画像の文字の量に関する状況を、タイトル領域があるかどうかで検出する。タイトル領域を有するということは、原稿画像は、文字を多く含む文書よりなる原稿である蓋然性が高い。そのため、タイトル領域の有無の情報より、高い精度で原稿画像における文字の量を検出可能である。したがって、これによっても、入力された原稿画像の文字の量に関する状況に応じたファイル形式の自動切り換えをより高精度に実現できる。 That is, the situation relating to the amount of characters in the document image is detected based on whether there is a title area. Having a title area means that an original image is highly likely to be an original made up of a document containing many characters. Therefore, it is possible to detect the amount of characters in the document image with higher accuracy than the information on the presence / absence of the title area. Accordingly, this also makes it possible to realize the automatic switching of the file format in accordance with the situation regarding the amount of characters of the input document image with higher accuracy.

この場合、上記タイトル領域判定手段は、上記レイアウト解析手段の解析結果より、一般的にタイトルが位置すると予想される位置に、原稿画像の他の領域よりも文字の外接矩形のサイズが大きく、かつ該外接矩形が連続してなる帯状ブロックが規定以上の長さを有する場合に、タイトル領域と仮判定する仮判定手段と、上記仮判定手段にてタイトル領域と仮判定された部分に対する文字認識結果が規定以上の認識率である場合に、タイトル領域と本判定する本判定手段とを含む構成とすることもできる。 In this case, the title area determination means has a size of the circumscribed rectangle of the character larger than the other area of the document image at a position where the title is generally expected from the analysis result of the layout analysis means, and Temporary determination means for tentatively determining as a title area and a character recognition result for a portion tentatively determined as a title area by the temporary determination means when the band-like block formed by continuous circumscribed rectangles has a length longer than a specified length Can be configured to include a title area and a main determination unit for main determination.

文書原稿におけるタイトルの位置は、原稿上部や原稿右端等、決まった位置に付されることが多い。また、文字サイズが、他の部分の文字サイズよりも大きいといった特徴もある。 In many cases, the position of a title in a document manuscript is assigned to a fixed position such as the upper part of the manuscript or the right edge of the manuscript. In addition, there is a feature that the character size is larger than the character size of other portions.

上記構成によれば、これらの特徴を利用して、まずはレイアウト的にタイトル領域の有無を判定し、レイアウト的にタイトル領域であると判定（仮判定）できるものがあれば、その部分が文字より構成されているかどうかを判断し、文字で構成されている場合は、タイトル領域であると判定（本判定）する。これにより、精度よく、タイトル領域の有無を判断することができる。 According to the above configuration, using these characteristics, first, the presence / absence of a title area is determined in a layout, and if there is something that can be determined (tentative determination) as a title area in a layout, that portion is made of characters. It is determined whether or not it is configured, and if it is configured with characters, it is determined that it is a title area (main determination). Thereby, it is possible to accurately determine the presence or absence of the title area.

本発明の画像処理装置においては、さらに、上記ファイル形式決定手段には、上記入力画像データの利用用途を示す選択信号が入力されるようになっており、上記ファイル形式決定手段は、上記選択信号より利用用途が画像ファイリングであることを判別すると、上記フォーマット化処理手段に対してデフォルト設定されているファイル形式に変更するよう指定する構成とすることが好ましい。 In the image processing apparatus of the present invention, the file format determination unit is further configured to receive a selection signal indicating a use application of the input image data, and the file format determination unit includes the selection signal. If it is determined that the use application is image filing, it is preferable that the format processing unit is designated to change to a file format set as default.

文書画像向きの出力ファイル形式は、文字部分が優先され、文字以外の部分の詳細情報が残りにくい形式であるため、ファイリング用途の一部である帳簿書類等の電子化には不向きである。 The output file format suitable for a document image is a format in which character portions are prioritized and detailed information of portions other than characters is unlikely to remain, so it is not suitable for digitizing book documents that are part of filing purposes.

そこで、利用用途を示す選択信号をファイル形式決定手段に入力させる構成とし、電子ファイリング用途の場合に出力ファイル形式の自動切り替えを自動で行わない構成とすることで、帳簿書類等を間違って文書画像向きの出力ファイル形式で保存することを防止できる。 Therefore, it is configured so that the selection signal indicating the usage is input to the file format determination means, and in the case of electronic filing usage, the automatic switching of the output file format is not automatically performed, so that the book document etc. is mistakenly converted to the document image. You can prevent saving in the output file format of the orientation.

本発明は、上記した本発明の画像処理装置を搭載した画像形成装置及び画像読取装置も発明の範疇としている。 The present invention also includes an image forming apparatus and an image reading apparatus equipped with the above-described image processing apparatus of the present invention.

なお、上記画像処理装置は、コンピュータによって実現してもよく、この場合には、コンピュータを上記各部として動作させることにより、上記画像処理装置をコンピュータにて実現させる画像処理プログラム、およびそれを記録したコンピュータ読み取り可能な記録媒体も、本発明の範疇に含まれる。 The image processing apparatus may be realized by a computer. In this case, an image processing program for causing the image processing apparatus to be realized by the computer by causing the computer to operate as the respective units, and the program are recorded. Computer-readable recording media are also included in the scope of the present invention.

以上のように、本発明の画像処理装置は、原稿画像を画像入力装置にて読み取ることで得られた入力画像データを指定されたファイル形式に変更するフォーマット化処理手段を備えた画像処理装置であって、上記入力画像データに対して文字認識処理を行う文字認識手段と、上記入力画像データに対して、ページ毎の画像のレイアウトを解析するレイアウト解析手段と、上記レイアウト解析手段の解析結果および上記文字認識手段の認識結果より、上記入力画像データが示す原稿画像が文字を多く含んでいる文字中心画像であるか否かを判定する原稿判定手段と、上記原稿判定手段の判定結果に基づいてファイル形式を決定し、上記フォーマット化処理手段に対して決定したファイル形式を指定するファイル形式決定手段とを有し、上記ファイル形式決定手段は、文字中心画像であると判定された場合に、文書画像に適したファイル形式を選択し、文字中心画像ではないと判定された場合に、写真画像に適したファイル形式を選択する構成である。 As described above, the image processing apparatus of the present invention is an image processing apparatus provided with formatting processing means for changing input image data obtained by reading a document image with the image input apparatus into a designated file format. A character recognition unit that performs character recognition processing on the input image data, a layout analysis unit that analyzes a layout of an image for each page with respect to the input image data, an analysis result of the layout analysis unit, and Based on the recognition result of the character recognition means, based on the determination result of the original determination means and the original determination means for determining whether the original image indicated by the input image data is a character center image including many characters. File format determining means for determining a file format and designating the determined file format to the formatting processing means. The file format determining means selects a file format suitable for a document image when it is determined that the image is a character-centered image, and selects a file format suitable for a photographic image when it is determined that the image is not a character-centered image. It is the structure to do.

また、本発明の画像処理方法は、原稿画像を画像入力装置にて読み取ることで得られた入力画像データを指定されたファイル形式に変更するフォーマット化処理工程を含む画像処理方法であって、上記入力画像データに対して文字認識処理を行う文字認識工程と、上記入力画像データに対して、ページ毎の画像のレイアウトを解析するレイアウト解析工程と、上記レイアウト解析工程の解析結果および上記文字認識工程の認識結果より、上記入力画像データが示す原稿画像が文字を多く含んでいる文字中心画像であるか否かを判定する原稿判定工程と、上記原稿判定工程の判定結果に基づいてファイル形式を決定し、上記フォーマット化処理工程に対して決定したファイル形式を指定するファイル形式決定工程とを有し、上記ファイル形式決定工程においては、上記原稿判定工程にて文字中心画像であると判定された場合に、文書画像に適したファイル形式を選択し、上記原稿判定工程にて文字中心画像ではないと判定された場合に、写真画像に適したファイル形式を選択するものである。 The image processing method of the present invention is an image processing method including a formatting process step of changing input image data obtained by reading a document image with an image input device into a specified file format, Character recognition process for performing character recognition processing on input image data, layout analysis process for analyzing image layout for each page with respect to input image data, analysis result of layout analysis process, and character recognition process From the recognition result, a document determination step for determining whether or not the document image indicated by the input image data is a character-centered image including many characters, and a file format is determined based on the determination result of the document determination step. And a file format determination step for designating the file format determined for the formatting process step. In the process, when it is determined that the image is a character-centered image in the document determination step, a file format suitable for the document image is selected, and when it is determined that the image is not a character-centered image in the document determination step. A file format suitable for a photographic image is selected.

これによれば、入力された原稿画像の文字の量に関する状況に応じて、ファイル形式が自動的に適したものへと切り換えられる。したがって、利用者は、自身で原稿画像の文字の量に関する状況を確認し、手動にて状況に適したファイル形式へと切り換える作業を伴うことなく、適切なファイル形式に変更させることが可能となる。 According to this, the file format is automatically switched to a suitable one according to the situation regarding the amount of characters of the input document image. Therefore, the user can check the situation regarding the amount of characters in the document image by himself and change the file format to an appropriate file format without manually switching to a file format suitable for the situation. .

本発明の一実施形態にかかる画像処理装置の要部の構成を示すブロック図である。It is a block diagram which shows the structure of the principal part of the image processing apparatus concerning one Embodiment of this invention. 上記画像処理装置における信号前処理部の構成を示すブロックである。It is a block which shows the structure of the signal pre-processing part in the said image processing apparatus. 上記画像処理装置におけるレイアウト解析部のレイアウト解析処理を示す説明図である。It is explanatory drawing which shows the layout analysis process of the layout analysis part in the said image processing apparatus. 上記画像処理装置におけるレイアウト解析部のレイアウト解析結果の一例を示す説明図である。It is explanatory drawing which shows an example of the layout analysis result of the layout analysis part in the said image processing apparatus. 上記画像処理装置における原稿判定部の第１の具体例であるの第１原稿判定部の構成を示すブロック図である。FIG. 2 is a block diagram illustrating a configuration of a first document determination unit that is a first specific example of a document determination unit in the image processing apparatus. 上記画像処理装置における、信号前処理部、文字認識部、レイアウト解析部、および第１原稿判定部で実施される、入力画像データより文字中心原稿であるか否かを判定するまでの大まかな処理の流れを示すフローチャートである。In the image processing apparatus, rough processing performed by the signal preprocessing unit, the character recognition unit, the layout analysis unit, and the first document determination unit until it is determined whether the document is a character-centered document from the input image data. It is a flowchart which shows the flow. 上記画像処理装置における原稿判定部の第２の具体例であるの第２原稿判定部の構成を示すブロック図である。It is a block diagram which shows the structure of the 2nd original document determination part which is a 2nd specific example of the original document determination part in the said image processing apparatus. （ａ）〜（ｄ）共に、上記第２原稿判定部が、レイアウト解析結果よりタイトル領域があると仮判定する処理を示す説明図である。(A)-(d) is explanatory drawing which shows the process which the said 2nd document determination part tentatively determines that there exists a title area from a layout analysis result. 上記画像処理装置における、信号前処理部、文字認識部、レイアウト解析部、および第２原稿判定部で実施される、入力画像データより文字中心原稿であるか否かを判定するまでの大まかな処理の流れを示すフローチャートである。In the image processing apparatus, rough processing performed by the signal preprocessing unit, the character recognition unit, the layout analysis unit, and the second document determination unit until it is determined whether the document is a character-centered document from the input image data. It is a flowchart which shows the flow. 上記画像処理装置におけるファイル形式決定部の、ファイル形式を決定して指定信号を出力する処理の手順を示すフローチャートである。It is a flowchart which shows the procedure of the process which determines the file format and outputs a designation | designated signal of the file format determination part in the said image processing apparatus. 上記画像処理装置におけるフォーマット化処理部の構成を示すブロック図である。It is a block diagram which shows the structure of the formatting process part in the said image processing apparatus. 上記画像処理装置を搭載したデジタルカラー複合機のブロック図を示すもので、画像入力装置より原稿画像を読み取って画像データを生成し、該画像データに基づく画像を画像出力装置にて生成して出力する、いわゆるコピーモードにおけるデータの流れを示している。FIG. 2 is a block diagram of a digital color multi-functional peripheral equipped with the image processing apparatus. The original image is read from the image input apparatus to generate image data, and an image based on the image data is generated and output by the image output apparatus. The flow of data in the so-called copy mode is shown. 上記画像処理装置を搭載したデジタルカラー複合機のブロック図を示すもので、画像入力装置より原稿画像を読み取って画像データを生成し、該画像データのファイル形式を変更して送受信装置より出力する、送信モードにおけるデータの流れを示している。A block diagram of a digital color multi-function peripheral equipped with the image processing device is shown. A document image is read from an image input device to generate image data, and a file format of the image data is changed and output from a transmission / reception device. The flow of data in the transmission mode is shown. （ａ）（ｂ）共に、上記画像処理装置において中間調補正処理に用いるガンマ曲線の一例を示すグラフである。(A) (b) is a graph which shows an example of the gamma curve used for a halftone correction process in the said image processing apparatus. 上記画像処理装置を搭載したデジタルカラー複合機の変形例を示すブロック図である。It is a block diagram which shows the modification of the digital color compound machine carrying the said image processing apparatus. 上記画像処理装置を搭載したデジタルカラー複合機の変形例を示すブロック図である。It is a block diagram which shows the modification of the digital color compound machine carrying the said image processing apparatus. 上記画像処理装置を搭載したデジタルカラースキャナのブロック図を示すものである。2 is a block diagram of a digital color scanner equipped with the image processing apparatus. FIG.

以下、本発明の実施の一形態について、図１〜図１７を用いて説明する。 Hereinafter, an embodiment of the present invention will be described with reference to FIGS.

まずは、図１を用いて、本実施形態の画像処理装置１００の要部について説明する。図１は、本画像処理装置１００の要部構成を示すブロック図である。 First, the main part of the image processing apparatus 100 of the present embodiment will be described with reference to FIG. FIG. 1 is a block diagram illustrating a main configuration of the image processing apparatus 100.

図１に示すように、本画像処理装置１００は、フォーマット化処理部（フォーマット化処理手段）１１０、信号前処理部１１５、文字認識部（文字認識手段）１１１、レイアウト解析部（レイアウト解析手段）１１２、原稿判定部（原稿判定手段）１１３、およびファイル形式決定部（ファイル形式決定手段）１１４を備えている。 As shown in FIG. 1, the image processing apparatus 100 includes a formatting processing unit (formatting processing unit) 110, a signal preprocessing unit 115, a character recognition unit (character recognition unit) 111, and a layout analysis unit (layout analysis unit). 112, a document determination unit (document determination unit) 113, and a file format determination unit (file format determination unit) 114.

フォーマット化処理部１１０は、原稿画像をスキャナ等の画像読取装置にて読み取ることで得られた入力画像データを、指定されたファイル形式に変更するものである。変更された画像データは、出力画像データとして、図示しない送受信装置を介してネットワーク送信されたり、画像処理装置外部の記憶装置の所定アドレスに保存されたりする。 The formatting processing unit 110 changes input image data obtained by reading an original image with an image reading apparatus such as a scanner into a designated file format. The changed image data is transmitted to the network as output image data via a transmission / reception device (not shown) or stored at a predetermined address in a storage device outside the image processing device.

上記入力画像データは、モノクロ対応の画像読取装置にて読み取られた場合は２値の画像データであり、カラー対応の画像読取装置にて読み取られた場合は、ＲＧＢ信号のカラー画像データである。本実施形態では、入力画像データが、ＲＧＢ信号のカラー画像データである場合を例示する。 The input image data is binary image data when read by a monochrome-compatible image reading device, and color image data of RGB signals when read by a color-compatible image reading device. In this embodiment, the case where input image data is color image data of RGB signals is illustrated.

フォーマット化処理部１１０におけるファイル形式の指定は、後述するファイル形式決定部１１４からの指定信号に行われる。フォーマット化処理部１１０は、入力画像データを、ファイル形式決定部１１４より入力される指定信号が示すファイル形式へと変更する。フォーマット化処理部１１０の詳細については、図１１を用いて後述する。 The specification of the file format in the formatting processing unit 110 is performed by a designation signal from the file format determination unit 114 described later. The formatting processing unit 110 changes the input image data to the file format indicated by the designation signal input from the file format determination unit 114. Details of the formatting processing unit 110 will be described later with reference to FIG.

信号前処理部１１５は、フォーマット化処理部１１０に入力されるＲＧＢ信号の入力画像データを、後段の文字認識部１１１およびレイアウト解析部１１２に適した状態になるように、信号変換処理や、２値化処理、解像度変換処理、原稿の傾き補正処理等を行うものである。信号前処理部１１５の出力は、文字認識部１１１とレイアウト解析部１１２とに出力される。信号前処理部１１５の詳細については、図２を用いて後述する。 The signal preprocessing unit 115 performs signal conversion processing or 2 so that the input image data of the RGB signal input to the formatting processing unit 110 is in a state suitable for the character recognition unit 111 and layout analysis unit 112 in the subsequent stage. A value conversion process, a resolution conversion process, a document inclination correction process, and the like are performed. The output of the signal preprocessing unit 115 is output to the character recognition unit 111 and the layout analysis unit 112. Details of the signal preprocessing unit 115 will be described later with reference to FIG.

文字認識部１１１は、ＯＣＲ技術を用いて入力画像データに対して文字認識処理を行うものである。文字認識処理の方法は特に限定されるものではなく、従来から公知の種々の方法を用いることができる。文字認識結果は、原稿判定部１１３と、フォーマット化処理部１１０へと出力される。また、本実施形態では、文字認識部１１１は、原稿画像の天地を判定する原稿天地判定にも利用されるようになっている。文字認識部１１１の詳細についても後述する。 The character recognition unit 111 performs character recognition processing on input image data using OCR technology. The character recognition processing method is not particularly limited, and various conventionally known methods can be used. The character recognition result is output to the document determination unit 113 and the formatting processing unit 110. In the present embodiment, the character recognizing unit 111 is also used for document top / bottom determination for determining the top / bottom of a document image. Details of the character recognition unit 111 will also be described later.

レイアウト解析部１１２は、入力画像データに対して、ページ毎の画像のレイアウトを解析するものである。レイアウト解析結果は、原稿判定部１１３および文字認識部１１１へと出力される。レイアウト解析方法については、図３、図４を用いて後述する。 The layout analysis unit 112 analyzes an image layout for each page with respect to input image data. The layout analysis result is output to the document determination unit 113 and the character recognition unit 111. The layout analysis method will be described later with reference to FIGS.

原稿判定部１１３は、レイアウト解析部１１２の解析結果および文字認識部１１１の認識結果より、入力画像データが示す原稿画像が文字画像を多く含んでいる文字中心画像であるか否かを判定するものである。判定結果は、ファイル形式決定部１１４へと出力される。文字中心画像であるかどうかの判定手法としては、例えば後記する２つの手法がある。原稿判定部１１３の詳細については、図５〜図９を用いて後述する。 The document determination unit 113 determines whether the document image indicated by the input image data is a character center image including many character images based on the analysis result of the layout analysis unit 112 and the recognition result of the character recognition unit 111. It is. The determination result is output to the file format determination unit 114. As a method for determining whether or not the image is a character center image, for example, there are two methods described later. Details of the document determination unit 113 will be described later with reference to FIGS.

ファイル形式決定部１１４は、原稿判定部１１３の判定結果に基づいてファイル形式を決定するものである。ファイル形式決定部１１４は、決定したファイル形式を示す指定信号をフォーマット化処理部１１０に対して出力する。ファイル形式決定部１１４は、文字中心画像であると判定された場合は、文書画像に適したファイル形式を選択し、文字中心画像ではないと判定された場合には、写真画像に適したファイル形式を選択する。また、本実施形態においては、ファイル形式決定部１１４に、入力画像データの用途を示す選択信号が入力されるようになっており、該選択信号の示す用途によっては、フォーマット化処理部１１０に対して、デフォルトのファイル形式を選択するよう指定する。ファイル形式決定部１１４の詳細については、図１０を用いて後述する。 The file format determination unit 114 determines the file format based on the determination result of the document determination unit 113. The file format determination unit 114 outputs a designation signal indicating the determined file format to the formatting processing unit 110. The file format determination unit 114 selects a file format suitable for a document image when it is determined that the image is a character-centered image, and if it is determined that the image is not a character-centered image, the file format suitable for a photographic image Select. In the present embodiment, a selection signal indicating the use of the input image data is input to the file format determination unit 114. Depending on the use indicated by the selection signal, the format processing unit 110 may receive the selection signal. To select the default file format. Details of the file format determination unit 114 will be described later with reference to FIG.

また、図１に示す、文字認識部１１１、レイアウト解析部１１２、原稿判定部１１３、およびファイル形式決定部１１４にて、後述する図１２、図１３、図１５〜図１７に示す認識処理部１０９が構成されている。 Further, in the character recognition unit 111, layout analysis unit 112, document determination unit 113, and file format determination unit 114 shown in FIG. 1, the recognition processing unit 109 shown in FIGS. 12, 13, and 15 to 17 described later. Is configured.

以下、画像処理装置１００の各部について、詳細に説明する。まずは、図２を用いて、信号前処理部１１５の構成を説明する。図２は、信号前処理部１１５の構成を示すブロックである。 Hereinafter, each part of the image processing apparatus 100 will be described in detail. First, the configuration of the signal preprocessing unit 115 will be described with reference to FIG. FIG. 2 is a block diagram illustrating a configuration of the signal preprocessing unit 115.

図２に示すように、信号前処理部１１５は、信号変換部１２１、２値化処理部１２２、解像度変換部１２３、原稿傾き検知部１２４、傾き補正部１２５を備えている。入力画像データは、信号変換部１２１に入力される。 As shown in FIG. 2, the signal preprocessing unit 115 includes a signal conversion unit 121, a binarization processing unit 122, a resolution conversion unit 123, a document inclination detection unit 124, and an inclination correction unit 125. The input image data is input to the signal conversion unit 121.

信号変換部１２１は、ＲＧＢ信号の入力画像データを無彩化して輝度信号に変換するものである。例えば、信号変換部１２１は、Ｙｉ＝０．３０Ｒｉ＋０．５９Ｇｉ＋０．１１Ｂｉを演算することによりＲＧＢ信号を輝度信号Ｙに変換する。ここで、Ｙは各画素の輝度信号であり、Ｒ，Ｇ，Ｂは各画素のＲＧＢ信号における各色成分である。添え字のｉは画素毎に付与された値（ｉは１以上の整数）である。 The signal conversion unit 121 achromatically converts the input image data of the RGB signal into a luminance signal. For example, the signal converter 121 converts the RGB signal into the luminance signal Y by calculating Yi = 0.30Ri + 0.59Gi + 0.11Bi. Here, Y is a luminance signal of each pixel, and R, G, and B are each color component in the RGB signal of each pixel. The subscript i is a value assigned to each pixel (i is an integer of 1 or more).

また、信号変換部１２１は、ＲＧＢ信号の入力画像データを無彩化してＣＩＥ１９７６Ｌ*ａ*ｂ*信号（ＣＩＥ:Commission International de l'Eclairage、Ｌ*：明度、a*,ｂ*:色度）に変換する構成としてもよい。 Further, the signal conversion unit 121 achromatizes the input image data of the RGB signal to obtain a CIE 1976 L * a * b * signal (CIE: Commission International de l'Eclairage, L *: brightness, a *, b *: chromaticity). It is good also as a structure converted into.

２値化処理部１２２は、信号変換部１２１より入力された無彩化された画像データ（本実施例では輝度信号であるので輝度値）と、予め設定された閾値とを比較することにより画像データを２値化するものである。例えば、画像データが８ビットである場合、上記閾値を１２８とする。あるいは、複数の画素（例えば５画素×５画素）からなるブロックにおける濃度（画素値）の平均値を閾値としてもよい。 The binarization processing unit 122 compares the achromatic image data input from the signal conversion unit 121 (in this embodiment, a luminance value because it is a luminance signal) with a preset threshold value to compare the image. Data is binarized. For example, when the image data is 8 bits, the threshold value is set to 128. Or it is good also considering the average value of the density | concentration (pixel value) in the block which consists of a some pixel (for example, 5 pixels x 5 pixels) as a threshold value.

解像度変換部１２３は、２値化処理部１２２にて２値化された画像データの解像度を低解像度に変換するものである。例えば、１２００ｄｐｉや６００ｄｐｉで読み込まれた画像データを３００ｄｐｉに変換する。解像度変換の方法は特に限定されるものではなく、例えば、公知のニアレストネイバー法、バイリニア法、バイキュービック法などを用いることができる。 The resolution conversion unit 123 converts the resolution of the image data binarized by the binarization processing unit 122 to a low resolution. For example, image data read at 1200 dpi or 600 dpi is converted to 300 dpi. The resolution conversion method is not particularly limited, and for example, a known nearest neighbor method, bilinear method, bicubic method, or the like can be used.

本実施形態では、解像度変換部１２３は、画像データを３００ｄｐｉと７５ｄｐｉとに低解像度化する。そして、原稿傾き検知部１２４には、３００ｄｐｉの画像データを出力し、レイアウト解析部には、７５ｄｐｉ（凡そのレイアウトが分かる程度の解像度）の画像データを出力する。 In the present embodiment, the resolution conversion unit 123 reduces the resolution of the image data to 300 dpi and 75 dpi. Then, 300 dpi image data is output to the document inclination detection unit 124, and image data of 75 dpi (a resolution with which the layout can be understood) is output to the layout analysis unit.

原稿傾き検知部１２４は、解像度変換部１２３によって低解像度に変換された画像データ（３００ｄｐｉ）に基づいて、画像読取時のスキャン範囲（正規の原稿位置）に対する原稿の傾き角度を検知するものである。画像読取装置におけるスキャン範囲（正規の原稿位置）に対して、読み取られる原稿の載置位置が傾いていた場合に、この傾き角度が検知される。傾き角度の検知方法は特に限定されるものではなく、従来から公知の種々の方法を用いることができる。検知した傾き角度の情報は、後段の傾き補正部１２５へと出力される。 The document tilt detection unit 124 detects the tilt angle of the document with respect to the scan range (normal document position) at the time of image reading, based on the image data (300 dpi) converted to a low resolution by the resolution conversion unit 123. . This tilt angle is detected when the placement position of the document to be read is tilted with respect to the scan range (regular document position) in the image reading apparatus. The method for detecting the tilt angle is not particularly limited, and various conventionally known methods can be used. Information on the detected tilt angle is output to the subsequent tilt correction unit 125.

ここでは、特許文献２に記載されている方法を用いた場合を説明する。この方法では、２値化された画像データからを黒画素と白画素との境界点（例えば各文字の上端における白／黒の境界点の座標）を複数個抽出し、各境界点の点列の座標データを求める。そして、この点列の座標データに基づいて回帰直線を求め、その回帰係数ｂを下記式（１）に基づいて算出する。 Here, the case where the method described in Patent Document 2 is used will be described. In this method, a plurality of boundary points between black pixels and white pixels (for example, coordinates of white / black boundary points at the upper end of each character) are extracted from the binarized image data, and a point sequence of each boundary point is extracted. Find the coordinate data. And a regression line is calculated | required based on the coordinate data of this point sequence, and the regression coefficient b is computed based on following formula (1).

ｂ＝Ｓｘｙ／Ｓｘ・・・（１）
なお、Ｓｘ，Ｓｙはそれぞれ変量ｘ，ｙの残差平方和であり、Ｓｘｙはｘの残差とｙの残差の積の和である。すなわち、Ｓｘ，Ｓｙ，Ｓｘｙは下記式（２）〜（４）で表わされる。 b = Sxy / Sx (1)
Sx and Sy are the residual sum of squares of the variables x and y, respectively, and Sxy is the sum of the products of the residual of x and the residual of y. That is, Sx, Sy, Sxy are represented by the following formulas (2) to (4).

そして、上記のように算出した回帰係数ｂより、下記式（５）に基づいて傾き角度θを算出する。 Then, the inclination angle θ is calculated based on the following equation (5) from the regression coefficient b calculated as described above.

ｔａｎθ＝ｂ・・・（５）
傾き補正部１２５は、原稿傾き検知部１２４で検知された傾き角度θに基づいて、原稿の座標を補正して、原稿の傾きを補正するものである。 tan θ = b (5)
The tilt correcting unit 125 corrects the document tilt by correcting the coordinates of the document based on the tilt angle θ detected by the document tilt detecting unit 124.

原点を中心に、反時計まわりに角度θだけ、傾き補正した場合の補正前後の座標の関係は、補正前座標（Ｘ，Ｙ）、補正後座標（Ｘ’，Ｙ’）とすると、以下の式で表される。下記式を用いて原稿の傾き補正を行う。 The relationship between the coordinates before and after correction when the inclination is corrected counterclockwise by an angle θ centered on the origin is the following coordinates when corrected (X, Y) and corrected coordinates (X ′, Y ′): It is expressed by a formula. The document skew is corrected using the following formula.

また、原点を中心に、反時計まわりに角度９０°、１８０°、２７０°回転する場合の回転前後の座標の関係は、回転前座標（Ｘ，Ｙ）、回転後座標（Ｘ’，Ｙ’）とすると、以下の式で表される。下記式を用いて原稿の傾き補正を行う。 Further, the relationship between the coordinates before and after the rotation when rotating at an angle of 90 °, 180 °, and 270 ° counterclockwise around the origin is the coordinates before rotation (X, Y) and coordinates after rotation (X ′, Y ′). ) Is expressed by the following formula. The document skew is corrected using the following formula.

（９０°回転時）
Ｘ’＝オリジナル画像Ｙ方向サイズ − １ − Ｙ
Ｙ’＝Ｘ
（１８０°回転時）
Ｘ’＝オリジナル画像Ｘ方向サイズ − １ − Ｘ
Ｙ’＝オリジナル画像Ｙ方向サイズ − １ − Ｙ
（２７０°回転時）
Ｘ’＝Ｙ
Ｙ’＝オリジナル画像Ｘ方向サイズ − １ − Ｘ補正前座標（Ｘ，Ｙ）。 (At 90 ° rotation)
X ′ = Original image size in Y direction − 1 − Y
Y '= X
(At 180 ° rotation)
X ′ = Original image X-direction size − 1 − X
Y ′ = original image Y-direction size − 1 − Y
(When rotating 270 °)
X '= Y
Y ′ = size of original image in X direction − 1 − X coordinates before correction (X, Y).

傾き補正部１２５による座標の補正にて、文字認識部１１１に入力される画像データ（３００ｄｐｉ）もレイアウト解析部１１２に入力される画像データ（７５ｄｐｉ）も原稿の傾きを含まないものとなる。 By correcting the coordinates by the inclination correction unit 125, the image data (300 dpi) input to the character recognition unit 111 and the image data (75 dpi) input to the layout analysis unit 112 do not include the document inclination.

傾き補正部１２５にて傾きが補正された３００ｄｐｉの画像データは文字認識部１１１へと出力され、傾きが補正された７５ｄｐｉの画像データはレイアウト解析部１１２へと出力される。 The 300 dpi image data whose inclination is corrected by the inclination correction unit 125 is output to the character recognition unit 111, and the 75 dpi image data whose inclination is corrected is output to the layout analysis unit 112.

なお、図１では記載していないが、原稿傾き検知部１２４、傾き補正部１２５の機能を用いて、フォーマット化処理部１１０の前段に設けられた原稿補正部にて、フォーマット化処理部１１０へと入力される画像データ（ＲＧＢ信号のカラー画像データ）に対しても、原稿の傾きが補正される。後述する図１２、図１３、図１５〜図１７に示す原稿補正部１５がこれに相当する。 Although not shown in FIG. 1, the original correction unit provided in the preceding stage of the formatting processing unit 110 using the functions of the original inclination detection unit 124 and the inclination correction unit 125 is transferred to the formatting processing unit 110. The inclination of the document is also corrected for the image data (RGB image color image data) input. The document correction unit 15 shown in FIGS. 12, 13, and 15 to 17 described later corresponds to this.

次に、図３、図４を用いて、レイアウト解析部１１２におけるレイアウト解析の手順（一例）について説明する。
（１）まず、原稿の傾きが補正された画像データ（７５ｄｐｉ）に対し、最初のラインを注目ラインとし、黒画素にラベリングを行う。
（２）次に、注目ラインを一つ下のラインにずらし、黒画素について上記ラインとは異なるラベルをセットする。
（３）注目ラインと一つ上のラインの黒画素の連結状態を判定し、連結して入る場合は、画素が繋がっていると判断し、同じラベル（上のラインのラベル）に置き換える。
（４）上記処理を繰り返して文字を抽出する。抽出された文字より、図３に示すように、上端、下端、左端および右端の画素位置を基に外接矩形（１文字の文字領域に相当）を抽出する。このとき、画素の座標は、入力画像データの左端の位置を原点として求める。
（５）抽出された外接矩形のうち、垂直方向の長さのうち小さいものを除いて平均値を求める。この処理を行単位（外接矩形が求められる単位毎）に行う。また、外接矩形の下側の座標の平均値を算出する。
（６）外接矩形の下側の座標の平均値から、上記垂直方向の長さの平均値を基準とした所定範囲内に新たな黒画素が存在するか否か判定する。ここで、黒画素が存在する場合は、次の行の外接矩形を求め、上記（５）と同様にして矩形の垂直方向の長さの平均値を求める。上記垂直方向の長さの平均値を基準とした所定範囲としては、例えば、垂直方向の長さの平均値の１．５倍の範囲とする。そして、所定範囲内であれば、同じブロックであると判定する。一方、所定範囲外である場合は、別のブロックであると判定する。 Next, a layout analysis procedure (one example) in the layout analysis unit 112 will be described with reference to FIGS.
(1) First, with respect to image data (75 dpi) in which the inclination of the original is corrected, the first line is set as a target line, and black pixels are labeled.
(2) Next, the target line is shifted to the next lower line, and a label different from the above line is set for the black pixel.
(3) The connected state of the black pixel of the line of interest and the line immediately above is determined, and when connected, it is determined that the pixels are connected and replaced with the same label (the label of the upper line).
(4) Repeat the above process to extract characters. As shown in FIG. 3, a circumscribed rectangle (corresponding to a character area of one character) is extracted from the extracted characters based on the pixel positions at the upper end, the lower end, the left end, and the right end. At this time, the coordinates of the pixel are obtained with the position of the left end of the input image data as the origin.
(5) From the extracted circumscribed rectangles, an average value is obtained by excluding a smaller one of the vertical lengths. This process is performed in units of rows (each unit for which a circumscribed rectangle is obtained). Further, the average value of the lower coordinates of the circumscribed rectangle is calculated.
(6) It is determined from the average value of the lower coordinates of the circumscribed rectangle whether or not a new black pixel exists within a predetermined range based on the average value of the vertical length. Here, when there is a black pixel, the circumscribed rectangle of the next row is obtained, and the average value of the lengths of the rectangles in the vertical direction is obtained in the same manner as (5) above. The predetermined range based on the average value in the vertical direction is, for example, a range that is 1.5 times the average value in the vertical direction. And if it is in a predetermined range, it will determine with it being the same block. On the other hand, when it is outside the predetermined range, it is determined that the block is another block.

図４に、学術論文の原稿（１ページ目）をレイアウト解析した結果を示す。５つのブロックＡ〜Ｅが抽出されている。 FIG. 4 shows the result of layout analysis of a manuscript (first page) of an academic paper. Five blocks A to E are extracted.

このうち、ブロックＡがタイトルの部分である。通常、タイトルは、原稿の上部あるいは右端に存在する。また、タイトルの特徴として、文字サイズが、本文を構成する部分に比べて大きい。つまり、外接矩形のサイズが、本文とみなされる領域の外接矩形よりも大きく、例えば、本文と見なされる領域の外接矩形の１．５倍以上ある。 Of these, the block A is the title part. Normally, the title exists at the top or right end of the document. In addition, as a feature of the title, the character size is larger than the portion constituting the text. That is, the size of the circumscribed rectangle is larger than the circumscribed rectangle of the region regarded as the text, and is, for example, 1.5 times or more the circumscribed rectangle of the region regarded as the text.

また、ブロックＢが著者名が記載され部分であり、ブロックＣがアブストラクトの記載部分、ブロックＤ、Ｅが本文の記載部分である。なお、アブストラクトを本文と区別する必要はない。また、表やグラフを抽出することも可能である。 Further, the block B is a part where the author's name is described, the block C is the abstract description part, and the blocks D and E are the text description part. It is not necessary to distinguish the abstract from the main text. It is also possible to extract tables and graphs.

文字認識部１１１は、原稿の傾きが補正された画像データ（３００ｐｉ）を用いて、画像データの特徴量を抽出し、辞書データと比較して文字認識を行う。 The character recognizing unit 111 extracts the feature amount of the image data using the image data (300 pi) in which the inclination of the document is corrected, and performs character recognition in comparison with the dictionary data.

本実施形態では、上述したように、文字認識部１１１は、原稿画像の天地を判定する原稿天地判定にも利用されるようになっている。原稿天地判定の場合は、一部のサンプリングされた文字に対して文字認識を行い、文字の向きを検出して原稿の天地を判定する。そして、画像データに含まれる全ての文字の文字認識については、原稿の天地が判定された状態で行う。 In the present embodiment, as described above, the character recognition unit 111 is also used for document top / bottom determination for determining the top / bottom of a document image. In the case of document top / bottom determination, character recognition is performed on some sampled characters, and the orientation of the document is determined by detecting the direction of the characters. Then, character recognition of all characters included in the image data is performed in a state where the top and bottom of the document is determined.

ここで、原稿天地判定について説明する。本実施形態の画像処理装置１００では、レイアウト解析部１１２を備えており、レイアウト解析により、原稿画像は、図４に示すような複数のブロックに分割される。そこで、画像処理装置１００は、まずはブロック毎の天地判定を行い、各ブロックの天地の判定結果を用いて最終的に原稿の天地を判定するようになっている。 Here, the document top / bottom determination will be described. The image processing apparatus 100 of the present embodiment includes a layout analysis unit 112, and the document image is divided into a plurality of blocks as shown in FIG. 4 by the layout analysis. Therefore, the image processing apparatus 100 first performs top / bottom determination for each block, and finally determines the top / bottom of the document using the top / bottom determination result of each block.

原稿の天地判定には、例えば特許文献３の記載された方法を用いることができ、手順について説明する。
（Ａ）ＯＣＲ技術を用い文字認識を行い、原稿内の一文字、一文字を切り出し、その文字をパターン化する。なお、ここでは、あくまで原稿の天地が判定できればいいので、全ての文字に対して行うのではなく、サンプリングした一部の文字についてのみパターン化する。
（Ｂ）文字パターンの特徴とデータベース化された文字パターン情報を比較する。マッチングの方法としては、データベース化された文字パターンに切り出された文字パターンを重ね合わせ、画素ごとの白黒を比較し、全てが合致したときのデータベース化された文字パターンを入力パターンの文字であると判別する。全てが合致する文字パターンがない場合、マッチングする画素が最も多い文字パターンの文字であると判別する。なお、所定のマッチング割合に達しなければ判別不能と判断する。
（Ｃ）切り出された文字パターンを９０°、１８０°、２７０°回転させ、上記２）の処理を繰りかえす。
（Ｄ）上記（Ｂ）（Ｄ）で得られた文字パターンの回転角毎の判別可能な文字数の比較を行い、判別可能な文字数が最も多い回転角を文字の方向を、当該ブロックの天地方向とする。
（Ｅ）各ブロックの天地判定に基づいて、最も多くのブロックが天地方向とした方向を、原稿の天地方向と判定する。なお、全てのブロックの天地方向が揃わなかった場合に、表示部に画像表示を行い、ユーザに天地の確認を促すように構成してもよい。
（Ｆ）天地判定の結果は、文字認識部１１１にフィードバックされる。これにより、文字認識部１１１は、画像データに含まれる全ての文字の認識については、原稿の天地を判別した状態で行うことができる。 For example, the method described in Patent Document 3 can be used for the top / bottom determination of the document, and the procedure will be described.
(A) Character recognition is performed using the OCR technique, one character or one character in the document is cut out, and the character is patterned. Here, since it is only necessary to determine the top and bottom of the document, patterning is performed only for some sampled characters, not for all characters.
(B) The feature of the character pattern is compared with the character pattern information stored in the database. As a matching method, the character pattern that has been cut out is superimposed on the character pattern that has been databased, the black and white for each pixel is compared, and the databased character pattern when all match is the character of the input pattern Determine. If there is no character pattern that matches all, it is determined that the character pattern has the largest number of matching pixels. If the predetermined matching ratio is not reached, it is determined that discrimination is impossible.
(C) The cut character pattern is rotated by 90 °, 180 °, and 270 °, and the process of 2) is repeated.
(D) The number of distinguishable characters for each rotation angle of the character patterns obtained in (B) and (D) above is compared, and the rotation angle with the largest number of distinguishable characters is the direction of the character and the vertical direction of the block. And
(E) Based on the top / bottom determination of each block, the direction in which the most blocks are the top / bottom direction is determined as the top / bottom direction of the document. In addition, when the top and bottom directions of all the blocks are not aligned, an image may be displayed on the display unit to prompt the user to confirm the top and bottom.
(F) The result of the top / bottom determination is fed back to the character recognition unit 111. As a result, the character recognition unit 111 can recognize all characters included in the image data in a state where the top and bottom of the document is determined.

また、文字認識部１１１は、この画像データに含まれる全ての文字の認識を行う場合にも、レイアウト解析部１１２のレイアウト解析結果を用いて、抽出されたブロック毎に文字認識を行う。 Also, the character recognition unit 111 performs character recognition for each extracted block using the layout analysis result of the layout analysis unit 112 even when recognizing all characters included in the image data.

なお、図１では記載していないが、傾き補正と同様に、この天地判定の結果を用いて、フォーマット化処理部１１０の前段に設けられた原稿補正部により、フォーマット化処理部１１０へと入力される画像データ（ＲＧＢ信号のカラー画像データ）に対しても原稿の天地が補正される。 Although not shown in FIG. 1, the result of this top / bottom determination is input to the formatting processing unit 110 by the document correction unit provided at the preceding stage of the formatting processing unit 110, as in the case of tilt correction. The top and bottom of the document is also corrected for the image data (color image data of RGB signals).

次に、図５〜図９を用いて、原稿判定部１１３について説明する。上述したように、文字中心画像であるかどうかの判定手法として、ここでは２つの手法について説明する。図５が、第１原稿判定部１１３Ａの構成を示すブロック図であり、図７が第２原稿判定部１１３Ｂの構成を示すブロック図である。 Next, the document determination unit 113 will be described with reference to FIGS. As described above, here, two methods will be described as a method for determining whether the image is a character-centered image. FIG. 5 is a block diagram illustrating the configuration of the first document determination unit 113A, and FIG. 7 is a block diagram illustrating the configuration of the second document determination unit 113B.

まず、図５、図６を用いて、第１原稿判定部１１３Ａより説明する。第１原稿判定部１１３Ａは、原稿画像の文字の量に関する状況を、文字数あるいは文字領域の割合より検出するものである。文字数や文字領域の割合からは、高い精度で原稿画像における文字の量を検出可能である。 First, the first document determination unit 113A will be described with reference to FIGS. The first document determination unit 113A detects a situation related to the amount of characters in the document image from the number of characters or the ratio of the character area. From the number of characters and the ratio of the character area, the amount of characters in the document image can be detected with high accuracy.

図５に示すように、第１原稿判定部１１３Ａは、文字計数部（文字計数手段）１３０、文字領域割合算出部（文字領域割合算出手段）１３１、および第１判定処理部（第１判定処理手段）１３２を備えている。 As shown in FIG. 5, the first document determination unit 113A includes a character counting unit (character counting unit) 130, a character area ratio calculating unit (character area ratio calculating unit) 131, and a first determination processing unit (first determination process). Means) 132.

文字計数部１３０には、文字認識部１１１より文字認識結果が入力される。文字計数部１３０は、文字認識部１１１において認識された文字のページ毎の総数を数えるものであり、ページ毎に文字数の合計を求める。 A character recognition result is input to the character counting unit 130 from the character recognition unit 111. The character counting unit 130 counts the total number of characters recognized by the character recognition unit 111 for each page, and obtains the total number of characters for each page.

文字領域割合算出部１３１には、レイアウト解析部１１２よりレイアウト解析結果が入力されると共に、文字認識部１１１より文字認識結果が入力される。文字領域割合算出部１３１は、レイアウト解析部１１２の解析結果および文字認識部１１１の認識結果より、原稿画像における文字領域の占める割合を算出する。具体的には、文字領域割合算出部１３１は、文字認識部１１１にて文字認識された各文字について、レイアウト解析部１１２が求めた外接矩形の面積を合計し、原稿画像（１ページ分の画像全体）の全面積に対する認識された全文字の外接矩形の総面積の割合を算出する。 The character area ratio calculation unit 131 receives the layout analysis result from the layout analysis unit 112 and the character recognition result from the character recognition unit 111. The character area ratio calculation unit 131 calculates the ratio of the character area in the document image from the analysis result of the layout analysis unit 112 and the recognition result of the character recognition unit 111. Specifically, the character area ratio calculating unit 131 sums the area of the circumscribed rectangle obtained by the layout analyzing unit 112 for each character recognized by the character recognizing unit 111 to obtain a document image (image for one page). The ratio of the total area of the circumscribed rectangle of all recognized characters to the total area of (total) is calculated.

上述したように、本画像処理装置１００においては、文字認識部１１１は、レイアウト解析部１１２のレイアウト解析結果に基づき、抽出されたブロック単位で文字認識を行うので、文字計数部１３０の認識文字の計数、および文字領域割合算出部１３１における外接矩形の面積加算の処理も、ブロック毎に実施される。 As described above, in the image processing apparatus 100, the character recognition unit 111 performs character recognition in units of extracted blocks based on the layout analysis result of the layout analysis unit 112. The processing of counting and area addition of the circumscribed rectangle in the character area ratio calculation unit 131 is also performed for each block.

第１判定処理部１３２は、文字計数部１３０にて計数された１ページ中に含まれる文字の総数、および文字領域割合算出部１３１にて算出された文字領域の割合に基づいて、文字中心画像であるか否かを判定するものであり、判定結果は、ファイル形式決定部１１４へと送られる。 The first determination processing unit 132 determines the character center image based on the total number of characters included in one page counted by the character counting unit 130 and the character area ratio calculated by the character area ratio calculating unit 131. The determination result is sent to the file format determination unit 114.

第１判定処理部１３２には、文字計数部１３０より、原稿画像（ページ単位）における文字認識された文字の総数を示す計数結果が入力されると共に、文字領域割合算出部１３１より、原稿画像（ページ単位）における文字認識された各文字の外接矩形の総面積が原稿画像の総面積に対して占める割合（以下、文字領域割合）の算出結果が入力される。 The first determination processing unit 132 receives a count result indicating the total number of characters recognized in the document image (per page) from the character counting unit 130, and receives a document image ( A calculation result of a ratio (hereinafter referred to as a character area ratio) of the total area of the circumscribed rectangle of each character recognized in the character (page unit) to the total area of the document image is input.

第１判定処理部１３２は、文字数および文字領域割合の比較基準として予め定められている規定値と比較し、少なくとも文字数が多いあるいは文字領域割合が高い場合に、文字中心画像と判定し、文字数が少なくかつ文字領域割合も低い場合に、文字中心画像ではないと判定する。 The first determination processing unit 132 compares a predetermined value as a reference for comparing the number of characters and the character area ratio and determines that the image is a character center image when the number of characters is at least large or the character area ratio is high. If the character area ratio is low and the character area ratio is low, it is determined that the image is not a character center image.

換言すると、第１判定処理部１３２は、文字数が規定値以上の場合は、たとえ文字領域割合が規定値未満でも、文字中心画像と判定する。同様に、文字領域割合が規定値以上の場合は、たとえ文字数が規定値未満であっても文字中心画像と判定する。そして、文字数が規定値未満であり、かつ、文字領域割合も規定値未満である場合に、文字中心画像ではないと判定する。 In other words, if the number of characters is greater than or equal to the specified value, the first determination processing unit 132 determines that the image is a character center image even if the character area ratio is less than the specified value. Similarly, when the character area ratio is equal to or greater than the specified value, the character center image is determined even if the number of characters is less than the specified value. Then, when the number of characters is less than the specified value and the character area ratio is also less than the specified value, it is determined that the image is not the character center image.

ここで、規定値の例としては、例えば文字数１５０文字以上、文字領域割合２０％以上などが考えられる。規定値については、ユーザが任意設定できるように構成したり、予め用意された複数セットの中から、ユーザが自由に選択設定できる構成としたりしてもよい。 Here, as an example of the specified value, for example, a character number of 150 characters or more, a character area ratio of 20% or more, and the like can be considered. The specified value may be configured to be arbitrarily set by the user, or may be configured so that the user can freely select and set from a plurality of sets prepared in advance.

図６のフローチャートに、信号前処理部１１５、文字認識部１１１、レイアウト解析部１１２、および第１原稿判定部１１３Ａで実施される、入力画像データより文字中心原稿であるか否かを判定するまでの大まかな処理の流れを示す。 In the flowchart of FIG. 6, until the signal preprocessing unit 115, the character recognition unit 111, the layout analysis unit 112, and the first document determination unit 113A determine whether the document is a character-centered document from input image data. The rough flow of processing is shown.

信号前処理部１１５にて、原稿傾きが検知され（Ｓ１）、傾きが補正される（Ｓ２）。傾きが補正された画像データを用いて、レイアウト解析部１１２がレイアウト解析を行う（Ｓ３）。レイアウト解析の結果を基に、文字認識部１１１による文字認識を利用して、抽出された最も多くのブロックが示す方向を、原稿の天地方向とする（Ｓ４）。 The signal preprocessing unit 115 detects the document inclination (S1) and corrects the inclination (S2). The layout analysis unit 112 performs layout analysis using the image data whose inclination is corrected (S3). Based on the result of the layout analysis, using the character recognition by the character recognition unit 111, the direction indicated by the most extracted blocks is set as the top-to-bottom direction of the document (S4).

原稿の天地が判定されると、文字認識部１１１は、画像全体の文字認識を開始する。ここで、文字認識は、レイアウト解析の結果を用いてブロック単位で実施する。文字認識部１１１が最初のブロックの文字認識を実施すると（Ｓ５）、その結果を用いて、第１原稿判定部１１３Ａが、認識された文字数を計測し（Ｓ６）、認識された文字の文字領域の面積を求める（Ｓ７）。 When the top / bottom of the document is determined, the character recognition unit 111 starts character recognition of the entire image. Here, character recognition is performed in units of blocks using the result of layout analysis. When the character recognition unit 111 performs character recognition of the first block (S5), using the result, the first document determination unit 113A measures the number of recognized characters (S6), and the character area of the recognized character Is determined (S7).

そして、Ｓ８において、当該ページの原稿画像に含まれる全てのブロックの処理が完了したかどうかを判断し、ブロックが残っている場合は、Ｓ５に戻り、次のブロックについて、Ｓ５〜Ｓ７を行って、認識された文字数の計測・加算、認識された文字の文字領域面積の算出・加算を行う。この処理を、Ｓ８において、全てのブロックの処理が完了したと判断するまで繰り返す。 Then, in S8, it is determined whether or not the processing of all the blocks included in the original image of the page has been completed. If any blocks remain, the process returns to S5, and S5 to S7 are performed for the next block. The number of recognized characters is measured / added, and the character area of the recognized character is calculated / added. This process is repeated until it is determined in S8 that all blocks have been processed.

Ｓ８において、全てのブロックについて処理が完了していると判断すると、Ｓ９に移行し、第１原稿判定部１１３Ａが、加算された文字領域面積より文字領域の割合を求め、求めた文字領域割合と文字数の総数に基づいて、文字中心画像であるか否かを判定する。判定結果は、ファイル形式決定部１１４へと送られる。 If it is determined in S8 that the processing has been completed for all the blocks, the process proceeds to S9, where the first document determination unit 113A obtains the character area ratio from the added character area area, It is determined whether the image is a character center image based on the total number of characters. The determination result is sent to the file format determination unit 114.

その後、ファイル形式決定部１１４が、判定結果を用いて、ファイル形式を決定する（Ｓ１０）。なお、Ｓ１０の処理の詳細については、図１０のフローチャートを用いて後述する。 Thereafter, the file format determination unit 114 determines the file format using the determination result (S10). Details of the process of S10 will be described later with reference to the flowchart of FIG.

続いて、図７〜図９を用いて、第２原稿判定部１１３Ｂについて説明する。第２原稿判定部１１３Ｂは、原稿画像の文字の量に関する状況を、タイトル領域があるかどうかで検出するものである。タイトル領域を有するということは、原稿画像は、文字を多く含む文書よりなる原稿である蓋然性が高い。そのため、タイトル領域の有無の情報より、高い精度で原稿画像における文字の量を検出可能である。 Next, the second document determination unit 113B will be described with reference to FIGS. The second document determination unit 113B detects a situation related to the amount of characters in the document image based on whether there is a title area. Having a title area means that an original image is highly likely to be an original made up of a document containing many characters. Therefore, it is possible to detect the amount of characters in the document image with higher accuracy than the information on the presence / absence of the title area.

図７に示すように、第２原稿判定部１１３Ｂは、タイトル領域判定部（タイトル領域判定手段）１４０、および第２判定処理部（第２判定処理手段）１４１を備えている。 As shown in FIG. 7, the second document determination unit 113B includes a title area determination unit (title area determination unit) 140 and a second determination processing unit (second determination processing unit) 141.

タイトル領域判定部１４０は、レイアウト解析部１１２の解析結果および文字認識部１１１の認識結果より、原稿画像にタイトル領域が有るかどうかを判定するものである。タイトル領域判定部１４０は、仮判定部（仮判定手段）１４２と、本判定部（本判定手段）１４３とを備えている。 The title area determination unit 140 determines whether the document image has a title area from the analysis result of the layout analysis unit 112 and the recognition result of the character recognition unit 111. The title area determination unit 140 includes a temporary determination unit (provisional determination unit) 142 and a main determination unit (main determination unit) 143.

仮判定部１４２は、レイアウト解析部１１２の解析結果より、一般的にタイトルが位置すると予想される位置に、原稿画像の他の領域よりも文字の外接矩形のサイズが大きく（文字サイズが大きく）、かつ該外接矩形が連続してなる帯状ブロックが規定以上の長さを有する場合に、これをタイトル領域と仮判定するものである。 Based on the analysis result of the layout analysis unit 112, the provisional determination unit 142 is larger in the size of the circumscribed rectangle of the character (the character size is larger) than the other region of the document image at a position where the title is generally expected to be located. When the band-like block formed by continuously connecting the circumscribed rectangles has a length longer than a specified length, this is provisionally determined as a title area.

ここで、図８（ａ）〜図８（ｄ）を用いて、仮判定部１４２が、レイアウト解析結果よりタイトル領域を仮判定する手順について説明する。仮判定部１４２は、レイアウト解析部１１２の解析結果を基に、抽出されたブロックの条件が以下のような条件である場合に、そのブロックはタイトル領域であると判定（仮判定）する。 Here, with reference to FIG. 8A to FIG. 8D, a procedure in which the temporary determination unit 142 temporarily determines the title area from the layout analysis result will be described. Based on the analysis result of the layout analysis unit 112, the temporary determination unit 142 determines that the block is a title area (temporary determination) when the extracted block condition is as follows.

図８（ａ）〜図８（ｄ）は、前述した図４のレイアウト解析結果を有する原稿を、４方向に回転した状態で入力された場合に取得される画像データのレイアウト解析結果を表している。４方向の内、図８（ａ）が上向き、図８（ｂ）が右向き、図８（ｃ）が下向き、図８（ｄ）が左向きの場合を表す。 FIGS. 8A to 8D show layout analysis results of image data acquired when the document having the layout analysis result of FIG. 4 described above is input in a state rotated in four directions. Yes. Of the four directions, FIG. 8 (a) is upward, FIG. 8 (b) is right, FIG. 8 (c) is downward, and FIG. 8 (d) is left.

画像データの方向が上向き・右向き・下向き・左向きと想定した場合のそれぞれに対する方向の上側（図８（ａ）〜図８（ｄ）のブロックＡの位置）または右側に、規定値以上の外接矩形サイズで、規定値以上の長さを有するブロックがある場合に、タイトル領域であると判定する。 A circumscribed rectangle that exceeds the specified value on the upper side (the position of block A in FIGS. 8A to 8D) or the right side of each of the image data directions assumed to be upward, rightward, downward, and leftward. When there is a block whose size is equal to or longer than a specified value, the title area is determined.

規定値の例としては、例えば外接矩形サイズが１８ポイント以上や本文領域の外接矩形サイズの１．５倍以上、画像データの幅または高さの４０％以上の長さなどが考えられる。ここで、画像データの幅または高さとは、図８（ａ）の参照符号Ｌ１、Ｌ２の寸法である。このような規定値の場合、図８（ａ）〜図８（ｄ）では、ブロックＡがタイトル領域として判定される。規定値は、ユーザが任意設定できるようにしたり、あるいは複数セット用意しておき、ユーザにより選択可能な構成にしたりしてもよい。 Examples of the prescribed values include a circumscribed rectangle size of 18 points or more, a size of 1.5 times or more of the circumscribed rectangle size of the body area, and a length of 40% or more of the width or height of the image data. Here, the width or height of the image data is the size of the reference characters L1 and L2 in FIG. In the case of such a predetermined value, in FIG. 8A to FIG. 8D, the block A is determined as the title area. The specified value may be arbitrarily set by the user, or a plurality of sets may be prepared and selectable by the user.

本判定部１４３は、仮判定部１４２にてタイトル領域と判定（仮判定）された部分に対する文字認識結果が規定以上の認識率である場合に、タイトル領域であると本判定するものである。本判定部１４３には、仮判定部１４２より、タイトル領域であると仮判定したブロックの情報と、文字認識部１１１より文字認識結果が入力される。 The main determination unit 143 performs main determination as a title region when the character recognition result for the portion determined as a title region (provisional determination) by the temporary determination unit 142 is a recognition rate that exceeds a specified level. The main determination unit 143 receives the information of the block tentatively determined as the title area from the temporary determination unit 142 and the character recognition result from the character recognition unit 111.

例えば、本判定部１４３は、仮判定部１４２にてタイトル領域であると判定されたブロックに対し、規定値以上の文字認識率である場合に、最終的にタイトル領域であると本判定する。 For example, the main determination unit 143 finally determines that the block is determined to be the title area by the temporary determination unit 142 when the character recognition rate is equal to or higher than a specified value, and finally the title area.

文字認識率の規定値の例としては、例えばブロックに含まれる外接矩形の５０％以上が文字認識されている場合などが考えられる。これにおいても、規定値については、ユーザが任意設定できるように構成したり、予め用意された複数セットの中から、ユーザが自由に選択設定できる構成としたりしてもよい。 As an example of the prescribed value of the character recognition rate, for example, a case where 50% or more of circumscribed rectangles included in a block are recognized is conceivable. Also in this case, the specified value may be configured to be arbitrarily set by the user, or may be configured to be freely selected and set by the user from a plurality of sets prepared in advance.

第２判定処理部１４１は、タイトル領域判定部１４０の判定結果に基づき、タイトル領域が有る場合は文字中心画像と判定し、タイトル領域が無い場合は文字中心画像ではないと判定するものであり、判定結果は、ファイル形式決定部１１４へと送られる。 Based on the determination result of the title area determination unit 140, the second determination processing unit 141 determines that the title area is a character center image, and determines that the title area is not a character center image when there is no title area. The determination result is sent to the file format determination unit 114.

図９のフローチャートに、信号前処理部１１５、文字認識部１１１、レイアウト解析部１１２、および第２原稿判定部１１３Ｂで実施される、入力画像データより文字中心原稿であるか否かを判定するまでの大まかな処理の流れを示す。 In the flowchart of FIG. 9, until the signal preprocessing unit 115, the character recognition unit 111, the layout analysis unit 112, and the second document determination unit 113B determine whether the document is a character-centered document from input image data. The rough flow of processing is shown.

信号前処理部１１５にて、原稿傾きが検知され（Ｓ２１）、傾きが補正される（Ｓ２２）。傾きが補正された画像データを用いて、レイアウト解析部１１２がレイアウト解析を行う（Ｓ２３）。レイアウト解析の結果を基に、第２原稿判定部１１３Ｂが、タイトル領域であるブロックが存在するかどうかを仮判定する（Ｓ２４）。 The signal preprocessing unit 115 detects the document inclination (S21) and corrects the inclination (S22). The layout analysis unit 112 performs layout analysis using the image data whose inclination is corrected (S23). Based on the result of the layout analysis, the second document determination unit 113B provisionally determines whether a block that is a title area exists (S24).

次に、文字認識部１１１による文字認識を利用して、抽出された最も多くのブロックが示す方向を、原稿の天地方向とする（Ｓ２５）。 Next, using the character recognition by the character recognition unit 111, the direction indicated by the most extracted blocks is set as the vertical direction of the document (S25).

原稿の天地が判定されると、文字認識部１１１は、画像全体の文字認識を開始する。ここで、文字認識は、レイアウト解析の結果を用いてブロック単位で実施する（Ｓ２６）。 When the top / bottom of the document is determined, the character recognition unit 111 starts character recognition of the entire image. Here, character recognition is performed in units of blocks using the result of layout analysis (S26).

文字認識が完了すると、Ｓ２７に進んでＳ２４の仮判定でタイトル領域であると仮判定されたブロックがあるかどうかを判断し、ある場合はＳ２８に進んで、文字認識結果を用いて、タイトル領域の本判定を行う。なお、無い場合は、Ｓ２８をスキップする。 When the character recognition is completed, the process proceeds to S27, where it is determined whether there is a block that is provisionally determined to be the title area in the provisional determination in S24. If there is, the process proceeds to S28, and the title area is used using the character recognition result. The main judgment is performed. If not, S28 is skipped.

続くＳ２９では、タイトル領域の有無に基づいて、文字中心画像であるか否かを判定する。判定結果は、ファイル形式決定部１１４へと送られる。 In subsequent S29, it is determined whether or not the image is a character center image based on the presence or absence of the title area. The determination result is sent to the file format determination unit 114.

その後、ファイル形式決定部１１４が、判定結果を用いて、ファイル形式を決定する（Ｓ３０）。なお、Ｓ３０の処理は、図６のフローチャートのＳ１０の処理と同じであり、詳細については、図１０のフローチャートを用いて後述する。 Thereafter, the file format determination unit 114 determines the file format using the determination result (S30). The process of S30 is the same as the process of S10 in the flowchart of FIG. 6, and details will be described later with reference to the flowchart of FIG.

次に、図１０を用いて、ファイル形式決定部１１４による処理について説明する。図１０は、ファイル形式決定部１１４による、ファイル形式を決定して指定信号を出力する処理の手順を示すフローチャートである。 Next, processing by the file format determination unit 114 will be described with reference to FIG. FIG. 10 is a flowchart illustrating a procedure of processing for determining a file format and outputting a designation signal by the file format determination unit 114.

上述したようにファイル形式決定部１１４は、原稿判定部１１３（第１原稿判定部１１３Ａ，第２原稿判定部１１３Ｂ）の判定結果に基づいてファイル形式を決定するものである。決定したファイル形式は、フォーマット化処理部１１０に対して指定信号として出力される。 As described above, the file format determination unit 114 determines the file format based on the determination result of the document determination unit 113 (the first document determination unit 113A and the second document determination unit 113B). The determined file format is output to the formatting processing unit 110 as a designation signal.

ファイル形式決定部１１４は、文字中心画像であると判定された場合は、文書画像に適したファイル形式を選択し、文字中心画像ではないと判定された場合には、写真画像に適したファイル形式を選択する。 The file format determination unit 114 selects a file format suitable for a document image when it is determined that the image is a character-centered image, and if it is determined that the image is not a character-centered image, the file format suitable for a photographic image Select.

より具体的には、本実施形態においては、ファイル形式決定部１１４は、文書画像に適したファイル形式として、原稿画像における文字部分を２値画像として格納し、原稿画像におけるその他の部分を多階調画像として格納する高圧縮ＰＤＦ形式を選択する。また、写真画像に適した出力ファイル形式としては、原稿画像全体を多階調画像として格納する通常ＰＤＦ形式を選択する。 More specifically, in the present embodiment, the file format determination unit 114 stores the character part of the document image as a binary image as a file format suitable for the document image, and stores other parts of the document image as multi-levels. Select a highly compressed PDF format to be stored as a toned image. As an output file format suitable for a photographic image, a normal PDF format that stores the entire original image as a multi-tone image is selected.

また、本実施形態においては、ファイル形式決定部１１４には、入力画像データの用途を示す選択信号が入力される。ファイル形式決定部１１４は、選択信号より用途が、画像ファイリングであることを判別すると、フォーマット化処理部１１０に対してデフォルト設定されているファイル形式に変更するよう指定するようになっている。 In the present embodiment, the file format determination unit 114 receives a selection signal indicating the use of the input image data. When the file format determination unit 114 determines that the application is image filing based on the selection signal, the file format determination unit 114 instructs the formatting processing unit 110 to change to the default file format.

文書画像向きの出力ファイル形式は、文字部分が優先され文字以外の部分の詳細情報が残りにくい形式であるため、ファイリング用途の一部である帳簿書類等の電子化には不向きである。 The output file format suitable for a document image is a format in which character portions are given priority and detailed information of portions other than characters is unlikely to remain, and thus is not suitable for digitizing book documents that are part of filing purposes.

そこで、利用用途を示す選択信号をファイル形式決定部１１４に入力させる構成とし、電子ファイリング用途の場合に出力ファイル形式の自動切り替えを自動で行わない構成とすることで、帳簿書類等を間違って文書画像向きの出力ファイル形式で保存することを防止できる。 In view of this, the file format determination unit 114 is configured to input a selection signal indicating the usage purpose, and the automatic switching of the output file format is not automatically performed in the case of the electronic filing usage, so that the book document or the like is erroneously written. It is possible to prevent saving in an output file format suitable for images.

本画像処理装置１００が、例えば画像ファイリング機能を有する画像形成装置（例えば、ＭＦＰ）に搭載された場合は、ユーザが操作パネル等を用いて選択した処理モード（scan to e-mailモード、scan to FTPモード、scan to 共有フォルダモード、scan to Desktopモード等）を示す選択信号がファイル形式決定部１１４に入力される。ファイル形式決定部１１４は、選択信号より、選択された処理モードが予め設定された画像ファイリングに係る処理モード（例えば、scan to 共有フォルダモード）である場合に、用途が画像ファイリングであることを判別する。scan to e-mailモード、scan to FTPモード、scan to 共有フォルダモード、scan to Desktopモード等のうち、scan to FTPモードとscan to 共有フォルダモードが画像ファイリングに関わるモードとなり得る。装置の仕様により、その両方、あるいは、何れか一方が、画像ファイリングに関わるモードとして事前に設定されている。 When the image processing apparatus 100 is mounted on, for example, an image forming apparatus (for example, MFP) having an image filing function, the processing mode (scan to e-mail mode, scan to A selection signal indicating an FTP mode, a scan to shared folder mode, a scan to desktop mode, or the like is input to the file format determination unit 114. Based on the selection signal, the file format determination unit 114 determines that the use is image filing when the selected processing mode is a processing mode related to image filing set in advance (for example, scan to shared folder mode). To do. Among scan to e-mail mode, scan to FTP mode, scan to shared folder mode, scan to desktop mode, etc., scan to FTP mode and scan to shared folder mode can be modes related to image filing. Depending on the specifications of the apparatus, either or both of them are set in advance as modes relating to image filing.

図１０のフローチャートに、ファイル形式決定部１１４の処理手順を示す。ファイル形式決定部１１４は、まず、選択信号がＯＫかどうか、つまり画像ファイリングの用途であるかどうかを判断する（Ｓ３１）。ここで、選択信号が、用途として画像ファイリングを示す場合は、選択信号ＯＫではないと判断してＳ３５に進み、デフォルト形式を選択する。 The flowchart of FIG. 10 shows the processing procedure of the file format determination unit 114. The file format determination unit 114 first determines whether or not the selection signal is OK, that is, whether or not the selection signal is for image filing (S31). If the selection signal indicates image filing as an application, it is determined that the selection signal is not OK and the process proceeds to S35 to select a default format.

一方、選択信号が、用途として画像ファイリングを示すものではない場合は、選択信号ＯＫと判断してＳ３２に進み、原稿判定部１１３の判別結果が、文字中心原稿であるか否かを判断する。 On the other hand, if the selection signal does not indicate image filing as an application, it is determined that the selection signal is OK, and the process proceeds to S32, where it is determined whether the determination result of the document determination unit 113 is a character-centered document.

ここで、文字中心原稿である場合は、Ｓ３３に進み高圧縮ＰＤＦ形式を選択する。一方、文字中心原稿ではない場合は、Ｓ３４に進み通常のＰＤＦ形式を選択する。 If the original is a character-centered original, the process proceeds to S33 and the high-compression PDF format is selected. On the other hand, if it is not a character-centered original, the process proceeds to S34 and a normal PDF format is selected.

Ｓ３３、Ｓ３４、Ｓ３５において、ファイル形式の選択が完了すると、Ｓ３６に進み、それぞれ選択したファイル形式を指定する指定信号をフォーマット化処理部１１０へと送信する。 When the selection of the file format is completed in S33, S34, and S35, the process proceeds to S36, and a designation signal designating the selected file format is transmitted to the formatting processing unit 110.

次に、図１１を用いて、フォーマット化処理部１１０について説明する。図１１は、フォーマット化処理部１１０の構成を示すブロック図である。フォーマット化処理部１１０は、透明テキスト作成部１５０と、画像ファイル生成部１５１とを備えている。 Next, the formatting processor 110 will be described with reference to FIG. FIG. 11 is a block diagram illustrating a configuration of the formatting processing unit 110. The formatting processing unit 110 includes a transparent text creation unit 150 and an image file generation unit 151.

透明テキスト作成部１５０には、文字認識部１１１より文字認識結果が入力される。透明テキスト作成部１５０は、文字認識結果に基づいて透明テキストを生成し、画像ファイル生成部１５１に出力する。透明テキストとは、認識された文字をテキスト情報として、見掛け上は見えない形で画像データに重ね合わせる（あるいは埋め込む）ためのデータである。例えば、ＰＤＦファイルでは、画像データに透明テキストを付加した画像ファイルが一般に使用されている。 A character recognition result is input from the character recognition unit 111 to the transparent text creation unit 150. The transparent text creation unit 150 generates a transparent text based on the character recognition result and outputs it to the image file generation unit 151. The transparent text is data for superimposing (or embedding) recognized characters as text information on the image data in an invisible form. For example, in a PDF file, an image file in which transparent text is added to image data is generally used.

画像ファイル生成部１５１は、入力画像データを、指定信号にて指定されたファイル形式に変更するものである。指定されたファイル形式に応じ、入力画像データを所定の形式で圧縮処理し、この圧縮処理した画像データと、透明テキストとに基づいて所定のフォーマットの画像ファイルを生成するものである。 The image file generation unit 151 changes the input image data to a file format designated by a designation signal. According to the designated file format, the input image data is compressed in a predetermined format, and an image file of a predetermined format is generated based on the compressed image data and transparent text.

本実施形態では、ＲＧＢ信号の入力画像データを、ファイル形式決定部１１４の指定に従って、通常のＰＤＦ形式あるいは高圧縮ＰＤＦ形式に変更するとともに、文字認識結果に基づいて生成された透明テキストを各画像ファイルに埋め込む。 In the present embodiment, the input image data of the RGB signal is changed to a normal PDF format or a high-compression PDF format according to the designation of the file format determination unit 114, and transparent text generated based on the character recognition result is changed to each image. Embed in a file.

画像ファイル生成部１５１は、高圧縮ＰＤＦ形式が指定されると、入力画像データにおける原稿画像の文字部分の画像は文字が判読しやすい解像度（たとえば３００ｄｐｉ）のＭＭＲ（Modified Modified Read）圧縮画像とし、原稿画像のその他の部分は、解像度がたとえば１５０ｄｐｉのＪＰＥＧ（Joint Photographic Experts Group）画像に圧縮する。また、通常ＰＤＦ形式が指定されると、原稿画像全体をたとえば１５０ｄｐｉのＪＰＥＧ画像に圧縮する。 When the high compression PDF format is designated, the image file generation unit 151 converts the image of the character portion of the original image in the input image data into an MMR (Modified Modified Read) compressed image having a resolution (for example, 300 dpi) that makes the character easy to read, The other part of the original image is compressed into a JPEG (Joint Photographic Experts Group) image having a resolution of, for example, 150 dpi. When the normal PDF format is designated, the entire document image is compressed into, for example, a 150 dpi JPEG image.

ここで、高圧縮ＰＤＦ生成処理の手順について説明する。
（I）前景マスク生成処理
入力画像から文字画素を表す前景マスクを抽出する。この処理としては、領域分離処理において、文字領域であると判定された画素を２値化し文字画素を抽出する。
（II）前景色インデックス化処理
前景画素色をインデックス化し、インデックス画像を表す前景レイヤーと、前景レイヤーの各文字色、および各文字色領域の最大・最小座標、各インデックスに属する画素数を格納した前景インデックスカラーテーブルを生成する。この処理としては、特許文献４に記載されている方法を用いることができる。これは、前景色のインデックス化処理に関する方法であり、前景レイヤー生成時に全ての前景画素を限られた色数で表す方法である。詳細には、前景画素について、前景インデックスカラーテーブルを更新していくことで、最終的に前景画像のインデックス化を行う。各前景画素について、前景画素色が既に前景インデックスカラーテーブルに登録されていると判断された場合、前景インデックスカラーテーブル内で、最も近い色を有するインデックス値を割り当てる。前景画素色が前景インデックスカラーテーブルに登録されていないと判断された場合は、新規インデックス値を割り当て、前景インデックスカラーテーブルに登録する。上記処理を繰り返すことにより、前景画像をインデックス化する。
（III）背景レイヤー生成処理
入力画像から前景画素を取り除いて、背景レイヤーを生成する処理であり、背景レイヤーの圧縮率を向上するために、前景画素周辺の前景画素でない周辺背景レイヤー画素を用いて穴埋め処理を行う。前景画素周辺の前景画素でない背景画素を参照し、その背景画素の平均値を用いて、背景レイヤー前景画素部を穴埋めする。また、近傍に前景画素でない背景画素が存在しない場合、近傍穴埋め処理結果を用いる。
（IV）２値画像生成処理
入力される前景レイヤーと、前景色インデックス化処理にて生成された座標情報を用いて、各インデックスの２値化画像を出力する。
（V）圧縮処理
各レイヤーに適切な圧縮処理を適用する。前述の通り、前景レイヤーは、ＭＭＲ：可逆圧縮技術を用いて圧縮する。一方、背景レイヤーはＪＰＥＧ：非可逆圧縮技術を用いて圧縮する。 Here, the procedure of high-compression PDF generation processing will be described.
(I) Foreground mask generation processing A foreground mask representing character pixels is extracted from the input image. In this process, pixels determined to be character areas in the area separation process are binarized and character pixels are extracted.
(II) Foreground indexing process Foreground pixel colors are indexed, and the foreground layer representing the index image, each character color of the foreground layer, the maximum and minimum coordinates of each character color area, and the number of pixels belonging to each index are stored. Generate a foreground index color table. As this process, the method described in Patent Document 4 can be used. This is a method related to the foreground indexing process, and is a method of expressing all foreground pixels with a limited number of colors when generating the foreground layer. Specifically, the foreground image is finally indexed by updating the foreground index color table for the foreground pixels. For each foreground pixel, if it is determined that the foreground pixel color has already been registered in the foreground index color table, an index value having the closest color in the foreground index color table is assigned. If it is determined that the foreground pixel color is not registered in the foreground index color table, a new index value is assigned and registered in the foreground index color table. The foreground image is indexed by repeating the above processing.
(III) Background layer generation process This process generates a background layer by removing the foreground pixels from the input image. In order to improve the compression ratio of the background layer, peripheral background layer pixels that are not foreground pixels around the foreground pixels are used. Perform hole filling processing. A background pixel that is not a foreground pixel around the foreground pixel is referred to, and the background layer foreground pixel portion is filled using the average value of the background pixels. Also, if there is no background pixel that is not a foreground pixel in the vicinity, the vicinity filling process result is used.
(IV) Binary image generation process Using the input foreground layer and the coordinate information generated in the foreground color indexing process, a binary image of each index is output.
(V) Compression processing Appropriate compression processing is applied to each layer. As described above, the foreground layer is compressed using the MMR: lossless compression technique. On the other hand, the background layer is compressed using JPEG: lossy compression technology.

上記処理は、本画像処理装置１００が複合機に搭載される場合は、後述する図１２、図１３に示す中間調生成部の後段に圧縮処理部を設け、高圧縮ＰＤＦを生成する際、圧縮処理部からのデータを用いるようにしても良い。 When the image processing apparatus 100 is mounted on a multi-function peripheral, the above processing is performed when a compression processing unit is provided after the halftone generation unit shown in FIGS. Data from the processing unit may be used.

また、ここではＰＤＦ形式を例示しているが、高解像度の２値画像と低解像度の背景画像とを重ねて配置することができるレイヤー構成が可能なファイルフォーマットであればどのような形式でもよい。例えば、マイクロソフト（商標）のパワーポイント（商標）など画像を重ねて配置できるようなアプリケーションは多数あり、複数の画像を重ねて配置できるようなアプリケーションの専用ファイルフォーマットで保存するという方法も考えられる。 In addition, although the PDF format is illustrated here, any format may be used as long as it is a file format capable of layering that can superimpose a high-resolution binary image and a low-resolution background image. . For example, there are many applications such as Microsoft (trademark) PowerPoint (trademark) that can arrange images in a superimposed manner, and a method of saving in a dedicated file format of an application that can arrange a plurality of images in a superimposed manner is also conceivable.

以上のように、本画像処理装置１００では、原稿を読み取って得られた画像データをフォーマット化処理部１１０にてそのファイル形式を変更するにあたり、レイアウト解析と文字認識処理とを利用して原稿画像の文字の量に関する状況を検出し、この状況に応じたファイル形式を自動的に選択して変更することができる。 As described above, in the image processing apparatus 100, when the image processing unit 110 changes the file format of image data obtained by reading a document, the document image is used by using layout analysis and character recognition processing. It is possible to detect a situation related to the amount of characters and automatically select and change a file format corresponding to the situation.

これにより、利用者は、自身で原稿画像の文字の量に関する状況を確認し、手動にて状況に適したファイル形式へと切り換える作業を伴うことなく、適切なファイル形式に変更させることが可能となり、装置の利便性を向上することができる。 As a result, the user can check the situation regarding the amount of characters in the document image by himself and change it to an appropriate file format without manually switching to a file format suitable for the situation. Therefore, the convenience of the apparatus can be improved.

そして、本画像処理装置１００は、複写機、複合機、画像読取装置等に搭載させることができる。 The image processing apparatus 100 can be mounted on a copying machine, a multifunction machine, an image reading apparatus, or the like.

図１２、図１３に、本画像処理装置１００を搭載したデジタルカラー複合機１のブロック図を示す。このうち、図１２は、スキャナ等の画像入力装置２より原稿画像を読み取って画像データを生成し、該画像データに基づく画像を画像出力装置４にて生成して出力する、いわゆるコピーモードにおけるデータの流れを示している。 12 and 13 are block diagrams of the digital color multifunction peripheral 1 in which the image processing apparatus 100 is mounted. Among these, FIG. 12 shows data in a so-called copy mode in which a document image is read from an image input device 2 such as a scanner to generate image data, and an image based on the image data is generated and output by an image output device 4. Shows the flow.

一方、図１３は、スキャナ等の画像入力装置２より原稿画像を読み取って画像データを生成し、該画像データのファイル形式を変更して送受信装置５より出力する、送信モードにおけるデータの流れを示している。 On the other hand, FIG. 13 shows a data flow in the transmission mode in which a document image is read from the image input device 2 such as a scanner to generate image data, the file format of the image data is changed, and the image data is output from the transmission / reception device 5. ing.

図１２、図１３に示すように、デジタルカラー複合機１は、画像入力装置２、画像処理装置３、画像出力装置４、送受信装置５、および操作パネル６を備えている。なお、本画像処理装置１００は、画像処理装置３に搭載されている。 As shown in FIGS. 12 and 13, the digital color multifunction peripheral 1 includes an image input device 2, an image processing device 3, an image output device 4, a transmission / reception device 5, and an operation panel 6. The image processing apparatus 100 is mounted on the image processing apparatus 3.

画像入力装置２は、原稿の画像を読み取って画像データを生成するものであり、例えばＣＣＤ（Charge Coupled Device ）などの光学情報を電気信号に変換するデバイスを備えたスキャナ部（図示せず）より構成されている。本実施形態では、画像入力装置２は、原稿からの反射光像を、ＲＧＢ（Ｒ：赤・Ｇ：緑・Ｂ：青）のアナログ信号として画像処理装置３に出力する。 The image input device 2 reads an image of a document and generates image data. For example, a scanner unit (not shown) including a device that converts optical information such as a CCD (Charge Coupled Device) into an electrical signal. It is configured. In the present embodiment, the image input device 2 outputs the reflected light image from the document to the image processing device 3 as RGB (R: red, G: green, B: blue) analog signals.

画像処理装置３は、図１２、図１３に示すように、Ａ／Ｄ変換部１１、シェーディング補正部１２、入力処理部１３、信号前処理部１１５、原稿補正部１５、色補正部（色変換部）１６、黒生成／下色除去部１７、空間フィルタ部１８、出力階調補正部１９、中間調生成部（中間調生成部）２０、領域分離部２１、認識処理部１０９、フォーマット化処理部１１０、記憶部２３、および制御部（ＣＰＵ）２４を備えている。 As shown in FIGS. 12 and 13, the image processing apparatus 3 includes an A / D conversion unit 11, a shading correction unit 12, an input processing unit 13, a signal preprocessing unit 115, a document correction unit 15, a color correction unit (color conversion unit). Section) 16, black generation / under color removal section 17, spatial filter section 18, output tone correction section 19, halftone generation section (halftone generation section) 20, area separation section 21, recognition processing section 109, formatting process Unit 110, storage unit 23, and control unit (CPU) 24.

記憶部２３は画像処理装置３で扱われる各種データ（画像データ等）を記憶する記憶手段である。記憶部２３の構成は特に限定されるものではないが、例えばハードディスクなどを用いることができる。また、制御部２４は、画像処理装置３に備えられる各部の動作を制御する制御手段である。この制御部２４は、デジタルカラー複合機１の主制御部（図示せず）に備えられるものであってもよく、主制御部とは別に備えられ、主制御部と協働して処理を行うものであってもよい。 The storage unit 23 is a storage unit that stores various data (image data and the like) handled by the image processing apparatus 3. The configuration of the storage unit 23 is not particularly limited, and for example, a hard disk or the like can be used. The control unit 24 is a control unit that controls the operation of each unit provided in the image processing apparatus 3. The control unit 24 may be provided in a main control unit (not shown) of the digital color multifunction peripheral 1, and is provided separately from the main control unit and performs processing in cooperation with the main control unit. It may be a thing.

画像処理装置３は、コピーモードでは、画像入力装置２から入力された画像データに種々の画像処理を施して得られるＣＭＹＫの画像データを画像出力装置４に出力する。 In the copy mode, the image processing device 3 outputs CMYK image data obtained by performing various image processing on the image data input from the image input device 2 to the image output device 4.

また、画像処理装置３は、送信モードでは、画像入力装置２から入力された画像データに傾き補正処理、原稿画像領域抽出処理、変倍処理、回転処理等の画像処理を施すと共に、画像処理後の画像データに基づいて一般に普及している表示装置の表示特性に適合したＲ’Ｇ’Ｂ’の画像データ（例えば、ｓＲＧＢデータ）を生成し、生成したＲ’Ｇ’Ｂ’の画像データを、フォーマット化処理部１１０にて、原稿の状況に応じたファイル形式に変更して送受信装置５に出力する。 Further, in the transmission mode, the image processing device 3 performs image processing such as tilt correction processing, document image region extraction processing, scaling processing, and rotation processing on the image data input from the image input device 2, and after image processing. R′G′B ′ image data (for example, sRGB data) suitable for display characteristics of a display device that is generally popular based on the image data of the image data, and the generated image data of R′G′B ′ is generated. Then, the formatting processor 110 changes the file format according to the document status and outputs it to the transmission / reception device 5.

画像出力装置４は、画像処理装置３から入力された画像データを記録材（例えば紙等）上に出力するものである。画像出力装置４の構成は特に限定されるものではなく、例えば、電子写真方式やインクジェット方式を用いた画像出力装置を用いることができる。 The image output device 4 outputs the image data input from the image processing device 3 onto a recording material (for example, paper). The configuration of the image output device 4 is not particularly limited, and for example, an image output device using an electrophotographic method or an inkjet method can be used.

送受信装置５は、例えばモデムやネットワークカードより構成される。送受信装置５は、ネットワークカード、ＬＡＮケーブル等を介して、ネットワークに接続された他の装置（例えば、パーソナルコンピュータ、サーバ装置、表示装置、他のデジタル複合機、ファクシミリ装置等）とデータ通信を行う。送受信装置５は、画像データを送信する場合、相手先との送信手続きを行って送信可能な状態が確保されると、所定の形式で圧縮された画像データをメモリから読み出し、圧縮形式の変更など必要な処理を施して、通信回線を介して相手先に順次送信する。また、送受信装置５は、画像データを受信する場合、通信手続きを行うとともに、相手先から送信されてくる画像データを受信して画像処理装置３に入力する。受信した画像データは、画像処理装置３で伸張処理、回転処理、解像度変換処理、出力階調補正、階調再現処理などの所定の処理が施され、画像出力装置４によって出力される。なお、受信した画像データを記憶装置（図示せず）に保存し、画像処理装置３が必要に応じて読み出して上記所定の処理を施すようにしてもよい。 The transmission / reception device 5 is constituted by a modem or a network card, for example. The transmission / reception device 5 performs data communication with other devices (for example, personal computers, server devices, display devices, other digital multifunction devices, facsimile devices, etc.) connected to the network via a network card, a LAN cable, or the like. . When transmitting / receiving image data, the transmission / reception device 5 reads out the image data compressed in a predetermined format from the memory, changes the compression format, and the like when the transmission procedure with the other party is performed and the transmission is possible. Necessary processing is performed, and the data is sequentially transmitted to the other party via the communication line. In addition, when receiving image data, the transmission / reception device 5 performs a communication procedure, receives image data transmitted from the other party, and inputs the image data to the image processing device 3. The received image data is subjected to predetermined processing such as expansion processing, rotation processing, resolution conversion processing, output gradation correction, gradation reproduction processing, and the like by the image processing apparatus 3 and is output by the image output apparatus 4. Note that the received image data may be stored in a storage device (not shown), and the image processing device 3 may read it out as necessary to perform the predetermined processing.

操作パネル６は、例えば、液晶ディスプレイなどの表示部と設定ボタンなどより構成され（いずれも図示せず）、デジタルカラー複合機１の主制御部（図示せず）の指示に応じた情報を上記表示部に表示するとともに、上記設定ボタンを介してユーザから入力される情報を上記主制御部に伝達する。ユーザは、操作パネル６を介して入力画像データに対する処理モード、印刷枚数、用紙サイズ、送信先アドレスなどの各種情報を入力することができる。 The operation panel 6 includes, for example, a display unit such as a liquid crystal display and setting buttons (none of which are shown), and information corresponding to an instruction from a main control unit (not shown) of the digital color multifunction peripheral 1 is described above. While displaying on a display part, the information input from a user via the said setting button is transmitted to the said main control part. The user can input various information such as a processing mode, the number of printed sheets, a paper size, and a transmission destination address for the input image data via the operation panel 6.

上記主制御部は、例えばＣＰＵ（Central Processing Unit）等からなり、図示しないＲＯＭ等に格納されたプログラムや各種データ、操作パネル６から入力される情報等に基づいて、デジタルカラー複合機１の各部の動作を制御する。 The main control unit includes, for example, a CPU (Central Processing Unit) and the like, and is based on programs and various data stored in a ROM (not shown) and the like, information input from the operation panel 6, and the like. To control the operation.

次に、コピーモードにおける画像処理装置３の動作についてより詳細に説明する。コピーモードの場合、図１２に示すように、まず、Ａ／Ｄ変換部１１が、画像入力装置２から入力されたＲＧＢのアナログ信号をデジタル信号に変換してシェーディング補正部１２に出力する。 Next, the operation of the image processing apparatus 3 in the copy mode will be described in more detail. In the copy mode, as shown in FIG. 12, first, the A / D conversion unit 11 converts RGB analog signals input from the image input device 2 into digital signals and outputs the digital signals to the shading correction unit 12.

シェーディング補正部１２は、Ａ／Ｄ変換部１１から送られてきたデジタルのＲＧＢ信号に対して、画像入力装置２の照明系、結像系、撮像系で生じる各種の歪みを取り除く処理を施し、入力処理部１３に出力する。 The shading correction unit 12 performs a process of removing various distortions generated in the illumination system, the imaging system, and the imaging system of the image input device 2 on the digital RGB signal sent from the A / D conversion unit 11, The data is output to the input processing unit 13.

入力処理部（入力階調補正部）１３は、シェーディング補正部１２にて各種の歪みが取り除かれたＲＧＢ信号に対して、カラーバランスを整えると同時に、濃度信号など画像処理装置３に採用されている画像処理システムの扱い易い信号に変換する処理を施すものである。また、下地濃度の除去やコントラストなどの画質調整処理を行う。また、入力処理部１３は、上記の各処理を施した画像データを記憶部２３に記憶させる。 The input processing unit (input gradation correction unit) 13 adjusts the color balance of the RGB signal from which various distortions have been removed by the shading correction unit 12 and is also used in the image processing apparatus 3 such as a density signal. The signal is converted into a signal that can be easily handled by an image processing system. Also, image quality adjustment processing such as background density removal and contrast is performed. Further, the input processing unit 13 causes the storage unit 23 to store the image data subjected to each of the above processes.

信号前処理部１１５は、上述したように、画像データを、後段の認識処理部１０９に適した信号となるように処理するものである。但し、コピーモードでは、認識処理部１０９よりフォーマット化処理部１１０へと指定信号が出力されたり、文字認識結果が出力されることはない。コピーモードの場合、信号前処理部１１５は、原稿傾き検知部１２４で検知された傾き角度の情報、あるいは傾き補正部１２５で補正した座標情報を、原稿補正部１５へと出力する。また、認識処理部１０９は、原稿の天地判定結果を原稿補正部１５へと出力する。 As described above, the signal preprocessing unit 115 processes the image data so as to be a signal suitable for the recognition processing unit 109 in the subsequent stage. However, in the copy mode, no designation signal or character recognition result is output from the recognition processing unit 109 to the formatting processing unit 110. In the copy mode, the signal preprocessing unit 115 outputs information on the tilt angle detected by the document tilt detection unit 124 or coordinate information corrected by the tilt correction unit 125 to the document correction unit 15. Further, the recognition processing unit 109 outputs the document top / bottom determination result to the document correction unit 15.

原稿補正部１５は、信号前処理部１１５、認識処理部１０９から出力される、原稿傾きの情報や原稿の天地の情報に従って、入力処理部１３から入力される画像データ、あるいは記憶部２３に保存されている画像データに対して、傾き補正処理、天地補正処理などを行う。 The document correction unit 15 is stored in the image data input from the input processing unit 13 or stored in the storage unit 23 according to the document tilt information and the document top / down information output from the signal preprocessing unit 115 and the recognition processing unit 109. An inclination correction process, a top-and-bottom correction process, and the like are performed on the image data that has been processed.

また、原稿補正部１５によって上記の各処理がなされた画像データを、記憶部２３においてファイリングデータとして管理するようにしてもよい。 Further, the image data that has been subjected to the above-described processes by the document correction unit 15 may be managed as filing data in the storage unit 23.

画像データが、ファイリングデータとして管理される場合、上記画像データは、例えば、ＪＰＥＧ圧縮アルゴリズムに基づいてＪＰＥＧコードに圧縮されて記憶部２３に格納される。そして、この画像データに対するコピー出力動作やプリント出力動作が指示された場合には、記憶部２３からＪＰＥＧコードが引き出されて不図示のＪＰＥＧ伸張部に引き渡され、復号化処理が施されてＲＧＢデータに変換される。また、上記の画像データに対して送信動作が指示された場合には、記憶部２３からＪＰＥＧコードが引き出され、ネットワーク網や通信回線を介して送受信装置５から外部装置へ送信される。 When the image data is managed as filing data, the image data is compressed into a JPEG code based on a JPEG compression algorithm and stored in the storage unit 23, for example. When a copy output operation or print output operation is instructed for this image data, a JPEG code is extracted from the storage unit 23 and transferred to a JPEG decompression unit (not shown), and subjected to a decoding process to obtain RGB data. Is converted to When a transmission operation is instructed for the image data, a JPEG code is extracted from the storage unit 23 and transmitted from the transmission / reception device 5 to an external device via a network or communication line.

色補正部１６は、色再現の忠実化実現のために、不要吸収成分を含むＣＭＹ（Ｃ：シアン・Ｍ：マゼンタ・Ｙ：イエロー）色材の分光特性に基づいた色濁りを取り除く処理を行うものである。 The color correction unit 16 performs a process of removing color turbidity based on spectral characteristics of CMY (C: cyan, M: magenta, Y: yellow) color materials including unnecessary absorption components in order to realize faithful color reproduction. Is.

黒生成／下色除去部１７は、色補正後のＣＭＹの３色信号から黒（Ｋ）信号を生成する黒生成、元のＣＭＹ信号から黒生成で得たＫ信号を差し引いて新たなＣＭＹ信号を生成する処理を行うものである。これにより、ＣＭＹの３色信号はＣＭＹＫの４色信号に変換される。 The black generation / under color removal unit 17 generates a black (K) signal from the CMY three-color signal after color correction, and subtracts the K signal obtained by the black generation from the original CMY signal to generate a new CMY signal. The process which produces | generates is performed. As a result, the CMY three-color signal is converted into a CMYK four-color signal.

空間フィルタ部１８は、黒生成／下色除去部１７より入力されるＣＭＹＫ信号の画像データに対して、領域識別信号を基にデジタルフィルタによる空間フィルタ処理（強調処理および／または平滑化処理）を行い、空間周波数特性を補正する。これにより、出力画像のぼやけや粒状性劣化を軽減することができる。 The spatial filter unit 18 performs spatial filter processing (enhancement processing and / or smoothing processing) using a digital filter on the image data of the CMYK signal input from the black generation / undercolor removal unit 17 based on the region identification signal. And correct the spatial frequency characteristics. As a result, blurring of the output image and deterioration of graininess can be reduced.

中間調生成部２０は、空間フィルタ部１８と同様、ＣＭＹＫ信号の画像データに対して領域識別信号を基に所定の処理を施すものである。 Similar to the spatial filter unit 18, the halftone generation unit 20 performs predetermined processing on the image data of the CMYK signal based on the region identification signal.

例えば、領域分離部２１にて文字に分離された領域は、特に黒文字あるいは色文字の再現性を高めるために、空間フィルタ部１８による空間フィルタ処理における鮮鋭強調処理で高周波数の強調量が大きくされる。同時に、中間調生成部２０においては、高域周波数の再現に適した高解像度のスクリーンでの二値化または多値化処理が選択される。また、領域分離部２１にて網点領域に分離された領域に関しては、空間フィルタ部１８において、入力網点成分を除去するためのローパス・フィルタ処理が施される。そして、出力階調補正部１９では、濃度信号などの信号を画像出力装置４の特性値である網点面積率に変換する出力階調補正処理を行った後、中間調生成部２０で、最終的に画像を画素に分離してそれぞれの階調を再現できるように処理する階調再現処理（中間調生成）が施される。領域分離部２１にて写真に分離された領域に関しては、階調再現性を重視したスクリーンでの二値化または多値化処理が行われる。 For example, in the region separated into characters by the region separation unit 21, the enhancement amount of high frequency is increased by sharp enhancement processing in the spatial filter processing by the spatial filter unit 18 in order to improve the reproducibility of black characters or color characters in particular. The At the same time, the halftone generation unit 20 selects the binarization or multi-value processing on the high-resolution screen suitable for reproducing the high frequency. Further, with respect to the region separated into halftone dot regions by the region separation unit 21, the spatial filter unit 18 performs low-pass filter processing for removing the input halftone dot component. The output tone correction unit 19 performs an output tone correction process for converting a signal such as a density signal into a halftone dot area ratio that is a characteristic value of the image output device 4. In particular, gradation reproduction processing (halftone generation) is performed in which an image is separated into pixels and processed so that each gradation can be reproduced. For the area separated into photographs by the area separation unit 21, binarization or multi-value processing is performed on the screen with an emphasis on gradation reproducibility.

領域分離部２１は、ＲＧＢ信号より、入力画像中の各画素を黒文字領域、色文字領域、網点領域、印画紙写真（連続階調領域）領域の何れかに分離するものである。領域分離部２１は、分離結果に基づき、画素がどの領域に属しているかを示す領域分離信号を、黒生成／下色除去部１７、空間フィルタ部１８、および中間調生成部２０へと出力する。 The region separation unit 21 separates each pixel in the input image into any one of a black character region, a color character region, a halftone dot region, and a photographic paper photograph (continuous tone region) region based on the RGB signal. Based on the separation result, the region separation unit 21 outputs a region separation signal indicating to which region the pixel belongs to the black generation / undercolor removal unit 17, the spatial filter unit 18, and the halftone generation unit 20. .

なお、認識処理部１０９における原稿判定部１１３およびファイル形式決定部１１４、フォーマット化処理部１１０は、コピーモードでは動作を行わない。 The document determination unit 113, the file format determination unit 114, and the formatting processing unit 110 in the recognition processing unit 109 do not operate in the copy mode.

上述した各処理が施された画像データは、一旦、図示しないメモリに記憶されたのち、所定のタイミングで読み出されて画像出力装置４に入力される。 The image data subjected to the above-described processes is temporarily stored in a memory (not shown), read out at a predetermined timing, and input to the image output device 4.

次に、送信モードにおける画像処理装置３の動作について、図１３を参照しながら説明する。なお、送信モードにおけるＡ/Ｄ変換部１１、シェーディング補正部１２、入力処理部１３、原稿補正部１５、および領域分離部２１の処理は、コピーモード時と同様である。なお、領域分離部２１は領域分離信号を空間フィルタ部１８および中間調生成部２０に対して出力する。 Next, the operation of the image processing apparatus 3 in the transmission mode will be described with reference to FIG. Note that the processing of the A / D conversion unit 11, the shading correction unit 12, the input processing unit 13, the document correction unit 15, and the region separation unit 21 in the transmission mode is the same as in the copy mode. The region separation unit 21 outputs the region separation signal to the spatial filter unit 18 and the halftone generation unit 20.

信号前処理部１１５は、画像形成モード時と同様の動作を行うとともに、２値化、低解像度化、傾き補正を行った画像データを認識処理部１０９に出力する。 The signal preprocessing unit 115 performs the same operation as in the image forming mode, and outputs the image data subjected to binarization, resolution reduction, and inclination correction to the recognition processing unit 109.

認識処理部１０９は、信号前処理部１１５より入力された画像データに基づいて、文字認識を行う。また、文字中心原稿であるかどうか、および処理モードがscan to e-mailモード、ファイリング等の処理ではないかどうかを判断して、ファイル形式を指定する指定信号を出力する。文字認識結果は、フォーマット化処理部１１０の透明テキスト作成部１５０に入力され、指定信号は画像ファイル生成部１５１に入力される。 The recognition processing unit 109 performs character recognition based on the image data input from the signal preprocessing unit 115. Further, it determines whether the document is a character-centered original and whether the processing mode is scan-to-e-mail mode, filing processing, or the like, and outputs a designation signal for designating a file format. The character recognition result is input to the transparent text creation unit 150 of the formatting processing unit 110, and the designation signal is input to the image file generation unit 151.

一方、色補正部１６は、原稿補正部１５から入力されたＲＧＢの画像データを、一般に普及している表示装置の表示特性に適合したＲ’Ｇ’Ｂ’の画像データ（例えば、ｓＲＧＢデータ）に変換し、黒生成／下色除去部１７に出力する。黒生成／下色除去部１７は、色補正部１６から入力された画像データをそのまま空間フィルタ部１８に出力（スルー）する。 On the other hand, the color correction unit 16 converts the RGB image data input from the document correction unit 15 into R′G′B ′ image data (for example, sRGB data) suitable for display characteristics of a display device that is generally popular. And output to the black generation / undercolor removal unit 17. The black generation / under color removal unit 17 outputs (through) the image data input from the color correction unit 16 to the spatial filter unit 18 as it is.

空間フィルタ部１８は、黒生成／下色除去部１７より入力されるＲ’Ｇ’Ｂ’の画像データに対して、領域識別信号を基にデジタルフィルタによる空間フィルタ処理（強調処理および／または平滑化処理）を行って、出力階調補正部１９に出力する。出力階調補正部１９は、空間フィルタ部１８から入力されたＲ’Ｇ’Ｂ’の画像データに対して領域識別信号を基に所定の処理を施し、フォーマット化処理部１１０における画像ファイル生成部１５１に出力する。 The spatial filter unit 18 applies spatial filter processing (enhancement processing and / or smoothing) to the R′G′B ′ image data input from the black generation / under color removal unit 17 based on the region identification signal. Is output to the output tone correction unit 19. The output tone correction unit 19 performs predetermined processing on the R′G′B ′ image data input from the spatial filter unit 18 based on the region identification signal, and an image file generation unit in the formatting processing unit 110. 151 is output.

例えば、出力階調補正部１９は、文字領域に対しては図１４（ｂ）に実線で示したガンマ曲線を用いた補正を行い、文字領域以外の領域に対しては図１４（ａ）に示したガンマ曲線を用いた補正を行う。図１４（ｂ）の破線で示したガンマ曲線は、図１４（ａ）に示したガンマ曲線である。図１４（ａ）に示したガンマ曲線としては、例えば送信先の外部装置に備えられる表示装置の表示特性に応じた曲線を設定しておき、文字領域のガンマ曲線は文字をくっきり表示できるように設定しておくことが好ましい。 For example, the output tone correction unit 19 performs correction using the gamma curve shown by the solid line in FIG. 14B for the character area, and the area other than the character area in FIG. 14A. Perform correction using the gamma curve shown. The gamma curve shown by the broken line in FIG. 14B is the gamma curve shown in FIG. As the gamma curve shown in FIG. 14A, for example, a curve corresponding to the display characteristics of a display device provided in an external device as a transmission destination is set, and the gamma curve in the character area can display characters clearly. It is preferable to set.

出力階調補正部１９から出力されたＲ’Ｇ’Ｂ’の画像データは、中間調生成部２０に出力する。中間調生成部２０は、出力階調補正部１９から入力されたＲ’Ｇ’Ｂ’の画像データをそのまま後段のフォーマット化処理部１１０における画像ファイル生成部１５１に出力（スルー）する。 The R′G′B ′ image data output from the output tone correction unit 19 is output to the halftone generation unit 20. The halftone generation unit 20 outputs (through) the R′G′B ′ image data input from the output tone correction unit 19 to the image file generation unit 151 in the subsequent formatting processing unit 110 as it is.

フォーマット化処理部１１０の透明テキスト作成部１５０は、文字認識結果より透明テキストを作成し、画像ファイル生成部１５１へと出力する。画像ファイル生成部１５１は、中間調生成部２０からスルーされてきたＲ’Ｇ’Ｂ’の画像データを、指定信号に従って、通常のＰＤＦ形式あるいは高圧縮ＰＤＦ形式、あるいはデフォルト設定されているファイル形式に変更するとともに、作成された透明テキストを各画像ファイルに埋め込む。 The transparent text creation unit 150 of the formatting processing unit 110 creates a transparent text from the character recognition result and outputs it to the image file generation unit 151. The image file generation unit 151 converts the R′G′B ′ image data passed through from the halftone generation unit 20 into a normal PDF format, a high-compression PDF format, or a default file format according to a designated signal. The transparent text created is embedded in each image file.

なお、本実施形態では、画像データに透明テキストを付加して送信するものとしているが、これに限るものではなく、ファイル形式のみ変更して、透明テキストを付加せずに送信するようにしてもよい。なお、透明テキストを付加しない場合には、認識処理部１０９からフォーマット化処理部１１０へのデータ出力を行わないようにしてもよい。 In this embodiment, the transparent text is added to the image data for transmission. However, the present invention is not limited to this, and only the file format may be changed and transmitted without adding the transparent text. Good. When transparent text is not added, data output from the recognition processing unit 109 to the formatting processing unit 110 may not be performed.

送受信装置５は、フォーマット化処理部１１０から入力された画像ファイルを、ネットワークを介して通信可能に接続された外部装置に送信する。例えば、送受信装置５は、上記の画像ファイルを図示しないメール処理部（ジョブ装置）によって電子メールに添付して送信する。また、scan to e-mailモードが選択された場合は、指定されたメールアドレスへ、上記画像ファイルを電子メールに添付して送信する。また、scan to FTPモード、scan to 共有フォルダモードが選択された場合は、デジタルカラー複合機１外部のファイルサーバ内の指定のアドレス（例えば、ＩＰアドレス）へ、上記画像ファイルを送信する。なお、デジタルカラー複合機１内部に設けた図示しないハードディスクを、ファイルサーバの代わりに使用することもできる。 The transmission / reception device 5 transmits the image file input from the formatting processing unit 110 to an external device that is communicably connected via a network. For example, the transmission / reception device 5 transmits the image file attached to an electronic mail by a mail processing unit (job device) (not shown). If the scan to e-mail mode is selected, the image file is attached to the e-mail and sent to the designated e-mail address. When the scan to FTP mode or the scan to shared folder mode is selected, the image file is transmitted to a specified address (for example, an IP address) in the file server outside the digital color multifunction peripheral 1. Note that a hard disk (not shown) provided in the digital color multifunction peripheral 1 can be used instead of the file server.

なお、図１２、図１３のデジタルカラー複合機１では、認識処理部１０９を領域分離部の前段に設けている。しかしながら、図１５に示すように、認識処理部１０９を領域分離部２１の後段に設けて、領域分離信号より作成されたテキストマップ（文字エッジと判定された画素よりなる画像領域）を参照して、レイアウト解析、文字領域に対する文字認識を行うようにしても良い。 In the digital color multifunction peripheral 1 shown in FIGS. 12 and 13, the recognition processing unit 109 is provided in front of the area separation unit. However, as shown in FIG. 15, the recognition processing unit 109 is provided in the subsequent stage of the region separation unit 21, and a text map (an image region made up of pixels determined to be character edges) created from the region separation signal is referred to. Further, layout analysis and character recognition for the character area may be performed.

あるいは、図１６に示すように、認識処理部１０９の前段に、原稿種別自動判別３０を設け、原稿種別信号を認識処理部１０９の文字認識部１１１に入力させ、文字認識部１１１は、天地方向が判定された原稿画像全体に対して文字認識を行う場合、原稿種別信号に基づいて、文字原稿、文字印刷写真原稿、文字印画紙写真原稿と判別されたときに、文字認識を行うようにしても良い。 Alternatively, as shown in FIG. 16, an automatic document type discrimination 30 is provided in the preceding stage of the recognition processing unit 109, and a document type signal is input to the character recognition unit 111 of the recognition processing unit 109. When character recognition is performed on the entire document image determined to be character recognition, character recognition is performed when it is determined as a character document, a character printed photo document, or a character photographic paper photo document based on the document type signal. Also good.

なお、本実施形態では、本画像処理装置１００をカラー画像データに対応した構成とし、デジタルカラー複合機１に適用する場合について説明したが、これに限らず、モノクロの複合機に適用してもよい。 In the present embodiment, the case where the image processing apparatus 100 is configured to correspond to color image data and applied to the digital color multifunction peripheral 1 has been described. However, the present invention is not limited to this, and the present invention may be applied to a monochrome multifunction peripheral. Good.

また、本画像処理装置１００を、例えば単体の画像読取装置に適用してもよい。図１７に、本画像処理装置１００を搭載したデジタルカラースキャナ３００の構成を示す。 The image processing apparatus 100 may be applied to, for example, a single image reading apparatus. FIG. 17 shows the configuration of a digital color scanner 300 equipped with the image processing apparatus 100.

図１７に示すように、デジタルカラースキャナ３００は、画像入力装置２、画像処理装置３’、送信装置５’、および操作パネル６を備えている。画像入力装置２、および操作パネル６の構成および機能は上述したデジタルカラー複合機１の場合と略同様なので、ここではその説明を省略する。送信装置５’は、前記した送受信装置５の送信機能のみを備えた構成である。本画像処理装置１００は画像処理装置３’に搭載されている。 As shown in FIG. 17, the digital color scanner 300 includes an image input device 2, an image processing device 3 ′, a transmission device 5 ′, and an operation panel 6. Since the configurations and functions of the image input device 2 and the operation panel 6 are substantially the same as those of the digital color multifunction peripheral 1 described above, the description thereof is omitted here. The transmission device 5 ′ is configured to have only the transmission function of the transmission / reception device 5 described above. The image processing apparatus 100 is mounted on the image processing apparatus 3 '.

画像処理装置３’は、Ａ／Ｄ変換部１１、シェーディング補正部１２、入力処理部１３、信号前処理部１１５、原稿補正部１５、色補正部１６、認識処理部１０９、フォーマット化処理部１１０、記憶部２３、および制御部２４を備えている。 The image processing apparatus 3 ′ includes an A / D conversion unit 11, a shading correction unit 12, an input processing unit 13, a signal preprocessing unit 115, a document correction unit 15, a color correction unit 16, a recognition processing unit 109, and a formatting processing unit 110. A storage unit 23 and a control unit 24.

なお、コピーモードを備えていない点、および、色補正部１６が色補正処理後の画像データを、フォーマット化処理部１１０の画像ファイル生成部１５１に出力し、画像ファイル生成部１５１が色補正部１６から入力された画像データに基づいて外部装置に送信する画像ファイルを生成する点以外の画像処理装置３’の機能は、上述したデジタルカラー複合機１の画像処理装置３の場合と略同様である。 Note that the copy mode is not provided, and the color correction unit 16 outputs the image data after the color correction processing to the image file generation unit 151 of the formatting processing unit 110, and the image file generation unit 151 outputs the color correction unit. The functions of the image processing apparatus 3 ′ other than the point of generating an image file to be transmitted to the external apparatus based on the image data input from 16 are substantially the same as those of the image processing apparatus 3 of the digital color multifunction peripheral 1 described above. is there.

画像処理装置３’において上述した各処理が施されて生成された画像ファイルは、送信装置５’により、ネットワークを介して通信可能に接続されたコンピュータやサーバなどに送信される。 The image file generated by performing the above-described processes in the image processing apparatus 3 ′ is transmitted by the transmission apparatus 5 ′ to a computer or server that is communicably connected via the network.

また、上記実施形態において、画像処理装置１００、デジタルカラー複合機１、デジタルカラースキャナ３００に備えられる各部（各ブロック）を、ＣＰＵ等のプロセッサを用いてソフトウェアによって実現してもよい。この場合、画像処理装置１００、デジタルカラー複合機１、デジタルカラースキャナ３００は、各機能を実現する制御プログラムの命令を実行するＣＰＵ（central processing unit）、上記プログラムを格納したＲＯＭ（read only memory）、上記プログラムを展開するＲＡＭ（random access memory）、上記プログラムおよび各種データを格納するメモリ等の記憶装置（記録媒体）などを備えている。 In the above embodiment, each unit (each block) provided in the image processing apparatus 100, the digital color multifunction peripheral 1, and the digital color scanner 300 may be realized by software using a processor such as a CPU. In this case, the image processing apparatus 100, the digital color multifunction peripheral 1, and the digital color scanner 300 include a CPU (central processing unit) that executes instructions of a control program that realizes each function, and a ROM (read only memory) that stores the program. A RAM (random access memory) for expanding the program, and a storage device (recording medium) such as a memory for storing the program and various data.

そして、本発明の目的は、上述した機能を実現するソフトウェアである画像処理装置１００、デジタルカラー複合機１、デジタルカラースキャナ３００の制御プログラムのプログラムコード（実行形式プログラム、中間コードプログラム、ソースプログラム）をコンピュータで読み取り可能に記録した記録媒体を、画像処理装置１００、デジタルカラー複合機１、デジタルカラースキャナ３００に供給し、そのコンピュータ（またはＣＰＵやＭＰＵ）が記録媒体に記録されているプログラムコードを読み出し実行することによって達成される。 An object of the present invention is to provide program codes (execution format program, intermediate code program, source program) of control programs for the image processing apparatus 100, the digital color multifunction peripheral 1, and the digital color scanner 300, which are software for realizing the functions described above. Is supplied to the image processing apparatus 100, the digital color multifunction peripheral 1, and the digital color scanner 300, and the computer (or CPU or MPU) stores the program code recorded on the recording medium. This is accomplished by performing a read.

上記記録媒体としては、例えば、磁気テープやカセットテープ等のテープ系、フロッピー（登録商標）ディスク／ハードディスク等の磁気ディスクやＣＤ−ＲＯＭ／ＭＯ／ＭＤ／ＤＶＤ／ＣＤ−Ｒ等の光ディスクを含むディスク系、ＩＣカード（メモリカードを含む）／光カード等のカード系、あるいはマスクＲＯＭ／ＥＰＲＯＭ／ＥＥＰＲＯＭ／フラッシュＲＯＭ等の半導体メモリ系などを用いることができる。 Examples of the recording medium include a tape system such as a magnetic tape and a cassette tape, a magnetic disk such as a floppy (registered trademark) disk / hard disk, and an optical disk such as a CD-ROM / MO / MD / DVD / CD-R. Card system such as IC card, IC card (including memory card) / optical card, or semiconductor memory system such as mask ROM / EPROM / EEPROM / flash ROM.

また、画像処理装置１００、デジタルカラー複合機１、デジタルカラースキャナ３００を通信ネットワークと接続可能に構成し、通信ネットワークを介して上記プログラムコードを供給してもよい。この通信ネットワークとしては、特に限定されず、例えば、インターネット、イントラネット、エキストラネット、ＬＡＮ、ＩＳＤＮ、ＶＡＮ、ＣＡＴＶ通信網、仮想専用網（virtual private network）、電話回線網、移動体通信網、衛星通信網等が利用可能である。また、通信ネットワークを構成する伝送媒体としては、特に限定されず、例えば、ＩＥＥＥ１３９４、ＵＳＢ、電力線搬送、ケーブルＴＶ回線、電話線、ＡＤＳＬ回線等の有線でも、ＩｒＤＡやリモコンのような赤外線、Ｂｌｕｅｔｏｏｔｈ（登録商標）、８０２．１１無線、ＨＤＲ、携帯電話網、衛星回線、地上波デジタル網等の無線でも利用可能である。なお、本発明は、上記プログラムコードが電子的な伝送で具現化された、搬送波に埋め込まれたコンピュータデータ信号の形態でも実現され得る。 Alternatively, the image processing apparatus 100, the digital color multifunction peripheral 1, and the digital color scanner 300 may be configured to be connectable to a communication network, and the program code may be supplied via the communication network. The communication network is not particularly limited. For example, the Internet, intranet, extranet, LAN, ISDN, VAN, CATV communication network, virtual private network, telephone line network, mobile communication network, satellite communication. A net or the like is available. Also, the transmission medium constituting the communication network is not particularly limited. For example, even in the case of wired such as IEEE 1394, USB, power line carrier, cable TV line, telephone line, ADSL line, etc., infrared rays such as IrDA and remote control, Bluetooth ( (Registered trademark), 802.11 wireless, HDR, mobile phone network, satellite line, terrestrial digital network, and the like can also be used. The present invention can also be realized in the form of a computer data signal embedded in a carrier wave in which the program code is embodied by electronic transmission.

また、画像処理装置１００、デジタルカラー複合機１、デジタルカラースキャナ３００の各ブロックは、ソフトウェアを用いて実現されるものに限らず、ハードウェアロジックによって構成されるものであってもよく、処理の一部を行うハードウェアと当該ハードウェアの制御や残余の処理を行うソフトウェアを実行する演算手段とを組み合わせたものであってもよい。 The blocks of the image processing apparatus 100, the digital color multifunction peripheral 1, and the digital color scanner 300 are not limited to those realized using software, and may be configured by hardware logic. A combination of hardware that performs a part and arithmetic means that executes software for controlling the hardware and performing the remaining processing may be used.

本発明は上述した実施形態に限定されるものではなく、請求項に示した範囲で種々の変更が可能である。すなわち、請求項に示した範囲で適宜変更した技術的手段を組み合わせて得られる実施形態についても本発明の技術的範囲に含まれる。 The present invention is not limited to the above-described embodiments, and various modifications can be made within the scope shown in the claims. That is, embodiments obtained by combining technical means appropriately modified within the scope of the claims are also included in the technical scope of the present invention.

本発明は、原稿から読み取った画像データのファイル形式を変更して他の装置に送信する画像処理装置、画像読取装置、および画像形成装置に適用できる。 The present invention can be applied to an image processing apparatus, an image reading apparatus, and an image forming apparatus that change the file format of image data read from a document and transmit it to another apparatus.

１デジタルカラー複合機（画像処理装置、画像形成装置）
２画像入力装置
３画像処理装置
３’ 画像処理装置
４画像出力装置
５送受信装置
５’ 送信装置
６操作パネル
１５原稿補正部
１６色補正部
２１領域分離部
２３記憶部
２４制御部
１００画像処理装置
１１０フォーマット化処理部（フォーマット化処理手段）
１１１文字認識部（文字認識手段）
１１２レイアウト解析部（レイアウト解析手段）
１１３原稿判定部（原稿判定手段）
１１３Ａ原稿判定部（原稿判定手段）
１１３Ｂ原稿判定部（原稿判定手段）
１１４ファイル形式決定部（ファイル形式決定手段）
１１５信号前処理部
１３０文字計数部（文字計数手段）
１３１文字領域割合算出部（文字領域割合算出手段）
１３２第１判定処理部（第１判定処理手段）
１４０タイトル領域判定部（タイトル領域判定手段）
１４１第２判定処理部（第２判定処理手段）
３００デジタルカラースキャナ（画像読取装置） 1 Digital color MFP (image processing device, image forming device)
2 image input device 3 image processing device 3 ′ image processing device 4 image output device 5 transmission / reception device 5 ′ transmission device 6 operation panel 15 document correction unit 16 color correction unit 21 region separation unit 23 storage unit 24 control unit 100 image processing device 110 Formatting processing unit (formatting processing means)
111 Character recognition unit (character recognition means)
112 Layout analysis unit (layout analysis means)
113 Document determination unit (document determination means)
113A Document determination unit (document determination means)
113B Document Determination Unit (Document Determination Unit)
114 File format determination unit (file format determination means)
115 Signal Pre-Processing Unit 130 Character Counting Unit (Character Counting Unit)
131 Character area ratio calculation unit (character area ratio calculation means)
132 1st determination process part (1st determination process means)
140 Title area determination unit (title area determination means)
141 2nd determination process part (2nd determination process means)
300 Digital color scanner (image reading device)

Claims

原稿画像を画像入力装置にて読み取ることで得られた入力画像データを指定されたファイル形式に変更するフォーマット化処理手段を備えた画像処理装置であって、
上記入力画像データに対して文字認識処理を行う文字認識手段と、
上記入力画像データに対して、ページ毎の画像のレイアウトを解析するレイアウト解析手段と、
上記レイアウト解析手段の解析結果および上記文字認識手段の認識結果より、上記入力画像データが示す原稿画像が文字を多く含んでいる文字中心画像であるか否かを判定する原稿判定手段と、
上記原稿判定手段の判定結果に基づいてファイル形式を決定し、上記フォーマット化処理手段に対して決定したファイル形式を指定するファイル形式決定手段とを有し、
上記ファイル形式決定手段は、文字中心画像であると判定された場合に、文書画像に適したファイル形式を選択し、文字中心画像ではないと判定された場合に、写真画像に適したファイル形式を選択することを特徴とする画像処理装置。 An image processing apparatus comprising formatting processing means for changing input image data obtained by reading an original image with an image input apparatus into a designated file format,
Character recognition means for performing character recognition processing on the input image data;
Layout analysis means for analyzing the layout of the image for each page with respect to the input image data;
A document determination unit that determines whether or not the document image indicated by the input image data is a character-centered image including many characters, based on the analysis result of the layout analysis unit and the recognition result of the character recognition unit;
File format determining means for determining a file format based on the determination result of the document determining means and designating the determined file format for the formatting processing means;
The file format determining means selects a file format suitable for a document image when it is determined that the image is a character-centered image, and selects a file format suitable for a photographic image when it is determined that the image is not a character-centered image. An image processing apparatus characterized by selecting.

上記原稿判定手段は、
上記レイアウト解析手段の解析結果および上記文字認識手段の認識結果より原稿画像における文字領域の占める割合を算出する文字領域割合算出手段と、
上記文字認識手段において認識された文字のページ毎の総数を数える文字計数手段と、
上記文字計数手段にて計数された文字数および上記文字領域割合算出手段にて算出された文字領域の割合に基づいて、少なくとも文字数が多いあるいは文字領域の割合が高い場合に文字中心画像と判定し、文字数が少なくかつ文字領域の割合も低い場合に文字中心画像ではないと判定する第１判定処理手段とを含むことを特徴とする請求項１に記載の画像処理装置。 The document determination means includes
A character area ratio calculating means for calculating a ratio of the character area in the document image from the analysis result of the layout analysis means and the recognition result of the character recognition means;
Character counting means for counting the total number of characters recognized by the character recognition means per page;
Based on the number of characters counted by the character counting means and the ratio of the character area calculated by the character area ratio calculating means, it is determined as a character center image when at least the number of characters is high or the ratio of the character area is high, The image processing apparatus according to claim 1, further comprising: a first determination processing unit that determines that the image is not a character center image when the number of characters is small and the ratio of the character area is low.

上記原稿判定手段は、
上記レイアウト解析手段の解析結果および上記文字認識手段の認識結果よりタイトル領域の有無を判定するタイトル領域判定手段と、
タイトル領域が有る場合に文字中心画像と判定し、タイトル領域が無い場合に文字中心画像ではないと判定する第２判定処理手段とを含むことを特徴とする請求項１に記載の画像処理装置。 The document determination means includes
Title area determination means for determining the presence or absence of a title area from the analysis result of the layout analysis means and the recognition result of the character recognition means;
The image processing apparatus according to claim 1, further comprising: a second determination processing unit that determines that the image is a character center image when the title region is present and determines that the character region image is not a character center image when there is no title region.

上記タイトル領域判定手段は、
上記レイアウト解析手段の解析結果より、一般的にタイトルが位置すると予想される位置に、原稿画像の他の領域よりも文字の外接矩形のサイズが大きく、かつ該外接矩形が連続してなる帯状ブロックが規定以上の長さを有する場合に、タイトル領域と仮判定する仮判定手段と、
上記仮判定手段にてタイトル領域と仮判定された部分に対する文字認識結果が規定以上の認識率である場合に、タイトル領域と本判定する本判定手段とを含むことを特徴とする請求項３に記載の画像処理装置。 The title area determination means
From the analysis result of the layout analysis means, a band-shaped block in which the size of a circumscribed rectangle of characters is larger than other areas of the document image and the circumscribed rectangle is continuous at a position where the title is generally expected to be located. Tentative determination means for tentatively determining that the title area when
4. The method according to claim 3, further comprising: a title region and a main determination unit that makes a final determination when a character recognition result for a portion that is temporarily determined to be a title region by the temporary determination unit is a recognition rate that exceeds a specified level. The image processing apparatus described.

上記文書画像に適したファイル形式が、原稿画像における文字部分を２値画像として格納し、原稿画像におけるその他の部分を多階調画像として格納する高圧縮ＰＤＦ形式であり、
上記写真画像に適した出力ファイル形式が、原稿画像全体を多階調画像として格納する通常ＰＤＦ形式であることを特徴とする請求項１〜４の何れか１項に記載の画像処理装置。 The file format suitable for the document image is a high-compression PDF format that stores character portions in a document image as a binary image and stores other portions in the document image as a multi-tone image.
The image processing apparatus according to claim 1, wherein the output file format suitable for the photographic image is a normal PDF format in which the entire document image is stored as a multi-tone image.

上記ファイル形式決定手段には、上記入力画像データの利用用途を示す選択信号が入力されるようになっており、
上記ファイル形式決定手段は、上記選択信号より利用用途が画像ファイリングであることを判別すると、上記フォーマット化処理手段に対してデフォルト設定されているファイル形式に変更するよう指定することを特徴とする請求項１〜５の何れか１項に記載の画像処理装置。 The file format determining means is adapted to receive a selection signal indicating the use application of the input image data,
The file format determining means, when judging from the selection signal that the application is image filing, instructs the formatting processing means to change to a default file format. Item 6. The image processing apparatus according to any one of Items 1 to 5.

請求項１〜６の何れか１項に記載の画像処理装置を備えることを特徴とする画像形成装置。 An image forming apparatus comprising the image processing apparatus according to claim 1.

請求項１〜６の何れか１項に記載の画像処理装置を備えることを特徴とする画像読取装置。 An image reading apparatus comprising the image processing apparatus according to claim 1.

原稿画像を画像入力装置にて読み取ることで得られた入力画像データを指定されたファイル形式に変更するフォーマット化処理工程を含む画像処理方法であって、
上記入力画像データに対して文字認識処理を行う文字認識工程と、
上記入力画像データに対して、ページ毎の画像のレイアウトを解析するレイアウト解析工程と、
上記レイアウト解析工程の解析結果および上記文字認識工程の認識結果より、上記入力画像データが示す原稿画像が文字画像を多く含んでいる文字中心画像であるか否かを判定する原稿判定工程と、
上記原稿判定工程の判定結果に基づいてファイル形式を決定し、上記フォーマット化処理工程に対して決定したファイル形式を指定するファイル形式決定工程とを有し、
上記ファイル形式決定工程においては、上記原稿判定工程にて文字中心画像であると判定された場合に、文書画像に適したファイル形式を選択し、上記原稿判定工程にて文字中心画像ではないと判定された場合に、写真画像に適したファイル形式を選択することを特徴とする画像処理方法。 An image processing method including a formatting process step of changing input image data obtained by reading a document image with an image input device into a specified file format,
A character recognition step for performing character recognition processing on the input image data;
A layout analysis step for analyzing the layout of the image for each page with respect to the input image data;
A document determination step of determining whether or not the document image indicated by the input image data is a character-centered image including many character images, based on the analysis result of the layout analysis step and the recognition result of the character recognition step;
Determining a file format based on a determination result of the document determination step, and specifying a file format determined for the formatting process step,
In the file format determination step, when it is determined in the document determination step that the image is a character center image, a file format suitable for the document image is selected, and in the document determination step, it is determined that the image is not a character center image. And a file format suitable for the photographic image is selected.

請求項１〜６の何れか１項に記載の画像処理装置を動作させるための画像処理プログラムであって、コンピュータを上記の各手段として機能させるための画像処理プログラム。 An image processing program for operating the image processing apparatus according to any one of claims 1 to 6, wherein the image processing program causes a computer to function as each of the above means.

請求項１０に記載された画像処理プログラムを記憶したことを特徴とした記録媒体。 A recording medium storing the image processing program according to claim 10.