JP2021082264A

JP2021082264A - Image processing system, image processing method, and program

Info

Publication number: JP2021082264A
Application number: JP2020163290A
Authority: JP
Inventors: 妙子山▲崎▼; Taeko Yamazaki; 金津　知俊; Tomotoshi Kanatsu; 知俊金津
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2019-11-13
Filing date: 2020-09-29
Publication date: 2021-05-27

Abstract

To provide an image processing system capable of swiftly identifying appropriate information by executing character recognition only in a required part of the document image.SOLUTION: Character areas in an image are detected, and the priority related to character recognition processing is set for each character area. Time factor values related to the processing time of character recognition processing for each character area are set based on information related to the size of the pixel block in the character area, and based on the set priority and time factor values, the character areas and the execution order to execute character recognition processing are determined.SELECTED DRAWING: Figure 2

Description

本発明は、画像処理装置、画像処理方法、及びプログラムに関する。 The present invention relates to an image processing apparatus, an image processing method, and a program.

文書画像から文字情報を文字認識した結果を画像のインデキシングに用いるシステムがある。インデキシングの方法には、例えば文書画像のタイトルをその画像のファイル名にするなどがある。また、文書画像に対して、画像中で名前や金額等が記載されている領域を特定し、文字認識を行うことで情報を抽出するシステムもある。これらシステムにおいて、ユーザの要求によって文書画像から抽出したい文字情報が存在する位置は異なり、必ずしも入力する文書画像の文字領域をすべて文字認識する必要はない。 There is a system that uses the result of character recognition of character information from a document image for image indexing. The indexing method includes, for example, changing the title of the document image to the file name of the image. There is also a system that extracts information by identifying an area in which a name, amount of money, etc. are described in a document image and performing character recognition. In these systems, the position where the character information to be extracted from the document image exists differs depending on the user's request, and it is not always necessary to recognize all the character areas of the input document image.

ここで、ユーザへの応答性や処理装置のリソースを加味し、所定時間内に文書画像中の必要な箇所だけを文字認識したい、という要求がある。特許文献１には、文書画像内のテキストブロックごとに面積から文字認識の制限時間を設定しておき、予備的な文字認識を行って得た文字数や文字サイズを基に実行順番を制御する技術が提案されている。また、特許文献１には、テキストブロックの文字認識処理が先に設定した制限時間に達した場合、次のテキストブロックの文字認識処理へ遷移することが記載されている。特許文献２には、文書画像の領域毎の文字サイズや文字の色等の属性を解析し、所定の色であって大きい文字の領域から文字認識を実行し、所定の文字数に達したら文字認識処理を終了する技術が提案されている。 Here, there is a demand that character recognition is performed only for a necessary part in a document image within a predetermined time in consideration of responsiveness to a user and resources of a processing device. Patent Document 1 is a technique in which a time limit for character recognition is set from the area for each text block in a document image, and the execution order is controlled based on the number of characters and the character size obtained by performing preliminary character recognition. Has been proposed. Further, Patent Document 1 describes that when the character recognition process of a text block reaches the previously set time limit, the process proceeds to the character recognition process of the next text block. In Patent Document 2, attributes such as character size and character color for each area of a document image are analyzed, character recognition is executed from a large character area having a predetermined color, and character recognition is performed when a predetermined number of characters is reached. A technique for terminating processing has been proposed.

特開２０１３−１６１２６８号公報Japanese Unexamined Patent Publication No. 2013-161268 特開２０１２−１７８００７号公報Japanese Unexamined Patent Publication No. 2012-178007

特許文献１及び特許文献２では、文書画像における領域の位置や文字サイズ、文字色に基づいて、文字認識の実行順や処理時間を設定している。しかし、低品位の文字画像、具体的にはかすれや周囲の画素と接触している文字を文字認識するには、理想的な文字画像と比較して多くの処理が必要になる。すなわち、文字認識処理においては、文字領域が持つ座標位置、大きさ、色では判定不能な、処理時間が増大する要因があり、これらの不意な処理時間の増加を防止しつつ、必要な領域の文字認識処理を実行しなければならない。さらに、特許文献１では、システム負荷による実際の文字認識処理に要する時間の変化により、文字認識の結果が変動するおそれがある。
本発明は、文書画像中の必要な部分にのみ文字認識を実行し、適切な情報を高速に特定できるようにすることを目的とする。 In Patent Document 1 and Patent Document 2, the execution order and processing time of character recognition are set based on the position of the area in the document image, the character size, and the character color. However, in order to recognize a low-quality character image, specifically a character that is faint or in contact with surrounding pixels, a lot of processing is required as compared with an ideal character image. That is, in the character recognition processing, there is a factor that the processing time increases, which cannot be determined by the coordinate position, size, and color of the character area. Character recognition processing must be performed. Further, in Patent Document 1, the result of character recognition may fluctuate due to a change in the time required for the actual character recognition processing due to the system load.
An object of the present invention is to perform character recognition only on a necessary part of a document image so that appropriate information can be identified at high speed.

本発明に係る画像処理装置は、画像中の文字領域を検出する検出手段と、前記文字領域ごとに文字認識処理に係る優先度を設定する優先度設定手段と、前記文字領域ごとに、当該文字領域内の画素塊の大きさに係る情報に基づいて、文字認識処理の処理時間に係る時間要因値を設定する時間要因値設定手段と、設定された前記優先度及び前記時間要因値に基づいて、文字認識処理を実行する前記文字領域及び実行順を決定する実行順設定手段とを有することを特徴とする。 The image processing apparatus according to the present invention includes a detection means for detecting a character area in an image, a priority setting means for setting a priority related to character recognition processing for each character area, and the character for each character area. Based on the time factor value setting means for setting the time factor value related to the processing time of the character recognition process based on the information related to the size of the pixel block in the region, and the set priority and the time factor value. It is characterized by having the character area for executing the character recognition process and the execution order setting means for determining the execution order.

本発明によれば、文書画像中の必要な部分にのみ文字認識を実行し、適切な情報を高速に特定することが可能となる。 According to the present invention, it is possible to perform character recognition only on a necessary part of a document image and identify appropriate information at high speed.

第１の実施形態に係る画像処理システムの構成例を示す図である。It is a figure which shows the structural example of the image processing system which concerns on 1st Embodiment. 第１の実施形態に係る文字認識処理の例を示すフローチャートである。It is a flowchart which shows the example of the character recognition processing which concerns on 1st Embodiment. スキャンした文書画像の一例を示す図である。It is a figure which shows an example of the scanned document image. 文字領域の検出結果の一例を示す図である。It is a figure which shows an example of the detection result of a character area. 第１の実施形態に係る認識優先度設定処理の例を示すフローチャートである。It is a flowchart which shows the example of the recognition priority setting process which concerns on 1st Embodiment. 第１の実施形態に係る時間要因値設定処理の例を示すフローチャートである。It is a flowchart which shows the example of the time factor value setting process which concerns on 1st Embodiment. 第１の実施形態に係る文字認識実行順の中間結果の一例を示す図である。It is a figure which shows an example of the intermediate result of the character recognition execution order which concerns on 1st Embodiment. 文字認識実行順の一例を示す図である。It is a figure which shows an example of the character recognition execution order. 第２の実施形態に係る文字認識処理及び時間要因値設定処理の例を示すフローチャートである。It is a flowchart which shows the example of the character recognition processing and time factor value setting processing which concerns on 2nd Embodiment. 第２の実施形態に係る文字領域の一例を示す図である。It is a figure which shows an example of the character area which concerns on 2nd Embodiment. 第２の実施形態に係る文字領域の他の例を示す図である。It is a figure which shows another example of the character area which concerns on 2nd Embodiment. 第３の実施形態に係る文字認識処理及び方向判定処理の例を示すフローチャートである。It is a flowchart which shows the example of the character recognition processing and direction determination processing which concerns on 3rd Embodiment. 第３の実施形態に係る文字認識の実行対象範囲を説明する図である。It is a figure explaining the execution target range of character recognition which concerns on 3rd Embodiment. 第３の実施形態に係る方向判定用の実行画像の例を示す図である。It is a figure which shows the example of the execution image for direction determination which concerns on 3rd Embodiment.

以下、本発明の実施形態を図面に基づいて説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

（第１の実施形態）
本発明の第１の実施形態について説明する。図１は、第１の実施形態に係る画像処理システムの構成例を示すブロック図である。本実施形態における画像処理システムは、読み取り装置１００及び画像処理装置１１０を有する。 (First Embodiment)
The first embodiment of the present invention will be described. FIG. 1 is a block diagram showing a configuration example of an image processing system according to the first embodiment. The image processing system in this embodiment includes a reading device 100 and an image processing device 110.

読み取り装置１００は、スキャナ部１０１及び通信部１０２を有する。スキャナ部１０１は、文書の読み取りを行い、スキャンした文書画像を生成する。通信部１０２は、ネットワークを介して外部装置と通信を行う。通信部１０２は、例えばスキャナ部１０１により生成された文書画像を画像処理装置１１０に送信する。 The reading device 100 has a scanner unit 101 and a communication unit 102. The scanner unit 101 reads a document and generates a scanned document image. The communication unit 102 communicates with an external device via a network. The communication unit 102 transmits, for example, the document image generated by the scanner unit 101 to the image processing device 110.

画像処理装置１１０は、システム制御部１１１、ＲＯＭ１１２、ＲＡＭ１１３、ハードディスクドライブ（ＨＤＤ）１１４、表示部１１５、入力部１１６、及び通信部１１７を有する。システム制御部１１１は、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）１１２に記憶された制御プログラムを読み出して各種処理を実行する。ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）１１３は、システム制御部１１１の主メモリ、ワークエリア等の一時記憶領域として用いられる。ＨＤＤ１１４は、各種データや各種プログラム等を記憶する。なお、後述する画像処理装置１１０の機能や処理は、システム制御部１１１がＲＯＭ１１２又はＨＤＤ１１４に格納されているプログラムを読み出し、このプログラムを実行することにより実現されるものである。システム制御部１１１がＲＯＭ１１２等からプログラムを読み出して実行することにより、例えば検出手段、優先度設定手段、時間要因値設定手段、実行順設定手段等の機能が実現される。 The image processing device 110 includes a system control unit 111, a ROM 112, a RAM 113, a hard disk drive (HDD) 114, a display unit 115, an input unit 116, and a communication unit 117. The system control unit 111 reads the control program stored in the ROM (Read Only Memory) 112 and executes various processes. The RAM (Random Access Memory) 113 is used as a temporary storage area for the main memory, work area, etc. of the system control unit 111. The HDD 114 stores various data, various programs, and the like. The functions and processing of the image processing device 110, which will be described later, are realized by the system control unit 111 reading a program stored in the ROM 112 or the HDD 114 and executing this program. When the system control unit 111 reads the program from the ROM 112 or the like and executes it, functions such as a detection means, a priority setting means, a time factor value setting means, and an execution order setting means are realized.

表示部１１５は、各種情報を表示する。入力部１１６は、キーボードやマウスを有し、ユーザによる各種操作を受け付ける。通信部１１７は、ネットワークを介して外部装置との通信処理を行う。通信部１１７は、例えば読み取り装置１００から文書画像を受信する。なお、表示部１１５と入力部１１６は、タッチパネルのように一体に設けられてもよい。また、表示部１１５は、プロジェクタによる投影を行うものであってもよく、入力部１１６は、投影された画像に対する指先の位置をカメラ等で認識するものであってもよい。 The display unit 115 displays various information. The input unit 116 has a keyboard and a mouse, and accepts various operations by the user. The communication unit 117 performs communication processing with an external device via the network. The communication unit 117 receives, for example, a document image from the reading device 100. The display unit 115 and the input unit 116 may be provided integrally like a touch panel. Further, the display unit 115 may be one that projects by a projector, and the input unit 116 may be one that recognizes the position of the fingertip with respect to the projected image by a camera or the like.

本実施形態においては、読み取り装置１００のスキャナ部１０１が帳票等の紙文書を読み取り、スキャンした文書画像を生成する。スキャンした文書画像は、読み取り装置１００の通信部１０２により画像処理装置１１０に送信される。画像処理装置１１０は、通信部１１７が読み取り装置１００からスキャンした文書画像を受信し、受信した文書画像をＨＤＤ１１４等の記憶装置に記憶する。なお、表示部１１５と入力部１１６の一部機能が読み取り装置１００にあってもよい。 In the present embodiment, the scanner unit 101 of the reading device 100 reads a paper document such as a form and generates a scanned document image. The scanned document image is transmitted to the image processing device 110 by the communication unit 102 of the reading device 100. The image processing device 110 receives the document image scanned by the communication unit 117 from the reading device 100, and stores the received document image in a storage device such as HDD 114. The reading device 100 may have some functions of the display unit 115 and the input unit 116.

図２は、本実施形態における画像処理装置１１０での文字認識処理の例を示すフローチャートである。図２に示すフローチャートの処理は、画像処理装置１１０において、ＲＯＭ１１２に格納されたプログラムに従ってシステム制御部１１１が実行することによって実現される。 FIG. 2 is a flowchart showing an example of character recognition processing in the image processing device 110 according to the present embodiment. The processing of the flowchart shown in FIG. 2 is realized by executing the processing of the flowchart shown in FIG. 2 by the system control unit 111 according to the program stored in the ROM 112 in the image processing device 110.

Ｓ２００において、読み取り装置１００が、ユーザによる指示に従って紙文書をスキャナ部１０１によりスキャンし、スキャンした文書画像を通信部１０２により画像処理装置１１０に送信する。送信されたスキャンした文書画像は、画像処理装置１１０の通信部１１７で受信され、システム制御部１１１によりＨＤＤ１１４等の記憶部に格納される。図３Ａに、このＳ２００での処理により得られた画像（スキャンした文書画像）の一例を示す。 In S200, the reading device 100 scans the paper document by the scanner unit 101 according to the instruction by the user, and transmits the scanned document image to the image processing device 110 by the communication unit 102. The transmitted scanned document image is received by the communication unit 117 of the image processing device 110, and is stored in a storage unit such as the HDD 114 by the system control unit 111. FIG. 3A shows an example of an image (scanned document image) obtained by the processing in S200.

Ｓ２０１において、システム制御部１１１は、スキャンした文書画像に対して文字領域検出処理を行い、検出結果をＲＡＭ１１３に格納する。文字領域検出処理は、スキャンした文書画像中から文字認識の対象となる文字を含む領域（文字領域）を検出する処理である。文字領域の検出方法は、特に限定されるものではなく、公知の技術を用いることができる。文字領域の検出技術の例としては、米国特許第５６８０４７８号公報に記載の処理などがある。上例では文書画像中の画素塊、白画素塊の集合を抽出し、その形状、大きさ、集合状態等から、文字、絵や図、表、枠、線といった特徴的な領域を抽出している。図３Ｂは、このＳ２０１での処理により得られた文字領域の検出結果の一例を示す図である。図３Ｂに示す例では、スキャンした文書画像３００において、文字領域３０１〜３２５が検出されている。 In S201, the system control unit 111 performs character area detection processing on the scanned document image, and stores the detection result in the RAM 113. The character area detection process is a process of detecting an area (character area) including a character to be recognized as a character in a scanned document image. The method for detecting the character region is not particularly limited, and a known technique can be used. Examples of the character region detection technique include the processing described in US Pat. No. 5,680,478. In the above example, a set of pixel blocks and white pixel blocks in a document image is extracted, and characteristic areas such as characters, pictures, figures, tables, frames, and lines are extracted from the shape, size, and set state. There is. FIG. 3B is a diagram showing an example of the detection result of the character area obtained by the processing in S201. In the example shown in FIG. 3B, the character regions 301 to 325 are detected in the scanned document image 300.

Ｓ２０２において、システム制御部１１１は、スキャンした文書画像に含まれる文字領域のそれぞれに対し認識優先度設定処理を行い、結果をＲＡＭ１１３に格納する。ここで、認識優先度とは、文字領域が文字認識の目的に適しているかをランク付けしたパラメータ値である。以下に説明する例では、認識優先度は１が最も高く、５が最も低いものとする。ノイズ又は小ポイント文字と推測される画素塊が多い文字領域は、文字認識する優先度が低い。一方、図３Ａ及び図３Ｂに示したような伝票画像の場合には、金額や日付等の通常サイズ以上の画素塊で構成される文字領域は優先的に文字認識処理すべきである。そこで、本実施形態では、Ｓ２０１での文字領域検出処理により得られた文字領域内の画素塊の大きさの分布傾向に基づいて、文字認識処理を行う優先度を示す認識優先度を決定する。 In S202, the system control unit 111 performs the recognition priority setting process for each of the character areas included in the scanned document image, and stores the result in the RAM 113. Here, the recognition priority is a parameter value that ranks whether the character area is suitable for the purpose of character recognition. In the example described below, the recognition priority is assumed to be 1 being the highest and 5 being the lowest. Character recognition has a low priority in a character area with many pixel clusters that are presumed to be noise or small point characters. On the other hand, in the case of a slip image as shown in FIGS. 3A and 3B, a character area composed of a pixel block having a normal size or larger such as an amount of money or a date should be preferentially subjected to character recognition processing. Therefore, in the present embodiment, the recognition priority indicating the priority for performing the character recognition process is determined based on the distribution tendency of the size of the pixel block in the character area obtained by the character area detection process in S201.

図４は、図３に示したＳ２０２において実行される、本実施形態における認識優先度設定処理の例を示すフローチャートである。
Ｓ４００において、システム制御部１１１は、ＲＡＭ１１３を参照して、処理対象とする文字領域を取得する。 FIG. 4 is a flowchart showing an example of the recognition priority setting process in the present embodiment executed in S202 shown in FIG.
In S400, the system control unit 111 refers to the RAM 113 and acquires the character area to be processed.

次に、Ｓ４０１において、システム制御部１１１は、Ｓ４００において取得した文字領域から抽出した画素塊を、サイズ毎に分類及び集計し、結果をＲＡＭ１１３に格納する。ここでは、画素塊の高さから、ノイズサイズ、小ポイントサイズ、及び主要文字サイズの３つのパターンに画素塊を分類する。画素塊を分類する際の閾値は、例えば伝票画像から金額や日付を抽出するケースであれば想定される文字のポイント数をピクセル数に換算した以下の値を用い、以下の条件式（１）のように分類する。
ノイズサイズＨｉ＜Ｔｑ＿１
小ポイントサイズＴｑ＿１≦Ｈｉ＜Ｔｑ＿２ …（１）
主要文字サイズＴｑ＿２≦Ｈｉ
条件式（１）において、Ｈｉは画素塊の高さであり、Ｔｑ＿１は文字サイズ３ｐｔ（ポイント）相当のピクセル数であり、Ｔｑ＿２は文字サイズ６ｐｔ相当のピクセル数である。 Next, in S401, the system control unit 111 classifies and aggregates the pixel clusters extracted from the character area acquired in S400 for each size, and stores the result in the RAM 113. Here, the pixel clusters are classified into three patterns of noise size, small point size, and main character size based on the height of the pixel clusters. For the threshold value when classifying pixel clusters, for example, in the case of extracting an amount or date from a slip image, the following value obtained by converting the expected number of character points into the number of pixels is used, and the following conditional expression (1) Classify as.
Noise size Hi <Tq_1
Small point size Tq_1 ≤ Hi <Tq_2 ... (1)
Main character size Tq_2 ≤ Hi
In the conditional expression (1), Hi is the height of the pixel block, Tq_1 is the number of pixels corresponding to the character size of 3 pt (points), and Tq_2 is the number of pixels corresponding to the character size of 6 pt.

そして、システム制御部１１１は、条件式（１）に従って分類した文字領域中の画素塊の数を分類ごとに以下の変数に格納し、結果をＲＡＭ１１３に格納する。ノイズサイズの画素塊数を変数Ｃｎｔ＿ｎｏｉｓｅに格納し、小ポイントサイズの画素塊数を変数Ｃｎｔ＿ｓｍａｌｌに格納し、主要文字サイズの画素塊数を変数Ｃｎｔ＿ｎｏｒｍａｌに格納する。 Then, the system control unit 111 stores the number of pixel clusters in the character area classified according to the conditional expression (1) in the following variables for each classification, and stores the result in the RAM 113. The number of pixel clusters of noise size is stored in the variable Cnt_noise, the number of pixel clusters of small point size is stored in the variable Cnt_small, and the number of pixel clusters of the main character size is stored in the variable Cnt_normal.

次に、Ｓ４０２において、システム制御部１１１は、ＲＡＭ１１３を参照し、処理対象の文字領域に主要文字サイズと分類された画素塊が存在するか否かを判定する。つまり、システム制御部１１１は、Ｃｎｔ＿ｎｏｉｓｅ＞０であるか否かを判定する。主要文字サイズと分類された画素塊が存在すると判定した場合（Ｙｅｓ）、システム制御部１１１はＳ４０３へ遷移する。一方、主要文字サイズと分類された画素塊が存在しないと判定した場合（Ｎｏ）、システム制御部１１１はＳ４０５へ遷移する。 Next, in S402, the system control unit 111 refers to the RAM 113 and determines whether or not a pixel block classified as the main character size exists in the character area to be processed. That is, the system control unit 111 determines whether or not Cnt_noise> 0. When it is determined that there is a pixel block classified as the main character size (Yes), the system control unit 111 transitions to S403. On the other hand, when it is determined that the pixel block classified as the main character size does not exist (No), the system control unit 111 transitions to S405.

Ｓ４０３において、システム制御部１１１は、ＲＡＭ１１３を参照し、処理対象の文字領域において、主要文字サイズと分類された画素塊の数が、ノイズサイズ、小ポイントサイズと分類された画素塊の数の合計より大きいか否かを判定する。つまり、システム制御部１１１は、Ｃｎｔ＿ｎｏｒｍａｌ＞（Ｃｎｔ＿ｓｍａｌｌ＋Ｃｎｔ＿ｎｏｉｓｅ）であるか否かを判定する。 In S403, the system control unit 111 refers to the RAM 113, and the number of pixel clusters classified as the main character size in the character area to be processed is the total number of pixel clusters classified as the noise size and the small point size. Determine if it is greater than. That is, the system control unit 111 determines whether or not Cnt_normal> (Cnt_smal + Cnt_noise).

主要文字サイズと分類された画素塊の数が、ノイズサイズ、小ポイントサイズと分類された画素塊の数の合計より大きいと判定した場合（Ｓ４０３のＹｅｓ）、システム制御部１１１はＳ４０６へ遷移する。Ｓ４０６において、システム制御部１１１は、処理対象の文字領域の認識優先度を１に設定してＲＡＭ１１３に格納し、認識優先度設定処理を終了する。 When it is determined that the number of pixel clusters classified as the main character size is larger than the total of the number of pixel clusters classified as the noise size and the small point size (Yes in S403), the system control unit 111 transitions to S406. .. In S406, the system control unit 111 sets the recognition priority of the character area to be processed to 1 and stores it in the RAM 113, and ends the recognition priority setting process.

一方、主要文字サイズと分類された画素塊の数が、ノイズサイズ、小ポイントサイズと分類された画素塊の数の合計より大きくないと判定した場合（Ｓ４０３のＮｏ）、システム制御部１１１はＳ４０４へ遷移する。Ｓ４０４において、システム制御部１１１は、ＲＡＭ１１３を参照し、処理対象の文字領域において、ノイズサイズ、小ポイントサイズと分類された画素塊の数の合計が所定の閾値Ｔｈより大きいか否かを判定する。つまり、システム制御部１１１は、（Ｃｎｔ＿ｓｍａｌｌ＋Ｃｎｔ＿ｎｏｉｓｅ）＞Ｔｈであるか否かを判定する。 On the other hand, when it is determined that the number of pixel clusters classified as the main character size is not larger than the total number of pixel clusters classified as the noise size and the small point size (No in S403), the system control unit 111 performs S404. Transition to. In S404, the system control unit 111 refers to the RAM 113 and determines whether or not the total number of pixel clusters classified as the noise size and the small point size is larger than the predetermined threshold value Th in the character area to be processed. .. That is, the system control unit 111 determines whether or not (Cnt_smal + Cnt_noise)> Th.

ノイズサイズ、小ポイントサイズと分類された画素塊の数の合計が所定の閾値Ｔｈより大きいと判定した場合（Ｓ４０４のＹｅｓ）、システム制御部１１１はＳ４０７へ遷移する。Ｓ４０７において、システム制御部１１１は、処理対象の文字領域の認識優先度を２に設定してＲＡＭ１１３に格納し、認識優先度設定処理を終了する。 When it is determined that the total number of pixel clusters classified as the noise size and the small point size is larger than the predetermined threshold value Th (Yes in S404), the system control unit 111 transitions to S407. In S407, the system control unit 111 sets the recognition priority of the character area to be processed to 2 and stores it in the RAM 113, and ends the recognition priority setting process.

一方、ノイズサイズ、小ポイントサイズと分類された画素塊の数の合計が所定の閾値Ｔｈより大きくないと判定した場合（Ｓ４０４のＮｏ）、システム制御部１１１はＳ４０８へ遷移する。Ｓ４０８において、システム制御部１１１は、処理対象の文字領域の認識優先度を３に設定してＲＡＭ１１３に格納し、認識優先度設定処理を終了する。 On the other hand, when it is determined that the total number of pixel clusters classified as noise size and small point size is not larger than the predetermined threshold value Th (No in S404), the system control unit 111 transitions to S408. In S408, the system control unit 111 sets the recognition priority of the character area to be processed to 3 and stores it in the RAM 113, and ends the recognition priority setting process.

Ｓ４０５において、システム制御部１１１は、ＲＡＭ１１３を参照し、処理対象の文字領域において、小ポイントサイズと分類された画素塊の数が、ノイズサイズと分類された画素塊の数より大きいか否かを判定する。つまり、システム制御部１１１は、Ｃｎｔ＿ｓｍａｌｌ＞Ｃｎｔ＿ｎｏｉｓｅであるか否かを判定する。 In S405, the system control unit 111 refers to the RAM 113 and determines whether or not the number of pixel clusters classified as the small point size is larger than the number of pixel clusters classified as the noise size in the character area to be processed. judge. That is, the system control unit 111 determines whether or not Cnt_small> Cnt_noise.

小ポイントサイズと分類された画素塊の数が、ノイズサイズと分類された画素塊の数より大きいと判定した場合（Ｓ４０５のＹｅｓ）、システム制御部１１１はＳ４０９へ遷移する。Ｓ４０９において、システム制御部１１１は、処理対象の文字領域の認識優先度を４に設定してＲＡＭ１１３に格納し、認識優先度設定処理を終了する。 When it is determined that the number of pixel clusters classified as the small point size is larger than the number of pixel clusters classified as the noise size (Yes in S405), the system control unit 111 transitions to S409. In S409, the system control unit 111 sets the recognition priority of the character area to be processed to 4 and stores it in the RAM 113, and ends the recognition priority setting process.

一方、小ポイントサイズと分類された画素塊の数が、ノイズサイズと分類された画素塊の数より大きくない判定した場合（Ｓ４０５のＮｏ）、システム制御部１１１はＳ４１０へ遷移する。Ｓ４１０において、システム制御部１１１は、処理対象の文字領域の認識優先度を５に設定してＲＡＭ１１３に格納し、認識優先度設定処理を終了する。 On the other hand, when it is determined that the number of pixel clusters classified as small point size is not larger than the number of pixel clusters classified as noise size (No in S405), the system control unit 111 transitions to S410. In S410, the system control unit 111 sets the recognition priority of the character area to be processed to 5 and stores it in the RAM 113, and ends the recognition priority setting process.

以上が、Ｓ２０２での認識優先度設定処理である。認識優先度が小さいと、通常サイズの画素塊が多く小ポイント文字やノイズが少ない文字領域、すなわち文字認識の目的に適した領域となり、認識優先度が高くなる設定になる。なお、本実施形態では、画素塊の高さのみを用いて認識優先度を決定するようにしているが、文字領域の座標情報等の他の特徴量を用いて認識優先度を決定するようにしてもよい。例えば、文字領域が画像平面において上方の位置に存在すれば、文書のタイトルである可能性が高いので、認識優先度を高くする処理を追加してもよい。 The above is the recognition priority setting process in S202. When the recognition priority is small, the character area has a large number of normal-sized pixel clusters and few small point characters and noise, that is, an area suitable for the purpose of character recognition, and the recognition priority is set to be high. In the present embodiment, the recognition priority is determined only by using the height of the pixel block, but the recognition priority is determined by using other features such as the coordinate information of the character area. You may. For example, if the character area exists at an upper position on the image plane, it is highly likely that it is the title of the document, so a process for increasing the recognition priority may be added.

図２に戻り、次にＳ２０３において、システム制御部１１１は、スキャンした文書画像に含まれる文字領域のそれぞれに対し時間要因値設定処理を行い、結果をＲＡＭ１１３に格納する。ここで、時間要因値とは、処理対象となる文字領域に文字認識を実行した際の処理時間の長さを推定した値であり、値が大きいほど処理時間を要することを示す。時間要因値は、基本的には文字領域内に含まれる文字数に依存する。ここでは、文字認識を実行せずに画像から取得できる画素塊のみの情報から、仮想文字数を推定し、推定した仮想文字数から前述の時間要因値を計算する。 Returning to FIG. 2, next, in S203, the system control unit 111 performs a time factor value setting process for each of the character areas included in the scanned document image, and stores the result in the RAM 113. Here, the time factor value is a value estimated for the length of the processing time when character recognition is executed in the character area to be processed, and the larger the value, the longer the processing time is. The time factor value basically depends on the number of characters contained in the character area. Here, the number of virtual characters is estimated from the information of only the pixel block that can be acquired from the image without executing character recognition, and the above-mentioned time factor value is calculated from the estimated number of virtual characters.

また、文字につぶれやかすれ、下線との接触などがある、いわゆる難読状態の文字領域を正確に文字認識するには、より多くの処理が必要になり、処理時間の増加が見込まれる。さらに、ピリオドと中点、ハイフンとマイナスなどの記号は類似した形状を持つケースが多い。また、点線や長い横棒は文字認識の内部で行う１文字に分割する処理の試行回数が増える。このため、最終的な文字認識結果の出力までの処理時間が多くなるケースが多い。よって、システム制御部１１１は、画素塊の配置や形状から難読状態か否かを分類し、分類によってペナルティとなる係数を設定する。今後は、難読状態であるか否かの分類を難読レベルと称し、説明を続ける。 In addition, more processing is required to accurately recognize the so-called obfuscated character area in which the characters are crushed or blurred, or in contact with the underline, and the processing time is expected to increase. Furthermore, symbols such as periods and midpoints, hyphens and minuses often have similar shapes. In addition, the dotted line and the long horizontal bar increase the number of trials of the process of dividing into one character performed inside the character recognition. Therefore, in many cases, the processing time until the final character recognition result is output is long. Therefore, the system control unit 111 classifies whether or not the pixel block is in an obfuscated state based on the arrangement and shape of the pixel block, and sets a coefficient that becomes a penalty according to the classification. In the future, the classification of whether or not it is in an obfuscated state will be referred to as the obfuscation level, and the explanation will be continued.

すなわち、本実施形態では、時間要因値Ｔｆの計算式を以下と定義する。
Ｔｆ＝Σ（Ｃｉ×Ｎｉ） …（２）
式（２）において、Ｎｉは難読レベルｉの仮想文字数であり、Ｃｉは難読レベルｉの係数である。また、難読レベル係数Ｃｉは、以下と定義する。
Ｃｉ＝αｉ×βｉ …（３）
式（３）において、αｉは難読レベルｉのサイズ要因係数であり、βｉは難読レベルｉの接触要因係数である。 That is, in the present embodiment, the calculation formula of the time factor value Tf is defined as follows.
Tf = Σ (Ci × Ni)… (2)
In the formula (2), Ni is the number of virtual characters of the obfuscation level i, and Ci is the coefficient of the obfuscation level i. The obfuscation level coefficient Ci is defined as follows.
Ci = αi × βi… (3)
In equation (3), αi is the size factor coefficient of obfuscation level i, and βi is the contact factor coefficient of obfuscation level i.

本実施形態では、難読レベルを、画素塊から得られる情報のみを用いてノイズ、小ポイント単独、主要サイズ単独、小ポイント接触、及び主要サイズ接触の５つのパターンに分け、仮想文字数及び難読レベル係数を設定して時間要因値を計算する。図５（Ａ）は、前述した式（２）を用いて実行される時間要因値設定処理の例を示すフローチャートである。 In the present embodiment, the obfuscation level is divided into five patterns of noise, small point alone, main size alone, small point contact, and main size contact using only the information obtained from the pixel block, and the number of virtual characters and the obfuscation level coefficient. To calculate the time factor value. FIG. 5A is a flowchart showing an example of the time factor value setting process executed by using the above-mentioned equation (2).

Ｓ５００において、システム制御部１１１は、ＲＡＭ１１３を参照し、処理対象とする文字領域を取得する。システム制御部１１１は、以下に説明するＳ５０１〜５１３の処理を処理対象の文字領域に含まれる画素塊すべてに対して実行する。 In S500, the system control unit 111 refers to the RAM 113 and acquires the character area to be processed. The system control unit 111 executes the processes of S501 to 513 described below for all the pixel clusters included in the character area to be processed.

次に、Ｓ５０１において、システム制御部１１１は、文字領域から１つの画素塊の高さ及び幅を取得し、ＲＡＭ１１３に格納する。画素塊の高さについては、先にＳ４０１にて計算したので、その情報をＲＡＭ１１３から取得しても構わない。 Next, in S501, the system control unit 111 acquires the height and width of one pixel block from the character area and stores it in the RAM 113. Since the height of the pixel block was calculated earlier in S401, the information may be acquired from the RAM 113.

次に、Ｓ５０２において、システム制御部１１１は、ＲＡＭ１１３を参照し、画素塊の高さ情報に基づいて当該画素塊のサイズ要因係数を決定する。システム制御部１１１は、画素塊をノイズと分類した場合にはＳ５０３に遷移し、画素塊を小ポイント文字と分類した場合にはＳ５０４に遷移し、画素塊を主要サイズ文字と分類した場合にはＳ５０５に遷移する。 Next, in S502, the system control unit 111 refers to the RAM 113 and determines the size factor coefficient of the pixel block based on the height information of the pixel block. The system control unit 111 transitions to S503 when the pixel block is classified as noise, transitions to S504 when the pixel block is classified as a small point character, and when the pixel block is classified as a main size character. Transition to S505.

Ｓ５０３において、システム制御部１１１は、ノイズと分類した当該画素塊のサイズ要因係数をα０とし、結果をＲＡＭ１１３に格納する。そして、Ｓ５０９へ遷移する。
Ｓ５０４において、システム制御部１１１は、小ポイント文字と分類した当該画素塊のサイズ要因係数をα１とし、結果をＲＡＭ１１３に格納する。そして、Ｓ５０６へ遷移する。
Ｓ５０５において、システム制御部１１１は、主要サイズ文字と分類した当該画素塊のサイズ要因係数をα２とし、結果をＲＡＭ１１３に格納する。そして、Ｓ５０６へ遷移する。 In S503, the system control unit 111 sets the size factor coefficient of the pixel block classified as noise to α0, and stores the result in the RAM 113. Then, the transition to S509 occurs.
In S504, the system control unit 111 sets the size factor coefficient of the pixel block classified as a small point character to α1 and stores the result in the RAM 113. Then, the transition to S506 occurs.
In S505, the system control unit 111 sets the size factor coefficient of the pixel block classified as the main size character to α2, and stores the result in the RAM 113. Then, the transition to S506 occurs.

Ｓ５０６において、システム制御部１１１は、ＲＡＭ１１３を参照し、当該画素塊の幅情報に基づいて当該画素塊の接触要因係数を決定し、結果をＲＡＭ１１３に格納する。システム制御部１１１は、画素塊を単独文字と分類した場合にはＳ５０７に遷移し、画素塊を接触文字と分類した場合にはＳ５０８に遷移する。
Ｓ５０７において、システム制御部１１１は、単独文字と分類した当該画素塊の接触要因係数をβ１とし、結果をＲＡＭ１１３に格納する。そして、Ｓ５１０へ遷移する。
Ｓ５０８において、システム制御部１１１は、接触文字と分類した当該画素塊の接触要因係数をβ２とし、結果をＲＡＭ１１３に格納する。そして、Ｓ５１１へ遷移する。 In S506, the system control unit 111 refers to the RAM 113, determines the contact factor coefficient of the pixel block based on the width information of the pixel block, and stores the result in the RAM 113. The system control unit 111 transitions to S507 when the pixel block is classified as a single character, and transitions to S508 when the pixel block is classified as a contact character.
In S507, the system control unit 111 sets the contact factor coefficient of the pixel block classified as a single character to β1, and stores the result in the RAM 113. Then, the transition to S510 occurs.
In S508, the system control unit 111 sets the contact factor coefficient of the pixel block classified as the contact character to β2, and stores the result in the RAM 113. Then, the transition to S511 occurs.

Ｓ５０９において、システム制御部１１１は、難読レベルがノイズとなった当該画素塊についての仮想文字数Ｎを１とし、結果をＲＡＭ１１３に格納してＳ５１２へ遷移する。
Ｓ５１０において、システム制御部１１１は、難読レベルが単独文字となった当該画素についての仮想文字数Ｎを１とし、結果をＲＡＭ１１３に格納してＳ５１２へ遷移する。
Ｓ５１１において、システム制御部１１１は、難読レベルが接触文字となった当該画素塊についての仮想文字数Ｎをｎとし、結果をＲＡＭ１１３に格納してＳ５１２へ遷移する。ｎの設定については、ここでは、あくまで文字数を推定するだけであるので、規定の閾値で画素塊の幅を割った値を用いる。既定の閾値とは、例えば当該画素塊の高さのピクセル数を用いても良いし、固定値を設定しても良い。つまり、少なくとも仮想文字数Ｎは、２以上の値が設定される。 In S509, the system control unit 111 sets the number of virtual characters N for the pixel block whose obfuscation level is noise to 1, stores the result in the RAM 113, and transitions to S512.
In S510, the system control unit 111 sets the number of virtual characters N for the pixel whose obfuscation level is a single character to 1, stores the result in the RAM 113, and transitions to S512.
In S511, the system control unit 111 sets the number of virtual characters N for the pixel block whose obfuscation level is the contact character to n, stores the result in the RAM 113, and transitions to S512. As for the setting of n, since the number of characters is only estimated here, a value obtained by dividing the width of the pixel block by a predetermined threshold value is used. As the default threshold value, for example, the number of pixels at the height of the pixel block may be used, or a fixed value may be set. That is, at least the number of virtual characters N is set to a value of 2 or more.

Ｓ５１２において、システム制御部１１１は、前述した式（３）に従って難読レベルの係数Ｃｉを算出する。すなわち、システム制御部１１１は、当該画素塊について難読レベルの係数Ｃｉを先に計算した係数α、βを乗算した値に設定し、結果をＲＡＭ１１３に格納してＳ５１３へ遷移する。 In S512, the system control unit 111 calculates the obfuscation level coefficient Ci according to the above-mentioned equation (3). That is, the system control unit 111 sets the obfuscated level coefficient Ci for the pixel block to a value obtained by multiplying the previously calculated coefficients α and β, stores the result in the RAM 113, and transitions to S513.

Ｓ５１３において、システム制御部１１１は、ＲＡＭ１１３から当該画素塊の難読レベル係数Ｃと文字数Ｎとを取得して、乗算した値を処理対象としている文字領域の時間要因値に加算する。Ｓ５１３での処理は、前述式（２）に相当する。すなわち、システム制御部１１１は、文字領域にあるすべての画素塊について、難読レベル係数Ｃと仮想文字数Ｎとの計算を行い、その合計を当該文字領域の時間要因値Ｔｆと設定してＲＡＭ１１３へ格納する。 In S513, the system control unit 111 acquires the obfuscation level coefficient C of the pixel block and the number of characters N from the RAM 113, and adds the multiplied value to the time factor value of the character area to be processed. The process in S513 corresponds to the above equation (2). That is, the system control unit 111 calculates the obfuscation level coefficient C and the number of virtual characters N for all the pixel blocks in the character area, sets the total as the time factor value Tf of the character area, and stores it in the RAM 113. To do.

図５（Ｂ）は、前述した認識優先度設定処理で設定した難読レベルごとの係数Ｃ及び仮想文字数Ｎの一覧を示す。
本実施形態では、画素塊の高さ情報に基づいて、文字認識処理を実行する必要のない領域であるノイズと、文字がつぶれて解析が困難な小ポイント文字と、主要サイズ文字とを簡易的に分類した。なお、各処理で設定する係数は、主要サイズ文字及び単独文字であれば係数の値は小さく、ノイズ、小ポイント文字、接触文字であれば係数の値を高くする。文字認識に処理時間を要するか否かの判定基準の閾値は、判定処理の計算コストを鑑み認識優先度設定処理と同等としたが、ターゲットとなる画像群や難読状態の分類の違いによって判定の閾値を変えてもよい。 FIG. 5B shows a list of the coefficient C and the number of virtual characters N for each obfuscation level set in the recognition priority setting process described above.
In the present embodiment, based on the height information of the pixel block, noise, which is an area where character recognition processing does not need to be executed, small point characters whose characters are crushed and difficult to analyze, and main size characters are simplified. It was classified into. As for the coefficient set in each process, the coefficient value is small for main size characters and single characters, and high for noise, small point characters, and contact characters. The threshold value of the criterion for determining whether or not character recognition requires processing time is the same as the recognition priority setting process in consideration of the calculation cost of the determination process, but the determination is made based on the difference in the target image group and the classification of the obfuscated state. The threshold may be changed.

また、実際に文字認識を実行する文字認識のアルゴリズムの特性や対象言語により、認識対象の下限の小ポイント文字は異なるため、実際に利用する文字認識処理工程に合わせた判定閾値を設定してもよい。同じく、文字認識のアルゴリズムによって不得手な分類の文字も異なる。よって、アルゴリズム特性に合わせた難読レベルを別途用意しても構わない。例えば、イタリック文字に関して処理時間を有する文字認識アルゴリズムを有している場合は、画素塊の形状の傾向から斜体文字と分類するケースも設定して構わない。さらに、係数は文字認識アルゴリズムの特性や認識対象言語によって異ならせてもよい。例えば、日本語や中国語などの漢字が使われる東アジア系言語の文字認識は、欧米系言語の文字認識よりも処理コストが高い。よって、文字領域が東アジア系言語である場合は係数を大きく設定する。
以上がＳ２０３での時間要因値設定処理である。 In addition, since the lower limit of small point characters to be recognized differs depending on the characteristics of the character recognition algorithm that actually executes character recognition and the target language, even if a judgment threshold is set according to the character recognition processing process actually used. Good. Similarly, different classifications of characters are different depending on the character recognition algorithm. Therefore, an obfuscation level that matches the algorithm characteristics may be prepared separately. For example, when a character recognition algorithm having a processing time for italic characters is provided, a case of classifying italic characters from the tendency of the shape of the pixel block may be set. Further, the coefficient may be different depending on the characteristics of the character recognition algorithm and the language to be recognized. For example, character recognition in East Asian languages that use Chinese characters such as Japanese and Chinese has a higher processing cost than character recognition in Western languages. Therefore, if the character area is an East Asian language, set a large coefficient.
The above is the time factor value setting process in S203.

図２に戻り、次にＳ２０４において、システム制御部１１１は、取得した文書画像中の文字領域に、Ｓ２０２において設定した認識優先度及びＳ２０３において設定した時間要因値に基づいて文字認識の実行順を設定し、結果をＲＡＭ１１３に格納する。そして、システム制御部１１１は、Ｓ２０５へ遷移する。 Returning to FIG. 2, in S204, the system control unit 111 assigns the character recognition execution order to the character area in the acquired document image based on the recognition priority set in S202 and the time factor value set in S203. Set and store the result in RAM 113. Then, the system control unit 111 transitions to S205.

図６（Ａ）は、スキャンした文書画像３００において検出された文字領域３０１〜３２５に対し、図２に示したＳ２０２及びＳ２０３の処理を実行し、認識優先度及び時間要因値を計算した一例（中間結果６００）である。文字認識の実行順は、まず、認識優先度が高い順（数値が小さい順）に文字領域をソートし、同一順位となった文字領域については、時間要因値が低い順（数値が小さい順）にソートする。これにより、文書画像のうち、文字認識の目的に適していて、かつ、文字認識の処理実行に時間がかからない順になる。図６（Ｂ）は、前述のように中間結果６００をソートした一例（中間結果６０１）である。 FIG. 6A is an example in which the processing of S202 and S203 shown in FIG. 2 is executed on the character areas 301 to 325 detected in the scanned document image 300, and the recognition priority and the time factor value are calculated ( The interim result is 600). As for the execution order of character recognition, first, the character areas are sorted in descending order of recognition priority (smallest numerical value), and for character areas having the same rank, the lowest time factor value (smallest numerical value). Sort to. As a result, among the document images, the order is suitable for the purpose of character recognition and does not take time to execute the character recognition process. FIG. 6B is an example (interim result 601) in which the intermediate result 600 is sorted as described above.

次に、Ｓ２０５において、システム制御部１１１は、Ｓ２０４において設定した文字認識の実行順の上位から時間要因値の累計を計算し、時間要因値の累計に基づいて文字認識を実行する文字領域（文字認識の実行領域）を設定する。システム制御部１１１は、あらかじめ設定した閾値より時間要因値の総和が下回る範囲で文字認識の実行領域を設定した結果をＲＡＭ１１３に格納し、Ｓ２０６へ遷移する。ここで、時間要因値の総和に係る閾値は、システムのＣＰＵ速度と要求応答時間からあらかじめ定めた固定値を用いることができる。図６（Ｃ）は、中間結果６０１から時間要因値の累計を計算した一例（中間結果６０２）である。例えば、あらかじめ設定した時間要因値の累計の上限値となる閾値を２５０とした場合、実際に文字認識を行う文字領域は、リストで表すと範囲６０３の文字領域、文書画像で表すと図７の６０４に示される文字領域となる。 Next, in S205, the system control unit 111 calculates the cumulative total of the time factor values from the top of the character recognition execution order set in S204, and executes character recognition based on the cumulative total of the time factor values. Set the recognition execution area). The system control unit 111 stores the result of setting the character recognition execution area in the range where the sum of the time factor values is lower than the preset threshold value in the RAM 113, and transitions to S206. Here, as the threshold value related to the sum of the time factor values, a fixed value predetermined from the CPU speed of the system and the request response time can be used. FIG. 6C is an example (interim result 602) in which the cumulative total of time factor values is calculated from the interim result 601. For example, when the threshold value that is the upper limit of the cumulative total of preset time factor values is set to 250, the character area for actually performing character recognition is the character area in the range 603 when represented by a list, and FIG. 7 when represented by a document image. This is the character area shown in 604.

Ｓ２０６において、システム制御部１１１は、Ｓ２０４及びＳ２０５での処理により設定した文字領域の順番及び範囲に従って文字認識を実行し、結果をＲＡＭ１１３やＨＤＤ１１４などの記憶部に格納する。 In S206, the system control unit 111 executes character recognition according to the order and range of the character areas set by the processing in S204 and S205, and stores the result in a storage unit such as RAM 113 or HDD 114.

以上説明したように、文書画像中の文字領域のうち、文字認識の目的に適した、かつ、処理時間を要しない順に文字領域を行うことができる。また、処理対象の領域の制限に文字認識処理の実時間を用いないので、システム負荷の影響により文字認識の結果が変わることがない。これにより、応答性を保ちつつ、ユーザが必要な文字領域のテキスト情報の抽出が可能になり、文書画像中の必要な文字領域から適切な情報を高速に特定することが可能となる。 As described above, among the character areas in the document image, the character areas can be formed in the order suitable for the purpose of character recognition and not requiring the processing time. Moreover, since the real time of the character recognition process is not used to limit the area to be processed, the result of the character recognition does not change due to the influence of the system load. As a result, the user can extract the text information of the required character area while maintaining the responsiveness, and can quickly identify the appropriate information from the required character area in the document image.

（第２の実施形態）
次に、本発明の第２の実施形態について説明する。第１の実施形態では、最初に抽出した文字領域の単位で文字認識の実行順を設定した。一方、抽出した同一の文字領域に文字認識したい文字と、時間要因値が高いノイズや記号等とが混じって存在している場合がある。そこで、文書画像中に、認識優先度は高いが時間要因値も大きい文字領域があった場合、特に時間要因値を押し上げている要因となる画素塊を特定し、その画素塊を削除したうえで時間要因値を再度計算し、文字認識を実行してもよい。構成や制御の流れなど第１の実施形態と共通する内容については説明を省略し、第１の実施形態と相違する時間要因値の再設定処理（図８（Ａ）に示すＳ８００）について説明する。時間要因値再設定処理は、Ｓ２０３において時間要因値設定処理を行った後に実行される。 (Second embodiment)
Next, a second embodiment of the present invention will be described. In the first embodiment, the execution order of character recognition is set in units of the character areas extracted first. On the other hand, there are cases where characters to be recognized as characters and noises, symbols, etc. having high time factor values are mixed and exist in the same extracted character area. Therefore, if there is a character area in the document image that has a high recognition priority but also a large time factor value, identify the pixel block that is the factor that pushes up the time factor value, and then delete the pixel block. The time factor value may be recalculated and character recognition may be performed. The contents common to the first embodiment such as the configuration and the flow of control will be omitted, and the time factor value resetting process (S800 shown in FIG. 8A) different from the first embodiment will be described. .. The time factor value resetting process is executed after the time factor value setting process is performed in S203.

図８（Ｂ）は、本実施形態における時間要因値再設定処理の例を示すフローチャートである。
Ｓ８０１において、システム制御部１１１は、ＲＡＭ１１３を参照し、時間要因値の再計算処理をしていない文字領域、及び文字領域内の時間要因値の累計値を取得する。
Ｓ８０２において、システム制御部１１１は、取得した時間要因値の累計値が所定の閾値より大きいか否かを判定する。取得した時間要因値の累計値が所定の閾値より大きいと判定した場合（Ｓ８０２のＹｅｓ）、システム制御部１１１はＳ８０３に遷移する。一方、取得した時間要因値の累計値が所定の閾値より大きくないと判定した場合（Ｓ８０２のＮｏ）、システム制御部１１１はＳ８０１へ遷移する。 FIG. 8B is a flowchart showing an example of the time factor value resetting process in the present embodiment.
In S801, the system control unit 111 refers to the RAM 113 and acquires the character area in which the time factor value has not been recalculated and the cumulative value of the time factor value in the character area.
In S802, the system control unit 111 determines whether or not the accumulated value of the acquired time factor values is larger than a predetermined threshold value. When it is determined that the accumulated value of the acquired time factor values is larger than the predetermined threshold value (Yes in S802), the system control unit 111 transitions to S803. On the other hand, when it is determined that the accumulated value of the acquired time factor values is not larger than the predetermined threshold value (No in S802), the system control unit 111 transitions to S801.

Ｓ８０３において、システム制御部１１１は、ＲＡＭ１１３を参照し、当該文字領域の時間要因値を押し上げる要因となる画素塊（要因画素塊）を特定し、結果をＲＡＭ１１３に格納する。具体的には、図５に示したフローチャートのＳ５１３の処理において時間要因値の計算に用いた係数が大きいものが割り当たっている画素塊を特定する。本実施形態では、文字領域の画素塊のうち、ノイズの係数が割り当たっている画素塊を要因画素塊とする。 In S803, the system control unit 111 refers to the RAM 113, identifies a pixel block (factor pixel block) that causes the time factor value of the character area to be pushed up, and stores the result in the RAM 113. Specifically, in the process of S513 of the flowchart shown in FIG. 5, a pixel block to which a large coefficient used for calculating the time factor value is assigned is specified. In the present embodiment, among the pixel clusters in the character region, the pixel cluster to which the noise coefficient is assigned is defined as the factor pixel cluster.

Ｓ８０４において、システム制御部１１１は、特定した要因画素塊を文字領域から除去する。例えば、図３Ｂに示した文字領域３２５が入力された場合、図９に示すように、本処理によって図９（Ａ）に示す文字領域３２５に含まれているノイズ係数が割り当たっている画素塊が除去され、図９（Ｂ）に示す文字領域３２５＿１が出力される。そして、Ｓ８０５において、システム制御部１１１は、図２に示したＳ２０３と同様にして、要因画素塊が削除された文字領域について時間要因値設定処理を行う。 In S804, the system control unit 111 removes the specified factor pixel block from the character area. For example, when the character area 325 shown in FIG. 3B is input, as shown in FIG. 9, a pixel block to which the noise coefficient included in the character area 325 shown in FIG. 9A is assigned by this processing. Is removed, and the character area 325_1 shown in FIG. 9B is output. Then, in S805, the system control unit 111 performs the time factor value setting process for the character area in which the factor pixel block is deleted in the same manner as in S203 shown in FIG.

以上により、文字領域中に含まれる時間要因値が大きい画素塊を除外して、文字認識の実行順を上位に変更でき、同一文字領域内でもユーザが有用と予想される文字の文字認識の実行が可能になる。 As described above, it is possible to exclude pixel clusters having a large time factor value contained in the character area and change the execution order of character recognition to a higher level, and execute character recognition of characters that are expected to be useful by the user even within the same character area. Becomes possible.

前述した例では、文字領域内の時間要因値の高い画素塊を除去して時間要因値の再計算を行っていた。時間要因値の高い画素塊を、分離した新たな文字領域として設定し、新たな認識優先度及び時間要因値を算出し、文字認識実行順を再度設定するようにしてもよい。例えば、図１０（Ａ）に示す文字領域３２５を、図１０（Ｂ）に示すように文字領域３２５＿２、３２５＿３の２つの文字領域に分離し、認識優先度及び時間要因値を算出し、文字認識実行順を再度設定するようにしてもよい。これにより、例えば、応答性を考慮したシーンでは上位領域を認識し、そのほかの領域は後追いで文字認識を行え、最終的には文字画像中のすべての文字領域に対して文字認識の実行が可能になる。 In the above-mentioned example, the pixel block having a high time factor value in the character area is removed and the time factor value is recalculated. A pixel block having a high time factor value may be set as a new separated character area, a new recognition priority and a time factor value may be calculated, and the character recognition execution order may be set again. For example, the character area 325 shown in FIG. 10A is separated into two character areas 325_2 and 325_3 as shown in FIG. 10B, the recognition priority and the time factor value are calculated, and the character recognition is performed. The execution order may be set again. As a result, for example, in a scene considering responsiveness, the upper area can be recognized, the other areas can be followed by character recognition, and finally, character recognition can be executed for all the character areas in the character image. become.

なお、前述した各実施形態では、スキャンした文書画像中のすべての文字領域に対し認識優先度を算出し、すべての文字領域に対して時間要因値を算出している。それに対して、すべての文字領域の認識優先度を算出した時点で一旦並び替えを行う。そして、認識優先度が高い順に文字領域の処理順に沿って時間要因値を算出し、時間要因値が既定の値を超えたところで処理を終了し、文字認識の実行対象領域と実行順を確定するようにしてもよい。このようにした場合には、さらに認識優先度が低いとした文字領域画像に対する時間要因値設定処理を省略できるので、応答性のさらなる向上が可能になる。 In each of the above-described embodiments, the recognition priority is calculated for all the character areas in the scanned document image, and the time factor value is calculated for all the character areas. On the other hand, when the recognition priority of all the character areas is calculated, the sorting is performed once. Then, the time factor value is calculated according to the processing order of the character area in descending order of recognition priority, the processing is terminated when the time factor value exceeds the default value, and the execution target area and execution order of character recognition are determined. You may do so. In this case, the time factor value setting process for the character area image having a lower recognition priority can be omitted, so that the responsiveness can be further improved.

（第３の実施形態）
次に、本発明の第３の実施形態について説明する。前述した第１及び第２の実施形態では、処理対象となる文字領域とその実行順を定義し、決定した実行順に従って文字領域ごとに逐次文字認識を実行している。一方、スキャナ部１０１から取得した文書画像は９０度単位で回転していることがある。このため、文字認識処理を実行する前に、文書画像中の文字が正立する方向を判定する方向判定処理を実行する必要がある。方向判定は、一般的には内部で９０度ずつ異なる四方向に回転した画像のそれぞれに対して文字認識を実行し、最も文字認識結果の信頼度が高い方向を文書の正立方向と判定している。 (Third Embodiment)
Next, a third embodiment of the present invention will be described. In the first and second embodiments described above, the character area to be processed and the execution order thereof are defined, and the character recognition is sequentially executed for each character area according to the determined execution order. On the other hand, the document image acquired from the scanner unit 101 may be rotated in units of 90 degrees. Therefore, before executing the character recognition process, it is necessary to execute the direction determination process for determining the upright direction of the characters in the document image. In the direction determination, character recognition is generally performed for each of the images rotated in four directions that differ by 90 degrees internally, and the direction with the highest reliability of the character recognition result is determined to be the upright direction of the document. ing.

よって、方向判定に対しても、認識優先度と時間要因値に基づいて処理する文字領域を設定してもよい。構成や制御の流れなど第１の実施形態と共通する内容については説明を省略する。第３の実施形態では、システム制御部１１１がＲＯＭ１１２等からプログラムを読み出して実行することにより、例えば検出手段、優先度設定手段、時間要因値設定手段、実行領域設定手段、画像生成手段、方向判定手段等の機能が実現される。以下では、第３の実施形態において、第１の実施形態と相違する方向判定処理及び画像回転処理（図１１（Ａ）に示すＳ１１０１、Ｓ１１０２）について説明する。 Therefore, the character area to be processed may be set based on the recognition priority and the time factor value for the direction determination. The description of the contents common to the first embodiment such as the configuration and the flow of control will be omitted. In the third embodiment, the system control unit 111 reads a program from the ROM 112 or the like and executes it, so that, for example, a detection means, a priority setting means, a time factor value setting means, an execution area setting means, an image generation means, and a direction determination are performed. Functions such as means are realized. Hereinafter, in the third embodiment, the direction determination process and the image rotation process (S1101, S1102 shown in FIG. 11A), which are different from those in the first embodiment, will be described.

図１１（Ａ）は、第３の実施形態における画像処理装置１１０での文字認識処理の例を示すフローチャートである。Ｓ１１０１の方向判定処理は、Ｓ２００においてスキャンした文書画像を取得した後に実行される。 FIG. 11A is a flowchart showing an example of character recognition processing in the image processing device 110 according to the third embodiment. The direction determination process of S1101 is executed after acquiring the document image scanned in S200.

Ｓ１１０１において、システム制御部１１１は、スキャンした文書画像に対して文字が正立する方向を判定し、正立する文字方向の角度をＲＡＭ１１３に格納する。ここで文字方向とは、正立した文字の方向が０度であるとした時の文書中における文字の方向と定義し、Ｓ１１０１では文書中の文字を正立する文字方向を９０度単位（０度、９０度、１８０度、２７０度）で出力する。 In S1101, the system control unit 111 determines the direction in which the characters stand upright with respect to the scanned document image, and stores the angle of the upright character direction in the RAM 113. Here, the character direction is defined as the direction of characters in a document when the direction of upright characters is 0 degrees, and in S1101, the direction of characters in an upright character is set in units of 90 degrees (0 degrees). Degrees, 90 degrees, 180 degrees, 270 degrees).

Ｓ１１０２において、システム制御部１１１は、ＲＡＭ１１３を参照して、正立する文字方向の角度を取得し、取得した文字方向の角度に基づいて、スキャンした文書画像を回転させる。これにより、文字が正立した文書画像が得られる。その後、システム制御部１１１は、Ｓ２０１（文字領域検出処理）に遷移する。 In S1102, the system control unit 111 refers to the RAM 113, acquires an angle in the upright character direction, and rotates the scanned document image based on the acquired angle in the character direction. As a result, a document image in which the characters are upright can be obtained. After that, the system control unit 111 transitions to S201 (character area detection processing).

図１１（Ｂ）は、第３の実施形態における方向判定処理の例を示すフローチャートである図１１（Ｂ）に示すフローチャートでも、認識優先度及び時間要因値の設定に関する処理について第１の実施形態と共通するものは説明を省略する。 FIG. 11 (B) is a flowchart showing an example of the direction determination process in the third embodiment. Also in the flowchart shown in FIG. 11 (B), regarding the process related to the setting of the recognition priority and the time factor value, the first embodiment. The explanation of what is common to the above is omitted.

Ｓ２０５では、システム制御部１１１は、あらかじめ設定した閾値より時間要因値の累計が下回る範囲で文字認識を実行する文字領域（文字認識の実行領域）を設定し、結果をＲＡＭ１１３に格納する。文書画像についての方向判定は、実際の文字認識の前処理であるため、更なる高速化が求められる。その一方、方向判定においては、判定対象とする文字はある程度の数が必要ではあるが、正立する文字方向が分かればよいため、実際の文字認識の実行と比べて少ない文字領域で処理しても問題ない。よって、あらかじめ設定した時間要因値の累計の上限値となる閾値は、文字認識の実行時の閾値より小さくしてもよい。本実施形態では、一例として閾値は５０とする。この場合、実際に方向判定に用いる文字領域は、リストで表すと図１２に示した範囲１２０１の文字領域となる。 In S205, the system control unit 111 sets a character area (character recognition execution area) for executing character recognition within a range in which the cumulative total of time factor values is lower than a preset threshold value, and stores the result in the RAM 113. Since the direction determination for the document image is a preprocessing for actual character recognition, further speedup is required. On the other hand, in the direction determination, a certain number of characters to be determined are required, but since it is sufficient to know the upright character direction, the processing is performed in a smaller character area than the actual character recognition execution. There is no problem. Therefore, the threshold value that is the upper limit of the cumulative total of the preset time factor values may be smaller than the threshold value at the time of executing character recognition. In this embodiment, the threshold value is set to 50 as an example. In this case, the character area actually used for the direction determination is the character area in the range 1201 shown in FIG. 12 when represented by a list.

Ｓ１１０３において、システム制御部１１１は、Ｓ２０５において設定した文字領域の範囲のみで構成される方向判定処理に用いる実行画像を生成し、結果をＲＡＭ１１３やＨＤＤ１１４などの記憶部に格納する。図１３に、本実施形態における方向判定用の実行画像の一例を示す。図１３に示す実行画像１３０１は、図１２に示した範囲１２０１の文字領域に基づいて生成された実行画像である。 In S1103, the system control unit 111 generates an execution image used for the direction determination process composed of only the range of the character area set in S205, and stores the result in a storage unit such as the RAM 113 or the HDD 114. FIG. 13 shows an example of an execution image for determining the direction in the present embodiment. The execution image 1301 shown in FIG. 13 is an execution image generated based on the character area of the range 1201 shown in FIG.

Ｓ１１０４において、システム制御部１１１は、ＲＡＭ１１３やＨＤＤ１１４などの記憶部を参照して、Ｓ１１０３において生成した方向判定用の実行画像に基づいて文書画像の正立方向を判定する。この処理は公知の技術を用いればよく、例えば特許第３７２７９７１号公報などに記載の処理を適用できる。システム制御部１１１は、判定結果として得られる、正立する文字方向の角度をＲＡＭ１１３に格納する。 In S1104, the system control unit 111 determines the upright direction of the document image based on the direction determination execution image generated in S1103 with reference to the storage units such as the RAM 113 and the HDD 114. A known technique may be used for this treatment, and for example, the treatment described in Japanese Patent No. 3727971 can be applied. The system control unit 111 stores in the RAM 113 the angle in the upright character direction obtained as a determination result.

以上により、実際の文字認識の前処理である方向判定にも、認識優先度と時間要因値を用いて選択した文字領域のみで構成される画像を生成し、方向判定を行うことで更なる処理速度の向上が可能になる。 As described above, even in the direction determination which is the preprocessing of the actual character recognition, an image composed of only the character area selected by using the recognition priority and the time factor value is generated, and further processing is performed by performing the direction determination. It is possible to improve the speed.

（本発明の他の実施形態）
本発明は、前述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサがプログラムを読み出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 (Other Embodiments of the present invention)
The present invention supplies a program that realizes one or more functions of the above-described embodiment to a system or device via a network or storage medium, and one or more processors in the computer of the system or device reads and executes the program. It can also be realized by the processing to be performed. It can also be realized by a circuit (for example, ASIC) that realizes one or more functions.

なお、前記実施形態は、何れも本発明を実施するにあたっての具体化のほんの一例を示したものに過ぎず、これらによって本発明の技術的範囲が限定的に解釈されてはならないものである。すなわち、本発明はその技術思想、又はその主要な特徴から逸脱することなく、様々な形で実施することができる。 It should be noted that the above-described embodiments are merely examples of embodiment of the present invention, and the technical scope of the present invention should not be construed in a limited manner by these. That is, the present invention can be implemented in various forms without departing from the technical idea or its main features.

１００：読み取り装置１０１：スキャナ部１０２：通信部１１０：画像処理装置１１１：システム制御部１１２：ＲＯＭ１１３：ＲＡＭ１１４：ＨＤＤ１１５：表示部１１６：入力部１１７：通信部 100: Reading device 101: Scanner unit 102: Communication unit 110: Image processing device 111: System control unit 112: ROM 113: RAM 114: HDD 115: Display unit 116: Input unit 117: Communication unit

Claims

画像中の文字領域を検出する検出手段と、
前記文字領域ごとに文字認識処理に係る優先度を設定する優先度設定手段と、
前記文字領域ごとに、当該文字領域内の画素塊の大きさに係る情報に基づいて、文字認識処理の処理時間に係る時間要因値を設定する時間要因値設定手段と、
設定された前記優先度及び前記時間要因値に基づいて、文字認識処理を実行する前記文字領域及び実行順を決定する実行順設定手段とを有することを特徴とする画像処理装置。 A detection means for detecting a character area in an image and
Priority setting means for setting the priority related to the character recognition process for each character area, and
For each character area, a time factor value setting means for setting a time factor value related to the processing time of the character recognition process based on information related to the size of the pixel block in the character area, and
An image processing apparatus comprising: the character area for executing character recognition processing and an execution order setting means for determining an execution order based on the set priority and the time factor value.

前記時間要因値設定手段は、前記画素塊の高さ及び幅の少なくとも一方に基づいて前記画素塊ごとに前記時間要因値を設定することを特徴とする請求項１に記載の画像処理装置。 The image processing apparatus according to claim 1, wherein the time factor value setting means sets the time factor value for each pixel block based on at least one of the height and the width of the pixel block.

前記実行順設定手段は、前記優先度及び前記文字領域内の画素塊にそれぞれ設定された前記時間要因値の合計に基づいて、文字認識処理を実行する前記文字領域及び実行順を決定することを特徴とする請求項１又は２に記載の画像処理装置。 The execution order setting means determines the character area and the execution order for executing the character recognition process based on the sum of the priority and the time factor values set for the pixel clusters in the character area, respectively. The image processing apparatus according to claim 1 or 2.

前記時間要因値設定手段は、文字認識処理を実行せずに画像から取得した前記画素塊の大きさに係る情報に基づいて、前記時間要因値を設定することを特徴とする請求項１〜３の何れか１項に記載の画像処理装置。 Claims 1 to 3 characterized in that the time factor value setting means sets the time factor value based on the information related to the size of the pixel block acquired from the image without executing the character recognition process. The image processing apparatus according to any one of the above items.

前記時間要因値設定手段は、
前記文字領域内の画素塊の集合から仮想文字数を算出し、
算出した前記仮想文字数ごとに難読レベルを分類して係数を設定し、
前記係数を設定する際に接触文字と判定した場合には前記仮想文字数を２以上とすることを特徴とする請求項１〜４の何れか１項に記載の画像処理装置。 The time factor value setting means
The number of virtual characters is calculated from the set of pixel clusters in the character area, and the number of virtual characters is calculated.
Obfuscation levels are classified according to the calculated number of virtual characters, and coefficients are set.
The image processing apparatus according to any one of claims 1 to 4, wherein the number of virtual characters is 2 or more when it is determined that the characters are contact characters when the coefficient is set.

前記実行順設定手段は、
文字認識処理の実行順は、前記優先度が高い順に、また前記優先度が同じである場合には前記時間要因値が低い順に決定し、
文字認識処理を実行する前記文字領域は、実行順に累計した前記時間要因値の総和が所定の閾値を超えない文字領域までとすることを特徴とする請求項１〜５の何れか１項に記載の画像処理装置。 The execution order setting means
The execution order of the character recognition process is determined in the order of higher priority, and if the priority is the same, the order of lower time factor value is determined.
The character area for executing the character recognition process is the character area up to a character area in which the sum of the time factor values accumulated in the execution order does not exceed a predetermined threshold value, according to any one of claims 1 to 5. Image processing equipment.

前記時間要因値設定手段は、
設定された前記文字領域の前記時間要因値が所定の閾値より大きい場合、当該文字領域から前記時間要因値を大きくする前記画素塊を特定して削除し、前記文字領域の前記時間要因値を設定することを特徴とする請求項１〜６の何れか１項に記載の画像処理装置。 The time factor value setting means
When the time factor value of the set character area is larger than a predetermined threshold value, the pixel block that increases the time factor value is specified and deleted from the character area, and the time factor value of the character area is set. The image processing apparatus according to any one of claims 1 to 6, wherein the image processing apparatus is used.

設定された前記文字領域の前記時間要因値が所定の閾値より大きい場合、当該文字領域から前記時間要因値を大きくする前記画素塊を特定して分離し、分離された文字領域ごとに前記優先度及び前記時間要因値を設定することを特徴とする請求項１〜６の何れか１項に記載の画像処理装置。 When the time factor value of the set character area is larger than a predetermined threshold value, the pixel block that increases the time factor value is specified and separated from the character area, and the priority is given to each separated character area. The image processing apparatus according to any one of claims 1 to 6, wherein the time factor value is set.

前記時間要因値設定手段は、
文字認識処理のアルゴリズム及び対象言語の少なくとも１つに応じて、設定する前記時間要因値を異ならせることを特徴とする請求項１〜８の何れか１項に記載の画像処理装置。 The time factor value setting means
The image processing apparatus according to any one of claims 1 to 8, wherein the time factor value to be set differs depending on the character recognition processing algorithm and at least one of the target languages.

設定された前記優先度及び前記時間要因値に基づいて、方向判定に用いる実行画像を生成する画像生成手段と、
生成された前記実行画像を用いて画像の方向を判定する方向判定手段とを有し、
前記方向判定手段による判定結果に基づいて、画像を回転させ文字認識処理を実行させることを特徴とする請求項１〜９の何れか１項に記載の画像処理装置。 An image generation means for generating an execution image used for direction determination based on the set priority and the time factor value, and
It has a direction determining means for determining the direction of the image using the generated execution image.
The image processing apparatus according to any one of claims 1 to 9, wherein the image is rotated to execute the character recognition process based on the determination result by the direction determination means.

設定された前記優先度及び前記時間要因値に基づいて、方向判定に係る文字認識処理を実行する前記文字領域を設定する実行領域設定手段を有し、
前記画像生成手段は、前記実行領域設定手段により設定された前記文字領域で構成される前記実行画像を生成することを特徴とする請求項１０に記載の画像処理装置。 It has an execution area setting means for setting the character area for executing the character recognition process related to the direction determination based on the set priority and the time factor value.
The image processing apparatus according to claim 10, wherein the image generation means generates the execution image composed of the character area set by the execution area setting means.

画像中の文字領域を検出する検出手段と、
前記文字領域ごとに文字認識処理に係る優先度を設定する優先度設定手段と、
前記文字領域ごとに、当該文字領域内の画素塊の大きさに係る情報に基づいて、文字認識処理の処理時間に係る時間要因値を設定する時間要因値設定手段と、
設定された前記優先度及び前記時間要因値に基づいて、方向判定に用いる実行画像を生成する画像生成手段と、
生成された前記実行画像を用いて画像の方向を判定する方向判定手段とを有することを特徴とする画像処理装置。 A detection means for detecting a character area in an image and
Priority setting means for setting the priority related to the character recognition process for each character area, and
For each character area, a time factor value setting means for setting a time factor value related to the processing time of the character recognition process based on information related to the size of the pixel block in the character area, and
An image generation means for generating an execution image used for direction determination based on the set priority and the time factor value, and
An image processing apparatus including a direction determining means for determining the direction of an image using the generated execution image.

設定された前記優先度及び前記時間要因値に基づいて、文字認識処理を実行する前記文字領域を設定する実行領域設定手段を有し、
前記画像生成手段は、前記実行領域設定手段により設定された前記文字領域で構成される前記実行画像を生成することを特徴とする請求項１２に記載の画像処理装置。 It has an execution area setting means for setting the character area for executing the character recognition process based on the set priority and the time factor value.
The image processing apparatus according to claim 12, wherein the image generation means generates the execution image composed of the character area set by the execution area setting means.

画像処理装置による画像処理方法であって、
画像中の文字領域を検出する検出工程と、
前記文字領域ごとに文字認識処理に係る優先度を設定する優先度設定工程と、
前記文字領域ごとに、当該文字領域内の画素塊の大きさに係る情報に基づいて、文字認識処理の処理時間に係る時間要因値を設定する時間要因値設定工程と、
設定された前記優先度及び前記時間要因値に基づいて、文字認識処理を実行する前記文字領域及び実行順を決定する実行順設定工程とを有することを特徴とする画像処理方法。 It is an image processing method using an image processing device.
A detection process that detects the character area in the image,
A priority setting process for setting a priority related to character recognition processing for each character area, and a priority setting process.
A time factor value setting step of setting a time factor value related to the processing time of the character recognition process based on information related to the size of the pixel block in the character area for each character area.
An image processing method comprising: the character area for executing character recognition processing and an execution order setting step for determining an execution order based on the set priority and the time factor value.

画像処理装置による画像処理方法であって、
画像中の文字領域を検出する検出工程と、
前記文字領域ごとに文字認識処理に係る優先度を設定する優先度設定工程と、
前記文字領域ごとに、当該文字領域内の画素塊の大きさに係る情報に基づいて、文字認識処理の処理時間に係る時間要因値を設定する時間要因値設定工程と、
設定された前記優先度及び前記時間要因値に基づいて、方向判定に用いる実行画像を生成する画像生成工程と、
生成された前記実行画像を用いて画像の方向を判定する方向判定工程とを有することを特徴とする画像処理方法。 It is an image processing method using an image processing device.
A detection process that detects the character area in the image,
A priority setting process for setting a priority related to character recognition processing for each character area, and a priority setting process.
A time factor value setting step of setting a time factor value related to the processing time of the character recognition process based on information related to the size of the pixel block in the character area for each character area.
An image generation step of generating an execution image used for direction determination based on the set priority and the time factor value, and
An image processing method comprising a direction determination step of determining the direction of an image using the generated execution image.

画像処理装置のコンピュータに、
画像中の文字領域を検出する検出ステップと、
前記文字領域ごとに文字認識処理に係る優先度を設定する優先度設定ステップと、
前記文字領域ごとに、当該文字領域内の画素塊の大きさに係る情報に基づいて、文字認識処理の処理時間に係る時間要因値を設定する時間要因値設定ステップと、
設定された前記優先度及び前記時間要因値に基づいて、文字認識処理を実行する前記文字領域及び実行順を決定する実行順設定ステップとを実行させるためのプログラム。 To the computer of the image processing device
A detection step that detects the character area in the image,
A priority setting step for setting a priority related to character recognition processing for each character area, and
For each character area, a time factor value setting step for setting a time factor value related to the processing time of the character recognition process based on information related to the size of the pixel block in the character area, and a time factor value setting step.
A program for executing the character area for executing character recognition processing and the execution order setting step for determining the execution order based on the set priority and the time factor value.

画像処理装置のコンピュータに、
画像中の文字領域を検出する検出ステップと、
前記文字領域ごとに文字認識処理に係る優先度を設定する優先度設定ステップと、
前記文字領域ごとに、当該文字領域内の画素塊の大きさに係る情報に基づいて、文字認識処理の処理時間に係る時間要因値を設定する時間要因値設定ステップと、
設定された前記優先度及び前記時間要因値に基づいて、方向判定に用いる実行画像を生成する画像生成ステップと、
生成された前記実行画像を用いて画像の方向を判定する方向判定ステップとを実行させるためのプログラム。 To the computer of the image processing device
A detection step that detects the character area in the image,
A priority setting step for setting a priority related to character recognition processing for each character area, and
For each character area, a time factor value setting step for setting a time factor value related to the processing time of the character recognition process based on information related to the size of the pixel block in the character area, and a time factor value setting step.
An image generation step of generating an execution image used for direction determination based on the set priority and the time factor value, and
A program for executing a direction determination step of determining the direction of an image using the generated execution image.