JP2008040598A

JP2008040598A - Image input apparatus

Info

Publication number: JP2008040598A
Application number: JP2006210988A
Authority: JP
Inventors: Tomoshi Yoshida; 知史吉田
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2006-08-02
Filing date: 2006-08-02
Publication date: 2008-02-21

Abstract

<P>PROBLEM TO BE SOLVED: To provide an image input apparatus for correcting erroneous recognition of location deviation between actually scanned image and a previously set OCR area. <P>SOLUTION: The image input apparatus comprises a scanning means, a form registering means, a zone OCR means, an index correcting means, an indexing means for making data which is acquired by the index correcting means to be an index of a document, and a storing means for storing the image and the index data into a document managing system as a document. <P>COPYRIGHT: (C)2008,JPO&INPIT

Description

本発明は、スキャナ機器およびスキャナ装置を有するMFP（デジタル複合機）から画像を入力し、ゾーンOCRによって必要な情報をインデックスデータとして作成し、画像とインデックスデータを文書管理システムに保存する画像入力装置に関する。 The present invention provides an image input device that inputs an image from an MFP (digital multifunction peripheral) having a scanner device and a scanner device, creates necessary information as index data by zone OCR, and stores the image and the index data in a document management system About.

従来、画像入力装置では、スキャナから読み取った画像に対しOCR処理を施し、COR結果をインデックスデータとし、画像とインデックスを文書として文書管理システムへ保存される。その際のOCR処理は、インデックスデータとして必要な文字情報のある画像上の部分領域を切り出しOCR結果を取得していた。しかしながら、スキャナ機器のフィーダ（紙送り装置）の特性や性能によって、取得される画像データが元原稿に対して斜行したり、予めゾーンOCR用に設定した領域と位置ずれを発生する場合がある。当然、予めゾーンOCR用に画像上の部分領域を設定する場合は、それも鑑みて多少広めに領域の範囲指定を行うことになる。 Conventionally, an image input apparatus performs OCR processing on an image read from a scanner, stores the COR result as index data, and stores the image and index as a document in a document management system. In the OCR processing at that time, a partial area on an image having character information necessary as index data is cut out to obtain an OCR result. However, depending on the characteristics and performance of the feeder (paper feeding device) of the scanner device, the acquired image data may be skewed with respect to the original document or may be misaligned with the area set in advance for the zone OCR. . Naturally, when a partial region on the image is set in advance for the zone OCR, the range of the region is designated somewhat wider in view of this.

また、ゾーンＯＣＲを行う際、予め原稿上に記入枠を設け、その記入枠に接触したり、はみ出して記入された文字等を、記入枠のドロップアウトや罫線削除技術によって認識率を低下させずに読み取る方法も提案されている。 In addition, when performing zone OCR, an entry frame is provided on the manuscript in advance, and the recognition rate of characters, etc. that are in contact with the entry frame or protruded is not reduced by the dropout of the entry frame or ruled line deletion technology. A reading method has also been proposed.

さらに、実際にスキャンした原稿と、予めＯＣＲ領域を設定するためにフォーム登録した原稿の比較による位置ズレ補正によってＯＣＲ精度を向上させる技術提案もあった。 Further, there has been a technical proposal for improving OCR accuracy by correcting a positional deviation by comparing a document actually scanned with a document previously registered in a form for setting an OCR area.

従来例としては、例えば特許文献１をあげることが出来る。
特開平１１−２８２９５９号公報 For example, Patent Literature 1 can be cited as a conventional example.
JP 11-282959 A

本発明は、上記従来例に鑑みなされたもので、予めゾーンOCR用に画像上の部分領域を設定し、その際、多少広めに領域の範囲指定を行った場合であっても、スキャナ機器による画像の位置ずれの影響で、先頭文字の左端、最後の文字の右端、文字列上部や文字列下部が微小に文字切れする場合があり、画像入力装置のOCR精度に悪影響を及ぼすという課題があった。微小な位置ずれに対して,部分領域を自動的に広くする処理も提案されているが、その場合、領域を広くしたために新たに別のデータが矩形内に混入する場合があり、課題となった。 The present invention has been made in view of the above-described conventional example, and even if a partial area on an image is set in advance for the zone OCR and the range of the area is specified to be somewhat wider, Due to the image misalignment, the left end of the first character, the right end of the last character, the upper part of the character string, and the lower part of the character string may be slightly cut off, which adversely affects the OCR accuracy of the image input device. It was. A process for automatically widening a partial area for a small positional shift has also been proposed, but in this case, another area may be mixed with another data because the area is widened, which is a problem. It was.

また、ブラックボックス化されたCOR認識エンジンを使用したアプリケーションソフトによる画像入力装置では、COR結果はOCRエンジンから返されるテキストデータのみで、そもそもそのゾーンOCRの結果が正しいかどうかの判断はできないという課題があった。 In addition, with image input devices using application software that uses a black boxed COR recognition engine, the COR result is only text data returned from the OCR engine, and it is impossible to determine whether the zone OCR result is correct in the first place. was there.

さらに、ゾーンOCRのために画像から部分画像を切り出す前段で、所望の文字列が包括される領域を自動的に判断する技術では、画像上の画素の密度を判断するなど画像処理に時間がかかるため、大量文書のインデキシングを行い、文書（画像とインデックス）を文書管理システムへ保存するための画像入力装置としてパフォーマンスの問題が課題としてあった。 In addition, the technology that automatically determines the area in which a desired character string is included before extracting a partial image from an image for zone OCR takes time for image processing, such as determining the pixel density on the image. Therefore, there has been a problem of performance as an image input device for indexing a large number of documents and storing the documents (images and indexes) in the document management system.

記入枠に接触したり、はみ出して記入された文字等を、記入枠のドロップアウトによって解決する方法については、そもそも認識したい文字データが記入枠に接触したりはみ出したりした時点で、文字データと記入枠線が交差することにより、文字認識率の低下は否めないという課題があり、また、この提案では、罫線や枠線のある原稿のみしか文字認識に利用できないという課題があった。 As for the method of solving characters that touch or fill out the entry frame by dropout of the entry frame, the character data and the entry are entered when the character data to be recognized touches or protrudes from the entry frame in the first place. There is a problem that the character recognition rate cannot be lowered due to the crossing of the frame lines, and this proposal has a problem that only originals with ruled lines or frame lines can be used for character recognition.

さらに、実際にスキャンした原稿と、予めＯＣＲ領域を設定するためにフォーム登録した原稿の比較による位置ズレ補正によってＯＣＲ精度を向上させる技術提案には、実際にスキャンした原稿が予めＯＣＲ領域を設定するためにフォーム登録した原稿と同じ原稿かどうか比較するためのフォーム認識技術が必要となり、フォーム認識が失敗した場合は、ＯＣＲ処理まで処理が至らず、結果ゾーンＯＣＲによる正しい認識結果が得られないという課題があった。 Further, in the technical proposal for improving the OCR accuracy by correcting the positional deviation by comparing the actually scanned document and the document registered in the form for setting the OCR region in advance, the actually scanned document sets the OCR region in advance. For this reason, a form recognition technique is required to compare whether or not the document is the same as the document registered in the form. If the form recognition fails, the process does not reach the OCR process, and the correct recognition result by the result zone OCR cannot be obtained. There was a problem.

上記従来例による課題を解決するために本発明は下記の手段を備える。 In order to solve the problems caused by the above conventional example, the present invention comprises the following means.

スキャナ機器で画像を読み取るスキャン手段、スキャンした画像にゾーンOCRの領域を複数指定するフォーム登録手段、スキャンした画像にフォーム登録された領域設定によってゾーンOCR処理を行うゾーンOCR手段、ある領域の部分画像に対して、自動的に領域を自動的に拡張し、ゾーンOCRを繰り返し処理することで妥当なインデックスデータを作成するインデックス補正手段、インデックス補正手段で取得したデータを文書のインデックスとするインデックス手段、画像とインデックスデータを文書として文書管理システムへ保存する保存手段、本発明は、以上の手段等によって構成される。 Scanning means for reading an image with a scanner device, form registration means for specifying a plurality of zone OCR areas in the scanned image, zone OCR means for performing zone OCR processing according to the area setting registered in the scanned image, partial image of a certain area In contrast, an index correction unit that automatically expands the area and repeats the zone OCR to create appropriate index data, an index unit that uses the data acquired by the index correction unit as a document index, The storage means for storing the image and the index data as a document in the document management system, the present invention is constituted by the above means and the like.

本発明は、以上の構成および各手段を備えるものであれば、単体のPC等のコンピュータ内に構成要素を備える形態でもよく、またネットワークを介した複数のPC上にそれぞれの構成要素がありシステムの形態で提供されても良いことは言うまでもない。 As long as the present invention has the above-described configuration and each means, the present invention may have a configuration in which components are provided in a computer such as a single PC. Needless to say, it may be provided in the form.

スキャナ機器以外の、本発明の特徴的な手段を有する画像入力装置や文書管理システムはコンピュータソフトウェアとしてコンピュータシステム上に構築されるものである。 The image input apparatus and document management system having the characteristic means of the present invention other than the scanner device are constructed on a computer system as computer software.

スキャナ機器がなくとも、画像データをインポート処理で画像入力装置に入力できる形態であれば、それがコンピュータソフトウェアのみのシステムとしてコンピュータ上に構築されるものでも構わない。 Even if there is no scanner device, as long as the image data can be input to the image input device by the import process, it may be constructed on a computer as a system of only computer software.

上記実施例等で説明した本発明により、次の効果がある。 The present invention described in the above embodiments and the like has the following effects.

（１）ゾーンOCR処理における部分画像の文字切れによる誤認識を軽減できるため、画像入力装置を使用するユーザの操作性が向上する。 (1) Since misrecognition due to character breakage in a partial image in zone OCR processing can be reduced, the operability of a user who uses the image input device is improved.

（２）本発明は、アプリケーション側の設定するＯＣＲ領域と実際のスキャン画像の位置ズレに対する文字列補正方法により、ＯＣＲのための罫線や記入枠によって文字列位置等を判断するわけでないため、従来、罫線判別や記入枠ドロップアウトでしか補正できなかった原稿（罫線や記入枠がある原稿）以外の原稿に対するゾーンＯＣＲ処理で認識率の低下を防ぐことが可能となる。 (2) According to the present invention, the character string position or the like is not determined by the ruled line or the entry frame for OCR by the character string correction method for the positional deviation between the OCR area set by the application side and the actual scan image. Further, it is possible to prevent the recognition rate from being lowered by the zone OCR process for a document other than a document that can be corrected only by ruled line discrimination or entry frame dropout (a document having a ruled line or an entry frame).

（３）本発明は、ＯＣＲ結果から位置ズレによる誤認識を補正するため、実際にスキャンした原稿と、予めＯＣＲ領域を設定するためにフォーム登録した原稿の比較（フォーム認識）を行わなくても、ゾーンＯＣＲの精度を向上することができるため、このようなシステム開発を行う場合、高価で複雑なフォーム認識技術を搭載しなくても、認識率の高いゾーンＯＣＲシステムを開発可能となり、その結果、エンドユーザへも認識率の高いゾーンＯＣＲシステム提供可能となる。 (3) In the present invention, in order to correct misrecognition due to misalignment from the OCR result, it is not necessary to perform comparison (form recognition) of the actually scanned original and the original registered in the form for setting the OCR area. Since the accuracy of the zone OCR can be improved, a zone OCR system with a high recognition rate can be developed without the need for expensive and complicated form recognition technology. It is possible to provide a zone OCR system with a high recognition rate to end users.

以下、本発明の各実施の形態を図面を参照し、説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

図１は本発明のシステムの基本構成を示した図である。 FIG. 1 is a diagram showing the basic configuration of the system of the present invention.

１．は、紙原稿から画像をスキャンするスキャナ機器を示す。 1. Indicates a scanner device that scans an image from a paper document.

２．は、本発明の特徴となる処理を行う画像入力装置を示す。 2. These show the image input apparatus which performs the process which is the characteristic of this invention.

３．は、画像入力装置から文書（画像とインデックス）を保存する文書管理システムを示す。 3. Shows a document management system for storing documents (images and indexes) from an image input device.

実施の形態としては、ネットワーク上でスキャナ機器や画像入力装置や文書管理システムが接続され、相互にデータ交換ができるシステム形態でもよく、また、画像入力装置や文書管理システムといった各構成要素が１つのPC内に存在する場合でも、機能的要素や手段が満たされていれば本発明は実現可能である。 The embodiment may be a system form in which scanner devices, image input devices, and document management systems are connected over a network and can exchange data with each other, and each component such as an image input device and a document management system has one component. Even if it exists in the PC, the present invention can be realized if the functional elements and means are satisfied.

図２は、本発明の実施の形態に係る全体的な処理のフローチャート図である。 FIG. 2 is a flowchart of overall processing according to the embodiment of the present invention.

図３は、本発明の実施の形態に係るインデックス補正について概念的に説明するためのものである。本件が画像上の部分領域に対するゾーンOCRの認識精度向上を目的とした発明であるため、それを説明するために、あえてインデックス補正処理を視覚的に解り易く概念的に示した。 FIG. 3 conceptually illustrates index correction according to the embodiment of the present invention. Since the present invention is an invention aimed at improving the recognition accuracy of the zone OCR with respect to the partial area on the image, the index correction process is intentionally shown conceptually in order to explain it.

図４は、原稿に対する部分領域の範囲設定の例を示す。画像上の原稿としての必須記入部分を矩形でエリア選択する。 FIG. 4 shows an example of partial area range setting for a document. A rectangular area is selected for a required entry portion as a manuscript on the image.

次に、図２のフローチャートに従い、本発明の実施例の処理について詳細に説明する。 Next, the processing of the embodiment of the present invention will be described in detail according to the flowchart of FIG.

まず、S101で、初期化処理を行う。初期化処理とは、スキャナ機器、画像入力装置、文書管理システムが起動され、使用可能な状態となるまでの一般的な処理を意味する。この初期化処理で、２画像入力装置は、スキャナ機器からの画像入力をするための処理待ちの状態となる。 First, in S101, initialization processing is performed. The initialization process means a general process until the scanner device, the image input device, and the document management system are activated and become usable. With this initialization process, the two-image input device enters a process waiting state for inputting an image from the scanner device.

次にステップS102で画像をスキャンする。 In step S102, the image is scanned.

次にステップS１０３で、処理がフォーム登録ステップなのか、画像入力ステップなのか処理フェーズについて判断する。この判断は、画像入力装置のユーザインタフェースからのユーザのコマンドをインタラクティブに入力することで判断する。フォーム登録ステップの場合、処理はS１０４へ進む。画像入力ステップの場合、処理はステップS１０７へ進む。 Next, in step S103, it is determined for the processing phase whether the process is a form registration step or an image input step. This determination is made by interactively inputting a user command from the user interface of the image input apparatus. In the case of the form registration step, the process proceeds to S104. In the case of the image input step, the process proceeds to step S107.

ステップS１０４では、S１０２で入力した画像をフォーム登録する。 In step S104, the image input in S102 is registered as a form.

次にステップS１０５では、S１０２で入力した画像に対するゾーンOCRのための領域設定を行う。領域設定の概念を図４に示す。図４では、原稿上の文字列情報のある部分に矩形エリアを選択する。この操作は画像入力装置のユーザインタフェース上で行われる。 In step S105, area setting for zone OCR is performed on the image input in step S102. The concept of area setting is shown in FIG. In FIG. 4, a rectangular area is selected for a part having character string information on a document. This operation is performed on the user interface of the image input apparatus.

次にステップS１０６で、S１０５の領域設定を繰り返し行い、他の領域設定を続行するかどうか判断する。繰り返し行う場合、処理はステップS１０５へ戻り、次の領域を設定する。図４の例では、原稿上の文字列情報は一箇所しかないが、実際には一つの頁上に複数のゾーンOCR領域を設定する場合もあるため、このS１０５、S１０６のような繰り返し処理が必要となる。 Next, in step S106, the region setting in S105 is repeated, and it is determined whether or not to continue other region settings. If repeated, the process returns to step S105 to set the next area. In the example of FIG. 4, there is only one character string information on the document, but actually, there may be a case where a plurality of zone OCR areas are set on one page. Therefore, iterative processes such as S105 and S106 are performed. Necessary.

このステップS１０４からS１０６までの処理で登録されたフォームは、画像入力装置内で記憶され、画像入力ステップで使用される。 The form registered in the processes from step S104 to step S106 is stored in the image input device and used in the image input step.

ステップS１０６で、これ以上領域設定を行わない場合、処理はステップS１０２へ戻る。 If it is determined in step S106 that no more areas are set, the process returns to step S102.

ステップS１０３の判断で、処理が画像入力モードの場合、処理はステップS１０７へ進む。 If it is determined in step S103 that the process is in the image input mode, the process proceeds to step S107.

ステップS１０７では、ステップS１０４からS１０６までの処理で登録されたフォームからゾーンOCRの領域設定を読み出す。 In step S107, the zone OCR area setting is read from the form registered in the processing from steps S104 to S106.

次にステップS１０８では、S107で読み出した矩形領域設定による部分画像を切り出し、その部分画像をOCRエンジンに渡し、ゾーンOCRを実行する。 Next, in step S108, the partial image by the rectangular area setting read in S107 is cut out, the partial image is transferred to the OCR engine, and zone OCR is executed.

次にステップS１０９では、S１０８のゾーンOCRのOCR結果（テキストデータ）を「第1のOCR結果」として取得し、その文字数と文字列を構成するキャラクタを解析し、システム内に一時保存する。 Next, in step S109, the OCR result (text data) of the zone OCR in S108 is acquired as the “first OCR result”, the number of characters and the characters constituting the character string are analyzed, and temporarily stored in the system.

次にステップS１１０では、S１０７で読み出した領域設定を自動拡張する。その概念は図3に示す。 In step S110, the area setting read in step S107 is automatically expanded. The concept is shown in FIG.

次に、ステップS１１１では、S110で自動拡張された領域に従い部分画像を切り出し、その部分画像をOCRエンジンに渡し、ゾーンOCRを実行する。 Next, in step S111, a partial image is cut out according to the area automatically expanded in S110, the partial image is transferred to the OCR engine, and zone OCR is executed.

次にステップS１１２では、S１１１のゾーンOCRのOCR結果（テキストデータ）を「第２のOCR結果」として取得し、その文字数と文字列を構成するキャラクタを解析し、システム内に一時保存する。 In step S112, the OCR result (text data) of the zone OCR in S111 is acquired as a “second OCR result”, the number of characters and the characters constituting the character string are analyzed, and temporarily stored in the system.

次にS１１３では、S１０９で取得した「第1のOCR結果」とS１１２で取得した「第２のOCR結果」を比較する。 Next, in S113, the “first OCR result” acquired in S109 is compared with the “second OCR result” acquired in S112.

次にS１１４では、Ｓ１１１３の比較結果の判断を行う。Ｓ１１４の比較結果で、S１０９で取得した「第1のOCR結果」とS１１２で取得した「第２のOCR結果」が同じ文字列であると判断した場合、処理はステップＳ１２０へ進む。この場合、インデックスデータには「第1のOCR結果」が採用される。 Next, in S114, the comparison result in S1113 is determined. If it is determined in the comparison result of S114 that the “first OCR result” acquired in S109 and the “second OCR result” acquired in S112 are the same character string, the process proceeds to step S120. In this case, the “first OCR result” is adopted as the index data.

本発明の特徴は、部分画像での微小な文字切れによるＯＣＲ結果の誤認識（所望の文字列をＯＣＲ結果として得られない状態）を補正するものであるため、元々の領域のＯＣＲ結果と拡張した領域のＯＣＲ結果が同一であれば、そもそも「第1のＯＣＲ結果」において文字切れによる誤認識はなかったと判断できる。 The feature of the present invention is to correct misrecognition of an OCR result (a state in which a desired character string cannot be obtained as an OCR result) due to a minute character cut in a partial image. If the OCR results of the selected areas are the same, it can be determined that there was no misrecognition due to character breakage in the “first OCR result” in the first place.

Ｓ１１４で、S１０９で取得した「第1のOCR結果」とS１１２で取得した「第２のOCR結果」が異なる場合、処理はステップＳ１１５に進む。 In S114, when the “first OCR result” acquired in S109 is different from the “second OCR result” acquired in S112, the process proceeds to step S115.

ステップＳ１１５では、「第２のOCR結果」の文字列内に改行コードがあるかどうかの判断を行う。 In step S115, it is determined whether or not there is a line feed code in the character string “second OCR result”.

Ｓ１１５の判断で改行コードがなかった場合、処理はステップＳ１２０へ進む。この場合、インデックスデータには「第２のOCR結果」が採用される。この理由は、領域を拡張して得られたＯＣＲ結果が拡張前と異なることから、「第1のＯＣＲ結果」は文字切れのあった部分画像に対して行われたものと判断でき、さらに、Ｓ１１５で「第2のＯＣＲ結果」に改行コードがないことから、領域を拡張した結果、原稿上の別の行のデータが部分画像に混入しなかったことを示す。そのため、この時点で「第2のＯＣＲ結果」が「第1のＯＣＲ結果」よりもより妥当なＯＣＲ結果であり、それがインデックスデータとして採用するべきであると判断される。 If it is determined in S115 that there is no line feed code, the process proceeds to step S120. In this case, the “second OCR result” is adopted as the index data. This is because the OCR result obtained by expanding the area is different from that before the expansion, so that the “first OCR result” can be determined to have been performed on the partial image where the character was cut. Since there is no line feed code in the “second OCR result” in S115, it indicates that the data of another line on the document has not been mixed into the partial image as a result of extending the area. Therefore, at this time, the “second OCR result” is a more appropriate OCR result than the “first OCR result”, and it is determined that it should be adopted as index data.

図面による概念図はないが、例えば「第1のＯＣＲ結果」が“ＢＣ”という文字列で、「第2のＯＣＲ結果」が“ＡＢＣ"という場合には、ステップ１１３の判断で、「第1のＯＣＲ結果」と「第2のＯＣＲ結果」が異なることから「第1のＯＣＲ結果」はインデックスデータとして採用されない。また、「第2のＯＣＲ結果」に改行コードが含まれないため、ステップ１１５の判断によって「第2のＯＣＲ結果」がインデックスデータとして採用される。つまり、本発明では、「第1のＯＣＲ結果」が正しいものか正しくないものかは解らないという前提で補正処理を行い、より妥当な結果をインデックスデータとして採用する技術であるであることが、これらの比較処理によって解ることは言うまでもない。 Although there is no conceptual diagram according to the drawing, for example, when the “first OCR result” is a character string “BC” and the “second OCR result” is “ABC”, the determination at step 113 is “first”. The “first OCR result” is not adopted as the index data because the “OCR result” and the “second OCR result” are different. Further, since the line feed code is not included in the “second OCR result”, the “second OCR result” is adopted as the index data according to the determination in step 115. In other words, in the present invention, it is a technique that performs correction processing on the premise that it is not known whether the “first OCR result” is correct or incorrect, and adopts a more appropriate result as index data. Needless to say, these comparison processes can be understood.

ステップＳ１１５の判断で、「第2のＯＣＲ結果」に改行コードが混入されていた場合、処理はステップＳ１１６へ進む。 If it is determined in step S115 that a line feed code is mixed in the “second OCR result”, the process proceeds to step S116.

ステップＳ１１６では、「第2のＯＣＲ結果」の文字列を改行を区切りに分割し、分割された文字列ごとに文字数や文字列を構成するキャラクタを解析し一時保存される。その概念を図３に示す。 In step S116, the character string of “second OCR result” is divided into line breaks, and the number of characters and characters constituting the character string are analyzed and temporarily stored for each divided character string. The concept is shown in FIG.

次にステップＳ１１７では、「第1のＯＣＲ結果」と「第2のＯＣＲ結果を分割した文字列」の比較を行う。これは、Ｓ１０９の解析結果とＳ１１６での分割文字列に対する解析結果の比較である。比較した結果は、Ｓ１１８で画像入力装置内に一時保存される。図3の例で説明すると、「第2のＯＣＲ結果」は“- - -\nABC”のため、分割された文字列の一つ目は“- - -”となる。“- - -”の文字列の文字数は３個で、構成するキャラクタは“- ”と“- ”と“- ”になる。一方、「第1のＯＣＲ結果」は“ＰＢＣ”であるため、文字列の文字数は3文字で構成するキャラクタは“P ”と“B ”と“C ”である。したがって、“- - -”と“ＰＢＣ”との比較では、文字列の文字数は同じだが、構成するキャラクタは異なることになる。Ｓ１１８では、この文字列の文字数は同じ、構成するキャラクタは全て異なることを、コンピュータの変数を使用したフラグ形式で画像於入力装置内に一時記憶することを示す。 Next, in step S117, the “first OCR result” and the “character string obtained by dividing the second OCR result” are compared. This is a comparison between the analysis result in S109 and the analysis result for the divided character string in S116. The comparison result is temporarily stored in the image input device in S118. In the example of FIG. 3, since the “second OCR result” is “--- \ nABC”, the first of the divided character strings is “---”. The number of characters in the character string “---” is 3, and the constituent characters are “-”, “-”, and “-”. On the other hand, since the “first OCR result” is “PBC”, the number of characters in the character string is “P”, “B”, and “C”. Therefore, in the comparison between “---” and “PBC”, the number of characters in the character string is the same, but the constituent characters are different. In S118, the fact that the number of characters in the character string is the same and that the constituent characters are all different is temporarily stored in the image input device in the form of a flag using computer variables.

次に、ステップＳ１１９で「第2のＯＣＲ結果」の分割文字列が他にあるかどうかの判断を行う。他の分割文字列がある場合、処理はステップＳ１１７へ戻り、Ｓ１１７、Ｓ１１８の処理を繰り返す。これを図３の例で説明すると、今回の比較対照となる分割文字列は、“ＡＢＣ”である。文字列“ＡＢＣ”は文字数が3文字で、構成するキャラクタは“A ”と“B ”と“C ”である。これを「第1のＯＣＲ結果」を比較した場合、文字列の文字数は同じで、構成するキャラクタの3文字中の2文字（“Ｂ”“と” “Ｃ“）は「第1のＯＣＲ結果」と同じである。これをＳ１１８で、コンピュータの変数を使用したフラグ形式で画像於入力装置内に一時記憶することを示す。 In step S119, it is determined whether there is another divided character string of “second OCR result”. If there is another divided character string, the process returns to step S117, and the processes of S117 and S118 are repeated. This will be described with reference to the example of FIG. 3. The divided character string as a comparison reference this time is “ABC”. The character string “ABC” has three characters, and the constituent characters are “A”, “B”, and “C”. When this is compared with the "first OCR result", the number of characters in the character string is the same, and two of the three characters ("B" "and" C ") are" first OCR result " Is the same. In step S118, the image is temporarily stored in the input device in the form of a flag using a computer variable.

次に、Ｓ１２０では、処理がステップＳ１１８から進んできた場合には、ステップＳ１１８で保存された分割文字列の結果の優劣を判断し、より「第1のＯＣＲ文字列」と類似した方をインデックスデータデータとして採用する。 Next, in S120, if the process has proceeded from step S118, the result of the divided character string stored in step S118 is determined to be superior or inferior, and the one more similar to the “first OCR character string” is indexed. Adopt as data.

Ｓ１２０で、処理がステップＳ１１４から進んできた場合には、「第1のＯＣＲ結果」をインデックスデータとして採用する。 If the process proceeds from step S114 in S120, the “first OCR result” is adopted as index data.

Ｓ１２０で、処理がステップＳ１１５から進んできた場合には、「第２のＯＣＲ結果」をインデックスデータとして採用する。 If the process has proceeded from step S115 in S120, the “second OCR result” is adopted as the index data.

このＳ１２０の処理によって、「第1のＯＣＲ結果」が部分画像の文字切れによって誤認識していた場合は補正され、誤認識していなかった場合には、そのままインデックスとして採用される。 By the processing of S120, the “first OCR result” is corrected when it is erroneously recognized due to the character cut of the partial image, and when it is not erroneously recognized, it is directly adopted as an index.

入力画像に対するインデックスは、複数存在する場合もあるので、このＳ１２０で決定したインデックスデータは、文書管理システムへ保存されるまで画像入力装置内に一時保存される。 Since there may be a plurality of indexes for the input image, the index data determined in S120 is temporarily stored in the image input device until it is stored in the document management system.

次に、ステップＳ１２１では、他の領域設定があるかないかを判断する。他の領域設定がある場合には、処理はＳ１０７へ戻り、次の領域に対しＳ１０７からＳ１２０までの処理を繰り返し実行し、最も妥当なインデックスデータを取得する処理を繰り返し行う。Ｓ１２１で他に領域設定がない場合、処理はステップＳ１２２へ進む。 Next, in step S121, it is determined whether there is another area setting. If there is another area setting, the process returns to S107, the processes from S107 to S120 are repeatedly executed for the next area, and the process of obtaining the most appropriate index data is repeatedly performed. If there is no other area setting in S121, the process proceeds to step S122.

ステップＳ１２２では、Ｓ１０２で入力した画像とＳ１０７からＳ１２０の処理で取得したインデックスデータを文書として文書管理システムへ保存する。 In step S122, the image input in step S102 and the index data acquired in steps S107 to S120 are stored in the document management system as a document.

ここで言う文書管理システムは、画像とインデックスデータはあるシステム内で関連付けされて保存され、後に、インデックス検索、および、検索結果から画像を発見し、描画できるものであればそれで良いことは言うまでもない。 In the document management system here, it goes without saying that images and index data are stored in association with each other in the system, and it is possible to find and draw images from index search and search results later. .

次に、Ｓ１２３で、全体の処理を終了するかどうかの判断を行う。終了しない場合、処理はＳ１０２へ戻り、次の画像の入力待ち状態を維持する。 Next, in S123, it is determined whether or not to end the entire process. If not, the process returns to S102, and the next image input waiting state is maintained.

Ｓ１２３で、全体の処理を終了する場合、一般的な装置の後処理を行い、処理全体を終了する。 If the entire process is terminated in S123, a general apparatus post-process is performed, and the entire process is terminated.

本発明の第１実施の形態に係るシステム構成ブロック図である。It is a system configuration block diagram concerning a 1st embodiment of the present invention. 本発明の実施の形態に係る全体的な処理のフローチャート図である。It is a flowchart figure of the whole process which concerns on embodiment of this invention. 本発明の実施の形態に係る、スキャン・保存フェーズ処理の概念図である。It is a conceptual diagram of the scan / save phase process according to the embodiment of the present invention. 本発明の実施の形態に係る、部分領域画像の領域設定を示す図である。It is a figure which shows the area | region setting of the partial area image based on embodiment of this invention.

Claims

本発明における画像入力装置は、スキャナ機器から画像を入力し、入力した画像上の部分領域に対し光学式文字認識（以下、ゾーンOCR）を実行し、OCR結果として取得されたテキストデータを文書のインデックスデータとし、最終的に文書（画像とインデックスデータ）を文書管理システムへ保存する画像入力装置であり、OCRによってインデックスデータを作成する処理において、部分画像切り出しによる文字切れによる誤認識を補正し、妥当なOCR出力をインデックスデータとして採用する処理を備える画像入力装置であり、
まず、インデックスデータを作成するための部分領域に対するOCR処理のために、予め原稿上のOCR領域設定をフォームとして登録するフォーム登録ステップを備え、
また、実際の原稿をスキャンし、OCRによってインデックスを作成し、文書（画像とインデックス）を文書管理システムへ保存する画像入力ステップを備え、
さらに、前期画像入力ステップにおいては、画像上の部分領域に対するOCR処理で、スキャン入力した画像のインデックス作成処理において、前期フォーム登録ステップで登録されたフォームのOCR領域設定にしたがってスキャン入力した画像の部分領域を切り出し、その部分領域画像に対し、1回目のOCR処理と、次に自動的に部分領域を拡張し2回目のOCR処理を実行し、
１回目のOCRの解析結果と2回目のOCRの解析結果を、まず条件１の判断として、「第1のOCR結果」と「第2のOCR結果」を比較した結果、同じテキストデータである場合は、「第1のOCR結果」をインデックスデータとして採用し、
次に条件２の判断として、「第1のOCR結果」と「第2のOCR結果」が違うテキストデータで、「第2のOCR結果」に改行コードが含まれていない場合には、「第2のOCR結果」をインデックスデータとして採用し、
最後に条件３の判断として、「第1のOCR結果」と「第2のOCR結果」が違うテキストデータである場合で、「第2のOCR結果」に改行コードが含まれている場合、「第2のOCR結果」の文字列を改行コードで分割し、分割されたテキストデータをそれぞれ「第1のOCR結果」と比較し、文字数や文字列を構成するキャラクタが「第1のOCR結果」と近いものをインデックスデータとして採用するといった、条件１、条件２、条件３の判断処理による部分画像切り出しによる文字切れによる誤認識を補正し、インデックデータを取得することを特徴とする画像入力装置。 An image input apparatus according to the present invention inputs an image from a scanner device, performs optical character recognition (hereinafter referred to as zone OCR) on a partial area on the input image, and converts text data acquired as an OCR result into a document. It is an image input device that saves the document (image and index data) to the document management system as index data, and corrects misrecognition due to character cutout due to partial image cutout in the process of creating index data by OCR. It is an image input device equipped with processing that adopts valid OCR output as index data,
First, for OCR processing for a partial area for creating index data, a form registration step for registering an OCR area setting on a document as a form in advance is provided.
It also includes an image input step that scans the actual document, creates an index by OCR, and stores the document (image and index) in the document management system.
Furthermore, in the previous image input step, an OCR process is performed on the partial area on the image, and in the index creation process of the scanned image, the portion of the image that was scanned and input according to the OCR area setting of the form registered in the previous form registration step Cut out the area, and for the partial area image, the first OCR process, then automatically expand the partial area and execute the second OCR process,
When the first OCR analysis result and the second OCR analysis result are the same text data as a result of comparing the “first OCR result” and “second OCR result” as the judgment of condition 1 Adopts "first OCR result" as index data,
Next, as the judgment of condition 2, if the "first OCR result" and the "second OCR result" are different text data and the "second OCR result" does not contain a line feed code, `` OCR result of 2 '' is adopted as index data,
Finally, as the judgment of condition 3, when the “first OCR result” and the “second OCR result” are different text data and the “second OCR result” includes a line feed code, The character string of the "second OCR result" is divided by the line feed code, the divided text data is compared with the "first OCR result", and the number of characters and characters constituting the character string are "first OCR result". An image input apparatus that corrects misrecognition due to character cutout due to partial image cutout by the determination processing of Condition 1, Condition 2, and Condition 3, such as adopting the index data as index data, and acquires index data.