JP2022125220A

JP2022125220A - Image processing apparatus, image processing method, and program

Info

Publication number: JP2022125220A
Application number: JP2022108466A
Authority: JP
Inventors: 隼哉秋山; Junya Akiyama; 克彦近藤; Katsuhiko Kondo; 哲 ▲瀬▼川; Satoru Segawa; 裕一中谷; Yuichi Nakatani; 充杉本; Mitsuru Sugimoto; 康日高; Yasushi Hidaka
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2017-06-22
Filing date: 2022-07-05
Publication date: 2022-08-26
Also published as: JP2019008775A

Abstract

PROBLEM TO BE SOLVED: To record, even if there are many types of document formats, an arbitrary character to be recorded in an arbitrary document according to the type of document format.

SOLUTION: An image processing apparatus is configured to: acquire image data of a document form and a recorded character string that an operator has recorded among character strings included in the image data; extract a feature amount of the character string that matches the recorded character string among the character strings recognized by character recognition processing on the image data; machine-learn the feature amount of the character string that is extracted from the image data of each of the plurality of document forms of the same format to generate the feature amount to be read; and record the character string recognized from the image data of a new document form if the feature amount extracted from the character string recognized by the character recognition processing for the image data of the acquired new document form matches the feature amount to be read. Further, the feature amount is information indicating an attribute of the character string and a range of the character string in the image of the document form.

SELECTED DRAWING: Figure 1

Description

本発明は、画像処理装置、画像処理方法、プログラムに関する。 The present invention relates to an image processing device, an image processing method, and a program.

スキャナ等を用いて取得した文書の画像をＯＣＲ（Optical Character Recognition）処理し、文書を読み取る画像処理装置がある。文書の書式が様々である場合にはこのような画像処理装置では文書の書式定義の情報を予め記憶しておき、読取対象領域を特定することでＯＣＲ処理の精度を高めている。これに関連する技術が特許文献１に開示されている。 2. Description of the Related Art There is an image processing apparatus that performs OCR (Optical Character Recognition) processing on an image of a document obtained using a scanner or the like and reads the document. When documents have various formats, such an image processing apparatus stores format definition information of documents in advance and specifies a reading target area to improve the accuracy of OCR processing. A technique related to this is disclosed in Patent Document 1.

特開２０１６－４８４４４号公報JP 2016-48444 A

しかしながら上述のように文書の書式の種類が多い場合、その全ての書式定義の情報を画像処理装置に対して記憶させておくことは作業者の労力がかかる。 However, when there are many types of document formats as described above, it takes a lot of work for the operator to store all the format definition information in the image processing apparatus.

そこでこの発明は、上述の課題を解決する画像処理装置、画像処理方法、プログラムを提供することを目的としている。 SUMMARY OF THE INVENTION Accordingly, an object of the present invention is to provide an image processing apparatus, an image processing method, and a program that solve the above-described problems.

本発明の第１の態様によれば、画像処理装置は、文書帳票の画像データと、前記画像データに含まれる文字列のうち作業者が記録した記録文字列と、を取得する取得部と、前記画像データに対する文字認識処理により認識された文字列のうち、前記記録文字列と一致する前記文字列の特徴量を抽出する特徴量抽出部と、同じフォーマットの複数の文書帳票それぞれの画像データから抽出された前記文字列の特徴量を機械学習して、読取対象の特徴量を生成する読取対象特徴量生成部と、前記取得部により取得された新たな文書帳票の画像データに対する文字認識処理により認識された文字列から抽出された特徴量と、前記読取対象の特徴量とが一致する場合に、前記新たな文書帳票の画像データから認識された文字列を記録する記録部と、を備え、前記特徴量は、前記文字列の属性と、前記文書帳票の画像における前記文字列の範囲とを示す情報である。 According to a first aspect of the present invention, an image processing apparatus includes an acquisition unit that acquires image data of a document form and a recorded character string recorded by an operator among character strings included in the image data; a feature amount extracting unit for extracting a feature amount of the character string that matches the recorded character string among character strings recognized by character recognition processing on the image data; Through machine learning of the feature amount of the extracted character string to generate the feature amount to be read, and character recognition processing for the image data of the new document acquired by the acquisition unit a recording unit that records the character string recognized from the image data of the new document form when the feature amount extracted from the recognized character string and the feature amount to be read match, The feature amount is information indicating the attribute of the character string and the range of the character string in the image of the document form.

本発明の第２の態様によれば、画像処理方法は、文書帳票の画像データと、前記画像データに含まれる文字列のうち作業者が記録した記録文字列と、を取得し、前記画像データに対する文字認識処理により認識された文字列のうち、前記記録文字列と一致する前記文字列の特徴量を抽出し、同じフォーマットの複数の文書帳票それぞれの画像データから抽出された前記文字列の特徴量を機械学習して、読取対象の特徴量を生成し、新たな文書帳票の画像データに対する文字認識処理により認識された文字列から抽出された特徴量と、前記読取対象の特徴量とが一致する場合に、前記新たな文書帳票の画像データから認識された文字列を記録し、前記特徴量は、前記文字列の属性と、前記文書帳票の画像における前記文字列の範囲とを示す情報である。 According to a second aspect of the present invention, an image processing method acquires image data of a document and a recorded character string recorded by an operator among character strings included in the image data, and obtains the image data. out of the character strings recognized by character recognition processing for the character string that matches the recorded character string; machine-learning the amount to generate a feature amount to be read, and the feature amount extracted from the character string recognized by character recognition processing for the image data of the new document form matches the feature amount to be read. character strings recognized from the image data of the new document form are recorded, and the feature quantity is information indicating the attributes of the character strings and the range of the character strings in the image of the document form. be.

本発明の第３の態様によれば、プログラムは、コンピュータに、文書帳票の画像データと、前記画像データに含まれる文字列のうち作業者が記録した記録文字列と、を取得する工程と、前記画像データに対する文字認識処理により認識された文字列のうち、前記記録文字列と一致する前記文字列の特徴量を抽出する工程と、同じフォーマットの複数の文書帳票それぞれの画像データから抽出された前記文字列の特徴量を機械学習して、読取対象の特徴量を生成する工程と、前記取得する工程により取得された新たな文書帳票の画像データに対する文字認識処理により認識された文字列から抽出された特徴量と、前記読取対象の特徴量とが一致する場合に、前記新たな文書帳票の画像データから認識された文字列を記録する工程と、を実行させるプログラムであり、前記特徴量は、前記文字列の属性と、前記文書帳票の画像における前記文字列の範囲とを示す情報である。 According to a third aspect of the present invention, a program obtains, in a computer, image data of a document and a recorded character string recorded by an operator among character strings included in the image data; a step of extracting a feature amount of the character string that matches the recorded character string among character strings recognized by character recognition processing on the image data; a step of performing machine learning on the feature quantity of the character string to generate a feature quantity to be read; and recording a character string recognized from the image data of the new document form when the read feature amount matches the feature amount to be read, wherein the feature amount is , the attribute of the character string and the range of the character string in the image of the document form.

本発明によれば、文書の書式種類が多い場合でもそれら書式の種類に応じた任意の書類における任意の記録対象文字を記録することができる。 According to the present invention, even if there are many kinds of document formats, arbitrary characters to be recorded in arbitrary documents corresponding to the kinds of formats can be recorded.

画像処理装置を含む画像処理システムの概要を示す図である。1 is a diagram showing an outline of an image processing system including an image processing device; FIG. 画像処理装置のハードウェア構成を示す図である。2 is a diagram showing the hardware configuration of an image processing apparatus; FIG. 画像処理装置の機能ブロック図である。1 is a functional block diagram of an image processing device; FIG. 文書帳票の一例を示す図である。FIG. 4 is a diagram showing an example of a document form; データベースが記憶するデータテーブルの概要を示す図である。4 is a diagram showing an outline of a data table stored in a database; FIG. 第一実施形態による画像処理装置の処理フローを示す第一の図である。FIG. 4 is a first diagram showing a processing flow of the image processing apparatus according to the first embodiment; 第一実施形態による画像処理装置の処理フローを示す第二の図である。FIG. 7 is a second diagram showing the processing flow of the image processing apparatus according to the first embodiment; 第二実施形態による画像処理装置の機能ブロック図である。FIG. 5 is a functional block diagram of an image processing device according to a second embodiment; 第二実施形態による画像処理装置の処理フローを示す第一の図である。FIG. 11 is a first diagram showing a processing flow of the image processing device according to the second embodiment; 第二実施形態による画像処理装置の処理フローを示す第二の図である。FIG. 10 is a second diagram showing the processing flow of the image processing apparatus according to the second embodiment; 第六実施形態による画像処理装置の処理フローを示す図である。FIG. 13 is a diagram showing a processing flow of an image processing device according to the sixth embodiment; 第七実施形態による画像処理システムと既存ＯＣＲシステムとの接続例を示す図である。FIG. 21 is a diagram showing an example of connection between an image processing system according to the seventh embodiment and an existing OCR system; 第六実施形態による画像処理装置の処理フローを示す図である。FIG. 13 is a diagram showing a processing flow of an image processing device according to the sixth embodiment; 第八実施形態による画像処理システム２００の概要を示す図である。FIG. 21 is a diagram showing an outline of an image processing system 200 according to an eighth embodiment; FIG. 第八実施形態による端末装置の処理フローを示す図である。FIG. 22 is a diagram showing a processing flow of a terminal device according to the eighth embodiment; 第八実施形態によるサーバ装置の処理フローを示す図である。It is a figure which shows the processing flow of the server apparatus by 8th embodiment. 画像処理装置の最小構成を示す図である。It is a figure which shows the minimum structure of an image processing apparatus.

以下、本発明の一実施形態による画像処理装置を図面を参照して説明する。
図１は本実施形態による画像処理装置を含む画像処理システムの概要を示す図である。図１で示すように画像処理システム１００は画像処理装置１、画像読取装置２、記録装置３、データベース４により構成される。
画像処理装置１は画像読取装置２と通信ケーブルにより接続されている。画像読取装置２は光学的に文書帳票などの画像データを取得して画像処理装置１へ出力する。画像処理装置１は文書帳票の画像データをＯＣＲ処理し文字認識する。画像処理装置１は文字認識結果を記録装置３に出力し、記録装置３がその文字認識結果をデータベースに記録する。データベース４は画像処理装置１と記録装置３とに接続されている。データベース４は記録装置３から過去に登録された複数の文書帳票の画像データとその画像データに含まれる文字列のうち記録対象となる文字列を示す記録文字列の対応関係を記憶している。記録文字列が示す文字列は文書帳票に記述される文字列のうちデータベース４に記録、保存しておくべき重要な文字列である。画像処理システム１００を利用する作業者は予め、記録装置３を用いて過去に登録された複数の文書帳票の画像データとその画像データに含まれる文字列のうちの記録文字列をデータベース４に登録しておく。 An image processing apparatus according to an embodiment of the present invention will be described below with reference to the drawings.
FIG. 1 is a diagram showing an overview of an image processing system including an image processing apparatus according to this embodiment. As shown in FIG. 1, the image processing system 100 is composed of an image processing device 1, an image reading device 2, a recording device 3, and a database 4. FIG.
The image processing device 1 is connected to the image reading device 2 by a communication cable. The image reading device 2 optically acquires image data such as a document form and outputs the image data to the image processing device 1 . The image processing apparatus 1 performs OCR processing on the image data of the document form and recognizes the characters. The image processing device 1 outputs the character recognition result to the recording device 3, and the recording device 3 records the character recognition result in a database. A database 4 is connected to the image processing device 1 and the recording device 3 . The database 4 stores correspondence relationships between image data of a plurality of document forms registered in the past from the recording device 3 and recorded character strings indicating character strings to be recorded among the character strings included in the image data. The character string indicated by the recorded character string is an important character string that should be recorded and saved in the database 4 among the character strings described in the document form. An operator who uses the image processing system 100 registers image data of a plurality of document forms registered in the past using the recording device 3 and recorded character strings among the character strings included in the image data in the database 4 in advance. Keep

そして記録装置３には文書帳票の画像データとその画像データに含まれる文字列の情報のうち記録対象となる文字列を示す記録文字列の情報との対応関係が、多くの文書帳票について十分に記録されているものとする。このような状態において、画像処理装置１は処理を行う。 In the recording device 3, the correspondence relationship between the image data of the document form and the information of the recording character string indicating the character string to be recorded among the information of the character string contained in the image data is sufficiently established for many document forms. shall be recorded. In such a state, the image processing apparatus 1 performs processing.

図２は画像処理装置のハードウェア構成を示す図である。
画像処理装置１は図２で示すように、ＣＰＵ（Central Processing Unit）１１、ＩＦ（Interface）１２、通信モジュール１３、ＲＯＭ（Read Only Memory）１４、ＲＡＭ（Random Access Memory）１５、ＨＤＤ（Hard Disk Drive）１６などの構成を備えたコンピュータである。通信モジュール１３は画像読取装置２、記録装置３、データベース４との間で無線通信を行うものであっても、有線通信を行うものであってもよく、それら２つの機能を有していてもよい。 FIG. 2 is a diagram showing the hardware configuration of the image processing apparatus.
As shown in FIG. 2, the image processing apparatus 1 includes a CPU (Central Processing Unit) 11, an IF (Interface) 12, a communication module 13, a ROM (Read Only Memory) 14, a RAM (Random Access Memory) 15, a HDD (Hard Disk Drive) 16 or the like. The communication module 13 may perform wireless communication with the image reading device 2, the recording device 3, and the database 4, or may perform wired communication. good.

図３は画像処理装置の機能ブロック図である。
画像処理装置１のＣＰＵ１１は記憶するプログラムを実行することにより、制御部１０１、取得部１０２、特徴量抽出部１０３、読取対象特徴量生成部１０４、記録部１０５の機能を備える。 FIG. 3 is a functional block diagram of the image processing apparatus.
The CPU 11 of the image processing apparatus 1 has functions of a control unit 101, an acquisition unit 102, a feature amount extraction unit 103, a reading target feature amount generation unit 104, and a recording unit 105 by executing a stored program.

制御部１０１は、他の機能部を制御する。
取得部１０２は、文書帳票の画像データを取得する。
特徴量抽出部１０３は、複数の文書帳票の画像データの文字認識処理結果に基づいて、文書帳票の画像データに含まれる記録文字列の特徴を示す特徴量を文書帳票の画像データ毎に抽出する。
読取対象特徴量生成部１０４は、文書帳票の画像データに対応する特徴量を用いて記録文字列の第一特徴量を生成する。
記録部１０５は、第一特徴量を用いて、新たな文書帳票の画像データから読み取った文字列の情報のうちの記録文字列を抽出して記録する。 A control unit 101 controls other functional units.
The acquisition unit 102 acquires image data of a document form.
A feature amount extraction unit 103 extracts a feature amount representing a feature of a recorded character string included in image data of a document form for each image data of the document form, based on results of character recognition processing of image data of a plurality of document forms. .
The reading target feature quantity generation unit 104 generates a first feature quantity of the recorded character string using the feature quantity corresponding to the image data of the document form.
The recording unit 105 uses the first feature quantity to extract and record the recorded character string from the character string information read from the image data of the new document form.

このような処理により画像処理装置１は新たな文書帳票の画像データに含まれる記録するべき文字列情報の記録の労力を軽減する。 Through such processing, the image processing apparatus 1 reduces labor for recording character string information to be recorded contained in image data of a new document form.

図４は文書帳票の一例を示す図である。
この図が示すように文書帳票には、その文書を作成した企業のマーク、作成日、作成担当者、文書内容が、その文書帳票に特有のフォーマットで記述されている。文書内容は、例えば文書帳票が発注票であれば発注した商品名やその発注個数などの情報の組が１つまたは複数を示す。作業者はある１つの文書帳票に基づいて、その文書帳票に記述されている文字列のうち記録すべき特定の記録文字列を、記録装置３を用いてデータベース４へ記録する。具体的には作業者は文書帳票を見ながら記録装置３がデータベース４に記録すべき特定の記録文字列を入力する。また作業者は文書帳票の画像データを画像読取装置２に読み込ませる。文書帳票は作業者の操作に基づいて画像読取装置２が読み取り画像処理装置１へ出力する。そして記録装置３は作業者の操作と画像処理装置１の制御とに基づいて、１つの文書帳票についての画像データと、その文書帳票に記述されている文字列のうち記録すべき特定の記録文字列を対応付けてデータベース４に記録する。図４の例においては、日付５１、発注先５２、商品名５３、数量５４、金額５５が記録すべき特定の記録文字列である。文書帳票５には作業者によって記録されない非記録文字列等のその他の情報も印字されている。当該情報は例えば文書帳票を発行した発注者の名称５０１、発注者のエンブレム画像５０２、文書帳票のタイトル５０３、挨拶文５０４などである。 FIG. 4 is a diagram showing an example of a document form.
As shown in this figure, in the document form, the mark of the company that created the document, the date of creation, the person in charge of creation, and the content of the document are described in a format unique to the document form. For example, if the document form is an order form, the document content indicates one or more sets of information such as the name of the ordered product and the number of ordered products. Based on one document form, the operator uses the recording device 3 to record a specific record character string to be recorded among the character strings described in the document form in the database 4 . Specifically, the operator inputs a specific record character string to be recorded in the database 4 by the recording device 3 while looking at the document form. Also, the operator causes the image reading device 2 to read the image data of the document form. The document form is read by the image reading device 2 and output to the image processing device 1 based on the operator's operation. Then, based on the operation of the operator and the control of the image processing apparatus 1, the recording device 3 stores image data for one document form and specific recording characters to be recorded among the character strings described in the document form. The columns are associated and recorded in the database 4. In the example of FIG. 4, the date 51, supplier 52, product name 53, quantity 54, and amount 55 are specific record character strings to be recorded. Other information such as non-recorded character strings that are not recorded by the operator are also printed on the document form 5 . The information includes, for example, the name 501 of the orderer who issued the document, the emblem image 502 of the orderer, the title 503 of the document, the greeting 504, and the like.

図５はデータベースが記憶するデータテーブルの概要を示す図である。
図５で示すようにデータベース４は文書帳票についての画像データと、その文書帳票に記述されている文字列のうち記録すべき特定の記録文字列を対応付けて記録テーブルに記憶する。 FIG. 5 is a diagram showing an outline of a data table stored in the database.
As shown in FIG. 5, the database 4 associates the image data of the document form with a specific recorded character string to be recorded among the character strings described in the document form and stores them in a recording table.

＜第一実施形態＞
図６は第一実施形態による画像処理装置の処理フローを示す第一の図である。
次に画像処理装置１の処理フローについて順を追って説明する。
まずデータベース４にはある文書帳票についての画像データと、その文書帳票に記述されている特定の記録文字列の組み合わせが、その同じ文書帳票複数枚分記録されている。例えば図４で示す文書帳票５についての特定の記録文字列情報が同じ複数枚分記録されているとする。このような状態で作業者が画像処理装置１を起動し、当該画像処理装置１へ処理開始を指示する。 <First Embodiment>
FIG. 6 is a first diagram showing the processing flow of the image processing apparatus according to the first embodiment.
Next, the processing flow of the image processing apparatus 1 will be explained step by step.
First, in the database 4, a combination of image data for a certain document form and a specific record character string described in the document form is recorded for a plurality of the same document forms. For example, it is assumed that specific recorded character string information for the document form 5 shown in FIG. 4 is recorded for the same multiple sheets. In such a state, the operator activates the image processing apparatus 1 and instructs the image processing apparatus 1 to start processing.

画像処理装置１の取得部１０２はデータベース４から文書帳票５の画像データとその画像データに対応する記録文字列の情報を読み取る（ステップＳ６０１）。取得部１０２は画像データと記録文字列とを特徴量抽出部１０３へ出力する。特徴量抽出部１０３は画像データをＯＣＲ処理して画像データ中の全ての文字列と当該文字列の範囲を示す画像データ内の座標を検出する（ステップＳ６０２）。なお文字列は複数の文字によって構成される文字の纏まりである。特徴量抽出部１０３は他の文字との間隔などによってその１つの纏まりの範囲を解析し、その範囲に含まれる１つまたは複数の文字を文字列として抽出すると共に、その画像データ内の文字列の範囲を示す座標を検出する。文字列として含まれる文字は、表意文字、表音文字などの記号、マーク、アイコン画像などを含んでよい。 The acquisition unit 102 of the image processing apparatus 1 reads the information of the image data of the document form 5 and the recorded character string corresponding to the image data from the database 4 (step S601). Acquisition unit 102 outputs the image data and the recorded character string to feature amount extraction unit 103 . The feature amount extraction unit 103 performs OCR processing on the image data to detect all character strings in the image data and coordinates in the image data indicating the range of the character strings (step S602). A character string is a group of characters composed of a plurality of characters. The feature amount extracting unit 103 analyzes the range of one group based on the interval with other characters, etc., extracts one or more characters included in the range as a character string, and extracts the character string in the image data. Find the coordinates that indicate the extent of . Characters included as character strings may include symbols such as ideograms and phonetic characters, marks, icon images, and the like.

特徴量抽出部１０３はＯＣＲ処理により画像データから抽出した文字列と、画像データと共にデータベース４から読み取った記録文字列とを比較する。特徴量抽出部１０３はＯＣＲ処理により画像データから抽出した文字列のうち、記録文字列の文字情報と一致した画像データ中の文字列と、その文字列に含まれる文字の属性と、その範囲の座標とを特定する（ステップＳ６０３）。文字の属性は、数字、アルファベット、ひらがな、漢字、文字数、文字高さ、フォントなどにより表される情報である。また文字列の範囲の座標には文字列に含まれる先頭文字の座標、終了文字の座標などを示す情報である。特徴量抽出部１０３はそれら特定した情報を含む特徴量を複数枚の同じ文書帳票５について生成する（ステップＳ６０４）。 The feature quantity extraction unit 103 compares the character string extracted from the image data by OCR processing with the recorded character string read from the database 4 together with the image data. Among the character strings extracted from the image data by OCR processing, the feature amount extraction unit 103 extracts character strings in the image data that match character information of the recorded character strings, attributes of the characters included in the character strings, and their range. coordinates are specified (step S603). Character attributes are information represented by numbers, alphabets, hiragana, kanji, number of characters, character height, font, and the like. The coordinates of the character string range are information indicating the coordinates of the leading character and the coordinates of the ending character included in the character string. The feature amount extraction unit 103 generates feature amounts including the specified information for a plurality of identical document forms 5 (step S604).

特徴量は特徴量抽出部１０３によって文書帳票５における記録文字列ごとに生成される。読取対象特徴量生成部１０４はそれら複数枚の同じ文書帳票５における特徴量を取得し、記録文字列に対応する各特徴量に含まれる文字の属性、文字列の範囲を示す座標を解析してそれぞれ１つの特徴量を生成する。この特徴量の解析は機械学習などの処理により行う。機械学習のことを単に学習とも称する。
読取対象特徴量生成部１０４が機械学習などの解析により生成した特徴量を第一特徴量と呼ぶ。つまり読取対象特徴量生成部１０４は同じ文書帳票５の複数枚を用いて、その文書帳票５における記録文字列それぞれの第一特徴量を生成する（ステップＳ６０５）。第一特徴量は記録文字列を認識するための特徴量であり、文字の属性、文字列の範囲を示す座標を含む。読取対象特徴量生成部１０４は文書帳票５における１つ又は複数の記録文字列それぞれの第一特徴量を、文書帳票５の識別子に紐づけてデータベース４に記録する（ステップＳ６０６）。 A feature amount is generated for each recorded character string in the document form 5 by the feature amount extraction unit 103 . The reading target feature amount generation unit 104 acquires the feature amounts of the plurality of identical document forms 5, analyzes the character attributes included in each feature amount corresponding to the recorded character string, and the coordinates indicating the range of the character string. Each generates one feature quantity. Analysis of this feature amount is performed by processing such as machine learning. Machine learning is also simply called learning.
A feature amount generated by the reading target feature amount generation unit 104 through analysis such as machine learning is referred to as a first feature amount. That is, the reading target feature quantity generation unit 104 uses a plurality of sheets of the same document form 5 to generate the first feature quantity for each of the recorded character strings in the document form 5 (step S605). The first feature quantity is a feature quantity for recognizing a recorded character string, and includes character attributes and coordinates indicating the character string range. The reading target feature amount generation unit 104 records the first feature amount of each of one or more recorded character strings in the document form 5 in the database 4 in association with the identifier of the document form 5 (step S606).

例えば読取対象特徴量生成部１０４は、文書帳票５に含まれる記録文字列である日付５１、発注先５２、商品名５３、数量５４、金額５５それぞれの、文字属性、文字列の範囲を示す座標などを示す各第一特徴量を、文書帳票５の識別子に紐づけてデータベース４に記録する。 For example, the reading target feature value generation unit 104 generates coordinates indicating character attributes and character string ranges of the date 51, orderer 52, product name 53, quantity 54, and amount 55, which are recorded character strings included in the document form 5, respectively. , etc., are linked to the identifier of the document form 5 and recorded in the database 4 .

以上の処理により画像処理装置１は、作業者の記録文字列を記録する労力を軽減するために利用する情報（第一特徴量）を機械学習等により生成してデータベース４に蓄積することができる。これにより画像処理装置１は新たな文書帳票についての画像データに基づいて記録文字列を自動でデータベース４に記録していくことができる。以下、その処理について説明する。 Through the above processing, the image processing apparatus 1 can generate information (first feature value) to be used for reducing the labor of the operator to record the record character string by machine learning or the like, and store the information in the database 4. . As a result, the image processing apparatus 1 can automatically record the record character string in the database 4 based on the image data of the new document form. The processing will be described below.

図７は第一実施形態による画像処理装置の処理フローを示す第二の図である。
作業者は新たな文書帳票を画像読取装置２に読み取らせる操作を行う。これにより画像読取装置２は文書帳票の画像データを生成して画像処理装置１へ出力する。画像処理装置１の取得部１０２は画像データを取得する（ステップＳ７０１）。取得部１０２は画像データを特徴量抽出部１０３へ出力する。特徴量抽出部１０３は画像データをＯＣＲ処理して、文字列と、文字列に含まれる文字の特徴と、その文字列の範囲の画像データ中の座標を検出する（ステップＳ７０２）。特徴量抽出部１０３はそれら検出した情報を含む第三特徴量を、画像データ中の文字列ごとに生成する（ステップＳ７０３）。つまり第三特徴量は新たに読み込んだ画像データの文書帳票に含まれる文字列の特徴を示す情報である。その後、グループ特定部１０７はデータベース４から、処理対象画像データの文書帳票についての１つまたは複数の第一特徴量を読み出す（ステップＳ７０４）。グループ特定部１０７は記録部１０５へ第三特徴量と第一特徴量を出力する。 FIG. 7 is a second diagram showing the processing flow of the image processing apparatus according to the first embodiment.
The operator performs an operation to cause the image reading device 2 to read a new document form. As a result, the image reading device 2 generates image data of the document form and outputs it to the image processing device 1 . The acquisition unit 102 of the image processing apparatus 1 acquires image data (step S701). Acquisition unit 102 outputs the image data to feature amount extraction unit 103 . The feature quantity extraction unit 103 performs OCR processing on the image data to detect a character string, features of characters included in the character string, and coordinates in the image data within the range of the character string (step S702). The feature amount extraction unit 103 generates a third feature amount including the detected information for each character string in the image data (step S703). That is, the third feature amount is information indicating the feature of the character string included in the document form of the newly read image data. After that, the group identification unit 107 reads out one or more first feature amounts of the document form of the image data to be processed from the database 4 (step S704). The group specifying unit 107 outputs the third feature amount and the first feature amount to the recording unit 105 .

記録部１０５は画像データ中の１つまたは複数の文字列についての第三特徴量と、１つ又は複数の第一特徴量とを取得する。記録部１０５は各第一特徴量に含まれる文字列の範囲を示す座標を用いて、各第一特徴量が示す当該座標に対応する座標を有する第三特徴量が全て存在するかを判定する（ステップＳ７０５）。各第一特徴量の座標に対応する座標を有する第三特徴量が全て存在する場合には、記録文字列に対応する文書帳票内の全ての記載事項に文字の記載が存在する。一方、各第一特徴量の座標に対応する座標を有する第三特徴量が全て存在しない場合には、文書帳票内の何れかの記載事項に文字の記載が無い状態である。 The recording unit 105 acquires a third feature amount and one or more first feature amounts for one or more character strings in the image data. The recording unit 105 uses the coordinates indicating the range of the character string included in each first feature amount to determine whether or not there are all third feature amounts having coordinates corresponding to the coordinates indicated by each first feature amount. (Step S705). When there are all the third feature values having coordinates corresponding to the coordinates of the first feature values, characters are written in all items in the document form corresponding to the recorded character string. On the other hand, when there are no third feature values having coordinates corresponding to the coordinates of each first feature value, there is no description of characters in any item in the document form.

ステップＳ７０５でＹＥＳの場合、記録部１０５は、第一特徴量に含まれる文字属性と、座標に基づいて特定された対応する第三特徴量に含まれる文字属性がそれぞれ一致するかどうかを判定する（ステップＳ７０６）。 In the case of YES in step S705, the recording unit 105 determines whether or not the character attribute included in the first feature amount matches the character attribute included in the corresponding third feature amount specified based on the coordinates. (Step S706).

記録部１０５は、ステップＳ７０６の判定結果がＹＥＳとなり文字属性が一致する場合、現在処理している画像データにおいて１つまたは複数の第三特徴量が示す座標に基づく記録文字列の範囲に矩形枠を表示した確認画面を生成する。記録部１０５はその確認画面をモニタに出力する（ステップＳ７０７）。作業者はこの確認画面に表示された矩形領域を確認して、画像処理装置１が記録しようとする記録文字列を確認することができる。これにより作業者は記録文字列に不足が無いかを確認することができる。確認画面にはＯＫまたはＮＧの何れかのボタンのアイコン画像が表示されている。このボタンのアイコン画像のうちＯＫのボタンを選択することにより作業者は記録文字列としての選択に不足がないことを指示することができる。他方、ボタンのアイコン画像のうちＮＧのボタンを選択することにより作業者は記録文字列としての選択に不足があることを指示することができる。 If the determination result in step S706 is YES and the character attributes match, the recording unit 105 adds a rectangular frame to the range of the recorded character string based on the coordinates indicated by one or more third feature values in the currently processed image data. Generate a confirmation screen that displays The recording unit 105 outputs the confirmation screen to the monitor (step S707). The operator can confirm the character string to be recorded by the image processing apparatus 1 by confirming the rectangular area displayed on the confirmation screen. This allows the operator to check whether the recorded character string is sufficient. An icon image of either OK or NG button is displayed on the confirmation screen. By selecting the OK button from the icon images of these buttons, the operator can indicate that the selection as the record character string is sufficient. On the other hand, by selecting the NG button from the icon images of the buttons, the operator can indicate that the selection as the record character string is insufficient.

なお確認画面をモニタに出力する理由を、図４を用いて説明する。図４では、記録文字列のうち商品名５３が６つ記入されている。過去の文書帳票においても６つの商品名５３の記入が最大だった場合、新たな文書帳票に対して、商品名５３は１～６個の範囲内で自動的に記録文字列と判定される。したがって、例えば新たな帳票では商品名５３が７つ記載されていた場合、１～６個目までの部分については、ステップＳ７０５、Ｓ７０６いずれもＹＥＳとなるため、画像処理装置１は７個目の文字列を記録せずに終了してしまう。このような事象が改善されるように、画像処理装置１は、ステップＳ７０７で記録文字列を記録する前に、確認画面を表示して、作業者に対して、記録して終了してよいかの確認を行う。 The reason why the confirmation screen is output to the monitor will be explained with reference to FIG. In FIG. 4, six product names 53 are entered in the recorded character string. If the number of entries of 6 product names 53 is the maximum in the past document form, the product name 53 is automatically determined as a recorded character string within the range of 1 to 6 in the new document form. Therefore, for example, when seven product names 53 are described in a new form, since both steps S705 and S706 are YES for the first to sixth parts, the image processing apparatus 1 It exits without recording the string. In order to improve such an event, the image processing apparatus 1 displays a confirmation screen before recording the record character string in step S707, and asks the operator whether or not it is OK to finish recording. confirmation.

記録部１０５は作業者のボタンのアイコン画像の押下に応じて、記録文字列の選択に不足が無いかを判定する（ステップＳ７０８）。記録部１０５は不足が無い場合には、第三特徴量に含まれる文字列を、文書帳票の識別情報に対応付けて記録テーブルに記録する（ステップＳ７０９）。 The recording unit 105 determines whether or not the selected character strings to be recorded are insufficient in response to the pressing of the icon image of the button by the operator (step S708). If there is no shortage, the recording unit 105 records the character string included in the third feature quantity in the recording table in association with the identification information of the document form (step S709).

例えば、文書帳票の画像データ中から第三特徴量ａ３、第三特徴量ｂ３、第三特徴量ｃ３、第三特徴量ｄ３が取得できたとする。そして第三特徴量ａ３が予めデータベースに記録されている第一特徴量ａ１と、第三特徴量ｂ３が第一特徴量ｂ１と、第三特徴量ｃ３が第一特徴量ｃ１と、第三特徴量ｄ３が第一特徴量ｄ１とそれぞれ特徴量が一致したとする。この場合、記録部１０５は、第三特徴量ａ３、第三特徴量ｂ３、第三特徴量ｃ３、第三特徴量ｄ３それぞれに含まれる文字列を、文書帳票の記録テーブルに記録する。 For example, assume that a third feature amount a3, a third feature amount b3, a third feature amount c3, and a third feature amount d3 have been acquired from the image data of the document form. Then, the third feature amount a3 is the first feature amount a1 recorded in advance in the database, the third feature amount b3 is the first feature amount b1, the third feature amount c3 is the first feature amount c1, and the third feature amount It is assumed that the feature amount of the quantity d3 matches the first feature amount d1. In this case, the recording unit 105 records the character strings included in the third feature amount a3, the third feature amount b3, the third feature amount c3, and the third feature amount d3 in the record table of the document form.

上述のステップＳ７０５でＮＯの場合、またはステップＳ７０６でＮＯの場合、またはステップＳ７０８でＮＯの場合、記録部１０５は、第一特徴量が示す当該座標に対応する座標を有する第三特徴量が存在しなかった場合の処理を行う。具体的には記録部１０５は、画像データ中の対応する座標の第三特徴量が存在しなかった第一特徴量の座標の範囲に入力欄を設けた帳票画像の入力用画像データを生成してモニタに出力する（ステップＳ７１０）。入力用画像データはＨＴＭＬやＸＭＬなどのマークアップ言語で記述されたデータであってよい。作業者はこの入力用画像データを見ながら、画像処理装置１のキーボード等の入力装置を操作して、モニタに表示されている入力用画像データ内の入力欄に記録文字列を入力する。当該入力用画像データには保存ボタンが表示されており、保存ボタンの押下操作をすると記録部１０５は既に文書帳票について取得した第三特徴量の他、新たに入力用画像データの入力欄に入力された文字列を含む第三特徴量を生成する（ステップＳ７１１）。記録部１０５は帳票画像データの識別子と入力欄に入力された文字列とを対応付けてデータベース４に記録する。画像処理装置１は図６で示した処理フローを再度実施することにより第一特徴量が更新され、自動的に記録できる文字列の範囲を拡張することができる。これにより、次に同じ文書帳票を処理したときには、自動的に文字列を記録できるようになり、作業者が文字列を入力する手間を省くことができる。記録部１０５は、全ての第三特徴量それぞれに含まれる文字列を、文書帳票の記録テーブルに記録する（ステップＳ７１２）。 In the case of NO in step S705, or in the case of NO in step S706, or in the case of NO in step S708, the recording unit 105 determines that a third feature having coordinates corresponding to the coordinates indicated by the first feature exists. If not, perform processing. Specifically, the recording unit 105 generates input image data of a form image in which an input field is provided in the range of coordinates of the first feature amount for which the third feature amount of the corresponding coordinates in the image data did not exist. output to the monitor (step S710). The input image data may be data described in a markup language such as HTML or XML. While viewing the input image data, the operator operates an input device such as a keyboard of the image processing apparatus 1 to input a record character string in the input field in the input image data displayed on the monitor. A save button is displayed on the input image data, and when the save button is pressed, the recording unit 105 newly inputs the third feature value already acquired for the document form into the input field of the input image data. A third feature amount including the character string is generated (step S711). The recording unit 105 associates the identifier of the form image data with the character string input in the input field and records them in the database 4 . The image processing apparatus 1 updates the first feature amount by executing the processing flow shown in FIG. 6 again, and can expand the range of character strings that can be automatically recorded. As a result, the next time the same document form is processed, the character string can be automatically recorded, saving the operator the trouble of inputting the character string. The recording unit 105 records the character strings included in all the third feature amounts in the recording table of the document form (step S712).

このような処理によれば、画像処理装置１は予め作業者が記録しておいた文書帳票の画像データと記録文字列によって、新たに入力させた文書帳票の画像データにおける記録文字列を自動的に記録することができる。したがって画像処理装置１は文書帳票における記録文字列の記録の作業者の労力を軽減することができる。
また文書帳票に記録文字列が記載されていない場合でも、本来、記載されているべき記録文字列に対応する記載事項が記載されていない場合には画像処理装置１は入力用画像データを出力する。これにより文書帳票において記載すべき記載事項に対して入力していない誤りが見つかると共に、その記載事項が示す記録文字列を容易に記録することができる。 According to such processing, the image processing apparatus 1 automatically converts the recorded character string in the newly input image data of the document based on the image data of the document and the recorded character string recorded in advance by the operator. can be recorded in Therefore, the image processing apparatus 1 can reduce the labor of the operator for recording the record character string in the document form.
Also, even if the recorded character string is not described in the document form, if the description corresponding to the recorded character string that should be originally described is not described, the image processing apparatus 1 outputs the image data for input. . As a result, it is possible to find errors in items to be entered in the document form that have not been entered, and to easily record the record character string indicated by the items.

＜第二実施形態＞
図８は第二実施形態による画像処理装置の機能ブロック図である。
この図が示すように第二実施形態による画像処理装置１は、図３で示した各機能部に加え、さらにグループ分類部１０６、グループ特定部１０７の機能を有する。画像処理装置１のハードウェア構成は図２で示した構成と同様である。 <Second embodiment>
FIG. 8 is a functional block diagram of the image processing device according to the second embodiment.
As shown in this figure, the image processing apparatus 1 according to the second embodiment has functions of a group classifying section 106 and a group identifying section 107 in addition to the functional sections shown in FIG. The hardware configuration of the image processing apparatus 1 is the same as the configuration shown in FIG.

図９は第二実施形態による画像処理装置の処理フローを示す第一の図である。
次に第二実施形態による画像処理装置１の処理フローについて順を追って説明する。データベース４には異なる複数の文書帳票についての画像データと、各文書帳票に記述されている特定の記録文字列の組み合わせが、その文書帳票ごとに多数記録されている。例えば図４で示す異なる文書帳票５それぞれについての特定の記録文字列情報が複数枚分記録されているとする。このような状態で作業者が画像処理装置１を起動し、当該画像処理装置１へ処理開始を指示する。 FIG. 9 is a first diagram showing the processing flow of the image processing apparatus according to the second embodiment.
Next, the processing flow of the image processing apparatus 1 according to the second embodiment will be explained step by step. In the database 4, a large number of combinations of image data for a plurality of different document forms and specific recorded character strings described in each document form are recorded for each document form. For example, assume that a plurality of pieces of specific recorded character string information are recorded for each of the different document forms 5 shown in FIG. In such a state, the operator activates the image processing apparatus 1 and instructs the image processing apparatus 1 to start processing.

画像処理装置１の取得部１０２はデータベース４から文書帳票５の画像データとその画像データに対応する記録文字列の情報を全て読み込んだかを判定する（ステップＳ９０１）。ＮＯの場合、取得部１０２はデータベース４から文書帳票５の画像データとその画像データに対応する記録文字列の情報を読み取る（ステップＳ９０２）。取得部１０２は画像データと記録文字列とを特徴量抽出部１０３へ出力する。特徴量抽出部１０３は画像データをＯＣＲ処理して画像データ中の全ての文字列とその画像データ内の座標を検出する（ステップＳ９０３）。なお文字列は複数の文字によって構成される文字の纏まりである。特徴量抽出部１０３は他の文字との間隔などによってその１つの纏まりの範囲を解析し、その範囲に含まれる１つまたは複数の文字を文字列として抽出すると共に、その画像データ内の文字列の範囲を示す座標を検出する。文字列として含まれる文字は、表意文字、表音文字などの記号、マーク、アイコン画像などを含んでよい。 The acquisition unit 102 of the image processing apparatus 1 determines whether or not all the information of the image data of the document form 5 and the recorded character strings corresponding to the image data have been read from the database 4 (step S901). In the case of NO, the acquisition unit 102 reads the image data of the document form 5 and information of the recorded character string corresponding to the image data from the database 4 (step S902). Acquisition unit 102 outputs the image data and the recorded character string to feature amount extraction unit 103 . The feature amount extraction unit 103 performs OCR processing on the image data to detect all character strings in the image data and coordinates in the image data (step S903). A character string is a group of characters composed of a plurality of characters. The feature amount extracting unit 103 analyzes the range of one group based on the interval with other characters, etc., extracts one or more characters included in the range as a character string, and extracts the character string in the image data. Find the coordinates that indicate the extent of . Characters included as character strings may include symbols such as ideograms and phonetic characters, marks, icon images, and the like.

特徴量抽出部１０３はＯＣＲ処理により画像データから抽出した文字列と、画像データと共にデータベース４から読み取った記録文字列とを比較する。特徴量抽出部１０３はＯＣＲ処理により画像データから抽出した文字列のうち、記録文字列の文字情報と一致した画像データ中の文字列と、その文字列に含まれる文字の属性と、その範囲の座標とを特定する（ステップＳ９０４）。文字の属性は、数字、アルファベット、ひらがな、漢字、文字数、文字高さ、フォントなどにより表される情報である。また文字列の範囲の座標には文字列に含まれる先頭文字の座標、終了文字の座標などを示す情報である。特徴量抽出部１０３はそれら特定した情報を含む特徴量を１つの文書帳票５について生成する（ステップＳ９０５）。 The feature quantity extraction unit 103 compares the character string extracted from the image data by OCR processing with the recorded character string read from the database 4 together with the image data. Among the character strings extracted from the image data by OCR processing, the feature amount extraction unit 103 extracts character strings in the image data that match character information of the recorded character strings, attributes of the characters included in the character strings, and their range. coordinates are identified (step S904). Character attributes are information represented by numbers, alphabets, hiragana, kanji, number of characters, character height, font, and the like. The coordinates of the character string range are information indicating the coordinates of the leading character and the coordinates of the ending character included in the character string. The feature quantity extraction unit 103 generates a feature quantity including the specified information for one document form 5 (step S905).

特徴量は特徴量抽出部１０３によって文書帳票５における記録文字列ごとに生成される。すなわち、特徴量抽出部１０３は、文書帳票毎かつ記録文字列毎に特徴量を生成する。この文書帳票毎かつ記録文字列毎の特徴量を個別第一特徴量と称する。個別第一特徴量は、文字の属性、文字列の範囲を示す座標の何れか一方または両方を含んでいてもよい。特徴量抽出部１０３は個々の文書帳票５における１つ又は複数の記録文字列それぞれの個別第一特徴量を、文書帳票５の識別子および記録文字列の識別子に紐づけてデータベース４に記録する（ステップＳ９０６）。記録文字列の識別子として、例えばその記録文字列の位置を示す座標値を用いることができる。 A feature amount is generated for each recorded character string in the document form 5 by the feature amount extraction unit 103 . That is, the feature quantity extraction unit 103 generates a feature quantity for each document form and for each recorded character string. The feature amount for each document form and for each recorded character string is called an individual first feature amount. The first individual feature amount may include either one or both of the character attribute and the coordinates indicating the range of the character string. The feature amount extraction unit 103 records the individual first feature amount of each of one or more recorded character strings in each document form 5 in the database 4 in association with the identifier of the document form 5 and the identifier of the recorded character string ( step S906). As the identifier of the recorded character string, for example, a coordinate value indicating the position of the recorded character string can be used.

例えば特徴量抽出部１０３は、文書帳票５に含まれる記録文字列である日付５１、発注先５２、商品名５３、数量５４、金額５５それぞれの、文字属性、文字列の範囲を示す座標などを示す各個別第一特徴量を、文書帳票５の識別子および記録文字列の識別子に紐づけてデータベース４に記録する。 For example, the feature amount extraction unit 103 extracts the character attributes of each of the date 51, orderer 52, product name 53, quantity 54, and amount 55, which are recorded character strings included in the document form 5, coordinates indicating the character string range, and the like. Each individual first feature quantity shown is recorded in the database 4 in association with the identifier of the document form 5 and the identifier of the recording character string.

特徴量抽出部１０３はまた、記録文字列に含まれる文字情報と一致しない画像データ中の非記録文字列と、その非記録文字列に含まれる文字の属性と、その範囲の座標とを特定する（ステップＳ９０７）。特徴量抽出部１０３はそれら特定した情報を含む特徴量を文書帳票５について生成する（ステップＳ９０８）。 The feature quantity extraction unit 103 also identifies non-recorded character strings in the image data that do not match the character information included in the recorded character strings, attributes of the characters included in the non-recorded character strings, and coordinates of the range. (Step S907). The feature quantity extraction unit 103 generates a feature quantity including the specified information for the document form 5 (step S908).

特徴量は特徴量抽出部１０３によって文書帳票５における非記録文字列ごとに生成される。すなわち、特徴量抽出部１０３は、文書帳票毎かつ非記録文字列毎に特徴量を生成する。この文書帳票毎かつ非記録文字列毎の特徴量を個別第二特徴量と称する。個別第二特徴量は、文字の属性、文字列の範囲を示す座標の何れか一方または両方を含んでいてもよい。特徴量抽出部１０３は個々の文書帳票５における１つ又は複数の非記録文字列それぞれの個別第二特徴量を、文書帳票５の識別子および非記録文字列の識別子に紐づけてデータベース４に記録する（ステップＳ９０９）。非記録文字列の識別子として、例えばその記録文字列の位置を示す座標値を用いることができる。 A feature amount is generated for each non-recorded character string in the document form 5 by the feature amount extraction unit 103 . That is, the feature quantity extraction unit 103 generates a feature quantity for each document form and for each non-recorded character string. The feature amount for each document form and for each non-recorded character string is called an individual second feature amount. The individual second feature amount may include either one or both of the character attribute and the coordinates indicating the range of the character string. The feature amount extraction unit 103 records the individual second feature amount of each of one or more non-recorded character strings in each document form 5 in the database 4 in association with the identifier of the document form 5 and the identifier of the non-recorded character string. (step S909). As the identifier of the non-recorded character string, for example, a coordinate value indicating the position of the recorded character string can be used.

例えば特徴量抽出部１０３は、文書帳票５に含まれる非記録文字列である発注者の名称５０１、発注者のエンブレム画像、文書帳票のタイトル５０３、挨拶文５０４などを示す各個別第二特徴量を、文書帳票５の識別子および記録文字列の識別子に紐づけてデータベース４に記録する。 For example, the feature amount extraction unit 103 extracts each individual second feature amount representing the name of the orderer 501, the emblem image of the orderer, the title 503 of the document form, the greeting sentence 504, etc., which are non-recorded character strings included in the document form 5. is recorded in the database 4 in association with the identifier of the document form 5 and the identifier of the recording character string.

データベース４には異なる複数の文書帳票５の画像データとその画像データに対応する記録文字列の情報が記録されている。画像処理装置１の取得部１０２は全ての文書帳票５についての画像データと記録文字列の情報を読み込むまでステップＳ９０１～ステップＳ９１１の処理を繰り返す。そしてステップＳ９０１において全て読み込んだと判定したとする（ステップＳ９０１：ＹＥＳ）。 Image data of a plurality of different document forms 5 and information of recorded character strings corresponding to the image data are recorded in the database 4 . The acquiring unit 102 of the image processing apparatus 1 repeats the processing of steps S901 to S911 until the image data and recorded character string information for all the document forms 5 are read. Then, it is assumed that it is determined that all have been read in step S901 (step S901: YES).

その場合、グループ分類部１０６が文書帳票５の画像データに含まれる個別第二特徴量に基づいて、文書帳票５をグループ分けする（ステップＳ９１２）。例えばグループ分類部１０６は、各文書帳票５を、個別第二特徴量が示す非記録文字列の一致度や、エンブレム画像の一致度、非記録文字列の座標範囲の一致度などに基づいてグループ分けする。グループ分類部１０６はこのグループ分けの処理において文書帳票５のグループ識別子を決定する。グループ分類部１０６は全ての文書帳票５についてグループ分けが終了したかを判定する（ステップＳ９１３）。 In this case, the grouping unit 106 groups the document forms 5 based on the individual second feature amount included in the image data of the document forms 5 (step S912). For example, the grouping unit 106 groups each document form 5 based on the degree of matching of non-recorded character strings indicated by the individual second feature amount, the degree of matching of emblem images, the degree of matching of coordinate ranges of non-recorded character strings, and the like. Divide. The grouping unit 106 determines the group identifier of the document form 5 in this grouping process. The grouping unit 106 determines whether all the document forms 5 have been grouped (step S913).

グループ分類部１０６は全ての文書帳票５のグループ分けが完了していない場合にはステップＳ９１２の処理を繰り返す。グループ分類部１０６は、全ての文書帳票５のグループ分けが完了した場合には、文書帳票５の識別子とその文書帳票５に付与されたグループ識別子とを対応付けてデータベース４のグループテーブルに記録する（ステップＳ９１４）。 The grouping unit 106 repeats the process of step S912 if grouping of all the document forms 5 has not been completed. When grouping of all the document forms 5 is completed, the group classification unit 106 associates the identifier of the document form 5 with the group identifier given to the document form 5 and records them in the group table of the database 4. (Step S914).

そして読取対象特徴量生成部１０４はあるグループに属する１つまたは複数の文書帳票５の各個別第一特徴量および各個別第二特徴量をデータベース４から読み取り、グループに属する文書帳票５の各個別第一特徴量および各個別第二特徴量に対応する各グループ第一特徴量、各グループ第二特徴量を生成する（ステップＳ９１５）。各グループ第一特徴量はグループに属する文書帳票５の各個別第一特徴量の平均等の値であってもよい。同様に各グループ第二特徴量はグループに属する文書帳票５の各個別第二特徴量の平均等の値であってもよい。各グループ第一特徴量、各グループ第二特徴量は、各個別第一特徴量の平均、各個別第二特徴量の平均でなくとも、所定の統計処理や機械学習等の手法を用いて、グループに属する１つ又は複数の文書帳票５の記録文字列や非記録文字列を特定できるよう算出された特徴量であれば、どのようなもの手法を用いて、各グループ第一特徴量、各グループ第二特徴量を生成してもよい。読取対象特徴量生成部１０４は、グループそれぞれについて各グループ第一特徴量、各グループ第二特徴量を算出し、グループの識別子に対応付けてデータベース４に記録する（ステップＳ９１６）。 Then, the reading target feature amount generation unit 104 reads each individual first feature amount and each individual second feature amount of one or a plurality of document forms 5 belonging to a certain group from the database 4, and reads each individual feature amount of the document forms 5 belonging to the group. Each group first feature amount and each group second feature amount corresponding to the first feature amount and each individual second feature amount are generated (step S915). Each group first feature amount may be a value such as an average of individual first feature amounts of the document forms 5 belonging to the group. Similarly, each group second feature amount may be a value such as the average of each individual second feature amount of the document forms 5 belonging to the group. Each group's first feature amount and each group's second feature amount are not the average of each individual first feature amount and the average of each individual second feature amount, but by using a method such as predetermined statistical processing or machine learning Each group first feature quantity, each A group second feature amount may be generated. The reading target feature amount generation unit 104 calculates each group first feature amount and each group second feature amount for each group, and records them in the database 4 in association with the identifier of the group (step S916).

以上の処理により画像処理装置１は、作業者の記録文字列を記録する労力を軽減するために必要な情報を文書帳票のグループ毎に生成してデータベース４に蓄積することができる。これにより画像処理装置１は新たな文書帳票についての画像データに基づいて記録文字列を自動でデータベース４に記録していくことができる。以下、その処理について説明する。 Through the above-described processing, the image processing apparatus 1 can generate information necessary for each group of document forms and store it in the database 4 in order to reduce the labor of the operator to record the record character string. As a result, the image processing apparatus 1 can automatically record the record character string in the database 4 based on the image data of the new document form. The processing will be described below.

図１０は第二実施形態による画像処理装置の処理フローを示す第二の図である。
作業者は新たな文書帳票を画像読取装置２に読み取らせる操作を行う。これにより画像読取装置２は文書帳票の画像データを生成して画像処理装置１へ出力する。画像処理装置１の取得部１０２は画像データを取得する（ステップＳ１００１）。取得部１０２は画像データを特徴量抽出部１０３へ出力する。特徴量抽出部１０３は画像データをＯＣＲ処理して、文字列と、文字列に含まれる文字の特徴と、その文字列の範囲の画像データ中の座標を検出する（ステップＳ１００２）。特徴量抽出部１０３はそれら検出した情報を含む第三特徴量を、画像データ中の文字列ごとに生成する（ステップＳ１００３）。第三特徴量は新たに読み込んだ画像データの文書帳票に含まれる文字列の特徴を示す情報である。 FIG. 10 is a second diagram showing the processing flow of the image processing apparatus according to the second embodiment.
The operator performs an operation to cause the image reading device 2 to read a new document form. As a result, the image reading device 2 generates image data of the document form and outputs it to the image processing device 1 . The acquisition unit 102 of the image processing apparatus 1 acquires image data (step S1001). Acquisition unit 102 outputs the image data to feature amount extraction unit 103 . The feature quantity extraction unit 103 performs OCR processing on the image data to detect a character string, features of characters included in the character string, and coordinates in the image data within the range of the character string (step S1002). The feature quantity extraction unit 103 generates a third feature quantity including the detected information for each character string in the image data (step S1003). The third feature amount is information indicating the feature of the character string included in the newly read document form of the image data.

次にグループ特定部１０７が、データベース４からあるグループ第二特徴量のうち新たな文書帳票のグループ特定に利用するグループ第二特徴量を読み取る。当該グループ第二特徴量は例えば文書帳票の画像データに表示される発注者のエンブレム画像５０２に対応する特徴量であってよい。グループ特定部１０７はあるグループ第二特徴量に示す情報が、ステップＳ１００１で取得した文書帳票の画像データから特定できるかどうかを判定する。グループ特定部１０７は全てのグループについてのグループ第二特徴量を用いて同様の処理を行う。グループ特定部１０７はデータベース４から読み取ったグループ第二特徴量に一致する情報が新たに読み込んだ文書帳票の画像データから特定できた場合、そのグループ第二特徴量を有するグループを、新たに読み込んだ文書帳票の画像データのグループと特定する（ステップＳ１００４）。その後、グループ特定部１０７はデータベース４からそのグループについての１つまたは複数のグループ第一特徴量を読み出す（ステップＳ１００５）。グループ特定部１０７は記録部１０５へ第三特徴量と１つまたは複数のグループ第一特徴量を出力する。グループ第一特徴量はそのグループに属する文書帳票内の１つまたは複数の記録文字列を特定するための特徴量である。 Next, the group specifying unit 107 reads out the group second feature amount to be used for group specification of the new document form among the group second feature amounts from the database 4 . The group second feature amount may be, for example, a feature amount corresponding to the emblem image 502 of the orderer displayed in the image data of the document form. The group identification unit 107 determines whether or not information indicated by a certain group second feature amount can be identified from the image data of the document form acquired in step S1001. The group specifying unit 107 performs similar processing using the group second feature amount for all groups. When information matching the group second feature value read from the database 4 can be specified from the image data of the newly read document form, the group specifying unit 107 newly reads the group having the group second feature value. A group of image data of a document form is specified (step S1004). After that, the group specifying unit 107 reads out one or more group first feature amounts for the group from the database 4 (step S1005). The group specifying unit 107 outputs the third feature amount and one or more group first feature amounts to the recording unit 105 . The group first feature amount is a feature amount for specifying one or more recorded character strings in the document form belonging to the group.

記録部１０５は画像データ中の１つまたは複数の文字列についての第三特徴量と、１つまたは複数のグループ第一特徴量とを取得する。記録部１０５は各グループ第一特徴量に含まれる文字列の範囲を示す座標を用いて、各グループ第一特徴量が示す当該座標に対応する座標を有する第三特徴量が全て存在するかを判定する（ステップＳ１００６）。各グループ第一特徴量の座標に対応する座標を有する第三特徴量が全て存在する場合には、記録文字列に対応する文書帳票内の全ての記載事項に文字の記載が存在する。一方、各グループ第一特徴量の座標に対応する座標を有する第三特徴量が全て存在しない場合には、文書帳票内の何れかの記載事項に文字の記載が無い状態である。 The recording unit 105 acquires a third feature amount and one or more group first feature amounts for one or more character strings in the image data. The recording unit 105 uses the coordinates indicating the range of the character string included in each group first feature quantity to determine whether or not there are all third feature quantities having coordinates corresponding to the coordinates indicated by each group first feature quantity. Determine (step S1006). When there are all the third feature values having coordinates corresponding to the coordinates of the first feature value in each group, characters are written in all items in the document form corresponding to the recorded character string. On the other hand, when there are no third feature quantities having coordinates corresponding to the coordinates of the first feature quantity for each group, there is no description of characters in any item in the document form.

ステップＳ１００６でＹＥＳの場合、記録部１０５は、グループ第一特徴量に含まれる文字属性と、座標に基づいて特定された対応する第三特徴量に含まれる文字属性がそれぞれ一致するかどうかを判定する（ステップＳ１００７）。 In the case of YES in step S1006, the recording unit 105 determines whether or not the character attribute included in the group first feature amount matches the character attribute included in the corresponding third feature amount specified based on the coordinates. (step S1007).

記録部１０５は、ステップＳ１００７の判定結果がＹＥＳとなり文字属性が一致する場合、現在処理している画像データにおいて１つまたは複数の第三特徴量が示す座標に基づく記録文字列の範囲に矩形枠を表示した確認画面を生成する。記録部１０５はその確認画面をモニタに出力する（ステップＳ１００８）。作業者はこの確認画面に表示された矩形領域を確認して、画像処理装置１が記録しようとする記録文字列を確認することができる。これにより作業者は記録文字列に不足が無いかを確認することができる。確認画面にはＯＫまたはＮＧの何れかのボタンのアイコン画像が表示されている。このボタンのアイコン画像のうちＯＫのボタンを選択することにより作業者は記録文字列としての選択に不足がないことを指示することができる。他方、ボタンのアイコン画像のうちＮＧのボタンを選択することにより作業者は記録文字列としての選択に不足があることを指示することができる。 If the determination result in step S1007 is YES and the character attributes match, the recording unit 105 adds a rectangular frame to the range of the recording character string based on the coordinates indicated by one or more third feature values in the image data currently being processed. Generate a confirmation screen that displays The recording unit 105 outputs the confirmation screen to the monitor (step S1008). The operator can confirm the character string to be recorded by the image processing apparatus 1 by confirming the rectangular area displayed on the confirmation screen. This allows the operator to check whether the recorded character string is sufficient. An icon image of either OK or NG button is displayed on the confirmation screen. By selecting the OK button from the icon images of these buttons, the operator can indicate that the selection as the record character string is sufficient. On the other hand, by selecting the NG button from the icon images of the buttons, the operator can indicate that the selection as the record character string is insufficient.

記録部１０５は作業者のボタンのアイコン画像の押下に応じて、記録文字列の選択に不足が無いかを判定する（ステップＳ１００９）。記録部１０５は不足が無い場合には、第三特徴量に含まれる文字列を、文書帳票の識別情報に対応付けて記録テーブルに記録する（ステップＳ１０１０）。 The recording unit 105 determines whether or not the selected character strings to be recorded are insufficient in response to the pressing of the icon image of the button by the operator (step S1009). If there is no shortage, the recording unit 105 records the character string included in the third feature quantity in the recording table in association with the identification information of the document form (step S1010).

例えば、文書帳票の画像データ中から第三特徴量ａ３、第三特徴量ｂ３、第三特徴量ｃ３、第三特徴量ｄ３が取得できたとする。そして第三特徴量ａ３が予めデータベースに記録されているグループ第一特徴量ｇ１１と、第三特徴量ｂ３がグループ第一特徴量ｇ１２と、第三特徴量ｃ３がグループ第一特徴量ｇ１３と、第三特徴量ｄ３がグループ第一特徴量ｇ１４とそれぞれ特徴量が一致したとする。この場合、記録部１０５は、第三特徴量ａ３、第三特徴量ｂ３、第三特徴量ｃ３、第三特徴量ｄ３それぞれに含まれる文字列を、文書帳票の記録テーブルに記録する。 For example, assume that a third feature amount a3, a third feature amount b3, a third feature amount c3, and a third feature amount d3 have been acquired from the image data of the document form. Then, the third feature amount a3 is the group first feature amount g11 recorded in advance in the database, the third feature amount b3 is the group first feature amount g12, the third feature amount c3 is the group first feature amount g13, Assume that the third feature amount d3 matches the group first feature amount g14. In this case, the recording unit 105 records the character strings included in the third feature amount a3, the third feature amount b3, the third feature amount c3, and the third feature amount d3 in the record table of the document form.

上述のステップＳ１００６でＮＯの場合、またはステップＳ１００７でＮＯの場合、またはステップＳ１００９でＮＯの場合、記録部１０５は、グループ第一特徴量が示す当該座標に対応する座標を有する第三特徴量が存在しなかった場合の処理を行う。具体的には記録部１０５は、画像データ中の対応する座標の第三特徴量が存在しなかったグループ第一特徴量の座標の範囲に入力欄を設けた帳票画像の入力用画像データを生成してモニタに出力する（ステップＳ１０１１）。入力用画像データはＨＴＭＬやＸＭＬなどのマークアップ言語で記述されたデータであってよい。作業者はこの入力用画像データを見ながら、画像処理装置１のキーボード等の入力装置を操作して、モニタに表示されている入力用画像データ内の入力欄に記録文字列を入力する。当該入力用画像データには保存ボタンが表示されており、保存ボタンの押下操作をすると記録部１０５は既に文書帳票について取得した第三特徴量の他、新たに入力用画像データの入力欄に入力された文字列を含む第三特徴量を生成する（ステップＳ１０１２）。記録部１０５は帳票画像データの識別子と入力欄に入力された文字列とを対応付けてデータベース４に記録する。画像処理装置１は図９で示した処理フローを再度実施することによりグループ第一特徴量およびグループ第二特徴量が更新され、自動的に記録できる文字列の範囲を拡張することができる。これにより、次に同じ文書帳票を処理したときには、自動的に文字列を記録できるようになり、作業者が文字列を入力する手間を省くことができる。記録部１０５は、全ての第三特徴量それぞれに含まれる文字列を、文書帳票の記録テーブルに記録する（ステップＳ１０１３）。 If NO in step S1006, NO in step S1007, or NO in step S1009, the recording unit 105 determines that the third feature having coordinates corresponding to the coordinates indicated by the group first feature is Perform processing if it does not exist. Specifically, the recording unit 105 generates input image data of a form image in which input fields are provided in the range of coordinates of the group first feature amount for which the third feature amount of the corresponding coordinates in the image data did not exist. and output to the monitor (step S1011). The input image data may be data described in a markup language such as HTML or XML. While viewing the input image data, the operator operates an input device such as a keyboard of the image processing apparatus 1 to input a record character string in the input field in the input image data displayed on the monitor. A save button is displayed on the input image data, and when the save button is pressed, the recording unit 105 newly inputs the third feature value already acquired for the document form into the input field of the input image data. A third feature amount including the character string is generated (step S1012). The recording unit 105 associates the identifier of the form image data with the character string input in the input field and records them in the database 4 . By executing the processing flow shown in FIG. 9 again, the image processing apparatus 1 updates the group first feature amount and the group second feature amount, and can expand the range of character strings that can be automatically recorded. As a result, the next time the same document form is processed, the character string can be automatically recorded, saving the operator the trouble of inputting the character string. The recording unit 105 records the character strings included in all the third feature amounts in the recording table of the document form (step S1013).

このような処理によれば、画像処理装置１は予め作業者が記録しておいた複数の異なる文書帳票の画像データと記録文字列によって、新たに入力させた文書帳票の種別によらずにその文書帳票の画像データにおける記録文字列を自動的に記録することができる。したがって画像処理装置１は文書帳票における記録文字列の記録の作業者の労力を軽減することができる。
また文書帳票に記録文字列が記載されていない場合でも、本来、記載されているべき記録文字列に対応する記載事項が記載されていない場合には画像処理装置１は入力用画像データを出力する。これにより文書帳票において記載すべき記載事項に対して入力していない誤りが見つかると共に、その記載事項が示す記録文字列を容易に記録することができる。 According to such processing, the image processing apparatus 1 uses the image data and recorded character strings of a plurality of different document forms recorded in advance by the operator, regardless of the type of the newly input document form. A recording character string in image data of a document form can be automatically recorded. Therefore, the image processing apparatus 1 can reduce the labor of the operator for recording the record character string in the document form.
Also, even if the recorded character string is not described in the document form, if the description corresponding to the recorded character string that should be originally described is not described, the image processing apparatus 1 outputs the image data for input. . As a result, it is possible to find errors in items to be entered in the document form that have not been entered, and to easily record the record character string indicated by the items.

＜第三実施形態＞
なお、画像処理装置１の処理の他の例としては、作業者が予め文書帳票のグループを画像処理装置１に登録しておいてもよい。例えば作業者は、過去において文書帳票の画像データを登録する際、文書帳票の種類に合わせてグループ識別子を入力しておき文書帳票の画像データと紐づけてデータベース４に登録しておく。これにより、同一グループ内に画像処理装置１の処理誤り等により異種の帳票が混じることがなくなり、精度のよいグループ第一特徴量およびグループ第二特徴量を抽出することができる。なおこの場合、登録時は作業者が文書帳票のグループを入力するが、新たな帳票に対しては、ステップＳ１００４と同じく、グループ第二特徴量を用いてグループ特定する。 <Third Embodiment>
As another example of the processing of the image processing apparatus 1, the operator may register groups of document forms in the image processing apparatus 1 in advance. For example, when registering image data of a document form in the past, the operator entered a group identifier according to the type of the document form and registered it in the database 4 in association with the image data of the document form. As a result, different types of forms are not mixed in the same group due to a processing error of the image processing apparatus 1, etc., and it is possible to extract the group first feature amount and the group second feature amount with high accuracy. In this case, the operator inputs the group of the document form at the time of registration, but for the new form, the group is specified using the group second feature amount, as in step S1004.

＜第四実施形態＞
また、画像処理装置１の処理の他の例としては、ステップＳ９１２で画像処理装置１は個別第二特徴量を用いて文書帳票をグループ分けするだけでなく、個別第一特徴量を用いて、また個別第二特徴量と共に個別第一特徴量を用いて、文書帳票をグループ分けするようにしてもよい。個別第一特徴量は記録文字列の特徴量であるが、同じ種類の文書帳票であれば、記録文字列の座標やその文字属性は同じであると考えられ、個別第一特徴量を用いて帳票をグループ分けすることが可能となる。
この場合、取得部１０２が、複数の帳票画像データとその帳票画像データに含まれる文字列のうち記録対象となった記録文字列とを取得する。そしてグループ分類部１０６が個別第一特徴量に基づいて帳票画像データをグループ分けする。そして、読取対象特徴量生成部１０４は、グループに含まれる帳票画像データに対応する個別第一特徴量を用いて当該グループごとの記録文字列のグループ第一特徴量を生成する。
また、最初のグループ分けを第三実施形態で示すように作業者が行い、新たな文書帳票に対してはステップＳ１００４の処理により個別第一特徴量を用いてグループ分けするようにしてもよい。これにより、ＯＣＲ処理において精度よく記録文字列を読み取ることが可能となる。 <Fourth embodiment>
As another example of the processing of the image processing apparatus 1, in step S912, the image processing apparatus 1 not only groups document forms using the second individual feature amount, but also uses the first individual feature amount to Document forms may also be grouped using the individual first feature amount together with the individual second feature amount. The individual first feature quantity is the feature quantity of the recorded character string, but if the document forms are of the same type, the coordinates and character attributes of the recorded character string are considered to be the same. Forms can be grouped.
In this case, the acquisition unit 102 acquires a plurality of form image data and a recorded character string to be recorded among the character strings included in the form image data. Then, the grouping unit 106 groups the form image data based on the individual first feature amount. Then, the reading target feature amount generating unit 104 generates a group first feature amount of the recorded character string for each group using the individual first feature amount corresponding to the form image data included in the group.
Alternatively, the first grouping may be performed by the operator as shown in the third embodiment, and the new document form may be grouped using the individual first feature amount in the process of step S1004. This makes it possible to accurately read the recorded character string in OCR processing.

＜第五実施形態＞
第二実施形態においてはステップＳ１００４において第二特徴量に基づいて新たな帳票のグループを特定している。しかしながら、別の処理態様として、画像処理装置１はグループを特定する処理を行わずに、作業者により設定された全グループに対して、１グループごとに順に特定してグループ第一特徴量を読み出し、第三特徴量と一致する個数をカウントする。正しいグループの場合には最も多くグループ第一特徴量と第三特徴量とが一致するはずなので、画像処理装置１は一致個数が最も多いときの特定グループの第三特徴量それぞれに含まれる文字列をステップＳ１００８において記録する。これにより、グループを特定しなくても記録文字列を記録することができる。
この場合、グループ第一特徴量の生成用にデータベース４に蓄えられる文書帳票の画像データが、作業者によって予めグループ分けされていてもよい。取得部１０２は、複数の帳票画像データとその帳票画像データに含まれる文字列のうち記録対象となった記録文字列とを取得する。そして、特徴量抽出部１０３は、取得部１０２の取得した帳票画像データを文字認識処理した結果に基づいて、記録文字列の特徴を示す個別第一特徴量を抽出する。読取対象特徴量生成部１０４は、予め設定された所定のグループに含まれる帳票画像データに対応する個別第一特徴量を用いて当該グループごとの記録文字列のグループ第一特徴量を生成する。 <Fifth Embodiment>
In the second embodiment, in step S1004, a new group of forms is specified based on the second feature amount. However, as another processing mode, the image processing apparatus 1 does not perform the process of specifying the groups, but sequentially specifies each group for all the groups set by the operator, and reads out the group first feature amount. , count the number of matches with the third feature quantity. In the case of the correct group, the first and third feature values of the group should match the most, so the image processing apparatus 1 determines the character string is recorded in step S1008. This makes it possible to record the recording string without specifying the group.
In this case, the image data of the document form stored in the database 4 for generating the group first feature amount may be grouped in advance by the operator. The acquisition unit 102 acquires a plurality of form image data and a recorded character string to be recorded among the character strings included in the form image data. Then, the feature amount extraction unit 103 extracts the individual first feature amount indicating the feature of the recorded character string based on the character recognition processing result of the form image data acquired by the acquisition unit 102 . The reading target feature amount generation unit 104 generates a group first feature amount of a recorded character string for each group using individual first feature amounts corresponding to form image data included in a predetermined group.

＜第六実施形態＞
第一実施形態の図７の処理で、画像処理装置１が、文字認識に成功した場合、失敗した場合のいずれも特徴量の機械学習を行うようにしてもよい。第六実施形態では、この点について説明する。
第六実施形態は、第一実施形態の図７の処理以外は第一実施形態と同様であり、図１～６を援用する。 <Sixth embodiment>
In the processing of FIG. 7 of the first embodiment, the image processing apparatus 1 may perform machine learning of the feature amount both when character recognition succeeds and when character recognition fails. This point will be described in the sixth embodiment.
The sixth embodiment is the same as the first embodiment except for the processing in FIG. 7 of the first embodiment, and FIGS.

図１１は、第六実施形態による画像処理装置の処理フローを示す図である。図１１のステップＳ１１０１～Ｓ１１０８は、図７のステップＳ７０１～Ｓ７０８と同様である。図１１のステップＳ１１１０～１１１２は、図７のステップＳ７０９～Ｓ７１１と同様である。図１１のステップＳ１１１４は、図７のステップＳ７１２と同様である。
従って、図１１の処理では、図７のステップＳ７０８とＳ７０９との間にステップＳ１１０９が入り、図７のステップＳ７１１とＳ７１２との間にステップＳ１１１３が入っていることになり、それ以外は、図７と同様である。
ステップＳ１１０９、Ｓ１１１３の何れでも、画像処理装置１の読取対象特徴量生成部１０４が図６の処理を行って第一特徴量を更新する。読取対象特徴量生成部１０４は、未学習の状態から図６の処理をやり直す必要はなく、画像データ１枚分の追加学習を行えばよい。 FIG. 11 is a diagram showing the processing flow of the image processing apparatus according to the sixth embodiment. Steps S1101 to S1108 in FIG. 11 are the same as steps S701 to S708 in FIG. Steps S1110-1112 in FIG. 11 are the same as steps S709-S711 in FIG. Step S1114 in FIG. 11 is the same as step S712 in FIG.
Therefore, in the process of FIG. 11, step S1109 is inserted between steps S708 and S709 of FIG. 7, and step S1113 is inserted between steps S711 and S712 of FIG. Similar to 7.
In both steps S1109 and S1113, the reading target feature amount generation unit 104 of the image processing apparatus 1 performs the processing of FIG. 6 to update the first feature amount. The reading target feature amount generation unit 104 does not need to redo the processing in FIG. 6 from the unlearned state, and may perform additional learning for one image data.

ここで、ＯＣＲ処理に失敗した場合（ＯＣＲ処理では記録文字列を適切に得られなかった場合）のみ学習を行うように、図１１のステップＳ１１０９では機械学習を行わず、ステップＳ１１１３でのみ機械学習を行うことが考えられる。しかし、この場合、画像処理装置１がＯＣＲ処理に成功すると機械学習を行わない結果、文書帳票の書式の統計的な情報が機械学習に反映されない場合がある。
例えば、文書帳票１００枚中９９枚まで左下に記録文字列があり、１枚だけ右上に記録文字列がある場合を考える。この場合、画像処理装置１が、同じ位置に記録文字列がある文書帳票では画像処理装置１がＯＣＲ処理に成功する（ステップＳ１１０８：ＹＥＳとなる）と、記録文字列が左下にある場合、右上にある場合とも１回ずつ学習を行う。実際には９９対１の割合であるのに、画像処理装置１の学習では１対１の割合で学習を行うことになり、記録文字列が右上にある場合について過学習してしまう可能性がある。
これに対し図１１のように、画像処理装置１がＯＣＲ処理に成功した場合、失敗した場合のいずれも機械学習（第一特徴量の更新）を行うことで、文書帳票の書式の統計的な情報を機械学習に反映させることができる。
第二実施形態～第五実施形態についても同様に、画像処理装置１が、図１０のステップＳ１００９とＳ１０１０との間、ステップＳ１０１２とＳ１０１３との間のいずれでも図９の処理を行うようにしてもよい。 Here, machine learning is not performed in step S1109 of FIG. can be considered. However, in this case, if the image processing apparatus 1 succeeds in OCR processing, machine learning may not be performed, and as a result, statistical information on the format of the document form may not be reflected in machine learning.
For example, consider a case where 99 out of 100 document forms have a recorded character string in the lower left corner and only one sheet has a recorded character string in the upper right corner. In this case, if the image processing apparatus 1 succeeds in OCR processing for a document form with the recorded character string at the same position (step S1108: YES), if the recorded character string is at the lower left, the upper right In each case, learning is performed once. Although the ratio is actually 99:1, the image processing apparatus 1 learns at a ratio of 1:1, and there is a possibility of over-learning when the recorded character string is in the upper right corner. be.
On the other hand, as shown in FIG. 11, both when the image processing apparatus 1 succeeds in OCR processing and when it fails, machine learning (updating the first feature value) is performed to statistically analyze the format of the document form. Information can be reflected in machine learning.
Similarly, in the second to fifth embodiments, the image processing apparatus 1 performs the processing of FIG. 9 between steps S1009 and S1010 and between steps S1012 and S1013 of FIG. good too.

＜第七実施形態＞
画像処理装置１が、例えば既存のＯＣＲシステムなど、文書帳票のスキャンまたは撮影を行うシステムからデータを取得して機械学習を行うようにしてもよい。第七実施形態ではこの点について説明する。以下では、画像処理装置１が既存のＯＣＲシステムからデータを取得する場合を例に説明する。但し、画像処理装置１へのデータ提供元は、既存のＯＣＲシステムに限定されずいろいろなシステムまたは装置とすることができる。例えば、画像処理装置１が、売り上げの日付、売り先、金額、証跡となる画像データをデータベースに保存するシステム（例えば、営業情報システム、または、経理情報システム）からデータを取得するようにしてもよい。特に、情報提供元のシステムまたは装置が、ＯＣＲ機能を有している必要はない。 <Seventh Embodiment>
The image processing apparatus 1 may acquire data from a system that scans or photographs documents, such as an existing OCR system, and performs machine learning. This point will be explained in the seventh embodiment. A case where the image processing apparatus 1 acquires data from an existing OCR system will be described below as an example. However, the data provider to the image processing apparatus 1 is not limited to the existing OCR system, and can be various systems or devices. For example, the image processing apparatus 1 may acquire data from a system (for example, a sales information system or an accounting information system) that stores the date of sale, the place of sale, the amount of money, and the image data that serves as a trail in a database. good. In particular, it is not necessary that the system or device providing the information has OCR capabilities.

図１２は、第七実施形態による画像処理システムと既存ＯＣＲシステムとの接続例を示す図である。図１２に示す構成では、図１の構成に加えて既存ＯＣＲシステム９が画像処理装置１に接続されている。この構成で、既存ＯＣＲシステム９については、画像データ生成機能（スキャン機能）およびユーザインタフェースを使用し、文字認識機能は使用しない。文字認識については画像処理装置１が行う。その他の点は第一実施形態と同様である。図１～７を援用し画像データの取得、およびユーザインタフェースについては既存ＯＣＲシステム９を用いるものと読み替える。 FIG. 12 is a diagram showing an example of connection between an image processing system according to the seventh embodiment and an existing OCR system. In the configuration shown in FIG. 12, an existing OCR system 9 is connected to the image processing apparatus 1 in addition to the configuration shown in FIG. With this configuration, the existing OCR system 9 uses the image data generation function (scan function) and user interface, and does not use the character recognition function. Character recognition is performed by the image processing apparatus 1 . Other points are the same as the first embodiment. With reference to FIGS. 1 to 7, acquisition of image data and user interface are read as those using the existing OCR system 9. FIG.

図１３は、第七実施形態による画像処理装置の処理フローを示す図である。図１３では、画像処理装置１が行う処理と既存ＯＣＲシステム９が行う処理との関係を示している。図１３の処理で、既存ＯＣＲシステムが作業者の操作に従ってスキャンを行い画像データを生成する（シーケンスＳ１３０１）。既存ＯＣＲシステム９は、得られた画像データを画像処理装置１へ送信する（シーケンスＳ１３０２）。
画像処理装置１は、画像データを受信するとＯＣＲ処理を行う（シーケンスＳ１３０３）。例えば、画像処理装置１は、図７のステップＳ７０２以下の処理を行い、記録文字列を取得する。特に、記録部１０５が、ＯＣＲ処理結果の文字列の中から記録文字列を特定する。
画像処理装置１は、文字認識結果の文字列（記録文字列）を既存ＯＣＲシステム９へ送信する（シーケンスＳ１３０４）。
既存ＯＣＲシステム９は、文字認識結果を作業者に提示して、文字認識結果の確認および修正の後確定操作を受ける（シーケンスＳ１３０５）。既存ＯＣＲシステム９は、確定された文字列（修正結果の記録文字列）を画像処理装置１へ送信する（シーケンスＳ１３０６）。
画像処理装置１は、修正結果を受信すると、機械学習を行う（シーケンスＳ１３０７）。
画像処理装置１は、シーケンスＳ１３０２で画像データを取得し、シーケンスＳ１３０６で作業者が確定している修正結果の文字列を取得している。この画像データおよび文字列を用いて、例えば図６の処理手順で機械学習を行うことができる。画像処理装置１が、ある程度データがたまってから纏めて機械学習を行うようにしてもよいし、データを入手する毎に追加学習を行うようにしてもよい。
このように、第七実施形態によれば画像処理装置１は、作業者による既存ＯＣＲシステム９の一般的な使用を利用して、機械学習を行うことができる。特に、読取対象特徴量生成部１０４が、第一特徴量を更新することができる。
第二実施形態～第六実施形態についても同様に、画像処理装置が既存ＯＣＲシステムを用いて機械学習を行うようにしてもよい。 FIG. 13 is a diagram showing the processing flow of the image processing apparatus according to the seventh embodiment. FIG. 13 shows the relationship between the processing performed by the image processing apparatus 1 and the processing performed by the existing OCR system 9 . In the process of FIG. 13, the existing OCR system scans according to the operator's operation and generates image data (sequence S1301). The existing OCR system 9 transmits the obtained image data to the image processing apparatus 1 (sequence S1302).
Upon receiving the image data, the image processing apparatus 1 performs OCR processing (sequence S1303). For example, the image processing apparatus 1 performs the processing from step S702 onward in FIG. 7 to acquire the recorded character string. In particular, the recording unit 105 identifies a recorded character string from the character strings resulting from the OCR processing.
The image processing apparatus 1 transmits the character string (recorded character string) of the character recognition result to the existing OCR system 9 (sequence S1304).
The existing OCR system 9 presents the character recognition result to the operator, and receives confirmation operation after confirming and correcting the character recognition result (sequence S1305). The existing OCR system 9 transmits the confirmed character string (recorded character string of correction result) to the image processing apparatus 1 (sequence S1306).
Upon receiving the correction result, the image processing apparatus 1 performs machine learning (sequence S1307).
The image processing apparatus 1 acquires the image data in sequence S1302, and acquires the character string of the correction result confirmed by the operator in sequence S1306. Using this image data and character strings, machine learning can be performed by, for example, the processing procedure shown in FIG. The image processing apparatus 1 may perform machine learning collectively after a certain amount of data is accumulated, or may perform additional learning each time data is obtained.
Thus, according to the seventh embodiment, the image processing apparatus 1 can perform machine learning using the existing OCR system 9 commonly used by workers. In particular, the reading target feature quantity generation unit 104 can update the first feature quantity.
Similarly, in the second to sixth embodiments, the image processing apparatus may perform machine learning using an existing OCR system.

＜第八実施形態＞
画像処理装置の学習機能をサーバ化して学習を加速させるようにしてもよい。第八実施形態ではこの点について説明する。
図１４は、第八実施形態による画像処理システム２００の概要を示す図である。図１４の構成で、画像読取装置２は、図１の場合と同様である。
図１の画像処理装置１の機能は、機械学習を行う学習機能部１ｂと、それ以外の機能を実行するＯＣＲ機能部１ａとに分けられ、サーバクライアントの構成となっている。特に、学習機能部１ｂは、第一特徴量の生成および更新を行う。また、学習機能部１ｂは、記録装置３の機能も実行して、学習結果データベース４ｂを管理する。ＯＣＲ機能部１ａは、ＯＣＲ処理、第三特徴量の抽出、記録文字列の特定、作業者に対するユーザインタフェース等の機能を実行する。ＯＣＲ機能部１ａは、ＯＣＲ処理にて文字列の検出、文字列に含まれる文字の特徴の検出、および、文字列の範囲の座標の検出を行う。
ＯＣＲ機能部１ａは、端末装置６に格納されている。学習機能部１ｂはサーバ装置７に格納されている。
図１のデータベース４は、ＯＣＲ結果データベース４ａと、学習結果データベース４ｂとに分けられている。ＯＣＲ結果データベース４ａは、記録テーブルなどＯＣＲ処理の結果を記憶する。学習結果データベース４ｂは、第一特徴量など機械学習の際に得られるデータを記憶する。学習結果データベース４ｂは、サーバ装置７に格納されている。一方、ＯＣＲ結果データベースは１つの装置として構成されている。 <Eighth Embodiment>
The learning function of the image processing apparatus may be implemented as a server to accelerate learning. This point will be explained in the eighth embodiment.
FIG. 14 is a diagram showing an outline of an image processing system 200 according to the eighth embodiment. In the configuration of FIG. 14, the image reading device 2 is the same as in the case of FIG.
The functions of the image processing apparatus 1 shown in FIG. 1 are divided into a learning function section 1b that performs machine learning and an OCR function section 1a that performs other functions, forming a server client configuration. In particular, the learning function unit 1b generates and updates the first feature amount. The learning function unit 1b also executes the functions of the recording device 3 and manages the learning result database 4b. The OCR function unit 1a executes functions such as OCR processing, extraction of the third feature amount, identification of recorded character strings, and user interface for the operator. The OCR function unit 1a performs OCR processing to detect a character string, detect characteristics of characters included in the character string, and detect coordinates within the range of the character string.
The OCR function section 1 a is stored in the terminal device 6 . The learning function unit 1b is stored in the server device 7. FIG.
The database 4 in FIG. 1 is divided into an OCR result database 4a and a learning result database 4b. The OCR result database 4a stores results of OCR processing such as a recording table. The learning result database 4b stores data obtained during machine learning, such as the first feature amount. The learning result database 4 b is stored in the server device 7 . On the other hand, the OCR result database is configured as one device.

図１５は、第八実施形態による端末装置の処理フローを示す図である。
図１５の処理で、端末装置６は、画像処理装置２により文書帳票の画像データを読み取り、画像データに対応する学習結果をサーバ装置７からダウンロードする（ステップＳ１５０１）。特に、端末装置６は、学習結果として第一特徴量を取得する。
端末装置６は、画像データに含まれる全ての文字列をＯＣＲ処理し、学習結果に当てはまるものを画面表示する（ステップＳ１５０２）。具体的には、端末装置６は、学習結果である第一特徴量を用いて、第一特徴量に当てはまる記録文字列を特定する。 FIG. 15 is a diagram showing the processing flow of the terminal device according to the eighth embodiment.
In the process of FIG. 15, the terminal device 6 reads the image data of the document form using the image processing device 2, and downloads the learning result corresponding to the image data from the server device 7 (step S1501). In particular, the terminal device 6 acquires the first feature amount as a learning result.
The terminal device 6 performs OCR processing on all character strings included in the image data, and displays on the screen those that match the learning result (step S1502). Specifically, the terminal device 6 uses the first feature amount, which is the learning result, to identify a recorded character string that matches the first feature amount.

端末装置６は、画像データ内の全てのＯＣＲ対象文字列が正しく読み取れているか判定する（ステップＳ１５０３）。読み取れていない文字列があると判定した場合（ステップＳ１５０３：ＮＯ）、端末装置６は、ＯＣＲ結果を画面表示にて作業者に提示し、修正操作を受ける（ステップＳ１５０４）。
端末装置６は、読み取った文字列（作業者の修正を受けた場合は修正後の文字列）をＯＣＲ結果データベース４ａに送信し記録する（ステップＳ１５０５）。具体的には、端末装置６は、記録文字列をＯＣＲ結果データベース４ａに記憶させる。ステップＳ１５０３でＹＥＳの場合も、処理がステップＳ１５０５へ進む。
端末装置６は、読み取った文字列および位置情報など学習用データをサーバ装置７へ送信する（ステップＳ１５０６）。
図１５の処理により、端末装置６は、ＯＣＲ結果として記録すべき記録文字列をＯＣＲ結果データベースに記録し、また、サーバ装置７に学習用データを提供している。 The terminal device 6 determines whether all OCR target character strings in the image data are correctly read (step S1503). If it is determined that there is a character string that has not been read (step S1503: NO), the terminal device 6 presents the OCR result to the operator on screen display and receives a correction operation (step S1504).
The terminal device 6 transmits and records the read character string (or the corrected character string if corrected by the operator) to the OCR result database 4a (step S1505). Specifically, the terminal device 6 stores the recorded character string in the OCR result database 4a. If YES in step S1503, the process also proceeds to step S1505.
The terminal device 6 transmits learning data such as the read character string and position information to the server device 7 (step S1506).
15, the terminal device 6 records the recorded character string to be recorded as the OCR result in the OCR result database, and also provides the server device 7 with learning data.

図１６は、第八実施形態によるサーバ装置の処理フローを示す図である。
図１６の処理で、サーバ装置７は、端末装置６から学習用データを受信して保存する（ステップＳ１６０１）。
サーバ装置７は、未学習の学習用データがＮ個（Ｎは、正整数）蓄積されたかを判定する（ステップＳ１６０２）。蓄積されていないと判定した場合（ステップＳ１６０２：ＮＯ）、ステップＳ１６０１へ戻る。
一方、未学習の学習用データがＮ個蓄積されたと判定した場合（ステップＳ１６０２：ＹＥＳ）、サーバ装置７は、学習用データとして受信した帳票の受信結果データを学習結データベースから読み込む（ステップＳ１６０３）。
次に、サーバ装置７は、読み込んだ学習結果データに新しい学習用データを追加し、再学習する（ステップＳ１６０４）。
サーバ装置７は、再学習結果を学習結果データベース４ｂに登録する（ステップＳ１６０５）。
そして、サーバ装置７は、端末装置６から学習用データを受信するまで待機する（ステップＳ１６０６）。ステップＳ１６０６の後、処理がステップＳ１６０１へ戻る。
図１６の処理で、サーバ装置７は、学習結果としての第一特徴量を更新している。更新された第一特徴量は、学習結果データベース４ｂに登録されて端末装置６の利用に供される。
第八実施形態によれば、画像処理装置の学習機能をサーバ化して学習用データを集約し学習を加速させることができる。 FIG. 16 is a diagram showing the processing flow of the server device according to the eighth embodiment.
In the process of FIG. 16, the server device 7 receives and stores learning data from the terminal device 6 (step S1601).
The server device 7 determines whether N (N is a positive integer) unlearned data for learning have been accumulated (step S1602). If it is determined that the data is not accumulated (step S1602: NO), the process returns to step S1601.
On the other hand, if it is determined that N pieces of unlearned learning data have been accumulated (step S1602: YES), the server apparatus 7 reads the reception result data of the form received as learning data from the learning result database (step S1603). .
Next, the server device 7 adds new learning data to the read learning result data, and re-learns (step S1604).
The server device 7 registers the re-learning result in the learning result database 4b (step S1605).
Then, the server device 7 waits until learning data is received from the terminal device 6 (step S1606). After step S1606, the process returns to step S1601.
In the process of FIG. 16, the server device 7 updates the first feature quantity as the learning result. The updated first feature amount is registered in the learning result database 4 b and used by the terminal device 6 .
According to the eighth embodiment, the learning function of the image processing apparatus can be implemented as a server to aggregate learning data and accelerate learning.

なお、サーバ装置７が再学習等の機械学習を行うタイミングは、図１６のステップＳ１６０２に示される、学習用データが一定数蓄積されたタイミングに限定されない。例えば、サーバ装置７が、一定期間ごとに学習を行うようにしてもよい。特に、サーバ装置７が、データ数の条件に加えてあるいは代えて、時間的な条件に従って学習を開始するようにしてもよい。 Note that the timing at which the server apparatus 7 performs machine learning such as re-learning is not limited to the timing at which a certain number of learning data is accumulated as shown in step S1602 of FIG. For example, the server device 7 may perform learning at regular intervals. In particular, the server device 7 may start learning according to time conditions in addition to or instead of the number of data conditions.

サーバ装置７が一定期間ごとに学習を行う例として、サーバ装置７が、端末装置６が稼働しない夜間または休業日に学習を行う場合が挙げられる。ここで、端末装置６の稼働中にサーバ装置７が学習を行うと、サーバ装置７の学習機能部１ｂによる学習結果データベースへのアクセスが生る。これにより、端末装置６から学習結果データベース４ｂへのアクセスに遅延が生じる、あるいは、アクセスが必要なタイミングでアクセスできないといった不都合が生じる可能性がある。そこで、サーバ装置７が、端末装置６が稼働していないときに学習を行うことで、端末装置６から学習結果データベース４ｂへのアクセスへの影響を回避できる。 As an example in which the server device 7 performs learning at regular intervals, there is a case where the server device 7 performs learning at night or on holidays when the terminal device 6 does not operate. Here, when the server device 7 performs learning while the terminal device 6 is in operation, the learning function unit 1b of the server device 7 accesses the learning result database. As a result, there is a possibility that the access from the terminal device 6 to the learning result database 4b will be delayed, or that access will not be possible at the required timing. Therefore, the server device 7 performs learning when the terminal device 6 is not in operation, thereby avoiding the influence on the access from the terminal device 6 to the learning result database 4b.

なお、サーバ装置７に加えて、あるいは代えて、端末装置６が機械学習を行うようにしてもよい。
ここで、サーバ装置７が機械学習を行い、端末装置６は学習を行わない場合、サーバ装置７が機械学習を行って学習結果（第一特徴量）を更新し、端末装置６と学習結果を共有するまで、端末装置６の処理の精度は向上しない。例えば、サーバ装置７が毎日夜間のみ機械学習を行う場合、端末装置６が更新後の学習結果を使用できるのは翌朝以降となってしまう。例えば端末装置６が午前中に学習用データを生成してサーバ装置７へ送信した場合でも、その学習の結果が端末装置６の処理に反映されるのは、約一日後の翌日朝となる。 In addition to or instead of the server device 7, the terminal device 6 may perform machine learning.
Here, when the server device 7 performs machine learning and the terminal device 6 does not perform learning, the server device 7 performs machine learning, updates the learning result (first feature amount), and exchanges the learning result with the terminal device 6. The processing accuracy of the terminal device 6 does not improve until it is shared. For example, if the server device 7 performs machine learning only at night every day, the terminal device 6 can use the updated learning result only after the next morning. For example, even if the terminal device 6 generates learning data in the morning and transmits it to the server device 7, the result of the learning will not be reflected in the processing of the terminal device 6 until the next morning after about one day.

そこで、端末装置６の各々が、帳票を一枚処理する毎に機械学習を行う。端末装置６は、学習結果を自らの処理に反映させるとともに、学習結果または学習用データをサーバ装置７へ送信する。端末装置６は、帳票の処理で特徴量（第三特徴量）を生成する。そして、端末装置６は、学習結果データベースから取得した第一特徴量を、生成した特徴量を用いて更新する。例えば、端末装置６は、図１５のステップＳ１５０６で、学習用データをサーバ装置７へ送信するとともに、その学習用データを用いて自ら機械学習を行い、端末装置６自らが記憶している第一特徴量を更新する。あるいは、端末装置６が、図１５のステップＳ１５０６で、学習用データに加えて学習結果（得られた特徴量）をサーバ装置７へ送信するようにしてもよい。
端末装置６は、自ら学習を行うことで、自らの学習結果を自らの処理に反映させることができる。この点で、端末装置６自らは機械学習を行わずサーバ装置７による学習結果の更新を待ち受ける場合よりも早く端末装置６の処理制度が向上することが期待される。端末装置６の各々が行う機械学習を仮学習と称する。 Therefore, each of the terminal devices 6 performs machine learning each time it processes one form. The terminal device 6 reflects the learning result in its own processing, and transmits the learning result or learning data to the server device 7 . The terminal device 6 generates a feature amount (third feature amount) by processing the form. Then, the terminal device 6 updates the first feature amount acquired from the learning result database using the generated feature amount. For example, in step S1506 of FIG. 15, the terminal device 6 transmits the learning data to the server device 7, performs machine learning by itself using the learning data, Update features. Alternatively, the terminal device 6 may transmit the learning result (obtained feature amount) to the server device 7 in addition to the learning data in step S1506 of FIG.
By learning by itself, the terminal device 6 can reflect its own learning result in its own processing. In this respect, it is expected that the processing accuracy of the terminal device 6 will be improved more quickly than when the terminal device 6 itself does not perform machine learning and waits for the update of the learning result by the server device 7 . Machine learning performed by each terminal device 6 is referred to as provisional learning.

端末装置６がサーバ装置７へ学習用データを送信する場合、サーバ装置７が行う処理は、図１６を参照して説明したのと同じである。
一方、端末装置６がサーバ装置７へ学習結果を送信する場合、サーバ装置７は、各端末装置６からの学習結果を蓄積しておく。そして、サーバ装置７は、例えば夜間に端末装置６から得られた学習結果を、学習結果データベース４ｂが記憶している学習結果に反映させる。具体的には、サーバ装置７の学習機能部１ｂが、端末装置６が学習で取得した特徴量を用いて、学習用データベース４ｂが記憶している第一特徴量を更新する。
学習結果データベースが記憶している第一特徴量を更新する処理を本学習と称する。 When the terminal device 6 transmits learning data to the server device 7, the processing performed by the server device 7 is the same as described with reference to FIG.
On the other hand, when the terminal device 6 transmits the learning result to the server device 7 , the server device 7 accumulates the learning result from each terminal device 6 . Then, the server device 7 reflects the learning results obtained from the terminal device 6 at night, for example, in the learning results stored in the learning result database 4b. Specifically, the learning function unit 1b of the server device 7 updates the first feature quantity stored in the learning database 4b using the feature quantity acquired by the terminal device 6 through learning.
A process of updating the first feature amount stored in the learning result database is referred to as main learning.

図１７は画像処理装置の最小構成を示す図である。
この図が示すように画像処理装置１は、少なくとも特徴量抽出部１０３と、読取対象特徴量生成部１０４とを備えればよい。
特徴量抽出部１０３は、過去に登録された複数の帳票画像データを文字認識処理した結果に基づいて、帳票画像データに含まれる文字列の特徴を示す特徴量を帳票画像データ毎に抽出する。
読取対象特徴量生成部１０４は、帳票画像データに対応する特徴量を用いて当該帳票画像データ中の記録文字列の第一特徴量を生成する。 FIG. 17 is a diagram showing the minimum configuration of the image processing apparatus.
As shown in this figure, the image processing apparatus 1 may include at least a feature amount extraction unit 103 and a reading target feature amount generation unit 104 .
The feature quantity extraction unit 103 extracts a feature quantity representing a feature of a character string included in the form image data for each form image data, based on the results of character recognition processing for a plurality of form image data registered in the past.
The reading target feature amount generation unit 104 uses the feature amount corresponding to the form image data to generate the first feature amount of the recorded character string in the form image data.

上述の各装置は内部に、コンピュータシステムを有している。そして、各装置に上述した各処理を行わせるためのプログラムは、それら装置のコンピュータ読み取り可能な記録媒体に記憶されており、このプログラムを各装置のコンピュータが読み出して実行することによって、上記処理が行われる。ここでコンピュータ読み取り可能な記録媒体とは、磁気ディスク、光磁気ディスク、ＣＤ－ＲＯＭ、ＤＶＤ－ＲＯＭ、半導体メモリ等をいう。また、このコンピュータプログラムを通信回線によってコンピュータに配信し、この配信を受けたコンピュータが当該プログラムを実行するようにしても良い。 Each of the devices described above has an internal computer system. A program for causing each device to perform each process described above is stored in a computer-readable recording medium of each device. done. Here, the computer-readable recording medium refers to magnetic disks, magneto-optical disks, CD-ROMs, DVD-ROMs, semiconductor memories, and the like. Alternatively, the computer program may be distributed to a computer via a communication line, and the computer receiving the distribution may execute the program.

また、上記プログラムは、前述した各処理部の機能の一部を実現するためのものであっても良い。さらに、前述した機能をコンピュータシステムにすでに記録されているプログラムとの組み合わせで実現できるもの、いわゆる差分ファイル（差分プログラム）であっても良い。 Further, the program may be for implementing part of the functions of the processing units described above. Further, it may be a so-called difference file (difference program) that can realize the above-described functions in combination with a program already recorded in the computer system.

１・・・画像処理装置
１ａ・・・ＯＣＲ機能部
１ｂ・・・学習機能部
２・・・画像読取装置
３・・・記録装置
４・・・データベース
４ａ・・・学習結果データベース
４ｂ・・・ＯＣＲ結果データベース
５・・・文書帳票
６・・・端末装置
７・・・サーバ装置
１０１・・・制御部
１０２・・・取得部
１０３・・・特徴量抽出部
１０４・・・読取対象特徴量生成部
１０５・・・記録部
１０６・・・グループ分類部
１０７・・・グループ特定部
１０８・・・作業対象判定部
１０９・・・作業データ生成部 Reference Signs List 1 Image processing device 1a OCR function unit 1b Learning function unit 2 Image reading device 3 Recording device 4 Database 4a Learning result database 4b OCR result database 5 Document form 6 Terminal device 7 Server device 101 Control unit 102 Acquisition unit 103 Feature amount extraction unit 104 Read target feature amount generation Unit 105 Recording unit 106 Group classification unit 107 Group identification unit 108 Work target determination unit 109 Work data generation unit

Claims

文書帳票の画像データと、前記画像データに含まれる文字列のうち作業者が記録した記録文字列と、を取得する取得部と、
前記画像データに対する文字認識処理により認識された文字列のうち、前記記録文字列と一致する前記文字列の特徴量を抽出する特徴量抽出部と、
同じフォーマットの複数の文書帳票それぞれの画像データから抽出された前記文字列の特徴量を機械学習して、読取対象の特徴量を生成する読取対象特徴量生成部と、
前記取得部により取得された新たな文書帳票の画像データに対する文字認識処理により認識された文字列から抽出された特徴量と、前記読取対象の特徴量とが一致する場合に、前記新たな文書帳票の画像データから認識された文字列を記録する記録部と、
を備え、
前記特徴量は、前記文字列の属性と、前記文書帳票の画像における前記文字列の範囲とを示す情報である
画像処理装置。 an acquisition unit for acquiring image data of a document form and a recorded character string recorded by an operator among character strings included in the image data;
a feature amount extracting unit for extracting a feature amount of the character string that matches the recorded character string from among the character strings recognized by character recognition processing on the image data;
a read target feature quantity generation unit that performs machine learning on the feature quantity of the character string extracted from the image data of each of a plurality of document forms having the same format to generate a read target feature quantity;
When the feature amount extracted from the character string recognized by the character recognition processing for the image data of the new document form acquired by the acquisition unit matches the feature amount of the reading target, the new document form is obtained. a recording unit that records the character string recognized from the image data of the
with
The image processing apparatus, wherein the feature amount is information indicating an attribute of the character string and a range of the character string in the image of the document form.

前記読取対象特徴量生成部は、前記複数の文書帳票それぞれの画像データが予めグループ分けされたグループ毎に、前記画像データから抽出された前記文字列の特徴量を機械学習して、前記グループ毎の前記読取対象の特徴量を生成する
請求項１に記載の画像処理装置。 The reading target feature amount generation unit machine-learns the feature amount of the character string extracted from the image data for each group in which the image data of each of the plurality of document forms is grouped in advance, and performs machine learning on each group. The image processing apparatus according to claim 1, wherein the feature amount of the reading object of is generated.

前記複数の文書帳票それぞれの画像データから抽出された、前記記録文字列と一致する文字列の特徴量の互いの一致度合いを示す一致度に基づいて、前記複数の文書帳票それぞれの画像データをグループ分けするグループ分類部と、
を備える請求項２に記載の画像処理装置。 Grouping the image data of each of the plurality of document forms based on the degree of matching indicating the degree of mutual matching of the feature quantity of the character string that matches the recorded character string extracted from the image data of each of the plurality of document forms. a group classification unit for dividing;
The image processing apparatus according to claim 2, comprising:

前記特徴量抽出部は、前記記録文字列と一致する文字列以外の文字列の特徴量をさらに抽出し、
前記複数の文書帳票それぞれの画像データから抽出された、前記記録文字列と一致する文字列以外の文字列の特徴量の互いの一致度合いを示す一致度に基づいて、前記複数の文書帳票それぞれの画像データをグループ分けするグループ分類部と、
を備える請求項２に記載の画像処理装置。 The feature quantity extraction unit further extracts a feature quantity of a character string other than the character string matching the recorded character string,
each of the plurality of document forms based on the degree of matching indicating the degree of mutual matching of feature quantities of character strings other than the character string matching the recorded character string, extracted from the image data of each of the plurality of document forms; a grouping unit for grouping image data;
The image processing apparatus according to claim 2, comprising:

前記読取対象特徴量生成部は、前記記録文字列と一致する文字列以外の文字列の特徴量に基づいて、前記グループ毎の特徴量を生成し、
前記グループ毎の特徴量と、前記新たな文書帳票の画像データから抽出された前記特徴量と、に基づいて、前記新たな文書帳票の画像データの属する前記グループを特定するグループ特定部と、を備え、
前記記録部は、前記新たな文書帳票の画像データから抽出された前記特徴量と、特定された前記グループの前記読取対象の特徴量とが一致する場合に、前記新たな文書帳票の画像データから認識された文字列を記録する、
請求項４に記載の画像処理装置。 The reading target feature quantity generation unit generates a feature quantity for each group based on a feature quantity of a character string other than the character string matching the recorded character string,
a group identification unit that identifies the group to which the image data of the new document form belongs based on the feature amount of each group and the feature amount extracted from the image data of the new document form; prepared,
If the feature amount extracted from the image data of the new document form matches the feature amount of the read target of the specified group, the recording unit record recognized strings,
The image processing apparatus according to claim 4.

文書帳票の画像データと、前記画像データに含まれる文字列のうち作業者が記録した記録文字列と、を取得し、
前記画像データに対する文字認識処理により認識された文字列のうち、前記記録文字列と一致する前記文字列の特徴量を抽出し、
同じフォーマットの複数の文書帳票それぞれの画像データから抽出された前記文字列の特徴量を機械学習して、読取対象の特徴量を生成し、
新たな文書帳票の画像データに対する文字認識処理により認識された文字列から抽出された特徴量と、前記読取対象の特徴量とが一致する場合に、前記新たな文書帳票の画像データから認識された文字列を記録し、
前記特徴量は、前記文字列の属性と、前記文書帳票の画像における前記文字列の範囲とを示す情報である
画像処理方法。 Acquiring image data of a document form and a recorded character string recorded by an operator among the character strings included in the image data,
extracting a feature amount of the character string that matches the recorded character string, from among the character strings recognized by character recognition processing on the image data;
machine-learning the feature amount of the character string extracted from the image data of each of a plurality of document forms of the same format to generate the feature amount to be read;
When the feature amount extracted from the character string recognized by the character recognition processing for the image data of the new document form matches the feature amount of the reading target, the character string is recognized from the image data of the new document form. record the string
The image processing method, wherein the feature amount is information indicating an attribute of the character string and a range of the character string in the image of the document form.

コンピュータに、
文書帳票の画像データと、前記画像データに含まれる文字列のうち作業者が記録した記録文字列と、を取得する工程と、
前記画像データに対する文字認識処理により認識された文字列のうち、前記記録文字列と一致する前記文字列の特徴量を抽出する工程と、
同じフォーマットの複数の文書帳票それぞれの画像データから抽出された前記文字列の特徴量を機械学習して、読取対象の特徴量を生成する工程と、
前記取得する工程により取得された新たな文書帳票の画像データに対する文字認識処理により認識された文字列から抽出された特徴量と、前記読取対象の特徴量とが一致する場合に、前記新たな文書帳票の画像データから認識された文字列を記録する工程と、
を実行させ、
前記特徴量は、前記文字列の属性と、前記文書帳票の画像における前記文字列の範囲とを示す情報である
プログラム。 to the computer,
a step of acquiring image data of a document form and a recorded character string recorded by an operator among the character strings included in the image data;
a step of extracting a feature amount of the character string that matches the recorded character string, from among the character strings recognized by the character recognition process for the image data;
a step of machine-learning the feature amount of the character string extracted from the image data of each of a plurality of document forms of the same format to generate a feature amount to be read;
When the feature amount extracted from the character string recognized by the character recognition processing for the image data of the new document form obtained in the obtaining step matches the feature amount of the reading target, the new document is obtained. a step of recording a character string recognized from the image data of the form;
and
The program, wherein the feature amount is information indicating an attribute of the character string and a range of the character string in the image of the document form.