JPH0934984A

JPH0934984A - Character recognizing device

Info

Publication number: JPH0934984A
Application number: JP18421395A
Authority: JP
Inventors: Shigeki Ozawa; 茂樹小澤
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1995-07-20
Filing date: 1995-07-20
Publication date: 1997-02-07

Abstract

PROBLEM TO BE SOLVED: To automatically remove an excessive character caused by a double description when linking recognized characters. SOLUTION: Concerning a character recognizing device with which a prescribed recognized character string 35 is generated by linking 1st and 2nd recognized characters 30 and 31 respectively recognized from different images, this device has a coincidence verifying part 32 for verifying coincidence/non- coincidence by comparing the linked side prescribed character of the 1st recognized character with the 2nd recognized character corresponding to their digits, character erasing part 33 for erasing the prescribed character from the 1st recognized character when the coincidence is discriminated as a result of verification, and character linking part 34 for linking the 1st recognized character after erasure and the 2nd recognized character.

Description

【発明の詳細な説明】Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、項目別に記入されたイ
メージから読み取った認識文字を連結する場合等におい
て、重複記入による認識文字中の余分な文字を除去する
文字認識装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a character recognizing device for removing an extra character from a recognition character due to duplicated input when connecting recognition characters read from an image written for each item.

【０００２】近年、知識処理等による認識率の向上によ
り、ＯＣＲ装置（ＦＡＸを含む）を利用した業務形態が
増大しつつある。しかし、このＯＣＲ装置に読ませるＯ
ＣＲ帳票の記入者は、専門知識を有した人だけではな
く、金融業務における振込帳票のごとく、一般の人が記
入する機会が増えており、記入の仕方によっては認識率
が低下するといった不都合が生じている。[0002] In recent years, due to improvement in recognition rate due to knowledge processing and the like, the number of business forms using OCR devices (including FAX) is increasing. However, the O that this OCR device reads
The people who fill in the CR form are not only those who have specialized knowledge, but the general public has more opportunities to fill in, like the transfer form in financial services, and the recognition rate declines depending on the method of filling. Has occurred.

【０００３】例えば、金融機関の固有名称「ニホン」と
金融機関種別名（業態名）「ギンコウ」とを別々に記入
させ、これをそれぞれ認識して１つの金融機関名「ニホ
ンギンコウ」を生成するような場合、一般の人は、固有
名称として通常の呼び名である「ニホンギンコウ」を記
入し、業態名として再度「ギンコウ」と記入する場合が
多く、これを連結すると「ニホンギンコウギンコウ」と
なる。従って、登録した「ニホンギンコウ」による知識
処理を行っても、桁数が異なる等の理由により、補正が
有効に機能せず、オペレータの修正操作が必要になる。[0003] For example, the unique name of a financial institution "Nihon" and the name of a financial institution type (business name) "Ginkow" are entered separately, and each is recognized to generate one financial institution name "Nihon Ginkou". In such cases, ordinary people often enter the ordinary name "Nihonginkou" as the proper name, and enter "ginginkou" again as the business type name, and when this is concatenated, it becomes "Nihonginkoukininkou". . Therefore, even if the knowledge processing by the registered "Nihonginkou" is performed, the correction does not function effectively due to the difference in the number of digits and the correction operation by the operator is required.

【０００４】このように、効率向上のために導入したＯ
ＣＲ装置でも、記入の仕方によっては業務効率は余り向
上しないといった弊害が発生しており、このような不都
合を解決する文字認識装置が求められている。Thus, the O introduced for improving the efficiency
Even in the CR device, there is a problem that the work efficiency is not improved so much depending on how to fill in, and a character recognition device that solves such an inconvenience is required.

【０００５】[0005]

【従来の技術】図６は従来例の構成図、図７は従来例の
処理手順（その１）説明図、図８は従来例の処理手順
（その２）説明図である。2. Description of the Related Art FIG. 6 is a block diagram of a conventional example, FIG. 7 is an explanatory diagram of a processing procedure (No. 1) of a conventional example, and FIG. 8 is an explanatory diagram of a processing procedure (No. 2) of a conventional example.

【０００６】図６は、手書き帳票10に記入された取引デ
ータをＯＣＲ装置１で読み取らせて入力する金融端末装
置の構成例を示したものである。いま、顧客が帳票10に
手書きして窓口に提出すると、オペレータはＯＣＲ装置
１に挿入する。この結果、以下に示す処理により、表示
装置８に文字認識された取引データが表示される。 (1) ＯＣＲ装置１は帳票10の各項目についてイメージを
読み取り、文字の認識処理を行い、文字コードに変換す
る。この際、認識が不確かで第１候補文字とは異なる他
の候補文字が存在すれば、文字ごとに提案する。帳票10
には、マークによる項目選択があり、選択情報（２値デ
ータ）も同時に出力する。 (2) 文字列変換部２は、文字列変換テーブル６を参照
し、選択情報を対応する文字列に変換する。 (3) 知識処理部３は、各項目について、すべての候補文
字を組み合わせた認識文字列と知識辞書７に登録されて
いる文字列とを比較し、登録文字列と一致する認識文字
列を正常認識した認識文字列として出力する。 (4) 画面表示部４は、決定した各項目の認識文字列を、
所定フォーマット、例えば帳票10の形式に相当する画面
を定義した画面定義体５に基づいて表示する。[0006] FIG. 6 shows an example of the configuration of a financial terminal device in which the transaction data entered on the handwritten form 10 is read by the OCR device 1 and input. Now, when the customer handwrites the form 10 and submits it to the counter, the operator inserts it into the OCR device 1. As a result, the transaction data which has been character-recognized is displayed on the display device 8 by the following processing. (1) The OCR device 1 reads an image for each item on the form 10, performs character recognition processing, and converts the character code. At this time, if there is another candidate character whose recognition is uncertain and different from the first candidate character, it is proposed for each character. Form 10
Has an item selection by a mark, and the selection information (binary data) is also output at the same time. (2) The character string conversion unit 2 refers to the character string conversion table 6 and converts the selection information into a corresponding character string. (3) For each item, the knowledge processing unit 3 compares the recognition character string in which all candidate characters are combined with the character string registered in the knowledge dictionary 7, and normally recognizes the recognition character string that matches the registered character string. Output as a recognized character string. (4) The screen display unit 4 displays the determined character string of each item
It is displayed based on a screen definition body 5 that defines a screen corresponding to a predetermined format, for example, the form of the form 10.

【０００７】オペレータはこの表示された認識文字と手
書き帳票10とを比較し、認識誤りがある文字について
は、図示省略したキーボードにより修正する。図７，図
８は具体的な処理手順を示したものである。The operator compares the displayed recognized character with the handwritten form 10 and corrects the character having a recognition error with a keyboard (not shown). 7 and 8 show a specific processing procedure.

【０００８】帳票10には、口座番号、金融機関名、
支店名、を手書きする項目と、銀行，信金等の業態
名をマークで選択する選択項目とがあり、〔金融機関
名＋業態名〕を正式な（金融機関名）として金融端末
装置に入力するようになっている。なお、図７の例で
は、金融機関名は正しくは「イロハ」と記入すべきとこ
ろを「イロハシンキン」と記入している例を示してい
る。The form 10 includes an account number, a financial institution name,
There is an item for handwriting the branch name, and a selection item for selecting the business type name such as bank and credit with a mark. Input [financial institution name + business type name] into the financial terminal device as the official (financial institution name). It is like this. Note that the example of FIG. 7 shows an example in which the correct name of the financial institution is "Iroha Shinkin" where "Iroha" should be entered.

【０００９】ここで、図７の表（イ）は、ＯＣＲ装置１
で認識された各項目の候補文字（このうち、最も正しい
と判別した認識文字を、認識文字または第１候補文字と
称する）を示したもので、業態名は、マーク位置により
「０１００」と認識されている。The table (a) of FIG. 7 shows the OCR device 1
It shows the candidate characters of each item recognized in (the recognized character that has been determined to be the most correct is called the recognized character or the first candidate character), and the business category name is recognized as "0100" depending on the mark position. Has been done.

【００１０】これらの認識データにより、先ず、選択情
報「０１００」が文字列変換テーブル６により、表
（ロ）のごとく、文字列「シンキン」と変換される。続
いて、認識された金融機関名「イロハシ？キソ」（第１
候補文字）と業態「シンキン」とが連結され、「イロハ
シ？キンシンキン」を対象として知識辞書７による知識
処理（比較による正否の検証等）が行われる。知識辞書
７には「イロハシンキン」は登録されているが、「シン
キン」を二重に連結した「イロハシンキンシンキン」は
登録されていないから、一致する登録文字は検出でき
ず、認識誤りと判定される。Based on these recognition data, the selection information "0100" is first converted by the character string conversion table 6 into the character string "Shinkin" as shown in the table (b). Then, the recognized financial institution name "Irohashi? Kiso" (first
The candidate character) and the business type “Shinkin” are connected, and knowledge processing (verification by comparison, etc.) by the knowledge dictionary 7 is performed for “Irohashi? Kinshinkin”. Although “Iroha Shinkin” is registered in the knowledge dictionary 7, “Iroha Shinkin Shinkin” which is a double concatenation of “Shinkin” is not registered, so that a matching registered character cannot be detected and it is determined as a recognition error. To be done.

【００１１】ここで、他に候補文字があれば、次にその
候補文字を第１候補文字に組み込んだ文字列を対象とし
て知識処理が行われる。しかし、ここでも、「シンキ
ン」が二重に連結されたものとの比較になるから、すべ
ての候補文字について、知識処理による正しい認識文字
の抽出は行われず、第１候補文字である「イロハシ？キ
ソシンキン」がそのまま表示される。この結果、オペレ
ータは「シ？キソ」を削除する操作を行う。なお、口座
番号は、ここでは第１候補文字「０１２３４５６７」が
採用され、支店名は「アイウエオマチ」が知識処理によ
って決定されている。If there are other candidate characters, the knowledge process is performed on the character string in which the candidate character is incorporated into the first candidate character. However, even here, since "Shinkin" is compared with a double concatenation, correct recognition characters are not extracted by knowledge processing for all candidate characters, and the first candidate character "Irohashi? "Kisoshinkin" is displayed as it is. As a result, the operator performs an operation of deleting "shiki". Note that the first candidate character “01234567” is adopted as the account number here, and the branch name “Iueomachi” is determined by the knowledge processing.

【００１２】[0012]

【発明が解決しようとする課題】手書き帳票等を読み取
り入力する文字認識装置では、認識率を向上させ、オペ
レータによる修正を少なくするため、手書き文字を少な
くした形式の帳票が使用されている。即ち、「シンキ
ン」，「ギンコウ」等の業態名はマークで選択させるよ
うにして認識率を改善している。In a character recognition device for reading and inputting a handwritten form or the like, a form in which handwritten characters are reduced is used in order to improve the recognition rate and reduce the correction by the operator. That is, the business category names such as "Shinkin" and "Ginkgo" are selected by the mark to improve the recognition rate.

【００１３】これにより、金融機関名として「イロハシ
ンキン」と記入する代わりに、固有名称部分である「イ
ロハ」を手書きし、業態名である「シンキン」はマーク
で指定すればよく、文字認識装置では、この２項目を連
結して「イロハシンキン」と認識する。勿論、「イロハ
シンキン」と手書きし、マーク記入をしなければ「イロ
ハシンキン」と認識されるが、手書き部分が多くなるの
で、その分認識率が低下する。As a result, instead of writing "Iroha Shinkin" as the name of the financial institution, "Iroha" which is a unique name portion is handwritten and "Shinkin" which is a business category name is designated by a mark. Then, these two items are connected and recognized as "Iroha Shinkin". Of course, if "Iroha Shinkin" is handwritten and the mark is not entered, it is recognized as "Iroha Shinkin", but since the number of handwritten parts increases, the recognition rate decreases accordingly.

【００１４】このような記入方法では、一般の顧客は、
固有名称の記入項目に、通常呼称する金融機関名、例え
ば「イロハシンキン」と記入した上、「信金」にマーク
を付けることが多く、文字認識装置では「イロハシンキ
ンシンキン」のごとく「シンキン」が重複することにな
る。この「イロハシンキンシンキン」は知識辞書には登
録されていないので、比較処理などの知識処理を実行し
ても候補文字から正しい認識文字を抽出することはでき
ず、そのまま「イロハシンキンシンキン」と表示される
ことになり、個々の文字は正しく認識されたとしてもオ
ペレータの修正が必要となる。With such an entry method, a general customer
In the entry of the proper name, the name of the financial institution usually called, for example, "Iroha Shinkin" is often entered, and "Shikinkin" is often marked. It will overlap. Since this "Iroha Shinkin Shinkin" is not registered in the knowledge dictionary, the correct recognized character cannot be extracted from the candidate characters even if knowledge processing such as comparison processing is executed, and it is displayed as "Iroha Shinkin Shinkin". Therefore, even if the individual characters are correctly recognized, the operator needs to correct them.

【００１５】このように、記入方法の仕方によっては、
知識処理による認識の改善効果は期待できず、オペレー
タの修正操作が削減できないといった課題があった。本
発明は、上記課題に鑑み、認識文字列を連結する際、余
分な文字を削除する文字認識装置を提供することを目的
とする。As described above, depending on the method of entry,
There is a problem that the improvement effect of recognition by knowledge processing cannot be expected and the correction operation of the operator cannot be reduced. In view of the above problems, it is an object of the present invention to provide a character recognition device that deletes extra characters when connecting recognized character strings.

【００１６】[0016]

【課題を解決するための手段】上記目的を達成するた
め、本発明の文字認識装置は、図１の本発明の原理図に
示すように、 (1) それぞれ異なるイメージより認識された第１の認識
文字30と第２の認識文字31とを連結して所定の認識文字
列35を生成する文字認識装置であって、第１の認識文字
30うちの前記連結側の所定文字（Ｂ）と第２の認識文字
31（Ｃ）とを桁対応で比較して一致不一致を検証する一
致検証部32と、前記検証結果、一致していると判別した
場合は第１の認識文字30より前記所定文字（Ｂ）を削除
する文字削除部33と、前記削除された第１の認識文字
（Ａ）と第２の認識文字（Ｃ）とを連結する文字連結部
34とを有するように構成する。 (2) 前記(1) において、第２の認識文字31は、マーク読
み取りにより認識され、変換された文字であるように構
成する。 (3) 前記(1) または(2) において、第１の認識文字30と
して、複数の候補文字が桁単位で提示されている場合、
各候補文字と第２の認識文字31とを桁対応で比較し、す
べての桁でそれぞれ第２の認識文字31と一致する候補文
字が存在すれば、第１の認識文字30より該候補文字を含
めた所定文字を削除するように構成する。 (4) 前記(1) または(2) または(3) において、認識文字
を登録済み文字と比較して補正する知識処理部を有し、
前記連結した認識文字列35の知識処理を行う場合は、前
記削除処理し連結した認識文字列を使用して知識処理を
行うように構成する。 (5) 前記(1) または(2) または(3) または(4) におい
て、表示装置を備え、該表示装置に連結した認識文字列
35を表示するように構成する。 (6) 前記(1) または(2) または(3) または(4) または
(5) において、イメージデータの文字認識を行い、前記
候補文字を出力する文字認識部を備えるように構成す
る。In order to achieve the above object, the character recognition apparatus of the present invention is, as shown in the principle diagram of the present invention of FIG. 1, (1) The first character recognized from different images. A character recognition device for connecting a recognition character 30 and a second recognition character 31 to generate a predetermined recognition character string 35, the first recognition character
The predetermined character (B) on the connection side among the 30 and the second recognition character
31 (C) is compared in digit correspondence with a match verification unit 32 for verifying a match / mismatch, and when it is determined that they match as a result of the verification, the predetermined character (B) is transferred from the first recognition character 30. A character deletion unit 33 for deleting and a character connection unit for connecting the deleted first recognition character (A) and second recognition character (C)
And 34. (2) In the above (1), the second recognition character 31 is configured to be a character recognized and converted by the mark reading. (3) In the above (1) or (2), when a plurality of candidate characters are presented in digit units as the first recognition character 30,
Each candidate character and the second recognized character 31 are compared in digit correspondence, and if there is a candidate character that matches the second recognized character 31 at all digits, the candidate character is selected from the first recognized character 30. It is configured to delete the included predetermined character. (4) In (1) or (2) or (3) above, a knowledge processing unit for correcting the recognized character by comparing it with the registered character is provided,
When performing the knowledge processing of the concatenated recognition character string 35, the knowledge processing is performed using the recognition processing concatenated recognition character string. (5) In (1) or (2) or (3) or (4) above, a recognition character string provided with a display device and connected to the display device.
Configure to display 35. (6) The above (1) or (2) or (3) or (4) or
In (5), a character recognition unit for performing character recognition of the image data and outputting the candidate character is configured.

【００１７】[0017]

【作用】[Action]

(1) 一致検証部32は、それぞれ異なるイメージより認識
された第１の認識文字30の連結側の所定文字列（図１の
Ｂ）と第２の認識文字31とを比較検証し、文字削除部33
は、前記検証の結果、一致していると判別した場合は第
１の認識文字30より前記所定文字列（Ｂ）を削除し、文
字連結部34は、削除された第１の認識文字（図１のＡ）
と第２の認識文字31（図１のＣ）とを連結する。(1) The matching verification unit 32 compares and verifies a predetermined character string (B in FIG. 1) on the concatenation side of the first recognized characters 30 recognized from different images and the second recognized character 31, and deletes the characters. Part 33
When the result of the verification indicates that they match, the predetermined character string (B) is deleted from the first recognition character 30, and the character concatenation unit 34 causes the deleted first recognition character (Fig. 1 of A)
And the second recognition character 31 (C in FIG. 1) are connected.

【００１８】以上により、第１の認識文字30と第２の認
識文字31を連結した場合に生じる二重記入による文字の
重複が自動的に除去される。 (2) 前記(1) において、第２の認識文字31としては、マ
ーク読み取りで認識され、変換された文字とする。マー
ク読み取りによる認識文字は、イメージから読み取った
認識文字より一般に認識率が高いので、検証における比
較基準として優れており、削除ミスが改善できる。 (3) 前記(1) または(2) において、第１の認識文字30
に、複数の認識された候補文字が桁単位で提示されてい
る場合、各候補文字と第２の文字列31とを桁対応で比較
する。その結果、すべての桁でそれぞれ第２の文字列31
と一致する候補文字が存在すれば、二重記入があるとし
て第１の認識文字30より候補文字を含めた所定文字を削
除する。As described above, the duplication of characters due to double entry which occurs when the first recognized character 30 and the second recognized character 31 are connected is automatically removed. (2) In the above (1), the second recognized character 31 is a character recognized by mark reading and converted. Since the recognition character by reading the mark generally has a higher recognition rate than the recognition character read from the image, it is excellent as a comparison reference in verification, and deletion mistakes can be improved. (3) In the above (1) or (2), the first recognition character 30
In the case where a plurality of recognized candidate characters are presented in digit units, each candidate character and the second character string 31 are compared in digit correspondence. As a result, every second digit 31 in every digit
If there is a candidate character that matches with, the predetermined character including the candidate character is deleted from the first recognized character 30 because there is a double entry.

【００１９】これにより、余分な記入部分の認識率が悪
い場合でも、候補文字があれば削除が可能となる。 (4) 知識処理部により連結文字列の知識処理を行う場合
は、前記削除処理し連結した認識文字列を対象として知
識処理を行う。この場合、重複部分が削除されているの
で知識処理が有効となる。 (5) 文字認識の補正装置を構成する場合は、表示装置を
設けて連結した認識文字列を表示する。これにより、オ
ペレータによる確認, 修正が可能な装置が実現できる。 (6) イメージデータの文字認識を行う文字認識部を備え
ることにより、例えば、ＦＡＸで送られたイメージデー
タの文字認識が可能となる。また、ＯＣＲ装置を設けれ
ば、手書き帳票による入力装置が実現できる。As a result, even if the recognition rate of the extra entry portion is poor, it is possible to delete it if there is a candidate character. (4) When the knowledge processing unit performs the knowledge processing of the concatenated character string, the knowledge processing is performed on the recognition character string that has been deleted and concatenated. In this case, the knowledge processing is effective because the overlapping portion is deleted. (5) When configuring a character recognition correction device, a display device is provided to display the connected recognition character string. As a result, a device that can be checked and corrected by the operator can be realized. (6) By providing the character recognizing unit for recognizing the character of the image data, for example, the character recognition of the image data sent by FAX becomes possible. If an OCR device is provided, an input device using a handwritten form can be realized.

【００２０】以上のごとく、項目を連結させる場合の余
分な重複記入を削除することができ、オペレータの修正
操作が少なくなるとともに、知識処理の効果を達成する
ことができるようになる。As described above, it is possible to delete the redundant duplication when the items are connected, the number of correction operations by the operator is reduced, and the effect of knowledge processing can be achieved.

【００２１】[0021]

【実施例】図２は一実施例の構成図、図３は処理フロー
チャート図、図４は一実施例の処理手順（その１）説明
図、図５は一実施例の処理手順（その２）説明図であ
る。FIG. 2 is a block diagram of an embodiment, FIG. 3 is a processing flowchart, FIG. 4 is an explanatory view of a processing procedure (part 1) of the embodiment, and FIG. 5 is a processing procedure (part 2) of the embodiment. FIG.

【００２２】本実施例は、ＯＣＲ装置により、従来例で
示した取引帳票を読み取り、知識処理を施した認識文字
を表示して、オペレータに修正させる文字認識装置に適
用した例を説明する。In the present embodiment, an example will be described in which the OCR device is applied to a character recognition device which reads the transaction form shown in the conventional example, displays the recognized characters on which knowledge processing has been performed, and causes the operator to make corrections.

【００２３】図２において、１はＯＣＲ装置で、挿入さ
れた手書き帳票10の各項目のイメージ, マークを読み取
り、認識文字，それに対する桁単位の候補文字（以上第
１の認識文字30に対応する) 、マークで指定された選択
情報を出力する。In FIG. 2, reference numeral 1 denotes an OCR device, which reads an image of each item and mark of the inserted handwritten form 10 and recognizes a recognition character, and a candidate character for each digit (the above corresponds to the first recognition character 30). ), Output the selection information specified by the mark.

【００２４】２は文字列変換部で、マーク読み取りで認
識した２値の選択情報により、文字列変換テーブル６を
参照し、所定の文字列（第２の認識文字31に対応し、変
換文字列と称する）に変換する。A character string conversion unit 2 refers to the character string conversion table 6 according to the binary selection information recognized by the mark reading, and a predetermined character string (corresponding to the second recognized character 31 Called)).

【００２５】14は文字削除部で、図１に示した一致検証
部32を含み、文字列変換部２で変換された変換文字列
（ここでは業態名）が連結対象の金融機関名の認識文字
中に存在するか否かを検証し、存在する場合は、金融機
関名から業態名を削除する。Reference numeral 14 is a character deletion unit, which includes the match verification unit 32 shown in FIG. 1, and the conversion character string (here, the business name) converted by the character string conversion unit 2 is a recognition character of the financial institution name to be concatenated. If it exists, the business name is deleted from the financial institution name.

【００２６】15は文字連結部で、文字削除部14で処理さ
れた金融機関名と業態名とを連結する。この際、第１候
補文字の他に候補文字があれば、第１候補文字と組み合
わせたすべての金融機関名と業態名とを連結する。A character connecting unit 15 connects the financial institution name and the business type name processed by the character deleting unit 14. At this time, if there is a candidate character other than the first candidate character, all the financial institution names and business category names combined with the first candidate character are connected.

【００２７】３は知識処理部で、知識辞書中に登録した
文字列と金融機関名と業態名とを連結した認識文字（候
補文字を含む）とを比較し、一致したとき、その認識文
字を正しい認識文字と判定する。登録した文字列と一致
する認識文字がなければ、第１候補文字を認識文字とし
て出力する。A knowledge processing unit 3 compares a character string registered in the knowledge dictionary with a recognition character (including a candidate character) which is a concatenation of a financial institution name and a business category name. Judge as a correct recognition character. If there is no recognized character that matches the registered character string, the first candidate character is output as the recognized character.

【００２８】４は画面表示部で、画面定義体５で定義さ
れている連結項目名に基づいて最終決定した認識文字を
連結し、表示部16に表示する。17はキーボードで、表示
された認識文字に修正の必要がある場合は、オペレータ
はその修正文字にカーソルを移動して修正文字を入力す
る。A screen display unit 4 connects the recognition characters finally determined based on the connection item names defined in the screen definition body 5 and displays them on the display unit 16. Reference numeral 17 is a keyboard. When the displayed recognition character needs to be corrected, the operator moves the cursor to the correction character and inputs the correction character.

【００２９】13は修正部で、キーボード17からの修正入
力により、認識文字の修正処理を行う。12はメモリで、
後述する各ファイルデータをロードするメモリ、読み取
った認識文字データの格納用メモリ、前述した各部のワ
ーク用のメモリとして使用される。A correction unit 13 corrects the recognized character by a correction input from the keyboard 17. 12 is memory,
It is used as a memory for loading each file data described later, a memory for storing the read recognition character data, and a memory for the work of each unit described above.

【００３０】11は中央処理ユニットＣＰＵで、バスで接
続された前記各部を動作させ、文字認識処理を遂行す
る。７は知識辞書で、ファイルに格納され、図８に示し
たように、各項目データが登録されている。Reference numeral 11 denotes a central processing unit CPU, which operates the respective units connected by a bus to perform a character recognition process. A knowledge dictionary 7 is stored in a file, and as shown in FIG. 8, each item data is registered.

【００３１】５はファイルに格納された帳票ごとの画面
定義体で、各項目ごとに画面表示属性、例えば、項目ご
とに、連結項目名，表示行位置，表示桁位置，項目長，
表示色等が定義されており、画面表示部４は、この画面
定義体を参照しつつ、表示する。なお、文字連結部15は
この画面定義体５を参照して連結処理を行い、メモリ12
に格納し、知識処理用に提供する。Reference numeral 5 denotes a screen definition body for each form stored in a file, which has a screen display attribute for each item, for example, for each item, a linked item name, display line position, display column position, item length,
The display color and the like are defined, and the screen display unit 4 displays while referring to this screen definition body. The character concatenation unit 15 refers to the screen definition body 5 to perform concatenation processing, and the memory 12
To provide for knowledge processing.

【００３２】６はファイルされた文字列変換テーブル
で、帳票種別を表す帳票ＩＤ，その帳票の変換項目数，
項目名（例えば業態別），テーブル個数（業態数）、選
択情報（０１００等），対応する変換文字列（シンキン
等）等から構成されている。これにより、例えば振込帳
票で、帳票ＩＤのマーク読み取りで「０１００１」が得
られ、業態読み取りで「０１００」が得られたとき、振
込帳票用の文字列変換テーブルを参照して「シンキン」
と変換する。Reference numeral 6 denotes a filed character string conversion table, which is a form ID representing a form type, the number of conversion items of the form,
It is composed of item names (for example, by business type), the number of tables (the number of business types), selection information (0100, etc.), corresponding conversion character strings (Shinkin, etc.), and the like. As a result, for example, in the transfer form, when "01001" is obtained by reading the mark of the form ID and "0100" is obtained by reading the business type, the character string conversion table for the transfer form is referred to, and "Sinkin"
And convert.

【００３３】以上の構成により、以下に示す認識処理が
行われる。図３参照いま、帳票10には、口座番号「０１２３４５６７」, 金
融機関名「イロハシンキン」，支店名「アイウエオマ
チ」の各項目が手書きで記入され、業態として「信金」
にマークが付されているものとする。つまり、「シンキ
ン」が二重に記入されている。この帳票10をＯＣＲ装置
１に挿入すると、ＯＣＲ装置１は各項目のイメージデー
タ，マーク，帳票10の左上の伝票ＩＤ「０１００１」の
イメージデータを読み取り、文字データに変換する。イ
メージデータについては、認識した第１候補文字の他
に、他の候補文字があれば、その候補文字を出力する。 (1) 先ず、帳票ＩＤを取得する。 (2) 文字列変換部２は文字列変換テーブル６をメモリ12
にロードし、 (3) 帳票ＩＤ「０１００１」の変換テーブルについて項
目数分ループし、登録された項目を検索する。その結
果、選択情報「０１００」に対応する文字列（変換文字
列）、ここでは「シンキン」が検索されるので「シンキ
ン」に変換する。 (4) 続いて、文字削除部14は、金融機関名の認識文字列
中の連結側文字列と、変換文字列とを比較する。 (5) 一致しない場合はそのまま後述のステップ(7) に処
理を進め、 (6) 一致していれば、認識文字列から変換文字列と同一
文字を削除する。 (7) 文字連結部15は、画面定義体５を参照して連結項目
を認識し、文字削除部14から出力された認識文字列と、
変換文字列とを連結する。そして、知識処理部３は、知
識辞書７を参照しつつ、連結された認識文字列が知識辞
書内にあるか否かの比較処理を行う。With the above configuration, the following recognition processing is performed. Refer to Fig. 3. Now, in the form 10, the account number "01234567", the financial institution name "Iroha Shinkin", and the branch name "Iueomachi" are handwritten, and the business type is "shinkin".
Shall be marked. In other words, "Shinkin" is entered twice. When this form 10 is inserted into the OCR device 1, the OCR device 1 reads the image data of each item, the mark, and the image data of the slip ID "01001" at the upper left of the form 10 and converts it into character data. Regarding the image data, if there are other candidate characters in addition to the recognized first candidate character, that candidate character is output. (1) First, the form ID is acquired. (2) The character string conversion unit 2 stores the character string conversion table 6 in the memory 12
Then, (3) the conversion table of the form ID “01001” is looped for the number of items to search the registered items. As a result, the character string (conversion character string) corresponding to the selection information “0100”, “shinkin” in this case, is searched, so that the character string is converted to “shinkin”. (4) Subsequently, the character deletion unit 14 compares the concatenated character string in the recognized character string of the financial institution name with the converted character string. (5) If they do not match, the process proceeds directly to step (7) described later. (6) If they match, the same character as the converted character string is deleted from the recognized character string. (7) The character concatenation unit 15 recognizes the concatenation item by referring to the screen definition body 5, and the recognized character string output from the character deletion unit 14,
Concatenate with the conversion string. Then, the knowledge processing unit 3 refers to the knowledge dictionary 7 and performs a comparison process as to whether or not the connected recognition character string is in the knowledge dictionary.

【００３４】一般に複数の候補文字がＯＣＲ装置１より
出力されているので、第１候補文字が辞書内に存在しな
ければ、第１候補文字と他の候補文字との組み合わせの
認識文字を生成して連結処理を行い、比較処理を行う。
そして、知識辞書７に登録されている文字列と一致して
いる認識文字があればその認識文字を正常に認識された
文字として出力する。なければ第１候補文字を出力す
る。 (8) 画面表示部４は、画面定義体５の情報から連結項目
の文字列を抽出して連結し表示する。なお、前記(7) で
連結した認識文字が得られているので、そのまま表示す
るようにしてもよい。Since a plurality of candidate characters are generally output from the OCR device 1, if the first candidate character does not exist in the dictionary, a recognized character that is a combination of the first candidate character and another candidate character is generated. Then, the connection process is performed and the comparison process is performed.
If there is a recognized character that matches the character string registered in the knowledge dictionary 7, the recognized character is output as a normally recognized character. If not, the first candidate character is output. (8) The screen display unit 4 extracts the character strings of the linked items from the information of the screen definition body 5 and displays them by connecting them. It should be noted that since the recognized characters linked in (7) above have been obtained, they may be displayed as they are.

【００３５】図４，図５は、従来例に基づく一実施例の
処理手順を示したものである。表「イ」は、候補文字を
含んだＯＣＲ出力で、先ず文字列変換を行う。この結果
表「ロ」に示すように、業態名として「シンキン」が得
られる。ここまでは、従来例で示した通りである。FIG. 4 and FIG. 5 show a processing procedure of an embodiment based on the conventional example. The table "a" is an OCR output including candidate characters, and character string conversion is first performed. As a result, as shown in the table "B", "Shinkin" is obtained as the business type name. The process up to this point is as shown in the conventional example.

【００３６】次に、表「ヘ」に示すように、金融機関名
の文字削除処理を行う。変換文字列「シンキン」は４桁
であるから、金融機関名の連結側４桁と「シンキン」と
を比較する。いまの場合、連結側４桁の文字列には複数
の候補文字があるから、各桁ごとに、「シンキン」とそ
の桁のすべての候補文字とを比較する。「シンキン」の
「シ」と対応する候補文字には、「ソ」，「シ」，
「ン」があり、「シンキン」の「シ」が候補文字中の該
当桁に存在し、「シンキン」の「ン」と対応する候補文
字には、「？」，「ツ」，「ン」があり、「ン」がその
桁の候補文字中に存在する。Next, as shown in the table "f", the character deletion processing of the financial institution name is performed. Since the conversion character string “Shinkin” has four digits, the four digits on the concatenation side of the financial institution name are compared with “Shinkin”. In this case, since there are a plurality of candidate characters in the character string of 4 digits on the concatenation side, "sinkin" is compared with all the candidate characters of that digit for each digit. The candidate characters corresponding to "shi" of "shinkin" are "so", "shi",
There is a “n”, the “shi” of “shinkin” exists in the corresponding digit in the candidate character, and the candidate characters corresponding to “n” of “shinkin” have “?”, “Tsu”, and “n”. , And "n" is present in the candidate character for that digit.

【００３７】このようにして各桁で比較すると、表
「ト」の反転文字で示したように、比較するすべての桁
で「シンキン」と一致する候補文字が存在する。この結
果、金融機関名の最後の４桁は「シンキン」と判定し、
これは変換文字列「シンキン」と重複しているから、候
補文字「イロハシ？キン」，「リ−−ソツエン」，「−
−−ツン−−」の後半４桁を削除し、「イロハ」を第１
候補文字、「リ−−」，「−−−」を次の候補文字とす
る。When the comparison is performed at each digit in this way, there is a candidate character that matches "Shinkin" at all the digits to be compared, as indicated by the inverted character in the table "G". As a result, the last four digits of the financial institution name are judged to be "Shinkin",
Since this overlaps with the conversion character string "Shinkin", the candidate characters "Irohashi? Kin", "Lee-Sotsuen", "-"
Delete the last 4 digits of "--Tun ---" and replace "Iroha" with the first
Candidate characters "LE-" and "---" are the next candidate characters.

【００３８】この結果を表「チ」に示す。この削除した
候補文字列と業態名「シンキン」とを連結し、知識辞書
７による知識処理を行う。いまの場合「イロハ」＋「シ
ンキン」＝「イロハシンキン」は知識辞書７に存在する
から、金融機関名は「イロハシンキン」と認識され、表
「リ」に示すように、金融機関名は「イロハ」，業態名
は「シンキン」と決定される。そして、画面表示部４に
より、金融機関名と業態名とが連結されて、「イロハシ
ンキン」と表示される。The results are shown in Table "H". The deleted candidate character string and the business category name “Shinkin” are connected, and knowledge processing by the knowledge dictionary 7 is performed. In the present case, “Iroha” + “Shinkin” = “Iroha Shinkin” exists in the knowledge dictionary 7, so the financial institution name is recognized as “Iroha Shinkin” and the financial institution name is “Iroha Shinkin” as shown in the table “Re”. Iroha ”and the business name is determined to be“ Shinkin ”. Then, the screen display unit 4 links the financial institution name and the business category name and displays “Iroha Shinkin”.

【００３９】なお、前述したように、取引処理において
「イロハ」と「シンキン」とを分離する必要がない場合
は、文字連結部15で連結され、補正された認識文字が保
存されているので、画面表示部４で連結する必要はな
い。As described above, when it is not necessary to separate "Iroha" and "Shinkin" in the transaction processing, the corrected character is connected by the character connecting unit 15 and is stored. It is not necessary to connect with the screen display unit 4.

【００４０】以上の実施例では、ＯＣＲ装置１を有する
装置について示したが、ＦＡＸで読み取った振込帳票の
イメージを受信して認識処理を行う装置にも適用できる
ことは勿論である。In the above embodiments, the device having the OCR device 1 is shown, but it is needless to say that the present invention can be applied to a device which receives an image of a transfer form read by FAX and performs a recognition process.

【００４１】以上のごとく、項目を分離して記入させる
場合、重複して記入された文字を両方の認識文字を比較
することにより自動的に削除することができ、知識処理
による認識率が向上する。As described above, when the items are entered separately, the duplicated characters can be automatically deleted by comparing both recognized characters, and the recognition rate by knowledge processing is improved. .

【００４２】[0042]

【発明の効果】以上説明したように、本発明の文字認識
装置は、認識した項目データを連結する場合、重複して
記入されたデータを削除するようにしたので、知識処理
を有効化することができ、オペレータによる修正操作を
削減できる効果を奏する。As described above, in the character recognition device of the present invention, when the recognized item data are connected, the duplicated data is deleted. Therefore, the knowledge processing should be validated. Therefore, it is possible to reduce the correction operation by the operator.

【図面の簡単な説明】[Brief description of drawings]

【図１】本発明の原理図FIG. 1 is a principle diagram of the present invention.

【図２】一実施例の構成図FIG. 2 is a configuration diagram of an embodiment.

【図３】処理フローチャート図[Fig. 3] Process flow chart

【図４】一実施例の処理手順（その１）説明図FIG. 4 is an explanatory diagram of a processing procedure (part 1) of the embodiment.

【図５】一実施例の処理手順（その２）説明図FIG. 5 is an explanatory diagram of a processing procedure (part 2) of the embodiment.

【図６】従来例の構成図FIG. 6 is a configuration diagram of a conventional example.

【図７】従来例の処理手順（その１）説明図FIG. 7 is an explanatory diagram of a processing procedure (No. 1) of a conventional example.

【図８】従来例の処理手順（その２）説明図FIG. 8 is an explanatory diagram of a processing procedure (No. 2) of a conventional example.

【符号の説明】[Explanation of symbols]

１ＯＣＲ装置２文字列変換部３知識処理部４画面表示部５画面定義体６文字列変換テ
ーブル７知識辞書８表示装置 10 手書き帳票 11 中央処理ユニ
ットＣＰＵ 12 メモリ 13 修正部 14 文字列削除部 15 文字連結部 16 表示部 17 キーボード 30 第１の認識文字 31 第２の認識文
字 32 一致検証部 33 文字削除部 34 文字連結部 35 認識文字列1 OCR device 2 Character string conversion unit 3 Knowledge processing unit 4 Screen display unit 5 Screen definition unit 6 Character string conversion table 7 Knowledge dictionary 8 Display device 10 Handwritten form 11 Central processing unit CPU 12 Memory 13 Correction unit 14 Character string deletion unit 15 Character concatenation part 16 Display part 17 Keyboard 30 First recognition character 31 Second recognition character 32 Match verification part 33 Character deletion part 34 Character concatenation part 35 Recognition character string

フロントページの続き (51)Int.Cl.⁶ 識別記号庁内整理番号ＦＩ技術表示箇所Ｇ０６Ｆ 15/30 ３２０ Continuation of the front page (51) Int.Cl. ⁶ Identification number Office reference number FI technical display location G06F 15/30 320

Claims

【特許請求の範囲】[Claims]

【請求項１】それぞれ異なるイメージより認識され
た第１の認識文字と第２の認識文字とを連結して所定の
認識文字列を生成する文字認識装置であって、第１の認識文字うちの前記連結側の所定文字と第２の認
識文字とを桁対応で比較して一致不一致を検証する一致
検証部と、前記検証結果、一致していると判別した場合は第１の認
識文字より前記所定文字を削除する文字削除部と、前記削除された第１の認識文字と第２の認識文字とを連
結する文字連結部とを有することを特徴とする文字認識
装置。1. A character recognition device for connecting a first recognition character and a second recognition character recognized from different images to each other to generate a predetermined recognition character string. A match verifying unit that compares the predetermined character on the concatenation side and the second recognized character in digit correspondence to verify a match / mismatch; and the verification result indicates that the first recognized character is more than the first recognized character when it is determined that they match. A character recognition device comprising: a character deletion unit that deletes a predetermined character; and a character connection unit that connects the deleted first recognition character and the deleted second recognition character.

【請求項２】前記第２の認識文字は、マーク読み取
りにより認識され、変換された文字であることを特徴と
する請求項１記載の文字認識装置。2. The character recognition device according to claim 1, wherein the second recognized character is a character recognized and converted by reading a mark.

【請求項３】前記第１の認識文字として、複数の候
補文字が桁単位で提示されている場合、各候補文字と第
２の認識文字とを桁対応で比較し、すべての桁でそれぞ
れ第２の認識文字と一致する候補文字が存在すれば、第
１の認識文字より該候補文字を含めた所定文字を削除す
るように構成したことを特徴とする請求項１または請求
項２記載の文字認識装置。3. When a plurality of candidate characters are presented as a digit unit as the first recognition character, each candidate character and the second recognition character are compared in a digit-to-digit correspondence, and the first character is compared with the first character. The character according to claim 1 or 2, wherein if a candidate character that matches the second recognized character exists, a predetermined character including the candidate character is deleted from the first recognized character. Recognition device.

【請求項４】認識文字を登録済み文字と比較して補
正する知識処理部を有し、前記連結した認識文字列の知
識処理を行う場合は、前記削除処理し連結した認識文字
列を使用して知識処理を行うように構成したことを特徴
とする請求項１または請求項２または請求項３記載の文
字認識装置。4. A knowledge processing unit for correcting a recognized character by comparing it with a registered character, and when performing the knowledge processing of the connected recognized character string, the deleted recognized and connected recognized character string is used. The character recognition device according to claim 1, 2 or 3, wherein the character recognition device is configured to perform knowledge processing.

【請求項５】表示装置を備え、該表示装置に連結し
た該認識文字列を表示することを特徴とする請求項１ま
たは請求項２または請求項３または請求項４記載の文字
認識装置。5. The character recognition device according to claim 1, further comprising a display device for displaying the recognized character string connected to the display device.

【請求項６】イメージデータの文字認識を行い、候
補文字を出力する文字認識部を備えることを特徴とする
請求項１または請求項２または請求項３または請求項４
または請求項５記載の文字認識装置。6. The character recognition unit for recognizing characters of image data and outputting candidate characters, claim 1, claim 2, claim 3, or claim 4.
Alternatively, the character recognition device according to claim 5.