JP5609236B2

JP5609236B2 - Letter sorting system and destination estimation method

Info

Publication number: JP5609236B2
Application number: JP2010101029A
Authority: JP
Inventors: 正安達
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2010-04-26
Filing date: 2010-04-26
Publication date: 2014-10-22
Anticipated expiration: 2030-04-26
Also published as: JP2011232868A

Description

本発明は、光学式読取機能を有し、大口書状物を区分する書状物区分システムおよび宛先推定方法に関する。 The present invention relates to a letter sorting system and destination estimation method that have an optical reading function and sort a large letter.

同じ差出人によって大量に作成された書状物を大口書状物と呼ぶ。一般的に、大口書状物は、大口差出人が保有する宛先リストを参照して、宛先を機械印刷して大量に作成される。大口書状物は、同じ差出人によって一度に作成されるため、書状の柄や模様などの意匠が同じになる。また、大口書状物は、届け先を示す宛先情報が文字印刷によって記載されるため、宛先情報の印刷位置が同じになる。さらに大口書状物は、宛先情報の表記方法および文字サイズが同一であるため、宛先情報が記載される領域の大きさを矩形で囲った場合には、その矩形サイズはほぼ均一になるという特徴を有する。 Letters created in large quantities by the same sender are called large letter letters. In general, large letters are created in large quantities by machine-printing destinations with reference to a destination list held by the large sender. Since large-sized letters are created at the same time by the same sender, the design of the letter pattern or pattern is the same. In addition, since the destination information indicating the delivery destination is written by character printing in the large letter, the printing position of the destination information is the same. Furthermore, since large-format letters have the same destination information notation method and character size, when the size of the area in which the destination information is described is enclosed by a rectangle, the rectangular size is substantially uniform. Have.

大口書状物は、このように複数の類似した画像特徴を有するため、光学式読取装置（ＯＣＲ：Optical Character Reader）によって宛先情報の自動読取が行える場合には、ほぼすべての同一書状物の宛先情報の読取が期待できる。しかし、ＯＣＲによって大口書状物の宛先情報の自動読取に失敗した場合には、大量の読取不能な書状物が発生することになる。ＯＣＲ読み取りが失敗した場合には、画像はビデオ符号化装置などに送られ、打鍵者による手作業で宛先の読み取りが行われるので、打鍵者による手作業の負担が増加してしまう。すなわち、ＯＣＲ装置による読み取り効果が低い場合には、打鍵者による手作業の負担が増加する。 Since a large letter has a plurality of similar image features as described above, when the destination information can be automatically read by an optical reader (OCR), the destination information of almost the same letter. Can be expected. However, when the automatic reading of the destination information of the large letter is unsuccessful due to OCR, a large number of unreadable letters are generated. If the OCR reading fails, the image is sent to a video encoding device or the like, and the destination is read manually by the keystroke person, which increases the burden of manual work by the keystroke person. That is, when the reading effect by the OCR device is low, the burden of manual work by the keystroke person increases.

なお、大口差出人は、大口書状物を複数回差し出す傾向を持つが、繰り返し差し出される大口書状物の宛先リストの内容は、更新によって多少の増減や変更があるものの、突然大きく変動することはない。また、大口差出人が作成する書状物や宛先印刷設備の変更がない限り、書状物の画像的特徴や、書状上の宛先の位置、書体および文字の大きさなどの特徴も変動しないと推測される。 Large senders tend to submit large letters multiple times, but the contents of the list of large letter recipients that are sent repeatedly will not fluctuate abruptly, although there will be some changes or changes due to updates. . In addition, unless there is a change in the letter created by the large sender and the destination printing equipment, it is assumed that the image characteristics of the letter, the destination position on the letter, and the characteristics such as the font and the size of the letter will not change. .

特許文献１には、ＯＣＲ装置による誤読の可能性があるイメージを効率よく収集する郵便物自動区分システムが記載されている。特許文献１に記載された郵便物自動区分システムは、ＯＣＲ装置による誤読の可能性が高いイメージだけをサンプリングしてビデオ符号化装置に分配することによって、無作為にイメージをサンプリングするよりも効率よくＯＣＲ装置による誤読の可能性があるイメージを収集する。結果として、ビデオ符号化装置を操作する打鍵者にとっては、無作為にイメージをサンプリングして分配される場合よりも打鍵者の負担を軽減することができる。 Patent Document 1 describes an automatic mail sorting system that efficiently collects images that may be misread by an OCR device. The automatic mail classification system described in Patent Document 1 is more efficient than sampling images randomly by sampling only the images that are likely to be misread by the OCR device and distributing them to the video encoding device. Collect images that may be misread by the OCR device. As a result, it is possible to reduce the burden on the keystroke person who operates the video encoding apparatus, compared to the case where the image is randomly sampled and distributed.

特許文献２には、宛名および宛先の住所を示す文字を適切に検出する郵便物の宛先読み取り装置が記載されている。特許文献２に記載された宛先読み取り装置は、大口書状物の差出人の住所氏名を宛先として誤読しないように判定を行う。 Patent Document 2 describes a mail address reading device that appropriately detects characters indicating the address and the address of the address. The destination reading device described in Patent Literature 2 makes a determination so that the address name of the sender of the large letter is not misread as the destination.

特開２００８−２２６０６６号公報（段落０００７−０００９，００２８−００３３）JP 2008-226066 (paragraphs 0007-0009, 0028-0033) 特開平０６−２２３２１８号公報（段落０００７−００１０）Japanese Patent Laid-Open No. 06-223218 (paragraphs 0007-0010)

しかし、大口書状物に対するＯＣＲ読み取り性能の向上は、省力化上有益であることは自明であるが、ＯＣＲ装置による文字認識性能の向上は認識技術上容易ではなく、また大口書状物に対して、ＯＣＲ装置による読み取り性能を向上させる特別有効な手法はなかった。 However, it is obvious that the improvement in OCR reading performance for large letters is beneficial for labor saving, but the improvement of character recognition performance by the OCR device is not easy in recognition technology, and for large letters, There was no particularly effective method for improving the reading performance by the OCR device.

特許文献１に記載された郵便物自動区分システムは、ＯＣＲ装置による誤読の可能性があるイメージの収集において、ビデオ符号化装置の打鍵者の負担を軽減するが、ＯＣＲ読み取りができない書状物に対する打鍵者の負担を軽減するものではない。従って、ＯＣＲによって大口書状物の宛先情報の自動読取に失敗した場合には、ビデオ符号化装置において、人手によって大量の大口書状物の宛先を読み取る作業が発生してしまう。 The automatic mail classification system described in Patent Document 1 reduces the burden on the keystroke person of the video encoding device in collecting an image that may be misread by the OCR device, but is keyed on a letter that cannot be read by the OCR device. It does not reduce the burden on the person. Therefore, when the automatic reading of the destination information of the large letter is unsuccessful due to the OCR, the video coding apparatus manually needs to read a large number of large letter destinations.

また、特許文献２に記載された宛先読み取り装置は、宛先の記載箇所をより確実に検出することができるが、検出された記載箇所に対するＯＣＲ読み取りが失敗した場合のビデオ符号化装置の打鍵者の負担を軽減することを目的とした装置ではない。 In addition, the destination reading device described in Patent Document 2 can more reliably detect the description portion of the destination, but the keystroke person of the video encoding device when the OCR reading for the detected description portion fails can be performed. It is not a device intended to reduce the burden.

そこで、本発明は、手作業の負担を軽減し、自動処理効率を向上させる大口書状物の書状物区分システムおよび宛先推定方法を提供することを目的とする。 SUMMARY OF THE INVENTION An object of the present invention is to provide a letter sort system and a destination estimation method for large letters that reduce the burden of manual work and improve automatic processing efficiency.

本発明による書状物区分システムは、大口差出人から差し出される一連の大口書状物の宛先を区分する書状物区分システムであって、それぞれの書状物の宛先が記載された面を走査して書状画像を生成する画像スキャナ部と、画像スキャナ部によって生成された書状画像から宛先が記載されている宛先領域画像を抽出し、宛先領域画像から宛先情報を読み取る読取部と、宛先領域画像と、宛先領域画像から読取部によって読み取られた宛先情報とを対応付けて大口書状物ごとに格納する宛先データベースと、宛先データベースに、同じ大口差出人による過去の大口書状物の情報が格納されている場合に、読取部によって宛先情報を読み取ることができなかった読取不能画像と、過去の大口書状物の宛先領域画像とを比較して、読取不能画像の宛先情報を推定する宛先推定部とを備えたことを特徴とする。 A letter sorting system according to the present invention is a letter sorting system for sorting a destination of a series of large letters sent from a large sender, and scans a surface on which the destination of each letter is written, and prints a letter image. An image scanner unit that generates a destination area, a destination area image in which a destination is described from a letter image generated by the image scanner unit, a reading unit that reads destination information from the destination area image, a destination area image, and a destination area A destination database that stores each large letter in association with the destination information read by the reading unit from the image, and when the information of the past large letter from the same large sender is stored in the destination database Compare the unreadable image whose destination information could not be read by the destination and the destination area image of a past large letter, Characterized in that a destination estimating unit that estimates a target information.

本発明による宛先推定方法は、大口差出人から差し出される一連の大口書状物の宛先を区分する書状物区分システムにおける宛先推定方法であって、画像スキャナ部が、それぞれの書状物の宛先が記載された面を走査して書状画像を生成し、読取部が、生成した書状画像から宛先が記載されている宛先領域画像を抽出し、読取部が、抽出した宛先領域画像から宛先情報を読み取り、宛先推定部が、宛先領域画像と、宛先領域画像から読み取った宛先情報とを対応付けて大口書状物ごとに宛先データベースに格納し、宛先推定部が、宛先データベースに、同じ大口差出人による過去の大口書状物の情報が格納されている場合に、宛先情報を読み取ることができなかった読取不能画像と、過去の大口書状物の宛先領域画像とを比較して、読取不能画像の宛先情報を推定することを特徴とする。 A destination estimation method according to the present invention is a destination estimation method in a letter sorting system that sorts a series of large letter destinations sent from a large sender, and the image scanner unit describes the destination of each letter. and the surface is scanned to generate a letter image reading unit extracts the destination area image from the generated letter image are described destination reads the destination information from the destination region image reading unit, the extracted destination The estimation unit associates the destination area image with the destination information read from the destination area image and stores each large letter in the destination database, and the destination estimation unit stores the past large letter from the same large sender in the destination database. When object information is stored, it is impossible to read by comparing the unreadable image for which the destination information could not be read with the destination area image of the past large letter. And estimates the destination information of the image.

本発明によれば、読取装置によって自動的に読み取りができなかった大口書状物の宛先を推定することによって、宛先を自動的に読み取りできない大口書状物を軽減させて手作業の負担を軽減し、自動処理効率を向上させることができる。 According to the present invention, by estimating the destination of a large letter that could not be automatically read by the reading device, the large letter that cannot be automatically read can be reduced to reduce the burden of manual work, Automatic processing efficiency can be improved.

本発明による書状物区分システムの一実施形態の構成を示すブロック図である。It is a block diagram which shows the structure of one Embodiment of the letter sort system by this invention. 初回動作時の書状物区分システムにおける処理を示す説明図である。It is explanatory drawing which shows the process in the letter sort system at the time of first time operation | movement. 大口郵便物の書状画像の例を示す説明図である。It is explanatory drawing which shows the example of the letter image of a large mail piece. 領域が分離された書状画像の処理の流れを示す説明図である。It is explanatory drawing which shows the flow of a process of the letter image from which the area | region was isolate | separated. 宛先データベースに格納されるデータ例を示す説明図である。It is explanatory drawing which shows the example of data stored in a destination database. 繰り返し動作時の書状物区分システムにおける処理を示す説明図である。It is explanatory drawing which shows the process in the letter sort system at the time of a repetition operation | movement. 大口リストの一例を示す説明図である。It is explanatory drawing which shows an example of a large mouth list. 図１に示す書状物区分システムにおけるデータの流れを示す説明図である。It is explanatory drawing which shows the flow of the data in the letter sort system shown in FIG. 本発明による書状物区分システムの主要部を示すブロック図である。It is a block diagram which shows the principal part of the letter sorter system by this invention.

図１は、本発明による書状物区分システムの一実施形態の構成を示すブロック図である。図１を参照して、書状物区分システムの一実施形態の構成を説明する。 FIG. 1 is a block diagram showing a configuration of an embodiment of a document sorting system according to the present invention. With reference to FIG. 1, the structure of one Embodiment of the letter sort system is demonstrated.

図１に示す書状物区分システムは、宛先認識装置１００、光学式読取装置（ＯＣＲ）２００、およびビデオ符号化装置３００を備える。なお、ＯＣＲ２００やビデオ符号化装置３００が、宛先認識装置１００に設けられていてもよい。また、宛先認識装置１００の一部が、宛先認識装置１００と別に独立した装置として設けられていてもよい。 The letter sorting system shown in FIG. 1 includes a destination recognition device 100, an optical reader (OCR) 200, and a video encoding device 300. Note that the OCR 200 and the video encoding device 300 may be provided in the destination recognition device 100. Further, a part of the destination recognition device 100 may be provided as a device independent of the destination recognition device 100.

宛先認識装置１００は、宛先認識装置１００に掛けられる書状物が大口書状物であるか判定し、書状物の届け先である宛先を識別する。宛先認識装置１００は、画像スキャナ部１１０、大口書状判定部１２０、大口書状画像特徴量格納部１３０、情報統合部１４０、宛先データベース１５０、および大口リスト格納部１６０を備える。 The destination recognition apparatus 100 determines whether the letter placed on the destination recognition apparatus 100 is a large letter, and identifies the destination to which the letter is to be delivered. The destination recognition apparatus 100 includes an image scanner unit 110, a large letter determination unit 120, a large letter image feature amount storage unit 130, an information integration unit 140, a destination database 150, and a large list list storage unit 160.

画像スキャナ部１１０は、宛先認識装置１００に掛けられた書状物の表面を光学的に走査し、書状物の宛先が記載された面の画像データである書状画像を生成する。画像スキャナ部１１０は、生成した書状画像を大口書状判定部１２０およびＯＣＲ２００のＯＣＲ読取部２１０に出力する。 The image scanner unit 110 optically scans the surface of the letter placed on the destination recognition device 100, and generates a letter image that is image data of a surface on which the destination of the letter is described. The image scanner unit 110 outputs the generated letter image to the large letter determination unit 120 and the OCR reading unit 210 of the OCR 200.

大口書状判定部１２０は、画像スキャナ部１１０によって入力された書状画像から宛先が記載されている宛先領域部分以外の画像を切り出す。ここで、宛先認識装置１００に大口書状物が連続して掛けられた場合には、画像スキャナ部１１０から大口書状判定部１２０に連続して入力された書状画像において、書状画像から宛先領域部分以外を切り出した画像は、同じまたは類似した画像となる。従って、以下、書状画像から宛先領域部分以外を切り出した画像を共通領域画像と呼ぶ。 The large-mouth letter determination unit 120 cuts out an image other than the destination area portion in which the destination is described from the letter image input by the image scanner unit 110. Here, when a large letter is continuously applied to the destination recognition apparatus 100, in the letter image continuously input from the image scanner unit 110 to the large letter determination unit 120, the letter image is not the destination area portion. The images cut out are the same or similar images. Therefore, hereinafter, an image obtained by cutting out a letter area other than the destination area is referred to as a common area image.

大口書状判定部１２０は、逐次切り出した共通領域画像に対して、前後の共通領域画像同士で画像差分値を取るなどの画像処理手法によって比較し、共通領域画像の類似度を測定する。大口書状判定部１２０は、測定した類似度が所定の閾値を満たした場合には、当該共通領域画像を類似画像と判定する。大口書状判定部１２０は、類似画像と判定される共通領域画像が、所定の回数以上連続して入力された場合に、当該共通領域画像の書状物を同一の大口書状物であると判定する。 The large letter determination unit 120 compares the sequentially extracted common area images by an image processing method such as taking an image difference value between the preceding and following common area images, and measures the similarity of the common area images. The large letter determination unit 120 determines that the common area image is a similar image when the measured similarity satisfies a predetermined threshold. When the common area image determined to be a similar image is continuously input a predetermined number of times or more, the large letter determination unit 120 determines that the letter of the common area image is the same large letter.

大口書状判定部１２０は、同一の大口書状物と判定した場合に、当該大口書状物の共通領域画像の特徴を数値化する特徴量計算を行う。特徴量は、例えば、共通領域画像そのものでも良いし、共通領域画像を一意に特定できるのであれば、データ量節減のために、例えば、Ｘ方向およびＹ方向のそれぞれに対する濃度ヒストグラムを採用してもよい。特徴量計算の結果、大口書状判定部１２０は、共通領域画像を一意に特定できる大口書状画像特徴量を算出する。 When the large-capital letter determination unit 120 determines that the same large-capture letter is the same, the large-capital letter determination unit 120 performs a feature amount calculation that quantifies the features of the common area image of the large-capture letter. The feature amount may be, for example, the common region image itself, or if the common region image can be uniquely specified, for example, a density histogram for each of the X direction and the Y direction may be adopted to reduce the data amount. Good. As a result of the feature amount calculation, the large letter determination unit 120 calculates a large letter image feature amount that can uniquely identify the common area image.

大口書状画像特徴量格納部１３０は、大口書状判定部１２０によって判定された大口書状物を識別する一意な大口ＩＤ番号を与え、当該大口書状物の大口書状画像特徴量と大口ＩＤ番号とをあわせて格納する。 The large letter image feature quantity storage unit 130 provides a unique large ID number for identifying the large letter letter determined by the large letter determination unit 120, and combines the large letter image feature value of the large letter letter and the large ID number. Store.

情報統合部１４０は、ＯＣＲ読取部２１０またはビデオ符号化装置３００から入力される書状物の宛先領域部分の画像（宛先領域画像）とその宛先情報に対して、大口書状物特徴量格納部１３０に格納されている当該書状物の大口ＩＤ番号をキーとして結合し、宛先データベース１５０に出力する。情報統合部１４０が宛先領域画像および宛先情報を大口ＩＤ番号と結合する方法は、処理タイミングや管理番号などを利用した方法でもよく、その他の一般的な方法であってもよい。 The information integration unit 140 stores the destination region image (destination region image) of the letter input from the OCR reading unit 210 or the video encoding device 300 and the destination information in the large letter feature amount storage unit 130. The stored large ID number of the letter is combined as a key and output to the destination database 150. The method in which the information integration unit 140 combines the destination area image and the destination information with the large ID number may be a method using processing timing, a management number, or the like, or may be another general method.

また、情報統合部１４０は、繰り返し動作時に、大口リストの住所領域画像の集合とＯＣＲリジェクト画像の集合との画像比較を行い、ＯＣＲリジェクト画像の宛先情報を推定する。繰り返し動作については、後述する。 Further, the information integration unit 140 compares the set of address area images in the large list and the set of OCR reject images during the repetitive operation, and estimates destination information of the OCR reject images. The repeated operation will be described later.

宛先データベース１５０は、情報統合部１４０から逐次入力される大口ＩＤ番号、宛先領域画像および宛先情報を、大口ＩＤ番号をキーとして、宛先領域画像と宛先情報とを組み合わせて格納する。すなわち、宛先データベース１５０は、宛先認識装置１００に掛けられた大口書状物について、大口ＩＤ番号をキーとして、各書状物の宛先領域画像と宛先情報との組み合わせを格納する。 The destination database 150 stores the large area ID number, the destination area image, and the destination information sequentially input from the information integration unit 140 in combination with the destination area image and the destination information using the large area ID number as a key. That is, the destination database 150 stores a combination of the destination area image and destination information of each letter with the large ID number as a key for the large letter applied to the destination recognition apparatus 100.

大口リスト格納部１６０は、繰り返し動作時に、宛先データベース１５０に格納されているデータのうち、特定の大口ＩＤ番号に属する宛先領域画像と宛先情報との集合（リスト）を一時的に格納する。 The large list storage unit 160 temporarily stores a set (list) of destination area images and destination information belonging to a specific large ID number among the data stored in the destination database 150 during the repetitive operation.

ＯＣＲ２００は、光学式読取機能を有し、ＯＣＲ読取部２１０およびＯＣＲリジェクト画像格納部２２０を備える。 The OCR 200 has an optical reading function and includes an OCR reading unit 210 and an OCR reject image storage unit 220.

ＯＣＲ読取部２１０は、画像スキャナ部１１０によって入力された書状画像から、宛先が記載されている宛先領域部分を示す宛先領域画像を切り出す。ＯＣＲ読取部２１０は、切り出した宛先領域画像から、宛先を示す宛先情報を読み取る。ＯＣＲ読取部２１０は、宛先領域画像と、宛先領域画像から読み取った宛先情報とを情報統合部１４０に出力する。ＯＣＲ読取部２１０は、初回動作時に、宛先領域画像から宛先情報を読み取ることができなかった場合には、当該宛先領域画像（ＯＣＲリジェクト画像）をビデオ符号化装置３００に出力する。ＯＣＲ読取部２１０は、繰り返し動作時に、宛先領域画像から宛先情報を読み取ることができなかった場合には、ＯＣＲリジェクト画像をＯＣＲリジェクト画像格納部２２０に出力する。初回動作については、後述する。 The OCR reading unit 210 cuts out a destination area image indicating a destination area portion in which a destination is described from a letter image input by the image scanner unit 110. The OCR reading unit 210 reads destination information indicating the destination from the cut out destination area image. The OCR reading unit 210 outputs the destination area image and the destination information read from the destination area image to the information integration unit 140. If the destination information cannot be read from the destination area image during the initial operation, the OCR reading unit 210 outputs the destination area image (OCR reject image) to the video encoding device 300. The OCR reading unit 210 outputs the OCR reject image to the OCR reject image storage unit 220 when the destination information cannot be read from the destination area image during the repetitive operation. The initial operation will be described later.

ＯＣＲリジェクト画像格納部２２０は、繰り返し動作時に大口書状物の読み取りが一通り終わるまで、ＯＣＲ読取部２１０から入力されたＯＣＲリジェクト画像を一時的に格納する。 The OCR reject image storage unit 220 temporarily stores the OCR reject image input from the OCR reading unit 210 until the large-sized letter is completely read during the repetitive operation.

ビデオ符号化装置３００は、宛先を手入力することによって、宛先を宛先情報に符号化することができる。ビデオ符号化装置３００では、ＯＣＲリジェクト画像格納部２２０から入力されたＯＣＲリジェクト画像に対して、目視による宛先の判断と、判断された宛先の手入力による宛先情報の生成とが行われる。そして、ビデオ符号化装置３００は、ＯＣＲリジェクト画像の宛先領域画像と、生成した宛先情報とを情報統合部１４０に出力する。 The video encoding apparatus 300 can encode the destination into destination information by manually inputting the destination. In the video encoding device 300, the destination determination by visual inspection and the generation of the destination information by manual input of the determined destination are performed on the OCR reject image input from the OCR reject image storage unit 220. Then, the video encoding device 300 outputs the destination area image of the OCR reject image and the generated destination information to the information integration unit 140.

なお、初回動作時に、宛先認識装置１００に掛けられた書状物の数が、書状物が大口書状物であるか大口書状判定部１２０によって判定できる程度の一定数を超えるまでは、ＯＣＲ読取部２１０およびビデオ符号化装置３００は、生成した宛先領域画像や宛先情報を各装置に一時格納しておいてもよい。また、書状画像から宛先領域部分を判断する方法は、一般的な文字認識装置で利用される宛先領域部分の判断方法を利用する。 It should be noted that during the initial operation, until the number of letters placed on the destination recognition apparatus 100 exceeds a certain number that can be determined by the large letter determination unit 120 as to whether the letter is a large letter, the OCR reading unit 210 and the video The encoding apparatus 300 may temporarily store the generated destination area image and destination information in each apparatus. Further, as a method for determining a destination area portion from a letter image, a method for determining a destination area portion used in a general character recognition device is used.

書状物の一例として郵便物を用いて、図１に示す書状物区分システムの動作を説明する。このとき、大口書状物は大口郵便物となる。宛先認識装置１００は、大口郵便物が最初に差し出されて宛先認識装置１００に掛けられたときに、大口郵便物の宛先住所リストの形成処理を行う初回動作と、再度同じ差出人による大口郵便物が差し出されて宛先認識装置１００に掛けられたときに、初回動作で形成した宛先住所リストをＯＣＲ２００に参照させる繰り返し動作の２種類の動作を行う。 The operation of the letter sorting system shown in FIG. 1 will be described using mail as an example of the letter. At this time, the large letter is a large mail. The destination recognition device 100 performs the initial operation of forming the destination address list of the large mail when the large mail is first presented and put on the destination recognition device 100, and again the large mail by the same sender. Is sent to the destination recognition device 100, and two types of operations are performed: a repetitive operation for causing the OCR 200 to refer to the destination address list formed in the initial operation.

図２は、初回動作時の書状物区分システムにおける処理を示す説明図である。図２を参照して、宛先認識装置１００が初回動作を行う場合の、図１に示す書状物区分システムにおける処理を説明する。 FIG. 2 is an explanatory diagram showing processing in the letter sorting system during the initial operation. With reference to FIG. 2, processing in the letter sorting system shown in FIG. 1 when the destination recognition apparatus 100 performs an initial operation will be described.

まず、宛先認識装置１００に郵便物が掛けられると、画像スキャナ部１１０が、郵便物の宛先が記載された表面を走査して書状画像を生成する（ステップＳ１１）。画像スキャナ部１１０は、生成した書状画像をＯＣＲ読取部２１０および大口書状判定部１２０に出力する。 First, when a mail piece is put on the address recognition device 100, the image scanner unit 110 scans the surface on which the mail address is described to generate a letter image (step S11). The image scanner unit 110 outputs the generated letter image to the OCR reading unit 210 and the large letter determination unit 120.

図３は、大口郵便物の書状画像の例を示す説明図である。図３に示す書状画像の大口郵便物には、宛先である住所以外に、大口業者（差出人）のロゴや広告等が記載されている。宛先には、それぞれの郵便物によって異なる住所が記載されるが、ロゴや広告等は、同じ大口郵便物であれば変わらない。従って、図３に示す書状画像では、住所の記載されている領域が宛先領域となり、それ以外の領域が共通領域となる。 FIG. 3 is an explanatory view showing an example of a letter image of a large mail. In the large mail piece of the letter image shown in FIG. 3, a logo or an advertisement of a large trader (sender) is described in addition to the address as the address. The address indicates a different address for each postal item, but the logo, advertisement, etc. are the same as long as the same large postal item. Therefore, in the letter image shown in FIG. 3, the area where the address is described becomes the destination area, and the other areas become the common area.

ステップＳ１１において画像スキャナ部１１０から書状画像が入力されると、入力先で共通領域と非共通領域とに書状画像の領域が分離され、ステップＳ１２〜Ｓ１４に示すＯＣＲ読取部２１０などによる処理と、ステップＳ１５〜１８に示す大口書状判定部１２０などによる処理とが、並行して実行される。 When a letter image is input from the image scanner unit 110 in step S11, the letter image area is separated into a common area and a non-common area at the input destination, and processing by the OCR reading unit 210 shown in steps S12 to S14, The processing by the large letter determination unit 120 shown in steps S15 to S18 is executed in parallel.

図４は、領域が分離された書状画像の処理の流れを示す説明図である。例えば、書状物が郵便物の場合には、共通領域画像に対して、図４の大口判定処理フローに示されるように、大口判定部１２０などによって画像特徴が抽出されて、大口郵便物の画像特徴量が算出される（後述するステップＳ１２〜Ｓ１４の処理に相当）。また、非共通領域画像（宛先領域画像）に相当する住所領域画像に対して、図４のＯＣＲ処理フローに示されるように、ＯＣＲ読取部２１０などによって宛先情報に相当する住所データが抽出される（後述するステップＳ１５〜Ｓ１８の処理に相当）。 FIG. 4 is an explanatory diagram showing a flow of processing of a letter image from which regions are separated. For example, when the letter is a postal matter, an image feature is extracted from the common area image by the large mouth determination unit 120 or the like as shown in the large mouth determination processing flow of FIG. A feature amount is calculated (corresponding to processing in steps S12 to S14 described later). Further, address data corresponding to the destination information is extracted by the OCR reading unit 210 or the like for the address area image corresponding to the non-common area image (destination area image) as shown in the OCR processing flow of FIG. (Corresponding to steps S15 to S18 described later).

ステップＳ１１において画像スキャナ部１１０から書状画像を入力されたＯＣＲ読取部２１０は、入力された書状画像から、宛先が記載されている住所領域画像を切り出す（ステップＳ１２）。ＯＣＲ読取部２１０は、切り出した住所領域画像から、住所データを読み取る（ステップＳ１３）。ＯＣＲ読取部２１０は、住所領域画像と、住所領域画像から読み取った住所データとを情報統合部１４０に出力する。 The OCR reading unit 210 that has received the letter image from the image scanner unit 110 in step S11 cuts out an address area image in which the destination is described from the inputted letter image (step S12). The OCR reading unit 210 reads address data from the cut out address area image (step S13). The OCR reading unit 210 outputs the address area image and the address data read from the address area image to the information integration unit 140.

ＯＣＲ読取部２１０は、初回動作時のステップＳ１３において、住所領域画像から住所データを読み取ることができなかった場合には、読み取りができなかった住所領域画像をＯＣＲリジェクト画像としてビデオ符号化装置３００に出力する。ビデオ符号化装置３００では、ＯＣＲ読取部２１０から入力されたＯＣＲリジェクト画像に対して、目視による宛先の判断と、判断された宛先の手入力による住所データの生成とが行われる（ステップＳ１４）。ビデオ符号化装置３００は、ＯＣＲリジェクト画像とされた住所領域画像と、生成した住所データとを情報統合部１４０に出力する。 When the address data cannot be read from the address area image in step S13 at the time of the initial operation, the OCR reading unit 210 converts the address area image that could not be read into the video encoding apparatus 300 as an OCR reject image. Output. In the video encoding device 300, the determination of the destination by visual inspection and the generation of the address data by manual input of the determined destination are performed on the OCR reject image input from the OCR reading unit 210 (step S14). The video encoding device 300 outputs the address area image that is the OCR reject image and the generated address data to the information integration unit 140.

一方、ステップＳ１１において画像スキャナ部１１０から書状画像が入力された大口書状判定部１２０は、入力された書状画像から、住所領域部分以外の共通領域画像を切り出す（ステップＳ１５）。 On the other hand, the large letter determination unit 120 to which the letter image is input from the image scanner unit 110 in step S11 cuts out the common area image other than the address area portion from the input letter image (step S15).

大口書状判定部１２０は、逐次入力されてくる郵便物の書状画像についてそれぞれ共通領域画像を切り出し、前後の共通領域画像同士で画像差分値を取るなどの画像処理手法によって比較し、一連の郵便物の共通領域画像の類似度を測定する。さらに、大口書状判定部１２０は、類似度を測定した共通領域画像の一連の郵便物が、同一の大口郵便物であるか判定する（ステップＳ１６）。 The large letter determination unit 120 cuts out a common area image for each letter image sequentially input and compares them by an image processing method such as taking an image difference value between the preceding and following common area images to obtain a series of postal items. The similarity of the common area images is measured. Further, the large letter determination unit 120 determines whether a series of postal items in the common area image whose similarity is measured is the same large postal item (step S16).

ステップＳ１６において、一連の郵便物を同一の大口郵便物と判定した場合に、大口書状判定部１２０は、当該大口郵便物の共通領域画像の特徴を数値化する所定の特徴量計算を行い、大口郵便物の共通領域画像を一意に特定できる大口書状画像特徴量を算出する（ステップＳ１７）。大口書状判定部１２０は、ステップＳ３７において算出した大口書状画像特徴量が、既に大口書状画像特徴量格納部１３０に格納されているか否か確認する（ステップＳ１８）。確認処理は、大口書状画像特徴量を差分比較するなどの方法で行われる。ステップＳ１８において、算出した大口書状画像特徴量が大口書状画像特徴量格納部１３０に格納されていない場合に、初回動作時の処理が行われる。 When it is determined in step S16 that a series of mail pieces are the same large mail piece, the large letter determination unit 120 performs a predetermined feature amount calculation for quantifying the characteristics of the common area image of the large mail piece, A large letter image feature quantity capable of uniquely specifying the common area image of the mail is calculated (step S17). The large letter determination unit 120 checks whether or not the large letter image feature value calculated in step S37 is already stored in the large letter image feature value storage unit 130 (step S18). The confirmation process is performed by a method such as comparing differences between large letter image feature amounts. In step S18, when the calculated large letter image feature quantity is not stored in the large letter image feature quantity storage unit 130, the process for the first operation is performed.

ステップＳ１８において、算出した大口書状画像特徴量が大口書状画像特徴量格納部１３０に格納されていない場合に、大口書状画像特徴量格納部１３０は、大口書状判定部１２０によって判定された大口郵便物を識別する一意な大口ＩＤ番号を与え（ステップＳ１９）、当該大口郵便物の大口書状画像特徴量と大口ＩＤ番号とをあわせて格納する。 In step S18, when the large letter image feature amount calculated is not stored in the large letter image feature amount storage unit 130, the large letter image feature amount storage unit 130 determines the large letter item determined by the large letter determination unit 120. Is given (step S19), and the large letter image feature value of the large mail and the large ID number are stored together.

情報統合部１４０は、ステップＳ１３においてＯＣＲ読取部２１０、またはステップＳ１４においてビデオ符号化装置３００から入力される郵便物の住所領域画像とその住所データに対して、ステップＳ１８において大口書状物特徴量格納部１３０に格納されている当該郵便物の大口ＩＤ番号をキーとして結合し、宛先データベース１５０に逐次出力する（ステップＳ２０）。 The information integration unit 140 stores the feature value of the large letter in step S18 for the address area image of the mail and the address data input from the OCR reading unit 210 in step S13 or the video encoding device 300 in step S14. The large mail ID numbers of the mail pieces stored in the unit 130 are combined as a key and sequentially output to the destination database 150 (step S20).

図５は、宛先データベースに格納されるデータ例を示す説明図である。図５に示すように、宛先データベース１５０は、大口ＩＤ番号をキーとして、宛先領域画像（書状物が郵便物の場合には、住所領域画像）と、宛先情報（書状物が郵便物の場合には、住所データ）との組み合わせを格納する（ステップＳ２１）。図５に示すようにデータを格納することによって、例えば、宛先認識装置１００に大口郵便物が掛けられた場合には、一連の大口郵便物ごとに、各郵便物の住所領域画像と住所データとがまとめられる。なお、宛先データベース１５０は、住所領域画像に替えて当該郵便物の書状画像を格納してもよく、また、住所領域画像とは別に書状画像を格納してもよい。 FIG. 5 is an explanatory diagram showing an example of data stored in the destination database. As shown in FIG. 5, the destination database 150 uses a large ID number as a key, a destination area image (address area image when the letter is mail), and destination information (when the letter is mail). Stores the combination with (address data) (step S21). By storing the data as shown in FIG. 5, for example, when a large mail is placed on the destination recognition device 100, the address area image and address data of each mail are displayed for each series of large mail. Are summarized. The destination database 150 may store a letter image of the mail piece instead of the address area image, or may store a letter image separately from the address area image.

例えば、１００００通の大口郵便物が宛先認識装置１００に掛けられた場合に、大口書状判定部１２０が行う類似判定の精度が十分高ければ、宛先データベース１５０には新たに１つの大口ＩＤ番号とそれに属する１００００対の住所領域画像と住所データとが追加される。大口書状判定部１２０の類似判定の精度は、大口郵便差出人の作成する封筒のデザインに依存するが、封筒には差出人を明確にするために差出人の名称、住所とともに差出人独特のロゴマークを有したり、封筒に独自の意匠を持たせたりすることが多いという一般的な事実から、別々の大口差出人の封筒同士が類似判定で誤判定されるほどに類似することはまれである。以上のようにして、宛先データベース１５０に新たに追加された大口ＩＤ番号に属する住所の集合は、大口差出人がその大口郵便物を製作するために使用した住所リストにある住所の集合と、ほぼ同一であると推定される。 For example, when 10,000 large mail items are placed on the destination recognition device 100, if the accuracy of the similarity determination performed by the large letter determination unit 120 is sufficiently high, the destination database 150 newly includes one large ID number and The 10,000 pairs of address area images and address data to which they belong are added. The accuracy of the similarity determination of the large letter determination unit 120 depends on the design of the envelope created by the large mail sender, but the envelope has a sender's name and address and a unique logo mark in order to clarify the sender. From the general fact that envelopes often have their own designs, it is rare that the envelopes of different large senders are so similar that they are misjudged by similarity determination. As described above, the set of addresses belonging to the large ID number newly added to the destination database 150 is substantially the same as the set of addresses in the address list used by the large sender to produce the large mail. It is estimated that.

図６は、繰り返し動作時の書状物区分システムにおける処理を示す説明図である。図６を参照して、宛先認識装置１００が繰り返し動作を行う場合の、図１に示す書状物区分システムにおける処理を説明する。 FIG. 6 is an explanatory diagram showing processing in the letter sorting system during the repetitive operation. With reference to FIG. 6, processing in the letter sorting system shown in FIG. 1 when the destination recognition apparatus 100 repeatedly performs operations will be described.

大口郵便物の差出人によって再度差し出された大口郵便物が、宛先認識装置１００に掛けられた場合に、画像スキャナ部１１０が書状画像を生成し（ステップＳ３１）、生成した書状画像をＯＣＲ読取部２１０および大口書状判定部１２０に出力する処理は、図２のステップＳ１１に示す初回動作時の処理と同じである。 When the large mail item sent again by the large mail sender is put on the destination recognition device 100, the image scanner unit 110 generates a letter image (step S31), and the generated letter image is converted into the OCR reading unit. The processing to be output to 210 and the large letter determination unit 120 is the same as the processing at the first operation shown in step S11 of FIG.

画像スキャナ部１１０から入力された書状画像に対して、ＯＣＲ読取部２１０が、住所領域画像を切りだし（ステップＳ３２）、住所データを読み取る（ステップＳ３３）処理は、図２のステップＳ１２〜Ｓ１３に示す初回動作時の処理と同じである。ステップＳ３３において住所領域画像から住所データの読み取りを行ったＯＣＲ読取部２１０は、住所領域画像と、住所領域画像から読み取った住所データとを情報統合部１４０に出力する。 With respect to the letter image input from the image scanner unit 110, the OCR reading unit 210 cuts out the address area image (step S32) and reads the address data (step S33) in steps S12 to S13 in FIG. It is the same as the process at the time of the first operation shown. The OCR reading unit 210 that has read the address data from the address area image in step S33 outputs the address area image and the address data read from the address area image to the information integration unit 140.

大口書状判定部１２０が、共通領域画像を切り出し（ステップＳ３５）、類似画像判定と大口書状物判定とを行い（ステップＳ３６）、大口郵便物と判定した場合に大口書状画像特徴量を算出し（ステップＳ３７）、算出した大口書状画像特徴量が大口書状画像特徴量格納部１３０に格納済みであるか否かを確認する（ステップＳ３８）処理は、図２のステップＳＳ１５〜Ｓ１８に示す初回動作時の処理と同じである。ステップＳ３８において、算出した大口書状画像特徴量が大口書状画像特徴量格納部１３０に格納済みであった場合には、宛先認識装置１００に掛けられた書状物が、過去の大口差出人と同じ大口差出人による大口書状物であると判断され、繰り返し動作時の処理が行われる。 The large letter determination unit 120 cuts out the common area image (step S35), performs similar image determination and large letter determination (step S36), and calculates a large letter image feature when it is determined as a large mail ( Step S37), whether or not the calculated large letter image feature quantity has been stored in the large letter image feature quantity storage unit 130 (Step S38) is the first operation shown in Steps SS15 to S18 of FIG. It is the same as the process. In step S38, if the calculated large letter image feature quantity has already been stored in the large letter image feature quantity storage unit 130, the letter applied to the destination recognition device 100 is the same large sender as the past large sender. Is determined to be a large letter, and processing during repeated operations is performed.

ステップＳ３８において、算出した大口書状画像特徴量が大口書状画像特徴量格納部１３０に格納済みであった場合に、大口書状判定部１２０は、格納済みであると確認された大口書状画像特徴量に対応付けられた大口ＩＤ番号を大口書状画像特徴量格納部１３０から取得する。そして、大口書状判定部１２０は、初回動作時に当該大口ＩＤ番号をキーとして宛先データベース１５０に格納された住所領域画像と住所データとの集合を取得し、大口リストを作成する。 In step S38, when the calculated large letter image feature value has been stored in the large letter image feature value storage unit 130, the large letter determination unit 120 determines that the large letter image feature value has been stored. The associated large ID number is acquired from the large letter image feature amount storage unit 130. Then, the large letter determination unit 120 acquires a set of address area images and address data stored in the destination database 150 using the large ID number as a key during the initial operation, and creates a large list.

図７は、大口リストの一例を示す説明図である。大口リストは、例えば、図７に示すように、宛先領域画像（書状物が郵便物の場合には、住所領域画像）、宛先情報（書状物が郵便物の場合には、住所データ）およびフラグを含む。フラグはＯＮとＯＦＦの値を持ち、最初はＯＦＦの値を持つ。大口リスト格納部１６０は、大口書状判定部１２０によって作成された大口リストを格納する（ステップＳ３９）。 FIG. 7 is an explanatory diagram showing an example of a large list. For example, as shown in FIG. 7, the large list includes a destination area image (address area image when the letter is mail), destination information (address data when the letter is mail), and a flag. including. The flag has ON and OFF values, and initially has an OFF value. The large mouth list storage unit 160 stores the large mouth list created by the large mouth letter determination unit 120 (step S39).

情報統合部１４０は、ＯＣＲ読取部２１０によって読み取られて情報統合部１４０に入力された住所データと、大口リスト格納部１６０に格納されている大口リストの住所データとを照合し、一致する住所データがあった場合には、当該住所データが含まれる大口リストのフラグをＯＮにする。 The information integration unit 140 collates the address data read by the OCR reading unit 210 and input to the information integration unit 140 with the address data of the large list stored in the large list storage unit 160 and matches the address data. If there is, the flag of the large list including the address data is turned ON.

一方、ステップＳ３３において、ＯＣＲ読取部２１０は、住所領域画像から住所データを読み取ることができなかった場合には、読み取りができなかった住所領域画像をＯＣＲリジェクト画像としてＯＣＲリジェクト画像格納部２２０に格納する（ステップＳ３４）。 On the other hand, in step S33, when the address data cannot be read from the address area image, the OCR reading unit 210 stores the address area image that could not be read as an OCR reject image in the OCR reject image storage unit 220. (Step S34).

そして、大口郵便物がすべて宛先認識装置１００に掛けられた後に、情報統合部１４０は、フラグがＯＮになっていない大口リスト、すなわちフラグがＯＦＦのままの大口リストの住所領域画像の集合と、ＯＣＲリジェクト画像格納部２２０に格納されたＯＣＲリジェクト画像の集合とで画像比較を行う。情報統合部１４０は、画像比較の結果、類似度が非常に高い画像同士の組み合わせを得ることができる。類似度が非常に高い画像同士の組み合わせが得られる理由としては、大口書状物の数量が大量であっても１人の差出人が差し出す数量は最大でも数万通程度の有限であること、宛先の位置、文字サイズおよび書体は、初回動作時に記憶した画像のそれと全く同一であることが期待されることが挙げられる。従って、例えば、単純な画像比較手法である画像差分比較を用いた場合でも、容易に同定することが可能である。 Then, after all large mail items are put on the destination recognition device 100, the information integration unit 140 includes a large list in which the flag is not ON, that is, a set of address area images of the large list with the flag remaining OFF, Image comparison is performed with a set of OCR reject images stored in the OCR reject image storage unit 220. As a result of the image comparison, the information integration unit 140 can obtain a combination of images with very high similarity. The reason why a combination of images with very high similarity is obtained is that even if the number of large letters is large, the number of one sender is limited to tens of thousands at most, It is mentioned that the position, character size and typeface are expected to be exactly the same as that of the image stored at the first operation. Therefore, for example, even when image difference comparison, which is a simple image comparison method, is used, identification can be easily performed.

住所領域画像とＯＣＲリジェクト画像との画像比較の結果、所定の水準を超える類似度を示す住所領域画像とＯＣＲリジェクト画像との組み合わせが見つかった場合に、情報統合部１４０は、見つかった組み合わせの画像は、同じ住所データを示す郵便物であると見なす。そして、情報統合部１４０は、当該住所領域画像を含む大口リストの住所データを当該ＯＣＲリジェクト画像が持つ住所データとする（ステップＳ４０）。 As a result of the image comparison between the address area image and the OCR reject image, when a combination of the address area image and the OCR reject image showing a similarity exceeding a predetermined level is found, the information integration unit 140 displays the image of the found combination Are considered to be mail pieces showing the same address data. Then, the information integration unit 140 sets the address data of the large list including the address area image as the address data included in the OCR reject image (step S40).

図８は、図１に示す書状物区分システムにおけるデータの流れを示す説明図である。図８では、初回動作と繰り返し動作とにおけるデータの流れがまとめて記載されている。図８を参照して、図１に示す書状物区分システムにおけるデータの流れを説明する。 FIG. 8 is an explanatory diagram showing a data flow in the letter sorting system shown in FIG. In FIG. 8, the data flow in the initial operation and the repetitive operation is collectively described. With reference to FIG. 8, the flow of data in the letter sorting system shown in FIG. 1 will be described.

書状画像は、大口書状物を走査した画像スキャナ部１１０から、ＯＣＲ読取部２１０および大口書状判定部１２０に出力される。大口書状画像特徴量は、大口書状判定部１２０によって算出され、大口書状画像特徴量格納部１３０に出力される。大口ＩＤ番号は、大口書状画像特徴量格納部１３０で付与され、情報結合部１４０に出力される。 The letter image is output to the OCR reading unit 210 and the large letter determination unit 120 from the image scanner unit 110 that has scanned the large letter. The large letter image feature amount is calculated by the large letter determination unit 120 and is output to the large letter image feature amount storage unit 130. The large ID number is assigned by the large letter image feature amount storage unit 130 and output to the information combining unit 140.

宛先領域画像および宛先情報は、ＯＣＲ読取部２１０またはビデオ符号化装置３００から、情報結合部１４０に出力される。ＯＣＲ読み取りができなかったＯＣＲリジェクト画像は、ＯＣＲ読取部２１０から、ビデオ符号化装置３００またはＯＣＲリジェクト画像格納部２２０に出力される。 The destination area image and the destination information are output from the OCR reading unit 210 or the video encoding device 300 to the information combining unit 140. The OCR reject image that could not be read by OCR is output from the OCR reading unit 210 to the video encoding device 300 or the OCR reject image storage unit 220.

大口ＩＤ番号、宛先領域画像および宛先情報は、情報統合部１４０において関連付けされ、宛先データベース１５０に格納される。繰り返し動作時には、宛先データベース１５０に格納されたデータから大口リストが作成されて大口リスト格納部１６０に出力される。 The large mouth ID number, the destination area image, and the destination information are associated in the information integration unit 140 and stored in the destination database 150. During the repetitive operation, a large list is created from the data stored in the destination database 150 and output to the large list storage unit 160.

このような書状物区分システムでは、宛先認識装置１００が、初回動作時に、大口書状物の差出人が保有する宛先リストと同等な内容を持つデータベースを宛先データベース１５０に自動的に作成し、繰り返し動作時に、初回動作時に作成したデータベースを利用して、ＯＣＲ読み取りができなかった大口書状物の宛先を推定することができるので、ビデオ符号化装置３００において手作業で処理しなければならないＯＣＲリジェクト画像の量を軽減し、自動処理効率を向上させることができる。 In such a letter sorting system, the destination recognition apparatus 100 automatically creates a database having the same content as the destination list held by the sender of the large letter in the destination database 150 during the initial operation, and during the repeated operation. Since the destination of a large letter that could not be read by OCR can be estimated using the database created during the initial operation, the amount of OCR reject image that must be manually processed in the video encoding device 300 Can be reduced and automatic processing efficiency can be improved.

図９は、本発明による書状物区分システムの主要部を示すブロック図である。図９に示すように、書状物区分システム１（例えば、図１に示す宛先認識装置１００およびＯＣＲ２００に相当）は、大口差出人から差し出される一連の大口書状物の宛先を区分する書状物区分システムであって、それぞれの書状物の宛先が記載された面を走査して書状画像を生成する画像スキャナ部２（例えば、図１に示す画像スキャナ部１１０に相当）と、画像スキャナ部２によって生成された書状画像から宛先が記載されている宛先領域画像を抽出し、宛先領域画像から宛先情報を読み取る読取部３（例えば、図１に示すＯＣＲ読取部２１０に相当）と、宛先領域画像と、宛先領域画像から読取部３によって読み取られた宛先情報とを対応付けて大口書状物ごとに格納する宛先データベース４（例えば、図１に示す宛先データベース１５０に相当）と、宛先データベース４に、同じ大口差出人による過去の大口書状物の情報が格納されている場合に、読取部３によって宛先情報を読み取ることができなかった読取不能画像（例えば、ＯＣＲリジェクト画像）と、過去の大口書状物の宛先領域画像とを比較して、読取不能画像の宛先情報を推定する宛先推定部５（例えば、図１に示す情報統合部１４０に相当）とを備えるように構成されている。 FIG. 9 is a block diagram showing the main part of the letter sorting system according to the present invention. As shown in FIG. 9, the letter sorting system 1 (e.g., corresponding to the destination recognition device 100 and the OCR 200 shown in FIG. 1) is a letter sorting system that sorts the destinations of a series of large letters sent from a large sender. The image scanner unit 2 (for example, corresponding to the image scanner unit 110 shown in FIG. 1) that scans the surface on which the address of each letter is written and generates a letter image, and the image scanner unit 2 generate the letter image. A destination area image in which the destination is described from the written letter image, and reads the destination information from the destination area image (for example, equivalent to the OCR reading unit 210 shown in FIG. 1), a destination area image, Destination database 4 (for example, destination database 1 shown in FIG. 1) that stores destination information in correspondence with destination information read by destination unit 3 from destination area images. And the destination database 4 stores information on past large-sized letters by the same large-volume sender, and the reading unit 3 cannot read the destination information (for example, OCR). A destination estimation unit 5 (e.g., corresponding to the information integration unit 140 shown in FIG. 1) that estimates destination information of an unreadable image by comparing a destination image of a past large-sized letter and a destination area image. It is configured as follows.

また、上記の実施形態では、以下の（１）〜（５）に示すような書状物区分システムも開示されている。 Moreover, in said embodiment, the letter sort system as shown to the following (1)-(5) is also disclosed.

（１）宛先推定部５は、宛先データベース４に格納されている同じ大口差出人による過去の大口書状物の宛先領域画像のうち、新たに読取部３によって読み取られた宛先情報とは異なる宛先情報に対応付けられた宛先領域画像と、読取不能画像とを比較して、読取不能画像の宛先情報を推定する（例えば、図６のステップＳ４０に示す動作によって実現される。）書状物区分システム。 (1) The destination estimation unit 5 changes destination information different from the destination information newly read by the reading unit 3 from the destination area image of the past large-sized letter by the same large sender stored in the destination database 4. A letter sorting system that compares destination address images associated with unreadable images and estimates destination information of unreadable images (for example, realized by the operation shown in step S40 of FIG. 6).

（２）宛先推定部５は、読取不能画像と過去の大口書状物の宛先領域画像とが、所定の水準を超える類似度を示す場合に、当該宛先領域画像に対応付けられている宛先情報を、読取不能画像の宛先情報とする（例えば、図６のステップＳ４０に示す動作によって実現される。）書状物区分システム。 (2) When the unreadable image and the destination area image of the past large letter document show similarities exceeding a predetermined level, the destination estimation unit 5 displays the destination information associated with the destination area image. The letter sorting system (which is realized, for example, by the operation shown in step S40 of FIG. 6) is used as destination information of an unreadable image.

（３）書状画像から宛先領域を含まない共通領域画像を抽出し、一連の他の書状物の共通領域画像との類似度に基づいて、一連の書状物が大口書状物であるか否かを判定する（例えば、図２のステップＳ１６に示す動作によって実現される。）大口書状物判定部（例えば、図１に示す大口書状判定部１２０に相当）を備えた書状物区分システム。 (3) A common area image not including the destination area is extracted from the letter image, and whether or not the series of letters is a large letter based on the similarity with the common area image of another series of letters. A letter sorting system including a large letter determination unit (e.g., equivalent to the large letter determination unit 120 shown in FIG. 1) for determination (for example, realized by the operation shown in step S16 in FIG. 2).

（４）大口書状物判定部は、大口書状物ごとの共通領域画像の特徴を数値化した画像特徴量（例えば、大口書状画像特徴量に相当）に基づいて、書状物区分システム１に掛けられた大口書状物が、過去の大口差出人と同じ大口差出人による大口書状物か否か判定する（例えば、図２のステップＳ１８に示す動作によって実現される。）書状物区分システム。 (4) The large letter determination unit is applied to the letter classification system 1 based on an image feature value (for example, equivalent to a large letter image feature value) obtained by quantifying the feature of the common area image for each large letter. A letter sorting system for determining whether or not the large letter is a large letter by the same large sender as the previous large sender (for example, realized by the operation shown in step S18 of FIG. 2).

（５）目視による宛先の判断と、手入力による宛先の符号化とを行うことができるビデオ符号化装置（例えば、図１に示すに相当）を備え、読取部３は、宛先データベース４に、同じ大口差出人による過去の大口書状物のデータが格納されていない場合に、読取不能画像をビデオ符号化装置に出力し（例えば、図２のステップＳ１４に示す動作によって実現される。）、宛先データベースは、読取不能画像と、ビデオ符号化装置によって符号化された読取不能画像の宛先情報とを対応付けて格納する（例えば、図２のステップＳ２１に示す動作によって実現される。）書状物区分システム。 (5) A video encoding device (e.g., equivalent to that shown in FIG. 1) capable of performing visual destination determination and manual input destination encoding is provided. When past large letter data by the same large sender is not stored, an unreadable image is output to the video encoding device (for example, realized by the operation shown in step S14 in FIG. 2), and the destination database. Stores the unreadable image and the destination information of the unreadable image encoded by the video encoding device in association with each other (for example, realized by the operation shown in step S21 of FIG. 2). .

また、上記の実施形態の一部または全部は、以下の付記のようにも記載されうるが、以下には限られない。 Moreover, although a part or all of said embodiment can be described also as the following additional remarks, it is not restricted to the following.

（付記１）大口差出人から差し出される一連の大口書状物の宛先を区分する書状物区分システムであって、それぞれの書状物の宛先が記載された面を走査して書状画像を生成する画像スキャナ部と、前記画像スキャナ部によって生成された書状画像から宛先が記載されている宛先領域画像を抽出し、前記宛先領域画像から宛先情報を読み取る読取部と、宛先領域画像と、宛先領域画像から前記読取部によって読み取られた宛先情報とを対応付けて大口書状物ごとに格納する宛先データベースと、前記宛先データベースに、同じ大口差出人による過去の大口書状物の情報が格納されている場合に、前記読取部によって宛先情報を読み取ることができなかった読取不能画像と、前記過去の大口書状物の宛先領域画像とを比較して、前記読取不能画像の宛先情報を推定する宛先推定部とを備えたことを特徴とする書状物区分システム。 (Supplementary note 1) A document sorting system for sorting a destination of a series of large-sized letters sent from a large sender, and generating a letter image by scanning a surface on which the destination of each letter is written A destination region image in which a destination is described from a letter image generated by the image scanner unit, a reading unit that reads destination information from the destination region image, a destination region image, and the destination region image A destination database that stores each large letter in association with the destination information read by the reading unit, and when the information of the past large letter from the same large sender is stored in the destination database, the reading Comparing the unreadable image whose destination information could not be read by a copy with the destination area image of the past large-sized letter, the unreadable image Letter was sorting system is characterized in that a destination estimator for estimating the destination information.

（付記２）宛先推定部は、宛先データベースに格納されている同じ大口差出人による過去の大口書状物の宛先領域画像のうち、新たに読取部によって読み取られた宛先情報とは異なる宛先情報に対応付けられた宛先領域画像と、読取不能画像とを比較して、前記読取不能画像の宛先情報を推定する付記１記載の書状物区分システム。 (Additional remark 2) The destination estimation part matches with destination information different from the destination information newly read by the reading part among the destination area images of the past large-sized letters by the same large sender stored in the destination database. The letter matter classification system according to appendix 1, wherein the destination area image and the unreadable image are compared to estimate the destination information of the unreadable image.

（付記３）宛先推定部は、読取不能画像と過去の大口書状物の宛先領域画像とが、所定の水準を超える類似度を示す場合に、当該宛先領域画像に対応付けられている宛先情報を、前記読取不能画像の宛先情報とする付記１または付記２記載の書状物区分システム。 (Additional remark 3) A destination estimation part, when the unreadable image and the destination area image of the past large letter document show the similarity exceeding a predetermined level, the destination information associated with the destination area image The letter matter classification system according to Supplementary Note 1 or Supplementary Note 2, which is used as destination information of the unreadable image.

（付記４）書状画像から宛先領域を含まない共通領域画像を抽出し、一連の他の書状物の共通領域画像との類似度に基づいて、一連の書状物が大口書状物であるか否かを判定する大口書状物判定部を備えた付記１から付記３のうちのいずれか１つに記載の書状物区分システム。 (Appendix 4) Extracting a common area image that does not include a destination area from a letter image, and whether or not the series of letters is a large letter based on the similarity to the common area image of another series of letters The letter sort system according to any one of Supplementary Note 1 to Supplementary Note 3 including a large-sized letter determination unit that determines whether or not.

（付記５）大口書状物判定部は、大口書状物ごとの共通領域画像の特徴を数値化した画像特徴量に基づいて、書状物区分システムに掛けられた大口書状物が、過去の大口差出人と同じ大口差出人による大口書状物か否か判定する付記４記載の書状物区分システム。 (Supplementary note 5) The large-sized letter determination unit determines whether the large-sized letter applied to the letter classification system is based on the image feature value obtained by quantifying the characteristics of the common area image for each large-sized letter. The letter classification system according to appendix 4, wherein it is determined whether the letter is a large letter by the same large sender.

（付記６）目視による宛先の判断と、手入力による宛先の符号化とを行うことができるビデオ符号化装置を備え、読取部は、宛先データベースに、同じ大口差出人による過去の大口書状物のデータが格納されていない場合に、読取不能画像を前記ビデオ符号化装置に出力し、宛先データベースは、前記読取不能画像と、前記ビデオ符号化装置によって符号化された前記読取不能画像の宛先情報とを対応付けて格納する付記１から付記５のうちのいずれか１つに記載の書状物区分システム。 (Additional remark 6) It is provided with the video encoding apparatus which can perform the judgment of the destination by visual observation, and the encoding of the destination by manual input, and the reading part stores the data of the past large-scale letters by the same large sender in the destination database. Is not stored, the unreadable image is output to the video encoding device, and the destination database includes the unreadable image and the destination information of the unreadable image encoded by the video encoding device. The letter matter classification system according to any one of Supplementary Note 1 to Supplementary Note 5 stored in association with each other.

（付記７）大口差出人から差し出される一連の大口書状物の宛先を区分する書状物区分システムにおける宛先推定方法であって、それぞれの書状物の宛先が記載された面を走査して書状画像を生成し、前記生成した書状画像から宛先が記載されている宛先領域画像を抽出し、前記抽出した宛先領域画像から宛先情報を読み取り、宛先領域画像と、宛先領域画像から読み取った宛先情報とを対応付けて大口書状物ごとに宛先データベースに格納し、前記宛先データベースに、同じ大口差出人による過去の大口書状物の情報が格納されている場合に、宛先情報を読み取ることができなかった読取不能画像と、前記過去の大口書状物の宛先領域画像とを比較して、前記読取不能画像の宛先情報を推定することを特徴とする宛先推定方法。 (Supplementary note 7) A destination estimation method in a letter sort system for sorting a series of large letter destinations sent from a large sender, and scanning the surface on which the address of each letter is written Generate and extract a destination area image in which a destination is described from the generated letter image, read destination information from the extracted destination area image, and correspond the destination area image to the destination information read from the destination area image In addition, each large letter is stored in the destination database, and when the information of the past large letter by the same large sender is stored in the destination database, the unreadable image in which the destination information could not be read and A destination estimation method, wherein destination information of the unreadable image is estimated by comparing with a destination area image of the past large letter.

（付記８）宛先データベースに格納されている同じ大口差出人による過去の大口書状物の宛先領域画像のうち、新たに読み取られた宛先情報とは異なる宛先情報に対応付けられた宛先領域画像と、読取不能画像とを比較して、前記読取不能画像の宛先情報を推定する付記７記載の宛先推定方法。 (Supplementary Note 8) Among destination area images of past large-sized letters by the same large sender stored in the destination database, destination area images associated with destination information different from newly read destination information, and reading The destination estimation method according to appendix 7, wherein destination information of the unreadable image is estimated by comparing with an unreadable image.

１書状物区分システム
２画像スキャナ部
３読取部
４宛先データベース
５宛先推定部
１００宛先認識装置
１１０画像スキャナ部
１２０大口書状物判定部
１３０大口書状画像特徴量格納部
１４０情報統合部
１５０宛先データベース
１６０大口リスト格納部
２００光学式読取装置
２１０ＯＣＲ読取部
２２０ＯＣＲリジェクト画像格納部
３００ビデオ符号化装置 DESCRIPTION OF SYMBOLS 1 Document classification system 2 Image scanner part 3 Reading part 4 Destination database 5 Destination estimation part 100 Destination recognition apparatus 110 Image scanner part 120 Large-mouth letter determination part 130 Large-letter letter image feature-value storage part 140 Information integration part 150 Destination database 160 Large mouth List storage unit 200 Optical reader 210 OCR reader 220 OCR reject image storage unit 300 Video encoding device

Claims

大口差出人から差し出される一連の大口書状物の宛先を区分する書状物区分システムであって、
それぞれの書状物の宛先が記載された面を走査して書状画像を生成する画像スキャナ部と、
前記画像スキャナ部によって生成された書状画像から宛先が記載されている宛先領域画像を抽出し、前記宛先領域画像から宛先情報を読み取る読取部と、
宛先領域画像と、宛先領域画像から前記読取部によって読み取られた宛先情報とを対応付けて大口書状物ごとに格納する宛先データベースと、
前記宛先データベースに、同じ大口差出人による過去の大口書状物の情報が格納されている場合に、前記読取部によって宛先情報を読み取ることができなかった読取不能画像と、前記過去の大口書状物の宛先領域画像とを比較して、前記読取不能画像の宛先情報を推定する宛先推定部とを備えた
ことを特徴とする書状物区分システム。 A letter sorting system that sorts the destination of a series of large letters sent from a large sender,
An image scanner unit that generates a letter image by scanning a surface on which a destination of each letter is written;
A reading unit that extracts a destination area image in which a destination is described from a letter image generated by the image scanner unit, and reads destination information from the destination area image;
A destination database that stores a destination area image and destination information read from the destination area image by the reading unit in association with each other in large letters;
When the destination database stores information on past large-sized letters by the same large-volume sender, the unreadable image whose destination information could not be read by the reading unit and the destination of the past large-sized letters A letter sorter system comprising: a destination estimation unit that compares destination images and estimates destination information of the unreadable image.

宛先推定部は、宛先データベースに格納されている同じ大口差出人による過去の大口書状物の宛先領域画像のうち、新たに読取部によって読み取られた宛先情報とは異なる宛先情報に対応付けられた宛先領域画像と、読取不能画像とを比較して、前記読取不能画像の宛先情報を推定する
請求項１記載の書状物区分システム。 The destination estimation unit is a destination region associated with destination information that is different from the destination information newly read by the reading unit, among the destination region images of past large-sized letters by the same large sender stored in the destination database. The letter sorting system according to claim 1, wherein destination information of the unreadable image is estimated by comparing the image and the unreadable image.

宛先推定部は、読取不能画像と過去の大口書状物の宛先領域画像とが、所定の水準を超える類似度を示す場合に、当該宛先領域画像に対応付けられている宛先情報を、前記読取不能画像の宛先情報とする
請求項１または請求項２記載の書状物区分システム。 The destination estimation unit, when the unreadable image and the destination area image of the past large letter document show similarities exceeding a predetermined level, the destination information associated with the destination area image cannot be read. The letter matter classification system according to claim 1, wherein the document destination information is image destination information.

書状画像から宛先領域を含まない共通領域画像を抽出し、一連の他の書状物の共通領域画像との類似度に基づいて、一連の書状物が大口書状物であるか否かを判定する大口書状物判定部を備えた
請求項１から請求項３のうちのいずれか１項に記載の書状物区分システム。 A common area image that does not include a destination area is extracted from a letter image, and whether or not a series of letters is a large letter is determined based on the similarity of a series of other letters to the common area image. The letter sort system according to any one of claims 1 to 3, further comprising a letter decision unit.

大口書状物判定部は、大口書状物ごとの共通領域画像の特徴を数値化した画像特徴量に基づいて、書状物区分システムに掛けられた大口書状物が、過去の大口差出人と同じ大口差出人による大口書状物か否か判定する
請求項４記載の書状物区分システム。 The large letter determination unit uses the same large sender as the previous large sender based on the image feature value obtained by quantifying the characteristics of the common area image for each large letter. The letter sorting system according to claim 4, wherein it is determined whether or not the letter is a large letter.

目視による宛先の判断と、手入力による宛先の符号化とを行うことができるビデオ符号化装置を備え、
読取部は、宛先データベースに、同じ大口差出人による過去の大口書状物のデータが格納されていない場合に、読取不能画像を前記ビデオ符号化装置に出力し、
宛先データベースは、前記読取不能画像と、前記ビデオ符号化装置によって符号化された前記読取不能画像の宛先情報とを対応付けて格納する
請求項１から請求項５のうちのいずれか１項に記載の書状物区分システム。 Provided with a video encoding device capable of performing visual destination determination and manual input destination encoding,
The reading unit outputs an unreadable image to the video encoding device when data of the past large letter by the same large sender is not stored in the destination database,
6. The destination database stores the unreadable image and the destination information of the unreadable image encoded by the video encoding device in association with each other. Letter sorting system.

大口差出人から差し出される一連の大口書状物の宛先を区分する書状物区分システムにおける宛先推定方法であって、
画像スキャナ部が、それぞれの書状物の宛先が記載された面を走査して書状画像を生成し、
読取部が、前記生成した書状画像から宛先が記載されている宛先領域画像を抽出し、
前記読取部が、前記抽出した宛先領域画像から宛先情報を読み取り、
宛先推定部が、宛先領域画像と、宛先領域画像から読み取った宛先情報とを対応付けて大口書状物ごとに宛先データベースに格納し、
前記宛先推定部が、前記宛先データベースに、同じ大口差出人による過去の大口書状物の情報が格納されている場合に、宛先情報を読み取ることができなかった読取不能画像と、前記過去の大口書状物の宛先領域画像とを比較して、前記読取不能画像の宛先情報を推定する
ことを特徴とする宛先推定方法。 A destination estimation method in a letter sorting system that sorts a series of large letter destinations sent from a large sender,
The image scanner unit generates a letter image by scanning the surface on which the destination of each letter is written,
The reading unit extracts a destination area image in which a destination is described from the generated letter image,
The reading unit reads destination information from the extracted destination area image,
The destination estimation unit associates the destination area image with the destination information read from the destination area image and stores it in the destination database for each large letter.
When the destination estimation unit stores in the destination database information on past large letters by the same large sender, the unreadable image in which the destination information could not be read, and the past large letter A destination estimation method, wherein the destination information of the unreadable image is estimated by comparing the destination area image with the destination area image.

宛先推定部が、宛先データベースに格納されている同じ大口差出人による過去の大口書状物の宛先領域画像のうち、新たに読み取られた宛先情報とは異なる宛先情報に対応付けられた宛先領域画像と、読取不能画像とを比較して、前記読取不能画像の宛先情報を推定する
請求項７記載の宛先推定方法。 A destination area image associated with destination information different from the newly read destination information out of destination area images of past large letter documents by the same large sender stored in the destination database; and The destination estimation method according to claim 7, wherein destination information of the unreadable image is estimated by comparing with an unreadable image.