JP3379327B2

JP3379327B2 - Character recognition device

Info

Publication number: JP3379327B2
Application number: JP06098496A
Authority: JP
Inventors: 大助藤野; 知也岡崎
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 1996-03-18
Filing date: 1996-03-18
Publication date: 2003-02-24
Anticipated expiration: 2016-03-18
Also published as: JPH09251514A

Description

【発明の詳細な説明】Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は文字認識方法及びそ
の装置に関するものであり、特にカメラ等で撮像した隣
り合う文字間に重なりや接合を生じた低品質の文字画像
を読みとる文字認識装置に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a character recognition method and an apparatus therefor, and more particularly to a character recognition apparatus for reading a low-quality character image in which adjacent characters captured by a camera or the like are overlapped or joined. Is.

【０００２】[0002]

【従来の技術】撮像した画像中の文字を読みとる文字認
識装置は、一般に画像から個々の文字パターンを切り出
す文字切り出し部と、切り出された文字パターンを認識
する文字認識部とから構成される。図４４はこのような
文字認識装置の一般的な構成を示すものである。2. Description of the Related Art A character recognition device for reading characters in a picked-up image is generally composed of a character cutout section for cutting out individual character patterns from the image and a character recognition section for recognizing the cut out character patterns. FIG. 44 shows a general configuration of such a character recognition device.

【０００３】図において、１０１は画像入力部であり、
認識対象文字を含む画像を走査し、２値信号で画像メモ
リ１０２に格納する。１０３は文字列切り出し部であ
り、オペレータが設定した文字列方向を用いて画像メモ
リ１０２から認識対象文字を含む文字列を切り出す。１
０４は文字切り出し部であり、文字列切り出し部１０３
で切り出した文字列画像から認識対象文字を１文字ずつ
矩形領域として切り出す。１０５は認識対象の文字に対
応した基準パターンを格納した認識辞書を記憶する辞書
メモリである。１０６は文字認識部であり、文字切り出
し部１０４で切り出された各文字の矩形領域を辞書メモ
リ１０５に格納された基準パターンと照合することによ
り最も似た文字を認識結果とする。１０７は認識結果出
力部であり、文字認識部１０６の認識結果をディスプレ
イや他の装置に出力する。In the figure, 101 is an image input unit,
The image including the recognition target character is scanned and stored in the image memory 102 as a binary signal. A character string cutout unit 103 cuts out a character string including a recognition target character from the image memory 102 using the character string direction set by the operator. 1
Reference numeral 04 is a character cutout unit, which is a character string cutout unit 103.
Characters to be recognized are cut out one by one as a rectangular area from the character string image cut out in step 1. Reference numeral 105 denotes a dictionary memory that stores a recognition dictionary that stores reference patterns corresponding to characters to be recognized. Reference numeral 106 denotes a character recognition unit, which matches the rectangular region of each character cut out by the character cutout unit 104 with a reference pattern stored in the dictionary memory 105 to make the most similar character a recognition result. A recognition result output unit 107 outputs the recognition result of the character recognition unit 106 to a display or another device.

【０００４】次に動作について、図４５の入力画像１１
０を例に説明する。画像入力部１０１から入力した画像
１１０は、文字を形成する値「１」の画素（以下「黒画
素」と称す）、文字を除く背景の値「０」の画素（以下
「白画素」と称す）として２値化され画像メモリ１０２
に蓄えられる。図４６において、文字切り出し部１０３
ではこの入力画像のあらかじめ指定された領域１１１に
ついて指定された文字列方向（この場合Ｘ方向）に走査
し、各走査ラインごとに黒画素を計数しＹ方向のヒスト
グラム１２０を求め、このヒストグラムの連続するＹ方
向の黒部分１２１、１２２の両端のＹ座標から文字列に
外接する矩形領域１２３、１２４を抽出する。Next, regarding the operation, the input image 11 shown in FIG.
0 will be described as an example. The image 110 input from the image input unit 101 is a pixel having a value “1” forming a character (hereinafter referred to as “black pixel”) and a pixel having a background value “0” excluding the character (hereinafter referred to as “white pixel”). Image memory 102
Stored in. In FIG. 46, the character cutout unit 103
Then, the previously designated area 111 of this input image is scanned in the designated character string direction (in this case, the X direction), black pixels are counted for each scanning line, and the histogram 120 in the Y direction is obtained. The rectangular regions 123 and 124 circumscribing the character string are extracted from the Y coordinates of both ends of the black portions 121 and 122 in the Y direction.

【０００５】次に図４７において、文字切り出し部１０
４では、以上のように文字列切り出し部１０３で文字列
として抽出された矩形領域の各々に対して、まず矩形領
域１２３について、文字列方向とは垂直方向（この場合
Ｙ方向）に走査し、各走査ラインごとに黒画素を計数し
Ｘ方向のヒストグラム１３０を求め、このヒストグラム
の連続するＸ方向の黒部分１３１、１３２、１３３、１
３４の両端のＸ座標から各文字に外接する矩形領域１３
５、１３６、１３７、１３８を抽出する。続いて矩形領
域１２４についても同様な処理を行い、対象文字に外接
する矩形領域を抽出する。Next, referring to FIG. 47, the character cutting section 10
4, in each of the rectangular areas extracted as a character string by the character string slicing unit 103 as described above, first, the rectangular area 123 is scanned in the direction perpendicular to the character string direction (Y direction in this case). Black pixels are counted for each scanning line to obtain a histogram 130 in the X direction, and continuous black portions 131, 132, 133, 1 in the X direction of the histogram are obtained.
Rectangular area 13 circumscribing each character from the X coordinates of both ends of 34
5, 136, 137, 138 are extracted. Subsequently, similar processing is performed on the rectangular area 124 to extract the rectangular area circumscribing the target character.

【０００６】以上のように抽出された対象文字に外接す
る矩形領域の各々に対して、文字認識部１０６は図４８
に示すように辞書メモリ１０５に格納された認識対象と
なる全ての文字の基準パターンと照合することにより最
も類似度の高い文字を認識結果とする。For each of the rectangular areas circumscribing the target character extracted as described above, the character recognition unit 106 is shown in FIG.
As shown in, the character having the highest degree of similarity is set as the recognition result by collating with the reference patterns of all the characters to be recognized stored in the dictionary memory 105.

【０００７】[0007]

【発明が解決しようとする課題】従来の文字認識装置は
以上のように構成されていたため、文字認識部が各文字
を認識するためには、文字認識部において認識対象とな
る文字が確実に１つの矩形領域として切り出されてい
る、すなわち、撮像された文字が各々明確に分離されて
いる必要があった。Since the conventional character recognition device is configured as described above, in order for the character recognition unit to recognize each character, the character recognition unit must ensure that the character to be recognized is 1 character. It was necessary to be cut out as one rectangular area, that is, the imaged characters should be clearly separated from each other.

【０００８】しかし、文字の配列によっては図４に示す
ように文字列方向とは垂直に文字が重なったり、あるい
は隣り合う文字が接合する場合があり、この場合は図４
の矩形領域４２、４３に示す如くヒストグラムからの各
文字の切り出しは困難であった。また、文字がきわめて
接近している場合は、画像入力部の解像度やノイズの影
響等によって隣り合う文字がくっついてしまうときがあ
り、この場合においてもヒストグラムからの個々の文字
の切り出しは困難であった。However, depending on the arrangement of the characters, there are cases where the characters overlap vertically to the character string direction, as shown in FIG. 4, or adjacent characters are joined. In this case, as shown in FIG.
It was difficult to cut out each character from the histogram as shown by the rectangular areas 42 and 43 in FIG. Also, when characters are extremely close to each other, adjacent characters may stick to each other due to the resolution of the image input unit, the influence of noise, etc. Even in this case, it is difficult to cut out individual characters from the histogram. It was

【０００９】そこで、重なりを持つ複数の文字を外接す
る矩形領域に対して、隣り合う文字間の背景領域の連結
性を調べて文字を分離する技術が提案されており、例え
ば特公平７ー０４６３７１に開示された技術がある。し
かし、このような従来技術によって文字列方向と直角に
重なった重なり文字の分離は可能であるが、接合文字に
対しては隣り合う文字を分離する背景領域が無いため個
々の文字領域を特定することは不可能であるという問題
があった。Therefore, for a rectangular area circumscribing a plurality of overlapping characters, a technique for separating the characters by examining the connectivity of the background areas between adjacent characters has been proposed. For example, Japanese Patent Publication No. 7-046371. There is a technology disclosed in. However, although it is possible to separate overlapping characters that overlap at right angles to the character string direction by such a conventional technique, individual character regions are specified because there is no background region that separates adjacent characters for spliced characters. There was a problem that it was impossible.

【００１０】この発明はこのような問題を解決するため
に為されたもので、隣り合う文字間に”重なり”や”接
合”があっても統一的な方法で文字の認識が可能である
文字認識装置を得ることを目的とする。The present invention has been made in order to solve such a problem. Even if there are "overlaps" or "joins" between adjacent characters, the characters can be recognized in a unified manner. The purpose is to obtain a recognition device.

【００１１】[0011]

【課題を解決するための手段】本発明に係わる文字認識
装置においては、認識対象文字を含む画像を入力する画
像入力部と、この画像入力部で入力した画像を文字列方
向に走査して生成した画素濃度ヒストグラムにより文字
列を切り出す文字列切り出し部と、この文字列切り出し
部で切り出した文字列画像を前記文字列と直角方向に走
査して生成した画素濃度ヒストグラムを用いて認識対象
文字を含む矩形領域を切り出す文字切り出し部と、この
文字切り出し部で切り出した前記矩形領域を認識対象文
字の基準パターンと比較することにより認識対象文字を
認識する第１の文字認識部と、この第１の文字認識部で
認識されなかった前記矩形領域と認識対象文字の基準パ
ターンとをこの基準パターンの位置をずらしながら比較
することにより、認識対象文字の基準パターンと類似度
の高い文字を認識対象文字の候補として基準パターンの
位置と共に抽出し、この抽出された複数の認識対象文字
の候補をその位置の出現順に並べ替えることにより認識
対象文字を認識する第２の文字認識部と、を備え、前記
第２の文字認識部は、文字列方向に隣り合う文字間に重
なりや接合を生じる可能性の有る部分をマスキングした
認識対象文字の基準パターンを用いて認識対象文字を認
識するものである。In a character recognition apparatus according to the present invention, an image input section for inputting an image containing a character to be recognized and an image input by this image input section are scanned in a character string direction to generate the image. A character string cutout portion for cutting out a character string by the pixel density histogram, and a character density image generated by scanning the character string image cut out by the character string cutout portion in a direction perpendicular to the character string and including a recognition target character A character cutout unit that cuts out a rectangular area, a first character recognition unit that recognizes the recognition target character by comparing the rectangular area cut out by the character cutout unit with a reference pattern of the recognition target character, and the first character By comparing the rectangular area not recognized by the recognition unit and the reference pattern of the recognition target character while shifting the position of the reference pattern, A character with a high similarity to the reference pattern of the recognition target character is extracted as a candidate for the recognition target character together with the position of the reference pattern, and the plurality of extracted candidates for the recognition target character are rearranged in the order of appearance of the recognition target. A second character recognizing unit for recognizing a character, wherein the second character recognizing unit masks a portion that may overlap or join between characters adjacent to each other in the character string direction. The recognition target character is recognized using a reference pattern.

【００１２】又、本発明に係わる文字認識装置は、認識
対象文字を含む画像を入力する画像入力部と、この画像
入力部で入力した画像を文字列方向に走査して生成した
画素濃度ヒストグラムにより文字列を切り出す文字列切
り出し部と、この文字列切り出し部で切り出した文字列
画像を前記文字列と直角方向に走査して生成した画素濃
度ヒストグラムを用いて認識対象文字を含む矩形領域を
切り出す文字切り出し部と、この文字切り出し部で切り
出した前記矩形領域を認識対象文字の基準パターンと比
較することにより認識対象文字を認識する第１の文字認
識部と、前記認識対象文字の基準パターンを文字列と直
角方向で矩形領域の上下に連なる線要素の存在の有無で
分類し、前記第１の文字認識部で認識されなかった前記
矩形領域と前記線要素の存在しない認識対象文字の基準
パターンとをこの基準パターンの位置をずらしながら比
較することにより、認識対象文字の基準パターンと類似
度の高い文字を認識対象文字の候補として基準パターン
の位置と共に抽出するとともに、前記第１の文字認識部
で認識されなかった矩形領域の文字列と前記前記線要素
の存在する認識対象文字の基準パターンとを重ね合わせ
ることにより、認識対象文字の基準パターンと類似度の
高い文字を認識対象文字の候補として基準パターンの位
置と共に抽出し、この抽出された複数の認識対象文字の
候補をその位置の出現順に並べ替えることにより認識対
象文字を認識する第２の文字認識部と、を備えたもので
ある。Further, the character recognition device according to the present invention uses an image input section for inputting an image containing a character to be recognized and a pixel density histogram generated by scanning the image input by the image input section in the character string direction. A character string cutout unit that cuts out a character string, and a character that cuts out a rectangular area including a recognition target character using a pixel density histogram generated by scanning the character string image cut out by this character string cutout unit in the direction perpendicular to the character string. A cutout unit, a first character recognition unit for recognizing a recognition target character by comparing the rectangular area cut out by the character cutout unit with a reference pattern of the recognition target character, and a reference pattern of the recognition target character as a character string. The rectangular area and the line that are not recognized by the first character recognition unit are classified by the presence or absence of line elements that are vertically connected to the rectangular area in the direction perpendicular to By comparing the reference pattern of the recognition target character that does not have a base while shifting the position of this reference pattern, a character with a high similarity to the reference pattern of the recognition target character is extracted as a candidate for the recognition target character together with the position of the reference pattern. In addition, by superimposing the character string of the rectangular area that is not recognized by the first character recognition unit and the reference pattern of the recognition target character in which the line element exists, the reference pattern of the recognition target character and the similarity Second character recognition for recognizing the recognition target character by extracting a character having a high recognition rate as a recognition target character candidate together with the position of the reference pattern and rearranging the plurality of extracted recognition target character candidates in the order of appearance of the position. And a section.

【００１３】又、本発明に係わる文字認識装置は、認識
対象文字を含む画像を入力する画像入力部と、この画像
入力部で入力した画像を文字列方向に走査して生成した
画素濃度ヒストグラムにより文字列を切り出す文字列切
り出し部と、この文字列切り出し部で切り出した文字列
画像を前記文字列と直角方向に走査して生成した画素濃
度ヒストグラムを用いて認識対象文字を含む矩形領域を
切り出す文字切り出し部と、この文字切り出し部で切り
出した前記矩形領域を認識対象文字の基準パターンと比
較することにより認識対象文字を認識する第１の文字認
識部と、この第１の文字認識部で認識されなかった前記
矩形領域の一方の端に認識対象文字の基準パターンを重
ね合わせて認識対象文字を認識し、更にこの認識された
認識対象文字を前記矩形領域から除いた残りの矩形領域
の前記矩形領域の一方の端と同じ側の端に認識対象文字
の基準パターンを重ね合わせることにより認識対象文字
を順次認識する第２の文字認識部と、を備え、前記第２
の文字認識部は、認識すべき矩形領域の一方の端に文字
列と直角方向に走査して生成した画素濃度ヒストグラム
の高い位置が存在するときは、認識対象文字の基準パタ
ーンの内で文字列と直角方向で前記矩形領域の上下に連
なる線要素の存在する認識対象文字の基準パターンを重
ね合わせて認識対象文字を認識し、認識すべき矩形領域
の一方の端に文字列と直角方向に走査して生成した画素
濃度ヒストグラムの高い位置が存在しないときは、認識
対象文字の基準パターンの内で文字列と直角方向で前記
矩形領域の上下に連なる線要素の存在しない認識対象文
字の基準パターンを重ね合わせて認識対象文字を認識す
るものである。Further, the character recognition apparatus according to the present invention uses an image input section for inputting an image including a character to be recognized and a pixel density histogram generated by scanning the image input by the image input section in the character string direction. A character string cutout unit that cuts out a character string, and a character that cuts out a rectangular area including a recognition target character using a pixel density histogram generated by scanning the character string image cut out by this character string cutout unit in the direction perpendicular to the character string. A first character recognition unit that recognizes a recognition target character by comparing the cutout unit and the rectangular area that is cut out by the character cutout unit with a reference pattern of the recognition target character, and the first character recognition unit that recognizes the recognition target character. The recognition target character is recognized by superimposing a reference pattern of the recognition target character on one end of the rectangular area that has not been detected, and the recognized recognition target character is further forwarded. A second character recognition unit for sequentially recognizing the recognition target character by superimposing a reference pattern of the recognition target character on the end on the same side as one end of the rectangular region of the remaining rectangular region excluding the rectangular region, And the second
When there is a high position in the pixel density histogram generated by scanning in the direction perpendicular to the character string at one end of the rectangular area to be recognized, the character recognition unit of The recognition target character is recognized by superimposing the reference pattern of the recognition target character having line elements that are vertically connected to the rectangular area in the direction perpendicular to the rectangular area, and is scanned in the direction perpendicular to the character string at one end of the rectangular area to be recognized. When there is no high position of the pixel density histogram generated by the above, the reference pattern of the recognition target character in which there is no line element continuous above and below the rectangular area in the direction orthogonal to the character string in the reference pattern of the recognition target character is set. The recognition target character is recognized by overlapping.

【００１４】[0014]

【発明の実施の形態】実施の形態１．本発明による第１の実施の形態を図１〜図１１により説
明する。図１は本発明による第１の実施の形態である文
字認識装置の要部構成を示す。図において、１は画像入
力部であり、認識対象文字を含む画像を走査し、２値信
号で画像メモリに２格納する。３は文字列切り出し部で
あり、オペレータが設定した文字列方向を用いて画像メ
モリ２から認識対象文字を含む文字列を切り出す。４は
文字切り出し部であり、文字列切り出し部３で切り出し
た文字列画像から認識対象文字を１文字ずつ矩形領域と
して切り出す。５は認識対象の文字に対応した基準パタ
ーンを格納した認識辞書を記憶する辞書メモリである。
６は第１の文字認識部、７は第２の文字認識部であり、
文字切り出し部４で切り出された各文字の矩形領域を辞
書メモリ５に格納された基準パターンと照合することに
より最も似た文字を認識結果とする。８は認識結果出力
部であり、第１の文字認識部６及び第２の文字認識部７
の認識結果をディスプレイや他の装置に出力するもので
ある。BEST MODE FOR CARRYING OUT THE INVENTION Embodiment 1. A first embodiment according to the present invention will be described with reference to FIGS. FIG. 1 shows a main configuration of a character recognition device according to a first embodiment of the present invention. In the figure, reference numeral 1 denotes an image input unit, which scans an image including a character to be recognized and stores it in an image memory as a binary signal. A character string cutout unit 3 cuts out a character string including a recognition target character from the image memory 2 using the character string direction set by the operator. Reference numeral 4 denotes a character cutout unit, which cuts out recognition target characters one by one as a rectangular area from the character string image cut out by the character string cutout unit 3. Reference numeral 5 is a dictionary memory that stores a recognition dictionary that stores a reference pattern corresponding to a character to be recognized.
6 is a first character recognition unit, 7 is a second character recognition unit,
By comparing the rectangular area of each character cut out by the character cutting unit 4 with the reference pattern stored in the dictionary memory 5, the most similar character is set as the recognition result. Reference numeral 8 is a recognition result output unit, which includes a first character recognition unit 6 and a second character recognition unit 7.
The recognition result of is output to a display or another device.

【００１５】次に動作について、図２の入力画像１０を
例に説明する。画像入力部１から入力した画像１０は、
文字を形成する値「１」の画素（以下「黒画素」と称
す）、文字を除く背景の値「０」の画素（以下「白画
素」と称す）として２値化され画像メモリ２に蓄えられ
る。図３において、文字切り出し部３ではこの入力画像
のあらかじめ指定された領域１１について指定された文
字列方向（この場合Ｘ方向）に走査し、各走査ラインご
とに黒画素を計数しＹ方向のヒストグラム２０を求め、
このヒストグラムの連続するＹ方向の黒部分２１の両端
のＹ座標から文字列に外装する矩形領域２３を抽出す
る。Next, the operation will be described by taking the input image 10 of FIG. 2 as an example. The image 10 input from the image input unit 1 is
A pixel having a value "1" (hereinafter referred to as "black pixel") forming a character and a pixel having a background value "0" excluding the character (hereinafter referred to as "white pixel") are binarized and stored in the image memory 2. To be In FIG. 3, the character clipping unit 3 scans a predesignated area 11 of the input image in a designated character string direction (X direction in this case), counts black pixels for each scanning line, and plots a histogram in the Y direction. Ask for 20,
From the Y coordinates of both ends of the continuous black portion 21 in the Y direction of this histogram, the rectangular area 23 to be enclosed in the character string is extracted.

【００１６】次に図４において、文字切り出し部４で
は、以上のように文字切り出し部３で文字列として抽出
された矩形領域２３について、文字列方向とは垂直方向
（この場合Ｙ方向）に走査し、各走査ラインごとに黒画
素を計数しＸ方向のヒストグラム３０を求め、このヒス
トグラムの連続するＸ方向の黒部分３１、３２、３３、
３４、３５の両端のＸ座標から矩形領域４１、４２、４
３、４４、４５を抽出する。Next, referring to FIG. 4, the character slicing unit 4 scans the rectangular area 23 extracted as a character string by the character slicing unit 3 in the direction perpendicular to the character string direction (Y direction in this case). Then, the black pixels are counted for each scanning line to obtain the histogram 30 in the X direction, and the continuous black portions 31, 32, 33 in the X direction of the histogram are obtained.
Rectangular regions 41, 42, 4 from the X coordinates of both ends of 34, 35
Extract 3, 44 and 45.

【００１７】以上のように抽出された矩形領域の各々に
対して、第１の文字認識部６は辞書メモリ５の内容を参
照しながら図５に示す処理フローによって文字の認識を
行う。図６は辞書メモリの内容であり、認識対象のｎ個
の文字カテゴリについて基準パターンおよびその横サイ
ズ、縦サイズが記録されている。以下、図５の処理フロ
ーについて説明する。With respect to each of the rectangular areas extracted as described above, the first character recognition section 6 recognizes characters by the processing flow shown in FIG. 5 while referring to the contents of the dictionary memory 5. FIG. 6 shows the contents of the dictionary memory, and the reference pattern and its horizontal size and vertical size are recorded for the n character categories to be recognized. The processing flow of FIG. 5 will be described below.

【００１８】認識対象文字パターンを図７（ａ）の矩形
領域４１とする。まず、ステップＳ１０１において幅Ｗ
１、高さＨ１である文字パターンを幅Ｗ、高さＨの基準
サイズに正規化する。これを図７（ｂ）に示す矩形領域
４１’とする。The character pattern to be recognized is a rectangular area 41 shown in FIG. First, in step S101, the width W
1. A character pattern having a height of H1 is normalized to a reference size of width W and height H. This is a rectangular area 41 'shown in FIG. 7 (b).

【００１９】次にステップＳ１０２で辞書メモリ５に格
納されたｉ番目（最初はｉ＝１）の認識対象のカテゴリ
を取り出しその基準パターンの縦横比と、切り出した文
字パターンの縦横比の比Ｒ＝（Ｈ１／Ｗ１）／（ｈｉ／ｗｉ）（１）をチェックする。Ｒ＝１ならば切り出した文字パターン
と基準パターンとの縦横比が等しい。Ｒが１よりも定め
られた値以上離れている場合は切り出した文字パターン
と基準パターンの形がかけ離れたものとしてステップＳ
１０５に進み、類似度＝０として記録する。それ以外の
場合はステップＳ１０３以下の類似度の計算を行う。Next, in step S102, the i-th (first i = 1) recognition target category stored in the dictionary memory 5 is taken out, and the aspect ratio of the reference pattern and the aspect ratio of the cut-out character pattern R = (H1 / W1) / (hi / wi) Check (1). If R = 1, the aspect ratio of the cut-out character pattern and the reference pattern is equal. If R is larger than 1 by a predetermined value or more, it is determined that the cut-out character pattern and the reference pattern are far from each other in step S.
Proceed to 105, and record as similarity = 0. In other cases, the calculation of the degree of similarity is performed after step S103.

【００２０】ステップＳ１０３では幅ｗｉ、高さｈｉで
ある文字パターンを幅Ｗ、高さＨの基準サイズに正規化
する。In step S103, the character pattern having the width wi and the height hi is normalized to the reference size of the width W and the height H.

【００２１】ステップＳ１０４では縦Ｈ、横Ｗにそれぞ
れ正規化された切り出した文字パターンと基準パターン
とを重ね合わせ両者の類似度を求め記録する。ここで正
規化された切り出した文字パターンをＰ、正規化された
基準パターンをＱとすると、Ｐ＝［Ｐij］、Ｑ＝［Ｑij］（ｉ＝１〜Ｗ，ｊ＝１〜Ｈ）としたとき、類似度は次に示すＳで表されるものであ
る。In step S104, the cut-out character pattern normalized in the vertical direction H and the horizontal direction W are overlapped with the reference pattern, and the degree of similarity between them is obtained and recorded. Assuming that the normalized cut-out character pattern is P and the normalized reference pattern is Q, P = [Pij] and Q = [Qij] (i = 1 to W, j = 1 to H) At this time, the similarity is represented by S shown below.

【００２２】[0022]

【数１】 [Equation 1]

【００２３】ステップＳ１０６で認識対象の全てのカテ
ゴリについて類似度を求めたかを調べ（ｉ＝ｎ？）、
「Ｎｏ」であればステップＳ１０２に戻り次の基準パタ
ーンについて類似度を求める。In step S106, it is checked whether the degrees of similarity have been obtained for all the recognition target categories (i = n?),
If “No”, the process returns to step S102 and the similarity is calculated for the next reference pattern.

【００２４】全てのカテゴリについて類似度を求めた記
録は図８のようになる。ステップＳ１０７ではこの記録
から類似度が基準値以上で最も高い値を持つカテゴリが
あるかを調べ、該当するカテゴリがあればそのカテゴリ
を切り出した文字パターンの認識結果とする（ステップ
Ｓ１０８）。図８の場合、カテゴリ「Ｅ」がその認識結
果となる。ステップＳ１０７で該当する類似度がない場
合はステップＳ１０９で認識不可（認識結果を「リジェ
クト」）とする。FIG. 8 shows a record in which the degree of similarity is calculated for all categories. In step S107, it is checked from this record whether there is a category whose similarity is higher than the reference value and has the highest value, and if there is a corresponding category, the category is taken as the recognition result of the character pattern (step S108). In the case of FIG. 8, the category “E” is the recognition result. If there is no corresponding similarity in step S107, recognition is not possible (recognition result is “reject”) in step S109.

【００２５】以上の処理を繰り返して、切り出した文字
パターン全てについて認識結果を求め、矩形領域４１、
４４、４５の文字パターンはそれぞれ「Ｅ」、「Ｏ」、
「Ｒ」と認識される。しかし、重なりや接合を生じて切
り出された矩形領域４２、４３の文字パターンについて
は、ステップＳ１０２でパターンサイズの縦横比が規定
外となるか、又はパターンサイズを正規化して類似度を
求めても低い値となるので、認識不可となる。ここで、
矩形領域と文字パターンとは同じ意味で用いる。By repeating the above processing, the recognition result is obtained for all the extracted character patterns, and the rectangular area 41,
The character patterns of 44 and 45 are "E", "O",
Recognized as "R". However, with respect to the character patterns of the rectangular regions 42 and 43 that are cut out due to overlapping or joining, the aspect ratio of the pattern size becomes out of regulation in step S102, or the pattern size is normalized to obtain the similarity. The value is too low to be recognized. here,
The rectangular area and the character pattern are used interchangeably.

【００２６】このように認識不可となった切り出し文字
パターンの各々対して、第２の文字認識部７は図９に示
すフローによって認識処理を行う。以下、図９の処理フ
ローについて説明する。The second character recognizing unit 7 performs the recognizing process on each of the cut-out character patterns which cannot be recognized in this way according to the flow shown in FIG. The processing flow of FIG. 9 will be described below.

【００２７】認識対象文字パターンを図４の矩形領域４
２とする。まず、ステップＳ２０１において幅Ｗ４２、
高さＨ４２である文字パターンを、高さＨの基準サイズ
に正規化し、幅は高さ正規化と同じ変倍率Ｈ／Ｈ４２で
変倍する。The character pattern to be recognized is the rectangular area 4 in FIG.
Set to 2. First, in step S201, the width W42,
The character pattern having the height H42 is normalized to the reference size of the height H, and the width is scaled by the same scaling ratio H / H42 as the height normalization.

【００２８】次に、ステップＳ２０２で辞書メモリ５に
格納されたｉ番目（最初はｉ＝１）の認識対象のカテゴ
リを取り出しその高さｈｉ、幅ｗｉである基準パターン
を高さＨの基準サイズに正規化し、幅は高さ正規化と同
じ変倍率Ｈ／ｈｉで変倍する。Next, in step S202, the i-th (first, i = 1) recognition target category stored in the dictionary memory 5 is taken out, and a reference pattern having a height hi and a width wi thereof is set to a reference size having a height H. And the width is scaled with the same scaling ratio H / hi as the height normalization.

【００２９】次に、ステップＳ２０３では高さＨに正規
化した基準パターンを、高さＨに正規化した切り出しパ
ターンの左端に重ねて類似度を求め、順次基準パターン
を右に動かしながら各位置における類似度を求めながら
切り出しパターンの右端まで移動する。このときの切り
出しパターンのサイズをＨ×Ｗｃ、基準パターンのサイ
ズをＨ×Ｗｄとすれば、各位置ｘ（切り出しパターンの
左端をｘ＝０とする）における類似度Ｓ(ｘ)は次式で表
される。Next, in step S203, the reference pattern normalized to the height H is superimposed on the left end of the cutout pattern normalized to the height H to obtain the similarity, and the reference pattern is sequentially moved to the right at each position. Move to the right end of the cutout pattern while obtaining the similarity. If the size of the cutout pattern at this time is H × Wc and the size of the reference pattern is H × Wd, the similarity S (x) at each position x (the left end of the cutout pattern is x = 0) is given by the following equation. expressed.

【００３０】[0030]

【数２】 [Equation 2]

【００３１】ここで、Ｐij（ｉ＝１〜Ｗｃ，ｊ＝１〜
Ｈ）は正規化された切り出しパターン、Ｑij（ｉ＝１〜
Ｗｄ，ｊ＝１〜Ｈ）は正規化された基準パターンであ
る。Here, Pij (i = 1 to Wc, j = 1 to 1)
H) is a normalized cutout pattern, Qij (i = 1 to 1)
Wd, j = 1 to H) is a normalized reference pattern.

【００３２】以上のように求められた各位置における類
似度Ｓ(ｘ)に対して、ステップＳ２０５ではＳ(ｘ)の値
が基準値以上であるかどうかを調べ、基準値以上であれ
ば、その位置ｘとカテゴリとを記録する。そうでなけれ
ば何もせずにステップＳ２０６に進む。With respect to the similarity S (x) at each position obtained as described above, it is checked in step S205 whether the value of S (x) is a reference value or more. Record the position x and the category. If not, the process proceeds to step S206 without doing anything.

【００３３】ステップＳ２０６で認識対象の全てのカテ
ゴリについて処理を行ったかを調べ（ｉ＝ｎ？）、もし
「Ｎｏ」であればステップＳ１０２に戻り次の基準パタ
ーンについて類似度を求める。In step S206, it is checked whether or not all the recognition target categories have been processed (i = n?), And if "No", the process returns to step S102 to obtain the similarity for the next reference pattern.

【００３４】全てのカテゴリについて処理を終えた記録
は図１０のようになる。即ち、カテゴリ「Ｅ」は位置ｘ
２において高い類似度を示し、カテゴリ「Ｌ」は位置ｘ
１において高い類似度を示すが、他のカテゴリは全領域
にわたって基準値未満の低い類似度となる。ステップＳ
２０７では、この記録から位置ｘの小さい順にカテゴリ
の編集を行い、切り出しパターンの矩形領域４２の認識
結果を「ＬＥ」とする。FIG. 10 shows a record in which processing has been completed for all categories. That is, the category "E" has a position x
2 shows a high degree of similarity, and the category “L” has a position x
1 shows a high similarity, but the other categories have a low similarity below the reference value over the entire region. Step S
At 207, the categories are edited from this recording in ascending order of the position x, and the recognition result of the rectangular area 42 of the cutout pattern is set to “LE”.

【００３５】以上の処理によって、第１の文字認識部６
で認識されなかった残りの切り出し領域についても、同
様に第２の文字認識部７で認識処理が行われる。切り出
した矩形領域４３に対してはステップＳ２０７までに図
１１のような記録が得られ、これを編集して認識結果
「ＶＡＴ」が得られる。By the above processing, the first character recognition unit 6
The second character recognizing unit 7 similarly performs recognition processing on the remaining cut-out areas that have not been recognized in step S4. A record as shown in FIG. 11 is obtained for the cut-out rectangular area 43 by step S207, and this is edited to obtain the recognition result “VAT”.

【００３６】そして、認識結果出力部８では以上のよう
に認識された文字認識結果をディスプレイその他の処理
装置に出力する。Then, the recognition result output unit 8 outputs the character recognition result recognized as described above to a display or other processing device.

【００３７】このように、複数の文字を含む切り出し領
域の高さをＨに規格化し、この上に高さＨに規格化した
規準パターンを重ね、左端から右へ順に規準パターンを
動かしながら各位置における類似度を求めるので、個々
の文字を切り出すことができなくても、直接文字の認識
が可能となる。又、文字列の全領域ではなく、個々の文
字認識が不可能であった切り出し領域のみに規準パター
ンを重ねて移動しながら類似度を求めるので、処理の高
速化が図れる。As described above, the height of the cut-out area including a plurality of characters is standardized to H, the standard pattern standardized to the height H is superposed on this, and the standard pattern is moved from left end to right position at each position. Since the degree of similarity in is calculated, the characters can be directly recognized even if the individual characters cannot be cut out. Further, the similarity is calculated while moving the reference pattern in an overlapping manner only in the cut-out region where individual character recognition is not possible, rather than in the entire region of the character string, so that the processing speed can be increased.

【００３８】実施の形態２．次に、本発明による第２の実施の形態を図１２〜図１８
により説明する。この実施の形態による文字認識装置の
構成を図１２に示す。本発明は認識対象文字としてある
文字パターンの一致度が他の文字パターンの一部に対し
て高い値を示すような文字を含んでいる場合に対応し、
図１２は第１の実施の形態を示す図１と同一構成をとる
が、第２の文字認識部７ａの処理内容が異なる。他の構
成部の処理内容は実施の形態１と同様であるため、第２
の文字認識部７ａの処理内容のみ以下に説明する。図１
３は第２の文字認識部７ａの処理フローを示す。Embodiment 2. Next, a second embodiment according to the present invention will be described with reference to FIGS.
Will be described. The structure of the character recognition device according to this embodiment is shown in FIG. The present invention corresponds to the case where the matching degree of a certain character pattern as a recognition target character includes a character having a high value with respect to a part of another character pattern,
12 has the same configuration as FIG. 1 showing the first embodiment, but the processing content of the second character recognition unit 7a is different. Since the processing contents of the other components are similar to those of the first embodiment, the second
Only the processing contents of the character recognition unit 7a will be described below. Figure 1
Reference numeral 3 shows a processing flow of the second character recognition unit 7a.

【００３９】認識対象文字パターンを図４の矩形領域４
１の例で説明する。ここで辞書メモリ５には文字パター
ンの一致度が他の文字パターンの一部に対して高い値を
示すような文字として「Ｉ」が含まれている（図１
４）。The character pattern to be recognized is the rectangular area 4 in FIG.
An example will be described. Here, the dictionary memory 5 contains "I" as a character whose matching degree of the character pattern is higher than a part of other character patterns (FIG. 1).
4).

【００４０】図１３のステップＳ３０１〜Ｓ３０４は、
それぞれ図９のステップＳ２０１〜Ｓ２０４の処理と同
様である。ステップＳ３０４で求められた各位置におけ
る類似度Ｓ(ｘ)に対して、ステップＳ３０５ではＳ(ｘ)
の値が基準値以上であるかどうか調べ、それが基準値以
上であれば、その文字カテゴリと位置ｘ、および基準パ
ターンの右端の位置を記録する。Steps S301 to S304 of FIG.
Each is similar to the processing of steps S201 to S204 in FIG. With respect to the similarity S (x) at each position obtained in step S304, S (x) is obtained in step S305.
Is checked to see if it is greater than or equal to the reference value, and if it is greater than or equal to the reference value, the character category and position x, and the rightmost position of the reference pattern are recorded.

【００４１】ステップＳ３０６のループ処理で全ての文
字カテゴリについて処理をおこなう。ここで文字カテゴ
リ「Ｉ」の基準パターンについては、切り出し領域の文
字パターンの「Ｌ」の左端および「Ｅ」の左端に重ねた
とき、類似度Ｓ(ｘ)が大きな値となるため（図１５）、
最終的な記録は図１６のように４つの文字が抽出され
る。All the character categories are processed in the loop processing of step S306. Here, for the reference pattern of the character category “I”, the similarity S (x) has a large value when it is overlapped with the left end of “L” and the left end of “E” of the character pattern in the cutout area (FIG. 15). ),
Four characters are extracted from the final record as shown in FIG.

【００４２】ステップＳ３０７では上記の記録から、同
時には存在し得ない文字を識別する。すなわち、４つの
抽出された文字の存在範囲を図示すれば図１７のように
なるため，Ｎｏ．２の「Ｉ」はＮｏ．４の「Ｌ」に内包
されることになり、両者は排他的と判定する。同様にＮ
ｏ．３の「Ｉ」はＮｏ．２の「Ｅ」に内包されるため、
両者は排他的であると判定する。In step S307, characters that cannot be present at the same time are identified from the above records. That is, FIG. 17 shows the existence range of the four extracted characters. No. 2 is “I”. 4 is included in “L”, and both are determined to be exclusive. Similarly N
o. No. 3 is “I”. Because it is included in "E" of 2,
Both are determined to be exclusive.

【００４３】ステップＳ３０８では、上記の判定結果お
よび各文字の位置関係から可能性のある文字列の組み合
わせとその存在範囲を計算する。この場合の発生する組
み合わせは以下に示す（ａ）〜（ｄ）となる。（ａ）Ｎｏ．４の「Ｌ」×Ｎｏ．３の「Ｉ」（ｂ）Ｎｏ．４の「Ｌ」×Ｎｏ．１の「Ｅ」（ｃ）Ｎｏ．２の「Ｉ」×Ｎｏ．３の「Ｉ」（ｄ）Ｎｏ．２の「Ｉ」×Ｎｏ．１の「Ｅ」そして、その存在範囲は図１８（ａ）〜（ｄ）の太線で
示した領域である。In step S308, possible combinations of character strings and their existing ranges are calculated from the above determination result and the positional relationship of each character. The combinations that occur in this case are the following (a) to (d). (A) No. No. 4 “L” × No. No. 3 “I” (b) No. No. 4 “L” × No. No. 1 “E” (c) No. 1 2 “I” × No. No. 3 “I” (d) No. 3 2 “I” × No. "E" of 1 and the existing range is the region shown by the thick line in FIGS. 18 (a) to 18 (d).

【００４４】ステップＳ３０９では、上記のように求め
られた文字の存在範囲を切り出し領域の存在範囲と比較
し最も一致度の高い文字の組み合わせを選択して認識結
果とする。例えば、図１８において、切り出し領域を示
す太線部の長さと、（ａ）〜（ｄ）の存在範囲を示す太
線部（ｘ１とｘ２の和）の長さとを比較し、これらの太
線部の長さが近いほど一致度が高いものとなる。この実
施の形態の場合は、図１８から（ｂ）の組み合わせの
「ＥＬ」が認識結果となる。In step S309, the existing range of the characters obtained as described above is compared with the existing range of the cutout area, and the combination of the characters having the highest degree of matching is selected as the recognition result. For example, in FIG. 18, the length of the thick line portion indicating the cutout area is compared with the length of the thick line portion (sum of x1 and x2) indicating the existence range of (a) to (d), and the length of these thick line portions is compared. The closer the distance is, the higher the degree of coincidence. In the case of this embodiment, “EL” in the combination shown in FIG. 18B is the recognition result.

【００４５】このように、複数の文字を含む一つの切り
出し領域から、位置関係が排他的である複数の文字が抽
出されたときは、可能な文字の組み合わせを生成し、そ
れらの文字の存在する範囲と切り出し領域の範囲とを比
較し、切り出し領域の範囲に最も近い存在範囲を有する
文字の組み合わせを認識すべき文字の組み合わせとする
ので、誤認識の少ない文字認識が可能となる。As described above, when a plurality of characters whose positional relationships are exclusive are extracted from one cutout region including a plurality of characters, possible character combinations are generated and those characters exist. Since the range and the range of the cutout area are compared and the combination of characters having the existing range closest to the range of the cutout area is set as the character combination to be recognized, character recognition with less erroneous recognition is possible.

【００４６】実施の形態３．次に、本発明による第３の実施の形態を図１９〜図２１
により説明する。この実施の形態による文字認識装置の
構成を図１９に示す。図１９は実施の形態１の図１と同
一構成をとるが、文字切り出し部４ａの処理内容、辞書
メモリ５ａの内容、および第２の文字認識部７ｂの処理
内容が異なる。他の構成部の処理内容は実施の形態１と
同様であるため、上記の相違点のみを以下に説明する。Embodiment 3. Next, a third embodiment according to the present invention will be described with reference to FIGS.
Will be described. The configuration of the character recognition device according to this embodiment is shown in FIG. 19 has the same configuration as that of FIG. 1 of the first embodiment, but the processing contents of the character cutout unit 4a, the contents of the dictionary memory 5a, and the processing contents of the second character recognition unit 7b are different. Since the processing contents of the other components are similar to those of the first embodiment, only the above differences will be described below.

【００４７】文字切り出し部４ａでは、文字切り出し部
３で文字列として抽出された矩形領域２３（図３を参
照）について、文字列方向とは垂直方向（この場合Ｙ方
向）に走査し、各走査ラインごとに黒画素を計数しＸ方
向のヒストグラム３０を求め、このヒストグラムの連続
するＸ方向の黒部分３１、３２、３３、３４、３５の両
端のＸ座標から、矩形領域４１、４２、４３、４４、４
５を抽出すると共に、このヒストグラムの値Ｈ（ｘ）が
文字列高さとなる位置をそれぞれ記録する（図４のｘ１
〜ｘ５を参照）。矩形領域４１〜４５の該当する位置は
それぞれ次のようになる。即ち、矩形領域４１ではｘ
１、矩形領域４２ではｘ２及びｘ３、矩形領域４３では
ｘ４、矩形領域４４ではなし、矩形領域４５ではｘ５と
なる。In the character slicing unit 4a, the rectangular area 23 (see FIG. 3) extracted as a character string by the character slicing unit 3 is scanned in the direction perpendicular to the character string direction (Y direction in this case), and each scanning is performed. A black pixel is counted for each line to obtain a histogram 30 in the X direction, and rectangular regions 41, 42, 43, are calculated from the X coordinates of both ends of the continuous black portions 31, 32, 33, 34, 35 in the X direction of this histogram. 44, 4
5 is extracted, and the position where the value H (x) of this histogram becomes the character string height is recorded (x1 in FIG. 4).
~ X5). The corresponding positions of the rectangular areas 41 to 45 are as follows. That is, in the rectangular area 41, x
1, x2 and x3 in the rectangular area 42, x4 in the rectangular area 43, none in the rectangular area 44, and x5 in the rectangular area 45.

【００４８】又、辞書メモリ５ａの内容は図２０に示す
如くであり、大きく２つに分類されている。一つの分類
は文字パターン左端に上下に連なる縦線要素が有る
「Ｂ，Ｄ，Ｅ，Ｆ，・・・」であり、他の分類はそのよ
うな縦線要素が無い「Ａ，Ｃ，Ｇ，Ｊ，・・・」であ
る。The contents of the dictionary memory 5a are as shown in FIG. 20, and are roughly classified into two. One classification is "B, D, E, F, ..." that has vertical line elements that are vertically aligned at the left end of the character pattern, and the other classification is "A, C, G" that does not have such vertical line elements. , J, ... ”.

【００４９】第２の文字認識部７ｂでは、第１の文字認
識部６で認識不可だった矩形領域４２、４３の認識を行
う。ここで第２の文字認識部７ｂの処理フローを図２１
に示す。The second character recognition section 7b recognizes the rectangular areas 42 and 43 which cannot be recognized by the first character recognition section 6. Here, the processing flow of the second character recognition unit 7b is shown in FIG.
Shown in.

【００５０】ステップＳ４０１、Ｓ４０２は、図９のス
テップＳ２０１、Ｓ２０２と同様である。ステップＳ４
０３１では、比較しようとする基準パターンが縦線を持
つ分類Ａに属するか縦線を持たない分類Ｂに属するかを
チェックし、分類Ａに属するならばステップＳ４０３２
にすすみ、分類Ｂに属するならばステップＳ４０３３に
進む。Steps S401 and S402 are the same as steps S201 and S202 of FIG. Step S4
In 031, it is checked whether the reference pattern to be compared belongs to the category A having a vertical line or the category B having no vertical line, and if it belongs to the category A, step S4032.
If it belongs to the category B, the process advances to step S4033.

【００５１】ステップＳ４０３３では、高さＨに正規化
した基準パターンの左端を、高さＨに正規化した切り出
しパターンの縦線位置に合わせて、各位置における類似
度を求める。切り出した矩形領域４２の場合、図４に示
す縦線位置ｘ２に当たる領域、縦線位置ｘ３に当たる領
域に基準パターンを合わせて各位置における類似度を求
める。すなわち式（３）におけるＳ(ｘ２)，Ｓ(ｘ３)を
求める。In step S4033, the left end of the reference pattern normalized to the height H is aligned with the vertical line position of the cutout pattern normalized to the height H to obtain the similarity at each position. In the case of the cut-out rectangular area 42, the reference pattern is fitted to the area corresponding to the vertical line position x2 and the area corresponding to the vertical line position x3 shown in FIG. 4, and the similarity at each position is obtained. That is, S (x2) and S (x3) in the equation (3) are obtained.

【００５２】ステップＳ４０３２では、高さＨに正規化
した基準パターンを高さＨに正規化した切り出しパター
ンの左端に重ねて類似度を求め、順次基準パターンを右
に動かしながら各位置における類似度を求め切り出しパ
ターンの右端まで移動する。すなわち式（３）における
Ｓ(ｘ)をｘを必要範囲で変化させながら計算する。In step S4032, the reference pattern normalized to the height H is superimposed on the left end of the cut-out pattern normalized to the height H to obtain the similarity, and the similarity at each position is calculated while sequentially moving the reference pattern to the right. Move to the right end of the desired cutout pattern. That is, S (x) in equation (3) is calculated while changing x within a required range.

【００５３】ステップＳ４０４以降の処理は、図９にお
けるステップＳ２０４以降の相当する処理と同様であ
る。The processing after step S404 is the same as the corresponding processing after step S204 in FIG.

【００５４】上記のステップＳ４０３３における処理で
は、縦線のある分類Ａに属する基準パターンに対して切
り出し領域の縦線のある位置でのみ類似度を求めている
が、その他の位置では類似度が高い値を示さないことは
明確であり、本方式で類似度の高い文字を正確に抽出す
ることができる。したがって、計算コストの高い類似度
の計算の回数を大幅に減らして処理時間の短縮が図れ
る。In the process in step S4033 described above, the similarity is calculated only at the position where the vertical line of the cutout region is present with respect to the reference pattern belonging to the classification A having the vertical line, but the similarity is high at other positions. It is clear that no value is shown, and this method can accurately extract characters with high similarity. Therefore, the processing time can be shortened by significantly reducing the number of times of calculation of the similarity with high calculation cost.

【００５５】実施の形態４．次に、本発明による第４の実施の形態を図２２〜図２５
により説明する。この実施の形態は撮像した認識対象文
字の字体が登録した認識対象文字の基準パターンに対し
て縦横異倍率の字体（いわゆる「長体」・「平体」）の
場合に対応する。この実施の形態の文字認識装置の構成
を図２２に示す。図２２は実施の形態１の図１と同一構
成をとるが、第１の文字認識部６ａと第２の文字認識部
７ｃの処理内容が異なる。他の構成部の処理内容は実施
の形態１と同様であるため、第１の文字認識部６ａと第
２の文字認識部７ｃの処理内容のみを以下に説明する。Fourth Embodiment Next, a fourth embodiment according to the present invention will be described with reference to FIGS.
Will be described. This embodiment corresponds to the case where the font of the imaged recognition target character is a font having different vertical / horizontal magnification with respect to the registered reference pattern of the recognition target character (so-called “long body” / “flat body”). The structure of the character recognition device of this embodiment is shown in FIG. 22 has the same configuration as that of FIG. 1 of the first embodiment, but the processing contents of the first character recognition unit 6a and the second character recognition unit 7c are different. Since the processing contents of the other components are similar to those of the first embodiment, only the processing contents of the first character recognition unit 6a and the second character recognition unit 7c will be described below.

【００５６】図２３に第１の文字認識部６ａの処理フロ
ーを示す。ステップＳ５０１では、切り出したパターン
全てに対して実施の形態１の第１の文字認識部６と同様
の処理により矩形領域４１、４４、４５の文字パターン
がそれぞれ「Ｅ」、「Ｏ」、「Ｒ」と認識され、矩形領
域４２、４３の文字パターンは認識不可となる。ここで
切り出された文字パターンと辞書に登録された基準パタ
ーンの縦横比が異なっていても、両者を基準サイズ縦
Ｈ，横Ｗに正規化して類似度を求めるので縦横比が異な
ることの影響は受けずに認識が実行される。FIG. 23 shows a processing flow of the first character recognition section 6a. In step S501, the character patterns of the rectangular regions 41, 44, and 45 are respectively “E”, “O”, and “R” by the same processing as that of the first character recognition unit 6 of the first embodiment for all the cut patterns. Is recognized, and the character patterns of the rectangular areas 42 and 43 cannot be recognized. Even if the character pattern cut out here and the reference pattern registered in the dictionary have different aspect ratios, since the two are normalized to the reference size length H and width W to obtain the similarity, the effect of the different aspect ratio is Recognition is executed without receiving.

【００５７】次に、ステップＳ５０２では、認識された
各文字に対して、切り出した矩形領域の縦横比と基準パ
ターンの縦横比の比を求め、それらの平均を求める。こ
れは切り出した矩形領域の縦サイズをＨｉ、横サイズを
Ｗｉ、そして基準パターンの縦サイズをｈｉ、横サイズ
をｗｉとすれば、両者の縦横比は、ｋｉ＝（ｈｉ・Ｗｉ）／（Ｈｉ・ｗｉ）で与えられ、この縦横比ｋｉは、ｈｉ／（ｋｉ・ｗｉ）＝Ｈｉ／Ｗｉの関係式から分かるように、認識された基準パターンに
対して横サイズをｋｉ倍すれば対応する切り出しパター
ンと縦横比の等しいパターンが得られるという倍率であ
る。この比ｋｉの平均値ｋを求める。そして、ｍ個の文
字が認識されたとき、この縦横比ｋｉの平均値は次式で
求めることができる。Next, in step S502, for each recognized character, the ratio between the aspect ratio of the cut-out rectangular area and the aspect ratio of the reference pattern is calculated, and the average of these is calculated. If the vertical size of the cut-out rectangular area is Hi, the horizontal size is Wi, the vertical size of the reference pattern is hi, and the horizontal size is wi, the aspect ratio of both is ki = (hi · Wi) / (Hi .Wi), and this aspect ratio ki is obtained by multiplying the recognized reference pattern by the lateral size ki, as can be seen from the relational expression of hi / (ki.wi) = Hi / Wi. It is a magnification that a pattern having the same aspect ratio as the pattern can be obtained. The average value k of this ratio ki is determined. Then, when m characters are recognized, the average value of the aspect ratio ki can be obtained by the following equation.

【００５８】[0058]

【数３】 [Equation 3]

【００５９】次に、第１の文字認識部６ａで認識不可と
なった切り出した矩形領域の文字パターンの各々に対し
て、第２の文字認識部７ｃは図２４に示すフローによっ
て認識処理を行う。以下、図２４の処理フローについて
説明する。Next, the second character recognizing unit 7c performs the recognizing process for each of the character patterns of the cut-out rectangular area which cannot be recognized by the first character recognizing unit 6a according to the flow shown in FIG. . The processing flow of FIG. 24 will be described below.

【００６０】認識対象文字パターンを図２５（ａ）の矩
形領域４２ａとする。まず、ステップＳ６０１におい
て、幅Ｗａ，高さＨａである文字パターンを、高さＨの
基準サイズに正規化し、幅は高さを正規化した時と同じ
変倍率Ｈ／Ｈａで変倍する（図２５（ｂ））。The recognition target character pattern is the rectangular area 42a in FIG. First, in step S601, the character pattern having the width Wa and the height Ha is normalized to the reference size of the height H, and the width is scaled at the same scaling ratio H / Ha as that when the height is normalized (see FIG. 25 (b)).

【００６１】次に、ステップＳ６０２で、辞書メモリ５
に格納されたｉ番目の認識対象カテゴリをとりだし、そ
の高さｈｉ、幅ｗｉである基準パターンを、まず高さを
Ｈの基準サイズに正規化し、幅は、高さを正規化した変
倍率Ｈ／ｈｉに（４）式のｋを乗じた倍率で変倍する。
これによって基準パターンは高さがＨで撮像した文字パ
ターンの縦横比と等しいパターンに変換される。ｉ番目
の認識対象カテゴリが「Ｌ」のとき変換パターンは図２
５（ｃ）から（ｄ）に変換される。Next, in step S602, the dictionary memory 5
The reference pattern having the height hi and the width wi stored in the i-th recognition target category is first normalized to the reference size of H, and the width is the scaling factor H obtained by normalizing the height. Magnification is performed at a magnification obtained by multiplying / hi by k in equation (4).
As a result, the reference pattern is converted into a pattern whose height is H and is equal to the aspect ratio of the imaged character pattern. When the i-th recognition target category is “L”, the conversion pattern is as shown in FIG.
5 (c) is converted to (d).

【００６２】次に、ステップＳ６０３では、上記のよう
に変換した基準パターンを、高さＨに正規化した切り出
しパターンの左端に重ねて類似度を求め、順次変換パタ
ーンを右に動かしながら各位置における類似度を求め、
切り出しパターンの右端まで移動する。このときの類似
度Ｓ(ｘ)の算出方法は上述の式（３）に準じる。Next, in step S603, the reference pattern converted as described above is superimposed on the left end of the cutout pattern normalized to the height H to obtain the similarity, and the conversion pattern is sequentially moved to the right at each position. Find the similarity,
Move to the right edge of the cutout pattern. The calculation method of the similarity S (x) at this time conforms to the above-mentioned formula (3).

【００６３】又、ステップＳ６０４以降の処理は、図９
のステップＳ２０４以降の対応する処理と同様であるた
め説明を省略する。Further, the processing after step S604 is performed by referring to FIG.
Since the processing is the same as the corresponding processing in step S204 and subsequent steps in step 1, the description thereof will be omitted.

【００６４】以上のように本実施の形態においては、基
準パターンに対して縦横比の異なる文字（長体・平体）
を入力した場合にも、第１の文字認識部６ａで両者の縦
横比の比を求め、第２の文字認識部７ｃではこの比を用
いて基準パターンを入力した文字と同じ縦横比に変換し
て類似度を求めているので、基準パターンと縦横比の異
なる文字が重なり・接合を生じた場合にも良好な文字認
識結果が得られる。As described above, in the present embodiment, characters having different aspect ratios with respect to the reference pattern (long body / flat body) are used.
Even when the input is, the first character recognition unit 6a obtains the aspect ratio of the two, and the second character recognition unit 7c uses this ratio to convert the reference pattern into the same aspect ratio as the input character. Since the degree of similarity is obtained by the above, a good character recognition result can be obtained even when characters having different aspect ratios from the reference pattern overlap and join.

【００６５】実施の形態５．次に、本発明による第５の実施の形態を図２６〜図３１
により説明する。実施の形態１では、例えば、図２６に
示す切り出した矩形領域４３を第２の文字認識部７で認
識する際、対象文字「Ａ」の基準パターンを切り出した
矩形領域４３に重ねたとき、図に示すように隣接する文
字の一部が重ね合わせた領域に入り込んで、その結果計
算した類似度が若干低くなる可能性がある。本実施の形
態では、このような類似度の低下を防止することを目的
としている。本実施の形態の文字認識装置の構成を図２
７に示す。図２７は実施の形態１の図１と同一構成をと
るが、辞書メモリ５ｂの内容と第２の文字認識部７ｄの
処理内容が異なる。他の構成部の処理内容は実施の形態
１と同様であるので、上記の相違点のみを以下に説明す
る。Embodiment 5. Next, a fifth embodiment according to the present invention will be described with reference to FIGS.
Will be described. In the first embodiment, for example, when the second character recognition unit 7 recognizes the cut-out rectangular area 43 shown in FIG. 26, when the reference pattern of the target character “A” is overlapped with the cut-out rectangular area 43, There is a possibility that a part of the adjacent characters will enter the overlapped area as shown in, and the calculated similarity will be slightly lower as a result. The present embodiment aims at preventing such a decrease in the degree of similarity. FIG. 2 shows the configuration of the character recognition device of this embodiment.
7 shows. 27 has the same configuration as that of FIG. 1 of the first embodiment, but the contents of the dictionary memory 5b and the processing contents of the second character recognition unit 7d are different. Since the processing contents of the other components are similar to those of the first embodiment, only the above differences will be described below.

【００６６】辞書メモリ５ｂの内容を図２８に示す。図
２８の内容は図６に示した内容にマスクパターンを付加
したものとなっている。このマスクパターンは、図２９
に示すように基準パターンの領域のうち隣接する文字と
重なりや接合を生ずる可能性のある部分を値「０」と
し、他の領域（図２９の網掛け部）を値「１」としたも
のである。The contents of the dictionary memory 5b are shown in FIG. The contents of FIG. 28 are obtained by adding a mask pattern to the contents shown in FIG. This mask pattern is shown in FIG.
As shown in, the value of the area of the reference pattern that may overlap or join adjacent characters is set to "0", and the other area (hatched portion in FIG. 29) is set to "1". Is.

【００６７】次に、第２の文字認識部７ｄの処理内容に
ついて説明する。図３０は第２の文字認識部７ｄの処理
フローである。全体の処理内容は図９の処理と同様であ
りステップＳ７０２の類似度の計算方法のみ異なる。し
たがって処理の流れについては説明を省略し、ステップ
Ｓ７０２の類似度の計算方法について説明する。Next, the processing contents of the second character recognition section 7d will be described. FIG. 30 is a processing flow of the second character recognition unit 7d. The entire processing content is the same as the processing in FIG. 9, and only the similarity calculation method in step S702 is different. Therefore, the description of the processing flow will be omitted, and the similarity calculation method in step S702 will be described.

【００６８】ここでは類似度Ｓ(ｘ)は以下の式によって
計算される。Here, the similarity S (x) is calculated by the following equation.

【００６９】[0069]

【数４】 [Equation 4]

【００７０】ここで、Ｒ＝［Ｒｉｊ］は基準サイズに正
規化されたマスクパターンであり、式（５）は切り出し
パターンと基準パターンの類似度を求めるとき、マスク
パターンの値が１となる領域（Ｒij＝１）にのみ着目し
て類似度を計算することを意味する。従って、例えば図
３１において「Ａ」の基準パターンを切り出した矩形領
域４３に重ねるとき、本来「Ａ」を抽出する位置におい
て網掛けで示したマスクパターンが値１の領域のみを比
較の対象とするため、隣接する文字パターンの重なり部
分の影響を受けずに類似度を求めることができる。Here, R = [Rij] is a mask pattern normalized to a reference size, and equation (5) is a region where the mask pattern value is 1 when the similarity between the cutout pattern and the reference pattern is obtained. This means that the similarity is calculated by focusing only on (Rij = 1). Therefore, for example, when the reference pattern of “A” in FIG. 31 is overlapped on the cut-out rectangular area 43, only the area where the masked pattern shown by hatching at the position where “A” is originally extracted has a value of 1 is targeted for comparison. Therefore, the similarity can be obtained without being affected by the overlapping portion of the adjacent character patterns.

【００７１】以上のように本実施の形態においては、重
なりや接合を生じる可能性のある部分をマスクして類似
度を計算するようにしたため、とくに重なりや接合の幅
が大きい場合でも安定して個々の文字を抽出し認識する
ことが可能となる。As described above, in the present embodiment, the similarity is calculated by masking the portions that may cause overlapping or joining, so that even if the width of the joining or joining is large, it is stable. It becomes possible to extract and recognize individual characters.

【００７２】実施の形態６．次に、本発明による第６の実施の形態を図３２〜図３６
により説明する。第６の実施の形態の文字認識装置の構
成を図３２に示す。図３２は第１の実施の形態の図１と
同一構成をとるが、第２の文字認識部７ｅの処理内容が
異なる。他の構成部の処理内容は実施の形態１と同様で
あるため、第２の文字認識部７ｅの処理内容のみを以下
に説明する。Sixth Embodiment Next, a sixth embodiment of the present invention will be described with reference to FIGS.
Will be described. The configuration of the character recognition device of the sixth embodiment is shown in FIG. 32 has the same configuration as that of FIG. 1 of the first embodiment, but the processing content of the second character recognition unit 7e is different. Since the processing contents of the other components are similar to those of the first embodiment, only the processing contents of the second character recognition unit 7e will be described below.

【００７３】図３３に第２の文字認識部７ｅの処理フロ
ーを示す。認識対象文字パターンを図４の４３の例で説
明する。ステップＳ８０１では、処理対象の矩形領域を
縦サイズが基準高さＨとなるように変倍し、横サイズは
縦サイズと同じ倍率で変倍する。このように縦サイズＨ
で正規化された矩形領域をＡとする（図３４（ａ）参
照）。次にステップＳ８０２で文字検出番号ｊをｊ＝
１、文字検出位置ｘcをｘc＝０に初期化する。ここで文
字検出番号ｊはこれから検出する文字が矩形領域Ａの左
から何番目に当たるかを示し、文字検出位置ｘcは文字
検出対象の領域が矩形領域Ａの左端を原点として座標値
ｘcから始まることを表すものである。FIG. 33 shows a processing flow of the second character recognition section 7e. The recognition target character pattern will be described with reference to the example of 43 in FIG. In step S801, the rectangular area to be processed is scaled so that the vertical size becomes the reference height H, and the horizontal size is scaled at the same magnification as the vertical size. Vertical size H
The rectangular area normalized by is set to A (see FIG. 34A). Next, in step S802, the character detection number j is set to j =
1. Initialize the character detection position xc to xc = 0. Here, the character detection number j indicates the position of the character to be detected from the left of the rectangular area A, and the character detection position xc is that the area of the character detection target starts from the coordinate value xc with the left end of the rectangular area A as the origin. It represents.

【００７４】ステップＳ８０３より、図１４に示す辞書
メモリ５の内容と矩形領域Ａの内容を比較し文字の抽出
を行う。まず、ステップＳ８０３で、文字カテゴリ番号
ｉをｉ＝１で初期化し、ステップＳ８０４で文字カテゴ
リ番号ｉの基準パターン（ｉ＝１のとき、図１４では文
字カテゴリ「Ａ」の基準パターン）を縦サイズが基準高
さＨとなるように変倍し、横サイズは縦サイズと同じ倍
率で変倍する。このように縦サイズＨで正規化された文
字カテゴリ番号ｉの基準パターンをＰiとする。From step S803, the contents of the dictionary memory 5 shown in FIG. 14 are compared with the contents of the rectangular area A to extract characters. First, in step S803, the character category number i is initialized to i = 1, and in step S804, the reference pattern of the character category number i (when i = 1, the reference pattern of the character category “A” in FIG. 14) is vertically sized. Is scaled so as to become the reference height H, and the horizontal size is scaled at the same magnification as the vertical size. The reference pattern of the character category number i thus normalized with the vertical size H is defined as Pi.

【００７５】次に、ステップＳ８０５では、文字検出番
号ｊの値を調べ、ｊ＝１（最初の文字検出）であれば、
ステップＳ８０６で、矩形領域Ａと基準パターンＰiの
類似度Ｓi＝Ｓ(ｘ)を求める。類似度Ｓ(ｘ)の内容は式
（３）で与えられる。ここではｘ＝ｘc（＝０）とし
て、図３４（ｂ）に示すように矩形領域Ａの左端に基準
パターンＰiの左端を合わせて両者の類似度Ｓ(ｘ)を求
める。そして、ステップＳ８０８で、文字カテゴリ番号
ｉ，文字検出座標ｘi、求めた類似度Ｓi、及び正規化さ
れたカテゴリ番号ｉの基準パターンＰiの横サイズｗｎi
を記録する。Next, in step S805, the value of the character detection number j is checked, and if j = 1 (first character detection),
In step S806, the similarity Si = S (x) between the rectangular area A and the reference pattern Pi is calculated. The content of the similarity S (x) is given by Expression (3). Here, assuming x = xc (= 0), the left edge of the rectangular area A is aligned with the left edge of the reference pattern Pi as shown in FIG. 34 (b), and the similarity S (x) between the two is obtained. Then, in step S808, the character category number i, the character detection coordinate xi, the calculated similarity Si, and the horizontal size wni of the standard pattern Pi of the normalized category number i.
To record.

【００７６】次に、ステップＳ８０９で、辞書メモリ５
に登録されている全て（ｎ個）の文字カテゴリについて
処理を行ったかどうか（ｊ＝ｎ？）をチェックし、「ｎ
ｏ」であれば、ステップＳ８１０で、文字カテゴリ番号
ｉを増分し、次の文字カテゴリを対象としてステップＳ
８０４へ戻る。以上の手順で矩形領域Ａの左端（ｘ＝ｘ
c＝０）に全ての文字カテゴリの正規化パターンを重ね
合わせて、それぞれの類似度を求め、ステップＳ８０８
で必要項目を記録する。全ての文字カテゴリについて処
理を行った後の類似度記録内容は図３５の如くであり、
各文字カテゴリ（１≦ｉ≦ｎ）について、文字カテゴリ
番号ｉ、文字検出座標ｘi（ｊ＝１においてはｘi＝ｘc
＝０）、類似度Ｓi、及び正規化基準パターンＰiの横サ
イズｗｎiが記録されている。Next, in step S809, the dictionary memory 5
It is checked whether all (n) character categories registered in are processed (j = n?).
If it is "o", the character category number i is incremented in step S810, and the next character category is targeted in step S810.
Return to 804. With the above procedure, the left end of the rectangular area A (x = x
The normalized patterns of all the character categories are overlaid on (c = 0), the respective degrees of similarity are obtained, and step S808 is performed.
Record required items in. The contents of the similarity record after processing for all the character categories are as shown in FIG.
For each character category (1.ltoreq.i.ltoreq.n), the character category number i and the character detection coordinate xi (xi = xc for j = 1)
= 0), the similarity Si, and the horizontal size wni of the normalized reference pattern Pi are recorded.

【００７７】全ての文字カテゴリについて処理を行った
ら（ｉ＝ｎ）、ステップＳ８０９でループを抜けて、ス
テップＳ８１１で、類似度記録内容をチェックし、設定
した基準値以上の類似度Ｓiがあれば、ステップＳ８１
２で、類似度Ｓiの最大値を与える文字カテゴリをｊ文
字目の認識結果とする。本実施の形態では文字カテゴリ
番号ｉ＝２２のカテゴリ「Ｖ」が類似度最大となり、ｊ
＝１文字目の認識結果として選ばれる。このときの、文
字検出位置をｘj、正規化基準パターンの横サイズをｗj
とする（図３４（ｃ）参照）。When all character categories have been processed (i = n), the process exits the loop in step S809, and in step S811, the similarity record content is checked, and if there is a similarity Si greater than or equal to the set reference value, , Step S81
In step 2, the character category that gives the maximum value of the similarity Si is the recognition result of the jth character. In the present embodiment, the category “V” having the character category number i = 22 has the maximum degree of similarity, and j
= 1 is selected as the recognition result of the first character. At this time, the character detection position is xj, and the horizontal size of the normalized reference pattern is wj.
(See FIG. 34C).

【００７８】次に、ステップＳ８１３で、文字検出座標
ｘcを、ｘj＋ｗj→ｘcとして、更新する（図３４
（ｄ））。ステップＳ８１４では、このときの矩形領域
Ａの残エリアである図３４（ｄ）の距離Ｒをチェック
し、距離Ｒが基準値以上であれば、まだ検出すべき文字
があるとして、ステップＳ８１５で、文字検出番号ｊを
増分し、ステップＳ８０３に戻り、同様の手順で次の文
字を検出していく。ｊ＝２では、図３４（ｅ）に示すｘ
cを基準位置として、矩形領域Aと各文字カテゴリの正規
化基準パターンの類似度を求めることになる。Next, in step S813, the character detection coordinate xc is updated as xj + wj → xc (FIG. 34).
(D)). In step S814, the distance R in FIG. 34 (d), which is the remaining area of the rectangular area A at this time, is checked. If the distance R is equal to or greater than the reference value, it is determined that there is a character to be detected, and in step S815, The character detection number j is incremented, the process returns to step S803, and the next character is detected by the same procedure. When j = 2, x shown in FIG.
With c as the reference position, the similarity between the rectangular area A and the normalized reference pattern of each character category is obtained.

【００７９】類似度を求める手順中、２文字目以降（文
字検出数ｊ≧２）では、ステップＳ８０５においてステ
ップＳ８０７に分岐する。ステップＳ８０７において
は、矩形領域Ａと基準パターンＰiの類似度をＳi＝max{Ｓ(ｘ)} （ｘc−ε1≦ｘ≦ｘc＋ε2）によって求める。これは（５）式のＳ(ｘ)をｘc−ε１
≦ｘ≦ｘc＋ε2の微小な範囲で変化させて求め、その最
大値を類似度Ｓiとする。また、その最大値を与えるｘ
をｘiとする。ここでｘの値を変化させるのは、ｊ−１
文字目に検出した文字の右端を基準とするｊ文字目の文
字検出座標ｘcに対して、図３６（ａ）に示すように重
なりのある文字や、図３６（ｂ）に示すように接合のあ
る文字を検出するためで、ε1、ε2にはあらかじめ適当
な値をセットしておく。In the procedure for obtaining the degree of similarity, at the second and subsequent characters (character detection number j ≧ 2), the process branches to step S807 in step S805. In step S807, the similarity between the rectangular area A and the reference pattern Pi is calculated by Si = max {S (x)} (xc-ε1≤x≤xc + ε2). This is S (x) of equation (5) is xc-ε1
It is determined by changing within a minute range of ≤x≤xc + ε2, and the maximum value thereof is defined as the similarity Si. Also, x that gives the maximum value
Be xi. Here, the value of x is changed by j-1
The character detection coordinate xc of the j-th character based on the right end of the character detected in the character's eye is overlapped as shown in FIG. 36 (a) or is joined as shown in FIG. 36 (b). In order to detect a certain character, ε1 and ε2 are set to appropriate values in advance.

【００８０】以上の処理により、矩形領域Ａに対して左
からｊ＝１，２，・・・番目の文字が順次抽出され、ス
テップＳ８１３で、矩形領域Ａの残エリアが無いと判断
された時点でステップＳ８１６に移行し、認識結果を編
集して処理を終了する。本実施の形態においては、認識
結果として「ＶＡＴ」の文字列が編集される。なお、ス
テップ８１１で基準値以上の類似度Ｓiが無い場合はそ
の認識検出番号において認識不可として、ステップ８１
６で例えば「Ｖ？」（「？」は認識不可を表す。）など
の認識結果が編集される。By the above processing, the j = 1,2, ... Characters from the left are sequentially extracted from the rectangular area A, and when it is determined in step S813 that there is no remaining area of the rectangular area A. Then, the process proceeds to step S816, the recognition result is edited, and the process ends. In the present embodiment, the character string "VAT" is edited as the recognition result. If there is no similarity Si greater than or equal to the reference value in step 811, the recognition detection number is determined to be unrecognizable, and step 81
In 6, the recognition result such as “V?” (“?” Indicates that recognition is impossible) is edited.

【００８１】このように、複数の文字を含む切り出し領
域に対して、切り出し領域の左端に規準パターンを重ね
て類似度を求めることにより第１の文字を抽出し、この
第１の文字を切り出した残りの領域の左端に規準パター
ンを重ねて同様にして第２の文字を抽出し、これを繰り
返して順次文字を抽出するので、個々の文字を切り出す
ことができなくても、直接文字の認識が可能となる。As described above, with respect to the cut-out area including a plurality of characters, the first character is extracted by superimposing the reference pattern on the left end of the cut-out area to obtain the similarity, and the first character is cut out. The second character is extracted in the same manner by overlapping the reference pattern on the left edge of the remaining area, and the characters are extracted sequentially by repeating this, so even if individual characters cannot be cut out, direct character recognition is not possible. It will be possible.

【００８２】実施の形態７．次に、本発明による第７の実施の形態を図３７〜図４０
により説明する。第７の実施の形態の文字認識装置の構
成を図３７に示す。本実施の形態は認識対象文字として
ある文字パターンの一致度が他の文字パターンの一部に
対して高い値を示すような文字を含んでいる場合に対応
し、図３７は第６の実施の形態の図３２と同一構成をと
るが、第２の文字認識部７ｆの処理内容が異なる。他の
構成部の処理内容は第６の実施の形態と同様であるた
め、文字認識部７ｆの処理内容のみを以下に説明する。Seventh Embodiment Next, a seventh embodiment according to the present invention will be described with reference to FIGS.
Will be described. The configuration of the character recognition device of the seventh embodiment is shown in FIG. This embodiment corresponds to the case where a character pattern as a recognition target character includes a character whose matching degree shows a high value with respect to a part of another character pattern, and FIG. 32 has the same configuration as that of FIG. 32, but the processing content of the second character recognition unit 7f is different. Since the processing contents of the other components are similar to those of the sixth embodiment, only the processing contents of the character recognition unit 7f will be described below.

【００８３】図３８に第２の文字認識部７ｆの処理フロ
ーを示す。本処理フローは図３３のフローで示した第６
の実施の形態の処理と同様な流れであり、認識対象の矩
形領域の左から順に文字を抽出していく。相違点はステ
ップＳ９１１の基準値以上の類似度Ｓiが有るかどうか
をチェックする処理である。ここで該当する類似度の個
数が「２個以上」、「１個」、「無し」によって処理が
分岐し、「２個以上」の場合において、いずれの文字カ
テゴリを認識結果として採用するかを決めるステップＳ
９１７の処理が付加したものとなる。「１個」、「無
し」の場合はそれぞれ図３３のステップＳ８１１におけ
る「ｙｅｓ」「ｎｏ」の分岐と同一である。ステップＳ
９１７では、類似度が基準値以上の複数の文字カテゴリ
に対して、各々の文字カテゴリを認識結果として仮定し
たときに、その仮定した文字パターンの右側に適正な文
字認識結果が得られるかによって認識結果を選択する。
以下にステップＳ９１７の処理内容について説明する。FIG. 38 shows a processing flow of the second character recognition section 7f. This processing flow is the sixth shown in the flow of FIG.
The flow is the same as the processing of the embodiment, and characters are extracted in order from the left of the rectangular area to be recognized. The difference is the process of checking whether or not there is a similarity degree Si that is equal to or greater than the reference value in step S911. Here, the process branches depending on whether the number of the corresponding degrees of similarity is “two or more”, “one”, or “none”, and in the case of “two or more”, which character category is to be adopted as the recognition result is determined. Step S to decide
The processing of 917 is added. The cases of "one" and "none" are the same as the branches of "yes" and "no" in step S811 of FIG. 33, respectively. Step S
In 917, when a plurality of character categories whose similarity is equal to or higher than a reference value are assumed as recognition results, recognition is performed depending on whether an appropriate character recognition result is obtained on the right side of the assumed character pattern. Select a result.
The processing content of step S917 will be described below.

【００８４】認識対象文字パターンを図４の矩形領域４
２の例で説明する。ステップＳ９０１で、矩形領域４２
を縦サイズＨで正規化したパターンをＡとする。ここ
で、辞書メモリ５には文字パターンの類似度が他の文字
パターンの一部に対して高い値を示すような文字として
「Ｉ」が含まれている（図１４参照）。文字カテゴリ
「Ｉ」の基準パターンは矩形領域Ａの左にあるパターン
「Ｌ」の左端の縦線と一致度が高いため、ステップＳ９
１１の時点でステップＳ９０８において記録された内容
は図３９の如くである。カテゴリ「Ｉ」「Ｌ」の類似度
Ｓ9、Ｓ12の２つが基準値以上となっているためステッ
プＳ９１７に分岐する。ステップＳ９１７ではＳiが基
準値以上である上記２つのカテゴリから以下の手順で選
択処理を行う。The character pattern to be recognized is the rectangular area 4 in FIG.
An example of No. 2 will be described. In step S901, the rectangular area 42
Let A be a pattern that is normalized with the vertical size H. Here, the dictionary memory 5 includes “I” as a character whose similarity of the character pattern is higher than a part of other character patterns (see FIG. 14). Since the reference pattern of the character category “I” has a high degree of coincidence with the vertical line at the left end of the pattern “L” on the left of the rectangular area A, step S9 is performed.
The contents recorded in step S908 at time 11 are as shown in FIG. Since two of the similarities S9 and S12 of the categories "I" and "L" are equal to or more than the reference value, the process branches to step S917. In step S917, a selection process is performed in the following procedure from the above two categories in which Si is equal to or larger than the reference value.

【００８５】まず、ｉ＝９の文字カテゴリ「Ｉ」を認識
結果と仮定して、その右側に適正な文字認識結果が得ら
れるかをチェックする。すなわち、図４０（ａ）におい
て文字検出位置ｘcに対してｘc’＝ｘc＋ｗｎ9としてｘ
c’を新しい文字検出位置としてここに各文字カテゴリ
の正規化基準パターンＰiを重ねて類似度Ｓi（ｉ＝９）
を求める。ここでＳiはＳi＝max{Ｓ(ｘ)} （ｘc’−ε1≦ｘ≦ｘc’＋ε2）によって与えられる。図４０（ａ）からわかるようにｘ
c’を文字検出基準位置とした網掛けで示した領域に
は、全てのカテゴリの基準パターンに対してパターンの
適合がみられないため類似度Ｓiに高い値は得られな
い。First, assuming that the character category "I" with i = 9 is the recognition result, it is checked on the right side whether a proper character recognition result is obtained. That is, in FIG. 40 (a), xc '= xc + wn9 is set to x for the character detection position xc.
Using c'as a new character detection position, the normalized reference pattern Pi of each character category is superposed here, and the similarity Si (i = 9) is obtained.
Ask for. Here, Si is given by Si = max {S (x)} (xc'-ε1≤x≤xc '+ ε2). As can be seen from FIG. 40 (a), x
In the shaded area where c'is the character detection reference position, no pattern match is found for the reference patterns of all categories, so a high value for the similarity Si cannot be obtained.

【００８６】次に、ｉ＝１２の文字カテゴリ「Ｌ」を認
識結果と仮定してその右側に適正な文字認識結果が得ら
れるかをチェックする。すなわち、図４０（ｂ）におい
て文字検出位置ｘcに対してｘc’＝ｘc＋ｗｎ1２とし
て、ｘc’を新しい文字検出位置とし、ここに各文字カ
テゴリの正規化基準パターンＰiを重ねて同様に類似度
Ｓi（ｉ＝1２）を求める。この場合図４０（ｂ）からわ
かるように、ｘc’を文字検出基準位置とした網掛けで
示した領域に文字カテゴリ「Ｅ」を重ねたときに類似度
Ｓiは高い値を示す。Next, assuming that the character category "L" of i = 12 is the recognition result, it is checked whether a proper character recognition result can be obtained on the right side. That is, in FIG. 40 (b), xc '= xc + wn12 is set for the character detection position xc, xc' is set as a new character detection position, and the normalization reference pattern Pi of each character category is superposed thereon, and the similarity Si ( i = 12) is calculated. In this case, as can be seen from FIG. 40B, the similarity Si shows a high value when the character category "E" is overlaid on the shaded area where xc 'is the character detection reference position.

【００８７】以上のように、「Ｉ」「Ｌ」各々の文字カ
テゴリを認識結果として仮定したときに、その仮定した
文字パターンの右側に適正な文字認識結果が得られるの
はカテゴリ「Ｌ」であるので、ステップＳ９１０では文
字カテゴリ「Ｌ」をｊ文字目の認識結果とする。なお、
仮定した文字パターンの右側に残エリアが無い場合も適
正な文字認識結果が得られたものと等価とする。このあ
とステップＳ９１２に処理を移した後は、図３３で説明
したステップＳ８１２以降の処理と同様である。As described above, when each character category of "I" and "L" is assumed as the recognition result, it is the category "L" that an appropriate character recognition result is obtained on the right side of the assumed character pattern. Therefore, in step S910, the character category "L" is set as the recognition result of the jth character. In addition,
Even when there is no remaining area on the right side of the assumed character pattern, it is equivalent to the case where an appropriate character recognition result is obtained. After that, the process after moving to step S912 is similar to the process after step S812 described in FIG.

【００８８】以上のように、本実施の形態では、認識対
象文字としてある文字パターンの一致度が他の文字パタ
ーンの一部に対して高い値を示すような文字を含んでい
る場合に対しても、抽出した各文字の右側に適切な文字
抽出が可能かどうかをチェックすることにより、複数の
文字を内在する矩形領域から適切に１つの文字を選択す
るので、誤認識を少なくして文字列の認識をすることが
可能である。As described above, in the present embodiment, the case where the degree of coincidence of a character pattern as a character to be recognized includes a character having a high value with respect to a part of another character pattern Also, by checking whether or not an appropriate character can be extracted on the right side of each extracted character, one character can be selected appropriately from the rectangular area that contains multiple characters. It is possible to recognize

【００８９】実施の形態８．次に、本発明による第８の実施の形態を図４１〜図４３
により説明する。第８の実施の形態の文字認識装置の構
成を図４１に示す。図４１は実施の形態６の図３２と同
一構成をとるが、辞書メモリ５ａの内容、および第２の
文字認識部７ｇの処理内容が異なる。他の構成部の処理
内容は第６の実施の形態と同様であるため、上記の相違
点のみを以下に説明する。Eighth Embodiment Next, an eighth embodiment according to the present invention will be described with reference to FIGS.
Will be described. FIG. 41 shows the configuration of the character recognition device according to the eighth embodiment. 41 has the same configuration as that of FIG. 32 of the sixth embodiment, but the contents of the dictionary memory 5a and the processing contents of the second character recognition unit 7g are different. Since the processing contents of the other components are similar to those of the sixth embodiment, only the above differences will be described below.

【００９０】辞書メモリ５ａの内容は図２０に示す如く
であり、大きく２つに分類されている。一つの分類は文
字パターン左端に上下に連なる縦線要素が有る「Ｂ，
Ｄ，Ｅ，Ｆ，・・・」（分類Ａとする）であり、他の分
類はそのような縦線要素が無い「Ａ，Ｃ，Ｇ，Ｊ，・・
・」（分類Ｂとする）である。The contents of the dictionary memory 5a are as shown in FIG. 20, and are roughly classified into two. One classification is a vertical line element that runs vertically on the left end of the character pattern, "B,
D, E, F, ... ”(classified as A), and the other classifications have no such vertical line element as“ A, C, G, J, ...
. "(Classified as B).

【００９１】図４２に第２の文字認識部７ｇの処理フロ
ーを示す。ステップＳ１００１では処理対象の矩形領域
を縦サイズＨで正規化する。このように正規化された矩
形領域をＡとし、図４３（ａ）の例で以下説明する。ス
テップＳ１００２では文字検出番号ｊをｊ＝１、文字検
出位置ｘcをｘc＝０に初期化する。FIG. 42 shows a processing flow of the second character recognition section 7g. In step S1001, the rectangular area to be processed is normalized with the vertical size H. The rectangular area thus normalized is referred to as A, and will be described below with reference to the example of FIG. In step S1002, the character detection number j is initialized to j = 1 and the character detection position xc is initialized to xc = 0.

【００９２】ステップＳ１００２ａでは矩形領域Ａの文
字検出位置ｘcの近傍について縦方向ヒストグラムが基
準値以上となる部分があるか調べ、基準値以上となる部
分があればステップＳ１００２ｂでｆｌａｇ＝ＯＮに、
基準値以上となる部分がなければステップＳ１００２ｃ
でｆｌａｇ＝ＯＦＦにセットする。本実施の形態の場
合，矩形領域Ａの縦方向ヒストグラムは図４３（ｂ）の
如くとなり、図からｘ＝ｘc＝０近傍でヒストグラムが
基準値以上となっているためｆｌａｇ＝ＯＮにセットさ
れる。In step S1002a, it is checked whether or not there is a portion where the vertical direction histogram exceeds the reference value in the vicinity of the character detection position xc of the rectangular area A. If there is a portion where the vertical histogram exceeds the reference value, flag = ON in step S1002b,
If there is no portion that is greater than or equal to the reference value, step S1002c
To set flag = OFF. In the case of the present embodiment, the vertical histogram of the rectangular area A is as shown in FIG. 43 (b), and from the figure, since the histogram is above the reference value in the vicinity of x = xc = 0, flag = ON is set. .

【００９３】ステップＳ１００３で文字カテゴリ番号を
ｉ＝１に初期化し、ステップＳ１００９のチェックによ
りステップＳ１０１０で文字カテゴリ番号を増分しなが
ら、各文字カテゴリの正規化基準パターンと矩形領域Ａ
の類似度をｘ＝ｘcを基準位置として求めていく。ステ
ップＳ１００４〜Ｓ１００８までの類似度を求める処理
は第６の実施の形態で説明した図３３のステップＳ８０
４〜Ｓ８０８の処理と同様であるが、本実施の形態で
は、各文字カテゴリを扱う前にステップＳ１００３ａで
基準パターンの分類チェックを行う部分が異なる。In step S1003, the character category number is initialized to i = 1. In step S1009, the character category number is incremented in step S1010, and the normalization reference pattern and rectangular area A of each character category are incremented.
Is calculated with x = xc as a reference position. The processing of obtaining the degree of similarity in steps S1004 to S1008 is step S80 in FIG. 33 described in the sixth embodiment.
Although it is similar to the processing of 4 to S808, in the present embodiment, the part for performing the classification check of the reference pattern in step S1003a before handling each character category is different.

【００９４】ステップＳ１００３ａでは、ｆｌａｇ＝Ｏ
Ｎならば文字カテゴリ番号ｉの基準パターンが左端に縦
線要素がある分類Ａ、又はｆｌａｇ＝ＯＦＦならば文字
カテゴリ番号ｉの基準パターンが左端に縦線要素がない
分類Ｂという条件に適合するかどうかをチェックし、条
件が適合すればステップＳ１００３以降の類似度を求め
る処理を行い、条件が適合しなければ類似度は求めずに
ステップＳ１００９にジャンプする。In step S1003a, flag = O
If N, the criterion pattern of the character category number i meets the condition of classification A having a vertical line element at the left end, or if flag = OFF, the criterion pattern of the character category number i meets the condition of classification B having no vertical line element at the left end. It is checked whether or not the condition is satisfied, and the process of obtaining the similarity degree after step S1003 is performed. If the condition is not satisfied, the similarity factor is not obtained and the process jumps to step S1009.

【００９５】いま、ｘc＝０ではｆｌａｇ＝ＯＮである
ため基準パターンの左端に縦線要素がある分類Ａのみ条
件が適合する。したがって分類Ａの「Ｂ」、「Ｄ」、
「Ｅ」、・・・について類似度が求められ、ステップＳ
１０１１、Ｓ１０１２において最も高い類似度を示す文
字カテゴリ「Ｌ」が認識結果として検出される。Now, when xc = 0, flag = ON, so that the condition is met only for the classification A having the vertical line element at the left end of the reference pattern. Therefore, "B", "D" of classification A,
The degree of similarity is obtained for “E”, ...
In 1011 and S1012, the character category “L” indicating the highest degree of similarity is detected as the recognition result.

【００９６】ステップＳ１０１３以降は第６の実施の形
態の図３３のステップＳ８１３以降の処理説明と同様
に、文字検出座標ｘcを更新し、残エリアをチェック
し、文字検出番号ｊを増分してステップＳ１００２ａに
戻って次の文字を検出する。本実施の形態ではｊ＝２に
おいて文字検出座標ｘcは図４３（ｂ）に示すｘ2の位置
にあるのでこの近傍でヒストグラムは基準値未満であ
り、ステップＳ１００２ａ，Ｓ１００２ｃでｆｌａｇ＝
ＯＦＦとなる。従ってステップＳ１００３ａでは基準パ
ターンの左端に縦線要素がない分類Ｂのみ条件が適合
し、分類Ｂの「Ａ」、「Ｃ」、「Ｇ」・・・についてス
テップＳ１００４以降で類似度が求められ、ステップＳ
１０１１、Ｓ１０１２において最も高い類似度を示す文
字カテゴリ「Ａ」が認識結果として検出される。After step S1013, the character detection coordinate xc is updated, the remaining area is checked, and the character detection number j is incremented in the same way as in the description of the processing after step S813 of FIG. 33 of the sixth embodiment. The process returns to S1002a to detect the next character. In the present embodiment, when j = 2, the character detection coordinate xc is at the position of x2 shown in FIG. 43 (b), so the histogram is less than the reference value in this vicinity, and flag = steps S1002a and S1002c.
Turns off. Therefore, in step S1003a, the condition is met only in the classification B having no vertical line element on the left end of the reference pattern, and the similarity is obtained in step S1004 and subsequent steps for “A”, “C”, “G” of the classification B, Step S
In 1011 and S1012, the character category “A” having the highest similarity is detected as the recognition result.

【００９７】以上の説明から明らかなように、本実施の
形態においては、認識対象の領域について左端近傍に高
いヒストグラムがあるか否かでこの領域に適合する文字
が左端に縦線を持つか否かを判断し、あらかじめ縦線を
持つ、持たないによって分類された文字カテゴリから、
条件に適合する分類の文字カテゴリについてのみ類似度
を計算するようにしたため、計算コストの高い類似度の
計算の回数を大幅に減らして処理時間の短縮が図れる。As is clear from the above description, in the present embodiment, it is determined whether or not a character matching the area has a vertical line at the left end depending on whether or not there is a high histogram near the left end in the area to be recognized. It is judged from the character categories classified by having a vertical line and not having a vertical line in advance.
Since the similarity is calculated only for the character category of the classification that meets the condition, the number of times of calculation of the similarity with high calculation cost can be significantly reduced and the processing time can be shortened.

【００９８】[0098]

【発明の効果】この発明は、以上説明したように構成さ
れているので、以下に記載されるような効果を奏する。Since the present invention is constructed as described above, it has the following effects.

【００９９】本発明に係わる文字認識装置においては、
認識対象文字を含む画像を入力する画像入力部と、この
画像入力部で入力した画像を文字列方向に走査して生成
した画素濃度ヒストグラムにより文字列を切り出す文字
列切り出し部と、この文字列切り出し部で切り出した文
字列画像を前記文字列と直角方向に走査して生成した画
素濃度ヒストグラムを用いて認識対象文字を含む矩形領
域を切り出す文字切り出し部と、この文字切り出し部で
切り出した前記矩形領域を認識対象文字の基準パターン
と比較することにより認識対象文字を認識する第１の文
字認識部と、この第１の文字認識部で認識されなかった
前記矩形領域と認識対象文字の基準パターンとをこの基
準パターンの位置をずらしながら比較することにより、
認識対象文字の基準パターンと類似度の高い文字を認識
対象文字の候補として基準パターンの位置と共に抽出
し、この抽出された複数の認識対象文字の候補をその位
置の出現順に並べ替えることにより認識対象文字を認識
する第２の文字認識部と、を備え、前記第２の文字認識
部は、文字列方向に隣り合う文字間に重なりや接合を生
じる可能性の有る部分をマスキングした認識対象文字の
基準パターンを用いて認識対象文字を認識するものであ
るので、複数の認識対象文字を含む矩形領域から、複数
の認識対象文字の候補をその位置と共に認識し、これら
の認識対象文字を位置の出現順に並べ替えて文字列を認
識することになり、隣り合う認識対象文字間に重なりや
接合を生じてヒストグラムからは個々の認識対象文字を
切り出せない場合においても、直接認識対象文字の認識
が可能となり、しかもその認識を安定したものとするこ
とができる効果がある。In the character recognition device according to the present invention,
An image input section that inputs an image that contains the recognition target character, a character string cutout section that cuts out a character string by the pixel density histogram generated by scanning the image input in this image input section in the character string direction, and this character string cutout Character string image cut out by the character string image is generated by scanning in a direction perpendicular to the character string, and a rectangular area cut out by the character cutting out part A first character recognition unit that recognizes the recognition target character by comparing with the reference pattern of the recognition target character, the rectangular area that is not recognized by the first character recognition unit, and the reference pattern of the recognition target character. By comparing while shifting the position of this reference pattern,
Characters that have a high degree of similarity to the reference pattern of the recognition target character are extracted as candidates for the recognition target character along with the position of the reference pattern, and the extracted plurality of recognition target character candidates are rearranged in the order of appearance of the recognition target. A second character recognizing unit for recognizing a character, wherein the second character recognizing unit masks a portion that may overlap or join between characters adjacent to each other in the character string direction. Since the recognition target character is recognized using the reference pattern, a plurality of recognition target character candidates are recognized together with their positions from a rectangular area including a plurality of recognition target characters, and these recognition target characters appear When the character strings are recognized by arranging them in order, and when the recognition target characters cannot be cut out from the histogram due to overlapping or joining between adjacent recognition target characters. You can have, it is possible to recognize the direct recognition target character, yet there is an effect that can be provided with stable and its recognition.

【０１００】又、認識対象文字を含む画像を入力する画
像入力部と、この画像入力部で入力した画像を文字列方
向に走査して生成した画素濃度ヒストグラムにより文字
列を切り出す文字列切り出し部と、この文字列切り出し
部で切り出した文字列画像を前記文字列と直角方向に走
査して生成した画素濃度ヒストグラムを用いて認識対象
文字を含む矩形領域を切り出す文字切り出し部と、この
文字切り出し部で切り出した前記矩形領域を認識対象文
字の基準パターンと比較することにより認識対象文字を
認識する第１の文字認識部と、前記認識対象文字の基準
パターンを文字列と直角方向で矩形領域の上下に連なる
線要素の存在の有無で分類し、前記第１の文字認識部で
認識されなかった前記矩形領域と前記線要素の存在しな
い認識対象文字の基準パターンとをこの基準パターンの
位置をずらしながら比較することにより、認識対象文字
の基準パターンと類似度の高い文字を認識対象文字の候
補として基準パターンの位置と共に抽出するとともに、
前記第１の文字認識部で認識されなかった矩形領域の文
字列と前記前記線要素の存在する認識対象文字の基準パ
ターンとを重ね合わせることにより、認識対象文字の基
準パターンと類似度の高い文字を認識対象文字の候補と
して基準パターンの位置と共に抽出し、この抽出された
複数の認識対象文字の候補をその位置の出現順に並べ替
えることにより認識対象文字を認識する第２の文字認識
部と、を備えたので、複数の認識対象文字を含む矩形領
域から、複数の認識対象文字の候補をその位置と共に認
識し、これらの認識対象文字を位置の出現順に並べ替え
て文字列を認識することになり、隣り合う認識対象文字
間に重なりや接合を生じてヒストグラムからは個々の認
識対象文字を切り出せない場合においても、直接認識対
象文字の認識が可能となり、しかもその認識回数を減ら
して、高速に認識対象文字の認識を可能とする効果があ
る。Further, an image input section for inputting an image containing a character to be recognized, and a character string cutout section for cutting out a character string by a pixel density histogram generated by scanning the image input by this image input section in the character string direction. , A character segmentation unit for segmenting a rectangular region including a character to be recognized by using a pixel density histogram generated by scanning the character string image segmented by the character string segmentation unit in a direction perpendicular to the character string, and the character segmentation unit. A first character recognition unit for recognizing a character to be recognized by comparing the cut-out rectangular area with a reference pattern of the character to be recognized, and the reference pattern of the character to be recognized is arranged above and below the rectangular area in a direction perpendicular to the character string. The rectangular area which is not recognized by the first character recognition unit and the recognition target character in which the line element does not exist are classified by the presence or absence of continuous line elements. By the quasi pattern comparing while shifting the position of the reference pattern, it is extracted together with the position of the reference pattern with high character similarity between the reference pattern to be recognized character as a candidate of the recognition object character,
A character having a high degree of similarity with the reference pattern of the recognition target character by superimposing the character string of the rectangular area not recognized by the first character recognition unit and the reference pattern of the recognition target character in which the line element exists. A second character recognition unit for recognizing the recognition target character by extracting the candidates of the recognition target character together with the position of the reference pattern, and rearranging the extracted candidates of the recognition target character in the order of appearance of the position; With this, it is possible to recognize multiple recognition target character candidates along with their positions from a rectangular area that contains multiple recognition target characters, and to recognize these character strings by rearranging these recognition target characters in the order of appearance of the positions. Even if it is not possible to cut out individual recognition target characters from the histogram due to overlapping or joining between adjacent recognition target characters, it is possible to directly recognize the recognition target characters. Next, yet the effect of enabling to reduce the recognition number, the recognition of the high speed recognition target character.

【０１０１】又、認識対象文字を含む画像を入力する画
像入力部と、この画像入力部で入力した画像を文字列方
向に走査して生成した画素濃度ヒストグラムにより文字
列を切り出す文字列切り出し部と、この文字列切り出し
部で切り出した文字列画像を前記文字列と直角方向に走
査して生成した画素濃度ヒストグラムを用いて認識対象
文字を含む矩形領域を切り出す文字切り出し部と、この
文字切り出し部で切り出した前記矩形領域を認識対象文
字の基準パターンと比較することにより認識対象文字を
認識する第１の文字認識部と、この第１の文字認識部で
認識されなかった前記矩形領域の一方の端に認識対象文
字の基準パターンを重ね合わせて認識対象文字を認識
し、更にこの認識された認識対象文字を前記矩形領域か
ら除いた残りの矩形領域の前記矩形領域の一方の端と同
じ側の端に認識対象文字の基準パターンを重ね合わせる
ことにより認識対象文字を順次認識する第２の文字認識
部と、を備え、前記第２の文字認識部は、認識すべき矩
形領域の一方の端に文字列と直角方向に走査して生成し
た画素濃度ヒストグラムの高い位置が存在するときは、
認識対象文字の基準パターンの内で文字列と直角方向で
前記矩形領域の上下に連なる線要素の存在する認識対象
文字の基準パターンを重ね合わせて認識対象文字を認識
し、認識すべき矩形領域の一方の端に文字列と直角方向
に走査して生成した画素濃度ヒストグラムの高い位置が
存在しないときは、認識対象文字の基準パターンの内で
文字列と直角方向で前記矩形領域の上下に連なる線要素
の存在しない認識対象文字の基準パターンを重ね合わせ
て認識対象文字を認識するので、矩形領域から認識した
認識対象文字を順次取り除きながら文字列を認識するこ
とになり、隣り合う認識対象文字間に重なりや接合を生
じてヒストグラムからは個々の認識対象文字を切り出せ
ない場合においても、直接認識対象文字の認識が可能と
なり、しかもその認識回数を減らして、高速に認識対象
文字の認識を可能とする効果がある。Further, an image input section for inputting an image containing a character to be recognized, and a character string cutout section for cutting out a character string by a pixel density histogram generated by scanning the image input by this image input section in the character string direction. , A character segmentation unit for segmenting a rectangular region including a character to be recognized by using a pixel density histogram generated by scanning the character string image segmented by the character string segmentation unit in a direction perpendicular to the character string, and the character segmentation unit. A first character recognition unit that recognizes a recognition target character by comparing the cut-out rectangular region with a reference pattern of the recognition target character, and one end of the rectangular region that is not recognized by the first character recognition unit. The recognition target character is recognized by superimposing the reference pattern of the recognition target character on and the recognized rectangle is the remaining rectangle excluding the recognized recognition target character from the rectangular area. A second character recognizing unit for sequentially recognizing the recognition target character by superimposing a reference pattern of the recognition target character on one end of the rectangular area on the same side as one end of the rectangular area; When the high position of the pixel density histogram generated by scanning in the direction perpendicular to the character string exists at one end of the rectangular area to be recognized,
In the reference pattern of the recognition target character, the reference pattern of the recognition target character in which line elements existing in the vertical direction of the rectangular region in the direction orthogonal to the character string are overlapped is recognized as the recognition target character, and the recognition target character is recognized. When there is no high position in the pixel density histogram generated by scanning in the direction perpendicular to the character string at one end, lines connecting the upper and lower sides of the rectangular area in the direction orthogonal to the character string in the reference pattern of the recognition target character. Since the recognition target character is recognized by overlapping the reference pattern of the recognition target character with no element, the character string is recognized while sequentially removing the recognition target characters recognized from the rectangular area. Even if individual recognition target characters cannot be cut out from the histogram due to overlapping or splicing, it becomes possible to directly recognize the recognition target characters. Reduce the 識回 number is effective to enable recognition of the high speed recognition target character.

【図面の簡単な説明】[Brief description of drawings]

【図１】この発明の第１の実施の形態による文字認識装
置の構成を示すブロック図である。FIG. 1 is a block diagram showing a configuration of a character recognition device according to a first embodiment of the present invention.

【図２】この発明の第１の実施の形態による処理例を示
すための入力画像を示す説明図である。FIG. 2 is an explanatory diagram showing an input image for showing a processing example according to the first embodiment of the present invention.

【図３】この発明の第１の実施の形態による入力画像か
ら文字列領域を切り出す課程を示す説明図である。FIG. 3 is an explanatory diagram showing a process of cutting out a character string region from an input image according to the first embodiment of the present invention.

【図４】この発明の第１の実施の形態による文字列領域
から文字領域を切り出す課程を示す説明図である。FIG. 4 is an explanatory diagram showing a process of cutting out a character area from a character string area according to the first embodiment of the present invention.

【図５】この発明の第１の実施の形態による第１の文字
認識部６の処理内容を示すフロー図である。FIG. 5 is a flowchart showing the processing contents of the first character recognition unit 6 according to the first embodiment of the present invention.

【図６】この発明の第１の実施の形態による辞書メモリ
５の内容を示す説明図である。FIG. 6 is an explanatory diagram showing contents of the dictionary memory 5 according to the first embodiment of the present invention.

【図７】この発明の第１の実施の形態による第１の文字
認識部６における文字サイズの正規化を示す説明図であ
る。FIG. 7 is an explanatory diagram showing normalization of a character size in the first character recognition unit 6 according to the first embodiment of the present invention.

【図８】この発明の第１の実施の形態による一つの切り
出しパターンの類似度の記録を示す説明図である。FIG. 8 is an explanatory diagram showing recording of the similarity of one cutout pattern according to the first embodiment of the present invention.

【図９】この発明の第１の実施の形態による第２の文字
認識部７の処理内容を示すフロー図である。FIG. 9 is a flowchart showing the processing contents of the second character recognition unit 7 according to the first embodiment of the present invention.

【図１０】この発明の第１の実施の形態による接合を生
じた文字切り出し領域から各文字カテゴリの類似度を求
めた記録を示す説明図である。FIG. 10 is an explanatory diagram showing a record in which the degree of similarity of each character category is obtained from the character cutout area in which joining has occurred according to the first embodiment of the present invention.

【図１１】この発明の第１の実施の形態による重なりを
生じた文字切り出し領域から各文字カテゴリの類似度を
求めた記録を示す説明図である。FIG. 11 is an explanatory diagram showing a record in which the degree of similarity of each character category is obtained from a character cutout area in which an overlap has occurred according to the first embodiment of the present invention.

【図１２】この発明の第２の実施の形態による文字認識
装置の構成を示すブロック図である。FIG. 12 is a block diagram showing a configuration of a character recognition device according to a second embodiment of the present invention.

【図１３】この発明の第２の実施の形態による第２の文
字認識部７ａの処理内容を示すフロー図である。FIG. 13 is a flowchart showing the processing contents of a second character recognition unit 7a according to the second embodiment of the present invention.

【図１４】この発明の第２の実施の形態による辞書メモ
リ５の内容を示す説明図である。FIG. 14 is an explanatory diagram showing the contents of a dictionary memory 5 according to the second embodiment of the present invention.

【図１５】この発明の第２の実施の形態による切り出し
パターン「ＬＥ」の各位置における基準パターン「Ｉ」
の類似度を示す説明図である。FIG. 15 is a reference pattern “I” at each position of the cutout pattern “LE” according to the second embodiment of the present invention.
It is an explanatory view showing the degree of similarity of.

【図１６】この発明の第２の実施の形態による切り出し
パターン「ＬＥ」から抽出される文字とその位置を示す
説明図である。FIG. 16 is an explanatory diagram showing characters and their positions extracted from a cutout pattern “LE” according to the second embodiment of the present invention.

【図１７】この発明の第２の実施の形態による抽出され
た各文字の存在範囲を示す説明図である。FIG. 17 is an explanatory diagram showing the existence range of each extracted character according to the second embodiment of the present invention.

【図１８】この発明の第２の実施の形態による抽出され
た文字組み合わせの存在範囲を示す説明図である。FIG. 18 is an explanatory diagram showing the existence range of an extracted character combination according to the second embodiment of the present invention.

【図１９】この発明の第３の実施の形態による文字認識
装置の構成を示すブロック図である。FIG. 19 is a block diagram showing a configuration of a character recognition device according to a third embodiment of the present invention.

【図２０】この発明の第３の実施の形態による辞書メモ
リ５ａの内容を示す説明図である。FIG. 20 is an explanatory diagram showing the contents of a dictionary memory 5a according to the third embodiment of the present invention.

【図２１】この発明の第３の実施の形態による第２の文
字認識部７ｂの処理内容を示すフロー図である。FIG. 21 is a flowchart showing the processing contents of the second character recognition unit 7b according to the third embodiment of the present invention.

【図２２】この発明の第４の実施の形態による文字認識
装置の構成を示すブロック図である。FIG. 22 is a block diagram showing a configuration of a character recognition device according to a fourth embodiment of the present invention.

【図２３】この発明の第４の実施の形態による第１の文
字認識部６ａの処理内容を示すフロー図である。FIG. 23 is a flowchart showing the processing contents of the first character recognition unit 6a according to the fourth embodiment of the present invention.

【図２４】この発明の第４の実施の形態による第２の文
字認識部７ｃの処理内容を示すフロー図である。FIG. 24 is a flowchart showing the processing contents of the second character recognition unit 7c according to the fourth embodiment of the present invention.

【図２５】この発明の第４の実施の形態による切り出し
パターンと基準パターンとを同サイズのパターンとする
パターンサイズの変換についての説明図である。FIG. 25 is an explanatory diagram of pattern size conversion in which the cutout pattern and the reference pattern have the same size according to the fourth embodiment of the present invention.

【図２６】この発明の第５の実施の形態による類似度の
低下する切り出し矩形領域（パターン）を示す説明図で
ある。FIG. 26 is an explanatory diagram showing a cutout rectangular area (pattern) in which the degree of similarity is reduced according to the fifth embodiment of the present invention.

【図２７】この発明の第５の実施の形態による文字認識
装置の構成を示すブロック図である。FIG. 27 is a block diagram showing a configuration of a character recognition device according to a fifth embodiment of the present invention.

【図２８】この発明の第５の実施の形態による辞書メモ
リ５ｂの内容を示す説明図である。FIG. 28 is an explanatory diagram showing the contents of a dictionary memory 5b according to the fifth embodiment of the present invention.

【図２９】この発明の第５の実施の形態によるマスクパ
ターンの内容を示す説明図である。FIG. 29 is an explanatory diagram showing the contents of a mask pattern according to the fifth embodiment of the present invention.

【図３０】この発明の第５の実施の形態による第２の文
字認識部７ｄの処理内容を示すフロー図である。FIG. 30 is a flowchart showing the processing contents of the second character recognition unit 7d according to the fifth embodiment of the present invention.

【図３１】この発明の第５の実施の形態によるマスクパ
ターンの効果を示す説明図である。FIG. 31 is an explanatory diagram showing the effect of the mask pattern according to the fifth embodiment of the present invention.

【図３２】この発明の第６の実施の形態による文字認識
装置の構成を示すブロック図である。FIG. 32 is a block diagram showing a configuration of a character recognition device according to a sixth embodiment of the present invention.

【図３３】この発明の第６の実施の形態による第２の文
字認識部７ｅの処理内容を示すフロー図である。FIG. 33 is a flowchart showing the processing contents of the second character recognition unit 7e according to the sixth embodiment of the present invention.

【図３４】この発明の第６の実施の形態による第２の文
字認識部７ｅの処理手順の説明図である。FIG. 34 is an explanatory diagram of a processing procedure of the second character recognition unit 7e according to the sixth embodiment of the present invention.

【図３５】この発明の第６の実施の形態による第２の文
字認識部７ｅの記録内容を示す説明図である。FIG. 35 is an explanatory diagram showing recorded contents of the second character recognition unit 7e according to the sixth embodiment of the present invention.

【図３６】この発明の第６の実施の形態による第２の文
字認識部７ｅの類似度を求める範囲についての説明図で
ある。FIG. 36 is an explanatory diagram of a range for obtaining the degree of similarity of the second character recognition unit 7e according to the sixth embodiment of the present invention.

【図３７】この発明の第７の実施の形態による文字認識
装置の構成を示すブロック図である。FIG. 37 is a block diagram showing a configuration of a character recognition device according to a seventh embodiment of the present invention.

【図３８】この発明の第７の実施の形態による第２の文
字認識部７ｆの処理内容を示すフロー図である。FIG. 38 is a flowchart showing the processing contents of the second character recognition unit 7f according to the seventh embodiment of the present invention.

【図３９】この発明の第７の実施の形態による第２の文
字認識部７ｆの記録内容を示す説明図である。FIG. 39 is an explanatory diagram showing recorded contents of the second character recognition unit 7f according to the seventh embodiment of the present invention.

【図４０】この発明の第７の実施の形態による第２の文
字認識部７ｆにおいて複数の認識候補文字から１つを選
択する方法の説明図である。FIG. 40 is an explanatory diagram of a method of selecting one from a plurality of recognition candidate characters in the second character recognition unit 7f according to the seventh embodiment of the present invention.

【図４１】この発明の第８の実施の形態による文字認識
装置の構成を示すブロック図である。FIG. 41 is a block diagram showing a configuration of a character recognition device according to an eighth embodiment of the present invention.

【図４２】この発明の第８の実施の形態による第２の文
字認識部７ｇの処理内容を示すフロー図である。FIG. 42 is a flowchart showing the processing contents of the second character recognition unit 7g according to the eighth embodiment of the present invention.

【図４３】この発明の第８の実施の形態による第２の文
字認識部７ｇの文字検出位置のヒストグラムによる文字
カテゴリの分類についての説明図である。FIG. 43 is an explanatory diagram of classification of character categories by a histogram of character detection positions of the second character recognition unit 7g according to the eighth embodiment of the present invention.

【図４４】従来の文字認識装置の構成を示すブロック図
である。FIG. 44 is a block diagram showing a configuration of a conventional character recognition device.

【図４５】従来の文字認識装置の処理を示すための入力
画像の説明図である。FIG. 45 is an explanatory diagram of an input image showing the processing of the conventional character recognition device.

【図４６】従来の入力画像から文字列領域を切り出す課
程を示す説明図である。FIG. 46 is an explanatory diagram showing a process of cutting out a character string region from a conventional input image.

【図４７】従来の文字列領域から文字領域を切り出す課
程を示す説明図である。FIG. 47 is an explanatory diagram showing a process of cutting out a character area from a conventional character string area.

【図４８】従来の個々に切り出された文字の類似度を求
める説明図である。[Fig. 48] Fig. 48 is an explanatory diagram for obtaining the similarity between individually cut out characters in the related art.

【符号の説明】[Explanation of symbols]

１画像入力部、２画像メモリ、３文字列切り出し
部、４、４ａ文字切り出し部、５、５ａ、５ｂ辞書
メモリ、６、６ａ第１の文字認識部、７、７ａ、７
ｂ、７ｃ、７ｄ，７ｅ、７ｆ、７ｇ第２の文字認識
部、１０画像、２０、３０画素濃度ヒストグラム、
２３文字列、４１〜４５矩形領域。1 image input unit, 2 image memory, 3 character string cutting unit 4, 4a character cutting unit 5, 5a, 5b dictionary memory, 6, 6a first character recognition unit, 7, 7a, 7
b, 7c, 7d, 7e, 7f, 7g Second character recognition unit, 10 images, 20, 30 Pixel density histogram,
23 character string, 41 to 45 rectangular area.

───────────────────────────────────────────────────── フロントページの続き (58)調査した分野(Int.Cl.⁷，ＤＢ名) G06K 9/00 - 9/82 ─────────────────────────────────────────────────── ─── Continuation of the front page (58) Fields surveyed (Int.Cl. ⁷ , DB name) G06K 9/00-9/82

Claims

(57)【特許請求の範囲】(57) [Claims]

【請求項１】認識対象文字を含む画像を入力する画像
入力部と、この画像入力部で入力した画像を文字列方向
に走査して生成した画素濃度ヒストグラムにより文字列
を切り出す文字列切り出し部と、この文字列切り出し部
で切り出した文字列画像を前記文字列と直角方向に走査
して生成した画素濃度ヒストグラムを用いて認識対象文
字を含む矩形領域を切り出す文字切り出し部と、この文
字切り出し部で切り出した前記矩形領域を認識対象文字
の基準パターンと比較することにより認識対象文字を認
識する第１の文字認識部と、この第１の文字認識部で認
識されなかった前記矩形領域と認識対象文字の基準パタ
ーンとをこの基準パターンの位置をずらしながら比較す
ることにより、認識対象文字の基準パターンと類似度の
高い文字を認識対象文字の候補として基準パターンの位
置と共に抽出し、この抽出された複数の認識対象文字の
候補をその位置の出現順に並べ替えることにより認識対
象文字を認識する第２の文字認識部と、を備え、前記第
２の文字認識部は、文字列方向に隣り合う文字間に重な
りや接合を生じる可能性の有る部分をマスキングした認
識対象文字の基準パターンを用いて認識対象文字を認識
することを特徴とする文字認識装置。1. An image input section for inputting an image including a character to be recognized, and a character string cutout section for cutting out a character string by a pixel density histogram generated by scanning the image input by the image input section in the character string direction. , A character segmentation unit for segmenting a rectangular area containing a recognition target character using a pixel density histogram generated by scanning the character string image segmented by this character string segmentation unit in the direction perpendicular to the character string, and this character segmentation unit. A first character recognition unit that recognizes a recognition target character by comparing the cut-out rectangular region with a reference pattern of the recognition target character, and the rectangular region and the recognition target character that are not recognized by the first character recognition unit. By comparing the reference pattern with the reference pattern while shifting the position of the reference pattern, a character having a high degree of similarity with the reference pattern of the recognition target character is recognized. A second character recognition unit for recognizing the recognition target character by extracting together with the position of the reference pattern as a character candidate, and rearranging the extracted candidates for the recognition target character in the appearance order of the position ; The above
The character recognizing unit of No. 2 has no overlap between characters that are adjacent in the character string direction.
Masking the parts that may cause glue joints
Recognize the recognition target character using the reference pattern of the recognition target character
A character recognition device characterized by:

【請求項２】認識対象文字を含む画像を入力する画像
入力部と、この画像入力部で入力した画像を文字列方向
に走査して生成した画素濃度ヒストグラムにより文字列
を切り出す文字列切り出し部と、この文字列切り出し部
で切り出した文字列画像を前記文字列と直角方向に走査
して生成した画素濃度ヒストグラムを用いて認識対象文
字を含む矩形領域を切り出す文字切り出し部と、この文
字切り出し部で切り出した前記矩形領域を認識対象文字
の基準パターンと比較することにより認識対象文字を認
識する第１の文字認識部と、前記認識対象文字の基準パ
ターンを文字列と直角方向で矩形領域の上下に連なる線
要素の存在の有無で分類し、前記第１の文字認識部で認
識されなかった前記矩形領域と前記線要素の存在しない
認識対象文字の基準パターンとをこの基準パターンの位
置をずらしながら比較することにより、認識対象文字の
基準パターンと類似度の高い文字を認識対象文字の候補
として基準パターンの位置と共に抽出するとともに、前
記第１の文字認識部で認識されなかった矩形領域の文字
列と前記前記線要素の存在する認識対象文字の基準パタ
ーンとを重ね合わせることにより、認識対象文字の基準
パターンと類似度の高い文字を認識対象文字の候補とし
て基準パターンの位置と共に抽出し、この抽出された複
数の認識対象文字の候補をその位置の出現順に並べ替え
ることにより認識対象文字を認識する第２の文字認識部
と、を備えたことを特徴とする文字認識装置。2. An image for inputting an image including a character to be recognized
Input part and the image input in this image input part in the character string direction
Character string by pixel density histogram generated by scanning
And the character string cutout part that cuts out
Scan the character string image cut out in the direction perpendicular to the character string.
Sentence using the pixel density histogram generated by
A character cutout part that cuts out a rectangular area containing characters and this sentence
The rectangular area cut out by the character cutout unit is recognized as a character.
The recognition target character is recognized by comparing with the reference pattern of
The first character recognition unit to recognize and the reference pattern of the recognition target character.
A line that connects the turn to the upper and lower sides of the rectangular area at right angles to the character string.
Classify according to the presence or absence of elements, and recognize with the first character recognition unit.
The rectangular area and the line element that were not recognized do not exist
The reference pattern of the character to be recognized is the position of this reference pattern.
By comparing while shifting the position,
Candidate reference pattern and high similarity character recognition Target character
As well as extracting with the position of the reference pattern as
The characters in the rectangular area that were not recognized by the first character recognition unit
A reference pattern of the character to be recognized in which the line and the line element are present
By overlapping with the
Characters with a high degree of similarity to the pattern are used as candidates for the recognition target character.
Extracted with the position of the reference pattern.
Sort the number of recognition target characters in order of appearance of the position
Second character recognition unit for recognizing a recognition target character by performing
And a character recognition device.

【請求項３】認識対象文字を含む画像を入力する画像
入力部と、この画像入力部で入力した画像を文字列方向
に走査して生成した画素濃度ヒストグラムにより文字列
を切り出す文字列切り出し部と、この文字列切り出し部
で切り出した文字列画像を前記文字列と直角方向に走査
して生成した画素濃度ヒストグラムを用いて認識対象文
字を含む矩形領域を切り出す文字切り出し部と、この文
字切り出し部で切り出した前記矩形領域を認識対象文字
の基準パターンと比較することにより認識対象文字を認
識する第１の文字認識部と、この第１の文字認識部で認
識されなかった前記矩形領域の一方の端に認識対象文字
の基準パターンを重ね合わせて認識対象文字を認識し、
更にこの認識された認識対象文字を前記矩形領域から除
いた残りの矩形領域の前記矩形領域の一方の端と同じ側
の端に認識対象文字の基準パターンを重ね合わせること
により認識対象文字を順次認識する第２の文字認識部
と、を備え、前記第２の文字認識部は、認識すべき矩形
領域の一方の端に文字列と直角方向に走査して生成した
画素濃度ヒストグラムの高い位置が存在するときは、認
識対象文字の基準パターンの内で文字列と直角方向で前
記矩形領域の上下に連なる線要素の存在する認識対象文
字の基準パターンを重ね合わせて認識対象文字を認識
し、認識すべき矩形領域の一方の端に文字列と直角方向
に走査して生成した画素濃度ヒストグラムの高い位置が
存在しないときは、認識対象文字の基準パターンの内で
文字列と直角方向で前記矩形領域の上下に連なる線要素
の存在しない認識対象文字の基準パターンを重ね合わせ
て認識対象文字を認識することを特徴とする文字認識装
置。3. An image for inputting an image containing a character to be recognized
Input part and the image input in this image input part in the character string direction
Character string by pixel density histogram generated by scanning
And the character string cutout part that cuts out
Scan the character string image cut out in the direction perpendicular to the character string.
Sentence using the pixel density histogram generated by
A character cutout part that cuts out a rectangular area containing characters and this sentence
The rectangular area cut out by the character cutout unit is recognized as a character.
The recognition target character is recognized by comparing with the reference pattern of
Recognize the first character recognition part and the first character recognition part
Characters to be recognized at one end of the unrecognized rectangular area
Recognize the recognition target character by overlapping the reference pattern of
Further, the recognized recognition target character is removed from the rectangular area.
Of the remaining rectangular area on the same side as one end of the rectangular area
Overlay the reference pattern of the recognition target character on the edge of the
Second character recognition unit for sequentially recognizing recognition target characters by
And the second character recognition unit is a rectangle to be recognized.
Generated by scanning one edge of the region in the direction perpendicular to the character string
If there is a high position in the pixel density histogram, the
In the reference pattern of the character to be recognized
Recognized sentence with line elements in the upper and lower parts of the rectangular area
Recognize the recognition target character by overlapping the reference pattern of the character
At the right angle to the character string at one end of the rectangular area to be recognized.
The high position of the pixel density histogram generated by scanning
If it does not exist, within the reference pattern of the recognition target character
A line element that extends vertically above and below the rectangular area in the direction perpendicular to the character string.
Superimpose reference patterns of recognition target characters that do not exist
A character recognition device characterized by recognizing a recognition target character .