JPH076208A - Character recognition device - Google Patents

Character recognition device

Info

Publication number
JPH076208A
JPH076208A JP5147976A JP14797693A JPH076208A JP H076208 A JPH076208 A JP H076208A JP 5147976 A JP5147976 A JP 5147976A JP 14797693 A JP14797693 A JP 14797693A JP H076208 A JPH076208 A JP H076208A
Authority
JP
Japan
Prior art keywords
character
unit
feature amount
stored
recognition target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP5147976A
Other languages
Japanese (ja)
Inventor
Yoshitake Tsuji
善丈 辻
Daisuke Nishiwaki
大輔 西脇
Makoto Uchida
誠 内田
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC Corp
Original Assignee
NEC Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NEC Corp filed Critical NEC Corp
Priority to JP5147976A priority Critical patent/JPH076208A/en
Publication of JPH076208A publication Critical patent/JPH076208A/en
Pending legal-status Critical Current

Links

Landscapes

  • Character Discrimination (AREA)

Abstract

PURPOSE:To attain more correct identification of similar characters whose shapes differ locally with each other. CONSTITUTION:A distance calculation section 6 multiplies a weight coefficient stored in a weight coefficient storage section 7 with a difference between a characteristic quantity of various character types and a characteristic quantity of a recognition object characteristic obtained by a characteristic extract section 4 to calculate a distance and totalizes all calculated distance values. A discrimination section 8 provides a characteristic type code of a character type corresponding to the sum of smallest distance values as a character recognition result.

Description

【発明の詳細な説明】Detailed Description of the Invention

【0001】[0001]

【産業上の利用分野】本発明は入力した文字を認識する
文字認識装置に関する。
BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a character recognition device for recognizing input characters.

【0002】[0002]

【従来の技術】従来の文字認識装置では認識対象文字か
ら得られた特徴量と予め辞書に蓄えられている各文字種
の特徴の距離を計算し、前記距離の最も小さい文字種コ
ードを出力していた(例えば、電子通信学会技術報告<
パターン認識と学習>PRL82−46“マルチフォン
ト印刷漢字認識実験装置”目黒他)。
2. Description of the Related Art In a conventional character recognition device, the distance between the feature amount obtained from the recognition target character and the feature of each character type stored in the dictionary in advance is calculated, and the character type code with the smallest distance is output. (For example, IEICE Technical Report <
Pattern recognition and learning> PRL82-46 "Multi-font printing Chinese character recognition experimental device" Meguro et al.).

【0003】[0003]

【発明が解決しようとする課題】しかしながら、認識対
象文字の特徴量と辞書に蓄えられている特徴との距離を
求めて最小距離の文字種を文字認識結果と判定する方法
では、「テ」や「ラ」等の局所的に形状の異なる文字の
場合、微量な違いの積み重ねによる距離と局所的な違い
による距離が近似し識別することが困難となる。
However, in the method of determining the distance between the feature amount of the recognition target character and the feature stored in the dictionary and determining the character type with the minimum distance as the character recognition result, "TE" and " In the case of a character having a locally different shape such as "La", the distance due to the accumulation of minute differences and the distance due to the local difference are approximate and it is difficult to identify.

【0004】本発明の目的は、局所的に形状の異なる文
字に対してもより正確に文字認識を行なうことが可能な
文字認識装置を提供することになる。
An object of the present invention is to provide a character recognizing device capable of more accurately recognizing characters having locally different shapes.

【0005】[0005]

【課題を解決するための手段】本発明の文字認識装置
は、認識対象文字を含む画像を入力する画像入力部と、
前記画像入力部で入力された画像が格納される画像メモ
リと、前記画像メモリに格納されている画像から認識対
象となる文字を矩形で切り出す文字切り出し部と、前記
文字切り出し部で得られた認識対象文字のパターンより
認識に必要な特徴量を抽出する特徴抽出部と、認識対象
文字種毎に標準特徴量が格納されている辞書部と、前記
辞書部に格納されている、各認識対象文字の標準特徴量
毎の重み係数が格納されている重み係数格納部と、前記
特徴抽出部によって得られた、認識対象文字パターンの
各特徴量と、前記辞書部に格納されている、該特徴量に
対応する標準特徴量の差の絶対値に、前記重み係数格納
部に格納されている、該標準特徴量に対応する重み係数
を乗算したものを当該認識対象文字パターンの全ての特
徴量について求めて、これらを加算し距離値を算出する
処理を、前記辞書部に標準特徴量が格納されて全ての認
識対象文字種について行なう距離計算部と、前記距離計
算部で算出された距離値のうちで最も小さい距離値に対
応する認識対象文字種の字種コードを文字認識結果とし
て出力する判定部とを有する。
A character recognition apparatus according to the present invention comprises an image input section for inputting an image containing a character to be recognized,
An image memory in which an image input by the image input unit is stored, a character cutout unit that cuts out a character to be recognized from the image stored in the image memory as a rectangle, and a recognition obtained by the character cutout unit. A feature extraction unit that extracts a feature amount necessary for recognition from the pattern of the target character, a dictionary unit that stores the standard feature amount for each recognition target character type, and a recognition unit for each recognition target character that is stored in the dictionary unit. A weighting factor storage unit that stores a weighting factor for each standard feature amount, each feature amount of the recognition target character pattern obtained by the feature extraction unit, and the feature amount stored in the dictionary unit. The absolute value of the difference between the corresponding standard feature amounts is multiplied by the weighting factor corresponding to the standard feature amount stored in the weighting factor storage unit to obtain for all the feature amounts of the recognition target character pattern. Of the distance values calculated by the distance calculation unit and the distance calculation unit that performs the process of adding these and calculating the distance value for all recognition target character types whose standard feature values are stored in the dictionary unit, The determination unit outputs the character type code of the recognition target character type corresponding to the small distance value as a character recognition result.

【0006】[0006]

【作用】認識対象文字に対して抽出した特徴量と辞書部
に格納された認識対象文字種毎の標準特徴量との距離計
算時に各特徴量の差分に対して重み係数を乗じ、局所的
な差異を強調する様な重み付けをすることにより、局所
的に形状の異なるような類似した文字に対してもより正
確に識別が行なえる。なお、特開昭62−55783号
は、距離計算部で得られた距離を文字に付与された重み
係数により正規化するものであり、本発明とは異なる。
When the distance between the feature amount extracted for the recognition target character and the standard feature amount for each recognition target character type stored in the dictionary unit is calculated, the difference between the feature amounts is multiplied by a weighting coefficient to obtain a local difference. By weighting so as to emphasize, it is possible to more accurately identify even similar characters having locally different shapes. It should be noted that Japanese Patent Laid-Open No. 62-57883 is different from the present invention in that the distance obtained by the distance calculator is normalized by the weighting factor given to the character.

【0007】[0007]

【実施例】次に、本発明の実施例について図面を参照し
ながら説明する。
Embodiments of the present invention will now be described with reference to the drawings.

【0008】図1は本発明の一実施例の文字認識装置の
ブロック図、図2は入力画像の一例を示す図、図3は文
字切り出し部3で切り出された認識対象文字の一例を示
す図、図4は文字の特徴を示す図、図5は重み係数格納
部7の内容を示す図である。
FIG. 1 is a block diagram of a character recognition apparatus according to an embodiment of the present invention, FIG. 2 is a view showing an example of an input image, and FIG. 3 is a view showing an example of recognition target characters cut out by a character cutting section 3. 4, FIG. 4 is a diagram showing characteristics of characters, and FIG. 5 is a diagram showing contents of the weighting factor storage unit 7.

【0009】本実施例の文字認識装置は、図1に示すよ
うに、画像入力部1と画像メモリ2と文字切り出し部3
と特徴抽出部4と辞書部5と距離計算部6と重み係数格
納部7と判定部8と制御部9とデータバス10で構成さ
れている。
As shown in FIG. 1, the character recognition apparatus of this embodiment has an image input unit 1, an image memory 2, and a character cutout unit 3.
It is composed of a feature extraction unit 4, a dictionary unit 5, a distance calculation unit 6, a weight coefficient storage unit 7, a determination unit 8, a control unit 9 and a data bus 10.

【0010】画像入力部1は図2に示すような認識対象
文字を含む画像を入力し、画像メモリ2に格納する。文
字切り出し部3は画像メモリ2に格納されている画像か
ら認識対象となる文字を2値化し、例えば数字「2」の
場合、図3に示すような矩形で取り出す。特徴抽出部4
は、文字切り出し部3で得られた認識文字のパターンに
より認識に必要な特徴量を求める。ここでは、特徴量
は、図4(2)に示すように、8個の方向A0,A1,
A2,・・・・,A7の各線分の長さで表わされる。図
3に示す認識文字パターンは、図4(1)に示すよう
に、第1凹と第2凹(または第1凸と第2凸)とからな
り、第1凹は方向A1の線分から始まって、方向A0の
線分、方向A7の線分、方向A6の線分、方向A5の線
分と続く。そしてこれら各線分の長さはそれぞれ2,
4,2,2,12である。したがって、第1凹の部分の
特徴量Gj (j=1,2,・・・,8)は表1のように
なる。第2凹についても同様に特徴量が求められる。
The image input unit 1 inputs an image including a character to be recognized as shown in FIG. 2 and stores it in the image memory 2. The character slicing unit 3 binarizes the character to be recognized from the image stored in the image memory 2 and, for example, in the case of the number "2", extracts it in a rectangle as shown in FIG. Feature extraction unit 4
Calculates the feature amount necessary for recognition from the pattern of the recognized character obtained by the character cutout unit 3. Here, the feature amount is, as shown in FIG. 4B, eight directions A0, A1,
It is represented by the length of each line segment of A2, ..., A7. As shown in FIG. 4A, the recognized character pattern shown in FIG. 3 includes a first concave and a second concave (or a first convex and a second convex), and the first concave starts from a line segment in the direction A1. Then, a line segment in the direction A0, a line segment in the direction A7, a line segment in the direction A6, and a line segment in the direction A5 follow. And the length of each of these line segments is 2,
It is 4, 2, 2, 12. Therefore, the feature amount G j (j = 1, 2, ..., 8) of the first concave portion is as shown in Table 1. The feature amount is similarly obtained for the second recess.

【0011】[0011]

【表1】 辞書部5には認識対象文字種C1,C2,・・・・,C
n毎に各方向A0,A1,・・・・,A7の標準特徴量
ijが格納されている。重み係数格納部7には図5に示
すように認識対象文字種C1,C2,・・・・,Cnの
方向A0,A1,・・・・,A7の標準特徴量Fijに対
する重み係数α11,α12,・・・・,α 1m,α22,・・
・・,α33,・・・・,αij,・・・・,αnmが格納さ
れている。距離計算部6は、特徴抽出部4で得られた認
識対象文字パターンの特徴量Gjと辞書部5に格納され
ている文字種Ci の特徴量Fijと重み係数格納部7に格
納されている重み係数αijとから次式により距離Di
求める。
[Table 1]The dictionary unit 5 includes recognition target character types C1, C2, ..., C.
Standard feature amount in each direction A0, A1, ..., A7 for each n
FijIs stored. The weighting factor storage unit 7 is shown in FIG.
Of the recognition target character types C1, C2, ..., Cn
Standard feature amount F in directions A0, A1, ..., A7ijAgainst
Weighting factor α11, Α12, ..., α 1m, Αtwenty two・ ・ ・
.., α33, ..., αij, ..., αnmIs stored
Has been. The distance calculation unit 6 uses the recognition obtained by the feature extraction unit 4.
Characteristic amount G of the recognition target character patternjAnd stored in the dictionary section 5
Character type Ci Feature amount FijAnd the weight coefficient storage unit 7
Stored weighting factor αijAnd the distance D from the following equationi To
Ask.

【0012】[0012]

【数1】 判定部8は距離計算部6で求められた距離D1,D2,
・・・・,Dnのうちで最も小さい値に対する文字種の
コードを認識結果として出力する。
[Equation 1] The determination unit 8 uses the distances D1, D2 calculated by the distance calculation unit 6
The code of the character type for the smallest value of Dn is output as the recognition result.

【0013】次に、本実施例の動作を、図6に示すよう
な認識文字パターンが文字切り出し部3で得られた場合
について表2,表3を参照して説明する。
Next, the operation of this embodiment will be described with reference to Tables 2 and 3 in the case where a recognized character pattern as shown in FIG. 6 is obtained by the character slicing unit 3.

【0014】[0014]

【表2】 [Table 2]

【0015】[0015]

【表3】 図6に示す認識文字パターンの特徴量は、表2と表3に
示すように、第1凹の方向A0,A1,A2,A3,A
4,A5,A6,A7についてそれぞれ5,0,0,
0,0,12,0,0であり、第1凸の方向A0,A
1,A2,A3,A4,A5,A6,A7についてそれ
ぞれ2,12,0,0,10,0,0,0である。一
方、文字種「テ」の標準特徴量は、表2に示すように、
第1凹の方向A0,A1,A2,A3,A4,A5,A
6,A7についてそれぞれ3,0,0,0,0,13,
0,0であり、第1凸の方向A0,A1,A2,A3,
A4,A5,A6,A7についてそれぞれ5,14,
0,0,10,0,0,0である。また、文字種「ラ」
の標準特徴量は、表3に示すように、第1凹の方向A
0,A1,A2,A3,A4,A5,A6,A7につい
てそれぞれ8,0,0,0,0,13,0,0であり、
第1凸の方向A0,A1,A2,A3,A4,A5,A
6,A7についてそれぞれ0,14,0,0,10,
0,0,0である。対応する特徴量Gj と標準特徴量F
ijの差(距離)を求め、これらを加算すると、文字種
「テ」の場合8となり(表2参照)、文字種「ラ」の場
合、第1凸の方向A0の特徴量と標準特徴量の差2に重
み付け係数2が乗算されて4となり、距離の総和は10
となる(表3参照)。距離の総和は文字種「テ」の場合
のほうが小さいので、「テ」の文字種コードが文字認識
結果として判定部8から出力される。
[Table 3] As shown in Tables 2 and 3, the feature amount of the recognized character pattern shown in FIG. 6 is the direction of the first concave A0, A1, A2, A3, A.
4, A5, A6, A7 are 5, 0, 0,
0,0,12,0,0, and the first convex direction A0, A
1, A2, A3, A4, A5, A6, A7 are 2, 12, 0, 0, 10, 0, 0, 0, respectively. On the other hand, the standard feature amount of the character type “te” is, as shown in Table 2,
First concave direction A0, A1, A2, A3, A4, A5, A
6, A7 is 3,0,0,0,0,13,
0,0, and the first convex directions A0, A1, A2, A3
About A4, A5, A6, A7 5,14,
0,0,10,0,0,0. Also, the character type "la"
As shown in Table 3, the standard feature amount of the
0, A1, A2, A3, A4, A5, A6, A7 are 8, 0, 0, 0, 0, 13, 0, 0 respectively,
First convex direction A0, A1, A2, A3, A4, A5, A
6, A7 is 0, 14, 0, 0, 10,
It is 0,0,0. Corresponding feature amount G j and standard feature amount F
If the difference (distance) of ij is calculated and these are added, it becomes 8 for the character type "te" (see Table 2), and for the character type "la", the difference between the feature amount in the first convex direction A0 and the standard feature amount. 2 is multiplied by the weighting factor 2 to get 4 and the total distance is 10
(See Table 3). Since the sum of the distances is smaller in the case of the character type “te”, the character type code of “te” is output from the determination unit 8 as the character recognition result.

【0016】[0016]

【発明の効果】以上説明したように本発明は、認識対象
文字に対して抽出した特徴量と辞書部に格納された認識
対象文字種毎の標準特徴量との距離計算時に各特徴値の
差分に対して重み係数を乗じ、局所的な差異を強調する
様な重み付けをすることにより、局所的に形状の異なる
ような類似した文字に対してもより正確に識別が行なえ
る効果がある。
As described above, according to the present invention, when calculating the distance between the feature amount extracted for the recognition target character and the standard feature amount for each recognition target character type stored in the dictionary unit, the difference between the feature values is calculated. By multiplying the weight coefficient by a weighting coefficient and emphasizing local differences, it is possible to more accurately identify similar characters having locally different shapes.

【図面の簡単な説明】[Brief description of drawings]

【図1】本発明の一実施例の文字認識装置のブロック図
である。
FIG. 1 is a block diagram of a character recognition device according to an embodiment of the present invention.

【図2】入力画像の一例を示す図である。FIG. 2 is a diagram showing an example of an input image.

【図3】文字切り出し部3で切り出された認識対象文字
の一例を示す図である。
FIG. 3 is a diagram showing an example of a recognition target character cut out by a character cutout unit 3.

【図4】文字の特徴量を説明する図である。FIG. 4 is a diagram illustrating character feature amounts.

【図5】重み係数格納部7の内容を示す図である。5 is a diagram showing the contents of a weighting coefficient storage unit 7. FIG.

【図6】切り出された文字認識パターンの一例を示す図
である。
FIG. 6 is a diagram showing an example of a cut-out character recognition pattern.

【符号の説明】[Explanation of symbols]

1 画像入力部 2 画像メモリ 3 文字切り出し部 4 特徴抽出部 5 辞書部 6 距離計算部 7 重み係数格納部 8 判定部 9 制御部 10 データバス 1 Image Input Section 2 Image Memory 3 Character Extraction Section 4 Feature Extraction Section 5 Dictionary Section 6 Distance Calculation Section 7 Weighting Factor Storage Section 8 Judgment Section 9 Control Section 10 Data Bus

─────────────────────────────────────────────────────
─────────────────────────────────────────────────── ───

【手続補正書】[Procedure amendment]

【提出日】平成5年11月16日[Submission date] November 16, 1993

【手続補正1】[Procedure Amendment 1]

【補正対象書類名】明細書[Document name to be amended] Statement

【補正対象項目名】特許請求の範囲[Name of item to be amended] Claims

【補正方法】変更[Correction method] Change

【補正内容】[Correction content]

【特許請求の範囲】[Claims]

Claims (1)

【特許請求の範囲】[Claims] 【請求項1】 認識対象文字を含む画像を入力する画像
入力部と、 前記画像入力部で入力された画像が格納される画像メモ
リと、 前記画像メモリに格納されている画像から認識対象とな
る文字を矩形で切り出す文字切り出し部と、 前記文字切り出し部で得られた認識対象文字のパターン
より認識に必要な特徴量を抽出する特徴抽出部と、 認識対象文字種毎に標準特徴量が格納されている辞書部
と、 前記辞書部に格納されている、各認識対象文字の標準特
徴量毎の重み係数が格納されている重み係数格納部と、 前記特徴抽出部によって得られた、認識対象文字パター
ンの各特徴量と、前記辞書部に格納されている、該特徴
量に対応する標準特徴量の差の絶対値に、前記重み係数
格納部に格納されている、該標準特徴量に対応する重み
係数を乗算したものを当該認識対象文字パターンの全て
の特徴量について求めて、これらを加算し距離値を算出
する処理を、前記辞書部に標準特徴量が格納されて全て
の認識対象文字種について行なう距離計算部と、 前記距離計算部で算出された距離値のうちで最も小さい
距離値に対応する認識対象文字種の字種コードを文字認
識結果として出力する判定部とを有する文字認識装置。
1. An image input unit for inputting an image containing a character to be recognized, an image memory in which the image input by the image input unit is stored, and an image to be recognized from the image stored in the image memory. A character cutout unit that cuts out a character in a rectangle, a feature extraction unit that extracts a feature amount necessary for recognition from the pattern of the recognition target character obtained by the character cutout unit, and a standard feature amount for each recognition target character type are stored. A dictionary unit, a weighting factor storage unit that stores the weighting factor for each standard feature amount of each recognition target character stored in the dictionary unit, and a recognition target character pattern obtained by the feature extraction unit Of the standard feature amount stored in the dictionary unit and the absolute value of the difference between the standard feature amount corresponding to the feature amount stored in the dictionary unit, and the weight corresponding to the standard feature amount stored in the weight coefficient storage unit. Multiply coefficient Distance calculation is performed for all recognition target character types in which the standard feature value is stored in the dictionary unit, and the calculated distance value is calculated for all the feature values of the recognition target character pattern. A character recognition device comprising: a unit; and a determination unit that outputs a character type code of a recognition target character type corresponding to the smallest distance value among the distance values calculated by the distance calculation unit as a character recognition result.
JP5147976A 1993-06-18 1993-06-18 Character recognition device Pending JPH076208A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP5147976A JPH076208A (en) 1993-06-18 1993-06-18 Character recognition device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP5147976A JPH076208A (en) 1993-06-18 1993-06-18 Character recognition device

Publications (1)

Publication Number Publication Date
JPH076208A true JPH076208A (en) 1995-01-10

Family

ID=15442360

Family Applications (1)

Application Number Title Priority Date Filing Date
JP5147976A Pending JPH076208A (en) 1993-06-18 1993-06-18 Character recognition device

Country Status (1)

Country Link
JP (1) JPH076208A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS51151031A (en) * 1975-06-20 1976-12-25 Ricoh Co Ltd Character reader unit
JPS57153387A (en) * 1981-03-19 1982-09-21 Ricoh Co Ltd Character recognizing method
JPS6175982A (en) * 1984-09-21 1986-04-18 Nec Corp Pattern recognizer
JPS61133000A (en) * 1984-12-01 1986-06-20 日本電気株式会社 Distance calculation circuit
JPS63225883A (en) * 1987-03-13 1988-09-20 Matsushita Electric Ind Co Ltd Character recognition device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS51151031A (en) * 1975-06-20 1976-12-25 Ricoh Co Ltd Character reader unit
JPS57153387A (en) * 1981-03-19 1982-09-21 Ricoh Co Ltd Character recognizing method
JPS6175982A (en) * 1984-09-21 1986-04-18 Nec Corp Pattern recognizer
JPS61133000A (en) * 1984-12-01 1986-06-20 日本電気株式会社 Distance calculation circuit
JPS63225883A (en) * 1987-03-13 1988-09-20 Matsushita Electric Ind Co Ltd Character recognition device

Similar Documents

Publication Publication Date Title
US5664027A (en) Methods and apparatus for inferring orientation of lines of text
JP2000089786A (en) Method for correcting speech recognition result and apparatus therefor
JPH076208A (en) Character recognition device
JP2675303B2 (en) Character recognition method
JP3236732B2 (en) Character recognition device
JP3930174B2 (en) Character recognition method and character recognition device
JPH09274645A (en) Method and device for recognizing character
JPH08161426A (en) Handwritten character stroke segmenting device
JPH07271921A (en) Character recognizing device and method thereof
JPH08101880A (en) Character recognition device
JP2963474B2 (en) Similar character identification method
JP2755738B2 (en) Character recognition device
JPH0950488A (en) Method for reading different size characters coexisting character string
JP2922949B2 (en) Post-processing method for character recognition
JPS63150788A (en) Character recognition device
JP3345469B2 (en) Word spacing calculation method, word spacing calculation device, character reading method, character reading device
JP3245241B2 (en) Character recognition apparatus and method
JP2658154B2 (en) Character identification method
JPS6240579A (en) Translating device
JPS6240574A (en) Word processor
JPH0318987A (en) Dictionary registering method
JPH07254048A (en) Character recognition method
JPH0765128A (en) Method for generating dictionary for type character recognition
JP3074981B2 (en) Character segmentation device
JP2658153B2 (en) Character identification method