JPH0113583B2 - - Google Patents

Info

Publication number
JPH0113583B2
JPH0113583B2 JP56034243A JP3424381A JPH0113583B2 JP H0113583 B2 JPH0113583 B2 JP H0113583B2 JP 56034243 A JP56034243 A JP 56034243A JP 3424381 A JP3424381 A JP 3424381A JP H0113583 B2 JPH0113583 B2 JP H0113583B2
Authority
JP
Japan
Prior art keywords
horizontal
character
vertical
features
memory
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired
Application number
JP56034243A
Other languages
Japanese (ja)
Other versions
JPS57147783A (en
Inventor
Eiichiro Yamamoto
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujitsu Ltd
Original Assignee
Fujitsu Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujitsu Ltd filed Critical Fujitsu Ltd
Priority to JP56034243A priority Critical patent/JPS57147783A/en
Publication of JPS57147783A publication Critical patent/JPS57147783A/en
Publication of JPH0113583B2 publication Critical patent/JPH0113583B2/ja
Granted legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/18Extraction of features or characteristics of the image
    • G06V30/184Extraction of features or characteristics of the image by analysing segments intersecting the pattern
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Character Discrimination (AREA)

Description

【発明の詳細な説明】 本発明は、文字を縦、横2方向に走査して文字
線(ストローク)に囲まれた領域(背景部分)を
抽出して該文字の特徴パターンを作成する文字認
識方式に関する。
DETAILED DESCRIPTION OF THE INVENTION The present invention is a character recognition system that scans a character in two directions, vertically and horizontally, extracts an area (background part) surrounded by character lines (strokes), and creates a characteristic pattern of the character. Regarding the method.

文字認識においては文字のストロークの特徴を
抽出する方式および文字の背景部分を符号化する
方法が知られている。後者に属する従来の方法は
背景部分のある点においてどの方向に文字線分が
あるかを調べているために、たとえば第3図のイ
とロでは、同一文字であるにもかかわらずロの方
はの背景部分に別の符号ラベルがついてしま
う。即ち第3図イ,ロはいずれ「文」という漢字
を例としたものであるが、右方および下方に線分
がある、左方および下方に線分がある、上、下、
左、右に線分がある等を基準として符号付けを行
なうためイの場合は図示背景部分にラベル〜
が付されるが、文字手書の際の変形によりロでは
斜めストロークの一部が長く伸びているのでイに
ない下方に線分ありの背景部分が生じ、これにラ
ベルが付される。この様な特徴の差は文字認識
に対しイとロの同一性を失なわせるように作用す
る。一般に漢字のように複雑な字形を対象にしよ
うとする場合、文字線の長短によりこのような違
いが頻繁に生じ、手書文字の識別論理を組むのが
むずかしくなる。また複雑な字形の場合はラベル
の種類が多くなり、処理が厄介になる。
In character recognition, a method of extracting stroke features of a character and a method of encoding a background part of a character are known. Conventional methods that belong to the latter category check in which direction the character line segment is located at a certain point in the background, so for example, in Figure 3, A and B are the same characters, but B is in the opposite direction. A different code label is attached to the background part of . In other words, Figure 3 A and B take the kanji ``文'' as an example, and there are line segments on the right and bottom, line segments on the left and bottom, top, bottom,
Codes are assigned based on the presence of line segments on the left and right, so in the case of A, labels are placed on the background of the illustration.
However, due to the deformation during handwriting, some of the diagonal strokes in B are elongated, resulting in a background part with a line segment below that is not in A, and a label is attached to this. This difference in features acts in character recognition to make A and B lose their identity. Generally, when trying to target complex character shapes such as kanji, such differences frequently occur due to the length of the character lines, making it difficult to develop a logic for identifying handwritten characters. Furthermore, in the case of complex glyphs, there are many types of labels, making processing difficult.

また文字認識に対し同一符号部分のヒストグラ
ムを作つてそれを特徴にすると第3図ハとニのよ
うに文字の変形により異なる特徴が抽出される。
即ち、本例ではいずれも「文」という漢字ではあ
るがハは中央の線分で囲まれる領域A1を小に、
ニはこれA2を大に書いたものであり、これらは
同一の分類基準に属するので同一のラベルがつけ
られるが、A1,A2は面積が違うためにラベレの
個数(1文字の画像信号は例えば50×50ビツトで
構成されて各ビツトはメモリの各セルに書込ま
れ、ラベルも各ビツトに対して付される)をカウ
ントするとヒストグラムH1,H2には差が生じ、
文字認識を困難にする。
Also, for character recognition, if a histogram of the same code portion is created and used as a feature, different features can be extracted depending on the deformation of the character, as shown in Figure 3 (c) and (d).
In other words, in this example, although both are the kanji ``文'', HA makes the area A 1 surrounded by the central line segment smaller,
2 is a large-sized version of A 2 , and since they belong to the same classification standard, they are given the same label. However, A 1 and A 2 have different areas, so the number of labellets (the image of one character) is The signal consists of, for example, 50 x 50 bits, each bit is written to each cell of the memory, and a label is also attached to each bit). When counting, a difference occurs between the histograms H 1 and H 2 ,
Makes character recognition difficult.

本発明は、手書文字の変形に影響されにくくま
た取扱いも容易な特徴量の抽出方法を採用した文
字認識方式を提供するものである。本発明の文字
認識方式は、入力文字画像情報の特徴を抽出し、
抽出された特徴パターンと予め用意された辞書パ
ターンとを照合して、その照合結果から入力文字
を判定する文字認識方式において、文字ストロー
クの背景部分を、該ストロークで上下に囲まれた
部分、左右が囲まれた部分、および上下左右が囲
まれた部分に分けて各部分にそれぞれ異なるラベ
ルを付け、そして水平、垂直走査線が各ラベル領
域と交差する回数を計数し、そのヒストグラムを
入力文字の特徴パターンとすることを特徴とする
が、以下図示の実施例を参照しながらこれを詳細
に説明する。
The present invention provides a character recognition method that employs a feature extraction method that is not easily affected by the deformation of handwritten characters and is easy to handle. The character recognition method of the present invention extracts features of input character image information,
In a character recognition method that compares the extracted feature pattern with a dictionary pattern prepared in advance and determines the input character based on the matching result, the background part of a character stroke is Divide it into the enclosed area and the area enclosed on the top, bottom, left and right, give each area a different label, count the number of times the horizontal and vertical scanning lines intersect each label area, and calculate the histogram of the input character. The present invention is characterized in that it is a characteristic pattern, which will be explained in detail below with reference to the illustrated embodiment.

第1図は本発明の一実施例である。同図におい
て、1は観測部、2は映像メモリ、3は垂直走査
アドレスカウンタ、4は水平走査アドレスカウン
タ、5は水平始点検出部、6は水平終点検出部、
7は垂直始点検出部、8は垂直終点検出部、9は
水平始点レジスタ、10は水平終点レジスタ、1
1は垂直始点レジスタ、12は垂直終点レジス
タ、13〜16は比較器、17,18はフリツプ
フロツプ、19は水平面特徴メモリ、20は垂直
面特徴メモリ、21〜23はアンド回路、24は
水平アドレスカウンタ、25は水直アドレスカウ
ンタ、26は水平垂直囲み特徴計数部、27は水
平囲み特徴計数部、28は垂直囲み特徴計数部、
29はマツチング部である。
FIG. 1 shows an embodiment of the present invention. In the figure, 1 is an observation unit, 2 is a video memory, 3 is a vertical scanning address counter, 4 is a horizontal scanning address counter, 5 is a horizontal start point detection unit, 6 is a horizontal end point detection unit,
7 is a vertical start point detection section, 8 is a vertical end point detection section, 9 is a horizontal start point register, 10 is a horizontal end point register, 1
1 is a vertical start point register, 12 is a vertical end point register, 13 to 16 are comparators, 17 and 18 are flip-flops, 19 is a horizontal plane feature memory, 20 is a vertical plane feature memory, 21 to 23 are AND circuits, and 24 is a horizontal address counter. , 25 is a horizontal address counter, 26 is a horizontal/vertical enclosing feature counter, 27 is a horizontal enclosing feature counter, 28 is a vertical enclosing feature counter,
29 is a matching section.

観測部1は、帳標上からの光学的映像情報を電
気的映像情報に変換するものであり、その出力映
像情報は映像メモリ2に格納される。垂直走査ア
ドレスカウンタ3は映像メモリ2を垂直方向に走
査するためのものであり、また水平走査アドレス
カウンタ4は映像メモリ2を水平方向に走査する
ためのものである。水平始点検出部5は水平走査
における「黒」(文字の部分を「黒」、背景部分を
「白」と呼ぶ)から「白」への変化点を検出する
ものであり、また水平終点検出部6は水平走査に
おいて最初の水平始点を検出した以後の白から黒
への変化点を検出するものである。この様子を、
第2図を参照して説明する。同図において31は
映像メモリ2に格納された映像である。今、ある
水平走査32を考えると、黒から白へと変化点は
33のようになる。またこの水平走査32におい
て、最初の水平始点検出後の白から黒への変化点
は34のようになる。水平始点レジスタ9、水平
終点レジスタ10は水平終点検出のタイミングで
セツトされ、したがつてレジスタ9,10には信
号33,34の黒丸印の値(位置情報)がセツト
される。
The observation unit 1 converts optical image information from the ledger into electrical image information, and the output image information is stored in the image memory 2. The vertical scanning address counter 3 is for scanning the video memory 2 in the vertical direction, and the horizontal scanning address counter 4 is for scanning the video memory 2 in the horizontal direction. The horizontal start point detection section 5 detects the point of change from "black" (the text part is called "black" and the background part is called "white") to "white" in horizontal scanning, and the horizontal end point detection section 6 detects a point of change from white to black after detecting the first horizontal starting point in horizontal scanning. This situation,
This will be explained with reference to FIG. In the figure, 31 is a video stored in the video memory 2. Now, considering a certain horizontal scan 32, there are 33 points of change from black to white. Further, in this horizontal scanning 32, the point of change from white to black after the first horizontal starting point is detected is 34. The horizontal start point register 9 and the horizontal end point register 10 are set at the timing of detecting the horizontal end point, so the values (position information) of the black circles of the signals 33 and 34 are set in the registers 9 and 10.

垂直始点検出部7、垂直終点検出部8、垂直始
点レジスタ11、垂直終点レジスタ12も水平の
場合と同様の動作をするので、以下では主に水平
走査を例に説明する。1水平走査が終了すると、
その走査行についても水平始点、水平終点の値
(Xアドレス)が水平始点レジスタ9および水平
終点レジスタ10に格納される。このとき水平ア
ドレスカウンタ24がカウントを開始し、水平始
点レジスタ9の値と一致した点で比較器13がオ
ンになり、また水平終点レジスタ10の値と一致
した点で比較器14がオンになる。フリツプフロ
ツプ17は、比較器13がオンとなつた時点から
比較器14がオンとなる時点までの間オンとな
る。水平面特徴メモリ19にはフリツプフロツプ
17の出力がアドレスを対応させながら書き込ま
れるのでその内容は第2図19aのようになる。
この図の多数の水平線分(メモリでは例えば情報
“1”記憶セルの列)が上記のようにして求めら
れた始終点間(文字ストローク間)部分を示し、
終点または始点のみのものはメモリ19上では本
例では“0”として書込まれる。垂直走査につい
ても同様で、垂直面特徴メモリ20にはフリツプ
フロツプ18の出力が書込まれ、その内容は第2
図の20aのようになる。アンドゲート21はメ
モリ19,20の読出し出力19a,20aを入
力され、その論理積を出力する。これらのメモリ
の読出しは同じアドレス信号を同時に入力して行
ない、従つて同じアドレスのメモリセルの記憶内
容が同時に該ゲート21に入力し、該ゲートの出
力は第2図で縦、横線を付した網の目部分38で
表わされるものとなる、アンドゲート22はメモ
リ19の出力と、インバータ22aで反転された
メモリ20の出力を入力されるので、垂直囲みが
無い、水平囲みのみの部分を示す出力を生じる。
本例ではこの出力はない。またアンドゲート23
はメモリ20の出力と、インバータ23bで反転
されたメモリ19の出力を入力され、水平囲みが
無い、垂直囲みのみの部分37を示す出力を生じ
る。
The vertical start point detection section 7, the vertical end point detection section 8, the vertical start point register 11, and the vertical end point register 12 also operate in the same way as in the horizontal case, so the explanation below will mainly take horizontal scanning as an example. When one horizontal scan is completed,
The horizontal start point and horizontal end point values (X address) for that scanning line are also stored in the horizontal start point register 9 and the horizontal end point register 10. At this time, the horizontal address counter 24 starts counting, and the comparator 13 is turned on when the value matches the value of the horizontal start point register 9, and the comparator 14 is turned on when the value matches the value of the horizontal end point register 10. . Flip-flop 17 is turned on from the time when comparator 13 is turned on until the time when comparator 14 is turned on. Since the output of the flip-flop 17 is written into the horizontal plane characteristic memory 19 with corresponding addresses, the contents become as shown in FIG. 2, 19a.
A large number of horizontal line segments in this figure (in the memory, for example, columns of information "1" storage cells) indicate the portion between the start and end points (between character strokes) determined as above,
In this example, only the end point or the start point is written as "0" on the memory 19. The same goes for vertical scanning; the output of the flip-flop 18 is written to the vertical surface feature memory 20, and its contents are
It will look like 20a in the figure. The AND gate 21 receives the read outputs 19a and 20a of the memories 19 and 20, and outputs the logical product thereof. Reading from these memories is carried out by inputting the same address signal at the same time, so the stored contents of the memory cells at the same address are input to the gate 21 at the same time, and the output of the gate is indicated by vertical and horizontal lines in FIG. The output of the memory 19 and the output of the memory 20 inverted by the inverter 22a are input to the AND gate 22, which is represented by the mesh area 38, so there is no vertical enclosure, only a horizontal enclosure is shown. produces an output.
In this example, this output is not present. Also and gate 23
inputs the output of the memory 20 and the output of the memory 19 inverted by the inverter 23b, and produces an output showing a portion 37 with only vertical surrounds and no horizontal surrounds.

水平垂直囲み特徴計数部26、水平囲み特徴計
数部27、垂直囲み特徴計数部28はそれぞれア
ンド回路21,22,23の出力38,36,3
7を水平あるいは垂直方向に監視して白から黒へ
の変化する回数をカウントする。水平垂直囲み特
徴を例にこれを説明すると、第2図の縦線と横線
の両方が付されている部分38が前述のようにア
ンド回路21の出力で、走査線39の位置での白
から黒への変化点は3カ所なので、水平垂直囲み
特徴計数部26は値3を出力する。かゝる出力は
前述の例では50本ある各水平走査線毎に生じる。
水平、垂直囲み特徴計数部27,28も同様であ
るが、垂直囲みの場合は勿論各垂直走査線毎の計
数値を出力する。従つて計数部26,27,28
の出力数は本例では各50個、計150個ある。マツ
チング部29ではこれらの入力部から抽出した特
徴とあらかじめ用意されている標準パターンとの
距離を計算し、距離の近い標準パターンのカテゴ
リー(文字)を結果として出力する。
The horizontal/vertical enclosing feature counting unit 26, the horizontal enclosing feature counting unit 27, and the vertical enclosing feature counting unit 28 are outputs 38, 36, and 3 of the AND circuits 21, 22, and 23, respectively.
7 horizontally or vertically and count the number of times it changes from white to black. To explain this using the horizontal and vertical encircling feature as an example, the portion 38 marked with both vertical and horizontal lines in FIG. Since there are three points of change to black, the horizontal/vertical enclosing feature counting unit 26 outputs a value of 3. Such an output occurs for each of the 50 horizontal scan lines in the above example.
The horizontal and vertical enclosing feature counting units 27 and 28 are similar, but in the case of vertical enclosing, of course, they output a count value for each vertical scanning line. Therefore, the counting sections 26, 27, 28
In this example, the number of outputs is 50 each, for a total of 150. The matching section 29 calculates the distance between the features extracted from these input sections and a standard pattern prepared in advance, and outputs the category (character) of the standard pattern with the closest distance as a result.

以下に本発明方式を採用した文字認識の実験デ
ータを示す。
Experimental data for character recognition using the method of the present invention is shown below.

次の類似カテゴリー37種について認識実験を行
なつた。
We conducted recognition experiments on the following 37 similar categories.

対象カテゴリ 悪、意、炎、恩、害、完、患、
危、鬼、急、愚、恵、憲、更、克、
思、慈、充、是、泉、走、束、息、
怠、態、忠、展、東、唐、念、売、
尾、免、吏、竜、昆 使用データ 各カテゴリ 100サンプル計3700
データ 認認実験 本発明による方式の特徴を用いた場合の認識率
65.3% 第3図ハ,ニの特徴を用いた場合の認識率
62.7% 本発明による方式の特徴と他の2つの特徴を組
み合わせた場合 86.2% 第3図ハ,ニの特徴と他の2つの特徴を組み合
わせた場合 83.4% この認識実験結果に示すように第3図ハ,ニの
特徴を用いた場合にくらべ、本発明による方式の
特徴を用いると認識率が3%程度向上する。
Target categories: evil, will, flame, favor, harm, perfection, suffering,
danger, demon, sudden, foolish, grace, ken, further, katsu,
thought, compassion, fullness, kore, spring, run, bundle, breath,
laziness, state, loyalty, exhibition, east, tang, nen, sale,
Tail, Men, Man, Dragon, Kun Usage data: 100 samples for each category, total 3700
Data Recognition experiment Recognition rate when using the features of the method according to the present invention
65.3% Recognition rate when using features C and D in Figure 3
62.7% When the features of the method according to the present invention are combined with two other features 86.2% When the features of Figure 3 C and D are combined with the other two features 83.4% As shown in the results of this recognition experiment, the third When the features of the method according to the present invention are used, the recognition rate improves by about 3% compared to when the features shown in Figures C and D are used.

以上の説明から明らかなように、本発明によれ
ば、筆記者特有の癖による文字の傾きや線の長短
に影響されない、また扱い易い文字の特徴を抽出
することが出来る。
As is clear from the above description, according to the present invention, it is possible to extract characteristics of characters that are not affected by the inclination of characters or the length of lines due to the peculiar habits of scribes, and are easy to handle.

【図面の簡単な説明】[Brief explanation of drawings]

第1図は本発明の一実施例を示すブロツク図、
第2図はその動作説明図、第3図は従来の文字認
識方式の説明図である。 図中、31は入力映像情報、20aは上下が囲
まれた部分、19aは左右が囲まれた部分、38
は上下左右が囲まれた部分である。
FIG. 1 is a block diagram showing one embodiment of the present invention;
FIG. 2 is an explanatory diagram of its operation, and FIG. 3 is an explanatory diagram of a conventional character recognition system. In the figure, 31 is input video information, 20a is a portion surrounded on the top and bottom, 19a is a portion surrounded on the left and right, 38
is an area surrounded on the top, bottom, left and right.

Claims (1)

【特許請求の範囲】[Claims] 1 入力文字画像情報の特徴を抽出し、抽出され
た特徴パターンと予め用意された辞書パターンと
を照合して、その照合結果から入力文字を判定す
る文字認識方式において、文字ストロークの背景
部分を、該ストロークで上下が囲まれた部分、左
右が囲まれた部分、および上下左右が囲まれた部
分に分けて各部分にそれぞれ異なるラベルを付
け、そして水平、垂直走査線が各ラベル領域と交
差する回数を計数し、そのヒストグラムを入力文
字の特徴パターンとすることを特徴とする文字認
識方式。
1 In a character recognition method that extracts the features of input character image information, matches the extracted feature patterns with a dictionary pattern prepared in advance, and determines the input character from the matching result, the background part of the character stroke is Divide the stroke into a part surrounded by the top and bottom, a part surrounded by the left and right, and a part surrounded by the top, bottom, left and right, and give each part a different label, and horizontal and vertical scanning lines intersect each label area. A character recognition method characterized by counting the number of times and using the histogram as a characteristic pattern of input characters.
JP56034243A 1981-03-10 1981-03-10 Character recognizing system Granted JPS57147783A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP56034243A JPS57147783A (en) 1981-03-10 1981-03-10 Character recognizing system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP56034243A JPS57147783A (en) 1981-03-10 1981-03-10 Character recognizing system

Publications (2)

Publication Number Publication Date
JPS57147783A JPS57147783A (en) 1982-09-11
JPH0113583B2 true JPH0113583B2 (en) 1989-03-07

Family

ID=12408712

Family Applications (1)

Application Number Title Priority Date Filing Date
JP56034243A Granted JPS57147783A (en) 1981-03-10 1981-03-10 Character recognizing system

Country Status (1)

Country Link
JP (1) JPS57147783A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP4276865A1 (en) 2022-05-12 2023-11-15 Zippy Technology Corp. Micro switch

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP4276865A1 (en) 2022-05-12 2023-11-15 Zippy Technology Corp. Micro switch

Also Published As

Publication number Publication date
JPS57147783A (en) 1982-09-11

Similar Documents

Publication Publication Date Title
US4408342A (en) Method for recognizing a machine encoded character
JPS6077274A (en) Recognition of character
KR19980018029A (en) Character recognition device
Lehal et al. Feature extraction and classification for OCR of Gurmukhi script
Biswas et al. Writer identification of Bangla handwritings by radon transform projection profile
Ayesh et al. A robust line segmentation algorithm for Arabic printed text with diacritics
Zhou et al. Discrimination of characters by a multi-stage recognition process
Lakshmi et al. An optical character recognition system for printed Telugu text
Verma et al. A novel approach for structural feature extraction: contour vs. direction
Tarafdar et al. A two-stage approach for word spotting in graphical documents
JPH0430070B2 (en)
Vasantha Lakshmi et al. OCR of printed Telugu text with high recognition accuracies
JPH0113583B2 (en)
Srivastava et al. Separation of machine printed and handwritten text for Hindi documents
US10657404B2 (en) Character recognition device, character recognition method, and character recognition program
Tou et al. Automatic recognition of handwritten characters via feature extraction and multi-level decision
Fernández-Mota et al. On the influence of key point encoding for handwritten word spotting
JP3476595B2 (en) Image area division method and image binarization method
Pourasad et al. A word spotting method for Farsi machine-printed document images
JP2917427B2 (en) Drawing reader
Lehal et al. A complete OCR system for Gurmukhi script
Lohakan et al. Single-character segmentation for handprinted Thai word
KR100332752B1 (en) Method for recognizing character
Parker et al. Vector templates for symbol recognition
Hwang et al. Segmentation of a text printed in Korean and English using structure information and character recognizers