CA1054718A - Method for recognizing characters - Google Patents

Method for recognizing characters

Info

Publication number
CA1054718A
CA1054718A CA253532A CA253532A CA1054718A CA 1054718 A CA1054718 A CA 1054718A CA 253532 A CA253532 A CA 253532A CA 253532 A CA253532 A CA 253532A CA 1054718 A CA1054718 A CA 1054718A
Authority
CA
Canada
Prior art keywords
character
features
feature
values
weights
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired
Application number
CA253532A
Other languages
French (fr)
Inventor
Arie A. Spanjersberg
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nederlanden Volksgezondheid Welzijn en Sport VWS
Original Assignee
Nederlanden Volksgezondheid Welzijn en Sport VWS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nederlanden Volksgezondheid Welzijn en Sport VWS filed Critical Nederlanden Volksgezondheid Welzijn en Sport VWS
Application granted granted Critical
Publication of CA1054718A publication Critical patent/CA1054718A/en
Expired legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/192Recognition using electronic means using simultaneous comparisons or correlations of the image signals with a plurality of references
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/192Recognition using electronic means using simultaneous comparisons or correlations of the image signals with a plurality of references
    • G06V30/194References adjustable by an adaptive method, e.g. learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Character Discrimination (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)
  • Character Input (AREA)

Abstract

ABSTRACT
The present invention relates to a method and device for recognizing characters, in which in a learning phase as well as in a subsequent working phase features of character patterns in a number of aspects thereof are classified in a number of groups, during the learning phase the results of these classifications being recorded in a store as statistic frequencies and during the working phase the result of the feature classification of a freshly offered character being utilized in determining, for each class of characters, the probabilities of the features found in this character, and in which the features of a character pattern to be recognized are weighted, the weight attributed to each feature depending on the shape of the pattern and in which, on the basis of the values of these weights, features are selected, the stored statistic frequencies of which are multiplied by the values of the weights and the largest value among the results is utilized for indicating the class.

Description

- 2 - ~054~

The invention relates to a method for recognizing characters, in which in a learning phase as well as in a subsequent working phase features of character patterns in a number of aspects thereof are classiied in a number of groups, during the learning phase the results of these classifications being recorded in a store as statistic frequencies and during the working phase the results of the feature classification of a freshly offered character being utilized in determining, for each class of characters, the probability of the features found in this character.
A method of this kind is known from the Canadian Patent application No. 204.920 . By this method certain characteristic features can be detected by finding out if in the character to be recognized there are portions the geometry of which is in accordance with previously given definitions of characteristic features. It is generally an inconvenience, however, that if the geometry of the character portion and the definition of the feature do not fully agree, this feature is considered not to occur. This difficulty is especially experienced when dealing with handwritten characters. It can be met by so establishing the definitions of the features that the requirements are met, when the features do occur, though not in their id2sl or clearest form. This gives rise to another difficulty, since information is obtained, it is true, as to whether a character element does or does not satisfy the definition of a feature, but there are no quantitative data with regard to a more or less pronounced or distinct appearance of the relevant feature in the examinet character.
The invention offers a solution for the said difficulties, consisting in that the features of a character pattern to be recognized are weighted, the weight attributed to each feature depending on the shape of the pattern, ~' ._ 105471~

after which, on the basis of the values of these weights, features are selected, the storet statistic frequencies of which are multiplied by the values of the weights and then the largest value among the results is utilizet for indicating the class. It is to be recommended that configurations of character elements contributing to some specified feature are taken into account with the positive sign and configurations of character elements contributing to the remaining features in the group are taken into account with the negative sign, after which in each group the feature with the largest weight is selected.
The numerical values of the weights are preferably normalizet so as to make them intependent of the sizes of the images. According to the patent application mentionet on p.2 for each outsite aspect and for each inside aspect of every character a choice is made from four groups of features,one feature being chosen from each of the groups.
In the learning phase the probability P for the feature K to be fount in a character to be recognized e.g. one of the figures 0...9, belonging to class i, is determinet for each of the K possible features in the system.
This is inticated in the following table.

aspectsgroups of features i ) classes features jumps ~ to ~HO
slopes ~ to -------PKi------top aspect (EPO

end points ~ to ~EP6 ELO
islants EL6 left aspect ( jumps (to ~ 4 ~ 105471~

The quantity PKi is determined by counting the characters of one class i in which the feature K is found. Normalization is effected by dividing the values found by ni, which is the number of characters of class i occurring in the learning material.

Example: PKi = 190 , which means that the feature K is found in 9000 of the 10000 class i characters offered.
The detection of the features always consists in the choice of one feature out of a group, e.g. the group of the jumps. So this choice is always concerned with one of the possible configurations of jumps employed in the system. The chosen feature lS given the weight 1, the ~eight 0 being attributed to all the other features.
According to the invention a more sophisticated version is obtained by indicating the ~eights attributed to the features more precisely. The process is such that for each aspect the weight attributed to each of the features is computed and then recorded as a numerical quantity. From each group (jumps, slopes, etc.) the feature having the largest weight (gk) is chosen in the first instance.
In the well-known manner it is achieved that for deriving the features of each pattern use is made of a subset ~ of the set of possible characteristics. Thus it is imaginable that for end points and islands the inside aspects are not used, so thet instead of 4 x 8 = 32 aspects for four groups of features according to the above diagram only 8 ~ 8 ~ 4 1 4 = 24 aspects are utilized. 24 A character is assigned to the class for which ~ gk. PKi has the largest value. The method for determining the weights (gk) will now be described for each of the groups of features separately, reference ~ 5 ~ 1 0 5 4 7 1 8 being had to the drawing, in which the respective figures show:
Fig. 1 a jump in a character pattern;
Fig. 2 an additional jump;
Fig. 3 a too small jump;
Fig. 4 two positive jumps and three slopes;
Fig. 5 a slope; -Fig. 6 a long gentle slope;
Fig. 7 a slope having a positive and a negative part;
Fig. 8 negative-positive-negative slope;
Fig. ~ an end point;
Fig. 10 ditto;
Fig. Il an example of end points;
Fig. 12 islands;
Fig. 13 a block diagram concerning the processing of weight IS information;
Fig. 14 a block diagram concerning a second processing method.
Jumps The weight of a jump is measured by the number of white image elements between the end of an intersection and the beginning of an intersection on the next line (Fig.l) of an image the character has been converted into by a well ~nown process.
As in the definition stated in the patent application mentioned on p. 2 a distinction is drawn between positive and negative jumps. For determining the weights of the jump configurations use is made of a counter which is capable of counting the jumps in an aspect. The maximum number that can be counted is three. Further there are two registers for each jump. Whenever a jump occurs, the white image elements that determine ~054'71~

the weight of the jump are counted. Dependent on whether it is a positive or a negative jump this number is stored in the corresponding registers.
jump nos(itive) neg(ative) S I . D I
S2 p2 n2 S3 p3 n3 The references p1 ... p3 and nl ... n3 indicate the weights.
The following jump configurations and associated weights are distinguished:
code description weight definition S0 no jumps gO ~ B - e S1 1 pos.jump g1 p max - (~e - p max) =
_ 2.p max - ~e S2 1 neg.jump g2 n max - (~e - n max) -_ 2.n max - ~e S3 1 pos.jump + 1 pos.jump g3 p max-1 - (p,n) min S4 I neg.jump + I neg.jump g4 n max-1 - (p,n) min S5 1 pos.jump + 1 neg.jump g5 (p max,n max) min - ~e-(p max + n max)]
S 6 more than two jumps g6 (S3) . (p,n) min In this diagram:
B - the width or height of the character image;
~e ~ the sum of all the intermediate white image elements included in the jumps in one aspect;
p max s the largest of the three numbers pl,p2 and p3, indicating the largest positive jump;
n max = the largest of the three numbers nl, n2 and n3, indicating the largest negative jump;

~054 71~

p max-l = the second largest value of pl, p2 and p3;
n max-l = the second largest value of nl, n2 and n3;
(p,n) min = the smallest of the numbers pl ... p3 and nl .... n3;
(p max, n max) min = the smaller of the numbers p max and n max;
(S3) = 1, if (p3, n3) $ 0;
= 0, if (p3, n3) = 0.
It can be seen from this list of formulae and from similar lists and examples to be given in what follows that configurations of image elements contributing to a specified feature are taken into account with a positive sign and that configurations of image elements contributing to the other features of the relevant group are taken into account uith a negative sign.
In this connection two examples will be given, it being observed that, if the distance between the beginning points of two intersections that do join is equal to or larger than half the width or height of the character 15 image,this will be regarded as a jump (additional jump, Fig.2).
Example I (Fig.3).
Scanning result (left aspect):
Sl : pl = C nl = 3 ~~
S2 : p2 = 0 n2 = 0 S3 : p3 = 0 n3 = 0 Computation of the weights:

gO = 8 - 3 = 5 gl = 0 - 3 = - 3 g2 = 6 - 3 = 3 g3 = 0 - 0 = 0 g4 = 0 - 0 = 0 g5 = 0 - 0 = 0 g6 = 0 - 0 = 0 105471~3 Conclusion: In a character of the class according to Fig.3 theweight gO
is the largest, i.e. the feature S0 (no jumps) is preferred. Evidently the one (negative) jump is too small to be of any interest. Next the weight g2 is the largest for feature S2 (one negative jump). The weights g1 and gS are negative, i.e. the associated features are still less important.
Example 2 (Fig.4).
Scanning result left aspect:
B - Bs - 18 S1 : pl - 4 nl - 0 S2 : p2 - 7 n2 ~ 0 S3 : p3 - 0 n3 ~ 0 ~ Computation of the weights:
; gO - 9 ~ - 2 gl - 7 ~ 4 ' 3 2 - o - 11 _ 11 _ .
g3 - 4 - 0 - 4 g4 - 0 - 0 - 0 g5 - 0 - 4 - - 4 g6 - 0 - 0 ~ o ! Conclusion: In a character of theclass according to Fig.4 the weight g3 is the largest, i.e. the feature S3 (two positive jumps) is preferred, immediately followed by the feature of one positive jump (cf. g1 for S1).
The weight g2 for the occurrence of one negative jump (S2) is very negative (-11), so that the corresponding feature is not fit for being considered.
Slopes.
The weight attributed to a slope configuration is determined by the 105471~

number of scanning lines covercd by the configuration. In what follows the slope is defined in a somewhat different manner than in the system according to the patent application mentioned on p.2. There is a slope when in two or more successive lines the distan~e to the side of the enclosing rectangle is different. This definition also includes a one element shift on one line, when no change of distance occurs on the next line. See Fig. 5. In this case the left aspect exhibits a negative slope on lines 3 and 4. ~hen the line segment continues beyond that point every second line is counted. See Fig.6.
Thus the weight of the negative slope is 5. A slope always eh~s in a jump or in an additional jump.
The maximum number of successive slopes that can be recorded in an aspect is limited to 4. In a compound slope configuration, e.g. a negative slope followed by a positive slope, the weight is determined by the degree of similarity of the two partial areas (symmetry of the slopes). See Fig.7.
The first s~lope in the left aspect is negative (nh 1) over three lines.
The second slope is positive (ph 2) over five lines. The weight of the feature"negative slope, positive slope" is found by determining the smaller value of the two numbers nh I and ph 2 and multiplying this value by 2.
Notation: 2 (nh 1, ph 2) min. So the weight is 2 x 3 = 6.
In the example according to Fig. 8 the feature "negative slope, positive slope.negative slope" must get the largest weight. The weight is determined by taking the smallest value of the numbers nh 1, ph 2, and nh 3 and multiplying it by 3. Notation: 3 (nh 1, ph 2, nh 3) min.
So in the above example the weight of the feature:"neg., pos., neg."
is 3 x 3 = 9.
To record the values of the successive slopes eight registers are utilized, viz: ph I ... ph 4 and nh 1 ... nh 4. The ph-registers contain lo lOS471~

the data of the positive slopes and the nh-registers those of the negative slopes. Cases are distinguished in which both registers ph 4 and nh 4 are empty; notation: (p,n) h 4 = O. So in these cases an aspect presents three slopes at the most. The following slope configurations - and associated weights can be distinguished.
description ¦weight definition HO no slopes g r~ B - ~ h~ . HE 4 Ul positive slope gl ~h max - (~h - ph max~ . HE 4 U2 negative slope B2 [nh max - (~h - nh max~ HE 4 H3 pos.pos.slope g3largest value of:
~ {(ph 1, ph 2) min - (p, n) h 3 ~ . HE 4 [2{(ph 2, ph 3) min - ~p, n) h 1~ . HE 4 . . .
H4 pos.neg.slopeg4 _ largest value of:
L2{(ph 1, nh 2) min - (p, n) h3}] . HE 4 [2{(ph 2, nh 3) min - (p, n) hl}~ . HE 4 . .
H5 neg.pos.slope gSlargest value of:
~ ( , ph 2) min (p, n) h3}~ . HE 4 [2~(nh 2, ph 3) min - (p, n) hl}] . HE 4 H6 neg neg.slope g6 ~2{(nh 1, nh 2) min (p, n) h ~

~ {(nh 2, nh 3) min - (p, n) hl ~ . HE 4 H7 pos.pos.neg. g7largest value of:
3{(ph 1, ph 2, nh 3) min - (p, n~ h~}
and 3{(ph 2, ph 3, nh 4) min - (p, n) hl}
.
H8 pos.neg.pos. g8largest value of:
3{(ph 1, nh 2, ph 3) min - (p, n) h4}
and 3{(ph 2, nh 3, ph 4) min - (p, n) hl}

105471~

code description ~ definition H9 pos.neg.neg. g9 largest value of 3{(ph 1, nh 2, nh 3) min - (p, n) h4~

. 3{(ph 2, nh 3, ph 4) min - (p, n) hl}
H10 neg.pos.pos glO largest value of:
3{(nh 1, ph 2, ph 3) min - (p, n) h4}

3{(nh 2, ph 3, ph 4) min - (p, n) hl}
H11 neg.pos.neg. gl] largest value of:
3{(nh 1, ph 2, nh 3) min - (p, n) h4}

3{(nh 2, ph 3, nh 4) min - (p, n) hl}
H12 neg.neg.pos. gl2 largest value of:
3{(nh 1, nh 2, ph 3) min - (p, n) h4}

3{(nh 2, nh 3, ph 4) min - (p, n) hl}
~13 neg.pos.neg.pos. g13 4 (nh 1, ph 2, nh 3, ph 4) min H14 4 slopes gl4 4 {(p, n) hl, (p, n) h2, (p, n) h3, , (p, n) h4 } .{ G 13}

In this diagram:
B = width or height of pattern;
~h = sum of all slope values;
HE 4 = 1 if (p, n) h4 = O
= O if (p, n) h4 = ~ O
G 13 = 1 if gl3 = O
_ o if g13 $ -The slope configurations "pos.pos.pos." and "neg.neg.neg." have been csncelled, because they occur very seldom. On the other hand the configuration 1054'71~

having four slopes, "neg.pos.neg.pos." has been included (H13), because this very configuration is expected to occur frequently in the characters 3 and 8.
Example (Fig. 4 left aspect). B = Bh = 17 ~h = 14 ph I = O nh I = 3 ph 2 = O nh 2 = 9 ph 3 = I nh 3 = O
ph 4 = O nh 4 = O

~O gO = 9 - 13 = -4 gl = I - 12 = -11 g2 = 9 - (13 - 9) = +5 g3 = -2 g4 = -2 IS g5 = O - 2 - -2 g6 = 6 - 2 = +4 g7 = O
g8 ~ O
9 = O
glO = O
gll = O
gl2 = 3 - O = +3 gl3 = O
gl4 = O.

End points.
End points are characterized in that the number of image elements s occupied by the line thickness is smaller than the number of lines p - 13 - 1054 71~

covered by the line segment (Fig. 9). When the left outside aspect is being dealt with for detecting jumps and slopes, the end point configuration for the outside aspect is formed at the same time. Going from top to bottom, the lines containing image information are shifted to S the left out of the matrix. At every step during this process it is testedhow many united image elements s the line thi~..ess comprises. This number is temporarily stored in a working register. When a value of s is found which exceeds the value in the register, this fresh value (s max) is recorded. Further the number of lines p covered by the end point is counted.
At every line the ratio P is determined anew. If the freshly determined s max value exceeds the value temporarily stored in a register, the new value is recorded. Thus the maximum value of the ratio P occurring during the forming of the end point configuration is obtained.
In the example of Fig. 9 the maximum of the ratio P a = 2 )2. This value determines the weight attributed to the feature.
In each outside aspect the positions of the extremities are determined first. An extremity occurs when, reckoned from the outside of the rectangle, the first intersection with the image area is found. The position of the extremity is the extreme black image element of the intersection.
In the example of Fig. 10 the extremity, in the top aspect, is found at the position k = 10, r = 4. The count of the number of lines covered by the end point is terminated as soon as no correct connection of two successive intersections is found. An OR-function is made of the successive lines of the image. When line n is added to the OR-function, a test is made as to whether the intersection in line (n-1), associated with an end point, consists of united black image elements within the OR-area. Only if that is the case, line (n-1) contributes to the value of the end point.

1054'71~

In the new method of determining features the same end point configurations are distinguished as in the system according to the patent application mentioned on p.2. The coding indicates whether the extremities of end points occur in the upper part or in the lower part of an aspect.
The values found for the end points, which are related to the positions of the relevant extremities, are recorded in four registers b 1,2 and o 1,2.
The values of the end points found in the upper part of the aspect are recorded in the registers b1 and b2; the values of the end points in the lower part of the aspect are recorded in the registers ol~and o2.
End point configurations and associated weights:
.
code description weight definition EP o no end points 8 2 - (bl + b2 + ol + o2) EP 1 1 end point above g1 (b 1,2)max - {(b 1,2)min + ol + o2}

EP 2 1 end point below g2 (o 1,2)max - {(o 1,2)min + bl + b2}

EP 3 2 end points above g3 (b 1,2)min - (o 1,2)max EP 4 1 end point above,1 below g4 {(b 1,2)max,(o 1,2)max} min -{(b 1,2) min + (o t,2)min}

EP 5 2 end points below 85 (o 1,2)min - (b 1,2)max EP 6 more than 2 end points g6 3 x (b 1,2 o 1,2) min, this mini = being ~ 0 and not re than one register being empty.

For the end pointsof the left aspect in the example of Fig. 11 the following maximum values of the ratio E~-m-a-x are recorded:

~ 15 1~5471~

bl 3 ) I
b2 4 ~ I
l 2 ' 5 o2 ~ 0 S The weights computed by the definitions given:
gO = -5 ~1 = -5 g2 = +3 g3 = -4 g4 = 0 g5 = -I
g6 = +3 So the feature: more than two end points has the same weight as the feature: one end point below.
15 Islands.
The marner in which islands are formed from the original character pattern is the same as in the known system. The further processing operations, however, are entirely.different.
After the islands have been formed, the ext,emity, i.e. the line 20 on which the island is observed for the first time, is determined for each island and a test is made as to whether it occurs in the upper half or in the lower half of 2~ aspect. It is obvious that this is done for each of the four outside aspects.
We distinguish the following configurations:
EL O no islands;
EL I one island, extremity above;

10547~

EL 2 one island, extremity below;
EL 3 two islands, two extremities above;
EL 4 two islands, one extremity above, one below;
EL 5 two islands, two extremities below;
EL 6 more than two islands.
The weight of e~ch feature is determined by the number of lines covered by the ~sland. In order to determine, in a simple manner, the extreme value and the number of lines of an island, tne "OR"-function is formed of every two successive lines. As long as the intersection of an island constitutes an uninterrupted series of black image elements, it belongs to the relevant island. Four registers el, e2, f] and f2 ensure the recording.
In the case of islands the extremities of which lie in the upper part of the aspect the total number of lines for the relevant aspect is recorded in the registers el and e2. In the case of islands the extremities of which occur in the lower part of the aspect the relevant total is recorded in the registers f1 and f2. --So in one image half two island extremities can be treated at the most. If there are more in one image half, the Excess extremities and, consequently, the relevant islands are ignored.
The following island configurations and asso.~iated weights are distinguished:

- 17 - 1054'~18 code _ ~ definition EL 0 no islands gO ~ B - (el + e2 + fl + f2) EL 1 1 island, extr. above gl (el, 2)max - {(el, 2)min + fl ~ f2}
EL 2 I island, extr. below g2 (fl, 2)max - {(fl, 2)min ~ el ~ e2}
EL 3 2 islands, 2 extr.abov g3 2(el, 2)min - (fl ~ f2) EL 4 2 islands, I extr.above g4 2 {(ei, 2)max' (f:, 2)max} min -I extr.below {(el, 2)min ~ (fl, 2)min}
EL 5 2 islands, 2 extr.belo~ gS 2 (fl, 2)min - (e] + e2) EL 6 more than 2 islands g6 3 x (el, 2, fl, 2)min, this minimum being ~0 and not more than I register being empty.

For the data of the top aspect, according to the example of Fig.l2, 10 the following weights are found:
el = I go = 420 _ (1 + 3 + 5 + 0) = 5 _ 9 = -4.
e2 = 3 gl = 3 - (I ~ 5 + 0) = 3 - 6 = -3.
fl = S g2 = 5 - (0 + I + 3) = S - 4 = +1.
f2 = 0 g3 = 2.1 - (S ~ 0) = 2 ~ 5! = ~3 IS B = 20 g4 - 2.3 - (I + 0) = 6 - I = +5.
g5 = 2.0 - (I + 3) = 0 ~ 4 = ~4-g6 = 3.1 = ~ +3.

The feature "t~o islands , one extremity above, one extremity below"
has the largest weight. From the same image the following data are 0 found for the left aspect:
el = 7 e2 = 6 fl = 5 f2 = 0 B = 12 18 105~7~8 weights: go ~ 42 _ (7 + 6 -I 5 + 0) - 3 - 18 8 -15 gl - 7 - (6 + 5 + 0) - 7 - ll ~ -4 g2 ~ 5 - (0 + 7 + 6) - 5 - l3 ~ -8 g3 ~ 2.6 - (5 + 0) ~ l2 - 5 - +7 g4 - 2.5 - (6 + 0) - lO - 6 - +4 g5 - 2.0 - (7 + 6) - 0 - l3 - -l3 g6 ~ 3.5 - +l5 The feature "more than two islands" has the largest weight.

The judging processor.
a. Maximum selector binary feature components.
Now that the weights of all- the features can be determined, all kinds of variations of the judging processor can be imagined. The simplest method is the one in which the existing judging method is approximated as much as possible. ~hus it could appear whether merely a better determination of the featuregalready leads to an improvement.
The block diagram of the system can be as indicated in Fig. 13.
In this figure the reference numeral I designates a character image, 2 an image manipulator by means of which the four outsite aspects and the four inside aspects are formet. For each aspect processing units 3 compute, in the described manner, the weight of each of the configurations in each of the groups of features (jumps S, slopes a, end points EP and islands EL). For each group selectoss 4 are utilized for determining the highest of the computed weights.
As a result of these operations four values 8~- gh~ gp and 8l are available per aspect for the four respective feature groups S, ~, EP and EL. Th~ judging procegsor in its simplest embodiment does not utilize the numerical values 88- gh~ gp and 8l~ but only employs the statistic .

_ 19 _ frequencies of occurrence of the selected features in characters of the various classes. When a feature x is selected, because the value of weight gx is the highest in the group of features to which 5 belongs, the relevant character belonging to class y, the content of the storage S position Pxy in a learning matrix 6, is increased by or.e in the learning phase. The storage position is indicated by means of code converters S, which indicate the relevant line of the learning matrix.
At the end of the learning phase the content of a storage position Pxy has the value Sxy~ which indicates the number of times the feature (x) was selected in images of class y. The probability of a character of class y, when the feature x has been selected is K(y¦x) for which K(y~x) = k Y

k ~ SxY
In this expression Sxy is the sum of all the characters in the learning set for which the fe~ture x was selected.
IS Then, at the end of the learning phase, the logarithm of all the probabilities K(ylx) is determined.
Consequently: K(y¦x)L = log K(y~x).
The (logarithmic) probability for a character to belong to class y is found in the working phase by summingthe logarithmic probabilities of all the selected features K~y) = ~ l~(y~x) by ~eans of a maximNm selector 7.
If there are features which,within one group, have the same weights, operations in the learning phase are carried out as if there are tifferent images, of which, consequently, all the selected features are recorded. In the working phase, in that case, the recognition is carried out for the various representations. If the results are contradictory, - 20 - l 0 5 4 7 the relevant character is rejected.

b. Judging processor with weighted feature components.
Fig. 14 shows how turing the learning phase a learning matrix is filled and how turing the working phase an image is recognized, in such a manner that use can be made of the numerical values of the features.
During the learning phase as well as during the working phase aspectsare offeret, by means of an image manipulator 9, of each character image 8 to be processed. The weights are computed in a processor unit 10. In each aspect a maximum selector 11 determines the largest weight (gx) fdr each feature group. By means of a coting device 12 that line in a learning matrix 13 is indicated which corresponds to the relevant selected feature. In the known way, during the learning phase, the content of a storage position xy is increased by one, if feature x is found in a character of class y. So during the learning phase each storage position of the learning matrix contains a sum S~ylx)being the number of times the feature (x) was selected in images of class y. At the ent of the learning phase, by a process identical to the one utilizet in the first embodiment of the judging processor (Fig.13), the probability of a character of class y can be determined, when the feature x has been selected:
K(y¦x)-~ Sx As, for the classification of images, the product probabilities for allthe selected features are utilized, in this embodiment of the judging processor too the calculations are carried out with logarithmic values of the probabilities. ~ fie end of the;learning phase each storage ~0547~8 position of the learning matrix contains the logarithmic probability of a character of class y, when the feature x has been selected:

K(y¦x)L = log K(y~x)-In the subsequent working phase the largest weight (gx) of each character image offered i determined, equally per aspect, for each group of features. The mumerical value of th;- weight is normalized by a devision by the dimension B (width or height) of the image. This is done in a normalizing device 14. Next the logarithmic values of the normalized weights are determined by logic circuits 15.
The logarithmic weight of the selected feature:

gxL = log(B-).
The weighted probability for an image of class y to occur, when a feature x having a weight gx has been selected, is:
Ry(x) = K~y1x) . gx .

The logarithmic probability:
K L ~x), = log Ry (x) This probability is found by suL~ming the logarithmic probability R(y¦x)L stored in the learning matrix and the logarithmic normalized weight g xL
RyL(X) = K(Ylx)L ~ gxL

As shown in Fig. 14 there is a summing device 16 for each group of features. In each group of f~atures the su~nation for the selected - feature is carried out per aspect for all the classes. Consequently, a second matrix 17 havir.g k columns is formed, the number k corresponding to the number of charac~er classes, the number of lines corresponding to the number of selected features multiplied by the number of aspects.

After the second matrix has been filled in view of the recognition of .

~ 22 1054718 an image, the logarithmic probability for the character to belong to class y is found by summing the probabilities K L(x).

This sum is ~ KyL

The pattern is assigned to the class for which ~ KyL is the highest. The highest value is determined by means of a maximum selector 18. The introduction of weights as describe~ above provides an improved method of determining the features of a character image; moreover the computed weights can be used for the classification of images, for which purpose the selected weights are preferably normalized in accordance with the dimensions of the images, the classification of an image being determined, on one hand, by the statistical frequencies, occurring in the utilized learning set, of the features selected for the relevant character image and, on the other hand, by the weights of the features selected for the relevant image.

Claims (4)

WHAT WE CLAIM IS:
1. Method for recognizing characters, in which in a learning phase as well as in a subsequent working phase features of character patterns in a number of aspects thereof are classified in a number of groups, during the learning phase the results of these classifications being recorded in a store as statistic frequencies and during the working phase the results of the feature classification of a freshly offered character being utilized in determining, for each class of characters, the probability of the features found in this character, characterized in that the features of a character pattern to be recognized are weighted, the weight attributed to each feature depending on the shape of the pattern, after which, on the basis of the values of these weights, features are selected, the stored statistic frequencies of which are multiplied by the values of the weights and then the largest value among the results is utilized for indicating the class.
2. Method according to claim 1, characterized in that configurations of character elements contributing to some specified feature are taken into account with the positive sign and configurations of character elements contributing to the remaining features in the group are taken into account with the negative sign, after which in each group the feature with the largest weight is selected.
3. Method according to claim 1, characterized in that the numerical values of the weights are normalized so as to make them independent of the sizes of the images.
4. Device for carrying out the method according to the preceding claims, comprising an image manipulator and a learning matrix, characterized by a number of processing units for computing the weights of the features found, per aspect, of the character image; a number of maximum selectors for choosing, from each group of features per aspect, the feature having the highest weight;
a number of coding devices for completing, during the learning phase, the store of the learning matrix for each of the highest values; a number of normalizing devices for establishing a relation between the value determined by the selectors and the size of the image in the same aspect; a number of logic circuits for determining the logarithmic values of the results obtained by the normalizing devices; a number of summing circuits for adding the logarithmic values obtained from the logic circuits to the logarithmic values of the values recorded per aspect in the learning matrix for the relevant feature in all the character classes; a second matrix for storing, for all the character classes, the logarithmic values found in the summing devices and a maximum selector for determining the highest value of the sum of the feature values for all the aspects of all classes of characters.
CA253532A 1975-06-02 1976-05-27 Method for recognizing characters Expired CA1054718A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
NL7506520.A NL165863C (en) 1975-06-02 1975-06-02 DEVICE FOR RECOGNIZING SIGNS.

Publications (1)

Publication Number Publication Date
CA1054718A true CA1054718A (en) 1979-05-15

Family

ID=19823862

Family Applications (1)

Application Number Title Priority Date Filing Date
CA253532A Expired CA1054718A (en) 1975-06-02 1976-05-27 Method for recognizing characters

Country Status (10)

Country Link
US (1) US4066999A (en)
JP (1) JPS527636A (en)
BE (1) BE842422A (en)
CA (1) CA1054718A (en)
CH (1) CH616255A5 (en)
DE (1) DE2623861C3 (en)
FR (1) FR2313717A1 (en)
GB (1) GB1515202A (en)
NL (1) NL165863C (en)
SE (1) SE435219B (en)

Families Citing this family (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4177448A (en) * 1978-06-26 1979-12-04 International Business Machines Corporation Character recognition system and method multi-bit curve vector processing
DE2920041C2 (en) * 1979-05-18 1986-09-04 Philips Patentverwaltung Gmbh, 2000 Hamburg Method for verifying signals, and arrangement for carrying out the method
US4447715A (en) * 1980-10-30 1984-05-08 Vincent Vulcano Sorting machine for sorting covers
NL8006241A (en) * 1980-11-14 1982-06-01 Nederlanden Staat DEVICE FOR AUTOMATIC READING OF CHARACTERS.
US4578812A (en) * 1982-12-01 1986-03-25 Nec Corporation Digital image processing by hardware using cubic convolution interpolation
US4599693A (en) * 1984-01-16 1986-07-08 Itt Corporation Probabilistic learning system
US4593367A (en) * 1984-01-16 1986-06-03 Itt Corporation Probabilistic learning element
US4620286A (en) * 1984-01-16 1986-10-28 Itt Corporation Probabilistic learning element
US4599692A (en) * 1984-01-16 1986-07-08 Itt Corporation Probabilistic learning element employing context drive searching
US4805225A (en) * 1986-11-06 1989-02-14 The Research Foundation Of The State University Of New York Pattern recognition method and apparatus
JPH04346187A (en) * 1991-05-23 1992-12-02 Matsushita Electric Ind Co Ltd Quality decision method for subject to be detected
DE4133590A1 (en) * 1991-07-03 1993-01-14 Bosch Gmbh Robert METHOD FOR CLASSIFYING SIGNALS
DE4436408C1 (en) * 1994-10-12 1995-12-07 Daimler Benz Ag Pattern recognition system for automatic character reader
US7167587B2 (en) * 2002-08-30 2007-01-23 Lockheed Martin Corporation Sequential classifier for use in pattern recognition system
GB2434496B (en) * 2005-07-14 2007-10-31 Snell & Wilcox Ltd Method and apparatus for analysing image data
JP5663148B2 (en) * 2009-06-29 2015-02-04 アズビル株式会社 Counting device, physical quantity sensor, counting method and physical quantity measuring method
JP2011033525A (en) * 2009-08-04 2011-02-17 Yamatake Corp Counter, physical quantity sensor, counting method, and physical quantity measuring method
JP5702536B2 (en) * 2010-01-05 2015-04-15 アズビル株式会社 Velocity measuring apparatus and method
US8982336B2 (en) 2010-03-10 2015-03-17 Azbil Corporation Physical quantity sensor and physical quantity measuring method
CN114934467B (en) * 2022-07-08 2024-04-30 江苏永达电力金具有限公司 Parking space barrier control method, parking space barrier system and medium

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3275986A (en) * 1962-06-14 1966-09-27 Gen Dynamics Corp Pattern recognition systems
US3341814A (en) * 1962-07-11 1967-09-12 Burroughs Corp Character recognition
US3588823A (en) * 1968-03-28 1971-06-28 Ibm Mutual information derived tree structure in an adaptive pattern recognition system
GB1233459A (en) * 1968-07-15 1971-05-26
JPS518699B1 (en) * 1970-11-09 1976-03-19

Also Published As

Publication number Publication date
NL165863B (en) 1980-12-15
DE2623861C3 (en) 1979-03-08
SE7605957L (en) 1976-12-03
GB1515202A (en) 1978-06-21
JPS527636A (en) 1977-01-20
FR2313717B1 (en) 1979-06-22
DE2623861B2 (en) 1978-07-13
DE2623861A1 (en) 1976-12-09
CH616255A5 (en) 1980-03-14
BE842422A (en) 1976-09-16
SE435219B (en) 1984-09-10
JPS5744186B2 (en) 1982-09-20
US4066999A (en) 1978-01-03
NL7506520A (en) 1976-12-06
FR2313717A1 (en) 1976-12-31
NL165863C (en) 1981-05-15

Similar Documents

Publication Publication Date Title
CA1054718A (en) Method for recognizing characters
Andre Automatically defined features: The simultaneous evolution of 2-dimensional feature detectors and an algorithm for using them
Mao Feature subset selection for support vector machines through discriminative function pruning analysis
CA1160347A (en) Method for recognizing a machine encoded character
US4481665A (en) Character segmentation method
US4571925A (en) Insertion machine with postage categorization
EP0538038A2 (en) Character recognition method & apparatus
EP0689153B1 (en) Character recognition
CN109284757A (en) A kind of licence plate recognition method, device, computer installation and computer readable storage medium
WO2011126411A9 (en) Method for processing bank notes (variants)
CN112529062A (en) Object classification method based on dexterous hand touch information
US4164789A (en) Electronic apparatus for dealing with numerical information
CN115841336A (en) Agricultural product transaction circulation information tracing method based on cloud computing
JP2000123129A (en) Character code and its reader
Singh et al. Handwritten character recognition using modified gradient descent technique of neural networks and representation of conjugate descent for training patterns
JPH04111085A (en) Pattern recognizing device
JPH07220087A (en) Neuro identification/loss separation device for sheet paper or the like by random masking system
JPS56110184A (en) Cash register
Jamtsho et al. Transfer Learning-Based Bhutanese Currency Recognition
JPS602714B2 (en) Printed character recognition device
JPS58169292A (en) Input data checking system
JPS5862770A (en) On-line manuscript character recognizing system
JP3657565B2 (en) Image pattern identification and recognition method
JPH08129614A (en) Character recognition device
CN113592087A (en) Neural network probability threshold setting method, device, equipment and storage medium