JPH0397085A - Segmenting system for routine format character - Google Patents

Segmenting system for routine format character

Info

Publication number
JPH0397085A
JPH0397085A JP1234852A JP23485289A JPH0397085A JP H0397085 A JPH0397085 A JP H0397085A JP 1234852 A JP1234852 A JP 1234852A JP 23485289 A JP23485289 A JP 23485289A JP H0397085 A JPH0397085 A JP H0397085A
Authority
JP
Japan
Prior art keywords
character
scanning direction
characters
horizontal scanning
vertical scanning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP1234852A
Other languages
Japanese (ja)
Inventor
Narihide Yamada
成英 山田
Yasuyoshi Kamata
鎌田 保善
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fuji Electric Co Ltd
Fuji Facom Corp
Original Assignee
Fuji Electric Co Ltd
Fuji Facom Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fuji Electric Co Ltd, Fuji Facom Corp filed Critical Fuji Electric Co Ltd
Priority to JP1234852A priority Critical patent/JPH0397085A/en
Publication of JPH0397085A publication Critical patent/JPH0397085A/en
Pending legal-status Critical Current

Links

Landscapes

  • Character Input (AREA)

Abstract

PURPOSE:To exactly and stably segment characters even when there is noise or a character missing part in a binary picture by respectively defining positions, where the relaxation of picture element cumulative values is made maximum as the remained body of a character group, as character segmenting positions and setting a character segmenting frame. CONSTITUTION:Concerning a whole binary picture G composed of the plural routine format characters, the relaxation of the picture element cumulative values is calculated in a horizontal scanning direction and a vertical scanning direction and the position to apply the maximum value is defined as the character segmenting position. Thus, even when there is the noise or the character missing part in one part of the binary picture G, the influence of them can be suppressed at a minimum when observing from the whole picture. Then, the character segmenting positions in the horizontal scanning direction and the vertical scanning direction are exactly determined and a character segmenting frame 1 can be set.

Description

【発明の詳細な説明】 (産業上の利用分野) 本発明は、光学的入カ手段によって入カされた画像を二
値化して文字・記号等のキャラクタ(以下,これらを総
称して「文字」という)を読取る光学式文字読取装置に
おいて、文字認識のための前処理として二値画像の中か
ら一文字単位に定型フォーマット文字を切り出す切り出
し方式に関する.(従来の技術) 従来、この種の文字切り出し方式としては、二値画像を
構成する画素数を水平走査方向,垂直走査方向に累積し
、その累積値の総和を一文字の文字幅内で順次求めてい
き、これらの総和が最大となる位置を文字切り出し位置
として一文字ずつ文字を切り出す方式が知られている。
DETAILED DESCRIPTION OF THE INVENTION (Field of Industrial Application) The present invention is a method of converting an image inputted by an optical inputting means into a binary form to generate characters such as letters and symbols (hereinafter collectively referred to as "characters"). This paper relates to an extraction method for cutting out standard format characters one character at a time from a binary image as preprocessing for character recognition in an optical character reading device that reads ``characters''. (Prior art) Conventionally, in this type of character extraction method, the number of pixels constituting a binary image is accumulated in the horizontal scanning direction and the vertical scanning direction, and the sum of the accumulated values is sequentially calculated within the character width of one character. A method is known in which characters are extracted one by one, with the position where the sum of these characters is maximum as the character extraction position.

(発明が解決しようとする課題) ?なわち従来の方式において、第4図上段に示すような
二値画像Gが入力された場合,垂直走査方向に沿った画
素gの数の累積値を水平走査方向に沿って示すと同図中
段のようになる.なお,図中、nは二値画像G内のノイ
ズを示す。また、これらの累積値を文字@Wの範囲内で
加算して総和を求める処理を,前記文字幅Wを水平走査
方向に沿って一画素ごとにずらして行なうと同図下段の
ようになる。
(Problem that the invention seeks to solve)? That is, in the conventional method, when a binary image G as shown in the upper part of Fig. 4 is input, the cumulative value of the number of pixels g along the vertical scanning direction is shown in the middle part of the figure. become that way. Note that in the figure, n indicates noise within the binary image G. If the process of adding these cumulative values within the range of the character @W to obtain the total sum is performed by shifting the character width W pixel by pixel along the horizontal scanning direction, the result will be as shown in the lower part of the figure.

なお、第4図下段において,Pエ〜P1sは文字@Wを
水平走査方向に沿って一画素ずつずらしていったときの
位置を示し、また各文字IIIW上に付された数字は画
素累積値の総和を示している.ここで、二値画像Gを構
成する各文字を文字幅Wの範囲で一文字ずつ切り出そう
とすると,第4図の位置P,.,P■■〜Pi4ではこ
れらの範囲内に存在するノイズnに起因して、何れの場
合も画素累積値の総和が30となり、二値画像Gを構成
する文字{( 8 #jの真の切り出し位置であるp,
が求められないという問題があった.また,同様に文字
it 2 uの真の切り出し位置は画素累積値の総和が
24であるP2であるが、位置P4では上記総和が25
と大きくなっているため、この位置P4を切り出し位置
として誤認してしまうこともあり、文字切り出し位置が
不安定になるという問題があった。
In addition, in the lower part of Figure 4, Pe to P1s indicate the position when the character @W is shifted one pixel at a time along the horizontal scanning direction, and the number attached to each character IIIW is the pixel cumulative value. It shows the total sum of . Here, if we try to cut out each character constituting the binary image G one by one within the range of the character width W, the positions P, . , P■■ to Pi4, due to the noise n existing within these ranges, the sum of pixel cumulative values is 30 in all cases, and the true value of the character {( 8 #j) constituting the binary image G is p, which is the cutting position,
The problem was that it was not required. Similarly, the true extraction position of the character it 2 u is P2, where the sum of pixel cumulative values is 24, but at position P4, the sum is 25.
Since the position P4 is large, the position P4 may be mistakenly recognized as the cut-out position, resulting in a problem that the character cut-out position becomes unstable.

本発明は上記問題点を解決するために提案されたもので
、その目的とするところは、二値画像にノイズ等を含む
場合でも安定した文字の切り出しが行なえるようにした
定型フォーマット文字の切り出し方式を提供することに
ある。
The present invention was proposed in order to solve the above problems, and its purpose is to cut out characters in a standard format so that characters can be stably cut out even when a binary image contains noise etc. The goal is to provide a method.

(課題を解決するための手段) 上記目的を達成するため、本発明は、二値画像を構成す
る複数文字の文字数、各文字の間隔及び各文字の外接枠
の大きさが既知である定型フォーマット文字を認識する
前処理として、前記各文字の水平走査方向及び垂直走査
方向の切り出し位置に基づき文字切り出し枠を設定して
各文字を切り出す定型フォーマット文字の切り出し方式
において,前記二値画像全体に対し、前記文字数、各文
字の間隔及び各文字の外接枠に基づいて水平走査方向及
び垂直走査方向に沿った基準位置をそれぞれ設定し、こ
れらの基準位置を水平走査方向及び垂直走査方向にずら
した各位置について当該位置内における前記二値画像の
構或画素の累積値の総和をそれぞれ求め,これらの総和
が最大となる位置をもってそれぞれ水平走査方向及び垂
直走査方゜向の文字切り出し位置を決定し,これらの文
字切り出し位置に基づいて文字切り出し枠を設定するも
のである。
(Means for Solving the Problems) In order to achieve the above object, the present invention provides a fixed format in which the number of characters constituting a binary image, the spacing between each character, and the size of the circumscribing frame of each character are known. As preprocessing for character recognition, in the standard format character extraction method in which each character is extracted by setting a character extraction frame based on the extraction position of each character in the horizontal scanning direction and vertical scanning direction, the entire binary image is , the reference positions along the horizontal scanning direction and the vertical scanning direction are respectively set based on the number of characters, the spacing between each character, and the circumscribing frame of each character, and each of the reference positions is shifted in the horizontal scanning direction and the vertical scanning direction. With respect to the position, calculate the total sum of the cumulative values of a certain pixel of the binary image in the relevant position, and determine the character cutting position in the horizontal scanning direction and vertical scanning direction at the position where these sums are maximum, respectively, A character cutting frame is set based on these character cutting positions.

(作用) 本発明によれば、複数の定型フォーマット文字からなる
二値画像全体について水平走査方向及び垂直走査方向の
画素累積値の総和を求め,その最大値を与える位置を文
字切り出し位置として決定する。このため,二値画像の
一部にノイズがあったり文字の欠落部がある場合にも、
画像全体から見ればこれらの影響を最小限に留めること
ができ,水平走査方向及び垂直走査方向の文字切り出し
位置を正確に決定して文字切り出し枠を設定することが
できる. ?実施例) 以下、図に沿って本発明の一実施例を説明する.まず、
この実施例では、第1図に示す如く、構成文字数や各文
字の外接枠の大きさ、各文字の間隔が明らかである複数
の定型フォーマット文字からなる二値画像Gが入力され
た場合,便宜的に各文字の外接枠1を設定する。そして
、この外接枠1や文字間隔等に基づき,水平走査方向(
X方向)及び垂直走査方向(Y方向)の基準位置Fa,
Fbを設定する. 始めに、水平走査方向の基準位置Faに従い、二値画像
Gの全体について垂直走査方向に沿った画素の累積値の
総和を求めると、図示するように前記基準位M F a
を水平走査方向に沿って一画素ずつずらしていった各位
置P■〜P9について,位置P4では64、P2では7
8、P,では58、P4では67、P,では71. P
,では61%P7では58, P,では46、P9では
54となる.従って、画素累積値の総和が最大となるの
は位置P2であるから、この位WP1を水平走査方向の
文字切り出し位置とする。
(Operation) According to the present invention, the sum of pixel cumulative values in the horizontal scanning direction and the vertical scanning direction is calculated for the entire binary image consisting of a plurality of fixed format characters, and the position giving the maximum value is determined as the character cutting position. . Therefore, even if there is noise or missing characters in part of the binary image,
When viewed from the perspective of the entire image, these effects can be kept to a minimum, and character extraction frames can be set by accurately determining character extraction positions in the horizontal and vertical scanning directions. ? Example) An example of the present invention will be described below with reference to the drawings. first,
In this embodiment, as shown in FIG. The circumscribing frame 1 of each character is set automatically. Then, based on this circumscribing frame 1, character spacing, etc., the horizontal scanning direction (
reference position Fa in the X direction) and vertical scanning direction (Y direction),
Set Fb. First, according to the reference position Fa in the horizontal scanning direction, when the sum of cumulative values of pixels along the vertical scanning direction is calculated for the entire binary image G, as shown in the figure, the reference position M Fa
For each position P■ to P9 shifted by one pixel along the horizontal scanning direction, 64 at position P4 and 7 at P2
8, P, 58, P4, 67, P, 71. P
, it becomes 58 for 61% P7, 46 for P, and 54 for P9. Therefore, since the total sum of pixel cumulative values is maximum at position P2, this position WP1 is set as the character cutting position in the horizontal scanning direction.

また,同様にして垂直走査方向についても、基準位置F
bに従って画素累積値の総和を求める結果,この総和が
最大となる位置Pnが垂直走査方向の文字切り出し位置
となる. よって、水平,垂直両走査方向について決定された文字
切り出し位置P a(= P i)− P b(= P
 n)に基づき、第2図に示す如く最終的に文字切り出
し枠1′を設定し、この文字切り出し枠l′に従って文
字を切り出してその認識を行なうものである。
Similarly, in the vertical scanning direction, the reference position F
As a result of calculating the sum of pixel cumulative values according to b, the position Pn where this sum is maximum becomes the character cutting position in the vertical scanning direction. Therefore, the character cutting position P a (= P i) − P b (= P
n), a character cutting frame 1' is finally set as shown in FIG. 2, and characters are cut out and recognized according to this character cutting frame 1'.

従って、例えば第3図に示すような二値画像Gが入力さ
れた場合、水平走査方向の基準位置を一画素ずつずらし
ていった各位置P1〜P,での画素の累積値の総和は、
位置P1が37、P2が43、P,が30、P4が34
. Psが30,−−−となり,最大値を与える位置P
2が水平走査方向の文字切り出し位置Paとなる。
Therefore, for example, when a binary image G as shown in FIG. 3 is input, the sum of the cumulative values of pixels at each position P1 to P, where the reference position in the horizontal scanning direction is shifted one pixel at a time, is:
Position P1 is 37, P2 is 43, P is 30, P4 is 34
.. Ps becomes 30,---, and the position P that gives the maximum value
2 is the character cutting position Pa in the horizontal scanning direction.

また,垂直走査方向についても同様に文字切り出し位置
pbが求められ、これらに基づいて文字切り出し枠1′
が求められる. これにより、第2図,第3図に示すように二値画像Gに
ノイズnがある場合や第3図に示すように文字の若干の
欠落部n′がある場合でも、これらのノイズnや欠落部
n′は,画像全体として算出される水平走査方向及び垂
直走査方向の画素累積値の総和に対して与える影響が小
さくなり,正確な文字切り出し位置の検出を損なう恐れ
がないものである。
In addition, the character cutting position pb is similarly determined in the vertical scanning direction, and based on this, the character cutting frame 1'
is required. As a result, even if there is noise n in the binary image G as shown in Figures 2 and 3, or if there is a slight missing part n' of characters as shown in Figure 3, these noises n and The missing portion n' has a small influence on the total sum of pixel cumulative values in the horizontal scanning direction and the vertical scanning direction calculated for the entire image, and there is no risk of impairing accurate detection of the character cutting position.

(発明の効果) 以上のように本発明によれば、二値画像を構成する文字
数や各文字の外接枠の大きさ、各文字間の間隔が既知で
ある複数の定型フォーマット文字に対し、水平走査方向
及び垂直走査方向に従って文字群全体として画素累積値
の総和が最大となる位置をそれぞれ文字切り出し位置と
して文字切り出し枠を設定するものであるから,各文字
毎に画素累積値の総和を求めてその最大値に基づき文字
切り出し位置を求める従来の方式に比べて、二値画像に
ノイズや文字欠落部がある場合でも正確かつ安定した文
字切り出しを行なうことができ、その後の文字認識の精
度を大幅に高めることができる. また,本発明では,水平走査方向及び垂直走査方向に沿
ってそれぞれ画素累積値の総和を求めた一次元の値に基
づいて文字切り出し位置を求めるため、文字の切り出し
処理が全体として短時間で行なえるという利点がある.
(Effects of the Invention) As described above, according to the present invention, horizontal Since the character cutting frame is set as the character cutting position at the position where the sum of pixel cumulative values for the entire character group is maximum according to the scanning direction and the vertical scanning direction, the sum of pixel cumulative values for each character is determined. Compared to the conventional method of determining the character extraction position based on the maximum value, it is possible to perform accurate and stable character extraction even when there is noise or missing characters in the binary image, greatly improving the accuracy of subsequent character recognition. It can be increased to In addition, in the present invention, the character cutting position is determined based on the one-dimensional value obtained by calculating the sum of pixel cumulative values along the horizontal scanning direction and the vertical scanning direction, so that the character cutting process can be performed in a short time as a whole. It has the advantage of being

【図面の簡単な説明】[Brief explanation of drawings]

第1図ないし第3図は,本発明の一実施例において二値
画像に基づく水平走査方向及び垂直走査方向の文字切り
出し位置を説明する図、第4図は従来の技術を説明する
ためのもので、二値画像に対する画素累積値の総和と水
平走査方向の文字切り出し位置の説明図である. G:二値画像  g:画素 Fa,Fb:基準位置 P ,〜P., Pn−t, Pn, Pn*t :位
置Pa,Pb:文字切り出し位置
Figures 1 to 3 are diagrams for explaining character extraction positions in the horizontal scanning direction and vertical scanning direction based on a binary image in an embodiment of the present invention, and Figure 4 is for explaining the conventional technique. This is an explanatory diagram of the total sum of pixel cumulative values and the character extraction position in the horizontal scanning direction for a binary image. G: Binary image g: Pixel Fa, Fb: Reference position P, ~P. , Pn-t, Pn, Pn*t: Position Pa, Pb: Character cutting position

Claims (1)

【特許請求の範囲】 二値画像を構成する複数文字の文字数、各文字の間隔及
び各文字の外接枠の大きさが既知である定型フォーマッ
ト文字を認識する前処理として、前記各文字の水平走査
方向及び垂直走査方向の切り出し位置に基づき文字切り
出し枠を設定して各文字を切り出す定型フォーマット文
字の切り出し方式において、 前記二値画像全体に対し、前記文字数、各文字の間隔及
び各文字の外接枠に基づいて水平走査方向及び垂直走査
方向に沿った基準位置をそれぞれ設定し、これらの基準
位置を水平走査方向及び垂直走査方向にずらした各位置
について当該位置内における前記二値画像の構成画素の
累積値の総和をそれぞれ求め、これらの総和が最大とな
る位置をもってそれぞれ水平走査方向及び垂直走査方向
の文字切り出し位置を決定し、これらの文字切り出し位
置に基づいて文字切り出し枠を設定することを特徴とす
る定型フォーマット文字の切り出し方式。
[Scope of Claims] As preprocessing for recognizing fixed format characters in which the number of characters, the spacing between each character, and the size of the circumscribing frame of each character are known, horizontal scanning of each character is performed. In a standard format character extraction method in which each character is extracted by setting a character extraction frame based on the extraction position in the direction and vertical scanning direction, the number of characters, the spacing between each character, and the circumscribing frame of each character are determined for the entire binary image The reference positions along the horizontal scanning direction and the vertical scanning direction are set based on the following, and for each position where these reference positions are shifted in the horizontal scanning direction and the vertical scanning direction, the number of constituent pixels of the binary image within the position is determined. The feature is that the total sum of cumulative values is determined, character cutting positions in the horizontal scanning direction and vertical scanning direction are determined respectively at the position where these sums are maximum, and a character cutting frame is set based on these character cutting positions. A method for cutting out fixed format characters.
JP1234852A 1989-09-11 1989-09-11 Segmenting system for routine format character Pending JPH0397085A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP1234852A JPH0397085A (en) 1989-09-11 1989-09-11 Segmenting system for routine format character

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP1234852A JPH0397085A (en) 1989-09-11 1989-09-11 Segmenting system for routine format character

Publications (1)

Publication Number Publication Date
JPH0397085A true JPH0397085A (en) 1991-04-23

Family

ID=16977365

Family Applications (1)

Application Number Title Priority Date Filing Date
JP1234852A Pending JPH0397085A (en) 1989-09-11 1989-09-11 Segmenting system for routine format character

Country Status (1)

Country Link
JP (1) JPH0397085A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20160133634A (en) * 2015-05-13 2016-11-23 이해룡 A Mask Unit Having Defensing of a Glass Steam Removing

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20160133634A (en) * 2015-05-13 2016-11-23 이해룡 A Mask Unit Having Defensing of a Glass Steam Removing

Similar Documents

Publication Publication Date Title
US7336813B2 (en) System and method of determining image skew using connected components
EP0543593B1 (en) Method for determining boundaries of words in text
JP2822189B2 (en) Character recognition apparatus and method
EP0543590B1 (en) Method for comparing word shapes
RU2621601C1 (en) Document image curvature eliminating
EP0543594A2 (en) A method for deriving wordshapes for subsequent comparison
JPH09311905A (en) Line detecting method and character recognition device
JPH0397085A (en) Segmenting system for routine format character
JPH07230525A (en) Method for recognizing ruled line and method for processing table
JPS58201182A (en) Character and graph demarcating method
JPH0581474A (en) Character string extracting method and character area detecting method
JP6613625B2 (en) Image processing program, image processing apparatus, and image processing method
JPS6316392A (en) Character recognizing device
JPS6254380A (en) Character recognizing device
JPH03160582A (en) Method for separating ruled line and character in document picture data
JPH02166583A (en) Character recognizing device
JPS6343788B2 (en)
JPS63101983A (en) Character string extracting system
JPH02187883A (en) Document reader
JPH01194086A (en) Character reader
CN115115818A (en) Subtitle recognition method and system based on twin network and image feature matching
JPS62169286A (en) Character segmenting system
JPH0934992A (en) On-line handwritten character string segmenting device
JPH0773273A (en) Pattern segmenting and recognizing method and its system
JPH05274472A (en) Image recognizing device