JP2518063B2 - Character cutting method and device - Google Patents

Character cutting method and device

Info

Publication number
JP2518063B2
JP2518063B2 JP1280482A JP28048289A JP2518063B2 JP 2518063 B2 JP2518063 B2 JP 2518063B2 JP 1280482 A JP1280482 A JP 1280482A JP 28048289 A JP28048289 A JP 28048289A JP 2518063 B2 JP2518063 B2 JP 2518063B2
Authority
JP
Japan
Prior art keywords
character
character string
function
model function
characters
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
JP1280482A
Other languages
Japanese (ja)
Other versions
JPH03141484A (en
Inventor
弥生 佐藤
淳 津雲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC Corp
Original Assignee
Nippon Electric Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nippon Electric Co Ltd filed Critical Nippon Electric Co Ltd
Priority to JP1280482A priority Critical patent/JP2518063B2/en
Publication of JPH03141484A publication Critical patent/JPH03141484A/en
Application granted granted Critical
Publication of JP2518063B2 publication Critical patent/JP2518063B2/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Landscapes

  • Character Input (AREA)

Description

【発明の詳細な説明】 (産業上の利用分野) 本発明は、光学的文字読み取り装置(OCR)における
文字切り出し方法及びその装置に関する。
TECHNICAL FIELD The present invention relates to a character cutting method and an apparatus for an optical character reading device (OCR).

(従来の技術) 計算機への情報入力の1つとして、光学的文字読み取
り装置が実用化されているが、手書き文字の認識の為に
使用されている文字認識方法には、文字記入枠内に記入
された文字、すなわち1文字ずつが切り出されている文
字を対象とするものがほとんどであった。近年、より使
いやすいOCRの実現を目指して、罫線などの文字列記入
枠内に記入された手書き文字列を認識する方法も提案さ
れてきている。文字列認識における1つの処理過程とし
て、文字切り出しがある。文字切り出し方法の研究とし
て従来、文献「手書き日本文字列からの文字切り出しの
基礎的考案、馬場口・塚本・相原、電子通信学会論文誌
(D)Vol.J68−D、No.12、1985、pp2123〜2131」、
「周辺分布を用いた手書き文字切り出しの検討、有吉・
岡沢・前田、昭60年電子通信学会総合全国大会講演論文
集、No.1555」に記載されているような技術が知られて
いる。これらの研究は、 1.文字の外接矩形は、ほぼ正方形に近い。
(Prior Art) An optical character reading device has been put to practical use as one of the information input to a computer. However, the character recognition method used for recognizing handwritten characters is in the character entry frame. Most of them were written characters, that is, characters cut out one by one. In recent years, a method of recognizing a handwritten character string entered in a character string entry frame such as a ruled line has been proposed for the purpose of realizing an easier-to-use OCR. Character segmentation is one of the processing steps in character string recognition. In the past, as a study of character extraction methods, the literature "Basic idea of character extraction from handwritten Japanese character strings, Babaguchi / Tsukamoto / Aihara, IEICE Transactions (D) Vol.J68-D, No.12, 1985, pp2123-2131 '',
"Examination of handwritten character segmentation using marginal distribution, Ariyoshi
Okazawa & Maeda, A technique known as the one described in "Proceedings of the 60th General Conference of the Institute of Electronics and Communication Engineers, No. 1555". These studies: 1. The circumscribed rectangle of a character is almost square.

2.同一文字列中に含まれる文字の大きさは、ほぼ等し
い。
2. The size of characters included in the same character string is almost equal.

といった仮定のもとでの文字切り出し方法である。しか
し、現実にはこれらの仮定を満たさないデータも多く、
文字幅や文字の外接矩形の縦横比の変動は大きい。また
これらの方法は、文字と文字が接触しているパタンには
有効でなかったり、文字が偏と旁に分かれていてそれぞ
れが1文字幅に近い幅をもっている場合、あるいは、文
字列中の文字幅の変動が大きい場合には有効でないとい
う問題点がある。文字の切り出しに文字列画像の図形的
特徴だけでなく、文字の認識結果も利用した研究とし
て、文献「候補文字ラティス法による枠無し筆記文字列
のオンライン認識、村瀬・若原・梅田、電子通信学会論
文誌(D)J68−D、No4、1989、pp765〜772」、「手書
き日本語文書からの文字切り出し方式、依田・松浦・前
田・南部、電子通信学会技術研究報告、SP86−35、198
6、pp67〜76」等がある。これらの方法にも上記と同じ
問題点があったり、あるいはパラメータを含むルールを
作成することにより上記の変動を吸収しようと試みた方
法ではあるが、切り出しの性能を向上するためのパラメ
ータの調節が難しいという問題点がある。
It is a character cutting method based on the assumption. However, in reality, there are many data that do not meet these assumptions,
There are large variations in the character width and the aspect ratio of the circumscribed rectangle of the character. In addition, these methods are not effective for patterns in which characters are in contact with each other, or if the characters are divided into uneven and staggered, and each has a width close to one character width, or There is a problem that it is not effective when the width varies greatly. As a study that uses not only the graphical features of a character string image but also the character recognition result for character extraction, the literature “Online recognition of frameless written character strings by the candidate character lattice method, Murase / Wakahara / Umeda, IEICE” Journal (D) J68-D, No4, 1989, pp765-772 "," Character extraction method from handwritten Japanese documents, Yoda / Matsuura / Maeda / Minami, IEICE technical report, SP86-35, 198 "
6, pp 67-76 ”, etc. These methods also have the same problems as described above, or methods that try to absorb the above variations by creating rules that include parameters, but the adjustment of parameters to improve the cutting performance is There is a problem that it is difficult.

(発明が解決しようとする課題) 本発明の目的は、ID番号や製品の型番、電話番号のよ
うな、予め文字数が既知な文字列から文字切り出しを行
う際に、上述したような従来技術の欠点を除去した、文
字幅や文字間隔の変動を吸収し文字の接触にも強い文字
切り出し方法で、しかもパラメータ調節の作業を殆ど必
要としない方法を提供し、またその方法を実現するため
の装置を提供することにある。
(Problems to be Solved by the Invention) An object of the present invention is to cut out characters from a character string in which the number of characters is known in advance, such as an ID number, a product model number, and a telephone number. A device for realizing the method by eliminating the drawbacks, providing a method for absorbing the fluctuation of the character width and the character spacing and strong against the contact of characters, and requiring almost no parameter adjustment work. To provide.

(課題を解決するための手段) 本発明によれば、文字列に含まれる文字数が既知であ
るときに、文字列を光学的に読み取り、その文字列画像
から1文字に相当する部分画像を切り出す方法に於い
て、文字列画像から1次元系列特徴を抽出し、前記文字
数と該1次元系列特徴に対応する、文字切り出し位置の
特定が可能なモデル関数を定義し、前記1次元系列特徴
と該モデル関数とを非線形にマッチングし、該非線形マ
ッチングに於ける非線形対応関数から前記モデル関数の
文字切り出し位置に対応する文字列画像の文字切り出し
位置を求め、該文字切り出し位置から1文字に相当する
部分画像を切り出すことを特徴とする文字切り出し方法
を実現できる。更に、文字列に含まれる文字数が既知で
あるときに、文字列を光学的に読み取り、その文字列画
像から1文字に相当する部分画像を切り出す装置におい
て、光学的に走査された文字列画像を格納する文字列画
像記憶手段と、文字数を格納する文字数記憶手段と、前
記文字列画像記憶手段から文字列画像を読み込み文字列
の1次元系列特徴を抽出する1次元系列特徴抽出手段
と、該1次元系列特徴を格納する1次元系列特徴記憶手
段と、前記1次元系列特徴と前記文字数に応じた、文字
切り出し位置の特定が可能なモデル関数を作成するモデ
ル関数発生手段と、該モデル関数を格納するモデル関数
記憶手段と、前記1次元系列特徴と前記モデル関数との
非線形マッチングを行い、その非線形対応関数を記憶す
る非線形マッチング手段と、前記モデル関数の文字切り
出し位置を求め、前記非線形対応関数から文字列画像の
文字切り出し位置を決定し、これを記憶する文字切り出
し位置決定手段と、該文字切り出し位置から文字列画像
の1文字に相当する部分画像の切り出しを行う文字切り
出し手段を有することを特徴とする文字切り出し装置を
実現できる。
(Means for Solving the Problem) According to the present invention, when the number of characters included in a character string is known, the character string is optically read and a partial image corresponding to one character is cut out from the character string image. In the method, a one-dimensional series feature is extracted from a character string image, a model function corresponding to the number of characters and the one-dimensional series feature and capable of specifying a character cutout position is defined, and the one-dimensional series feature and the one-dimensional series feature are defined. A non-linear matching with a model function is performed, the character cut-out position of the character string image corresponding to the character cut-out position of the model function is obtained from the non-linear correspondence function in the non-linear matching, and a portion corresponding to one character from the character cut-out position A character cutout method characterized by cutting out an image can be realized. Further, when the number of characters contained in the character string is known, the character string is optically read, and a partial image corresponding to one character is cut out from the character string image. A character string image storage means for storing, a character number storage means for storing the number of characters, a one-dimensional series feature extraction means for reading a character string image from the character string image storage means and extracting a one-dimensional series feature of the character string; One-dimensional series feature storage means for storing a three-dimensional series feature, model function generating means for creating a model function capable of specifying a character cutout position according to the one-dimensional series feature and the number of characters, and the model function are stored Model function storage means, non-linear matching means for performing non-linear matching between the one-dimensional series feature and the model function, and storing the non-linear corresponding function; The character cut-out position of the character function is obtained from the non-linear correspondence function, the character cut-out position of the character string image is determined from the non-linear correspondence function, and the character cut-out position determining means stores the character cut-out position. It is possible to realize a character cutout device having a character cutout unit that cuts out a partial image.

(作用) 図面を参照して本発明の一つである文字切り出し方法
の原理について詳細に説明する。以下の説明では、横書
きの文字列画像を用いて説明するが、縦書きの文字列に
ついても同様の原理が適用できる。文字列に対して、そ
こに含まれる文字数をNとする。第2図(a)は、横書
き文字列の一例である。
(Operation) The principle of the character cutting method according to the present invention will be described in detail with reference to the drawings. In the following description, a horizontally written character string image is used, but the same principle can be applied to a vertically written character string. Let N be the number of characters contained in a character string. FIG. 2A is an example of a horizontally written character string.

この文字列画像から1次元系列特徴を抽出する。例え
ば、1次元系列特徴として第2図(b)のような縦方向
に黒画素の数を計数した投影ヒストグラムを表す投影関
数を考えることができる。この投影関数では、文字が接
触していても、文字の切り出し位置では極小値をとるこ
とが予想される。従って、投影関数からN−1個の極小
点で、各極小点間に、文字らしい塊があるようなものを
探すことが、文字の切り出し位置を求めることになる。
そのために、第2図(c)のような、1次元系列特徴か
ら得られる特徴量を使ってN個の極大点とN−1個の極
小点を交互にもつようなモデル関数を定義する。この極
小点間の距離が文字ピッチに相当し、極小点が文字の切
り出し位置に相当する。このモデル関数と1次元系列特
徴とを1次元方向に伸縮させながら、しかもこの伸縮は
すべての点で一様な伸縮率ではなく、局所的に異なる伸
縮性をもつような、例えばDPマッチングのような非線形
なマッチングを行う。DPマッチングについては、文献
「動的計画法を利用した時間正規化に基づく連続音声認
識、迫江・千葉、音響学会誌27−9(1971)」等でよく
知られている技術であり、後で詳細に説明する。このマ
ッチングにより文字幅や文字間隔の変動を吸収すること
が可能となり、また、このマッチングがモデル関数の極
小点と1次元系列特徴の極小点の付近を対応させること
ができるという点で、従来技術に於いては困難であった
文字が接触した場合の文字切り出しに対しても有効には
たらく。非線形マッチングにおける1次元系列特徴とモ
デル関数との対応関係を表す対応関数から、モデル関数
の各極小点に対応する1次元系列特徴の点を求めること
が、即ち文字列画像の文字切り出し位置を決定すること
になる。この文字切り出し位置間の部分画像を、1文字
に相当する部分画像とみなすことにより文字の切り出し
が実現できる。このようにして、本発明は、従来技術の
問題点であった文字幅や文字間隔の変動大きい文字列画
像からの文字切り出しや、文字と文字の接触している文
字列画像からの文字切り出しを可能にした。
One-dimensional series features are extracted from this character string image. For example, a projection function representing a projection histogram in which the number of black pixels is counted in the vertical direction as shown in FIG. 2B can be considered as the one-dimensional series feature. With this projection function, it is expected that a minimum value will be obtained at the character cutout position, even if the characters are in contact. Therefore, the character cut-out position is obtained by searching N-1 minimum points from the projection function such that there is a character-like block between the minimum points.
Therefore, as shown in FIG. 2 (c), a model function having N maximum points and N-1 minimum points alternately is defined by using the feature amount obtained from the one-dimensional series feature. The distance between the minimum points corresponds to the character pitch, and the minimum point corresponds to the character cutout position. While expanding and contracting this model function and the one-dimensional series feature in the one-dimensional direction, and this expansion and contraction does not have a uniform expansion and contraction rate at all points, it has a locally different elasticity, such as DP matching. Performs non-linear matching. DP matching is a well-known technique in the literature "Continuous Speech Recognition Based on Time Normalization Using Dynamic Programming, Sakoe and Chiba, Journal of Acoustical Society 27-9 (1971)", etc. Will be described in detail. This matching makes it possible to absorb variations in character width and character spacing, and in this matching, it is possible to match the minimum point of the model function with the vicinity of the minimum point of the one-dimensional sequence feature. In this case, it also works effectively for cutting out characters when they come into contact with each other. The point of the one-dimensional series feature corresponding to each minimum point of the model function can be obtained from the correspondence function representing the correspondence between the one-dimensional sequence feature and the model function in the non-linear matching, that is, the character cut-out position of the character string image is determined. Will be done. By considering the partial image between the character cutout positions as a partial image corresponding to one character, the character cutout can be realized. In this way, the present invention can perform character segmentation from a character string image having large fluctuations in character width and character spacing, and character segmentation from a character string image in which characters are in contact with each other, which has been a problem of the prior art. Made possible

(実施例) 第1図は、本発明の一つである文字切り出し装置の一
実施例の構成を示すブロック図である。文字列画像記憶
手段1は、2値化された文字列画像を格納する通常の記
憶手段である。文字数記憶手段2は、文字列画像に含ま
れる文字数を格納する通常の記憶手段である。1次元系
列特徴抽出手段3は、前記文字列画像記憶手段1から文
字列画像を信号10として読み込み、文字列方向と垂直な
方向に画像を走査し黒画素の数を計数した第2図(b)
のような投影ヒストグラムを表す投影関数f(x)を1
次元系列特徴をして抽出する。これは従来技術により容
易に実現できる。1次元系列特徴記憶手段4は、前記1
次元系列特徴を信号11により格納する通常の記憶手段で
ある。モデル関数発生手段5は、前記1次元系列特徴記
憶手段4から信号13として1次元系列特徴を読み込み、
前記文字数記憶手段2から信号12として文字数Nを読み
込み、1次元系列特徴と文字数に対応した1変数のモデ
ル関数を定義する。例えば、第2図(c)のような関数
m(x)を考えることができる。これは、1次元系列特
徴から文字列全体の幅(W)、及び文字列画像中の黒画
素の総数(A)を求め、文字数Nに対して以下の式で定
義したものであり、従来技術により容易に実現できる。
(Embodiment) FIG. 1 is a block diagram showing the configuration of an embodiment of a character slicing device according to the present invention. The character string image storage means 1 is a normal storage means for storing a binarized character string image. The character number storage unit 2 is a normal storage unit that stores the number of characters included in the character string image. The one-dimensional sequence feature extraction means 3 reads the character string image as the signal 10 from the character string image storage means 1, scans the image in the direction perpendicular to the character string direction, and counts the number of black pixels in FIG. )
Projection function f (x) representing a projection histogram such as
Dimensional series features are extracted and extracted. This can be easily achieved by conventional techniques. The one-dimensional sequence feature storage means 4 stores the 1
It is an ordinary storage means for storing the dimensional sequence feature by the signal 11. The model function generating means 5 reads the one-dimensional sequence feature as the signal 13 from the one-dimensional sequence feature storage means 4,
The number of characters N is read from the number-of-characters storage means 2 as a signal 12 to define a one-variable model function corresponding to the one-dimensional series feature and the number of characters. For example, a function m (x) as shown in FIG. 2 (c) can be considered. This is one in which the width (W) of the entire character string and the total number (A) of black pixels in the character string image are obtained from the one-dimensional feature, and the number of characters N is defined by the following equation. Can be realized easily.

m(x)=|h・sin(Π・x/w)| w=W/N h=Π・A/(2・W) ここで、hはモデル関数m(x)の面積と投影関数f
(x)の面積か等しくなるように定義してある。モデル
関数記憶手段6は、前記モデル関数発生手段で定義され
たモデル関数を信号14として格納する通常の記憶手段で
ある。非線形マッチング手段7は、前記記憶手段4から
1次元系列特徴を信号15により読み込み、前記記憶手段
6から信号16によりモデル関数を読み込み、2つの関数
を1次元方向に伸縮させるDPマッチングような非線形の
マッチングを行い、2つの関数の対応関係を表す対応関
数を通常の記憶手段により記憶する。非線形マッチング
の1つであるDPマッチングはよく知られた技術であり、
ここでのDPマッチングは以下の式を評価関数とし、対応
関数として最小値Tをあたえる第3図に示すようなパス
c=(c1、c2)を探索するための手段とする。
m (x) = | h · sin (Π · x / w) | w = W / N h = Π · A / (2 · W) where h is the area of the model function m (x) and the projection function f
The area of (x) is defined to be equal. The model function storage means 6 is a normal storage means for storing the model function defined by the model function generation means as a signal 14. The non-linear matching means 7 reads a one-dimensional sequence feature from the storage means 4 by a signal 15, reads a model function from the storage means 6 by a signal 16, and performs a non-linear matching such as DP matching for expanding and contracting two functions in a one-dimensional direction. Matching is performed, and the corresponding function representing the correspondence between the two functions is stored in the normal storage means. DP matching, which is one of non-linear matching, is a well-known technology,
In the DP matching here, the following formula is used as an evaluation function, and means for searching a path c = (c 1 , c 2 ) as shown in FIG. 3 which gives a minimum value T as a corresponding function.

文字切り出し位置の決定手段8は、信号17により前記
非線形マッチング手段7からの対応関数を読み込み、前
記記憶手段6からモデル関数を信号18により読み込み、
まずモデル関数のN−1個の極小点の位置を求め、対応
関数により1次元系列特徴に於ける対応点を探し、これ
を記憶するものとする。これらの処理は従来技術により
容易に実現できる。文字切り出し手段9は、前記文字切
り出し位置の決定手段8から信号19により文字切り出し
位置を読み込み、前記文字画像記憶手段1から信号20に
より文字画像を読み込み、文字切り出し位置間に含まれ
る黒画素の外接矩形を求め、その部分画像を1文字画像
として、記憶することにより文字列画像から1文字ずつ
を切り出すが、これらもまた従来技術により容易に実現
可能である。
The means 8 for determining the character cut-out position reads the corresponding function from the non-linear matching means 7 with the signal 17, reads the model function from the storage means 6 with the signal 18,
First, the positions of N-1 minimum points of the model function are obtained, the corresponding points in the one-dimensional series feature are searched for by the corresponding function, and these are stored. These processes can be easily realized by conventional techniques. The character cutting-out means 9 reads the character cutting-out position from the character cutting-out position determining means 8 by a signal 19, reads the character image from the character image storing means 1 by a signal 20, and circumscribes the black pixels included between the character cutting-out positions. Although a rectangle is obtained and the partial image is stored as a character image, the character image is cut out one by one, but these can also be easily realized by conventional techniques.

以上、実施例をもって本発明を詳細に説明したが、本
発明はこの実施例のみに限定されるものではない。例え
ば、本実施例は文字列画像として2値画像を想定してい
るが、多値画像でも適用できる。この場合、1次元系列
特徴として文字列方向と垂直な方向に累積をとった投影
ヒストグラムを使うことができる。また2値画像の場合
でも、1次元系列特徴として文字列方向と垂直な方向に
画像を走査し、黒画素間の距離の最大値を与える第4図
(a)のような関数を適用することもできる。また、モ
デル関数においても本実施例は1つの例に過ぎず、文字
数Nに対してN−1個の極小点とN個の極大点とが交互
に存在するような1変数関数で、入力画像から得られる
1次元系列特徴の文字の図形としてのまとまりを表すも
のであれば本発明は支障なく実施することができ、例え
ばモデル関数m(x)として以下のような関数を用いる
こともできる(第4図(b)参照)。
Although the present invention has been described in detail with reference to the embodiment, the present invention is not limited to this embodiment. For example, although a binary image is assumed as the character string image in this embodiment, a multi-valued image can also be applied. In this case, a projection histogram accumulated in the direction perpendicular to the character string direction can be used as the one-dimensional series feature. Also in the case of a binary image, the image is scanned in the direction perpendicular to the character string direction as a one-dimensional series feature, and a function as shown in FIG. 4 (a) that gives the maximum value of the distance between black pixels is applied. You can also The present embodiment is only one example in the model function, and is a one-variable function in which N-1 minimum points and N maximum points are alternately present for the number of characters N, and the input image is used. The present invention can be implemented without any trouble as long as it represents a group of one-dimensional series feature characters as a figure, and the following function can be used as the model function m (x) ( See FIG. 4 (b).

m(x)=h・(1−|cos(Π・x/w)|) 但し、hとwは文字列全体の幅をW、文字列画像中の
黒画素の総数をA、文字数をNとしたときに以下の式で
定義された値である。
m (x) = h · (1- | cos (Π · x / w) |) where h and w are the width of the entire character string, A is the total number of black pixels in the character string image, and N is the number of characters. Is the value defined by the following formula.

h=(A・Π)/(W・(Π−2)) w=W/N ここでhは、投影関数f(x)の面積とモデル関数の
面積とが等しくなるように定義してある。
h = (A · Π) / (W · (Π-2)) w = W / N where h is defined so that the area of the projection function f (x) is equal to the area of the model function. .

(発明の効果) 以上のように本発明によれば、文字列に含まれる文字
数が与えられたときに、文字幅や文字間隔の変動が比較
的大きい文字列画像や、文字と文字が接触している文字
列画像から1文字ずつを切り出すことができ、しかもパ
ラメータの数が比較的少なく単純な方法で切り出しを行
うことが可能になる。
(Effect of the invention) As described above, according to the present invention, when the number of characters included in a character string is given, a character string image with a relatively large variation in character width or character spacing or a character-to-character contact It is possible to cut out character by character from the displayed character string image, and moreover, it is possible to cut out by a simple method with a relatively small number of parameters.

【図面の簡単な説明】[Brief description of drawings]

第1図は本発明の一つである文字切り出し装置の一実施
例の構成を示すブロック図、第2図と第3図は本発明の
原理を説明するための図、第4図は他の実施例を示す図
である。 図中、1は文字列画像記憶手段、2は文字数記憶手段、
3は1次元系列特徴抽出手段、4は1次元系列特徴記憶
手段、5はモデル関数発生手段、6はモデル関数記憶手
段、7は非線形マッチング手段、8は文字切り出し位置
決定手段、9は文字切り出し手段である。
FIG. 1 is a block diagram showing the configuration of an embodiment of a character slicing device according to the present invention, FIGS. 2 and 3 are diagrams for explaining the principle of the present invention, and FIG. 4 is another diagram. It is a figure which shows an Example. In the figure, 1 is a character string image storage means, 2 is a character number storage means,
3 is a one-dimensional sequence feature extraction means, 4 is a one-dimensional sequence feature storage means, 5 is a model function generation means, 6 is a model function storage means, 7 is a non-linear matching means, 8 is a character cutout position determination means, and 9 is a character cutout. It is a means.

Claims (2)

(57)【特許請求の範囲】(57) [Claims] 【請求項1】文字列に含まれる文字数が既知であるとき
に、文字列を光学的に読みとり、その文字列画像から1
文字に相当する部分画面を切り出す方法に於いて、文字
列画像から1次元系列特徴を抽出し、前記文字数と該1
次元系列特徴に対応する、文字切り出し位置の特定が可
能なモデル関数を定義し、前記1次元系列特徴と該モデ
ル関数とを非線形にマッチングし、該非線形マッチング
に於ける非線形対応関数から前記モデル関数の文字切り
出し位置に対応する文字列画像の文字切り出し位置を求
め、該文字切り出し位置から1文字に相当する部分画像
を切り出すことを特徴とする文字切り出し方法。
1. When the number of characters included in a character string is known, the character string is optically read, and 1 is read from the character string image.
In a method of cutting out a partial screen corresponding to a character, a one-dimensional series feature is extracted from a character string image, and the number of characters and the 1
A model function corresponding to a dimensional series feature and capable of specifying a character cut-out position is defined, the one-dimensional series feature and the model function are non-linearly matched, and the model function is converted from the non-linear corresponding function in the non-linear matching. The character cutout position of the character string image corresponding to the character cutout position, and the partial image corresponding to one character is cut out from the character cutout position.
【請求項2】文字列に含まれる文字数が既知であるとき
に、文字列を光学的に読み取り、その文字列画像から1
文字に相当する部分画像を切り出す装置において、光学
的に走査された文字列画像を格納する文字列画像記憶手
段と、文字数を格納する文字数記憶手段と、前記文字列
画像記憶手段から文字列画像を読み込んで文字列の1次
元系列特徴を抽出する1次元系列特徴抽出手段と、該1
次元系列特徴を格納する1次元系列特徴記憶手段と、前
記1次元系列特徴と前記文字数に応じた、文字切り出し
位置の特定が可能なモデル関数を作成するモデル関数発
生手段と、該モデル関数を格納するモデル関数記憶手段
と、前記1次元系列特徴と前記モデル関数との非線形マ
ッチングを行い、その非線形対応関数を記憶する非線形
マッチング手段と、前記モデル関数の文字切り出し位置
に対応する文字列画像の文字切り出し位置を、該非線形
対応関数から決定し、これを記憶する文字切り出し位置
決定手段と、該文字切り出し位置から文字列画像の1文
字に相当する部分画像の切り出しを行う文字切り出し手
段を有することを特徴とする文字切り出し装置。
2. When the number of characters included in a character string is known, the character string is optically read and 1 is read from the character string image.
In a device for cutting out a partial image corresponding to a character, a character string image storage means for storing an optically scanned character string image, a character number storage means for storing the number of characters, and a character string image from the character string image storage means. A one-dimensional series feature extraction means for reading and extracting one-dimensional series features of a character string;
One-dimensional series feature storage means for storing a three-dimensional series feature, model function generating means for creating a model function capable of specifying a character cutout position according to the one-dimensional series feature and the number of characters, and the model function are stored Model function storage means, non-linear matching means for performing non-linear matching of the one-dimensional series feature and the model function, and storing the non-linear corresponding function, and characters of the character string image corresponding to the character cut-out position of the model function. A cutting-out position is determined from the non-linear correspondence function, and a character cutting-out position deciding means for storing the function and a character cutting-out means for cutting out a partial image corresponding to one character of the character string image from the character cutting-out position are provided. Character cutting device.
JP1280482A 1989-10-26 1989-10-26 Character cutting method and device Expired - Fee Related JP2518063B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP1280482A JP2518063B2 (en) 1989-10-26 1989-10-26 Character cutting method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP1280482A JP2518063B2 (en) 1989-10-26 1989-10-26 Character cutting method and device

Publications (2)

Publication Number Publication Date
JPH03141484A JPH03141484A (en) 1991-06-17
JP2518063B2 true JP2518063B2 (en) 1996-07-24

Family

ID=17625693

Family Applications (1)

Application Number Title Priority Date Filing Date
JP1280482A Expired - Fee Related JP2518063B2 (en) 1989-10-26 1989-10-26 Character cutting method and device

Country Status (1)

Country Link
JP (1) JP2518063B2 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPWO2006068269A1 (en) 2004-12-24 2008-08-07 日本電気株式会社 Image structuring apparatus and method
CN110516125B (en) * 2019-08-28 2020-05-08 拉扎斯网络科技(上海)有限公司 Method, device and equipment for identifying abnormal character string and readable storage medium

Also Published As

Publication number Publication date
JPH03141484A (en) 1991-06-17

Similar Documents

Publication Publication Date Title
CA2077970C (en) Optical word recognition by examination of word shape
Huang et al. Off-line signature verification based on geometric feature extraction and neural network classification
Kim et al. An architecture for handwritten text recognition systems
Kim et al. A lexicon driven approach to handwritten word recognition for real-time applications
US5321770A (en) Method for determining boundaries of words in text
Sabourin et al. Off-line identification with handwritten signature images: survey and perspectives
US5687253A (en) Method for comparing word shapes
US5390259A (en) Methods and apparatus for selecting semantically significant images in a document image without decoding image content
Cattoni et al. Geometric layout analysis techniques for document image understanding: a review
US5050222A (en) Polygon-based technique for the automatic classification of text and graphics components from digitized paper-based forms
US5640466A (en) Method of deriving wordshapes for subsequent comparison
Chakraborty et al. Does deeper network lead to better accuracy: a case study on handwritten Devanagari characters
Demilew et al. Ancient Geez script recognition using deep learning
US20010033694A1 (en) Handwriting recognition by word separation into sillouette bar codes and other feature extraction
Hamida et al. Handwritten arabic words recognition system based on hog and gabor filter descriptors
Waqar et al. Meter digit recognition via Faster R-CNN
Lyu et al. The early japanese books text line segmentation base on image processing and deep learning
JP2518063B2 (en) Character cutting method and device
Zou et al. Extracting strokes from static line images based on selective searching
Srinivas et al. An overview of OCR research in Indian scripts
Roth An approach to recognition of printed music
Tripathy et al. System for Oriya handwritten numeral recognition
Shah et al. Signature recognition and verification: The most acceptable biometrics for security
JP2518067B2 (en) Character cutting method and device
Nishida et al. A model-based split-and-merge method for recognition and segmentation of character strings

Legal Events

Date Code Title Description
FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20090517

Year of fee payment: 13

LAPS Cancellation because of no payment of annual fees