JP2518063B2

JP2518063B2 - Character cutting method and device

Info

Publication number: JP2518063B2
Application number: JP1280482A
Authority: JP
Inventors: 弥生佐藤; 淳津雲
Original assignee: Nippon Electric Co Ltd
Current assignee: NEC Corp
Priority date: 1989-10-26
Filing date: 1989-10-26
Publication date: 1996-07-24
Anticipated expiration: 2011-07-24
Also published as: JPH03141484A

Description

【発明の詳細な説明】（産業上の利用分野）本発明は、光学的文字読み取り装置（OCR）における
文字切り出し方法及びその装置に関する。TECHNICAL FIELD The present invention relates to a character cutting method and an apparatus for an optical character reading device (OCR).

（従来の技術）計算機への情報入力の１つとして、光学的文字読み取
り装置が実用化されているが、手書き文字の認識の為に
使用されている文字認識方法には、文字記入枠内に記入
された文字、すなわち１文字ずつが切り出されている文
字を対象とするものがほとんどであった。近年、より使
いやすいOCRの実現を目指して、罫線などの文字列記入
枠内に記入された手書き文字列を認識する方法も提案さ
れてきている。文字列認識における１つの処理過程とし
て、文字切り出しがある。文字切り出し方法の研究とし
て従来、文献「手書き日本文字列からの文字切り出しの
基礎的考案、馬場口・塚本・相原、電子通信学会論文誌
（Ｄ）Vol.J68−Ｄ、No.12、1985、pp2123〜2131」、
「周辺分布を用いた手書き文字切り出しの検討、有吉・
岡沢・前田、昭60年電子通信学会総合全国大会講演論文
集、No.1555」に記載されているような技術が知られて
いる。これらの研究は、 1.文字の外接矩形は、ほぼ正方形に近い。(Prior Art) An optical character reading device has been put to practical use as one of the information input to a computer. However, the character recognition method used for recognizing handwritten characters is in the character entry frame. Most of them were written characters, that is, characters cut out one by one. In recent years, a method of recognizing a handwritten character string entered in a character string entry frame such as a ruled line has been proposed for the purpose of realizing an easier-to-use OCR. Character segmentation is one of the processing steps in character string recognition. In the past, as a study of character extraction methods, the literature "Basic idea of character extraction from handwritten Japanese character strings, Babaguchi / Tsukamoto / Aihara, IEICE Transactions (D) Vol.J68-D, No.12, 1985, pp2123-2131 '',
"Examination of handwritten character segmentation using marginal distribution, Ariyoshi
Okazawa & Maeda, A technique known as the one described in "Proceedings of the 60th General Conference of the Institute of Electronics and Communication Engineers, No. 1555". These studies: 1. The circumscribed rectangle of a character is almost square.

2.同一文字列中に含まれる文字の大きさは、ほぼ等し
い。2. The size of characters included in the same character string is almost equal.

といった仮定のもとでの文字切り出し方法である。しか
し、現実にはこれらの仮定を満たさないデータも多く、
文字幅や文字の外接矩形の縦横比の変動は大きい。また
これらの方法は、文字と文字が接触しているパタンには
有効でなかったり、文字が偏と旁に分かれていてそれぞ
れが１文字幅に近い幅をもっている場合、あるいは、文
字列中の文字幅の変動が大きい場合には有効でないとい
う問題点がある。文字の切り出しに文字列画像の図形的
特徴だけでなく、文字の認識結果も利用した研究とし
て、文献「候補文字ラティス法による枠無し筆記文字列
のオンライン認識、村瀬・若原・梅田、電子通信学会論
文誌（Ｄ）J68−Ｄ、No4、1989、pp765〜772」、「手書
き日本語文書からの文字切り出し方式、依田・松浦・前
田・南部、電子通信学会技術研究報告、SP86−35、198
6、pp67〜76」等がある。これらの方法にも上記と同じ
問題点があったり、あるいはパラメータを含むルールを
作成することにより上記の変動を吸収しようと試みた方
法ではあるが、切り出しの性能を向上するためのパラメ
ータの調節が難しいという問題点がある。It is a character cutting method based on the assumption. However, in reality, there are many data that do not meet these assumptions,
There are large variations in the character width and the aspect ratio of the circumscribed rectangle of the character. In addition, these methods are not effective for patterns in which characters are in contact with each other, or if the characters are divided into uneven and staggered, and each has a width close to one character width, or There is a problem that it is not effective when the width varies greatly. As a study that uses not only the graphical features of a character string image but also the character recognition result for character extraction, the literature “Online recognition of frameless written character strings by the candidate character lattice method, Murase / Wakahara / Umeda, IEICE” Journal (D) J68-D, No4, 1989, pp765-772 "," Character extraction method from handwritten Japanese documents, Yoda / Matsuura / Maeda / Minami, IEICE technical report, SP86-35, 198 "
6, pp 67-76 ”, etc. These methods also have the same problems as described above, or methods that try to absorb the above variations by creating rules that include parameters, but the adjustment of parameters to improve the cutting performance is There is a problem that it is difficult.

（発明が解決しようとする課題）本発明の目的は、ID番号や製品の型番、電話番号のよ
うな、予め文字数が既知な文字列から文字切り出しを行
う際に、上述したような従来技術の欠点を除去した、文
字幅や文字間隔の変動を吸収し文字の接触にも強い文字
切り出し方法で、しかもパラメータ調節の作業を殆ど必
要としない方法を提供し、またその方法を実現するため
の装置を提供することにある。(Problems to be Solved by the Invention) An object of the present invention is to cut out characters from a character string in which the number of characters is known in advance, such as an ID number, a product model number, and a telephone number. A device for realizing the method by eliminating the drawbacks, providing a method for absorbing the fluctuation of the character width and the character spacing and strong against the contact of characters, and requiring almost no parameter adjustment work. To provide.

（課題を解決するための手段）本発明によれば、文字列に含まれる文字数が既知であ
るときに、文字列を光学的に読み取り、その文字列画像
から１文字に相当する部分画像を切り出す方法に於い
て、文字列画像から１次元系列特徴を抽出し、前記文字
数と該１次元系列特徴に対応する、文字切り出し位置の
特定が可能なモデル関数を定義し、前記１次元系列特徴
と該モデル関数とを非線形にマッチングし、該非線形マ
ッチングに於ける非線形対応関数から前記モデル関数の
文字切り出し位置に対応する文字列画像の文字切り出し
位置を求め、該文字切り出し位置から１文字に相当する
部分画像を切り出すことを特徴とする文字切り出し方法
を実現できる。更に、文字列に含まれる文字数が既知で
あるときに、文字列を光学的に読み取り、その文字列画
像から１文字に相当する部分画像を切り出す装置におい
て、光学的に走査された文字列画像を格納する文字列画
像記憶手段と、文字数を格納する文字数記憶手段と、前
記文字列画像記憶手段から文字列画像を読み込み文字列
の１次元系列特徴を抽出する１次元系列特徴抽出手段
と、該１次元系列特徴を格納する１次元系列特徴記憶手
段と、前記１次元系列特徴と前記文字数に応じた、文字
切り出し位置の特定が可能なモデル関数を作成するモデ
ル関数発生手段と、該モデル関数を格納するモデル関数
記憶手段と、前記１次元系列特徴と前記モデル関数との
非線形マッチングを行い、その非線形対応関数を記憶す
る非線形マッチング手段と、前記モデル関数の文字切り
出し位置を求め、前記非線形対応関数から文字列画像の
文字切り出し位置を決定し、これを記憶する文字切り出
し位置決定手段と、該文字切り出し位置から文字列画像
の１文字に相当する部分画像の切り出しを行う文字切り
出し手段を有することを特徴とする文字切り出し装置を
実現できる。(Means for Solving the Problem) According to the present invention, when the number of characters included in a character string is known, the character string is optically read and a partial image corresponding to one character is cut out from the character string image. In the method, a one-dimensional series feature is extracted from a character string image, a model function corresponding to the number of characters and the one-dimensional series feature and capable of specifying a character cutout position is defined, and the one-dimensional series feature and the one-dimensional series feature are defined. A non-linear matching with a model function is performed, the character cut-out position of the character string image corresponding to the character cut-out position of the model function is obtained from the non-linear correspondence function in the non-linear matching, and a portion corresponding to one character from the character cut-out position A character cutout method characterized by cutting out an image can be realized. Further, when the number of characters contained in the character string is known, the character string is optically read, and a partial image corresponding to one character is cut out from the character string image. A character string image storage means for storing, a character number storage means for storing the number of characters, a one-dimensional series feature extraction means for reading a character string image from the character string image storage means and extracting a one-dimensional series feature of the character string; One-dimensional series feature storage means for storing a three-dimensional series feature, model function generating means for creating a model function capable of specifying a character cutout position according to the one-dimensional series feature and the number of characters, and the model function are stored Model function storage means, non-linear matching means for performing non-linear matching between the one-dimensional series feature and the model function, and storing the non-linear corresponding function; The character cut-out position of the character function is obtained from the non-linear correspondence function, the character cut-out position of the character string image is determined from the non-linear correspondence function, and the character cut-out position determining means stores the character cut-out position. It is possible to realize a character cutout device having a character cutout unit that cuts out a partial image.

（作用）図面を参照して本発明の一つである文字切り出し方法
の原理について詳細に説明する。以下の説明では、横書
きの文字列画像を用いて説明するが、縦書きの文字列に
ついても同様の原理が適用できる。文字列に対して、そ
こに含まれる文字数をＮとする。第２図（ａ）は、横書
き文字列の一例である。(Operation) The principle of the character cutting method according to the present invention will be described in detail with reference to the drawings. In the following description, a horizontally written character string image is used, but the same principle can be applied to a vertically written character string. Let N be the number of characters contained in a character string. FIG. 2A is an example of a horizontally written character string.

この文字列画像から１次元系列特徴を抽出する。例え
ば、１次元系列特徴として第２図（ｂ）のような縦方向
に黒画素の数を計数した投影ヒストグラムを表す投影関
数を考えることができる。この投影関数では、文字が接
触していても、文字の切り出し位置では極小値をとるこ
とが予想される。従って、投影関数からＮ−１個の極小
点で、各極小点間に、文字らしい塊があるようなものを
探すことが、文字の切り出し位置を求めることになる。
そのために、第２図（ｃ）のような、１次元系列特徴か
ら得られる特徴量を使ってＮ個の極大点とＮ−１個の極
小点を交互にもつようなモデル関数を定義する。この極
小点間の距離が文字ピッチに相当し、極小点が文字の切
り出し位置に相当する。このモデル関数と１次元系列特
徴とを１次元方向に伸縮させながら、しかもこの伸縮は
すべての点で一様な伸縮率ではなく、局所的に異なる伸
縮性をもつような、例えばDPマッチングのような非線形
なマッチングを行う。DPマッチングについては、文献
「動的計画法を利用した時間正規化に基づく連続音声認
識、迫江・千葉、音響学会誌27−９（1971）」等でよく
知られている技術であり、後で詳細に説明する。このマ
ッチングにより文字幅や文字間隔の変動を吸収すること
が可能となり、また、このマッチングがモデル関数の極
小点と１次元系列特徴の極小点の付近を対応させること
ができるという点で、従来技術に於いては困難であった
文字が接触した場合の文字切り出しに対しても有効には
たらく。非線形マッチングにおける１次元系列特徴とモ
デル関数との対応関係を表す対応関数から、モデル関数
の各極小点に対応する１次元系列特徴の点を求めること
が、即ち文字列画像の文字切り出し位置を決定すること
になる。この文字切り出し位置間の部分画像を、１文字
に相当する部分画像とみなすことにより文字の切り出し
が実現できる。このようにして、本発明は、従来技術の
問題点であった文字幅や文字間隔の変動大きい文字列画
像からの文字切り出しや、文字と文字の接触している文
字列画像からの文字切り出しを可能にした。One-dimensional series features are extracted from this character string image. For example, a projection function representing a projection histogram in which the number of black pixels is counted in the vertical direction as shown in FIG. 2B can be considered as the one-dimensional series feature. With this projection function, it is expected that a minimum value will be obtained at the character cutout position, even if the characters are in contact. Therefore, the character cut-out position is obtained by searching N-1 minimum points from the projection function such that there is a character-like block between the minimum points.
Therefore, as shown in FIG. 2 (c), a model function having N maximum points and N-1 minimum points alternately is defined by using the feature amount obtained from the one-dimensional series feature. The distance between the minimum points corresponds to the character pitch, and the minimum point corresponds to the character cutout position. While expanding and contracting this model function and the one-dimensional series feature in the one-dimensional direction, and this expansion and contraction does not have a uniform expansion and contraction rate at all points, it has a locally different elasticity, such as DP matching. Performs non-linear matching. DP matching is a well-known technique in the literature "Continuous Speech Recognition Based on Time Normalization Using Dynamic Programming, Sakoe and Chiba, Journal of Acoustical Society 27-9 (1971)", etc. Will be described in detail. This matching makes it possible to absorb variations in character width and character spacing, and in this matching, it is possible to match the minimum point of the model function with the vicinity of the minimum point of the one-dimensional sequence feature. In this case, it also works effectively for cutting out characters when they come into contact with each other. The point of the one-dimensional series feature corresponding to each minimum point of the model function can be obtained from the correspondence function representing the correspondence between the one-dimensional sequence feature and the model function in the non-linear matching, that is, the character cut-out position of the character string image is determined. Will be done. By considering the partial image between the character cutout positions as a partial image corresponding to one character, the character cutout can be realized. In this way, the present invention can perform character segmentation from a character string image having large fluctuations in character width and character spacing, and character segmentation from a character string image in which characters are in contact with each other, which has been a problem of the prior art. Made possible

（実施例）第１図は、本発明の一つである文字切り出し装置の一
実施例の構成を示すブロック図である。文字列画像記憶
手段１は、２値化された文字列画像を格納する通常の記
憶手段である。文字数記憶手段２は、文字列画像に含ま
れる文字数を格納する通常の記憶手段である。１次元系
列特徴抽出手段３は、前記文字列画像記憶手段１から文
字列画像を信号10として読み込み、文字列方向と垂直な
方向に画像を走査し黒画素の数を計数した第２図（ｂ）
のような投影ヒストグラムを表す投影関数ｆ（ｘ）を１
次元系列特徴をして抽出する。これは従来技術により容
易に実現できる。１次元系列特徴記憶手段４は、前記１
次元系列特徴を信号11により格納する通常の記憶手段で
ある。モデル関数発生手段５は、前記１次元系列特徴記
憶手段４から信号13として１次元系列特徴を読み込み、
前記文字数記憶手段２から信号12として文字数Ｎを読み
込み、１次元系列特徴と文字数に対応した１変数のモデ
ル関数を定義する。例えば、第２図（ｃ）のような関数
ｍ（ｘ）を考えることができる。これは、１次元系列特
徴から文字列全体の幅（Ｗ）、及び文字列画像中の黒画
素の総数（Ａ）を求め、文字数Ｎに対して以下の式で定
義したものであり、従来技術により容易に実現できる。(Embodiment) FIG. 1 is a block diagram showing the configuration of an embodiment of a character slicing device according to the present invention. The character string image storage means 1 is a normal storage means for storing a binarized character string image. The character number storage unit 2 is a normal storage unit that stores the number of characters included in the character string image. The one-dimensional sequence feature extraction means 3 reads the character string image as the signal 10 from the character string image storage means 1, scans the image in the direction perpendicular to the character string direction, and counts the number of black pixels in FIG. )
Projection function f (x) representing a projection histogram such as
Dimensional series features are extracted and extracted. This can be easily achieved by conventional techniques. The one-dimensional sequence feature storage means 4 stores the 1
It is an ordinary storage means for storing the dimensional sequence feature by the signal 11. The model function generating means 5 reads the one-dimensional sequence feature as the signal 13 from the one-dimensional sequence feature storage means 4,
The number of characters N is read from the number-of-characters storage means 2 as a signal 12 to define a one-variable model function corresponding to the one-dimensional series feature and the number of characters. For example, a function m (x) as shown in FIG. 2 (c) can be considered. This is one in which the width (W) of the entire character string and the total number (A) of black pixels in the character string image are obtained from the one-dimensional feature, and the number of characters N is defined by the following equation. Can be realized easily.

ｍ（ｘ）＝|h・sin（Π・x/w）｜ｗ＝W/N ｈ＝Π・A/（２・Ｗ）ここで、ｈはモデル関数ｍ（ｘ）の面積と投影関数ｆ
（ｘ）の面積か等しくなるように定義してある。モデル
関数記憶手段６は、前記モデル関数発生手段で定義され
たモデル関数を信号14として格納する通常の記憶手段で
ある。非線形マッチング手段７は、前記記憶手段４から
１次元系列特徴を信号15により読み込み、前記記憶手段
６から信号16によりモデル関数を読み込み、２つの関数
を１次元方向に伸縮させるDPマッチングような非線形の
マッチングを行い、２つの関数の対応関係を表す対応関
数を通常の記憶手段により記憶する。非線形マッチング
の１つであるDPマッチングはよく知られた技術であり、
ここでのDPマッチングは以下の式を評価関数とし、対応
関数として最小値Ｔをあたえる第３図に示すようなパス
ｃ＝（c₁、c₂）を探索するための手段とする。m (x) = | h · sin (Π · x / w) | w = W / N h = Π · A / (2 · W) where h is the area of the model function m (x) and the projection function f
The area of (x) is defined to be equal. The model function storage means 6 is a normal storage means for storing the model function defined by the model function generation means as a signal 14. The non-linear matching means 7 reads a one-dimensional sequence feature from the storage means 4 by a signal 15, reads a model function from the storage means 6 by a signal 16, and performs a non-linear matching such as DP matching for expanding and contracting two functions in a one-dimensional direction. Matching is performed, and the corresponding function representing the correspondence between the two functions is stored in the normal storage means. DP matching, which is one of non-linear matching, is a well-known technology,
In the DP matching here, the following formula is used as an evaluation function, and means for searching a path c = (c ₁ , c ₂ ) as shown in FIG. 3 which gives a minimum value T as a corresponding function.

文字切り出し位置の決定手段８は、信号17により前記
非線形マッチング手段７からの対応関数を読み込み、前
記記憶手段６からモデル関数を信号18により読み込み、
まずモデル関数のＮ−１個の極小点の位置を求め、対応
関数により１次元系列特徴に於ける対応点を探し、これ
を記憶するものとする。これらの処理は従来技術により
容易に実現できる。文字切り出し手段９は、前記文字切
り出し位置の決定手段８から信号19により文字切り出し
位置を読み込み、前記文字画像記憶手段１から信号20に
より文字画像を読み込み、文字切り出し位置間に含まれ
る黒画素の外接矩形を求め、その部分画像を１文字画像
として、記憶することにより文字列画像から１文字ずつ
を切り出すが、これらもまた従来技術により容易に実現
可能である。 The means 8 for determining the character cut-out position reads the corresponding function from the non-linear matching means 7 with the signal 17, reads the model function from the storage means 6 with the signal 18,
First, the positions of N-1 minimum points of the model function are obtained, the corresponding points in the one-dimensional series feature are searched for by the corresponding function, and these are stored. These processes can be easily realized by conventional techniques. The character cutting-out means 9 reads the character cutting-out position from the character cutting-out position determining means 8 by a signal 19, reads the character image from the character image storing means 1 by a signal 20, and circumscribes the black pixels included between the character cutting-out positions. Although a rectangle is obtained and the partial image is stored as a character image, the character image is cut out one by one, but these can also be easily realized by conventional techniques.

以上、実施例をもって本発明を詳細に説明したが、本
発明はこの実施例のみに限定されるものではない。例え
ば、本実施例は文字列画像として２値画像を想定してい
るが、多値画像でも適用できる。この場合、１次元系列
特徴として文字列方向と垂直な方向に累積をとった投影
ヒストグラムを使うことができる。また２値画像の場合
でも、１次元系列特徴として文字列方向と垂直な方向に
画像を走査し、黒画素間の距離の最大値を与える第４図
（ａ）のような関数を適用することもできる。また、モ
デル関数においても本実施例は１つの例に過ぎず、文字
数Ｎに対してＮ−１個の極小点とＮ個の極大点とが交互
に存在するような１変数関数で、入力画像から得られる
１次元系列特徴の文字の図形としてのまとまりを表すも
のであれば本発明は支障なく実施することができ、例え
ばモデル関数ｍ（ｘ）として以下のような関数を用いる
こともできる（第４図（ｂ）参照）。Although the present invention has been described in detail with reference to the embodiment, the present invention is not limited to this embodiment. For example, although a binary image is assumed as the character string image in this embodiment, a multi-valued image can also be applied. In this case, a projection histogram accumulated in the direction perpendicular to the character string direction can be used as the one-dimensional series feature. Also in the case of a binary image, the image is scanned in the direction perpendicular to the character string direction as a one-dimensional series feature, and a function as shown in FIG. 4 (a) that gives the maximum value of the distance between black pixels is applied. You can also The present embodiment is only one example in the model function, and is a one-variable function in which N-1 minimum points and N maximum points are alternately present for the number of characters N, and the input image is used. The present invention can be implemented without any trouble as long as it represents a group of one-dimensional series feature characters as a figure, and the following function can be used as the model function m (x) ( See FIG. 4 (b).

ｍ（ｘ）＝ｈ・（１−|cos（Π・x/w）｜）但し、ｈとｗは文字列全体の幅をＷ、文字列画像中の
黒画素の総数をＡ、文字数をＮとしたときに以下の式で
定義された値である。m (x) = h · (1- | cos (Π · x / w) |) where h and w are the width of the entire character string, A is the total number of black pixels in the character string image, and N is the number of characters. Is the value defined by the following formula.

ｈ＝（Ａ・Π）／（Ｗ・（Π−２））ｗ＝W/N ここでｈは、投影関数ｆ（ｘ）の面積とモデル関数の
面積とが等しくなるように定義してある。h = (A · Π) / (W · (Π-2)) w = W / N where h is defined so that the area of the projection function f (x) is equal to the area of the model function. .

（発明の効果）以上のように本発明によれば、文字列に含まれる文字
数が与えられたときに、文字幅や文字間隔の変動が比較
的大きい文字列画像や、文字と文字が接触している文字
列画像から１文字ずつを切り出すことができ、しかもパ
ラメータの数が比較的少なく単純な方法で切り出しを行
うことが可能になる。(Effect of the invention) As described above, according to the present invention, when the number of characters included in a character string is given, a character string image with a relatively large variation in character width or character spacing or a character-to-character contact It is possible to cut out character by character from the displayed character string image, and moreover, it is possible to cut out by a simple method with a relatively small number of parameters.

【図面の簡単な説明】[Brief description of drawings]

第１図は本発明の一つである文字切り出し装置の一実施
例の構成を示すブロック図、第２図と第３図は本発明の
原理を説明するための図、第４図は他の実施例を示す図
である。図中、１は文字列画像記憶手段、２は文字数記憶手段、
３は１次元系列特徴抽出手段、４は１次元系列特徴記憶
手段、５はモデル関数発生手段、６はモデル関数記憶手
段、７は非線形マッチング手段、８は文字切り出し位置
決定手段、９は文字切り出し手段である。FIG. 1 is a block diagram showing the configuration of an embodiment of a character slicing device according to the present invention, FIGS. 2 and 3 are diagrams for explaining the principle of the present invention, and FIG. 4 is another diagram. It is a figure which shows an Example. In the figure, 1 is a character string image storage means, 2 is a character number storage means,
3 is a one-dimensional sequence feature extraction means, 4 is a one-dimensional sequence feature storage means, 5 is a model function generation means, 6 is a model function storage means, 7 is a non-linear matching means, 8 is a character cutout position determination means, and 9 is a character cutout. It is a means.

Claims

(57)【特許請求の範囲】(57) [Claims]

【請求項１】文字列に含まれる文字数が既知であるとき
に、文字列を光学的に読みとり、その文字列画像から１
文字に相当する部分画面を切り出す方法に於いて、文字
列画像から１次元系列特徴を抽出し、前記文字数と該１
次元系列特徴に対応する、文字切り出し位置の特定が可
能なモデル関数を定義し、前記１次元系列特徴と該モデ
ル関数とを非線形にマッチングし、該非線形マッチング
に於ける非線形対応関数から前記モデル関数の文字切り
出し位置に対応する文字列画像の文字切り出し位置を求
め、該文字切り出し位置から１文字に相当する部分画像
を切り出すことを特徴とする文字切り出し方法。1. When the number of characters included in a character string is known, the character string is optically read, and 1 is read from the character string image.
In a method of cutting out a partial screen corresponding to a character, a one-dimensional series feature is extracted from a character string image, and the number of characters and the 1
A model function corresponding to a dimensional series feature and capable of specifying a character cut-out position is defined, the one-dimensional series feature and the model function are non-linearly matched, and the model function is converted from the non-linear corresponding function in the non-linear matching. The character cutout position of the character string image corresponding to the character cutout position, and the partial image corresponding to one character is cut out from the character cutout position.

【請求項２】文字列に含まれる文字数が既知であるとき
に、文字列を光学的に読み取り、その文字列画像から１
文字に相当する部分画像を切り出す装置において、光学
的に走査された文字列画像を格納する文字列画像記憶手
段と、文字数を格納する文字数記憶手段と、前記文字列
画像記憶手段から文字列画像を読み込んで文字列の１次
元系列特徴を抽出する１次元系列特徴抽出手段と、該１
次元系列特徴を格納する１次元系列特徴記憶手段と、前
記１次元系列特徴と前記文字数に応じた、文字切り出し
位置の特定が可能なモデル関数を作成するモデル関数発
生手段と、該モデル関数を格納するモデル関数記憶手段
と、前記１次元系列特徴と前記モデル関数との非線形マ
ッチングを行い、その非線形対応関数を記憶する非線形
マッチング手段と、前記モデル関数の文字切り出し位置
に対応する文字列画像の文字切り出し位置を、該非線形
対応関数から決定し、これを記憶する文字切り出し位置
決定手段と、該文字切り出し位置から文字列画像の１文
字に相当する部分画像の切り出しを行う文字切り出し手
段を有することを特徴とする文字切り出し装置。2. When the number of characters included in a character string is known, the character string is optically read and 1 is read from the character string image.
In a device for cutting out a partial image corresponding to a character, a character string image storage means for storing an optically scanned character string image, a character number storage means for storing the number of characters, and a character string image from the character string image storage means. A one-dimensional series feature extraction means for reading and extracting one-dimensional series features of a character string;
One-dimensional series feature storage means for storing a three-dimensional series feature, model function generating means for creating a model function capable of specifying a character cutout position according to the one-dimensional series feature and the number of characters, and the model function are stored Model function storage means, non-linear matching means for performing non-linear matching of the one-dimensional series feature and the model function, and storing the non-linear corresponding function, and characters of the character string image corresponding to the character cut-out position of the model function. A cutting-out position is determined from the non-linear correspondence function, and a character cutting-out position deciding means for storing the function and a character cutting-out means for cutting out a partial image corresponding to one character of the character string image from the character cutting-out position are provided. Character cutting device.