JPH04223584A

JPH04223584A - Optical character reader

Info

Publication number: JPH04223584A
Application number: JP2418914A
Authority: JP
Inventors: Michio Terai; 寺　　井　　道　　夫; Naoto Aoki; 青　　木　　直　　人; Satoshi Miyashita; 宮　　下　　聡; Shizuko Kawada; 川　　田　　志　　津　　子
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 1990-12-25
Filing date: 1990-12-25
Publication date: 1992-08-13

Abstract

PURPOSE:To easily and exactly read a slip by detecting the sizes of a line, field and character frame from projections in the main scanning direction and sub scanning direction of an image data and reading the part of the character frame. CONSTITUTION:Concerning the image data optically reading a read object on a slip 11 having the character frame, the projections in the main scanning direction and sub scanning direction are detected. Then, a line/field/character frame extraction part 18a detects the line, field and character frame from the respective projections in the main and sub scanning directions. Afterwards, read character information showing the relation of correspondence between the size of the character frame and the read object stored in a read character information storage part 18 is selected according to the detected size of the character frame. Further, a data corresponding to the character frame is erased from the image data and only the read object in the character frame is recognized.

Description

【発明の詳細な説明】[Detailed description of the invention]

【０００１】0001

【産業上の利用分野】本発明は、光学式文字読取装置（
以下ＯＣＲと略す）に関し、特に帳票上の文字枠の検出
に関するものである。[Industrial Application Field] The present invention relates to an optical character reading device (
The present invention relates to OCR (hereinafter abbreviated as OCR), and particularly relates to the detection of character frames on documents.

【０００２】0002

【従来の技術】図７は従来のＯＣＲの要部構成図である
。同図において、帳票３１は図示しない搬送機構によっ
て矢印３ａの方向へ搬送される。帳票３１が読取位置に
達すると光源３２から発せられた光は、帳票３１を含む
読取視野３ｂ上で反射し、その反射光はレンズ３３を通
してイメージセンサ３４の受光面上に結像される。イメ
ージセンサ３４の出力はアンプ３５で増幅され、Ａ／Ｄ
　変換部３６でディジタル信号に変換されて画像データ
として行あるいはページ単位でメモリ３７に格納される
。格納された多値データは２値化部３８により適当なス
ライスで黒白の２値データに変換され、この２値データ
とフォーマット情報記憶部３９に予め登録されているフ
ォーマット情報をもとに認識が行われる。このフォーマ
ット情報には帳票の大きさ、行数及び行中のフィールド
、フィールド内の文字数、文字の大きさ、字種等が規定
されていて、帳票１種類につき１つのフォーマット情報
が必票である。このフォーマット情報は予め記憶されて
いて読取時の指定あるいは帳票上の特定位置のＩＤ番号
を読取ることによって該当するフォーマットが選択され
る。2. Description of the Related Art FIG. 7 is a diagram showing the main part of a conventional OCR. In the figure, a form 31 is conveyed in the direction of an arrow 3a by a conveyance mechanism (not shown). When the form 31 reaches the reading position, the light emitted from the light source 32 is reflected on the reading field 3b including the form 31, and the reflected light passes through the lens 33 and forms an image on the light receiving surface of the image sensor 34. The output of the image sensor 34 is amplified by the amplifier 35, and the A/D
The conversion unit 36 converts the data into a digital signal and stores it as image data in the memory 37 in rows or pages. The stored multivalued data is converted into black and white binary data by the binarization unit 38 in appropriate slices, and recognition is performed based on this binary data and format information registered in advance in the format information storage unit 39. It will be done. This format information specifies the size of the form, number of lines and fields in the line, number of characters in the field, font size, font type, etc., and one format information is required for each type of form. . This format information is stored in advance, and the appropriate format is selected by specifying it at the time of reading or by reading the ID number at a specific position on the form.

【０００３】0003

【発明が解決しようとする課題】しかしながら、上記構
成の装置では、帳票１種類につき１つのフォーマットが
必要で、読取りを行う帳票の大きさや、行数、行中のフ
ィールド、フィールド内の文字数、文字の大きさ、字種
等のフォーマット情報を予め登録しておかなければなら
ない。また、フォーマット情報がわずかに違っているだ
けでも、帳票の正確な読取りができなくなってしまうと
いう問題点があった。[Problems to be Solved by the Invention] However, in the device with the above configuration, one format is required for each type of form, and the size of the form to be read, the number of lines, the fields in the line, the number of characters in the field, the characters Format information such as size and type of characters must be registered in advance. Furthermore, even if the format information is slightly different, there is a problem in that the form cannot be read accurately.

【０００４】本発明はこれらの問題点を解決するため、
イメージデータの主走査方向及び副走査方向の射影から
行、フィールド、文字枠の大きさを検出し、検出した文
字枠部分を読取ることによって、フォーマット情報を登
録しなくても帳票の正確な読取りができるＯＣＲを提供
することを目的とする。[0004] In order to solve these problems, the present invention
By detecting the sizes of lines, fields, and character frames from the projection of image data in the main scanning direction and sub-scanning direction, and reading the detected character frame parts, accurate reading of forms is possible without registering format information. The purpose is to provide OCR that can be used.

【０００５】[0005]

【課題を解決するための手段】本発明は前記問題点を解
決するため、文字枠を有する帳票上の読取対象を光学的
に読取り、文字枠内の読取対象の認識を行なう光学式文
字読取装置において、帳票上の読取対象を光学的に読取
った画像データの主走査方向及び副走査方向の射影を抽
出する主・副走査方向射影抽出手段と、抽出した該主・
副走査方向の各射影から行，フィールド文字枠を検出す
る行・フィールド文字枠抽出手段と、文字枠の大きさと
読取対象との対応関係を示す読取文字情報を格納する読
取文字情報記憶手段と、前記行・フィールド文字枠抽出
手段により検出した文字枠の大きさから対応する前記読
取文字情報を選択する手段と、前記画像データから文字
枠に相当するデータを除去する文字枠除去手段とを設け
たことに特徴がある。[Means for Solving the Problems] In order to solve the above problems, the present invention provides an optical character reading device that optically reads an object to be read on a document having a character frame and recognizes the object to be read within the character frame. , main/sub-scanning direction projection extraction means for extracting projections in the main-scanning direction and sub-scanning direction of image data obtained by optically reading an object to be read on a form;
a line/field character frame extraction means for detecting line and field character frames from each projection in the sub-scanning direction; a read character information storage means for storing read character information indicating a correspondence between the size of the character frame and the object to be read; Means for selecting the corresponding read character information based on the size of the character frame detected by the line/field character frame extraction means, and character frame removing means for removing data corresponding to the character frame from the image data. There are certain characteristics.

【０００６】[0006]

【作用】以上のような構成を有する本発明によれば、文
字枠を有する帳票上の読取対象を光学的に読取った画像
データの主走査方向及び副走査方向の射影を抽出する。そして、抽出した該主・副走査方向の各射影から行，フ
ィールド文字枠を検出する。検出した文字枠の大きさか
ら、読取文字情報記憶手段に格納された文字枠の大きさ
と読取対象との対応関係を示す読取文字情報を選択する
。そして画像データから文字枠に相当するデータを除去
する。文字枠内の読取対象のみの認識を行なう。[Operation] According to the present invention having the above-described structure, projections in the main scanning direction and the sub-scanning direction of image data obtained by optically reading an object to be read on a document having a character frame are extracted. Then, lines and field character frames are detected from each of the extracted projections in the main and sub-scanning directions. Based on the detected character frame size, read character information indicating the correspondence between the character frame size and the reading object stored in the read character information storage means is selected. Then, data corresponding to the character frame is removed from the image data. Only the object to be read within the character frame is recognized.

【０００７】従って、本発明は前記問題点を解決でき、
イメージデータの主走査方向及び副走査方向の射影から
行、フィールド、文字枠の大きさを検出し、検出した文
字枠部分を読取ることによって、フォーマット情報を登
録しなくても帳票の正確な読取りができるＯＣＲを提供
できる。[0007] Therefore, the present invention can solve the above problems,
By detecting the sizes of lines, fields, and character frames from the projection of image data in the main scanning direction and sub-scanning direction, and reading the detected character frame parts, accurate reading of forms is possible without registering format information. We can provide OCR that can be used.

【０００８】[0008]

【実施例】図１は本発明の一実施例を示す要部構成図で
ある。同図において、読取動作を開始すると帳票１１は
図示しない帳票搬送機構により矢印１ａの方向へ搬送さ
れる。帳票１１がイメージセンサの読取視野１ｂ上を通
過する時に光源１２から発せられた光が帳票１１を含む
読取視野１ｂ上で反射し、その反射光はレンズ１３を通
してイメージセンサ１４の受光面上に結像される。イメ
ージセンサ１４の出力は１段または数段のアンプ１５に
より増幅され、Ａ／Ｄ　変換部１６によりＡ／Ｄ　変換
された後、多値データとしてメモリ１７に格納される。格納された多値データをもとに文字枠検出部１８ａ　は
行，フィールドの検出後文字枠の検出を行い、検出され
た文字枠部分について文字枠内部のデータを２値化部１
９へ送り、文字の認識を行う。文字枠検出部１８ａの詳
細な構成は後述する。このとき、文字枠の大きさと読取
字種との間に規定があり、読取文字情報記憶部１８ｂに
記憶されている場合には、この読取文字情報が２値デー
タと共に図示していない認識部へ送られ、文字の認識が
行われる。DESCRIPTION OF THE PREFERRED EMBODIMENTS FIG. 1 is a block diagram of essential parts showing an embodiment of the present invention. In the figure, when a reading operation is started, a form 11 is conveyed in the direction of arrow 1a by a form conveyance mechanism (not shown). When the form 11 passes over the reading field 1b of the image sensor, the light emitted from the light source 12 is reflected on the reading field 1b including the form 11, and the reflected light passes through the lens 13 and is focused on the light receiving surface of the image sensor 14. imaged. The output of the image sensor 14 is amplified by one stage or several stages of amplifiers 15, A/D converted by an A/D converter 16, and then stored in a memory 17 as multivalued data. Based on the stored multivalued data, the character frame detection unit 18a detects the character frame after detecting the row and field, and converts the data inside the character frame into the binarization unit 1 for the detected character frame portion.
9 to perform character recognition. The detailed configuration of the character frame detection section 18a will be described later. At this time, if there is a regulation between the size of the character frame and the readable character type and it is stored in the readable character information storage section 18b, this readable character information is sent to the recognition section (not shown) together with the binary data. and the characters are recognized.

【０００９】次に、行，フィールド及び文字枠の検出方
法について説明する。まずメモリ１７に格納された多値
データに対し、適当なスライスレベルを設定して主走査
方向及び副走査方向の射影を抽出すると、読取対象部に
ついては図２のようになる。同図は２つのフィールドを
持つ行の列である。主走査方向の射影から孤立射影部４
ｂを検出し、これを行として切り出す。切り出した行の
副走査方向の射影から射影間の距離４ｅが所定の値以下
の場合は同一フィールドとみなし、射影間の距離４ｅが
所定の値を越える場合は別フィールドとみなす。Next, a method for detecting lines, fields, and character frames will be explained. First, when an appropriate slice level is set for the multivalued data stored in the memory 17 and projections in the main scanning direction and the sub-scanning direction are extracted, the reading target portion becomes as shown in FIG. The figure shows a row and column with two fields. Isolated projection part 4 from projection in the main scanning direction
Detect b and cut it out as a row. If the distance 4e between the projections from the projection in the sub-scanning direction of the cut out row is less than a predetermined value, they are considered to be the same field, and if the distance 4e between the projections exceeds the predetermined value, they are considered to be different fields.

【００１０】ここで図３は図１の文字枠検出部１８ａ　
の構成を示す図である。同図の文字枠検出部１８ａ　動
作を図４に示す文字切出しのフローに従って説明する。まずメモリに格納された多値データを読出し（ステップ
１０１）、図３の主走査方向射影抽出部５１で主走査方
向の射影が抽出される（ステップ１０２　）。この射影
抽出は予め設定されたスライス値に基き、多値データか
ら２値データを作成して行われる。次に行切出し部５２
において、上記抽出した射影の孤立部分をそれぞれ１行
として行の切出しを行う（ステップ１０３　）。切出さ
れた行について副走査方向射影抽出部５３で副走査方向
の射影を抽出する（ステップ１０４　）。ここで抽出さ
れた孤立射影が１つの文字に対応するが、行の中には複
数のフィールドが設けられることがある。このフィール
ドの検出方法を以下に説明する。Here, FIG. 3 shows the character frame detection section 18a of FIG.
FIG. The operation of the character frame detection section 18a shown in FIG. 4 will be explained according to the character extraction flow shown in FIG. First, the multivalued data stored in the memory is read out (step 101), and the projection in the main scanning direction is extracted by the main scanning direction projection extracting section 51 shown in FIG. 3 (step 102). This projection extraction is performed by creating binary data from multivalued data based on preset slice values. Next, the line cutting section 52
In step 103, lines are cut out, with each isolated portion of the extracted projection as one line. The sub-scanning direction projection extraction unit 53 extracts a sub-scanning direction projection for the cut out row (step 104). Although the isolated projection extracted here corresponds to one character, a plurality of fields may be provided in a line. The method for detecting this field will be explained below.

【００１１】行中に複数のフィールドを設ける場合にフ
ィールド間隔（図２の４ｅ）を同一フィールド内の文字
間隔（図２の４ｆ）より大きくなるように設定しておく
。そして副走査方向の隣接する孤立射影の間隔を図３の
射影間距離判定部５４で測定し、複数のフィールドが存
在するかどうかを判定する（ステップ１０５　）。When a plurality of fields are provided in a line, the field spacing (4e in FIG. 2) is set to be larger than the character spacing in the same field (4f in FIG. 2). Then, the interval between adjacent isolated projections in the sub-scanning direction is measured by the inter-projection distance determination section 54 of FIG. 3, and it is determined whether a plurality of fields exist (step 105).

【００１２】以上の結果、その行が１フィールドのみで
構成されていれば上記データがそのまま文字切出し部５
６へ送られ、孤立射影を文字として切出す（ステップ１
０７　）。また、その行が複数フィールドで構成されて
いる場合（ステップ１０６　）には、上記データがフィ
ールド切出し部５５へ送られフィールド毎に分解された
後、文字切出し部５６へ送られ孤立射影を文字として切
出す（ステップ１０７　）。As a result of the above, if the line consists of only one field, the above data is directly stored in the character cutting section 5.
6 and extracts the isolated projection as a character (step 1
07). If the line is made up of multiple fields (step 106), the above data is sent to the field extraction section 55 and decomposed into each field, and then sent to the character extraction section 56 to convert the isolated projection into characters. Cut out (step 107).

【００１３】このようにすれば図３の行４ａは２つのフ
ィールド４ｃ，４ｄ　に分けられる。以上のようにして
行及びフィールドの検出が行われる。また、主走査方向
の射影の孤立部分を行として切り出すので行数の指定も
不要である。In this way, row 4a in FIG. 3 is divided into two fields 4c and 4d. Row and field detection is performed as described above. Furthermore, since isolated portions of the projection in the main scanning direction are cut out as lines, there is no need to specify the number of lines.

【００１４】上記検出された行及びフィールド部分につ
いての主走査方向及び副走査方向の射影から、読取対象
となる文字枠部分は、図５（ａ）　のように文字２ａの
有無にかかわらず文字枠２ｃ及び２ｄが得られる。図５
（５）　に示すように文字２ａは、文字枠２ｂの内部に
書かれる。従って文字による主走査方向及び副走査方向
の射影の長さは、必ず文字枠による射影の長さ以下とな
り、文字枠がある場合には、文字２ａの有無にかかわら
ず、主走査方向の射影２ｃ及び副走査方向の射影２ｄは
文字枠の射影となる。From the projection of the detected line and field portions in the main scanning direction and sub-scanning direction, the character frame portion to be read is determined to be a character frame regardless of the presence or absence of the character 2a as shown in FIG. 5(a). 2c and 2d are obtained. Figure 5
(5) As shown in (5), the character 2a is written inside the character frame 2b. Therefore, the length of the projection of the character in the main scanning direction and the sub-scanning direction is always less than the length of the projection of the character frame, and if there is a character frame, the projection 2c in the main scanning direction regardless of the presence or absence of the character 2a. The projection 2d in the sub-scanning direction is a projection of the character frame.

【００１５】文字枠のない部分に文字２ｅや汚れ等があ
っても文字枠がないので図５（ｂ）　に示すように文字
枠がある場合とは異った射影２ｆ及び２ｇが得られるた
め、読取対象部とそれ以外の部分は文字枠の検出により
明確に分離できる。Even if there is a character 2e or dirt in a part without a character frame, there is no character frame, so projections 2f and 2g that are different from those in the case where there is a character frame, as shown in FIG. 5(b), are obtained. , the part to be read and the other part can be clearly separated by detecting the character frame.

【００１６】上述のように文字枠すなわち読取対象部の
検出が行われるが、文字の認識において文字枠は不要な
データである。そこでイメージデータを認識部へ送る前
に文字枠のデータを取り除く必要がある。文字枠データ
をイメージデータから取り除く方法の１つを以下に説明
する。文字枠の検出において文字枠の射影を抽出するが
、読取文字は抽出された文字枠の射影の内部に存在する
。したがって、文字枠の射影の一番外側から文字枠の線
幅＋α（αは１画素〜数画素）だけ内側に注目し、この
部分のデータだけを図１の２値化部１９へ送ってやれば
、それ以降のデータに文字枠の影響は出なくなる。Although the character frame, that is, the portion to be read is detected as described above, the character frame is unnecessary data in character recognition. Therefore, it is necessary to remove the character frame data before sending the image data to the recognition section. One method for removing character frame data from image data will be described below. When detecting a character frame, a projection of the character frame is extracted, and the read character exists inside the extracted projection of the character frame. Therefore, focus on the line width of the character frame + α (α is 1 pixel to several pixels) from the outermost side of the projection of the character frame, and send only the data of this part to the binarization unit 19 in Fig. 1. If so, the character frame will no longer affect subsequent data.

【００１７】メモリ上のイメージデータが図６（ａ）　
のようになっているとすると、主走査方向の射影の長さ
としてｙ１−ｙ２　が得られ、副走査方向の射影の長さ
としてｘ１−ｘ２　が得られる。文字枠の線幅を２ビッ
トとすれば図６（ｂ）　のようになる。そこで文字枠の
影響を受けない様に文字枠の内側のデータだけを切取る
。センサのボケ等の影響に対するマージンαをとり、α
を１ビットとすれば、座標（ｘ１＋３，　　　　ｙ１＋
３）　から　（ｘ２−３，ｙ２−３）までのデータをメ
モリ上に戻す。このとき、メモリ上のデータをクリアし
ておけば文字枠が除去される。この動作は、図３の文字
枠除去部５７で行われる。The image data on the memory is shown in FIG. 6(a).
Assuming that, y1-y2 is obtained as the length of the projection in the main scanning direction, and x1-x2 is obtained as the length of the projection in the sub-scanning direction. If the line width of the character frame is 2 bits, it will be as shown in Fig. 6(b). Therefore, only the data inside the character frame is cut out so that it is not affected by the character frame. Taking the margin α for the influence of sensor blur, etc., α
If is 1 bit, the coordinates (x1+3, y1+
3) Return the data from (x2-3, y2-3) to the memory. At this time, if you clear the data in memory, the character frame will be removed. This operation is performed by the character frame removing unit 57 in FIG.

【００１８】従来の帳票フォーマットでは読取位置のほ
かに読取字種も規定しており、読取精度を上げるために
は、字種の指定も必要である。本発明における読取字種
の指定方法を以下に述べる。本発明では、読取文字情報
記憶部を設け、文字枠の大きさごとに読取字種を任意に
設定、記憶できるようにしている。例えば　３×４ｍｍ
　は活字、　　５×６ｍｍ　は手書数字、　６×６ｍｍ
　は手書英字、　８×８ｍｍ　は漢字というように設定
しておけば、文字枠の大きさによって字種を分けること
ができる。この読取文字情報を文字のイメージデータと
共に認識部へ送ることで読取精度を上げることができる
。[0018] In the conventional form format, in addition to the reading position, the type of character to be read is also specified, and in order to improve the reading accuracy, it is necessary to specify the type of character. The method of specifying the reading character type in the present invention will be described below. In the present invention, a reading character information storage section is provided so that reading character types can be arbitrarily set and stored for each character frame size. For example, 3x4mm
is printed, 5 x 6 mm is handwritten numbers, 6 x 6 mm
If you set it so that `` is handwritten English letters and 8 x 8 mm is kanji, you can separate the types of characters depending on the size of the character frame. Reading accuracy can be improved by sending this read character information to the recognition unit along with character image data.

【００１９】文字枠の大きさによって、字種を分けてお
けば、数字は数字のみの辞書で認識を行えば良いし、英
字は英字のみの辞書で認識を行えば良いので認識する文
字の範囲が狭くてすむ。すなわち、数字の１（イチ）と
英小文字のｌ（エル）や、数字の９（キュー）と英小文
字のｑ（キュー）などが文字枠の大きさで区別できるた
め、異字種の類似文字の誤不読がなくなり、読取精度が
向上することになる。また文字枠は何色でもよいため、
帳票のコピー等も読取ることができる。[0019] If the character types are separated according to the size of the character frame, numbers can be recognized using a dictionary containing only numbers, and letters can be recognized using a dictionary containing only alphabets, so the range of characters to be recognized can be reduced. The space is small. In other words, the number 1 (ichi) and the lowercase letter l (ell), the number 9 (cue) and the lowercase letter q (cue), etc. can be distinguished by the size of the character frame, so similar characters in different characters can be distinguished. This eliminates misreading and improves reading accuracy. Also, the text frame can be of any color, so
Copies of forms, etc. can also be read.

【００２０】[0020]

【発明の効果】以上詳細に説明したように、この発明に
よれば、イメージデータの主走査方向及び副走査方向の
射影から行、フィールド、文字枠を検出し文字枠の内部
を読取るようにし、文字枠の大きさによって読取字種等
の情報を設定できるようにしたので、読取帳票のフォー
マットをあらかじめ登録する必要がないので文字枠のあ
る帳票であれば、どんな帳票でも簡単に読取れる。更に
、イメージデータから読取位置を決定するので、従来の
フォーマット情報にあった実際の帳票データとの間の誤
差による切り出し誤り等がなくなるので、読取エラー率
の減少も期待できる。As described above in detail, according to the present invention, lines, fields, and character frames are detected from the projection of image data in the main scanning direction and the sub-scanning direction, and the inside of the character frame is read. Since information such as the reading character type can be set according to the size of the character frame, there is no need to register the format of the read form in advance, so any form with a character frame can be easily read. Furthermore, since the reading position is determined from the image data, cutting errors due to errors between the conventional format information and the actual form data are eliminated, so a reduction in the reading error rate can be expected.

【図面の簡単な説明】[Brief explanation of the drawing]

【図１】本発明の一実施例の要部構成図である。FIG. 1 is a configuration diagram of main parts of an embodiment of the present invention.

【図２】主・副走査方向の射影を示す図である。FIG. 2 is a diagram showing projection in the main and sub-scanning directions.

【図３】本実施例における文字枠検出部の構成図である
。FIG. 3 is a configuration diagram of a character frame detection section in this embodiment.

【図４】本実施例における文字切出し動作を示すフロー
チャートである。FIG. 4 is a flowchart showing a character extraction operation in this embodiment.

【図５】文字枠の有無による射影の違いを示す図である
。FIG. 5 is a diagram showing the difference in projection depending on the presence or absence of a character frame.

【図６】文字枠の除去の様子を示す図である。FIG. 6 is a diagram showing how a character frame is removed.

【図７】従来のＯＣＲの要部構成図である。FIG. 7 is a configuration diagram of main parts of a conventional OCR.

【符号の説明】[Explanation of symbols]

１１　　帳票１２　　光源１３　　レンズ１４　　イメージセンサ１５　　アンプ１６　　Ａ／Ｄ　変換部１７　　メモリ１８ａ　　　文字枠検出部１８ｂ　　　読取文字情報記憶部１９　　２値化部５１　　主走査方向射影抽出部５２　　行切出し部５３　　副走査方向射影抽出部５４　　射影間距離判定部５５　　フィールド切出し部５６　　文字切出し部５７　　文字枠除去部 11. Form 12 Light source 13 Lens 14 Image sensor 15 Amplifier 16 A/D conversion section 17 Memory 18a Character frame detection section 18b Read character information storage unit 19 Binarization section 51 Main scanning direction projection extraction section 52 Row cutting section 53 Sub-scanning direction projection extraction section 54 Inter-projection distance determination unit 55 Field cutting section 56 Character cutting section 57 Character frame removal section

Claims

【特許請求の範囲】[Claims]

【請求項１】　　文字枠を有する帳票上の読取対象を光
学的に読取り、文字枠内の読取対象の認識を行なう光学
式文字読取装置において、帳票上の読取対象を光学的に
読取った画像データの主走査方向及び副走査方向の射影
を抽出する主・副走査方向射影抽出手段と、抽出した該
主・副走査方向の各射影から行，フィールド文字枠を検
出する行・フィールド文字枠抽出手段と、文字枠の大き
さと読取対象との対応関係を示す前記読取文字情報を格
納する読取文字情報記憶手段と、前記行・フィールド文
字枠抽出手段により検出した文字枠の大きさから対応す
る読取文字情報を選択する手段と、前記画像データから
文字枠に相当するデータを除去する文字枠除去手段とを
有することを特徴とする光学式文字読取装置。Claim 1: Image data obtained by optically reading the object to be read on the form in an optical character reading device that optically reads the object to be read on the document having a character frame and recognizes the object to be read within the character frame. main and sub-scanning direction projection extraction means for extracting projections in the main and sub-scanning directions, and line and field character frame extraction means for detecting line and field character frames from each of the extracted projections in the main and sub-scanning directions. a read character information storage means for storing the read character information indicating the correspondence between the size of the character frame and the object to be read; and a read character information storage means for storing the read character information indicating the correspondence between the size of the character frame and the reading target, and a read character corresponding to the read character from the size of the character frame detected by the line/field character frame extracting means. An optical character reading device comprising: means for selecting information; and character frame removing means for removing data corresponding to character frames from the image data.