JPS6245581B2

JPS6245581B2 -

Info

Publication number: JPS6245581B2
Application number: JP55040890A
Authority: JP
Inventors: Hitoshi Myai; Masamichi Shudo; Yoshasu Kikuchi; Yasufumi Mitsuzawa
Original assignee: Nippon Electric Co Ltd
Current assignee: NEC Corp
Priority date: 1980-03-28
Filing date: 1980-03-28
Publication date: 1987-09-28
Also published as: JPS56137480A

Description

【発明の詳細な説明】本発明は、図面、写真、文字などの割付装置、
特に割り付けフオーマツトの入力装置に関するも
のである。[Detailed Description of the Invention] The present invention provides an arrangement device for drawings, photographs, characters, etc.
In particular, it relates to an input device for layout formats.

従来、新聞、雑誌、ドキユメント等の印刷シス
テムにおいて、決められたサイズの紙面への、与
えられた図面、写真、文字等のフオーマツテイン
グ（割り付け）、即ち、紙面上での位置決めは、
熟練された人間の手作業によるか、もしくは、特
殊なデイスプレイに全紙面を表示し、会話的に写
真、図面、文字位置を指定する方法でなされてい
た。フオーマツテイングとは、決められた紙面の
中のどの位置に、写真、図面、文字等を位置づけ
るかを意味するもので、例えば矩形の写真につい
ては、フオーマツトすべき紙面上での左上隅座標
と写真の縦横のサイズとの２つの情報（この２対
の情報をフオーマツト情報、またその集合をフオ
ーマツトと呼ぶ）が得られればフオーマツテイン
グすることができる。印刷枚数の大幅な増大に伴
い、フオーマツテイング作業の省力化がはかられ
計算機が導入されているが、前記のように特殊な
デイスプレイを用いてフオーマツテイング処理が
行なわれているのが大部分である。このようなデ
イスプレイを用いる場合の欠点は、装置が高価と
なる他、デイスプレイに表示された画面は、ほと
んどの場合実際に印刷する紙面とサイズおよび縦
横の比率が異なつており、割り付けミスあるいは
誤差を生ずることであつた。この他フオーマツト
入力方法としては、フアクシミリ等のイメージス
キヤナにより、フオーマツテイング領域（例えば
矩形枠）を描いた紙（帳票）を読ませ、計算機に
よりフオーマツトを演算、検出する方法がある。
しかしながら、例えば複数個の図面を一枚の紙面
にフオーマツテイングする際、複数個のフオーマ
ツト情報が発生するが、計算機は個々に識別がで
きないため、単に入力される順に、シーケンシヤ
ル番号がふられ分類されるにすぎない。この場合
には、フオーマツテイングされる複数個の図面と
の対応は、人間が対応テーブルを見ながら図面と
何ら関係のない番号でもつて行わねばならず、誤
つたフオーマツテイングを起こし易くなる。 Conventionally, in printing systems for newspapers, magazines, documents, etc., the formatting (allocation) of given drawings, photographs, characters, etc. on a paper of a fixed size, that is, the positioning on the paper, is as follows:
This was done manually by skilled people, or by displaying the entire page on a special display and interactively specifying the positions of photographs, drawings, and text. Formatting means where to position photographs, drawings, characters, etc. within a predetermined space of paper. For example, for a rectangular photograph, the coordinates of the upper left corner of the paper to be formatted and Formatting can be performed if two pieces of information, the vertical and horizontal sizes of the photograph, are obtained (these two pairs of information are called format information, and the set is called a format). With the large increase in the number of printed sheets, computers have been introduced to save labor in formatting work, but as mentioned above, the formatting process is largely performed using a special display. It is a part. The disadvantage of using such a display is that the equipment is expensive, and in most cases the size and aspect ratio of the screen displayed on the display are different from the actual paper to be printed, making it easy to make layout mistakes or errors. It was something that would happen. Other format input methods include a method in which a paper (form) on which a formatting area (for example, a rectangular frame) is drawn is read using an image scanner such as a facsimile machine, and a computer calculates and detects the format.
However, when formatting multiple drawings onto a single sheet of paper, for example, multiple pieces of format information are generated, but computers cannot identify each one individually, so they are simply assigned sequential numbers and classified in the order in which they are input. It's just being done. In this case, the correspondence with the plurality of drawings to be formatted must be done by a person while looking at the correspondence table using numbers that have no relation to the drawings, making it easy for incorrect formatting to occur.

本発明の目的は、以上述べた欠点を除去し、実
際の帳票のサイズに近い形で、しかも自動識別お
よび分類の可能なフオーマツト入力装置を提供す
ることにある。 SUMMARY OF THE INVENTION An object of the present invention is to eliminate the above-mentioned drawbacks, provide a format input device that has a form close to the size of an actual form, and is capable of automatic identification and classification.

本発明によれば２次画像を走査して得られた濃
淡画像データを２値化して出力する画像入力装置
と、前記画像入力装置より得られた２次元画像デ
ータから１或いは複数個の矩形領域C₁、C₂、
…、Cnを切り出し、該各領域に含まれる１或い
は複数個の文字に対し文字認識を行い、対応する
文字コードおよび２次元画像データ上での文字位
置とサイズを出力する文字認識装置と、前記文字
認識装置により認識された文字の中、特定の種類
の文字コードを含む文字矩形領域S₁、S₂、…、
Skを取り出し、該各領域に対し前記２次元画像
データ上の基準点からの位置およびサイズを算出
する文字矩形領域算出装置と、前記画像入力装置
より得られた２次元画像データから、１或いは複
数個のフオーマツト矩形領域R₁、R₂、…、Rkを
切り出し、該各領域に対し前記２次元画像データ
上の基準点からの位置およびサイズを算出するフ
オーマツト矩形領域算出装置と、前記フオーマツ
ト矩形領域算出装置より得られた矩形領域R₁、
R₂、…、Rkの位置およびサイズより、矩形領域
R₁、R₂、…、Rkと前記文字矩形領域算出装置よ
り得られた文字矩形領域S₁、S₂、…、Skとの位
置関係を計算し、フオーマツト矩形領域R₁、
R₂、…、Rkに対し、対応する文字矩形領域S₁、
S₂、…、Skに含まれる文字コード列を識別コー
ドとして割り当てる領域識別装置を有し、２次元
画像上に記述された１或いは複数個のフオーマツ
ト矩形領域の位置とサイズを計算し、該各領域に
対し自動的に識別文字コード列を割り当て分類す
ることが可能である。 According to the present invention, there is provided an image input device that binarizes and outputs grayscale image data obtained by scanning a secondary image, and one or more rectangular regions are extracted from the two-dimensional image data obtained by the image input device. _C1 , _C2 ,
..., a character recognition device that cuts out Cn, performs character recognition on one or more characters included in each region, and outputs the corresponding character code and character position and size on the two-dimensional image data; Among the characters recognized by the character recognition device, character rectangular areas S ₁ , S ₂ , ..., containing a specific type of character code
A character rectangular area calculation device that extracts Sk and calculates the position and size of each area from the reference point on the two-dimensional image data, and one or more character rectangular area calculation devices from the two-dimensional image data obtained from the image input device. a format rectangular region calculation device that cuts out format rectangular regions R ₁ , R ₂ , ..., Rk and calculates the position and size of each region from a reference point on the two-dimensional image data; Rectangular area R ₁ obtained from the calculation device,
From the position and size of R ₂ ,..., Rk, a rectangular area
_The positional relationship between R ₁ _, R ₂ _, .
For R ₂ , ..., Rk, the corresponding character rectangular area S ₁ ,
It has an area identification device that allocates the character code string included in S ₂ , ..., Sk as an identification code, calculates the position and size of one or more format rectangular areas described on a two-dimensional image, and It is possible to automatically assign identification character code strings to areas and classify them.

次に図面を参照しながら本発明について詳細に
説明する。以後図面の信号線に付している番号
は、その信号線上の信号をも指しているものとし
て説明する。 Next, the present invention will be explained in detail with reference to the drawings. Hereinafter, the numbers attached to signal lines in the drawings will be explained as referring to the signals on the signal lines as well.

第１図は本発明のフオーマツト入力装置の一実
施例を示すブロツク図であり、画像入力装置１、
文字認識装置２、文字矩形領域算出装置３、フオ
ーマツト矩形領域算出装置４および領域識別装置
５から構成される。第２図ａ，ｂ，ｃは、フオー
マツト入力装置の原理を説明するための模式図で
あり、第２図ａは、フオーマツト入力帳票の中、
文字認識装置２の処理対象となる矩形領域C₁、
C₂、…Cnを示し、第２図ｂは、同じくフオーマ
ツト入力帳票の中、文字矩形領域算出装置３の処
理対象となる文字が記述されている文字矩形領域
S₁、S₂、…、Skを示し、第２図ｃは、同じくフ
オーマツト入力帳票の中、フオーマツト矩形領域
算出装置４の処理対象となるフオーマツト矩形領
域C₁、C₂、…、Ckを示す。 FIG. 1 is a block diagram showing an embodiment of the format input device of the present invention.
It is composed of a character recognition device 2, a character rectangular area calculation device 3, a format rectangular area calculation device 4, and an area identification device 5. Figures 2a, b, and c are schematic diagrams for explaining the principle of the format input device, and Figure 2a shows the format input form,
A rectangular area C ₁ to be processed by the character recognition device 2,
C ₂ ,...Cn, and FIG. 2 b shows a character rectangular area in which a character to be processed by the character rectangular area calculation device 3 is written in the same format input form.
S ₁ , S ₂ , . . _. , Sk are shown, _and FIG. .

画像入力装置１は、FSS、ITV、CCDスキヤナ
等のイメージスキヤナを用いることができ、２次
元画像を走査して２次元画像データを得るもので
あり、走査された画像データが多値であれば、イ
メージスキヤナの後に多値データを２値化する回
路を設ける。文字認識装置２はOCR（光学文字
認識装置）の認識部分を用いることができ、画像
入力装置１からの２次元画像データより、１或い
は複数個の矩形領域C₁、C₂、…、Cn（第２図
ａ）を切り出し、該各領域に含まれる１或いは複
数個の文字に対し文字認識を行う装置である。矩
形領域C₁、C₂、…、Cnの切り出しは、２次元画
像データ上の基準点（えば左上隅座標を原点とす
る。）からの位置とサイズによつて行い、それら
の情報を出力すると共に各領域において、認識さ
れた文字に対しては認識された文字に対応する文
字コードを出力し、認識されなかつた文字に対し
ては認識されなかつたことを示す文字コードを出
力する。文字矩形領域算出装置３は、文字認識装
置２の出力文字の中から、あらかじめ定めておい
た文字種（例えばアルフアベツト、数字）の文字
コード例をフオーマツト識別文字として抜き出
し、抜き出した文字コード列を含む文字矩形領域
S₁、S₂、…、Sk（第２図ｂ）を矩形領域C₁、
C₂、…、Cnの中から選び出し、各文字矩形領域
S₁、S₂、…、Skに対し画像入力装置１より得ら
れた２次元画像データ上の基準点（C₁、C₂、
…、Cnに対する基準点と同じ）からの位置を算
出し対応する文字コード例と共に出力する装置で
ある。フオーマツト矩形領域算出装置４は、画像
入力装置１より得られた２次画像データから、１
或いは複数個のフオーマツト矩形領域R₁、R₂、
…、Rk（第２図ｃ）の位置とサイズを規定する
ための情報（例えば矩形領域の１個の隅座標と対
角隅座標或いは、１個の隅座標と縦長と横長或い
は４個の隅座標）を示すフオーマツト指示線分を
取り出し、各領域R₁、R₂、…、Rkに対し、前記
２次元画像データ上の基準点からの位置およびサ
イズを算出する装置である。領域識別手段５は、
フオーマツト矩形領域算出装置４より得られたフ
オーマツト矩形領域R₁、R₂、…、Rkの位置およ
びサイズを基に、フオーマツト領域R₁、R₂、
…、Rkと前記文字矩形領域算出装置３より得ら
れた文字矩形領域S₁、S₂、…、Skとの位置関係
（例えばフオーマツト矩形領域Riに含まれる文字
矩形領域Si、或いはフオーマツト矩形領域R₁、
R₂、…、Rkから一定又はある範囲内の距離に存
在する文字矩形領域Si）を調べ、対応する領域同
志を対として出力する装置である。具体的にはフ
オーマツト矩形領域Riの位置およびサイズに対
し、対応する文字矩形領域S₁、S₂、…、Skに含
まれる文字コード列（フオーマツト識別文字と呼
ぶ）が識別コードとして割り当てられ、フオーマ
ツト矩形領域Riの位置およびサイズにフオーマ
ツト識別文字が付加され対として出力される。 The image input device 1 can use an image scanner such as FSS, ITV, or CCD scanner, and scans a two-dimensional image to obtain two-dimensional image data. For example, a circuit for binarizing multivalued data is provided after the image scanner. The character recognition device 2 can use the recognition part of an OCR (optical character recognition device), and from the two-dimensional image data from the image input device 1, one or more rectangular areas C ₁ , C ₂ , ..., Cn ( This is a device that cuts out the image shown in FIG. 2a) and performs character recognition on one or more characters included in each area. The rectangular areas C ₁ , C ₂ , ..., Cn are cut out based on the position and size from the reference point on the two-dimensional image data (for example, the origin is the upper left corner coordinates), and this information is output. At the same time, in each area, for recognized characters, a character code corresponding to the recognized character is output, and for unrecognized characters, a character code indicating that the character was not recognized is outputted. The character rectangular area calculation device 3 extracts character code examples of predetermined character types (for example, alphanumeric characters) from the output characters of the character recognition device 2 as format identification characters, and extracts characters including the extracted character code string. rectangular area
S ₁ , S ₂ , ..., Sk (Fig. 2 b) in the rectangular area C ₁ ,
Select from C ₂ , ..., Cn and create a rectangular area for each character.
For S ₁ , S ₂ , ..., Sk, reference points (C ₁ , C ₂ ,
..., which is the same as the reference point for Cn), and outputs it along with the corresponding character code example. The format rectangular area calculation device 4 calculates one image from the secondary image data obtained from the image input device 1.
Or a plurality of format rectangular areas R ₁ , R ₂ ,
..., information for specifying the position and size of Rk (Fig. 2c) (for example, one corner coordinate and diagonal corner coordinate of a rectangular area, one corner coordinate, vertically long and horizontally long, or four corners) This device extracts a format instruction line segment indicating the coordinates) and calculates the position and size of each region R ₁ , R ₂ , . . . , Rk from a reference point on the two-dimensional image data. The area identification means 5 is
Based on the positions and sizes of the format rectangular regions R ₁ , R ₂ , ..., Rk obtained by the format rectangular region calculating device 4, the format regions R ₁ , R ₂ , Rk are calculated.
..., Rk and the character rectangular areas S ₁ , S ₂ , . ₁ ,
This is a device that examines character rectangular areas Si existing at a constant distance or within a certain range from R ₂ , . . . , Rk, and outputs corresponding areas as a pair. Specifically, character code strings (referred to as format identification characters) included in the corresponding character rectangular areas S ₁ , S ₂ , ..., Sk are assigned as identification codes to the position and size of the format rectangular area Ri, and the format A format identification character is added to the position and size of the rectangular area Ri and output as a pair.

本発明の詳細なブロツク説明に入る前に、本発
明によりフオーマツトが入力されるまでの処理手
順について実施例に即して説明を行う。 Before entering into a detailed block description of the present invention, the processing procedure until a format is input according to the present invention will be explained based on an embodiment.

第３図は、フオーマツト入力帳票上の一部分を
示し、１或いは複数個のフオーマツトおよびそれ
に付随するフオーマツト識別文字列の記述された
２次元画像の１実施例を示す図である。第２図ａ
におけるCiに対応するものは、第３図における
基準格子９０で表わされる矩形領域であり、第２
図ｂにおけるSiに対応するものは、フオーマツト
指示線分ELiおよびフオーマツト識別文字EDiが
記述されている矩形領域（文字矩形領域）であ
り、第２図ｃにおけるRiに対応するものは、フ
オーマツト指示線分ELiである。フオーマツト
は、点Ｐ_LUiを始点、点Ｐ_RDiを終点とするフオー
マツト指示線分ELiで記述し、点Ｐ_LUiを左上隅
点、点Ｐ_RDiを右下隅点とするフオーマツト矩形
領域（斜線部分）によつて規定する。点Ｐ_LUiと
点Ｐ_RDiをつなぐフオーマツト指示線分ELiは、第
３図のような直角線分ぞけでなく、点Ｐ_LUiと点
Ｐ_RDiがつながつていればどんな線分でもよい。
またフオーマツト識別文字列は、フオーマツト矩
形領域（斜線部分）の内部に位置する例を挙げて
いるが、前記フオーマツト矩形領域から一定距離
或いは一定位置に存在するものと規定してもよ
い。第３図において破線格子は、フオーマツト指
示線分ELiおよびフオーマツト識別文字が記述さ
れる際の基準となるもので、人間の眼には見え
て、画像入力装置１のイメージスキヤナには感じ
ないドロツプアウトカラーで記述されている基準
格子９０であり、前述のように第２図ａにおける
Ciに対応する。 FIG. 3 is a diagram showing an example of a two-dimensional image showing a part of a format input form, in which one or more formats and associated format identification character strings are described. Figure 2a
What corresponds to Ci in is the rectangular area represented by the reference grid 90 in FIG.
What corresponds to Si in Figure b is a rectangular area (character rectangular area) in which the format indication line segment ELi and format identification character EDi are written, and what corresponds to Ri in Figure 2c is the format indication line Minute ELi. The format is described by a format instruction line segment ELi with point P _LUi as the starting point and point P _RDi as the end point, and written in a format rectangular area (hatched area) with point P _LUi as the upper left corner point and point P _RDi as the lower right corner point. It is stipulated accordingly. The format indicating line segment ELi connecting the point P _LUi and the point P _RDi is not a right-angled line segment as shown in FIG. 3, but may be any line segment as long as the point P _LUi and the point P _RDi are connected.
Furthermore, although an example is given in which the format identification character string is located inside the format rectangular area (shaded area), it may also be defined as existing at a certain distance or a certain position from the format rectangular area. In FIG. 3, the dashed line lattice serves as a reference when format instruction line segments ELi and format identification characters are written. The reference grid 90 is described in pop-out colors, as described above in FIG. 2a.
Corresponds to Ci.

第４図は、第３図において示されたフオーマツ
ト入力帳票上の２次元画像が、画像入力装置１に
入力された後の、２値化された２次元画像データ
を示す。データ内には、フオーマツト指示線分の
パターンFLiと、フオーマツト識別文字列のパタ
ーンFDiが存在する。第４図に記述されている基
準格子９１は、画像入力装置１で読み取られた２
次画像データ上には、実際には存在しないデータ
であるが、以後の説明を容易にするためのもので
ある。 FIG. 4 shows binarized two-dimensional image data after the two-dimensional image on the format input form shown in FIG. 3 is input into the image input device 1. The data includes a format instruction line segment pattern FLi and a format identification character string pattern FDi. The reference grid 91 described in FIG.
Although this data does not actually exist on the next image data, it is provided to facilitate the subsequent explanation.

第４図において、２次元画像データ上での予め
設定された基準点となる座標原点（xo、yo）に
対してフオーマツト矩形領域Ri（斜線部分）の
左上隅点Ｑ_LUiの座標を（ｘ_LUi、ｙ_LUi）、右下隅
点Ｑ_RDiの座標を（ｘ_RDi、ｙ_RDi）、フオーマツト
矩形領域Riの横長をｍ_i、縦長をｎ_i、さらに基準
格子９１の格子幅を、横Δｍ、縦Δｎとする。こ
れらはいずれも基準格子単位でなく、ドツトを単
位とした値をとる変数或いは定数であり、ｍ_i、
ｎ_i、ｘ_LUi、ｙ_LUi、ｘ_RDi、ｙ_RDiの間には次の関
係が成り立つ。 In Fig. 4, the coordinates of the upper left corner point Q _LUi of the formatted rectangular area Ri (shaded area) are expressed as (x _LUi , y _LUi ), the coordinates of the lower right corner point Q _RDi are (x _RDi , y _RDi ), the horizontal length of the format rectangular area Ri is m _i , the vertical length is n _i , and the grid width of the reference grid 91 is Δm in width and Δn in height. shall be. All of these are variables or constants that take values not in units of reference grids but in units of dots, and m _i ,
The following relationship holds between n _i , x _LUi , y _LUi , x _RDi , and y _RDi .

ここで用いているｉは、１からｋまでの値をと
るサフイツクスであり、複数個のフオーマツト矩
形領域を識別するためのものである。 The i used here is a suffix that takes a value from 1 to k, and is used to identify a plurality of format rectangular areas.

第５図は、第４図における２次元画像データ
を、文字認識装置２に入力し、基準格子内に記述
されたものに対して文字認識を行なつた文字コー
ドを示し、特に第３図に示すフオーマツト入力帳
票と対応を取るために基準格子を用いて表示して
いる。第５図においても、第３図との対応をとる
ため基準格子９２を記述しているが、実際には存
在しないデータである。ここでは個々の基準格子
を指すために行および列の指示を行つている。文
字認識は、認識対象領域の切り出しと文字認識処
理の２つの過程を経て行なわれる。本実施例で
は、認識対象領域は、基準格子９２単位とし、文
字認識は基準格子９２内に記述されている文字或
いは記号に対して行う。具体的に述べると、認識
対象領域の切り出し位置および領域サイズを求め
るには、第４図に示した２次元画像データ上の座
標原点（ｘ_p、ｙ_p）からのドツト単位での相対位
置およびサイズ計算によつて行われ各基準格子９
２の位置およびサイズが求められる。基準格子９
２のサイズが前述したように各々Δｍ、Δｎであ
るので、基準格子Ｉ行Ｊ列に記述された文字コー
ドの位置（Ｉ行Ｊ列基準格子の左上隅座標）は、
（Δｍ×（Ｉ−１）、Δｎ×（Ｊ−１））で与えられ
る。領域サイズ即ち基準格子サイズはΔｍ、Δｎ
である。第５図においては、基準格子９２と認識
された文字コードとの対応を見易くするため、認
識された文字コードを基準格子９２の中に割り当
てている。文字認識装置２によれば、フオーマツ
ト識別文字が認識される以外に、フオーマツト指
示線分のパターン或いは空白に対しても認識処理
が行われるが、これらは文字ではないためフオー
マツト識別文字とは異なつたコード（例えば認識
されないことを示す、あらかじめ取り決めたコー
ド）が割り当てられ出力される。このようにして
文字認識装置２からは、認識された文字コード列
および各文字の座標およびサイズが出力される。
認識された文字コード列の中、全てを出力するこ
ともできるが、決められた種類の文字コード列を
出力してもよいという意味で、本装置からの出力
は特定の文字コード列が出力される。 FIG. 5 shows character codes obtained by inputting the two-dimensional image data in FIG. 4 into the character recognition device 2 and performing character recognition on what is written in the reference grid. A reference grid is used for display in order to correspond to the format input form shown in FIG. In FIG. 5 as well, a reference grid 92 is described in order to correspond with FIG. 3, but this data does not actually exist. Row and column designations are used here to refer to individual reference grids. Character recognition is performed through two processes: cutting out a recognition target area and character recognition processing. In this embodiment, the recognition target area is defined as a unit of the reference grid 92, and character recognition is performed for characters or symbols written within the reference grid 92. Specifically, in order to find the cutout position and area size of the recognition target area, the relative position in dot units from the coordinate origin (x _p , y _p ) on the two-dimensional image data shown in FIG. The size calculation is performed for each reference grid 9
The position and size of 2 are determined. Reference grid 9
As mentioned above, the sizes of 2 are Δm and Δn, respectively, so the position of the character code written in the I row, J column of the reference grid (the upper left corner coordinates of the I row, J column) is:
It is given by (Δm×(I-1), Δn×(J-1)). The area size, that is, the reference grid size is Δm, Δn
It is. In FIG. 5, the recognized character codes are assigned to the reference grid 92 in order to make it easier to see the correspondence between the reference grid 92 and the recognized character codes. According to the character recognition device 2, in addition to recognizing format identification characters, recognition processing is also performed on patterns or blank spaces in format instruction line segments, but since these are not characters, they are different from format identification characters. A code (for example, a prearranged code indicating that it is not recognized) is assigned and output. In this way, the character recognition device 2 outputs the recognized character code string and the coordinates and size of each character.
Although it is possible to output all of the recognized character code strings, it is also possible to output a predetermined type of character code string. Ru.

以上のようにして認識された文字コード列は、
文字矩形領域算出装置３に入力され、フオーマツ
ト識別文字に対応する文字コード列および各文字
コード列の先頭文字が位置する２次元画像データ
上での座標Li（ｘ_Li、ｙ_Li）を抽出し出力する。
フオーマツト識別文字コードの抽出は、コード或
いはコード種の違いにより行う。抽出されたフオ
ーマツト識別文字コード列の先頭文字が位置した
２次元画像データ上の座標は、前記文字認識装置
２の認識対象領域の切り出しと同様に、基準格子
Ｉ行Ｊ列に記述されている文字であれば、ｘ_Li＝
Δｍ×（Ｉ−１）、ｙ_Li＝Δｎ×（Ｊ−１）で求め
ることができる。また、前記文字認識装置２より
得られる文字座標およびサイズを用いて求めるこ
ともできる。第５図の場合フオーマツト識別文字
としてアルフアベツトを指定しているので、基準
格子（Ｉ、Ｊ）、（Ｉ、Ｊ＋１）、（Ｉ、Ｊ＋２）に
記述された文字Ａ、Ｂ、Ｃに対応する文字コード
４１₁₆，４２₁₆，４３₁₆がフオーマツト識別文字
として抽出され、その文字コード列の先頭文字Ａ
の位置即ちフオーマツト識別文字列が占める文字
矩形領域の位置ｘ_Li＝Δｍ×（Ｉ−１）、ｙ_Li＝Δ
ｎ×（Ｊ−１）および領域サイズΔｍ×３、Δｎ
と共に出力される。ここで領域サイズの中、横幅
を示すΔｍ×３は、文字Ａ、Ｂ、Ｃが行方向に３
個連続しているためにΔｍを３倍して横幅を求め
ている。一般的には、認識された文字が行方向、
列方向に連続している個数を用いている。 The character code string recognized as above is
The coordinates Li (x _Li , y _Li ) on the two-dimensional image data where the character code string corresponding to the format identification character and the first character of each character code string are located are input to the character rectangular area calculation device 3 and output. do.
The format identification character code is extracted based on the difference in code or code type. The coordinates on the two-dimensional image data where the first character of the extracted format identification character code string is located are the characters described in the reference grid row I and column J, similar to the extraction of the recognition target area by the character recognition device 2. If x _Li =
It can be determined by Δm×(I-1) and y _Li =Δn×(J-1). Alternatively, it can be determined using the character coordinates and size obtained from the character recognition device 2. In the case of Fig. 5, alphabets are specified as the format identification characters, so the characters corresponding to the characters A, B, and C written in the reference grids (I, J), (I, J+1), and (I, J+2) Codes 41 ₁₆ , 42 ₁₆ , 43 ₁₆ are extracted as format identification characters, and the first character A of the character code string is
, that is, the position of the character rectangular area occupied by the format identification character string x _Li =Δm×(I-1), y _Li =Δ
n×(J-1) and area size Δm×3, Δn
It is output with. Here, in the area size, Δm×3 indicating the width means that the characters A, B, and C are 3 in the row direction.
Since the pieces are continuous, the width is calculated by multiplying Δm by 3. Generally, recognized characters are
The number of consecutive pieces in the column direction is used.

前記画像入力装置１より得られた２次元画像デ
ータ（第４図）が、フオーマツト矩形領域算出装
置４に入力されると、前記文字矩形領域算出装置
３より出力された文字矩形領域の位置（第５図で
言えばLi（ｘ_Li、ｙ_Li））、および領域サイズ（第
４図で言えばΔｍ×３、Δｎ）に対応する領域に
対してマスクがかけられフオーマツト識別文字列
に対応する２次元画像データ上のパターンが除去
される。これはフオーマツト指定のためのフオー
マツト指示線分FLiのみを検出するための処理
で、このようにして得られたフオーマツト指示線
分FLiの端点Ｑ_LUi、Ｑ_RDi位置より、フオーマツ
ト矩形領域Riの左上隅座標と領域サイズが出力
される。 When the two-dimensional image data (FIG. 4) obtained from the image input device 1 is input to the format rectangular area calculating device 4, the position of the character rectangular area outputted from the character rectangular area calculating device 3 (Fig. Li (x _Li , y _Li )) in Fig. 5, and the area corresponding to the area size (Δm×3, Δn in Fig. 4) are masked, and the area corresponding to the format identification string 2 is masked. Patterns on the dimensional image data are removed. This is a process to detect only the format instruction line segment FLi for format specification, and from the end points Q _LUi and Q _RDi of the format instruction line segment FLi obtained in this way, the upper left corner of the format rectangular area Ri is detected. Coordinates and area size are output.

第４図に示すフオーマツト指示線分FLiの端点
Ｑ_LUi、Ｑ_RDiの座標は次の手順によつて求められ
る。第４図に基づいて説明する。まずｙ方向に順
次走査を行い、最初に検出されたドツトをＱ_LUi
とする。次にＱ_LUiを始点としてドツトの存在す
るフオーマツト指示線分FL_i上を追跡し、最もｙ
座標の大きいドツトに対してＱ_RDiと定める。 The coordinates of the end points Q _LUi and Q _RDi of the format indicating line segment FLi shown in FIG. 4 are determined by the following procedure. This will be explained based on FIG. First, sequential scanning is performed in the y direction, and the first detected dot is Q _LUi
shall be. Next, starting from Q _LUi , trace the format indicating line segment FL _i where the dot exists, and find the most y
Define Q _RDi for a dot with large coordinates.

第６図ｄ，ｂ，ｃは、フオーマツト指示線分の
追跡方法について一例を説明のための模式図であ
る。第６図ａに現在注目しているドツトＤ_ijと上
下左右のドツトＤ_i-1、_j、Ｄ_i+1、_j、Ｄ_i、_j-1、Ｄ
_i、_j+1を示す。現在注目しているドツトＤ_ijから次
に上下左右のどの方向に追跡するかは、第６図ｂ
に示す状態遷移図にしたがう。状態は図のよう
に、上向き状態ST₁、右向き状態ST₂、下向き状
態ST₃、左向き状態ST₄の４つの状態があり現在
追跡している方向を示す。１つの状態から他の状
態への遷移は矢印の示す向きに可能であるが、各
状態には優先順位が設けてあり、１つの状態から
遷移できる状態が２つある場合は、優先順位の高
い方に遷移し、遷移不可能の場合のみ優先順位の
低い状態に遷移する。優先順位の最も高いのは、
上向き状態ST₁で以下右向き状態ST₂、下向き状
態ST₃、左向き状態ST₄の順に低くなる。第６図
ｃに示すような場合、始点A₁から終点A₁₉まで追
跡を行うには次の手順がふまれる。初期状態は上
向き状態ST₁にあるとすると、始点A₁について
は、上向きには追跡できないため、右向き状態
ST₂に遷移し、追跡は点A₂に移る。点A₂，A₃と
上向きに追跡できないため、優先順位の恩恵をこ
うむらず点A₄まで追跡する。点A₄においては上
向きに追跡できるため、上向き状態ST₁に遷移し
追跡はA₅に移る。点A₆に移るときは、右向き状
態ST₂に遷移する。点A₆では、上、右向き両方向
とも追跡できないため、下向き状態ST₃に遷移し
追跡はA₇に移る。点A₇では、上および右向きに
追跡できるが、下向き状態ST₃からの遷移が右向
き状態ST₂へのみ可能であるので追跡はA₈へ移
る。A₁₀においては、前記A₆と同じ動作で下向き
追跡方向が変わり下向き状態ST₃のままA₁₄に来
る。A₁₄からは、状態遷移の優先順位から右また
は左方向にしか追跡できないためA₁₅に追跡が移
り、状態は左向き状態ST₄に移行する。A₁₅で
は、やはり状態遷移の優先順位から、下向きに追
跡が移り、以下A₁₇で前記A₇と同じ動作、A₁₈で
は、前記A₆と同じ動作で終点A₁₉に移る。 FIGS. 6d, b, and c are schematic diagrams for explaining an example of a method for tracing a format instruction line segment. In Fig. 6a, the dot D _ij that we are currently focusing on and the dots D _i-1 , _j , D _i+1 , _j , D _i , _j-1 , D
_i and _j+1 are shown. Figure 6b shows which direction, up, down, left, or right, to trace from the currently focused dot D _ij .
Follow the state transition diagram shown in . As shown in the figure, there are four states: an upward state ST ₁ , a rightward state ST ₂ , a downward state ST ₃ , and a leftward state ST ₄ , which indicate the direction currently being tracked. Transition from one state to another is possible in the direction indicated by the arrow, but each state has a priority, and if there are two states that can transition from one state, the one with the highest priority and only if the transition is impossible, transition to a lower priority state. The highest priority is
In the upward state ST _1, the value decreases in the following order: rightward state ST ₂ , downward state ST ₃ , and leftward state ST ₄ . In the case shown in FIG. 6c, the following procedure is involved in tracking from the starting point A ₁ to the ending point A ₁₉ . Assuming that the initial state is in the upward state ST ₁ , the starting point A ₁ cannot be traced upward, so it is in the rightward state
Transition to ST ₂ , and tracking moves to point A ₂ . Since it cannot be traced upward to points A ₂ and A ₃ , it is traced to point A ₄ without incurring the benefit of priority. Since upward tracking is possible at point _A4 , the state transitions to upward state _ST1 and tracking moves to _A5 . When moving to point A ₆ , it transitions to rightward state ST ₂ . At point _A6 , since tracking is not possible in both the upward and rightward directions, the state transitions to downward state _ST3 and the tracking moves to _A7 . At point A ₇ , tracking is possible up and to the right, but since a transition from downward state ST ₃ is only possible to right state ST ₂ , the tracking moves to A ₈ . At A ₁₀ , the downward tracking direction changes with the same operation as A ₆ and comes to A ₁₄ while remaining in the downward state _ST3 . From _A14 , tracking is possible only in the right or left direction based on the state transition priority, so tracking moves to _A15 , and the state shifts to leftward state _ST4 . At _A15 , the tracking moves downward from the state transition priority, and at _A17 , the same operation as _A7 is performed, and at _A18 , the same operation as _A6 is performed to reach the end point _A19 .

このようにしてフオーマツト矩形領域Riの左
上隅および右下隅座標（ｘ_LUi、ｙ_LUi）、（ｘ_RDi、
ｙ_RDi）が求められる。フオーマツト矩形領域Ri
の領域サイズについては、２つの座標値（左上
隅、右下隅）を用いて、前記式(1)のように表わさ
れる。 In this way, the upper left corner and lower right corner coordinates (x _LUi , y _LUi ), (x _RDi ,
y _RDi ) is calculated. Format rectangular area Ri
The area size of is expressed as in the above equation (1) using two coordinate values (upper left corner, lower right corner).

領域識別装置５は、前記文字矩形領域算出装置
３の出力であるフオーマツト識別文字コード列と
その座標を基に、前記フオーマツト矩形領域算出
装置４の出力であるフオーマツト情報（フオーマ
ツト矩形領域Riの左上隅座標とその領域サイ
ズ）により対応関係のあるものを組にして出力す
る。今前記フオーマツト矩形領域算出装置４の出
力を次のｋ対のデータとする。 The area identification device 5 generates format information (the upper left corner of the format rectangular area Ri) that is the output of the format rectangular area calculation device 4 based on the format identification character code string and its coordinates that are the output of the character rectangular area calculation device 3. Coordinates and their area sizes) that have a corresponding relationship are output as pairs. Now, the output of the format rectangular area calculation device 4 is assumed to be the following k pairs of data.

ここでｍ_i、ｎ_iは、式(1)で与えられるフオーマ
ツト矩形領域の横と縦のサイズである。式(2)によ
つて示される各フオーマツト矩形領域Ｒ_iは、互
いに重ならないものとすると、次に示す座標関係
が成り立つ。 Here, m _i and n _i are the horizontal and vertical sizes of the format rectangular area given by equation (1). Assuming that each format rectangular area R _i shown by equation (2) does not overlap with each other, the following coordinate relationship holds true.

前記文字矩形領域Ｓ_iに対応するｋ個のフオー
マツト識別文字列の座標は以下のようにあらわさ
れる。 The coordinates of the k format identification character strings corresponding to the character rectangular area S _i are expressed as follows.

式(2)におけるＲ_iと式(5)におけるＣ_iの対応は以
下のようにして行う。 The correspondence between R _i in equation (2) and C _i in equation (5) is performed as follows.

以上のようにして、フオーマツト矩形領域Ｒ_i
の位置（ｘ_LUi、ｙ_LUi）、領域サイズｍ_i、ｎ_iおよ
び対応するフオーマツト識別文字列が対として領
域識別装置５から出力される。 As described above, the format rectangular area R _i
The position (x _LUi , y _LUi ), the area size m _i , n _i and the corresponding format identification character string are outputted from the area identification device 5 as a pair.

次に実施例の１つについて詳細に説明する。画
像入力装置１および文字認識装置２については公
知のため、残りの文字矩形領域算出装置３、フオ
ーマツト矩形領域算出装置４および領域識別装置
５について説明する。 Next, one of the embodiments will be described in detail. Since the image input device 1 and the character recognition device 2 are well known, the remaining character rectangular area calculation device 3, format rectangular area calculation device 4, and area identification device 5 will be explained.

第７図は文字矩形領域算出装置３の一実施例を
示すブロツク図である。図において、３０は文字
コード比較回路、３１はフオーマツト識別用とし
て取り決められている文字コードが登録されてい
るフオーマツト識別文字コードテーブル、３２は
文字矩形領域演算回路である。文字認識装置２か
らの文字コード列２０１は文字コード比較回路３
０によつて、あらかじめフオーマツト識別文字コ
ードテーブル３１に登録されている文字コードと
比較され、一致した場合は、一致信号３０１、一
致しない場合は、不一致信号３０２が出力され
る。文字矩形領域演算回路３２は、文字認識装置
２から、フオーマツト識別帳票の基準格子サイズ
Δm202およびΔn203と、フオーマツト識別帳票
の基準格子の行列サイズM204およびN205を受け
取り、第５図における第１行第１列の文字コード
から、最初列番号Ｊが変化し、ＪがＭと等しくな
るたびに行番号Ｉが１増加し、そのとき列番号Ｊ
があらためて１からカウントをはじめるという順
に発生する行列番号に従つて直列に送られてくる
文字コード列に対して、一致信号３０１に同期し
て行列番号を決定する。さらに連続しているフオ
ーマツト識別文字コード列については、先頭の文
字の左上隅座標ｘ_Li321、ｙ_Li322および連続して
いる文字列の領域サイズ（Δｍ×行方向文字数
323、Δｎ×列方向文字数324）を出力する。連続
しない文字については、左上隅座標ｘ_Li321、ｙ_Li
322および文字の領域サイズ（Δm323、Δn324）
を出力する。 FIG. 7 is a block diagram showing an embodiment of the character rectangular area calculation device 3. In the figure, 30 is a character code comparison circuit, 31 is a format identification character code table in which character codes determined for format identification are registered, and 32 is a character rectangular area calculation circuit. The character code string 201 from the character recognition device 2 is sent to the character code comparison circuit 3
0 is compared with the character code registered in advance in the format identification character code table 31, and if they match, a match signal 301 is output, and if they do not match, a mismatch signal 302 is output. The character rectangular area calculation circuit 32 receives the standard grid sizes Δm202 and Δn203 of the format identification form and the matrix sizes M204 and N205 of the standard grid of the format identification form from the character recognition device 2, and receives the reference grid sizes M204 and N205 of the format identification form from the first row From the character code of the column, the column number J changes at first, and each time J becomes equal to M, the row number I increases by 1, and then the column number J
The row and column numbers are determined in synchronization with the match signal 301 for the character code strings that are sent in series according to the row and row numbers that occur in the order in which the count starts again from 1. Furthermore, for a continuous format identification character code string, the upper left corner coordinates of the first character x _Li 321, y _Li 322 and the area size of the continuous character string (Δm x number of characters in the row direction)
323, Δn x number of characters in column direction 324). For non-consecutive characters, the upper left corner coordinates x _Li 321, y _Li
322 and character area size (Δm323, Δn324)
Output.

第８図はフオーマツト矩形領域算出装置４の一
実施例を示すブロツク図である。 FIG. 8 is a block diagram showing one embodiment of the format rectangular area calculation device 4.

図において、４０は２次元画像データ１０１か
ら文字矩形領域を除去するマスク回路であり、前
記文字矩形領域算出装置、３より転送される文字
矩形領域情報321、322、323、324をレジスタ群４
３にラツチしておき、マスク対象領域を指示す
る。 In the figure, 40 is a mask circuit for removing a character rectangular area from the two-dimensional image data 101, and the character rectangular area information 321, 322, 323, 324 transferred from the character rectangular area calculation device 3 is transferred to a register group 4.
3 and specify the area to be masked.

このようにしてマスクされた２次元画像データ
４０１は前述たように、始点検出回路４１により
フオーマツト矩形領域の左上隅点の座標を求めら
れる。本回路４１は２次元画像データをｙ方向に
走査し、始点を求める回路である。 As described above, the two-dimensional image data 401 masked in this manner is used to determine the coordinates of the upper left corner point of the formatted rectangular area by the start point detection circuit 41. This circuit 41 is a circuit that scans two-dimensional image data in the y direction to find a starting point.

第９図に示すように、２次元画像データ９５の
中で隣り合う２点Ａ（ｘ_c−１、ｙ_c）、Ｂ（ｘ_c、
ｙ_c）の値が、走査（矢印の方向）を開始して初
めて現われたとき、そのときのｘ_c、ｙ_cを始点座
標Ｑ_LUiとする。 As shown in FIG. 9, two adjacent points A (x _c -1, y _c ) and B (x _c ,
When the value of y _c ) appears for the first time after starting scanning (in the direction of the arrow), let x _c and y _c at that time be the starting point coordinates Q _LUi .

次に終点検出回路４２により、フオーマツト指
示線分の終点即ちフオーマツト矩形領域の右下隅
点の座標を求め、フオーマツト矩形情報としてフ
オーマツト矩形領域位置421、422およびサイズ
423、424を得ることができる。 Next, the end point detection circuit 42 determines the coordinates of the end point of the format instruction line segment, that is, the lower right corner point of the format rectangular area, and uses the format rectangle information as the format rectangular area positions 421, 422 and size.
You can get 423, 424.

終点検出回路４２は、前述のように追跡回路が
含まれているので第１０図ａ，ｂ，ｃのブロツク
図を用いて説明する。第１０図ａにおいて６１，
６２，６３，６４は、第６図ａに対応して２次元
画像データのドツトをラツチするフリツプフロツ
プである。第６図ａに示すＤ_ijに対応するフリツ
プフロツプは描かれていないがＤ_ijは常に値１を
とるからフリツプフロツプは不安である。 Since the end point detection circuit 42 includes a tracking circuit as described above, it will be explained using the block diagrams of FIGS. 10a, b, and c. In Figure 10a, 61,
Reference numerals 62, 63, and 64 indicate flip-flops for latching dots of two-dimensional image data, corresponding to FIG. 6a. Although the flip-flop corresponding to D _ij shown in FIG. 6a is not drawn, since D _ij always takes the value 1, the flip-flop is unreliable.

CKは各フリツプフロツプ６１，６２，６３，
６４に、データｕ，ｒ，ｄ，ｌをラツチするため
に用いる。各フリツプフロツプ６１，６２，６
３，６４の出力は、それぞれＵ，Ｒ，Ｄ，Ｌであ
る。第１０図ｂは、シフトレジスタ７０を示し、
信号SLによつて下シフト（７１→７４方向）、信
号SRによつて右シフト（７４→７１方向）を行
う。シフトレジスタ７０は、４つのビツト71、
72、73、74をもち、それぞれ前述の上向き、右向
き、下向き、左向き状態を保持する。第１０図ｃ
は、状態遷移を動作させる回路である。８１は、
入力データＵ，Ｒ，Ｄ，Ｌに対し、現在の追跡状
態ST_i＝（ｉ＝１、２、３、４）において許され
る次の追跡方向（例えばST₂ならば上、右、下方
向）にあるデータをゲートして出力する追跡方向
限定回路８１である。例えばST₂状態において８
１より出力されるのはU′，R′，D′で、L′は出力
されない。８２は、８１においてゲートされたデ
ータU′，R′，D′，L′のうち、前述の優先順位に
応じて、値１を持ちかつ優先順位の最も高いもの
を選び出す優先方向選択回路８２である。この出
力U″，R″，D″，L″が次段の状態遷移決定回路８
３に入力されると、現在の追跡状態と、次に追跡
する方向（U″，R″，D″，L″のいずれか一つで決
まる。）とから、第１０図ｂのシフトレジスタの
シフト方向を決定するSR、SLを出力する。 CK is each flip-flop 61, 62, 63,
64 is used to latch data u, r, d, l. Each flip-flop 61, 62, 6
The outputs of 3 and 64 are U, R, D, and L, respectively. FIG. 10b shows the shift register 70,
A downward shift (71→74 direction) is performed by the signal SL, and a right shift (74→71 direction) is performed by the signal SR. The shift register 70 has four bits 71,
72, 73, and 74, and maintain the above-mentioned upward, rightward, downward, and leftward states, respectively. Figure 10c
is a circuit that operates state transition. 81 is
For the input data U, R, D, L, the next tracking direction allowed in the current tracking state ST _i = (i = 1, 2, 3, 4) (for example, for ST _2, upward, right, downward direction) This is a tracking direction limiting circuit 81 that gates and outputs the data in the . For example, in ST ₂ state 8
1 outputs U', R', and D', and L' is not output. 82 is a priority direction selection circuit 82 which selects the data having the value 1 and having the highest priority among the data U', R', D', and L' gated in 81 according to the priority order described above. be. These outputs U″, R″, D″, L″ are the next stage state transition determination circuit 8
3, from the current tracking state and the next tracking direction (determined by one of U″, R″, D″, L″), the shift register shown in Figure 10b is selected. Outputs SR and SL that determine the shift direction.

領域識別装置５は、単に式(6)の大小関係を満た
す対を見いだすための回路である。第１１図にブ
ロツク図を示す。文字矩形領域算出装置３より得
られる文字矩形領域Ｓ_iの位置ｘ_Lj321とｘ_j322と
フオーマツト矩形領域算出装置４より得られるフ
オーマツト矩形領域Ｒ_iの位置ｘ_LUi421、ｙ_LUi422
および領域サイズｍ_i423、ｎ_i424より比較回路１
１００，１２００，１３００，１４００を用いて
式(6)を満足するとき一致信号１１００１をｘ_LU
_ｉ、ｙ_LUi、ｍ_i、ｎ_iと共に出力する。 The area identification device 5 is simply a circuit for finding pairs that satisfy the magnitude relationship of equation (6). A block diagram is shown in FIG. Positions x _Lj 321 and x _j 322 of the character rectangular area S _i obtained from the character rectangular area calculating device 3 and positions x LUi 421, y _LUi ₄₂₂ of the format rectangular area R _i obtained from the format rectangular area calculating device 4.
Comparison circuit 1 from area size m _i 423, n _i 424
When formula (6) is satisfied using 100, 1200, 1300, 1400, the coincidence signal 11001 is x _LU
It is output together with _i , y _LUi , m _i and n _i .

以上述べたように、本発明によるフオーマツト
入力装置は、実際の帳票サイズに近い形で、フオ
ーマツトを入力することができ、しかもフオーマ
ツト情報に対し、自動識別および分類が可能であ
る。 As described above, the format input device according to the present invention is capable of inputting a format in a form close to the actual document size, and is also capable of automatically identifying and classifying format information.

【図面の簡単な説明】[Brief explanation of the drawing]

第１図は本発明の構成を示すためのブロツク
図、第２図ａは２次元画像上での認識された文字
領域を示す模式図、第２図ｂは、認識された文字
領域のうちフオーマツト識別文字の記述されてい
る文字矩形領域を示す模式図、第２図ｃはフオー
マツト矩形領域をフオーマツト指示線分によつて
記述した模式図、第３図は、２次元画像上に記述
されたフオーマツト指示線分とフオーマツト識別
文字の１例を示す模式図、第４図は第３図におけ
る２次元画像が画像入力装置１によつて読み取ら
れた結果である２次元画像データを示す模式図、
第５図は第４図における２次元画像データを文字
認識装置２によつて認識した結果を示す模式図、
第６図ａはフオーマツト指示線分を追跡する際の
参照領域を示す模式図、第６図ｂは追跡方向状態
の遷移を示すため模式図、第６図ｃは追跡手順を
説明するためのフオーマツト指示線分例の１部を
表わす模式図、第７図は文字矩形領域算出装置３
の１実施例を示すブロツク図、第８図はフオーマ
ツト矩形領域算出装置４の１実施例を示すブロツ
ク図、第９図は、フオーマツト指示線分の始点検
出回路の１実施例を示すブロツク図、第１０図は
フオーマツト矩形領域算出装置４内におけるフオ
ーマツト指示線分追跡を実行する１実施例を示す
ブロツク図、第１１図は、領域識別装置の１実施
例を示すブロツク図である。なお図において、１……画像入力装置、２……
文字認識装置、３……文字矩形領域算出装置、４
……フオーマツト矩形領域算出装置、５……矩形
識別装置、９０，９１，９２……基準格子、３０
……文字コード比較回路、３１……フオーマツト
識別文字コードテーブル、３２……文字矩形領域
算出回路、４０……マスク回路、４１……始点検
出回路、４２……終点検出回路、４３……文字矩
形領域情報格納回路、６１，６２，６３，６４…
…ドツトレジスタ、７０……シフトレジスタ、８
１……追跡方向限定回路、８２……優先方向選択
回路、８３……状態遷移決定回路。 FIG. 1 is a block diagram showing the configuration of the present invention, FIG. 2a is a schematic diagram showing a recognized character area on a two-dimensional image, and FIG. FIG. 2c is a schematic diagram showing a character rectangular area in which an identification character is written. FIG. A schematic diagram showing an example of an instruction line segment and a format identification character; FIG. 4 is a schematic diagram showing two-dimensional image data as a result of the two-dimensional image in FIG. 3 being read by the image input device 1;
FIG. 5 is a schematic diagram showing the result of recognition of the two-dimensional image data in FIG. 4 by the character recognition device 2;
FIG. 6a is a schematic diagram showing a reference area when tracing a format instruction line segment, FIG. 6b is a schematic diagram showing the transition of the tracking direction state, and FIG. 6c is a format diagram for explaining the tracking procedure. A schematic diagram showing a part of an example of a designated line segment, FIG. 7 is a character rectangular area calculation device 3
8 is a block diagram showing an embodiment of the format rectangular area calculating device 4. FIG. 9 is a block diagram showing an embodiment of the starting point detection circuit for the format instruction line segment. FIG. 10 is a block diagram showing one embodiment of the format instruction line segment tracing in the format rectangular area calculation device 4, and FIG. 11 is a block diagram showing one embodiment of the region identification device. In the figure, 1... image input device, 2...
Character recognition device, 3...Character rectangular area calculation device, 4
... Format rectangular area calculation device, 5 ... Rectangle identification device, 90, 91, 92 ... Reference grid, 30
...Character code comparison circuit, 31...Format identification character code table, 32...Character rectangle area calculation circuit, 40...Mask circuit, 41...Start point detection circuit, 42...End point detection circuit, 43...Character rectangle Area information storage circuit, 61, 62, 63, 64...
...Dot register, 70...Shift register, 8
1... Tracking direction limiting circuit, 82... Priority direction selection circuit, 83... State transition determining circuit.

Claims

【特許請求の範囲】[Claims]

１２次元画像を走査して得られた濃淡画像デー
タを２値化して出力する画像入力手段と、前記画
像入力手段より得られた２次元画像データから１
或いは複数個の矩形領域C₁、C₂、…、Cnを切り
出し、該各領域に含まれる１或いは複数個の文字
に対し文字認識を行い、対応する文字コードおよ
び２次元画像データ上での文字位置とサイズを出
力する文字認識手段と、前記文字認識手段により
認識された文字の中、特定の種類の文字コードを
含む文字矩形領域S₁、S₂、…、Skを取り出し、
該各領域に対し前記２次元画像データ上の基準点
からの位置およびサイズを算出する文字矩形領域
算出手段と、前記画像入力手段より得られた２次
元画像データから、１或いは複数個のフオーマツ
ト矩形領域R₁、R₂、…、Rkを切り出し、該各領
域に対し前記２次元画像データ上の基準点からの
位置およびサイズを算出するフオーマツト矩形領
域算出手段と、前記フオーマツト矩形領域算出手
段より得られた矩形領域R₁、R₂、…、Rkの位置
およびサイズより、矩形領域R₁、R₂、…、Rkと
前記文字矩形領域算出手段より得られた文字矩形
領域S₁、S₂、…、Skとの位置関係を計算し、フ
オーマツト矩形領域R₁、R₂、…、Rkに対し、対
応する文字矩形領域S₁、S₂、…、Skに含まれる
文字コード列を識別コードとして割り当てる領域
識別手段を有し、２次元画像上に記述された１或
いは複数個のフオーマツト矩形領域の位置とサイ
ズを計算し、該各領域に対し識別文字コード列を
割り当てることを特徴とするフオーマツト入力装
置。1. An image input means for binarizing and outputting gray scale image data obtained by scanning a two-dimensional image; and
Alternatively, cut out multiple rectangular areas C ₁ , C ₂ , ..., Cn, perform character recognition on one or more characters included in each area, and identify the corresponding character code and characters on the two-dimensional image data. character recognition means for outputting the position and size; and character rectangular areas S ₁ , S ₂ , ..., Sk containing a specific type of character code from among the characters recognized by the character recognition means,
character rectangle area calculation means for calculating the position and size of each area from a reference point on the two-dimensional image data; and one or more format rectangles from the two-dimensional image data obtained from the image input means. A formatted rectangular area calculation means for cutting out the regions R ₁ , R ₂ , ..., Rk and calculating the position and size of each of the regions from a reference point on the two-dimensional image data; Based on the positions and sizes of the rectangular regions R ₁ , R ₂ , ..., Rk, the rectangular regions R ₁ , R ₂ , ..., Rk and the character rectangular regions S ₁ , S ₂ , ..., Sk is calculated, and character code strings included in the corresponding character rectangular regions S ₁ , S ₂ , ..., Sk are used as identification codes for the format rectangular regions R ₁ , R ₂ , ..., Rk. A format input device comprising allocation area identification means, which calculates the position and size of one or more format rectangular areas described on a two-dimensional image, and allocates an identification character code string to each area. Device.