JPH03260885A

JPH03260885A - Systolic array for high-speed character recognition preprocessing

Info

Publication number: JPH03260885A
Application number: JP2058043A
Authority: JP
Inventors: Masayuki Kimura; 木村　正行; Hirotomo Aso; 阿曽　弘具; Shinichiro Omachi; 真一郎大町; Yutaka Katsuyama; 裕勝山
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1990-03-12
Filing date: 1990-03-12
Publication date: 1991-11-20

Abstract

PURPOSE:To accelerate various processings by constituting an exclusive programmable cell to be a systolic array and parallelly executing the processings by pipe lines. CONSTITUTION:A cell 1 is equipped with a memory 2, ALU 3 and first and second registers 4 and 5 and connected in the shape of the systolic array. Then, image data is added to each cell for the unit of a dot, and the inputted data is processed for the unit of the dot. The memory 2 stores the dot data to be inputted, program for processing the data to be added from the vertically adjacent cell in a direction where the data is inputted, and data under processing, and the ALU 3 executes the addition and comparison of the data to be processed by the program at least. The register 4 stores the inputted data and outputs it to the next step, and the register 5 stores the result of this processing and outputs it to an adjacent cell 2 facing to the adjacent cell. Thus, a preprocessing with the large amount of data to be handled can be executed at high speed.

Description

【発明の詳細な説明】〔概　　要〕文字の認識を行うにあたり、画像データを前処理する高
速文字認識前処理用シストリックアレイに関し、文字認識装置における扱うデータ量の多い前処理を高速
に行う高速文字認識前処理用シストリックアレイを提供
することを目的とし、イメージデータが並列に少なくともドツト単位で加わり
、該ドツトデータを次段へ出力するセルをシストリック
アレイに接続し、該セルは入力するドツトデータと、該
データから入力する方向に対して垂直方向の隣合うセル
より加わるデータとを処理するプログラムと処理中のデ
ータとを記憶するメモリと、前記処理においてデータの
加算比較を少なくとも行うＡＬＵと、入力したデータを
記憶するとともに、次段に出力する第１のレジスタと、
前記処理結果を記憶するとともに前記隣合うセルに対向
した隣合うセルに出力する第２のレジスタとより威るよ
うに構成する。[Detailed Description of the Invention] [Summary] Regarding a systolic array for high-speed character recognition pre-processing that pre-processes image data for character recognition, this invention performs high-speed pre-processing that handles a large amount of data in a character recognition device. The purpose is to provide a systolic array for high-speed character recognition preprocessing, and a cell that adds image data in parallel at least in dot units and outputs the dot data to the next stage is connected to the systolic array, and the cell is connected to the input cell. a memory for storing a program for processing dot data to be input from the data, and data added from adjacent cells in a direction perpendicular to the input direction from the data, and data being processed; and at least performing addition and comparison of the data in the processing. an ALU, a first register that stores input data and outputs it to the next stage;
A second register is configured to store the processing result and output it to an adjacent cell opposite to the adjacent cell.

〔産業上の利用分野〕[Industrial application field]

本発明は、文字認識装置に係り、更に詳しくは文字の認
識を行うにあたり、画像データを前処理する高速文字認
識前処理用シストリックアレイに関する。The present invention relates to a character recognition device, and more particularly to a systolic array for high-speed character recognition preprocessing that preprocesses image data for character recognition.

〔従来の技術〕[Conventional technology]

近年、文書の管理が計算機によって行われるようになっ
てきた。それにともない、既存の文書を計重機に自動的
に読み取らせ、効率よく入力する光学文字読み取り装置
！　（ＯＣＲ）に対する社会的要求が急速に増大してい
る。それに応えるために、ＯＣＲは高速・高精度である
こと、小型であることなどが要求され、実現されつつあ
る。In recent years, documents have come to be managed using computers. Along with this, an optical character reader that automatically reads existing documents into a weighing machine and inputs them efficiently! The social demand for (OCR) is rapidly increasing. In order to meet this demand, OCR is required to be high speed, highly accurate, and compact, and is now being realized.

文字認識装置においても前述のＯＣＲの高速・高精度化
を要求しているものであり、これまでに高速・高精度化
を計ったＯＣＲを用いた様々な認識方式が開発されてい
る。それらの概略を流れで表すと、イメージデータの入
力・前処理（切り出し・正規化）・特徴抽出・認識の過
程に分けられる。このうち、前処理の部分はほとんどす
べての文字認識手法に共通している。しかしながら、正
規化等の前処理は特に重要であるが、文字の大きさとい
う大局的情報を必要とするため、従来は逐次的処理で行
われていた。Character recognition devices are also required to have faster and more accurate OCR as described above, and various recognition methods using OCR that are designed to be faster and more accurate have been developed. The process can be summarized as follows: image data input, preprocessing (cutting, normalization), feature extraction, and recognition. Of these, the preprocessing part is common to almost all character recognition methods. However, although preprocessing such as normalization is particularly important, it has conventionally been performed sequentially because it requires global information such as the size of characters.

一方、文字切り出しや正規化は画像処理の一種であり、
この分野で開発された画像処理用の汎用の小型並列処理
プロセッサを文字認識へ適用する研究も行われている。On the other hand, character extraction and normalization are types of image processing.
Research is also being conducted to apply general-purpose small parallel processing processors for image processing developed in this field to character recognition.

しかし、単に画像処理用のプロセッサを文字認識に用い
ることと、文字認識専用のプロセッサを用いて簡単でよ
り高速なシステムを構成することとは処理時間等におい
て大きな差異があり、文字認識専用でないかぎりこの処
理速度能力に問題がある。However, there is a big difference in processing time between simply using an image processing processor for character recognition and constructing a simpler, faster system using a processor dedicated to character recognition. There is a problem with this processing speed.

また、文字認識アルゴリズムによっては、正規化後、細
線化・線素化等の二次元画像処理が必要なものもあるが
、細線化や線素化は各画素近傍の局所的情報に基づく反
復処理であって、それをパイプライン形アルゴリズムを
用いて効率的に行う方法は既に提案されている。しかし
、従来のこの種の方法で正規化を効率よく実現する方法
は得られていない。In addition, some character recognition algorithms require two-dimensional image processing such as thinning or line segmentation after normalization, but thinning and line segmentation are iterative processes based on local information in the vicinity of each pixel. A method for efficiently performing this using a pipeline algorithm has already been proposed. However, conventional methods of this type have not provided a method for efficiently realizing normalization.

〔発明が解決しようとする課題〕[Problem to be solved by the invention]

前述した文字認識装置には高認識率さらには高速化が要
求されている。認識率を高めるためには高精度読取りが
要求されるが、この高精度読み取りによりデータ量が増
加し、認識処理が遅くなるという問題を発生していた。The above-mentioned character recognition device is required to have a high recognition rate and even higher speed. In order to increase the recognition rate, high-precision reading is required, but this high-precision reading increases the amount of data, causing the problem of slowing down the recognition process.

また、逆に高速化するにはデータ数を低下させなくては
ならず、これによって認識率が下がるという問題を発生
していた。On the other hand, in order to increase the speed, it is necessary to reduce the amount of data, which causes the problem of lower recognition rate.

すなわち、ＯＣＲ等が高精度・高速化がなされていても
認識のための処理が遅く、これが原因となって認識装置
における高精度化・高速化を満足できなかった。特に、
前述した前処理では画像データを扱うため、その処理に
多くの時間を必要とし、全体の処理の実行を遅らせてし
まう問題を有していた。That is, even if OCR and the like have been made highly accurate and fast, the processing for recognition is slow, and this is the reason why the recognition apparatus cannot satisfy the requirements for high accuracy and high speed. especially,
Since the above-mentioned preprocessing deals with image data, it requires a lot of time and has the problem of delaying the execution of the entire process.

本発明は、文字認識装置における扱うデータ量の多い前
処理を高速に行う高速文字ｖｌ識前処理用シストリック
アレイを提供することを目的とする。SUMMARY OF THE INVENTION An object of the present invention is to provide a systolic array for high-speed character vl recognition processing, which performs preprocessing of a large amount of data at high speed in a character recognition device.

〔課題を解決するための手段〕[Means to solve the problem]

第１図は本発明の原理ブロック図である。 FIG. 1 is a block diagram of the principle of the present invention.

セル１は、メモリ２．ＡＬＵ３．第１のレジスタ４．第
２のレジスタ５を有し、シストリックアレイ状に接続さ
れている。Cell 1 is memory 2. ALU3. First register 4. It has a second register 5 and is connected in a systolic array.

メモリ２は、入力するドツトデータと、そのデータが人
力する方向に対して垂直方向の隣合うセルより加わるデ
ータとを処理するプログラムと処理中のデータとを記憶
する。The memory 2 stores a program for processing input dot data, data added from adjacent cells in a direction perpendicular to the direction in which the data is manually input, and data being processed.

ＡＬＵ３は、前記プログラムが処理するデータの加算・
比較を少なくとも行う。ALU3 performs addition/addition of data processed by the program.
At least make a comparison.

第１のレジスタ４は入力したデータを記憶するとともに
次段に出力する。The first register 4 stores input data and outputs it to the next stage.

第２のレジスタ５は、前記処理結果を記憶するとともに
、前記隣合うセルに対向した隣合うセルにを出力する。The second register 5 stores the processing result and outputs it to an adjacent cell facing the adjacent cell.

〔作　　　用〕[For production]

シストリックアレイ構造に構成された各セルにイメージ
データがドツト単位で加わり、ドツト単位で人力したデ
ータを処理する。Image data is added dot by dot to each cell configured in a systolic array structure, and the manually generated data is processed dot by dot.

メモリ２はイメージデータを処理するためのうログラム
を記憶しており、そのプログラムをセルｌが実行して加
算や比較等の簡単な演算処理を行い、その結果を第２の
レジスタ５に加える。このとき、これらの処理において
は、第２のレジスタが出力するセルと、対向する隣のセ
ルとから結果が加わっており、その結果と第１のレジス
タ４に記憶されたイメージデータ等を用いて処理する。The memory 2 stores a program for processing image data, and the cell l executes the program to perform simple arithmetic operations such as addition and comparison, and adds the results to the second register 5. At this time, in these processes, the results are added from the cell output by the second register and the opposing adjacent cell, and the results and the image data stored in the first register 4 are used to process the results. Process.

そしてその結果を第２のレジスタ５に加えるとともに隣
のセルに出力する。また、第１のレジスタ４で記憶した
ドツトデータは次段のセルにも加わっており、次段にお
いても同様の処理を順次行う。The result is then added to the second register 5 and output to the adjacent cell. Further, the dot data stored in the first register 4 is also added to the cells at the next stage, and the same processing is sequentially performed at the next stage as well.

セル１をシストリックアレイ構造にしているので、入力
した複数のドツトデータをドツト対応でセルは演算し次
段に出力するとともに隣のセルにもその結果を出力する
ので、入力方向と垂直方向に関係するドツトデータの処
理が可能となり（並列処理）、更にパイプライン処理す
るので、高速の各種の処理が可能となる。また、処理に
必要とする演算は加算・比較等、であるので簡単なＡＬ
Ｕで槽底することができ、セルを安価とすることができ
る。Since cell 1 has a systolic array structure, the cell calculates the input multiple dot data in correspondence with the dots and outputs it to the next stage, and also outputs the result to the adjacent cell, so it Since related dot data can be processed (parallel processing) and further pipeline processing is performed, various types of high-speed processing are possible. In addition, since the operations required for processing are addition, comparison, etc., simple AL
The bottom of the tank can be made of U, and the cell can be made inexpensive.

〔実　　施　・例〕〔Example〕

以下、図面を用いて本発明の詳細な説明する。 Hereinafter, the present invention will be explained in detail using the drawings.

第２図は本発明の実施例の文字認識の流れ図である。用
紙等に印刷された文字を認識するため先ず画像を読み取
る（画像入力、：　１ｎｐｕｔ　ｉｍａｇｅ　Ｓｌ）、
そしてその読み取った１ペ一ジ分にわたってヒストグラ
ムを計算する（ｃａｌｃｕｌａｔｅ　ｈｉｓｔｏｇｒａ
ｍ　Ｓ２）、このヒストグラム計算はページ内における
縦方向、横方向への投影ドツト数の加算である。またこ
のヒストグラム針）ＥＳ２においては用紙等の傾きをも
補正するため使用される傾き範囲内に対応してそれぞれ
傾いているであろうとした角度に対応してヒストグラム
を求める。そして続いて傾き補正（ｃｏｒｒｅｃｔ　１
ｎｃｌｉｎａｔｉｏｎ　、　Ｓ　３　）を行う。この傾
き補正Ｓ３は前述のヒストグラム計算Ｓ２においてそれ
ぞれの角度でなされたヒストグラムから紙の傾きを求め
その傾きに対応した処理を行う、ヒストグラムはその傾
きに対応しヒストグラム値の最大値と最小値の差は正規
の位置に配置された時に最大となる。すなわち複数の傾
きに対応してそれぞれヒストグラム計算をした場合、そ
の最大値と最小値の差が最大となる時の傾きが用紙の傾
きであり、傾き補正Ｓ３においてはこの傾きを求める。FIG. 2 is a flowchart of character recognition according to an embodiment of the present invention. In order to recognize characters printed on paper, etc., first read the image (image input: 1nput image Sl),
Then, calculate histogram over one page read.
mS2), this histogram calculation is the addition of the number of projected dots in the vertical and horizontal directions within the page. In addition, in this histogram needle (ES2), in order to correct the inclination of the paper, etc., a histogram is obtained corresponding to each possible inclination angle within the inclination range used. Then, tilt correction (correct 1
nclination, S3). This tilt correction S3 calculates the tilt of the paper from the histograms made at each angle in the histogram calculation S2 described above, and performs processing corresponding to the tilt. is maximum when placed at the normal position. That is, when histogram calculations are performed for each of a plurality of inclinations, the inclination at which the difference between the maximum value and the minimum value becomes the maximum is the inclination of the sheet, and this inclination is determined in the inclination correction S3.

その結果は文字切り出しくｅｘｔｒａｃｔ　ａ　ｒｅｇ
ｉｏｎ　ｏｆ　ｃｈａｒａｃｔｅｒ　ｐａｔｔｅｒｎ　
Ｓ　４　）に加わる。The result is extract a reg
ion of character pattern
S4).

文字切り出しＳ４はヒストグラム計算によって求められ
た、例えば最大値の周期から文字の領域を求め、その領
域単位で前述の傾き補正Ｓ３によって得られた補正値に
対応し順次１文字を切り出す。In the character cutting S4, a character area is obtained from, for example, the period of the maximum value obtained by histogram calculation, and one character is sequentially cut out in each area corresponding to the correction value obtained in the above-mentioned tilt correction S3.

本発明の実施例においては文書の読み取り（用紙の読み
取り）はページ単位で行い、前述の処理３２〜Ｓ４によ
って最終的にそれぞれの１文字単位での領域の分割がな
される。前述した処理５２〜Ｓ４の文字分割化（ｓｅｇ
ｍｅｎｔａｔｉｏｎ　ｏｆ　ｃｈａｒａｃｔｅｒ　ｐａ
ｔｔｅｒｎｓ）と、その領域の分割化に対応し文字単位
での正規化とを行う（ｎｏｒｍａｌｉｚａｔｉｏｎ）。In the embodiment of the present invention, reading of a document (reading of paper) is performed page by page, and the area is finally divided into each character by the above-mentioned processes 32 to S4. Character segmentation (seg
Mention of character pa
tterns) and normalization on a character-by-character basis corresponding to the division of the area.

正規化においては先ず変換表作成を行う（ｍａｋｅｔｒ
ａｎｓｆｏｒｍ　ｆｕｎｃｔｉｏｎ　ｆｏｒ　ｎｏｒｍ
ａｌｉｚａｔｉｏｎ　　Ｓ　５　）　。In normalization, first a conversion table is created (maketr
ansform function for norm
alization S5).

この変換表作成Ｓ５は、後述するが入力した文字を特定
の大きさに正規化するための角辺の拡大縮小をドツト単
位で行うテーブルを作成する処理であり、このテーブル
を作成した後拡大縮小処理（ｍａｐ　５ｍ１ｔｈ　ｔｈ
ｅ　ｔｒａｎｓｆｏｒｍ　ｆｕｎｃｔｉｏｎ　Ｓ　６　
）を行い、読み取った１文字の領域において特定の大き
さの文字とする処理いわゆる正規化処理を行う。This conversion table creation S5 is a process to create a table that scales the corners in dot units in order to normalize input characters to a specific size, as will be described later. Processing (map 5m1th th
e transform function S 6
), and a so-called normalization process is performed to make a character of a specific size in the region of one read character.

例えば１文字の切り出しドツトがＭＸＭドツトであり正
規化後の文字の大きさがＤＸＤドツトであった時には、
ＭＸＭドツトを読み取った領域の内に存在する文字を全
てＤＸＤとする処理である。For example, if the cutout dot for one character is an MXM dot and the normalized character size is a DXD dot,
This is a process in which all characters existing within the area where the MXM dots are read are converted to DXD.

前述の処理３５．Ｓ６による正規化（ｎｏｒｍａｌｉｚ
ａｔｔｏｎ）を行った後、各文字単位での特徴抽出（ｅ
ｘｔｒａｃｔ　ｆｅａｔｕｒｅｓ　　Ｓ　７　）を行い
、その特徴と予め求められている辞書内の文字との距離
を求める。そして、その距離から認識（ｒｅｃｏｇｎｉ
ｚｅ　Ｓ　８　）　Ｌ、最も距離の少ないものを辞書内
の文字コードを認識結果として出力（ｏｕｔ　ｐｕｔ　
Ｓ　９）する。Process 35 above. Normalization by S6
atton), then feature extraction (e
xtract features S 7 ) to find the distance between the feature and the previously found character in the dictionary. Then, recognition is made from that distance.
ze S 8) L, Output the character code in the dictionary with the shortest distance as the recognition result (output
S9) Do it.

辞書のような文字認識において、前述の処理ヒストグラ
ム計ＸＳ２〜拡大縮小Ｓ６は認識を行うための前処理で
ある。このような処理は全てドツト単位で行わなくては
ならず、その処理量は多大なものである６本発明におい
ては前述の前処理５２〜Ｓ６をシストリックアレイ化し
高速に行わせている。以下では各処理へのシストリック
アレイ化を説明する。In character recognition such as a dictionary, the aforementioned processing histogram measurement XS2 to scaling S6 are pre-processing for recognition. All such processing must be performed dot by dot, and the processing amount is large.6 In the present invention, the aforementioned preprocessing steps 52 to S6 are arranged in a systolic array to perform them at high speed. Below, we will explain how to create a systolic array for each process.

以下では、まず正規化アルゴリズムについて述べ、次に
このアルゴリズムを実現するシストリックアレイについ
て説明する。なお、特に断らない限り、入力画像はＭＸ
Ｍの二値画像であるとし、ＤＸＤの大きさに正規化する
ものとする。Below, we will first describe the normalization algorithm, and then explain the systolic array that implements this algorithm. In addition, unless otherwise specified, the input image is MX
It is assumed that the image is a binary image of M, and is normalized to the size of DXD.

〔正規化用変換関数の作成アルゴリズム〕正規化は、も
との画像を一定の大きさに拡大（または縮小）し、入力
画像上の文字領域の位置や大きさの違いによる影響を吸
収しようとする処理である。最も簡単な正規化は、入力
画像を一定の大きさに線形伸縮することで、線形正規化
という、それ以外の正規化、つまり非線形正規化の１つ
に、入力画像の行方向、列方向の線密度（白画素から黒
画素への反転数）　ｆ　（ｉ）　、　ｇ（ｊ）を求め、
変換関数Ｆ（ｉ）　、　Ｇ（ｊ）をと定義して写像する。ただし、１１＋Ｊｓは、それぞれ
黒画素画存在する領域の最も上の行、最も左の列を表し
、また、ｂは非線形の度合を決める正の重み係数で、通
常は１である。[Algorithm for creating conversion function for normalization] Normalization enlarges (or reduces) the original image to a certain size and attempts to absorb the effects of differences in the position and size of the character area on the input image. This is the process of The simplest normalization is to linearly expand or contract the input image to a constant size. Find the line density (number of inversions from white pixels to black pixels) f (i) and g (j),
The transformation functions F(i) and G(j) are defined and mapped. However, 11+Js respectively represent the top row and leftmost column of the area where black pixels exist, and b is a positive weighting coefficient that determines the degree of nonlinearity, and is usually 1.

第３図は非線形正規化の別図である。以下では、第３図
を参照してまず列方向の正規化について説明する。拡大
の場合は（ａ）のｊ゛列を（ロ）の第（Ｇ（ｊ’）ＸＤ
／Ｗ）列から第（Ｇ（ｊ”＋１）ｘＤ／Ｗ−１）列に対
応させ、縮小の場合は、（ロ）の第４列を（ａ）のＧ−
’（ｒｊｘＷ／Ｄ　　）で定まる連続した複数の列に対
応させる０行方向についてもＦ　（Ｊ）、　Ｈを用いて
同様にできる。ただし、Ｈ，Ｗはそれぞれ黒画素が存在
する領域におけるＦ（ｊ）、　Ｇ（ｊ）の最大値であり
、「　は切り上げを表す。このような非線形正規化は手
書き漢字認識に有効である。FIG. 3 is another diagram of nonlinear normalization. In the following, normalization in the column direction will first be explained with reference to FIG. In the case of expansion, the j゛ column of (a) is converted to the (G(j'))th
/W) column to the (G(j”+1)
The same thing can be done using F (J) and H for the 0th row direction that corresponds to a plurality of consecutive columns determined by '(rjxW/D). However, H and W are the maximum values of F(j) and G(j), respectively, in the area where black pixels exist, and " represents rounding up. Such nonlinear normalization is effective for handwritten kanji recognition.

一方線形正規化は、式（１）、　（２）でｂ−ｏとおい
た場合に相当するので、以下では正規化はすべてこの変
換関数による写像で行う。On the other hand, since linear normalization corresponds to the case where b-o is used in equations (1) and (2), all normalizations below will be performed by mapping using this conversion function.

これらの変換関数Ｆ（ｉ）、　Ｇ（ｊ）は第４図の二重
ループプログラムで求めることができる。プログラム中
で、ｒ　　ｄｅｎ　（ｉ　）　、　　ｃ　　ｄｅｎ　（
ｊ　）はそれぞれｆ　（ｉ）、　　ｇ　Ｕ）に対応し、
ｒ　　ａｃｅ（ｉ）、　　ｒ−ａｃｃ（ｊ）がそれぞれ
Ｆ（ｉ）、　ＧＯ）に相当する。　ｊａｇ（ｉ）（ｊ）
は入力画像の１行ｊ列目の画素を表し、黒画素なら１、
白画素なら０である。また、Ｈ−ｒ　　ｗａｘ　（Ｍ）
　、　Ｗｗ　ｃ　　ｗａｘ　（Ｍ）とする。These conversion functions F(i) and G(j) can be obtained using the double loop program shown in FIG. In the program, r den (i), c den (
j ) correspond to f (i), g U), respectively,
r ace(i) and r-acc(j) correspond to F(i) and GO), respectively. jag(i)(j)
represents the pixel in the 1st row and jth column of the input image, and if it is a black pixel, it is 1,
If it is a white pixel, it is 0. Also, H-r wax (M)
, Ww c wax (M).

〔正規化（変換関数による写像）アルゴリズム〕正規化
は、変換関数を用いて、「行方向の正規化→横変換→列
方向の正規化→縦横変換」という手順で実現する（第６
図）、縦横変換（９０°回転）については、特に述べな
いが、専用の２人出力ボートを持つメモリ素子回路によ
り実現される。[Normalization (mapping using a transformation function) algorithm] Normalization is achieved using a transformation function in the following steps: "normalization in the row direction → horizontal transformation → normalization in the column direction → vertical and horizontal transformation" (6th
Although not specifically mentioned, vertical/horizontal conversion (90° rotation) is realized by a memory element circuit having a dedicated two-person output board.

行方向の正規化は次のアルゴリズムである。The normalization in the row direction is the following algorithm.

ただし、ｉｉｍｇ（ｉ）は入力画像のｉ行目、０３ｍｇ
　（ｉ　）は正規化後の画像のｉ行目を表し、Ｄは正規
化後の大きさを表す。However, iimg(i) is the i-th row of the input image, 03mg
(i) represents the i-th row of the image after normalization, and D represents the size after normalization.

一般に、のように、右辺の配列の添字が関数の形で表されている
ループプログラムは、シストリックアルゴリズムに変換
しにくい、そこで冗長にはなるが、（４）式を次のよう
な二重ループプログラムに書き換える。In general, it is difficult to convert a loop program in which the index of the array on the right side is expressed in the form of a function, such as in the form of a function, to a systolic algorithm. Rewrite it to a loop program.

ただし、ｈ　（ｉ）はＭｔ　とＭ２の間の整数値のみを
取るものとする。（４）式と（５）式が同値であること
は容易に理解されよう、このような変換を行うことによ
り、（４）式を実現するシストリックアレーを得ること
ができる。However, it is assumed that h (i) takes only an integer value between Mt and M2. It is easily understood that equations (4) and (5) are equivalent; by performing such conversion, a systolic array that realizes equation (4) can be obtained.

この変換を（３）式に対して行い、条件式を関数Ｆ（ｉ
）を用いて書き直せば、第５図のようなループプログラ
ムが得られる。ただし、変換関数Ｆ　（ｉ）を配列Ｆ（
ｉ）で表した。This conversion is performed on equation (3), and the conditional expression is transformed into a function F(i
), a loop program like the one shown in Figure 5 can be obtained. However, the transformation function F(i) is converted to the array F(
Expressed as i).

〔正規化シストリックアレー〕[Normalized systolic array]

第４図、第５図の正規化ループプログラムから、変換関
数を作成し、正規化を行うＭ個のセルからなる一次元の
シストリックアレーが第７図のごとく構成できる。尚、
ｉ行目のセルを「セルｉ」と呼ぶ。From the normalization loop programs shown in FIGS. 4 and 5, a one-dimensional systolic array consisting of M cells for normalization can be constructed as shown in FIG. 7 by creating a conversion function. still,
The cell in the i-th row is called "cell i."

アレーの全体図及び１個のセルの構成図を第８図に示す
。セルは、レジスタの値を加算・比較する機能を持ち、
レジスタに格納する値を制御できるＡＬＵと、６個の計
算値格納用レジスタ、１個の画体データ入力用レジスタ
、および、セル機能を記述するＲＡＭで槽底される。こ
のうち、４個のレジスタは、隣のセルに値を送ることが
できる。FIG. 8 shows an overall diagram of the array and a diagram of the configuration of one cell. Cells have the function of adding and comparing register values,
It consists of an ALU that can control the values stored in the registers, six registers for storing calculated values, one register for inputting image data, and a RAM that describes the cell functions. Among these, four registers can send values to neighboring cells.

尚、０内、０内はそれぞれ変換間数作成時、正規化時の
レジスタの名称で、以後セル機能を記述する際に用いる
。Note that 0 and 0 are the names of registers at the time of creating the conversion number and normalizing, respectively, and will be used later when describing the cell function.

ＲＡＭには、あらかじめ、セル機能を記述するマイクロ
プログラムを格納しておく、各セルは、クロックごとに
、ＲＡＭに記述された動作に従い、各レジスタの値を更
新する。A microprogram that describes cell functions is stored in advance in the RAM, and each cell updates the value of each register in accordance with the operation described in the RAM every clock.

次では、このシストリックアレーへのデータの入力およ
びセルｉのセル機能について述べる。ただし、セル（ｉ
−１）のレジスタｒからの人力をｒ（−１）と表すもの
とする。また、動作開始し時刻をｔ−１とする。Next, the input of data to this systolic array and the cell function of cell i will be described. However, cell (i
-1), the manpower from register r is denoted by r(-1). Further, the time when the operation starts is set to t-1.

〔シストリックアレーの変換関数作成〕セルの初期化で
はレジスタの値をすべて０とする。[Creation of conversion function for systolic array] In cell initialization, all register values are set to 0.

データの人力時にセルｉには、時刻ｔに入力画像のｉ行
（ｔ−ｉ＋１）列目の画素を人力する。結果的には、第
９図のように一行ごとにシフトさせたデータを入力する
。When inputting data manually, a pixel in the i-th row and column (t-i+1) of the input image is manually input into cell i at time t. As a result, data shifted row by row is input as shown in FIG.

セル機能は第４図のループプログラムから、次の機能が
定まる。The following cell functions are determined from the loop program shown in FIG.

ｒ　　ｄｅｎ＝ｉｆ　ｉｍｇ＝＝ｏ　ａｎｄ　１ｎｐｕ
ｔ＝＝１ｔｈｅｎ　ｒ　　ｄｅｎ＋１ｅｌｓｅ　ｒ　　ｄｅｎ　　；ｒ　　ａｃｃ＝ｉｆ　ｒ　　ｄｅｎ＝＝Ｏａｎｄ　ｒ　
　ａｃｃ（−１）＝＝Ｏｔｈｅｎ　０ｅｌｓｅ　ｒ　　ｄｅｎ　Ｘｂ　十ｒ　　ａｃｃ（−１
）＋１　；ｒ　　ｍａｘ＝ｉｆ　ｒ　　ｄｅｎ＝＝Ｏｔ
ｈｅｎ　　ｒ　　ｗａｘ（−１）ｅｌｓｅ　ｒ　　ａｃ
ｅ　　；ｃ　　ｄｅｎ＝ｉｆ　ｔａｇ（−１）＝＝Ｏａｎｄ　１
ｎｐｕｔ＝＝１ｔｈｅｎ　　ｃ　　　ｄｅｎ（−１）＋
１ｅｌｓｅ　ｃ　　ｄｅｎ（−１）　；ｃ　　ａｃｃ＝ｉｆ　ｃ　　ｄｅｎ＝＝ｏ　ａｎｄ　ｃ
　　ａｃｃ＝＝Ｏｔｈｅｎ　０ｅｌｓｅ　ｃ　　ｄｅｎ
　　Ｘ　　ｂ＋　　ｃ　　ａｃｅ　　＋１；ｅｌｓｅ　
ｃ　　ａｃｃ　　；１Ｂ＝ｉｎｐｕｔ　　；ただし、セルｌでは、ｒ　　ａｃｃ（−１）＝　ｒ　　ｗａｘ（−１）＝　ｃ
　　ｄｅｎ（−１）＝０とする。ｂの値は通常１（非線
形）かＯ（線形）である、また、灰量では乗算も考える
が、２倍、４倍という乗算に限ることでシフトで実現し
、乗算器を用いないようにもできる。r den=if img==o and 1npu
t==1then r den+1 else r den; r acc=if r den==Oand r
acc(-1)==Othen 0 else den Xb 十r acc(-1
)+1 ;r max=if r den==Ot
hen r wax (-1) else r ac
e; c den=if tag(-1)==Oand 1
nput==1then c den(-1)+
1else c den(-1); c acc=if c den==o and c
acc==Othen 0else c den
X b+ c ace +1; else
c acc ; 1B=input ; However, in cell l, r acc (-1) = r wax (-1) = c
Let den(-1)=0. The value of b is usually 1 (nonlinear) or O (linear).Also, multiplication is also considered for the ash amount, but by limiting the multiplication to 2x or 4x, it can be realized by shifting and not using a multiplier. You can also do it.

データの出力時には変換関数の作成は２Ｍクロックで完
了し、行方向の変換関数のｉ行目の値は、セルｉのレジ
スタｒ　　ａｃｃに格納される。また、列方向の変換関
数のｊ列目の値は、時刻ｔ　−Ｍｔｊに、セルＭのレジ
スタｃ　　ａｃｃから出力される。When outputting data, the creation of the conversion function is completed in 2M clocks, and the value of the i-th row of the conversion function in the row direction is stored in the register r acc of cell i. Further, the value of the j-th column of the column-direction conversion function is output from the register c acc of cell M at time t - Mtj.

さらに、行、列方向の最大値Ｈ，Ｗは、動作完了時（時
刻ｔ−２Ｍ）のセルＭのレジスタｒ　　ｌａＸ。Furthermore, the maximum values H and W in the row and column directions are the register r laX of cell M at the time of completion of the operation (time t-2M).

Ｃ■ａｘの値となる。This becomes the value of C■ax.

〔正規化時の動作〕[Operation during normalization]

セルの初期化ではレジスタｉ　　ｉＢには入力画像のｉ
行目のデータ（１行分二Ｍビット）を、画体入力用レジ
スタを通して（あるいは直接）格納し、ａｃｃｌ、ａｃ
ｃ２にはそれぞれＤＸＦ（ｉ−１）。When initializing the cell, the input image i is stored in register i iB.
Store the row data (2 M bits for one row) through the image input register (or directly), and
DXF(i-1) for c2, respectively.

Ｄ　Ｘ　Ｆ　（ｉ）の値を格納しておく、ただし、列方
向の正規化の時は、Ｆ　（ｉ）の代わりに列方向変換関
数Ｇ（ｊ）を用いる。また、他のレジスタにはＯを格納
する。データの入力はない。The value of D X F (i) is stored. However, when normalizing in the column direction, the column direction transformation function G(j) is used instead of F (i). Further, O is stored in other registers. There is no data input.

セル機能としては第５図のループプログラムから、次の
機能が定まる。The following cell functions are determined from the loop program shown in FIG.

ｍａｘ＝ｍａｘ（−１）　；ｓｕｍ＝ｉｆ　ｍａｘ＝＝Ｑ　ｔｈｅｎ　０ｅｌｓｅ　
ｓｕｍ＋ｍａｘ　；ｏ　　　ｉｍｇ＝ｉｆ　　ａｃｃｌ　　＜　　ｓｕｍ　
　ａｎｄ　　ｓｕｍ　　　≦　ａｃｃ２ｔｈｅｎ　ｉ　
　１Ｈｅｌｓｅ　ｏ　　１Ｂ（−１）　；たたし、セル１ではｏ　　　ｉｍｇ（−１）＝Ｏ，１ｌａｘ（−１）＝ｒ　
　ｗａｘとする。ここで、ｒ　　ｌａＸは、変換関数作
成の際に求めた行方向最大値である０列方向の正規化の
ときは、列方向最大値ｃ　　ｗａｘを用いる。max=max(-1); sum=if max==Q then 0else
sum+max; o img=if accl<sum
and sum ≦ acc2then i
1H else o 1B(-1); In cell 1, o img(-1)=O, 1lax(-1)=r
Wax. Here, r laX is the maximum value in the row direction obtained when creating the conversion function. When normalizing in the column direction, the maximum value in the column direction c wax is used.

データの出力時には正規化はＭ＋Ｄクロックで完了し、
正規化後の画像のｉ行目は、時刻ｔ　−Ｍ＋ｉにセルＭ
のレジスタｏ　　ｉｍｇから出力される。When outputting data, normalization is completed with M+D clock,
The i-th row of the image after normalization is cell M at time t −M+i.
is output from the register o img.

レジスタｉ　　ｉｍｇをＮビット（Ｎ＜Ｍ）にせざるを
えない時は、Ｎ列毎にわけて正規化する。またａｃｃｌ
、ａｃｃ２の値の計算は、変換関数作成時のセル機能で
、“＋ｌ°の代わりに“十Ｄ°とすることで、乗算器を
使わずに表現できる。When the register i img must be made into N bits (N<M), it is normalized by dividing it into every N columns. Also accl
, acc2 can be expressed without using a multiplier by using the cell function when creating the conversion function and using 10D° instead of +1°.

以上においては、非線形におけるシストリックアレーに
より正規化について説明したが、以下ではさらに様“々
な正規化について説明する。In the above, normalization has been explained using a nonlinear systolic array, but below, various types of normalization will be explained.

正規化用シストリックアレーを用い、レジスタのデータ
を変更することによって、文字認識で有用と思われる様
々な正規化を実現する。ただし、以下では行方向に対す
る変更方法について説明するが列方向も全く同様の方法
で変更できる。また、行方向と列方向を組み合わせるこ
ともできる。By using a systolic array for normalization and changing register data, we realize various normalizations that are considered useful in character recognition. However, although the method of changing in the row direction will be explained below, the changing in the column direction can also be done in exactly the same way. Furthermore, the row direction and column direction can also be combined.

なお、以下では、セルｉのレジスタａｃｃ２及びセル（
ｉ＋１）のレジスタａｃｃｌに格納する値を、Ｆ（ｉ）
Ｘｓｉｚｅ＋ｐｏｓとし、セルｌのレジスタＷａＸに代入する値を量ａｘと記し、５ｉｚｅ、ｐｏｓ、■ａｘを具体的にどう与え
るかで各種正規化を実現する。通常の正規化は、５ｉｚ
ｅ＝Ｄ、　　ｐｏｓ＝ｏ、　　ｍａｘ＝ｒ　　　ｗａｘ
である。Note that in the following, register acc2 of cell i and cell (
F(i)
Let Xsize+pos be expressed, and the value assigned to the register WaX of cell l is written as the quantity ax, and various normalizations are realized depending on how specifically 5ize, pos, and xax are given. Normal normalization is 5iz
e=D, pos=o, max=r wax
It is.

指定サイズ正規化では文字！！識の対象をワードプロセ
ッサによる印刷文字とした場合、よく用いられる文字の
サイズに、全角、半角、１７４角等がある。これらをす
べて同一の大きさに正規化して認識する方法もあるが、
全角と半角では一般にフォントが異なり、また、全角、
半角は半角として認識する必要が生じる場合もあるため
、正規化後の大きさはＤＸＤに固定ではなく、自由に変
えることができる方がよい。Characters in specified size normalization! ! When the object of knowledge is characters printed by a word processor, commonly used character sizes include full-width, half-width, and 174-width characters. There is a way to normalize them all to the same size and recognize them, but
Generally, the fonts are different for full-width and half-width, and
Since it may be necessary to recognize a half-width as a half-width, it is better that the size after normalization is not fixed to DXD and can be changed freely.

また、このような正規化を行うと、正規化後の画像はＤ
ＸＤの領域全体を占めず、空白部分が生じる。従って、
文字領域をＤＸＤのどの部分に配置するかということを
指定できると便利である。Also, when such normalization is performed, the normalized image becomes D
It does not occupy the entire area of XD, and a blank area is generated. Therefore,
It would be convenient if it was possible to specify in which part of the DXD the character area should be placed.

正規化後の大きさをｄにするには、５ｉｚｅ　＝　ｄｍａｘ＝ｒ　　　重ａｘとする、　　ｐｏｓの値は文字領域の配置位置によって
異なり、となる。乗算器を使わない場合はｄは２のべき乗に限る
。To make the size after normalization d, 5ize = d max = r weight ax The value of pos varies depending on the placement position of the character area, and is as follows. If a multiplier is not used, d is limited to a power of 2.

縦横同化比率正規化では、“１”や°−′といった縦長
、横長の文字は、通常の正規化ではＤＸＤの枠金体にク
ロック画素領域が広がってしまい、元の形の情報が失わ
れてしまう。このようなときは、縦横を同比率で正規化
することも望まれる。In vertical and horizontal assimilation ratio normalization, for vertically long and horizontally long characters such as "1" and °-', with normal normalization, the clock pixel area expands to the frame body of DXD, and information about the original shape is lost. Put it away. In such a case, it is desirable to normalize the height and width to the same ratio.

入力画像が縦長の場合は、行方向は通常の処理でよい、
横長（ｃ　　ｗａｘ　＞　ｒ　　ｗａｘ）の場合、縦横
を同比率で正規化するには、ｓｉｚｅ＝Ｄｍａｘ＝ｃ　　　　ｗａｘとする、なお、本節の考え方は、指定サイズ正規化と組
み合わせることができ、縦横同比率でｄ×ｄの範囲内に
収めるといった処理も可能である。If the input image is vertically long, the row direction can be processed normally.
In the case of horizontal length (c wax > r wax), to normalize the vertical and horizontal directions at the same ratio, size = D max = c wax.The idea in this section can be combined with specified size normalization, It is also possible to perform processing to keep the same ratio within the range of d×d.

位置正規化においてはいわゆる「パターンマツチング法
」によって認識を行うような場合は、拡大・縮小や変形
を行わず、位置のみを合わせる正規化も必要になろう、
このような正規化を行うには、５ｉｚｅ＝　１鵬ａｘ＝１とする。In position normalization, if recognition is performed using the so-called "pattern matching method," it may be necessary to normalize only the position without scaling or transforming.
To perform such normalization, set 5ize=1 and ax=1.

上述３の正規化を第８図のプロセッサを用いて実際に行
った例を第１０図に示す。FIG. 10 shows an example in which the normalization described in 3 above is actually performed using the processor shown in FIG. 8.

以上では、正規化について詳細に本発明の実施例を用い
て説明したが、以下では本発明の実施例を文字切り出し
への適応について説明する。In the above, normalization has been explained in detail using the embodiments of the present invention, but below, the application of the embodiments of the present invention to character segmentation will be explained.

正規化用シストリックアレーを文字切り出しのために必
要となる黒画素のヒストグラム計寡、傾き補正処理へ適
用する。なお、ここで扱うデータはＭＸＮの二値画像で
あるとする。Ｎは、入力画像データの横方向の長さに相
当する。The normalization systolic array is applied to the histogram reduction and tilt correction processing of black pixels required for character extraction. It is assumed that the data handled here is an MXN binary image. N corresponds to the horizontal length of the input image data.

〔ヒストグラム計纂〕[Histogram calculation]

ヒストグラム掲載は、正規化用変換関数の作成とほぼ同
様のアルゴリズムであり、第１１図のループプログラム
で書ける。データの入力方法などは変えずにセル機能を
次のようにして実現できる。The algorithm for displaying the histogram is almost the same as that for creating the normalization conversion function, and can be written using the loop program shown in FIG. The cell function can be implemented as follows without changing the data input method.

ｒ　　ｄｅｎ＝ｉｆ　１ｎｐｕｔ＝＝１　ｔｈｅｎ　ｒ
　　ｄｅｎ　＋１ｅｌｓｅ　ｒ　　ｄｅｎ　；ｃ　　ｄｅｎ＝ｉｆ　１ｎｐｕｔ＝＝１　ｔｈｅｎ　ｃ
　　ｄｅｎ（−１）　＋１ｅｌｓｅ　ｃ　　ｄｅｎ　（
−１）　；ヒストグラムのみを求めるのであれば他のレ
ジスタは使用しないが、ｒ　　ａｃｃ、　ｃ　　ａｃｃ
を用い、セル機能を、ｒ　　ａｃｃ＝　ｒ　　ｄｅｎ　＋　ｒ　　ａｃｅ（−
１）　：ｃ　ａｃｃ＝　ｃ　ｄｅｎ　十ｃ　ａｃｅ　；
と定義すれば、行、列方向のヒストグラムの累積値や画
像全体の黒画素の数などを求めることができる０文字切
り出しのアルゴリズムによっては、これらの値を用いる
ものと思われる。r den=if 1nput==1 then r
den +1else r den; c den=if 1nput==1 then c
den(-1) +1else c den (
-1); If you only want to obtain the histogram, other registers are not used, but r acc, c acc
, and the cell function is r acc = r den + r ace(-
1): c acc= c den ten c ace;
If defined as , these values may be used depending on the zero character extraction algorithm that can calculate the cumulative value of the histogram in the row and column directions, the number of black pixels in the entire image, etc.

〔傾き補正〕[Tilt correction]

傾き補正は、第１２図に示すように、入力画像をに列ご
とのブロックに分割し、第ＬブロックをＬ−１行だけ、
上または下にシフトする。ただし、傾きは小さいと仮定
し、Ｋは上述したブロックのシフトによって傾きが補正
できる、ようにするための値で、入力画像の傾きから求
まる定数である。For tilt correction, as shown in FIG.
Shift up or down. However, assuming that the slope is small, K is a value that allows the slope to be corrected by shifting the blocks described above, and is a constant determined from the slope of the input image.

この傾き補正は、ブロックごとの位置の正規化であると
みなせる。第Ｌブロックを扱う場合の動作は、次のよう
になる。This tilt correction can be considered to be normalization of the position of each block. The operation when handling the Lth block is as follows.

セルの初期化においてはレジスタｉ　　ｉｍｇに入力画
像のｉ行目のデータ（Ｋヒフ１分）を格納する。In initializing the cell, the data of the i-th row of the input image (1 minute of K) is stored in the register i img.

データの入力はない。There is no data input.

セル機能においてはセルｉには次の機能を持たせる。Regarding cell functions, cell i has the following functions.

ｍａｘ＝ｍａｘ（−１）　　；ｏ　　　１Ｈ＝ｉｆ　　ｍａｘ＝＝ｉ　　ｔｈｅｎ　　
ｉ　　　１Ｈｅｌｓｅ　ｏ　　１Ｈ（−１）　；ただし、セル１では、Ｏｌ−ｇ（−１）＝０　；とする。ここで、ｔは時刻、Ｌはブロック番号を表す。max=max(-1); o 1H=if max==i then
i 1Helse o 1H(-1); However, in cell 1, Ol-g(-1)=0; Here, t represents time and L represents block number.

データの出力ではブロックごとの動作は２Ｍクロックで
完了し、変換後の画像のｉ行目は、時刻ｔ＝Ｍ＋ｉにセ
ルＭのレジスタｏ　　ｉＩＩｇから出力される。In data output, the operation for each block is completed in 2M clocks, and the i-th row of the converted image is output from the register o iIIg of cell M at time t=M+i.

以上本発明の実施例を変換表作成、正規化、ヒストグラ
ム計纂、傾き補正について説明したが、はぼ同一の槽底
でさらに各セルを構成するプロセッサは比較、加算等の
ＡＬＵでよく、従来のプロセッサと比べはるかに素子数
の少ないＬＳＩで構成できる。またプログラムを変更す
るのみで各種の処理を行うことかできる。The embodiments of the present invention have been described above regarding conversion table creation, normalization, histogram calculation, and slope correction.However, the processors constituting each cell in the same tank bottom may be ALUs for comparison, addition, etc. It can be constructed using an LSI with a much smaller number of elements than the processor of Additionally, various types of processing can be performed simply by changing the program.

〔発明の効果〕〔Effect of the invention〕

以上述べたように本発明によれば専用のプログラマブル
セルをシストリックアレー構成とし、パイプラインで並
列に処理しているので、各種の処理を高速化することが
できる。また、プログラマブルセルは簡単なＡＬＵによ
って構成されるので安価とすることができる。さらには
、プログラマブルであるので変換表作成、正規化、ヒス
トグラム計算、傾き補正等に共通に使用できる。As described above, according to the present invention, dedicated programmable cells are configured in a systolic array, and processing is performed in parallel in a pipeline, so that various types of processing can be speeded up. In addition, since the programmable cell is constituted by a simple ALU, it can be made inexpensive. Furthermore, since it is programmable, it can be commonly used for conversion table creation, normalization, histogram calculation, tilt correction, etc.

【図面の簡単な説明】[Brief explanation of drawings]

第１図は本発明の原理ブロック図、第２図は本発明の実施例の文字認識のフローチャート、第３図は非線形正規化の偶因、第４図は変換関数の作成を示す図、第５図は正規化用ループプログラムを示す図、第６図は
正規化の流れを示す図、第７図はアレーの構成国、第８図はセルの構成国、第９図はデータのフローチャート、第１０図は様々な正規化図、第１１図はヒストグラム計算を示す図、第１２図は傾き
補正図である。１・・・セル、２・・・メモリ、３・・・ＡＬＵ。４・・・第１のレジスタ、５・・・第２のレジスタ。Fig. 1 is a block diagram of the principle of the present invention, Fig. 2 is a flowchart of character recognition in an embodiment of the present invention, Fig. 3 is a contingency factor of nonlinear normalization, Fig. 4 is a diagram showing the creation of a conversion function, Figure 5 is a diagram showing the normalization loop program, Figure 6 is a diagram showing the flow of normalization, Figure 7 is the constituent countries of the array, Figure 8 is the constituent countries of the cell, Figure 9 is the data flowchart, Fig. 10 shows various normalization diagrams, Fig. 11 shows histogram calculation, and Fig. 12 shows a tilt correction diagram. 1...Cell, 2...Memory, 3...ALU. 4...First register, 5...Second register.

Claims

【特許請求の範囲】イメージデータが並列に少なくともドット単位で加わり
、該ドットデータを次段へ出力するセル（１）をシスト
リックアレイに接続し、該セルは入力するドットデータと、該データから入力す
る方向に対して垂直方向の隣合うセルより加わるデータ
とを処理するプログラムと処理中のデータとを記憶する
メモリ（２）と、前記処理においてデータの加算比較を
少なくとも行うＡＬＵ（３）と、入力したデータを記憶
するとともに、次段に出力する第１のレジスタ（４）と
、前記処理結果を記憶するとともに前記隣合うセルに対
向した隣合うセルに出力する第２のレジスタ（５）とよ
り成ることを特徴とする高速文字認識前処理用シストリ
ックアレイ。[Claims] A cell (1) in which image data is added in parallel at least in units of dots and outputs the dot data to the next stage is connected to a systolic array, and the cell receives input dot data and data from the data. a memory (2) for storing a program for processing data added from adjacent cells in a direction perpendicular to the input direction and data being processed; and an ALU (3) for at least adding and comparing data in the processing. , a first register (4) that stores input data and outputs it to the next stage, and a second register (5) that stores the processing result and outputs it to an adjacent cell opposite to the adjacent cell. A systolic array for high-speed character recognition preprocessing characterized by comprising: