JP2903599B2 - Character recognition device - Google Patents

Character recognition device

Info

Publication number
JP2903599B2
JP2903599B2 JP2037604A JP3760490A JP2903599B2 JP 2903599 B2 JP2903599 B2 JP 2903599B2 JP 2037604 A JP2037604 A JP 2037604A JP 3760490 A JP3760490 A JP 3760490A JP 2903599 B2 JP2903599 B2 JP 2903599B2
Authority
JP
Japan
Prior art keywords
character
image information
voiced
recognition
unit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
JP2037604A
Other languages
Japanese (ja)
Other versions
JPH03240891A (en
Inventor
学人 杉本
真司 近藤
哲也 松本
禎造 成本
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Panasonic Holdings Corp
Original Assignee
Matsushita Electric Industrial Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Matsushita Electric Industrial Co Ltd filed Critical Matsushita Electric Industrial Co Ltd
Priority to JP2037604A priority Critical patent/JP2903599B2/en
Publication of JPH03240891A publication Critical patent/JPH03240891A/en
Application granted granted Critical
Publication of JP2903599B2 publication Critical patent/JP2903599B2/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Landscapes

  • Character Discrimination (AREA)

Description

【発明の詳細な説明】 産業上の利用分野 本発明は、平仮名,片仮名の濁音文字,半濁音文字を
含む新聞,雑誌等の活字文字及び手書き文字を認識し、
例えばJISコード等の情報量に変換する文字認識装置に
関するものである。
Description: BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention recognizes printed characters and handwritten characters in newspapers, magazines, etc., including hiragana, katakana and katakana characters.
For example, the present invention relates to a character recognition device that converts an information amount such as a JIS code.

従来の技術 濁音文字,半濁音文字を含む入力画像に対し、すでに
知られている文字認識技術を用いて認識した認識結果の
濁音文字,半濁音文字,清音文字の誤りを自動的に訂正
する文字認識装置には、言語の文法にしたがって認識結
果を単語ごとに分割し、あらかじめ登録されている単語
辞書との照合によって訂正を行うものがある。
2. Description of the Related Art Characters that automatically correct errors in voiced characters, half-voiced characters, and clear voiced characters as a result of recognition using an already known character recognition technology for input images containing voiced and semi-voiced characters Some recognition devices divide a recognition result for each word according to the grammar of a language, and correct the result by collating with a word dictionary registered in advance.

発明が解決しようとする課題 しかしながら、上記のような従来の技術では、単語を
照合させる単語辞書の量が膨大なものであり照合させる
には非常に時間がかかるという欠点を有していた。
Problems to be Solved by the Invention However, the conventional techniques as described above have a disadvantage that the amount of a word dictionary for matching words is enormous, and it takes a very long time to match words.

本発明はかかる点に鑑みてなさなれたものであり、濁
音文字,半濁音文字を含む入力画像を文字認識技術によ
って認識した認識結果の濁音文字,半濁音文字,清音文
字の誤りを簡易な方法で、高速かつ自動的に訂正する文
字認識装置を提供することを目的としている。
SUMMARY OF THE INVENTION The present invention has been made in view of the above circumstances, and provides a simple method for recognizing an error in a voiced character, a half-voiced voice character, and a clear voiced character as a result of recognition of an input image including a voiced character and a half-voiced character by a character recognition technology. Therefore, it is an object of the present invention to provide a character recognition device that performs high-speed and automatic correction.

課題を解決するための手段 本発明は上記目的を達成するために、文字認識対象を
含む画像情報を入力する画像情報入力部と、前記画像情
報入力部に入力された画像情報を格納する画像情報メモ
リ部と、前記画像情報メモリ部に格納された画像情報か
ら1文字ずつ文字画像情報を抽出する文字切り出し部
と、前記文字切り出し部で抽出した文字画像情報を格納
する文字画像情報メモリ部と、前記文字画像情報メモリ
部に格納された文字画像情報から文字認識を行い文字認
識結果を得る文字認識部と、前記文字認識部で得た認識
結果と前記文字画像情報メモリ部に格納された文字画像
情報を用いて濁音文字,半濁音文字,清音文字の誤認識
の訂正を行う濁音・半濁音文字処理部を備えた文字認識
装置である。
Means for Solving the Problems In order to achieve the above object, the present invention provides an image information input unit for inputting image information including a character recognition target, and image information for storing image information input to the image information input unit. A memory unit, a character cutout unit that extracts character image information character by character from the image information stored in the image information memory unit, and a character image information memory unit that stores the character image information extracted by the character cutout unit. A character recognition unit that performs character recognition from the character image information stored in the character image information memory unit to obtain a character recognition result; a recognition result obtained by the character recognition unit and a character image stored in the character image information memory unit This is a character recognition device having a voiced / semi-voiced character processing unit for correcting misrecognition of voiced, semi-voiced and clear voice characters using information.

作用 本発明は上記の構成により、画像情報入力部で入力さ
れた画像情報を画像情報メモリ部に格納し、格納した画
像情報から文字切り出し部で1文字ずつの文字画像情報
を抽出し、抽出された文字画像情報を文字画像情報メモ
リ部へ格納し、格納した文字画像情報から文字認識部で
文字認識を行い認識結果を抽出し、抽出された認識結果
と文字画像情報メモリ部に格納されている文字画像情報
から濁音・半濁音文字処理部で濁音文字,半濁音文字,
清音文字の誤認識の訂正する。
According to the present invention, the image information input by the image information input unit is stored in the image information memory unit, and character image information for each character is extracted from the stored image information by the character cutout unit. The character image information stored in the character image information memory unit, the character recognition unit performs character recognition from the stored character image information to extract a recognition result, and the extracted recognition result is stored in the character image information memory unit. From the character image information, the voiced / half-voiced sound character processing unit
Correct misrecognition of Kiyon characters.

実施例 以下、本発明の実施例について図面を参照しながら説
明する。
Embodiments Hereinafter, embodiments of the present invention will be described with reference to the drawings.

第1図は、本発明による文字認識装置の一実施例の構
成図である。1は画像情報入力部であり文字認識対象を
含む画像を走査し、2値信号で画像情報メモリ部2に格
納する。3は文字切り出し部であり画像情報メモリ部2
で格納されている画像情報から1文字ずつの文字画像情
報を抽出し、文字画像情報メモリ部4へ格納する。5は
文字認識部であり文字画像情報メモリ部4で格納されて
いる文字画像情報を文字認識し認識結果を抽出する。6
は濁音・半濁音文字処理部であり5の文字認識部で抽出
された認識結果と4の文字画像情報メモリ部で格納され
ている文字画像情報を用いて、濁音文字,半濁音文字,
清音文字の誤認識の訂正する。
FIG. 1 is a configuration diagram of one embodiment of a character recognition device according to the present invention. Reference numeral 1 denotes an image information input unit which scans an image including a character recognition target and stores the image in the image information memory unit 2 as a binary signal. Reference numeral 3 denotes a character cutout unit, and an image information memory unit 2
The character image information of each character is extracted from the image information stored in step (1) and stored in the character image information memory unit 4. Reference numeral 5 denotes a character recognition unit that performs character recognition on the character image information stored in the character image information memory unit 4 and extracts a recognition result. 6
Is a voiced / semi-voiced character processing unit, which uses the recognition result extracted by the character recognition unit 5 and the character image information stored in the character image information memory unit 4 to obtain a voiced character, a half-voiced character,
Correct misrecognition of Kiyon characters.

以上のように構成された文字認識装置について第2図
に示す入力画像7を例に説明する。
The character recognition device configured as described above will be described using the input image 7 shown in FIG. 2 as an example.

画像情報入力部1から入力された画像7は、文字部の
黒画素を1、背景部の白画素をOの2値データで画像情
報メモリ部2に蓄える。
The image 7 input from the image information input unit 1 stores binary data of 1 for black pixels in the character portion and O for white pixels in the background portion in the image information memory unit 2.

文字切り出し部3では、画像情報メモリ部2に蓄えら
れている入力画像7を横方向に走査して黒画素間の距離
を算出する。同様に画像情報メモリ部2に蓄えられてい
る入力画像7を縦方向に走査して黒画素間の距離を算出
する。縦方向,横方向に走査して得られた黒画素間の距
離情報に着目し、1文字ずつの文字画像情報8を抽出
し、文字画像情報メモリ部4に蓄える。第3図に第2図
の入力画像7から文字切り出し部3によって1文字ずつ
の文字画像情報8が抽出された状態を示す。
The character cutout unit 3 scans the input image 7 stored in the image information memory unit 2 in the horizontal direction to calculate the distance between black pixels. Similarly, the input image 7 stored in the image information memory unit 2 is scanned in the vertical direction to calculate the distance between black pixels. Focusing on distance information between black pixels obtained by scanning in the vertical and horizontal directions, character image information 8 for each character is extracted and stored in the character image information memory unit 4. FIG. 3 shows a state where character image information 8 for each character is extracted from the input image 7 of FIG.

文字認識部5では、文字画像情報メモリ部4に蓄えら
れている1文字ずつの文字画像情報を横方向に4分割、
縦方向に4分割、合計16個の小領域に分割する。第4図
に『す』の文字画像情報9を分割した状態を示す。分割
した16個の小領域に対して文字部の黒画素数と背景部の
白画素数を特徴量として算出し、あらかじめ登録されて
いる1文字ずつの特徴量と照合し、最も似た文字『す』
を認識結果とする。第5図に第3図の文字画像情報8を
文字認識部5で認識した認識結果10を示す。
The character recognition unit 5 divides the character image information of each character stored in the character image information memory unit 4 into four parts in the horizontal direction,
It is divided vertically into four, a total of 16 small areas. FIG. 4 shows a state in which the character image information 9 of "su" is divided. The number of black pixels in the character portion and the number of white pixels in the background portion are calculated as feature amounts for the 16 divided small regions, and compared with the previously registered feature amount of each character, and the most similar character “ You
Is the recognition result. FIG. 5 shows a recognition result 10 obtained by recognizing the character image information 8 of FIG.

濁音・半濁音文字処理部6では、文字認識部5で抽出
した認識結果10の濁音文字,半濁音文字,清音文字を抽
出し、対応する文字画像情報メモリ部4の文字画像情報
8に対し黒画素の連結している黒画素塊の個数を算出す
る。第6図に『ハ,バ,パ』の黒画素塊の算出結果を示
す。次に算出した黒画素塊の個数を、あらかじめ登録さ
れている文字別黒画素塊数情報11と照合と黒画素塊数が
一致する文字を正確として文字認識部5で抽出した認識
結果10に訂正を加え、訂正後の認識結果12を得る。第7
図にあらかじめ登録されている文字別黒画素塊数情報11
を示す。第8図に濁音・半濁音文字処理部6で訂正され
た認識結果12を示す。
The voiced / semi-voiced character processing unit 6 extracts the voiced character, the half-voiced sound character, and the clear sound character of the recognition result 10 extracted by the character recognition unit 5, and performs black processing on the corresponding character image information 8 in the character image information memory unit 4. The number of connected black pixel blocks is calculated. FIG. 6 shows the calculation result of the black pixel block of “c, b, c”. Next, the calculated number of black pixel blocks is corrected to a recognition result 10 extracted by the character recognition unit 5 assuming that a character whose number of black pixel blocks matches with the previously registered black pixel block information 11 for each character is correct. To obtain the corrected recognition result 12. Seventh
Black pixel block number information for each character registered in advance in the figure 11
Is shown. FIG. 8 shows a recognition result 12 corrected by the voiced / semi-voiced character processing unit 6.

以上のように構成された文字認識装置では、入力画像
から文字画像情報を抽出し認識結果を得ることができ、
更に濁音文字,半濁音文字,清音文字について誤認識を
訂正することができる。
In the character recognition device configured as described above, character image information can be extracted from an input image to obtain a recognition result,
Further, erroneous recognition can be corrected for voiced characters, semi-voiced characters, and clear voiced characters.

発明の効果 以上説明したように、本発明によれば認識対象を含む
入力画像から簡易な方法で認識結果を得ることができ、
更に誤認識している濁音文字,半濁音文字,清音文字に
ついて自動的に訂正することができる。
Effects of the Invention As described above, according to the present invention, a recognition result can be obtained from an input image including a recognition target by a simple method,
Furthermore, it is possible to automatically correct erroneously recognized voiced characters, semi-voiced characters, and clear voiced characters.

【図面の簡単な説明】[Brief description of the drawings]

第1図は本発明における一実施例の文字認識装置の構成
図、第2図は入力画像の説明図、第3図は1文字ずつの
文字画像情報の説明図、第4図は『す』の文字画像情報
の分割状態の説明図、第5図は訂正前認識結果の説明
図、第6図は『ハ,バ,パ』の黒画素塊の算出結果の説
明図、第7図は文字別黒画素塊数情報の説明図、第8図
は訂正後認識結果の説明図である。 1……画像情報入力部、2……画像情報メモリ部、3…
…文字切り出し部、4……文字画像情報メモリ部、5…
…文字認識部、6……濁音・半濁音文字処理部、7……
入力画像、8……文字画像情報、9……『す』の文字画
像情報の分割状態、10……訂正前認識結果、11……文字
別黒画素塊数情報、12……訂正後認識結果。
FIG. 1 is a block diagram of a character recognition device according to an embodiment of the present invention, FIG. 2 is an explanatory diagram of an input image, FIG. 3 is an explanatory diagram of character image information for each character, and FIG. FIG. 5 is an explanatory diagram of a recognition result before correction, FIG. 6 is an explanatory diagram of a calculation result of a black pixel block of “ha, ba, pa”, and FIG. FIG. 8 is an explanatory diagram of different black pixel block number information, and FIG. 8 is an explanatory diagram of a recognition result after correction. 1... Image information input section, 2... Image information memory section, 3.
... Character cutout section, 4 ... Character image information memory section, 5 ...
… Character recognition unit, 6… Dakuon / semi-voiced character processing unit, 7 ……
Input image, 8 ... Character image information, 9 ... Division of character image information of "su", 10 ... Recognition result before correction, 11 ... Black pixel block number information by character, 12 ... Recognition result after correction .

───────────────────────────────────────────────────── フロントページの続き (72)発明者 成本 禎造 大阪府門真市大字門真1006番地 松下電 器産業株式会社内 (58)調査した分野(Int.Cl.6,DB名) G06K 9/03 ────────────────────────────────────────────────── ─── Continuing on the front page (72) Inventor Sadazo Narimoto 1006 Kazuma Kadoma, Osaka Prefecture Matsushita Electric Industrial Co., Ltd. (58) Field surveyed (Int.Cl. 6 , DB name) G06K 9/03

Claims (1)

(57)【特許請求の範囲】(57) [Claims] 【請求項1】文字認識対象を含む画像情報を入力する画
像情報入力部と、前記画像情報入力部に入力された前記
画像情報を格納する画像情報メモリ部と、前記画像情報
メモリ部に格納された前記画像情報から1文字ずつ文字
画像情報を抽出する文字切り出し部と、前記文字切り出
し部で抽出した前記文字画像情報を格納する文字画像情
報メモリ部と、前記文字画像情報メモリ部に格納された
前記文字画像情報から文字認識を行い認識結果を得る文
字認識部と、前記文字認識部で得た認識結果と前記文字
画像情報メモリ部に格納された前記文字画像情報を用い
て濁音文字,半濁音文字,清音文字の誤認識の訂正を行
う濁音・半濁音文字処理部を有することを特徴とする文
字認識装置。
An image information input unit for inputting image information including a character recognition target; an image information memory unit for storing the image information input to the image information input unit; A character extracting unit that extracts character image information one character at a time from the image information, a character image information memory unit that stores the character image information extracted by the character extracting unit, and a character image information memory unit that stores the character image information. A character recognition unit that obtains a recognition result by performing character recognition from the character image information; a voiced character and a semi-voiced sound using the recognition result obtained by the character recognition unit and the character image information stored in the character image information memory unit A character recognition device comprising a voiced / semi-voiced character processing unit for correcting erroneous recognition of characters and clear-tone characters.
JP2037604A 1990-02-19 1990-02-19 Character recognition device Expired - Fee Related JP2903599B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2037604A JP2903599B2 (en) 1990-02-19 1990-02-19 Character recognition device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2037604A JP2903599B2 (en) 1990-02-19 1990-02-19 Character recognition device

Publications (2)

Publication Number Publication Date
JPH03240891A JPH03240891A (en) 1991-10-28
JP2903599B2 true JP2903599B2 (en) 1999-06-07

Family

ID=12502180

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2037604A Expired - Fee Related JP2903599B2 (en) 1990-02-19 1990-02-19 Character recognition device

Country Status (1)

Country Link
JP (1) JP2903599B2 (en)

Also Published As

Publication number Publication date
JPH03240891A (en) 1991-10-28

Similar Documents

Publication Publication Date Title
JP3139521B2 (en) Automatic language determination device
JP2713622B2 (en) Tabular document reader
JP2000067164A (en) Method and device for pattern recognition and record medium where template generating program is recorded
JP2903599B2 (en) Character recognition device
KR102043693B1 (en) Machine learning based document management system
JP3812719B2 (en) Document search device
JPH11328315A (en) Character recognizing device
JP2985813B2 (en) Character string recognition device and knowledge database learning method
JP2902097B2 (en) Information processing device and character recognition device
KR20200003667A (en) Method for post-processing RESULTS OF Optical character recognition and Optical character recognition apparatus performing thereof
JPH051512B2 (en)
JP2538543B2 (en) Character information recognition device
JP3173363B2 (en) OCR maintenance method and device
JP2856409B2 (en) Character recognition apparatus and method
JP3006823B2 (en) Character and word recognition methods
JPS6095689A (en) Optical character reader
JP2746345B2 (en) Post-processing method for character recognition
CN114115542A (en) Braille processing method, device, storage medium and electronic device
JP2827288B2 (en) Character recognition device
JP2891368B2 (en) Post-processing method of character recognition result
JP3270590B2 (en) Character recognition device
JP2939945B2 (en) Roman character address recognition device
Jose et al. Transcript mapping for handwritten English documents
JP2977244B2 (en) Character recognition method and character recognition device
JPH07107700B2 (en) Character recognition device

Legal Events

Date Code Title Description
LAPS Cancellation because of no payment of annual fees