JPS592191A - Recognizing and processing system of handwritten japanese sentence - Google Patents

Recognizing and processing system of handwritten japanese sentence

Info

Publication number
JPS592191A
JPS592191A JP57111912A JP11191282A JPS592191A JP S592191 A JPS592191 A JP S592191A JP 57111912 A JP57111912 A JP 57111912A JP 11191282 A JP11191282 A JP 11191282A JP S592191 A JPS592191 A JP S592191A
Authority
JP
Japan
Prior art keywords
character
text
candidates
kanji
japanese
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
JP57111912A
Other languages
Japanese (ja)
Other versions
JPH0247788B2 (en
Inventor
Naoki Morimoto
直樹 森本
Michiaki Nakanishi
道明 中西
Masahiro Okawa
大川 正廣
Yasunao Isaki
伊崎 保直
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujitsu Ltd
Original Assignee
Fujitsu Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujitsu Ltd filed Critical Fujitsu Ltd
Priority to JP57111912A priority Critical patent/JPS592191A/en
Publication of JPS592191A publication Critical patent/JPS592191A/en
Publication of JPH0247788B2 publication Critical patent/JPH0247788B2/ja
Granted legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/28Character recognition specially adapted to the type of the alphabet, e.g. Latin alphabet
    • G06V30/287Character recognition specially adapted to the type of the alphabet, e.g. Latin alphabet of Kanji, Hiragana or Katakana characters

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Character Discrimination (AREA)

Abstract

PURPOSE:To heighten the accuracy of recognition of Chinese characters and to discriminate Chinese characters from non-Chinese characters, by discriminating a Chinese character and non-Chinese character of the same shape utilizing Japanese phonetic syllabaries which easier to recognize than Chinese characters. CONSTITUTION:Character (Japanese phonetic syllabaries 12 and text 11) on a medium 1 to be read are scanned and read by a photoelectric conversion device 2 such as CCD etc., and inputted to a recognition pretreatment circuit 3. The circuit 3 performs characteristic extraction of each character and selects candidates of each character and outputs them. Candidate output from the circuit 3 is distributed according to the Japanese syllabaries and the text to buffers 4, 5 that stores the Japanese syllabaries and the text. Japanese syllabary and Chinese character correspondence output read out and text candidate output from the buffer 5 are collated successively, and candidates of Chinese character in the text are selected and reduced to 2-3 candidates from candidaea category of 20 for each character. Character candidate output from a candidate buffer 8 in which selected character candidates are stored is inputted to a selector circuit 9. Here, character candidates are arranged sideways to constitute a sentence and an ultimately correct category is selected.

Description

【発明の詳細な説明】 (1)発明の技術分野 本発明は1手書き日本語文をlii!識する文字認識装
置に係り%特に漢字部分にのみフリガナを付与して手書
き日本語文の認識精度を向上した認識処理方式に関する
ものである。
DETAILED DESCRIPTION OF THE INVENTION (1) Technical field of the invention The present invention provides 1 handwritten Japanese text. The present invention relates to a character recognition device that recognizes handwritten Japanese text, and particularly relates to a recognition processing method that improves the recognition accuracy of handwritten Japanese text by adding furigana only to Kanji characters.

(2)技術の背景 一般に文字の認識装置においては、最初にパターンの観
測が行なわれる。次忙その文字の形が大われる。そして
パターンの特徴は、装置内に記憶している過去のデータ
から得た特徴と照し合わせて最も類似しているパターン
を選び出す。すなわち、待機の処理が行なわれ、ひきつ
づいてその結果から与えられたパターンを識別するため
の決定が行なわれる。このよ5iC1,て文字の認識が
行なわれるものであるが、最近の文字認識装置は手書き
カナ文字に限らず手書き漢字用のものが開発されている
(2) Technical background In general, in a character recognition device, a pattern is observed first. The shape of the letter is emphasized. Then, the pattern characteristics are compared with the characteristics obtained from past data stored in the device, and the most similar pattern is selected. That is, a waiting process is performed, followed by a decision being made to identify a given pattern from the results. Character recognition is performed in this way, but recent character recognition devices have been developed not only for handwritten kana characters but also for handwritten kanji characters.

(3)従来技術と問題点 漢字の認識は従来のカタカナまでの認識に比べてその特
徴抽出が困難であるため妊精度が上がらないという問題
がある。更K、日本語文に於いては次のような問題をも
有している。
(3) Prior art and problems Recognition of kanji has a problem in that the accuracy cannot be improved because it is more difficult to extract the features compared to the conventional recognition of katakana. Furthermore, the Japanese text also has the following problems.

とは非常に困難である。例えは「エア」と「工事」に於
ける「工」、「力士」と「カサ」に於ける「力」は識別
不可能である。従来、このための方策として、認識装置
内において文章の前後関係を判断して非漢字と漢字を識
別する方法が知られているが、この方法は複雑であると
共忙必ずしも識別できないという欠点を有している。
is very difficult. For example, the ``tech'' in ``air'' and ``construction'' are indistinguishable, and the ``power'' in ``sumo wrestler'' and ``casa'' are indistinguishable. Conventionally, as a method for this purpose, a method is known in which non-kanji and kanji are distinguished by determining the context of the sentence in a recognition device, but this method has the disadvantage that it is complicated and cannot necessarily distinguish between kanji and kanji. have.

又、別の方法としては手書き日本語文自体に非漢字と漢
字の識別標識をそれぞれ付与すると(・うことも考えら
れているが、このような標識を文字枠内に付与すること
は面倒であり実用的でな1,1という欠点を有している
Another method is to add identification marks for non-kanji characters and kanji characters to the handwritten Japanese text itself (・), but it is considered troublesome to add such marks inside the character frame. It has the disadvantage of being impractical.

(4)発明の目的 本発明は上記従来の欠点を除去し、漢字の認識精度を向
上させると共に、簡単に非漢字と漢字とを識別し得る手
書き日本語文の認識処理方式を提供することを目的とし
ている。
(4) Purpose of the Invention The purpose of the present invention is to provide a recognition processing method for handwritten Japanese sentences that can eliminate the above-mentioned conventional drawbacks, improve the recognition accuracy of kanji, and easily distinguish between non-kanji and kanji. It is said that

(5)発明の構成 そしてこの目的は本発明によれば、手書き日本語文と核
子書き日本語文に対応してその漢字のみに付与された手
書きフリガナ文字を夫々区別して読取り核子書き日本語
文の認識出力と核子書きフリガナ文字の認識出力のカナ
漢字対応出力とを照合することにより漢字を認識すると
共に、対応するフリガナの存在しない部分は非漢字とし
て認識するようにしたことを特徴とする手書き日本語の
認識処理方式を提供することなよって達成される。
(5) Structure and object of the invention According to the present invention, the handwritten furigana characters added only to the kanji characters corresponding to the handwritten Japanese sentences and the Japanese sentences written in the nucleus are respectively read and recognized, and the Japanese sentences written in the nucleus are recognized and output. A system for handwritten Japanese that is characterized by recognizing kanji by comparing the recognition output of kana-kanji corresponding to the recognition output of nucleo-written furigana characters, and recognizing the parts for which there is no corresponding furigana as non-kanji. This is achieved without providing a recognition processing method.

(6)発明の実施例 以下本発明の一実施例を図面に従って詳述する。(6) Examples of the invention An embodiment of the present invention will be described in detail below with reference to the drawings.

第1図は本発明による手書き日本語文の1118w&処
理方式を実現するための構成図を示す。
FIG. 1 shows a block diagram for realizing the 1118w& processing method for handwritten Japanese text according to the present invention.

第1図において、lは手書き日本語文が書かれ11に対
応してその漢字のみに付与された手書きフリガナ文字1
2が書かれている。被読取り媒体1上の文字(フリガナ
12及び本文11)はCCD等の光電変換装置2によっ
て走査されて読取られ画像情報として認識前処理回路3
に入力する。
In Figure 1, l is a handwritten furigana character 1 which corresponds to 11 in which a handwritten Japanese sentence is written and is given only to that kanji.
2 is written. The characters (furigana 12 and text 11) on the medium 1 to be read are scanned and read by a photoelectric conversion device 2 such as a CCD, and are recognized as image information by a preprocessing circuit 3.
Enter.

認識前処理回路3は各文字の特徴抽出を行ない各文字の
候補を選択して出力する。この場合、フリガナについて
は殆ど正しい答が出るので1位の候補のみ選択してやれ
ばよい。
The recognition preprocessing circuit 3 extracts the features of each character, selects and outputs candidates for each character. In this case, since most of the furigana answers are correct, it is sufficient to select only the first candidate.

しかし乍ら、漢字を含む本文については各文字の候補を
そのまま出力する。4,5はそれぞれフリガナ、本文を
格納するバッファであり、認識前処理回路3からの候補
出力がフリガナか本文かによって握り分けられる。6は
読出しコントローラであり、フリガナバッファ4かもの
フリガナ出力に対応してヨミ辞書メモリ7をアク七ス(
2、読出されたカナ漢字対応出力と本文バッファ5から
の本文候補群出力とを順次照合することによって本文中
の漢字候補を取捨選択し、名字20の候補カテゴリーか
ら2〜3の候補に絞る。尚、読出しコン)p−26は本
文の各文字についてのフリガナのカナ漢字対応出力との
照合の結果、対応するフリガナが存在しないとg識した
場合にはその文字は非漢字であるカタカナ、ひらがな、
アルファベット、アラビア数字、その他の記号であると
判断する。
However, for text containing kanji, candidates for each character are output as is. 4 and 5 are buffers for storing furigana and text, respectively, and are classified depending on whether the candidate output from the recognition preprocessing circuit 3 is furigana or text. Reference numeral 6 denotes a read controller, which activates the reading dictionary memory 7 in response to the furigana output from the furigana buffer 4.
2. Select the kanji candidates in the text by sequentially collating the read kana-kanji correspondence output and the text candidate group output from the text buffer 5, and narrow down the candidates to 2 to 3 candidates from the candidate category of the last name 20. In addition, as a result of checking the kana-kanji correspondence output of the furigana for each character in the main text, if it is determined that the corresponding furigana does not exist, the character is changed to katakana or hiragana, which are non-kanji. ,
Determine whether it is an alphabet, Arabic numerals, or other symbols.

このようにして読出しコント胃−ラ6で取捨選択された
文字候補(各文字について2〜3の候補)は候補バッフ
ァ8に格納される。候補バッファ8からの文字候補出力
は選択回路9に入力し、ここで各文字候補は文章な構成
するように横に並べられ、最終的な正しいカテゴリーを
1つ選択するための処理が行なわれる。選択回路9から
の文字出力はフロッピー、a1算機等の出力回路10に
出力される。
The character candidates selected by the read controller 6 in this manner (two to three candidates for each character) are stored in a candidate buffer 8. The character candidate output from the candidate buffer 8 is input to a selection circuit 9, where each character candidate is arranged horizontally to form a sentence and processed to select one final correct category. The character output from the selection circuit 9 is output to an output circuit 10 such as a floppy disk or an A1 calculator.

第3図は第1図忙おける読出しコントローラ6の動作を
説明するための図である。第3図(a)は本文11とそ
のフリガナil 2の一例を示すものであり、それぞれ
文字枠虻対応[7て連続番号を付しである。第3図(b
)〜(e)は第3図(a)におけるフリガナ11の部分
の〜■に対応する本文11の漢字候補の認識決定の模様
を示すものであり、それぞれフリガナ部分の〜■のスタ
ート時点を変えた時のフリガナのヨミに対応するカナ漢
字対応出力と本文バッファ6(第1図)からの候補カテ
ゴリの中で合致するものがあるか否かを示すものである
。同図の場合はンリガナ部分■〜■に対応する漢字の候
補は「当社」であると判断される。同様にして、フリガ
ナ部分@〜OK対応する漢字は「工事業者」であること
が判断され、従って本文の部分■〜のの「はエアコン」
は非漢字で、この場合カタカナであることが認識される
FIG. 3 is a diagram for explaining the operation of the read controller 6 shown in FIG. 1. FIG. 3(a) shows an example of the main text 11 and its furigana 2, each of which is numbered consecutively in correspondence with the character frame. Figure 3 (b
) to (e) show the recognition and determination patterns of kanji candidates in text 11 that correspond to ~■ in the furigana part 11 in Figure 3(a), and the starting point of ~■ in the furigana part is changed. This shows whether or not there is a match between the kana-kanji corresponding output corresponding to the reading of the furigana and the candidate categories from the text buffer 6 (FIG. 1). In the case of the same figure, it is determined that the kanji candidate corresponding to the non-rigana parts ■ to ■ is "our company." Similarly, the kanji corresponding to the furigana part @~OK is determined to be "construction contractor", and therefore the text part ■~no's "is air conditioner"
is a non-kanji, and in this case it is recognized as katakana.

(η 発明の詳細 な説明したように本発明忙よれば、漢字に比べて認識が
容易なフリガナを利用して漢字と非漢字の同形のものの
弁別をするものであり漢字の認識精度を高めることがで
きると共K、簡単に非漢字と漢字とを識別して認識でき
るという効果を有する。
(η As described in detail, the present invention is to use furigana, which is easier to recognize than kanji, to distinguish between kanji and non-kanji with the same shape, thereby increasing the recognition accuracy of kanji. This has the effect that non-kanji and kanji can be easily distinguished and recognized.

【図面の簡単な説明】[Brief explanation of the drawing]

第1図は本発明の認識処理方式を実現するための構成図
、第2図は第1図における被読取り媒体1に書かれた文
字の例、第3図は第1図における読み出しコントローラ
6の動作を説明するための図である。 図忙おいて、1は日本語文11と7リガナ12が書かれ
た被読取り媒体、2は光電変換装置、3は認識前処理回
路、4はフリガナバッ7ア、5は本文バッファ、6は読
み出しコント目−ラ、7はヨミ辞書メモリをそれぞれ示
す。
FIG. 1 is a block diagram for realizing the recognition processing method of the present invention, FIG. 2 is an example of characters written on the read medium 1 in FIG. 1, and FIG. 3 is an example of the read controller 6 in FIG. FIG. 3 is a diagram for explaining the operation. In the figure, 1 is a read medium on which Japanese sentences 11 and 7 ligana 12 are written, 2 is a photoelectric conversion device, 3 is a recognition preprocessing circuit, 4 is a furigana buffer 7, 5 is a text buffer, and 6 is a readout device. Control numbers A and 7 indicate reading dictionary memories, respectively.

Claims (1)

【特許請求の範囲】 手書き日本語文と該手書き日本語文に対応して申 その漢文のみに付与された手書き7リガナ文字を夫々区
別して読取り、該手書き日本語文の認識出力と該手書き
フリガナ文字の認識出力のカナ漢字対応出力とを照合す
ることにより漢字を認識すると共に、対応するフリガナ
の存在しない部分は非漢字としてg識するよう忙したこ
とを特徴とする手書き日本語文の!i!!m処理方式。
[Scope of Claims] A handwritten Japanese text and a handwritten 7 ligana characters added only to Chinese characters corresponding to the handwritten Japanese text are read separately, and a recognition output of the handwritten Japanese text and a recognition of the handwritten furigana characters are provided. It recognizes kanji by comparing the output with the corresponding kana-kanji output, and also recognizes parts without corresponding furigana as non-kanji! i! ! m processing method.
JP57111912A 1982-06-29 1982-06-29 Recognizing and processing system of handwritten japanese sentence Granted JPS592191A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP57111912A JPS592191A (en) 1982-06-29 1982-06-29 Recognizing and processing system of handwritten japanese sentence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP57111912A JPS592191A (en) 1982-06-29 1982-06-29 Recognizing and processing system of handwritten japanese sentence

Publications (2)

Publication Number Publication Date
JPS592191A true JPS592191A (en) 1984-01-07
JPH0247788B2 JPH0247788B2 (en) 1990-10-22

Family

ID=14573230

Family Applications (1)

Application Number Title Priority Date Filing Date
JP57111912A Granted JPS592191A (en) 1982-06-29 1982-06-29 Recognizing and processing system of handwritten japanese sentence

Country Status (1)

Country Link
JP (1) JPS592191A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS6334680A (en) * 1986-07-29 1988-02-15 Toshiba Corp Character reader
JPH04242491A (en) * 1991-01-17 1992-08-31 Nec Corp Optical character reader
US5326713A (en) * 1992-09-04 1994-07-05 Taiwan Semiconductor Manufacturies Company Buried contact process
US5814541A (en) * 1987-12-04 1998-09-29 Kabushiki Kaisha Toshiba Method for manufacturing semiconductor device
EP4047519A1 (en) 2021-02-22 2022-08-24 Carl Zeiss Vision International GmbH Devices and methods for processing eyeglass prescriptions
EP4101367A1 (en) 2021-06-09 2022-12-14 Carl Zeiss Vision International GmbH Method and device for determining a visual performance

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS5699573A (en) * 1980-01-09 1981-08-10 Hitachi Ltd Kanji (chinese character) distinction system using katakana (square form of japanese syllabary)
JPS5699581A (en) * 1980-01-10 1981-08-10 Toshiba Corp Kanji (chinese character) read method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS5699573A (en) * 1980-01-09 1981-08-10 Hitachi Ltd Kanji (chinese character) distinction system using katakana (square form of japanese syllabary)
JPS5699581A (en) * 1980-01-10 1981-08-10 Toshiba Corp Kanji (chinese character) read method

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS6334680A (en) * 1986-07-29 1988-02-15 Toshiba Corp Character reader
US5814541A (en) * 1987-12-04 1998-09-29 Kabushiki Kaisha Toshiba Method for manufacturing semiconductor device
JPH04242491A (en) * 1991-01-17 1992-08-31 Nec Corp Optical character reader
US5326713A (en) * 1992-09-04 1994-07-05 Taiwan Semiconductor Manufacturies Company Buried contact process
EP4047519A1 (en) 2021-02-22 2022-08-24 Carl Zeiss Vision International GmbH Devices and methods for processing eyeglass prescriptions
WO2022175511A1 (en) 2021-02-22 2022-08-25 Carl Zeiss Vision International Gmbh Devices and methods for processing eyeglass prescriptions
EP4101367A1 (en) 2021-06-09 2022-12-14 Carl Zeiss Vision International GmbH Method and device for determining a visual performance
WO2022258647A1 (en) 2021-06-09 2022-12-15 Carl Zeiss Vision International Gmbh Method and device for determining a visual performance

Also Published As

Publication number Publication date
JPH0247788B2 (en) 1990-10-22

Similar Documents

Publication Publication Date Title
JP2713622B2 (en) Tabular document reader
JPS592191A (en) Recognizing and processing system of handwritten japanese sentence
KR970059977A (en) Online Character Recognition Method and Online Character Recognition Device
JPS5842904B2 (en) Handwritten kana/kanji character recognition device
Maheshwari et al. Offline handwriting recognition with emphasis on character recognition: A comprehensive survey
JPH0896081A (en) Character recognizing device and character recognizing method
JP3507720B2 (en) Online handwritten character recognition device and computer-readable recording medium
JP3763262B2 (en) Handwritten character recognition device
JPS60217483A (en) Recognizer of character
JP2972443B2 (en) Character recognition device
JPS607586A (en) Character information recognizer
KR900005141B1 (en) Handwritter character recognizing device
JPH06119497A (en) Character recognizing method
JPS6293776A (en) Information recognizing device
JPS61260354A (en) Kana and written kanji converting system
JP3138665B2 (en) Handwritten character recognition method and recording medium
JPH10328624A (en) Document understanding device and mail sorter
JPH1011542A (en) Character recognition device
JPS60138689A (en) Character recognizing method
JPS63268082A (en) Pattern recognizing device
JPS6162986A (en) Recognition order determining system
JPS6162985A (en) Recognition order determining system
JPS60225987A (en) Pattern recognizer
JPS62138989A (en) On-line character recognizing device
JPH04274580A (en) Optical character reader