JPH0554072A

JPH0554072A - Digital translation device

Info

Publication number: JPH0554072A
Application number: JP3211711A
Authority: JP
Inventors: Hitoshi Nakamura; 村仁中; Takashi Sato; 藤隆佐; Masumi Sato; 藤眞澄佐; Kenichi Hasegawa; 谷川健一長
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 1991-08-23
Filing date: 1991-08-23
Publication date: 1993-03-05

Abstract

PURPOSE:To shorten translation time and to make a translation result easy to be read by recognizing only a character in a designated area and to perform translation on a recognized character when a sentence original is read optically and the character is recognized. CONSTITUTION:A mark area detecting part 4 extracts image information in accordance with a word in a mark-designated area from image information stored in a storage means 3. Also, a character recognition means 5 recognizes the character in word unit by segmenting information in character unit in an image extracted at the detecting part 4, and extracting word information from the gap of segmented information in character unit. Furthermore, a translation means 6 translates a recognized word, and an output image forming means 8 writes the character image of a translated word on output image memory, and outputs written information, thereby, a translation image in accordance with a mark-specified original image can be formed.

Description

【発明の詳細な説明】Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明はデジタル翻訳装置に関
し、特に文章原稿を光学的に読取って文字を認識する際
に、指定した領域の文字のみを認識し、認識した文字に
対して翻訳を行なうデジタル翻訳装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a digital translation device, and more particularly, when optically reading a text original to recognize characters, only the characters in a designated area are recognized and the recognized characters are translated. Digital translation device

【０００２】[0002]

【従来の技術】翻訳装置においては、例えば、翻訳した
い文章原稿を光学的に読取り、読み取った画像から文字
認識し、この認識結果から辞書検索を行ない翻訳処理を
行なう。この時、入力されたイメージ文字の認識はＯＣ
Ｒ（文字認識装置）を用いて行なっている。一般的なＯ
ＣＲでは、標準文字が登録された標準辞書を用い、読取
った文字画像と標準辞書の文字との類似率より、文字を
認識する。2. Description of the Related Art In a translation apparatus, for example, a text original to be translated is optically read, characters are recognized from the read image, and a dictionary search is performed from the recognition result to perform translation processing. At this time, the recognition of the input image character is OC
This is performed using R (character recognition device). General O
In CR, a standard dictionary in which standard characters are registered is used, and characters are recognized based on the similarity between the read character image and the characters in the standard dictionary.

【０００３】[0003]

【発明が解決しようとする課題】ところで、従来の翻訳
装置においては、読取った画像のすべての文字に対して
翻訳処理を行なうため、翻訳処理の必要ない単語や、一
度翻訳した単語に対しても翻訳処理を行ない、翻訳に時
間がかかっていた。また、翻訳結果を出力する際に、原
稿文字の下に翻訳結果を表示するスペースを作成し、翻
訳結果の文字長に合わせて原稿文字をずらして表示する
ため、翻訳結果が非常に見づらかった。By the way, in the conventional translation apparatus, since the translation processing is performed for all the characters in the read image, even a word that does not need to be translated or a word that has been translated once is not translated. It took a long time to do the translation process. In addition, when the translation result is output, a space for displaying the translation result is created below the manuscript character, and the manuscript character is shifted and displayed according to the character length of the translation result, so that the translation result is very difficult to see.

【０００４】本発明は上記問題を解決すべく考案された
もので、翻訳時間を短縮し、さらに翻訳結果を見やすく
することを目的とする。The present invention was devised to solve the above problems, and an object of the present invention is to shorten the translation time and make the translation result easier to see.

【０００５】[0005]

【課題を解決するための手段】本発明のデジタル翻訳装
置は、原稿画像を読み取る原稿読取手段(2)；原稿読取
手段(2)が読み取った原稿の画像情報を格納する記憶手
段(3)；記憶手段(3)に格納された画像情報から、マーク
指定された領域の画像情報を抽出する抽出手段(4)；抽
出手段(4)で抽出した画像の文字単位の情報を切り出
し、切り出した文字単位の情報の間隔から単語情報を抽
出して、単語単位で文字を認識する文字認識手段(5)；
認識した単語を翻訳する翻訳手段(6)；および、翻訳し
た単語の文字画像を出力画像メモリに書込み、書込んだ
情報を出力する出力画像形成手段(8)；を備える。なお
カッコ内の記号は後述する実施例の対応要素である。A digital translation apparatus of the present invention comprises a document reading unit (2) for reading a document image; a storage unit (3) for storing image information of a document read by the document reading unit (2); Extraction means (4) for extracting the image information of the marked area from the image information stored in the storage means (3); the character-by-character information of the image extracted by the extraction means (4) is cut out, and the cut out character is extracted. Character recognition means (5) for recognizing characters in word units by extracting word information from the unit information interval;
A translation means (6) for translating the recognized word; and an output image forming means (8) for writing a character image of the translated word in an output image memory and outputting the written information. The symbols in parentheses are the corresponding elements of the examples described later.

【０００６】[0006]

【作用】これによれば、まず、原稿読取手段(2)が原稿
画像を読み取り、記憶手段(3)が、原稿読取手段(2)が読
み取った原稿の画像情報を格納する。さらに抽出手段
(4)が、記憶手段(3)に格納された画像情報から、マーク
指定された領域の単語に対応する画像情報を抽出し、文
字認識手段(5)が、抽出手段(4)で抽出した画像の文字単
位の情報を切り出し、切り出した文字単位の情報の間隔
から単語情報を抽出して、単語単位で文字を認識する。
従って例えば、任意の単語に対してマーク処理が施こさ
れた英文章の原稿に対して、まず、マーク処理が施こさ
れた部分の画像のみを抽出し、その後、抽出された領域
において、アルファベット単位の文字情報の切り出しを
行ない、これを基に単語単位で文字を認識することが可
能である。また、翻訳手段(5)が認識した単語を翻訳
し、出力画像形成手段(8)が翻訳した単語の文字画像を
出力画像メモリに書込み、書込んだ情報を出力するの
で、マーク指定された原稿画像に対応したが翻訳画像形
成される。According to this, first, the document reading unit (2) reads the document image, and the storage unit (3) stores the image information of the document read by the document reading unit (2). Further extraction means
(4) extracts the image information corresponding to the word in the marked area from the image information stored in the storage means (3), and the character recognition means (5) extracts it by the extraction means (4). Information in character units of an image is cut out, word information is extracted from an interval of the cut out information in character units, and characters are recognized in word units.
Therefore, for example, for an English document in which an arbitrary word has been marked, only the image of the marked part is first extracted, and then the alphabet in the extracted area is extracted. It is possible to cut out character information in units and recognize characters in word units based on this. Further, since the word recognized by the translation means (5) is translated, the character image of the word translated by the output image forming means (8) is written in the output image memory, and the written information is output, so that the mark-designated original document is output. A translated image is formed although it corresponds to the image.

【０００７】以上のように本発明では、マーク処理を施
こした必要な単語以外の文字認識処理および翻訳処理を
省略することができるので、翻訳にかかる処理時間の短
縮が図れる。As described above, according to the present invention, the character recognition process and the translation process other than the necessary words to which the mark process is applied can be omitted, so that the processing time required for the translation can be shortened.

【０００８】また、本発明の好ましい実施例では、原稿
画像の任意の領域に対するマーク指定を入力する領域指
定入力手段(14）；を備える。これにより、原稿画像に
対して直接マーク処理を施こす必要がなく、操作により
容易に原稿画像の任意の領域に対するマーク指定を行な
うことができる。Further, in a preferred embodiment of the present invention, an area designation input means (14) for inputting a mark designation for an arbitrary area of the original image is provided. As a result, it is not necessary to directly perform the mark processing on the original image, and it is possible to easily perform the mark designation for an arbitrary area of the original image by the operation.

【０００９】さらに、本発明の好ましい実施例では、出
力画像形成手段(8)は、原稿画像の指定された全ての領
域に対する文字の、翻訳した単語の文字画像を所定の書
式に従って表形式で出力する。例えばこの書式を、指定
された領域の原稿文字と翻訳された文字が対応する表に
することにより、非常に見やすい翻訳結果出力が得られ
る。本発明の他の目的および特徴は図面を参照した以下
の実施例の説明により明らかになろう。Further, in a preferred embodiment of the present invention, the output image forming means (8) outputs the character images of the translated words of the characters for all the designated areas of the original image in a tabular format according to a predetermined format. To do. For example, by converting this format into a table in which manuscript characters in a designated area correspond to translated characters, a very easy-to-read translation result output can be obtained. Other objects and features of the present invention will become apparent from the following description of embodiments with reference to the drawings.

【００１０】[0010]

【実施例】【Example】

（実施例１）図１に本発明のデジタル翻訳装置の構成概
略のブロック図を示す。図１において、１は装置全体を
制御するＣＰＵ、２は翻訳する原稿を読み込むためのス
キャナ、３はスキャナ２により読取った画像データを格
納する画像メモリ、４は画像メモリ３に格納した画像デ
ータを読出しマークペンでマークした領域の検出を行な
うマーク領域検出部、５はマーク領域検出部４より検出
したマーク領域の文字を切り出して文字認識を行なう文
字認識部、６は文字認識部５で認識した文字を単語とし
て切り出し翻訳を行なう翻訳部、７はスキャナ２におい
て原稿の読取りなどを指示する操作部、８は翻訳した結
果を出力する出力部である。(Embodiment 1) FIG. 1 shows a schematic block diagram of a digital translation apparatus of the present invention. In FIG. 1, 1 is a CPU for controlling the entire apparatus, 2 is a scanner for reading a document to be translated, 3 is an image memory for storing image data read by the scanner 2, and 4 is image data stored in the image memory 3. A mark area detecting unit for detecting an area marked by a reading mark pen, a character recognizing section for recognizing a character in the mark area detected by the mark area detecting section 4, and a character recognizing section for recognizing the character. A translation unit that cuts out and translates characters as words, 7 is an operation unit that instructs the scanner 2 to read a document, and 8 is an output unit that outputs a translation result.

【００１１】図２に、操作部７に対応したＣＰＵ１の制
御動作の概要を示す。スキャナ２の原稿設置台（図示し
ない）にマークペンで領域指定された英文文章原稿（翻
訳を行ないたい部分をマークペンで塗りつぶした原稿）
がセットされ、操作部７に「スタート」入力があると、
ＣＰＵ１はまず、セットされた原稿をスキャナ２の原稿
読取り部により読取って、例えば、図３に示すような原
稿対応の画像を得る（ステップ１００：以下カッコ内で
はサブルーチンとかステップと言う語を省略し、それに
付した番号数字のみを記す）。なお、図３において四角
で囲まれた単語は、マークペンで指定された（塗りつぶ
された）単語を示す。FIG. 2 shows an outline of the control operation of the CPU 1 corresponding to the operation unit 7. An English text manuscript whose area is specified by the mark pen on the manuscript stand (not shown) of the scanner 2 (the manuscript in which the part to be translated is painted with the mark pen)
Is set and there is a “start” input on the operation unit 7,
First, the CPU 1 reads the set original by the original reading unit of the scanner 2 to obtain an image corresponding to the original as shown in FIG. 3 (step 100: hereinafter, the word "subroutine" or "step" is omitted in parentheses). , Only the numbers attached to it are noted). Note that the words surrounded by squares in FIG. 3 indicate the words designated (filled in) by the mark pen.

【００１２】次にマーク領域検出部４により、マーク領
域の検出処理を行なう（２００）。このマーク領域の検
出処理（２００）では、ステップ１００で読取った画像
に対し、まず、図４に示すように主走査方向の黒画素の
ヒストグラムを計算し、文字画像を行単位で切り出す。
次に、行単位で切り出した１行分の画像において、図５
に示すように、その中心（主走査方向の中心）を通るラ
イン（副走査方向に１画素分のライン）１０の濃度値を
検出する。行単位で切り出した画像には、文字画像の領
域，マークペンでマークした領域，および画像のない領
域、の３種類の画像領域があるが、図５において、１１
は文字画像の領域、１２はマークペンでマークした領
域、１３は画像のない領域の濃度１３、をそれぞれ示し
ている。３種類の画像領域の副走査方向の１ラインの濃
度値は、それぞれの領域において異なることがわかる。
従って、この濃度の違いにより、原稿のマーク領域の単
語の画像のみを抽出することができる。Next, the mark area detecting section 4 performs a mark area detecting process (200). In the mark area detection processing (200), a histogram of black pixels in the main scanning direction is first calculated for the image read in step 100 as shown in FIG. 4, and character images are cut out line by line.
Next, in the image for one line cut out line by line,
As shown in, the density value of a line (a line for one pixel in the sub-scanning direction) 10 passing through the center (the center in the main scanning direction) is detected. The image cut out in units of lines has three types of image areas: a character image area, an area marked with a marking pen, and an area without an image.
Indicates an area of a character image, 12 indicates an area marked with a mark pen, and 13 indicates density 13 of an area without an image. It can be seen that the density values of one line in the sub-scanning direction of the three types of image areas are different in each area.
Therefore, due to this difference in density, it is possible to extract only the image of the word in the mark area of the document.

【００１３】さらに文字検出部５において、図６に示す
ように抽出したマーク領域の文字画像において副走査方
向に黒画素のヒストグラムを計算し、１文字単位の文字
画像を切り出す（３００）。なお、１文字単位で切り出
された文字画像の例を図７に示す。また、文字認識部５
において、切り出したある文字単位の画像とその次の文
字単位画像との距離（文字間）が１文字分の距離以上の
場合、そこを単語の切れ目と判断し、１文字単位の文字
画像をグループ化して単語を抽出する（４００）。この
抽出した単語の例を図８に示す。その後、文字認識部５
でグループ化した単語の１文字ずつの文字認識を行なう
（５００）。Further, in the character detecting section 5, a histogram of black pixels in the sub-scanning direction is calculated in the extracted character image of the mark area as shown in FIG. 6, and the character image of each character is cut out (300). Note that FIG. 7 shows an example of a character image cut out in units of one character. In addition, the character recognition unit 5
In, if the distance (between characters) between a cut-out image in one character unit and the next character unit image is more than one character, it is judged as a word break, and the character image in one character unit is grouped. It is converted into words and extracted (400). An example of this extracted word is shown in FIG. After that, the character recognition unit 5
Character recognition is performed for each character of the words grouped in (500).

【００１４】翻訳部６において、文字認識部５で認識し
た単語単位の文字コードに相当する日本語を翻訳辞書か
ら検索し、翻訳結果を出力する（６００）。さらに翻訳
結果の文字コードを文字画像に変換して出力部８が有す
る出力画像メモリに画像を形成する（７００）。指定さ
れた単語のすべての出力画像（翻訳画像）が形成される
まで、ステップ３００〜７００の処理を繰返し、指定さ
れた全ての単語の出力画像が形成されると（８００）、
出力画像メモリに書き込みれた出力画像を紙に転写して
排出する（９００）。The translation unit 6 searches the translation dictionary for Japanese corresponding to the character code of each word recognized by the character recognition unit 5, and outputs the translation result (600). Furthermore, the character code of the translation result is converted into a character image and an image is formed in the output image memory of the output unit 8 (700). The processes of steps 300 to 700 are repeated until all output images (translated images) of the specified word are formed, and when output images of all the specified words are formed (800),
The output image written in the output image memory is transferred onto paper and discharged (900).

【００１５】以上のようにあらかじめ原稿にマークペン
で領域指定を行なえば、このマーク領域の単語のみを抽
出して翻訳を行なうので、翻訳する必要のない単語を翻
訳する分の時間が短縮される。As described above, if the area is designated on the manuscript with the mark pen in advance, only the words in the marked area are extracted and translated, so that the time required to translate the words that do not need to be translated is shortened. ..

【００１６】（実施例２）実施例１においては、マーク
ペンにより原稿に直接マークを行ない、翻訳する領域を
指定したが、実施例２では操作入力により翻訳する領域
を指定する。(Embodiment 2) In the first embodiment, the area to be translated is designated by directly marking the original with the mark pen, but in the second embodiment, the area to be translated is designated by the operation input.

【００１７】図９に本発明（実施例２）のデジタル翻訳
装置の構成概略のブロック図を示す。図９において、Ｃ
ＰＵ１、スキャナ２、画像メモリ３、文字認識部５、翻
訳部６、操作部７、符出力部８、図１に示す各部と同一
の構成および動作であり、マーク領域検出部４において
そのマーク指定動作が異なる。また付号１４は、マーク
領域を指定するための表示部一体型のタブレット（領域
指操作部＆表示部）であり、位置指定を行なうためのス
タイラスペンを備えている。FIG. 9 shows a block diagram of a schematic configuration of a digital translation apparatus of the present invention (Example 2). In FIG. 9, C
The PU 1, the scanner 2, the image memory 3, the character recognition unit 5, the translation unit 6, the operation unit 7, the code output unit 8 and the components shown in FIG. 1 have the same configuration and operation, and the mark area detection unit 4 designates the mark. The behavior is different. Reference numeral 14 is a display unit integrated tablet (region finger operation unit & display unit) for designating a mark region, and is provided with a stylus pen for designating a position.

【００１８】図１０に、操作部７に対応したＣＰＵ１の
制御動作の概要を示す。スキャナ２の原稿設置台（図示
しない）に英文文章原稿がセットされ、操作部７に「ス
タート」入力があると、ＣＰＵ１はまず、セットされた
原稿をスキャナ２の原稿読取り部により読取って、例え
ば、図３に示すような原稿対応の画像を得る（ステップ
１００：以下カッコ内ではサブルーチンとかステップと
言う語を省略し、それに付した番号数字のみを記す）。
なお、図３において四角で囲まれたマーク領域はこの実
施例２では付加されていないものとする。FIG. 10 shows an outline of the control operation of the CPU 1 corresponding to the operation unit 7. When an English text document is set on the document setting table (not shown) of the scanner 2 and a “start” input is made on the operation unit 7, the CPU 1 first reads the set document by the document reading unit of the scanner 2 and, for example, An image corresponding to the original document as shown in FIG. 3 is obtained (step 100: hereinafter, in parentheses, the word "subroutine" or "step" is omitted, and only the numbers attached to it are described).
The mark area surrounded by a square in FIG. 3 is not added in the second embodiment.

【００１９】次に、翻訳単語の指定処理を行なう（２０
０ａ）。この処理（２００ａ）では、まず、ステップ１
００で読取った画像に対し、マーク領域検出部４の文字
切出し手段により、主走査方向の黒画素のヒストグラム
を計算し文字画像を行単位で切り出す。切り出した行単
位の文字画像の全行（原稿画像と同一の画像）を表示部
一体型タブレット１４に表示する。この表示された画像
に対してスタイラスペン（図示しない）を用いて翻訳し
たい領域の位置指定を行なうと、指定された領域（位置
情報）を記憶する。Next, a translation word designation process is performed (20).
0a). In this process (200a), first, step 1
With respect to the image read at 00, the character cutting-out means of the mark area detecting unit 4 calculates a histogram of black pixels in the main scanning direction and cuts out the character image line by line. All lines of the cut-out line-by-line character image (the same image as the original image) are displayed on the display unit integrated tablet 14. When the position of the region to be translated is designated on the displayed image using a stylus pen (not shown), the designated region (position information) is stored.

【００２０】指定された領域の画像に対して、文字認識
部５で、副走査方向に黒画素のヒストグラムを計算し、
１文字単位の文字画像を切り出す（３００）。また、文
字認識部５において、切り出したある文字単位の画像と
その次の文字単位画像との距離（文字間）が１文字分の
距離以上の場合、そこを単語の切れ目と判断し、１文字
単位の文字画像をグループ化して単語を抽出する（４０
０）。その後、文字認識部５でグループ化した単語の１
文字ずつの文字認識を行なう（５００）。For the image of the designated area, the character recognition unit 5 calculates a histogram of black pixels in the sub-scanning direction,
The character image of each character is cut out (300). In addition, in the character recognition unit 5, when the distance (between characters) between a cut-out image of one character unit and the next character-unit image is one character or more, it is determined as a word break and one character The words are extracted by grouping the character images of the units (40
0). After that, one of the words grouped by the character recognition unit 5
Character recognition is performed for each character (500).

【００２１】翻訳部６において、文字認識部５で認識し
た単語単位の文字コードに相当する日本語を翻訳辞書か
ら検索し、翻訳結果を出力する（６００）。さらに翻訳
結果の文字コードを文字画像に変換して出力部８が有す
る出力画像メモリに画像を形成する（７００）。指定さ
れた単語のすべての出力画像（翻訳画像）が形成される
まで、ステップ３００〜７００の処理を繰返し、指定さ
れた全ての単語の出力画像が形成されると（８００）、
出力画像メモリに書き込みれた出力画像を紙に転写して
排出する（９００）。The translation unit 6 searches the translation dictionary for Japanese corresponding to the character code of each word recognized by the character recognition unit 5, and outputs the translation result (600). Furthermore, the character code of the translation result is converted into a character image and an image is formed in the output image memory of the output unit 8 (700). The processes of steps 300 to 700 are repeated until all output images (translated images) of the specified word are formed, and when output images of all the specified words are formed (800),
The output image written in the output image memory is transferred onto paper and discharged (900).

【００２２】以上のように、読取った原稿画像を表示部
一体型のタブレット１４に表示し、この表示に基づいて
翻訳したい単語のみの領域指定を行なえば、指定領域の
単語のみを抽出して翻訳を行なうので、翻訳する必要の
ない単語を翻訳する分の時間が短縮される。As described above, the read original image is displayed on the tablet 14 integrated with the display unit, and based on this display, only the words to be translated are designated, and only the words in the designated area are extracted and translated. Therefore, it takes less time to translate a word that does not need to be translated.

【００２３】ここで翻訳結果の出力形態につてい説明す
る。出力形態には、（１）入力文章に対応して英単語のすぐ下に翻訳結果を
出力する方法（２）出力用紙の下半分や右半分に翻訳結果を出力した
り、翻訳結果のみを出力する方法、等があるが、（１）の出力形態では、翻訳結果をための
領域を英文１行の下に作成し、更に英単語に対する日本
語（翻訳結果）の出力する画像の長さ（領域）が長い場
合、英文をずらして出力するため、入力原稿と比べ、単
語間が延び縮みするため非常に見ずらい。また、（２）
出力形態では、使用者が翻訳結果を見る場合、英単語と
翻訳結果の対応がわかりにくい。以上の点から本実施例
１および実施例２においては、図１３に示すように、領
域検出した英単語とそれに対する翻訳結果を表形式で出
力するようにしている。これは実施例１および実施例２
のステップ７００において、翻訳結果の文字コードを文
字画像に変換して出力部８が有する出力画像メモリに画
像を形成する際に、位置指定（アドレス指定）を行なう
だけで容易に実施することができる。The output form of the translation result will be described here. The output form is as follows: (1) Output the translation result just below the English word corresponding to the input sentence. (2) Output the translation result on the lower half or right half of the output paper, or output only the translation result. However, in the output form of (1), a region for the translation result is created under one line of English sentence, and the length of the image (Japanese translation result) for the English word output ( When the area is long, the English sentence is shifted and output, so the words are stretched and contracted compared to the input manuscript, which is very difficult to see. Also, (2)
In the output form, when the user views the translation result, it is difficult to understand the correspondence between the English words and the translation result. From the above points, in the first and second embodiments, as shown in FIG. 13, the region-detected English words and the translation results for them are output in a tabular format. This is Example 1 and Example 2.
In step 700 of 1., when the character code of the translation result is converted into a character image and an image is formed in the output image memory of the output unit 8, it can be easily performed only by specifying the position (addressing). ..

【００２４】[0024]

【発明の効果】以上のように本発明によれば、まず、原
稿読取手段(2)が原稿画像を読み取り、記憶手段(3)が、
原稿読取手段(2)が読み取った原稿の画像情報を格納す
る。さらに抽出手段(4)が、記憶手段(3)に格納された画
像情報から、マーク指定された領域の単語に対応する画
像情報を抽出し、文字認識手段(5)が、抽出手段(4)で抽
出した画像の文字単位の情報を切り出し、切り出した文
字単位の情報の間隔から単語情報を抽出して、単語単位
で文字を認識する。従って例えば、任意の単語に対して
マーク処理が施こされた英文章の原稿に対して、まず、
マーク処理が施こされた部分の画像のみを抽出し、その
後、抽出された領域において、アルファベット単位の文
字情報の切り出しを行ない、これを基に単語単位で文字
を認識することが可能である。また、翻訳手段(5)が認
識した単語を翻訳し、出力画像形成手段(8)が翻訳した
単語の文字画像を出力画像メモリに書込み、書込んだ情
報を出力するので、マーク指定された原稿画像に対応し
たが翻訳画像形成される。As described above, according to the present invention, first, the document reading means (2) reads the document image and the storage means (3):
The image information of the document read by the document reading means (2) is stored. Further, the extraction means (4) extracts image information corresponding to the word in the marked area from the image information stored in the storage means (3), and the character recognition means (5) extracts the extraction means (4). The character-by-character information of the image extracted in step 1 is cut out, word information is extracted from the interval of the cut-out character-by-character information, and the character is recognized word by word. Therefore, for example, for an English manuscript in which mark processing is applied to an arbitrary word, first,
It is possible to extract only the image of the portion that has been subjected to the mark processing, and then cut out character information in alphabetical units in the extracted region, and recognize characters in word units based on this. Further, since the word recognized by the translation means (5) is translated, the character image of the word translated by the output image forming means (8) is written in the output image memory, and the written information is output, so that the mark-designated original document is output. A translated image is formed although it corresponds to the image.

【００２５】以上のように本発明では、マーク処理を施
こした必要な単語以外の文字認識処理および翻訳処理を
省略することができるので、翻訳にかかる処理時間の短
縮が図れる。As described above, according to the present invention, the character recognition process and the translation process other than the necessary words to which the mark process is applied can be omitted, so that the processing time required for the translation can be shortened.

【００２６】また、原稿画像の任意の領域に対するマー
ク指定を入力する領域指定入力手段(14）を備えるの
で、原稿画像に対して直接マーク処理を施こす必要がな
く、操作により容易に原稿画像の任意の領域に対するマ
ーク指定を行なうことができる。Further, since the area designation input means (14) for inputting the mark designation for an arbitrary area of the original image is provided, it is not necessary to directly perform the mark processing on the original image, and the original image can be easily operated by the operation. Marks can be specified for any area.

【００２７】さらに、出力画像形成手段(8)は、原稿画
像の指定された全ての領域に対する文字の、翻訳した単
語の文字画像を所定の書式に従って表形式で出力する。
従って、例えばこの書式を、指定された領域の原稿文字
と翻訳された文字が対応する表にすることにより、非常
に見やすい翻訳結果出力が得られる。Further, the output image forming means (8) outputs the character images of the translated words of the characters for all the designated areas of the original image in a tabular format according to a predetermined format.
Therefore, for example, by converting this format into a table in which the original characters in the designated area and the translated characters correspond to each other, a very easy-to-read translation result output can be obtained.

【図面の簡単な説明】[Brief description of drawings]

【図１】本発明のデジタル翻訳装置の構成概略のブロ
ック図である。FIG. 1 is a block diagram of a schematic configuration of a digital translation device of the present invention.

【図２】図１に示す操作部７に対応したＣＰＵ１の制
御動作の概要を示すフローチャートである。FIG. 2 is a flowchart showing an outline of a control operation of a CPU 1 corresponding to the operation unit 7 shown in FIG.

【図３】原稿対応の画像の一例を示す平面図である。FIG. 3 is a plan view showing an example of an image corresponding to a document.

【図４】図３に示す原稿画像に対して、主走査方向の
黒画素のヒストグラムを計算し文字画像を行単位で切り
出す際の、原稿対応の画像の一例を示す平面図である。FIG. 4 is a plan view showing an example of an image corresponding to an original when a histogram of black pixels in the main scanning direction is calculated for the original image shown in FIG.

【図５】行単位で切り出した１行分の画像において、
画像の中心（主走査方向の中心）を通るライン（副走査
方向に１画素分のライン）１０の濃度値を示すグラフで
ある。FIG. 5: In an image for one line cut out in line units,
6 is a graph showing the density value of a line (a line for one pixel in the sub-scanning direction) 10 that passes through the center of the image (center in the main scanning direction).

【図６】図４に示した行単位で切り出したの文字画像
を、副走査方向に黒画素のヒストグラムを計算し１文字
単位の文字画像を切り出す際の、原稿対応の画像の一例
を示す平面図である。FIG. 6 is a plan view showing an example of an image corresponding to a document when a character image cut out in units of lines shown in FIG. 4 is cut out in a character image by calculating a histogram of black pixels in the sub-scanning direction. It is a figure.

【図７】１文字単位で切り出された文字画像の一例を
示す平面図である。FIG. 7 is a plan view showing an example of a character image cut out in units of one character.

【図８】１文字単位の文字画像をグループ化して単語
単位で抽出した一例を示す平面図である。FIG. 8 is a plan view showing an example in which character images of one character are grouped and extracted in word units.

【図９】図１に示すデジタル翻訳装置（実施例１）と
別の、デジタル翻訳装置（実施例２）の構成概略のブロ
ック図である。9 is a block diagram of a schematic configuration of a digital translation apparatus (Example 2) different from the digital translation apparatus (Example 1) shown in FIG.

【図１０】図９に示す操作部７に対応したＣＰＵ１の
制御動作の概要を示すフローチャートである。10 is a flowchart showing an outline of a control operation of CPU 1 corresponding to operation unit 7 shown in FIG.

【図１１】図９に示すタブレット１４を示す平面図で
ある。11 is a plan view showing the tablet 14 shown in FIG. 9. FIG.

【図１２】翻訳出力の一例を示す平面図である。FIG. 12 is a plan view showing an example of translation output.

【符号の説明】[Explanation of symbols]

１：ＣＰＵ２：スキャナ（画稿
読取手段）３：画像メモリ（記憶手段）４：マーク領域検出
部（抽出手段）５：文字認識部（文字認識手段）６：翻訳部（翻訳手
段）７：操作部８：出力部（出力画
像形成手段）１４：表示部一体型タブレット（領域指定入力手段）1: CPU 2: Scanner (image reading means) 3: Image memory (storage means) 4: Mark area detection part (extraction means) 5: Character recognition part (character recognition means) 6: Translation part (translation means) 7: Operation Part 8: Output unit (output image forming means) 14: Display unit integrated tablet (area designation input means)

フロントページの続き (72)発明者長谷川健一東京都大田区中馬込１丁目３番６号株式会社リコ−内Front Page Continuation (72) Inventor Kenichi Hasegawa 1-3-6 Nakamagome, Ota-ku, Tokyo Reco Ltd.

Claims

【特許請求の範囲】[Claims]

【請求項１】原稿画像を読み取る原稿読取手段；原稿読
取手段が読み取った原稿の画像情報を格納する記憶手
段；記憶手段に格納された画像情報から、マーク指定さ
れた領域の画像情報を抽出する抽出手段；抽出手段で抽
出した画像の文字単位の情報を切り出し、切り出した文
字単位の情報の間隔から単語情報を抽出して、単語単位
で文字を認識する文字認識手段；認識した単語を翻訳す
る翻訳手段；および、翻訳した単語の文字画像を出力画像メモリに書込み、書
込んだ情報を出力する出力画像形成手段；を備える、デ
ジタル翻訳装置。1. An original reading means for reading an original image; a storage means for storing image information of an original read by the original reading means; image information of a marked area is extracted from the image information stored in the storage means. Extraction means: Character-by-character information of the image extracted by the extraction means is cut out, word information is extracted from the interval of the cut-out character-by-character information, and character recognition means for recognizing characters in word units; A digital translation device comprising: a translation unit; and an output image forming unit that writes a character image of a translated word in an output image memory and outputs the written information.

【請求項２】原稿画像の任意の領域に対するマーク指定
を入力する領域指定入力手段；を備えることを特徴とす
る、前記請求項１記載のデジタル翻訳装置。2. The digital translation apparatus according to claim 1, further comprising area designation input means for inputting a mark designation for an arbitrary area of an original image.

【請求項３】出力画像形成手段は、原稿画像の指定され
た全ての領域に対する文字の、翻訳した単語の文字画像
を所定の書式に従って表形式で出力することを特徴とす
る、前記請求項１または請求項２記載のデジタル翻訳装
置。3. The output image forming means outputs the character images of the translated words of the characters for all designated areas of the original image in a tabular format according to a predetermined format. Alternatively, the digital translation device according to claim 2.