JP2006309314A

JP2006309314A - Translation device

Info

Publication number: JP2006309314A
Application number: JP2005127784A
Authority: JP
Inventors: Nobue Nojima; 伸江野島; Mari Tanaka; 真理田中
Original assignee: Konica Minolta Photo Imaging Inc
Current assignee: Konica Minolta Photo Imaging Inc
Priority date: 2005-04-26
Filing date: 2005-04-26
Publication date: 2006-11-09

Abstract

<P>PROBLEM TO BE SOLVED: To achieve a translation device that can improve a character recognition rate and also reduce burden of a user to thereby enhance usefulness. <P>SOLUTION: An image processing unit 30 corrects parallax in still image data on the basis of the distance to a subject measured by a photographing unit 10. Then, a character string is extracted from the still image data after the correction. When having failed in the extraction, the image processing such as edge enhancement and gradation correction is applied to the still image data, and character extraction is performed again. A language processing unit 40 transmits the character string extracted by the image processing unit 30 to a translation engine 60. The translation engine 60 retrieves the translated word of the character string, which is input from the language processing unit 40, using the translation engine 60 and returns the word to the language processing unit 40. The language processing unit 40 displays the word, which is returned from the translation engine 60, on a display unit 70. Then, the optical image of the word is projected on a holographic optical element and led to the eye of a user by diffraction reflection. <P>COPYRIGHT: (C)2007,JPO&INPIT

Description

本発明は、文字列を翻訳して、その文字列の訳語を取得して表示する翻訳装置に関する。 The present invention relates to a translation apparatus that translates a character string and acquires and displays a translated word of the character string.

近年、入力した文字列の訳語を表示する電子辞書や、カメラで撮影した静止画像から文字列を抽出して翻訳表示するカメラ付き携帯端末等の様々な形態の翻訳装置が広く普及し、例えば、外国語の文書の閲読や海外旅行等において利用されている。 In recent years, various forms of translation devices such as electronic dictionaries that display translated words of input character strings and mobile terminals with cameras that extract and display character strings from still images taken with a camera are widely used. Used for reading foreign language documents and traveling abroad.

翻訳装置の一例としては次のようなものが知られている。即ち、動画像を撮影して、その動画像から静止画像を抽出してエッジの検出を行い、エッジの検出が行えた場合は文字認識を行い、エッジの検出が行えなかった場合は、再度動画像から静止画像を抽出して同処理を繰り返すカメラ付き携帯情報端末である（特許文献１参照）。 The following is known as an example of a translation apparatus. That is, a moving image is shot, a still image is extracted from the moving image, and an edge is detected. If an edge is detected, character recognition is performed. If an edge cannot be detected, a moving image is again displayed. This is a portable information terminal with a camera that extracts a still image from an image and repeats the same processing (see Patent Document 1).

また、翻訳装置の一例として、ハーフミラーを採用したヘッドマウントディスプレイ（Head Mount Display；以下、「ＨＭＤ」という。）装置が知られている。このＨＭＤ装置は、ＣＣＤ（Charge Coupled Device）カメラにより撮影された撮影画像の中から、使用者の視線の移動に従って指定された領域の画像を抽出し、その画像内の文字列を抽出する。そして、その抽出した文字列を翻訳して使用者の眼に投射する（特許文献２参照）。 As an example of a translation apparatus, a head mounted display (hereinafter referred to as “HMD”) apparatus that employs a half mirror is known. This HMD device extracts an image of a designated area in accordance with movement of a user's line of sight from a photographed image photographed by a CCD (Charge Coupled Device) camera, and extracts a character string in the image. Then, the extracted character string is translated and projected to the user's eyes (see Patent Document 2).

特開２００３−２１６８９３号公報JP 2003-216893 A 特開２００１−５６４４６号公報JP 2001-56446 A

ところで、ＨＭＤ装置には、装着時に違和感がないこと、使用者の視野を狭くしないこと、視野を暗くしないことといった装着性が要求され、使用者に負荷がかからないことが望ましい。 By the way, the HMD device is required to have a wearability such as no discomfort at the time of wearing, not narrowing the user's visual field, and not darkening the visual field, and it is desirable that the user is not burdened.

しかし、特許文献２のＨＭＤ装置は、ハーフミラーを採用しているため、使用者の眼前をハウジングで覆い暗部を作らなければならないため、使用者の視野が暗くなってしまう。また、対物レンズを介して現実画像を見るため使用者の視野は制限されてしまう。また、大きなハウジングが眼前にくるため、使用者に圧迫感を与えてしまう。このため、特許文献１のＨＭＤ装置は、使用者に大きな違和感を生じさせてしまい、当該ＨＭＤ装置を装着して街中を歩行することは現実的ではない。 However, since the HMD device of Patent Document 2 employs a half mirror, the user's visual field becomes dark because the front of the user must be covered with a housing to form a dark part. In addition, since the real image is viewed through the objective lens, the visual field of the user is limited. In addition, since the large housing comes in front of the eyes, it gives the user a feeling of pressure. For this reason, the HMD device of Patent Document 1 causes a great discomfort to the user, and it is not realistic to wear the HMD device and walk in the city.

また、電子辞書は、調べたい外国語の文字をわざわざ手入力で入力しなければならなくその操作は煩わしいものである。これに対して特許文献１のカメラ付き携帯端末は、文字の入力の手間を省ける。しかし、使用者は、翻訳したい文字列が含まれた被写体を、カメラ付き携帯端末がエッジ検出できるように、当該端末を用いて適切に撮影しなければならない。しかし、現実には、被写体が遠距離であり文字列が小さく撮影されたり、撮影画像にピンぼけやブレが生じてしまうことがあるように、被写体の画像が不明瞭となってしまった場合には、カメラ付き携帯端末の文字認識が困難となり、文字認識に成功する確率、即ち、文字認識率が低下してしまった。 In addition, the electronic dictionary requires bothersome input of foreign language characters to be checked, and the operation is troublesome. On the other hand, the camera-equipped mobile terminal disclosed in Patent Document 1 saves time and effort for inputting characters. However, the user must appropriately photograph the subject including the character string to be translated using the terminal so that the camera-equipped portable terminal can detect the edge. However, in reality, if the subject image is unclear, as the subject is far away and the character string is photographed small, or the captured image may be out of focus or blurred. The character recognition of the camera-equipped mobile terminal becomes difficult, and the probability of successful character recognition, that is, the character recognition rate has decreased.

また、使用者がカメラ付き携帯端末を把持して所望の被写体にカメラを向けて撮影を行わなければならなく、使用者の手を塞いでしまうため、傘や荷物を更に把持する場合には使用者に負担を与えてしまう。また、被写体にカメラを向ける動作は、撮影動作であることが第３者から見ても明らかであるため、使用者に抵抗感を与えてしまう場合があった。このように、特許文献１や２の翻訳装置は実用性に欠けるため、例えば、歩行中に翻訳が多用される海外旅行先での使用には不向きであった。 Also, since the user must hold the camera-equipped mobile terminal and point the camera at the desired subject to take a picture, which closes the user's hand. Burden the person. Further, since it is obvious from a third party that the operation of directing the camera toward the subject is a photographing operation, there is a case where the user is given a sense of resistance. Thus, since the translation apparatus of patent document 1 and 2 lacks practicality, for example, it was unsuitable for the use in the overseas travel destination where translation is used frequently during walking.

本発明は、上述した課題に鑑みて為されたものであり、その目的とするところは、文字認識率の向上を図ると共に、使用者の負荷を軽減することで実用性の高い翻訳装置を実現することである。 The present invention has been made in view of the above-described problems, and the object of the present invention is to improve the character recognition rate and realize a highly practical translation apparatus by reducing the load on the user. It is to be.

以上の課題を解決するために、請求項１に記載の翻訳装置は、
被写体を撮影して、撮影画像を生成する撮影手段と、
前記撮影手段により生成された撮影画像の文字認識を行う文字認識手段と、
前記文字認識手段により文字認識された文字列を翻訳して、当該文字列の訳語を取得する翻訳手段と、
前記翻訳手段により取得された訳語の光学像を投射する映像投射手段と、
ホログラフィック光学素子を含む透過型の素材で形成され、前記映像投射手段により投射された光学像を当該ホログラフィック光学素子の光学効果によって使用者の眼に導く接眼光学系と、
を備えることを特徴としている。 In order to solve the above problems, the translation device according to claim 1
Photographing means for photographing a subject and generating a photographed image;
Character recognition means for performing character recognition of a photographed image generated by the photographing means;
A translation means for translating the character string recognized by the character recognition means and obtaining a translation of the character string;
Video projection means for projecting an optical image of the translation acquired by the translation means;
An eyepiece optical system that is formed of a transmission-type material including a holographic optical element and guides an optical image projected by the image projection means to the user's eye by the optical effect of the holographic optical element;
It is characterized by having.

請求項２に記載の発明は、請求項１に記載の翻訳装置であって、
前記撮影手段は、前記使用者の視野内の被写体を撮影可能に当該翻訳装置に配設されることを特徴としている。 Invention of Claim 2 is the translation apparatus of Claim 1, Comprising:
The photographing means is arranged in the translation device so as to be able to photograph a subject in the field of view of the user.

請求項３に記載の発明は、請求項１又は２に記載の翻訳装置において、
前記使用者の視野の中心位置と、前記撮影画像の中心位置とのズレを推測する推測手段と、
前記推測手段により推測されたズレに基づいて、前記撮影画像の中心位置を前記使用者の視野の中心位置に合わせる補正手段と、を更に備えることを特徴としている。 The invention according to claim 3 is the translation device according to claim 1 or 2,
An estimation means for estimating a deviation between the center position of the user's visual field and the center position of the captured image;
The image processing apparatus further includes correction means for aligning the center position of the captured image with the center position of the visual field of the user based on the deviation estimated by the estimation means.

請求項４に記載の発明は、請求項１〜３の何れか一項に記載の翻訳装置において、
前記推測手段は、前記被写体と前記撮影手段との間の距離を測定する測定手段を有し、
前記補正手段は、前記使用者の視野の中心位置と前記撮影手段の中心位置との間の距離と、前記測定手段により測定された距離とに基づいて、前記撮影画像の中心位置を当該使用者の視野の中心位置に合わせることを特徴としている。 Invention of Claim 4 is the translation apparatus as described in any one of Claims 1-3,
The estimation means includes a measurement means for measuring a distance between the subject and the photographing means,
The correcting means determines the center position of the photographed image based on the distance between the center position of the user's visual field and the center position of the photographing means and the distance measured by the measuring means. It is characterized by being adjusted to the center position of the visual field.

請求項５に記載の発明は、請求項１〜４の何れか一項に記載の翻訳装置において、
撮影条件を設定する設定手段を更に備え、
前記撮影手段は、前記設定手段により設定された撮影条件に基づいて前記被写体の撮影を行うことを特徴としている。 Invention of Claim 5 is the translation apparatus as described in any one of Claims 1-4,
It further comprises setting means for setting shooting conditions,
The photographing unit is characterized in that the subject is photographed based on the photographing condition set by the setting unit.

請求項６に記載の発明は、請求項１〜５の何れか一項に記載の翻訳装置において、
前記撮影手段のブレを補正するブレ補正手段を更に備えることを特徴としている。 Invention of Claim 6 is the translation apparatus as described in any one of Claims 1-5,
The image processing apparatus further includes a shake correction unit that corrects a shake of the photographing unit.

請求項７に記載の発明は、請求項１〜６の何れか一項に記載の翻訳装置において、
前記撮影手段により生成された撮影画像に画像処理を施す画像処理手段を更に備え、
前記文字認識手段は、前記画像処理手段により画像処理が施された撮影画像の文字認識を行うことを特徴としている。 The invention according to claim 7 is the translation device according to any one of claims 1 to 6,
Image processing means for performing image processing on the photographed image generated by the photographing means;
The character recognizing means performs character recognition of a photographed image subjected to image processing by the image processing means.

請求項８に記載の発明は、請求項１〜７の何れか一項に記載の翻訳装置であって、前記映像投射手段は、前記文字認識手段により文字認識された文字列の光学像を投射する認識文字投射手段を有することを特徴としている。 Invention of Claim 8 is the translation apparatus as described in any one of Claims 1-7, Comprising: The said image | video projection means projects the optical image of the character string recognized by the said character recognition means It has the recognition character projection means to perform.

請求項９に記載の発明は、請求項１〜８の何れか一項に記載の翻訳装置であって、
前記翻訳手段は、前記文字認識された文字列の訳語を複数取得する複数訳語取得手段を有し、
前記複数訳語取得手段により取得された複数の訳語の中から訳語選択条件を満たす訳語を選択する訳語選択手段を更に備え、
前記映像投射手段は、前記訳語選択手段により選択された訳語の光学像を投射することを特徴としている。 Invention of Claim 9 is a translation apparatus as described in any one of Claims 1-8, Comprising:
The translating means includes a plurality of translation acquisition means for acquiring a plurality of translations of the character-recognized character string;
A translation selection means for selecting a translation satisfying a translation selection condition from a plurality of translations acquired by the plurality of translation acquisition means;
The video projecting unit projects an optical image of the translated word selected by the translated word selecting unit.

請求項１０に記載の発明は、請求項１〜９の何れか一項に記載の翻訳装置において、
使用者の頭部に装着可能なメガネ型に形成され、装着時に前記接眼光学系が使用者の眼前に配置することを特徴としている。 Invention of Claim 10 is the translation apparatus as described in any one of Claims 1-9,
The eyepiece optical system is formed in a glasses shape that can be worn on the user's head, and the eyepiece optical system is disposed in front of the user's eyes when worn.

請求項１に記載の発明によれば、撮影画像を文字認識することで得られた文字列を翻訳して、その文字列の訳語の光学像をホログラフィック光学素子を介して使用者の眼に導く。ホログラフィック光学素子を介することで、使用者の眼に入射する光量を維持でき、また、使用者の視界を遮ることがないため、使用者に与える負荷を軽減できる。また、接眼光学系を介して使用者の眼に訳語を投射することで、使用者の眼に当該訳語を映し出すことができる。このため、使用者は、わざわざ辞書を引いたり、その辞書を参照するために視線を移動したりすることなく、容易に当該訳語を視認することができる。これにより、撮影動作を行う必要がなく使用者の手が自由になる。従って、実用的な翻訳装置が実現でき、使用者は、海外旅行先での観光や買い物を違和感なく快適に楽しむことができるようになる。 According to the first aspect of the invention, the character string obtained by recognizing the photographed image is translated, and the translated image of the character string is transmitted to the user's eyes via the holographic optical element. Lead. By using the holographic optical element, the amount of light incident on the user's eyes can be maintained, and the user's field of view is not obstructed, so the load on the user can be reduced. In addition, by projecting a translated word to the user's eye via the eyepiece optical system, the translated word can be projected to the user's eye. Therefore, the user can easily visually recognize the translated word without drawing the dictionary or moving the line of sight to refer to the dictionary. Thereby, it is not necessary to perform a photographing operation, and the user's hand is freed. Therefore, a practical translation device can be realized, and the user can enjoy sightseeing and shopping at a foreign destination comfortably and comfortably.

請求項２に記載の発明によれば、使用者の視野内の被写体を撮影できるため、使用者が見た被写体に含まれる文字列の訳語を、使用者の眼に投射することができる。従って、使用者は同一の視野内で文字列と、その訳語を視認することができる。 According to the second aspect of the present invention, since the subject within the field of view of the user can be photographed, the translation of the character string included in the subject viewed by the user can be projected onto the user's eyes. Therefore, the user can visually recognize the character string and its translation within the same field of view.

請求項３に記載の発明によれば、請求項１又は２に記載の発明と同様の効果が得られるのは無論のこと、使用者の視野の中心位置と、撮影画像の中心位置とのズレを推測して、撮影画像の中心位置を使用者の視野の中心位置に合わせることで、使用者の視野の中心位置と、撮影画像の中心位置との誤差が補正される。従って、文字認識手段は、使用者の視野の中心位置に対する被写体の文字列を適切に文字認識することができる。 According to the third aspect of the invention, it is possible to obtain the same effect as the first or second aspect of the invention, and the difference between the center position of the user's visual field and the center position of the photographed image. Thus, the error between the center position of the user's visual field and the center position of the captured image is corrected by matching the center position of the captured image with the center position of the user's visual field. Therefore, the character recognition means can appropriately recognize the character string of the subject with respect to the center position of the user's visual field.

請求項４に記載の発明によれば、請求項１〜３の何れか一項に記載の発明と同様の効果が得られるのは無論のこと、視野の中心位置及び撮影手段の中心位置の間の距離と、被写体及び撮影手段の間の距離とに基づいて、撮影画像の中心位置を、使用者の視野の中心位置に合わせる。これにより、使用者の視野の中心位置と、撮影画像の中心位置との誤差を適切に補正することができる。 According to the invention described in claim 4, it is needless to say that the same effect as in the invention described in any one of claims 1 to 3 can be obtained, between the center position of the visual field and the center position of the photographing means. The center position of the captured image is matched with the center position of the user's visual field based on the distance between the subject and the photographing means. Accordingly, it is possible to appropriately correct an error between the center position of the user's visual field and the center position of the captured image.

請求項５に記載の発明によれば、請求項１〜４の何れか一項に記載の発明と同様の効果が得られるのは無論のこと、撮影条件を適切に設定することで、明瞭な撮影画像が得られるようになるため、文字認識率を向上させることができる。 According to the invention described in claim 5, it is obvious that the same effect as that of the invention described in any one of claims 1 to 4 can be obtained, and it is clear by appropriately setting the photographing conditions. Since a captured image can be obtained, the character recognition rate can be improved.

請求項６に記載の発明によれば、請求項１〜４の何れか一項に記載の発明と同様の効果が得られるのは無論のこと、撮影手段のブレを補正するため、撮影画像にブレが生じしてしまうことを低減でき、明瞭な撮影画像が得られる。従って、文字認識率の向上が図れる。 According to the invention described in claim 6, it is needless to say that the same effect as in the invention described in any one of claims 1 to 4 can be obtained. The occurrence of blurring can be reduced, and a clear captured image can be obtained. Therefore, the character recognition rate can be improved.

請求項７に記載の発明によれば、請求項１〜６の何れか一項に記載の発明と同様の効果が得られるのは無論のこと、撮影画像に画像処理を施すことで、明瞭な撮影画像が得られるようになる。このため、例えば、文字認識に失敗した場合であっても、撮影画像に画像処理に施すことで、文字認識に成功する確率、即ち文字認識率の向上が図れる。 According to the seventh aspect of the invention, it is obvious that the same effect as that of the first aspect of the invention can be obtained. A captured image can be obtained. For this reason, for example, even when character recognition fails, the probability of successful character recognition, that is, the character recognition rate can be improved by performing image processing on the captured image.

請求項８に記載の発明によれば、請求項１〜７の何れか一項に記載の発明と同様の効果が得られるのは無論のこと、映像投射手段は、文字認識された文字列の光学像を投射するため、使用者の眼にはその文字列の光学像と、訳語の光学像とが投射される。これにより、使用者は、翻訳の対象となった文字列を容易に視認することができるため、例えば、当該文字列と共に投射された訳語が、所望の文字列の訳語であるかを判断することができる。 According to the eighth aspect of the invention, it is of course possible to obtain the same effect as that of any one of the first to seventh aspects of the invention. In order to project an optical image, the optical image of the character string and the translated optical image are projected onto the user's eyes. Accordingly, the user can easily visually recognize the character string to be translated. For example, it is determined whether the translated word projected together with the character string is a translated word of the desired character string. Can do.

請求項９に記載の発明によれば、請求項１〜８の何れか一項に記載の発明と同様の効果が得られるのは無論のこと、複数の訳語の中から訳語選択条件を満たす訳語を選択して、その選択した訳語の光学像を使用者の眼に投射することができる。このため、例えば、翻訳頻度の高い訳語や、一般的に用いられる頻度の高い訳語といった妥当性の高い訳語を投射することができる。 According to the invention described in claim 9, it is needless to say that the same effect as in the invention described in any one of claims 1 to 8 can be obtained. And an optical image of the selected translation can be projected onto the user's eyes. For this reason, for example, a translation with high validity such as a translation with a high translation frequency or a translation with a high frequency of general use can be projected.

請求項１０に記載の発明によれば、請求項１〜９の何れか一項に記載の発明と同様の効果が得られるのは無論のこと、メガネ型の翻訳装置の装着時に、接眼光学系が使用者の眼前に配置される。このため、使用者は、例えば、日常生活や海外旅行先等で翻訳装置を装着して違和感なく使用することができる。また、使用者の視界を覆ったり、暗くしたりすることもないため、使用者の負荷を軽減することができる。 According to the invention described in claim 10, it is needless to say that the same effect as in the invention described in any one of claims 1 to 9 can be obtained. Is placed in front of the user's eyes. For this reason, for example, the user can wear the translation device and use it without a sense of incongruity in daily life or overseas travel destinations. Further, since the user's field of view is not covered or darkened, the load on the user can be reduced.

〔実施形態〕
以下、本発明を実施形態について図１〜図９を参照して詳細に説明する。また、以下の実施形態においては、図１に示すメガネ型の翻訳装置（以下、単に「翻訳装置」と称す。）を例にとって説明するが、その適用可能な範囲は、図示例に限定されないものとする。 Embodiment
DESCRIPTION OF EMBODIMENTS Hereinafter, embodiments of the present invention will be described in detail with reference to FIGS. Further, in the following embodiments, the glasses-type translation device (hereinafter simply referred to as “translation device”) shown in FIG. 1 will be described as an example, but the applicable range is not limited to the illustrated examples. And

〔翻訳装置の概観及び機構〕
図１は、翻訳装置１の斜視図の一例である。図１によれば、翻訳装置１は、使用者の頭部に装着可能に形成されたヘッドマウントディスプレイ（Head Mount Display；以下、「ＨＭＤ」という。）２００と、コントロールユニット３００とがケーブル２９０によって接続されて構成されている。 [Overview and mechanism of translation equipment]
FIG. 1 is an example of a perspective view of the translation apparatus 1. According to FIG. 1, the translation apparatus 1 includes a head mount display (hereinafter referred to as “HMD”) 200 formed so as to be attachable to the user's head and a control unit 300 via a cable 290. Connected and configured.

ＨＭＤ２００は、左右対称に設けられたメガネレンズ（以下、単に「レンズ」という。）２１０Ｒ及び２１０Ｌと、鼻あて２３０Ｒ及び２３０Ｌと、テンプル２４０Ｒ及び２４０Ｌと、ブリッジ２７０とを備えて構成される。 The HMD 200 includes spectacle lenses (hereinafter simply referred to as “lenses”) 210R and 210L, nose pads 230R and 230L, temples 240R and 240L, and a bridge 270 that are provided symmetrically.

テンプル２４０Ｒ及び２４０Ｌは、使用者の頭部にＨＭＤ２００を支持するための支持部材である。使用者は、テンプル２４０Ｒ及び２４０Ｌそれぞれを側頭部を介して左右の耳介にかけ、鼻あて２３０Ｒ及び２３０Ｌを鼻の付け根部分に載せるように装着すると、レンズ２１０Ｒ及び２１０Ｌそれぞれが左右の眼Ｅの前に配置される。 Temples 240R and 240L are support members for supporting the HMD 200 on the user's head. When the user puts the temples 240R and 240L on the left and right auricles through the temporal region and puts the nose pads 230R and 230L on the base of the nose, the lenses 210R and 210L are respectively attached to the left and right eyes E. Placed in front.

また、ＨＭＤ２００は、表示ユニット２５０と、カメラユニット２６０とを備えて構成されている。この表示ユニット２５０、カメラユニット２６０及びコントロールユニットは、ケーブル２９０を介して通信可能に接続される。ケーブル２９０は、各ユニット間を電気的に接続する通信媒体である。尚、翻訳装置１００は、ケーブル２９０を介して各種データ通信を行うこととするが、各ユニットに無線通信機を設けて無線方式のデータ通信を行うこととしてもよい。無線方式を採用することで、遊動可能なケーブル２９０が使用者に掛かって邪魔になることなくなる。 The HMD 200 includes a display unit 250 and a camera unit 260. The display unit 250, the camera unit 260, and the control unit are communicably connected via a cable 290. The cable 290 is a communication medium that electrically connects the units. The translation apparatus 100 performs various data communications via the cable 290. However, the translation apparatus 100 may perform wireless data communications by providing a wireless communication device in each unit. By adopting the wireless system, the movable cable 290 is not hindered by the user.

表示ユニット２５０（映像投射手段、認識文字投射手段）は、接眼光学系２２０に対して光学像を投射し、レンズ２１０Ｒの上部に設けられる。接眼光学系２２０は、レンズ２１０Ｒと一体的に形成され、表示ユニット２５０から投射された光学像を使用者の眼Ｅに導く機能部である。 The display unit 250 (video projection means, recognition character projection means) projects an optical image onto the eyepiece optical system 220 and is provided above the lens 210R. The eyepiece optical system 220 is a functional unit that is formed integrally with the lens 210R and guides the optical image projected from the display unit 250 to the user's eye E.

接眼光学系２２０により投影された光学像は、使用者の眼Ｅに虚像映像として映し出される。これにより、表示ユニット２５０及び接眼光学系２２０が、仮想的な映像を表示する表示装置として機能することとなる。 The optical image projected by the eyepiece optical system 220 is displayed as a virtual image on the user's eye E. As a result, the display unit 250 and the eyepiece optical system 220 function as a display device that displays a virtual image.

図２に、表示ユニット２５０及び接眼光学系２２０の断面図の一例を示す。図２に示すように、表示ユニット２５０は、ＬＥＤ（Light Emitting Diode）等により構成される光源２５２と、コンデンサレンズ等により構成される集光レンズ２５４と、ＬＣＤ（Liquid Crystal Display）等により構成される透過型の表示パネル２５６とが筐体２５８内に保持されて構成される。 FIG. 2 shows an example of a cross-sectional view of the display unit 250 and the eyepiece optical system 220. As shown in FIG. 2, the display unit 250 includes a light source 252 configured by an LED (Light Emitting Diode) or the like, a condenser lens 254 configured by a condenser lens, an LCD (Liquid Crystal Display), or the like. The transmissive display panel 256 is held in a housing 258.

光源２５２が、表示パネル２５６へ照射光を出射すると、その照射光は集光レンズ２５４により表示パネル２５６全面に均一に導かれる。プリズム２１４の上面に対して傾斜して配設された表示パネル２５６は、使用者の眼Ｅに映し出すための映像を表示すると、光源２５２からの照明光を受けて、当該映像の光学像、即ち映像光を接眼光学系２２０のプリズム２１４に対して発することとなる。 When the light source 252 emits irradiation light to the display panel 256, the irradiation light is uniformly guided to the entire surface of the display panel 256 by the condenser lens 254. When the display panel 256 disposed to be inclined with respect to the upper surface of the prism 214 displays an image to be projected on the user's eye E, the display panel 256 receives illumination light from the light source 252 and receives an optical image of the image, that is, The image light is emitted to the prism 214 of the eyepiece optical system 220.

筐体２５８は、プリズム２１４の上部を狭持するように設けられ、光源２５２、集光レンズ２５４及び表示パネル２５６を覆い保持する。 The housing 258 is provided so as to sandwich the upper portion of the prism 214, and covers and holds the light source 252, the condenser lens 254, and the display panel 256.

接眼光学系２２０は、プリズム２１４と、ホログラフィック光学素子（Holographic Optical Element；以下「ＨＯＥ」という。）２１２とを有し、レンズ２１０Ｒと一体的に形成されて構成される。 The eyepiece optical system 220 includes a prism 214 and a holographic optical element (hereinafter referred to as “HOE”) 212, and is formed integrally with the lens 210R.

プリズム２１４は、透明なガラス又は樹脂等からなり、平板状に形成されたものである。プリズム２１４は、右眼用のレンズ２１０Ｒの一部に埋め込まれており、その下端部にＨＯＥ２１２が接合形成されている。ＨＯＥ２１２は、プリズム２１４とレンズ２１０Ｒとをホログラム基板として、その間に挟まれるように配置されている。 The prism 214 is made of transparent glass, resin, or the like, and is formed in a flat plate shape. The prism 214 is embedded in a part of the lens 210R for the right eye, and the HOE 212 is joined and formed at the lower end thereof. The HOE 212 is arranged so as to be sandwiched between the prism 214 and the lens 210R as a hologram substrate.

ＨＯＥ２１２は、ホログラム基板面に対して平行でない干渉縞からなる２つの干渉縞パターンを有している。ＨＯＥ２１２に入射した光学像は、その干渉縞パターンの回折作用によって回折反射して使用者の眼Ｅに導かれる。 The HOE 212 has two interference fringe patterns composed of interference fringes that are not parallel to the hologram substrate surface. The optical image incident on the HOE 212 is diffracted and reflected by the diffraction action of the interference fringe pattern and guided to the user's eye E.

また、表示ユニット２５０は、ケーブル２９０を介してコントロールユニット３００に接続され、コントロールユニット３００からの指示に基づいた映像を表示する。即ち、表示パネル２５６から発せられた映像光が、プリズム２１４により反射を複数回繰り返しながらＨＯＥ２１２に導かれると、ＨＯＥ２１２により回折して平行光に近い光束となって利用者の眼Ｅに入射する。これにより、使用者に対して映像を仮想的に表示せしめることができる。 The display unit 250 is connected to the control unit 300 via the cable 290 and displays an image based on an instruction from the control unit 300. That is, when the image light emitted from the display panel 256 is guided to the HOE 212 while being reflected by the prism 214 a plurality of times, it is diffracted by the HOE 212 and enters the user's eye E as a light flux close to parallel light. Thereby, a video can be virtually displayed to the user.

また、接眼光学系２２０及びレンズ２１０Ｒは、使用者の視界にある像の光学像を透過して、使用者の眼Ｅに導く。このため、使用者の眼Ｅには、レンズ２１０Ｒ前方（図２の矢印Ｖ２の方向）の映像に、表示パネル２５６が表示した映像を重ね合わせた映像が映し出されることとなる。 Further, the eyepiece optical system 220 and the lens 210R transmit the optical image of the image in the user's field of view and guide it to the user's eye E. For this reason, an image obtained by superimposing the image displayed on the display panel 256 on the image in front of the lens 210R (in the direction of the arrow V2 in FIG. 2) is displayed on the user's eye E.

カメラユニット２６０は、光学レンズ、ＣＣＤ等の光電変換素子と、Ａ／Ｄ変換部とを有して構成される。光学レンズを介して入力される被写体の光学像をＣＣＤにより光電変換する。そして、この光電変換により得られた電気信号を、Ａ／Ｄ変換部によってＡ／Ｄ変換して、デジタルデータの画像データ（撮影画像）を生成する。この光電変換及びＡ／Ｄ変換を連続的に行うことにより、被写体の動画撮影が行われる。 The camera unit 260 includes an optical lens, a photoelectric conversion element such as a CCD, and an A / D conversion unit. An optical image of the subject input through the optical lens is photoelectrically converted by the CCD. Then, the electrical signal obtained by the photoelectric conversion is A / D converted by the A / D conversion unit to generate image data (captured image) of digital data. By performing this photoelectric conversion and A / D conversion continuously, moving image shooting of the subject is performed.

コントロールユニット３００は、図１に示すように、操作スイッチ群３１０を備えて構成される。操作スイッチ３１０は、電源ボタンやカーソルキー、決定スイッチ、モード設定スイッチなどにより構成される。使用者は、この操作スイッチ３１０を押下することにより、翻訳装置１を操作し、翻訳装置１の動作モードの切り替えや動画撮影の開始指示、翻訳の継続指示等を行う。 The control unit 300 includes an operation switch group 310 as shown in FIG. The operation switch 310 includes a power button, a cursor key, a determination switch, a mode setting switch, and the like. The user operates the translation device 1 by depressing the operation switch 310, and performs an operation mode switching of the translation device 1, a moving image shooting start instruction, a translation continuation instruction, and the like.

翻訳装置１の動作モードとしては翻訳モードや音楽モードや道案内モード等が設けられている。翻訳モードは、カメラユニット２６０により撮影・生成された撮影画像に含まれた文字列の訳語を表示するモードである。音楽モードは、記憶部５０に記憶された音楽データを再生して、イヤホン（図示略）から音声出力するモードである。また、道案内モードは、ＧＰＳ受信装置（図示略）により受信したＧＰＳ信号に基づいて現在位置を測位し、記憶部５０に記憶された地図情報に従って当該現在位置から目的地までの地図を表示するモードである。 As an operation mode of the translation apparatus 1, a translation mode, a music mode, a route guidance mode, and the like are provided. The translation mode is a mode for displaying a translation of a character string included in a photographed image photographed / generated by the camera unit 260. The music mode is a mode in which music data stored in the storage unit 50 is reproduced and sound is output from an earphone (not shown). In the route guidance mode, the current position is measured based on a GPS signal received by a GPS receiver (not shown), and a map from the current position to the destination is displayed according to the map information stored in the storage unit 50. Mode.

使用者は、モード設定スイッチを操作することにより、所望の動作モードに切り替える。尚、翻訳装置１の動作モードは、上述した動作モードに限らず、適宜追加変更可能である。 The user switches to a desired operation mode by operating the mode setting switch. Note that the operation mode of the translation apparatus 1 is not limited to the operation mode described above, and can be added or changed as appropriate.

〔翻訳装置の機能構成〕
図３は、翻訳装置１の機能構成の一例を示すブロック図である。図３によれば、翻訳装置１は、撮影部１０、操作入力部２０、画像処理部３０、言語処理部４０、記憶部５０、翻訳エンジン６０及び表示部７０を備えて構成される。また、操作入力部２０、画像処理部３０、言語処理部４０、記憶部５０及び翻訳エンジン６０（翻訳手段）は、コントロールユニット３００内に設けられる。 [Functional structure of translation device]
FIG. 3 is a block diagram illustrating an example of a functional configuration of the translation apparatus 1. According to FIG. 3, the translation apparatus 1 includes a photographing unit 10, an operation input unit 20, an image processing unit 30, a language processing unit 40, a storage unit 50, a translation engine 60, and a display unit 70. Further, the operation input unit 20, the image processing unit 30, the language processing unit 40, the storage unit 50, and the translation engine 60 (translation means) are provided in the control unit 300.

撮影部１０（撮影手段）は、ガラスやプラスチック等からなる光学レンズプリズムと、ＣＣＤやＣＭＯＳ等の撮像素子と、Ａ／Ｄ変換器とを有して構成される機能部であり、上述したカメラユニット２６０に相当する。撮影部１０は、光学レンズを介して入力される被写体の光学像を撮像素子によりフレーム単位で電気信号に変換して画像データを生成することで動画撮影を行い、そのフレーム単位の画像データを随時画像処理部３０に出力する。 The imaging unit 10 (imaging means) is a functional unit that includes an optical lens prism made of glass, plastic, or the like, an imaging element such as a CCD or CMOS, and an A / D converter. This corresponds to the unit 260. The imaging unit 10 shoots a moving image by converting an optical image of a subject input via an optical lens into an electrical signal in units of frames by an image sensor, and generates image data. The image is output to the image processing unit 30.

また、撮影部１０は、複数の光学レンズプリズムを組み合わせたり、光学レンズプリズムと撮像素子との距離を調整したりすることにより、ピント（焦点）を自動的に合わせるオートフォーカス（Auto Focus；以下、「ＡＦ」と略す。）機能を実現する。本実施形態では、ピントが合っていないときに撮像素子上のコントラストが低下することを利用してピント合わせを行う所謂パッシブ方式によりＡＦ機能を実現する。そして、撮影部１０（測定手段）は、ＡＦ機能によるピント調整の結果に基づいて焦点距離を取得し、この焦点距離から撮影部１０から被写体までの距離を測定する。この測定した距離（測定距離５３）は、後述するパララックス補正に用いられる。 In addition, the photographing unit 10 combines a plurality of optical lens prisms or adjusts the distance between the optical lens prisms and the image sensor to automatically focus (Auto Focus), (Abbreviated as “AF”). In the present embodiment, the AF function is realized by a so-called passive method in which focusing is performed by utilizing the fact that the contrast on the image pickup device is lowered when the focus is not achieved. The photographing unit 10 (measuring unit) acquires a focal length based on the result of focus adjustment by the AF function, and measures the distance from the photographing unit 10 to the subject from this focal length. This measured distance (measurement distance 53) is used for parallax correction described later.

尚、ＡＦ機能の実現方法及び被写体までの距離の測定方法として、パッシブ方式を採用することとしたが、撮影部１０に赤外線照射部を設けて、被写体に向けて照射した赤外線の反射を利用して被写体との距離を測定し、ピント合わせを行うアクティブ方式を採用することとしても勿論よい。 Although the passive method is adopted as a method for realizing the AF function and a method for measuring the distance to the subject, an infrared irradiation unit is provided in the photographing unit 10 and the reflection of infrared rays irradiated toward the subject is used. Of course, the active method of measuring the distance to the subject and focusing can be adopted.

また、撮影部１０（ブレ補正手段）は、角速度センサを有して構成され、当該角速度センサによりアンチシェイク（Anti Shake；以下、「ＡＳ」という。）機能を有している。ＡＳ機能は、撮影時のブレを補正する機能であり、次のようにして実現される。即ち、角速度センサによりブレの量、速度及び方向を検知して、光学レンズの画角特性に合わせてＣＣＤ上でのズレ量をリアルタイムで算出する。そして、そのズレ量に基づいてＣＣＤを上下左右に駆動することでブレを解消する。このＡＳ機能により、使用者の頭部の動きや細かな振動によるブレを補正することで、より明瞭な撮影画像が得られるようになる。 In addition, the photographing unit 10 (blur correction unit) includes an angular velocity sensor, and has an anti-shake function (hereinafter referred to as “AS”) by the angular velocity sensor. The AS function is a function for correcting blur at the time of shooting, and is realized as follows. That is, the amount, speed, and direction of blur are detected by the angular velocity sensor, and the amount of deviation on the CCD is calculated in real time in accordance with the field angle characteristics of the optical lens. Then, the blur is eliminated by driving the CCD vertically and horizontally based on the amount of deviation. With this AS function, it is possible to obtain a clearer captured image by correcting blurring due to the movement of the user's head and fine vibrations.

尚、ズレ量に基づいてＣＣＤを駆動するＣＣＤシフト方式によりＡＳ機能を実現することとしたが、例えば、ブレ量に基づいて光学レンズプリズムを動かすことでブレを補正する光学式により実現することとしてもよい。 Although the AS function is realized by the CCD shift method of driving the CCD based on the deviation amount, for example, it is realized by an optical method that corrects the blur by moving the optical lens prism based on the blur amount. Also good.

操作入力部２０は、上述した操作スイッチ３１０や、文字入力キー、カーソルキー備えて構成され、使用者により押下されたスイッチの押下信号を出力する入力装置である。 The operation input unit 20 includes the above-described operation switch 310, character input keys, and cursor keys, and is an input device that outputs a pressing signal of a switch pressed by a user.

画像処理部３０（画像処理手段）は、ＣＰＵ（Central Processing Unit）やＲＯＭ（Read Only Memory）、ＲＡＭ（Random Access Memory）等を備えて構成され、ＣＰＵが、ＲＯＭに記憶された各種プログラムをＲＡＭに展開することで、当該プログラムに従った処理を実行する。特に、画像処理部３０は、動画像の画像データからの静止画像データの作成、パララックス補正処理、静止画像データのコントラストを強調する画像処理、文字認識処理等を行う。 The image processing unit 30 (image processing means) includes a CPU (Central Processing Unit), a ROM (Read Only Memory), a RAM (Random Access Memory), and the like, and the CPU stores various programs stored in the ROM. The processing according to the program is executed. In particular, the image processing unit 30 performs creation of still image data from image data of moving images, parallax correction processing, image processing for enhancing contrast of still image data, character recognition processing, and the like.

ここで、画像処理部３０の動作を説明すると次のようになる。即ち、画像処理部３０は、使用者の操作入力部２０の押下により動作撮影の開始指示を示す操作信号が入力されると、撮影部１０を駆動して動画撮影を開始させる。そして、撮影部１０から随時入力されてくる動画像の画像データから静止画像の画像データ（以下、静止画像の画像データを「静止画像データ」という。）を作成して、静止画像データ５１として記憶部５０に記憶させる。 Here, the operation of the image processing unit 30 will be described as follows. That is, when an operation signal indicating an instruction to start motion shooting is input by the user pressing the operation input unit 20, the image processing unit 30 drives the shooting unit 10 to start moving image shooting. Then, still image data (hereinafter, still image data is referred to as “still image data”) is created from moving image image data input from the photographing unit 10 as needed, and stored as still image data 51. Stored in the unit 50.

この静止画像データの作成方法は、適宜公知技術を採用可能であり、例えば、撮影部１０から入力されたフレーム単位の画像データから、任意のタイミングの画像データを選択して取得することで静止画像データ５１としてもよいし、数フレーム分（例えば、３フレーム分）の画像データを取得して、それらの画像データを合成することで静止画像データ５１を作成することとしてもよい。 As a method for creating the still image data, a publicly known technique can be used as appropriate. For example, a still image can be obtained by selecting and acquiring image data at an arbitrary timing from image data in units of frames input from the imaging unit 10. Data 51 may be used, or image data for several frames (for example, three frames) may be acquired, and the still image data 51 may be created by combining the image data.

この画像処理部３０が作成した静止画像データ５１、即ち、撮影画像の中心位置と、使用者がレンズ２１０Ｒを介して見た被写体の中心位置とには、カメラユニット２６０とレンズ２１０ＲとのＨＭＤ１００上における設置位置の違いから視差（パララックス）が生じる。 The still image data 51 created by the image processing unit 30, that is, the center position of the captured image and the center position of the subject viewed by the user through the lens 210R are on the HMD 100 of the camera unit 260 and the lens 210R. The parallax is generated due to the difference in the installation position.

より具体的には、例えば、図６（ａ）のように遠景の景色を被写体ＯＢ２として、被写体ＯＢ２とＨＭＤ１００との距離が充分に長い場合、カメラユニット２６０の画角の中心点Ｐ１は、レンズ２１０Ｒを介した使用者の視界の中心点Ｐ２（レンズ２１０Ｒの画角の中心位置）と略同一となり、パララックスの影響は小さい。 More specifically, for example, as shown in FIG. 6A, when a distant view is a subject OB2, and the distance between the subject OB2 and the HMD 100 is sufficiently long, the center point P1 of the angle of view of the camera unit 260 is the lens It becomes substantially the same as the center point P2 of the user's field of view through 210R (the center position of the angle of view of the lens 210R), and the influence of the parallax is small.

しかし、例えば、図６（ｂ）のように、使用者が把持した文書を被写体ＯＢ３として、ＨＭＤ１００と被写体ＯＢ２との距離が近ければ近いほど、カメラユニット２６０の画角の中心点Ｐ１と、レンズ２１０Ｒを介した使用者の視界の中心点Ｐ２とのパララックスが大きくなる。 However, for example, as shown in FIG. 6B, the document held by the user is the subject OB3, and the closer the distance between the HMD 100 and the subject OB2, the closer the center point P1 of the angle of view of the camera unit 260 and the lens. The parallax with the center point P2 of the user's field of view through 210R increases.

そこで、画像処理部３０は、レンズ２１０Ｒに対する被写体の中心点Ｐ２（使用者の視点の中心）に合わせるように、カメラユニット２６０の画角の中心点Ｐ１（カメラユニット２６０の視点）を補正するパララックス補正を行う。パララックス補正は公知の手法であるが、簡単に説明すると次のようになる。 Therefore, the image processing unit 30 corrects the center point P1 of the angle of view of the camera unit 260 (the viewpoint of the camera unit 260) so as to match the center point P2 of the subject with respect to the lens 210R (the center of the viewpoint of the user). Perform Lux correction. The parallax correction is a well-known method, but it will be briefly described as follows.

先ず、図５（ｂ）に示すように、カメラユニット２６０の前方方向（矢印Ｖ１の方向）の中心点Ｐ１と当該カメラユニット２６０との距離Ｄ１（測定距離５３）を、上述したＡＦ機能を用いて撮影部１０が測定する。そして、画像処理部３０（距離算出手段）は、カメラユニット２６０の中心とレンズ２１０Ｒの中心との距離Ｄ２と測定した距離Ｄ１とから、三角関数を用いることにより、なす角Ｒを算出する。このなす角Ｒは、カメラユニット２６０からレンズ２１０Ｒの前方方向Ｖ２の方向の中心点Ｐ２に対する方向Ｖ３と、カメラユニット２６０の前方方向Ｖ１とのなす角である。画像処理部３０（推測手段、補正手段）は、静止画像データ５１をなす角Ｒに基づいてずらすことで、静止画像データ５１の中心位置（撮影画像の中心位置）をレンズ２１０Ｒから見た被写体ＯＢ１の中心点Ｐ２の位置（使用者の視野の中心位置）に合わせる。 First, as shown in FIG. 5B, the distance D1 (measurement distance 53) between the center point P1 in the forward direction (the direction of the arrow V1) of the camera unit 260 and the camera unit 260 is used as the AF function described above. Then, the photographing unit 10 measures. Then, the image processing unit 30 (distance calculation means) calculates an angle R formed by using a trigonometric function from the distance D2 between the center of the camera unit 260 and the center of the lens 210R and the measured distance D1. The angle R formed is the angle formed by the direction V3 with respect to the center point P2 in the direction of the front direction V2 of the lens 210R from the camera unit 260 and the front direction V1 of the camera unit 260. The image processing unit 30 (estimating means, correcting means) shifts based on the angle R that forms the still image data 51, whereby the subject OB1 when the center position of the still image data 51 (center position of the captured image) is viewed from the lens 210R. To the position of the center point P2 (the center position of the user's visual field).

画像処理部３０は、パララックス補正を行って静止画像データ５１を更新する。このパララックス補正により、レンズ２１０Ｒの中心点を介して使用者の眼Ｅに映し出される文字列と、画像処理部３０が文字認識の対象とする文字列とを一致させることができる。 The image processing unit 30 performs the parallax correction and updates the still image data 51. By this parallax correction, the character string displayed on the user's eye E through the center point of the lens 210R can be matched with the character string to be recognized by the image processing unit 30.

画像処理部３０（文字認識手段）は、静止画像データ５１の周辺部位を切り出して（カットして）、残りの一定領域に対して文字認識を行う。具体的には、予めアルファベットやハングル文字等といった外国語の表記文字のテンプレートを用意しておく。そして、静止画像データ５１の中心の一定領域内における画素毎に当該テンプレートと比較する。その比較の結果、テンプレートの文字と略一致した場合には、当該テンプレートの文字が静止画像データ５１中に含まれているとして抽出する。尚、この文字認識の手法は、パターンマッチングに限らず、ゾンデ法やストロークアナリシス法など適宜公知技術を採用可能である。画像処理部３０は、文字認識により抽出した文字列を抽出文字列５５として記憶部５０に記憶させる。 The image processing unit 30 (character recognition means) cuts out (cuts) the peripheral portion of the still image data 51 and performs character recognition on the remaining constant area. Specifically, a template for a written character in a foreign language such as an alphabet or a Hangul character is prepared in advance. And it compares with the said template for every pixel in the fixed area | region of the center of the still image data 51. FIG. As a result of the comparison, if the characters of the template substantially match, it is extracted that the characters of the template are included in the still image data 51. The character recognition method is not limited to pattern matching, and a known technique such as a sonde method or a stroke analysis method can be employed as appropriate. The image processing unit 30 stores the character string extracted by character recognition in the storage unit 50 as the extracted character string 55.

言語処理部４０は、ＣＰＵやＲＯＭ、ＲＡＭ等を備えて構成され、抽出文字列５５の言語種別５７の判定と、翻訳エンジン６０により取得された複数の訳語から訳語選択条件に基づいた訳語の選択と、表示部７０に対する訳語の表示制御等を行う。 The language processing unit 40 includes a CPU, a ROM, a RAM, and the like, and determines a language type 57 of the extracted character string 55 and selects a translation based on a translation selection condition from a plurality of translations acquired by the translation engine 60. And display control of translated words on the display unit 70.

言語処理部４０の具体的な動作は次のようになる。先ず、画像処理部３０により抽出された抽出文字列５５内の個々の文字の表記形式と、文字の配列等から言語の種別を決定する。例えば、抽出文字列５５の表記形式から文字種がハングル文字であれば言語の種別を韓国語として判定して、言語種別５７として記憶部５０に記憶する。また、抽出文字列５５の文字種がアルファベットであった場合は、個々のアルファベットの配列から英語、独語、仏語等といった言語種別５７を判定する。 The specific operation of the language processing unit 40 is as follows. First, the language type is determined from the notation format of each character in the extracted character string 55 extracted by the image processing unit 30, the character arrangement, and the like. For example, if the character type is a Hangul character from the notation format of the extracted character string 55, the language type is determined as Korean and stored as the language type 57 in the storage unit 50. When the character type of the extracted character string 55 is alphabet, the language type 57 such as English, German, French, etc. is determined from the arrangement of the individual alphabets.

そして、記憶部５０に記憶された抽出文字列５５及び言語種別５７を翻訳エンジン６０に送信し、翻訳エンジン６０の翻訳処理の結果得られた抽出文字列５５の訳語を受信して記憶部５０に記憶させる。言語処理部４０は、訳語５９が複数ある場合は、一般的に使用頻度の高い訳語であるという訳語選択条件を満たす訳語を選択する。 Then, the extracted character string 55 and the language type 57 stored in the storage unit 50 are transmitted to the translation engine 60, and the translated word of the extracted character string 55 obtained as a result of the translation process of the translation engine 60 is received and stored in the storage unit 50. Remember me. When there are a plurality of translations 59, the language processing unit 40 selects a translation that satisfies the translation selection condition that the translation is frequently used.

訳語選択条件としては、例えば、観光、英文学等の使用者の翻訳装置１の使用目的に応じた分野に関わるか否か、過去に翻訳エンジン６０によって取得された訳語の分野に近似するか否か等があり適宜変更可能である。このように訳語選択条件を満たす訳語を選択することで、表示する訳語を使用者の使用目的に近いものや、よく使われるもの等に制限することができる。 As the translation selection condition, for example, whether it is related to the field according to the purpose of use of the translation device 1 of the user such as tourism or English literature, or whether it approximates to the field of the translation acquired in the past by the translation engine 60 These can be changed as appropriate. By selecting translations that satisfy the translation selection conditions in this way, the translations to be displayed can be limited to those that are close to the user's purpose of use or frequently used.

言語処理部４０（訳語選択手段）は、訳語選択条件を満たす訳語５９を選択した後、当該訳語５９と、抽出文字列５５とを、例えば、図８の一覧リストＬＳＴように表示させるための表示データを生成して表示部７０に出力する。 The language processing unit 40 (translation word selection means) selects a translation 59 satisfying the translation selection condition, and then displays the translation 59 and the extracted character string 55 as, for example, a list list LST in FIG. Data is generated and output to the display unit 70.

記憶部５０は、光学的又は磁気的な記憶媒体にデータの読み書きを行う機能部であり、例えば、ＨＤＤ（Hard Disk Drive）等により構成される。図４は、記憶部５０のデータ構成の一例を示す図である。図４によれば、記憶部５０は、静止画像データ５１、測定距離５３、抽出文字列５５、言語種別５７及び訳語５９を記憶している。 The storage unit 50 is a functional unit that reads / writes data from / to an optical or magnetic storage medium, and includes, for example, an HDD (Hard Disk Drive). FIG. 4 is a diagram illustrating an example of a data configuration of the storage unit 50. According to FIG. 4, the storage unit 50 stores still image data 51, a measurement distance 53, an extracted character string 55, a language type 57, and a translated word 59.

翻訳エンジン６０（翻訳手段、複数訳語取得手段）は、ＣＰＵやＲＯＭ、ＲＡＭ等の他に、翻訳データベース（以下、データベースを「ＤＢ」という。）を備えて構成され、言語処理部４０から入力された抽出文字列５５の言語種別５７とに基づいて、抽出文字列５５の訳語を翻訳データベースから検索して、その検索結果を全て言語処理部４０に出力する。 The translation engine 60 (translation means, multiple translation acquisition means) includes a translation database (hereinafter referred to as “DB”) in addition to the CPU, ROM, RAM, and the like, and is input from the language processing unit 40. Based on the language type 57 of the extracted character string 55, the translated words of the extracted character string 55 are searched from the translation database, and all the search results are output to the language processing unit 40.

翻訳ＤＢは、見出語と、当該見出語の訳語（和語）とを対応付けて記憶するデータテーブルであり、言語種別毎に設けられている。例えば、言語処理部４０から出力された言語種別５７が韓国語であった場合、翻訳エンジン６０は、韓国語の翻訳ＤＢを選択し、この翻訳ＤＢに記憶された見出語の中から、抽出文字列５５と一致する見出語を検索する。そして、一致する見出語があったならば、当該見出語に対応する訳語を翻訳ＤＢから読み出して言語処理部４０に出力する。 The translation DB is a data table that stores a headword and a translation (Japanese) of the headword in association with each other, and is provided for each language type. For example, if the language type 57 output from the language processing unit 40 is Korean, the translation engine 60 selects a Korean translation DB and extracts it from the headwords stored in the translation DB. A headword that matches the character string 55 is searched. If there is a matching headword, the translation corresponding to the headword is read from the translation DB and output to the language processing unit 40.

表示部７０は、言語処理部４０から入力される表示データに基づいて各種画面を表示させるものであり、図２に示した表示パネル２５６に相当する。静止画像データ５１の中から抽出された文字列の訳語が表示部７０に表示されることで、当該訳語の光学像が接眼光学系２２０を介して使用者の眼Ｅに映し出されることとなる。 The display unit 70 displays various screens based on display data input from the language processing unit 40, and corresponds to the display panel 256 shown in FIG. By displaying the translation of the character string extracted from the still image data 51 on the display unit 70, an optical image of the translation is displayed on the user's eye E through the eyepiece optical system 220.

〔翻訳装置の具体的な動作〕
次に、翻訳装置１の具体的な動作について図７のフローチャートを用い、図８を参照しつつ説明する。 [Specific operation of translation device]
Next, a specific operation of the translation apparatus 1 will be described with reference to FIG. 8 using the flowchart of FIG.

先ず、使用者のモード設定スイッチの操作に従って動作モードの設定を行い（モード設定処理；ステップＳ１）、操作入力部２０は、翻訳モードが選択されたと判定した場合は（ステップＳ３；Ｙｅｓ）、翻訳モードＯＮを示す操作信号を画像処理部３０に出力する。また、他の動作モードが選択されたと判定した場合には（ステップＳ３；Ｎｏ）、当該動作モードに応じた処理を開始する（ステップＳ５）。 First, the operation mode is set in accordance with the user's operation of the mode setting switch (mode setting process; step S1). When the operation input unit 20 determines that the translation mode is selected (step S3; Yes), the translation is performed. An operation signal indicating mode ON is output to the image processing unit 30. If it is determined that another operation mode has been selected (step S3; No), processing corresponding to the operation mode is started (step S5).

画像処理部３０は、使用者の操作入力部２０の操作により、動画撮影の開始指示（例えば、決定スイッチの押下操作）が入力されるまで待機し、当該開始指示が入力されたならば（ステップＳ７；Ｙｅｓ）、撮影部１０を駆動して動画撮影を開始させる（ステップＳ９）。 The image processing unit 30 stands by until an instruction to start moving image shooting (for example, a pressing operation of a determination switch) is input by the operation of the operation input unit 20 by the user. If the start instruction is input (step S30). S7; Yes), the photographing unit 10 is driven to start moving image photographing (step S9).

そして、画像処理部３０は、撮影部１０から随時入力される動画像の画像データから静止画像データ５１を作成して記憶部５０に記憶した後（ステップＳ１１）、当該静止画像データ５１に対してパララックス補正を行う（ステップＳ１７）。 Then, the image processing unit 30 creates the still image data 51 from the image data of the moving image input as needed from the photographing unit 10 and stores the still image data 51 in the storage unit 50 (step S <b> 11). Parallax correction is performed (step S17).

そして、パララックス補正後の、静止画像データ５１に含まれる文字列の抽出を行い（ステップＳ１７）、当該抽出に成功したか否かを判定する（ステップＳ１９）。画像処理部３０は、文字列の抽出に失敗したと判定した場合（ステップＳ１９；Ｎｏ）、静止画像データ５１に対して各種画像処理を施した後に（ステップＳ２１）、再度文字列の抽出を行う。 Then, the character string included in the still image data 51 after the parallax correction is extracted (step S17), and it is determined whether the extraction is successful (step S19). When the image processing unit 30 determines that the extraction of the character string has failed (step S19; No), the image processing unit 30 performs various image processing on the still image data 51 (step S21), and then extracts the character string again. .

ステップＳ２３の画像処理では、例えば、静止画像データ５１に含まれる文字列の輪郭を明瞭にするエッジ強調や、静止画像データ５１全体のコントラストの増加（又は低減）させる階調補正を行う。尚、ステップＳ２１で行う画像処理は、エッジ強調や階調補正に限らず、例えば、拡大縮小処理、２値化処理やノイズ除去等、適宜公知技術を採用可能である。 In the image processing in step S23, for example, edge enhancement that makes the outline of the character string included in the still image data 51 clear and gradation correction that increases (or reduces) the contrast of the entire still image data 51 are performed. Note that the image processing performed in step S21 is not limited to edge enhancement and gradation correction, and known techniques such as enlargement / reduction processing, binarization processing, noise removal, and the like can be appropriately employed.

このような画像処理を施した静止画像データ５１に対して文字列の抽出を行って、再度失敗した場合には（ステップＳ１９；Ｎｏ）、他の画像処理を順次施していく（ステップＳ２１→Ｓ２３）。尚、パラメータを変更してエッジ強調を数回行うといったように、同じ種類の画像処理を段階的に行うこととしてもよい。 When the character string is extracted from the still image data 51 subjected to such image processing and fails again (step S19; No), other image processing is sequentially performed (step S21 → S23). ). Note that the same type of image processing may be performed in stages, such as changing parameters and performing edge enhancement several times.

このため、ステップＳ１１において作成した静止画像データ５１が文字認識に不向きな画像であっても、文字認識しやすい画像となるように画像処理を繰り返し施すことで、文字認識の成功率、即ち、文字認識率の向上を図ることができる。 For this reason, even if the still image data 51 created in step S11 is an image unsuitable for character recognition, the success of character recognition, that is, character recognition is achieved by repeatedly performing image processing so that the image is easily recognized. The recognition rate can be improved.

画像処理部３０は、文字列の抽出に失敗したと判定した際、ステップＳ２１において画像処理を行うが、全ての種類の画像処理を施してしまった場合には（ステップＳ２１；Ｙｅｓ）、ステップＳ９の処理に移行して、再度動画撮影を行ってステップＳ１１〜Ｓ１９の処理を行う。 When the image processing unit 30 determines that the character string extraction has failed, the image processing unit 30 performs image processing in step S21. If all types of image processing have been performed (step S21; Yes), step S9 is performed. Then, the process of steps S11 to S19 is performed by taking a moving image again.

画像処理部３０が文字列の抽出に成功したと判定した場合（ステップＳ１９；Ｙｅｓ）、言語処理部４０は、抽出文字列５５の言語の種別を判定し（ステップＳ２５）、言語種別５７と抽出文字列５５とを翻訳エンジン６０に伝達する（ステップＳ２７）。 When the image processing unit 30 determines that the character string has been successfully extracted (step S19; Yes), the language processing unit 40 determines the language type of the extracted character string 55 (step S25), and extracts the language type 57. The character string 55 is transmitted to the translation engine 60 (step S27).

翻訳エンジン６０は、言語種別５７に応じた翻訳ＤＢを選択して、抽出文字列５５に対する訳語を検索する（翻訳処理；ステップＳ２９）。そして、翻訳処理により得られた訳語５９を言語処理部４０に返して記憶部５０に記憶させる。 The translation engine 60 selects a translation DB corresponding to the language type 57 and searches for a translated word for the extracted character string 55 (translation processing; step S29). Then, the translated word 59 obtained by the translation process is returned to the language processing unit 40 and stored in the storage unit 50.

言語処理部４０は、訳語５９の中から訳語選択条件を満たす訳語を選択、即ち妥当性の高い訳語を選択する（ステップＳ３１）。そして、抽出文字列５５と、選択した訳語（選択訳語）とを出力して表示部７０に表示させる（ステップＳ３３）。 The language processing unit 40 selects a translation that satisfies the translation selection condition from the translations 59, that is, selects a translation with high validity (step S31). Then, the extracted character string 55 and the selected translation (selected translation) are output and displayed on the display unit 70 (step S33).

例えば、使用者が図６（ａ）に示した遠景の被写体ＯＢ２を見ながら、レンズ２１０Ｒの中心点Ｐ２を看板ＳＢに合わせてから動作撮影の開始指示を入力すると、撮影部１０が被写体ＯＢ２の撮影を開始する。このとき、ＡＦ機能によりカメラユニット２６０と被写体ＯＢ２との距離が例えば、１００ｍと測定された場合は、測定距離５３が充分に遠い、即ちパララックスが小さいために、ステップＳ１５におけるパララックス補正は行わなくてよい。 For example, when the user inputs an operation shooting start instruction after aligning the center point P2 of the lens 210R with the signboard SB while looking at the distant subject OB2 shown in FIG. 6A, the photographing unit 10 sets the subject OB2. Start shooting. At this time, when the distance between the camera unit 260 and the subject OB2 is measured to be 100 m, for example, by the AF function, the measurement distance 53 is sufficiently long, that is, the parallax is small, so that the parallax correction in step S15 is performed. It is not necessary.

そして、静止画像データ５１中から看板ＳＢ上に描かれている「ＨＯＴＥＬ」という文字列が画像処理部３０により抽出される。そして、抽出文字列５５「ＨＯＴＥＬ」の言語種別５７が「英語」と判定されて、英語の翻訳ＤＢから「ＨＯＴＥＬ」に対する訳語が検索される。 Then, the character string “HOTEL” drawn on the signboard SB from the still image data 51 is extracted by the image processing unit 30. Then, the language type 57 of the extracted character string 55 “HOTEL” is determined to be “English”, and the translated word for “HOTEL” is searched from the English translation DB.

表示部７０が、抽出文字列５５「ＨＯＴＥＬ」と、その訳語である「旅館」と「ホテル」とを一覧リストＬＳＴで表示部７０に表示すると、レンズ２１０Ｒを介して被写体ＯＢ２の光学像と、一覧リストＬＳＴの光学像とが使用者の眼Ｅに導かれて、図８（ａ）のような映像が映し出される。このため、使用者は翻訳装置１を装着して、例えば、街中を歩きながら興味のあるものを見ながら動画撮影の開始指示を行うだけで、その訳語を確認することができる。 When the display unit 70 displays the extracted character string 55 “HOTEL” and the translated words “ryokan” and “hotel” on the display unit 70 in the list list LST, an optical image of the subject OB2 through the lens 210R, The optical image of the list LST is guided to the user's eye E, and an image as shown in FIG. 8A is displayed. For this reason, the user can check the translated word by wearing the translation device 1 and instructing the start of moving image shooting while watching what is of interest while walking in the city.

また、６（ｂ）の使用者が把持した被写体ＯＢ３を見ながら、レンズ２１０Ｒの中心点Ｐ２を「ＨＯＴＥＬＢ」に合わせると、画像処理部３０は、カメラユニット２６０を介して被写体ＯＢ３を撮影する。このとき、カメラユニット２６０と被写体ＯＢ３との距離が例えば３０ｃｍと測定された場合、測定距離５３が近い、即ちパララックスが大きいため、カメラユニット２６０の画角の中心点Ｐ１を使用者の視点の中心点Ｐ２に合わせるようにパララックス補正を行う。 When the center point P2 of the lens 210R is set to “HOTEL B” while viewing the subject OB3 held by the user 6 (b), the image processing unit 30 captures the subject OB3 via the camera unit 260. . At this time, when the distance between the camera unit 260 and the subject OB3 is measured to be, for example, 30 cm, the measurement distance 53 is close, that is, the parallax is large, so the center point P1 of the angle of view of the camera unit 260 is Parallax correction is performed so as to match the center point P2.

このパララックス補正により、静止画像データ５１上の中心点Ｐ２上にある「ＨＯＴＥＬ」という文字列が画像処理部３０により抽出されて、遠景の被写体ＯＢ２を見た場合と同様に、一覧リストＬＳＴを表示部７０に表示すると、図８（ｂ）のような映像が使用者の眼Ｅに映し出される。 By this parallax correction, the character string “HOTEL” on the center point P2 on the still image data 51 is extracted by the image processing unit 30, and the list list LST is displayed in the same manner as in the case of viewing the distant subject OB2. When displayed on the display unit 70, an image as shown in FIG. 8B is displayed on the user's eye E.

このように、一覧リストＬＳＴ内に、抽出文字列５５が表示されることで、翻訳装置１により文字認識された文字列が所望の文字列であるかを確認することができる。 Thus, by displaying the extracted character string 55 in the list list LST, it is possible to confirm whether the character string recognized by the translation apparatus 1 is a desired character string.

ステップＳ３３による一覧リストＬＳＴの表示後、使用者は、翻訳装置１により翻訳された文字列が所望の文字列ではなかった、表示された訳語が所望のものではなかった、又は別の文字列の訳語が知りたい場合等に、翻訳継続操作（例えば、決定スイッチの押下操作）を行う。 After the display of the list LST in step S33, the user has determined that the character string translated by the translation device 1 is not the desired character string, the displayed translated word is not the desired one, or another character string. When it is desired to know a translated word, a translation continuation operation (for example, a determination switch pressing operation) is performed.

操作入力部２０は、翻訳継続操作が為されたか否かを判定し（ステップＳ３５）、当該操作が為されたと判定した場合は（ステップＳ３５；Ｙｅｓ）、ステップＳ９の処理へ移行する。また、当該操作が為されなかった場合は（ステップＳ３５；Ｎｏ）、翻訳モードにおける図７の処理を終了する。 The operation input unit 20 determines whether or not a translation continuation operation has been performed (step S35). If it is determined that the operation has been performed (step S35; Yes), the process proceeds to step S9. If the operation is not performed (step S35; No), the process of FIG. 7 in the translation mode is terminated.

尚、図８（ａ）及び（ｂ）のように一覧リストＬＳＴを、使用者の視界の中心点Ｐ２からずらした位置に表示することとしたが、図８（ｃ）に示すように、被写体ＯＢ２の中心位置（使用者の視界の中心）に表示することで、看板ＳＢに重畳表示させることとしてもよい。また、一覧リストＬＳＴの表示する静止画像データ５１の背景の色（背景色）を検出し、一覧リストＬＳＴの表示色を、検出した背景色と異なる色にすることとしてもよい。これにより、例えば、背景色が黒であった場合に、一覧リストＬＳＴの色とすることで、一覧リストＬＳＴが明確になる。 Note that the list LST is displayed at a position shifted from the center point P2 of the user's field of view as shown in FIGS. 8A and 8B. However, as shown in FIG. It may be superimposed on the signboard SB by displaying it at the center position of OB2 (the center of the user's field of view). Alternatively, the background color (background color) of the still image data 51 displayed in the list list LST may be detected, and the display color of the list list LST may be set to a color different from the detected background color. Thereby, for example, when the background color is black, the list list LST is clarified by setting the color of the list list LST.

また、ステップＳ３３において抽出文字列５５を使用者の眼Ｅに投射すると共に、当該抽出文字列５５を音声出力することとしてもよい。この場合、翻訳装置１００にイヤホンを設け、更に、記憶部５０に表記文字毎の発音を表す音声データを予め記憶しておく。そして、言語処理部４０が、抽出文字列５５に応じた音声データを読み出して、その音声データを合成して抽出文字列５５の発音を表す音声をイヤホンを介して音声出力する。これにより、使用者は、翻訳しようとする外国語の発音を知ることもできる。尚、翻訳エンジン６０が訳語の発音を表す音声データを訳語毎に記憶し、この音声データを言語処理部４０に出力することとしてもよい。 In step S33, the extracted character string 55 may be projected onto the user's eye E and the extracted character string 55 may be output as a sound. In this case, the translation device 100 is provided with earphones, and further, voice data representing pronunciation for each written character is stored in the storage unit 50 in advance. Then, the language processing unit 40 reads out voice data corresponding to the extracted character string 55, synthesizes the voice data, and outputs the voice representing the pronunciation of the extracted character string 55 through the earphone. Thereby, the user can also know the pronunciation of the foreign language to be translated. The translation engine 60 may store voice data representing the pronunciation of the translated word for each translated word, and output the voice data to the language processing unit 40.

〔翻訳装置の評価〕
次に、機能仕様の異なる７つの翻訳装置１ａ〜１ｇを用意し、翻訳装置それぞれを評価した評価結果について説明する。先ず、翻訳装置１ａ〜１ｇの評価方法は、次のようにした。 [Evaluation of translation equipment]
Next, the evaluation results obtained by preparing seven translation devices 1a to 1g having different functional specifications and evaluating each translation device will be described. First, the evaluation method of the translation apparatuses 1a to 1g was as follows.

（１）図９（ａ）に示すような模擬セット５００を作成して、分岐点に看板５０２〜５１８を設置する。
（２）看板５０２〜５１８上に、評価者に対する歩行指示（例えば、「右折」、「左折」）を外国語（例えば、韓国語）で表記する。
（３）評価者は、翻訳装置１ａ〜１ｇそれぞれを装着して、模擬セット５００内を歩行する。
（４）評価者は、翻訳装置によって表示される歩行指示の訳語に従って歩行する。
（５）５つの看板に従って歩行して、所定のポジションに到達したら評価を終了し、その到達するまでの所要時間と、歩行の容易性とを評価結果として記録する。
（６）評価者は２０人とする。
（７）評価の都度、看板５０２〜５１８の表記内容は変更する。
（８）翻訳装置の使用方法が分からないことによる時間のロスを軽減するために、各評価者は、同一の翻訳装置を２回ずつ評価して良好な評価結果を評価対象とする。
（９）翻訳装置１ａ〜１ｇそれぞれについての、２０人分の評価結果の平均を取る。 (1) A simulation set 500 as shown in FIG. 9A is created, and signboards 502 to 518 are installed at branch points.
(2) On the signboards 502 to 518, a walking instruction (eg, “right turn”, “left turn”) to the evaluator is written in a foreign language (eg, Korean).
(3) The evaluator wears each of the translation devices 1a to 1g and walks in the simulation set 500.
(4) The evaluator walks according to the translation of the walking instruction displayed by the translation device.
(5) Walk according to the five signboards, end the evaluation when reaching a predetermined position, and record the time required to reach the position and the ease of walking as an evaluation result.
(6) There are 20 evaluators.
(7) The contents of the signs on the signs 502 to 518 are changed every time the evaluation is made.
(8) In order to reduce the time loss due to not knowing how to use the translation device, each evaluator evaluates the same translation device twice and sets a good evaluation result as an evaluation target.
(9) Take the average of the evaluation results for 20 people for each of the translation devices 1a to 1g.

翻訳装置１ａ〜１ｇそれぞれの機能仕様は、図９（ｂ）に示すように変えた。翻訳装置１ａ〜１ｅは、表示手段としてＨＯＥを採用して、パララックス補正処理、ＡＳ機能及び画像処理の組み合わせを変えたものである。特に翻訳装置１ａは、表示手段としてＨＯＥを採用して、パララックス補正、ブレ補正及び画像処理全てを行うものであり、上述した翻訳装置１に相当する。 The functional specifications of the translation devices 1a to 1g were changed as shown in FIG. The translation devices 1a to 1e employ HOE as display means and change the combination of the parallax correction processing, the AS function, and the image processing. In particular, the translation apparatus 1a employs HOE as a display means and performs all parallax correction, blur correction, and image processing, and corresponds to the translation apparatus 1 described above.

また、表示手段として小型のハーフミラーを採用した翻訳装置１ｆと、大型のハーフミラーを採用した翻訳装置１ｇとを用意した。この翻訳装置１ｇは、上述した特許文献１のＨＭＤ装置に相当するものである。 Moreover, the translation apparatus 1f which employ | adopted the small half mirror as a display means and the translation apparatus 1g which employ | adopted the large half mirror were prepared. This translation apparatus 1g corresponds to the HMD apparatus of Patent Document 1 described above.

図９（ｂ）の評価結果によれば、ハーフミラーを採用した翻訳装置１ｆ及び１ｇは、評価者の眼前がハウジングにより覆われてしまうため、視界が暗くなり、更に視野が狭くなるために歩行自体が困難になってしまうという結果が得られた。 According to the evaluation result of FIG. 9B, the translation devices 1f and 1g employing the half mirror are walked because the front of the evaluator's eyes are covered with the housing, the field of view becomes darker and the field of view becomes narrower. The result was that it became difficult.

これに対して、ＨＯＥを採用した翻訳装置１ａ〜１ｅは、評価者の視界が開けるため、歩行の容易性に優れているという結果が得られた。また、所定のポジションに到達するまでの所要時間の平均に関し、翻訳装置１ｆ及び１ｇの場合は、４分以上かかってしまうため実用性に欠けていたが、翻訳装置１ａ〜１ｅは、２分未満という良好な結果が得られ、街中や公道等で使用しても大きな影響がないことが分かった。 On the other hand, the translation devices 1a to 1e adopting the HOE have the result that the evaluator's field of view is open and the walking is excellent. In addition, regarding the average time required to reach a predetermined position, the translation devices 1f and 1g required 4 minutes or more and lacked practicality, but the translation devices 1a to 1e were less than 2 minutes. As a result, it was found that even when used on the streets or on public roads, there was no significant effect.

また、翻訳装置１ｅにパララック補正、ＡＳ機能及び画像処理それぞれを追加構成した翻訳装置１ｂ〜１ｄは、翻訳装置１ｅよりも平均所要時間が短くなるという結果が得られ、それぞれの機能が有用であることが分かった。また、本実施形態の翻訳装置１に相当する翻訳装置１ａは、動画撮影から訳語の表示までの処理時間を他の翻訳装置１ｂ〜１ｅよりも早くすることができ、翻訳装置１ａ〜１ｇの中で最短の平均所要時間となった。 In addition, the translation apparatuses 1b to 1d in which the parallax correction, the AS function, and the image processing are added to the translation apparatus 1e are obtained, and the average required time is shorter than that of the translation apparatus 1e, and the respective functions are useful. I understood that. Also, the translation device 1a corresponding to the translation device 1 of the present embodiment can make the processing time from the moving image shooting to the display of the translated word faster than the other translation devices 1b to 1e. The shortest average required time.

以上、本実施形態によれば、静止画像データ５１から文字列を抽出し、当該文字列の訳語を表示部７０に表示することにより、その訳語の光学像がＨＯＥ２１２を介して使用者の眼Ｅに投影される。翻訳装置１を装着した使用者は、レンズ２１０Ｒを介して所望の文字列を見ながら操作入力部２０を操作すると、その文字列の訳語が仮想的に表示されるため、街中を自然に歩きながら違和感なく簡単に外国語の訳語を知ることができる。また、海外旅行先で辞書を引く動作を行わなくても良いため、旅行者であることを分かりにくくすることができ防犯上の効果も得られる。また、従来のように辞書を引いたり、カメラ付き携帯端末で撮影を行う必要がないため、使用者の両手が自由になる。 As described above, according to the present embodiment, by extracting a character string from the still image data 51 and displaying a translated word of the character string on the display unit 70, an optical image of the translated word is transmitted to the user's eyes E via the HOE 212. Projected on. When the user wearing the translation device 1 operates the operation input unit 20 while viewing a desired character string via the lens 210R, the translated word of the character string is virtually displayed, so that the user can walk naturally in the city. You can easily understand foreign language translations without any sense of incongruity. In addition, since it is not necessary to perform a dictionary lookup operation at overseas travel destinations, it is possible to make it difficult to identify a traveler, and a crime prevention effect can be obtained. In addition, since it is not necessary to draw a dictionary or to take a picture with a mobile terminal with a camera as in the prior art, both hands of the user are free.

また、翻訳装置１は、静止画像データ５１に対してパララックス補正を行うため、使用者の視界の中心にある文字列を適切に抽出することができる。また、撮影部１０に、ＡＳ機能を搭載することで、静止画像データ５１にブレが生ずることを防止できるため、使用者の動きが文字列の抽出に与える影響を低減できる。 Moreover, since the translation apparatus 1 performs parallax correction on the still image data 51, it is possible to appropriately extract the character string at the center of the user's field of view. In addition, by mounting the AS function in the photographing unit 10, it is possible to prevent blurring in the still image data 51, and thus it is possible to reduce the influence of user movement on character string extraction.

また、文字列の抽出に失敗した場合は、エッジ強調や階調補正等の各種画像処理を順次施して再度文字列の抽出を行う。このように画像処理を施すことで、静止画像データ５１中の文字列を浮き上がらせることができるため、文字抽出の精度を向上させることができる。また、一度文字抽出に失敗したとしても、他の画像処理を順次行って更新される静止画像データ５１に対して文字抽出を行うことで、文字列の抽出に成功しうる確率、即ち文字認識率を向上させることができる。 If extraction of the character string fails, various image processes such as edge enhancement and gradation correction are sequentially performed to extract the character string again. By performing image processing in this way, the character string in the still image data 51 can be raised, so that the accuracy of character extraction can be improved. Further, even if the character extraction fails once, the probability that the character string can be successfully extracted by performing character extraction on the still image data 51 that is updated by sequentially performing other image processing, that is, the character recognition rate. Can be improved.

〔変形例〕
尚、本実施形態において、使用者の操作入力部２０の操作によって動画撮影を開始することとしたが、例えば、次のようにしてもよい。先ず、使用者の眼球運動を検知して、視線方向を検出する視線検出部を設ける。この視線検出部は、赤外線発光部、赤外線受光部、光学系、演算部等を備えて構成される。 [Modification]
In the present embodiment, the moving image shooting is started by the operation of the operation input unit 20 by the user. For example, the following may be performed. First, a gaze detection unit that detects the eye movement of the user and detects the gaze direction is provided. The line-of-sight detection unit includes an infrared light emitting unit, an infrared light receiving unit, an optical system, a calculation unit, and the like.

赤外線発光部から使用者の眼球に対して赤外線を照射し、その反射光を赤外線受光部が受光する。そして、赤外線受光部から得られた反射光に基づいて眼球の瞳孔の中心を抽出して使用者の視線を検出する。画像処理部３０は、視線検出部により検出された使用者の視線の方向が所定時間（例えば、３秒）一定だった場合に、撮影部１０を駆動して動画撮影を開始させる。 Infrared light is irradiated from the infrared light emitting unit to the user's eye, and the reflected light is received by the infrared light receiving unit. Then, based on the reflected light obtained from the infrared light receiver, the center of the pupil of the eyeball is extracted to detect the user's line of sight. When the direction of the user's line of sight detected by the line-of-sight detection unit is constant for a predetermined time (for example, 3 seconds), the image processing unit 30 drives the photographing unit 10 to start moving image shooting.

これにより、使用者は、翻訳したい文字列を見つめるといった動作をするだけで、その訳語を確認することができるようになり、コントロールユニット３００を操作する必要がなくなる。 As a result, the user can confirm the translated word only by looking at the character string to be translated, and does not need to operate the control unit 300.

また、静止画像データ５１に対する画像処理は、ステップＳ１７における文字抽出の後に行うこととしたが、例えば、ステップＳ１５においてパララックス補正を行った後の静止画像データ５１に前処理として画像処理を行うこととしてもよい。これにより、ステップＳ１７〜Ｓ２１で繰り返し行う文字抽出において、最初の文字抽出が成功する確率を高めることができる。 Further, the image processing for the still image data 51 is performed after the character extraction in step S17. For example, the image processing is performed as preprocessing on the still image data 51 after the parallax correction is performed in step S15. It is good. Thereby, in the character extraction repeatedly performed by step S17-S21, the probability that the first character extraction will be successful can be raised.

また、画像処理部３０、言語処理部４０、記憶部５０及び翻訳エンジン６０を、コントロールユニット３００内に設けることとしたが、これら全ての機能部を小型化して表示ユニット２５０内に設けることとしてもよいし、操作入力部２０を表示ユニット２５０に一体的に形成して、翻訳装置１の操作をＨＭＤ２００上で行えるようにしてもよい。 In addition, the image processing unit 30, the language processing unit 40, the storage unit 50, and the translation engine 60 are provided in the control unit 300. However, all these functional units may be reduced in size and provided in the display unit 250. Alternatively, the operation input unit 20 may be formed integrally with the display unit 250 so that the translation apparatus 1 can be operated on the HMD 200.

また、翻訳エンジン６０を翻訳装置１が内蔵することとしたが、例えば、インターネット回線上のサーバに翻訳エンジン６０を設けることとしてもよい。この場合、翻訳装置１は、インターネット回線との通信接続を行う通信部を更に備え、当該通信部を介して翻訳エンジン６０とのデータ通信を行う。翻訳エンジン６０は、上述したように翻訳ＤＢを備えて構成され、その翻訳ＤＢのデータ量は膨大なものである。このように、翻訳エンジン６０を翻訳装置１と別体で設けることにより、翻訳装置１の処理性能を向上させることができる。 In addition, the translation apparatus 60 is built in the translation apparatus 1. However, for example, the translation engine 60 may be provided in a server on the Internet line. In this case, the translation apparatus 1 further includes a communication unit that performs communication connection with the Internet line, and performs data communication with the translation engine 60 via the communication unit. The translation engine 60 includes a translation DB as described above, and the amount of data in the translation DB is enormous. Thus, by providing the translation engine 60 separately from the translation device 1, the processing performance of the translation device 1 can be improved.

また、サーバに設けられた翻訳エンジン６０は、抽出文字列５５に関する各種情報をインターネット上で検索して、その検索結果を言語処理部４０に返すこととしてもよい。例えば、抽出文字列５５が観光地名であった場合は、その観光地名でインターネット上を検索し、その観光地名に関する情報を言語処理部４０に返す。言語処理部４０は、翻訳エンジン６０から返された観光地名に関する情報を表示部７０に表示させる。このため、使用者は、抽出文字列５５に関する様々な情報を確認することができる。また、サーバ上の翻訳エンジン６０が記憶する内容は、最新のものに更新可能であるため、常に最新の情報を翻訳装置１に表示させることができる。 In addition, the translation engine 60 provided in the server may search various information related to the extracted character string 55 on the Internet and return the search result to the language processing unit 40. For example, if the extracted character string 55 is a sightseeing place name, the Internet searches for the sightseeing place name on the Internet, and returns information on the sightseeing place name to the language processing unit 40. The language processing unit 40 causes the display unit 70 to display information on the tourist destination name returned from the translation engine 60. For this reason, the user can confirm various information regarding the extracted character string 55. Moreover, since the content stored in the translation engine 60 on the server can be updated to the latest one, the latest information can always be displayed on the translation device 1.

また、翻訳エンジン６０に言語処理部４０から伝送された抽出文字列５５と言語種別５７に基づいて訳語の検索を行うこととしたが、例えば、使用者により予め設定されたキーワードを、その訳語の検索に利用することとしてもよい。 In addition, the translation word is searched based on the extracted character string 55 and the language type 57 transmitted from the language processing unit 40 to the translation engine 60. For example, a keyword set in advance by the user is selected as the translation word. It may be used for searching.

例えば、翻訳ＤＢに記憶された訳語に対してＡ〜Ｃといった難易度が対応付けられているとする。そして、使用者の文字入力キーの操作により「難易度Ｃ」というキーワードが入力された場合は、検索した訳語のうち、難易度がＣのものを選択して言語処理部４０に出力することとしてもよい。また、キーワードとして、「日常会話」が入力された場合は、日常会話で使用される頻度の高いものを検索して出力することとしてもよい。これにより、言語処理部４０に入力される訳語の数を減らし、言語処理部４０における訳語の選択の処理負荷を低減することができる。 For example, assume that difficulty levels such as A to C are associated with the translation words stored in the translation DB. When the keyword “difficulty level C” is input by the user's operation of the character input key, the search word having the difficulty level C is selected and output to the language processing unit 40. Also good. In addition, when “daily conversation” is input as a keyword, it is possible to search for and output a frequently used phrase used in daily conversation. As a result, the number of translations input to the language processing unit 40 can be reduced, and the processing load for selecting translations in the language processing unit 40 can be reduced.

また、翻訳エンジン６０は、単語の抽出文字列５５の訳語を検索することとしたが、抽出文字列５５が文章であった場合には、言語種別５７の文法規則に則って翻訳することとしてもよい。尚、この文法規則に則った翻訳の手法は、公知技術であるためその説明は省略する。 In addition, the translation engine 60 searches for a translated word in the extracted character string 55 of the word. However, if the extracted character string 55 is a sentence, it may be translated in accordance with the grammar rules of the language type 57. Good. Note that the translation method according to this grammatical rule is a well-known technique, and therefore its description is omitted.

また、言語処理部４０が、抽出文字列５５の文字種や文字配列から言語種別５７を判定することとしたが、ＧＰＳ衛星から送信されるＧＰＳ信号を受信して、翻訳装置１の現在位置を測位する測位モジュールを設けて、その現在位置に基づいて言語種別５７を判定雨することとしてもよい。より詳しくは、言語処理部４０が、測位モジュールにより測位された現在位置に基づいて翻訳装置１が存在する国を判定する。 In addition, the language processing unit 40 determines the language type 57 from the character type and character arrangement of the extracted character string 55. However, the language processing unit 40 receives a GPS signal transmitted from a GPS satellite and measures the current position of the translation device 1. A positioning module may be provided, and the language type 57 may be determined based on the current position. More specifically, the language processing unit 40 determines the country in which the translation apparatus 1 exists based on the current position measured by the positioning module.

そして、国名（例えば、ドイツ）と、当該国で使用されている言語種別（例えば、独語）とを対応付けて予め記憶した使用言語テーブルから、翻訳装置１が存在する国に対応する言語種別を読み出す。次いで、その読み出した言語種別の翻訳ＤＢを選択して、当該翻訳ＤＢに基づいて抽出文字列５５の訳語を取得する。このように、測位モジュールにより測位された現在位置に基づいて言語種別５７を判定することで、例えば、アルファベットを表記文字とした言語種別である英語、独語、仏語の中から、簡単に言語種別を選択・決定することができる。 Then, the language type corresponding to the country in which the translation apparatus 1 exists is determined from the language table stored in advance by associating the country name (for example, Germany) and the language type (for example, German) used in the country. read out. Next, the translation DB of the read language type is selected, and the translated word of the extracted character string 55 is acquired based on the translation DB. In this way, by determining the language type 57 based on the current position measured by the positioning module, for example, the language type can be easily selected from English, German, and French, which are the language types with alphabetical characters. Can be selected and determined.

また、ステップＳ９の動画撮影に先だって、撮影部１０のズーム率や露出量等の撮影条件を予め設定することとしてもよい。例えば、外光の強い中、翻訳装置１を使用する場合は、操作入力部２０（設定手段）によって予め露出量を低く設定しておく。また、屋外で翻訳装置１を使用する場合は、ズーム率を高く設定する。このように、動画撮影の前に、撮影条件を適切に設定することで、適切な大きさの文字列を含み、文字列の輪郭が明瞭な静止画像データ５１が得られるようになり、文字認識率の向上が図れる。 Prior to the moving image shooting in step S9, shooting conditions such as the zoom rate and exposure amount of the shooting unit 10 may be set in advance. For example, when the translation apparatus 1 is used in strong external light, the exposure amount is set low in advance by the operation input unit 20 (setting means). In addition, when using translation apparatus 1 outdoors, the zoom ratio is set high. As described above, by appropriately setting shooting conditions before moving image shooting, still image data 51 including a character string of an appropriate size and a clear outline of the character string can be obtained, and character recognition is performed. The rate can be improved.

翻訳装置の斜視図の一例。An example of the perspective view of a translation apparatus. 表示ユニット及び接眼光学系の断面図の一例。An example of sectional drawing of a display unit and an eyepiece optical system. 翻訳装置の機能構成の一例を示すブロック図。The block diagram which shows an example of a function structure of a translation apparatus. 記憶部のデータ構成の一例を示す図。The figure which shows an example of the data structure of a memory | storage part. パララック補正を説明するための図。The figure for demonstrating parallax correction. 被写体を遠景及び近景とした際の、使用者の視界の中心と、カメラユニットの画角の中心との位置関係を表す図。The figure showing the positional relationship of the center of a user's view, and the center of a field angle of a camera unit when a to-be-photographed object is a distant view and a near view. 翻訳装置の具体的な動作を説明するためのフローチャート。The flowchart for demonstrating the specific operation | movement of a translation apparatus. 翻訳装置が使用者の眼に投影する映像の一例。An example of the image | video which a translation apparatus projects on a user's eyes. 翻訳装置の評価を行った模擬セットの一例を示す図（ａ）、評価結果を示す表（ｂ）。The figure which shows an example of the simulation set which evaluated the translation apparatus (a), and the table | surface (b) which shows an evaluation result.

符号の説明Explanation of symbols

１翻訳装置
１０撮影部
２０操作入力部
３０画像処理部
４０言語処理部
５０記憶部
５１静止画像データ
５３測定距離
５５抽出文字列
５７言語種別
５９訳語
６０翻訳エンジン
７０表示部
２２０接眼光学系
２１０Ｒレンズ
２１４プリズム
２５０表示ユニット
２５２光源
２５４集光レンズ
２５６表示パネル
２６０カメラユニット
２９０ケーブル
３００コントロールユニット
３１０操作スイッチ
Ｅ眼 DESCRIPTION OF SYMBOLS 1 Translation apparatus 10 Image pick-up part 20 Operation input part 30 Image processing part 40 Language processing part 50 Memory | storage part 51 Still image data 53 Measurement distance 55 Extracted character string 57 Language type 59 Translation 60 Translation engine 70 Display part 220 Eyepiece optical system 210R Lens 214 Prism 250 Display unit 252 Light source 254 Condensing lens 256 Display panel 260 Camera unit 290 Cable 300 Control unit 310 Operation switch E Eye

Claims

被写体を撮影して、撮影画像を生成する撮影手段と、
前記撮影手段により生成された撮影画像の文字認識を行う文字認識手段と、
前記文字認識手段により文字認識された文字列を翻訳して、当該文字列の訳語を取得する翻訳手段と、
前記翻訳手段により取得された訳語の光学像を投射する映像投射手段と、
ホログラフィック光学素子を含む透過型の素材で形成され、前記映像投射手段により投射された光学像を当該ホログラフィック光学素子の光学効果によって使用者の眼に導く接眼光学系と、
を備えることを特徴とする翻訳装置。 Photographing means for photographing a subject and generating a photographed image;
Character recognition means for performing character recognition of a photographed image generated by the photographing means;
A translation means for translating the character string recognized by the character recognition means and obtaining a translation of the character string;
Video projection means for projecting an optical image of the translation acquired by the translation means;
An eyepiece optical system that is formed of a transmission-type material including a holographic optical element and guides an optical image projected by the image projection means to the user's eye by the optical effect of the holographic optical element;
A translation apparatus comprising:

前記撮影手段は、前記使用者の視野内の被写体を撮影可能に当該翻訳装置に配設されることを特徴とする請求項１に記載の翻訳装置。 The translation apparatus according to claim 1, wherein the photographing unit is arranged in the translation apparatus so as to be able to photograph a subject in the visual field of the user.

前記使用者の視野の中心位置と、前記撮影画像の中心位置とのズレを推測する推測手段と、
前記推測手段により推測されたズレに基づいて、前記撮影画像の中心位置を前記使用者の視野の中心位置に合わせる補正手段と、
を更に備えることを特徴とする請求項１又は２に記載の翻訳装置。 An estimation means for estimating a deviation between the center position of the user's visual field and the center position of the captured image;
Based on the deviation estimated by the estimation means, a correction means for aligning the center position of the captured image with the center position of the visual field of the user;
The translation apparatus according to claim 1, further comprising:

前記推測手段は、前記被写体と前記撮影手段との間の距離を測定する測定手段を有し、
前記補正手段は、前記使用者の視野の中心位置と前記撮影手段の中心位置との間の距離と、前記測定手段により測定された距離とに基づいて、前記撮影画像の中心位置を当該使用者の視野の中心位置に合わせることを特徴とする請求項１〜３の何れか一項に記載の翻訳装置。 The estimation means includes a measurement means for measuring a distance between the subject and the photographing means,
The correcting means determines the center position of the photographed image based on the distance between the center position of the user's visual field and the center position of the photographing means and the distance measured by the measuring means. The translation apparatus according to claim 1, wherein the translation apparatus is matched with a center position of the field of view.

撮影条件を設定する設定手段を更に備え、
前記撮影手段は、前記設定手段により設定された撮影条件に基づいて前記被写体の撮影を行うことを特徴とする請求項１〜４の何れか一項に記載の翻訳装置。 It further comprises setting means for setting shooting conditions,
The translation apparatus according to claim 1, wherein the photographing unit photographs the subject based on a photographing condition set by the setting unit.

前記撮影手段のブレを補正するブレ補正手段を更に備えることを特徴とする請求項１〜５に記載の翻訳装置。 The translation apparatus according to claim 1, further comprising a blur correction unit that corrects a blur of the photographing unit.

前記撮影手段により生成された撮影画像に画像処理を施す画像処理手段を更に備え、
前記文字認識手段は、前記画像処理手段により画像処理が施された撮影画像の文字認識を行うことを特徴とする請求項１〜６の何れか一項に記載の翻訳装置。 Image processing means for performing image processing on the photographed image generated by the photographing means;
The translation apparatus according to claim 1, wherein the character recognizing unit performs character recognition of a captured image subjected to image processing by the image processing unit.

前記映像投射手段は、前記文字認識手段により文字認識された文字列の光学像を投射する認識文字投射手段を有することを特徴とする請求項１〜７の何れか一項に記載の翻訳装置。 The translation apparatus according to claim 1, wherein the video projecting unit includes a recognized character projecting unit that projects an optical image of a character string recognized by the character recognizing unit.

前記翻訳手段は、前記文字認識された文字列の訳語を複数取得する複数訳語取得手段を有し、
前記複数訳語取得手段により取得された複数の訳語の中から訳語選択条件を満たす訳語を選択する訳語選択手段を更に備え、
前記映像投射手段は、前記訳語選択手段により選択された訳語の光学像を投射することを特徴とする請求項１〜８の何れか一項に記載の翻訳装置。 The translating means includes a plurality of translation acquisition means for acquiring a plurality of translations of the character-recognized character string;
A translation selection means for selecting a translation satisfying a translation selection condition from a plurality of translations acquired by the plurality of translation acquisition means;
The translation apparatus according to claim 1, wherein the video projecting unit projects an optical image of the translated word selected by the translated word selecting unit.

使用者の頭部に装着可能なメガネ型に形成され、装着時に前記接眼光学系が使用者の眼前に配置されることを特徴とする請求項１〜９の何れか一項に記載の翻訳装置。 The translation device according to any one of claims 1 to 9, wherein the translation device is formed in a glasses shape that can be worn on a user's head, and the eyepiece optical system is arranged in front of the user's eye when worn. .